Saipem

Optimize your document search process with the Smart Data Hub

  • AI & Data Solutions
industry
Energy & Utilities
know how
  • AI & Machine Learning

1 Starting Point

1 Need

Saipem, a leading company in advanced engineering and energy transition, had the following needs:

  • Enable fast and accurate search for information across over 100,000 documents distributed across multiple repositories
  • Extract relevant insights from documents, capitalizing on hidden corporate knowledge
  • Integrate the solution with cloud (SharePoint Online) and on-premise (SharePoint On-Premise and Documentum) systems
  • Reduce time and costs associated with information search, improving decision-making processes

2 Discovery

2 Direction

Dinova collaborated with Saipem to create a Smart Data Hub, an advanced document search solution based on Natural Language Processing (NLP) technologies. This system allowed for:

  • Perform concept searches, highlighting relevant text snippets
  • Automate document classification and reclassification
  • Automatically suggest tags and metadata using machine learning techniques
  • Save recurring queries to your favorites for quick access
  • Manage large volumes of multilingual, organizational and technical documents from heterogeneous sources

3 How

3 The challenge

With over 100,000 documents spread across multiple repositories, Saipem needed to easily and quickly find specific information and extract from it insights relevant to the search context, finally unlocking the value of the often-hidden corporate knowledge.

To do this, a specific solution needed to improve the search and dissemination of corporate information. The solution, in this specific case, needed to be implemented on Microsoft Azure technology and capable of integrating with multiple document sources.

4 What

4 Solution by Dinova

In collaborazione con Dinova, Saipem ha implementato uno Smart Data Hub per ottimizzare il processo di ricerca documentale e di informazioni in termini di pertinenza e accuratezza delle risposte. Questo permette di cercare informazioni molto specifiche all’interno di qualsiasi documento, anche in quelli che gli operatori non sanno ancora di avere a disposizione, sfruttando i più evoluti modelli di Natural Language Processing (NLP).

The designed solution is:

  • multicloud
  • multiplatform
  • containerized
  • replicable
  • scalable

 

The Smart Data Hub is capable of managing a large volume of organizational, qualitative, and technical documents, such as manuals and regulations, originating from numerous sources and divided into four categories in different languages. The supported documents come from cloud systems (such as SharePoint Online) or on-premise systems (such as SharePoint On-Premise and Documentum).

The implemented NLP techniques make it possible to read and interpret natural language through machine learning models. The solution is therefore able not only to return the document most relevant to the search but also to highlight personalized insights related to the searched concept.

Its main advanced features are:

  • Classification and reclassification automation
  • Search by concepts
  • Cosine similarity
  • Automatic suggestion of tags and metadata
  • Text snippet of the searched concept
  • Save your most used queries as favorites.

 

This project was carried out following the Cloud Native DevOps methodology and is based on PaaS and IaaS services. The implementation roadmap followed can be summarized in these 3 steps:

  • Implementation of a 3-month PoC aimed at validating the model
  • Implementation of the solution on 3 environments (Dev, Test and Prod) following a DevOps logic
  • Main release released into production
  • Continuous improvement through key user analysis

 

The project progresses through incremental releases, optimizing the solution based on user experience. The next step involves introducing text summarization and virtual assistant features.

5 Why

5 Why Dinova?

A new way of doing research and working together

The Smart Data Hub represents for Saipem a strategic asset capable of generating greater awareness of the documents and information available within the company.

This solution enables greater data cleansing, the removal of duplicates, and provides operators with entire documents of hundreds of pages summarized in just a few lines. It is compliant with Saipem’s security policies and ensures an intuitive usability experience that facilitates and simplifies operators’ work.

Thanks to the platform’s scalability and replicability, the system perfectly adapts to the specific needs of the company and its people. The project roadmap, structured in incremental phases, ensures continuous improvement of the features offered.

This project demonstrates the positive impact of digital innovation in improving daily work and making organizations more efficient and aware.

Other success stories