Case study

Large-scale data wrangling and data structuring for major O&G company’s Petroleum Engineering Department


The client was able to have an instant data accessibility from successfully digitize key data sets from its various kind of legacy reports, images, logs, and charts and also to attain a structured data for its major reports in less than 12 weeks, achieving 55 % increased efficiency accessing and searching for information as well as unlocking potential of the untapped static data.


The client’s technical department wants to capture and restructure their static data from their major reports to more accessible data in motion that can be used across multiple levels for analytics and reporting purposes. The challenges however lie on the type of documents and its volume. The scope of project resolves around retrieving information from reports of three separate domain within their department, namely: Production Technology, Petrophysics and Reservoir Summation

The client faces challenges in accessing the information in their reports as well as to do further analytics since the information resides mostly in the legacy reports were in the form of non-searchable pdf, images and etc. and this heavily affected their processes of correlations, studies, and research. Additionally, the technical challenges however are due to the volume which is more than 200 documents, and the information resides in multiple type of documents of different variations and formats as well as originating from the typewritten, machine printed and  scanned documents.  In order to enable users to proceed with the study, the client needed its information to be identified, cleaned, extracted, validated, structured first before proceeding with their analytics.


The client engaged with Net Geometry for DataGeometry-BPA Data Wrangling Services to structure their data residing in multiple legacy documents.

Implementing certified Project Management Institute (PMI) and Project Management Body of Knowledge (PMBOK) certified project phases & processes together with leveraging on DataGeometry – BPA, a custom solution was built in to automatically identify the targeted information, digitize, and extract them seamlessly and storing them in a structured database upon being validated.

High Level Scope of Work:


High Level Scope of Digitization for Digitization:

Business Impacts

DataGeometry-BPA’s solution has allowed the subsurface technical users to work effectively and collaboratively to produce valuable insights and reviews on all of their assets in Malaysia regardless of location and have made this legacy information more accessible. This first end to end services of data identification to structuring and storing directly into the deployed at the client premises is considered as the benchmark. The results of the scalability and performances of the DataGeometry – BPA’s Data Wrangling Services outperforms significantly when compared to manual human extraction.

  • Reduces operational time for extraction by almost 75%.
  • End to end processes of extraction, validation and standardization and populating to database and at one go reduces the overall manpower required as compared to manual processes.
  • Custom built solutions for this project can be used forever and whenever the documents are made available.
  • Eliminate human errors during the review and extraction process.
  • Improved user experience for the Management and Technical community


Agile, Rapid and Proven

What’s Next?

Keep connect with US

Suites C-5-16, Metropolitan Square Commerce, Damansara Perdana, 47820, Petaling Jaya, Selangor, Malaysia

+60 37625 3153

Leave A Message