The Power of Structured Data
OVERVIEW
TVC compliance is achievable through multiple approaches and formats of data. Traditional documentation and record keeping can meet these requirements by saving unstructured data (i.e., PDFs and other forms of "digital paper") in centralized folder-based repositories such as SharePoint. This practice allows operators to meet PHMSA requirements in CFR 49 192.67, 192.205, 192.227, etc.
While this strategy ticks the TVC box, it creates hurdles for the operator assuring supplier records are accessible and accurate. Even small projects result in thousands of TVC applicable assets (e.g., joints, fittings, welds), each of which have dozens of associated documents stored in a maze of folders. This maze gets even trickier with the fact that each member of the supply chain uses non-standardized spreadsheets and document formats. If ensuring these documents are accessible to the right stakeholders (including PHMSA) isn’t difficult enough, the 2019 Final Rule also states all TVC records must be complete and accurate. Inspectors help our industry build the safest form of energy transportation; however, asking anyone to manually cross-verify every digital paper PDF for accuracy is a tall order.
Operators have now started to embrace solutions that address these challenges, and which provide significant returns at every stage of an infrastructure’s lifespan. Requiring API 5MT compliance provides an operator with a digital birth certificate of every component moving forward. Additional requirements can request structured data (spreadsheets and database outputs) alongside the signed and stamped PDF documents. Structured data can be uploaded to platforms that cross-verify the material data against its engineering, procurement, and traceability requirements. By leveraging concepts like hydrotest record validation in key areas such as material and construction documentation, energy companies can optimize asset value during a transfer and efficiently streamline regulatory responses when it comes to audits or project modifications.
This article will include recent results on capital pipelines and facilities, and the transfer of existing infrastructure. These results include stopping untraceable materials from ever reaching the ROW, reducing a current project’s material moves by over 3,000 miles, an operator’s ability to sell cancelled project materials at near new value, and another operator’s expedited transfer of a pipeline while reducing risk for indemnity/warranty claims post sale.
INTRODUCTION
For Energy Operators, Data Management has transitioned from a peripheral competitive advantage to a primary business requirement. Effectively managing data comes with significant challenges based on the shear breadth of the subject. Ever expanding sources of data, along with evolving types of data, make even communications regarding data management complicated. However, for today’s energy organizations, clearly understanding types of data and how to achieve optimal data formats is critical knowledge required to comply with regulations and elevate competitiveness.
To bridge the knowledge gap faced by most owner-operators, energy firms are advised to work with a partner that specializes in data management organization, such that an appropriate focus may be placed on conditioning data associated with all sources of a project, including engineering, procurement, manufacturing, construction, and transfer of physical infrastructure such as facilities, pipelines, and terminals.
TYPES OF DATA
There are three categories of data: unstructured, semi-structured, and structured.
Unstructured Data
A large percentage of all data in the energy sector is unstructured data. Often simply referred to as documents or “digital paper,” unstructured data represents almost 80 percent of all data shared across supply chains with the bulk being PDFs. This type of data cannot be contained in a row-column database and doesn’t have an associated data model. This lack of structure makes it very difficult to search, manage, and analyze. These records require a significant manual effort to extract value while maintaining the data context.
Other examples of unstructured data include photos, audio/video files, text files, social media content, satellite imagery, presentations, open-ended survey responses, websites, and call centre transcripts.
Semi-Structured Data
Semi-structured data possesses some defining or consistent characteristics but doesn’t conform to a structure such as a relational database. As a result, there are some organizational properties such as metadata to make it easier to organize, but there’s still fluidity in the data. Metadata is data that describes other data. It’s an underlying basic summary or definition. For example, a description of an image that’s associated with the image file is metadata. Metadata information helps systems uncover what users are looking for.
E-mail messages are a good example of semi-structured data. While the actual content is unstructured, it contains enough structured data such as name and e-mail address of sender and recipient, time sent, and so forth. Another example is a digital photograph. The image itself is unstructured, but if the photo was taken on a smartphone, it would be data and time stamped, geo tagged, and have a device ID.
These additional pieces of information (metadata) tagged with the files in question permit for a degree of searchability to be applied, therefore making these records more accessible than files that fall into the unstructured data bucket.
Although most data is unstructured, a fair amount of what people classify as unstructured data is indeed semi-structured, because it contains some classifying characteristics.
Structured Data
This type of data is the easiest to search and organize, because it is usually contained in rows and columns and its elements can be mapped into fixed pre-defined fields. It consists of alphanumeric values which are sourced through spreadsheet uploads, platform integrations, and properly formatted digital forms.
In structured data, information can be grouped together to form relations, which makes it easy to store, analyze, and search. As explained below, although the benefits of using structured data for business are immense, estimates indicate that it accounts for less than 15 percent of all data.
It’s important to note, however, that while structured data’s rows and columns improve the data’s accessibility, the framework isn’t standardized, making for complicated consolidation of data from separate locations and multiple stakeholders across the supply chain and throughout the life of the project. For example, if an owner-operator is constructing a new pipeline, structured data provided by numerous suppliers will not feature a standardized framework without data management planning and coordination.
ACHIEVING TVC COMPLIANCE
The Traceable, Verifiable, and Complete (TVC) rules were initially introduced by PHMSA in an Advisory Bulletin in 2012 and further developed as 49 CFR192.607 in the recently adopted gas rules in 2020.
To be TVC-compliant, energy companies must meet the following criteria.
Traceable: Records can be clearly linked to all original asset and component information.
Verifiable: Information can be confirmed by other complementary but separate documentation.
Complete: Each record is finalized as evidenced by signature, date, or other appropriate marking.
Energy companies may achieve TVC compliance by using any type of data as a starting point, but the level of work involved varies considerable between the three categories. The time spent in making digital paper (unstructured data) by assigning metadata to elevate it to semi-structured can be very time consuming.
The practical solution is to use a combination of semi-structured and structured data to achieve TVC compliance and sustained operational excellence.
However, the reality is that the energy industry possesses a complex global supply chain with little to no standardized formats or models for each stakeholder to leverage when sharing data with each other or their end customer. Further complications arise when most stakeholders transform structured data into unstructured PDFs for sharing across the supply chain with misperceptions that is the most secure and cost-effective method.
OTHER BENEFITS OF STRUCTURED DATA
The benefits of structured data are numerous. There are three key areas in the energy supply chain where structured data can make a significant positive impact on energy businesses: Asset value maximization, CapEx/OpEx efficiency, and risk mitigation.
Asset Value Maximization
In any transaction relating to energy infrastructure, due diligence is rigorous and time consuming. However, case studies have shown that when an asset is accompanied by trustworthy and accessible structured data, the due diligence time required as part of the transaction can be drastically reduced. Thus, costs are reduced, and the asset’s value is preserved. Without highly accessible and trustworthy data, purchasers of assets often need to perform re-certification processes in the field, with these costs passed back to the seller of the asset.
CapEx and OpEx Efficiencies
Accessibility of relevant and trustworthy information is becoming more of a requirement every day. Regulatory changes continue to push organizations towards “near on-demand” availability of information for audits and any sort of incident response. Organizations that have focused on data management and a transition to as much structured data as possible have shown a reduction in time spent gathering information in these circumstances.
Further, energy infrastructure construction has shown that trustworthy and accessible structured data embedded into the day-to-day construction process can help mitigate costly field issues while uncovering new efficiencies, such as reduced stockpile moves, streamlined stringing, reduced inspection data entry time, and more.
Regulatory Risk Mitigation
TVC compliance is now standard in regulatory models around North America. Though compliance is achievable through status quo record-keeping, such as file boxes or unstructured data hubs, it is extremely challenging to identify gaps and potential weak spots without some form of automated verification. Introducing structured data to an overall regulatory risk mitigation process allows organizations to manage their risks appropriately and make data-driven decisions around the safety and reliability of their infrastructure.
RECENT CASES EXEMPLIFYING THE POWER OF STRUCTURED DATA
Vintri Technologies has worked firsthand with several owner-operators that have evidenced significant increases in operational efficiencies, increased value of existing infrastructure, and risk mitigation as a direct result of establishing and utilizing structured data.
Major Operational Efficiency Improvements
In April 2021, a major North American Energy Company reported that they established full digital pipe tracing capability as a direct result of implementing measures to achieve structured data.
Before this measure, the company’s inspectors would manually process asset information, spending a minimum of 2-3 hours a day documenting pipe that was received on site. It would take them at least 2-3 weeks for that information to be updated into the company’s database systems.
Now, information about materials that used to take 2 weeks to 3 months to find can be found almost instantly, sometimes even before the pipe is shipped to stockpile sites.
Increased Value of Existing Infrastructure
In 2021, a Houston-based midstream infrastructure company undertook measures to transform all data related to a 14-mile section of NPS 4 and NPS 8 pipelines to a structured format.
This resulted in the project being brought into TVC compliance (verified through hydrotest records) and optimization of all project data accessibility. Establishing structured data for the project also directly supported a subsequent transfer of assets at a high value, which required the shortest period of due diligence process in the history of the company.
Risk Mitigation
In 2020, a major North American midstream and utility company undertook measures to optimize its asset data for five new compressor stations (valued at $500,000,000) as part of a pipeline enhancement project based in British Columbia.
As a direct result of conditioning its data to become structured and undertaking subsequent analysis (that was now feasible because of the data being structured), the following discoveries were noted:
• One-hundred-and-twenty-six items were identified as surplus materials. The company confirmed that the surplus materials were identified at least 18 months earlier than their status quo process would have permitted with inferior data.
• The surplus material was identified in a near-new state permitting redistribution or sale as opposed to the material becoming damaged as a result of sitting for a lengthy period at a laydown yard and being exposed to moisture.
• Twenty-eight items were found to possess conflicting procurement product codes, which would have negatively impacted the eventual upload of data to the client’s ERP system of record.
• Prior to undertaking data integrity management initiatives, the client had initially forecasted 5 full days for crate inspection. In fact, with the use of accessible, traceable, and verifiable data, the 786 crates were fully inspected in only 1.5 days—less than 40 percent of the budgeted time.
• Structured data permitted for timeline review of asset availability. By examining the curated data, it was found that two months prior to construction mobilization for a compressor station, project was missing 60 percent of large spool turnover packages.
• Thirty days prior to mobilization, the project was still missing 14 percent of the required turnover packages.
• Of the documents and data received from suppliers, 43 percent of the spool packages had at least one traceability issue requiring physical inspection and resolution.
• Of these identified issues, 71 percent were critical enough in nature to warrant non-conformance reports (NCRs), each of which would have halted workflow if discovered during construction.
ENHANCING YOUR ORGANIZATION’S DATA
Initiating measures to optimize your company’s data formats have never been easier. Contact us to schedule a complimentary conversation. We’ll assess your current data situation and determine recommended next steps.
NOTE: This post was written by Vintri’s Sales & Marketing team. In the event of any discrepancy pertaining to technical details included in the above article, our Data Services team will be consulted and issue any needed clarification.