AI-Based System Automates Textual Classification of Daily Drilling Reports

Journal of Petroleum Technology Pub Date : 2024-02-01 DOI:10.2118/0224-0055-jpt

C. Carpenter

{"title":"AI-Based System Automates Textual Classification of Daily Drilling Reports","authors":"C. Carpenter","doi":"10.2118/0224-0055-jpt","DOIUrl":null,"url":null,"abstract":"\n \n This article, written by JPT Technology Editor Chris Carpenter, contains highlights of paper OTC 32978, “Development and Implementation of an AI-Based System To Automate Textual Classification on Daily Drilling Reports,” by Stephan Perrout, Aliel F. Riente, and Guilherme S.F. Vanni, SPE, Petrobras, et al. The paper has not been peer reviewed. Copyright 2023 Offshore Technology Conference. Reproduced by permission.\n \n \n \n Structured daily drilling reports (DDRs) are a rich source of information that allows better planning, more-accurate risk analysis, and improved key performance indicators and contracts. However, such information is originally stored in a free-text and unstructured format, which becomes difficult for efficient data mining. With the advance of artificial intelligence (AI) technologies, particularly AI language models, applying such techniques over unstructured data has become critical to digital transformation. The complete paper presents an approach for automatic DDR classification that incorporates new techniques of AI.\n \n \n \n This work addresses the complex task of automatic classification of DDRs according to a newly proposed ontology. The ontology follows a hierarchical model that classifies actions into three or four levels depending on the intervention, considering drilling, completion, and abandonment. Each event has an ontology built and reviewed by experts in oil and gas.\n Classifying DDR constitutes a demanding task, and effectively exploiting AI-based models represents a promising solution. This work bridges the gap by proposing a classifier based on transformers along with recurrent neural networks (RNNs) to classify reported events described in unstructured text related to drilling, completion, and abandonment interventions. A large number of DDRs was used for training and validation of the proposed classifier, yielding promising results for key processes in the company.\n \n \n \n Bidirectional Long Short-Term Memory. Early neural-network models are characterized by inputs of fixed length. This is a drawback when working with texts, however, because sentences vary in their number of words. To overcome such an issue and to process data sequentially, RNNs were proposed. The RNNs are characterized by a set of parameters inherent from the early models plus an internal memory (a hidden or internal state) responsible for storing the context of the sequence being processed.\n Long short-term memory (LSTM) is a variation of RNN proposed to mitigate two problems: Information can be easily lost when processing very long sequences, and the gradient can become quite low because of the high number of mathematical operations performed during the processing while remaining far from reaching the threshold. LSTM consists of a set of parameters called the input gate, forget gate, and output gate that control information flow through the network. This set of additional parameters helps to maintain only what is important for the internal state of the network besides controlling the output.\n BiLSTM is a variant of LSTM that comprises two LSTMs. One processes texts from left to right, and the second one processes texts from right to left. This feature allows “future” elements to be part of the model’s decision process for “past” elements. The final classification is the combination of the output of both LSTMs.\n","PeriodicalId":16720,"journal":{"name":"Journal of Petroleum Technology","volume":"40 12","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Petroleum Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2118/0224-0055-jpt","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

This article, written by JPT Technology Editor Chris Carpenter, contains highlights of paper OTC 32978, “Development and Implementation of an AI-Based System To Automate Textual Classification on Daily Drilling Reports,” by Stephan Perrout, Aliel F. Riente, and Guilherme S.F. Vanni, SPE, Petrobras, et al. The paper has not been peer reviewed. Copyright 2023 Offshore Technology Conference. Reproduced by permission. Structured daily drilling reports (DDRs) are a rich source of information that allows better planning, more-accurate risk analysis, and improved key performance indicators and contracts. However, such information is originally stored in a free-text and unstructured format, which becomes difficult for efficient data mining. With the advance of artificial intelligence (AI) technologies, particularly AI language models, applying such techniques over unstructured data has become critical to digital transformation. The complete paper presents an approach for automatic DDR classification that incorporates new techniques of AI. This work addresses the complex task of automatic classification of DDRs according to a newly proposed ontology. The ontology follows a hierarchical model that classifies actions into three or four levels depending on the intervention, considering drilling, completion, and abandonment. Each event has an ontology built and reviewed by experts in oil and gas. Classifying DDR constitutes a demanding task, and effectively exploiting AI-based models represents a promising solution. This work bridges the gap by proposing a classifier based on transformers along with recurrent neural networks (RNNs) to classify reported events described in unstructured text related to drilling, completion, and abandonment interventions. A large number of DDRs was used for training and validation of the proposed classifier, yielding promising results for key processes in the company. Bidirectional Long Short-Term Memory. Early neural-network models are characterized by inputs of fixed length. This is a drawback when working with texts, however, because sentences vary in their number of words. To overcome such an issue and to process data sequentially, RNNs were proposed. The RNNs are characterized by a set of parameters inherent from the early models plus an internal memory (a hidden or internal state) responsible for storing the context of the sequence being processed. Long short-term memory (LSTM) is a variation of RNN proposed to mitigate two problems: Information can be easily lost when processing very long sequences, and the gradient can become quite low because of the high number of mathematical operations performed during the processing while remaining far from reaching the threshold. LSTM consists of a set of parameters called the input gate, forget gate, and output gate that control information flow through the network. This set of additional parameters helps to maintain only what is important for the internal state of the network besides controlling the output. BiLSTM is a variant of LSTM that comprises two LSTMs. One processes texts from left to right, and the second one processes texts from right to left. This feature allows “future” elements to be part of the model’s decision process for “past” elements. The final classification is the combination of the output of both LSTMs.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于人工智能的系统自动对每日钻井报告进行文本分类

本文由 JPT 技术编辑 Chris Carpenter 撰写，包含 OTC 32978 号论文 "Development and Implementation of an AI-Based System To Automate Textual Classification on Daily Drilling Reports "的要点，作者为 Stephan Perrout、Aliel F. Riente 和 Guilherme S.F. Vanni，SPE，Petrobras 等人，该论文未经同行评审。2023 年近海技术大会版权所有。经许可转载。结构化每日钻井报告 (DDR) 是一个丰富的信息来源，可用于更好地规划、更准确地分析风险以及改进关键性能指标和合同。然而，这些信息最初是以自由文本和非结构化格式存储的，很难进行有效的数据挖掘。随着人工智能（AI）技术，特别是人工智能语言模型的发展，将此类技术应用于非结构化数据已成为数字化转型的关键。本文介绍了一种结合人工智能新技术的 DDR 自动分类方法。这项工作涉及根据新提出的本体对 DDR 进行自动分类的复杂任务。本体采用分层模型，根据干预情况将行动分为三到四个级别，包括钻井、完井和弃井。每个事件都有一个本体，并由石油和天然气专家进行审核。对 DDR 进行分类是一项艰巨的任务，有效利用基于人工智能的模型是一种很有前景的解决方案。这项工作通过提出一种基于变压器和递归神经网络（RNN）的分类器来对非结构化文本中描述的与钻井、完井和弃井干预相关的报告事件进行分类，从而弥补了这一差距。建议的分类器使用了大量的 DDR 进行训练和验证，为公司的关键流程带来了可喜的成果。双向长短期记忆。早期的神经网络模型以固定长度的输入为特征。然而，这在处理文本时是个缺陷，因为句子的字数各不相同。为了克服这一问题并按顺序处理数据，人们提出了 RNN。RNNs 的特点是有一组早期模型固有的参数，外加一个内部存储器（隐藏或内部状态），负责存储正在处理的序列的上下文。长短期记忆（LSTM）是 RNN 的一种变体，旨在缓解两个问题：在处理超长序列时，信息很容易丢失；由于在处理过程中执行了大量数学运算，梯度可能会变得很低，但仍远未达到阈值。LSTM 包含一组称为输入门、遗忘门和输出门的参数，用于控制网络中的信息流。这组附加参数除了控制输出外，还有助于只保持对网络内部状态重要的信息。BiLSTM 是 LSTM 的一种变体，由两个 LSTM 组成。一个从左到右处理文本，另一个从右到左处理文本。这一特性允许 "未来 "元素成为模型对 "过去 "元素决策过程的一部分。最终的分类是两个 LSTM 输出的组合。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊