利用自然语言处理情感分析技术实现钻井报告数据挖掘工具

P. Kowalchuk
{"title":"利用自然语言处理情感分析技术实现钻井报告数据挖掘工具","authors":"P. Kowalchuk","doi":"10.2118/194961-MS","DOIUrl":null,"url":null,"abstract":"\n Drilling operations generate much information, such as daily drilling reports and reports generated by service companies, support personnel, and other stakeholders. These reports can be unstructured with information presented in a variety of formats. The extraction of this information is frequently challenging, which limits its use in future projects. Natural language processing provides an efficient way of mining and obtaining knowledge. This paper demonstrates how these techniques were used to analyze vast amounts of historical documents to quickly rank well complexity and determine which aspects of drilling operations were most critical.\n Sentiment analysis can be used to classify documents and other pieces of information into separate categories. In social media, it is used to analyze the collective perception of a given trending item. The technique was used here to classify wells into two ranked categorized lists. First, a classification listed wells by drilling issues. Second, a complexity ranking was defined so that each well could be classified as easy or difficult to drill. To build the sentiment analysis tool, a random set of training wells and their respective documents were selected. From these documents, a list of words was identified in what became known as highlighting sessions. During these sessions, subject matter experts (SMEs) classified words found in the documents. This \"bag of words\" was then used to train a classifier capable of ranking the wells related to the documents. A probability was associated to each well, providing a likelihood of inclusion in a given category.\n The methodology proved to be successful, ranking drilling documents in both defined category sets. Results show that the list of ranked wells can be used by SMEs to identify which wells are relevant and deserve detailed analysis. The list generated for both categories provided a guideline for further analysis, particularly identifying wells with little value. Results also showed the importance of correctly developing a list of words, an adequate training set, and the language used, as well as the need for SMEs to produce the final analysis. The technology showed promising results with real-world applications being conceivable with its current level of maturity. However, the results also indicated room for improving its effectiveness by refining the highlighting sessions, word lists, types of classifier used, and final ranking methodology.\n The use of methods and technology to help improve and enable the analysis of unstructured data in the drilling space should increase over time. This paper shows how current technology can already be used in practical real-life cases to produce tangible value.","PeriodicalId":10908,"journal":{"name":"Day 2 Tue, March 19, 2019","volume":"79 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Implementing a Drilling Reporting Data Mining Tool Using Natural Language Processing Sentiment Analysis Techniques\",\"authors\":\"P. Kowalchuk\",\"doi\":\"10.2118/194961-MS\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n Drilling operations generate much information, such as daily drilling reports and reports generated by service companies, support personnel, and other stakeholders. These reports can be unstructured with information presented in a variety of formats. The extraction of this information is frequently challenging, which limits its use in future projects. Natural language processing provides an efficient way of mining and obtaining knowledge. This paper demonstrates how these techniques were used to analyze vast amounts of historical documents to quickly rank well complexity and determine which aspects of drilling operations were most critical.\\n Sentiment analysis can be used to classify documents and other pieces of information into separate categories. In social media, it is used to analyze the collective perception of a given trending item. The technique was used here to classify wells into two ranked categorized lists. First, a classification listed wells by drilling issues. Second, a complexity ranking was defined so that each well could be classified as easy or difficult to drill. To build the sentiment analysis tool, a random set of training wells and their respective documents were selected. From these documents, a list of words was identified in what became known as highlighting sessions. During these sessions, subject matter experts (SMEs) classified words found in the documents. This \\\"bag of words\\\" was then used to train a classifier capable of ranking the wells related to the documents. A probability was associated to each well, providing a likelihood of inclusion in a given category.\\n The methodology proved to be successful, ranking drilling documents in both defined category sets. Results show that the list of ranked wells can be used by SMEs to identify which wells are relevant and deserve detailed analysis. The list generated for both categories provided a guideline for further analysis, particularly identifying wells with little value. Results also showed the importance of correctly developing a list of words, an adequate training set, and the language used, as well as the need for SMEs to produce the final analysis. The technology showed promising results with real-world applications being conceivable with its current level of maturity. However, the results also indicated room for improving its effectiveness by refining the highlighting sessions, word lists, types of classifier used, and final ranking methodology.\\n The use of methods and technology to help improve and enable the analysis of unstructured data in the drilling space should increase over time. This paper shows how current technology can already be used in practical real-life cases to produce tangible value.\",\"PeriodicalId\":10908,\"journal\":{\"name\":\"Day 2 Tue, March 19, 2019\",\"volume\":\"79 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-03-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Day 2 Tue, March 19, 2019\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2118/194961-MS\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Day 2 Tue, March 19, 2019","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2118/194961-MS","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

钻井作业产生大量信息,例如每日钻井报告以及服务公司、支持人员和其他利益相关者生成的报告。这些报告可以是非结构化的,以各种格式呈现信息。这些信息的提取通常具有挑战性,这限制了其在未来项目中的使用。自然语言处理为挖掘和获取知识提供了一种有效的方法。本文演示了如何使用这些技术来分析大量的历史文件,从而快速对井的复杂性进行排序,并确定钻井作业的哪些方面是最关键的。情感分析可以用来将文档和其他信息分成不同的类别。在社交媒体中,它用于分析给定热门项目的集体感知。该技术用于将井分为两个等级分类列表。首先,根据钻井问题对井进行分类。其次,定义了复杂程度排序,以便将每口井划分为易钻和难钻。为了构建情感分析工具,我们随机选择了一组训练井和它们各自的文档。从这些文件中,一个单词列表被识别出来,后来被称为突出显示会话。在这些会议期间,主题专家(sme)对文件中的单词进行分类。然后,这个“词包”被用来训练一个分类器,该分类器能够对与文档相关的井进行排序。与每口井相关联的概率,提供了被纳入给定类别的可能性。该方法被证明是成功的,在两个定义的类别集中对钻井文档进行了排序。结果表明,中小企业可以使用排名井列表来确定哪些井是相关的,值得详细分析。根据这两种类型生成的清单为进一步分析提供了指导,特别是识别没有价值的井。结果还显示了正确开发单词列表、适当的训练集和使用的语言的重要性,以及中小企业产生最终分析的必要性。以目前的成熟程度,该技术在实际应用中显示出了令人鼓舞的结果。然而,结果也表明了通过改进突出显示会话、单词列表、使用的分类器类型和最终排序方法来提高其有效性的空间。随着时间的推移,使用方法和技术来帮助改进和分析钻井空间中的非结构化数据应该会越来越多。本文展示了当前的技术如何在实际生活案例中使用,以产生有形价值。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Implementing a Drilling Reporting Data Mining Tool Using Natural Language Processing Sentiment Analysis Techniques
Drilling operations generate much information, such as daily drilling reports and reports generated by service companies, support personnel, and other stakeholders. These reports can be unstructured with information presented in a variety of formats. The extraction of this information is frequently challenging, which limits its use in future projects. Natural language processing provides an efficient way of mining and obtaining knowledge. This paper demonstrates how these techniques were used to analyze vast amounts of historical documents to quickly rank well complexity and determine which aspects of drilling operations were most critical. Sentiment analysis can be used to classify documents and other pieces of information into separate categories. In social media, it is used to analyze the collective perception of a given trending item. The technique was used here to classify wells into two ranked categorized lists. First, a classification listed wells by drilling issues. Second, a complexity ranking was defined so that each well could be classified as easy or difficult to drill. To build the sentiment analysis tool, a random set of training wells and their respective documents were selected. From these documents, a list of words was identified in what became known as highlighting sessions. During these sessions, subject matter experts (SMEs) classified words found in the documents. This "bag of words" was then used to train a classifier capable of ranking the wells related to the documents. A probability was associated to each well, providing a likelihood of inclusion in a given category. The methodology proved to be successful, ranking drilling documents in both defined category sets. Results show that the list of ranked wells can be used by SMEs to identify which wells are relevant and deserve detailed analysis. The list generated for both categories provided a guideline for further analysis, particularly identifying wells with little value. Results also showed the importance of correctly developing a list of words, an adequate training set, and the language used, as well as the need for SMEs to produce the final analysis. The technology showed promising results with real-world applications being conceivable with its current level of maturity. However, the results also indicated room for improving its effectiveness by refining the highlighting sessions, word lists, types of classifier used, and final ranking methodology. The use of methods and technology to help improve and enable the analysis of unstructured data in the drilling space should increase over time. This paper shows how current technology can already be used in practical real-life cases to produce tangible value.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Cost and Time Effective Stimulation Technique in Horizontal Cemented Liner Application in Carbonate Reservoir With HPCT Hydrajetting Tools Single Trip Multizone Perforation and Gravel Pack STPP: Success Story and Lessons Learned in Malaysian Application Machine Learning and the Analysis of High-Power Electromagnetic Interaction with Subsurface Matter Acoustic Properties of Carbonate: An Experimental and Modelling Study Application of Renewable Energy in the Oil and Gas Industry
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1