声波慢度测井重建机器学习模型的不确定性与可解释性分析

Hua Wang , Yuqiong Wu , Yushun Zhang , Fuqiang Lai , Zhou Feng , Bing Xie , Ailin Zhao
{"title":"声波慢度测井重建机器学习模型的不确定性与可解释性分析","authors":"Hua Wang ,&nbsp;Yuqiong Wu ,&nbsp;Yushun Zhang ,&nbsp;Fuqiang Lai ,&nbsp;Zhou Feng ,&nbsp;Bing Xie ,&nbsp;Ailin Zhao","doi":"10.1016/j.aiig.2023.11.002","DOIUrl":null,"url":null,"abstract":"<div><p>Logs are valuable information for oil and gas fields as they help to determine the lithology of the formations surrounding the borehole and the location and reserves of subsurface oil and gas reservoirs. However, important logs are often missing in horizontal or old wells, which poses a challenge in field applications. To address this issue, conventional methods involve supplementing the missing logs by either combining geological experience and referring data from nearby boreholes or reconstructing them directly using the remaining logs in the same borehole. Nevertheless, there is currently no quantitative evaluation for the quality and rationality of the constructed log. In this paper, we utilize data from the 2020 machine learning competition of the Society of Petrophysicists and Logging Analysts (SPWLA), which aims to predict the missing compressional wave slowness (DTC) and shear wave slowness (DTS) logs using other logs in the same borehole. We employ the natural gradient boosting (NGBoost) algorithm to construct an Ensemble Learning model that can predicate the results as well as their uncertainty. Furthermore, we combine the SHAP (SHapley Additive exPlanations) method to investigate the interpretability of the machine learning model. We compare the performance of the NGBosst model with four other commonly used Ensemble Learning methods, including Random Forest, GBDT, XGBoost, LightGBM. The results show that the NGBoost model performs well in the testing set and can provide a probability distribution for the prediction results. This distribution allows petrophysicists to quantitatively analyze the confidence interval of the constructed log. In addition, the variance of the probability distribution of the predicted log can be used to justify the quality of the constructed log. Using the SHAP explainable machine learning model, we calculate the importance of each input log to the predicted results as well as the coupling relationship among input logs. Our findings reveal that the NGBoost model tends to provide greater slowness prediction results when the neutron porosity (CNC) and gamma ray (GR) are large, which is consistent with the cognition of petrophysical models. Furthermore, the machine learning model can capture the influence of the changing borehole caliper on slowness, where the influence of borehole caliper on slowness is complex and not easy to establish a direct relationship. These findings are in line with the physical principle of borehole acoustics. Finally, by using the explainable machine learning model, we observe that although we did not correct the effect of borehole caliper on the neutron porosity log through preprocessing, the machine learning model assigned a greater importance to the influence of the caliper, achieving the same effect as caliper correction.</p></div>","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"4 ","pages":"Pages 182-198"},"PeriodicalIF":0.0000,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666544123000321/pdfft?md5=ff398734a4ea8a092a89af0a39182690&pid=1-s2.0-S2666544123000321-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Uncertainty and explainable analysis of machine learning model for reconstruction of sonic slowness logs\",\"authors\":\"Hua Wang ,&nbsp;Yuqiong Wu ,&nbsp;Yushun Zhang ,&nbsp;Fuqiang Lai ,&nbsp;Zhou Feng ,&nbsp;Bing Xie ,&nbsp;Ailin Zhao\",\"doi\":\"10.1016/j.aiig.2023.11.002\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Logs are valuable information for oil and gas fields as they help to determine the lithology of the formations surrounding the borehole and the location and reserves of subsurface oil and gas reservoirs. However, important logs are often missing in horizontal or old wells, which poses a challenge in field applications. To address this issue, conventional methods involve supplementing the missing logs by either combining geological experience and referring data from nearby boreholes or reconstructing them directly using the remaining logs in the same borehole. Nevertheless, there is currently no quantitative evaluation for the quality and rationality of the constructed log. In this paper, we utilize data from the 2020 machine learning competition of the Society of Petrophysicists and Logging Analysts (SPWLA), which aims to predict the missing compressional wave slowness (DTC) and shear wave slowness (DTS) logs using other logs in the same borehole. We employ the natural gradient boosting (NGBoost) algorithm to construct an Ensemble Learning model that can predicate the results as well as their uncertainty. Furthermore, we combine the SHAP (SHapley Additive exPlanations) method to investigate the interpretability of the machine learning model. We compare the performance of the NGBosst model with four other commonly used Ensemble Learning methods, including Random Forest, GBDT, XGBoost, LightGBM. The results show that the NGBoost model performs well in the testing set and can provide a probability distribution for the prediction results. This distribution allows petrophysicists to quantitatively analyze the confidence interval of the constructed log. In addition, the variance of the probability distribution of the predicted log can be used to justify the quality of the constructed log. Using the SHAP explainable machine learning model, we calculate the importance of each input log to the predicted results as well as the coupling relationship among input logs. Our findings reveal that the NGBoost model tends to provide greater slowness prediction results when the neutron porosity (CNC) and gamma ray (GR) are large, which is consistent with the cognition of petrophysical models. Furthermore, the machine learning model can capture the influence of the changing borehole caliper on slowness, where the influence of borehole caliper on slowness is complex and not easy to establish a direct relationship. These findings are in line with the physical principle of borehole acoustics. Finally, by using the explainable machine learning model, we observe that although we did not correct the effect of borehole caliper on the neutron porosity log through preprocessing, the machine learning model assigned a greater importance to the influence of the caliper, achieving the same effect as caliper correction.</p></div>\",\"PeriodicalId\":100124,\"journal\":{\"name\":\"Artificial Intelligence in Geosciences\",\"volume\":\"4 \",\"pages\":\"Pages 182-198\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2666544123000321/pdfft?md5=ff398734a4ea8a092a89af0a39182690&pid=1-s2.0-S2666544123000321-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Artificial Intelligence in Geosciences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666544123000321\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence in Geosciences","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666544123000321","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

测井资料对于油气田来说是很有价值的信息,因为它们有助于确定井眼周围地层的岩性以及地下油气储层的位置和储量。然而,水平井或老井往往缺少重要的测井曲线,这给现场应用带来了挑战。为了解决这个问题,传统的方法包括通过结合地质经验和参考附近井眼的数据来补充缺失的测井曲线,或者直接使用同一井眼中的剩余测井曲线进行重建。然而,目前还没有对所建原木的质量和合理性进行定量评价。在本文中,我们利用了来自岩石物理学家和测井分析师协会(SPWLA) 2020年机器学习竞赛的数据,该竞赛旨在使用同一井眼中的其他测井数据预测缺失的纵波慢度(DTC)和横波慢度(DTS)测井数据。我们采用自然梯度增强(NGBoost)算法来构建一个集成学习模型,该模型可以预测结果及其不确定性。此外,我们结合SHAP (SHapley Additive exPlanations)方法来研究机器学习模型的可解释性。我们将NGBosst模型与其他四种常用的集成学习方法(包括Random Forest, GBDT, XGBoost, LightGBM)的性能进行了比较。结果表明,NGBoost模型在测试集中表现良好,可以为预测结果提供一个概率分布。这种分布使岩石物理学家能够定量分析构造的测井曲线的置信区间。此外,预测日志的概率分布的方差可以用来证明构造日志的质量。使用SHAP可解释机器学习模型,我们计算了每个输入日志对预测结果的重要性以及输入日志之间的耦合关系。研究结果表明,当中子孔隙度(CNC)和伽马射线(GR)较大时,NGBoost模型的慢度预测结果更佳,这与岩石物理模型的认知一致。此外,机器学习模型可以捕捉井径变化对慢度的影响,其中井径对慢度的影响是复杂的,不容易建立直接关系。这些发现符合钻孔声学的物理原理。最后,通过使用可解释机器学习模型,我们观察到,虽然我们没有通过预处理校正井径器对中子孔隙度测井的影响,但机器学习模型更加重视井径器的影响,达到了与井径器校正相同的效果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Uncertainty and explainable analysis of machine learning model for reconstruction of sonic slowness logs

Logs are valuable information for oil and gas fields as they help to determine the lithology of the formations surrounding the borehole and the location and reserves of subsurface oil and gas reservoirs. However, important logs are often missing in horizontal or old wells, which poses a challenge in field applications. To address this issue, conventional methods involve supplementing the missing logs by either combining geological experience and referring data from nearby boreholes or reconstructing them directly using the remaining logs in the same borehole. Nevertheless, there is currently no quantitative evaluation for the quality and rationality of the constructed log. In this paper, we utilize data from the 2020 machine learning competition of the Society of Petrophysicists and Logging Analysts (SPWLA), which aims to predict the missing compressional wave slowness (DTC) and shear wave slowness (DTS) logs using other logs in the same borehole. We employ the natural gradient boosting (NGBoost) algorithm to construct an Ensemble Learning model that can predicate the results as well as their uncertainty. Furthermore, we combine the SHAP (SHapley Additive exPlanations) method to investigate the interpretability of the machine learning model. We compare the performance of the NGBosst model with four other commonly used Ensemble Learning methods, including Random Forest, GBDT, XGBoost, LightGBM. The results show that the NGBoost model performs well in the testing set and can provide a probability distribution for the prediction results. This distribution allows petrophysicists to quantitatively analyze the confidence interval of the constructed log. In addition, the variance of the probability distribution of the predicted log can be used to justify the quality of the constructed log. Using the SHAP explainable machine learning model, we calculate the importance of each input log to the predicted results as well as the coupling relationship among input logs. Our findings reveal that the NGBoost model tends to provide greater slowness prediction results when the neutron porosity (CNC) and gamma ray (GR) are large, which is consistent with the cognition of petrophysical models. Furthermore, the machine learning model can capture the influence of the changing borehole caliper on slowness, where the influence of borehole caliper on slowness is complex and not easy to establish a direct relationship. These findings are in line with the physical principle of borehole acoustics. Finally, by using the explainable machine learning model, we observe that although we did not correct the effect of borehole caliper on the neutron porosity log through preprocessing, the machine learning model assigned a greater importance to the influence of the caliper, achieving the same effect as caliper correction.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
4.20
自引率
0.00%
发文量
0
期刊最新文献
Convolutional sparse coding network for sparse seismic time-frequency representation Research on the prediction method for fluvial-phase sandbody connectivity based on big data analysis--a case study of Bohai a oilfield Pore size classification and prediction based on distribution of reservoir fluid volumes utilizing well logs and deep learning algorithm in a complex lithology Benchmarking data handling strategies for landslide susceptibility modeling using random forest workflows A 3D convolutional neural network model with multiple outputs for simultaneously estimating the reactive transport parameters of sandstone from its CT images
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1