Enhanced Lithology Classification Using an Interpretable SHAP Model Integrating Semi-Supervised Contrastive Learning and Transformer with Well Logging Data

IF 5 2区地球科学 Q1 GEOSCIENCES, MULTIDISCIPLINARY Natural Resources Research Pub Date : 2025-01-17 DOI:10.1007/s11053-024-10452-z

Youzhuang Sun, Shanchen Pang, Hengxiao Li, Sibo Qiao, Yongan Zhang

{"title":"Enhanced Lithology Classification Using an Interpretable SHAP Model Integrating Semi-Supervised Contrastive Learning and Transformer with Well Logging Data","authors":"Youzhuang Sun, Shanchen Pang, Hengxiao Li, Sibo Qiao, Yongan Zhang","doi":"10.1007/s11053-024-10452-z","DOIUrl":null,"url":null,"abstract":"<p>In petroleum and natural gas exploration, lithology identification—analyzing rock types beneath the Earth’s surface—is crucial for assessing hydrocarbon reservoirs and optimizing drilling strategies. Traditionally, this process relies on logging data such as gamma rays and resistivity, which often require manual interpretation, making it labor-intensive and prone to errors. To address these challenges, we propose a novel machine learning framework—contrastive learning-transformer—leveraging self-attention mechanisms to enhance the accuracy of lithology identification. Our method first extracts unlabeled samples from logging data while obtaining labeled core sample data. Through self-supervised contrastive learning and a transformer backbone network, we optimize performance using techniques like batch normalization. After pretraining, the model is fine-tuned with a limited number of labeled samples to improve accuracy and significantly reduce reliance on large labeled datasets, thereby lowering the costs associated with drilling core annotations. Additionally, our research incorporates shapley additive explanations (SHAP) technology to enhance the transparency of the model’s decision-making process, facilitating the analysis of the contribution of each feature to lithology predictions. The model also learns time-reversal invariance by reversing sequential data, ensuring reliable identification even with variations in data sequences. Experimental results demonstrate that our transformer model, combined with semi-supervised contrastive learning, significantly outperforms traditional methods, achieving more precise lithology identification, especially in complex geological environments.</p>","PeriodicalId":54284,"journal":{"name":"Natural Resources Research","volume":"60 1","pages":""},"PeriodicalIF":5.0000,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Natural Resources Research","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1007/s11053-024-10452-z","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

Abstract

In petroleum and natural gas exploration, lithology identification—analyzing rock types beneath the Earth’s surface—is crucial for assessing hydrocarbon reservoirs and optimizing drilling strategies. Traditionally, this process relies on logging data such as gamma rays and resistivity, which often require manual interpretation, making it labor-intensive and prone to errors. To address these challenges, we propose a novel machine learning framework—contrastive learning-transformer—leveraging self-attention mechanisms to enhance the accuracy of lithology identification. Our method first extracts unlabeled samples from logging data while obtaining labeled core sample data. Through self-supervised contrastive learning and a transformer backbone network, we optimize performance using techniques like batch normalization. After pretraining, the model is fine-tuned with a limited number of labeled samples to improve accuracy and significantly reduce reliance on large labeled datasets, thereby lowering the costs associated with drilling core annotations. Additionally, our research incorporates shapley additive explanations (SHAP) technology to enhance the transparency of the model’s decision-making process, facilitating the analysis of the contribution of each feature to lithology predictions. The model also learns time-reversal invariance by reversing sequential data, ensuring reliable identification even with variations in data sequences. Experimental results demonstrate that our transformer model, combined with semi-supervised contrastive learning, significantly outperforms traditional methods, achieving more precise lithology identification, especially in complex geological environments.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

结合半监督对比学习和变压器与测井数据的可解释SHAP模型增强岩性分类

在石油和天然气勘探中，岩性识别--分析地表下的岩石类型--对于评估油气储层和优化钻探策略至关重要。传统上，这一过程依赖于伽马射线和电阻率等测井数据，而这些数据通常需要人工解释，因此劳动密集型且容易出错。为了应对这些挑战，我们提出了一种新颖的机器学习框架--对比学习--转换器--利用自我注意机制来提高岩性识别的准确性。我们的方法首先从测井数据中提取未标记的样本，同时获取已标记的岩心样本数据。通过自监督对比学习和变压器骨干网络，我们利用批量归一化等技术优化了性能。经过预训练后，利用数量有限的标注样本对模型进行微调，以提高准确性，并显著减少对大型标注数据集的依赖，从而降低钻探岩心注释的相关成本。此外，我们的研究还采用了夏普利加法解释（SHAP）技术，以提高模型决策过程的透明度，便于分析每个特征对岩性预测的贡献。该模型还通过反转顺序数据来学习时间反转不变性，从而确保即使在数据顺序发生变化的情况下也能进行可靠的识别。实验结果表明，我们的变压器模型与半监督对比学习相结合，明显优于传统方法，实现了更精确的岩性识别，尤其是在复杂的地质环境中。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Natural Resources Research Environmental Science-General Environmental Science

CiteScore

11.90

自引率

11.10%

发文量

151

期刊介绍： This journal publishes quantitative studies of natural (mainly but not limited to mineral) resources exploration, evaluation and exploitation, including environmental and risk-related aspects. Typical articles use geoscientific data or analyses to assess, test, or compare resource-related aspects. NRR covers a wide variety of resources including minerals, coal, hydrocarbon, geothermal, water, and vegetation. Case studies are welcome.