Enhanced Lithology Classification Using an Interpretable SHAP Model Integrating Semi-Supervised Contrastive Learning and Transformer with Well Logging Data
{"title":"Enhanced Lithology Classification Using an Interpretable SHAP Model Integrating Semi-Supervised Contrastive Learning and Transformer with Well Logging Data","authors":"Youzhuang Sun, Shanchen Pang, Hengxiao Li, Sibo Qiao, Yongan Zhang","doi":"10.1007/s11053-024-10452-z","DOIUrl":null,"url":null,"abstract":"<p>In petroleum and natural gas exploration, lithology identification—analyzing rock types beneath the Earth’s surface—is crucial for assessing hydrocarbon reservoirs and optimizing drilling strategies. Traditionally, this process relies on logging data such as gamma rays and resistivity, which often require manual interpretation, making it labor-intensive and prone to errors. To address these challenges, we propose a novel machine learning framework—contrastive learning-transformer—leveraging self-attention mechanisms to enhance the accuracy of lithology identification. Our method first extracts unlabeled samples from logging data while obtaining labeled core sample data. Through self-supervised contrastive learning and a transformer backbone network, we optimize performance using techniques like batch normalization. After pretraining, the model is fine-tuned with a limited number of labeled samples to improve accuracy and significantly reduce reliance on large labeled datasets, thereby lowering the costs associated with drilling core annotations. Additionally, our research incorporates shapley additive explanations (SHAP) technology to enhance the transparency of the model’s decision-making process, facilitating the analysis of the contribution of each feature to lithology predictions. The model also learns time-reversal invariance by reversing sequential data, ensuring reliable identification even with variations in data sequences. Experimental results demonstrate that our transformer model, combined with semi-supervised contrastive learning, significantly outperforms traditional methods, achieving more precise lithology identification, especially in complex geological environments.</p>","PeriodicalId":54284,"journal":{"name":"Natural Resources Research","volume":"60 1","pages":""},"PeriodicalIF":4.8000,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Natural Resources Research","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1007/s11053-024-10452-z","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
In petroleum and natural gas exploration, lithology identification—analyzing rock types beneath the Earth’s surface—is crucial for assessing hydrocarbon reservoirs and optimizing drilling strategies. Traditionally, this process relies on logging data such as gamma rays and resistivity, which often require manual interpretation, making it labor-intensive and prone to errors. To address these challenges, we propose a novel machine learning framework—contrastive learning-transformer—leveraging self-attention mechanisms to enhance the accuracy of lithology identification. Our method first extracts unlabeled samples from logging data while obtaining labeled core sample data. Through self-supervised contrastive learning and a transformer backbone network, we optimize performance using techniques like batch normalization. After pretraining, the model is fine-tuned with a limited number of labeled samples to improve accuracy and significantly reduce reliance on large labeled datasets, thereby lowering the costs associated with drilling core annotations. Additionally, our research incorporates shapley additive explanations (SHAP) technology to enhance the transparency of the model’s decision-making process, facilitating the analysis of the contribution of each feature to lithology predictions. The model also learns time-reversal invariance by reversing sequential data, ensuring reliable identification even with variations in data sequences. Experimental results demonstrate that our transformer model, combined with semi-supervised contrastive learning, significantly outperforms traditional methods, achieving more precise lithology identification, especially in complex geological environments.
期刊介绍:
This journal publishes quantitative studies of natural (mainly but not limited to mineral) resources exploration, evaluation and exploitation, including environmental and risk-related aspects. Typical articles use geoscientific data or analyses to assess, test, or compare resource-related aspects. NRR covers a wide variety of resources including minerals, coal, hydrocarbon, geothermal, water, and vegetation. Case studies are welcome.