{"title":"Enhancing lithofacies machine learning predictions with gamma-ray attributes for boreholes with limited diversity of recorded well logs","authors":"David A. Wood","doi":"10.1016/j.aiig.2022.02.007","DOIUrl":null,"url":null,"abstract":"<div><p>Derivative and volatility attributes can be usefully calculated from recorded gamma ray (GR) data to enhance lithofacies classification in wellbores penetrating multiple lithologies. Such attributes extract information about the log curve shape that cannot be readily discerned from the recorded well log data. A logged wellbore section for which 8911 data records are available for the three recorded logs (GR, sonic (DT) and bulk density (PB)) is evaluated. That section demonstrates the value of the GR attributes for machine learning (ML) lithofacies predictions. Five feature selection configurations are considered. The 9-var configuration including GR, DT, PB and six GR attributes, and the 7-var configuration of GR and the six GR attributes, provide the most accurate and reproducible lithofacies predictions. The other three feature configurations evaluated do not include the GR attributes but just one to three of the recorded log features. The results of seven ML models and two regression models reveal that K-nearest neighbor (KNN), random forest (RF) and extreme gradient boosting (XGB) are the best performing models. They generate between 14 and 23 misclassification from 8911 data records for the 9-var model. Multi-layer perceptron (MLP) and support vector classification (SVC) do not perform well with the 7-var model which lacks the PB feature displaying the highest correlation with facies class. Annotated confusion matrices reveal that KNN, RF and XGB models can effectively distinguish all facies classes for the 9-var and 7-var configurations (that includes the GR attributes), whereas none of the models can achieve that outcome for the 3-var configuration (that excludes the GR attributes). Accurately distinguishing lithofacies using well-log data in sedimentary sections is an important objective in applied geoscience. The straightforward, GR-attribute method proposed works to improve confidence in ML-lithofacies classifications based on limited recorded well-log data.</p></div>","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"2 ","pages":"Pages 148-164"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666544122000077/pdfft?md5=2bf3b12ae35a11a62a8a749a700d3504&pid=1-s2.0-S2666544122000077-main.pdf","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence in Geosciences","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666544122000077","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
Derivative and volatility attributes can be usefully calculated from recorded gamma ray (GR) data to enhance lithofacies classification in wellbores penetrating multiple lithologies. Such attributes extract information about the log curve shape that cannot be readily discerned from the recorded well log data. A logged wellbore section for which 8911 data records are available for the three recorded logs (GR, sonic (DT) and bulk density (PB)) is evaluated. That section demonstrates the value of the GR attributes for machine learning (ML) lithofacies predictions. Five feature selection configurations are considered. The 9-var configuration including GR, DT, PB and six GR attributes, and the 7-var configuration of GR and the six GR attributes, provide the most accurate and reproducible lithofacies predictions. The other three feature configurations evaluated do not include the GR attributes but just one to three of the recorded log features. The results of seven ML models and two regression models reveal that K-nearest neighbor (KNN), random forest (RF) and extreme gradient boosting (XGB) are the best performing models. They generate between 14 and 23 misclassification from 8911 data records for the 9-var model. Multi-layer perceptron (MLP) and support vector classification (SVC) do not perform well with the 7-var model which lacks the PB feature displaying the highest correlation with facies class. Annotated confusion matrices reveal that KNN, RF and XGB models can effectively distinguish all facies classes for the 9-var and 7-var configurations (that includes the GR attributes), whereas none of the models can achieve that outcome for the 3-var configuration (that excludes the GR attributes). Accurately distinguishing lithofacies using well-log data in sedimentary sections is an important objective in applied geoscience. The straightforward, GR-attribute method proposed works to improve confidence in ML-lithofacies classifications based on limited recorded well-log data.