Machine Learning model interpretability using SHAP values: Application to Igneous Rock Classification task

IF 2.6 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Applied Computing and Geosciences Pub Date : 2024-07-22 DOI:10.1016/j.acags.2024.100178
Antonella S. Antonini , Juan Tanzola , Lucía Asiain , Gabriela R. Ferracutti , Silvia M. Castro , Ernesto A. Bjerg , María Luján Ganuza
{"title":"Machine Learning model interpretability using SHAP values: Application to Igneous Rock Classification task","authors":"Antonella S. Antonini ,&nbsp;Juan Tanzola ,&nbsp;Lucía Asiain ,&nbsp;Gabriela R. Ferracutti ,&nbsp;Silvia M. Castro ,&nbsp;Ernesto A. Bjerg ,&nbsp;María Luján Ganuza","doi":"10.1016/j.acags.2024.100178","DOIUrl":null,"url":null,"abstract":"<div><p>El Fierro intrusive body is one of the bodies that compose the La Jovita–Las Aguilas mafic–ultramafic belt, located in the Sierra Grande de San Luis, Argentina. The units of this belt carry a base metal sulfide (BMS) mineralization and platinum group minerals (PGM). The macroscopic description of mafic and ultramafic rocks, as is usually done by the mining exploration companies, leads to an imprecise modal classification of the rocks. In this study, we develop a random forest-based prediction model, which uses geochemical parameters to classify mafic and ultramafic rocks intercepted by drill cores. This model showed an accuracy of between 86% and 94%, and an f1_score of 96%. Random forest classification is a widely adopted Machine Learning approach to construct predictive models across various research domains. However, as models become more complex, their interpretation can be considerably difficult. To interpret the model results, we use both global and local perspectives, incorporating the SHAP (SHapley Additive exPlanations) method. The SHAP technique allows us to analyze individual samples using force plots, and provides a measure of the importance of each geochemical input attribute in the model output. As a result of analyzing the contribution of each input feature to the model, the three variables with the highest contributions were identified in the following order: <span><math><mrow><msub><mrow><mi>Al</mi></mrow><mrow><mn>2</mn></mrow></msub><msub><mrow><mi>O</mi></mrow><mrow><mn>3</mn></mrow></msub></mrow></math></span>, <span><math><mi>MgO</mi></math></span>, and <span><math><mi>Sr</mi></math></span>.</p></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"23 ","pages":"Article 100178"},"PeriodicalIF":2.6000,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590197424000259/pdfft?md5=4c1e0ad425c657a335a51d5db628874f&pid=1-s2.0-S2590197424000259-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Computing and Geosciences","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2590197424000259","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

El Fierro intrusive body is one of the bodies that compose the La Jovita–Las Aguilas mafic–ultramafic belt, located in the Sierra Grande de San Luis, Argentina. The units of this belt carry a base metal sulfide (BMS) mineralization and platinum group minerals (PGM). The macroscopic description of mafic and ultramafic rocks, as is usually done by the mining exploration companies, leads to an imprecise modal classification of the rocks. In this study, we develop a random forest-based prediction model, which uses geochemical parameters to classify mafic and ultramafic rocks intercepted by drill cores. This model showed an accuracy of between 86% and 94%, and an f1_score of 96%. Random forest classification is a widely adopted Machine Learning approach to construct predictive models across various research domains. However, as models become more complex, their interpretation can be considerably difficult. To interpret the model results, we use both global and local perspectives, incorporating the SHAP (SHapley Additive exPlanations) method. The SHAP technique allows us to analyze individual samples using force plots, and provides a measure of the importance of each geochemical input attribute in the model output. As a result of analyzing the contribution of each input feature to the model, the three variables with the highest contributions were identified in the following order: Al2O3, MgO, and Sr.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用 SHAP 值的机器学习模型可解释性:应用于火成岩分类任务
El Fierro 侵入体是构成 La Jovita-Las Aguilas 黑云母-超黑云母岩带的岩体之一,位于阿根廷的 Sierra Grande de San Luis。该岩带的岩体含有贱金属硫化物(BMS)矿化物和铂族矿物(PGM)。矿业勘探公司通常对黑云母岩和超黑云母岩进行宏观描述,导致岩石的模式分类不精确。在这项研究中,我们开发了一种基于随机森林的预测模型,利用地球化学参数对钻探岩心截获的岩浆岩和超基性岩进行分类。该模型的准确率在 86% 到 94% 之间,f1_score 为 96%。随机森林分类法是一种广泛采用的机器学习方法,用于构建各种研究领域的预测模型。然而,随着模型变得越来越复杂,对模型的解释也变得相当困难。为了解释模型结果,我们结合 SHAP(SHapley Additive exPlanations)方法,使用了全局和局部视角。通过 SHAP 技术,我们可以使用力图分析单个样本,并对模型输出中每个地球化学输入属性的重要性进行衡量。通过分析每个输入特征对模型的贡献,确定了贡献最大的三个变量,其顺序如下:Al2O3、MgO 和 Sr。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Applied Computing and Geosciences
Applied Computing and Geosciences Computer Science-General Computer Science
CiteScore
5.50
自引率
0.00%
发文量
23
审稿时长
5 weeks
期刊最新文献
Revolutionizing the future of hydrological science: Impact of machine learning and deep learning amidst emerging explainable AI and transfer learning Generating land gravity anomalies from satellite gravity observations using PIX2PIX GAN image translation Reconstruction of reservoir rock using attention-based convolutional recurrent neural network Mapping landforms of a hilly landscape using machine learning and high-resolution LiDAR topographic data Evaluating the performances of SVR and XGBoost for short-range forecasting of heatwaves across different temperature zones of India
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1