{"title":"增强含烊辉石岩浆的机器学习热压测量法","authors":"Mónica Ágreda-López , Valerio Parodi , Alessandro Musu , Corin Jorgenson , Alessandro Carfì , Fulvio Mastrogiovanni , Luca Caricchi , Diego Perugini , Maurizio Petrelli","doi":"10.1016/j.cageo.2024.105707","DOIUrl":null,"url":null,"abstract":"<div><p>In this study, we proposed a general workflow that aims to enhance the ML-based geothermobarometer modelling. Our workflow focuses on three key areas. Firstly, we developed a robust pre-processing pipeline that addresses data imbalance, feature engineering, and data augmentation. Secondly, we assessed modelling errors using a Monte Carlo approach to quantify the impact of analytical uncertainties on the final pressure and temperature estimates. Thirdly, we implemented a robust strategy to validate and test the ML models to avoid over- and under-fitting issues while correcting biases associated with the application of specific ML models (i.e., tree-based ensembles).</p><p>To facilitate the use of our workflow, we have developed a web app (<span><span>https://bit.ly/ml-pt-web</span><svg><path></path></svg></span>) and a Python module (<span><span>https://bit.ly/ml-pt-py</span><svg><path></path></svg></span>). The robustness of this strategy has been tested on two calibrations: clinopyroxene (cpx) and clinopyroxene-liquid (cpx-liq). Our results show a significant reduction in errors compared to the baseline model, as well as good generalization ability on an independent external dataset. The Root Mean Squared Errors are 57 °C and 2.5 kbar for the cpx calibration, and 36 °C and 2.1 kbar for the cpx-liq calibration. Finally, our models show improved outcomes on the external dataset compared to existing ML and classical cpx and cpx-liq thermobarometers.</p></div>","PeriodicalId":55221,"journal":{"name":"Computers & Geosciences","volume":"193 ","pages":"Article 105707"},"PeriodicalIF":4.2000,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0098300424001900/pdfft?md5=35a76aa189a72d9015dd976686c4e57f&pid=1-s2.0-S0098300424001900-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Enhancing machine learning thermobarometry for clinopyroxene-bearing magmas\",\"authors\":\"Mónica Ágreda-López , Valerio Parodi , Alessandro Musu , Corin Jorgenson , Alessandro Carfì , Fulvio Mastrogiovanni , Luca Caricchi , Diego Perugini , Maurizio Petrelli\",\"doi\":\"10.1016/j.cageo.2024.105707\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>In this study, we proposed a general workflow that aims to enhance the ML-based geothermobarometer modelling. Our workflow focuses on three key areas. Firstly, we developed a robust pre-processing pipeline that addresses data imbalance, feature engineering, and data augmentation. Secondly, we assessed modelling errors using a Monte Carlo approach to quantify the impact of analytical uncertainties on the final pressure and temperature estimates. Thirdly, we implemented a robust strategy to validate and test the ML models to avoid over- and under-fitting issues while correcting biases associated with the application of specific ML models (i.e., tree-based ensembles).</p><p>To facilitate the use of our workflow, we have developed a web app (<span><span>https://bit.ly/ml-pt-web</span><svg><path></path></svg></span>) and a Python module (<span><span>https://bit.ly/ml-pt-py</span><svg><path></path></svg></span>). The robustness of this strategy has been tested on two calibrations: clinopyroxene (cpx) and clinopyroxene-liquid (cpx-liq). Our results show a significant reduction in errors compared to the baseline model, as well as good generalization ability on an independent external dataset. The Root Mean Squared Errors are 57 °C and 2.5 kbar for the cpx calibration, and 36 °C and 2.1 kbar for the cpx-liq calibration. Finally, our models show improved outcomes on the external dataset compared to existing ML and classical cpx and cpx-liq thermobarometers.</p></div>\",\"PeriodicalId\":55221,\"journal\":{\"name\":\"Computers & Geosciences\",\"volume\":\"193 \",\"pages\":\"Article 105707\"},\"PeriodicalIF\":4.2000,\"publicationDate\":\"2024-08-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S0098300424001900/pdfft?md5=35a76aa189a72d9015dd976686c4e57f&pid=1-s2.0-S0098300424001900-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Geosciences\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0098300424001900\",\"RegionNum\":2,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Geosciences","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0098300424001900","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
摘要
在本研究中,我们提出了一个通用工作流程,旨在增强基于 ML 的地温热压计建模。我们的工作流程侧重于三个关键领域。首先,我们开发了一个强大的预处理管道,以解决数据不平衡、特征工程和数据增强等问题。其次,我们使用蒙特卡罗方法评估建模误差,量化分析不确定性对最终压力和温度估计值的影响。第三,我们实施了一种稳健的策略来验证和测试 ML 模型,以避免过度拟合和拟合不足的问题,同时纠正与应用特定 ML 模型(即基于树的集合)相关的偏差。为了方便使用我们的工作流程,我们开发了一个网络应用程序 (https://bit.ly/ml-pt-web) 和一个 Python 模块 (https://bit.ly/ml-pt-py)。我们在两个定标中测试了这一策略的稳健性:clinopyroxene (cpx) 和 clinopyroxene-liquid (cpx-liq)。结果表明,与基线模型相比,误差明显减少,而且在独立的外部数据集上具有良好的泛化能力。cpx 标定的均方根误差为 57 ℃ 和 2.5 千巴,cpx-liq 标定的均方根误差为 36 ℃ 和 2.1 千巴。最后,与现有的 ML 和经典 cpx 和 cpx-liq 温度计相比,我们的模型在外部数据集上显示出更好的结果。
Enhancing machine learning thermobarometry for clinopyroxene-bearing magmas
In this study, we proposed a general workflow that aims to enhance the ML-based geothermobarometer modelling. Our workflow focuses on three key areas. Firstly, we developed a robust pre-processing pipeline that addresses data imbalance, feature engineering, and data augmentation. Secondly, we assessed modelling errors using a Monte Carlo approach to quantify the impact of analytical uncertainties on the final pressure and temperature estimates. Thirdly, we implemented a robust strategy to validate and test the ML models to avoid over- and under-fitting issues while correcting biases associated with the application of specific ML models (i.e., tree-based ensembles).
To facilitate the use of our workflow, we have developed a web app (https://bit.ly/ml-pt-web) and a Python module (https://bit.ly/ml-pt-py). The robustness of this strategy has been tested on two calibrations: clinopyroxene (cpx) and clinopyroxene-liquid (cpx-liq). Our results show a significant reduction in errors compared to the baseline model, as well as good generalization ability on an independent external dataset. The Root Mean Squared Errors are 57 °C and 2.5 kbar for the cpx calibration, and 36 °C and 2.1 kbar for the cpx-liq calibration. Finally, our models show improved outcomes on the external dataset compared to existing ML and classical cpx and cpx-liq thermobarometers.
期刊介绍:
Computers & Geosciences publishes high impact, original research at the interface between Computer Sciences and Geosciences. Publications should apply modern computer science paradigms, whether computational or informatics-based, to address problems in the geosciences.