{"title":"Uncertainty quantification of geochemical data imputation using Monte Carlo dropout","authors":"Vladimir Puzyrev , Paul Duuring","doi":"10.1016/j.gexplo.2025.107695","DOIUrl":null,"url":null,"abstract":"<div><div>Machine learning models have shown their promise in geochemical data imputation tasks. However, being black-box solvers, these models require more confidence in their predictions. Using uncertainty quantification methods for deep neural networks can increase the reliability of their predictions. In this paper, we use Monte Carlo Dropout to estimate uncertainty in geochemical data imputation. Multiple forward passes with different dropout configurations yield a predictive distribution for the unknown analytes. The mean of this distribution is used as the prediction, while the standard deviation expresses the uncertainty of the neural networks. Two different scenarios, namely the WACHEM and WAMEX databases containing multi-element geochemical data for rock samples, illustrate the predictive accuracy of the method and its capability to measure the associated uncertainty. Dropout values of 0.1–0.2 were identified as a good balance in prediction accuracy and model uncertainty.</div></div>","PeriodicalId":16336,"journal":{"name":"Journal of Geochemical Exploration","volume":"272 ","pages":"Article 107695"},"PeriodicalIF":3.4000,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Geochemical Exploration","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0375674225000275","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOCHEMISTRY & GEOPHYSICS","Score":null,"Total":0}
引用次数: 0
Abstract
Machine learning models have shown their promise in geochemical data imputation tasks. However, being black-box solvers, these models require more confidence in their predictions. Using uncertainty quantification methods for deep neural networks can increase the reliability of their predictions. In this paper, we use Monte Carlo Dropout to estimate uncertainty in geochemical data imputation. Multiple forward passes with different dropout configurations yield a predictive distribution for the unknown analytes. The mean of this distribution is used as the prediction, while the standard deviation expresses the uncertainty of the neural networks. Two different scenarios, namely the WACHEM and WAMEX databases containing multi-element geochemical data for rock samples, illustrate the predictive accuracy of the method and its capability to measure the associated uncertainty. Dropout values of 0.1–0.2 were identified as a good balance in prediction accuracy and model uncertainty.
期刊介绍:
Journal of Geochemical Exploration is mostly dedicated to publication of original studies in exploration and environmental geochemistry and related topics.
Contributions considered of prevalent interest for the journal include researches based on the application of innovative methods to:
define the genesis and the evolution of mineral deposits including transfer of elements in large-scale mineralized areas.
analyze complex systems at the boundaries between bio-geochemistry, metal transport and mineral accumulation.
evaluate effects of historical mining activities on the surface environment.
trace pollutant sources and define their fate and transport models in the near-surface and surface environments involving solid, fluid and aerial matrices.
assess and quantify natural and technogenic radioactivity in the environment.
determine geochemical anomalies and set baseline reference values using compositional data analysis, multivariate statistics and geo-spatial analysis.
assess the impacts of anthropogenic contamination on ecosystems and human health at local and regional scale to prioritize and classify risks through deterministic and stochastic approaches.
Papers dedicated to the presentation of newly developed methods in analytical geochemistry to be applied in the field or in laboratory are also within the topics of interest for the journal.