A high-performance extreme gradient boosting outlier detection framework for integrating the outputs of diverse anomaly detectors for detecting mineralization-related geochemical anomalies
{"title":"A high-performance extreme gradient boosting outlier detection framework for integrating the outputs of diverse anomaly detectors for detecting mineralization-related geochemical anomalies","authors":"Sheng He, Yongliang Chen","doi":"10.1016/j.gexplo.2025.107741","DOIUrl":null,"url":null,"abstract":"<div><div>In geochemical exploration, the geochemical anomalies identified in the same area by different unsupervised anomaly detection models are often quite divergent. How to combine these divergent geochemical anomalies into reliable mineral prospecting targets is a problem worth studying. In this regard, the extreme gradient boosting outlier detection (XGBOD) framework was adopted to integrate the anomaly scores produced by diverse unsupervised anomaly detection models to construct a high-performance semi-supervised anomaly detection ensemble for detecting mineralization-related geochemical anomalies. In the XGBOD framework, various unsupervised anomaly detection models are built and used to transform input variables into the transformed outlier scores (TOSs), and the important TOSs are then selected and added into the original input data to train the extreme gradient boosting (XGBoost) model, and a high-performance semi-supervised XGBoost model is established finally for detecting mineralization-related geochemical anomalies. The superiority of the XGBOD framework was demonstrated by a case study implemented in the Baishan area (Jilin, China). The <em>K</em>-nearest neighbor, local outlier factor, histogram-based outlier score, one-class support vector machine and isolation forest were used to transform element concentrations to TOSs, and the TOSs were used as the input data of the XGBoost model together with the original input element concentration data. The XGBoost model was finally established to detect mineralization-related geochemical anomalies. The results show that the semi-supervised XGBoost model performs significantly better than the five unsupervised anomaly detection models. Therefore, the XGBOD framework is a viable tool for combining diverse anomaly scores produced by various anomaly detectors to build a high-performance semi-supervised ensemble for detecting mineralization-related geochemical anomalies.</div></div>","PeriodicalId":16336,"journal":{"name":"Journal of Geochemical Exploration","volume":"273 ","pages":"Article 107741"},"PeriodicalIF":3.4000,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Geochemical Exploration","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0375674225000731","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOCHEMISTRY & GEOPHYSICS","Score":null,"Total":0}
引用次数: 0
Abstract
In geochemical exploration, the geochemical anomalies identified in the same area by different unsupervised anomaly detection models are often quite divergent. How to combine these divergent geochemical anomalies into reliable mineral prospecting targets is a problem worth studying. In this regard, the extreme gradient boosting outlier detection (XGBOD) framework was adopted to integrate the anomaly scores produced by diverse unsupervised anomaly detection models to construct a high-performance semi-supervised anomaly detection ensemble for detecting mineralization-related geochemical anomalies. In the XGBOD framework, various unsupervised anomaly detection models are built and used to transform input variables into the transformed outlier scores (TOSs), and the important TOSs are then selected and added into the original input data to train the extreme gradient boosting (XGBoost) model, and a high-performance semi-supervised XGBoost model is established finally for detecting mineralization-related geochemical anomalies. The superiority of the XGBOD framework was demonstrated by a case study implemented in the Baishan area (Jilin, China). The K-nearest neighbor, local outlier factor, histogram-based outlier score, one-class support vector machine and isolation forest were used to transform element concentrations to TOSs, and the TOSs were used as the input data of the XGBoost model together with the original input element concentration data. The XGBoost model was finally established to detect mineralization-related geochemical anomalies. The results show that the semi-supervised XGBoost model performs significantly better than the five unsupervised anomaly detection models. Therefore, the XGBOD framework is a viable tool for combining diverse anomaly scores produced by various anomaly detectors to build a high-performance semi-supervised ensemble for detecting mineralization-related geochemical anomalies.
期刊介绍:
Journal of Geochemical Exploration is mostly dedicated to publication of original studies in exploration and environmental geochemistry and related topics.
Contributions considered of prevalent interest for the journal include researches based on the application of innovative methods to:
define the genesis and the evolution of mineral deposits including transfer of elements in large-scale mineralized areas.
analyze complex systems at the boundaries between bio-geochemistry, metal transport and mineral accumulation.
evaluate effects of historical mining activities on the surface environment.
trace pollutant sources and define their fate and transport models in the near-surface and surface environments involving solid, fluid and aerial matrices.
assess and quantify natural and technogenic radioactivity in the environment.
determine geochemical anomalies and set baseline reference values using compositional data analysis, multivariate statistics and geo-spatial analysis.
assess the impacts of anthropogenic contamination on ecosystems and human health at local and regional scale to prioritize and classify risks through deterministic and stochastic approaches.
Papers dedicated to the presentation of newly developed methods in analytical geochemistry to be applied in the field or in laboratory are also within the topics of interest for the journal.