Anomaly Detection-Oriented Positive-Unlabeled Metric Learning for Extracting High-Dimensional Geochemical Anomalies Linked to Mineralization

IF 4.8 2区 地球科学 Q1 GEOSCIENCES, MULTIDISCIPLINARY Natural Resources Research Pub Date : 2025-02-05 DOI:10.1007/s11053-025-10464-3
Zhaorui Yang, Yongliang Chen
{"title":"Anomaly Detection-Oriented Positive-Unlabeled Metric Learning for Extracting High-Dimensional Geochemical Anomalies Linked to Mineralization","authors":"Zhaorui Yang, Yongliang Chen","doi":"10.1007/s11053-025-10464-3","DOIUrl":null,"url":null,"abstract":"<p>In geochemical exploration, a small number of positive samples and a large number of unlabeled samples can be defined according to the geochemical exploration data and the mineral deposits (occurrences) found in the exploration area. The positive samples usually comprise multiple types of mineral deposits (occurrences) while the unlabeled samples usually comprise a large number of background samples and some unknown positive samples. Accurate recognition of unknown positive samples among a large number of unlabeled samples is a challenge in the field of exploration geochemistry. To address this challenge, the positive-unlabeled (PU) metric learning for anomaly detection (PUMAD) is developed to model positive-unlabeled geochemical exploration data to detect mineralization-related anomalies. The PUMAD is a novel PU learning algorithm that incorporates artificial neural networks with distance hashing-based filtering (DHF) and deep metric learning (DML) to establish an anomaly detection model for dataset with positive and unlabeled samples. To test the effectiveness and robustness of the PUMAD in mineralization-related geochemical anomaly identification, the Baishan area of Jilin Province (China) was chosen as the case research area, and a dataset with positive and unlabeled samples was constructed according to the stream sediment geochemical survey data from four 1:200,000 scale geological maps and spatial locations of more than 30 discovered polymetallic deposits. The PUMAD model, PU learning model and DML model were established on the constructed dataset and were used to identify the geochemical anomalies linked to known polymetallic mineralization. A comparative analysis of the three models showed that the PUMAD model performed much better than the other two models in identifying mineralization-related geochemical anomalies. The receiver operating characteristic (ROC) curve of the PUMAD model was closer to the upper left corner of the ROC space compared to those of the PU learning model and DML model. The calculated area under the ROC curve (AUC) of the PUMAD model was 0.9626, which substantially exceeded those of the PU learning model (0.8493) and the DML model (0.7542). The geochemical anomalies linked to polymetallic mineralization recognized by the PUMAD model comprised 10.89% of the Baishan exploration area and encompass all the discovered polymetallic deposits within the area, while those recognized by the PU learning model and DML model comprised 16.87% and 25.29%, respectively, of the study area and encompassed 90% and 87%, respectively, of the discovered polymetallic deposits. The recognized mineralization-related geochemical anomalies are spatially linked to regional geological factors that controlled polymetallic mineralization in the Baishan exploration area. Therefore, it can be concluded that PUMAD is an awesome technique for detecting mineralization-related anomalies within an exploration area. It is worthwhile to further test its validity for mapping mineralization-related geochemical anomalies in different exploration areas.</p>","PeriodicalId":54284,"journal":{"name":"Natural Resources Research","volume":"87 1","pages":""},"PeriodicalIF":4.8000,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Natural Resources Research","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1007/s11053-025-10464-3","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

In geochemical exploration, a small number of positive samples and a large number of unlabeled samples can be defined according to the geochemical exploration data and the mineral deposits (occurrences) found in the exploration area. The positive samples usually comprise multiple types of mineral deposits (occurrences) while the unlabeled samples usually comprise a large number of background samples and some unknown positive samples. Accurate recognition of unknown positive samples among a large number of unlabeled samples is a challenge in the field of exploration geochemistry. To address this challenge, the positive-unlabeled (PU) metric learning for anomaly detection (PUMAD) is developed to model positive-unlabeled geochemical exploration data to detect mineralization-related anomalies. The PUMAD is a novel PU learning algorithm that incorporates artificial neural networks with distance hashing-based filtering (DHF) and deep metric learning (DML) to establish an anomaly detection model for dataset with positive and unlabeled samples. To test the effectiveness and robustness of the PUMAD in mineralization-related geochemical anomaly identification, the Baishan area of Jilin Province (China) was chosen as the case research area, and a dataset with positive and unlabeled samples was constructed according to the stream sediment geochemical survey data from four 1:200,000 scale geological maps and spatial locations of more than 30 discovered polymetallic deposits. The PUMAD model, PU learning model and DML model were established on the constructed dataset and were used to identify the geochemical anomalies linked to known polymetallic mineralization. A comparative analysis of the three models showed that the PUMAD model performed much better than the other two models in identifying mineralization-related geochemical anomalies. The receiver operating characteristic (ROC) curve of the PUMAD model was closer to the upper left corner of the ROC space compared to those of the PU learning model and DML model. The calculated area under the ROC curve (AUC) of the PUMAD model was 0.9626, which substantially exceeded those of the PU learning model (0.8493) and the DML model (0.7542). The geochemical anomalies linked to polymetallic mineralization recognized by the PUMAD model comprised 10.89% of the Baishan exploration area and encompass all the discovered polymetallic deposits within the area, while those recognized by the PU learning model and DML model comprised 16.87% and 25.29%, respectively, of the study area and encompassed 90% and 87%, respectively, of the discovered polymetallic deposits. The recognized mineralization-related geochemical anomalies are spatially linked to regional geological factors that controlled polymetallic mineralization in the Baishan exploration area. Therefore, it can be concluded that PUMAD is an awesome technique for detecting mineralization-related anomalies within an exploration area. It is worthwhile to further test its validity for mapping mineralization-related geochemical anomalies in different exploration areas.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
求助全文
约1分钟内获得全文 去求助
来源期刊
Natural Resources Research
Natural Resources Research Environmental Science-General Environmental Science
CiteScore
11.90
自引率
11.10%
发文量
151
期刊介绍: This journal publishes quantitative studies of natural (mainly but not limited to mineral) resources exploration, evaluation and exploitation, including environmental and risk-related aspects. Typical articles use geoscientific data or analyses to assess, test, or compare resource-related aspects. NRR covers a wide variety of resources including minerals, coal, hydrocarbon, geothermal, water, and vegetation. Case studies are welcome.
期刊最新文献
Anomaly Detection-Oriented Positive-Unlabeled Metric Learning for Extracting High-Dimensional Geochemical Anomalies Linked to Mineralization Experimental and Molecular Simulation Research on the Oxidation Behavior of Soaked Coal Spontaneous Combustion Interpretability Analysis of Data Augmented Convolutional Neural Network in Mineral Prospectivity Mapping Using Black-Box Visualization Tools Integrated Hydrogeophysical Study for the Delineation of Mio–Plio–Quaternary Aquifers in the Central Part of the Sousse Governorate (Tunisian Sahel) Deep Learning-Based Surrogate-Assisted Intelligent Optimization Framework for Reservoir Production Schemes
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1