{"title":"A deep autoencoder network connected to geographical random forest for spatially aware geochemical anomaly detection","authors":"Zeinab Soltani , Hossein Hassani , Saeid Esmaeiloghli","doi":"10.1016/j.cageo.2024.105657","DOIUrl":null,"url":null,"abstract":"<div><p>Machine learning (ML) and deep learning (DL) techniques have recently shown encouraging performance in recognizing metal-vectoring geochemical anomalies within complex Earth systems. However, the generalization of these techniques to detect subtle anomalies may be precluded due to overlooking non-stationary spatial structures and intra-pattern local dependencies contained in geochemical exploration data. Motivated by this, we conceptualize in this paper an innovative algorithm connecting a DL architecture to a spatial ML processor to account for local neighborhood information and spatial non-stationarities in support of spatially aware anomaly detection. A deep autoencoder network (DAN) is trained to abstract deep feature codings (DFCs) of multi-element input data. The encoded DFCs represent the typical performance of a nonlinear Earth system, i.e., multi-element signatures of geochemical background populations developed by different geo-processes. A local version of the random forest algorithm, geographical random forest (GRF), is then connected to the input and code layers of the DAN processor to establish nonlinear and spatially aware regressions between original geochemical signals (dependent variables) and DFCs (independent variables). After contributions of the latter on the former are determined, residuals of GRF regressions are quantified and interpreted as spatially aware anomaly scores related to mineralization. The proposed algorithm (i.e., DAN‒GRF) is implemented in the R language environment and examined in a case study with stream sediment geochemical data pertaining to the Takht-e-Soleyman district, Iran. The high-scored anomalies mapped by DAN‒GRF, compared to those by the stand-alone DAN technique, indicated a stronger spatial correlation with locations of known metal occurrences, which was statistically confirmed by success-rate curves, Student's <span><math><mrow><mi>t</mi></mrow></math></span>‒statistic method, and prediction-area plots. The findings suggested that the proposed algorithm has an enhanced capability to recognize subtle multi-element geochemical anomalies and extract reliable insights into metal exploration targeting.</p></div>","PeriodicalId":55221,"journal":{"name":"Computers & Geosciences","volume":"190 ","pages":"Article 105657"},"PeriodicalIF":4.2000,"publicationDate":"2024-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Geosciences","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0098300424001407","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Machine learning (ML) and deep learning (DL) techniques have recently shown encouraging performance in recognizing metal-vectoring geochemical anomalies within complex Earth systems. However, the generalization of these techniques to detect subtle anomalies may be precluded due to overlooking non-stationary spatial structures and intra-pattern local dependencies contained in geochemical exploration data. Motivated by this, we conceptualize in this paper an innovative algorithm connecting a DL architecture to a spatial ML processor to account for local neighborhood information and spatial non-stationarities in support of spatially aware anomaly detection. A deep autoencoder network (DAN) is trained to abstract deep feature codings (DFCs) of multi-element input data. The encoded DFCs represent the typical performance of a nonlinear Earth system, i.e., multi-element signatures of geochemical background populations developed by different geo-processes. A local version of the random forest algorithm, geographical random forest (GRF), is then connected to the input and code layers of the DAN processor to establish nonlinear and spatially aware regressions between original geochemical signals (dependent variables) and DFCs (independent variables). After contributions of the latter on the former are determined, residuals of GRF regressions are quantified and interpreted as spatially aware anomaly scores related to mineralization. The proposed algorithm (i.e., DAN‒GRF) is implemented in the R language environment and examined in a case study with stream sediment geochemical data pertaining to the Takht-e-Soleyman district, Iran. The high-scored anomalies mapped by DAN‒GRF, compared to those by the stand-alone DAN technique, indicated a stronger spatial correlation with locations of known metal occurrences, which was statistically confirmed by success-rate curves, Student's ‒statistic method, and prediction-area plots. The findings suggested that the proposed algorithm has an enhanced capability to recognize subtle multi-element geochemical anomalies and extract reliable insights into metal exploration targeting.
最近,机器学习(ML)和深度学习(DL)技术在识别复杂地球系统中的金属矢量地球化学异常方面表现出令人鼓舞的性能。然而,由于忽略了地球化学勘探数据中包含的非稳态空间结构和模式内局部依赖性,这些技术在检测微妙异常方面的普适性可能被排除在外。受此启发,我们在本文中构思了一种创新算法,将 DL 架构与空间 ML 处理器相连接,以考虑局部邻域信息和空间非稳态性,支持空间感知异常检测。对深度自动编码器网络(DAN)进行训练,以抽象出多元素输入数据的深度特征编码(DFC)。编码后的 DFCs 代表了非线性地球系统的典型性能,即由不同地质过程形成的地球化学背景种群的多元素特征。然后,将随机森林算法的本地版本--地理随机森林(GRF)连接到 DAN 处理器的输入层和代码层,在原始地球化学信号(因变量)和 DFCs(自变量)之间建立非线性和空间感知回归。在确定后者对前者的贡献之后,对 GRF 回归的残差进行量化,并将其解释为与矿化有关的空间感知异常分数。建议的算法(即 DAN-GRF)在 R 语言环境中实现,并在伊朗 Takht-e-Soleyman 地区流沉积物地球化学数据的案例研究中进行了检验。与独立的 DAN 技术相比,DAN-GRF 所绘制的高分异常显示与已知金属矿藏的位置具有更强的空间相关性,成功率曲线、Student's t 统计法和预测区域图在统计学上证实了这一点。研究结果表明,所提出的算法具有更强的能力来识别微妙的多元素地球化学异常,并为金属勘探目标的确定提供可靠的见解。
期刊介绍:
Computers & Geosciences publishes high impact, original research at the interface between Computer Sciences and Geosciences. Publications should apply modern computer science paradigms, whether computational or informatics-based, to address problems in the geosciences.