Earth Science Informatics最新文献

Estimation of the elastic modulus of basaltic rocks using machine learning methods 利用机器学习方法估算玄武岩的弹性模量

IF 2.8 4区地球科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Earth Science Informatics

Pub Date : 2024-09-19 DOI: 10.1007/s12145-024-01472-7

Nurgul Yesiloglu-Gultekin, Ayhan Dogan

The elastic modulus of basalt is a significant engineering parameter required for many projects. Therefore, a total of 137 datasets of basalts from Digor-Kilittasi, Turkey, were used to predict the elastic modulus of intact rock (E_i) for this study. P wave velocity, S wave velocity, apparent porosity, and dry density parameters were employed as input parameters. In order to predict E_i, seven different models with two or three inputs were constructed, employing four different machine learning methods such as Support Vector Machine (SVM), Gaussian Process Regression (GPR), Ensembles of Tree (ET), and Regression Trees (RT). The performance of datasets, models, and methods was evaluated using the coefficient of determination (R²), Root Mean Squared Error (RMSE), Mean Squared Error (MSE), and Mean Absolute Error (MAE). This study presented and analyzed the performance of four machine learning methods. A ranking approach was employed to determine the best performing method and dataset. Based on these evaluations, all four machine learning techniques effectively estimate the value of E_i. While they can be used as an appropriate choice for estimating the elastic modulus of basaltic rocks, the ET approach appears to be the most successful method. However, the performance of the GPR is the worst according to model assessments. The average R² values for Model 1 through 7 of the ET method for the five test datasets are 0.97, 0.93, 0.89, 0.97, 0.91, 0.99, and 0.99, respectively. The the average R² values for GPR from Models 1 to 7 for the five test datasets are 0.73, 0.55, 0.69, 0.48, 0.47, 0.73, 0.56, respectively. An additional indication that the ET performed better than all the other methods was the Taylor diagram, which made it simple to determine how well the model predictions matched the observations. Furthermore, these findings validate the performance of the machine learning techniques employed in this study as valuable instruments for future investigations into the modeling of complex engineering issues. The results of this study suggest that machine learning algorithms can help reduce the need for high-quality core samples and labor-intensive procedures in predicting the elastic modulus of basaltic rocks, resulting in time and cost savings.

玄武岩的弹性模量是许多项目所需的重要工程参数。因此，本研究共使用了 137 个来自土耳其 Digor-Kilittasi 的玄武岩数据集来预测完整岩石的弹性模量（Ei）。输入参数包括 P 波速度、S 波速度、表观孔隙度和干密度参数。为了预测 Ei，采用了四种不同的机器学习方法，如支持向量机（SVM）、高斯过程回归（GPR）、树集合（ET）和回归树（RT），构建了七种具有两个或三个输入的不同模型。使用判定系数（R2）、均方根误差（RMSE）、均方误差（MSE）和平均绝对误差（MAE）对数据集、模型和方法的性能进行了评估。本研究介绍并分析了四种机器学习方法的性能。研究采用了排名方法来确定性能最佳的方法和数据集。根据这些评估结果，所有四种机器学习技术都能有效估计 Ei 的值。虽然它们都可作为估算玄武岩弹性模量的适当选择，但 ET 方法似乎是最成功的方法。然而，根据模型评估，GPR 的性能最差。在五个测试数据集中，ET 方法模型 1 至 7 的平均 R² 值分别为 0.97、0.93、0.89、0.97、0.91、0.99 和 0.99。在五个测试数据集上，模型 1 至 7 的 GPR 平均 R2 值分别为 0.73、0.55、0.69、0.48、0.47、0.73 和 0.56。泰勒图是 ET 性能优于所有其他方法的另一个标志，它可以简单地确定模型预测与观测结果的匹配程度。此外，这些发现还验证了本研究中采用的机器学习技术的性能，它们是未来研究复杂工程问题建模的宝贵工具。本研究的结果表明，机器学习算法有助于减少预测玄武岩弹性模量时对高质量岩芯样本和劳动密集型程序的需求，从而节省时间和成本。

{"title":"Estimation of the elastic modulus of basaltic rocks using machine learning methods","authors":"Nurgul Yesiloglu-Gultekin, Ayhan Dogan","doi":"10.1007/s12145-024-01472-7","DOIUrl":"https://doi.org/10.1007/s12145-024-01472-7","url":null,"abstract":"The elastic modulus of basalt is a significant engineering parameter required for many projects. Therefore, a total of 137 datasets of basalts from Digor-Kilittasi, Turkey, were used to predict the elastic modulus of intact rock (Ei) for this study. P wave velocity, S wave velocity, apparent porosity, and dry density parameters were employed as input parameters. In order to predict Ei, seven different models with two or three inputs were constructed, employing four different machine learning methods such as Support Vector Machine (SVM), Gaussian Process Regression (GPR), Ensembles of Tree (ET), and Regression Trees (RT). The performance of datasets, models, and methods was evaluated using the coefficient of determination (R2), Root Mean Squared Error (RMSE), Mean Squared Error (MSE), and Mean Absolute Error (MAE). This study presented and analyzed the performance of four machine learning methods. A ranking approach was employed to determine the best performing method and dataset. Based on these evaluations, all four machine learning techniques effectively estimate the value of Ei. While they can be used as an appropriate choice for estimating the elastic modulus of basaltic rocks, the ET approach appears to be the most successful method. However, the performance of the GPR is the worst according to model assessments. The average R² values for Model 1 through 7 of the ET method for the five test datasets are 0.97, 0.93, 0.89, 0.97, 0.91, 0.99, and 0.99, respectively. The the average R2 values for GPR from Models 1 to 7 for the five test datasets are 0.73, 0.55, 0.69, 0.48, 0.47, 0.73, 0.56, respectively. An additional indication that the ET performed better than all the other methods was the Taylor diagram, which made it simple to determine how well the model predictions matched the observations. Furthermore, these findings validate the performance of the machine learning techniques employed in this study as valuable instruments for future investigations into the modeling of complex engineering issues. The results of this study suggest that machine learning algorithms can help reduce the need for high-quality core samples and labor-intensive procedures in predicting the elastic modulus of basaltic rocks, resulting in time and cost savings.","PeriodicalId":49318,"journal":{"name":"Earth Science Informatics","volume":"27 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142255344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Feature-adaptive FPN with multiscale context integration for underwater object detection 采用多尺度上下文集成的特征自适应 FPN 用于水下物体探测

IF 2.8 4区地球科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Earth Science Informatics

Pub Date : 2024-09-18 DOI: 10.1007/s12145-024-01473-6

Shikha Bhalla, Ashish Kumar, Riti Kushwaha

Underwater object detection is vital for diverse applications, from studies in marine biology to underwater robotics. However, underwater environments pose unique challenges, including reduced visibility due to color distortion, light attenuation, and complex backgrounds. Traditional computer vision methods have limitations, prompting the implementation of deep learning, for underwater object detection. Despite progress, challenges persist, such as visual degradation, scale variations, diverse marine species, and complex backgrounds. To address these issues, we propose Feature-Adaptive FPN with Multiscale Context Integration (FA-FPN-MCI), a novel deep-learning algorithm aimed at enhancing both detection and domain generalization performance. We integrate the Style Normalization and Restitution (SNR) module for domain generalization, Receptive Field Blocks (RFBs) for fine-grained detail capture, and a twin-branch Global Context Module (TBGCM) for multiscale context information. We enhance lateral connections within the Feature Pyramid Network (FPN) with deformable convolution. Experimental outcome reveal that the proposed method attains mean average precision of 84.2%. Additionally, other performance metrics were evaluated, and outperforming all other methods used for comparison.

从海洋生物学研究到水下机器人技术，水下物体检测对各种应用都至关重要。然而，水下环境带来了独特的挑战，包括颜色失真、光衰减和复杂背景导致的能见度降低。传统的计算机视觉方法存在局限性，这促使人们开始采用深度学习方法来进行水下物体检测。尽管取得了进展，但挑战依然存在，如视觉退化、尺度变化、海洋物种多样性和复杂背景。为了解决这些问题，我们提出了具有多尺度上下文集成的特征自适应 FPN（FA-FPN-MCI），这是一种新型深度学习算法，旨在提高检测和领域泛化性能。我们整合了用于领域泛化的样式归一化和复原（SNR）模块、用于细粒度细节捕捉的接收场块（RFB）以及用于多尺度上下文信息的双分支全局上下文模块（TBGCM）。我们利用可变形卷积增强了特征金字塔网络（FPN）内的横向联系。实验结果表明，拟议方法的平均精度达到了 84.2%。此外，我们还对其他性能指标进行了评估，结果表明这些指标优于用于比较的所有其他方法。

{"title":"Feature-adaptive FPN with multiscale context integration for underwater object detection","authors":"Shikha Bhalla, Ashish Kumar, Riti Kushwaha","doi":"10.1007/s12145-024-01473-6","DOIUrl":"https://doi.org/10.1007/s12145-024-01473-6","url":null,"abstract":"Underwater object detection is vital for diverse applications, from studies in marine biology to underwater robotics. However, underwater environments pose unique challenges, including reduced visibility due to color distortion, light attenuation, and complex backgrounds. Traditional computer vision methods have limitations, prompting the implementation of deep learning, for underwater object detection. Despite progress, challenges persist, such as visual degradation, scale variations, diverse marine species, and complex backgrounds. To address these issues, we propose Feature-Adaptive FPN with Multiscale Context Integration (FA-FPN-MCI), a novel deep-learning algorithm aimed at enhancing both detection and domain generalization performance. We integrate the Style Normalization and Restitution (SNR) module for domain generalization, Receptive Field Blocks (RFBs) for fine-grained detail capture, and a twin-branch Global Context Module (TBGCM) for multiscale context information. We enhance lateral connections within the Feature Pyramid Network (FPN) with deformable convolution. Experimental outcome reveal that the proposed method attains mean average precision of 84.2%. Additionally, other performance metrics were evaluated, and outperforming all other methods used for comparison.","PeriodicalId":49318,"journal":{"name":"Earth Science Informatics","volume":"15 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142255345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Autoregressive modelling of tropospheric radio refractivity over selected locations in tropical Nigeria using artificial neural network 利用人工神经网络对尼日利亚热带部分地区对流层无线电折射率进行自回归建模

IF 2.8 4区地球科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Earth Science Informatics

Pub Date : 2024-09-17 DOI: 10.1007/s12145-024-01489-y

Ayodeji Gabriel Ashidi

Tropospheric radio refractivity is a significant atmospheric phenomenon that affects the propagation of radio signals, and can impact the design and operation of wireless communication systems. This study focuses on the development of an autoregressive model of tropospheric radio refractivity in Nigeria using artificial neural networks (ANNs). The proposed model utilizes atmospheric variables—temperature, pressure, and humidity—as inputs and predicts refractivity values with high accuracy. Descriptive statistics and data visualization techniques were used to gain insights into the relationships between the atmospheric variables and computed radio refractivity. It could be deduced from the results obtained that the developed ANN model accurately predicts tropospheric radio refractivity, with satisfactory performance indicators that include standard error (SE), root mean square error (RMSE), and correlation coefficient (R). It also demonstrates the reliability and robustness of the developed model, which could play an important role in improving the preparation and implementation routines of wireless communication systems. The study also identifies areas for further study, such as data availability, model complexity, and interpretability. Lastly, this work has further validated the suitability of applying ANNs to tropospheric radio refractivity model optimization, as it provides insights into the potential of the non-linear autoregressive modeling (NARX-ANN) approach for improving wireless communication systems.

对流层无线电折射率是影响无线电信号传播的重要大气现象，会对无线通信系统的设计和运行产生影响。本研究的重点是利用人工神经网络（ANN）开发尼日利亚对流层无线电折射率的自回归模型。所提议的模型利用大气变量--温度、压力和湿度--作为输入，并能高精度地预测折射率值。利用描述性统计和数据可视化技术深入了解了大气变量与计算出的无线电折射率之间的关系。从获得的结果可以推断出，所开发的 ANN 模型能够准确预测对流层射电折射率，其性能指标令人满意，包括标准误差（SE）、均方根误差（RMSE）和相关系数（R）。研究还证明了所开发模型的可靠性和鲁棒性，该模型可在改进无线通信系统的准备和实施例程方面发挥重要作用。研究还确定了需要进一步研究的领域，如数据可用性、模型复杂性和可解释性。最后，这项工作进一步验证了将 ANNs 应用于对流层无线电折射率模型优化的适用性，因为它深入揭示了非线性自回归建模（NARX-ANN）方法在改进无线通信系统方面的潜力。

{"title":"Autoregressive modelling of tropospheric radio refractivity over selected locations in tropical Nigeria using artificial neural network","authors":"Ayodeji Gabriel Ashidi","doi":"10.1007/s12145-024-01489-y","DOIUrl":"https://doi.org/10.1007/s12145-024-01489-y","url":null,"abstract":"Tropospheric radio refractivity is a significant atmospheric phenomenon that affects the propagation of radio signals, and can impact the design and operation of wireless communication systems. This study focuses on the development of an autoregressive model of tropospheric radio refractivity in Nigeria using artificial neural networks (ANNs). The proposed model utilizes atmospheric variables—temperature, pressure, and humidity—as inputs and predicts refractivity values with high accuracy. Descriptive statistics and data visualization techniques were used to gain insights into the relationships between the atmospheric variables and computed radio refractivity. It could be deduced from the results obtained that the developed ANN model accurately predicts tropospheric radio refractivity, with satisfactory performance indicators that include standard error (SE), root mean square error (RMSE), and correlation coefficient (R). It also demonstrates the reliability and robustness of the developed model, which could play an important role in improving the preparation and implementation routines of wireless communication systems. The study also identifies areas for further study, such as data availability, model complexity, and interpretability. Lastly, this work has further validated the suitability of applying ANNs to tropospheric radio refractivity model optimization, as it provides insights into the potential of the non-linear autoregressive modeling (NARX-ANN) approach for improving wireless communication systems.","PeriodicalId":49318,"journal":{"name":"Earth Science Informatics","volume":"7 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142255347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Time series land subsidence monitoring and prediction based on SBAS-InSAR and GeoTemporal transformer model 基于 SBAS-InSAR 和 GeoTemporal transformer 模型的时间序列地面沉降监测与预测

IF 2.8 4区地球科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Earth Science Informatics

Pub Date : 2024-09-16 DOI: 10.1007/s12145-024-01487-0

Jiayi Zhang, Jian Gao, Fanzong Gao

Land subsidence, the loss of elevation of the earth's surface caused by natural and human-induced factors, has become a significant global concern. It poses substantial threats to urban planning, construction, and sustainable development. Monitoring and predicting regional land subsidence are particularly crucial. Interferometric Synthetic Aperture Radar (InSAR) and deep learning provide valuable insights into monitoring and predicting land subsidence. However, methods for accurate and long-term monitoring and predicting time series land subsidence still have limitations. Firstly, most models only utilize historical data and overlook the combined effects of various factors, including human activities and urbanization. Secondly, the spatiotemporal correlation of subsidence across different locations and times is underestimated. Thirdly, the nonlinearity of land subsidence is not adequately addressed. To address these challenges, this study assesses land deformation patterns from January 2018 to December 2022, using Sentinel-1 InSAR data processed through Small Baseline Subset-InSAR (SBAS-InSAR). The result shows that the annual average deformation rate ranged from -6.39 to 8.27 mm/year, with maximum cumulative subsidence and uplift of 27.62 mm and 36.62 mm, respectively. Subsequently, a GeoTemporal Transformer (GTformer) model based on the Transformer model is proposed. It captures nonlinearities and spatiotemporal correlations between land subsidence and influencing factors by generating spatiotemporal distance matrices. The results demonstrate the efficacy of the GTformer model in improving prediction accuracy by incorporating urbanization factors and constructing spatiotemporal distance matrices. Compared with traditional machine learning models, the R² of GTformer has increased by at least 14.6%, and compared with the standard Transformer, it has increased by 4%. The predictions closely align with observed subsidence patterns, highlighting the reliability. Moreover, this study underscores the critical role of urbanization factors in land subsidence mechanisms. The GTformer model provides a novel approach that integrates multiple factors and spatiotemporal correlation to predict land subsidence. The methodology offers a valuable tool for urban planners and decision-makers to effectively manage urban development and mitigate geological disaster risks.

土地沉降是由自然和人为因素造成的地球表面海拔下降，已成为全球关注的一个重要问题。它对城市规划、建设和可持续发展构成了巨大威胁。监测和预测区域土地沉降尤为重要。干涉合成孔径雷达（InSAR）和深度学习为监测和预测地面沉降提供了宝贵的见解。然而，准确、长期监测和预测时间序列土地沉降的方法仍然存在局限性。首先，大多数模型仅利用历史数据，忽略了人类活动和城市化等各种因素的综合影响。其次，不同地点和时间的沉降时空相关性被低估。第三，没有充分考虑土地沉降的非线性。为应对这些挑战，本研究利用通过小基线子集-InSAR（SBAS-InSAR）处理的哨兵-1 InSAR 数据，评估了 2018 年 1 月至 2022 年 12 月的土地变形模式。结果表明，年平均形变率介于-6.39 至 8.27 毫米/年之间，最大累计下沉和隆起分别为 27.62 毫米和 36.62 毫米。随后，在变压器模型的基础上提出了一个地质时空变压器（GTformer）模型。该模型通过生成时空距离矩阵来捕捉土地沉降与影响因素之间的非线性和时空相关性。结果表明，GTformer 模型通过纳入城市化因素和构建时空距离矩阵，提高了预测精度。与传统的机器学习模型相比，GTformer 的 R2 至少提高了 14.6%，与标准 Transformer 相比提高了 4%。预测结果与观测到的沉降模式密切吻合，凸显了其可靠性。此外，这项研究还强调了城市化因素在土地沉降机制中的关键作用。GTformer 模型提供了一种新方法，它综合了多种因素和时空相关性来预测地面沉降。该方法为城市规划者和决策者有效管理城市发展和降低地质灾害风险提供了宝贵的工具。

{"title":"Time series land subsidence monitoring and prediction based on SBAS-InSAR and GeoTemporal transformer model","authors":"Jiayi Zhang, Jian Gao, Fanzong Gao","doi":"10.1007/s12145-024-01487-0","DOIUrl":"https://doi.org/10.1007/s12145-024-01487-0","url":null,"abstract":"Land subsidence, the loss of elevation of the earth's surface caused by natural and human-induced factors, has become a significant global concern. It poses substantial threats to urban planning, construction, and sustainable development. Monitoring and predicting regional land subsidence are particularly crucial. Interferometric Synthetic Aperture Radar (InSAR) and deep learning provide valuable insights into monitoring and predicting land subsidence. However, methods for accurate and long-term monitoring and predicting time series land subsidence still have limitations. Firstly, most models only utilize historical data and overlook the combined effects of various factors, including human activities and urbanization. Secondly, the spatiotemporal correlation of subsidence across different locations and times is underestimated. Thirdly, the nonlinearity of land subsidence is not adequately addressed. To address these challenges, this study assesses land deformation patterns from January 2018 to December 2022, using Sentinel-1 InSAR data processed through Small Baseline Subset-InSAR (SBAS-InSAR). The result shows that the annual average deformation rate ranged from -6.39 to 8.27 mm/year, with maximum cumulative subsidence and uplift of 27.62 mm and 36.62 mm, respectively. Subsequently, a GeoTemporal Transformer (GTformer) model based on the Transformer model is proposed. It captures nonlinearities and spatiotemporal correlations between land subsidence and influencing factors by generating spatiotemporal distance matrices. The results demonstrate the efficacy of the GTformer model in improving prediction accuracy by incorporating urbanization factors and constructing spatiotemporal distance matrices. Compared with traditional machine learning models, the R2 of GTformer has increased by at least 14.6%, and compared with the standard Transformer, it has increased by 4%. The predictions closely align with observed subsidence patterns, highlighting the reliability. Moreover, this study underscores the critical role of urbanization factors in land subsidence mechanisms. The GTformer model provides a novel approach that integrates multiple factors and spatiotemporal correlation to predict land subsidence. The methodology offers a valuable tool for urban planners and decision-makers to effectively manage urban development and mitigate geological disaster risks.","PeriodicalId":49318,"journal":{"name":"Earth Science Informatics","volume":"213 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142255346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Drought index time series forecasting via three-in-one machine learning concept for the Euphrates basin 通过幼发拉底河流域三合一机器学习概念进行干旱指数时间序列预测

IF 2.8 4区地球科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Earth Science Informatics

Pub Date : 2024-09-16 DOI: 10.1007/s12145-024-01471-8

Levent Latifoğlu, Savaş Bayram, Gaye Aktürk, Hatice Citakoglu

Droughts are among the most hazardous and costly natural disasters and are hard to quantify and characterize. Accurate drought forecasting reduces droughts' devastating economic effects on ecosystems and people. Eastern Anatolia is the largest and coldest geographical region of Türkiye. Previous studies lack drought forecasting in the Eastern Anatolia (Upper Mesopotamia) Region, where agriculture is limited due to being under snow most of the year. This study focuses on the Euphrates basin, specifically the Tercan and the Tunceli meteorological stations of the Karasu River sub-basin, a vital Eastern Anatolia Region water resource. In this context, time series of 1-, 3-, 6-, 9-, and 12-month Standardized Precipitation Index (SPI) and Standardized Precipitation Evapotranspiration Index (SPEI) values were created. The Tuned Q-factor Wavelet Transform (TQWT) method and Univariate Feature Ranking Using F-Tests (FSRFtest) were used for pre-processing and feature selection. Several models were created, such as stand-alone, hybrid, and tribrid. Machine Learning (ML) methods such as Artificial Neural Networks (ANN), Gaussian Process Regression (GPR), and Support Vector Machine (SVM) were conducted for the time series analyses. The GPR approach was concluded to perform better than the ANN and SVM at the Tercan station. In other words, GPR performs better in 80% of cases than SVM and ANN models. At the Tunceli station for the SPI output, SVM, which had a superior performance in 60% of the cases, demonstrated a performance comparable to GPR. At the same time, ANN once again exhibited an inferior performance. Similarly, for the SPEI output at the Tunceli station, no clear superiority was observed between the GPR and ANN methods. Because both methods were successful in 40% of cases. This study contributes by introducing a third concept to the stand-alone and hybrid model comparison of drought forecasting, adding tribrid models. It has been detected that the Hybrid and Tribrid ML methods lead to a 91% and 64% decrease relative root mean square error percentage compared stand-alone ML methods for SPEI and SPI in two stations. While the hybrid model at Tercan station was more successful in 80% of the cases, the hybrid model at Tercan station was more successful in 90% of the cases. While hybrid models were observed to be superior, tribrid models not only demonstrated performance close to the hybrid models but also provided advantages such as reducing computational load and shortening calculation time.

干旱是危害最大、代价最高的自然灾害之一，而且难以量化和定性。准确的干旱预报可以减少干旱对生态系统和人类造成的破坏性经济影响。东安纳托利亚是土耳其最大、最寒冷的地理区域。以往的研究缺乏对东安纳托利亚（上美索不达米亚）地区的干旱预报，由于该地区全年大部分时间都在积雪之下，农业生产受到限制。本研究侧重于幼发拉底河流域，特别是卡拉苏河子流域的特尔坎和通杰利气象站，卡拉苏河是安纳托利亚东部地区的重要水资源。在这种情况下，创建了 1 个月、3 个月、6 个月、9 个月和 12 个月的标准化降水指数 (SPI) 和标准化降水蒸散指数 (SPEI) 时间序列值。预处理和特征选择采用了调谐 Q 因子小波变换 (TQWT) 方法和使用 F 检验的单变量特征排序 (FSRFtest)。创建了多个模型，如独立模型、混合模型和三重模型。在时间序列分析中使用了人工神经网络（ANN）、高斯过程回归（GPR）和支持向量机（SVM）等机器学习（ML）方法。结论是，在 Tercan 站，GPR 方法的性能优于 ANN 和 SVM。换句话说，在 80% 的情况下，GPR 的性能优于 SVM 和 ANN 模型。在通杰利站的 SPI 输出中，SVM 在 60% 的情况下表现优异，其表现与 GPR 相当。与此同时，ANN 的表现再次逊色。同样，对于通杰利站的 SPEI 输出，GPR 和 ANN 方法也没有明显的优劣之分。因为这两种方法都有 40% 的成功案例。本研究在干旱预报的独立模型和混合模型比较中引入了第三个概念，即增加了三混合模型。研究发现，在两个站点的 SPEI 和 SPI 中，混合和三重混合 ML 方法与独立 ML 方法相比，相对均方根误差分别减少了 91% 和 64%。Tercan 站的混合模型在 80% 的情况下更成功，而 Tercan 站的混合模型在 90% 的情况下更成功。虽然混合模型被认为更优越，但三rid 模型不仅表现出与混合模型接近的性能，还提供了减少计算负荷和缩短计算时间等优势。

{"title":"Drought index time series forecasting via three-in-one machine learning concept for the Euphrates basin","authors":"Levent Latifoğlu, Savaş Bayram, Gaye Aktürk, Hatice Citakoglu","doi":"10.1007/s12145-024-01471-8","DOIUrl":"https://doi.org/10.1007/s12145-024-01471-8","url":null,"abstract":"Droughts are among the most hazardous and costly natural disasters and are hard to quantify and characterize. Accurate drought forecasting reduces droughts' devastating economic effects on ecosystems and people. Eastern Anatolia is the largest and coldest geographical region of Türkiye. Previous studies lack drought forecasting in the Eastern Anatolia (Upper Mesopotamia) Region, where agriculture is limited due to being under snow most of the year. This study focuses on the Euphrates basin, specifically the Tercan and the Tunceli meteorological stations of the Karasu River sub-basin, a vital Eastern Anatolia Region water resource. In this context, time series of 1-, 3-, 6-, 9-, and 12-month Standardized Precipitation Index (SPI) and Standardized Precipitation Evapotranspiration Index (SPEI) values were created. The Tuned Q-factor Wavelet Transform (TQWT) method and Univariate Feature Ranking Using F-Tests (FSRFtest) were used for pre-processing and feature selection. Several models were created, such as stand-alone, hybrid, and tribrid. Machine Learning (ML) methods such as Artificial Neural Networks (ANN), Gaussian Process Regression (GPR), and Support Vector Machine (SVM) were conducted for the time series analyses. The GPR approach was concluded to perform better than the ANN and SVM at the Tercan station. In other words, GPR performs better in 80% of cases than SVM and ANN models. At the Tunceli station for the SPI output, SVM, which had a superior performance in 60% of the cases, demonstrated a performance comparable to GPR. At the same time, ANN once again exhibited an inferior performance. Similarly, for the SPEI output at the Tunceli station, no clear superiority was observed between the GPR and ANN methods. Because both methods were successful in 40% of cases. This study contributes by introducing a third concept to the stand-alone and hybrid model comparison of drought forecasting, adding tribrid models. It has been detected that the Hybrid and Tribrid ML methods lead to a 91% and 64% decrease relative root mean square error percentage compared stand-alone ML methods for SPEI and SPI in two stations. While the hybrid model at Tercan station was more successful in 80% of the cases, the hybrid model at Tercan station was more successful in 90% of the cases. While hybrid models were observed to be superior, tribrid models not only demonstrated performance close to the hybrid models but also provided advantages such as reducing computational load and shortening calculation time.","PeriodicalId":49318,"journal":{"name":"Earth Science Informatics","volume":"11 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142255348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A hybrid approach consisting of 3D depthwise separable convolution and depthwise squeeze-and-excitation network for hyperspectral image classification 一种由三维深度可分离卷积和深度挤压激励网络组成的混合方法，用于高光谱图像分类

IF 2.8 4区地球科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Earth Science Informatics

Pub Date : 2024-09-12 DOI: 10.1007/s12145-024-01469-2

Mehmet Emin Asker, Mustafa Güngör

Hyperspectral image classification is crucial for a wide range of applications, including environmental monitoring, precision agriculture, and mining, due to its ability to capture detailed spectral information across numerous wavelengths. However, the high dimensionality and complex spatial-spectral relationships in hyperspectral data pose significant challenges. Deep learning, particularly Convolutional Neural Networks (CNNs), has shown remarkable success in automatically extracting relevant features from high-dimensional data, making them well-suited for handling the intricate spatial-spectral relationships in hyperspectral images.This study presents a hybrid approach for hyperspectral image classification, combining 3D Depthwise Separable Convolution (3D DSC) and Depthwise Squeeze-and-Excitation Network (DSENet). The 3D DSC efficiently captures spatial-spectral features, reducing computational complexity while preserving essential information. The DSENet further refines these features by applying channel-wise attention, enhancing the model's ability to focus on the most informative features. To assess the performance of the proposed hybrid model, extensive experimental studies were carried out on four commonly utilized HSI datasets, namely HyRANK-Loukia and WHU-Hi (including HongHu, HanChuan, and LongKou). As a result of the experimental studies, the HyRANK-Loukia achieved an accuracy of 90.9%, marking an 8.86% increase compared to its previous highest accuracy. Similarly, for the WHU-Hi datasets, HongHu achieved an accuracy of 97.49%, reflecting a 2.11% improvement over its previous highest accuracy; HanChuan achieved an accuracy of 97.49%, showing a 2.4% improvement; and LongKou achieved an accuracy of 99.79%, providing a 0.15% improvement compared to its previous highest accuracy. Comparative analysis highlights the superiority of the proposed model, emphasizing improved classification accuracy with lower computational costs.

高光谱图像分类对环境监测、精准农业和采矿等广泛应用至关重要，因为它能够捕捉到众多波长的详细光谱信息。然而，高光谱数据的高维度和复杂的空间光谱关系带来了巨大的挑战。深度学习，尤其是卷积神经网络（CNN），在自动从高维数据中提取相关特征方面取得了显著的成功，因此非常适合处理高光谱图像中错综复杂的空间光谱关系。本研究提出了一种混合方法，将三维深度可分离卷积（3D DSC）和深度挤压激发网络（DSENet）相结合，用于高光谱图像分类。3D DSC 能有效捕捉空间光谱特征，降低计算复杂度，同时保留基本信息。DSENet 通过应用信道关注进一步完善了这些特征，增强了模型关注信息量最大的特征的能力。为了评估所提出的混合模型的性能，我们在四个常用的人脸识别数据集（即 HyRANK-Loukia 和 WHU-Hi，包括洪湖、汉川和龙口）上进行了广泛的实验研究。实验研究结果表明，HyRANK-Loukia 的准确率达到了 90.9%，与之前的最高准确率相比提高了 8.86%。同样，在 WHU-Hi 数据集上，洪湖的准确率达到了 97.49%，比之前的最高准确率提高了 2.11%；汉川的准确率达到了 97.49%，提高了 2.4%；龙口的准确率达到了 99.79%，比之前的最高准确率提高了 0.15%。对比分析凸显了所提模型的优越性，强调了分类准确率的提高和计算成本的降低。

{"title":"A hybrid approach consisting of 3D depthwise separable convolution and depthwise squeeze-and-excitation network for hyperspectral image classification","authors":"Mehmet Emin Asker, Mustafa Güngör","doi":"10.1007/s12145-024-01469-2","DOIUrl":"https://doi.org/10.1007/s12145-024-01469-2","url":null,"abstract":"Hyperspectral image classification is crucial for a wide range of applications, including environmental monitoring, precision agriculture, and mining, due to its ability to capture detailed spectral information across numerous wavelengths. However, the high dimensionality and complex spatial-spectral relationships in hyperspectral data pose significant challenges. Deep learning, particularly Convolutional Neural Networks (CNNs), has shown remarkable success in automatically extracting relevant features from high-dimensional data, making them well-suited for handling the intricate spatial-spectral relationships in hyperspectral images.This study presents a hybrid approach for hyperspectral image classification, combining 3D Depthwise Separable Convolution (3D DSC) and Depthwise Squeeze-and-Excitation Network (DSENet). The 3D DSC efficiently captures spatial-spectral features, reducing computational complexity while preserving essential information. The DSENet further refines these features by applying channel-wise attention, enhancing the model's ability to focus on the most informative features. To assess the performance of the proposed hybrid model, extensive experimental studies were carried out on four commonly utilized HSI datasets, namely HyRANK-Loukia and WHU-Hi (including HongHu, HanChuan, and LongKou). As a result of the experimental studies, the HyRANK-Loukia achieved an accuracy of 90.9%, marking an 8.86% increase compared to its previous highest accuracy. Similarly, for the WHU-Hi datasets, HongHu achieved an accuracy of 97.49%, reflecting a 2.11% improvement over its previous highest accuracy; HanChuan achieved an accuracy of 97.49%, showing a 2.4% improvement; and LongKou achieved an accuracy of 99.79%, providing a 0.15% improvement compared to its previous highest accuracy. Comparative analysis highlights the superiority of the proposed model, emphasizing improved classification accuracy with lower computational costs.","PeriodicalId":49318,"journal":{"name":"Earth Science Informatics","volume":"24 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A framework for microscopic grains segmentation and Classification for Minerals Recognition using hybrid features 利用混合特征进行微观颗粒分割和矿物识别分类的框架

IF 2.8 4区地球科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Earth Science Informatics

Pub Date : 2024-09-12 DOI: 10.1007/s12145-024-01478-1

Ghazanfar Latif, Kévin Bouchard, Julien Maitre, Arnaud Back, Léo Paul Bédard

Mineral grain recognition is an extremely important task in many fields, especially in mineral exploration, when trying to identify locations where precious minerals can possibly be found. The usual manual method would be to collect samples; a specialized individual using expensive equipment would manually identify and then count the grain minerals in the sample. This is a tedious task that is time-consuming and expensive. It is also limited because small portions of areas can be surveyed; even then, it might require extremely long periods. In addition, this process is still prone to human errors. Developing an automatic system to identify, recognize, and count grain minerals in samples from images would allow for more precise results than the time required by humans. In addition, such systems can be fitted on robots that collect samples, take images of the samples, and then proceed with the automated recognition and counting algorithm without human intervention. Vast amounts of land can be surveyed in this way. This paper proposes a modified approach for microscopic grain mineral recognition and classification using hybrid features and ensemble algorithms from images. The enhanced approach also included a modified segmentation approach, which enhanced the results. For 10 classes of microscopic mineral grains, using the modified approach and the ensemble algorithm resulted in an average accuracy of 84.01%. For 8 classes, the average reported accuracy is 94.93% using the Boosting ensemble learning with the C4.5 classifier. The results obtained outperform similar methods reported in the extant literature.

矿物晶粒识别在许多领域都是一项极其重要的工作，尤其是在矿物勘探领域，因为在勘探过程中需要识别可能发现珍贵矿物的地点。通常的人工方法是采集样本；由专业人员使用昂贵的设备对样本中的颗粒矿物进行人工识别和计数。这是一项既耗时又昂贵的繁琐工作。这种方法也有局限性，因为只能对小部分区域进行勘测；即便如此，也可能需要很长的时间。此外，这一过程还容易出现人为错误。开发一种自动系统，从图像中识别、辨认和计算样本中的颗粒矿物，将比人工所需的时间更精确。此外，这种系统可以安装在机器人上，机器人可以采集样本、拍摄样本图像，然后进行自动识别和计数算法，无需人工干预。通过这种方法可以对大量土地进行勘测。本文提出了一种利用图像混合特征和集合算法进行微观颗粒矿物识别和分类的改进方法。改进后的方法还包括一种改进的分割方法，从而提高了结果。对于 10 个类别的微观矿物颗粒，使用改进方法和集合算法的平均准确率为 84.01%。对于 8 个类别，使用 C4.5 分类器的 Boosting 集合学习，报告的平均准确率为 94.93%。所获得的结果优于现有文献中报道的类似方法。

{"title":"A framework for microscopic grains segmentation and Classification for Minerals Recognition using hybrid features","authors":"Ghazanfar Latif, Kévin Bouchard, Julien Maitre, Arnaud Back, Léo Paul Bédard","doi":"10.1007/s12145-024-01478-1","DOIUrl":"https://doi.org/10.1007/s12145-024-01478-1","url":null,"abstract":"Mineral grain recognition is an extremely important task in many fields, especially in mineral exploration, when trying to identify locations where precious minerals can possibly be found. The usual manual method would be to collect samples; a specialized individual using expensive equipment would manually identify and then count the grain minerals in the sample. This is a tedious task that is time-consuming and expensive. It is also limited because small portions of areas can be surveyed; even then, it might require extremely long periods. In addition, this process is still prone to human errors. Developing an automatic system to identify, recognize, and count grain minerals in samples from images would allow for more precise results than the time required by humans. In addition, such systems can be fitted on robots that collect samples, take images of the samples, and then proceed with the automated recognition and counting algorithm without human intervention. Vast amounts of land can be surveyed in this way. This paper proposes a modified approach for microscopic grain mineral recognition and classification using hybrid features and ensemble algorithms from images. The enhanced approach also included a modified segmentation approach, which enhanced the results. For 10 classes of microscopic mineral grains, using the modified approach and the ensemble algorithm resulted in an average accuracy of 84.01%. For 8 classes, the average reported accuracy is 94.93% using the Boosting ensemble learning with the C4.5 classifier. The results obtained outperform similar methods reported in the extant literature.","PeriodicalId":49318,"journal":{"name":"Earth Science Informatics","volume":"36 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A novel data-driven model for real-time prediction of static Young's modulus applying mud-logging data 应用泥浆记录数据实时预测静态杨氏模量的新型数据驱动模型

IF 2.8 4区地球科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Earth Science Informatics

Pub Date : 2024-09-11 DOI: 10.1007/s12145-024-01474-5

Shadfar Davoodi, Mohammad Mehrad, David A. Wood, Mohammed Al-Shargabi, Grachik Eremyan, Tamara Shulgina

Effective drilling planning relies on understanding the rock mechanical properties, typically estimated from petrophysical data. Real-time estimation of these properties, especially static Young's modulus (({E}_{sta})), is crucial for geomechanical modeling, wellbore stability, and cost-effective decision-making. In this study, predictive models of ({E}_{sta}) were developed using mudlogging data from two vertically drilled wells (A and B) in the same field. ({E}_{sta}) was estimated from petrophysical data across the studied depth range in both wells using a field-specific equation. Outlier data were identified and removed by evaluating the cross plot of mechanical specific energy and drilling rate for Well A. The data from Well A were then randomly divided into training and testing sets. The algorithms, multi-layer perceptron neural networks, random forests, Gaussian process regression (GPR), and support vector regression, were adjusted and applied to the training data. The resulting models were evaluated on the test data. The GPR model demonstrated the lowest RMSE values in both the training (0.0075 GPa) and testing (0.4577 GPa) phases, indicating superior performance. To further assess the models, the overfitting index and scoring techniques were employed, revealing that the GPR model exhibited the lowest overfitting value and outperformed the other models. Consequently, the GPR model was selected as the best-performing model and was analyzed using Shapley additive explanation to evaluate the influence of each input feature on the output. This analysis indicated that depth had the greatest effect, while rotation speed had the least impact on the model's output. The application of the GPR model to predict ({E}_{sta}) in Well B demonstrated its high generalization capability. Therefore, it can be confidently stated that with additional data, this model could be effectively applied to similar depth ranges in other wells within the field. The study introduces innovations by applying GPR to predict ({E}_{sta}) from mudlogging data, addressing outlier impact on predictions, and developing a real-time ({E}_{sta}) prediction model for drilling.

有效的钻井规划有赖于对岩石力学性质的了解，这些性质通常是通过岩石物理数据估算出来的。实时估算这些属性，尤其是静态杨氏模量（({E}_{sta})），对于地质力学建模、井筒稳定性和成本效益决策至关重要。本研究利用同一油田两口垂直钻井（A 井和 B 井）的泥浆记录数据建立了 ({E}_{sta}) 的预测模型。({E}_{sta}) 是使用油田特定方程从两口井的岩石物理数据中估算出来的。通过评估 A 井的机械比能量和钻井速率的交叉图，识别并剔除离群数据，然后将 A 井的数据随机分为训练集和测试集。调整多层感知器神经网络、随机森林、高斯过程回归（GPR）和支持向量回归等算法，并将其应用于训练数据。结果模型在测试数据上进行了评估。GPR 模型在训练阶段（0.0075 GPa）和测试阶段（0.4577 GPa）的 RMSE 值都最低，表明其性能优越。为了进一步评估模型，采用了过拟合指数和评分技术，结果显示 GPR 模型的过拟合值最低，性能优于其他模型。因此，GPR 模型被选为表现最佳的模型，并使用 Shapley 加法解释进行分析，以评估每个输入特征对输出的影响。分析表明，深度对模型输出的影响最大，而旋转速度对模型输出的影响最小。应用 GPR 模型预测 B 井中的({E}_{sta}) 证明了该模型具有很高的泛化能力。因此，可以肯定地说，如果有更多的数据，该模型可以有效地应用于油田内其他井的类似深度范围。该研究通过应用 GPR 从泥浆记录数据中预测 ({E}_{sta})、解决离群值对预测的影响以及开发钻井实时 ({E}_{sta})预测模型进行了创新。

{"title":"A novel data-driven model for real-time prediction of static Young's modulus applying mud-logging data","authors":"Shadfar Davoodi, Mohammad Mehrad, David A. Wood, Mohammed Al-Shargabi, Grachik Eremyan, Tamara Shulgina","doi":"10.1007/s12145-024-01474-5","DOIUrl":"https://doi.org/10.1007/s12145-024-01474-5","url":null,"abstract":"Effective drilling planning relies on understanding the rock mechanical properties, typically estimated from petrophysical data. Real-time estimation of these properties, especially static Young's modulus (({E}_{sta})), is crucial for geomechanical modeling, wellbore stability, and cost-effective decision-making. In this study, predictive models of ({E}_{sta}) were developed using mudlogging data from two vertically drilled wells (A and B) in the same field. ({E}_{sta}) was estimated from petrophysical data across the studied depth range in both wells using a field-specific equation. Outlier data were identified and removed by evaluating the cross plot of mechanical specific energy and drilling rate for Well A. The data from Well A were then randomly divided into training and testing sets. The algorithms, multi-layer perceptron neural networks, random forests, Gaussian process regression (GPR), and support vector regression, were adjusted and applied to the training data. The resulting models were evaluated on the test data. The GPR model demonstrated the lowest RMSE values in both the training (0.0075 GPa) and testing (0.4577 GPa) phases, indicating superior performance. To further assess the models, the overfitting index and scoring techniques were employed, revealing that the GPR model exhibited the lowest overfitting value and outperformed the other models. Consequently, the GPR model was selected as the best-performing model and was analyzed using Shapley additive explanation to evaluate the influence of each input feature on the output. This analysis indicated that depth had the greatest effect, while rotation speed had the least impact on the model's output. The application of the GPR model to predict ({E}_{sta}) in Well B demonstrated its high generalization capability. Therefore, it can be confidently stated that with additional data, this model could be effectively applied to similar depth ranges in other wells within the field. The study introduces innovations by applying GPR to predict ({E}_{sta}) from mudlogging data, addressing outlier impact on predictions, and developing a real-time ({E}_{sta}) prediction model for drilling.","PeriodicalId":49318,"journal":{"name":"Earth Science Informatics","volume":"161 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Determination of the stress concentration factor adjacent an extracted underground coal panel using the CART and MARS algorithms 使用 CART 和 MARS 算法确定井下采掘煤板附近的应力集中系数

IF 2.8 4区地球科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Earth Science Informatics

Pub Date : 2024-09-09 DOI: 10.1007/s12145-024-01476-3

Mohammad Rezaei, Hazhar Habibi, Mostafa Asadizadeh

In this study, classification and regression tree (CART) and multivariate adaptive regression spline (MARS) models are proposed to predict the stress concentration factor (SCF) around an extracted underground coal panel. Models are trained and tested using 120 collected datasets with 100 series allocated for models training and 20 datasets reserved for testing. For SCF prediction using the CART and MARS models, input parameters including overburden thickness (H), specific gravity of rock mass (γ), straight distance from the panel edge (D), and height of disturbed zone over the mined panel (H_d) are utilized, employing principal component analysis (PCA) to remove correlations. A predictive tree graph and 17 if–then rules with quantitative outputs are generated from the CART model, while a predictive equation is derived from the MARS technique for SCF prediction. The achieved values of the coefficient of determination (R²) for CART and MARS models are 0.940 and 0.957, respectively. Furthermore, obtained amounts of normalized root mean square error (NRMSE), variant account for (VAF), and performance index (PI) for CART are 0.043 92.473%, and 1.82, respectively. For the MARS model these values are 0.035, 95.419%, and 1.876,. Additionally, performance evaluations of the models using the Wilcoxon Signed Ranks and Friedman non-parametric tests, along with Taylor diagrams and error analysis demonstrate the reliability and suitability of the proposed models for SCF prediction. However, error and accuracy analyses confirm that MARS model yields more precise outputs, achieving 2.57% greater accuracy and 10.84% lower error than the CART model. Furthermore, the importance analysis demonstrated that both H and H_d have the highest importance on the SCF, while γ has the lowest, with importance values of 33.33% and 11.11%, respectively. Models verification based on the field SCF measurement confirms the models validity, as indicated by the relative errors of 6.83 for the MARS model and 7.05 for the CART model. Finally, a comparative analysis based on a case study data validates the practical application of the proposed models.

本研究提出了分类回归树（CART）和多元自适应回归样条线（MARS）模型，用于预测井下采掘煤板周围的应力集中系数（SCF）。使用收集的 120 个数据集对模型进行了训练和测试，其中 100 个数据集用于模型训练，20 个数据集用于测试。使用 CART 和 MARS 模型预测 SCF 时，输入参数包括覆盖层厚度 (H)、岩体比重 (γ)、距采煤层边缘的直线距离 (D) 和采煤层上扰动区的高度 (Hd)，并采用主成分分析 (PCA) 消除相关性。CART 模型生成了一个预测树状图和 17 个具有定量输出的 "如果-那么 "规则，而用于 SCF 预测的 MARS 技术则导出了一个预测方程。CART 模型和 MARS 模型的判定系数 (R2) 分别为 0.940 和 0.957。此外，CART 的归一化均方根误差 (NRMSE)、变异系数 (VAF) 和性能指数 (PI) 分别为 0.043 92.473% 和 1.82。对于 MARS 模型，这些值分别为 0.035、95.419% 和 1.876。此外，利用 Wilcoxon Signed Ranks 和 Friedman 非参数检验对模型进行的性能评估，以及泰勒图和误差分析表明了所提出的模型在 SCF 预测中的可靠性和适用性。然而，误差和准确度分析证实，MARS 模型能产生更精确的输出结果，其准确度比 CART 模型高 2.57%，误差低 10.84%。此外，重要性分析表明，H 和 Hd 对 SCF 的重要性最高，而 γ 的重要性最低，重要性值分别为 33.33% 和 11.11%。基于现场 SCF 测量的模型验证证实了模型的有效性，MARS 模型的相对误差为 6.83，CART 模型的相对误差为 7.05。最后，基于案例研究数据的对比分析验证了所提模型的实际应用。

{"title":"Determination of the stress concentration factor adjacent an extracted underground coal panel using the CART and MARS algorithms","authors":"Mohammad Rezaei, Hazhar Habibi, Mostafa Asadizadeh","doi":"10.1007/s12145-024-01476-3","DOIUrl":"https://doi.org/10.1007/s12145-024-01476-3","url":null,"abstract":"In this study, classification and regression tree (CART) and multivariate adaptive regression spline (MARS) models are proposed to predict the stress concentration factor (SCF) around an extracted underground coal panel. Models are trained and tested using 120 collected datasets with 100 series allocated for models training and 20 datasets reserved for testing. For SCF prediction using the CART and MARS models, input parameters including overburden thickness (H), specific gravity of rock mass (γ), straight distance from the panel edge (D), and height of disturbed zone over the mined panel (Hd) are utilized, employing principal component analysis (PCA) to remove correlations. A predictive tree graph and 17 if–then rules with quantitative outputs are generated from the CART model, while a predictive equation is derived from the MARS technique for SCF prediction. The achieved values of the coefficient of determination (R2) for CART and MARS models are 0.940 and 0.957, respectively. Furthermore, obtained amounts of normalized root mean square error (NRMSE), variant account for (VAF), and performance index (PI) for CART are 0.043 92.473%, and 1.82, respectively. For the MARS model these values are 0.035, 95.419%, and 1.876,. Additionally, performance evaluations of the models using the Wilcoxon Signed Ranks and Friedman non-parametric tests, along with Taylor diagrams and error analysis demonstrate the reliability and suitability of the proposed models for SCF prediction. However, error and accuracy analyses confirm that MARS model yields more precise outputs, achieving 2.57% greater accuracy and 10.84% lower error than the CART model. Furthermore, the importance analysis demonstrated that both H and Hd have the highest importance on the SCF, while γ has the lowest, with importance values of 33.33% and 11.11%, respectively. Models verification based on the field SCF measurement confirms the models validity, as indicated by the relative errors of 6.83 for the MARS model and 7.05 for the CART model. Finally, a comparative analysis based on a case study data validates the practical application of the proposed models.","PeriodicalId":49318,"journal":{"name":"Earth Science Informatics","volume":"74 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A novel technology for unraveling the spatial risk of Natech disasters based on machine learning and GIS: a case study from the city of Changzhou, China 基于机器学习和地理信息系统的新型纳特奇灾害空间风险揭示技术：中国常州市的案例研究

IF 2.8 4区地球科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Earth Science Informatics

Pub Date : 2024-09-09 DOI: 10.1007/s12145-024-01484-3

Weiyi Ju, Zhixiang Xing

In recent years, technical accidents caused by natural disasters have caused huge losses. The purpose of this study is to develop a mathematical model to predict and prevent the risk of such accidents. The model applied machine learning to predict the risk of such accidents in the hope of providing risk visualization results for local governments. The expected impact of this research will benefit residents and public welfare organizations. In this study, Random Forest (RF), the K-Nearest Neighbor (KNN), the Back Propagation (BP) neural network, Adaptive Boosting (AdaBoost), Gradient Boosting Decision Tree (GBDT), and the Extreme Gradient Boosting (XGBoost) was applied to predict the risk value. At the same time, this study applied ArcGIS to spatially interpolate the risk prediction values to generate the risk map. The results demonstrated that the RF algorithm achieved the highest classification performance among the five algorithms tested. Specifically, the RF algorithm attained an accuracy of 0.874, an F1-Score of 0.887, and an Area Under the Curve (AUC) of 0.984. The three townships with the highest risk were Xueyan, Daibu, and Shanghuang, with the proportion of risk area accounting for 48.39%, 44.34% and 79.64% respectively. This study provides a reference for the local government, which can take targeted measures to prevent and control. For disaster managers, the risks for those high-risk areas should receive sufficient attention. The government should establish a real-time updated disaster database to monitor the development of the situation. Moreover, the development and acquisition of historical disaster data is worthy of encouragement.

近年来，自然灾害引发的技术事故造成了巨大损失。本研究的目的是开发一个数学模型来预测和预防此类事故的风险。该模型应用机器学习来预测此类事故的风险，希望为地方政府提供风险可视化结果。这项研究的预期影响将惠及居民和公益组织。在本研究中，随机森林（RF）、K-近邻（KNN）、反向传播（BP）神经网络、自适应提升（AdaBoost）、梯度提升决策树（GBDT）和极端梯度提升（XGBoost）被用于预测风险值。同时，本研究应用 ArcGIS 对风险预测值进行空间插值，生成风险地图。结果表明，在所测试的五种算法中，RF 算法的分类性能最高。具体而言，射频算法的准确率为 0.874，F1 分数为 0.887，曲线下面积（AUC）为 0.984。风险最高的三个乡镇分别是雪堰、戴埠和上黄，风险面积占比分别为 48.39%、44.34% 和 79.64%。这项研究为地方政府提供了参考，可以有针对性地采取防治措施。对于灾害管理者来说，高风险地区的风险应引起足够重视。政府应建立实时更新的灾害数据库，监控灾情发展。此外，历史灾害数据的开发与获取也值得鼓励。

{"title":"A novel technology for unraveling the spatial risk of Natech disasters based on machine learning and GIS: a case study from the city of Changzhou, China","authors":"Weiyi Ju, Zhixiang Xing","doi":"10.1007/s12145-024-01484-3","DOIUrl":"https://doi.org/10.1007/s12145-024-01484-3","url":null,"abstract":"In recent years, technical accidents caused by natural disasters have caused huge losses. The purpose of this study is to develop a mathematical model to predict and prevent the risk of such accidents. The model applied machine learning to predict the risk of such accidents in the hope of providing risk visualization results for local governments. The expected impact of this research will benefit residents and public welfare organizations. In this study, Random Forest (RF), the K-Nearest Neighbor (KNN), the Back Propagation (BP) neural network, Adaptive Boosting (AdaBoost), Gradient Boosting Decision Tree (GBDT), and the Extreme Gradient Boosting (XGBoost) was applied to predict the risk value. At the same time, this study applied ArcGIS to spatially interpolate the risk prediction values to generate the risk map. The results demonstrated that the RF algorithm achieved the highest classification performance among the five algorithms tested. Specifically, the RF algorithm attained an accuracy of 0.874, an F1-Score of 0.887, and an Area Under the Curve (AUC) of 0.984. The three townships with the highest risk were Xueyan, Daibu, and Shanghuang, with the proportion of risk area accounting for 48.39%, 44.34% and 79.64% respectively. This study provides a reference for the local government, which can take targeted measures to prevent and control. For disaster managers, the risks for those high-risk areas should receive sufficient attention. The government should establish a real-time updated disaster database to monitor the development of the situation. Moreover, the development and acquisition of historical disaster data is worthy of encouragement.","PeriodicalId":49318,"journal":{"name":"Earth Science Informatics","volume":"34 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0