Earth Science Informatics最新文献_第3页

Using wavelet transform to analyze the dynamics of climatic variables; to assess the status of available water resources in Iran (1961–2020) 利用小波变换分析气候变量的动态变化；评估伊朗可用水资源的状况（1961-2020 年）

IF 2.8 4区地球科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Earth Science Informatics

Pub Date : 2024-08-31 DOI: 10.1007/s12145-024-01433-0

Ali Rezaee, Abolfazl Mosaedi, Aliasghar Beheshti, Azar Zarrin

In recent years, the effects and consequences of climate change have shown themselves by creating irregularities and trends in the essential climatic variables. In most cases, the trend of climatic variables is associated with periodicity. In this study, the trends and periodicity of these data (precipitation, temperature, evapotranspiration, and net available water (NWA) have been investigated in a period of 60 years in Iran. The Mann–Kendall trend test and Sen’s slope estimator are applied for analyzing the trend and its magnitude. Wavelet transform is used to detect the periodicity of time series and to determine the correlation between NWA and temperature, precipitation, and evapotranspiration in common periodicity. The results show that the stations located in eastern and western Iran have more significant increasing/decreasing trends. Evapotranspiration shows the highest increasing trend in most stations, followed by temperature, while NWA and precipitation have trends at lower significance levels and decreasing direction. The examination of periodicity in time series showed that, among all the studied stations, evapotranspiration has the most extended periodicity with an average length of 8.3 years, followed by NWA, temperature, and precipitation with 7.3 years, 5.8 years, and 5.5 years. The results of the correlations investigation showed that in about 80% of the stations, there is a high correlation between precipitation and NWA in the short-term periodicity and at the end of the studied period. The evapotranspiration variable in most stations has a high correlation in different periodicities with the amount of NWA.

近年来，气候变化的影响和后果表现为基本气候变量的不规则性和趋势。在大多数情况下，气候变量的趋势与周期性有关。本研究调查了伊朗 60 年间这些数据（降水量、温度、蒸散量和净可用水量）的趋势和周期性。采用 Mann-Kendall 趋势检验法和森斜率估计法分析趋势及其幅度。小波变换用于检测时间序列的周期性，并确定 NWA 与温度、降水量和蒸散量在共同周期内的相关性。结果表明，位于伊朗东部和西部的站点具有更明显的增加/减少趋势。在大多数站点中，蒸散量的上升趋势最大，其次是温度，而核辐射吸收量和降水量的趋势显著性水平较低，且呈下降趋势。对时间序列周期性的研究表明，在所有研究站点中，蒸散量的周期性最长，平均为 8.3 年，其次是水汽蒸发量、温度和降水量，分别为 7.3 年、5.8 年和 5.5 年。相关性调查结果表明，在约 80% 的站点中，降水和 NWA 在短期周期和研究期末存在高度相关性。大多数站点的蒸发蒸腾变量在不同周期内与 NWA 量高度相关。

{"title":"Using wavelet transform to analyze the dynamics of climatic variables; to assess the status of available water resources in Iran (1961–2020)","authors":"Ali Rezaee, Abolfazl Mosaedi, Aliasghar Beheshti, Azar Zarrin","doi":"10.1007/s12145-024-01433-0","DOIUrl":"https://doi.org/10.1007/s12145-024-01433-0","url":null,"abstract":"In recent years, the effects and consequences of climate change have shown themselves by creating irregularities and trends in the essential climatic variables. In most cases, the trend of climatic variables is associated with periodicity. In this study, the trends and periodicity of these data (precipitation, temperature, evapotranspiration, and net available water (NWA) have been investigated in a period of 60 years in Iran. The Mann–Kendall trend test and Sen’s slope estimator are applied for analyzing the trend and its magnitude. Wavelet transform is used to detect the periodicity of time series and to determine the correlation between NWA and temperature, precipitation, and evapotranspiration in common periodicity. The results show that the stations located in eastern and western Iran have more significant increasing/decreasing trends. Evapotranspiration shows the highest increasing trend in most stations, followed by temperature, while NWA and precipitation have trends at lower significance levels and decreasing direction. The examination of periodicity in time series showed that, among all the studied stations, evapotranspiration has the most extended periodicity with an average length of 8.3 years, followed by NWA, temperature, and precipitation with 7.3 years, 5.8 years, and 5.5 years. The results of the correlations investigation showed that in about 80% of the stations, there is a high correlation between precipitation and NWA in the short-term periodicity and at the end of the studied period. The evapotranspiration variable in most stations has a high correlation in different periodicities with the amount of NWA.","PeriodicalId":49318,"journal":{"name":"Earth Science Informatics","volume":"24 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MSCANet: A multi-scale context-aware network for remote sensing object detection MSCANet：用于遥感物体探测的多尺度情境感知网络

IF 2.8 4区地球科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Earth Science Informatics

Pub Date : 2024-08-31 DOI: 10.1007/s12145-024-01447-8

Huaping Zhou, Weidong Liu, Kelei Sun, Jin Wu, Tao Wu

With the rapid development of remote sensing technology and the widespread application of remote sensing images, remote sensing object detection has become a hot research direction. However, we observe three primary challenges in remote sensing object detection: scale variations, small objects, and complex backgrounds. To address these challenges, we propose a novel detector, he Multi-Scale Context-Aware Network (MSCANet). First, we introduce a Multi-Scale Fusion Module (MSFM) that provides various scales of receptive fields to extract contextual information of objects at different scales adequately. Second, the Multi-Scale Guidance Module (MSGM) is proposed, which fuses deep and shallow feature maps from multiple scales, reducing the loss of feature information in small objects. Finally, we introduce the Context-Aware DownSampling Module (CADM). It dynamically adjusts context information weights at different scales, effectively reducing interference from complex backgrounds. Experimental results demonstrate that the proposed MSCANet achieves superior performance results with mean average precision (mAP) of 97.1% and 73.4% on the challenging RSOD and DIOR datasets, respectively, which indicates that the proposed network is suitable for remote sensing object detection and is of a great reference value.

随着遥感技术的飞速发展和遥感图像的广泛应用，遥感物体检测已成为一个热门研究方向。然而，我们发现遥感物体检测面临三个主要挑战：尺度变化、小物体和复杂背景。针对这些挑战，我们提出了一种新型检测器--多尺度情境感知网络（MSCANet）。首先，我们引入了多尺度融合模块（MSFM），该模块提供各种尺度的感受野，以充分提取不同尺度物体的上下文信息。其次，我们提出了多尺度引导模块（MSGM），它融合了多个尺度的深层和浅层特征图，减少了小物体特征信息的损失。最后，我们引入了上下文感知下采样模块（CADM）。它能动态调整不同尺度的上下文信息权重，有效减少复杂背景的干扰。实验结果表明，所提出的 MSCANet 在具有挑战性的 RSOD 和 DIOR 数据集上取得了优异的性能结果，平均精度（mAP）分别达到 97.1% 和 73.4%，这表明所提出的网络适用于遥感物体检测，具有很高的参考价值。

{"title":"MSCANet: A multi-scale context-aware network for remote sensing object detection","authors":"Huaping Zhou, Weidong Liu, Kelei Sun, Jin Wu, Tao Wu","doi":"10.1007/s12145-024-01447-8","DOIUrl":"https://doi.org/10.1007/s12145-024-01447-8","url":null,"abstract":"With the rapid development of remote sensing technology and the widespread application of remote sensing images, remote sensing object detection has become a hot research direction. However, we observe three primary challenges in remote sensing object detection: scale variations, small objects, and complex backgrounds. To address these challenges, we propose a novel detector, he Multi-Scale Context-Aware Network (MSCANet). First, we introduce a Multi-Scale Fusion Module (MSFM) that provides various scales of receptive fields to extract contextual information of objects at different scales adequately. Second, the Multi-Scale Guidance Module (MSGM) is proposed, which fuses deep and shallow feature maps from multiple scales, reducing the loss of feature information in small objects. Finally, we introduce the Context-Aware DownSampling Module (CADM). It dynamically adjusts context information weights at different scales, effectively reducing interference from complex backgrounds. Experimental results demonstrate that the proposed MSCANet achieves superior performance results with mean average precision (mAP) of 97.1% and 73.4% on the challenging RSOD and DIOR datasets, respectively, which indicates that the proposed network is suitable for remote sensing object detection and is of a great reference value.","PeriodicalId":49318,"journal":{"name":"Earth Science Informatics","volume":"74 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Estimating soil organic carbon using sentinel-2 data under zero tillage agriculture: a machine learning approach 利用哨兵-2 数据估算零耕作农业下的土壤有机碳：一种机器学习方法

IF 2.8 4区地球科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Earth Science Informatics

Pub Date : 2024-08-30 DOI: 10.1007/s12145-024-01427-y

Lawrence Mango, Nuthammachot Narissara, Som-ard Jaturong

Soil organic carbon (SOC) is the main component of soil organic matter (SOM) and constitutes the crucial component of the soil. It supports key soil functions, stabilizes soil structure, aid in plant-nutrient retention and release, and promote water infiltration and storage. Predicting SOC using Sentinel-2 data integrated with machine learning algorithms under zero tillage practice is inadequately documented for developing countries like Zimbabwe. The purpose of this study is to evaluate the performance of support vector machine (SVM), artificial neural network (ANN), and partial least square regression (PLSR) algorithms from Sentinel-2 data for SOC estimation. The SVM, ANN and PLSR models were used with a cross-validation to estimate the SOC content based on 50 georeferenced calibration samples under a zero-tillage practice. The ANN model outperformed the other two models by delivering a coefficient of determination (R²) of between 55 and 60% of SOC variability and RMSE varied between 5.01 and 8.78%, whereas for the SVM, R² varied between 0.53 and 0.57 and RMSE varied between 6.25 and 11.39%. The least estimates of SOC provided by the PLSR algorithm were, R² = 0.44–0.49 and RMSE = 7.59–12.42% for the top 15 cm depth. Results with and R², root mean square error (RMSE) and mean absolute error (MAE) for SVM, ANN and PLSR, show that the ANN model is highly capable for capturing SOC variability. Although the ANN algorithm provides more accurate SOC estimates than the SVM algorithm, the difference in accuracy is not significant. Results revealed a satisfactory agreement between the SOC content and zero tillage practice (R², coefficient of variation (CV), MAE, and RMSE using SVM, ANN and PLSR for the validation dataset using four predictor variables. The calibration results of SOC indicated that the mean SOC was 15.83% and the validation mean SOC was 17.02%. The SOC validation dataset (34.17%) had higher degree of variation around its mean as compared to the calibration dataset (29.86%). The SOC prediction results can be used as an important tool for informed decisions about soil health and productivity by the farmers, land managers and policy makers.

土壤有机碳（SOC）是土壤有机质（SOM）的主要成分，也是土壤的重要组成部分。它支持土壤的主要功能，稳定土壤结构，有助于植物养分的保持和释放，并促进水分的渗透和储存。对于像津巴布韦这样的发展中国家来说，利用哨兵-2 数据与机器学习算法相结合，预测零耕作实践下的 SOC 的文献资料尚不充分。本研究的目的是评估支持向量机 (SVM)、人工神经网络 (ANN) 和偏最小二乘回归 (PLSR) 算法在利用哨兵-2 数据估算 SOC 方面的性能。利用 SVM、ANN 和 PLSR 模型进行交叉验证，根据 50 个地理参照校准样本估算出零耕法下的 SOC 含量。ANN 模型的 SOC 变异性决定系数 (R2) 在 55% 到 60% 之间，RMSE 在 5.01% 到 8.78% 之间；而 SVM 模型的 R2 在 0.53% 到 0.57% 之间，RMSE 在 6.25% 到 11.39% 之间。PLSR 算法对顶部 15 厘米深度 SOC 的最小估计值为 R2 = 0.44-0.49 和 RMSE = 7.59-12.42%。SVM、ANN 和 PLSR 的 R2、均方根误差（RMSE）和平均绝对误差（MAE）结果表明，ANN 模型能够很好地捕捉 SOC 的变化。虽然与 SVM 算法相比，ANN 算法能提供更准确的 SOC 估计值，但准确度的差异并不显著。结果显示，在使用四个预测变量的验证数据集上，使用 SVM、ANN 和 PLSR 得出的 SOC 含量与零耕作实践之间的 R2、变异系数 (CV)、MAE 和 RMSE 的一致性令人满意。SOC 的校准结果表明，平均 SOC 为 15.83%，验证平均 SOC 为 17.02%。与校准数据集（29.86%）相比，SOC 验证数据集（34.17%）围绕其平均值的变化程度更高。SOC 预测结果可作为农民、土地管理者和政策制定者就土壤健康和生产力做出明智决策的重要工具。

{"title":"Estimating soil organic carbon using sentinel-2 data under zero tillage agriculture: a machine learning approach","authors":"Lawrence Mango, Nuthammachot Narissara, Som-ard Jaturong","doi":"10.1007/s12145-024-01427-y","DOIUrl":"https://doi.org/10.1007/s12145-024-01427-y","url":null,"abstract":"Soil organic carbon (SOC) is the main component of soil organic matter (SOM) and constitutes the crucial component of the soil. It supports key soil functions, stabilizes soil structure, aid in plant-nutrient retention and release, and promote water infiltration and storage. Predicting SOC using Sentinel-2 data integrated with machine learning algorithms under zero tillage practice is inadequately documented for developing countries like Zimbabwe. The purpose of this study is to evaluate the performance of support vector machine (SVM), artificial neural network (ANN), and partial least square regression (PLSR) algorithms from Sentinel-2 data for SOC estimation. The SVM, ANN and PLSR models were used with a cross-validation to estimate the SOC content based on 50 georeferenced calibration samples under a zero-tillage practice. The ANN model outperformed the other two models by delivering a coefficient of determination (R2) of between 55 and 60% of SOC variability and RMSE varied between 5.01 and 8.78%, whereas for the SVM, R2 varied between 0.53 and 0.57 and RMSE varied between 6.25 and 11.39%. The least estimates of SOC provided by the PLSR algorithm were, R2 = 0.44–0.49 and RMSE = 7.59–12.42% for the top 15 cm depth. Results with and R2, root mean square error (RMSE) and mean absolute error (MAE) for SVM, ANN and PLSR, show that the ANN model is highly capable for capturing SOC variability. Although the ANN algorithm provides more accurate SOC estimates than the SVM algorithm, the difference in accuracy is not significant. Results revealed a satisfactory agreement between the SOC content and zero tillage practice (R2, coefficient of variation (CV), MAE, and RMSE using SVM, ANN and PLSR for the validation dataset using four predictor variables. The calibration results of SOC indicated that the mean SOC was 15.83% and the validation mean SOC was 17.02%. The SOC validation dataset (34.17%) had higher degree of variation around its mean as compared to the calibration dataset (29.86%). The SOC prediction results can be used as an important tool for informed decisions about soil health and productivity by the farmers, land managers and policy makers.","PeriodicalId":49318,"journal":{"name":"Earth Science Informatics","volume":"117 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A method for landslide identification and detection in high-precision aerial imagery: progressive CBAM-U-net model 高精度航空图像中的滑坡识别和检测方法：渐进式 CBAM-U-net 模型

IF 2.8 4区地球科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Earth Science Informatics

Pub Date : 2024-08-30 DOI: 10.1007/s12145-024-01465-6

Hanjie Lin, Li Li, Yue Qiang, Xinlong Xu, Siyu Liang, Tao Chen, Wenjun Yang, Yi Zhang

Rapid identification and detection of landslides is of significance for disaster damage assessment and post-disaster relief. However, U-net for rapid landslide identification and detection suffers from semantic gap and loss of spatial information. For this purpose, this paper proposed the U-net with a progressive Convolutional Block Attention Module (CBAM-U-net) for landslide boundary identification and extraction from high-precision aerial imagery. Firstly, 109 high-precision aerial landslide images were collected, and the original database was extended by data enhancement to strengthen generalization ability of models. Subsequently, the CBAM-U-net was constructed by introducing spatial attention module and channel attention module for each down-sampling process in U-net. Meanwhile, U-net, FCN and DeepLabv3 + are used as comparison models. Finally, 6 evaluation metrics were used to comprehensively assess the ability of models for landslide identification and segmentation. The results show that CBAM-U-net exhibited better recognition and segmentation accuracies compared to other models, with optimal values of average row correct, dice coefficient, global correct, IoU and mean IoU of 98.3, 0.877, 95, 88.5 and 90.2, respectively. U-net, DeepLab V3 + , and FCN tend to confuse bare ground and roads with landslides. In contrast, CBAM-U-net has stronger ability of feature learning, feature representation, feature refinement and adaptation.The proposed method can improve the problems of semantic gap and spatial information loss in U-net, and has better accuracy and robustness in recognizing and segmenting high-precision landslide images, which can provide certain reference value for the research of rapid landslide recognition and detection.

快速识别和检测滑坡对灾害损失评估和灾后救援具有重要意义。然而，用于快速识别和检测滑坡的 U 网存在语义空白和空间信息丢失的问题。为此，本文提出了一种带有渐进式卷积块注意模块（CBAM-U-net）的 U-net，用于从高精度航空图像中识别和提取滑坡边界。首先，收集了 109 幅高精度滑坡航空影像，并通过数据增强扩展了原始数据库，以增强模型的泛化能力。随后，通过在 U-net 中的每个下采样过程中引入空间关注模块和通道关注模块，构建了 CBAM-U-net 模型。同时，将 U-net、FCN 和 DeepLabv3 + 作为对比模型。最后，使用 6 个评价指标来综合评估模型在滑坡识别和分割方面的能力。结果表明，与其他模型相比，CBAM-U-net 的识别和分割精度更高，平均行正确率、骰子系数、全局正确率、IoU 和平均 IoU 的最佳值分别为 98.3、0.877、95、88.5 和 90.2。U-net、DeepLab V3 + 和 FCN 容易将裸露地面和道路与滑坡混淆。所提出的方法可以改善 U-net 中语义空白和空间信息丢失的问题，在识别和分割高精度滑坡图像时具有较好的准确性和鲁棒性，可为滑坡快速识别与检测的研究提供一定的参考价值。

{"title":"A method for landslide identification and detection in high-precision aerial imagery: progressive CBAM-U-net model","authors":"Hanjie Lin, Li Li, Yue Qiang, Xinlong Xu, Siyu Liang, Tao Chen, Wenjun Yang, Yi Zhang","doi":"10.1007/s12145-024-01465-6","DOIUrl":"https://doi.org/10.1007/s12145-024-01465-6","url":null,"abstract":"Rapid identification and detection of landslides is of significance for disaster damage assessment and post-disaster relief. However, U-net for rapid landslide identification and detection suffers from semantic gap and loss of spatial information. For this purpose, this paper proposed the U-net with a progressive Convolutional Block Attention Module (CBAM-U-net) for landslide boundary identification and extraction from high-precision aerial imagery. Firstly, 109 high-precision aerial landslide images were collected, and the original database was extended by data enhancement to strengthen generalization ability of models. Subsequently, the CBAM-U-net was constructed by introducing spatial attention module and channel attention module for each down-sampling process in U-net. Meanwhile, U-net, FCN and DeepLabv3 + are used as comparison models. Finally, 6 evaluation metrics were used to comprehensively assess the ability of models for landslide identification and segmentation. The results show that CBAM-U-net exhibited better recognition and segmentation accuracies compared to other models, with optimal values of average row correct, dice coefficient, global correct, IoU and mean IoU of 98.3, 0.877, 95, 88.5 and 90.2, respectively. U-net, DeepLab V3 + , and FCN tend to confuse bare ground and roads with landslides. In contrast, CBAM-U-net has stronger ability of feature learning, feature representation, feature refinement and adaptation.The proposed method can improve the problems of semantic gap and spatial information loss in U-net, and has better accuracy and robustness in recognizing and segmenting high-precision landslide images, which can provide certain reference value for the research of rapid landslide recognition and detection.","PeriodicalId":49318,"journal":{"name":"Earth Science Informatics","volume":"15 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Prediction of soil classification in a metro line from seismic wave velocities using soft computing techniques 利用软计算技术从地震波速度预测地铁线路中的土壤分类

IF 2.8 4区地球科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Earth Science Informatics

Pub Date : 2024-08-28 DOI: 10.1007/s12145-024-01435-y

Hosein Chatrayi, Farnusch Hajizadeh, Behzad Shakouri

At a particular location on the ground, geotechnical measurements of soil properties are utilized to offer information for infrastructure design. Design uncertainty and dependability may increase when little point data is used. Geophysical techniques offer constant geographic information about the soil and are less time-consuming and intrusive. Geophysical data, however, is not expressed in terms of technical specifications. To enable the use of geophysical data in geotechnical designs, correlations between geotechnical and geophysical characteristics are required. The S- and P- seismic wave velocities are the main focus of the present geophysical technique research. Artificial neural network (ANN) models are developed using published data to predict seismic wave velocity and soil classification for seismic site effect evaluation. The results of ANN models using publicly available data demonstrate that seismic wave velocity has a moderate to high degree of accuracy in predicting soil classification. Regression is not as effective as artificial neural networks (ANN) in terms of overall performance. To confirm this, enclosed areas were evaluated to accurately predict soil classification and assess the performance of both ANN and regression models. The artificial neural network predicted the enclosed areas with much higher accuracy.

在地面的特定位置，利用岩土工程测量土壤特性，为基础设施设计提供信息。如果使用的点数据很少，设计的不确定性和可靠性可能会增加。地球物理技术可提供有关土壤的恒定地理信息，耗时较短，侵入性较低。然而，地球物理数据并不是以技术规范的形式表达的。为了在岩土工程设计中使用地球物理数据，需要将岩土工程和地球物理特征联系起来。S 地震波速度和 P 地震波速度是目前地球物理技术研究的重点。利用已公布的数据开发了人工神经网络（ANN）模型，用于预测地震波速度和土壤分类，以评估地震场地效应。利用公开数据建立的人工神经网络模型的结果表明，地震波速度在预测土壤分类方面具有中等至高等程度的准确性。就整体性能而言，回归不如人工神经网络（ANN）有效。为了证实这一点，对封闭区域进行了评估，以准确预测土壤分类，并评估人工神经网络和回归模型的性能。人工神经网络预测封闭区域的准确性要高得多。

{"title":"Prediction of soil classification in a metro line from seismic wave velocities using soft computing techniques","authors":"Hosein Chatrayi, Farnusch Hajizadeh, Behzad Shakouri","doi":"10.1007/s12145-024-01435-y","DOIUrl":"https://doi.org/10.1007/s12145-024-01435-y","url":null,"abstract":"At a particular location on the ground, geotechnical measurements of soil properties are utilized to offer information for infrastructure design. Design uncertainty and dependability may increase when little point data is used. Geophysical techniques offer constant geographic information about the soil and are less time-consuming and intrusive. Geophysical data, however, is not expressed in terms of technical specifications. To enable the use of geophysical data in geotechnical designs, correlations between geotechnical and geophysical characteristics are required. The S- and P- seismic wave velocities are the main focus of the present geophysical technique research. Artificial neural network (ANN) models are developed using published data to predict seismic wave velocity and soil classification for seismic site effect evaluation. The results of ANN models using publicly available data demonstrate that seismic wave velocity has a moderate to high degree of accuracy in predicting soil classification. Regression is not as effective as artificial neural networks (ANN) in terms of overall performance. To confirm this, enclosed areas were evaluated to accurately predict soil classification and assess the performance of both ANN and regression models. The artificial neural network predicted the enclosed areas with much higher accuracy.","PeriodicalId":49318,"journal":{"name":"Earth Science Informatics","volume":"9 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Application of SVC, k-NN, and LDA machine learning algorithms for improved prediction of Bioturbation: Example from the Subei Basin, China 应用 SVC、k-NN 和 LDA 机器学习算法改进生物扰动预测：以中国苏北盆地为例

IF 2.8 4区地球科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Earth Science Informatics

Pub Date : 2024-08-28 DOI: 10.1007/s12145-024-01450-z

Jonathan Atuquaye Quaye, Kwame Sarkodie, Zaixing Jiang, Chenlin Hu, Joshua Agbanu, Stephen Adjei, Baiqiang Li

Three supervised machine learning (ML) classification algorithms: Support Vector Classifier (SVC), K- Nearest Neighbour (K-NN), and Linear Discriminant Analysis (LDA) classification algorithms are combined with seventy-six (76) data points of nine (9) core sample datasets retrieved from five (5) selected wells in oilfields of the Subei Basin to delineate bioturbation. Application of feature selection via p-score and f-scoring reduced the number of relevant features to 7 out of the 12 considered. Each classifier underwent model training and testing allocating 80% of the data for training and the remaining 20% for testing. Under the model training, optimization of hyperparameters of the SVC (C, Gamma and Kernel) and K-NN (K value) was performed via the grid search to understand the best form of the decision boundaries that provides optimal accuracy of prediction of Bioturbation. Results aided the selection of optimized SVC hyperparameters such as a linear kernel, C-1000 and Gamma parameter—0.10 that provided a training accuracy of 96.17%. The optimized KNN classifier was obtained based on the K = 5 nearest neighbour to obtain a training accuracy of 73.28%. The training accuracy of the LDA classifier was 67.36% which made it the worst-performing classifier in this work. Further cross-validation based on a fivefold stratification was performed on each classifier to ascertain model generalization and stability for the prediction of unseen test data. Results of the test performance of each classifier indicated that the SVC was the best predictor of the bioturbation index at 92.86% accuracy, followed by the K-NN model at 90.48%, and then the LDA classifier which gave the lowest test accuracy at 76.2%. The results of this work indicate that bioturbation can be predicted via ML methods which is a more efficient and effective means of rock characterization compared to conventional methods used in the oil and gas industry.

三种有监督的机器学习（ML）分类算法：支持向量分类器（SVC）、K-近邻（K-NN）和线性判别分析（LDA）分类算法与从苏北盆地油田五（5）口选定油井中提取的九（9）个岩心样本数据集的 76 个数据点相结合，对生物扰动进行了划分。通过 p 分数和 f 分数进行特征选择，将相关特征的数量减少到 12 个中的 7 个。每个分类器都进行了模型训练和测试，其中 80% 的数据用于训练，其余 20% 用于测试。在模型训练过程中，通过网格搜索对 SVC 的超参数（C、Gamma 和核）和 K-NN 的超参数（K 值）进行了优化，以了解决策边界的最佳形式，从而提供最佳的生物扰动预测精度。结果帮助选择了优化的 SVC 超参数，如线性核、C-1000 和 Gamma 参数-0.10，使训练准确率达到 96.17%。优化的 KNN 分类器基于 K = 5 近邻，训练准确率为 73.28%。LDA 分类器的训练准确率为 67.36%，是本研究中表现最差的分类器。对每个分类器都进行了基于五重分层的进一步交叉验证，以确定模型在预测未见测试数据时的泛化和稳定性。各分类器的测试结果表明，SVC 是生物扰动指数的最佳预测器，准确率为 92.86%，其次是 K-NN 模型，准确率为 90.48%，然后是 LDA 分类器，测试准确率最低，为 76.2%。这项工作的结果表明，通过 ML 方法可以预测生物扰动，与油气行业使用的传统方法相比，这是一种更高效、更有效的岩石表征方法。

{"title":"Application of SVC, k-NN, and LDA machine learning algorithms for improved prediction of Bioturbation: Example from the Subei Basin, China","authors":"Jonathan Atuquaye Quaye, Kwame Sarkodie, Zaixing Jiang, Chenlin Hu, Joshua Agbanu, Stephen Adjei, Baiqiang Li","doi":"10.1007/s12145-024-01450-z","DOIUrl":"https://doi.org/10.1007/s12145-024-01450-z","url":null,"abstract":"Three supervised machine learning (ML) classification algorithms: Support Vector Classifier (SVC), K- Nearest Neighbour (K-NN), and Linear Discriminant Analysis (LDA) classification algorithms are combined with seventy-six (76) data points of nine (9) core sample datasets retrieved from five (5) selected wells in oilfields of the Subei Basin to delineate bioturbation. Application of feature selection via p-score and f-scoring reduced the number of relevant features to 7 out of the 12 considered. Each classifier underwent model training and testing allocating 80% of the data for training and the remaining 20% for testing. Under the model training, optimization of hyperparameters of the SVC (C, Gamma and Kernel) and K-NN (K value) was performed via the grid search to understand the best form of the decision boundaries that provides optimal accuracy of prediction of Bioturbation. Results aided the selection of optimized SVC hyperparameters such as a linear kernel, C-1000 and Gamma parameter—0.10 that provided a training accuracy of 96.17%. The optimized KNN classifier was obtained based on the K = 5 nearest neighbour to obtain a training accuracy of 73.28%. The training accuracy of the LDA classifier was 67.36% which made it the worst-performing classifier in this work. Further cross-validation based on a fivefold stratification was performed on each classifier to ascertain model generalization and stability for the prediction of unseen test data. Results of the test performance of each classifier indicated that the SVC was the best predictor of the bioturbation index at 92.86% accuracy, followed by the K-NN model at 90.48%, and then the LDA classifier which gave the lowest test accuracy at 76.2%. The results of this work indicate that bioturbation can be predicted via ML methods which is a more efficient and effective means of rock characterization compared to conventional methods used in the oil and gas industry.","PeriodicalId":49318,"journal":{"name":"Earth Science Informatics","volume":"24 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A debris flow susceptibility mapping study considering sample heterogeneity 考虑到样本异质性的泥石流易感性绘图研究

IF 2.8 4区地球科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Earth Science Informatics

Pub Date : 2024-08-28 DOI: 10.1007/s12145-024-01453-w

Ruiyuan Gao, Di Wu, Hailiang Liu, Xiaoyang Liu

Susceptibility mapping has been an effective approach to manage the threat of debris flows. However, the sample heterogeneity problem has rarely been considered in previous studies. This paper is to explore the effect of sample heterogeneity on susceptibility mapping and propose corresponding solutions. Two unsupervised clustering approaches including K-means clustering and fuzzy C-means clustering were introduced to divide the study area into several homogeneous regions, each region was processed independently to solve the sample heterogeneity problem. The information gain ratio method was used to evaluate the predictive ability of the conditioning factors in the total dataset before clustering and the homogeneous datasets after clustering. Then the total dataset and the homogeneous datasets were involved in the random forest modeling. The receiver operating characteristic curves and related statistical results were employed to evaluate the model performance. The results showed that there was a significant sample heterogeneity problem for the study area, and the fuzzy C-means algorithm can play an important role in solving this problem. By dividing the study area into several homogeneous regions to process independently, conditioning factors with better predictive ability, models with better performance and debris flow susceptibility maps with higher quality could be obtained.

绘制易受灾害影响的地区分布图一直是管理泥石流威胁的有效方法。然而，以往的研究很少考虑样本异质性问题。本文旨在探讨样本异质性对易感性绘图的影响，并提出相应的解决方案。本文引入两种无监督聚类方法，包括 K-均值聚类和模糊 C-均值聚类，将研究区域划分为多个同质区域，每个区域独立处理，以解决样本异质性问题。采用信息增益比方法对聚类前的总数据集和聚类后的同质数据集的条件因子预测能力进行评估。然后将总数据集和同质数据集进行随机森林建模。采用接收者操作特征曲线和相关统计结果来评估模型性能。结果表明，研究区域存在明显的样本异质性问题，而模糊 C-means 算法在解决这一问题方面可以发挥重要作用。通过将研究区域划分为多个同质区域进行独立处理，可以获得预测能力更强的调节因子、性能更好的模型和质量更高的泥石流易发图。

{"title":"A debris flow susceptibility mapping study considering sample heterogeneity","authors":"Ruiyuan Gao, Di Wu, Hailiang Liu, Xiaoyang Liu","doi":"10.1007/s12145-024-01453-w","DOIUrl":"https://doi.org/10.1007/s12145-024-01453-w","url":null,"abstract":"Susceptibility mapping has been an effective approach to manage the threat of debris flows. However, the sample heterogeneity problem has rarely been considered in previous studies. This paper is to explore the effect of sample heterogeneity on susceptibility mapping and propose corresponding solutions. Two unsupervised clustering approaches including K-means clustering and fuzzy C-means clustering were introduced to divide the study area into several homogeneous regions, each region was processed independently to solve the sample heterogeneity problem. The information gain ratio method was used to evaluate the predictive ability of the conditioning factors in the total dataset before clustering and the homogeneous datasets after clustering. Then the total dataset and the homogeneous datasets were involved in the random forest modeling. The receiver operating characteristic curves and related statistical results were employed to evaluate the model performance. The results showed that there was a significant sample heterogeneity problem for the study area, and the fuzzy C-means algorithm can play an important role in solving this problem. By dividing the study area into several homogeneous regions to process independently, conditioning factors with better predictive ability, models with better performance and debris flow susceptibility maps with higher quality could be obtained.","PeriodicalId":49318,"journal":{"name":"Earth Science Informatics","volume":"10 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

EAS $$^2$$ KAM: enhanced adaptive source-selection kernel with attention mechanism for hyperspectral image classification EAS $^$2$ KAM：用于高光谱图像分类的具有关注机制的增强型自适应源选择内核

IF 2.8 4区地球科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Earth Science Informatics

Pub Date : 2024-08-27 DOI: 10.1007/s12145-024-01466-5

Ahmed R. El-gabri, Hussein A. Aly, Mohamed A. Elshafey, Tarek S. Ghoniemy

Hyperspectral Images (HSIs) possess extensive applications in remote sensing, especially material discrimination and earth observation monitoring. However, constraints in spatial resolution increase sensitivity to spectral noise, limiting the ability to adjust Receptive Fields (RFs). Convolutional Neural Networks (CNNs) with fixed RFs are a common choice for HSI classification tasks. However, their potential in leveraging the appropriate RF remains under-exploited, thus affecting feature discriminative capabilities. This study introduces an Enhanced Adaptive Source-Selection Kernel with Attention Mechanism (EAS(^2)KAM) for HSI Classification. The model incorporates a Three Dimensional Enhanced Function Mixture (3D-EFM) with a distinct RF for local low-rank contextual exploitation. Furthermore, it incorporates diverse global RF branches enriched with spectral attention and an additional spectral-spatial mixing branch to adjust RFs, enhancing multiscale feature discrimination. The 3D-EFM is integrated with a 3D Residual Network (3D ResNet) that includes a Channel-Pixel Attention Module (CPAM) in each segment, improving spectral-spatial feature utilization. Comprehensive experiments on four benchmark datasets show marked advancements, including a maximum rise of 0.67% in Overall Accuracy (OA), 0.87% in Average Accuracy (AA), and 1.33% in the Kappa Coefficient ((kappa )), outperforming the top two HSI classifiers from a list of eleven state-of-the-art deep learning models. A detailed ablation study evaluates model complexity and runtime, confirming the superior performance of the proposed model.

高光谱图像（HSIs）在遥感领域有着广泛的应用，特别是在材料识别和地球观测监测方面。然而，空间分辨率的限制增加了对光谱噪声的敏感性，从而限制了调整接收场（RF）的能力。具有固定射频的卷积神经网络（CNN）是人机交互分类任务的常见选择。然而，它们在利用适当射频方面的潜力仍未得到充分开发，从而影响了特征判别能力。本研究介绍了用于人机交互分类的增强型自适应源选择内核（EAS/(^2/)KAM）。该模型结合了三维增强函数混合物（3D-EFM），具有独特的射频（RF），可用于局部低等级上下文利用。此外，该模型还包含多种全局射频分支，这些分支富含频谱注意力和额外的频谱-空间混合分支，用于调整射频，从而增强多尺度特征识别能力。3D-EFM 与 3D 残差网络（3D ResNet）集成，其中每个分段都包含一个通道-像素注意模块（CPAM），从而提高了频谱-空间特征的利用率。在四个基准数据集上进行的综合实验显示，该技术取得了显著进步，包括总体准确率（OA）最大提高了0.67%，平均准确率（AA）提高了0.87%，卡帕系数（(kappa )）提高了1.33%，超过了11个最先进深度学习模型中排名前两位的HSI分类器。一项详细的消融研究评估了模型的复杂性和运行时间，证实了拟议模型的卓越性能。

{"title":"EAS $$^2$$ KAM: enhanced adaptive source-selection kernel with attention mechanism for hyperspectral image classification","authors":"Ahmed R. El-gabri, Hussein A. Aly, Mohamed A. Elshafey, Tarek S. Ghoniemy","doi":"10.1007/s12145-024-01466-5","DOIUrl":"https://doi.org/10.1007/s12145-024-01466-5","url":null,"abstract":"Hyperspectral Images (HSIs) possess extensive applications in remote sensing, especially material discrimination and earth observation monitoring. However, constraints in spatial resolution increase sensitivity to spectral noise, limiting the ability to adjust Receptive Fields (RFs). Convolutional Neural Networks (CNNs) with fixed RFs are a common choice for HSI classification tasks. However, their potential in leveraging the appropriate RF remains under-exploited, thus affecting feature discriminative capabilities. This study introduces an Enhanced Adaptive Source-Selection Kernel with Attention Mechanism (EAS(^2)KAM) for HSI Classification. The model incorporates a Three Dimensional Enhanced Function Mixture (3D-EFM) with a distinct RF for local low-rank contextual exploitation. Furthermore, it incorporates diverse global RF branches enriched with spectral attention and an additional spectral-spatial mixing branch to adjust RFs, enhancing multiscale feature discrimination. The 3D-EFM is integrated with a 3D Residual Network (3D ResNet) that includes a Channel-Pixel Attention Module (CPAM) in each segment, improving spectral-spatial feature utilization. Comprehensive experiments on four benchmark datasets show marked advancements, including a maximum rise of 0.67% in Overall Accuracy (OA), 0.87% in Average Accuracy (AA), and 1.33% in the Kappa Coefficient ((kappa )), outperforming the top two HSI classifiers from a list of eleven state-of-the-art deep learning models. A detailed ablation study evaluates model complexity and runtime, confirming the superior performance of the proposed model.","PeriodicalId":49318,"journal":{"name":"Earth Science Informatics","volume":"32 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enriching building function classification using Large Language Model embeddings of OpenStreetMap Tags 利用 OpenStreetMap 标签的大语言模型嵌入丰富建筑功能分类

IF 2.8 4区地球科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Earth Science Informatics

Pub Date : 2024-08-27 DOI: 10.1007/s12145-024-01463-8

Abdulkadir Memduhoğlu, Nir Fulman, Alexander Zipf

Automated methods for building function classification are essential due to restricted access to official building use data. Existing approaches utilize traditional Natural Language Processing (NLP) techniques to analyze textual data representing human activities, but they struggle with the ambiguity of semantic contexts. In contrast, Large Language Models (LLMs) excel at capturing the broader context of language. This study presents a method that uses LLMs to interpret OpenStreetMap (OSM) tags, combining them with physical and spatial metrics to classify urban building functions. We employed an XGBoost model trained on 32 features from six city datasets to classify urban building functions, demonstrating varying F1 scores from 67.80% in Madrid to 91.59% in Liberec. Integrating LLM embeddings enhanced the model's performance by an average of 12.5% across all cities compared to models using only physical and spatial metrics. Moreover, integrating LLM embeddings improved the model's performance by 6.2% over models that incorporate OSM tags as one-hot encodings, and when predicting based solely on OSM tags, the LLM approach outperforms traditional NLP methods in 5 out of 6 cities. These results suggest that deep contextual understanding, as captured by LLM embeddings more effectively than traditional NLP approaches, is beneficial for classification. Finally, a Pearson correlation coefficient of approximately -0.858 between population density and F1-scores suggests that denser areas present greater classification challenges. Moving forward, we recommend investigation into discrepancies in model performance across and within cities, aiming to identify generalized models.

由于获取官方建筑使用数据的途径有限，因此建筑功能分类的自动化方法至关重要。现有的方法利用传统的自然语言处理 (NLP) 技术来分析代表人类活动的文本数据，但这些方法难以解决语义上下文的模糊性问题。相比之下，大型语言模型（LLM）则擅长捕捉更广泛的语言语境。本研究介绍了一种使用 LLMs 解释 OpenStreetMap (OSM) 标签的方法，该方法将 LLMs 与物理和空间指标相结合，对城市建筑功能进行分类。我们使用基于六个城市数据集 32 个特征训练的 XGBoost 模型对城市建筑功能进行分类，结果显示 F1 分数从马德里的 67.80% 到利贝雷茨的 91.59% 不等。与仅使用物理和空间指标的模型相比，整合 LLM 嵌入的模型在所有城市的平均性能提高了 12.5%。此外，与将 OSM 标签作为单击编码的模型相比，整合 LLM 嵌入的模型性能提高了 6.2%，而当仅基于 OSM 标签进行预测时，LLM 方法在 6 个城市中的 5 个城市的表现优于传统的 NLP 方法。这些结果表明，LLM 嵌入比传统的 NLP 方法更有效地捕捉到了深层次的上下文理解，有利于分类。最后，人口密度与 F1 分数之间的皮尔逊相关系数约为-0.858，这表明人口密集的地区面临着更大的分类挑战。展望未来，我们建议对城市间和城市内的模型性能差异进行调查，以确定通用模型。

{"title":"Enriching building function classification using Large Language Model embeddings of OpenStreetMap Tags","authors":"Abdulkadir Memduhoğlu, Nir Fulman, Alexander Zipf","doi":"10.1007/s12145-024-01463-8","DOIUrl":"https://doi.org/10.1007/s12145-024-01463-8","url":null,"abstract":"Automated methods for building function classification are essential due to restricted access to official building use data. Existing approaches utilize traditional Natural Language Processing (NLP) techniques to analyze textual data representing human activities, but they struggle with the ambiguity of semantic contexts. In contrast, Large Language Models (LLMs) excel at capturing the broader context of language. This study presents a method that uses LLMs to interpret OpenStreetMap (OSM) tags, combining them with physical and spatial metrics to classify urban building functions. We employed an XGBoost model trained on 32 features from six city datasets to classify urban building functions, demonstrating varying F1 scores from 67.80% in Madrid to 91.59% in Liberec. Integrating LLM embeddings enhanced the model's performance by an average of 12.5% across all cities compared to models using only physical and spatial metrics. Moreover, integrating LLM embeddings improved the model's performance by 6.2% over models that incorporate OSM tags as one-hot encodings, and when predicting based solely on OSM tags, the LLM approach outperforms traditional NLP methods in 5 out of 6 cities. These results suggest that deep contextual understanding, as captured by LLM embeddings more effectively than traditional NLP approaches, is beneficial for classification. Finally, a Pearson correlation coefficient of approximately -0.858 between population density and F1-scores suggests that denser areas present greater classification challenges. Moving forward, we recommend investigation into discrepancies in model performance across and within cities, aiming to identify generalized models.","PeriodicalId":49318,"journal":{"name":"Earth Science Informatics","volume":"2 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Exploring advanced machine learning techniques for landslide susceptibility mapping in Yanchuan County, China 探索先进的机器学习技术，绘制中国延川县滑坡易发性地图

IF 2.8 4区地球科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Earth Science Informatics

Pub Date : 2024-08-27 DOI: 10.1007/s12145-024-01455-8

Wei Chen, Chao Guo, Fanghao Lin, Ruixin Zhao, Tao Li, Paraskevas Tsangaratos, Ioanna Ilia

Many landslides occurred every year, causing extensive property losses and casualties in China. Landslide susceptibility mapping is crucial for disaster prevention by the government or related organizations to protect people's lives and property. This study compared the performance of random forest (RF), classification and regression trees (CART), Bayesian network (BN), and logistic model trees (LMT) methods in generating landslide susceptibility maps in Yanchuan County using optimization strategy. A field survey was conducted to map 311 landslides. The dataset was divided into a training dataset and a validation dataset with a ratio of 7:3. Sixteen factors influencing landslides were identified based on a geological survey of the study area, including elevation, plan curvature, profile curvature, slope aspect, slope angle, slope length, topographic position index (TPI), terrain ruggedness index (TRI), convergence index, normalized difference vegetation index (NDVI), distance to roads, distance to rivers, rainfall, soil type, lithology, and land use. The training dataset was used to train the models in Weka software, and landslide susceptibility maps were generated in GIS software. The performance of the four models was evaluated by receiver operating characteristic (ROC) curves, confusion matrix, chi-square test, and other statistical analysis methods. The comparison results show that all four machine learning models are suitable for evaluating landslide susceptibility in the study area. The performances of the RF and LMT methods are more stable than those of the other two models; thus, they are suitable for landslide susceptibility mapping.

中国每年都会发生多起山体滑坡，造成大量财产损失和人员伤亡。绘制滑坡易发性地图对于政府或相关机构预防灾害、保护人民生命财产安全至关重要。本研究采用优化策略，比较了随机森林（RF）、分类与回归树（CART）、贝叶斯网络（BN）和逻辑模型树（LMT）方法在生成延川县滑坡易感性图方面的性能。通过实地调查，绘制了 311 幅滑坡地图。数据集按 7:3 的比例分为训练数据集和验证数据集。根据研究区域的地质调查，确定了 16 个影响滑坡的因素，包括海拔、平面曲率、剖面曲率、坡面、坡角、坡长、地形位置指数（TPI）、地形崎岖指数（TRI）、收敛指数、归一化差异植被指数（NDVI）、道路距离、河流距离、降雨量、土壤类型、岩性和土地利用。训练数据集用于在 Weka 软件中训练模型，滑坡易发性地图则在 GIS 软件中生成。通过接收器操作特征曲线（ROC）、混淆矩阵、卡方检验和其他统计分析方法对四种模型的性能进行了评估。比较结果表明，四种机器学习模型都适用于评估研究区域的滑坡易发性。RF 和 LMT 方法的性能比其他两种模型更稳定，因此适合用于滑坡易感性绘图。

{"title":"Exploring advanced machine learning techniques for landslide susceptibility mapping in Yanchuan County, China","authors":"Wei Chen, Chao Guo, Fanghao Lin, Ruixin Zhao, Tao Li, Paraskevas Tsangaratos, Ioanna Ilia","doi":"10.1007/s12145-024-01455-8","DOIUrl":"https://doi.org/10.1007/s12145-024-01455-8","url":null,"abstract":"Many landslides occurred every year, causing extensive property losses and casualties in China. Landslide susceptibility mapping is crucial for disaster prevention by the government or related organizations to protect people's lives and property. This study compared the performance of random forest (RF), classification and regression trees (CART), Bayesian network (BN), and logistic model trees (LMT) methods in generating landslide susceptibility maps in Yanchuan County using optimization strategy. A field survey was conducted to map 311 landslides. The dataset was divided into a training dataset and a validation dataset with a ratio of 7:3. Sixteen factors influencing landslides were identified based on a geological survey of the study area, including elevation, plan curvature, profile curvature, slope aspect, slope angle, slope length, topographic position index (TPI), terrain ruggedness index (TRI), convergence index, normalized difference vegetation index (NDVI), distance to roads, distance to rivers, rainfall, soil type, lithology, and land use. The training dataset was used to train the models in Weka software, and landslide susceptibility maps were generated in GIS software. The performance of the four models was evaluated by receiver operating characteristic (ROC) curves, confusion matrix, chi-square test, and other statistical analysis methods. The comparison results show that all four machine learning models are suitable for evaluating landslide susceptibility in the study area. The performances of the RF and LMT methods are more stable than those of the other two models; thus, they are suitable for landslide susceptibility mapping.","PeriodicalId":49318,"journal":{"name":"Earth Science Informatics","volume":"312 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142224832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0