Comparative study on landslide susceptibility mapping based on different ratios of training samples and testing samples by using RF and FR-RF models

Ke Xu , Zhou Zhao , Wei Chen , Jianquan Ma , Fei Liu , Yihao Zhang , Zijun Ren
{"title":"Comparative study on landslide susceptibility mapping based on different ratios of training samples and testing samples by using RF and FR-RF models","authors":"Ke Xu ,&nbsp;Zhou Zhao ,&nbsp;Wei Chen ,&nbsp;Jianquan Ma ,&nbsp;Fei Liu ,&nbsp;Yihao Zhang ,&nbsp;Zijun Ren","doi":"10.1016/j.nhres.2023.07.004","DOIUrl":null,"url":null,"abstract":"<div><p>Evaluation of landslide susceptibility is essential to planning of land and space utilization. For this purpose, the paper presents a case study from Fugu County, Shaanxi Province, China. Firstly, the geological environment and current state of landslides in Fugu County were investigated. Then, slope, aspect, terrain relief, curvature, lithology, land type, and normalized difference vegetation index (NDVI) were considered as the landslide susceptibility condition factors, and the correlation between these carried out by using Multicollinearity Analysis method. Next, landslide and non-landslide samples were divided into training samples and testing samples according to the sample <em>ratios</em> of 8/2, 7/3, 6/4, and 5/5, respectively. The landslide susceptibility mapping was carried out by using Random Forest (RF) model and Frequency Ratio coupled with Random Forest (FR-RF) model, respectively. Lastly, the landslide density (LD), landslide frequency ratio (LFR), the area under the curve (AUC) of the receiver operator, and other indicators were used to validate the rationality, accuracy, and performance of the landslide susceptibility maps produced from different models and <em>ratios</em>. The results indicated that all maps are reasonable, except the map when <em>ratio</em> is 5/5. For each map, regardless of <em>ratios</em>, the LD and LFR are the greatest in the zones classed as having a very high susceptibility, followed by those with a high, moderate, low, and very low classes.</p><p>In the Random Forest (RF) model, when the training test set is not at the same time its in the area of extremely high sensitivity of LD and the size of the FR value respectively 7/3 (201.026) ​&gt; ​8/2 (154.440) ​&gt; ​6/4 (93.696) &gt;5/5 (136.364) and 7/3 (4.806) ​&gt; ​8/2 (3.692) ​&gt; ​6/4 (3.260) ​&gt; ​5/5 (2.240); in the Frequency Ratio coupled with Random Forest (FR-RF) model, Inall the training test sets the size of the proportion of LD and FR value respectively 7/3 (145.693) ​&gt; ​6/4 (127.151) ​&gt; ​5/5 (122.857) ​&gt; ​8/2 (113.263) and 7/3 (3.334) ​&gt; ​6/4 (3.073) ​&gt; ​5/5 (2.811) ​&gt; ​8/2 (2.592). What else, from the comparison of ROC curves, when <em>ratio</em> is 7/3, the accuracy of the two models is higher than that of other <em>ratios</em>. Similarly, the results of the ensemble model (A combination of two models with different learning abilities.) are not more reasonable than the results of the single model, which reflects that the combination of a weaker learner model (Frequency Ratio model here) with a stronger learner model (Random Forest model here) can diminish the performance of the stronger model.</p></div>","PeriodicalId":100943,"journal":{"name":"Natural Hazards Research","volume":"4 1","pages":"Pages 62-74"},"PeriodicalIF":0.0000,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666592123000732/pdfft?md5=57f6bcca382435f449d5967b78339074&pid=1-s2.0-S2666592123000732-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Natural Hazards Research","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666592123000732","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Evaluation of landslide susceptibility is essential to planning of land and space utilization. For this purpose, the paper presents a case study from Fugu County, Shaanxi Province, China. Firstly, the geological environment and current state of landslides in Fugu County were investigated. Then, slope, aspect, terrain relief, curvature, lithology, land type, and normalized difference vegetation index (NDVI) were considered as the landslide susceptibility condition factors, and the correlation between these carried out by using Multicollinearity Analysis method. Next, landslide and non-landslide samples were divided into training samples and testing samples according to the sample ratios of 8/2, 7/3, 6/4, and 5/5, respectively. The landslide susceptibility mapping was carried out by using Random Forest (RF) model and Frequency Ratio coupled with Random Forest (FR-RF) model, respectively. Lastly, the landslide density (LD), landslide frequency ratio (LFR), the area under the curve (AUC) of the receiver operator, and other indicators were used to validate the rationality, accuracy, and performance of the landslide susceptibility maps produced from different models and ratios. The results indicated that all maps are reasonable, except the map when ratio is 5/5. For each map, regardless of ratios, the LD and LFR are the greatest in the zones classed as having a very high susceptibility, followed by those with a high, moderate, low, and very low classes.

In the Random Forest (RF) model, when the training test set is not at the same time its in the area of extremely high sensitivity of LD and the size of the FR value respectively 7/3 (201.026) ​> ​8/2 (154.440) ​> ​6/4 (93.696) >5/5 (136.364) and 7/3 (4.806) ​> ​8/2 (3.692) ​> ​6/4 (3.260) ​> ​5/5 (2.240); in the Frequency Ratio coupled with Random Forest (FR-RF) model, Inall the training test sets the size of the proportion of LD and FR value respectively 7/3 (145.693) ​> ​6/4 (127.151) ​> ​5/5 (122.857) ​> ​8/2 (113.263) and 7/3 (3.334) ​> ​6/4 (3.073) ​> ​5/5 (2.811) ​> ​8/2 (2.592). What else, from the comparison of ROC curves, when ratio is 7/3, the accuracy of the two models is higher than that of other ratios. Similarly, the results of the ensemble model (A combination of two models with different learning abilities.) are not more reasonable than the results of the single model, which reflects that the combination of a weaker learner model (Frequency Ratio model here) with a stronger learner model (Random Forest model here) can diminish the performance of the stronger model.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用 RF 和 FR-RF 模型,基于训练样本和测试样本的不同比例绘制滑坡易感性地图的比较研究
滑坡易发性评估对土地和空间利用规划至关重要。为此,本文介绍了中国陕西省府谷县的一个案例研究。首先,对府谷县的地质环境和滑坡现状进行了调查。然后,将坡度、坡向、地形起伏、曲率、岩性、土地类型和归一化差异植被指数(NDVI)作为滑坡易发条件因子,并利用多重共线性分析方法对这些因子之间的相关性进行了分析。然后,按照 8/2、7/3、6/4 和 5/5 的样本比例将滑坡样本和非滑坡样本分为训练样本和测试样本。分别使用随机森林(RF)模型和频率比耦合随机森林(FR-RF)模型绘制滑坡易感性图。最后,利用滑坡密度(LD)、滑坡频率比(LFR)、接收算子曲线下面积(AUC)等指标验证了不同模型和比例绘制的滑坡易感性图的合理性、准确性和性能。结果表明,除比率为 5/5 时的地图外,其他地图都是合理的。在随机森林(RF)模型中,当训练测试集不在同一时间时,其在极高敏感度区域的 LD 和 LFR 值大小分别为 7/3 (201.026) > 8/2 (154.440) > 6/4 (93.696)>5/5(136.364)和 7/3(4.806)>8/2(3.692)>6/4(3.260)>5/5(2.240);在频率比耦合随机森林(FR-RF)模型中,在所有训练测试集中,LD 和 FR 值的比例大小分别为 7/3(145.693);6/4(127.151);5/5(122.857);8/2(113.263)和 7/3(3.334);6/4(3.073);5/5(2.811);8/2(2.592)。另外,从 ROC 曲线的比较来看,当比率为 7/3 时,两个模型的准确率高于其他比率。同样,集合模型(由两个学习能力不同的模型组合而成)的结果也没有比单一模型的结果更合理,这反映了较弱的学习者模型(这里是频率比模型)与较强的学习者模型(这里是随机森林模型)的组合会削弱较强模型的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
4.10
自引率
0.00%
发文量
0
期刊最新文献
Report on the 2024 annual academic conference of the committee on earthquake hazard chain, Seismological Society of China,6–9 December 2024, Shanghai, China Assessment of three satellite precipitation products for hydrological studies in a data-scarce context: Ouarzazate basin, southern Morocco Geometry and late Quaternary slip rate of the Tuolai Shan-Hala Hu segment of the Haiyuan fault, northeastern Tibetan Plateau Exploring the dynamics of extreme rainfall in the Cauvery river basin, Southern India: Spatio-temporal insights and adaptive strategies Assessing coastal exposure to Sea Level Rise: A coupled approach of qualitative modeling and spatial autocorrelation analysis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1