{"title":"A debris flow susceptibility mapping study considering sample heterogeneity","authors":"Ruiyuan Gao, Di Wu, Hailiang Liu, Xiaoyang Liu","doi":"10.1007/s12145-024-01453-w","DOIUrl":null,"url":null,"abstract":"<p>Susceptibility mapping has been an effective approach to manage the threat of debris flows. However, the sample heterogeneity problem has rarely been considered in previous studies. This paper is to explore the effect of sample heterogeneity on susceptibility mapping and propose corresponding solutions. Two unsupervised clustering approaches including K-means clustering and fuzzy C-means clustering were introduced to divide the study area into several homogeneous regions, each region was processed independently to solve the sample heterogeneity problem. The information gain ratio method was used to evaluate the predictive ability of the conditioning factors in the total dataset before clustering and the homogeneous datasets after clustering. Then the total dataset and the homogeneous datasets were involved in the random forest modeling. The receiver operating characteristic curves and related statistical results were employed to evaluate the model performance. The results showed that there was a significant sample heterogeneity problem for the study area, and the fuzzy C-means algorithm can play an important role in solving this problem. By dividing the study area into several homogeneous regions to process independently, conditioning factors with better predictive ability, models with better performance and debris flow susceptibility maps with higher quality could be obtained.</p>","PeriodicalId":49318,"journal":{"name":"Earth Science Informatics","volume":"10 1","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Earth Science Informatics","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1007/s12145-024-01453-w","RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Susceptibility mapping has been an effective approach to manage the threat of debris flows. However, the sample heterogeneity problem has rarely been considered in previous studies. This paper is to explore the effect of sample heterogeneity on susceptibility mapping and propose corresponding solutions. Two unsupervised clustering approaches including K-means clustering and fuzzy C-means clustering were introduced to divide the study area into several homogeneous regions, each region was processed independently to solve the sample heterogeneity problem. The information gain ratio method was used to evaluate the predictive ability of the conditioning factors in the total dataset before clustering and the homogeneous datasets after clustering. Then the total dataset and the homogeneous datasets were involved in the random forest modeling. The receiver operating characteristic curves and related statistical results were employed to evaluate the model performance. The results showed that there was a significant sample heterogeneity problem for the study area, and the fuzzy C-means algorithm can play an important role in solving this problem. By dividing the study area into several homogeneous regions to process independently, conditioning factors with better predictive ability, models with better performance and debris flow susceptibility maps with higher quality could be obtained.
期刊介绍:
The Earth Science Informatics [ESIN] journal aims at rapid publication of high-quality, current, cutting-edge, and provocative scientific work in the area of Earth Science Informatics as it relates to Earth systems science and space science. This includes articles on the application of formal and computational methods, computational Earth science, spatial and temporal analyses, and all aspects of computer applications to the acquisition, storage, processing, interchange, and visualization of data and information about the materials, properties, processes, features, and phenomena that occur at all scales and locations in the Earth system’s five components (atmosphere, hydrosphere, geosphere, biosphere, cryosphere) and in space (see "About this journal" for more detail). The quarterly journal publishes research, methodology, and software articles, as well as editorials, comments, and book and software reviews. Review articles of relevant findings, topics, and methodologies are also considered.