This study developed a modeling methodology for statistical optimization-based geologic hazard susceptibility assessment, aiming to enhance the comprehensive performance and classification accuracy of the assessment models. First, the cumulative probability method revealed that a low probability (15%) of geologic hazards between any two geologic hazard points occurred outside a buffer zone with a radius of 2297 m (i.e., the distance threshold). The training dataset was established, consisting of negative samples (non-hazard points) randomly generated based on the distance threshold, positive samples (i.e., historical hazards), and 13 conditioning factors. Then, models were built using five machine learning algorithms, namely random forest (RF), gradient boosting decision tree (GBDT), naive Bayes (NB), logistic regression (LR), and support vector machine (SVM). The comprehensive performance of the models was assessed using the area under the receiver operating characteristic curve (AUC) and overall accuracy (OA) as indicators, revealing that RF exhibited the best performance, with OA and AUC values of 2.7127 and 0.981, respectively. Furthermore, the machine learning models constructed by considering the distance threshold outperformed those built using the unoptimized dataset. The characteristic factors were ranked using the mutual information method, with their scores decreasing in the order of rainfall (0.1616), altitude (0.06), normalized difference vegetation index (NDVI; 0.04), and distance from roads (0.03). Finally, the geologic hazard susceptibility classification was assessed using the natural breaks method combined with a clustering algorithm. The results indicate that the clustering algorithm exhibited higher classification accuracy than the natural breaks method. The findings of this study demonstrate that the proposed model optimization scheme can provide a scientific basis for the prevention and control of geologic hazards.
扫码关注我们
求助内容:
应助结果提醒方式:
