Ming Tao, Qizheng Zhao, Rui Zhao, Memon Muhammad Burhan
{"title":"使用改进的 XGBoost 算法对稀疏数据类别进行岩爆预测的新方法","authors":"Ming Tao, Qizheng Zhao, Rui Zhao, Memon Muhammad Burhan","doi":"10.1007/s11053-024-10412-7","DOIUrl":null,"url":null,"abstract":"<p>Rockburst prediction significantly affects the development and utilization of underground resources. Currently, an increasing number of artificial intelligence algorithms are being applied for rockburst prediction. However, owing to the scarcity of data for certain rockburst grades, machine learning models have struggled to accurately train and learn their characteristics, resulting in bias or overfitting. In this study, 321 worldwide cases of rockbursts were collected. Seven indices considering both rock mechanics and stress conditions were selected as input parameters for the model. To address the issue of limited data for certain rockburst grades, the Synthetic Minority Over-sampling TEchnique (SMOTE) algorithm was used for comprehensive oversampling and synthesis of the rockburst data. The theoretical rationality of this method was corroborated by the Spearman’s correlation coefficient. Additionally, the model hyperparameters were optimized using the Bayesian optimization method, and an improved eXtreme gradient boosting (XGBoost) rockburst prediction model (SM–BO–XGBoost) was established. The constructed SM–BO–XGBoost model was compared with decision tree, random forest, support vector machine, and k-nearest neighbor classification machine learning models. The results showed a significant improvement in the prediction accuracy for the None and Strong rockburst categories, which had limited data in the original rockburst dataset. To address the poor interpretability of the XGBoost model, the SHapley Additive exPlanations (SHAP) method was introduced to explain the constructed model, and to analyze the marginal contributions of different features to the model output across various rockburst grades. The SM-BO-XGBoost model was validated using field rockburst records from the Xincheng and Sanshandao gold mines. As indicated by the results, the model demonstrated favorable performance and applicability, with wide potential for predicting engineering rockbursts.</p>","PeriodicalId":54284,"journal":{"name":"Natural Resources Research","volume":"31 1","pages":""},"PeriodicalIF":4.8000,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A New Method of Rockburst Prediction for Categories with Sparse Data Using Improved XGBoost Algorithm\",\"authors\":\"Ming Tao, Qizheng Zhao, Rui Zhao, Memon Muhammad Burhan\",\"doi\":\"10.1007/s11053-024-10412-7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Rockburst prediction significantly affects the development and utilization of underground resources. Currently, an increasing number of artificial intelligence algorithms are being applied for rockburst prediction. However, owing to the scarcity of data for certain rockburst grades, machine learning models have struggled to accurately train and learn their characteristics, resulting in bias or overfitting. In this study, 321 worldwide cases of rockbursts were collected. Seven indices considering both rock mechanics and stress conditions were selected as input parameters for the model. To address the issue of limited data for certain rockburst grades, the Synthetic Minority Over-sampling TEchnique (SMOTE) algorithm was used for comprehensive oversampling and synthesis of the rockburst data. The theoretical rationality of this method was corroborated by the Spearman’s correlation coefficient. Additionally, the model hyperparameters were optimized using the Bayesian optimization method, and an improved eXtreme gradient boosting (XGBoost) rockburst prediction model (SM–BO–XGBoost) was established. The constructed SM–BO–XGBoost model was compared with decision tree, random forest, support vector machine, and k-nearest neighbor classification machine learning models. The results showed a significant improvement in the prediction accuracy for the None and Strong rockburst categories, which had limited data in the original rockburst dataset. To address the poor interpretability of the XGBoost model, the SHapley Additive exPlanations (SHAP) method was introduced to explain the constructed model, and to analyze the marginal contributions of different features to the model output across various rockburst grades. The SM-BO-XGBoost model was validated using field rockburst records from the Xincheng and Sanshandao gold mines. As indicated by the results, the model demonstrated favorable performance and applicability, with wide potential for predicting engineering rockbursts.</p>\",\"PeriodicalId\":54284,\"journal\":{\"name\":\"Natural Resources Research\",\"volume\":\"31 1\",\"pages\":\"\"},\"PeriodicalIF\":4.8000,\"publicationDate\":\"2024-09-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Natural Resources Research\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://doi.org/10.1007/s11053-024-10412-7\",\"RegionNum\":2,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GEOSCIENCES, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Natural Resources Research","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1007/s11053-024-10412-7","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}
A New Method of Rockburst Prediction for Categories with Sparse Data Using Improved XGBoost Algorithm
Rockburst prediction significantly affects the development and utilization of underground resources. Currently, an increasing number of artificial intelligence algorithms are being applied for rockburst prediction. However, owing to the scarcity of data for certain rockburst grades, machine learning models have struggled to accurately train and learn their characteristics, resulting in bias or overfitting. In this study, 321 worldwide cases of rockbursts were collected. Seven indices considering both rock mechanics and stress conditions were selected as input parameters for the model. To address the issue of limited data for certain rockburst grades, the Synthetic Minority Over-sampling TEchnique (SMOTE) algorithm was used for comprehensive oversampling and synthesis of the rockburst data. The theoretical rationality of this method was corroborated by the Spearman’s correlation coefficient. Additionally, the model hyperparameters were optimized using the Bayesian optimization method, and an improved eXtreme gradient boosting (XGBoost) rockburst prediction model (SM–BO–XGBoost) was established. The constructed SM–BO–XGBoost model was compared with decision tree, random forest, support vector machine, and k-nearest neighbor classification machine learning models. The results showed a significant improvement in the prediction accuracy for the None and Strong rockburst categories, which had limited data in the original rockburst dataset. To address the poor interpretability of the XGBoost model, the SHapley Additive exPlanations (SHAP) method was introduced to explain the constructed model, and to analyze the marginal contributions of different features to the model output across various rockburst grades. The SM-BO-XGBoost model was validated using field rockburst records from the Xincheng and Sanshandao gold mines. As indicated by the results, the model demonstrated favorable performance and applicability, with wide potential for predicting engineering rockbursts.
期刊介绍:
This journal publishes quantitative studies of natural (mainly but not limited to mineral) resources exploration, evaluation and exploitation, including environmental and risk-related aspects. Typical articles use geoscientific data or analyses to assess, test, or compare resource-related aspects. NRR covers a wide variety of resources including minerals, coal, hydrocarbon, geothermal, water, and vegetation. Case studies are welcome.