The article investigates the possibilities of predicting soil quality based on the main agrochemical indicators using machine learning methods. The experimental base consisted of 768 soil samples collected from the territory of the Rozsoshansk community and 192 additional samples from the neighboring territory of the Izyaslav community Khmelnytskyi region, Ukraine, in the autumn of 2022-2023 and spring of 2022-2023. We determined exchangeable acidity, organic carbon, ammonium and nitrate nitrogen, mobile phosphorus, exchangeable calcium, and potassium for each sample. Based on the analyzed indicators, a generalized approach to assessing fertility levels was offered, categorizing soil quality into three classes. Machine learning methods were used to predict soil quality: Gaussian NB, Multinomial NB, Logistic Regression, Ridge Classifier, SGDC, Random Forest, XGBoost, kNN, SVM, and MLP neural network. Random Forest, XGBoost, and MLP demonstrated the highest accuracy on the test dataset. When testing on an independent dataset of 192 new samples, the MLP model preserved the best balance of classification performance metrics. It achieved high G-Mean values of 0.894 for class 1, 0.915 for class 2, and 0.903 for class 3, indicating the model’s effectiveness in both detecting the target class and correctly identifying the remaining classes. In addition, the model demonstrated strong F1-score values of 0.884, 0.921, and 0.773 accordingly. The constructed ROC and Precision–Recall curves further confirmed the high generalization capability of the proposed model. To interpret the operation of the neural network, the SHAP method was applied. Global SHAP analysis identified available phosphorus, soil acidity, and organic carbon as the most influential input features. Local SHAP explanations for sample No. 162 demonstrated physically meaningful and consistent model responses. The conducted SHAP analysis of the MLP neural network made it possible to quantitatively assess the contribution of individual input parameters to the prediction outcomes, which significantly increased the interpretability of the model and the level of confidence in the obtained results. The approach proposed in this study not only improves the accuracy of soil quality classification but also provides an agrochemical interpretation of the results, thereby creating a basis for the development of rational, efficient, and precision land use systems relevant to agronomists, land managers, and farmers.
扫码关注我们
求助内容:
应助结果提醒方式:
