Ensuring reliable river-water quality assessment is increasingly important in North Africa, where pollution pressures and data limitations complicate monitoring. Therefore, the research developed a principal-component-analysis-based water quality index (WQI_P) that is designed to address eclipsing, multicollinearity, and subjectively assigned weights that affect traditional indices such as the weighted-arithmetic WQI (WA_WQI). The objective of the research is to evaluate whether PCA-derived weights and objective parameter selection improve reliability, uncertainty, and classification stability. A dataset of 159 river-water samples from the Skikda region (Algeria) was analyzed. After screening correlated variables and extracting PCA contributions, WQI_P was constructed from the retained components. Eight machine-learning algorithms and a stacked ensemble were used under 10-fold cross-validation to compare the prediction performance and uncertainty of WQI_P and WA_WQI. Agreement metrics, PREI scores, confidence intervals, and class-transition analysis were used to assess the differences between the two indices, Predictive uncertainty was quantified using a Gaussian Monte Carlo simulation, which propagates variability by repeatedly perturbing model residuals to generate distributions of index predictions. The WQI_P consistently produced lower prediction errors (stacked RMSE = 2.74; MAE = 1.75) than the WA_WQI (RMSE = 3.16; MAE = 2.21), together with narrower 95% confidence intervals and reduced predictive uncertainty. The classification outcomes shifted toward a stricter and more balanced assessment: the proportion of samples classified as "Excellent" decreased (30 to 7), "Good" increased (55 to 88), and "Unsuitable" declined (40 to 12). These results indicated that grounding weights in the multivariate structure enhances stability and reduces dependence on a small set of dominant parameters. The findings demonstrated that the WQI_P can improve transparency, objectivity, and monitoring efficiency by focusing on the most informative variables. The index is applicable to data-scarce regions where objective weighting and uncertainty control are essential. Future work should test WQI_P across larger and more heterogeneous basins, extend validation using spatial-temporal blocking, and explore its integration into operational monitoring frameworks.
Using a machine learning framework, this study investigates the spatial distribution and key environmental factors of heavy metal contamination (iron, nickel, lead, copper) in four groundwater aquifers of Isfahan province during the water year 2023-2024. A total of 150 wells were sampled and metal concentrations were determined using ICP-MS, AAS, VGA, and Mercury Analyzer methods in accordance with WHO and Iranian standards. The maximum observed concentrations of iron, nickel, lead and copper were approximately 48, 44.1, 2.9 and 11.2 mg/L respectively, with the peak concentrations of iron and copper in the Damaneh - Daran aquifers, nickel in Bouin and lead in Chadegan. Random Forest (RF) and Support Vector Machine (SVM) models were used, and in RF, 100 trees were used for accurate predictions. Multiple collinearity between environmental predictors, including soil properties, unsaturated and saturated zones, hydraulic parameters, slope, groundwater level, and aquifer depth was assessed through variance inflation factor (VIF), all of which were below 10. Model interpretation showed that soil properties and groundwater level had the greatest influence in RF, while the unsaturated layer was dominant in SVM. Iron decreased with increasing aquifer depth, pore thickness, and water table, while soil permeability and slope increased iron accumulation. Nickel was higher in shallow, shallow, and low-conductivity areas, while lead increased with depth and slope, indicating a nonlinear dependence on hydraulic and soil properties. Copper was positively correlated with soil permeability and negatively correlated with water table. Spatial predictions showed that the Bouin aquifer showed the highest iron and nickel (more than 40 and more than 30 mg/L), lead reached about 44 mg/L in Chadegan, and copper peaked in Bouin from southeast to northwest. RF outperformed SVM by achieving an accuracy of 0.7874, sensitivity of 0.7448, and specificity of 0.8243, while SVM performed poorly. This study innovatively combines machine learning models with the parameters of the DRASTIC analytical model to assess and predict heavy metal contamination in the aquifers of Isfahan province. Overall, the results confirm the nonlinear hydrogeological controls on heavy metal distribution and demonstrate the high capability of RF for reliable prediction of groundwater contamination. This approach provides a transferable method for groundwater quality assessment and supports sustainable aquifer management in arid and semi-arid regions.
Few studies have evaluated variations in riverine Eco-Surpluses and Eco-Deficits (ES and ED) from the perspective of flow processes. Therefore, this study evaluated the river ES (ED) change from multiple time scales through the intra-annual flow process and assessed the synergistic change relationship between ES (ED) and various hydro-meteorological factors by combining the Copula model. The study also constructed a ES (ED) prediction system through the Variable Infiltration Capacity (VIC) and Light Gradient Boosting Machine (LightGBM), analyzing each influencing factor's role in the LightGBM simulation through the Shapley Additive exPlanations (SHAP) model. It was found that: the ES (ED) based on the river flow process was more in line with the actual situation of the river after the validation of the Dynamic Time Warping (DTW) and other methods; there were significant discrepancies in the joint return periods among ES (ED) and various hydro-meteorological factors by the two-dimensional and three-dimensional Copula models; the prediction system constructed by coupling the baseflow extracted from the VIC model with the LightGBM algorithm exhibits better performance, and the simulation accuracy (R2) exceeds 0.85; the results of SHAP model showed that base flow variations exert a significant influence on ES and ED dynamics, followed by the role of various hydro-meteorological factors. The study results can provide a reference basis for scheduling river cascade reservoirs and planning regional water resources.

