One of the biggest environmental contaminants is wastewater, which can impede global sustainable development. Visible-near infrared spectroscopy can be used to enhance the management, efficiency, and wise use of water resources. However, noise information and the large dimensionality of spectral data frequently limit how accurate spectral models are for water quality metrics. The rPCA-MARS model will use visible-near infrared spectral data as a novel analytical technique for estimating the contents of biological oxygen demand, chemical oxygen demand, and NH3-N in WW. The MARS model will be built once the spectral data have been subjected to the rPCA algorithm to get principal component scores. The MARS model utilizes six PC scores as its input variables. The piecewise-linear and cubic MARS model will be used to build a mathematical correlation between the COD, BOD, and NH3-N content for each component (Y) and the data matrix (X). The rPCA-MARS model is calibrated using a set of 42 samples. An independent test set of 16 samples is then used to evaluate its performance. We will employ the duplex algorithm to select calibration and prediction sets from the data matrix. Prior to running the rPCA-MARS model on the spectral data, we will employ moving average smoothing and SNV transformation for data processing. Coefficient of determination (R2), adjusted R-squared (R2adj), R2 estimated by generalized cross-validation (R2GCV), and mean square error (MSE) were used to assess the effectiveness of the rPCA-MARS model. Both piecewise-linear and piecewise-cubic rPCA-MARS models demonstrated excellent performance for BOD, COD, and NH3-N determination on the calibration and test sets. High R2 values (> 0.93) in both datasets indicate a strong correlation between predicted and observed values. Additionally, the high adjusted R2 (0.93) suggests that the model effectively avoids overfitting. Furthermore, the relatively high R2GCV (0.90) confirms both the model’s accuracy and generalizability.