Predicting total upland sediment yield using regression and machine learning models for improved land management and water conservation

Journal of Hydroinformatics Pub Date : 2024-08-08 DOI:10.2166/hydro.2024.159

Shravan Kumar S. M., Manish Pandey, N. V. Umamahesh

{"title":"Predicting total upland sediment yield using regression and machine learning models for improved land management and water conservation","authors":"Shravan Kumar S. M., Manish Pandey, N. V. Umamahesh","doi":"10.2166/hydro.2024.159","DOIUrl":null,"url":null,"abstract":"\n In this study, various regression models were utilized to predict total sediment yield in tons, while their performance was evaluated for accuracy and reliability. The dataset utilized contains numerous predictors that have been standardized and processed through principal component analysis to improve model performance. Models evaluated here include linear regression, normalized linear regression, PCA, PCC with generalized ridge regression, kernel ridge regression, multivariate regression, lasso regression approaches such as CA-ANN or ANN, and more. Results suggest that the artificial neural network (ANN) model achieved the lowest mean squared error (MSE), 113.641; this suggests superior predictive capability compared to other models. Although environmental data were complex and relationships complex, an ANN model showed less error, followed closely by CA-ANN with an MSE of 124.83. Traditional models such as linear or lasso regression revealed larger errors with negative squared values that indicated poor fits to data. This exhaustive analysis not only showcases the power of advanced machine-learning techniques in environmental modeling but also stresses the significance of selecting models based on data characteristics and specific environmental phenomena studied. Furthermore, its insights could assist environmental planners and advocates with better prediction and management of soil erosion and sediment transport for planning purposes and conservation efforts.","PeriodicalId":507813,"journal":{"name":"Journal of Hydroinformatics","volume":"4 4","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Hydroinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2166/hydro.2024.159","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In this study, various regression models were utilized to predict total sediment yield in tons, while their performance was evaluated for accuracy and reliability. The dataset utilized contains numerous predictors that have been standardized and processed through principal component analysis to improve model performance. Models evaluated here include linear regression, normalized linear regression, PCA, PCC with generalized ridge regression, kernel ridge regression, multivariate regression, lasso regression approaches such as CA-ANN or ANN, and more. Results suggest that the artificial neural network (ANN) model achieved the lowest mean squared error (MSE), 113.641; this suggests superior predictive capability compared to other models. Although environmental data were complex and relationships complex, an ANN model showed less error, followed closely by CA-ANN with an MSE of 124.83. Traditional models such as linear or lasso regression revealed larger errors with negative squared values that indicated poor fits to data. This exhaustive analysis not only showcases the power of advanced machine-learning techniques in environmental modeling but also stresses the significance of selecting models based on data characteristics and specific environmental phenomena studied. Furthermore, its insights could assist environmental planners and advocates with better prediction and management of soil erosion and sediment transport for planning purposes and conservation efforts.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用回归和机器学习模型预测高地泥沙总产量，改善土地管理和水资源保护

本研究利用各种回归模型来预测以吨为单位的沉积物总产量，并对其准确性和可靠性进行了评估。所使用的数据集包含大量预测因子，这些预测因子已通过主成分分析进行标准化和处理，以提高模型性能。评估的模型包括线性回归、归一化线性回归、PCA、PCC 与广义脊回归、核脊回归、多元回归、套索回归方法（如 CA-ANN 或 ANN）等。结果表明，人工神经网络（ANN）模型的均方误差（MSE）最小，为 113.641；这表明该模型的预测能力优于其他模型。虽然环境数据复杂、关系复杂，但人工神经网络模型的误差较小，CA-ANN 紧随其后，MSE 为 124.83。线性回归或套索回归等传统模型的误差较大，负平方值表明与数据的拟合效果不佳。这项详尽的分析不仅展示了先进的机器学习技术在环境建模中的威力，还强调了根据数据特征和所研究的具体环境现象选择模型的重要性。此外，其见解还能帮助环境规划者和倡导者更好地预测和管理土壤侵蚀和沉积物迁移，以达到规划和保护的目的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Journal of Hydroinformatics

自引率

0.00%

发文量