{"title":"利用遥感和机器学习进行大规模河流磷估算","authors":"Pradeep Ramtel, Dongmei Feng, John Gardner","doi":"10.1029/2024JG008121","DOIUrl":null,"url":null,"abstract":"<p>Phosphorus pollution is a major water quality issue impacting the environment and human health. Traditional methods limit the frequency and extent of total phosphorus (TP) measurements across many rivers. However, remote sensing can accurately estimate riverine TP; nevertheless, no large-scale assessment of riverine TP using remote sensing exists. Large-scale models using remote sensing can provide a fast and consistent method for TP measurement, important for data generalization and accessing extensive spatial-temporal change in TP. Our study uses remote sensing and machine learning to estimate the TP in rivers in the contiguous United States (CONUS). Initially, we developed a national scale matchup data set for Landsat detectable rivers (river width >30 m) using in situ TP and surface reflectance. We used in situ data from the Water Quality Portal (WQP), alongside water surface reflectance data from Landsat 5, 7, and 8 spanning from 1984 to 2021. Then, we used this data set to develop a machine learning (ML) model using different preprocessing methods and algorithms. We found that using high-level vegetation in the clustering approach and over-sampling or under-sampling our training data in the sampling approach improved our model estimation accuracy. We compared XGBLinear, XGBTree, Regularized Random Forest (RRF), and K-Nearest neighbors ML algorithms and selected XGBLinear as the best model with an R<sup>2</sup> of 0.604, RMSE of 0.103 mg/L, mean average error of 0.83, and NSE of 0.602. Finally, we identified human footprint, elevation, river area, and soil erosion as the main attributes influencing the accuracy of estimated TP from the ML model.</p>","PeriodicalId":16003,"journal":{"name":"Journal of Geophysical Research: Biogeosciences","volume":"129 8","pages":""},"PeriodicalIF":3.7000,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1029/2024JG008121","citationCount":"0","resultStr":"{\"title\":\"Toward Large-Scale Riverine Phosphorus Estimation Using Remote Sensing and Machine Learning\",\"authors\":\"Pradeep Ramtel, Dongmei Feng, John Gardner\",\"doi\":\"10.1029/2024JG008121\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Phosphorus pollution is a major water quality issue impacting the environment and human health. Traditional methods limit the frequency and extent of total phosphorus (TP) measurements across many rivers. However, remote sensing can accurately estimate riverine TP; nevertheless, no large-scale assessment of riverine TP using remote sensing exists. Large-scale models using remote sensing can provide a fast and consistent method for TP measurement, important for data generalization and accessing extensive spatial-temporal change in TP. Our study uses remote sensing and machine learning to estimate the TP in rivers in the contiguous United States (CONUS). Initially, we developed a national scale matchup data set for Landsat detectable rivers (river width >30 m) using in situ TP and surface reflectance. We used in situ data from the Water Quality Portal (WQP), alongside water surface reflectance data from Landsat 5, 7, and 8 spanning from 1984 to 2021. Then, we used this data set to develop a machine learning (ML) model using different preprocessing methods and algorithms. We found that using high-level vegetation in the clustering approach and over-sampling or under-sampling our training data in the sampling approach improved our model estimation accuracy. We compared XGBLinear, XGBTree, Regularized Random Forest (RRF), and K-Nearest neighbors ML algorithms and selected XGBLinear as the best model with an R<sup>2</sup> of 0.604, RMSE of 0.103 mg/L, mean average error of 0.83, and NSE of 0.602. Finally, we identified human footprint, elevation, river area, and soil erosion as the main attributes influencing the accuracy of estimated TP from the ML model.</p>\",\"PeriodicalId\":16003,\"journal\":{\"name\":\"Journal of Geophysical Research: Biogeosciences\",\"volume\":\"129 8\",\"pages\":\"\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2024-08-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1029/2024JG008121\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Geophysical Research: Biogeosciences\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1029/2024JG008121\",\"RegionNum\":3,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Geophysical Research: Biogeosciences","FirstCategoryId":"93","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1029/2024JG008121","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
Toward Large-Scale Riverine Phosphorus Estimation Using Remote Sensing and Machine Learning
Phosphorus pollution is a major water quality issue impacting the environment and human health. Traditional methods limit the frequency and extent of total phosphorus (TP) measurements across many rivers. However, remote sensing can accurately estimate riverine TP; nevertheless, no large-scale assessment of riverine TP using remote sensing exists. Large-scale models using remote sensing can provide a fast and consistent method for TP measurement, important for data generalization and accessing extensive spatial-temporal change in TP. Our study uses remote sensing and machine learning to estimate the TP in rivers in the contiguous United States (CONUS). Initially, we developed a national scale matchup data set for Landsat detectable rivers (river width >30 m) using in situ TP and surface reflectance. We used in situ data from the Water Quality Portal (WQP), alongside water surface reflectance data from Landsat 5, 7, and 8 spanning from 1984 to 2021. Then, we used this data set to develop a machine learning (ML) model using different preprocessing methods and algorithms. We found that using high-level vegetation in the clustering approach and over-sampling or under-sampling our training data in the sampling approach improved our model estimation accuracy. We compared XGBLinear, XGBTree, Regularized Random Forest (RRF), and K-Nearest neighbors ML algorithms and selected XGBLinear as the best model with an R2 of 0.604, RMSE of 0.103 mg/L, mean average error of 0.83, and NSE of 0.602. Finally, we identified human footprint, elevation, river area, and soil erosion as the main attributes influencing the accuracy of estimated TP from the ML model.
期刊介绍:
JGR-Biogeosciences focuses on biogeosciences of the Earth system in the past, present, and future and the extension of this research to planetary studies. The emerging field of biogeosciences spans the intellectual interface between biology and the geosciences and attempts to understand the functions of the Earth system across multiple spatial and temporal scales. Studies in biogeosciences may use multiple lines of evidence drawn from diverse fields to gain a holistic understanding of terrestrial, freshwater, and marine ecosystems and extreme environments. Specific topics within the scope of the section include process-based theoretical, experimental, and field studies of biogeochemistry, biogeophysics, atmosphere-, land-, and ocean-ecosystem interactions, biomineralization, life in extreme environments, astrobiology, microbial processes, geomicrobiology, and evolutionary geobiology