评估先进神经网络对水质指标及其不确定性的双重估算

Arun M. Saranathan, Mortimer Werther, S. Balasubramanian, Daniel Odermatt, N. Pahlevan
{"title":"评估先进神经网络对水质指标及其不确定性的双重估算","authors":"Arun M. Saranathan, Mortimer Werther, S. Balasubramanian, Daniel Odermatt, N. Pahlevan","doi":"10.3389/frsen.2024.1383147","DOIUrl":null,"url":null,"abstract":"Given the use of machine learning-based tools for monitoring the Water Quality Indicators (WQIs) over lakes and coastal waters, understanding the properties of such models, including the uncertainties inherent in their predictions is essential. This has led to the development of two probabilistic NN-algorithms: Mixture Density Network (MDN) and Bayesian Neural Network via Monte Carlo Dropout (BNN-MCD). These NNs are complex, featuring thousands of trainable parameters and modifiable hyper-parameters, and have been independently trained and tested. The model uncertainty metric captures the uncertainty present in each prediction based on the properties of the model—namely, the model architecture and the training data distribution. We conduct an analysis of MDN and BNN-MCD under near-identical conditions of model architecture, training, and test sets, etc., to retrieve the concentration of chlorophyll-a pigments (Chl a), total suspended solids (TSS), and the absorption by colored dissolved organic matter at 440 nm (acdom (440)). The spectral resolutions considered correspond to the Hyperspectral Imager for the Coastal Ocean (HICO), PRecursore IperSpettrale della Missione Applicativa (PRISMA), Ocean Colour and Land Imager (OLCI), and MultiSpectral Instrument (MSI). The model performances are tested in terms of both predictive residuals and predictive uncertainty metric quality. We also compared the simultaneous WQI retrievals against a single-parameter retrieval framework (for Chla). Ultimately, the models’ real-world applicability was investigated using a MSI satellite-matchup dataset N=3,053) of Chla and TSS. Experiments show that both models exhibit comparable estimation performance. Specifically, the median symmetric accuracy (MdSA) on the test set for the different parameters in both algorithms range from 30% to 60%. The uncertainty estimates, on the other hand, differ strongly. MDN’s uncertainty estimate is ∼50%, encompassing estimation residuals for 75% of test samples, whereas BNN-MCD’s average uncertainty estimate is ∼25%, encompassing the residuals for 50% of samples. Our analysis also revealed that simultaneous estimation results in improvements in both predictive performance and uncertainty metric quality. Interestingly, the trends mentioned above hold across different sensor resolutions, as well as experimental regimes. This disparity calls for additional research to determine whether such trends in model uncertainty are inherent to specific models or can be more broadly generalized across different algorithms and sensor setups.","PeriodicalId":198378,"journal":{"name":"Frontiers in Remote Sensing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Assessment of advanced neural networks for the dual estimation of water quality indicators and their uncertainties\",\"authors\":\"Arun M. Saranathan, Mortimer Werther, S. Balasubramanian, Daniel Odermatt, N. Pahlevan\",\"doi\":\"10.3389/frsen.2024.1383147\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Given the use of machine learning-based tools for monitoring the Water Quality Indicators (WQIs) over lakes and coastal waters, understanding the properties of such models, including the uncertainties inherent in their predictions is essential. This has led to the development of two probabilistic NN-algorithms: Mixture Density Network (MDN) and Bayesian Neural Network via Monte Carlo Dropout (BNN-MCD). These NNs are complex, featuring thousands of trainable parameters and modifiable hyper-parameters, and have been independently trained and tested. The model uncertainty metric captures the uncertainty present in each prediction based on the properties of the model—namely, the model architecture and the training data distribution. We conduct an analysis of MDN and BNN-MCD under near-identical conditions of model architecture, training, and test sets, etc., to retrieve the concentration of chlorophyll-a pigments (Chl a), total suspended solids (TSS), and the absorption by colored dissolved organic matter at 440 nm (acdom (440)). The spectral resolutions considered correspond to the Hyperspectral Imager for the Coastal Ocean (HICO), PRecursore IperSpettrale della Missione Applicativa (PRISMA), Ocean Colour and Land Imager (OLCI), and MultiSpectral Instrument (MSI). The model performances are tested in terms of both predictive residuals and predictive uncertainty metric quality. We also compared the simultaneous WQI retrievals against a single-parameter retrieval framework (for Chla). Ultimately, the models’ real-world applicability was investigated using a MSI satellite-matchup dataset N=3,053) of Chla and TSS. Experiments show that both models exhibit comparable estimation performance. Specifically, the median symmetric accuracy (MdSA) on the test set for the different parameters in both algorithms range from 30% to 60%. The uncertainty estimates, on the other hand, differ strongly. MDN’s uncertainty estimate is ∼50%, encompassing estimation residuals for 75% of test samples, whereas BNN-MCD’s average uncertainty estimate is ∼25%, encompassing the residuals for 50% of samples. Our analysis also revealed that simultaneous estimation results in improvements in both predictive performance and uncertainty metric quality. Interestingly, the trends mentioned above hold across different sensor resolutions, as well as experimental regimes. This disparity calls for additional research to determine whether such trends in model uncertainty are inherent to specific models or can be more broadly generalized across different algorithms and sensor setups.\",\"PeriodicalId\":198378,\"journal\":{\"name\":\"Frontiers in Remote Sensing\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Frontiers in Remote Sensing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3389/frsen.2024.1383147\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Remote Sensing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/frsen.2024.1383147","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

鉴于使用基于机器学习的工具来监测湖泊和沿岸水域的水质指标 (WQIs),了解这些模型的特性,包括其预测中固有的不确定性至关重要。因此,开发了两种概率 NN 算法:混合密度网络(MDN)和贝叶斯神经网络(BNN-MCD)。这些神经网络非常复杂,具有数千个可训练参数和可修改的超参数,并经过了独立的训练和测试。模型不确定性度量可根据模型的属性(即模型架构和训练数据分布)捕捉每次预测中存在的不确定性。我们对 MDN 和 BNN-MCD 进行了分析,在模型结构、训练集和测试集等几乎相同的条件下,检索叶绿素-a 色素(Chl a)浓度、总悬浮固体(TSS)和有色溶解有机物在 440 纳米波长的吸收(acdom (440))。所考虑的光谱分辨率与沿岸海洋高光谱成像仪(HICO)、PRecursore IperSpettrale della Missione Applicativa(PRISMA)、海洋颜色和陆地成像仪(OLCI)以及多光谱仪器(MSI)相对应。从预测残差和预测不确定性度量质量两个方面对模型性能进行了测试。我们还将同步 WQI 检索与单参数检索框架(针对 Chla)进行了比较。最后,我们使用 Chla 和 TSS 的 MSI 卫星匹配数据集(N=3,053)研究了模型在现实世界中的适用性。实验结果表明,两种模型的估计性能相当。具体来说,两种算法中不同参数在测试集上的中位对称精度(MdSA)在 30% 到 60% 之间。另一方面,不确定性估计值差异很大。MDN 的不确定性估计值为 50%,包含 75% 测试样本的估计残差,而 BNN-MCD 的平均不确定性估计值为 25%,包含 50% 样本的残差。我们的分析还显示,同步估算可提高预测性能和不确定性度量质量。有趣的是,上述趋势在不同的传感器分辨率和实验条件下都能保持不变。这种差异需要进一步研究,以确定模型不确定性的这种趋势是特定模型固有的,还是可以在不同算法和传感器设置中更广泛地推广。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Assessment of advanced neural networks for the dual estimation of water quality indicators and their uncertainties
Given the use of machine learning-based tools for monitoring the Water Quality Indicators (WQIs) over lakes and coastal waters, understanding the properties of such models, including the uncertainties inherent in their predictions is essential. This has led to the development of two probabilistic NN-algorithms: Mixture Density Network (MDN) and Bayesian Neural Network via Monte Carlo Dropout (BNN-MCD). These NNs are complex, featuring thousands of trainable parameters and modifiable hyper-parameters, and have been independently trained and tested. The model uncertainty metric captures the uncertainty present in each prediction based on the properties of the model—namely, the model architecture and the training data distribution. We conduct an analysis of MDN and BNN-MCD under near-identical conditions of model architecture, training, and test sets, etc., to retrieve the concentration of chlorophyll-a pigments (Chl a), total suspended solids (TSS), and the absorption by colored dissolved organic matter at 440 nm (acdom (440)). The spectral resolutions considered correspond to the Hyperspectral Imager for the Coastal Ocean (HICO), PRecursore IperSpettrale della Missione Applicativa (PRISMA), Ocean Colour and Land Imager (OLCI), and MultiSpectral Instrument (MSI). The model performances are tested in terms of both predictive residuals and predictive uncertainty metric quality. We also compared the simultaneous WQI retrievals against a single-parameter retrieval framework (for Chla). Ultimately, the models’ real-world applicability was investigated using a MSI satellite-matchup dataset N=3,053) of Chla and TSS. Experiments show that both models exhibit comparable estimation performance. Specifically, the median symmetric accuracy (MdSA) on the test set for the different parameters in both algorithms range from 30% to 60%. The uncertainty estimates, on the other hand, differ strongly. MDN’s uncertainty estimate is ∼50%, encompassing estimation residuals for 75% of test samples, whereas BNN-MCD’s average uncertainty estimate is ∼25%, encompassing the residuals for 50% of samples. Our analysis also revealed that simultaneous estimation results in improvements in both predictive performance and uncertainty metric quality. Interestingly, the trends mentioned above hold across different sensor resolutions, as well as experimental regimes. This disparity calls for additional research to determine whether such trends in model uncertainty are inherent to specific models or can be more broadly generalized across different algorithms and sensor setups.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A near-real-time tropical deforestation monitoring algorithm based on the CuSum change detection method Suitability of different in-water algorithms for eutrophic and absorbing waters applied to Sentinel-2 MSI and Sentinel-3 OLCI data Sea surface barometry with an O2 differential absorption radar: retrieval algorithm development and simulation Assessment of advanced neural networks for the dual estimation of water quality indicators and their uncertainties Selecting HyperNav deployment sites for calibrating and validating PACE ocean color observations
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1