Comparative analysis of k-nearest neighbors distance metrics for retrieving coastal water quality based on concurrent in situ and satellite observations

IF 4.9 3区 环境科学与生态学 Q1 ENVIRONMENTAL SCIENCES Marine pollution bulletin Pub Date : 2025-05-01 Epub Date: 2025-03-13 DOI:10.1016/j.marpolbul.2025.117816
Bonyad Ahmadi , Mehdi Gholamalifard , Seyed Mahmoud Ghasempouri , Tiit Kutser
{"title":"Comparative analysis of k-nearest neighbors distance metrics for retrieving coastal water quality based on concurrent in situ and satellite observations","authors":"Bonyad Ahmadi ,&nbsp;Mehdi Gholamalifard ,&nbsp;Seyed Mahmoud Ghasempouri ,&nbsp;Tiit Kutser","doi":"10.1016/j.marpolbul.2025.117816","DOIUrl":null,"url":null,"abstract":"<div><div>It is time consuming and expensive to monitor extensive areas of coastal waters with sufficient frequency using in situ (ship based) methods. Satellite remote sensing is much more cost effective. Satellites can detect Optically Active Constituents (OACs) in water. Therefore, it is crucial to know the concentrations of OACs in the study area in order to develop and validate remote sensing methods suitable for assessing water quality in this region. The Pars Special Economic Energy Zone (PSEEZ), a major hub of natural gas extraction in the Persian Gulf, has undergone rapid industrial expansion since 1998, intensifying environmental pressures and necessitating high-resolution, frequent water quality assessments. However, a structured, long-term monitoring framework is absent despite the significance of this region. In order to develop satellite-based remote sensing methods for this region we carried out measurements of different OACs (chlorophyll-a, coloured dissolved organic matter (CDOM) and turbidity) and tested Landsat 8, Sentinel-2, and Sentinel-3 performance in retrieving the OACs. We tested the k-Nearest Neighbors machine learning algorithm. The selection of distance metrics demonstrated a significant influence on the accuracy of retrieving OACs. In turbidity retrieval, the Euclidean Distance (ED) enhanced the regression slope to 0.90 (a 55.17 % improvement over Fuzzy Mahalanobis Distance (FD)) and reduced the RMSLE to 0.51, corresponding to an approximate 160 % enhancement in precision. For CDOM, RMSLE values for ED and FD were 0.39 and 0.48, respectively, indicating an 18.75 % improvement favoring ED. Furthermore, bias analysis revealed deviations of 1–6 % compared to reference data, with the lowest values observed for Mahalanobis Distance (MD) with MSI and FD with OLCI. In chlorophyll-a retrieval, the choice of distance metric directly impacted the accuracy of the OLCI sensor, inducing bidirectional bias, comprising both overestimation and underestimation, which varied depending on the selected metric. These results underscore the critical importance of optimizing distance metric selection to enhance prediction accuracy and mitigate systematic errors in remote sensing applications. Furthermore, the results revealed that the implementation of this algorithm exhibited substantially superior performance compared to other evaluated algorithms within the study area, achieving significantly higher accuracy metrics. Thereby establishing k-NN as the optimal framework for satellite-based water quality monitoring in environmentally sensitive regions like PSEEZ.</div></div>","PeriodicalId":18215,"journal":{"name":"Marine pollution bulletin","volume":"214 ","pages":"Article 117816"},"PeriodicalIF":4.9000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Marine pollution bulletin","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0025326X25002917","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/3/13 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

It is time consuming and expensive to monitor extensive areas of coastal waters with sufficient frequency using in situ (ship based) methods. Satellite remote sensing is much more cost effective. Satellites can detect Optically Active Constituents (OACs) in water. Therefore, it is crucial to know the concentrations of OACs in the study area in order to develop and validate remote sensing methods suitable for assessing water quality in this region. The Pars Special Economic Energy Zone (PSEEZ), a major hub of natural gas extraction in the Persian Gulf, has undergone rapid industrial expansion since 1998, intensifying environmental pressures and necessitating high-resolution, frequent water quality assessments. However, a structured, long-term monitoring framework is absent despite the significance of this region. In order to develop satellite-based remote sensing methods for this region we carried out measurements of different OACs (chlorophyll-a, coloured dissolved organic matter (CDOM) and turbidity) and tested Landsat 8, Sentinel-2, and Sentinel-3 performance in retrieving the OACs. We tested the k-Nearest Neighbors machine learning algorithm. The selection of distance metrics demonstrated a significant influence on the accuracy of retrieving OACs. In turbidity retrieval, the Euclidean Distance (ED) enhanced the regression slope to 0.90 (a 55.17 % improvement over Fuzzy Mahalanobis Distance (FD)) and reduced the RMSLE to 0.51, corresponding to an approximate 160 % enhancement in precision. For CDOM, RMSLE values for ED and FD were 0.39 and 0.48, respectively, indicating an 18.75 % improvement favoring ED. Furthermore, bias analysis revealed deviations of 1–6 % compared to reference data, with the lowest values observed for Mahalanobis Distance (MD) with MSI and FD with OLCI. In chlorophyll-a retrieval, the choice of distance metric directly impacted the accuracy of the OLCI sensor, inducing bidirectional bias, comprising both overestimation and underestimation, which varied depending on the selected metric. These results underscore the critical importance of optimizing distance metric selection to enhance prediction accuracy and mitigate systematic errors in remote sensing applications. Furthermore, the results revealed that the implementation of this algorithm exhibited substantially superior performance compared to other evaluated algorithms within the study area, achieving significantly higher accuracy metrics. Thereby establishing k-NN as the optimal framework for satellite-based water quality monitoring in environmentally sensitive regions like PSEEZ.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于现场和卫星同步观测的海岸水质检索的k近邻距离度量的比较分析
使用现场(船基)方法以足够的频率监测大面积沿海水域既耗时又昂贵。卫星遥感的成本效益要高得多。卫星可以探测到水中的光学活性成分(OACs)。因此,了解研究区oac的浓度对于开发和验证适合该地区水质评价的遥感方法至关重要。Pars经济能源特区(PSEEZ)是波斯湾天然气开采的主要枢纽,自1998年以来经历了快速的工业扩张,加剧了环境压力,需要高分辨率、频繁的水质评估。然而,尽管该区域具有重要意义,却缺乏一个有组织的长期监测框架。为了开发该地区的卫星遥感方法,我们对不同的oac(叶绿素-a、彩色溶解有机质(CDOM)和浊度)进行了测量,并测试了Landsat 8、Sentinel-2和Sentinel-3在获取oac方面的性能。我们测试了k近邻机器学习算法。距离度量的选择对oac检索的准确性有显著影响。在浊度检索中,欧几里得距离(ED)将回归斜率提高到0.90(比模糊马氏距离(FD)提高55.17%),将RMSLE降低到0.51,相当于精度提高了约160%。对于CDOM, ED和FD的RMSLE值分别为0.39和0.48,表明ED改善了18.75%。此外,偏倚分析显示,与参考数据相比,偏差为1 - 6%,MSI和OLCI的马氏距离(MD)最小。在叶绿素-a检索中,距离度量的选择直接影响OLCI传感器的精度,导致双向偏差,包括高估和低估,这取决于所选择的度量。这些结果强调了优化距离度量选择对于提高遥感应用中的预测精度和减轻系统误差的重要性。此外,结果显示,与研究领域内的其他评估算法相比,该算法的实现表现出显著优于其他算法的性能,实现了更高的精度指标。从而确立了k-NN作为psez等环境敏感地区卫星水质监测的最优框架。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Marine pollution bulletin
Marine pollution bulletin 环境科学-海洋与淡水生物学
CiteScore
10.20
自引率
15.50%
发文量
1077
审稿时长
68 days
期刊介绍: Marine Pollution Bulletin is concerned with the rational use of maritime and marine resources in estuaries, the seas and oceans, as well as with documenting marine pollution and introducing new forms of measurement and analysis. A wide range of topics are discussed as news, comment, reviews and research reports, not only on effluent disposal and pollution control, but also on the management, economic aspects and protection of the marine environment in general.
期刊最新文献
Biomagnification potential and elevated carcinogenic risks of PAHs in a semi-enclosed bay food web: Evidence from Laizhou Bay, China New kid in town: the Mediterranean outbreak of Callinectes sapidus, its road to valorization and the Italian paradox Differences in PFAS exposure between Pacific pinnipeds: The Galapagos (Zalophus wollebaeki) and California (Zalophus californianus) sea lions Microbial community dynamics and functional potential in response to organic micropollutants in river sediments Stable isotope approaches for assessing spatial and seasonal patterns of organic pollution in a high-latitude coastal marine ecosystem
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1