基于数据挖掘的机器学习方法用于改进水文数据 北冰洋西部盐度场案例研究

IF 11.2 1区 地球科学 Q1 GEOSCIENCES, MULTIDISCIPLINARY Earth System Science Data Pub Date : 2024-05-03 DOI:10.5194/essd-2024-138
Shuhao Tao, Ling Du, Jiahao Li
{"title":"基于数据挖掘的机器学习方法用于改进水文数据 北冰洋西部盐度场案例研究","authors":"Shuhao Tao, Ling Du, Jiahao Li","doi":"10.5194/essd-2024-138","DOIUrl":null,"url":null,"abstract":"<strong>Abstract.</strong> In the Western Arctic Ocean lies the largest freshwater reservoir in the Arctic Ocean, the Beaufort Gyre. Long-term changes in freshwater reservoirs are critical for understanding the Arctic Ocean, and data from various sources, particularly measured or reanalyzed data, must be used to the greatest extent possible. Over the past two decades, a large number of intensive field observations and ship surveys have been conducted in the western Arctic Ocean to obtain a large amount of CTD data. Multiple machine learning methods were evaluated and merged to reconstruct annual salinity product in the western Arctic Ocean over the period 2003–2022. Data mining-based machine learning methods make use of variables determined by physical processes, such as sea level pressure, sea ice concentration, and drift. Our objective is to effectively manage the mean root mean square error (RMSE) of sea surface salinity, which exhibits greater susceptibility to atmospheric, sea ice, and oceanic changes. Considering the higher susceptibility of sea surface salinity to atmospheric, sea ice, and oceanic changes, which leads to greater variability, we ensured that the average root mean square error of CTD and EN4 sea surface salinity field during the machine learning training process was constrained within 0.25 psu. The machine learning process reveals that the uncertainty in predicting sea surface salinity, as constrained by CTD data, is 0.24 %, whereas when constrained by EN4 data it reduces to 0.02 %. During data merging and post-calibrating, the weight coefficients are constrained by imposing limitations on the uncertainty value. Compared with commonly used EN4 and ORAS5 salinity in the Arctic Ocean, our salinity product provide more accurate descriptions of freshwater content in the Beaufort Gyre and depth variations at its halocline base. The application potential of this multi-machine learning results approach for evaluating and integrating extends beyond the salinity field, encompassing hydrometeorology, sea ice thickness, polar biogeochemistry, and other related fields. The datasets are available at https://zenodo.org/records/10990138 (Tao and Du, 2024).","PeriodicalId":48747,"journal":{"name":"Earth System Science Data","volume":"84 1","pages":""},"PeriodicalIF":11.2000,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Data mining-based machine learning methods for improving hydrological data a case study of salinity field in the Western Arctic Ocean\",\"authors\":\"Shuhao Tao, Ling Du, Jiahao Li\",\"doi\":\"10.5194/essd-2024-138\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<strong>Abstract.</strong> In the Western Arctic Ocean lies the largest freshwater reservoir in the Arctic Ocean, the Beaufort Gyre. Long-term changes in freshwater reservoirs are critical for understanding the Arctic Ocean, and data from various sources, particularly measured or reanalyzed data, must be used to the greatest extent possible. Over the past two decades, a large number of intensive field observations and ship surveys have been conducted in the western Arctic Ocean to obtain a large amount of CTD data. Multiple machine learning methods were evaluated and merged to reconstruct annual salinity product in the western Arctic Ocean over the period 2003–2022. Data mining-based machine learning methods make use of variables determined by physical processes, such as sea level pressure, sea ice concentration, and drift. Our objective is to effectively manage the mean root mean square error (RMSE) of sea surface salinity, which exhibits greater susceptibility to atmospheric, sea ice, and oceanic changes. Considering the higher susceptibility of sea surface salinity to atmospheric, sea ice, and oceanic changes, which leads to greater variability, we ensured that the average root mean square error of CTD and EN4 sea surface salinity field during the machine learning training process was constrained within 0.25 psu. The machine learning process reveals that the uncertainty in predicting sea surface salinity, as constrained by CTD data, is 0.24 %, whereas when constrained by EN4 data it reduces to 0.02 %. During data merging and post-calibrating, the weight coefficients are constrained by imposing limitations on the uncertainty value. Compared with commonly used EN4 and ORAS5 salinity in the Arctic Ocean, our salinity product provide more accurate descriptions of freshwater content in the Beaufort Gyre and depth variations at its halocline base. The application potential of this multi-machine learning results approach for evaluating and integrating extends beyond the salinity field, encompassing hydrometeorology, sea ice thickness, polar biogeochemistry, and other related fields. The datasets are available at https://zenodo.org/records/10990138 (Tao and Du, 2024).\",\"PeriodicalId\":48747,\"journal\":{\"name\":\"Earth System Science Data\",\"volume\":\"84 1\",\"pages\":\"\"},\"PeriodicalIF\":11.2000,\"publicationDate\":\"2024-05-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Earth System Science Data\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://doi.org/10.5194/essd-2024-138\",\"RegionNum\":1,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GEOSCIENCES, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Earth System Science Data","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.5194/essd-2024-138","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

摘要北冰洋西部有北冰洋最大的淡水库--波弗特环流。淡水库的长期变化对了解北冰洋至关重要,必须最大限度地利用各种来源的数据,特别是测量或重新分析的数据。在过去二十年中,在北冰洋西部进行了大量密集的实地观测和船舶调查,获得了大量 CTD 数据。对多种机器学习方法进行了评估和合并,以重建 2003-2022 年期间北冰洋西部的年盐度乘积。基于数据挖掘的机器学习方法利用了由物理过程决定的变量,如海平面压力、海冰浓度和漂移。我们的目标是有效管理海面盐度的均方根误差(RMSE),因为海面盐度更容易受到大气、海冰和海洋变化的影响。考虑到海表盐度更容易受到大气、海冰和海洋变化的影响,从而导致更大的变异性,我们确保在机器学习训练过程中将 CTD 和 EN4 海表盐度场的平均均方根误差控制在 0.25 psu 以内。机器学习过程显示,在 CTD 数据的约束下,预测海面盐度的不确定性为 0.24%,而在 EN4 数据的约束下,不确定性降低到 0.02%。在数据合并和后校准过程中,通过对不确定性值施加限制来约束加权系数。与北冰洋常用的 EN4 和 ORAS5 盐度相比,我们的盐度产品能更准确地描述波弗特环流的淡水含量及其卤线基底的深度变化。这种多机器学习结果评估和整合方法的应用潜力超出了盐度领域,涵盖了水文气象、海冰厚度、极地生物地球化学和其他相关领域。数据集可在 https://zenodo.org/records/10990138 网站上查阅(陶和杜,2024 年)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Data mining-based machine learning methods for improving hydrological data a case study of salinity field in the Western Arctic Ocean
Abstract. In the Western Arctic Ocean lies the largest freshwater reservoir in the Arctic Ocean, the Beaufort Gyre. Long-term changes in freshwater reservoirs are critical for understanding the Arctic Ocean, and data from various sources, particularly measured or reanalyzed data, must be used to the greatest extent possible. Over the past two decades, a large number of intensive field observations and ship surveys have been conducted in the western Arctic Ocean to obtain a large amount of CTD data. Multiple machine learning methods were evaluated and merged to reconstruct annual salinity product in the western Arctic Ocean over the period 2003–2022. Data mining-based machine learning methods make use of variables determined by physical processes, such as sea level pressure, sea ice concentration, and drift. Our objective is to effectively manage the mean root mean square error (RMSE) of sea surface salinity, which exhibits greater susceptibility to atmospheric, sea ice, and oceanic changes. Considering the higher susceptibility of sea surface salinity to atmospheric, sea ice, and oceanic changes, which leads to greater variability, we ensured that the average root mean square error of CTD and EN4 sea surface salinity field during the machine learning training process was constrained within 0.25 psu. The machine learning process reveals that the uncertainty in predicting sea surface salinity, as constrained by CTD data, is 0.24 %, whereas when constrained by EN4 data it reduces to 0.02 %. During data merging and post-calibrating, the weight coefficients are constrained by imposing limitations on the uncertainty value. Compared with commonly used EN4 and ORAS5 salinity in the Arctic Ocean, our salinity product provide more accurate descriptions of freshwater content in the Beaufort Gyre and depth variations at its halocline base. The application potential of this multi-machine learning results approach for evaluating and integrating extends beyond the salinity field, encompassing hydrometeorology, sea ice thickness, polar biogeochemistry, and other related fields. The datasets are available at https://zenodo.org/records/10990138 (Tao and Du, 2024).
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Earth System Science Data
Earth System Science Data GEOSCIENCES, MULTIDISCIPLINARYMETEOROLOGY-METEOROLOGY & ATMOSPHERIC SCIENCES
CiteScore
18.00
自引率
5.30%
发文量
231
审稿时长
35 weeks
期刊介绍: Earth System Science Data (ESSD) is an international, interdisciplinary journal that publishes articles on original research data in order to promote the reuse of high-quality data in the field of Earth system sciences. The journal welcomes submissions of original data or data collections that meet the required quality standards and have the potential to contribute to the goals of the journal. It includes sections dedicated to regular-length articles, brief communications (such as updates to existing data sets), commentaries, review articles, and special issues. ESSD is abstracted and indexed in several databases, including Science Citation Index Expanded, Current Contents/PCE, Scopus, ADS, CLOCKSS, CNKI, DOAJ, EBSCO, Gale/Cengage, GoOA (CAS), and Google Scholar, among others.
期刊最新文献
A new repository of electrical resistivity tomography and ground-penetrating radar data from summer 2022 near Ny-Ålesund, Svalbard CCD-Rice: A long-term paddy rice distribution dataset in China at 30 m resolution SMOS-derived Antarctic thin sea ice thickness: data description and validation in the Weddell Sea Global Greenhouse Gas Reconciliation 2022 A daily reconstructed chlorophyll-a dataset in the South China Sea from MODIS using OI-SwinUnet
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1