Machine learning shows a limit to rain-snow partitioning accuracy when using near-surface meteorology

IF 15.7 1区综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Nature Communications Pub Date : 2025-03-25 DOI:10.1038/s41467-025-58234-2

Keith S. Jennings, Meghan Collins, Benjamin J. Hatchett, Anne Heggli, Nayoung Hur, Sonia Tonino, Anne W. Nolin, Guo Yu, Wei Zhang, Monica M. Arienzo

{"title":"Machine learning shows a limit to rain-snow partitioning accuracy when using near-surface meteorology","authors":"Keith S. Jennings, Meghan Collins, Benjamin J. Hatchett, Anne Heggli, Nayoung Hur, Sonia Tonino, Anne W. Nolin, Guo Yu, Wei Zhang, Monica M. Arienzo","doi":"10.1038/s41467-025-58234-2","DOIUrl":null,"url":null,"abstract":"<p>Partitioning precipitation into rain and snow with near-surface meteorology is a well-known challenge. However, whether a limit exists to its potential performance remains unknown. Here, we evaluate this possibility by applying a set of benchmark precipitation phase partitioning methods plus three machine learning (ML) models (an artificial neural network, random forest, and XGBoost) to two independent datasets: 38.5 thousand crowdsourced observations and 17.8 million synoptic meteorology reports. The ML methods provide negligible improvements over the best benchmarks, increasing accuracy only by up to 0.6% and reducing rain and snow biases by up to -4.7%. ML methods fail to identify mixed precipitation and sub-freezing rainfall events, while expressing their worst accuracy values from 1.0 °C–2.5 °C. A potential cause of these shortcomings is the air temperature overlap in rain and snow distributions (peaking between 1.0 °C–1.6 °C), which expresses a significant negative relationship (<i>p</i> < 0.0005) with partitioning accuracy. Thus, the meteorological characteristics of rain and snow are similar at air temperatures slightly above freezing with increasing overlap associated with decreasing performance. We suggest researchers switch their focus from marginally improving inherently limited precipitation phase partitioning methods using near-surface meteorology to creating new methods that assimilate novel data sources—e.g., crowdsourced precipitation phase observations.</p>","PeriodicalId":19066,"journal":{"name":"Nature Communications","volume":"11 1","pages":""},"PeriodicalIF":15.7000,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature Communications","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41467-025-58234-2","RegionNum":1,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}

引用次数: 0

Abstract

Partitioning precipitation into rain and snow with near-surface meteorology is a well-known challenge. However, whether a limit exists to its potential performance remains unknown. Here, we evaluate this possibility by applying a set of benchmark precipitation phase partitioning methods plus three machine learning (ML) models (an artificial neural network, random forest, and XGBoost) to two independent datasets: 38.5 thousand crowdsourced observations and 17.8 million synoptic meteorology reports. The ML methods provide negligible improvements over the best benchmarks, increasing accuracy only by up to 0.6% and reducing rain and snow biases by up to -4.7%. ML methods fail to identify mixed precipitation and sub-freezing rainfall events, while expressing their worst accuracy values from 1.0 °C–2.5 °C. A potential cause of these shortcomings is the air temperature overlap in rain and snow distributions (peaking between 1.0 °C–1.6 °C), which expresses a significant negative relationship (p < 0.0005) with partitioning accuracy. Thus, the meteorological characteristics of rain and snow are similar at air temperatures slightly above freezing with increasing overlap associated with decreasing performance. We suggest researchers switch their focus from marginally improving inherently limited precipitation phase partitioning methods using near-surface meteorology to creating new methods that assimilate novel data sources—e.g., crowdsourced precipitation phase observations.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

机器学习显示了在使用近地表气象学时雨雪划分精度的限制

用近地面气象学将降水划分为雨和雪是一个众所周知的挑战。然而，其潜在性能是否存在限制仍然未知。在这里，我们通过将一组基准降水相位划分方法和三种机器学习（ML）模型（人工神经网络，随机森林和XGBoost）应用于两个独立的数据集来评估这种可能性：385000个众包观测和1780万天气气象报告。与最佳基准相比，ML方法提供了微不足道的改进，仅将准确性提高了0.6%，并将雨雪偏差降低了-4.7%。ML方法不能识别混合降水和亚冰冻降水事件，同时在1.0°C - 2.5°C范围内表达出最差的精度值。造成这些缺点的一个潜在原因是雨雪分布中的气温重叠（峰值在1.0°C - 1.6°C之间），这与分区精度呈显著负相关（p < 0.0005）。因此，在气温略高于冰点时，雨和雪的气象特征是相似的，重叠度越高，表现越差。我们建议研究人员将他们的重点从利用近地表气象略微改进固有的有限降水相位划分方法转向创造吸收新数据源的新方法。，众包降水阶段观测。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Nature Communications Biological Science Disciplines-

CiteScore

24.90

自引率

2.40%

发文量

6928

审稿时长

3.7 months

期刊介绍： Nature Communications, an open-access journal, publishes high-quality research spanning all areas of the natural sciences. Papers featured in the journal showcase significant advances relevant to specialists in each respective field. With a 2-year impact factor of 16.6 (2022) and a median time of 8 days from submission to the first editorial decision, Nature Communications is committed to rapid dissemination of research findings. As a multidisciplinary journal, it welcomes contributions from biological, health, physical, chemical, Earth, social, mathematical, applied, and engineering sciences, aiming to highlight important breakthroughs within each domain.