Keith S. Jennings, Meghan Collins, Benjamin J. Hatchett, Anne Heggli, Nayoung Hur, Sonia Tonino, Anne W. Nolin, Guo Yu, Wei Zhang, Monica M. Arienzo
{"title":"Machine learning shows a limit to rain-snow partitioning accuracy when using near-surface meteorology","authors":"Keith S. Jennings, Meghan Collins, Benjamin J. Hatchett, Anne Heggli, Nayoung Hur, Sonia Tonino, Anne W. Nolin, Guo Yu, Wei Zhang, Monica M. Arienzo","doi":"10.1038/s41467-025-58234-2","DOIUrl":null,"url":null,"abstract":"<p>Partitioning precipitation into rain and snow with near-surface meteorology is a well-known challenge. However, whether a limit exists to its potential performance remains unknown. Here, we evaluate this possibility by applying a set of benchmark precipitation phase partitioning methods plus three machine learning (ML) models (an artificial neural network, random forest, and XGBoost) to two independent datasets: 38.5 thousand crowdsourced observations and 17.8 million synoptic meteorology reports. The ML methods provide negligible improvements over the best benchmarks, increasing accuracy only by up to 0.6% and reducing rain and snow biases by up to -4.7%. ML methods fail to identify mixed precipitation and sub-freezing rainfall events, while expressing their worst accuracy values from 1.0 °C–2.5 °C. A potential cause of these shortcomings is the air temperature overlap in rain and snow distributions (peaking between 1.0 °C–1.6 °C), which expresses a significant negative relationship (<i>p</i> < 0.0005) with partitioning accuracy. Thus, the meteorological characteristics of rain and snow are similar at air temperatures slightly above freezing with increasing overlap associated with decreasing performance. We suggest researchers switch their focus from marginally improving inherently limited precipitation phase partitioning methods using near-surface meteorology to creating new methods that assimilate novel data sources—e.g., crowdsourced precipitation phase observations.</p>","PeriodicalId":19066,"journal":{"name":"Nature Communications","volume":"11 1","pages":""},"PeriodicalIF":15.7000,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature Communications","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41467-025-58234-2","RegionNum":1,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Partitioning precipitation into rain and snow with near-surface meteorology is a well-known challenge. However, whether a limit exists to its potential performance remains unknown. Here, we evaluate this possibility by applying a set of benchmark precipitation phase partitioning methods plus three machine learning (ML) models (an artificial neural network, random forest, and XGBoost) to two independent datasets: 38.5 thousand crowdsourced observations and 17.8 million synoptic meteorology reports. The ML methods provide negligible improvements over the best benchmarks, increasing accuracy only by up to 0.6% and reducing rain and snow biases by up to -4.7%. ML methods fail to identify mixed precipitation and sub-freezing rainfall events, while expressing their worst accuracy values from 1.0 °C–2.5 °C. A potential cause of these shortcomings is the air temperature overlap in rain and snow distributions (peaking between 1.0 °C–1.6 °C), which expresses a significant negative relationship (p < 0.0005) with partitioning accuracy. Thus, the meteorological characteristics of rain and snow are similar at air temperatures slightly above freezing with increasing overlap associated with decreasing performance. We suggest researchers switch their focus from marginally improving inherently limited precipitation phase partitioning methods using near-surface meteorology to creating new methods that assimilate novel data sources—e.g., crowdsourced precipitation phase observations.
期刊介绍:
Nature Communications, an open-access journal, publishes high-quality research spanning all areas of the natural sciences. Papers featured in the journal showcase significant advances relevant to specialists in each respective field. With a 2-year impact factor of 16.6 (2022) and a median time of 8 days from submission to the first editorial decision, Nature Communications is committed to rapid dissemination of research findings. As a multidisciplinary journal, it welcomes contributions from biological, health, physical, chemical, Earth, social, mathematical, applied, and engineering sciences, aiming to highlight important breakthroughs within each domain.