{"title":"用于气候预测的重力波参数化回归森林方法","authors":"David S. Connelly, Edwin P. Gerber","doi":"10.1029/2023MS004184","DOIUrl":null,"url":null,"abstract":"<p>We train random and boosted forests, two machine learning architectures based on regression trees, to emulate a physics-based parameterization of atmospheric gravity wave momentum transport. We compare the forests to a neural network benchmark, evaluating both offline errors and online performance when coupled to an atmospheric model under the present day climate and in 800 and 1,200 ppm CO<sub>2</sub> global warming scenarios. Offline, the boosted forest exhibits similar skill to the neural network, while the random forest scores significantly lower. Both forest models couple stably to the atmospheric model, and control climate integrations with the boosted forest exhibit lower biases than those with the neural network. Integrations with all three data-driven emulators successfully capture the Quasi-Biennial Oscillation (QBO) and sudden stratospheric warmings, key modes of stratospheric variability, with the boosted forest more accurate than the random forest in replicating their statistics across our range of carbon dioxide perturbations. The boosted forest and neural network capture the sign of the QBO period response to increased CO<sub>2</sub>, though both struggle with the magnitude of this response under the more extreme 1,200 ppm scenario. To investigate the connection between performance in the control climate and the ability to generalize, we use techniques from interpretable machine learning to understand how the data-driven methods use physical information. We leverage this understanding to develop a retraining procedure that improves the coupled performance of the boosted forest in the control climate and under the 800 ppm CO<sub>2</sub> scenario.</p>","PeriodicalId":14881,"journal":{"name":"Journal of Advances in Modeling Earth Systems","volume":"16 7","pages":""},"PeriodicalIF":4.4000,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1029/2023MS004184","citationCount":"0","resultStr":"{\"title\":\"Regression Forest Approaches to Gravity Wave Parameterization for Climate Projection\",\"authors\":\"David S. Connelly, Edwin P. Gerber\",\"doi\":\"10.1029/2023MS004184\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>We train random and boosted forests, two machine learning architectures based on regression trees, to emulate a physics-based parameterization of atmospheric gravity wave momentum transport. We compare the forests to a neural network benchmark, evaluating both offline errors and online performance when coupled to an atmospheric model under the present day climate and in 800 and 1,200 ppm CO<sub>2</sub> global warming scenarios. Offline, the boosted forest exhibits similar skill to the neural network, while the random forest scores significantly lower. Both forest models couple stably to the atmospheric model, and control climate integrations with the boosted forest exhibit lower biases than those with the neural network. Integrations with all three data-driven emulators successfully capture the Quasi-Biennial Oscillation (QBO) and sudden stratospheric warmings, key modes of stratospheric variability, with the boosted forest more accurate than the random forest in replicating their statistics across our range of carbon dioxide perturbations. The boosted forest and neural network capture the sign of the QBO period response to increased CO<sub>2</sub>, though both struggle with the magnitude of this response under the more extreme 1,200 ppm scenario. To investigate the connection between performance in the control climate and the ability to generalize, we use techniques from interpretable machine learning to understand how the data-driven methods use physical information. We leverage this understanding to develop a retraining procedure that improves the coupled performance of the boosted forest in the control climate and under the 800 ppm CO<sub>2</sub> scenario.</p>\",\"PeriodicalId\":14881,\"journal\":{\"name\":\"Journal of Advances in Modeling Earth Systems\",\"volume\":\"16 7\",\"pages\":\"\"},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2024-07-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1029/2023MS004184\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Advances in Modeling Earth Systems\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1029/2023MS004184\",\"RegionNum\":2,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"METEOROLOGY & ATMOSPHERIC SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Advances in Modeling Earth Systems","FirstCategoryId":"89","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1029/2023MS004184","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"METEOROLOGY & ATMOSPHERIC SCIENCES","Score":null,"Total":0}
引用次数: 0
摘要
我们训练了随机森林和提升森林这两种基于回归树的机器学习架构,以模拟大气重力波动量传输的物理参数化。我们将森林与神经网络基准进行了比较,评估了离线误差和在线性能。离线时,增强型森林的技能与神经网络相似,而随机森林的得分明显较低。两种森林模型都能稳定地与大气模型耦合,使用增强森林进行控制气候积分时,其偏差低于使用神经网络进行的积分。使用所有三种数据驱动模拟器进行的整合都成功捕捉到了准双年涛动(QBO)和平流层骤暖(平流层变异的关键模式),在复制二氧化碳扰动范围内的统计数据方面,增强型森林比随机森林更准确。提升森林和神经网络捕捉到了 QBO 周期对二氧化碳增加的响应,但在更极端的 1200 ppm 情景下,两者都难以捕捉到这种响应的幅度。为了研究在控制气候中的表现与概括能力之间的联系,我们使用了可解释机器学习的技术来理解数据驱动方法如何使用物理信息。利用这种理解,我们开发了一种再训练程序,该程序可提高提升森林在控制气候和 800 ppm CO2 情景下的耦合性能。
Regression Forest Approaches to Gravity Wave Parameterization for Climate Projection
We train random and boosted forests, two machine learning architectures based on regression trees, to emulate a physics-based parameterization of atmospheric gravity wave momentum transport. We compare the forests to a neural network benchmark, evaluating both offline errors and online performance when coupled to an atmospheric model under the present day climate and in 800 and 1,200 ppm CO2 global warming scenarios. Offline, the boosted forest exhibits similar skill to the neural network, while the random forest scores significantly lower. Both forest models couple stably to the atmospheric model, and control climate integrations with the boosted forest exhibit lower biases than those with the neural network. Integrations with all three data-driven emulators successfully capture the Quasi-Biennial Oscillation (QBO) and sudden stratospheric warmings, key modes of stratospheric variability, with the boosted forest more accurate than the random forest in replicating their statistics across our range of carbon dioxide perturbations. The boosted forest and neural network capture the sign of the QBO period response to increased CO2, though both struggle with the magnitude of this response under the more extreme 1,200 ppm scenario. To investigate the connection between performance in the control climate and the ability to generalize, we use techniques from interpretable machine learning to understand how the data-driven methods use physical information. We leverage this understanding to develop a retraining procedure that improves the coupled performance of the boosted forest in the control climate and under the 800 ppm CO2 scenario.
期刊介绍:
The Journal of Advances in Modeling Earth Systems (JAMES) is committed to advancing the science of Earth systems modeling by offering high-quality scientific research through online availability and open access licensing. JAMES invites authors and readers from the international Earth systems modeling community.
Open access. Articles are available free of charge for everyone with Internet access to view and download.
Formal peer review.
Supplemental material, such as code samples, images, and visualizations, is published at no additional charge.
No additional charge for color figures.
Modest page charges to cover production costs.
Articles published in high-quality full text PDF, HTML, and XML.
Internal and external reference linking, DOI registration, and forward linking via CrossRef.