{"title":"用机器学习解释热浪","authors":"Sebastian Buschow, Jan Keller, Sabrina Wahl","doi":"10.1002/qj.4642","DOIUrl":null,"url":null,"abstract":"Heatwaves are known to arise from the interplay between large-scale climate variability, synoptic weather patterns and regional to local scale surface processes. While recent research has made important progress for each individual contributing factor, ways to properly incorporate multiple or all of them in a unified analysis are still lacking. In this study, we consider a wide range of possible predictor variables from the ERA5 reanalysis, and ask, how much information on heatwave occurrence in Europe <i>can be learned</i>\nfrom each of them. To simplify the problem, we first adapt the recently developed logistic principal component analysis to the task of compressing large binary heatwave fields to a small number of interpretable principal components. The relationships between heatwaves and various climate variables can then be learned by a neural network. Starting from the simple notion that the importance of a variable is given by its impact on the performance of our statistical model, we arrive naturally at the definition of Shapley values. Classic results of game theory show that this is the only fair way of distributing the overall success of a model among its inputs. We find a non linear model that explains 70 % of reduced heatwave variability. The biggest individual contribution (27 % of the 70 %) comes from upper level geopotential, top level soil moisture is in second place (15 %). Beyond this decomposition, Shapley interaction values enable us to quantify overlapping information and positive synergies between all pairs of predictors.","PeriodicalId":49646,"journal":{"name":"Quarterly Journal of the Royal Meteorological Society","volume":"13 1","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2023-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Explaining heatwaves with machine learning\",\"authors\":\"Sebastian Buschow, Jan Keller, Sabrina Wahl\",\"doi\":\"10.1002/qj.4642\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Heatwaves are known to arise from the interplay between large-scale climate variability, synoptic weather patterns and regional to local scale surface processes. While recent research has made important progress for each individual contributing factor, ways to properly incorporate multiple or all of them in a unified analysis are still lacking. In this study, we consider a wide range of possible predictor variables from the ERA5 reanalysis, and ask, how much information on heatwave occurrence in Europe <i>can be learned</i>\\nfrom each of them. To simplify the problem, we first adapt the recently developed logistic principal component analysis to the task of compressing large binary heatwave fields to a small number of interpretable principal components. The relationships between heatwaves and various climate variables can then be learned by a neural network. Starting from the simple notion that the importance of a variable is given by its impact on the performance of our statistical model, we arrive naturally at the definition of Shapley values. Classic results of game theory show that this is the only fair way of distributing the overall success of a model among its inputs. We find a non linear model that explains 70 % of reduced heatwave variability. The biggest individual contribution (27 % of the 70 %) comes from upper level geopotential, top level soil moisture is in second place (15 %). Beyond this decomposition, Shapley interaction values enable us to quantify overlapping information and positive synergies between all pairs of predictors.\",\"PeriodicalId\":49646,\"journal\":{\"name\":\"Quarterly Journal of the Royal Meteorological Society\",\"volume\":\"13 1\",\"pages\":\"\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2023-12-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Quarterly Journal of the Royal Meteorological Society\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://doi.org/10.1002/qj.4642\",\"RegionNum\":3,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"METEOROLOGY & ATMOSPHERIC SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Quarterly Journal of the Royal Meteorological Society","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1002/qj.4642","RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"METEOROLOGY & ATMOSPHERIC SCIENCES","Score":null,"Total":0}
Heatwaves are known to arise from the interplay between large-scale climate variability, synoptic weather patterns and regional to local scale surface processes. While recent research has made important progress for each individual contributing factor, ways to properly incorporate multiple or all of them in a unified analysis are still lacking. In this study, we consider a wide range of possible predictor variables from the ERA5 reanalysis, and ask, how much information on heatwave occurrence in Europe can be learned
from each of them. To simplify the problem, we first adapt the recently developed logistic principal component analysis to the task of compressing large binary heatwave fields to a small number of interpretable principal components. The relationships between heatwaves and various climate variables can then be learned by a neural network. Starting from the simple notion that the importance of a variable is given by its impact on the performance of our statistical model, we arrive naturally at the definition of Shapley values. Classic results of game theory show that this is the only fair way of distributing the overall success of a model among its inputs. We find a non linear model that explains 70 % of reduced heatwave variability. The biggest individual contribution (27 % of the 70 %) comes from upper level geopotential, top level soil moisture is in second place (15 %). Beyond this decomposition, Shapley interaction values enable us to quantify overlapping information and positive synergies between all pairs of predictors.
期刊介绍:
The Quarterly Journal of the Royal Meteorological Society is a journal published by the Royal Meteorological Society. It aims to communicate and document new research in the atmospheric sciences and related fields. The journal is considered one of the leading publications in meteorology worldwide. It accepts articles, comprehensive review articles, and comments on published papers. It is published eight times a year, with additional special issues.
The Quarterly Journal has a wide readership of scientists in the atmospheric and related fields. It is indexed and abstracted in various databases, including Advanced Polymers Abstracts, Agricultural Engineering Abstracts, CAB Abstracts, CABDirect, COMPENDEX, CSA Civil Engineering Abstracts, Earthquake Engineering Abstracts, Engineered Materials Abstracts, Science Citation Index, SCOPUS, Web of Science, and more.