{"title":"分析极端降水频率的科学逻辑和时空依赖性:可忽略还是不可忽略?","authors":"Francesco Serinaldi","doi":"10.5194/hess-28-3191-2024","DOIUrl":null,"url":null,"abstract":"Abstract. Statistics is often misused in hydro-climatology, thus causing research to get stuck on unscientific concepts that hinder scientific advances. In particular, neglecting the scientific rationale of statistical inference results in logical and operational fallacies that prevent the discernment of facts, assumptions, and models, thus leading to systematic misinterpretations of the output of data analysis. This study discusses how epistemological principles are not just philosophical concepts but also have very practical effects. To this aim, we focus on the iterated underestimation and misinterpretation of the role of spatio-temporal dependence in statistical analysis of hydro-climatic processes by analyzing the occurrence process of extreme precipitation (P) derived from 100-year daily time series recorded at 1106 worldwide gauges of the Global Historical Climatology Network. The analysis contrasts a model-based approach that is compliant with the well-devised but often neglected logic of statistical inference and a widespread but theoretically problematic test-based approach relying on statistical hypothesis tests applied to unrepeatable hydro-climatic records. The model-based approach highlights the actual impact of spatio-temporal dependence and a finite sample size on statistical inference, resulting in over-dispersed marginal distributions and biased estimates of dependence properties, such as autocorrelation and power spectrum density. These issues also affect the outcome and interpretation of statistical tests for trend detection. Overall, the model-based approach results in a theoretically coherent modeling framework where stationary stochastic processes incorporating the empirical spatio-temporal correlation and its effects provide a faithful description of the occurrence process of extreme P at various spatio-temporal scales. On the other hand, the test-based approach leads to theoretically unsubstantiated results and interpretations, along with logically contradictory conclusions such as the simultaneous equi-dispersion and over-dispersion of extreme P. Therefore, accounting for the effect of dependence in the analysis of the frequency of extreme P has a huge impact that cannot be ignored, and, more importantly, any data analysis can be scientifically meaningful only if it considers the epistemological principles of statistical inference such as the asymmetry between confirmatory and disconfirmatory empiricism, the inverse-probability problem affecting statistical tests, and the difference between assumptions and models.\n","PeriodicalId":507846,"journal":{"name":"Hydrology and Earth System Sciences","volume":"27 7","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Scientific logic and spatio-temporal dependence in analyzing extreme-precipitation frequency: negligible or neglected?\",\"authors\":\"Francesco Serinaldi\",\"doi\":\"10.5194/hess-28-3191-2024\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract. Statistics is often misused in hydro-climatology, thus causing research to get stuck on unscientific concepts that hinder scientific advances. In particular, neglecting the scientific rationale of statistical inference results in logical and operational fallacies that prevent the discernment of facts, assumptions, and models, thus leading to systematic misinterpretations of the output of data analysis. This study discusses how epistemological principles are not just philosophical concepts but also have very practical effects. To this aim, we focus on the iterated underestimation and misinterpretation of the role of spatio-temporal dependence in statistical analysis of hydro-climatic processes by analyzing the occurrence process of extreme precipitation (P) derived from 100-year daily time series recorded at 1106 worldwide gauges of the Global Historical Climatology Network. The analysis contrasts a model-based approach that is compliant with the well-devised but often neglected logic of statistical inference and a widespread but theoretically problematic test-based approach relying on statistical hypothesis tests applied to unrepeatable hydro-climatic records. The model-based approach highlights the actual impact of spatio-temporal dependence and a finite sample size on statistical inference, resulting in over-dispersed marginal distributions and biased estimates of dependence properties, such as autocorrelation and power spectrum density. These issues also affect the outcome and interpretation of statistical tests for trend detection. Overall, the model-based approach results in a theoretically coherent modeling framework where stationary stochastic processes incorporating the empirical spatio-temporal correlation and its effects provide a faithful description of the occurrence process of extreme P at various spatio-temporal scales. On the other hand, the test-based approach leads to theoretically unsubstantiated results and interpretations, along with logically contradictory conclusions such as the simultaneous equi-dispersion and over-dispersion of extreme P. Therefore, accounting for the effect of dependence in the analysis of the frequency of extreme P has a huge impact that cannot be ignored, and, more importantly, any data analysis can be scientifically meaningful only if it considers the epistemological principles of statistical inference such as the asymmetry between confirmatory and disconfirmatory empiricism, the inverse-probability problem affecting statistical tests, and the difference between assumptions and models.\\n\",\"PeriodicalId\":507846,\"journal\":{\"name\":\"Hydrology and Earth System Sciences\",\"volume\":\"27 7\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Hydrology and Earth System Sciences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5194/hess-28-3191-2024\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Hydrology and Earth System Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5194/hess-28-3191-2024","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
摘要。统计学在水文气候学中经常被误用,从而导致研究陷入不科学的概念,阻碍科学进步。特别是,忽视统计推论的科学原理会导致逻辑和操作谬误,妨碍对事实、假设和模型的辨别,从而导致对数据分析结果的系统性误读。本研究讨论了认识论原则不仅是哲学概念,而且具有非常实际的影响。为此,我们通过分析全球历史气候学网络(Global Historical Climatology Network)1106 个全球测站记录的 100 年日时间序列得出的极端降水量(P)的发生过程,重点探讨在水文气候过程的统计分析中反复低估和误解时空依赖性的作用。该分析对比了一种基于模型的方法和一种基于检验的方法,前者符合精心设计但往往被忽视的统计推断逻辑,而后者则普遍存在但理论上有问题,依赖于对不可重复的水文气候记录进行统计假设检验。基于模型的方法突出了时空依赖性和有限样本量对统计推断的实际影响,导致边际分布过于分散,对自相关性和功率谱密度等依赖性属性的估计存在偏差。这些问题也会影响趋势检测统计检验的结果和解释。总体而言,基于模型的方法产生了一个理论上连贯的建模框架,其中包含了经验时空相关性及其影响的静态随机过程忠实地描述了不同时空尺度上极端 P 的发生过程。另一方面,基于检验的方法会导致理论上未经证实的结果和解释,以及逻辑上相互矛盾的结论,如极端 P 同时存在等离散和过离散。因此,在分析极端 P 的频率时考虑依赖性的影响具有不可忽视的巨大作用,更重要的是,任何数据分析只有考虑到统计推断的认识论原则,如证实经验主义和不证实经验主义之间的不对称性、影响统计检验的反概率问题以及假设和模型之间的差异,才具有科学意义。
Scientific logic and spatio-temporal dependence in analyzing extreme-precipitation frequency: negligible or neglected?
Abstract. Statistics is often misused in hydro-climatology, thus causing research to get stuck on unscientific concepts that hinder scientific advances. In particular, neglecting the scientific rationale of statistical inference results in logical and operational fallacies that prevent the discernment of facts, assumptions, and models, thus leading to systematic misinterpretations of the output of data analysis. This study discusses how epistemological principles are not just philosophical concepts but also have very practical effects. To this aim, we focus on the iterated underestimation and misinterpretation of the role of spatio-temporal dependence in statistical analysis of hydro-climatic processes by analyzing the occurrence process of extreme precipitation (P) derived from 100-year daily time series recorded at 1106 worldwide gauges of the Global Historical Climatology Network. The analysis contrasts a model-based approach that is compliant with the well-devised but often neglected logic of statistical inference and a widespread but theoretically problematic test-based approach relying on statistical hypothesis tests applied to unrepeatable hydro-climatic records. The model-based approach highlights the actual impact of spatio-temporal dependence and a finite sample size on statistical inference, resulting in over-dispersed marginal distributions and biased estimates of dependence properties, such as autocorrelation and power spectrum density. These issues also affect the outcome and interpretation of statistical tests for trend detection. Overall, the model-based approach results in a theoretically coherent modeling framework where stationary stochastic processes incorporating the empirical spatio-temporal correlation and its effects provide a faithful description of the occurrence process of extreme P at various spatio-temporal scales. On the other hand, the test-based approach leads to theoretically unsubstantiated results and interpretations, along with logically contradictory conclusions such as the simultaneous equi-dispersion and over-dispersion of extreme P. Therefore, accounting for the effect of dependence in the analysis of the frequency of extreme P has a huge impact that cannot be ignored, and, more importantly, any data analysis can be scientifically meaningful only if it considers the epistemological principles of statistical inference such as the asymmetry between confirmatory and disconfirmatory empiricism, the inverse-probability problem affecting statistical tests, and the difference between assumptions and models.