首页 > 最新文献

Asta-Advances in Statistical Analysis最新文献

英文 中文
Editorial special issue: Bridging the gap between AI and Statistics 编辑特刊:缩小人工智能与统计学之间的差距
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-06-21 DOI: 10.1007/s10182-024-00503-4
Benjamin Säfken, David Rügamer
{"title":"Editorial special issue: Bridging the gap between AI and Statistics","authors":"Benjamin Säfken, David Rügamer","doi":"10.1007/s10182-024-00503-4","DOIUrl":"10.1007/s10182-024-00503-4","url":null,"abstract":"","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"108 2","pages":"225 - 229"},"PeriodicalIF":1.4,"publicationDate":"2024-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142412950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Markov-switching decision trees 马尔可夫转换决策树
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-05-29 DOI: 10.1007/s10182-024-00501-6
Timo Adam, Marius Ötting, Rouven Michels

Decision trees constitute a simple yet powerful and interpretable machine learning tool. While tree-based methods are designed only for cross-sectional data, we propose an approach that combines decision trees with time series modeling and thereby bridges the gap between machine learning and statistics. In particular, we combine decision trees with hidden Markov models where, for any time point, an underlying (hidden) Markov chain selects the tree that generates the corresponding observation. We propose an estimation approach that is based on the expectation-maximisation algorithm and assess its feasibility in simulation experiments. In our real-data application, we use eight seasons of National Football League (NFL) data to predict play calls conditional on covariates, such as the current quarter and the score, where the model’s states can be linked to the teams’ strategies. R code that implements the proposed method is available on GitHub.

决策树是一种简单但功能强大、可解释的机器学习工具。虽然基于树的方法只适用于横截面数据,但我们提出了一种将决策树与时间序列建模相结合的方法,从而缩小了机器学习与统计学之间的差距。特别是,我们将决策树与隐马尔可夫模型相结合,对于任何时间点,底层(隐)马尔可夫链都会选择生成相应观测值的树。我们提出了一种基于期望最大化算法的估计方法,并在模拟实验中评估了其可行性。在我们的真实数据应用中,我们使用美国国家橄榄球联盟(NFL)八个赛季的数据来预测以当前季度和比分等协变量为条件的比赛调用,其中模型的状态可以与球队的策略相关联。实现该方法的 R 代码可在 GitHub 上获取。
{"title":"Markov-switching decision trees","authors":"Timo Adam,&nbsp;Marius Ötting,&nbsp;Rouven Michels","doi":"10.1007/s10182-024-00501-6","DOIUrl":"10.1007/s10182-024-00501-6","url":null,"abstract":"<div><p>Decision trees constitute a simple yet powerful and interpretable machine learning tool. While tree-based methods are designed only for cross-sectional data, we propose an approach that combines decision trees with time series modeling and thereby bridges the gap between machine learning and statistics. In particular, we combine decision trees with hidden Markov models where, for any time point, an underlying (hidden) Markov chain selects the tree that generates the corresponding observation. We propose an estimation approach that is based on the expectation-maximisation algorithm and assess its feasibility in simulation experiments. In our real-data application, we use eight seasons of National Football League (NFL) data to predict play calls conditional on covariates, such as the current quarter and the score, where the model’s states can be linked to the teams’ strategies. R code that implements the proposed method is available on GitHub.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"108 2","pages":"461 - 476"},"PeriodicalIF":1.4,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-024-00501-6.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141170744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Markov switching stereotype logit models for longitudinal ordinal data affected by unobserved heterogeneity in responding behavior 受反应行为中未观察到的异质性影响的纵向序数数据的马尔可夫转换定型 Logit 模型
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-05-15 DOI: 10.1007/s10182-024-00500-7
Roberto Colombi, Sabrina Giordano

When asked to assess their opinion about attitudes or perceptions on Likert-scale, respondents often endorse the midpoint or extremes of the scale and agree or disagree regardless of the content. These responding behaviors are known in the psychometric literature as middle, extremes, aquiescence and disacquiescence response styles that generally introduce bias in the results. One of the key motivations behind our approach is to account for these attitudes and how they evolve over time. The novelty of our proposal, in the context of longitudinal ordered categorical data, is in considering simultaneously the temporal dynamics of the responses (observable ordinal variables) and unobservable answering behaviors, possibly influenced by response styles, through a Markov switching logit model with two latent components. One component accommodates serial dependence and respondent’s unobserved heterogeneity, the other component determines the responding attitude (due to response styles or not). The dependence of the responses on covariates is modelled by a stereotype logit model with parameters varying according to the two latent components. The stereotype logit model is adopted because it is a flexible extension of the proportional odds logit model that retains the advantage of using a single parameter to describe a regressor effect. In the paper, a new interpretation of the parameters of the stereotype model is given by defining the allocation sets as intervals of values of the linear predictor that identify the most probable response. Unobserved heterogeneity, serial dependence and tendency to response style are modelled through our approach on longitudinal data, collected by the Bank of Italy.

当要求受访者用李克特量表评估其对态度或认知的看法时,受访者通常会赞同量表的中点或极 端,并且无论内容如何,都会表示同意或不同意。这些回答行为在心理测量学文献中被称为中间、极端、钝化和不钝化回答风格,通常会给结果带来偏差。我们的方法背后的主要动机之一就是要考虑这些态度以及它们如何随时间演变。在纵向有序分类数据的背景下,我们的建议的新颖之处在于通过一个具有两个潜在成分的马尔可夫切换 logit 模型,同时考虑了回答(可观察的序变量)和不可观察的回答行为(可能受回答风格的影响)的时间动态。其中一个部分考虑了序列依赖性和应答者未观察到的异质性,另一个部分决定了应答态度(是否受应答风格影响)。回答对协变量的依赖性由一个定型 logit 模型来模拟,其参数根据两个潜变量的不同而变化。之所以采用定型 logit 模型,是因为它是比例几率 logit 模型的灵活扩展,保留了使用单一参数描述回归效应的优点。本文通过将分配集定义为线性预测因子值的区间来确定最可能的反应,从而对定型模型的参数给出了新的解释。通过我们对意大利银行收集的纵向数据所采用的方法,对未观察到的异质性、序列依赖性和反应风格倾向进行了建模。
{"title":"Markov switching stereotype logit models for longitudinal ordinal data affected by unobserved heterogeneity in responding behavior","authors":"Roberto Colombi,&nbsp;Sabrina Giordano","doi":"10.1007/s10182-024-00500-7","DOIUrl":"10.1007/s10182-024-00500-7","url":null,"abstract":"<div><p>When asked to assess their opinion about attitudes or perceptions on Likert-scale, respondents often endorse the midpoint or extremes of the scale and agree or disagree regardless of the content. These responding behaviors are known in the psychometric literature as middle, extremes, aquiescence and disacquiescence response styles that generally introduce bias in the results. One of the key motivations behind our approach is to account for these attitudes and how they evolve over time. The novelty of our proposal, in the context of longitudinal ordered categorical data, is in considering simultaneously the temporal dynamics of the responses (observable ordinal variables) and unobservable answering behaviors, possibly influenced by response styles, through a Markov switching logit model with two latent components. One component accommodates serial dependence and respondent’s unobserved heterogeneity, the other component determines the responding attitude (due to response styles or not). The dependence of the responses on covariates is modelled by a stereotype logit model with parameters varying according to the two latent components. The stereotype logit model is adopted because it is a flexible extension of the proportional odds logit model that retains the advantage of using a single parameter to describe a regressor effect. In the paper, a new interpretation of the parameters of the stereotype model is given by defining the allocation sets as intervals of values of the linear predictor that identify the most probable response. Unobserved heterogeneity, serial dependence and tendency to response style are modelled through our approach on longitudinal data, collected by the Bank of Italy.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"109 1","pages":"117 - 147"},"PeriodicalIF":1.4,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-024-00500-7.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141059332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deducing neighborhoods of classes from a fitted model 从拟合模型中推断类别邻域
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-05-08 DOI: 10.1007/s10182-024-00502-5
Alexander Gerharz, Andreas Groll, Gunther Schauberger

In this article, a new kind of interpretable machine learning method is presented, which can help to understand the partition of the feature space into predicted classes in a classification model using quantile shifts, and this way make the underlying statistical or machine learning model more trustworthy. Basically, real data points (or specific points of interest) are used and the changes of the prediction after slightly raising or decreasing specific features are observed. By comparing the predictions before and after the shifts, under certain conditions the observed changes in the predictions can be interpreted as neighborhoods of the classes with regard to the shifted features. Chord diagrams are used to visualize the observed changes. For illustration, this quantile shift method (QSM) is applied to an artificial example with medical labels and a real data example.

本文提出了一种新的可解释机器学习方法,它可以帮助理解分类模型中利用量子位移将特征空间划分为预测类别的过程,从而使底层统计或机器学习模型更加可信。基本上,该方法使用真实数据点(或特定的兴趣点),并观察在稍微提高或降低特定特征后预测结果的变化。通过比较移动前后的预测结果,在某些条件下,观察到的预测变化可以解释为与移动特征相关的类别邻近。弦线图用于直观显示观察到的变化。为便于说明,我们将这种量子位移方法(QSM)应用于一个带有医疗标签的人工示例和一个真实数据示例。
{"title":"Deducing neighborhoods of classes from a fitted model","authors":"Alexander Gerharz,&nbsp;Andreas Groll,&nbsp;Gunther Schauberger","doi":"10.1007/s10182-024-00502-5","DOIUrl":"10.1007/s10182-024-00502-5","url":null,"abstract":"<div><p>In this article, a new kind of interpretable machine learning method is presented, which can help to understand the partition of the feature space into predicted classes in a classification model using quantile shifts, and this way make the underlying statistical or machine learning model more trustworthy. Basically, real data points (or specific points of interest) are used and the changes of the prediction after slightly raising or decreasing specific features are observed. By comparing the predictions before and after the shifts, under certain conditions the observed changes in the predictions can be interpreted as neighborhoods of the classes with regard to the shifted features. Chord diagrams are used to visualize the observed changes. For illustration, this quantile shift method (QSM) is applied to an artificial example with medical labels and a real data example.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"108 2","pages":"395 - 425"},"PeriodicalIF":1.4,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-024-00502-5.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140936567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Testing distributional assumptions in CUB models for the analysis of rating data 测试用于分析评级数据的 CUB 模型中的分布假设
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-04-13 DOI: 10.1007/s10182-024-00498-y
Francesca Di Iorio, Riccardo Lucchetti, Rosaria Simone

In this paper, we propose a portmanteau test for misspecification in combination of uniform and binomial (CUB) models for the analysis of ordered rating data. Specifically, the test we build belongs to the class of information matrix (IM) tests that are based on the information matrix equality. Monte Carlo evidence indicates that the test has excellent properties in finite samples in terms of actual size and power versus several alternatives. Differently from other tests of the IM family, finite-sample adjustments based on the bootstrap seem to be unnecessary. An empirical application is also provided to illustrate how the IM test can be used to supplement model validation and selection.

在本文中,我们提出了一种用于分析有序评级数据的统一和二项(CUB)组合模型的波特曼检验法(portmanteau test)。具体来说,我们建立的检验属于基于信息矩阵相等的信息矩阵(IM)检验。蒙特卡洛证据表明,在有限样本中,该检验在实际规模和功率方面相对于几种备选方案都具有出色的特性。与 IM 系列的其他检验不同,基于引导的有限样本调整似乎是不必要的。本文还提供了一个经验应用,以说明如何使用 IM 检验来补充模型验证和选择。
{"title":"Testing distributional assumptions in CUB models for the analysis of rating data","authors":"Francesca Di Iorio,&nbsp;Riccardo Lucchetti,&nbsp;Rosaria Simone","doi":"10.1007/s10182-024-00498-y","DOIUrl":"10.1007/s10182-024-00498-y","url":null,"abstract":"<div><p>In this paper, we propose a <i>portmanteau</i> test for misspecification in combination of uniform and binomial (CUB) models for the analysis of ordered rating data. Specifically, the test we build belongs to the class of information matrix (IM) tests that are based on the information matrix equality. Monte Carlo evidence indicates that the test has excellent properties in finite samples in terms of actual size and power versus several alternatives. Differently from other tests of the IM family, finite-sample adjustments based on the bootstrap seem to be unnecessary. An empirical application is also provided to illustrate how the IM test can be used to supplement model validation and selection.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"108 3","pages":"669 - 701"},"PeriodicalIF":1.4,"publicationDate":"2024-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-024-00498-y.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140582934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Testing for periodicity at an unknown frequency under cyclic long memory, with applications to respiratory muscle training 在循环长记忆下测试未知频率的周期性,并应用于呼吸肌训练
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-04-12 DOI: 10.1007/s10182-024-00499-x
Jan Beran, Jeremy Näscher, Fabian Pietsch, Stephan Walterspacher

A frequent problem in applied time series analysis is the identification of dominating periodic components. A particularly difficult task is to distinguish deterministic periodic signals from periodic long memory. In this paper, a family of test statistics based on Whittle’s Gaussian log-likelihood approximation is proposed. Asymptotic critical regions and bounds for the asymptotic power are derived. In cases where a deterministic periodic signal and periodic long memory share the same frequency, consistency and rates of type II error probabilities depend on the long-memory parameter. Simulations and an application to respiratory muscle training data illustrate the results.

在应用时间序列分析中,一个经常遇到的问题是如何识别占主导地位的周期成分。一个特别困难的任务是将确定性周期信号与周期性长记忆区分开来。本文提出了基于惠特尔高斯对数似然近似的检验统计量系列。推导出了渐近临界区和渐近功率的边界。在确定性周期信号和周期性长记忆共享相同频率的情况下,一致性和 II 型错误概率率取决于长记忆参数。模拟和呼吸肌训练数据的应用说明了这些结果。
{"title":"Testing for periodicity at an unknown frequency under cyclic long memory, with applications to respiratory muscle training","authors":"Jan Beran,&nbsp;Jeremy Näscher,&nbsp;Fabian Pietsch,&nbsp;Stephan Walterspacher","doi":"10.1007/s10182-024-00499-x","DOIUrl":"10.1007/s10182-024-00499-x","url":null,"abstract":"<div><p>A frequent problem in applied time series analysis is the identification of dominating periodic components. A particularly difficult task is to distinguish deterministic periodic signals from periodic long memory. In this paper, a family of test statistics based on Whittle’s Gaussian log-likelihood approximation is proposed. Asymptotic critical regions and bounds for the asymptotic power are derived. In cases where a deterministic periodic signal and periodic long memory share the same frequency, consistency and rates of type II error probabilities depend on the long-memory parameter. Simulations and an application to respiratory muscle training data illustrate the results.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"108 4","pages":"705 - 731"},"PeriodicalIF":1.4,"publicationDate":"2024-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-024-00499-x.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140582923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bernstein flows for flexible posteriors in variational Bayes 变异贝叶斯中灵活后验的伯恩斯坦流
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-04-03 DOI: 10.1007/s10182-024-00497-z
Oliver Dürr, Stefan Hörtling, Danil Dold, Ivonne Kovylov, Beate Sick

Black-box variational inference (BBVI) is a technique to approximate the posterior of Bayesian models by optimization. Similar to MCMC, the user only needs to specify the model; then, the inference procedure is done automatically. In contrast to MCMC, BBVI scales to many observations, is faster for some applications, and can take advantage of highly optimized deep learning frameworks since it can be formulated as a minimization task. In the case of complex posteriors, however, other state-of-the-art BBVI approaches often yield unsatisfactory posterior approximations. This paper presents Bernstein flow variational inference (BF-VI), a robust and easy-to-use method flexible enough to approximate complex multivariate posteriors. BF-VI combines ideas from normalizing flows and Bernstein polynomial-based transformation models. In benchmark experiments, we compare BF-VI solutions with exact posteriors, MCMC solutions, and state-of-the-art BBVI methods, including normalizing flow-based BBVI. We show for low-dimensional models that BF-VI accurately approximates the true posterior; in higher-dimensional models, BF-VI compares favorably against other BBVI methods. Further, using BF-VI, we develop a Bayesian model for the semi-structured melanoma challenge data, combining a CNN model part for image data with an interpretable model part for tabular data, and demonstrate, for the first time, the use of BBVI in semi-structured models.

黑箱变分推理(BBVI)是一种通过优化近似贝叶斯模型后验的技术。与 MCMC 相似,用户只需指定模型,推理过程就会自动完成。与 MCMC 相比,BBVI 可以扩展到许多观测值,在某些应用中速度更快,而且可以利用高度优化的深度学习框架,因为它可以被表述为最小化任务。然而,在复杂后验的情况下,其他最先进的 BBVI 方法往往不能得到令人满意的后验近似值。本文介绍了伯恩斯坦流变推理(BF-VI),这是一种稳健、易用的方法,可灵活逼近复杂的多变量后验。BF-VI 结合了归一化流和基于伯恩斯坦多项式变换模型的思想。在基准实验中,我们将 BF-VI 解决方案与精确后验、MCMC 解决方案和最先进的 BBVI 方法(包括基于归一化流的 BBVI)进行了比较。结果表明,在低维模型中,BF-VI 准确地逼近了真实后验;在高维模型中,BF-VI 与其他 BBVI 方法相比更胜一筹。此外,我们利用 BF-VI 为半结构化黑色素瘤挑战数据开发了一个贝叶斯模型,将用于图像数据的 CNN 模型部分与用于表格数据的可解释模型部分相结合,并首次证明了 BBVI 在半结构化模型中的应用。
{"title":"Bernstein flows for flexible posteriors in variational Bayes","authors":"Oliver Dürr,&nbsp;Stefan Hörtling,&nbsp;Danil Dold,&nbsp;Ivonne Kovylov,&nbsp;Beate Sick","doi":"10.1007/s10182-024-00497-z","DOIUrl":"10.1007/s10182-024-00497-z","url":null,"abstract":"<div><p>Black-box variational inference (BBVI) is a technique to approximate the posterior of Bayesian models by optimization. Similar to MCMC, the user only needs to specify the model; then, the inference procedure is done automatically. In contrast to MCMC, BBVI scales to many observations, is faster for some applications, and can take advantage of highly optimized deep learning frameworks since it can be formulated as a minimization task. In the case of complex posteriors, however, other state-of-the-art BBVI approaches often yield unsatisfactory posterior approximations. This paper presents Bernstein flow variational inference (BF-VI), a robust and easy-to-use method flexible enough to approximate complex multivariate posteriors. BF-VI combines ideas from normalizing flows and Bernstein polynomial-based transformation models. In benchmark experiments, we compare BF-VI solutions with exact posteriors, MCMC solutions, and state-of-the-art BBVI methods, including normalizing flow-based BBVI. We show for low-dimensional models that BF-VI accurately approximates the true posterior; in higher-dimensional models, BF-VI compares favorably against other BBVI methods. Further, using BF-VI, we develop a Bayesian model for the semi-structured melanoma challenge data, combining a CNN model part for image data with an interpretable model part for tabular data, and demonstrate, for the first time, the use of BBVI in semi-structured models.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"108 2","pages":"375 - 394"},"PeriodicalIF":1.4,"publicationDate":"2024-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-024-00497-z.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140582930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Variational inference: uncertainty quantification in additive models 变量推理:加法模型中的不确定性量化
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-04-03 DOI: 10.1007/s10182-024-00492-4
Jens Lichter, Paul F V Wiemann, Thomas Kneib

Markov chain Monte Carlo (MCMC)-based simulation approaches are by far the most common method in Bayesian inference to access the posterior distribution. Recently, motivated by successes in machine learning, variational inference (VI) has gained in interest in statistics since it promises a computationally efficient alternative to MCMC enabling approximate access to the posterior. Classical approaches such as mean-field VI (MFVI), however, are based on the strong mean-field assumption for the approximate posterior where parameters or parameter blocks are assumed to be mutually independent. As a consequence, parameter uncertainties are often underestimated and alternatives such as semi-implicit VI (SIVI) have been suggested to avoid the mean-field assumption and to improve uncertainty estimates. SIVI uses a hierarchical construction of the variational parameters to restore parameter dependencies and relies on a highly flexible implicit mixing distribution whose probability density function is not analytic but samples can be taken via a stochastic procedure. With this paper, we investigate how different forms of VI perform in semiparametric additive regression models as one of the most important fields of application of Bayesian inference in statistics. A particular focus is on the ability of the rivalling approaches to quantify uncertainty, especially with correlated covariates that are likely to aggravate the difficulties of simplifying VI assumptions. Moreover, we propose a method, where we combine both advantages of MFVI and SIVI and compare its performance. The different VI approaches are studied in comparison with MCMC in simulations and an application to tree height models of douglas fir based on a large-scale forestry data set.

基于马尔科夫链蒙特卡罗(MCMC)的模拟方法是贝叶斯推理中迄今为止最常用的获取后验分布的方法。最近,在机器学习取得成功的推动下,变分推理(VI)在统计学中越来越受到关注,因为它有望成为 MCMC 的一种计算高效的替代方法,能够近似访问后验分布。然而,均值场变分推理(MFVI)等经典方法是基于近似后验的强均值场假设,其中参数或参数块被假定为相互独立的。因此,参数的不确定性往往被低估,人们提出了半隐式 VI(SIVI)等替代方法,以避免均值场假设并改进不确定性估计。SIVI 使用变分参数的分层结构来恢复参数依赖关系,并依赖于高度灵活的隐式混合分布,其概率密度函数不是解析的,但可以通过随机过程取样。本文研究了不同形式的 VI 在半参数加法回归模型中的表现,该模型是贝叶斯推理在统计学中最重要的应用领域之一。本文特别关注了不同方法量化不确定性的能力,尤其是在相关协变量可能加剧简化 VI 假设困难的情况下。此外,我们还提出了一种方法,该方法结合了 MFVI 和 SIVI 的优点,并对其性能进行了比较。我们将不同的 VI 方法与模拟 MCMC 进行了比较研究,并将其应用于基于大规模林业数据集的道格拉斯杉树高模型。
{"title":"Variational inference: uncertainty quantification in additive models","authors":"Jens Lichter,&nbsp;Paul F V Wiemann,&nbsp;Thomas Kneib","doi":"10.1007/s10182-024-00492-4","DOIUrl":"10.1007/s10182-024-00492-4","url":null,"abstract":"<div><p>Markov chain Monte Carlo (MCMC)-based simulation approaches are by far the most common method in Bayesian inference to access the posterior distribution. Recently, motivated by successes in machine learning, variational inference (VI) has gained in interest in statistics since it promises a computationally efficient alternative to MCMC enabling approximate access to the posterior. Classical approaches such as mean-field VI (MFVI), however, are based on the strong mean-field assumption for the approximate posterior where parameters or parameter blocks are assumed to be mutually independent. As a consequence, parameter uncertainties are often underestimated and alternatives such as semi-implicit VI (SIVI) have been suggested to avoid the mean-field assumption and to improve uncertainty estimates. SIVI uses a hierarchical construction of the variational parameters to restore parameter dependencies and relies on a highly flexible implicit mixing distribution whose probability density function is not analytic but samples can be taken via a stochastic procedure. With this paper, we investigate how different forms of VI perform in semiparametric additive regression models as one of the most important fields of application of Bayesian inference in statistics. A particular focus is on the ability of the rivalling approaches to quantify uncertainty, especially with correlated covariates that are likely to aggravate the difficulties of simplifying VI assumptions. Moreover, we propose a method, where we combine both advantages of MFVI and SIVI and compare its performance. The different VI approaches are studied in comparison with MCMC in simulations and an application to tree height models of douglas fir based on a large-scale forestry data set.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"108 2","pages":"279 - 331"},"PeriodicalIF":1.4,"publicationDate":"2024-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-024-00492-4.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140582932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ridge regularization for spatial autoregressive models with multicollinearity issues 具有多重共线性问题的空间自回归模型的岭正则化
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-04-01 DOI: 10.1007/s10182-024-00496-0
Cristina O. Chavez-Chong, Cécile Hardouin, Ana-Karina Fermin

This work proposes a new method for building an explanatory spatial autoregressive model in a multicollinearity context. We use Ridge regularization to bypass the collinearity issue. We present new estimation algorithms that allow for the estimation of the regression coefficients as well as the spatial dependence parameter. A spatial cross-validation procedure is used to tune the regularization parameter. In fact, ordinary cross-validation techniques are not applicable to spatially dependent observations. Variable importance is assessed by permutation tests since classical tests are not valid after Ridge regularization. We assess the performance of our methodology through numerical experiments conducted on simulated synthetic data. Finally, we apply our method to a real data set and evaluate the impact of some socioeconomic variables on the COVID-19 intensity in France.

本研究提出了一种在多共线性背景下建立解释性空间自回归模型的新方法。我们使用 Ridge 正则化来绕过共线性问题。我们提出了新的估计算法,可以估计回归系数和空间依赖性参数。空间交叉验证程序用于调整正则化参数。事实上,普通的交叉验证技术并不适用于空间依赖性观测。由于传统测试在里奇正则化后无效,因此我们采用置换测试来评估变量的重要性。我们通过对模拟合成数据进行数值实验来评估我们方法的性能。最后,我们将我们的方法应用于真实数据集,并评估一些社会经济变量对法国 COVID-19 强度的影响。
{"title":"Ridge regularization for spatial autoregressive models with multicollinearity issues","authors":"Cristina O. Chavez-Chong,&nbsp;Cécile Hardouin,&nbsp;Ana-Karina Fermin","doi":"10.1007/s10182-024-00496-0","DOIUrl":"10.1007/s10182-024-00496-0","url":null,"abstract":"<div><p>This work proposes a new method for building an explanatory spatial autoregressive model in a multicollinearity context. We use Ridge regularization to bypass the collinearity issue. We present new estimation algorithms that allow for the estimation of the regression coefficients as well as the spatial dependence parameter. A spatial cross-validation procedure is used to tune the regularization parameter. In fact, ordinary cross-validation techniques are not applicable to spatially dependent observations. Variable importance is assessed by permutation tests since classical tests are not valid after Ridge regularization. We assess the performance of our methodology through numerical experiments conducted on simulated synthetic data. Finally, we apply our method to a real data set and evaluate the impact of some socioeconomic variables on the COVID-19 intensity in France.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"109 1","pages":"25 - 52"},"PeriodicalIF":1.4,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140582922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using sequential statistical tests for efficient hyperparameter tuning 利用序列统计检验实现高效超参数调整
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-03-14 DOI: 10.1007/s10182-024-00495-1
Philip Buczak, Andreas Groll, Markus Pauly, Jakob Rehof, Daniel Horn

Hyperparameter tuning is one of the most time-consuming parts in machine learning. Despite the existence of modern optimization algorithms that minimize the number of evaluations needed, evaluations of a single setting may still be expensive. Usually a resampling technique is used, where the machine learning method has to be fitted a fixed number of k times on different training datasets. The respective mean performance of the k fits is then used as performance estimator. Many hyperparameter settings could be discarded after less than k resampling iterations if they are clearly inferior to high-performing settings. However, resampling is often performed until the very end, wasting a lot of computational effort. To this end, we propose the sequential random search (SQRS) which extends the regular random search algorithm by a sequential testing procedure aimed at detecting and eliminating inferior parameter configurations early. We compared our SQRS with regular random search using multiple publicly available regression and classification datasets. Our simulation study showed that the SQRS is able to find similarly well-performing parameter settings while requiring noticeably fewer evaluations. Our results underscore the potential for integrating sequential tests into hyperparameter tuning.

超参数调整是机器学习中最耗时的部分之一。尽管现代优化算法可以最大限度地减少所需的评估次数,但对单个设置的评估仍可能非常昂贵。通常会使用重采样技术,即在不同的训练数据集上对机器学习方法进行固定次数的 k 次拟合。然后将 k 次拟合各自的平均性能作为性能估计值。如果许多超参数设置明显不如高性能设置,那么可以在少于 k 次的重采样迭代后将其舍弃。然而,重采样往往要到最后才进行,浪费了大量的计算资源。为此,我们提出了顺序随机搜索(SQRS),它通过一个顺序测试程序扩展了常规随机搜索算法,旨在及早检测和消除劣质参数配置。我们使用多个公开的回归和分类数据集对 SQRS 和常规随机搜索进行了比较。我们的模拟研究表明,SQRS 能够找到类似的性能良好的参数设置,而所需的评估次数却明显减少。我们的结果强调了将顺序测试整合到超参数调整中的潜力。
{"title":"Using sequential statistical tests for efficient hyperparameter tuning","authors":"Philip Buczak,&nbsp;Andreas Groll,&nbsp;Markus Pauly,&nbsp;Jakob Rehof,&nbsp;Daniel Horn","doi":"10.1007/s10182-024-00495-1","DOIUrl":"10.1007/s10182-024-00495-1","url":null,"abstract":"<div><p>Hyperparameter tuning is one of the most time-consuming parts in machine learning. Despite the existence of modern optimization algorithms that minimize the number of evaluations needed, evaluations of a single setting may still be expensive. Usually a resampling technique is used, where the machine learning method has to be fitted a fixed number of <i>k</i> times on different training datasets. The respective mean performance of the <i>k</i> fits is then used as performance estimator. Many hyperparameter settings could be discarded after less than <i>k</i> resampling iterations if they are clearly inferior to high-performing settings. However, resampling is often performed until the very end, wasting a lot of computational effort. To this end, we propose the sequential random search (SQRS) which extends the regular random search algorithm by a sequential testing procedure aimed at detecting and eliminating inferior parameter configurations early. We compared our SQRS with regular random search using multiple publicly available regression and classification datasets. Our simulation study showed that the SQRS is able to find similarly well-performing parameter settings while requiring noticeably fewer evaluations. Our results underscore the potential for integrating sequential tests into hyperparameter tuning.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"108 2","pages":"441 - 460"},"PeriodicalIF":1.4,"publicationDate":"2024-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-024-00495-1.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140124518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Asta-Advances in Statistical Analysis
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1