首页 > 最新文献

Asta-Advances in Statistical Analysis最新文献

英文 中文
Variational inference: uncertainty quantification in additive models 变量推理:加法模型中的不确定性量化
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-04-03 DOI: 10.1007/s10182-024-00492-4
Jens Lichter, Paul F V Wiemann, Thomas Kneib

Markov chain Monte Carlo (MCMC)-based simulation approaches are by far the most common method in Bayesian inference to access the posterior distribution. Recently, motivated by successes in machine learning, variational inference (VI) has gained in interest in statistics since it promises a computationally efficient alternative to MCMC enabling approximate access to the posterior. Classical approaches such as mean-field VI (MFVI), however, are based on the strong mean-field assumption for the approximate posterior where parameters or parameter blocks are assumed to be mutually independent. As a consequence, parameter uncertainties are often underestimated and alternatives such as semi-implicit VI (SIVI) have been suggested to avoid the mean-field assumption and to improve uncertainty estimates. SIVI uses a hierarchical construction of the variational parameters to restore parameter dependencies and relies on a highly flexible implicit mixing distribution whose probability density function is not analytic but samples can be taken via a stochastic procedure. With this paper, we investigate how different forms of VI perform in semiparametric additive regression models as one of the most important fields of application of Bayesian inference in statistics. A particular focus is on the ability of the rivalling approaches to quantify uncertainty, especially with correlated covariates that are likely to aggravate the difficulties of simplifying VI assumptions. Moreover, we propose a method, where we combine both advantages of MFVI and SIVI and compare its performance. The different VI approaches are studied in comparison with MCMC in simulations and an application to tree height models of douglas fir based on a large-scale forestry data set.

基于马尔科夫链蒙特卡罗(MCMC)的模拟方法是贝叶斯推理中迄今为止最常用的获取后验分布的方法。最近,在机器学习取得成功的推动下,变分推理(VI)在统计学中越来越受到关注,因为它有望成为 MCMC 的一种计算高效的替代方法,能够近似访问后验分布。然而,均值场变分推理(MFVI)等经典方法是基于近似后验的强均值场假设,其中参数或参数块被假定为相互独立的。因此,参数的不确定性往往被低估,人们提出了半隐式 VI(SIVI)等替代方法,以避免均值场假设并改进不确定性估计。SIVI 使用变分参数的分层结构来恢复参数依赖关系,并依赖于高度灵活的隐式混合分布,其概率密度函数不是解析的,但可以通过随机过程取样。本文研究了不同形式的 VI 在半参数加法回归模型中的表现,该模型是贝叶斯推理在统计学中最重要的应用领域之一。本文特别关注了不同方法量化不确定性的能力,尤其是在相关协变量可能加剧简化 VI 假设困难的情况下。此外,我们还提出了一种方法,该方法结合了 MFVI 和 SIVI 的优点,并对其性能进行了比较。我们将不同的 VI 方法与模拟 MCMC 进行了比较研究,并将其应用于基于大规模林业数据集的道格拉斯杉树高模型。
{"title":"Variational inference: uncertainty quantification in additive models","authors":"Jens Lichter,&nbsp;Paul F V Wiemann,&nbsp;Thomas Kneib","doi":"10.1007/s10182-024-00492-4","DOIUrl":"10.1007/s10182-024-00492-4","url":null,"abstract":"<div><p>Markov chain Monte Carlo (MCMC)-based simulation approaches are by far the most common method in Bayesian inference to access the posterior distribution. Recently, motivated by successes in machine learning, variational inference (VI) has gained in interest in statistics since it promises a computationally efficient alternative to MCMC enabling approximate access to the posterior. Classical approaches such as mean-field VI (MFVI), however, are based on the strong mean-field assumption for the approximate posterior where parameters or parameter blocks are assumed to be mutually independent. As a consequence, parameter uncertainties are often underestimated and alternatives such as semi-implicit VI (SIVI) have been suggested to avoid the mean-field assumption and to improve uncertainty estimates. SIVI uses a hierarchical construction of the variational parameters to restore parameter dependencies and relies on a highly flexible implicit mixing distribution whose probability density function is not analytic but samples can be taken via a stochastic procedure. With this paper, we investigate how different forms of VI perform in semiparametric additive regression models as one of the most important fields of application of Bayesian inference in statistics. A particular focus is on the ability of the rivalling approaches to quantify uncertainty, especially with correlated covariates that are likely to aggravate the difficulties of simplifying VI assumptions. Moreover, we propose a method, where we combine both advantages of MFVI and SIVI and compare its performance. The different VI approaches are studied in comparison with MCMC in simulations and an application to tree height models of douglas fir based on a large-scale forestry data set.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"108 2","pages":"279 - 331"},"PeriodicalIF":1.4,"publicationDate":"2024-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-024-00492-4.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140582932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ridge regularization for spatial autoregressive models with multicollinearity issues 具有多重共线性问题的空间自回归模型的岭正则化
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-04-01 DOI: 10.1007/s10182-024-00496-0
Cristina O. Chavez-Chong, Cécile Hardouin, Ana-Karina Fermin

This work proposes a new method for building an explanatory spatial autoregressive model in a multicollinearity context. We use Ridge regularization to bypass the collinearity issue. We present new estimation algorithms that allow for the estimation of the regression coefficients as well as the spatial dependence parameter. A spatial cross-validation procedure is used to tune the regularization parameter. In fact, ordinary cross-validation techniques are not applicable to spatially dependent observations. Variable importance is assessed by permutation tests since classical tests are not valid after Ridge regularization. We assess the performance of our methodology through numerical experiments conducted on simulated synthetic data. Finally, we apply our method to a real data set and evaluate the impact of some socioeconomic variables on the COVID-19 intensity in France.

本研究提出了一种在多共线性背景下建立解释性空间自回归模型的新方法。我们使用 Ridge 正则化来绕过共线性问题。我们提出了新的估计算法,可以估计回归系数和空间依赖性参数。空间交叉验证程序用于调整正则化参数。事实上,普通的交叉验证技术并不适用于空间依赖性观测。由于传统测试在里奇正则化后无效,因此我们采用置换测试来评估变量的重要性。我们通过对模拟合成数据进行数值实验来评估我们方法的性能。最后,我们将我们的方法应用于真实数据集,并评估一些社会经济变量对法国 COVID-19 强度的影响。
{"title":"Ridge regularization for spatial autoregressive models with multicollinearity issues","authors":"Cristina O. Chavez-Chong,&nbsp;Cécile Hardouin,&nbsp;Ana-Karina Fermin","doi":"10.1007/s10182-024-00496-0","DOIUrl":"10.1007/s10182-024-00496-0","url":null,"abstract":"<div><p>This work proposes a new method for building an explanatory spatial autoregressive model in a multicollinearity context. We use Ridge regularization to bypass the collinearity issue. We present new estimation algorithms that allow for the estimation of the regression coefficients as well as the spatial dependence parameter. A spatial cross-validation procedure is used to tune the regularization parameter. In fact, ordinary cross-validation techniques are not applicable to spatially dependent observations. Variable importance is assessed by permutation tests since classical tests are not valid after Ridge regularization. We assess the performance of our methodology through numerical experiments conducted on simulated synthetic data. Finally, we apply our method to a real data set and evaluate the impact of some socioeconomic variables on the COVID-19 intensity in France.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"109 1","pages":"25 - 52"},"PeriodicalIF":1.4,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140582922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using sequential statistical tests for efficient hyperparameter tuning 利用序列统计检验实现高效超参数调整
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-03-14 DOI: 10.1007/s10182-024-00495-1
Philip Buczak, Andreas Groll, Markus Pauly, Jakob Rehof, Daniel Horn

Hyperparameter tuning is one of the most time-consuming parts in machine learning. Despite the existence of modern optimization algorithms that minimize the number of evaluations needed, evaluations of a single setting may still be expensive. Usually a resampling technique is used, where the machine learning method has to be fitted a fixed number of k times on different training datasets. The respective mean performance of the k fits is then used as performance estimator. Many hyperparameter settings could be discarded after less than k resampling iterations if they are clearly inferior to high-performing settings. However, resampling is often performed until the very end, wasting a lot of computational effort. To this end, we propose the sequential random search (SQRS) which extends the regular random search algorithm by a sequential testing procedure aimed at detecting and eliminating inferior parameter configurations early. We compared our SQRS with regular random search using multiple publicly available regression and classification datasets. Our simulation study showed that the SQRS is able to find similarly well-performing parameter settings while requiring noticeably fewer evaluations. Our results underscore the potential for integrating sequential tests into hyperparameter tuning.

超参数调整是机器学习中最耗时的部分之一。尽管现代优化算法可以最大限度地减少所需的评估次数,但对单个设置的评估仍可能非常昂贵。通常会使用重采样技术,即在不同的训练数据集上对机器学习方法进行固定次数的 k 次拟合。然后将 k 次拟合各自的平均性能作为性能估计值。如果许多超参数设置明显不如高性能设置,那么可以在少于 k 次的重采样迭代后将其舍弃。然而,重采样往往要到最后才进行,浪费了大量的计算资源。为此,我们提出了顺序随机搜索(SQRS),它通过一个顺序测试程序扩展了常规随机搜索算法,旨在及早检测和消除劣质参数配置。我们使用多个公开的回归和分类数据集对 SQRS 和常规随机搜索进行了比较。我们的模拟研究表明,SQRS 能够找到类似的性能良好的参数设置,而所需的评估次数却明显减少。我们的结果强调了将顺序测试整合到超参数调整中的潜力。
{"title":"Using sequential statistical tests for efficient hyperparameter tuning","authors":"Philip Buczak,&nbsp;Andreas Groll,&nbsp;Markus Pauly,&nbsp;Jakob Rehof,&nbsp;Daniel Horn","doi":"10.1007/s10182-024-00495-1","DOIUrl":"10.1007/s10182-024-00495-1","url":null,"abstract":"<div><p>Hyperparameter tuning is one of the most time-consuming parts in machine learning. Despite the existence of modern optimization algorithms that minimize the number of evaluations needed, evaluations of a single setting may still be expensive. Usually a resampling technique is used, where the machine learning method has to be fitted a fixed number of <i>k</i> times on different training datasets. The respective mean performance of the <i>k</i> fits is then used as performance estimator. Many hyperparameter settings could be discarded after less than <i>k</i> resampling iterations if they are clearly inferior to high-performing settings. However, resampling is often performed until the very end, wasting a lot of computational effort. To this end, we propose the sequential random search (SQRS) which extends the regular random search algorithm by a sequential testing procedure aimed at detecting and eliminating inferior parameter configurations early. We compared our SQRS with regular random search using multiple publicly available regression and classification datasets. Our simulation study showed that the SQRS is able to find similarly well-performing parameter settings while requiring noticeably fewer evaluations. Our results underscore the potential for integrating sequential tests into hyperparameter tuning.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"108 2","pages":"441 - 460"},"PeriodicalIF":1.4,"publicationDate":"2024-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-024-00495-1.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140124518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Weighted likelihood methods for robust fitting of wrapped models for p-torus data 用加权似然法稳健拟合 p-torus 数据的包裹模型
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-03-11 DOI: 10.1007/s10182-024-00494-2
Claudio Agostinelli, Luca Greco, Giovanni Saraceno

We consider, robust estimation of wrapped models to multivariate circular data that are points on the surface of a p-torus based on the weighted likelihood methodology. Robust model fitting is achieved by a set of weighted likelihood estimating equations, based on the computation of data dependent weights aimed to down-weight anomalous values, such as unexpected directions that do not share the main pattern of the bulk of the data. Weighted likelihood estimating equations with weights evaluated on the torus or obtained after unwrapping the data onto the Euclidean space are proposed and compared. Asymptotic properties and robustness features of the estimators under study have been studied, whereas their finite sample behavior has been investigated by Monte Carlo numerical experiment and real data examples.

我们根据加权似然法,考虑对多元圆形数据(p-torus 表面上的点)的包裹模型进行稳健估计。稳健模型拟合是通过一组加权似然估计方程实现的,该方程基于与数据相关的权重计算,旨在降低异常值的权重,例如与大部分数据的主要模式不一致的意外方向。我们提出并比较了加权似然估计方程,其权重在环上进行评估,或在欧几里得空间上对数据进行解包后获得。对所研究的估计器的渐近特性和稳健性特征进行了研究,并通过蒙特卡罗数值实验和实际数据实例对其有限样本行为进行了研究。
{"title":"Weighted likelihood methods for robust fitting of wrapped models for p-torus data","authors":"Claudio Agostinelli,&nbsp;Luca Greco,&nbsp;Giovanni Saraceno","doi":"10.1007/s10182-024-00494-2","DOIUrl":"10.1007/s10182-024-00494-2","url":null,"abstract":"<div><p>We consider, robust estimation of wrapped models to multivariate circular data that are points on the surface of a <i>p</i>-torus based on the weighted likelihood methodology. Robust model fitting is achieved by a set of weighted likelihood estimating equations, based on the computation of data dependent weights aimed to down-weight anomalous values, such as unexpected directions that do not share the main pattern of the bulk of the data. Weighted likelihood estimating equations with weights evaluated on the torus or obtained after unwrapping the data onto the Euclidean space are proposed and compared. Asymptotic properties and robustness features of the estimators under study have been studied, whereas their finite sample behavior has been investigated by Monte Carlo numerical experiment and real data examples.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"108 4","pages":"853 - 888"},"PeriodicalIF":1.4,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140116179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust Bayesian small area estimation using the sub-Gaussian (alpha)-stable distribution for measurement error in covariates 使用亚高斯$$alpha$$-稳定分布对协变因素中的测量误差进行稳健的贝叶斯小面积估算
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-03-06 DOI: 10.1007/s10182-024-00493-3
Serena Arima, Shaho Zarei

In small area estimation, the sample size is so small that direct estimators have seldom enough adequate precision. Therefore, it is common to use auxiliary data via covariates and produce estimators that combine them with direct data. Nevertheless, it is not uncommon for covariates to be measured with error, leading to inconsistent estimators. Area-level models accounting for measurement error (ME) in covariates have been proposed, and they usually assume that the errors are an i.i.d. Gaussian model. However, there might be situations in which this assumption is violated especially when covariates present severe outlying values that cannot be cached by the Gaussian distribution. To overcome this problem, we propose to model the ME through sub-Gaussian (alpha)-stable (SG(alpha)S) distribution, a flexible distribution that accommodates different types of outlying observations and also Gaussian data as a special case when (alpha =2). The SG(alpha)S distribution is a generalization of the Gaussian distribution that allows for skewness and heavy tails by adding an extra parameter, (alpha in (0,2]), to control tail behaviour. The model parameters are estimated in a fully Bayesian framework. The performance of the proposal is illustrated by applying to real data and some simulation studies.

摘要 在小面积估算中,样本量非常小,直接估算器很少有足够的精度。因此,通常通过协变量使用辅助数据,并将其与直接数据结合生成估算器。然而,协变量的测量存在误差,导致估计值不一致的情况并不少见。有人提出了考虑协变量测量误差(ME)的区域级模型,这些模型通常假设误差为 i.i.d. 高斯模型。然而,在某些情况下,这一假设可能会被违反,尤其是当协变量出现严重的离群值,而高斯分布无法将其缓存时。为了克服这个问题,我们建议通过亚高斯稳定分布(SG (alpha) S)对 ME 进行建模,这是一种灵活的分布,可以容纳不同类型的离差观测值,当 (alpha =2)时,高斯数据也是一种特殊情况。SG (alpha) S 分布是高斯分布的广义化,通过增加一个额外参数((0,2])来控制尾部行为,从而允许偏斜和重尾。模型参数在完全贝叶斯框架下进行估计。通过应用真实数据和一些模拟研究,说明了该建议的性能。
{"title":"Robust Bayesian small area estimation using the sub-Gaussian (alpha)-stable distribution for measurement error in covariates","authors":"Serena Arima,&nbsp;Shaho Zarei","doi":"10.1007/s10182-024-00493-3","DOIUrl":"10.1007/s10182-024-00493-3","url":null,"abstract":"<div><p>In small area estimation, the sample size is so small that direct estimators have seldom enough adequate precision. Therefore, it is common to use auxiliary data via covariates and produce estimators that combine them with direct data. Nevertheless, it is not uncommon for covariates to be measured with error, leading to inconsistent estimators. Area-level models accounting for measurement error (ME) in covariates have been proposed, and they usually assume that the errors are an i.i.d. Gaussian model. However, there might be situations in which this assumption is violated especially when covariates present severe outlying values that cannot be cached by the Gaussian distribution. To overcome this problem, we propose to model the ME through sub-Gaussian <span>(alpha)</span>-stable (SG<span>(alpha)</span>S) distribution, a flexible distribution that accommodates different types of outlying observations and also Gaussian data as a special case when <span>(alpha =2)</span>. The SG<span>(alpha)</span>S distribution is a generalization of the Gaussian distribution that allows for skewness and heavy tails by adding an extra parameter, <span>(alpha in (0,2])</span>, to control tail behaviour. The model parameters are estimated in a fully Bayesian framework. The performance of the proposal is illustrated by applying to real data and some simulation studies.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"108 4","pages":"777 - 799"},"PeriodicalIF":1.4,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140043971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Post-processing for Bayesian analysis of reduced rank regression models with orthonormality restrictions 对具有正交限制的缩减秩回归模型进行贝叶斯分析的后处理
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2023-12-20 DOI: 10.1007/s10182-023-00489-5
Christian Aßmann, Jens Boysen-Hogrefe, Markus Pape

Orthonormality constraints are common in reduced rank models. They imply that matrix-variate parameters are given as orthonormal column vectors. However, these orthonormality restrictions do not provide identification for all parameters. For this setup, we show how the remaining identification issue can be handled in a Bayesian analysis via post-processing the sampling output according to an appropriately specified loss function. This extends the possibilities for Bayesian inference in reduced rank regression models with a part of the parameter space restricted to the Stiefel manifold. Besides inference, we also discuss model selection in terms of posterior predictive assessment. We illustrate the proposed approach with a simulation study and an empirical application.

正交性约束是还原秩模型中常见的约束条件。它们意味着矩阵变量参数是以正交列向量的形式给出的。然而,这些正交性限制并不能识别所有参数。对于这种设置,我们展示了如何通过根据适当指定的损失函数对采样输出进行后处理,在贝叶斯分析中处理剩余的识别问题。这就扩展了贝叶斯推理在缩小秩回归模型中的应用,其参数空间的一部分被限制在 Stiefel 流形中。除了推理,我们还从后验预测评估的角度讨论了模型选择。我们通过模拟研究和经验应用来说明所提出的方法。
{"title":"Post-processing for Bayesian analysis of reduced rank regression models with orthonormality restrictions","authors":"Christian Aßmann,&nbsp;Jens Boysen-Hogrefe,&nbsp;Markus Pape","doi":"10.1007/s10182-023-00489-5","DOIUrl":"10.1007/s10182-023-00489-5","url":null,"abstract":"<div><p>Orthonormality constraints are common in reduced rank models. They imply that matrix-variate parameters are given as orthonormal column vectors. However, these orthonormality restrictions do not provide identification for all parameters. For this setup, we show how the remaining identification issue can be handled in a Bayesian analysis via post-processing the sampling output according to an appropriately specified loss function. This extends the possibilities for Bayesian inference in reduced rank regression models with a part of the parameter space restricted to the Stiefel manifold. Besides inference, we also discuss model selection in terms of posterior predictive assessment. We illustrate the proposed approach with a simulation study and an empirical application.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"108 3","pages":"577 - 609"},"PeriodicalIF":1.4,"publicationDate":"2023-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-023-00489-5.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138818116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian generalized additive model selection including a fast variational option 贝叶斯广义加法模型选择,包括快速变异选项
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2023-12-15 DOI: 10.1007/s10182-023-00490-y
Virginia X. He, Matt P. Wand

We use Bayesian model selection paradigms, such as group least absolute shrinkage and selection operator priors, to facilitate generalized additive model selection. Our approach allows for the effects of continuous predictors to be categorized as either zero, linear or non-linear. Employment of carefully tailored auxiliary variables results in Gibbsian Markov chain Monte Carlo schemes for practical implementation of the approach. In addition, mean field variational algorithms with closed form updates are obtained. Whilst not as accurate, this fast variational option enhances scalability to very large data sets. A package in the R language aids use in practice.

我们使用贝叶斯模型选择范式,如组最小绝对收缩和选择算子先验,来促进广义加法模型选择。我们的方法允许将连续预测因子的影响分为零、线性或非线性。采用精心定制的辅助变量,可产生吉布斯马尔科夫链蒙特卡洛方案,用于该方法的实际应用。此外,还获得了具有闭式更新的均值场变分算法。这种快速变异方案虽然精确度不高,但增强了对超大数据集的可扩展性。R 语言的软件包有助于实际应用。
{"title":"Bayesian generalized additive model selection including a fast variational option","authors":"Virginia X. He,&nbsp;Matt P. Wand","doi":"10.1007/s10182-023-00490-y","DOIUrl":"10.1007/s10182-023-00490-y","url":null,"abstract":"<div><p>We use Bayesian model selection paradigms, such as group least absolute shrinkage and selection operator priors, to facilitate generalized additive model selection. Our approach allows for the effects of continuous predictors to be categorized as either zero, linear or non-linear. Employment of carefully tailored auxiliary variables results in Gibbsian Markov chain Monte Carlo schemes for practical implementation of the approach. In addition, mean field variational algorithms with closed form updates are obtained. Whilst not as accurate, this fast variational option enhances scalability to very large data sets. A package in the <span>R</span> language aids use in practice.\u0000</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"108 3","pages":"639 - 668"},"PeriodicalIF":1.4,"publicationDate":"2023-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138690278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A note on sufficient dimension reduction with post dimension reduction statistical inference 关于充分降维与降维后统计推断的说明
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2023-12-13 DOI: 10.1007/s10182-023-00491-x
Kyongwon Kim

Sufficient dimension reduction is a widely used tool to extract core information hidden in high-dimensional data for classifying, clustering, and predicting response variables. Various dimension reduction methods and their applications have been introduced in the past decades. Data analysis using sufficient dimension reduction involves two steps: dimension reduction and model estimation. However, when we implement the two-step modeling process, we consider the estimated sufficient predictor as a true predictor variable and proceed to the model development step, which includes statistical inference such as estimating confidence intervals and performing hypothesis tests. However, the outcome obtained using this method is by no means complete because it contains errors only from the model estimation step. Therefore, post dimension reduction inference is an important topic because it is essential to consider errors from sufficient dimension reduction. In this paper, we review the fundamentals of sufficient dimension reduction methods. Then, we introduce an intuitive and heuristic approach for the recently developed post dimension reduction statistical inference.

充分降维是一种广泛应用的工具,可提取隐藏在高维数据中的核心信息,用于分类、聚类和预测响应变量。在过去的几十年里,人们提出了各种降维方法及其应用。充分降维的数据分析包括两个步骤:降维和模型估计。然而,当我们实施两步建模过程时,我们会将估计出的充分预测变量视为真正的预测变量,并进入模型开发步骤,其中包括统计推断,如估计置信区间和进行假设检验。然而,使用这种方法得到的结果并不完整,因为它只包含了模型估计步骤中的误差。因此,后降维推断是一个重要课题,因为必须考虑充分降维带来的误差。本文回顾了充分降维方法的基本原理。然后,我们将为最近开发的后降维统计推断介绍一种直观的启发式方法。
{"title":"A note on sufficient dimension reduction with post dimension reduction statistical inference","authors":"Kyongwon Kim","doi":"10.1007/s10182-023-00491-x","DOIUrl":"10.1007/s10182-023-00491-x","url":null,"abstract":"<div><p>Sufficient dimension reduction is a widely used tool to extract core information hidden in high-dimensional data for classifying, clustering, and predicting response variables. Various dimension reduction methods and their applications have been introduced in the past decades. Data analysis using sufficient dimension reduction involves two steps: dimension reduction and model estimation. However, when we implement the two-step modeling process, we consider the estimated sufficient predictor as a true predictor variable and proceed to the model development step, which includes statistical inference such as estimating confidence intervals and performing hypothesis tests. However, the outcome obtained using this method is by no means complete because it contains errors only from the model estimation step. Therefore, post dimension reduction inference is an important topic because it is essential to consider errors from sufficient dimension reduction. In this paper, we review the fundamentals of sufficient dimension reduction methods. Then, we introduce an intuitive and heuristic approach for the recently developed post dimension reduction statistical inference.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"108 4","pages":"733 - 753"},"PeriodicalIF":1.4,"publicationDate":"2023-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138581852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Zero-modified count time series modeling with an application to influenza cases 零修正计数时间序列模型及其在流感病例中的应用
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2023-11-27 DOI: 10.1007/s10182-023-00488-6
Marinho G. Andrade, Katiane S. Conceição, Nalini Ravishanker

The past few decades have seen considerable interest in modeling time series of counts, with applications in many domains. Classical and Bayesian modeling have primarily focused on conditional Poisson sampling distributions at each time. There is very little research on modeling time series involving Zero-Modified (i.e., Zero Deflated or Inflated) distributions. This paper aims to fill this gap and develop models for count time series involving Zero-Modified distributions, which belong to the Power Series family and are suitable for time series exhibiting both zero-inflation and zero-deflation. A full Bayesian approach via the Hamiltonian Monte Carlo (HMC) technique enables accurate modeling and inference. The paper illustrates our approach using time series on the number of deaths from the influenza virus in the city of São Paulo, Brazil.

在过去的几十年里,人们对计数时间序列建模产生了相当大的兴趣,并在许多领域得到了应用。经典和贝叶斯建模主要集中在每次的条件泊松抽样分布上。对涉及零修正(即零Deflated或零膨胀)分布的时间序列建模的研究很少。本文旨在填补这一空白,开发涉及零修正分布的计数时间序列模型,该模型属于幂级数族,适用于零通货膨胀和零通货紧缩的时间序列。一个完整的贝叶斯方法通过哈密顿蒙特卡罗(HMC)技术实现准确的建模和推理。该论文说明了我们的方法使用时间序列上的死亡人数从流感病毒在城市圣保罗,巴西。
{"title":"Zero-modified count time series modeling with an application to influenza cases","authors":"Marinho G. Andrade,&nbsp;Katiane S. Conceição,&nbsp;Nalini Ravishanker","doi":"10.1007/s10182-023-00488-6","DOIUrl":"10.1007/s10182-023-00488-6","url":null,"abstract":"<div><p>The past few decades have seen considerable interest in modeling time series of counts, with applications in many domains. Classical and Bayesian modeling have primarily focused on conditional Poisson sampling distributions at each time. There is very little research on modeling time series involving Zero-Modified (i.e., Zero Deflated or Inflated) distributions. This paper aims to fill this gap and develop models for count time series involving Zero-Modified distributions, which belong to the Power Series family and are suitable for time series exhibiting both zero-inflation and zero-deflation. A full Bayesian approach via the Hamiltonian Monte Carlo (HMC) technique enables accurate modeling and inference. The paper illustrates our approach using time series on the number of deaths from the influenza virus in the city of São Paulo, Brazil.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"108 3","pages":"611 - 637"},"PeriodicalIF":1.4,"publicationDate":"2023-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138506562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mixtures of generalized normal distributions and EGARCH models to analyse returns and volatility of ESG and traditional investments 混合广义正态分布和EGARCH模型来分析ESG和传统投资的回报和波动性
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2023-11-18 DOI: 10.1007/s10182-023-00487-7
Pierdomenico Duttilo, Stefano Antonio Gattone, Barbara Iannone

Environmental, social and governance (ESG) criteria are increasingly integrated into investment process to contribute to overcoming global sustainability challenges. Focusing on the reaction to turmoil periods, this work analyses returns and volatility of several ESG indices and makes a comparison with their traditional counterparts from 2016 to 2022. These indices comprise the following markets: Global, the US, Europe and emerging markets. Firstly, the two-component mixture of generalized normal distribution was exploited to objectively detect financial market turmoil periods with the Naïve Bayes’ classifier. Secondly, the EGARCH-in-mean model with exogenous dummy variables was applied to capture the turmoil period impact. Results show that returns and volatility are both affected by turmoil periods. The return–risk performance differs by index type and market: the European ESG index is less volatile than its traditional market benchmark, while in the other markets, the estimated volatility is approximately the same. Moreover, ESG and non-ESG indices differ in terms of turmoil periods impact, risk premium and leverage effect.

环境、社会和治理(ESG)标准日益融入投资过程,有助于克服全球可持续性挑战。本文着眼于对动荡时期的反应,分析了2016年至2022年几个ESG指数的回报和波动性,并与传统指数进行了比较。这些指数包括以下市场:全球、美国、欧洲和新兴市场。首先,利用广义正态分布的双成分混合,利用Naïve贝叶斯分类器客观地检测金融市场动荡时期。其次,采用外生虚拟变量的EGARCH-in-mean模型来捕捉动荡时期的影响。结果表明,收益和波动率都受到动荡时期的影响。不同指数类型和市场的回报风险表现不同:欧洲ESG指数的波动率低于其传统市场基准,而在其他市场,估计的波动率大致相同。此外,ESG指数与非ESG指数在动荡期影响、风险溢价和杠杆效应方面存在差异。
{"title":"Mixtures of generalized normal distributions and EGARCH models to analyse returns and volatility of ESG and traditional investments","authors":"Pierdomenico Duttilo,&nbsp;Stefano Antonio Gattone,&nbsp;Barbara Iannone","doi":"10.1007/s10182-023-00487-7","DOIUrl":"10.1007/s10182-023-00487-7","url":null,"abstract":"<div><p>Environmental, social and governance (ESG) criteria are increasingly integrated into investment process to contribute to overcoming global sustainability challenges. Focusing on the reaction to turmoil periods, this work analyses returns and volatility of several ESG indices and makes a comparison with their traditional counterparts from 2016 to 2022. These indices comprise the following markets: Global, the US, Europe and emerging markets. Firstly, the two-component mixture of generalized normal distribution was exploited to objectively detect financial market turmoil periods with the Naïve Bayes’ classifier. Secondly, the EGARCH-in-mean model with exogenous dummy variables was applied to capture the turmoil period impact. Results show that returns and volatility are both affected by turmoil periods. The return–risk performance differs by index type and market: the European ESG index is less volatile than its traditional market benchmark, while in the other markets, the estimated volatility is approximately the same. Moreover, ESG and non-ESG indices differ in terms of turmoil periods impact, risk premium and leverage effect.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"108 4","pages":"755 - 775"},"PeriodicalIF":1.4,"publicationDate":"2023-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-023-00487-7.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138506566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Asta-Advances in Statistical Analysis
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1