Pub Date : 2024-06-04DOI: 10.1007/s11222-024-10436-2
Douglas P. Wiens
We discuss, and give examples of, methods for randomly implementing some minimax robust designs from the literature. These have the advantage, over their deterministic counterparts, of having bounded maximum loss in large and very rich neighbourhoods of the, almost certainly inexact, response model fitted by the experimenter. Their maximum loss rivals that of the theoretically best possible, but not implementable, minimax designs. The procedures are then extended to more general robust designs. For two-dimensional designs we sample from contractions of Voronoi tessellations, generated by selected basis points, which partition the design space. These ideas are then extended to k-dimensional designs for general k.
我们将讨论并举例说明随机实施文献中某些最小稳健设计的方法。与确定性设计相比,这些设计的优点是在实验者拟合的反应模型(几乎可以肯定是不精确的)的大而丰富的邻域内具有有界最大损失。它们的最大损失可与理论上最佳但无法实施的最小设计相媲美。然后,我们将程序扩展到更一般的稳健设计。对于二维设计,我们从由选定基点生成的 Voronoi 网格收缩中进行采样,从而分割设计空间。然后,我们将这些想法扩展到一般 k 的 k 维设计。
{"title":"Jittering and clustering: strategies for the construction of robust designs","authors":"Douglas P. Wiens","doi":"10.1007/s11222-024-10436-2","DOIUrl":"https://doi.org/10.1007/s11222-024-10436-2","url":null,"abstract":"<p>We discuss, and give examples of, methods for randomly implementing some minimax robust designs from the literature. These have the advantage, over their deterministic counterparts, of having bounded maximum loss in large and very rich neighbourhoods of the, almost certainly inexact, response model fitted by the experimenter. Their maximum loss rivals that of the theoretically best possible, but not implementable, minimax designs. The procedures are then extended to more general robust designs. For two-dimensional designs we sample from contractions of Voronoi tessellations, generated by selected basis points, which partition the design space. These ideas are then extended to <i>k</i>-dimensional designs for general <i>k</i>.</p>","PeriodicalId":22058,"journal":{"name":"Statistics and Computing","volume":"418 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141259028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-03DOI: 10.1007/s11222-024-10441-5
Ruhul Ali Khan, Ayan Pal, Debasis Kundu
Outlier-prone data sets are of immense interest in diverse areas including economics, finance, statistical physics, signal processing, telecommunications and so on. Stable laws (also known as (alpha )- stable laws) are often found to be useful in modeling outlier-prone data containing important information and exhibiting heavy tailed phenomenon. In this article, an asymptotic distribution of a unbiased and consistent estimator of the stability index (alpha ) is proposed based on jackknife empirical likelihood (JEL) and adjusted JEL method. Next, using the sum-preserving property of stable random variables and exploiting U-statistic theory, we have developed a goodness-of-fit test procedure for (alpha )-stable distributions where the stability index (alpha ) is specified. Extensive simulation studies are performed in order to assess the finite sample performance of the proposed test. Finally, two appealing real life data examples related to the daily closing price of German Stock Index and Bitcoin cryptocurrency are analysed in detail for illustration purposes.
离群值数据集在经济、金融、统计物理、信号处理、电信等多个领域都有着巨大的意义。稳定规律(也称为 (α)- 稳定规律)经常被用来模拟包含重要信息并表现出重尾现象的离群易变数据。本文基于杰克刀经验似然法(JEL)和调整JEL法,提出了稳定指数(alpha )的无偏一致估计值的渐近分布。接下来,我们利用稳定随机变量的保和性并利用 U 统计理论,为指定了稳定指数 ()的 (α )-稳定分布建立了拟合优度检验程序。为了评估所提出的测试的有限样本性能,进行了广泛的模拟研究。最后,为了说明问题,详细分析了与德国股票指数和比特币加密货币每日收盘价相关的两个有吸引力的现实生活数据示例。
{"title":"Testing the goodness-of-fit of the stable distributions with applications to German stock index data and Bitcoin cryptocurrency data","authors":"Ruhul Ali Khan, Ayan Pal, Debasis Kundu","doi":"10.1007/s11222-024-10441-5","DOIUrl":"https://doi.org/10.1007/s11222-024-10441-5","url":null,"abstract":"<p>Outlier-prone data sets are of immense interest in diverse areas including economics, finance, statistical physics, signal processing, telecommunications and so on. Stable laws (also known as <span>(alpha )</span>- stable laws) are often found to be useful in modeling outlier-prone data containing important information and exhibiting heavy tailed phenomenon. In this article, an asymptotic distribution of a unbiased and consistent estimator of the stability index <span>(alpha )</span> is proposed based on jackknife empirical likelihood (JEL) and adjusted JEL method. Next, using the sum-preserving property of stable random variables and exploiting <i>U</i>-statistic theory, we have developed a goodness-of-fit test procedure for <span>(alpha )</span>-stable distributions where the stability index <span>(alpha )</span> is specified. Extensive simulation studies are performed in order to assess the finite sample performance of the proposed test. Finally, two appealing real life data examples related to the daily closing price of German Stock Index and Bitcoin cryptocurrency are analysed in detail for illustration purposes.</p>","PeriodicalId":22058,"journal":{"name":"Statistics and Computing","volume":"75 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141259103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-31DOI: 10.1007/s11222-024-10423-7
Antoine Luciano, Christian P. Robert, Robin J. Ryder
In some applied scenarios, the availability of complete data is restricted, often due to privacy concerns; only aggregated, robust and inefficient statistics derived from the data are made accessible. These robust statistics are not sufficient, but they demonstrate reduced sensitivity to outliers and offer enhanced data protection due to their higher breakdown point. We consider a parametric framework and propose a method to sample from the posterior distribution of parameters conditioned on various robust and inefficient statistics: specifically, the pairs (median, MAD) or (median, IQR), or a collection of quantiles. Our approach leverages a Gibbs sampler and simulates latent augmented data, which facilitates simulation from the posterior distribution of parameters belonging to specific families of distributions. A by-product of these samples from the joint posterior distribution of parameters and data given the observed statistics is that we can estimate Bayes factors based on observed statistics via bridge sampling. We validate and outline the limitations of the proposed methods through toy examples and an application to real-world income data.
{"title":"Insufficient Gibbs sampling","authors":"Antoine Luciano, Christian P. Robert, Robin J. Ryder","doi":"10.1007/s11222-024-10423-7","DOIUrl":"https://doi.org/10.1007/s11222-024-10423-7","url":null,"abstract":"<p>In some applied scenarios, the availability of complete data is restricted, often due to privacy concerns; only aggregated, robust and inefficient statistics derived from the data are made accessible. These robust statistics are not sufficient, but they demonstrate reduced sensitivity to outliers and offer enhanced data protection due to their higher breakdown point. We consider a parametric framework and propose a method to sample from the posterior distribution of parameters conditioned on various robust and inefficient statistics: specifically, the pairs (median, MAD) or (median, IQR), or a collection of quantiles. Our approach leverages a Gibbs sampler and simulates latent augmented data, which facilitates simulation from the posterior distribution of parameters belonging to specific families of distributions. A by-product of these samples from the joint posterior distribution of parameters and data given the observed statistics is that we can estimate Bayes factors based on observed statistics via bridge sampling. We validate and outline the limitations of the proposed methods through toy examples and an application to real-world income data.</p>","PeriodicalId":22058,"journal":{"name":"Statistics and Computing","volume":"94 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141190162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-31DOI: 10.1007/s11222-024-10437-1
Gianluca Cubadda, Francesco Giancaterini, Alain Hecq, Joann Jasiak
This paper investigates the performance of routinely used optimization algorithms in application to the Generalized Covariance estimator (GCov) for univariate and multivariate mixed causal and noncausal models. The GCov is a semi-parametric estimator with an objective function based on nonlinear autocovariances to identify causal and noncausal orders. When the number and type of nonlinear autocovariances included in the objective function are insufficient/inadequate, or the error density is too close to the Gaussian, identification issues can arise. These issues result in local minima in the objective function, which correspond to parameter values associated with incorrect causal and noncausal orders. Then, depending on the starting point and the optimization algorithm employed, the algorithm can converge to a local minimum. The paper proposes the Simulated Annealing (SA) optimization algorithm as an alternative to conventional numerical optimization methods. The results demonstrate that SA performs well in its application to mixed causal and noncausal models, successfully eliminating the effects of local minima. The proposed approach is illustrated by an empirical study of a bivariate series of commodity prices.
{"title":"Optimization of the generalized covariance estimator in noncausal processes","authors":"Gianluca Cubadda, Francesco Giancaterini, Alain Hecq, Joann Jasiak","doi":"10.1007/s11222-024-10437-1","DOIUrl":"https://doi.org/10.1007/s11222-024-10437-1","url":null,"abstract":"<p>This paper investigates the performance of routinely used optimization algorithms in application to the Generalized Covariance estimator (<i>GCov</i>) for univariate and multivariate mixed causal and noncausal models. The <i>GCov</i> is a semi-parametric estimator with an objective function based on nonlinear autocovariances to identify causal and noncausal orders. When the number and type of nonlinear autocovariances included in the objective function are insufficient/inadequate, or the error density is too close to the Gaussian, identification issues can arise. These issues result in local minima in the objective function, which correspond to parameter values associated with incorrect causal and noncausal orders. Then, depending on the starting point and the optimization algorithm employed, the algorithm can converge to a local minimum. The paper proposes the Simulated Annealing (SA) optimization algorithm as an alternative to conventional numerical optimization methods. The results demonstrate that SA performs well in its application to mixed causal and noncausal models, successfully eliminating the effects of local minima. The proposed approach is illustrated by an empirical study of a bivariate series of commodity prices.</p>","PeriodicalId":22058,"journal":{"name":"Statistics and Computing","volume":"2010 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141190197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-29DOI: 10.1007/s11222-024-10435-3
Sphiwe B. Skhosana, Salomon M. Millard, Frans H. J. Kanfer
Semi-parametric Gaussian mixtures of non-parametric regressions (SPGMNRs) are a flexible extension of Gaussian mixtures of linear regressions (GMLRs). The model assumes that the component regression functions (CRFs) are non-parametric functions of the covariate(s) whereas the component mixing proportions and variances are constants. Unfortunately, the model cannot be reliably estimated using traditional methods. A local-likelihood approach for estimating the CRFs requires that we maximize a set of local-likelihood functions. Using the Expectation-Maximization (EM) algorithm to separately maximize each local-likelihood function may lead to label-switching. This is because the posterior probabilities calculated at the local E-step are not guaranteed to be aligned. The consequence of this label-switching is wiggly and non-smooth estimates of the CRFs. In this paper, we propose a unified approach to address label-switching and obtain sensible estimates. The proposed approach has two stages. In the first stage, we propose a model-based approach to address the label-switching problem. We first note that each local-likelihood function is a likelihood function of a Gaussian mixture model (GMM). Next, we reformulate the SPGMNRs model as a mixture of these GMMs. Lastly, using a modified version of the Expectation Conditional Maximization (ECM) algorithm, we estimate the mixture of GMMs. In addition, using the mixing weights of the local GMMs, we can automatically choose the local points where local-likelihood estimation takes place. In the second stage, we propose one-step backfitting estimates of the parametric and non-parametric terms. The effectiveness of the proposed approach is demonstrated on simulated data and real data analysis.
{"title":"A modified EM-type algorithm to estimate semi-parametric mixtures of non-parametric regressions","authors":"Sphiwe B. Skhosana, Salomon M. Millard, Frans H. J. Kanfer","doi":"10.1007/s11222-024-10435-3","DOIUrl":"https://doi.org/10.1007/s11222-024-10435-3","url":null,"abstract":"<p>Semi-parametric Gaussian mixtures of non-parametric regressions (SPGMNRs) are a flexible extension of Gaussian mixtures of linear regressions (GMLRs). The model assumes that the component regression functions (CRFs) are non-parametric functions of the covariate(s) whereas the component mixing proportions and variances are constants. Unfortunately, the model cannot be reliably estimated using traditional methods. A local-likelihood approach for estimating the CRFs requires that we maximize a set of local-likelihood functions. Using the Expectation-Maximization (EM) algorithm to separately maximize each local-likelihood function may lead to label-switching. This is because the posterior probabilities calculated at the local E-step are not guaranteed to be aligned. The consequence of this label-switching is wiggly and non-smooth estimates of the CRFs. In this paper, we propose a unified approach to address label-switching and obtain sensible estimates. The proposed approach has two stages. In the first stage, we propose a model-based approach to address the label-switching problem. We first note that each local-likelihood function is a likelihood function of a Gaussian mixture model (GMM). Next, we reformulate the SPGMNRs model as a mixture of these GMMs. Lastly, using a modified version of the Expectation Conditional Maximization (ECM) algorithm, we estimate the mixture of GMMs. In addition, using the mixing weights of the local GMMs, we can automatically choose the local points where local-likelihood estimation takes place. In the second stage, we propose one-step backfitting estimates of the parametric and non-parametric terms. The effectiveness of the proposed approach is demonstrated on simulated data and real data analysis.</p>","PeriodicalId":22058,"journal":{"name":"Statistics and Computing","volume":"62 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141166408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-25DOI: 10.1007/s11222-024-10433-5
Mineaki Ohishi
Generalized fused Lasso (GFL) is a powerful method based on adjacent relationships or the network structure of data. It is used in a number of research areas, including clustering, discrete smoothing, and spatio-temporal analysis. When applying GFL, the specific optimization method used is an important issue. In generalized linear models, efficient algorithms based on the coordinate descent method have been developed for trend filtering under the binomial and Poisson distributions. However, to apply GFL to other distributions, such as the negative binomial distribution, which is used to deal with overdispersion in the Poisson distribution, or the gamma and inverse Gaussian distributions, which are used for positive continuous data, an algorithm for each individual distribution must be developed. To unify GFL for distributions in the exponential family, this paper proposes a coordinate descent algorithm for generalized linear models. To illustrate the method, a real data example of spatio-temporal analysis is provided.
{"title":"Generalized fused Lasso for grouped data in generalized linear models","authors":"Mineaki Ohishi","doi":"10.1007/s11222-024-10433-5","DOIUrl":"https://doi.org/10.1007/s11222-024-10433-5","url":null,"abstract":"<p>Generalized fused Lasso (GFL) is a powerful method based on adjacent relationships or the network structure of data. It is used in a number of research areas, including clustering, discrete smoothing, and spatio-temporal analysis. When applying GFL, the specific optimization method used is an important issue. In generalized linear models, efficient algorithms based on the coordinate descent method have been developed for trend filtering under the binomial and Poisson distributions. However, to apply GFL to other distributions, such as the negative binomial distribution, which is used to deal with overdispersion in the Poisson distribution, or the gamma and inverse Gaussian distributions, which are used for positive continuous data, an algorithm for each individual distribution must be developed. To unify GFL for distributions in the exponential family, this paper proposes a coordinate descent algorithm for generalized linear models. To illustrate the method, a real data example of spatio-temporal analysis is provided.</p>","PeriodicalId":22058,"journal":{"name":"Statistics and Computing","volume":"17 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2024-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141153778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-24DOI: 10.1007/s11222-024-10434-4
Eoghan O’Neill
Censoring occurs when an outcome is unobserved beyond some threshold value. Methods that do not account for censoring produce biased predictions of the unobserved outcome. This paper introduces Type I Tobit Bayesian Additive Regression Tree (TOBART-1) models for censored outcomes. Simulation results and real data applications demonstrate that TOBART-1 produces accurate predictions of censored outcomes. TOBART-1 provides posterior intervals for the conditional expectation and other quantities of interest. The error term distribution can have a large impact on the expectation of the censored outcome. Therefore, the error is flexibly modeled as a Dirichlet process mixture of normal distributions. An R package is available at https://github.com/EoghanONeill/TobitBART.
当一个结果的观测值超过某个临界值时,就会发生删减。不考虑删减的方法对未观察结果的预测会产生偏差。本文介绍了针对剔除结果的 I 类托比特贝叶斯加性回归树(TOBART-1)模型。模拟结果和实际数据应用证明,TOBART-1 能准确预测剔除结果。TOBART-1 提供了条件期望和其他相关量的后验区间。误差项的分布会对删减结果的期望值产生很大影响。因此,误差被灵活地建模为正态分布的 Dirichlet 过程混合物。R 软件包见 https://github.com/EoghanONeill/TobitBART。
{"title":"Type I Tobit Bayesian Additive Regression Trees for censored outcome regression","authors":"Eoghan O’Neill","doi":"10.1007/s11222-024-10434-4","DOIUrl":"https://doi.org/10.1007/s11222-024-10434-4","url":null,"abstract":"<p>Censoring occurs when an outcome is unobserved beyond some threshold value. Methods that do not account for censoring produce biased predictions of the unobserved outcome. This paper introduces Type I Tobit Bayesian Additive Regression Tree (TOBART-1) models for censored outcomes. Simulation results and real data applications demonstrate that TOBART-1 produces accurate predictions of censored outcomes. TOBART-1 provides posterior intervals for the conditional expectation and other quantities of interest. The error term distribution can have a large impact on the expectation of the censored outcome. Therefore, the error is flexibly modeled as a Dirichlet process mixture of normal distributions. An R package is available at https://github.com/EoghanONeill/TobitBART.\u0000</p>","PeriodicalId":22058,"journal":{"name":"Statistics and Computing","volume":"46 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2024-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141150197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-22DOI: 10.1007/s11222-024-10432-6
Vladimir Pastukhov
In this paper, we introduce and study fused lasso nearly-isotonic signal approximation, which is a combination of fused lasso and generalized nearly-isotonic regression. We show how these three estimators relate to each other and derive solution to a general problem. Our estimator is computationally feasible and provides a trade-off between monotonicity, block sparsity, and goodness-of-fit. Next, we prove that fusion and near-isotonisation in a one-dimensional case can be applied interchangably, and this step-wise procedure gives the solution to the original optimization problem. This property of the estimator is very important, because it provides a direct way to construct a path solution when one of the penalization parameters is fixed. Also, we derive an unbiased estimator of degrees of freedom of the estimator.
{"title":"Fused lasso nearly-isotonic signal approximation in general dimensions","authors":"Vladimir Pastukhov","doi":"10.1007/s11222-024-10432-6","DOIUrl":"https://doi.org/10.1007/s11222-024-10432-6","url":null,"abstract":"<p>In this paper, we introduce and study fused lasso nearly-isotonic signal approximation, which is a combination of fused lasso and generalized nearly-isotonic regression. We show how these three estimators relate to each other and derive solution to a general problem. Our estimator is computationally feasible and provides a trade-off between monotonicity, block sparsity, and goodness-of-fit. Next, we prove that fusion and near-isotonisation in a one-dimensional case can be applied interchangably, and this step-wise procedure gives the solution to the original optimization problem. This property of the estimator is very important, because it provides a direct way to construct a path solution when one of the penalization parameters is fixed. Also, we derive an unbiased estimator of degrees of freedom of the estimator.</p>","PeriodicalId":22058,"journal":{"name":"Statistics and Computing","volume":"48 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2024-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141150213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-21DOI: 10.1007/s11222-024-10404-w
Alex Cooper, Aki Vehtari, Catherine Forbes, Dan Simpson, Lauren Kennedy
Brute force cross-validation (CV) is a method for predictive assessment and model selection that is general and applicable to a wide range of Bayesian models. Naive or ‘brute force’ CV approaches are often too computationally costly for interactive modeling workflows, especially when inference relies on Markov chain Monte Carlo (MCMC). We propose overcoming this limitation using massively parallel MCMC. Using accelerator hardware such as graphics processor units, our approach can be about as fast (in wall clock time) as a single full-data model fit. Parallel CV is flexible because it can easily exploit a wide range data partitioning schemes, such as those designed for non-exchangeable data. It can also accommodate a range of scoring rules. We propose MCMC diagnostics, including a summary of MCMC mixing based on the popular potential scale reduction factor ((widehat{textrm{R}})) and MCMC effective sample size ((widehat{textrm{ESS}})) measures. We also describe a method for determining whether an (widehat{textrm{R}}) diagnostic indicates approximate stationarity of the chains, that may be of more general interest for applications beyond parallel CV. Finally, we show that parallel CV and its diagnostics can be implemented with online algorithms, allowing parallel CV to scale up to very large blocking designs on memory-constrained computing accelerators.
{"title":"Bayesian cross-validation by parallel Markov chain Monte Carlo","authors":"Alex Cooper, Aki Vehtari, Catherine Forbes, Dan Simpson, Lauren Kennedy","doi":"10.1007/s11222-024-10404-w","DOIUrl":"https://doi.org/10.1007/s11222-024-10404-w","url":null,"abstract":"<p>Brute force cross-validation (CV) is a method for predictive assessment and model selection that is general and applicable to a wide range of Bayesian models. Naive or ‘brute force’ CV approaches are often too computationally costly for interactive modeling workflows, especially when inference relies on Markov chain Monte Carlo (MCMC). We propose overcoming this limitation using massively parallel MCMC. Using accelerator hardware such as graphics processor units, our approach can be about as fast (in wall clock time) as a single full-data model fit. Parallel CV is flexible because it can easily exploit a wide range data partitioning schemes, such as those designed for non-exchangeable data. It can also accommodate a range of scoring rules. We propose MCMC diagnostics, including a summary of MCMC mixing based on the popular potential scale reduction factor (<span>(widehat{textrm{R}})</span>) and MCMC effective sample size (<span>(widehat{textrm{ESS}})</span>) measures. We also describe a method for determining whether an <span>(widehat{textrm{R}})</span> diagnostic indicates approximate stationarity of the chains, that may be of more general interest for applications beyond parallel CV. Finally, we show that parallel CV and its diagnostics can be implemented with online algorithms, allowing parallel CV to scale up to very large blocking designs on memory-constrained computing accelerators.</p>","PeriodicalId":22058,"journal":{"name":"Statistics and Computing","volume":"60 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2024-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141150196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-13DOI: 10.1007/s11222-024-10430-8
Yu-Chien Bo Ning, Ning Ning
Sparse principal component analysis (SPCA) is a popular tool for dimensionality reduction in high-dimensional data. However, there is still a lack of theoretically justified Bayesian SPCA methods that can scale well computationally. One of the major challenges in Bayesian SPCA is selecting an appropriate prior for the loadings matrix, considering that principal components are mutually orthogonal. We propose a novel parameter-expanded coordinate ascent variational inference (PX-CAVI) algorithm. This algorithm utilizes a spike and slab prior, which incorporates parameter expansion to cope with the orthogonality constraint. Besides comparing to two popular SPCA approaches, we introduce the PX-EM algorithm as an EM analogue to the PX-CAVI algorithm for comparison. Through extensive numerical simulations, we demonstrate that the PX-CAVI algorithm outperforms these SPCA approaches, showcasing its superiority in terms of performance. We study the posterior contraction rate of the variational posterior, providing a novel contribution to the existing literature. The PX-CAVI algorithm is then applied to study a lung cancer gene expression dataset. The (textsf{R}) package (textsf{VBsparsePCA}) with an implementation of the algorithm is available on the Comprehensive R Archive Network (CRAN).
{"title":"Spike and slab Bayesian sparse principal component analysis","authors":"Yu-Chien Bo Ning, Ning Ning","doi":"10.1007/s11222-024-10430-8","DOIUrl":"https://doi.org/10.1007/s11222-024-10430-8","url":null,"abstract":"<p>Sparse principal component analysis (SPCA) is a popular tool for dimensionality reduction in high-dimensional data. However, there is still a lack of theoretically justified Bayesian SPCA methods that can scale well computationally. One of the major challenges in Bayesian SPCA is selecting an appropriate prior for the loadings matrix, considering that principal components are mutually orthogonal. We propose a novel parameter-expanded coordinate ascent variational inference (PX-CAVI) algorithm. This algorithm utilizes a spike and slab prior, which incorporates parameter expansion to cope with the orthogonality constraint. Besides comparing to two popular SPCA approaches, we introduce the PX-EM algorithm as an EM analogue to the PX-CAVI algorithm for comparison. Through extensive numerical simulations, we demonstrate that the PX-CAVI algorithm outperforms these SPCA approaches, showcasing its superiority in terms of performance. We study the posterior contraction rate of the variational posterior, providing a novel contribution to the existing literature. The PX-CAVI algorithm is then applied to study a lung cancer gene expression dataset. The <span>(textsf{R})</span> package <span>(textsf{VBsparsePCA})</span> with an implementation of the algorithm is available on the Comprehensive R Archive Network (CRAN).</p>","PeriodicalId":22058,"journal":{"name":"Statistics and Computing","volume":"47 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140941321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}