首页 > 最新文献

Computational Statistics最新文献

英文 中文
Asymptotic properties of kernel density and hazard rate function estimators with censored widely orthant dependent data 核密度和危险率函数估计器的渐近特性与普查广泛正交依存数据
IF 1.3 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-06-03 DOI: 10.1007/s00180-024-01509-x
Yi Wu, Wei Wang, Wei Yu, Xuejun Wang

Kernel estimators of density function and hazard rate function are very important in nonparametric statistics. The paper aims to investigate the uniformly strong representations and the rates of uniformly strong consistency for kernel smoothing density and hazard rate function estimation with censored widely orthant dependent data based on the Kaplan–Meier estimator. Under some mild conditions, the rates of the remainder term and strong consistency are shown to be (Obig (sqrt{log (ng(n))/big (nb_{n}^{2}big )}big )~a.s.) and (Obig (sqrt{log (ng(n))/big (nb_{n}^{2}big )}big )+Obig (b_{n}^{2}big )~a.s.), respectively, where g(n) are the dominating coefficients of widely orthant dependent random variables. Some numerical simulations and a real data analysis are also presented to confirm the theoretical results based on finite sample performances.

密度函数和危险率函数的核估计器在非参数统计中非常重要。本文旨在研究基于 Kaplan-Meier 估计器的核平滑密度和危险率函数估计的均匀强表示和均匀强一致性率。在一些温和的条件下,余项率和强一致性被证明为 (Obig (sqrt{log (ng(n))/big (nb_{n}^{2}big )}big )~a.s.)和(Obig (sqrtlog (ng(n))/big (nb_{n}^{2}big )}big )+Obig (b_{n}^{2}big )~a.s.) ,其中 g(n) 是广义正交因变量的支配系数。本文还给出了一些数值模拟和实际数据分析,以证实基于有限样本性能的理论结果。
{"title":"Asymptotic properties of kernel density and hazard rate function estimators with censored widely orthant dependent data","authors":"Yi Wu, Wei Wang, Wei Yu, Xuejun Wang","doi":"10.1007/s00180-024-01509-x","DOIUrl":"https://doi.org/10.1007/s00180-024-01509-x","url":null,"abstract":"<p>Kernel estimators of density function and hazard rate function are very important in nonparametric statistics. The paper aims to investigate the uniformly strong representations and the rates of uniformly strong consistency for kernel smoothing density and hazard rate function estimation with censored widely orthant dependent data based on the Kaplan–Meier estimator. Under some mild conditions, the rates of the remainder term and strong consistency are shown to be <span>(Obig (sqrt{log (ng(n))/big (nb_{n}^{2}big )}big )~a.s.)</span> and <span>(Obig (sqrt{log (ng(n))/big (nb_{n}^{2}big )}big )+Obig (b_{n}^{2}big )~a.s.)</span>, respectively, where <i>g</i>(<i>n</i>) are the dominating coefficients of widely orthant dependent random variables. Some numerical simulations and a real data analysis are also presented to confirm the theoretical results based on finite sample performances.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"128 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141256196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Expectile regression averaging method for probabilistic forecasting of electricity prices 用于电价概率预测的期望回归平均法
IF 1.3 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-05-29 DOI: 10.1007/s00180-024-01508-y
Joanna Janczura

In this paper we propose a new method for probabilistic forecasting of electricity prices. It is based on averaging point forecasts from different models combined with expectile regression. We show that deriving the predicted distribution in terms of expectiles, might be in some cases advantageous to the commonly used quantiles. We apply the proposed method to the day-ahead electricity prices from the German market and compare its accuracy with the Quantile Regression Averaging method and quantile- as well as expectile-based historical simulation. The obtained results indicate that using the expectile regression improves the accuracy of the probabilistic forecasts of electricity prices, but a variance stabilizing transformation should be applied prior to modelling.

本文提出了一种新的电价概率预测方法。该方法基于不同模型的平均点预测,并结合了期望值回归。我们证明,在某些情况下,用期望值推导预测分布可能比常用的量值更有优势。我们将所提出的方法应用于德国市场的日前电价,并将其准确性与量化回归平均法以及基于量化和期望值的历史模拟进行了比较。结果表明,使用期望值回归法可以提高电价概率预测的准确性,但在建模前应进行方差稳定转换。
{"title":"Expectile regression averaging method for probabilistic forecasting of electricity prices","authors":"Joanna Janczura","doi":"10.1007/s00180-024-01508-y","DOIUrl":"https://doi.org/10.1007/s00180-024-01508-y","url":null,"abstract":"<p>In this paper we propose a new method for probabilistic forecasting of electricity prices. It is based on averaging point forecasts from different models combined with expectile regression. We show that deriving the predicted distribution in terms of expectiles, might be in some cases advantageous to the commonly used quantiles. We apply the proposed method to the day-ahead electricity prices from the German market and compare its accuracy with the Quantile Regression Averaging method and quantile- as well as expectile-based historical simulation. The obtained results indicate that using the expectile regression improves the accuracy of the probabilistic forecasts of electricity prices, but a variance stabilizing transformation should be applied prior to modelling.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"28 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141165757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Projection predictive variable selection for discrete response families with finite support 有限支持离散响应族的投影预测变量选择
IF 1.3 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-05-29 DOI: 10.1007/s00180-024-01506-0
Frank Weber, Änne Glass, Aki Vehtari

The projection predictive variable selection is a decision-theoretically justified Bayesian variable selection approach achieving an outstanding trade-off between predictive performance and sparsity. Its projection problem is not easy to solve in general because it is based on the Kullback–Leibler divergence from a restricted posterior predictive distribution of the so-called reference model to the parameter-conditional predictive distribution of a candidate model. Previous work showed how this projection problem can be solved for response families employed in generalized linear models and how an approximate latent-space approach can be used for many other response families. Here, we present an exact projection method for all response families with discrete and finite support, called the augmented-data projection. A simulation study for an ordinal response family shows that the proposed method performs better than or similarly to the previously proposed approximate latent-space projection. The cost of the slightly better performance of the augmented-data projection is a substantial increase in runtime. Thus, if the augmented-data projection’s runtime is too high, we recommend the latent projection in the early phase of the model-building workflow and the augmented-data projection for final results. The ordinal response family from our simulation study is supported by both projection methods, but we also include a real-world cancer subtyping example with a nominal response family, a case that is not supported by the latent projection.

投影预测变量选择是一种决策理论上合理的贝叶斯变量选择方法,可在预测性能和稀疏性之间实现出色的权衡。其投影问题在一般情况下并不容易解决,因为它是基于从所谓参考模型的受限后验预测分布到候选模型的参数条件预测分布的库尔贝-莱布勒发散。之前的研究表明了如何解决广义线性模型中的响应族的投影问题,以及如何使用近似潜空间方法解决许多其他响应族的投影问题。在这里,我们提出了一种适用于所有离散和有限支持的响应族的精确投影方法,即增强数据投影法。对一个序数响应族的仿真研究表明,所提出的方法比之前提出的近似潜空间投影方法性能更好,或者类似。增强数据投影性能略好的代价是运行时间大幅增加。因此,如果增强数据投影的运行时间过长,我们建议在模型建立工作流程的早期阶段使用潜空间投影,而在最终结果中使用增强数据投影。两种投影方法都支持我们模拟研究中的序数响应族,但我们也包含了一个真实世界中的癌症细分示例,该示例使用的是名义响应族,而潜在投影方法不支持这种情况。
{"title":"Projection predictive variable selection for discrete response families with finite support","authors":"Frank Weber, Änne Glass, Aki Vehtari","doi":"10.1007/s00180-024-01506-0","DOIUrl":"https://doi.org/10.1007/s00180-024-01506-0","url":null,"abstract":"<p>The projection predictive variable selection is a decision-theoretically justified Bayesian variable selection approach achieving an outstanding trade-off between predictive performance and sparsity. Its projection problem is not easy to solve in general because it is based on the Kullback–Leibler divergence from a restricted posterior predictive distribution of the so-called reference model to the parameter-conditional predictive distribution of a candidate model. Previous work showed how this projection problem can be solved for response families employed in generalized linear models and how an approximate latent-space approach can be used for many other response families. Here, we present an exact projection method for all response families with discrete and finite support, called the augmented-data projection. A simulation study for an ordinal response family shows that the proposed method performs better than or similarly to the previously proposed approximate latent-space projection. The cost of the slightly better performance of the augmented-data projection is a substantial increase in runtime. Thus, if the augmented-data projection’s runtime is too high, we recommend the latent projection in the early phase of the model-building workflow and the augmented-data projection for final results. The ordinal response family from our simulation study is supported by both projection methods, but we also include a real-world cancer subtyping example with a nominal response family, a case that is not supported by the latent projection.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"42 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141165753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient regression analyses with zero-augmented models based on ranking 基于排序的零增强模型的高效回归分析
IF 1.3 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-05-14 DOI: 10.1007/s00180-024-01503-3
Deborah Kanda, Jingjing Yin, Xinyan Zhang, Hani Samawi

Several zero-augmented models exist for estimation involving outcomes with large numbers of zero. Two of such models for handling count endpoints are zero-inflated and hurdle regression models. In this article, we apply the extreme ranked set sampling (ERSS) scheme in estimation using zero-inflated and hurdle regression models. We provide theoretical derivations showing superiority of ERSS compared to simple random sampling (SRS) using these zero-augmented models. A simulation study is also conducted to compare the efficiency of ERSS to SRS and lastly, we illustrate applications with real data sets.

有几种零增量模型可用于涉及大量零结果的估计。零膨胀回归模型和阶跃回归模型是处理计数终点的两种模型。在本文中,我们将极端排序集抽样(ERSS)方案应用于零膨胀和阶跃回归模型的估计中。我们提供的理论推导表明,与使用这些零膨胀模型的简单随机抽样(SRS)相比,ERSS 更具优势。我们还进行了模拟研究,比较了 ERSS 与 SRS 的效率,最后,我们用真实数据集说明了应用情况。
{"title":"Efficient regression analyses with zero-augmented models based on ranking","authors":"Deborah Kanda, Jingjing Yin, Xinyan Zhang, Hani Samawi","doi":"10.1007/s00180-024-01503-3","DOIUrl":"https://doi.org/10.1007/s00180-024-01503-3","url":null,"abstract":"<p>Several zero-augmented models exist for estimation involving outcomes with large numbers of zero. Two of such models for handling count endpoints are zero-inflated and hurdle regression models. In this article, we apply the extreme ranked set sampling (ERSS) scheme in estimation using zero-inflated and hurdle regression models. We provide theoretical derivations showing superiority of ERSS compared to simple random sampling (SRS) using these zero-augmented models. A simulation study is also conducted to compare the efficiency of ERSS to SRS and lastly, we illustrate applications with real data sets.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"5 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140935059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exact and approximate computation of the scatter halfspace depth 散射半空间深度的精确和近似计算
IF 1.3 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-05-09 DOI: 10.1007/s00180-024-01500-6
Xiaohui Liu, Yuzi Liu, Petra Laketa, Stanislav Nagy, Yuting Chen

The scatter halfspace depth (sHD) is an extension of the location halfspace (also called Tukey) depth that is applicable in the nonparametric analysis of scatter. Using sHD, it is possible to define minimax optimal robust scatter estimators for multivariate data. The problem of exact computation of sHD for data of dimension (d ge 2) has, however, not been addressed in the literature. We develop an exact algorithm for the computation of sHD in any dimension d and implement it efficiently for any dimension (d ge 1). Since the exact computation of sHD is slow especially for higher dimensions, we also propose two fast approximate algorithms. All our programs are freely available in the R package scatterdepth.

散点半空间深度(sHD)是位置半空间深度(也称为 Tukey)的扩展,适用于散点的非参数分析。利用 sHD,可以定义多元数据的最小最优稳健散点估计值。然而,对于维数为 (d ge 2) 的数据,sHD 的精确计算问题在文献中还没有得到解决。我们开发了一种在任意维度 d 下计算 sHD 的精确算法,并在任意维度 (dge 1 )下有效地实现了这一算法。由于sHD的精确计算速度较慢,尤其是在高维情况下,因此我们还提出了两种快速近似算法。我们的所有程序都可以在R软件包scatterdepth中免费获取。
{"title":"Exact and approximate computation of the scatter halfspace depth","authors":"Xiaohui Liu, Yuzi Liu, Petra Laketa, Stanislav Nagy, Yuting Chen","doi":"10.1007/s00180-024-01500-6","DOIUrl":"https://doi.org/10.1007/s00180-024-01500-6","url":null,"abstract":"<p>The scatter halfspace depth (<b>sHD</b>) is an extension of the location halfspace (also called Tukey) depth that is applicable in the nonparametric analysis of scatter. Using <b>sHD</b>, it is possible to define minimax optimal robust scatter estimators for multivariate data. The problem of exact computation of <b>sHD</b> for data of dimension <span>(d ge 2)</span> has, however, not been addressed in the literature. We develop an exact algorithm for the computation of <b>sHD</b> in any dimension <i>d</i> and implement it efficiently for any dimension <span>(d ge 1)</span>. Since the exact computation of <b>sHD</b> is slow especially for higher dimensions, we also propose two fast approximate algorithms. All our programs are freely available in the <span>R</span> package <span>scatterdepth</span>.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"43 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140942041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Bayesian approach for clustering and exact finite-sample model selection in longitudinal data mixtures 在纵向数据混合物中进行聚类和精确有限样本模型选择的贝叶斯方法
IF 1.3 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-05-08 DOI: 10.1007/s00180-024-01501-5
M. Corneli, E. Erosheva, X. Qian, M. Lorenzi

We consider mixtures of longitudinal trajectories, where one trajectory contains measurements over time of the variable of interest for one individual and each individual belongs to one cluster. The number of clusters as well as individual cluster memberships are unknown and must be inferred. We propose an original Bayesian clustering framework that allows us to obtain an exact finite-sample model selection criterion for selecting the number of clusters. Our finite-sample approach is more flexible and parsimonious than asymptotic alternatives such as Bayesian information criterion or integrated classification likelihood criterion in the choice of the number of clusters. Moreover, our approach has other desirable qualities: (i) it keeps the computational effort of the clustering algorithm under control and (ii) it generalizes to several families of regression mixture models, from linear to purely non-parametric. We test our method on simulated datasets as well as on a real world dataset from the Alzheimer’s disease neuroimaging initative database.

我们考虑的是纵向轨迹的混合物,其中一个轨迹包含一个个体在一段时间内对相关变量的测量结果,每个个体属于一个群组。聚类的数量以及个体的聚类成员身份都是未知的,必须通过推断来确定。我们提出了一个独创的贝叶斯聚类框架,通过该框架,我们可以获得一个精确的有限样本模型选择标准,用于选择聚类的数量。与贝叶斯信息准则或综合分类似然准则等渐进方法相比,我们的有限样本方法在选择聚类数量方面更加灵活和简洁。此外,我们的方法还有其他可取之处:(i) 它能控制聚类算法的计算量;(ii) 它能推广到多个回归混合模型系列,从线性模型到纯粹的非参数模型。我们在模拟数据集和来自阿尔茨海默病神经成像初始数据库的真实数据集上测试了我们的方法。
{"title":"A Bayesian approach for clustering and exact finite-sample model selection in longitudinal data mixtures","authors":"M. Corneli, E. Erosheva, X. Qian, M. Lorenzi","doi":"10.1007/s00180-024-01501-5","DOIUrl":"https://doi.org/10.1007/s00180-024-01501-5","url":null,"abstract":"<p>We consider mixtures of longitudinal trajectories, where one trajectory contains measurements over time of the variable of interest for one individual and each individual belongs to one cluster. The number of clusters as well as individual cluster memberships are unknown and must be inferred. We propose an original Bayesian clustering framework that allows us to obtain an exact finite-sample model selection criterion for selecting the number of clusters. Our finite-sample approach is more flexible and parsimonious than asymptotic alternatives such as Bayesian information criterion or integrated classification likelihood criterion in the choice of the number of clusters. Moreover, our approach has other desirable qualities: (i) it keeps the computational effort of the clustering algorithm under control and (ii) it generalizes to several families of regression mixture models, from linear to purely non-parametric. We test our method on simulated datasets as well as on a real world dataset from the Alzheimer’s disease neuroimaging initative database.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"38 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140935153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mixture models for simultaneous classification and reduction of three-way data 用于同时分类和还原三向数据的混合模型
IF 1.3 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-05-06 DOI: 10.1007/s00180-024-01478-1
Roberto Rocci, Maurizio Vichi, Monia Ranalli

Finite mixture of Gaussians are often used to classify two- (units and variables) or three- (units, variables and occasions) way data. However, two issues arise: model complexity and capturing the true cluster structure. Indeed, a large number of variables and/or occasions implies a large number of model parameters; while the existence of noise variables (and/or occasions) could mask the true cluster structure. The approach adopted in the present paper is to reduce the number of model parameters by identifying a sub-space containing the information needed to classify the observations. This should also help in identifying noise variables and/or occasions. The maximum likelihood model estimation is carried out through an EM-like algorithm. The effectiveness of the proposal is assessed through a simulation study and an application to real data.

有限高斯混合物通常用于对双向(单位和变量)或三向(单位、变量和场合)数据进行分类。然而,这就产生了两个问题:模型的复杂性和捕捉真实的聚类结构。事实上,大量的变量和/或场合意味着大量的模型参数;而噪声变量(和/或场合)的存在可能会掩盖真实的聚类结构。本文采用的方法是通过识别包含观测分类所需信息的子空间来减少模型参数的数量。这也有助于识别噪声变量和/或场合。最大似然模型估计是通过类似 EM 的算法进行的。通过模拟研究和对真实数据的应用,对该建议的有效性进行了评估。
{"title":"Mixture models for simultaneous classification and reduction of three-way data","authors":"Roberto Rocci, Maurizio Vichi, Monia Ranalli","doi":"10.1007/s00180-024-01478-1","DOIUrl":"https://doi.org/10.1007/s00180-024-01478-1","url":null,"abstract":"<p>Finite mixture of Gaussians are often used to classify two- (units and variables) or three- (units, variables and occasions) way data. However, two issues arise: model complexity and capturing the true cluster structure. Indeed, a large number of variables and/or occasions implies a large number of model parameters; while the existence of noise variables (and/or occasions) could mask the true cluster structure. The approach adopted in the present paper is to reduce the number of model parameters by identifying a sub-space containing the information needed to classify the observations. This should also help in identifying noise variables and/or occasions. The maximum likelihood model estimation is carried out through an EM-like algorithm. The effectiveness of the proposal is assessed through a simulation study and an application to real data.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"62 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140885015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust Bayesian cumulative probit linear mixed models for longitudinal ordinal data 用于纵向序数数据的稳健贝叶斯累积概率线性混合模型
IF 1.3 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-05-04 DOI: 10.1007/s00180-024-01499-w
Kuo-Jung Lee, Ray-Bing Chen, Keunbaik Lee

Longitudinal studies have been conducted in various fields, including medicine, economics and the social sciences. In this paper, we focus on longitudinal ordinal data. Since the longitudinal data are collected over time, repeated outcomes within each subject may be serially correlated. To address both the within-subjects serial correlation and the specific variance between subjects, we propose a Bayesian cumulative probit random effects model for the analysis of longitudinal ordinal data. The hypersphere decomposition approach is employed to overcome the positive definiteness constraint and high-dimensionality of the correlation matrix. Additionally, we present a hybrid Gibbs/Metropolis-Hastings algorithm to efficiently generate cutoff points from truncated normal distributions, thereby expediting the convergence of the Markov Chain Monte Carlo (MCMC) algorithm. The performance and robustness of our proposed methodology under misspecified correlation matrices are demonstrated through simulation studies under complete data, missing completely at random (MCAR), and missing at random (MAR). We apply the proposed approach to analyze two sets of actual ordinal data: the arthritis dataset and the lung cancer dataset. To facilitate the implementation of our method, we have developed BayesRGMM, an open-source R package available on CRAN, accompanied by comprehensive documentation and source code accessible at https://github.com/kuojunglee/BayesRGMM/.

纵向研究已在医学、经济学和社会科学等多个领域开展。本文重点研究纵向序数数据。由于纵向数据是长期收集的,因此每个研究对象内部的重复结果可能存在序列相关性。为了解决受试者内部的序列相关性和受试者之间的特定方差,我们提出了一种贝叶斯累积 probit 随机效应模型,用于分析纵向序数数据。我们采用了超球分解方法来克服相关矩阵的正定性约束和高维性。此外,我们还提出了一种混合吉布斯/大都会-哈斯廷斯算法,从截断正态分布中有效地生成截断点,从而加快了马尔可夫链蒙特卡罗(MCMC)算法的收敛速度。我们通过对完整数据、完全随机缺失(MCAR)和随机缺失(MAR)数据的模拟研究,证明了我们提出的方法在误设相关矩阵下的性能和稳健性。我们将提出的方法用于分析两组实际的序数数据:关节炎数据集和肺癌数据集。为了方便方法的实施,我们开发了 BayesRGMM,这是一个开源的 R 软件包,可在 CRAN 上获取,并附有全面的文档和源代码,可在 https://github.com/kuojunglee/BayesRGMM/ 上访问。
{"title":"Robust Bayesian cumulative probit linear mixed models for longitudinal ordinal data","authors":"Kuo-Jung Lee, Ray-Bing Chen, Keunbaik Lee","doi":"10.1007/s00180-024-01499-w","DOIUrl":"https://doi.org/10.1007/s00180-024-01499-w","url":null,"abstract":"<p>Longitudinal studies have been conducted in various fields, including medicine, economics and the social sciences. In this paper, we focus on longitudinal ordinal data. Since the longitudinal data are collected over time, repeated outcomes within each subject may be serially correlated. To address both the within-subjects serial correlation and the specific variance between subjects, we propose a Bayesian cumulative probit random effects model for the analysis of longitudinal ordinal data. The hypersphere decomposition approach is employed to overcome the positive definiteness constraint and high-dimensionality of the correlation matrix. Additionally, we present a hybrid Gibbs/Metropolis-Hastings algorithm to efficiently generate cutoff points from truncated normal distributions, thereby expediting the convergence of the Markov Chain Monte Carlo (MCMC) algorithm. The performance and robustness of our proposed methodology under misspecified correlation matrices are demonstrated through simulation studies under complete data, missing completely at random (MCAR), and missing at random (MAR). We apply the proposed approach to analyze two sets of actual ordinal data: the arthritis dataset and the lung cancer dataset. To facilitate the implementation of our method, we have developed <span>BayesRGMM</span>, an open-source R package available on CRAN, accompanied by comprehensive documentation and source code accessible at https://github.com/kuojunglee/BayesRGMM/.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"62 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140885010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
R-estimation in linear models: algorithms, complexity, challenges 线性模型中的 R 估计:算法、复杂性和挑战
IF 1.3 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-05-03 DOI: 10.1007/s00180-024-01495-0
Jaromír Antoch, Michal Černý, Ryozo Miura

The main objective of this paper is to discuss selected computational aspects of robust estimation in the linear model with the emphasis on R-estimators. We focus on numerical algorithms and computational efficiency rather than on statistical properties. In addition, we formulate some algorithmic properties that a “good” method for R-estimators is expected to satisfy and show how to satisfy them using the currently available algorithms. We illustrate both good and bad properties of the existing algorithms. We propose two-stage methods to minimize the effect of the bad properties. Finally we justify a challenge for new approaches based on interior-point methods in optimization.

本文的主要目的是讨论线性模型中稳健估计的某些计算问题,重点是 R 估计器。我们的重点是数值算法和计算效率,而不是统计特性。此外,我们还提出了 R 估计器的 "好 "方法应满足的一些算法属性,并展示了如何利用现有算法满足这些属性。我们举例说明了现有算法的优点和缺点。我们提出了两阶段方法,以尽量减少不良属性的影响。最后,我们对基于优化中内点法的新方法提出了挑战。
{"title":"R-estimation in linear models: algorithms, complexity, challenges","authors":"Jaromír Antoch, Michal Černý, Ryozo Miura","doi":"10.1007/s00180-024-01495-0","DOIUrl":"https://doi.org/10.1007/s00180-024-01495-0","url":null,"abstract":"<p>The main objective of this paper is to discuss selected computational aspects of robust estimation in the linear model with the emphasis on <i>R</i>-estimators. We focus on numerical algorithms and computational efficiency rather than on statistical properties. In addition, we formulate some algorithmic properties that a “good” method for <i>R</i>-estimators is expected to satisfy and show how to satisfy them using the currently available algorithms. We illustrate both good and bad properties of the existing algorithms. We propose two-stage methods to minimize the effect of the bad properties. Finally we justify a challenge for new approaches based on interior-point methods in optimization.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"176 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Two-stage regression spline modeling based on local polynomial kernel regression 基于局部多项式核回归的两阶段回归样条线建模
IF 1.3 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-05-01 DOI: 10.1007/s00180-024-01498-x
Hamid Mraoui, Ahmed El-Alaoui, Souad Bechrouri, Nezha Mohaoui, Abdelilah Monir

This paper introduces a new nonparametric estimator of the regression based on local quasi-interpolation spline method. This model combines a B-spline basis with a simple local polynomial regression, via blossoming approach, to produce a reduced rank spline like smoother. Different coefficients functionals are allowed to have different smoothing parameters (bandwidths) if the function has different smoothness. In addition, the number and location of the knots of this estimator are not fixed. In practice, we may employ a modest number of basis functions and then determine the smoothing parameter as the minimizer of the criterion. In simulations, the approach achieves very competitive performance with P-spline and smoothing spline methods. Simulated data and a real data example are used to illustrate the effectiveness of the method proposed in this paper.

本文介绍了一种基于局部准插值样条法的新的非参数估计回归模型。该模型通过绽放法将 B-样条曲线基础与简单的局部多项式回归相结合,生成类似于减阶样条曲线的平滑器。如果函数的平滑度不同,则允许不同的系数函数具有不同的平滑参数(带宽)。此外,该估计器的节点数量和位置也不是固定的。在实践中,我们可以采用数量适中的基函数,然后根据准则的最小化确定平滑参数。在模拟实验中,该方法与 P 样条法和平滑样条法相比,性能极具竞争力。本文使用模拟数据和真实数据实例来说明本文所提方法的有效性。
{"title":"Two-stage regression spline modeling based on local polynomial kernel regression","authors":"Hamid Mraoui, Ahmed El-Alaoui, Souad Bechrouri, Nezha Mohaoui, Abdelilah Monir","doi":"10.1007/s00180-024-01498-x","DOIUrl":"https://doi.org/10.1007/s00180-024-01498-x","url":null,"abstract":"<p>This paper introduces a new nonparametric estimator of the regression based on local quasi-interpolation spline method. This model combines a B-spline basis with a simple local polynomial regression, via blossoming approach, to produce a reduced rank spline like smoother. Different coefficients functionals are allowed to have different smoothing parameters (bandwidths) if the function has different smoothness. In addition, the number and location of the knots of this estimator are not fixed. In practice, we may employ a modest number of basis functions and then determine the smoothing parameter as the minimizer of the criterion. In simulations, the approach achieves very competitive performance with P-spline and smoothing spline methods. Simulated data and a real data example are used to illustrate the effectiveness of the method proposed in this paper.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"17 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140827212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computational Statistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1