首页 > 最新文献

Computational Statistics最新文献

英文 中文
A Bayesian approach for clustering and exact finite-sample model selection in longitudinal data mixtures 在纵向数据混合物中进行聚类和精确有限样本模型选择的贝叶斯方法
IF 1.3 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-05-08 DOI: 10.1007/s00180-024-01501-5
M. Corneli, E. Erosheva, X. Qian, M. Lorenzi

We consider mixtures of longitudinal trajectories, where one trajectory contains measurements over time of the variable of interest for one individual and each individual belongs to one cluster. The number of clusters as well as individual cluster memberships are unknown and must be inferred. We propose an original Bayesian clustering framework that allows us to obtain an exact finite-sample model selection criterion for selecting the number of clusters. Our finite-sample approach is more flexible and parsimonious than asymptotic alternatives such as Bayesian information criterion or integrated classification likelihood criterion in the choice of the number of clusters. Moreover, our approach has other desirable qualities: (i) it keeps the computational effort of the clustering algorithm under control and (ii) it generalizes to several families of regression mixture models, from linear to purely non-parametric. We test our method on simulated datasets as well as on a real world dataset from the Alzheimer’s disease neuroimaging initative database.

我们考虑的是纵向轨迹的混合物,其中一个轨迹包含一个个体在一段时间内对相关变量的测量结果,每个个体属于一个群组。聚类的数量以及个体的聚类成员身份都是未知的,必须通过推断来确定。我们提出了一个独创的贝叶斯聚类框架,通过该框架,我们可以获得一个精确的有限样本模型选择标准,用于选择聚类的数量。与贝叶斯信息准则或综合分类似然准则等渐进方法相比,我们的有限样本方法在选择聚类数量方面更加灵活和简洁。此外,我们的方法还有其他可取之处:(i) 它能控制聚类算法的计算量;(ii) 它能推广到多个回归混合模型系列,从线性模型到纯粹的非参数模型。我们在模拟数据集和来自阿尔茨海默病神经成像初始数据库的真实数据集上测试了我们的方法。
{"title":"A Bayesian approach for clustering and exact finite-sample model selection in longitudinal data mixtures","authors":"M. Corneli, E. Erosheva, X. Qian, M. Lorenzi","doi":"10.1007/s00180-024-01501-5","DOIUrl":"https://doi.org/10.1007/s00180-024-01501-5","url":null,"abstract":"<p>We consider mixtures of longitudinal trajectories, where one trajectory contains measurements over time of the variable of interest for one individual and each individual belongs to one cluster. The number of clusters as well as individual cluster memberships are unknown and must be inferred. We propose an original Bayesian clustering framework that allows us to obtain an exact finite-sample model selection criterion for selecting the number of clusters. Our finite-sample approach is more flexible and parsimonious than asymptotic alternatives such as Bayesian information criterion or integrated classification likelihood criterion in the choice of the number of clusters. Moreover, our approach has other desirable qualities: (i) it keeps the computational effort of the clustering algorithm under control and (ii) it generalizes to several families of regression mixture models, from linear to purely non-parametric. We test our method on simulated datasets as well as on a real world dataset from the Alzheimer’s disease neuroimaging initative database.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"38 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140935153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mixture models for simultaneous classification and reduction of three-way data 用于同时分类和还原三向数据的混合模型
IF 1.3 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-05-06 DOI: 10.1007/s00180-024-01478-1
Roberto Rocci, Maurizio Vichi, Monia Ranalli

Finite mixture of Gaussians are often used to classify two- (units and variables) or three- (units, variables and occasions) way data. However, two issues arise: model complexity and capturing the true cluster structure. Indeed, a large number of variables and/or occasions implies a large number of model parameters; while the existence of noise variables (and/or occasions) could mask the true cluster structure. The approach adopted in the present paper is to reduce the number of model parameters by identifying a sub-space containing the information needed to classify the observations. This should also help in identifying noise variables and/or occasions. The maximum likelihood model estimation is carried out through an EM-like algorithm. The effectiveness of the proposal is assessed through a simulation study and an application to real data.

有限高斯混合物通常用于对双向(单位和变量)或三向(单位、变量和场合)数据进行分类。然而,这就产生了两个问题:模型的复杂性和捕捉真实的聚类结构。事实上,大量的变量和/或场合意味着大量的模型参数;而噪声变量(和/或场合)的存在可能会掩盖真实的聚类结构。本文采用的方法是通过识别包含观测分类所需信息的子空间来减少模型参数的数量。这也有助于识别噪声变量和/或场合。最大似然模型估计是通过类似 EM 的算法进行的。通过模拟研究和对真实数据的应用,对该建议的有效性进行了评估。
{"title":"Mixture models for simultaneous classification and reduction of three-way data","authors":"Roberto Rocci, Maurizio Vichi, Monia Ranalli","doi":"10.1007/s00180-024-01478-1","DOIUrl":"https://doi.org/10.1007/s00180-024-01478-1","url":null,"abstract":"<p>Finite mixture of Gaussians are often used to classify two- (units and variables) or three- (units, variables and occasions) way data. However, two issues arise: model complexity and capturing the true cluster structure. Indeed, a large number of variables and/or occasions implies a large number of model parameters; while the existence of noise variables (and/or occasions) could mask the true cluster structure. The approach adopted in the present paper is to reduce the number of model parameters by identifying a sub-space containing the information needed to classify the observations. This should also help in identifying noise variables and/or occasions. The maximum likelihood model estimation is carried out through an EM-like algorithm. The effectiveness of the proposal is assessed through a simulation study and an application to real data.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"62 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140885015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust Bayesian cumulative probit linear mixed models for longitudinal ordinal data 用于纵向序数数据的稳健贝叶斯累积概率线性混合模型
IF 1.3 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-05-04 DOI: 10.1007/s00180-024-01499-w
Kuo-Jung Lee, Ray-Bing Chen, Keunbaik Lee

Longitudinal studies have been conducted in various fields, including medicine, economics and the social sciences. In this paper, we focus on longitudinal ordinal data. Since the longitudinal data are collected over time, repeated outcomes within each subject may be serially correlated. To address both the within-subjects serial correlation and the specific variance between subjects, we propose a Bayesian cumulative probit random effects model for the analysis of longitudinal ordinal data. The hypersphere decomposition approach is employed to overcome the positive definiteness constraint and high-dimensionality of the correlation matrix. Additionally, we present a hybrid Gibbs/Metropolis-Hastings algorithm to efficiently generate cutoff points from truncated normal distributions, thereby expediting the convergence of the Markov Chain Monte Carlo (MCMC) algorithm. The performance and robustness of our proposed methodology under misspecified correlation matrices are demonstrated through simulation studies under complete data, missing completely at random (MCAR), and missing at random (MAR). We apply the proposed approach to analyze two sets of actual ordinal data: the arthritis dataset and the lung cancer dataset. To facilitate the implementation of our method, we have developed BayesRGMM, an open-source R package available on CRAN, accompanied by comprehensive documentation and source code accessible at https://github.com/kuojunglee/BayesRGMM/.

纵向研究已在医学、经济学和社会科学等多个领域开展。本文重点研究纵向序数数据。由于纵向数据是长期收集的,因此每个研究对象内部的重复结果可能存在序列相关性。为了解决受试者内部的序列相关性和受试者之间的特定方差,我们提出了一种贝叶斯累积 probit 随机效应模型,用于分析纵向序数数据。我们采用了超球分解方法来克服相关矩阵的正定性约束和高维性。此外,我们还提出了一种混合吉布斯/大都会-哈斯廷斯算法,从截断正态分布中有效地生成截断点,从而加快了马尔可夫链蒙特卡罗(MCMC)算法的收敛速度。我们通过对完整数据、完全随机缺失(MCAR)和随机缺失(MAR)数据的模拟研究,证明了我们提出的方法在误设相关矩阵下的性能和稳健性。我们将提出的方法用于分析两组实际的序数数据:关节炎数据集和肺癌数据集。为了方便方法的实施,我们开发了 BayesRGMM,这是一个开源的 R 软件包,可在 CRAN 上获取,并附有全面的文档和源代码,可在 https://github.com/kuojunglee/BayesRGMM/ 上访问。
{"title":"Robust Bayesian cumulative probit linear mixed models for longitudinal ordinal data","authors":"Kuo-Jung Lee, Ray-Bing Chen, Keunbaik Lee","doi":"10.1007/s00180-024-01499-w","DOIUrl":"https://doi.org/10.1007/s00180-024-01499-w","url":null,"abstract":"<p>Longitudinal studies have been conducted in various fields, including medicine, economics and the social sciences. In this paper, we focus on longitudinal ordinal data. Since the longitudinal data are collected over time, repeated outcomes within each subject may be serially correlated. To address both the within-subjects serial correlation and the specific variance between subjects, we propose a Bayesian cumulative probit random effects model for the analysis of longitudinal ordinal data. The hypersphere decomposition approach is employed to overcome the positive definiteness constraint and high-dimensionality of the correlation matrix. Additionally, we present a hybrid Gibbs/Metropolis-Hastings algorithm to efficiently generate cutoff points from truncated normal distributions, thereby expediting the convergence of the Markov Chain Monte Carlo (MCMC) algorithm. The performance and robustness of our proposed methodology under misspecified correlation matrices are demonstrated through simulation studies under complete data, missing completely at random (MCAR), and missing at random (MAR). We apply the proposed approach to analyze two sets of actual ordinal data: the arthritis dataset and the lung cancer dataset. To facilitate the implementation of our method, we have developed <span>BayesRGMM</span>, an open-source R package available on CRAN, accompanied by comprehensive documentation and source code accessible at https://github.com/kuojunglee/BayesRGMM/.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"62 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140885010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
R-estimation in linear models: algorithms, complexity, challenges 线性模型中的 R 估计:算法、复杂性和挑战
IF 1.3 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-05-03 DOI: 10.1007/s00180-024-01495-0
Jaromír Antoch, Michal Černý, Ryozo Miura

The main objective of this paper is to discuss selected computational aspects of robust estimation in the linear model with the emphasis on R-estimators. We focus on numerical algorithms and computational efficiency rather than on statistical properties. In addition, we formulate some algorithmic properties that a “good” method for R-estimators is expected to satisfy and show how to satisfy them using the currently available algorithms. We illustrate both good and bad properties of the existing algorithms. We propose two-stage methods to minimize the effect of the bad properties. Finally we justify a challenge for new approaches based on interior-point methods in optimization.

本文的主要目的是讨论线性模型中稳健估计的某些计算问题,重点是 R 估计器。我们的重点是数值算法和计算效率,而不是统计特性。此外,我们还提出了 R 估计器的 "好 "方法应满足的一些算法属性,并展示了如何利用现有算法满足这些属性。我们举例说明了现有算法的优点和缺点。我们提出了两阶段方法,以尽量减少不良属性的影响。最后,我们对基于优化中内点法的新方法提出了挑战。
{"title":"R-estimation in linear models: algorithms, complexity, challenges","authors":"Jaromír Antoch, Michal Černý, Ryozo Miura","doi":"10.1007/s00180-024-01495-0","DOIUrl":"https://doi.org/10.1007/s00180-024-01495-0","url":null,"abstract":"<p>The main objective of this paper is to discuss selected computational aspects of robust estimation in the linear model with the emphasis on <i>R</i>-estimators. We focus on numerical algorithms and computational efficiency rather than on statistical properties. In addition, we formulate some algorithmic properties that a “good” method for <i>R</i>-estimators is expected to satisfy and show how to satisfy them using the currently available algorithms. We illustrate both good and bad properties of the existing algorithms. We propose two-stage methods to minimize the effect of the bad properties. Finally we justify a challenge for new approaches based on interior-point methods in optimization.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"176 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Two-stage regression spline modeling based on local polynomial kernel regression 基于局部多项式核回归的两阶段回归样条线建模
IF 1.3 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-05-01 DOI: 10.1007/s00180-024-01498-x
Hamid Mraoui, Ahmed El-Alaoui, Souad Bechrouri, Nezha Mohaoui, Abdelilah Monir

This paper introduces a new nonparametric estimator of the regression based on local quasi-interpolation spline method. This model combines a B-spline basis with a simple local polynomial regression, via blossoming approach, to produce a reduced rank spline like smoother. Different coefficients functionals are allowed to have different smoothing parameters (bandwidths) if the function has different smoothness. In addition, the number and location of the knots of this estimator are not fixed. In practice, we may employ a modest number of basis functions and then determine the smoothing parameter as the minimizer of the criterion. In simulations, the approach achieves very competitive performance with P-spline and smoothing spline methods. Simulated data and a real data example are used to illustrate the effectiveness of the method proposed in this paper.

本文介绍了一种基于局部准插值样条法的新的非参数估计回归模型。该模型通过绽放法将 B-样条曲线基础与简单的局部多项式回归相结合,生成类似于减阶样条曲线的平滑器。如果函数的平滑度不同,则允许不同的系数函数具有不同的平滑参数(带宽)。此外,该估计器的节点数量和位置也不是固定的。在实践中,我们可以采用数量适中的基函数,然后根据准则的最小化确定平滑参数。在模拟实验中,该方法与 P 样条法和平滑样条法相比,性能极具竞争力。本文使用模拟数据和真实数据实例来说明本文所提方法的有效性。
{"title":"Two-stage regression spline modeling based on local polynomial kernel regression","authors":"Hamid Mraoui, Ahmed El-Alaoui, Souad Bechrouri, Nezha Mohaoui, Abdelilah Monir","doi":"10.1007/s00180-024-01498-x","DOIUrl":"https://doi.org/10.1007/s00180-024-01498-x","url":null,"abstract":"<p>This paper introduces a new nonparametric estimator of the regression based on local quasi-interpolation spline method. This model combines a B-spline basis with a simple local polynomial regression, via blossoming approach, to produce a reduced rank spline like smoother. Different coefficients functionals are allowed to have different smoothing parameters (bandwidths) if the function has different smoothness. In addition, the number and location of the knots of this estimator are not fixed. In practice, we may employ a modest number of basis functions and then determine the smoothing parameter as the minimizer of the criterion. In simulations, the approach achieves very competitive performance with P-spline and smoothing spline methods. Simulated data and a real data example are used to illustrate the effectiveness of the method proposed in this paper.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"17 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140827212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Advancements in reliability estimation for the exponentiated Pareto distribution: a comparison of classical and Bayesian methods with lower record values 指数化帕累托分布可靠性估计的进展:使用较低记录值的经典方法与贝叶斯方法的比较
IF 1.3 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-04-29 DOI: 10.1007/s00180-024-01497-y
Shubham Saini

Estimating the reliability of multicomponent systems is crucial in various engineering and reliability analysis applications. This paper investigates the multicomponent stress strength reliability estimation using lower record values, specifically for the exponentiated Pareto distribution. We compare classical estimation techniques, such as maximum likelihood estimation, with Bayesian estimation methods. Under Bayesian estimation, we employ Markov Chain Monte Carlo techniques and Tierney–Kadane’s approximation to obtain the posterior distribution of the reliability parameter. To evaluate the performance of the proposed estimation approaches, we conduct a comprehensive simulation study, considering various system configurations and sample sizes. Additionally, we analyze real data to illustrate the practical applicability of our methods. The proposed methodologies provide valuable insights for engineers and reliability analysts in accurately assessing the reliability of multicomponent systems using lower record values.

在各种工程和可靠性分析应用中,估算多组件系统的可靠性至关重要。本文研究了使用较低记录值估算多组件应力强度可靠性,特别是指数化帕累托分布。我们将最大似然估计等经典估计技术与贝叶斯估计方法进行了比较。在贝叶斯估计法中,我们采用马尔可夫链蒙特卡罗技术和 Tierney-Kadane 近似法来获得可靠性参数的后验分布。为了评估所提出的估计方法的性能,我们进行了全面的模拟研究,考虑了各种系统配置和样本大小。此外,我们还分析了真实数据,以说明我们方法的实际应用性。所提出的方法为工程师和可靠性分析人员提供了宝贵的见解,有助于他们使用较低的记录值准确评估多组件系统的可靠性。
{"title":"Advancements in reliability estimation for the exponentiated Pareto distribution: a comparison of classical and Bayesian methods with lower record values","authors":"Shubham Saini","doi":"10.1007/s00180-024-01497-y","DOIUrl":"https://doi.org/10.1007/s00180-024-01497-y","url":null,"abstract":"<p>Estimating the reliability of multicomponent systems is crucial in various engineering and reliability analysis applications. This paper investigates the multicomponent stress strength reliability estimation using lower record values, specifically for the exponentiated Pareto distribution. We compare classical estimation techniques, such as maximum likelihood estimation, with Bayesian estimation methods. Under Bayesian estimation, we employ Markov Chain Monte Carlo techniques and Tierney–Kadane’s approximation to obtain the posterior distribution of the reliability parameter. To evaluate the performance of the proposed estimation approaches, we conduct a comprehensive simulation study, considering various system configurations and sample sizes. Additionally, we analyze real data to illustrate the practical applicability of our methods. The proposed methodologies provide valuable insights for engineers and reliability analysts in accurately assessing the reliability of multicomponent systems using lower record values.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"153 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140885012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Maximizing adjusted covariance: new supervised dimension reduction for classification 调整后协方差最大化:用于分类的新监督降维方法
IF 1.3 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-04-02 DOI: 10.1007/s00180-024-01472-7
Hyejoon Park, Hyunjoong Kim, Yung-Seop Lee

This study proposes a new linear dimension reduction technique called Maximizing Adjusted Covariance (MAC), which is suitable for supervised classification. The new approach is to adjust the covariance matrix between input and target variables using the within-class sum of squares, thereby promoting class separation after linear dimension reduction. MAC has a low computational cost and can complement existing linear dimensionality reduction techniques for classification. In this study, the classification performance by MAC was compared with those of the existing linear dimension reduction methods using 44 datasets. In most of the classification models used in the experiment, the MAC dimension reduction method showed better classification accuracy and F1 score than other linear dimension reduction methods.

本研究提出了一种新的线性降维技术--最大化调整协方差(MAC),它适用于监督分类。新方法是利用类内平方和调整输入变量和目标变量之间的协方差矩阵,从而促进线性降维后的类分离。MAC 的计算成本较低,可作为现有线性降维分类技术的补充。本研究使用 44 个数据集比较了 MAC 与现有线性降维方法的分类性能。在实验中使用的大多数分类模型中,MAC 降维方法的分类准确率和 F1 分数都优于其他线性降维方法。
{"title":"Maximizing adjusted covariance: new supervised dimension reduction for classification","authors":"Hyejoon Park, Hyunjoong Kim, Yung-Seop Lee","doi":"10.1007/s00180-024-01472-7","DOIUrl":"https://doi.org/10.1007/s00180-024-01472-7","url":null,"abstract":"<p>This study proposes a new linear dimension reduction technique called Maximizing Adjusted Covariance (MAC), which is suitable for supervised classification. The new approach is to adjust the covariance matrix between input and target variables using the within-class sum of squares, thereby promoting class separation after linear dimension reduction. MAC has a low computational cost and can complement existing linear dimensionality reduction techniques for classification. In this study, the classification performance by MAC was compared with those of the existing linear dimension reduction methods using 44 datasets. In most of the classification models used in the experiment, the MAC dimension reduction method showed better classification accuracy and F1 score than other linear dimension reduction methods.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"53 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140567927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A class of transformed joint quantile time series models with applications to health studies 一类转化联合量化时间序列模型在健康研究中的应用
IF 1.3 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-04-01 DOI: 10.1007/s00180-024-01484-3
Fahimeh Tourani-Farani, Zeynab Aghabazaz, Iraj Kazemi

Extensions of quantile regression modeling for time series analysis are extensively employed in medical and health studies. This study introduces a specific class of transformed quantile-dispersion regression models for non-stationary time series. These models possess the flexibility to incorporate the time-varying structure into the model specification, enabling precise predictions for future decisions. Our proposed modeling methodology applies to dynamic processes characterized by high variation and possible periodicity, relying on a non-linear framework. Additionally, unlike the transformed time series model, our approach directly interprets the regression parameters concerning the initial response. For computational purposes, we present an iteratively reweighted least squares algorithm. To assess the performance of our model, we conduct simulation experiments. To illustrate the modeling strategy, we analyze time-series measurements of influenza infection and daily COVID-19 deaths.

用于时间序列分析的量化回归模型的扩展在医学和健康研究中得到广泛应用。本研究为非平稳时间序列引入了一类特定的转换量化离散回归模型。这些模型具有灵活性,可将时变结构纳入模型规范,从而为未来决策提供精确预测。我们提出的建模方法适用于以高变化和可能的周期性为特征的动态过程,依赖于非线性框架。此外,与转换后的时间序列模型不同,我们的方法直接解释了有关初始响应的回归参数。为了便于计算,我们提出了一种迭代加权最小二乘法算法。为了评估模型的性能,我们进行了模拟实验。为了说明建模策略,我们分析了流感感染和 COVID-19 每日死亡人数的时间序列测量结果。
{"title":"A class of transformed joint quantile time series models with applications to health studies","authors":"Fahimeh Tourani-Farani, Zeynab Aghabazaz, Iraj Kazemi","doi":"10.1007/s00180-024-01484-3","DOIUrl":"https://doi.org/10.1007/s00180-024-01484-3","url":null,"abstract":"<p>Extensions of quantile regression modeling for time series analysis are extensively employed in medical and health studies. This study introduces a specific class of transformed quantile-dispersion regression models for non-stationary time series. These models possess the flexibility to incorporate the time-varying structure into the model specification, enabling precise predictions for future decisions. Our proposed modeling methodology applies to dynamic processes characterized by high variation and possible periodicity, relying on a non-linear framework. Additionally, unlike the transformed time series model, our approach directly interprets the regression parameters concerning the initial response. For computational purposes, we present an iteratively reweighted least squares algorithm. To assess the performance of our model, we conduct simulation experiments. To illustrate the modeling strategy, we analyze time-series measurements of influenza infection and daily COVID-19 deaths.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"96 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140567967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A smoothed semiparametric likelihood for estimation of nonparametric finite mixture models with a copula-based dependence structure 用于估计具有基于共轭依赖结构的非参数有限混合物模型的平滑半参数似然法
IF 1.3 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-03-27 DOI: 10.1007/s00180-024-01483-4
Michael Levine, Gildas Mazo

In this manuscript, we consider a finite multivariate nonparametric mixture model where the dependence between the marginal densities is modeled using the copula device. Pseudo expectation–maximization (EM) stochastic algorithms were recently proposed to estimate all of the components of this model under a location-scale constraint on the marginals. Here, we introduce a deterministic algorithm that seeks to maximize a smoothed semiparametric likelihood. No location-scale assumption is made about the marginals. The algorithm is monotonic in one special case, and, in another, leads to “approximate monotonicity”—whereby the difference between successive values of the objective function becomes non-negative up to an additive term that becomes negligible after a sufficiently large number of iterations. The behavior of this algorithm is illustrated on several simulated and real datasets. The results suggest that, under suitable conditions, the proposed algorithm may indeed be monotonic in general. A discussion of the results and some possible future research directions round out our presentation.

在本手稿中,我们考虑了一种有限多元非参数混合物模型,该模型中边际密度之间的依赖关系使用 copula 装置建模。最近提出的伪期望最大化(EM)随机算法可以在边际的位置尺度约束下估计该模型的所有成分。在这里,我们引入了一种确定性算法,旨在最大化平滑半参数似然。对边际值不做位置尺度假设。该算法在一种特殊情况下是单调的,而在另一种情况下则会导致 "近似单调性"--即目标函数的连续值之间的差值变为非负,直到一个加法项,该加法项在足够大的迭代次数后变得可以忽略不计。我们在几个模拟和真实数据集上对该算法的行为进行了说明。结果表明,在合适的条件下,所提出的算法在一般情况下可能确实是单调的。最后,我们将对结果和未来可能的研究方向进行讨论。
{"title":"A smoothed semiparametric likelihood for estimation of nonparametric finite mixture models with a copula-based dependence structure","authors":"Michael Levine, Gildas Mazo","doi":"10.1007/s00180-024-01483-4","DOIUrl":"https://doi.org/10.1007/s00180-024-01483-4","url":null,"abstract":"<p>In this manuscript, we consider a finite multivariate nonparametric mixture model where the dependence between the marginal densities is modeled using the copula device. Pseudo expectation–maximization (EM) stochastic algorithms were recently proposed to estimate all of the components of this model under a location-scale constraint on the marginals. Here, we introduce a deterministic algorithm that seeks to maximize a smoothed semiparametric likelihood. No location-scale assumption is made about the marginals. The algorithm is monotonic in one special case, and, in another, leads to “approximate monotonicity”—whereby the difference between successive values of the objective function becomes non-negative up to an additive term that becomes negligible after a sufficiently large number of iterations. The behavior of this algorithm is illustrated on several simulated and real datasets. The results suggest that, under suitable conditions, the proposed algorithm may indeed be monotonic in general. A discussion of the results and some possible future research directions round out our presentation.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"28 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140884878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A subspace aggregating algorithm for accurate classification 用于精确分类的子空间聚合算法
IF 1.3 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-03-09 DOI: 10.1007/s00180-024-01476-3
Saeid Amiri, Reza Modarres

We present a technique for learning via aggregation in supervised classification. The new method improves classification performance, regardless of which classifier is at its core. This approach exploits the information hidden in subspaces by combinations of aggregating variables and is applicable to high-dimensional data sets. We provide algorithms that randomly divide the variables into smaller subsets and permute them before applying a classification method to each subset. We combine the resulting classes to predict the class membership. Theoretical and simulation analyses consistently demonstrate the high accuracy of our classification methods. In comparison to aggregating observations through sampling, our approach proves to be significantly more effective. Through extensive simulations, we evaluate the accuracy of various classification methods. To further illustrate the effectiveness of our techniques, we apply them to five real-world data sets.

我们提出了一种在监督分类中通过聚合进行学习的技术。无论哪种分类器是其核心,新方法都能提高分类性能。这种方法利用了聚合变量组合隐藏在子空间中的信息,适用于高维数据集。我们提供的算法可将变量随机划分为较小的子集,并在对每个子集应用分类方法之前对其进行排列。我们将得到的类别结合起来,以预测类别成员资格。理论和模拟分析一致证明了我们的分类方法具有很高的准确性。与通过抽样来汇总观察结果相比,我们的方法被证明更为有效。通过大量模拟,我们评估了各种分类方法的准确性。为了进一步说明我们技术的有效性,我们将其应用于五个真实世界的数据集。
{"title":"A subspace aggregating algorithm for accurate classification","authors":"Saeid Amiri, Reza Modarres","doi":"10.1007/s00180-024-01476-3","DOIUrl":"https://doi.org/10.1007/s00180-024-01476-3","url":null,"abstract":"<p>We present a technique for learning via aggregation in supervised classification. The new method improves classification performance, regardless of which classifier is at its core. This approach exploits the information hidden in subspaces by combinations of aggregating variables and is applicable to high-dimensional data sets. We provide algorithms that randomly divide the variables into smaller subsets and permute them before applying a classification method to each subset. We combine the resulting classes to predict the class membership. Theoretical and simulation analyses consistently demonstrate the high accuracy of our classification methods. In comparison to aggregating observations through sampling, our approach proves to be significantly more effective. Through extensive simulations, we evaluate the accuracy of various classification methods. To further illustrate the effectiveness of our techniques, we apply them to five real-world data sets.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"12 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140075811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computational Statistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1