首页 > 最新文献

Statistical Modelling最新文献

英文 中文
Robust function-on-function interaction regression 鲁棒函数对函数交互回归
4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2023-10-23 DOI: 10.1177/1471082x231198907
Ufuk Beyaztas, Han Lin Shang, Abhijit Mandal
A function-on-function regression model with quadratic and interaction effects of the covariates provides a more flexible model. Despite several attempts to estimate the model’s parameters, almost all existing estimation strategies are non-robust against outliers. Outliers in the quadratic and interaction effects may deteriorate the model structure more severely than their effects in the main effect. We propose a robust estimation strategy based on the robust functional principal component decomposition of the function-valued variables and [Formula: see text]-estimator. The performance of the proposed method relies on the truncation parameters in the robust functional principal component decomposition of the function-valued variables. A robust Bayesian information criterion is used to determine the optimum truncation constants. A forward stepwise variable selection procedure is employed to determine relevant main, quadratic, and interaction effects to address a possible model misspecification. The finite-sample performance of the proposed method is investigated via a series of Monte-Carlo experiments. The proposed method’s asymptotic consistency and influence function are also studied in the supplement, and its empirical performance is further investigated using a U.S. COVID-19 dataset.
具有二次效应和交互效应的函数对函数回归模型提供了一个更灵活的模型。尽管多次尝试估计模型的参数,但几乎所有现有的估计策略对异常值都是非鲁棒的。二次效应和交互效应中的异常值可能比主效应中的异常值更严重地恶化模型结构。我们提出了一种基于函数值变量的鲁棒泛函主成分分解和[公式:见文本]估计量的鲁棒估计策略。该方法的性能依赖于函数值变量鲁棒泛函主成分分解中的截断参数。采用鲁棒贝叶斯信息准则确定最佳截断常数。采用前向逐步变量选择程序来确定相关的主要、二次和交互效应,以解决可能的模型错误说明。通过一系列的蒙特卡罗实验研究了该方法的有限样本性能。本文还研究了该方法的渐近一致性和影响函数,并利用美国COVID-19数据集进一步研究了该方法的经验性能。
{"title":"Robust function-on-function interaction regression","authors":"Ufuk Beyaztas, Han Lin Shang, Abhijit Mandal","doi":"10.1177/1471082x231198907","DOIUrl":"https://doi.org/10.1177/1471082x231198907","url":null,"abstract":"A function-on-function regression model with quadratic and interaction effects of the covariates provides a more flexible model. Despite several attempts to estimate the model’s parameters, almost all existing estimation strategies are non-robust against outliers. Outliers in the quadratic and interaction effects may deteriorate the model structure more severely than their effects in the main effect. We propose a robust estimation strategy based on the robust functional principal component decomposition of the function-valued variables and [Formula: see text]-estimator. The performance of the proposed method relies on the truncation parameters in the robust functional principal component decomposition of the function-valued variables. A robust Bayesian information criterion is used to determine the optimum truncation constants. A forward stepwise variable selection procedure is employed to determine relevant main, quadratic, and interaction effects to address a possible model misspecification. The finite-sample performance of the proposed method is investigated via a series of Monte-Carlo experiments. The proposed method’s asymptotic consistency and influence function are also studied in the supplement, and its empirical performance is further investigated using a U.S. COVID-19 dataset.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":"11 9","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135367709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ordinal compositional data and time series 有序成分数据和时间序列
4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2023-10-05 DOI: 10.1177/1471082x231190971
Christian H. Weiß
There are several real applications where the categories behind compositional data (CoDa) exhibit a natural order, which, however, is not accounted for by existing CoDa methods. For various application areas, it is demonstrated that appropriately developed methods for ordinal CoDa provide valuable additional insights and are, thus, recommended to complement existing CoDa methods. The potential benefits are demonstrated for the (visual) descriptive analysis of ordinal CoDa, for statistical inference based on CoDa samples, for the monitoring of CoDa processes by means of control charts, and for the analysis and modelling of compositional time series. The novel methods are illustrated by a couple of real-world data examples.
在一些实际应用中,组合数据(CoDa)背后的类别表现出自然的顺序,然而,现有的CoDa方法无法解释这一点。对于不同的应用领域,证明了适当开发的有序CoDa方法提供了有价值的额外见解,因此,建议补充现有的CoDa方法。潜在的好处证明了有序CoDa的(视觉)描述性分析,基于CoDa样本的统计推断,通过控制图监测CoDa过程,以及成分时间序列的分析和建模。这些新方法通过几个实际数据实例加以说明。
{"title":"Ordinal compositional data and time series","authors":"Christian H. Weiß","doi":"10.1177/1471082x231190971","DOIUrl":"https://doi.org/10.1177/1471082x231190971","url":null,"abstract":"There are several real applications where the categories behind compositional data (CoDa) exhibit a natural order, which, however, is not accounted for by existing CoDa methods. For various application areas, it is demonstrated that appropriately developed methods for ordinal CoDa provide valuable additional insights and are, thus, recommended to complement existing CoDa methods. The potential benefits are demonstrated for the (visual) descriptive analysis of ordinal CoDa, for statistical inference based on CoDa samples, for the monitoring of CoDa processes by means of control charts, and for the analysis and modelling of compositional time series. The novel methods are illustrated by a couple of real-world data examples.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134977792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Editorial to the Special Issue “Applications of P-Splines” in Memory of Brian D. Marx 纪念布赖恩·d·马克思的特刊“p样条的应用”社论
4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2023-10-01 DOI: 10.1177/1471082x231201705
Paul H.C. Eilers, Thomas Kneib
{"title":"Editorial to the Special Issue “Applications of P-Splines” in Memory of Brian D. Marx","authors":"Paul H.C. Eilers, Thomas Kneib","doi":"10.1177/1471082x231201705","DOIUrl":"https://doi.org/10.1177/1471082x231201705","url":null,"abstract":"","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135965350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
P-splines and GAMLSS: a powerful combination, with an application to zero-adjusted distributions p样条和GAMLSS:一个强大的组合,与零调整分布的应用
4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2023-10-01 DOI: 10.1177/1471082x231176635
Dimitrios M. Stasinopoulos, Robert A. Rigby, Gillian Z. Heller, Fernanda De Bastiani
P-splines are a versatile statistical modelling tool, dealing with nonlinear relationships between the response and explanatory variable(s). GAMLSS is a distributional regression framework which allows modelling of a response variable using any parametric distribution. The combination of the two methodologies provides one of the most powerful tools in modern regression analysis. This article discusses the application of the two techniques when the response variable is zero-adjusted (or semi-continuous), which combines a point probability at zero with a positive continuous distribution.
p样条是一种通用的统计建模工具,用于处理响应和解释变量之间的非线性关系。GAMLSS是一个分布回归框架,它允许使用任何参数分布对响应变量进行建模。这两种方法的结合提供了现代回归分析中最强大的工具之一。本文讨论了这两种技术在响应变量为零调整(或半连续)时的应用,它将零点概率与正连续分布相结合。
{"title":"P-splines and GAMLSS: a powerful combination, with an application to zero-adjusted distributions","authors":"Dimitrios M. Stasinopoulos, Robert A. Rigby, Gillian Z. Heller, Fernanda De Bastiani","doi":"10.1177/1471082x231176635","DOIUrl":"https://doi.org/10.1177/1471082x231176635","url":null,"abstract":"P-splines are a versatile statistical modelling tool, dealing with nonlinear relationships between the response and explanatory variable(s). GAMLSS is a distributional regression framework which allows modelling of a response variable using any parametric distribution. The combination of the two methodologies provides one of the most powerful tools in modern regression analysis. This article discusses the application of the two techniques when the response variable is zero-adjusted (or semi-continuous), which combines a point probability at zero with a positive continuous distribution.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134934745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Childhood obesity in Singapore: A Bayesian nonparametric approach 新加坡儿童肥胖:贝叶斯非参数方法
4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2023-09-21 DOI: 10.1177/1471082x231185892
Mario Beraha, Alessandra Guglielmi, Fernando Andrés Quintana, Maria De Iorio, Johan Gunnar Eriksson, Fabian Yap
Overweight and obesity in adults are known to be associated with increased risk of metabolic and cardiovascular diseases. Obesity has now reached epidemic proportions, increasingly affecting children. Therefore, it is important to understand if this condition persists from early life to childhood and if different patterns can be detected to inform intervention policies. Our motivating application is a study of temporal patterns of obesity in children from South Eastern Asia. Our main focus is on clustering obesity patterns after adjusting for the effect of baseline information. Specifically, we consider a joint model for height and weight over time. Measurements are taken every six months from birth. To allow for data-driven clustering of trajectories, we assume a vector autoregressive sampling model with a dependent logit stick-breaking prior. Simulation studies show good performance of the proposed model to capture overall growth patterns, as compared to other alternatives. We also fit the model to the motivating dataset, and discuss the results, in particular highlighting cluster differences. We have found four large clusters, corresponding to children sub-groups, though two of them are similar in terms of both height and weight at each time point. We provide interpretation of these clusters in terms of combinations of predictors.
成年人超重和肥胖与代谢和心血管疾病的风险增加有关。肥胖现在已经达到流行病的程度,对儿童的影响越来越大。因此,了解这种情况是否会从生命早期持续到儿童期,以及是否可以发现不同的模式,从而为干预政策提供信息,这一点非常重要。我们的激励应用是对东南亚儿童肥胖的时间模式的研究。我们的主要重点是在调整基线信息的影响后,对肥胖模式进行聚类。具体来说,我们考虑的是身高和体重随时间变化的联合模型。从出生开始每六个月测量一次。为了允许数据驱动的轨迹聚类,我们假设一个矢量自回归采样模型具有依赖logit棍子断裂先验。仿真研究表明,与其他替代方案相比,所提出的模型在捕获整体增长模式方面具有良好的性能。我们还将模型拟合到激励数据集,并讨论了结果,特别强调了聚类差异。我们发现了四个大的集群,对应于儿童亚组,尽管其中两个在每个时间点的身高和体重都相似。我们根据预测因子的组合提供对这些群集的解释。
{"title":"Childhood obesity in Singapore: A Bayesian nonparametric approach","authors":"Mario Beraha, Alessandra Guglielmi, Fernando Andrés Quintana, Maria De Iorio, Johan Gunnar Eriksson, Fabian Yap","doi":"10.1177/1471082x231185892","DOIUrl":"https://doi.org/10.1177/1471082x231185892","url":null,"abstract":"Overweight and obesity in adults are known to be associated with increased risk of metabolic and cardiovascular diseases. Obesity has now reached epidemic proportions, increasingly affecting children. Therefore, it is important to understand if this condition persists from early life to childhood and if different patterns can be detected to inform intervention policies. Our motivating application is a study of temporal patterns of obesity in children from South Eastern Asia. Our main focus is on clustering obesity patterns after adjusting for the effect of baseline information. Specifically, we consider a joint model for height and weight over time. Measurements are taken every six months from birth. To allow for data-driven clustering of trajectories, we assume a vector autoregressive sampling model with a dependent logit stick-breaking prior. Simulation studies show good performance of the proposed model to capture overall growth patterns, as compared to other alternatives. We also fit the model to the motivating dataset, and discuss the results, in particular highlighting cluster differences. We have found four large clusters, corresponding to children sub-groups, though two of them are similar in terms of both height and weight at each time point. We provide interpretation of these clusters in terms of combinations of predictors.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136154353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A spline-based framework for the flexible modelling of continuously observed multistate survival processes 一个基于样条的框架,用于连续观察多状态生存过程的灵活建模
4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2023-09-21 DOI: 10.1177/1471082x231176120
Alessia Eletti, Giampiero Marra, Rosalba Radice
Multistate modelling is becoming increasingly popular due to the availability of richer longitudinal health data. When the times at which the events characterising disease progression are known, the modelling of the multistate process is greatly simplified as it can be broken down in a number of traditional survival models. We propose to flexibly model them through the existing general link-based additive framework implemented in the R package GJRM. The associated transition probabilities can then be obtained through a simulation-based approach implemented in the R package mstate, which is appealing due to its generality. The integration between the two is seamless and efficient since we model a transformation of the survival function, rather than the hazard function, as is commonly found. This is achieved through the use of shape constrained P-splines which elegantly embed the monotonicity required for the survival functions within the construction of the survival functions themselves. The proposed framework allows for the inclusion of virtually any type of covariate effects, including time-dependent ones, while imposing no restriction on the multistate process assumed. We exemplify the usage of this framework through a case study on breast cancer patients.
由于可以获得更丰富的纵向健康数据,多状态建模正变得越来越流行。当表征疾病进展的事件的时间已知时,多状态过程的建模就大大简化了,因为它可以分解为许多传统的生存模型。我们建议通过在R包GJRM中实现的现有通用的基于链接的添加框架来灵活地建模它们。然后可以通过在R包状态中实现的基于模拟的方法获得相关的转移概率,由于其通用性,该方法很有吸引力。两者之间的整合是无缝且有效的,因为我们建模的是生存函数的转换,而不是通常发现的风险函数。这是通过使用形状约束的p样条来实现的,它在生存函数本身的构造中优雅地嵌入了生存函数所需的单调性。所提出的框架允许包含几乎任何类型的协变量效应,包括时间相关效应,同时对假设的多状态过程没有限制。我们通过对乳腺癌患者的案例研究来举例说明这一框架的使用。
{"title":"A spline-based framework for the flexible modelling of continuously observed multistate survival processes","authors":"Alessia Eletti, Giampiero Marra, Rosalba Radice","doi":"10.1177/1471082x231176120","DOIUrl":"https://doi.org/10.1177/1471082x231176120","url":null,"abstract":"Multistate modelling is becoming increasingly popular due to the availability of richer longitudinal health data. When the times at which the events characterising disease progression are known, the modelling of the multistate process is greatly simplified as it can be broken down in a number of traditional survival models. We propose to flexibly model them through the existing general link-based additive framework implemented in the R package GJRM. The associated transition probabilities can then be obtained through a simulation-based approach implemented in the R package mstate, which is appealing due to its generality. The integration between the two is seamless and efficient since we model a transformation of the survival function, rather than the hazard function, as is commonly found. This is achieved through the use of shape constrained P-splines which elegantly embed the monotonicity required for the survival functions within the construction of the survival functions themselves. The proposed framework allows for the inclusion of virtually any type of covariate effects, including time-dependent ones, while imposing no restriction on the multistate process assumed. We exemplify the usage of this framework through a case study on breast cancer patients.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136236955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A truncated mean-parameterized Conway-Maxwell-Poisson model for the analysis of Test match bowlers 一个截尾均值参数化康威-麦克斯韦-泊松模型分析的板球比赛
4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2023-09-19 DOI: 10.1177/1471082x231178584
Peter M. Philipson
A truncated, mean-parameterized Conway-Maxwell-Poisson model is developed to handle under- and overdispersed count data owing to individual heterogeneity. The truncated nature of the data allows for a more direct implementation of the model than is utilized in previous work without too much computational burden. The model is applied to a large dataset of Test match cricket bowlers, where the data are in the form of small counts and range in time from 1877 to the modern day, leading to the inclusion of temporal effects to account for fundamental changes to the sport and society. Rankings of sportsmen and women based on a statistical model are often handicapped by the popularity of inappropriate traditional metrics, which are found to be flawed measures in this instance. Inferences are made using a Bayesian approach by deploying a Markov Chain Monte Carlo algorithm to obtain parameter estimates and to extract the innate ability of individual players. The model offers a good fit and indicates that there is merit in a more sophisticated measure for ranking and assessing Test match bowlers.
建立了截断的均值参数化康威-麦克斯韦-泊松模型来处理由于个体异质性造成的过分散和过分散的计数数据。数据的截断特性允许比以前的工作中使用的模型更直接的实现,而不需要太多的计算负担。该模型应用于板球板球测试赛的大型数据集,其中数据以小计数的形式出现,时间范围从1877年到现在,导致包括时间效应,以解释运动和社会的根本变化。基于统计模型的男女运动员排名往往受到不适当的传统衡量标准的流行的阻碍,在这种情况下,这些衡量标准被发现是有缺陷的。通过部署马尔可夫链蒙特卡罗算法,使用贝叶斯方法进行推断,以获得参数估计并提取个体玩家的先天能力。该模型提供了很好的拟合,并表明采用更复杂的方法来排名和评估板球测试赛的投球手是有价值的。
{"title":"A truncated mean-parameterized Conway-Maxwell-Poisson model for the analysis of Test match bowlers","authors":"Peter M. Philipson","doi":"10.1177/1471082x231178584","DOIUrl":"https://doi.org/10.1177/1471082x231178584","url":null,"abstract":"A truncated, mean-parameterized Conway-Maxwell-Poisson model is developed to handle under- and overdispersed count data owing to individual heterogeneity. The truncated nature of the data allows for a more direct implementation of the model than is utilized in previous work without too much computational burden. The model is applied to a large dataset of Test match cricket bowlers, where the data are in the form of small counts and range in time from 1877 to the modern day, leading to the inclusion of temporal effects to account for fundamental changes to the sport and society. Rankings of sportsmen and women based on a statistical model are often handicapped by the popularity of inappropriate traditional metrics, which are found to be flawed measures in this instance. Inferences are made using a Bayesian approach by deploying a Markov Chain Monte Carlo algorithm to obtain parameter estimates and to extract the innate ability of individual players. The model offers a good fit and indicates that there is merit in a more sophisticated measure for ranking and assessing Test match bowlers.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":"169 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135014868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Derivative curve estimation in longitudinal studies using P-splines 用p样条估计纵向研究中的导数曲线
4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2023-09-18 DOI: 10.1177/1471082x231178078
María Alejandra Hernández, Dae-Jin Lee, María Xosé Rodríguez-álvarez, María Durbán
The estimation of curve derivatives is of interest in many disciplines. It allows the extraction of important characteristics to gain insight about the underlying process. In the context of longitudinal data, the derivative allows the description of biological features of the individuals or finding change regions of interest. Although there are several approaches to estimate subject-specific curves and their derivatives, there are still open problems due to the complicated nature of these time course processes. In this article, we illustrate the use of P-spline models to estimate derivatives in the context of longitudinal data. We also propose a new penalty acting at the population and the subject-specific levels to address under-smoothing and boundary problems in derivative estimation. The practical performance of the proposal is evaluated through simulations, and comparisons with an alternative method are reported. Finally, an application to longitudinal height measurements of 125 football players in a youth professional academy is presented, where the goal is to analyse their growth and maturity patterns over time.
曲线导数的估计是许多学科都感兴趣的问题。它允许提取重要的特征,从而深入了解底层流程。在纵向数据的背景下,导数允许描述个体的生物特征或发现感兴趣的变化区域。虽然有几种方法来估计特定学科的曲线及其导数,但由于这些时间过程过程的复杂性,仍然存在开放的问题。在这篇文章中,我们说明了使用p样条模型来估计纵向数据的导数。我们还提出了一种新的惩罚作用于群体和特定学科的水平,以解决导数估计中的欠平滑和边界问题。通过仿真对该方法的实际性能进行了评价,并与一种替代方法进行了比较。最后,本文提出了一项对125名青年职业学院足球运动员纵向身高测量的应用,其目标是分析他们随时间的成长和成熟模式。
{"title":"Derivative curve estimation in longitudinal studies using P-splines","authors":"María Alejandra Hernández, Dae-Jin Lee, María Xosé Rodríguez-álvarez, María Durbán","doi":"10.1177/1471082x231178078","DOIUrl":"https://doi.org/10.1177/1471082x231178078","url":null,"abstract":"The estimation of curve derivatives is of interest in many disciplines. It allows the extraction of important characteristics to gain insight about the underlying process. In the context of longitudinal data, the derivative allows the description of biological features of the individuals or finding change regions of interest. Although there are several approaches to estimate subject-specific curves and their derivatives, there are still open problems due to the complicated nature of these time course processes. In this article, we illustrate the use of P-spline models to estimate derivatives in the context of longitudinal data. We also propose a new penalty acting at the population and the subject-specific levels to address under-smoothing and boundary problems in derivative estimation. The practical performance of the proposal is evaluated through simulations, and comparisons with an alternative method are reported. Finally, an application to longitudinal height measurements of 125 football players in a youth professional academy is presented, where the goal is to analyse their growth and maturity patterns over time.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135153927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Penalty parameter selection and asymmetry corrections to Laplace approximations in Bayesian P-splines models 贝叶斯p样条模型拉普拉斯近似的惩罚参数选择和不对称校正
4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2023-09-10 DOI: 10.1177/1471082x231181173
Philippe Lambert, Oswaldo Gressani
Laplace P-splines (LPS) combine the P-splines smoother and the Laplace approximation in a unifying framework for fast and flexible inference under the Bayesian paradigm. The Gaussian Markov random field prior assumed for penalized parameters and the Bernstein-von Mises theorem typically ensure a razor-sharp accuracy of the Laplace approximation to the posterior distribution of these quantities. This accuracy can be seriously compromised for some unpenalized parameters, especially when the information synthesized by the prior and the likelihood is sparse. Therefore, we propose a refined version of the LPS methodology by splitting the parameter space in two subsets. The first set involves parameters for which the joint posterior distribution is approached from a non-Gaussian perspective with an approximation scheme tailored to capture asymmetric patterns, while the posterior distribution for the penalized parameters in the complementary set undergoes the LPS treatment with Laplace approximations. As such, the dichotomization of the parameter space provides the necessary structure for a separate treatment of model parameters, yielding improved estimation accuracy as compared to a setting where posterior quantities are uniformly handled with Laplace. In addition, the proposed enriched version of LPS remains entirely sampling-free, so that it operates at a computing speed that is far from reach to any existing Markov chain Monte Carlo approach. The methodology is illustrated on the additive proportional odds model with an application on ordinal survey data.
拉普拉斯p样条(LPS)将p样条光滑和拉普拉斯近似结合在一个统一的框架中,在贝叶斯范式下实现快速灵活的推理。高斯马尔可夫随机场预先假定惩罚参数和伯恩斯坦-冯米塞斯定理通常确保了这些量的后验分布的拉普拉斯近似的精确程度。对于一些未惩罚参数,特别是当先验和似然合成的信息是稀疏的时,这种准确性可能会受到严重损害。因此,我们提出了一种改进版的LPS方法,将参数空间分成两个子集。第一组参数的联合后验分布从非高斯的角度用一种专为捕获不对称模式而定制的近似方案来处理,而互补组中惩罚参数的后验分布则用拉普拉斯近似进行LPS处理。因此,参数空间的二分类为模型参数的单独处理提供了必要的结构,与用拉普拉斯统一处理后验量的设置相比,可以提高估计精度。此外,所提出的强化版LPS仍然完全不需要采样,因此它的计算速度远远达不到任何现有的马尔可夫链蒙特卡洛方法。通过对顺序调查数据的应用,说明了加性比例赔率模型的方法。
{"title":"Penalty parameter selection and asymmetry corrections to Laplace approximations in Bayesian P-splines models","authors":"Philippe Lambert, Oswaldo Gressani","doi":"10.1177/1471082x231181173","DOIUrl":"https://doi.org/10.1177/1471082x231181173","url":null,"abstract":"Laplace P-splines (LPS) combine the P-splines smoother and the Laplace approximation in a unifying framework for fast and flexible inference under the Bayesian paradigm. The Gaussian Markov random field prior assumed for penalized parameters and the Bernstein-von Mises theorem typically ensure a razor-sharp accuracy of the Laplace approximation to the posterior distribution of these quantities. This accuracy can be seriously compromised for some unpenalized parameters, especially when the information synthesized by the prior and the likelihood is sparse. Therefore, we propose a refined version of the LPS methodology by splitting the parameter space in two subsets. The first set involves parameters for which the joint posterior distribution is approached from a non-Gaussian perspective with an approximation scheme tailored to capture asymmetric patterns, while the posterior distribution for the penalized parameters in the complementary set undergoes the LPS treatment with Laplace approximations. As such, the dichotomization of the parameter space provides the necessary structure for a separate treatment of model parameters, yielding improved estimation accuracy as compared to a setting where posterior quantities are uniformly handled with Laplace. In addition, the proposed enriched version of LPS remains entirely sampling-free, so that it operates at a computing speed that is far from reach to any existing Markov chain Monte Carlo approach. The methodology is illustrated on the additive proportional odds model with an application on ordinal survey data.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136073309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A black box approach to fitting smooth models of mortality 拟合平滑死亡率模型的黑盒方法
IF 1 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2023-08-22 DOI: 10.1177/1471082x231181165
I. Currie
Actuaries have long been interested in the forecasting of mortality for the purpose of the pricing and reserving of pensions and annuities. Most models of mortality in age and year of death, and often year of birth, are not identifiable so actuaries worried about what constraints should be used to give sensible estimates of the age and year of death parameters, and, if required, the year of birth parameters. These parameters were then forecast with an ARIMA model to give the required forecasts of mortality. A recent article showed that, while the fitted parameters were not identifiable, both the fitted and forecast mortalities were. This result holds if the age term is smoothed with P-splines. The present article deals with generalized linear models with a rank deficient regression matrix. We have two aims. First, we investigate the effect that different constraints have on the estimated regression coefficients. We show that it is possible to fit the model under different constraints in R without imposing any explicit constraints. R does all the necessary booking-keeping ‘under the bonnet’. The estimated regression coefficients under a particular set of constraints can then be recovered from the invariant fitted values. We have a black box approach to fitting the model subject to any set of constraints.
精算师长期以来一直对死亡率预测感兴趣,目的是对养老金和年金进行定价和储备。大多数以年龄和死亡年份(通常是出生年份)为单位的死亡率模型都是不可识别的,因此精算师担心应该使用什么约束条件来合理估计年龄和死亡年度参数,如果需要,还担心出生年份参数。然后用ARIMA模型对这些参数进行预测,以给出所需的死亡率预测。最近的一篇文章表明,虽然拟合的参数无法识别,但拟合和预测的死亡率都是可识别的。如果使用P样条曲线对年龄项进行平滑,则此结果成立。本文讨论了具有秩亏回归矩阵的广义线性模型。我们有两个目标。首先,我们研究了不同约束条件对估计回归系数的影响。我们证明了在不施加任何显式约束的情况下,在R中的不同约束下拟合模型是可能的。R负责所有必要的预订工作。然后,可以从不变的拟合值中恢复特定约束集下的估计回归系数。我们有一种黑盒方法来拟合受任何约束的模型。
{"title":"A black box approach to fitting smooth models of mortality","authors":"I. Currie","doi":"10.1177/1471082x231181165","DOIUrl":"https://doi.org/10.1177/1471082x231181165","url":null,"abstract":"Actuaries have long been interested in the forecasting of mortality for the purpose of the pricing and reserving of pensions and annuities. Most models of mortality in age and year of death, and often year of birth, are not identifiable so actuaries worried about what constraints should be used to give sensible estimates of the age and year of death parameters, and, if required, the year of birth parameters. These parameters were then forecast with an ARIMA model to give the required forecasts of mortality. A recent article showed that, while the fitted parameters were not identifiable, both the fitted and forecast mortalities were. This result holds if the age term is smoothed with P-splines. The present article deals with generalized linear models with a rank deficient regression matrix. We have two aims. First, we investigate the effect that different constraints have on the estimated regression coefficients. We show that it is possible to fit the model under different constraints in R without imposing any explicit constraints. R does all the necessary booking-keeping ‘under the bonnet’. The estimated regression coefficients under a particular set of constraints can then be recovered from the invariant fitted values. We have a black box approach to fitting the model subject to any set of constraints.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":" ","pages":""},"PeriodicalIF":1.0,"publicationDate":"2023-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42415077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Statistical Modelling
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1