Pub Date : 2023-10-23DOI: 10.1177/1471082x231198907
Ufuk Beyaztas, Han Lin Shang, Abhijit Mandal
A function-on-function regression model with quadratic and interaction effects of the covariates provides a more flexible model. Despite several attempts to estimate the model’s parameters, almost all existing estimation strategies are non-robust against outliers. Outliers in the quadratic and interaction effects may deteriorate the model structure more severely than their effects in the main effect. We propose a robust estimation strategy based on the robust functional principal component decomposition of the function-valued variables and [Formula: see text]-estimator. The performance of the proposed method relies on the truncation parameters in the robust functional principal component decomposition of the function-valued variables. A robust Bayesian information criterion is used to determine the optimum truncation constants. A forward stepwise variable selection procedure is employed to determine relevant main, quadratic, and interaction effects to address a possible model misspecification. The finite-sample performance of the proposed method is investigated via a series of Monte-Carlo experiments. The proposed method’s asymptotic consistency and influence function are also studied in the supplement, and its empirical performance is further investigated using a U.S. COVID-19 dataset.
{"title":"Robust function-on-function interaction regression","authors":"Ufuk Beyaztas, Han Lin Shang, Abhijit Mandal","doi":"10.1177/1471082x231198907","DOIUrl":"https://doi.org/10.1177/1471082x231198907","url":null,"abstract":"A function-on-function regression model with quadratic and interaction effects of the covariates provides a more flexible model. Despite several attempts to estimate the model’s parameters, almost all existing estimation strategies are non-robust against outliers. Outliers in the quadratic and interaction effects may deteriorate the model structure more severely than their effects in the main effect. We propose a robust estimation strategy based on the robust functional principal component decomposition of the function-valued variables and [Formula: see text]-estimator. The performance of the proposed method relies on the truncation parameters in the robust functional principal component decomposition of the function-valued variables. A robust Bayesian information criterion is used to determine the optimum truncation constants. A forward stepwise variable selection procedure is employed to determine relevant main, quadratic, and interaction effects to address a possible model misspecification. The finite-sample performance of the proposed method is investigated via a series of Monte-Carlo experiments. The proposed method’s asymptotic consistency and influence function are also studied in the supplement, and its empirical performance is further investigated using a U.S. COVID-19 dataset.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":"11 9","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135367709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-05DOI: 10.1177/1471082x231190971
Christian H. Weiß
There are several real applications where the categories behind compositional data (CoDa) exhibit a natural order, which, however, is not accounted for by existing CoDa methods. For various application areas, it is demonstrated that appropriately developed methods for ordinal CoDa provide valuable additional insights and are, thus, recommended to complement existing CoDa methods. The potential benefits are demonstrated for the (visual) descriptive analysis of ordinal CoDa, for statistical inference based on CoDa samples, for the monitoring of CoDa processes by means of control charts, and for the analysis and modelling of compositional time series. The novel methods are illustrated by a couple of real-world data examples.
{"title":"Ordinal compositional data and time series","authors":"Christian H. Weiß","doi":"10.1177/1471082x231190971","DOIUrl":"https://doi.org/10.1177/1471082x231190971","url":null,"abstract":"There are several real applications where the categories behind compositional data (CoDa) exhibit a natural order, which, however, is not accounted for by existing CoDa methods. For various application areas, it is demonstrated that appropriately developed methods for ordinal CoDa provide valuable additional insights and are, thus, recommended to complement existing CoDa methods. The potential benefits are demonstrated for the (visual) descriptive analysis of ordinal CoDa, for statistical inference based on CoDa samples, for the monitoring of CoDa processes by means of control charts, and for the analysis and modelling of compositional time series. The novel methods are illustrated by a couple of real-world data examples.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134977792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-01DOI: 10.1177/1471082x231201705
Paul H.C. Eilers, Thomas Kneib
{"title":"Editorial to the Special Issue “Applications of P-Splines” in Memory of Brian D. Marx","authors":"Paul H.C. Eilers, Thomas Kneib","doi":"10.1177/1471082x231201705","DOIUrl":"https://doi.org/10.1177/1471082x231201705","url":null,"abstract":"","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135965350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-01DOI: 10.1177/1471082x231176635
Dimitrios M. Stasinopoulos, Robert A. Rigby, Gillian Z. Heller, Fernanda De Bastiani
P-splines are a versatile statistical modelling tool, dealing with nonlinear relationships between the response and explanatory variable(s). GAMLSS is a distributional regression framework which allows modelling of a response variable using any parametric distribution. The combination of the two methodologies provides one of the most powerful tools in modern regression analysis. This article discusses the application of the two techniques when the response variable is zero-adjusted (or semi-continuous), which combines a point probability at zero with a positive continuous distribution.
{"title":"P-splines and GAMLSS: a powerful combination, with an application to zero-adjusted distributions","authors":"Dimitrios M. Stasinopoulos, Robert A. Rigby, Gillian Z. Heller, Fernanda De Bastiani","doi":"10.1177/1471082x231176635","DOIUrl":"https://doi.org/10.1177/1471082x231176635","url":null,"abstract":"P-splines are a versatile statistical modelling tool, dealing with nonlinear relationships between the response and explanatory variable(s). GAMLSS is a distributional regression framework which allows modelling of a response variable using any parametric distribution. The combination of the two methodologies provides one of the most powerful tools in modern regression analysis. This article discusses the application of the two techniques when the response variable is zero-adjusted (or semi-continuous), which combines a point probability at zero with a positive continuous distribution.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134934745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-21DOI: 10.1177/1471082x231185892
Mario Beraha, Alessandra Guglielmi, Fernando Andrés Quintana, Maria De Iorio, Johan Gunnar Eriksson, Fabian Yap
Overweight and obesity in adults are known to be associated with increased risk of metabolic and cardiovascular diseases. Obesity has now reached epidemic proportions, increasingly affecting children. Therefore, it is important to understand if this condition persists from early life to childhood and if different patterns can be detected to inform intervention policies. Our motivating application is a study of temporal patterns of obesity in children from South Eastern Asia. Our main focus is on clustering obesity patterns after adjusting for the effect of baseline information. Specifically, we consider a joint model for height and weight over time. Measurements are taken every six months from birth. To allow for data-driven clustering of trajectories, we assume a vector autoregressive sampling model with a dependent logit stick-breaking prior. Simulation studies show good performance of the proposed model to capture overall growth patterns, as compared to other alternatives. We also fit the model to the motivating dataset, and discuss the results, in particular highlighting cluster differences. We have found four large clusters, corresponding to children sub-groups, though two of them are similar in terms of both height and weight at each time point. We provide interpretation of these clusters in terms of combinations of predictors.
{"title":"Childhood obesity in Singapore: A Bayesian nonparametric approach","authors":"Mario Beraha, Alessandra Guglielmi, Fernando Andrés Quintana, Maria De Iorio, Johan Gunnar Eriksson, Fabian Yap","doi":"10.1177/1471082x231185892","DOIUrl":"https://doi.org/10.1177/1471082x231185892","url":null,"abstract":"Overweight and obesity in adults are known to be associated with increased risk of metabolic and cardiovascular diseases. Obesity has now reached epidemic proportions, increasingly affecting children. Therefore, it is important to understand if this condition persists from early life to childhood and if different patterns can be detected to inform intervention policies. Our motivating application is a study of temporal patterns of obesity in children from South Eastern Asia. Our main focus is on clustering obesity patterns after adjusting for the effect of baseline information. Specifically, we consider a joint model for height and weight over time. Measurements are taken every six months from birth. To allow for data-driven clustering of trajectories, we assume a vector autoregressive sampling model with a dependent logit stick-breaking prior. Simulation studies show good performance of the proposed model to capture overall growth patterns, as compared to other alternatives. We also fit the model to the motivating dataset, and discuss the results, in particular highlighting cluster differences. We have found four large clusters, corresponding to children sub-groups, though two of them are similar in terms of both height and weight at each time point. We provide interpretation of these clusters in terms of combinations of predictors.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136154353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-21DOI: 10.1177/1471082x231176120
Alessia Eletti, Giampiero Marra, Rosalba Radice
Multistate modelling is becoming increasingly popular due to the availability of richer longitudinal health data. When the times at which the events characterising disease progression are known, the modelling of the multistate process is greatly simplified as it can be broken down in a number of traditional survival models. We propose to flexibly model them through the existing general link-based additive framework implemented in the R package GJRM. The associated transition probabilities can then be obtained through a simulation-based approach implemented in the R package mstate, which is appealing due to its generality. The integration between the two is seamless and efficient since we model a transformation of the survival function, rather than the hazard function, as is commonly found. This is achieved through the use of shape constrained P-splines which elegantly embed the monotonicity required for the survival functions within the construction of the survival functions themselves. The proposed framework allows for the inclusion of virtually any type of covariate effects, including time-dependent ones, while imposing no restriction on the multistate process assumed. We exemplify the usage of this framework through a case study on breast cancer patients.
{"title":"A spline-based framework for the flexible modelling of continuously observed multistate survival processes","authors":"Alessia Eletti, Giampiero Marra, Rosalba Radice","doi":"10.1177/1471082x231176120","DOIUrl":"https://doi.org/10.1177/1471082x231176120","url":null,"abstract":"Multistate modelling is becoming increasingly popular due to the availability of richer longitudinal health data. When the times at which the events characterising disease progression are known, the modelling of the multistate process is greatly simplified as it can be broken down in a number of traditional survival models. We propose to flexibly model them through the existing general link-based additive framework implemented in the R package GJRM. The associated transition probabilities can then be obtained through a simulation-based approach implemented in the R package mstate, which is appealing due to its generality. The integration between the two is seamless and efficient since we model a transformation of the survival function, rather than the hazard function, as is commonly found. This is achieved through the use of shape constrained P-splines which elegantly embed the monotonicity required for the survival functions within the construction of the survival functions themselves. The proposed framework allows for the inclusion of virtually any type of covariate effects, including time-dependent ones, while imposing no restriction on the multistate process assumed. We exemplify the usage of this framework through a case study on breast cancer patients.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136236955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-19DOI: 10.1177/1471082x231178584
Peter M. Philipson
A truncated, mean-parameterized Conway-Maxwell-Poisson model is developed to handle under- and overdispersed count data owing to individual heterogeneity. The truncated nature of the data allows for a more direct implementation of the model than is utilized in previous work without too much computational burden. The model is applied to a large dataset of Test match cricket bowlers, where the data are in the form of small counts and range in time from 1877 to the modern day, leading to the inclusion of temporal effects to account for fundamental changes to the sport and society. Rankings of sportsmen and women based on a statistical model are often handicapped by the popularity of inappropriate traditional metrics, which are found to be flawed measures in this instance. Inferences are made using a Bayesian approach by deploying a Markov Chain Monte Carlo algorithm to obtain parameter estimates and to extract the innate ability of individual players. The model offers a good fit and indicates that there is merit in a more sophisticated measure for ranking and assessing Test match bowlers.
{"title":"A truncated mean-parameterized Conway-Maxwell-Poisson model for the analysis of Test match bowlers","authors":"Peter M. Philipson","doi":"10.1177/1471082x231178584","DOIUrl":"https://doi.org/10.1177/1471082x231178584","url":null,"abstract":"A truncated, mean-parameterized Conway-Maxwell-Poisson model is developed to handle under- and overdispersed count data owing to individual heterogeneity. The truncated nature of the data allows for a more direct implementation of the model than is utilized in previous work without too much computational burden. The model is applied to a large dataset of Test match cricket bowlers, where the data are in the form of small counts and range in time from 1877 to the modern day, leading to the inclusion of temporal effects to account for fundamental changes to the sport and society. Rankings of sportsmen and women based on a statistical model are often handicapped by the popularity of inappropriate traditional metrics, which are found to be flawed measures in this instance. Inferences are made using a Bayesian approach by deploying a Markov Chain Monte Carlo algorithm to obtain parameter estimates and to extract the innate ability of individual players. The model offers a good fit and indicates that there is merit in a more sophisticated measure for ranking and assessing Test match bowlers.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":"169 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135014868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-18DOI: 10.1177/1471082x231178078
María Alejandra Hernández, Dae-Jin Lee, María Xosé Rodríguez-álvarez, María Durbán
The estimation of curve derivatives is of interest in many disciplines. It allows the extraction of important characteristics to gain insight about the underlying process. In the context of longitudinal data, the derivative allows the description of biological features of the individuals or finding change regions of interest. Although there are several approaches to estimate subject-specific curves and their derivatives, there are still open problems due to the complicated nature of these time course processes. In this article, we illustrate the use of P-spline models to estimate derivatives in the context of longitudinal data. We also propose a new penalty acting at the population and the subject-specific levels to address under-smoothing and boundary problems in derivative estimation. The practical performance of the proposal is evaluated through simulations, and comparisons with an alternative method are reported. Finally, an application to longitudinal height measurements of 125 football players in a youth professional academy is presented, where the goal is to analyse their growth and maturity patterns over time.
{"title":"Derivative curve estimation in longitudinal studies using P-splines","authors":"María Alejandra Hernández, Dae-Jin Lee, María Xosé Rodríguez-álvarez, María Durbán","doi":"10.1177/1471082x231178078","DOIUrl":"https://doi.org/10.1177/1471082x231178078","url":null,"abstract":"The estimation of curve derivatives is of interest in many disciplines. It allows the extraction of important characteristics to gain insight about the underlying process. In the context of longitudinal data, the derivative allows the description of biological features of the individuals or finding change regions of interest. Although there are several approaches to estimate subject-specific curves and their derivatives, there are still open problems due to the complicated nature of these time course processes. In this article, we illustrate the use of P-spline models to estimate derivatives in the context of longitudinal data. We also propose a new penalty acting at the population and the subject-specific levels to address under-smoothing and boundary problems in derivative estimation. The practical performance of the proposal is evaluated through simulations, and comparisons with an alternative method are reported. Finally, an application to longitudinal height measurements of 125 football players in a youth professional academy is presented, where the goal is to analyse their growth and maturity patterns over time.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135153927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-10DOI: 10.1177/1471082x231181173
Philippe Lambert, Oswaldo Gressani
Laplace P-splines (LPS) combine the P-splines smoother and the Laplace approximation in a unifying framework for fast and flexible inference under the Bayesian paradigm. The Gaussian Markov random field prior assumed for penalized parameters and the Bernstein-von Mises theorem typically ensure a razor-sharp accuracy of the Laplace approximation to the posterior distribution of these quantities. This accuracy can be seriously compromised for some unpenalized parameters, especially when the information synthesized by the prior and the likelihood is sparse. Therefore, we propose a refined version of the LPS methodology by splitting the parameter space in two subsets. The first set involves parameters for which the joint posterior distribution is approached from a non-Gaussian perspective with an approximation scheme tailored to capture asymmetric patterns, while the posterior distribution for the penalized parameters in the complementary set undergoes the LPS treatment with Laplace approximations. As such, the dichotomization of the parameter space provides the necessary structure for a separate treatment of model parameters, yielding improved estimation accuracy as compared to a setting where posterior quantities are uniformly handled with Laplace. In addition, the proposed enriched version of LPS remains entirely sampling-free, so that it operates at a computing speed that is far from reach to any existing Markov chain Monte Carlo approach. The methodology is illustrated on the additive proportional odds model with an application on ordinal survey data.
{"title":"Penalty parameter selection and asymmetry corrections to Laplace approximations in Bayesian P-splines models","authors":"Philippe Lambert, Oswaldo Gressani","doi":"10.1177/1471082x231181173","DOIUrl":"https://doi.org/10.1177/1471082x231181173","url":null,"abstract":"Laplace P-splines (LPS) combine the P-splines smoother and the Laplace approximation in a unifying framework for fast and flexible inference under the Bayesian paradigm. The Gaussian Markov random field prior assumed for penalized parameters and the Bernstein-von Mises theorem typically ensure a razor-sharp accuracy of the Laplace approximation to the posterior distribution of these quantities. This accuracy can be seriously compromised for some unpenalized parameters, especially when the information synthesized by the prior and the likelihood is sparse. Therefore, we propose a refined version of the LPS methodology by splitting the parameter space in two subsets. The first set involves parameters for which the joint posterior distribution is approached from a non-Gaussian perspective with an approximation scheme tailored to capture asymmetric patterns, while the posterior distribution for the penalized parameters in the complementary set undergoes the LPS treatment with Laplace approximations. As such, the dichotomization of the parameter space provides the necessary structure for a separate treatment of model parameters, yielding improved estimation accuracy as compared to a setting where posterior quantities are uniformly handled with Laplace. In addition, the proposed enriched version of LPS remains entirely sampling-free, so that it operates at a computing speed that is far from reach to any existing Markov chain Monte Carlo approach. The methodology is illustrated on the additive proportional odds model with an application on ordinal survey data.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136073309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-22DOI: 10.1177/1471082x231181165
I. Currie
Actuaries have long been interested in the forecasting of mortality for the purpose of the pricing and reserving of pensions and annuities. Most models of mortality in age and year of death, and often year of birth, are not identifiable so actuaries worried about what constraints should be used to give sensible estimates of the age and year of death parameters, and, if required, the year of birth parameters. These parameters were then forecast with an ARIMA model to give the required forecasts of mortality. A recent article showed that, while the fitted parameters were not identifiable, both the fitted and forecast mortalities were. This result holds if the age term is smoothed with P-splines. The present article deals with generalized linear models with a rank deficient regression matrix. We have two aims. First, we investigate the effect that different constraints have on the estimated regression coefficients. We show that it is possible to fit the model under different constraints in R without imposing any explicit constraints. R does all the necessary booking-keeping ‘under the bonnet’. The estimated regression coefficients under a particular set of constraints can then be recovered from the invariant fitted values. We have a black box approach to fitting the model subject to any set of constraints.
{"title":"A black box approach to fitting smooth models of mortality","authors":"I. Currie","doi":"10.1177/1471082x231181165","DOIUrl":"https://doi.org/10.1177/1471082x231181165","url":null,"abstract":"Actuaries have long been interested in the forecasting of mortality for the purpose of the pricing and reserving of pensions and annuities. Most models of mortality in age and year of death, and often year of birth, are not identifiable so actuaries worried about what constraints should be used to give sensible estimates of the age and year of death parameters, and, if required, the year of birth parameters. These parameters were then forecast with an ARIMA model to give the required forecasts of mortality. A recent article showed that, while the fitted parameters were not identifiable, both the fitted and forecast mortalities were. This result holds if the age term is smoothed with P-splines. The present article deals with generalized linear models with a rank deficient regression matrix. We have two aims. First, we investigate the effect that different constraints have on the estimated regression coefficients. We show that it is possible to fit the model under different constraints in R without imposing any explicit constraints. R does all the necessary booking-keeping ‘under the bonnet’. The estimated regression coefficients under a particular set of constraints can then be recovered from the invariant fitted values. We have a black box approach to fitting the model subject to any set of constraints.","PeriodicalId":49476,"journal":{"name":"Statistical Modelling","volume":" ","pages":""},"PeriodicalIF":1.0,"publicationDate":"2023-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42415077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}