When analyzing data combined from multiple sources (e.g., hospitals, studies), the heterogeneity across different sources must be accounted for. In this article, we consider high-dimensional linear regression models for integrative data analysis. We propose a new adaptive clustering penalty (ACP) method to simultaneously select variables and cluster source-specific regression coefficients with subhomogeneity. We show that the estimator based on the ACP method enjoys a strong oracle property under certain regularity conditions. We also develop an efficient algorithm based on the alternating direction method of multipliers (ADMM) for parameter estimation. We conduct simulation studies to compare the performance of the proposed method to three existing methods (a fused LASSO with adjacent fusion, a pairwise fused LASSO and a multidirectional shrinkage penalty method). Finally, we apply the proposed method to the multicentre Childhood Adenotonsillectomy Trial to identify subhomogeneity in the treatment effects across different study sites.
{"title":"High-dimensional variable selection accounting for heterogeneity in regression coefficients across multiple data sources","authors":"Tingting Yu, Shangyuan Ye, Rui Wang","doi":"10.1002/cjs.11793","DOIUrl":"10.1002/cjs.11793","url":null,"abstract":"<p>When analyzing data combined from multiple sources (e.g., hospitals, studies), the heterogeneity across different sources must be accounted for. In this article, we consider high-dimensional linear regression models for integrative data analysis. We propose a new adaptive clustering penalty (ACP) method to simultaneously select variables and cluster source-specific regression coefficients with subhomogeneity. We show that the estimator based on the ACP method enjoys a strong oracle property under certain regularity conditions. We also develop an efficient algorithm based on the alternating direction method of multipliers (ADMM) for parameter estimation. We conduct simulation studies to compare the performance of the proposed method to three existing methods (a fused LASSO with adjacent fusion, a pairwise fused LASSO and a multidirectional shrinkage penalty method). Finally, we apply the proposed method to the multicentre Childhood Adenotonsillectomy Trial to identify subhomogeneity in the treatment effects across different study sites.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2023-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42707966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Functional analysis of variance (ANOVA) models are often used to compare groups of functional data. Similar to the traditional ANOVA model, a common follow-up procedure to the rejection of the functional ANOVA null hypothesis is to perform functional linear contrast tests to identify which groups have different mean functions. Most existing functional contrast tests assume independent functional observations within each group. In this article, we introduce a new functional linear contrast test procedure that accounts for possible time dependency among functional group members. The test statistic and its normalized version, based on the Karhunen–Loève decomposition of the covariance function and a weak convergence result of the error processes, follow respectively a mixture chi-squared and a chi-squared distribution. An extensive simulation study is conducted to compare the empirical performance of the existing and new contrast tests. We also present two applications of these contrast tests to a weather study and a battery-life study. We provide software implementation and example data in the Supplementary Material.
{"title":"Contrast tests for groups of functional data","authors":"Quyen Do, Pang Du","doi":"10.1002/cjs.11794","DOIUrl":"10.1002/cjs.11794","url":null,"abstract":"<p>Functional analysis of variance (ANOVA) models are often used to compare groups of functional data. Similar to the traditional ANOVA model, a common follow-up procedure to the rejection of the functional ANOVA null hypothesis is to perform functional linear contrast tests to identify which groups have different mean functions. Most existing functional contrast tests assume independent functional observations within each group. In this article, we introduce a new functional linear contrast test procedure that accounts for possible time dependency among functional group members. The test statistic and its normalized version, based on the Karhunen–Loève decomposition of the covariance function and a weak convergence result of the error processes, follow respectively a mixture chi-squared and a chi-squared distribution. An extensive simulation study is conducted to compare the empirical performance of the existing and new contrast tests. We also present two applications of these contrast tests to a weather study and a battery-life study. We provide software implementation and example data in the Supplementary Material.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2023-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.11794","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48159209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A reduced-rank mixed-effects model is developed for robust modelling of sparsely observed paired functional data. In this model, the curves for each functional variable are summarized using a few functional principal components, and the association of the two functional variables is modelled through the association of the principal component scores. A multivariate-scale mixture of normal distributions is used to model the principal component scores and the measurement errors in order to handle outlying observations and achieve robust inference. The mean functions and principal component functions are modelled using splines, and roughness penalties are applied to avoid overfitting. An EM algorithm is developed for computation of model fitting and prediction. A simulation study shows that the proposed method outperforms an existing method, which is not designed for robust estimation. The effectiveness of the proposed method is illustrated through an application of fitting multiband light curves of Type Ia supernovae.
{"title":"Robust joint modelling of sparsely observed paired functional data","authors":"Huiya Zhou, Xiaomeng Yan, Lan Zhou","doi":"10.1002/cjs.11796","DOIUrl":"10.1002/cjs.11796","url":null,"abstract":"<p>A reduced-rank mixed-effects model is developed for robust modelling of sparsely observed paired functional data. In this model, the curves for each functional variable are summarized using a few functional principal components, and the association of the two functional variables is modelled through the association of the principal component scores. A multivariate-scale mixture of normal distributions is used to model the principal component scores and the measurement errors in order to handle outlying observations and achieve robust inference. The mean functions and principal component functions are modelled using splines, and roughness penalties are applied to avoid overfitting. An EM algorithm is developed for computation of model fitting and prediction. A simulation study shows that the proposed method outperforms an existing method, which is not designed for robust estimation. The effectiveness of the proposed method is illustrated through an application of fitting multiband light curves of Type Ia supernovae.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2023-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.11796","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45602827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We are delighted to present a special issue of The Canadian Journal of Statistics (CJS) in honour of Professor Nancy Reid. The articles in this collection have been contributed by a group of participants who attended a workshop entitled “Statistics at its Best” in Toronto on 5 May 2022. The workshop was organized by the Department of Statistical Sciences at the University of Toronto to celebrate Professor Reid’s 70th birthday. It highlighted her remarkable contributions to Statistical Science and her dedication to the profession, exemplified in research, leadership, service and education of the next generation of statisticians. Professor Reid’s impactful career has played a crucial role in fostering the growth of the Canadian statistical community. This workshop was part of a series of celebratory activities coordinated by the Statistical Society of Canada, marking the 50th anniversary of the statistical community in this country. This collection of articles encompasses a wide range of topics. First, the engaging dialogue A conversation with Nancy Reid by Craiu and Yi sheds light on Professor Reid’s intellectual journey and perspectives on statistical science and data science. In The inducement of population sparsity, Battey presents the pioneering work on parameter orthogonalization by Cox and Reid as an inducement of abstract population-level sparsity. The article focuses on three important examples related to sparsity-inducing parameterizations or data transformations: covariance models, nuisance parameter elimination and high-dimensional regression. Strategies for inducing sparsity vary depending on the context and may involve solving partial differential equations or specifying parameterized paths. Battey concludes by presenting some open problems. McCullagh then highlights, in A tale of two variances, the ambiguity and potential misinterpretation of the standard repeated-sampling concept of the variance in a finite-dimensional parametric model. He presents three operational interpretations, all numerically distinct and compatible with repeated sampling from a fixed parameter population. These interpretations help resolve contradictions between Fisherian variance and inverse-information variance. We next turn to hypothesis testing for parameters on the boundary of their domain. In Improved inference for a boundary parameter, Elkantassi, Bellio, Brazzale and Davison review theoretical work on the problem, including hard and soft boundaries, and iceberg estimators. They highlight the significant underestimation of the probability due to the limiting results, propose remedies based on the normal approximation for the profile score function, and outline the success of higher order approximations. Using these approaches, the authors develop an accurate test to assess the need for a spline component in a linear mixed model. In Sparse estimation within Pearson’s system, with an application to financial market risk, Carey, Genest and Ramsay tackle t
{"title":"Special issue in honour of Nancy Reid: Guest Editors' introduction","authors":"","doi":"10.1002/cjs.11792","DOIUrl":"https://doi.org/10.1002/cjs.11792","url":null,"abstract":"We are delighted to present a special issue of The Canadian Journal of Statistics (CJS) in honour of Professor Nancy Reid. The articles in this collection have been contributed by a group of participants who attended a workshop entitled “Statistics at its Best” in Toronto on 5 May 2022. The workshop was organized by the Department of Statistical Sciences at the University of Toronto to celebrate Professor Reid’s 70th birthday. It highlighted her remarkable contributions to Statistical Science and her dedication to the profession, exemplified in research, leadership, service and education of the next generation of statisticians. Professor Reid’s impactful career has played a crucial role in fostering the growth of the Canadian statistical community. This workshop was part of a series of celebratory activities coordinated by the Statistical Society of Canada, marking the 50th anniversary of the statistical community in this country. This collection of articles encompasses a wide range of topics. First, the engaging dialogue A conversation with Nancy Reid by Craiu and Yi sheds light on Professor Reid’s intellectual journey and perspectives on statistical science and data science. In The inducement of population sparsity, Battey presents the pioneering work on parameter orthogonalization by Cox and Reid as an inducement of abstract population-level sparsity. The article focuses on three important examples related to sparsity-inducing parameterizations or data transformations: covariance models, nuisance parameter elimination and high-dimensional regression. Strategies for inducing sparsity vary depending on the context and may involve solving partial differential equations or specifying parameterized paths. Battey concludes by presenting some open problems. McCullagh then highlights, in A tale of two variances, the ambiguity and potential misinterpretation of the standard repeated-sampling concept of the variance in a finite-dimensional parametric model. He presents three operational interpretations, all numerically distinct and compatible with repeated sampling from a fixed parameter population. These interpretations help resolve contradictions between Fisherian variance and inverse-information variance. We next turn to hypothesis testing for parameters on the boundary of their domain. In Improved inference for a boundary parameter, Elkantassi, Bellio, Brazzale and Davison review theoretical work on the problem, including hard and soft boundaries, and iceberg estimators. They highlight the significant underestimation of the probability due to the limiting results, propose remedies based on the normal approximation for the profile score function, and outline the success of higher order approximations. Using these approaches, the authors develop an accurate test to assess the need for a spline component in a linear mixed model. In Sparse estimation within Pearson’s system, with an application to financial market risk, Carey, Genest and Ramsay tackle t","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2023-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"51300145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We are delighted to present a special issue of The Canadian Journal of Statistics (CJS) in honour of Professor Nancy Reid. The articles in this collection have been contributed by a group of participants who attended a workshop entitled “Statistics at its Best” in Toronto on 5 May 2022. The workshop was organized by the Department of Statistical Sciences at the University of Toronto to celebrate Professor Reid’s 70th birthday. It highlighted her remarkable contributions to Statistical Science and her dedication to the profession, exemplified in research, leadership, service and education of the next generation of statisticians. Professor Reid’s impactful career has played a crucial role in fostering the growth of the Canadian statistical community. This workshop was part of a series of celebratory activities coordinated by the Statistical Society of Canada, marking the 50th anniversary of the statistical community in this country. This collection of articles encompasses a wide range of topics. First, the engaging dialogue A conversation with Nancy Reid by Craiu and Yi sheds light on Professor Reid’s intellectual journey and perspectives on statistical science and data science. In The inducement of population sparsity, Battey presents the pioneering work on parameter orthogonalization by Cox and Reid as an inducement of abstract population-level sparsity. The article focuses on three important examples related to sparsity-inducing parameterizations or data transformations: covariance models, nuisance parameter elimination and high-dimensional regression. Strategies for inducing sparsity vary depending on the context and may involve solving partial differential equations or specifying parameterized paths. Battey concludes by presenting some open problems. McCullagh then highlights, in A tale of two variances, the ambiguity and potential misinterpretation of the standard repeated-sampling concept of the variance in a finite-dimensional parametric model. He presents three operational interpretations, all numerically distinct and compatible with repeated sampling from a fixed parameter population. These interpretations help resolve contradictions between Fisherian variance and inverse-information variance. We next turn to hypothesis testing for parameters on the boundary of their domain. In Improved inference for a boundary parameter, Elkantassi, Bellio, Brazzale and Davison review theoretical work on the problem, including hard and soft boundaries, and iceberg estimators. They highlight the significant underestimation of the probability due to the limiting results, propose remedies based on the normal approximation for the profile score function, and outline the success of higher order approximations. Using these approaches, the authors develop an accurate test to assess the need for a spline component in a linear mixed model. In Sparse estimation within Pearson’s system, with an application to financial market risk, Carey, Genest and Ramsay tackle t
{"title":"Special issue in honour of Nancy Reid: Guest Editors' introduction","authors":"","doi":"10.1002/cjs.11792","DOIUrl":"https://doi.org/10.1002/cjs.11792","url":null,"abstract":"We are delighted to present a special issue of The Canadian Journal of Statistics (CJS) in honour of Professor Nancy Reid. The articles in this collection have been contributed by a group of participants who attended a workshop entitled “Statistics at its Best” in Toronto on 5 May 2022. The workshop was organized by the Department of Statistical Sciences at the University of Toronto to celebrate Professor Reid’s 70th birthday. It highlighted her remarkable contributions to Statistical Science and her dedication to the profession, exemplified in research, leadership, service and education of the next generation of statisticians. Professor Reid’s impactful career has played a crucial role in fostering the growth of the Canadian statistical community. This workshop was part of a series of celebratory activities coordinated by the Statistical Society of Canada, marking the 50th anniversary of the statistical community in this country. This collection of articles encompasses a wide range of topics. First, the engaging dialogue A conversation with Nancy Reid by Craiu and Yi sheds light on Professor Reid’s intellectual journey and perspectives on statistical science and data science. In The inducement of population sparsity, Battey presents the pioneering work on parameter orthogonalization by Cox and Reid as an inducement of abstract population-level sparsity. The article focuses on three important examples related to sparsity-inducing parameterizations or data transformations: covariance models, nuisance parameter elimination and high-dimensional regression. Strategies for inducing sparsity vary depending on the context and may involve solving partial differential equations or specifying parameterized paths. Battey concludes by presenting some open problems. McCullagh then highlights, in A tale of two variances, the ambiguity and potential misinterpretation of the standard repeated-sampling concept of the variance in a finite-dimensional parametric model. He presents three operational interpretations, all numerically distinct and compatible with repeated sampling from a fixed parameter population. These interpretations help resolve contradictions between Fisherian variance and inverse-information variance. We next turn to hypothesis testing for parameters on the boundary of their domain. In Improved inference for a boundary parameter, Elkantassi, Bellio, Brazzale and Davison review theoretical work on the problem, including hard and soft boundaries, and iceberg estimators. They highlight the significant underestimation of the probability due to the limiting results, propose remedies based on the normal approximation for the profile score function, and outline the success of higher order approximations. Using these approaches, the authors develop an accurate test to assess the need for a spline component in a linear mixed model. In Sparse estimation within Pearson’s system, with an application to financial market risk, Carey, Genest and Ramsay tackle t","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2023-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50135645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Finite mixture models have been used for unsupervised learning for some time, and their use within the semisupervised paradigm is becoming more commonplace. Clickstream data are one of the various emerging data types that demand particular attention because there is a notable paucity of statistical learning approaches currently available. A mixture of first-order continuous-time Markov models is introduced for unsupervised and semisupervised learning of clickstream data. This approach assumes continuous time, which distinguishes it from existing mixture model-based approaches; practically, this allows account to be taken of the amount of time each user spends on each webpage. The approach is evaluated and compared with the discrete-time approach, using simulated and real data.
{"title":"Clustering and semi-supervised classification for clickstream data via mixture models","authors":"Michael P. B. Gallaugher, Paul D. McNicholas","doi":"10.1002/cjs.11795","DOIUrl":"10.1002/cjs.11795","url":null,"abstract":"<p>Finite mixture models have been used for unsupervised learning for some time, and their use within the semisupervised paradigm is becoming more commonplace. Clickstream data are one of the various emerging data types that demand particular attention because there is a notable paucity of statistical learning approaches currently available. A mixture of first-order continuous-time Markov models is introduced for unsupervised and semisupervised learning of clickstream data. This approach assumes continuous time, which distinguishes it from existing mixture model-based approaches; practically, this allows account to be taken of the amount of time each user spends on each webpage. The approach is evaluated and compared with the discrete-time approach, using simulated and real data.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2023-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49122235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Identifiability constraints are necessary for parameter estimation when fitting models with nonlinear covariate associations. The choice of constraint affects standard errors of the estimated curve. Centring constraints are often applied by default because they are thought to yield lowest standard errors out of any constraint, but this claim has not been investigated. We show that whether centring constraints are optimal depends on the response distribution and parameterization, and that for natural exponential family responses under the canonical parametrization, centring constraints are optimal only for Gaussian response.
{"title":"Identifiability constraints in generalized additive models","authors":"Alex Stringer","doi":"10.1002/cjs.11786","DOIUrl":"10.1002/cjs.11786","url":null,"abstract":"<p>Identifiability constraints are necessary for parameter estimation when fitting models with nonlinear covariate associations. The choice of constraint affects standard errors of the estimated curve. Centring constraints are often applied by default because they are thought to yield lowest standard errors out of any constraint, but this claim has not been investigated. We show that whether centring constraints are optimal depends on the response distribution and parameterization, and that for natural exponential family responses under the canonical parametrization, centring constraints are optimal only for Gaussian response.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2023-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.11786","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45183591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jinhan Xie, Xianwen Ding, Bei Jiang, Xiaodong Yan, Linglong Kong
This article considers robust prediction issues in ultrahigh-dimensional (UHD) datasets and proposes combining quantile regression with sequential model averaging to arrive at a quantile sequential model averaging (QSMA) procedure. The QSMA method is made computationally feasible by employing a sequential screening process and a Bayesian information criterion (BIC) model averaging method for UHD quantile regression and provides a more accurate and stable prediction of the conditional quantile of a response variable. Meanwhile, the proposed method shows effective behaviour in dealing with prediction in UHD datasets and saves a great deal of computational cost with the help of the sequential technique. Under some suitable conditions, we show that the proposed QSMA method can mitigate overfitting and yields reliable predictions. Numerical studies, including extensive simulations and a real data example, are presented to confirm that the proposed method performs well.
{"title":"High-dimensional model averaging for quantile regression","authors":"Jinhan Xie, Xianwen Ding, Bei Jiang, Xiaodong Yan, Linglong Kong","doi":"10.1002/cjs.11789","DOIUrl":"10.1002/cjs.11789","url":null,"abstract":"<p>This article considers robust prediction issues in ultrahigh-dimensional (UHD) datasets and proposes combining quantile regression with sequential model averaging to arrive at a quantile sequential model averaging (QSMA) procedure. The QSMA method is made computationally feasible by employing a sequential screening process and a Bayesian information criterion (BIC) model averaging method for UHD quantile regression and provides a more accurate and stable prediction of the conditional quantile of a response variable. Meanwhile, the proposed method shows effective behaviour in dealing with prediction in UHD datasets and saves a great deal of computational cost with the help of the sequential technique. Under some suitable conditions, we show that the proposed QSMA method can mitigate overfitting and yields reliable predictions. Numerical studies, including extensive simulations and a real data example, are presented to confirm that the proposed method performs well.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2023-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.11789","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48251981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Soumaya Elkantassi, Ruggero Bellio, Alessandra R. Brazzale, Anthony C. Davison
The limiting distributions of statistics used to test hypotheses about parameters on the boundary of their domains may provide very poor approximations to the finite-sample behaviour of these statistics, even for very large samples. We review theoretical work on this problem, describe hard and soft boundaries and iceberg estimators, and give examples highlighting how the limiting results greatly underestimate the probability that the parameter lies on its boundary even in very large samples. We propose and evaluate some simple remedies for this difficulty based on normal approximation for the profile score function, and then outline how higher order approximations yield excellent results in a range of hard and soft boundary examples. We use the approach to develop an accurate test for the need for a spline component in a linear mixed model.
{"title":"Improved inference for a boundary parameter","authors":"Soumaya Elkantassi, Ruggero Bellio, Alessandra R. Brazzale, Anthony C. Davison","doi":"10.1002/cjs.11791","DOIUrl":"10.1002/cjs.11791","url":null,"abstract":"<p>The limiting distributions of statistics used to test hypotheses about parameters on the boundary of their domains may provide very poor approximations to the finite-sample behaviour of these statistics, even for very large samples. We review theoretical work on this problem, describe hard and soft boundaries and iceberg estimators, and give examples highlighting how the limiting results greatly underestimate the probability that the parameter lies on its boundary even in very large samples. We propose and evaluate some simple remedies for this difficulty based on normal approximation for the profile score function, and then outline how higher order approximations yield excellent results in a range of hard and soft boundary examples. We use the approach to develop an accurate test for the need for a spline component in a linear mixed model.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2023-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.11791","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41342009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Longitudinal data arise frequently in biomedical follow-up observation studies. Conditional mean regression and conditional quantile regression are two popular approaches to model longitudinal data. Many results are derived under the case where the response variables are independent of the observation times. In this article, we propose a quantile regression model for the analysis of longitudinal data, where the longitudinal responses are allowed to not only depend on the past observation history but also associate with a terminal event (e.g., death). Non-smoothing estimating equation approaches are developed to estimate parameters, and the consistency and asymptotic normality of the proposed estimators are established. The asymptotic variance is estimated by a resampling method. A majorize-minimize algorithm is proposed to compute the proposed estimators. Simulation studies show that the proposed estimators perform well, and an HIV-RNA dataset is used to illustrate the proposed method.
{"title":"Joint modelling of quantile regression for longitudinal data with information observation times and a terminal event","authors":"Weicai Pang, Yutao Liu, Xingqiu Zhao, Yong Zhou","doi":"10.1002/cjs.11782","DOIUrl":"10.1002/cjs.11782","url":null,"abstract":"<p>Longitudinal data arise frequently in biomedical follow-up observation studies. Conditional mean regression and conditional quantile regression are two popular approaches to model longitudinal data. Many results are derived under the case where the response variables are independent of the observation times. In this article, we propose a quantile regression model for the analysis of longitudinal data, where the longitudinal responses are allowed to not only depend on the past observation history but also associate with a terminal event (e.g., death). Non-smoothing estimating equation approaches are developed to estimate parameters, and the consistency and asymptotic normality of the proposed estimators are established. The asymptotic variance is estimated by a resampling method. A majorize-minimize algorithm is proposed to compute the proposed estimators. Simulation studies show that the proposed estimators perform well, and an HIV-RNA dataset is used to illustrate the proposed method.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2023-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44865451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}