Canadian Journal of Statistics-Revue Canadienne De Statistique最新文献_第3页

Nonparametric estimation of a survival function in the presence of measurement errors on the failure time of interest 在相关故障时间存在测量误差的情况下，对生存函数进行非参数估计

IF 0.8 4区数学 Q3 STATISTICS & PROBABILITY

Canadian Journal of Statistics-Revue Canadienne De Statistique

Pub Date : 2023-11-10 DOI: 10.1002/cjs.11799

Shaojia Jin, Yanyan Liu, Guangcai Mao, Jianguo Sun, Yuanshan Wu

This article discusses nonparametric estimation of a survival function in the presence of measurement errors on the observation of the failure time of interest. One situation where such issues arise would be clinical studies of chronic diseases where the observation on the time to the failure event of interest such as the onset of the disease relies on patient recall or chart review of electronic medical records. It is easy to see that both situations can be subject to measurement errors. To resolve this problem, we propose a simulation extrapolation approach to correct the bias induced by the measurement error. To overcome potential computational difficulties, we use spline regression to approximate the unspecified extrapolated coefficient function of time, and establish the asymptotic properties of our proposed estimator. The proposed method is applied to nonparametric estimation based on interval-censored data. Extensive numerical experiments involving both simulated and actual study datasets demonstrate the feasibility of this proposed estimation procedure.

这篇文章讨论了在对相关失效时间的观测存在测量误差的情况下，对生存函数进行非参数估计的问题。出现此类问题的一种情况是慢性疾病的临床研究，其中对相关失效事件（如发病）发生时间的观察依赖于患者回忆或电子病历的图表审查。不难看出，这两种情况都可能存在测量误差。为了解决这个问题，我们提出了一种模拟外推法来纠正测量误差引起的偏差。为了克服潜在的计算困难，我们使用样条回归来逼近未指定的时间外推系数函数，并建立了我们提出的估计器的渐近特性。我们将所提出的方法应用于基于区间删失数据的非参数估计。涉及模拟数据集和实际研究数据集的大量数值实验证明了所提估计程序的可行性。

{"title":"Nonparametric estimation of a survival function in the presence of measurement errors on the failure time of interest","authors":"Shaojia Jin, Yanyan Liu, Guangcai Mao, Jianguo Sun, Yuanshan Wu","doi":"10.1002/cjs.11799","DOIUrl":"10.1002/cjs.11799","url":null,"abstract":"<p>This article discusses nonparametric estimation of a survival function in the presence of measurement errors on the observation of the failure time of interest. One situation where such issues arise would be clinical studies of chronic diseases where the observation on the time to the failure event of interest such as the onset of the disease relies on patient recall or chart review of electronic medical records. It is easy to see that both situations can be subject to measurement errors. To resolve this problem, we propose a simulation extrapolation approach to correct the bias induced by the measurement error. To overcome potential computational difficulties, we use spline regression to approximate the unspecified extrapolated coefficient function of time, and establish the asymptotic properties of our proposed estimator. The proposed method is applied to nonparametric estimation based on interval-censored data. Extensive numerical experiments involving both simulated and actual study datasets demonstrate the feasibility of this proposed estimation procedure.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"52 3","pages":"783-803"},"PeriodicalIF":0.8,"publicationDate":"2023-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135141149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Fused mean structure learning in data integration with dependence 具有依赖性的数据整合中的融合均值结构学习

IF 0.8 4区数学 Q3 STATISTICS & PROBABILITY

Canadian Journal of Statistics-Revue Canadienne De Statistique

Pub Date : 2023-10-27 DOI: 10.1002/cjs.11797

Emily C. Hector

Motivated by image-on-scalar regression with data aggregated across multiple sites, we consider a setting in which multiple independent studies each collect multiple dependent vector outcomes, with potential mean model parameter homogeneity between studies and outcome vectors. To determine the validity of a joint analysis of these data sources, we must learn which of them share mean model parameters. We propose a new model fusion approach that delivers improved flexibility and statistical performance over existing methods. Our proposed approach specifies a quadratic inference function within each data source and fuses mean model parameter vectors in their entirety based on a new formulation of a pairwise fusion penalty. We establish theoretical properties of our estimator and propose an asymptotically equivalent weighted oracle meta-estimator that is more computationally efficient. Simulations and an application to the ABIDE neuroimaging consortium highlight the flexibility of the proposed approach. An R package is provided for ease of implementation.

受图像-尺度回归与多站点数据汇总的启发，我们考虑了这样一种情况，即多项独立研究各自收集多个因变向量结果，而研究与结果向量之间可能存在平均模型参数同质性。为了确定对这些数据源进行联合分析的有效性，我们必须了解其中哪些数据源共享平均模型参数。我们提出了一种新的模型融合方法，与现有方法相比，这种方法具有更好的灵活性和统计性能。我们提出的方法在每个数据源中指定了一个二次推理函数，并根据成对融合罚则的新表述融合了整个平均模型参数向量。我们建立了估计器的理论属性，并提出了一种计算效率更高的渐进等效加权甲骨文元估计器。模拟和在 ABIDE 神经成像联盟中的应用凸显了所提方法的灵活性。为了便于实施，我们还提供了一个 R 软件包。

引用次数: 0

High-dimensional variable selection accounting for heterogeneity in regression coefficients across multiple data sources 考虑多元数据源回归系数异质性的高维变量选择

IF 0.8 4区数学 Q3 STATISTICS & PROBABILITY

Canadian Journal of Statistics-Revue Canadienne De Statistique

Pub Date : 2023-08-19 DOI: 10.1002/cjs.11793

Tingting Yu, Shangyuan Ye, Rui Wang

When analyzing data combined from multiple sources (e.g., hospitals, studies), the heterogeneity across different sources must be accounted for. In this article, we consider high-dimensional linear regression models for integrative data analysis. We propose a new adaptive clustering penalty (ACP) method to simultaneously select variables and cluster source-specific regression coefficients with subhomogeneity. We show that the estimator based on the ACP method enjoys a strong oracle property under certain regularity conditions. We also develop an efficient algorithm based on the alternating direction method of multipliers (ADMM) for parameter estimation. We conduct simulation studies to compare the performance of the proposed method to three existing methods (a fused LASSO with adjacent fusion, a pairwise fused LASSO and a multidirectional shrinkage penalty method). Finally, we apply the proposed method to the multicentre Childhood Adenotonsillectomy Trial to identify subhomogeneity in the treatment effects across different study sites.

在分析来自多个来源(如医院、研究)的综合数据时，必须考虑到不同来源之间的异质性。在本文中，我们考虑采用高维线性回归模型进行综合数据分析。我们提出了一种新的自适应聚类惩罚(ACP)方法来同时选择变量和具有亚同质性的特定聚类源的回归系数。我们证明了在一定的正则性条件下，基于ACP方法的估计量具有很强的预言性。我们还开发了一种基于乘法器交替方向法(ADMM)的高效参数估计算法。我们进行了仿真研究，将所提出的方法与三种现有方法(融合LASSO与相邻融合、两两融合LASSO和多向收缩惩罚方法)的性能进行了比较。最后，我们将提出的方法应用于多中心儿童腺扁桃体切除术试验，以确定不同研究地点治疗效果的亚均匀性。

引用次数: 0

Contrast tests for groups of functional data 功能数据组的对比测试

IF 0.8 4区数学 Q3 STATISTICS & PROBABILITY

Canadian Journal of Statistics-Revue Canadienne De Statistique

Pub Date : 2023-08-19 DOI: 10.1002/cjs.11794

Quyen Do, Pang Du

Functional analysis of variance (ANOVA) models are often used to compare groups of functional data. Similar to the traditional ANOVA model, a common follow-up procedure to the rejection of the functional ANOVA null hypothesis is to perform functional linear contrast tests to identify which groups have different mean functions. Most existing functional contrast tests assume independent functional observations within each group. In this article, we introduce a new functional linear contrast test procedure that accounts for possible time dependency among functional group members. The test statistic and its normalized version, based on the Karhunen–Loève decomposition of the covariance function and a weak convergence result of the error processes, follow respectively a mixture chi-squared and a chi-squared distribution. An extensive simulation study is conducted to compare the empirical performance of the existing and new contrast tests. We also present two applications of these contrast tests to a weather study and a battery-life study. We provide software implementation and example data in the Supplementary Material.

方差函数分析（ANOVA）模型通常用于比较函数数据组。与传统的方差分析模型类似，拒绝功能性方差分析零假设的常见后续程序是进行功能性线性对比测试，以确定哪些组具有不同的平均函数。大多数现有的功能对比测试都假设在每组中进行独立的功能观察。在本文中，我们介绍了一种新的函数线性对比测试程序，该程序考虑了函数组成员之间可能的时间依赖性。基于协方差函数的Karhunen–Loève分解和误差过程的弱收敛结果，检验统计量及其归一化版本分别遵循混合卡方分布和卡方分布。进行了广泛的模拟研究，以比较现有和新的对比测试的经验性能。我们还介绍了这些对比测试在天气研究和电池寿命研究中的两个应用。我们在补充材料中提供了软件实现和示例数据。

{"title":"Contrast tests for groups of functional data","authors":"Quyen Do, Pang Du","doi":"10.1002/cjs.11794","DOIUrl":"10.1002/cjs.11794","url":null,"abstract":"<p>Functional analysis of variance (ANOVA) models are often used to compare groups of functional data. Similar to the traditional ANOVA model, a common follow-up procedure to the rejection of the functional ANOVA null hypothesis is to perform functional linear contrast tests to identify which groups have different mean functions. Most existing functional contrast tests assume independent functional observations within each group. In this article, we introduce a new functional linear contrast test procedure that accounts for possible time dependency among functional group members. The test statistic and its normalized version, based on the Karhunen–Loève decomposition of the covariance function and a weak convergence result of the error processes, follow respectively a mixture chi-squared and a chi-squared distribution. An extensive simulation study is conducted to compare the empirical performance of the existing and new contrast tests. We also present two applications of these contrast tests to a weather study and a battery-life study. We provide software implementation and example data in the Supplementary Material.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"52 3","pages":"713-733"},"PeriodicalIF":0.8,"publicationDate":"2023-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.11794","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48159209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Robust joint modelling of sparsely observed paired functional data 稀疏观测配对函数数据的鲁棒联合建模

IF 0.8 4区数学 Q3 STATISTICS & PROBABILITY

Canadian Journal of Statistics-Revue Canadienne De Statistique

Pub Date : 2023-08-19 DOI: 10.1002/cjs.11796

Huiya Zhou, Xiaomeng Yan, Lan Zhou

A reduced-rank mixed-effects model is developed for robust modelling of sparsely observed paired functional data. In this model, the curves for each functional variable are summarized using a few functional principal components, and the association of the two functional variables is modelled through the association of the principal component scores. A multivariate-scale mixture of normal distributions is used to model the principal component scores and the measurement errors in order to handle outlying observations and achieve robust inference. The mean functions and principal component functions are modelled using splines, and roughness penalties are applied to avoid overfitting. An EM algorithm is developed for computation of model fitting and prediction. A simulation study shows that the proposed method outperforms an existing method, which is not designed for robust estimation. The effectiveness of the proposed method is illustrated through an application of fitting multiband light curves of Type Ia supernovae.

为稀疏观测的配对功能数据的鲁棒建模，开发了一种降秩混合效应模型。在该模型中，每个功能变量的曲线使用几个功能主成分进行总结，并通过主成分得分的关联来建模两个功能变量的关联。一个多元尺度的正态分布混合用于模拟主成分得分和测量误差，以处理离群观测和实现稳健推理。均值函数和主成分函数使用样条建模，并应用粗糙度惩罚以避免过拟合。提出了一种用于模型拟合和预测计算的电磁算法。仿真研究表明，该方法优于现有的非鲁棒估计方法。通过对Ia型超新星多波段光曲线的拟合，说明了该方法的有效性。

引用次数: 0

Special issue in honour of Nancy Reid: Guest Editors' introduction 纪念南希·里德的特刊:客座编辑的介绍

IF 0.6 4区数学 Q3 STATISTICS & PROBABILITY

Canadian Journal of Statistics-Revue Canadienne De Statistique

Pub Date : 2023-08-17 DOI: 10.1002/cjs.11792

We are delighted to present a special issue of The Canadian Journal of Statistics (CJS) in honour of Professor Nancy Reid. The articles in this collection have been contributed by a group of participants who attended a workshop entitled “Statistics at its Best” in Toronto on 5 May 2022. The workshop was organized by the Department of Statistical Sciences at the University of Toronto to celebrate Professor Reid’s 70th birthday. It highlighted her remarkable contributions to Statistical Science and her dedication to the profession, exemplified in research, leadership, service and education of the next generation of statisticians. Professor Reid’s impactful career has played a crucial role in fostering the growth of the Canadian statistical community. This workshop was part of a series of celebratory activities coordinated by the Statistical Society of Canada, marking the 50th anniversary of the statistical community in this country. This collection of articles encompasses a wide range of topics. First, the engaging dialogue A conversation with Nancy Reid by Craiu and Yi sheds light on Professor Reid’s intellectual journey and perspectives on statistical science and data science. In The inducement of population sparsity, Battey presents the pioneering work on parameter orthogonalization by Cox and Reid as an inducement of abstract population-level sparsity. The article focuses on three important examples related to sparsity-inducing parameterizations or data transformations: covariance models, nuisance parameter elimination and high-dimensional regression. Strategies for inducing sparsity vary depending on the context and may involve solving partial differential equations or specifying parameterized paths. Battey concludes by presenting some open problems. McCullagh then highlights, in A tale of two variances, the ambiguity and potential misinterpretation of the standard repeated-sampling concept of the variance in a finite-dimensional parametric model. He presents three operational interpretations, all numerically distinct and compatible with repeated sampling from a fixed parameter population. These interpretations help resolve contradictions between Fisherian variance and inverse-information variance. We next turn to hypothesis testing for parameters on the boundary of their domain. In Improved inference for a boundary parameter, Elkantassi, Bellio, Brazzale and Davison review theoretical work on the problem, including hard and soft boundaries, and iceberg estimators. They highlight the significant underestimation of the probability due to the limiting results, propose remedies based on the normal approximation for the profile score function, and outline the success of higher order approximations. Using these approaches, the authors develop an accurate test to assess the need for a spline component in a linear mixed model. In Sparse estimation within Pearson’s system, with an application to financial market risk, Carey, Genest and Ramsay tackle t

我们很高兴向南希·里德教授颁发《加拿大统计杂志》(CJS)特刊。本文集中的文章是由参加2022年5月5日在多伦多举行的题为“最佳统计”的讲习班的一组参与者提供的。该研讨会由多伦多大学统计科学系举办，以庆祝里德教授70岁生日。它突出了她对统计科学的杰出贡献和她对这一职业的奉献精神，体现在下一代统计学家的研究、领导、服务和教育方面。里德教授影响深远的职业生涯在促进加拿大统计界的发展方面发挥了至关重要的作用。这个讲习班是加拿大统计学会协调的一系列庆祝活动的一部分，以纪念该国统计界成立50周年。这个文章集合包含了广泛的主题。首先是引人入胜的对话:craig和Yi与Nancy Reid的对话，揭示了Reid教授在统计科学和数据科学方面的知识历程和观点。在种群稀疏性的诱导中，Battey将Cox和Reid在参数正交化方面的开创性工作作为抽象种群级稀疏性的诱导。本文重点介绍了与稀疏性参数化或数据转换相关的三个重要示例:协方差模型、干扰参数消除和高维回归。诱导稀疏性的策略因环境而异，可能涉及求解偏微分方程或指定参数化路径。巴特最后提出了一些有待解决的问题。McCullagh接着在《两个方差的故事》中强调了有限维参数模型中方差的标准重复采样概念的模糊性和潜在的误解。他提出了三种可操作的解释，所有这些解释在数字上都是不同的，并且与固定参数总体的重复抽样兼容。这些解释有助于解决fisher方差和逆信息方差之间的矛盾。接下来，我们转向对其域边界上的参数进行假设检验。在边界参数的改进推理中，Elkantassi, Bellio, Brazzale和Davison回顾了该问题的理论工作，包括硬边界和软边界，以及冰山估计器。他们强调了由于限制结果而导致的概率严重低估，提出了基于剖面分数函数的正态近似的补救措施，并概述了高阶近似的成功。使用这些方法，作者开发了一个准确的测试，以评估需要一个样条成分在一个线性混合模型。在皮尔逊系统内的稀疏估计中，Carey、Genest和Ramsay将其应用于金融市场风险，解决了估计皮尔逊系统内密度的挑战性任务，这是一类包含许多经典单变量分布的模型。作者提出了一种将惩罚回归和轮廓估计技术相结合的有效方法。通过模拟和使用标准普尔500指数数据的应用，他们证明了该方法大大提高了市场风险评估，优于金融机构和监管机构目前使用的风险价值和预期不足估计。Urban, Bong, Orellana和Kass探索振荡神经回路:相位，振幅和复杂的正态分布。他们考虑了频域中的多个振荡时间序列，并讨论了复值相关性，它与实值Pearson的相似之处

{"title":"Special issue in honour of Nancy Reid: Guest Editors' introduction","authors":"","doi":"10.1002/cjs.11792","DOIUrl":"https://doi.org/10.1002/cjs.11792","url":null,"abstract":"We are delighted to present a special issue of The Canadian Journal of Statistics (CJS) in honour of Professor Nancy Reid. The articles in this collection have been contributed by a group of participants who attended a workshop entitled “Statistics at its Best” in Toronto on 5 May 2022. The workshop was organized by the Department of Statistical Sciences at the University of Toronto to celebrate Professor Reid’s 70th birthday. It highlighted her remarkable contributions to Statistical Science and her dedication to the profession, exemplified in research, leadership, service and education of the next generation of statisticians. Professor Reid’s impactful career has played a crucial role in fostering the growth of the Canadian statistical community. This workshop was part of a series of celebratory activities coordinated by the Statistical Society of Canada, marking the 50th anniversary of the statistical community in this country. This collection of articles encompasses a wide range of topics. First, the engaging dialogue A conversation with Nancy Reid by Craiu and Yi sheds light on Professor Reid’s intellectual journey and perspectives on statistical science and data science. In The inducement of population sparsity, Battey presents the pioneering work on parameter orthogonalization by Cox and Reid as an inducement of abstract population-level sparsity. The article focuses on three important examples related to sparsity-inducing parameterizations or data transformations: covariance models, nuisance parameter elimination and high-dimensional regression. Strategies for inducing sparsity vary depending on the context and may involve solving partial differential equations or specifying parameterized paths. Battey concludes by presenting some open problems. McCullagh then highlights, in A tale of two variances, the ambiguity and potential misinterpretation of the standard repeated-sampling concept of the variance in a finite-dimensional parametric model. He presents three operational interpretations, all numerically distinct and compatible with repeated sampling from a fixed parameter population. These interpretations help resolve contradictions between Fisherian variance and inverse-information variance. We next turn to hypothesis testing for parameters on the boundary of their domain. In Improved inference for a boundary parameter, Elkantassi, Bellio, Brazzale and Davison review theoretical work on the problem, including hard and soft boundaries, and iceberg estimators. They highlight the significant underestimation of the probability due to the limiting results, propose remedies based on the normal approximation for the profile score function, and outline the success of higher order approximations. Using these approaches, the authors develop an accurate test to assess the need for a spline component in a linear mixed model. In Sparse estimation within Pearson’s system, with an application to financial market risk, Carey, Genest and Ramsay tackle t","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"51 1","pages":""},"PeriodicalIF":0.6,"publicationDate":"2023-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"51300145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Special issue in honour of Nancy Reid: Guest Editors' introduction 南希·里德特刊：客座编辑介绍

IF 0.6 4区数学 Q3 STATISTICS & PROBABILITY

Canadian Journal of Statistics-Revue Canadienne De Statistique

Pub Date : 2023-08-17 DOI: 10.1002/cjs.11792

We are delighted to present a special issue of The Canadian Journal of Statistics (CJS) in honour of Professor Nancy Reid. The articles in this collection have been contributed by a group of participants who attended a workshop entitled “Statistics at its Best” in Toronto on 5 May 2022. The workshop was organized by the Department of Statistical Sciences at the University of Toronto to celebrate Professor Reid’s 70th birthday. It highlighted her remarkable contributions to Statistical Science and her dedication to the profession, exemplified in research, leadership, service and education of the next generation of statisticians. Professor Reid’s impactful career has played a crucial role in fostering the growth of the Canadian statistical community. This workshop was part of a series of celebratory activities coordinated by the Statistical Society of Canada, marking the 50th anniversary of the statistical community in this country. This collection of articles encompasses a wide range of topics. First, the engaging dialogue A conversation with Nancy Reid by Craiu and Yi sheds light on Professor Reid’s intellectual journey and perspectives on statistical science and data science. In The inducement of population sparsity, Battey presents the pioneering work on parameter orthogonalization by Cox and Reid as an inducement of abstract population-level sparsity. The article focuses on three important examples related to sparsity-inducing parameterizations or data transformations: covariance models, nuisance parameter elimination and high-dimensional regression. Strategies for inducing sparsity vary depending on the context and may involve solving partial differential equations or specifying parameterized paths. Battey concludes by presenting some open problems. McCullagh then highlights, in A tale of two variances, the ambiguity and potential misinterpretation of the standard repeated-sampling concept of the variance in a finite-dimensional parametric model. He presents three operational interpretations, all numerically distinct and compatible with repeated sampling from a fixed parameter population. These interpretations help resolve contradictions between Fisherian variance and inverse-information variance. We next turn to hypothesis testing for parameters on the boundary of their domain. In Improved inference for a boundary parameter, Elkantassi, Bellio, Brazzale and Davison review theoretical work on the problem, including hard and soft boundaries, and iceberg estimators. They highlight the significant underestimation of the probability due to the limiting results, propose remedies based on the normal approximation for the profile score function, and outline the success of higher order approximations. Using these approaches, the authors develop an accurate test to assess the need for a spline component in a linear mixed model. In Sparse estimation within Pearson’s system, with an application to financial market risk, Carey, Genest and Ramsay tackle t

{"title":"Special issue in honour of Nancy Reid: Guest Editors' introduction","authors":"","doi":"10.1002/cjs.11792","DOIUrl":"https://doi.org/10.1002/cjs.11792","url":null,"abstract":"We are delighted to present a special issue of The Canadian Journal of Statistics (CJS) in honour of Professor Nancy Reid. The articles in this collection have been contributed by a group of participants who attended a workshop entitled “Statistics at its Best” in Toronto on 5 May 2022. The workshop was organized by the Department of Statistical Sciences at the University of Toronto to celebrate Professor Reid’s 70th birthday. It highlighted her remarkable contributions to Statistical Science and her dedication to the profession, exemplified in research, leadership, service and education of the next generation of statisticians. Professor Reid’s impactful career has played a crucial role in fostering the growth of the Canadian statistical community. This workshop was part of a series of celebratory activities coordinated by the Statistical Society of Canada, marking the 50th anniversary of the statistical community in this country. This collection of articles encompasses a wide range of topics. First, the engaging dialogue A conversation with Nancy Reid by Craiu and Yi sheds light on Professor Reid’s intellectual journey and perspectives on statistical science and data science. In The inducement of population sparsity, Battey presents the pioneering work on parameter orthogonalization by Cox and Reid as an inducement of abstract population-level sparsity. The article focuses on three important examples related to sparsity-inducing parameterizations or data transformations: covariance models, nuisance parameter elimination and high-dimensional regression. Strategies for inducing sparsity vary depending on the context and may involve solving partial differential equations or specifying parameterized paths. Battey concludes by presenting some open problems. McCullagh then highlights, in A tale of two variances, the ambiguity and potential misinterpretation of the standard repeated-sampling concept of the variance in a finite-dimensional parametric model. He presents three operational interpretations, all numerically distinct and compatible with repeated sampling from a fixed parameter population. These interpretations help resolve contradictions between Fisherian variance and inverse-information variance. We next turn to hypothesis testing for parameters on the boundary of their domain. In Improved inference for a boundary parameter, Elkantassi, Bellio, Brazzale and Davison review theoretical work on the problem, including hard and soft boundaries, and iceberg estimators. They highlight the significant underestimation of the probability due to the limiting results, propose remedies based on the normal approximation for the profile score function, and outline the success of higher order approximations. Using these approaches, the authors develop an accurate test to assess the need for a spline component in a linear mixed model. In Sparse estimation within Pearson’s system, with an application to financial market risk, Carey, Genest and Ramsay tackle t","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"51 3","pages":"747-751"},"PeriodicalIF":0.6,"publicationDate":"2023-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50135645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Clustering and semi-supervised classification for clickstream data via mixture models 通过混合模型聚类和半监督分类点击流数据

IF 0.8 4区数学 Q3 STATISTICS & PROBABILITY

Canadian Journal of Statistics-Revue Canadienne De Statistique

Pub Date : 2023-08-17 DOI: 10.1002/cjs.11795

Michael P. B. Gallaugher, Paul D. McNicholas

Finite mixture models have been used for unsupervised learning for some time, and their use within the semisupervised paradigm is becoming more commonplace. Clickstream data are one of the various emerging data types that demand particular attention because there is a notable paucity of statistical learning approaches currently available. A mixture of first-order continuous-time Markov models is introduced for unsupervised and semisupervised learning of clickstream data. This approach assumes continuous time, which distinguishes it from existing mixture model-based approaches; practically, this allows account to be taken of the amount of time each user spends on each webpage. The approach is evaluated and compared with the discrete-time approach, using simulated and real data.

有限混合模型用于无监督学习已经有一段时间了，它们在半监督范式中的使用越来越普遍。点击流数据是需要特别关注的各种新兴数据类型之一，因为目前可用的统计学习方法明显不足。引入了一阶连续时间马尔可夫模型的混合，用于点击流数据的无监督和半监督学习。这种方法假设时间是连续的，这将其与现有的基于混合模型的方法区分开来；实际上，这允许考虑每个用户在每个网页上花费的时间量。使用模拟和真实数据对该方法进行了评估，并将其与离散时间方法进行了比较。

引用次数: 0

Identifiability constraints in generalized additive models 广义加性模型中的可识别性约束

IF 0.6 4区数学 Q3 STATISTICS & PROBABILITY

Canadian Journal of Statistics-Revue Canadienne De Statistique

Pub Date : 2023-08-08 DOI: 10.1002/cjs.11786

Alex Stringer

Identifiability constraints are necessary for parameter estimation when fitting models with nonlinear covariate associations. The choice of constraint affects standard errors of the estimated curve. Centring constraints are often applied by default because they are thought to yield lowest standard errors out of any constraint, but this claim has not been investigated. We show that whether centring constraints are optimal depends on the response distribution and parameterization, and that for natural exponential family responses under the canonical parametrization, centring constraints are optimal only for Gaussian response.

当拟合具有非线性协变量关联的模型时，可辨识性约束是参数估计所必需的。约束条件的选择影响估计曲线的标准误差。定心约束通常默认应用，因为它们被认为在任何约束中产生最低的标准误差，但这种说法尚未得到调查。我们证明了集中约束是否最优取决于响应分布和参数化，并且对于典型参数化下的自然指数族响应，集中约束仅对高斯响应是最优的。

引用次数: 0

High-dimensional model averaging for quantile regression 分位数回归的高维模型平均

IF 0.6 4区数学 Q3 STATISTICS & PROBABILITY

Canadian Journal of Statistics-Revue Canadienne De Statistique

Pub Date : 2023-08-08 DOI: 10.1002/cjs.11789

Jinhan Xie, Xianwen Ding, Bei Jiang, Xiaodong Yan, Linglong Kong

This article considers robust prediction issues in ultrahigh-dimensional (UHD) datasets and proposes combining quantile regression with sequential model averaging to arrive at a quantile sequential model averaging (QSMA) procedure. The QSMA method is made computationally feasible by employing a sequential screening process and a Bayesian information criterion (BIC) model averaging method for UHD quantile regression and provides a more accurate and stable prediction of the conditional quantile of a response variable. Meanwhile, the proposed method shows effective behaviour in dealing with prediction in UHD datasets and saves a great deal of computational cost with the help of the sequential technique. Under some suitable conditions, we show that the proposed QSMA method can mitigate overfitting and yields reliable predictions. Numerical studies, including extensive simulations and a real data example, are presented to confirm that the proposed method performs well.

本文考虑了超高维(UHD)数据集的鲁棒预测问题，并提出将分位数回归与顺序模型平均相结合，以达到分位数顺序模型平均(QSMA)过程。采用序列筛选过程和贝叶斯信息准则(BIC)模型平均方法进行UHD分位数回归，使QSMA方法在计算上可行，并能更准确、更稳定地预测响应变量的条件分位数。同时，该方法在处理超高清数据集的预测方面表现出有效的性能，并借助序列技术节省了大量的计算成本。在一些合适的条件下，我们证明了所提出的QSMA方法可以减轻过拟合并产生可靠的预测。数值研究，包括大量的模拟和一个真实的数据实例，证实了所提出的方法是有效的。

引用次数: 0