首页 > 最新文献

Canadian Journal of Statistics-Revue Canadienne De Statistique最新文献

英文 中文
A new copula regression model for hierarchical data 分层数据的新型共轭回归模型
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-08-30 DOI: 10.1002/cjs.11830
Talagbe Gabin Akpo, Louis-Paul Rivest

This article proposes multivariate copula models for hierarchical data. They account for two types of correlation: one is between variables measured on the same unit, and the other is a correlation between units in the same cluster. This model is used to carry out copula regression for hierarchical data that gives cluster-specific prediction curves. In the simple case where a cluster contains two units and where two variables are measured on each one, the new model is constructed with a D-vine. The proposed copula density is expressed in terms of three copula families. When the copula families and the marginal distributions are normal, the model is equivalent to a normal linear mixed model with random cluster-specific intercepts. Methods to select the three copula families and to estimate their parameters are proposed. We perform Monte Carlo studies of the sampling properties of these estimators and of out-of-sample predictions. The new model is applied to a dataset on the marks of students in several schools.

本文提出了分层数据的多元 copula 模型。这些模型考虑了两类相关性:一类是在同一单位上测量的变量之间的相关性,另一类是同一聚类中的单位之间的相关性。该模型用于对分层数据进行协方差回归,从而给出特定群组的预测曲线。在一个群组包含两个单元,且每个单元测量两个变量的简单情况下,新模型是用-藤构建的。建议的 copula 密度用三个 copula 系来表示。当 copula 系和边际分布为正态分布时,模型等同于具有随机特定群组截距的正态线性混合模型。我们提出了选择三个 copula 系并估计其参数的方法。我们对这些估计器的抽样特性和样本外预测进行了蒙特卡罗研究。新模型被应用于几个学校的学生分数数据集。
{"title":"A new copula regression model for hierarchical data","authors":"Talagbe Gabin Akpo,&nbsp;Louis-Paul Rivest","doi":"10.1002/cjs.11830","DOIUrl":"10.1002/cjs.11830","url":null,"abstract":"<p>This article proposes multivariate copula models for hierarchical data. They account for two types of correlation: one is between variables measured on the same unit, and the other is a correlation between units in the same cluster. This model is used to carry out copula regression for hierarchical data that gives cluster-specific prediction curves. In the simple case where a cluster contains two units and where two variables are measured on each one, the new model is constructed with a <span></span><math>\u0000 <mrow>\u0000 <mi>D</mi>\u0000 </mrow></math>-vine. The proposed copula density is expressed in terms of three copula families. When the copula families and the marginal distributions are normal, the model is equivalent to a normal linear mixed model with random cluster-specific intercepts. Methods to select the three copula families and to estimate their parameters are proposed. We perform Monte Carlo studies of the sampling properties of these estimators and of out-of-sample predictions. The new model is applied to a dataset on the marks of students in several schools.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"53 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.11830","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142198226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A framework for incorporating behavioural change into individual-level spatial epidemic models 将行为变化纳入个人层面空间流行病模型的框架
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-08-21 DOI: 10.1002/cjs.11828
Madeline A. Ward, Rob Deardon, Lorna E. Deeth

Epidemic trajectories can be substantially impacted by people modifying their behaviours in response to changes in their perceived risk of spreading or contracting the disease. However, most infectious disease models assume a stable population behaviour. We present a flexible new class of models, called behavioural change individual-level models (BC-ILMs), that incorporate both individual-level covariate information and a data-driven behavioural change effect. Focusing on spatial BC-ILMs, we consider four “alarm” functions to model the effect of behavioural change as a function of infection prevalence over time. Through simulation studies, we find that if behavioural change is present, using an alarm function, even if specified incorrectly, will result in an improvement in posterior predictive performance over a model that assumes stable population behaviour. The methods are applied to data from the 2001 U.K. foot and mouth disease epidemic. The results show some evidence of a behavioural change effect, although it may not meaningfully impact model fit compared to a simpler spatial ILM in this dataset.

人们会根据自己对传播或感染疾病风险的感知变化而改变自己的行为,从而对流行病的轨迹产生重大影响。然而,大多数传染病模型都假定人群行为是稳定的。我们提出了一类灵活的新模型,称为行为变化个体水平模型(BC-ILMs),其中包含个体水平协变量信息和数据驱动的行为变化效应。以空间 BC-ILM 为重点,我们考虑了四种 "报警 "函数,将行为变化的影响作为感染率随时间变化的函数进行建模。通过模拟研究,我们发现,如果存在行为变化,使用报警函数,即使指定不正确,也会比假定人口行为稳定的模型提高后验预测性能。这些方法被应用于 2001 年英国口蹄疫疫情数据。结果显示了一些行为变化效应的证据,尽管与该数据集中更简单的空间 ILM 相比,行为变化效应可能不会对模型拟合产生有意义的影响。
{"title":"A framework for incorporating behavioural change into individual-level spatial epidemic models","authors":"Madeline A. Ward,&nbsp;Rob Deardon,&nbsp;Lorna E. Deeth","doi":"10.1002/cjs.11828","DOIUrl":"10.1002/cjs.11828","url":null,"abstract":"<p>Epidemic trajectories can be substantially impacted by people modifying their behaviours in response to changes in their perceived risk of spreading or contracting the disease. However, most infectious disease models assume a stable population behaviour. We present a flexible new class of models, called behavioural change individual-level models (BC-ILMs), that incorporate both individual-level covariate information and a data-driven behavioural change effect. Focusing on spatial BC-ILMs, we consider four “alarm” functions to model the effect of behavioural change as a function of infection prevalence over time. Through simulation studies, we find that if behavioural change is present, using an alarm function, even if specified incorrectly, will result in an improvement in posterior predictive performance over a model that assumes stable population behaviour. The methods are applied to data from the 2001 U.K. foot and mouth disease epidemic. The results show some evidence of a behavioural change effect, although it may not meaningfully impact model fit compared to a simpler spatial ILM in this dataset.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"53 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.11828","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142198227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fast and scalable inference for spatial extreme value models 快速、可扩展的空间极值模型推理
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-08-21 DOI: 10.1002/cjs.11829
Meixi Chen, Reza Ramezan, Martin Lysy

The generalized extreme value (GEV) distribution is a popular model for analyzing and forecasting extreme weather data. To increase prediction accuracy, spatial information is often pooled via a latent Gaussian process (GP) on the GEV parameters. Inference for GEV-GP models is typically carried out using Markov Chain Monte Carlo (MCMC) methods, or using approximate inference methods such as the integrated nested Laplace approximation (INLA). However, MCMC becomes prohibitively slow as the number of spatial locations increases, whereas INLA is applicable in practice only to a limited subset of GEV-GP models. In this article, we revisit the original Laplace approximation for fitting spatial GEV models. In combination with a popular sparsity-inducing spatial covariance approximation technique, we show through simulations that our approach accurately estimates the Bayesian predictive distribution of extreme weather events, is scalable to several thousand spatial locations, and is several orders of magnitude faster than MCMC. A case study in forecasting extreme snowfall across Canada is presented.

广义极值(GEV)分布是分析和预测极端天气数据的常用模型。为了提高预测精度,通常会通过关于 GEV 参数的潜在高斯过程 (GP) 汇集空间信息。GEV-GP 模型的推断通常使用马尔可夫链蒙特卡罗(MCMC)方法,或使用近似推断方法,如集成嵌套拉普拉斯近似(INLA)。然而,随着空间位置数量的增加,MCMC 的速度会变得过慢,而 INLA 在实践中只适用于 GEV-GP 模型的有限子集。在本文中,我们重新审视了用于拟合空间 GEV 模型的原始拉普拉斯近似。结合流行的稀疏性诱导空间协方差近似技术,我们通过仿真表明,我们的方法能准确估计极端天气事件的贝叶斯预测分布,可扩展到数千个空间位置,而且比 MCMC 快几个数量级。我们还介绍了预测加拿大极端降雪的案例研究。
{"title":"Fast and scalable inference for spatial extreme value models","authors":"Meixi Chen,&nbsp;Reza Ramezan,&nbsp;Martin Lysy","doi":"10.1002/cjs.11829","DOIUrl":"10.1002/cjs.11829","url":null,"abstract":"<p>The generalized extreme value (GEV) distribution is a popular model for analyzing and forecasting extreme weather data. To increase prediction accuracy, spatial information is often pooled via a latent Gaussian process (GP) on the GEV parameters. Inference for GEV-GP models is typically carried out using Markov Chain Monte Carlo (MCMC) methods, or using approximate inference methods such as the integrated nested Laplace approximation (INLA). However, MCMC becomes prohibitively slow as the number of spatial locations increases, whereas INLA is applicable in practice only to a limited subset of GEV-GP models. In this article, we revisit the original Laplace approximation for fitting spatial GEV models. In combination with a popular sparsity-inducing spatial covariance approximation technique, we show through simulations that our approach accurately estimates the Bayesian predictive distribution of extreme weather events, is scalable to several thousand spatial locations, and is several orders of magnitude faster than MCMC. A case study in forecasting extreme snowfall across Canada is presented.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"53 2","pages":""},"PeriodicalIF":0.8,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.11829","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142198228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Debiased lasso after sample splitting for estimation and inference in high-dimensional generalized linear models 用于高维广义线性模型估计和推理的样本分割后去偏套索技术
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-08-21 DOI: 10.1002/cjs.11827
Omar Vazquez, Bin Nan

We consider random sample splitting for estimation and inference in high-dimensional generalized linear models (GLMs), where we first apply the lasso to select a submodel using one subsample and then apply the debiased lasso to fit the selected model using the remaining subsample. We show that a sample splitting procedure based on the debiased lasso yields asymptotically normal estimates under mild conditions and that multiple splitting can address the loss of efficiency. Our simulation results indicate that using the debiased lasso instead of the standard maximum likelihood method in the estimation stage can vastly reduce the bias and variance of the resulting estimates. Furthermore, our multiple splitting debiased lasso method has better numerical performance than some existing methods for high-dimensional GLMs proposed in the recent literature. We illustrate the proposed multiple splitting method with an analysis of the smoking data of the Mid-South Tobacco Case–Control Study.

我们考虑了用于高维广义线性模型(GLM)估计和推断的随机样本分割,在这种情况下,我们首先应用套索(lasso)使用一个子样本选择一个子模型,然后应用去杂套索(debiased lasso)使用剩余子样本拟合所选模型。我们的研究表明,在温和的条件下,基于去杂套索的样本拆分程序可以得到渐近正态的估计值,而且多次拆分可以解决效率损失的问题。我们的模拟结果表明,在估计阶段使用去偏套索法而不是标准的极大似然法,可以大大减少估计结果的偏差和方差。此外,与近期文献中提出的一些现有高维 GLM 方法相比,我们的多重分裂去偏 lasso 方法具有更好的数值性能。我们通过分析中南烟草病例对照研究的吸烟数据来说明所提出的多重分割方法。
{"title":"Debiased lasso after sample splitting for estimation and inference in high-dimensional generalized linear models","authors":"Omar Vazquez,&nbsp;Bin Nan","doi":"10.1002/cjs.11827","DOIUrl":"10.1002/cjs.11827","url":null,"abstract":"<p>We consider random sample splitting for estimation and inference in high-dimensional generalized linear models (GLMs), where we first apply the lasso to select a submodel using one subsample and then apply the debiased lasso to fit the selected model using the remaining subsample. We show that a sample splitting procedure based on the debiased lasso yields asymptotically normal estimates under mild conditions and that multiple splitting can address the loss of efficiency. Our simulation results indicate that using the debiased lasso instead of the standard maximum likelihood method in the estimation stage can vastly reduce the bias and variance of the resulting estimates. Furthermore, our multiple splitting debiased lasso method has better numerical performance than some existing methods for high-dimensional GLMs proposed in the recent literature. We illustrate the proposed multiple splitting method with an analysis of the smoking data of the Mid-South Tobacco Case–Control Study.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"53 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.11827","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142225252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Variable selection in modelling clustered data via within-cluster resampling 通过簇内再采样建立聚类数据模型时的变量选择
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-08-01 DOI: 10.1002/cjs.11824
Shangyuan Ye, Tingting Yu, Daniel A. Caroff, Susan S. Huang, Bo Zhang, Rui Wang

In many biomedical applications, there is a need to build risk-adjustment models based on clustered data. However, methods for variable selection that are applicable to clustered discrete data settings with a large number of candidate variables and potentially large cluster sizes are lacking. We develop a new variable selection approach that combines within-cluster resampling techniques with penalized likelihood methods to select variables for high-dimensional clustered data. We derive an upper bound on the expected number of falsely selected variables, demonstrate the oracle properties of the proposed method and evaluate the finite sample performance of the method through extensive simulations. We illustrate the proposed approach using a colon surgical site infection data set consisting of 39,468 individuals from 149 hospitals to build risk-adjustment models that account for both the main effects of various risk factors and their two-way interactions.

在许多生物医学应用中,都需要根据聚类数据建立风险调整模型。然而,目前还缺乏适用于具有大量候选变量和潜在大聚类规模的聚类离散数据设置的变量选择方法。我们开发了一种新的变量选择方法,该方法结合了簇内重采样技术和惩罚似然法,可为高维聚类数据选择变量。我们推导出了误选变量的预期数量上限,证明了所提方法的甲骨文特性,并通过大量模拟评估了该方法的有限样本性能。我们使用由来自 149 家医院的 39468 人组成的结肠手术部位感染数据集来说明所提出的方法,并建立了考虑到各种风险因素的主效应及其双向交互作用的风险调整模型。
{"title":"Variable selection in modelling clustered data via within-cluster resampling","authors":"Shangyuan Ye,&nbsp;Tingting Yu,&nbsp;Daniel A. Caroff,&nbsp;Susan S. Huang,&nbsp;Bo Zhang,&nbsp;Rui Wang","doi":"10.1002/cjs.11824","DOIUrl":"10.1002/cjs.11824","url":null,"abstract":"<p>In many biomedical applications, there is a need to build risk-adjustment models based on clustered data. However, methods for variable selection that are applicable to clustered discrete data settings with a large number of candidate variables and potentially large cluster sizes are lacking. We develop a new variable selection approach that combines within-cluster resampling techniques with penalized likelihood methods to select variables for high-dimensional clustered data. We derive an upper bound on the expected number of falsely selected variables, demonstrate the oracle properties of the proposed method and evaluate the finite sample performance of the method through extensive simulations. We illustrate the proposed approach using a colon surgical site infection data set consisting of 39,468 individuals from 149 hospitals to build risk-adjustment models that account for both the main effects of various risk factors and their two-way interactions.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"53 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.11824","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141881170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Joint analysis of longitudinal count and binary response data in the presence of outliers 对存在异常值的纵向计数和二元响应数据进行联合分析
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-08-01 DOI: 10.1002/cjs.11819
Sanjoy Sinha

In this article, we develop an innovative, robust method for jointly analyzing longitudinal count and binary responses. The method is useful for bounding the influence of potential outliers in the data when estimating the model parameters. We use a log-linear model for the count response and a logistic regression model for the binary response, where the two response processes are linked through a set of association parameters. The asymptotic properties of the robust estimators are briefly studied. The empirical properties of the estimators are studied based on simulations. The study shows that the proposed estimators are approximately unbiased and also efficient when fitting a joint model to data contaminated with outliers. We also apply the proposed method to some real longitudinal survey data obtained from a health study.

在本文中,我们开发了一种创新、稳健的方法,用于联合分析纵向计数和二元响应。在估算模型参数时,该方法有助于限制数据中潜在异常值的影响。我们对计数响应采用对数线性模型,对二元响应采用逻辑回归模型,两个响应过程通过一组关联参数联系起来。我们简要研究了稳健估计器的渐近特性。基于模拟对估计器的经验特性进行了研究。研究表明,所提出的估计器近似无偏,而且在对受异常值污染的数据拟合联合模型时也很有效。我们还将提出的方法应用于从一项健康研究中获得的一些真实纵向调查数据。
{"title":"Joint analysis of longitudinal count and binary response data in the presence of outliers","authors":"Sanjoy Sinha","doi":"10.1002/cjs.11819","DOIUrl":"10.1002/cjs.11819","url":null,"abstract":"<p>In this article, we develop an innovative, robust method for jointly analyzing longitudinal count and binary responses. The method is useful for bounding the influence of potential outliers in the data when estimating the model parameters. We use a log-linear model for the count response and a logistic regression model for the binary response, where the two response processes are linked through a set of association parameters. The asymptotic properties of the robust estimators are briefly studied. The empirical properties of the estimators are studied based on simulations. The study shows that the proposed estimators are approximately unbiased and also efficient when fitting a joint model to data contaminated with outliers. We also apply the proposed method to some real longitudinal survey data obtained from a health study.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"53 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.11819","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141881169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust change point detection for high-dimensional linear models with tolerance for outliers and heavy tails 容许异常值和重尾的高维线性模型的稳健变化点检测
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-08-01 DOI: 10.1002/cjs.11826
Zhi Yang, Liwen Zhang, Siyu Sun, Bin Liu

This article focuses on detecting change points in high-dimensional linear regression models with piecewise constant regression coefficients, moving beyond the conventional reliance on strict Gaussian or sub-Gaussian noise assumptions. In the face of real-world complexities, where noise often deviates into uncertain or heavy-tailed distributions, we propose two tailored algorithms: a dynamic programming algorithm (DPA) for improved localization accuracy, and a binary segmentation algorithm (BSA) optimized for computational efficiency. These solutions are designed to be flexible, catering to increasing sample sizes and data dimensions, and offer a robust estimation of change points without requiring specific moments of the noise distribution. The efficacy of DPA and BSA is thoroughly evaluated through extensive simulation studies and application to real datasets, showing their competitive edge in adaptability and performance.

本文的重点是检测具有片断常数回归系数的高维线性回归模型中的变化点,超越了传统的严格高斯或亚高斯噪声假设。面对噪声经常偏离成不确定或重尾分布的复杂现实世界,我们提出了两种量身定制的算法:一种是提高定位精度的动态编程算法(DPA),另一种是为提高计算效率而优化的二元分割算法(BSA)。这些解决方案设计灵活,能满足样本量和数据维度不断增加的要求,并能对变化点进行稳健的估计,而不需要噪声分布的特定矩。通过广泛的模拟研究和对真实数据集的应用,对 DPA 和 BSA 的功效进行了全面评估,显示了它们在适应性和性能方面的竞争优势。
{"title":"Robust change point detection for high-dimensional linear models with tolerance for outliers and heavy tails","authors":"Zhi Yang,&nbsp;Liwen Zhang,&nbsp;Siyu Sun,&nbsp;Bin Liu","doi":"10.1002/cjs.11826","DOIUrl":"10.1002/cjs.11826","url":null,"abstract":"<p>This article focuses on detecting change points in high-dimensional linear regression models with piecewise constant regression coefficients, moving beyond the conventional reliance on strict Gaussian or sub-Gaussian noise assumptions. In the face of real-world complexities, where noise often deviates into uncertain or heavy-tailed distributions, we propose two tailored algorithms: a dynamic programming algorithm (DPA) for improved localization accuracy, and a binary segmentation algorithm (BSA) optimized for computational efficiency. These solutions are designed to be flexible, catering to increasing sample sizes and data dimensions, and offer a robust estimation of change points without requiring specific moments of the noise distribution. The efficacy of DPA and BSA is thoroughly evaluated through extensive simulation studies and application to real datasets, showing their competitive edge in adaptability and performance.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"53 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141881167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian jackknife empirical likelihood-based inference for missing data and causal inference 针对缺失数据和因果推断的基于经验似然法的贝叶斯千斤顶推断法
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-08-01 DOI: 10.1002/cjs.11825
Sixia Chen, Yuke Wang, Yichuan Zhao

Missing data reduce the representativeness of the sample and can lead to inference problems. In this article, we apply the Bayesian jackknife empirical likelihood (BJEL) method for inference on data that are missing at random, as well as for causal inference. The semiparametric fractional imputation estimator, propensity score-weighted estimator, and doubly robust estimator are used for constructing the jackknife pseudo values, which are needed for conducting BJEL-based inference with missing data. Existing methods, such as normal approximation and JEL, are compared with the BJEL approach in a simulation study. The proposed approach shows better performance in many scenarios in terms of credible intervals. Furthermore, we demonstrate the application of the proposed approach for causal inference problems in a study of risk factors for impaired kidney function.

缺失数据会降低样本的代表性,从而导致推断问题。在本文中,我们将贝叶斯千刀经验似然法(BJEL)应用于随机缺失数据的推断以及因果推断。半参数分数估算器、倾向得分加权估算器和双重稳健估算器被用于构建杰克刀伪值,这是进行基于 BJEL 的缺失数据推断所必需的。在模拟研究中,对现有方法(如正态近似和 JEL)与 BJEL 方法进行了比较。就可信区间而言,所提出的方法在许多情况下都表现出更好的性能。此外,我们还演示了在肾功能受损风险因素研究中应用所提方法进行因果推断问题的情况。
{"title":"Bayesian jackknife empirical likelihood-based inference for missing data and causal inference","authors":"Sixia Chen,&nbsp;Yuke Wang,&nbsp;Yichuan Zhao","doi":"10.1002/cjs.11825","DOIUrl":"10.1002/cjs.11825","url":null,"abstract":"<p>Missing data reduce the representativeness of the sample and can lead to inference problems. In this article, we apply the Bayesian jackknife empirical likelihood (BJEL) method for inference on data that are missing at random, as well as for causal inference. The semiparametric fractional imputation estimator, propensity score-weighted estimator, and doubly robust estimator are used for constructing the jackknife pseudo values, which are needed for conducting BJEL-based inference with missing data. Existing methods, such as normal approximation and JEL, are compared with the BJEL approach in a simulation study. The proposed approach shows better performance in many scenarios in terms of credible intervals. Furthermore, we demonstrate the application of the proposed approach for causal inference problems in a study of risk factors for impaired kidney function.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"53 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141881172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multiple change-point detection for regression curves 回归曲线的多变化点检测
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-07-25 DOI: 10.1002/cjs.11816
Yunlong Wang

Nonparametric estimation of a regression curve becomes crucial when the underlying dependence structure between covariates and responses is not explicit. While existing literature has addressed single change-point estimation for regression curves, the problem of multiple change points remains unresolved. In an effort to bridge this gap, this article introduces a nonparametric estimator for multiple change points by minimizing a penalized weighted sum of squared residuals, presenting consistent results under mild conditions. Additionally, we propose a cross-validation-based procedure that possesses the advantage of being tuning-free. Our simulation results showcase the competitive performance of these new procedures when compared with state-of-the-art methods. As an illustration of their utility, we apply these procedures to a real dataset.

当协变量和响应之间的基本依赖结构不明确时,回归曲线的非参数估计就变得至关重要。现有文献已经解决了回归曲线的单变化点估计问题,但多变化点问题仍未解决。为了缩小这一差距,本文通过最小化受惩罚的加权残差平方和,介绍了一种多变化点的非参数估计方法,并在温和条件下给出了一致的结果。此外,我们还提出了一种基于交叉验证的程序,该程序具有无需调整的优点。我们的模拟结果表明,与最先进的方法相比,这些新程序的性能极具竞争力。为了说明这些程序的实用性,我们将其应用于一个真实的数据集。
{"title":"Multiple change-point detection for regression curves","authors":"Yunlong Wang","doi":"10.1002/cjs.11816","DOIUrl":"10.1002/cjs.11816","url":null,"abstract":"<p>Nonparametric estimation of a regression curve becomes crucial when the underlying dependence structure between covariates and responses is not explicit. While existing literature has addressed single change-point estimation for regression curves, the problem of multiple change points remains unresolved. In an effort to bridge this gap, this article introduces a nonparametric estimator for multiple change points by minimizing a penalized weighted sum of squared residuals, presenting consistent results under mild conditions. Additionally, we propose a cross-validation-based procedure that possesses the advantage of being tuning-free. Our simulation results showcase the competitive performance of these new procedures when compared with state-of-the-art methods. As an illustration of their utility, we apply these procedures to a real dataset.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"52 4","pages":""},"PeriodicalIF":0.8,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141769582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimation of the additive hazards model based on case-cohort interval-censored data with dependent censoring 基于病例队列区间删失数据的加性危害模型估计与依赖性删失
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-07-12 DOI: 10.1002/cjs.11818
Yuqing Ma, Peijie Wang, Yichen Lou, Jianguo Sun, Alzheimer's Disease Neuroimaging Initiative

The additive hazards model is one of the most commonly used models for regression analysis of failure time data, and many methods have been developed for its estimation. In this article, we consider the situation where one observes informatively interval-censored data arising from case-cohort studies where covariate information is collected only for a small subcohort of study subjects. By informative or dependent censoring, we mean that the failure time of interest and the censoring mechanism may be correlated. For estimation, we will develop a sieve inverse probability weighting estimation procedure with the use of Bernstein polynomials. The resulting estimators of regression parameters are shown to be consistent and asymptotically normal. An extensive simulation study is conducted and suggests that the proposed method works well in practical situations. An example is also provided.

加性危险模型是失效时间数据回归分析中最常用的模型之一,目前已开发出许多估算方法。在本文中,我们将考虑这样一种情况,即观察由病例队列研究产生的信息区间删失数据,在这种情况下,只收集研究对象中一小部分子队列的协变量信息。我们所说的信息性或依赖性删减是指相关的失败时间和删减机制可能是相关的。在估算方面,我们将利用伯恩斯坦多项式开发一种筛式反概率加权估算程序。结果表明,回归参数的估计值是一致和渐近正态的。我们还进行了广泛的模拟研究,结果表明所提出的方法在实际情况下运行良好。此外,还提供了一个示例。
{"title":"Estimation of the additive hazards model based on case-cohort interval-censored data with dependent censoring","authors":"Yuqing Ma,&nbsp;Peijie Wang,&nbsp;Yichen Lou,&nbsp;Jianguo Sun,&nbsp;Alzheimer's Disease Neuroimaging Initiative","doi":"10.1002/cjs.11818","DOIUrl":"https://doi.org/10.1002/cjs.11818","url":null,"abstract":"<p>The additive hazards model is one of the most commonly used models for regression analysis of failure time data, and many methods have been developed for its estimation. In this article, we consider the situation where one observes informatively interval-censored data arising from case-cohort studies where covariate information is collected only for a small subcohort of study subjects. By informative or dependent censoring, we mean that the failure time of interest and the censoring mechanism may be correlated. For estimation, we will develop a sieve inverse probability weighting estimation procedure with the use of Bernstein polynomials. The resulting estimators of regression parameters are shown to be consistent and asymptotically normal. An extensive simulation study is conducted and suggests that the proposed method works well in practical situations. An example is also provided.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"52 4","pages":""},"PeriodicalIF":0.8,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142641682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Canadian Journal of Statistics-Revue Canadienne De Statistique
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1