首页 > 最新文献

Canadian Journal of Statistics-Revue Canadienne De Statistique最新文献

英文 中文
Predicting rare events using training data from stratified sampling designs, with application to human-caused wildfire prediction 利用分层抽样设计的训练数据预测罕见事件,并应用于人为野火预测
IF 1 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-04-22 DOI: 10.1002/cjs.70008
Johanna de Haan-Ward, Douglas G. Woolford, Simon J. Bonner

Response-based sampling is often used in modelling rare events from large, imbalanced data for efficiency. When modelling the event with logistic regression, the sampling design may be adjusted for using sampling weights or an offset. We propose a stratified sampling design for modelling rare events with large data which improves on previous methods by providing unbiased estimates of the standard errors of the coefficients in a multiple logistic regression scenario. We use multiple intercepts to model the incidence in the sampled data, then adjust each intercept via a stratum-specific offset. Our simulations provide no evidence of bias in the estimated logistic regression coefficients or their standard errors. We apply this method to spatio-temporal, fine-scale human-caused fire occurrence modelling for a region in northwestern Ontario, Canada, illustrating how the stratified sampling approach results in more locally precise estimates of fire occurrence.

基于响应的抽样通常用于从大量不平衡数据中对罕见事件进行建模,以提高效率。当用逻辑回归对事件建模时,可以调整抽样设计以使用抽样权重或偏移量。我们提出了一种分层抽样设计,用于模拟具有大数据的罕见事件,该设计通过在多元逻辑回归场景中提供系数标准误差的无偏估计,改进了以前的方法。我们使用多个截距来模拟采样数据中的发生率,然后通过地层特定偏移量调整每个截距。我们的模拟在估计的逻辑回归系数或其标准误差中没有提供偏差的证据。我们将该方法应用于加拿大安大略省西北部一个地区的时空、精细尺度人为火灾发生模型,说明分层抽样方法如何导致更精确的局部火灾发生估计。
{"title":"Predicting rare events using training data from stratified sampling designs, with application to human-caused wildfire prediction","authors":"Johanna de Haan-Ward,&nbsp;Douglas G. Woolford,&nbsp;Simon J. Bonner","doi":"10.1002/cjs.70008","DOIUrl":"https://doi.org/10.1002/cjs.70008","url":null,"abstract":"<p>Response-based sampling is often used in modelling rare events from large, imbalanced data for efficiency. When modelling the event with logistic regression, the sampling design may be adjusted for using sampling weights or an offset. We propose a stratified sampling design for modelling rare events with large data which improves on previous methods by providing unbiased estimates of the standard errors of the coefficients in a multiple logistic regression scenario. We use multiple intercepts to model the incidence in the sampled data, then adjust each intercept via a stratum-specific offset. Our simulations provide no evidence of bias in the estimated logistic regression coefficients or their standard errors. We apply this method to spatio-temporal, fine-scale human-caused fire occurrence modelling for a region in northwestern Ontario, Canada, illustrating how the stratified sampling approach results in more locally precise estimates of fire occurrence.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"53 3","pages":""},"PeriodicalIF":1.0,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.70008","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144918755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unified inference for longitudinal/functional data quantile dynamic additive models 纵向/功能数据分位数动态加性模型的统一推理
IF 1 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-04-11 DOI: 10.1002/cjs.70006
Qian Huang, Tao Li, Jinhong You, Liwen Zhang

We investigate the unified inference of a time-varying additive model under the quantile regression framework, considering both sparse and dense longitudinal or functional data. For convolution-type smoothed objective functions, we propose a two-step method for estimating both the trend and the component functions. Theoretical analysis shows that the two-step estimators share the same asymptotic distribution as the oracle estimators, while the convergence rates and limiting variance functions differ between sparse and dense situations. However, making a subjective choice between these two cases can lead to incorrect statistical inferences. To address this issue, we develop sandwich formulas for variance estimations. This allows us to establish a unified inference without the need to decide whether the data are sparse or dense. Via simulation studies, we assess the finite-sample performance of the proposed methods. Finally, analyses of two different types of real data illustrate our proposed methods.

我们研究了在分位数回归框架下时变加性模型的统一推理,同时考虑了稀疏和密集的纵向或功能数据。对于卷积型平滑目标函数,我们提出了一种两步估计趋势函数和分量函数的方法。理论分析表明,两步估计量与oracle估计量具有相同的渐近分布,而稀疏和密集情况下的收敛速率和极限方差函数不同。然而,在这两种情况之间做出主观选择可能会导致不正确的统计推断。为了解决这个问题,我们开发了用于方差估计的三明治公式。这允许我们建立一个统一的推理,而不需要决定数据是稀疏的还是密集的。通过仿真研究,我们评估了所提出方法的有限样本性能。最后,通过对两种不同类型的实际数据的分析,说明了我们所提出的方法。
{"title":"Unified inference for longitudinal/functional data quantile dynamic additive models","authors":"Qian Huang,&nbsp;Tao Li,&nbsp;Jinhong You,&nbsp;Liwen Zhang","doi":"10.1002/cjs.70006","DOIUrl":"https://doi.org/10.1002/cjs.70006","url":null,"abstract":"<p>We investigate the unified inference of a time-varying additive model under the quantile regression framework, considering both sparse and dense longitudinal or functional data. For convolution-type smoothed objective functions, we propose a two-step method for estimating both the trend and the component functions. Theoretical analysis shows that the two-step estimators share the same asymptotic distribution as the oracle estimators, while the convergence rates and limiting variance functions differ between sparse and dense situations. However, making a subjective choice between these two cases can lead to incorrect statistical inferences. To address this issue, we develop sandwich formulas for variance estimations. This allows us to establish a unified inference without the need to decide whether the data are sparse or dense. Via simulation studies, we assess the finite-sample performance of the proposed methods. Finally, analyses of two different types of real data illustrate our proposed methods.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"53 3","pages":""},"PeriodicalIF":1.0,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144918754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient and model-agnostic parameter estimation under privacy-preserving post-randomization data 保隐私后随机化数据下的高效模型无关参数估计
IF 1 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-04-07 DOI: 10.1002/cjs.70003
Qinglong Tian, Jiwei Zhao

Balancing data privacy with public access is critical for sensitive datasets. However, even after de-identification, the data are still vulnerable to, for example, inference attacks (by matching some keywords with external datasets). Statistical disclosure control (SDC) methods offer additional protection, and the post-randomization method (PRAM) adds noise to data to achieve this goal. However, PRAM-perturbed data pose challenges for analysis, as directly using the perturbed data leads to biased parameter estimates. This article addresses parameter estimation when data are perturbed using PRAM for privacy. While existing methods suffer from limitations like being parameter-specific, model-dependent and lacking optimality guarantees, our proposed method overcomes these limitations. Our approach applies to general parameters defined through estimating equations and makes no assumptions about the underlying data model. Furthermore, we prove that the proposed estimator achieves the semiparametric efficiency bound, making it asymptotically optimal in terms of estimation efficiency.

平衡数据隐私和公共访问对于敏感数据集至关重要。然而,即使在去识别之后,数据仍然容易受到例如推理攻击(通过将一些关键字与外部数据集匹配)。统计披露控制(SDC)方法提供了额外的保护,后随机化方法(PRAM)在数据中添加噪声以实现这一目标。然而,pram扰动数据给分析带来了挑战,因为直接使用扰动数据会导致参数估计有偏。本文讨论了使用PRAM对数据进行干扰时的参数估计。虽然现有方法存在诸如参数特定、模型依赖和缺乏最优性保证等局限性,但我们提出的方法克服了这些局限性。我们的方法适用于通过估计方程定义的一般参数,并且对底层数据模型不做任何假设。进一步证明了所提估计量达到了半参数效率界,使其在估计效率方面渐近最优。
{"title":"Efficient and model-agnostic parameter estimation under privacy-preserving post-randomization data","authors":"Qinglong Tian,&nbsp;Jiwei Zhao","doi":"10.1002/cjs.70003","DOIUrl":"https://doi.org/10.1002/cjs.70003","url":null,"abstract":"<p>Balancing data privacy with public access is critical for sensitive datasets. However, even after de-identification, the data are still vulnerable to, for example, inference attacks (by matching some keywords with external datasets). Statistical disclosure control (SDC) methods offer additional protection, and the post-randomization method (PRAM) adds noise to data to achieve this goal. However, PRAM-perturbed data pose challenges for analysis, as directly using the perturbed data leads to biased parameter estimates. This article addresses parameter estimation when data are perturbed using PRAM for privacy. While existing methods suffer from limitations like being parameter-specific, model-dependent and lacking optimality guarantees, our proposed method overcomes these limitations. Our approach applies to general parameters defined through estimating equations and makes no assumptions about the underlying data model. Furthermore, we prove that the proposed estimator achieves the semiparametric efficiency bound, making it asymptotically optimal in terms of estimation efficiency.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"53 3","pages":""},"PeriodicalIF":1.0,"publicationDate":"2025-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.70003","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144918688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction to “Matching distributions for survival data” 对“生存数据匹配分布”的更正
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-04-04 DOI: 10.1002/cjs.70007

Jiang, Q., Xia, Y., and Liang, B. (2022) Matching distributions for survival data. The Canadian Journal of Statistics, 50:751–775.

The name of the first author “Qiang JIANG” was incorrect. This should have been: “Qing JIANG”.

We apologize for this error.

蒋强,夏勇,梁波(2022)生存数据的匹配分布。统计学报,50(5):751 - 775。第一作者“强江”的名字不正确。这应该是:“清江”。我们为这个错误道歉。
{"title":"Correction to “Matching distributions for survival data”","authors":"","doi":"10.1002/cjs.70007","DOIUrl":"https://doi.org/10.1002/cjs.70007","url":null,"abstract":"<p>Jiang, Q., Xia, Y., and Liang, B. (2022) Matching distributions for survival data. <i>The Canadian Journal of Statistics</i>, 50:751–775.</p><p>The name of the first author “Qiang JIANG” was incorrect. This should have been: “Qing JIANG”.</p><p>We apologize for this error.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"53 2","pages":""},"PeriodicalIF":0.8,"publicationDate":"2025-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.70007","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144108866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimal relevant subset designs in nonlinear models 非线性模型中相关子集的优化设计
IF 1 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-04-04 DOI: 10.1002/cjs.70004
Adam Lane

It is well known that certain ancillary statistics form a relevant subset, a subset of the sample space on which inference should be restricted, and that conditioning on such ancillary statistics reduces the dimension of the data without a loss of information. The use of ancillary statistics in post-data inference has received significant attention; however, their role in the design of experiments has not been well characterized. Ancillary statistics are not known prior to data collection and as a result cannot be incorporated into the design a priori. Conversely, in sequential experiments the ancillary statistics based on the data from the preceding observations are known and can be used to determine the design assignment of the current observation. The main results of this work describe the benefits of incorporating ancillary statistics, specifically the ancillary statistic that constitutes a relevant subset, into adaptive designs.

众所周知,某些辅助统计数据形成了一个相关的子集,一个应该限制推理的样本空间的子集,并且对这些辅助统计数据进行调节可以在不丢失信息的情况下降低数据的维数。在数据后推断中使用辅助统计数据已经受到了极大的关注;然而,它们在实验设计中的作用尚未得到很好的描述。辅助统计数据在数据收集之前是未知的,因此不能将其纳入先验设计。相反,在顺序实验中,基于先前观测数据的辅助统计量是已知的,可以用来确定当前观测的设计分配。这项工作的主要结果描述了将辅助统计数据,特别是构成相关子集的辅助统计数据纳入自适应设计的好处。
{"title":"Optimal relevant subset designs in nonlinear models","authors":"Adam Lane","doi":"10.1002/cjs.70004","DOIUrl":"https://doi.org/10.1002/cjs.70004","url":null,"abstract":"<p>It is well known that certain ancillary statistics form a relevant subset, a subset of the sample space on which inference should be restricted, and that conditioning on such ancillary statistics reduces the dimension of the data without a loss of information. The use of ancillary statistics in post-data inference has received significant attention; however, their role in the design of experiments has not been well characterized. Ancillary statistics are not known prior to data collection and as a result cannot be incorporated into the design a priori. Conversely, in sequential experiments the ancillary statistics based on the data from the preceding observations are known and can be used to determine the design assignment of the current observation. The main results of this work describe the benefits of incorporating ancillary statistics, specifically the ancillary statistic that constitutes a relevant subset, into adaptive designs.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"53 3","pages":""},"PeriodicalIF":1.0,"publicationDate":"2025-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144918718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reweighted penalized regression for convenience samples 对方便样本重新加权惩罚回归
IF 1 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-04-03 DOI: 10.1002/cjs.70005
Zhuoran Zhang, Olivia Bernstein Morgan, Daniel L. Gillen, for the Alzheimer's Disease Neuroimaging Initiative

Modern epidemiological studies are often characterized by extensive data collection, which facilitates building high-dimensional predictive models. With large samples often conveniently sampled, weighted penalized regression models are commonly applied to provide improved prediction. In this article, we empirically show that weighted ridge regression models may yield suboptimal results because of the lack of flexibility in the penalty structure. We propose a generalized weighted ridge regression (GWRR) estimation procedure that allows for the adjustment of sampling weights in the penalty structure. We derive the asymptotic properties of the proposed GWRR estimator and provide a computationally efficient closed-form solution. We demonstrate the performance of the proposed GWRR estimator and justify the asymptotic variance via simulation studies. Finally, we illustrate the utility of our proposed estimator through an application to the prediction of mini-mental state examination (MMSE) scores.

现代流行病学研究往往以广泛的数据收集为特点,这有利于建立高维预测模型。对于通常方便采样的大样本,加权惩罚回归模型通常用于提供改进的预测。在这篇文章中,我们的经验表明,加权脊回归模型可能会产生次优结果,因为在惩罚结构缺乏灵活性。我们提出了一种广义加权脊回归(GWRR)估计过程,允许在惩罚结构中调整采样权值。我们推导了所提出的GWRR估计量的渐近性质,并提供了一个计算效率高的闭形式解。我们证明了所提出的GWRR估计器的性能,并通过仿真研究证明了渐近方差。最后,我们通过最小精神状态考试(MMSE)分数预测的应用来说明我们提出的估计器的实用性。
{"title":"Reweighted penalized regression for convenience samples","authors":"Zhuoran Zhang,&nbsp;Olivia Bernstein Morgan,&nbsp;Daniel L. Gillen,&nbsp;for the Alzheimer's Disease Neuroimaging Initiative","doi":"10.1002/cjs.70005","DOIUrl":"https://doi.org/10.1002/cjs.70005","url":null,"abstract":"<p>Modern epidemiological studies are often characterized by extensive data collection, which facilitates building high-dimensional predictive models. With large samples often conveniently sampled, weighted penalized regression models are commonly applied to provide improved prediction. In this article, we empirically show that weighted ridge regression models may yield suboptimal results because of the lack of flexibility in the penalty structure. We propose a generalized weighted ridge regression (GWRR) estimation procedure that allows for the adjustment of sampling weights in the penalty structure. We derive the asymptotic properties of the proposed GWRR estimator and provide a computationally efficient closed-form solution. We demonstrate the performance of the proposed GWRR estimator and justify the asymptotic variance via simulation studies. Finally, we illustrate the utility of our proposed estimator through an application to the prediction of mini-mental state examination (MMSE) scores.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"53 3","pages":""},"PeriodicalIF":1.0,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144918716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Noisy matrix completion for longitudinal data with subject- and time-specific covariates 具有主题和时间特定协变量的纵向数据的噪声矩阵补全
IF 1 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-03-12 DOI: 10.1002/cjs.70002
Zhaohan Sun, Yeying Zhu, Joel A. Dubin

In this article, we consider the imputation of missing responses in a longitudinal dataset via matrix completion. We propose a fixed-effect, longitudinal, low-rank model that incorporates both subject-specific and time-specific covariates. To solve the optimization problem, a two-step optimization algorithm is proposed, which provides good statistical properties for the estimation of the fixed effects and the low-rank term. In a theoretical investigation, the non-asymptotic error bounds on the fixed effects and low-rank term are presented. We illustrate the finite-sample performance of the proposed algorithm via simulation studies, and apply our method to a power plant SO2$$ {}_2 $$ emissions dataset in which the monthly recorded amounts of emissions data on monitors are subject to missingness.

在本文中,我们考虑通过矩阵补全在纵向数据集中的缺失响应的imputation。我们提出了一个固定效应的纵向低秩模型,该模型包含了特定于受试者和特定于时间的协变量。为了解决优化问题,提出了一种两步优化算法,该算法对固定效应和低秩项的估计具有良好的统计性能。在理论研究中,给出了固定效应和低秩项的非渐近误差界。我们通过模拟研究说明了所提出算法的有限样本性能,并将我们的方法应用于发电厂二氧化硫$$ {}_2 $$排放数据集,其中监视器上每月记录的排放数据量可能会丢失。
{"title":"Noisy matrix completion for longitudinal data with subject- and time-specific covariates","authors":"Zhaohan Sun,&nbsp;Yeying Zhu,&nbsp;Joel A. Dubin","doi":"10.1002/cjs.70002","DOIUrl":"https://doi.org/10.1002/cjs.70002","url":null,"abstract":"<p>In this article, we consider the imputation of missing responses in a longitudinal dataset via matrix completion. We propose a fixed-effect, longitudinal, low-rank model that incorporates both subject-specific and time-specific covariates. To solve the optimization problem, a two-step optimization algorithm is proposed, which provides good statistical properties for the estimation of the fixed effects and the low-rank term. In a theoretical investigation, the non-asymptotic error bounds on the fixed effects and low-rank term are presented. We illustrate the finite-sample performance of the proposed algorithm via simulation studies, and apply our method to a power plant SO<span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <msub>\u0000 <mrow></mrow>\u0000 <mrow>\u0000 <mn>2</mn>\u0000 </mrow>\u0000 </msub>\u0000 </mrow>\u0000 <annotation>$$ {}_2 $$</annotation>\u0000 </semantics></math> emissions dataset in which the monthly recorded amounts of emissions data on monitors are subject to missingness.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"53 3","pages":""},"PeriodicalIF":1.0,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.70002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144918752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sample empirical likelihood methods for causal inference 因果推理的样本经验似然方法
IF 1 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-03-06 DOI: 10.1002/cjs.70000
Jingyue Huang, Changbao Wu, Leilei Zeng

Causal inference plays a crucial role in understanding the true impact of interventions, medical treatments, policies, or actions, enabling informed decision making and providing insights into the underlying mechanisms that shape our world. In this article, we establish a framework for the estimation of and inference concerning average treatment effects using a two-sample empirical likelihood function. Two different approaches to incorporating propensity scores are developed. The first approach introduces propensity-score-calibrated constraints in addition to the standard model-calibration constraints; the second approach uses the propensity scores to form weighted versions of the model-calibration constraints. The resulting estimators from both approaches are doubly robust. The limiting distributions of the two-sample empirical likelihood ratio statistics are derived, facilitating the construction of confidence intervals and hypothesis tests for the average treatment effect. Bootstrap methods for constructing sample empirical likelihood ratio confidence intervals are also discussed for both approaches. The finite-sample performance of each method is investigated via simulation studies.

因果推理在理解干预措施、医疗、政策或行动的真正影响方面发挥着至关重要的作用,使人们能够做出明智的决策,并提供对塑造我们世界的潜在机制的见解。在本文中,我们建立了一个框架,估计和推断有关平均治疗效果使用两个样本的经验似然函数。发展了两种不同的方法来合并倾向得分。第一种方法除了标准模型校准约束外,还引入了倾向分数校准约束;第二种方法使用倾向分数来形成模型校准约束的加权版本。两种方法得到的估计量都具有双重鲁棒性。导出了两样本经验似然比统计量的极限分布,便于置信区间的构建和平均处理效果的假设检验。讨论了两种方法构造样本经验似然比置信区间的Bootstrap方法。通过仿真研究了每种方法的有限样本性能。
{"title":"Sample empirical likelihood methods for causal inference","authors":"Jingyue Huang,&nbsp;Changbao Wu,&nbsp;Leilei Zeng","doi":"10.1002/cjs.70000","DOIUrl":"https://doi.org/10.1002/cjs.70000","url":null,"abstract":"<p>Causal inference plays a crucial role in understanding the true impact of interventions, medical treatments, policies, or actions, enabling informed decision making and providing insights into the underlying mechanisms that shape our world. In this article, we establish a framework for the estimation of and inference concerning average treatment effects using a two-sample empirical likelihood function. Two different approaches to incorporating propensity scores are developed. The first approach introduces propensity-score-calibrated constraints in addition to the standard model-calibration constraints; the second approach uses the propensity scores to form weighted versions of the model-calibration constraints. The resulting estimators from both approaches are doubly robust. The limiting distributions of the two-sample empirical likelihood ratio statistics are derived, facilitating the construction of confidence intervals and hypothesis tests for the average treatment effect. Bootstrap methods for constructing sample empirical likelihood ratio confidence intervals are also discussed for both approaches. The finite-sample performance of each method is investigated via simulation studies.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"53 3","pages":""},"PeriodicalIF":1.0,"publicationDate":"2025-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144918687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Doubly robust criterion for causal inference 因果推理的双鲁棒准则
IF 1 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-03-04 DOI: 10.1002/cjs.70001
Takamichi Baba, Yoshiyuki Ninomiya

In causal inference, semiparametric estimation using propensity scores has rapidly developed in various directions. At the same time, although model selection is indispensable in statistical analysis, an information criterion for selecting the regression structure between the potential outcome and explanatory variables has not been well developed. Here, based on the original definition of AIC, we derive an AIC-type criterion for propensity score analysis. A risk based on the Kullback–Leibler divergence is defined as the cornerstone, and general causal inference models and general causal effects are treated. Considering the high importance of doubly robust estimation, we make the information criterion itself doubly robust so that it is an asymptotically unbiased estimator of the risk even under some model misspecification. In simulation studies, we compare the derived criterion with an existing weighted quasi-likelihood information criterion and confirm that the former outperforms the latter. Real data analyses indicate that results using the two criteria can differ significantly.

在因果推理中,利用倾向分数进行半参数估计已迅速向各个方向发展。与此同时,虽然模型选择在统计分析中是必不可少的,但选择潜在结果与解释变量之间回归结构的信息标准尚未得到很好的发展。在此,基于AIC的原始定义,我们导出了一个AIC类型的倾向得分分析标准。将基于Kullback-Leibler散度的风险定义为基础,并对一般因果推理模型和一般因果效应进行了处理。考虑到双鲁棒估计的重要性,我们使信息准则本身具有双鲁棒性,使得即使在某些模型不规范的情况下,它也是风险的渐近无偏估计量。在仿真研究中,我们将导出的准则与现有的加权准似然信息准则进行了比较,并证实前者优于后者。实际数据分析表明,使用这两种标准的结果可能存在显著差异。
{"title":"Doubly robust criterion for causal inference","authors":"Takamichi Baba,&nbsp;Yoshiyuki Ninomiya","doi":"10.1002/cjs.70001","DOIUrl":"https://doi.org/10.1002/cjs.70001","url":null,"abstract":"<p>In causal inference, semiparametric estimation using propensity scores has rapidly developed in various directions. At the same time, although model selection is indispensable in statistical analysis, an information criterion for selecting the regression structure between the potential outcome and explanatory variables has not been well developed. Here, based on the original definition of AIC, we derive an AIC-type criterion for propensity score analysis. A risk based on the Kullback–Leibler divergence is defined as the cornerstone, and general causal inference models and general causal effects are treated. Considering the high importance of doubly robust estimation, we make the information criterion itself doubly robust so that it is an asymptotically unbiased estimator of the risk even under some model misspecification. In simulation studies, we compare the derived criterion with an existing weighted quasi-likelihood information criterion and confirm that the former outperforms the latter. Real data analyses indicate that results using the two criteria can differ significantly.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"53 3","pages":""},"PeriodicalIF":1.0,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.70001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144918719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Acknowledgement of Referees' Services Remerciements aux membres des jurys 对裁判服务的认可
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-02-26 DOI: 10.1002/cjs.11840
{"title":"Acknowledgement of Referees' Services Remerciements aux membres des jurys","authors":"","doi":"10.1002/cjs.11840","DOIUrl":"https://doi.org/10.1002/cjs.11840","url":null,"abstract":"","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"53 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143497132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Canadian Journal of Statistics-Revue Canadienne De Statistique
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1