How estimating nuisance parameters can reduce the variance (with consistent variance estimation).

IF 1.8 4区医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Statistics in Medicine Pub Date : 2024-10-15 Epub Date: 2024-07-30 DOI:10.1002/sim.10164

Judith J Lok

{"title":"How estimating nuisance parameters can reduce the variance (with consistent variance estimation).","authors":"Judith J Lok","doi":"10.1002/sim.10164","DOIUrl":null,"url":null,"abstract":"<p><p>We often estimate a parameter of interest <math> <semantics><mrow><mi>ψ</mi></mrow> <annotation>$$ \\psi $$</annotation></semantics> </math> when the identifying conditions involve a finite-dimensional nuisance parameter <math> <semantics><mrow><mi>θ</mi> <mo>∈</mo> <msup><mrow><mi>ℝ</mi></mrow> <mrow><mi>d</mi></mrow> </msup> </mrow> <annotation>$$ \\theta \\in {\\mathbb{R}}^d $$</annotation></semantics> </math> . Examples from causal inference are inverse probability weighting, marginal structural models and structural nested models, which all lead to unbiased estimating equations. This article presents a consistent sandwich estimator for the variance of estimators <math> <semantics> <mrow> <mover><mrow><mi>ψ</mi></mrow> <mo>^</mo></mover> </mrow> <annotation>$$ \\hat{\\psi} $$</annotation></semantics> </math> that solve unbiased estimating equations including <math> <semantics><mrow><mi>θ</mi></mrow> <annotation>$$ \\theta $$</annotation></semantics> </math> which is also estimated by solving unbiased estimating equations. This article presents four additional results for settings where <math> <semantics> <mrow> <mover><mrow><mi>θ</mi></mrow> <mo>^</mo></mover> </mrow> <annotation>$$ \\hat{\\theta} $$</annotation></semantics> </math> solves (partial) score equations and <math> <semantics><mrow><mi>ψ</mi></mrow> <annotation>$$ \\psi $$</annotation></semantics> </math> does not depend on <math> <semantics><mrow><mi>θ</mi></mrow> <annotation>$$ \\theta $$</annotation></semantics> </math> . This includes many causal inference settings where <math> <semantics><mrow><mi>θ</mi></mrow> <annotation>$$ \\theta $$</annotation></semantics> </math> describes the treatment probabilities, missing data settings where <math> <semantics><mrow><mi>θ</mi></mrow> <annotation>$$ \\theta $$</annotation></semantics> </math> describes the missingness probabilities, and measurement error settings where <math> <semantics><mrow><mi>θ</mi></mrow> <annotation>$$ \\theta $$</annotation></semantics> </math> describes the error distribution. These four additional results are: (1) Counter-intuitively, the asymptotic variance of <math> <semantics> <mrow> <mover><mrow><mi>ψ</mi></mrow> <mo>^</mo></mover> </mrow> <annotation>$$ \\hat{\\psi} $$</annotation></semantics> </math> is typically smaller when <math> <semantics><mrow><mi>θ</mi></mrow> <annotation>$$ \\theta $$</annotation></semantics> </math> is estimated. (2) If estimating <math> <semantics><mrow><mi>θ</mi></mrow> <annotation>$$ \\theta $$</annotation></semantics> </math> is ignored, the sandwich estimator for the variance of <math> <semantics> <mrow> <mover><mrow><mi>ψ</mi></mrow> <mo>^</mo></mover> </mrow> <annotation>$$ \\hat{\\psi} $$</annotation></semantics> </math> is conservative. (3) A consistent sandwich estimator for the variance of <math> <semantics> <mrow> <mover><mrow><mi>ψ</mi></mrow> <mo>^</mo></mover> </mrow> <annotation>$$ \\hat{\\psi} $$</annotation></semantics> </math> . (4) If <math> <semantics> <mrow> <mover><mrow><mi>ψ</mi></mrow> <mo>^</mo></mover> </mrow> <annotation>$$ \\hat{\\psi} $$</annotation></semantics> </math> with the true <math> <semantics><mrow><mi>θ</mi></mrow> <annotation>$$ \\theta $$</annotation></semantics> </math> plugged in is efficient, the asymptotic variance of <math> <semantics> <mrow> <mover><mrow><mi>ψ</mi></mrow> <mo>^</mo></mover> </mrow> <annotation>$$ \\hat{\\psi} $$</annotation></semantics> </math> does not depend on whether <math> <semantics><mrow><mi>θ</mi></mrow> <annotation>$$ \\theta $$</annotation></semantics> </math> is estimated. To illustrate we use observational data to calculate confidence intervals for (1) the effect of cazavi versus colistin on bacterial infections and (2) how the effect of antiretroviral treatment depends on its initiation time in HIV-infected patients.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":" ","pages":"4456-4480"},"PeriodicalIF":1.8000,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11570876/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistics in Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/sim.10164","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/7/30 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

We often estimate a parameter of interest $ψ$ when the identifying conditions involve a finite-dimensional nuisance parameter $θ \in ℝ^{d}$ . Examples from causal inference are inverse probability weighting, marginal structural models and structural nested models, which all lead to unbiased estimating equations. This article presents a consistent sandwich estimator for the variance of estimators $\hat{ψ}$ that solve unbiased estimating equations including $θ$ which is also estimated by solving unbiased estimating equations. This article presents four additional results for settings where $\hat{θ}$ solves (partial) score equations and $ψ$ does not depend on $θ$ . This includes many causal inference settings where $θ$ describes the treatment probabilities, missing data settings where $θ$ describes the missingness probabilities, and measurement error settings where $θ$ describes the error distribution. These four additional results are: (1) Counter-intuitively, the asymptotic variance of $\hat{ψ}$ is typically smaller when $θ$ is estimated. (2) If estimating $θ$ is ignored, the sandwich estimator for the variance of $\hat{ψ}$ is conservative. (3) A consistent sandwich estimator for the variance of $\hat{ψ}$ . (4) If $\hat{ψ}$ with the true $θ$ plugged in is efficient, the asymptotic variance of $\hat{ψ}$ does not depend on whether $θ$ is estimated. To illustrate we use observational data to calculate confidence intervals for (1) the effect of cazavi versus colistin on bacterial infections and (2) how the effect of antiretroviral treatment depends on its initiation time in HIV-infected patients.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

估计干扰参数如何减少方差（使用一致的方差估计）。

当识别条件涉及有限维滋扰参数θ ∈ ℝ d $ $ \theta \in {\mathbb{R}}^d $$时，我们经常会估计一个感兴趣的参数ψ $ $ \psi $$。因果推理中的例子包括反概率加权、边际结构模型和结构嵌套模型，它们都能得到无偏估计方程。本文提出了一个一致的三明治估计器，用于求解无偏估计方程的估计器ψ ^ $$\hat{\psi} $$的方差，包括θ $$ \theta $$，该估计器也是通过求解无偏估计方程来估计的。本文针对θ ^ $$ \hat{\theta} $$求解（部分）分数方程且ψ $$ \psi $$不依赖于θ $$ \theta $$的情况提出了另外四个结果。这包括许多因果推理设置，其中θ $$ \theta $$描述了处理概率；缺失数据设置，其中θ $$ \theta $$描述了缺失概率；以及测量误差设置，其中θ $$ \theta $$描述了误差分布。这四个额外结果是(1) 与直觉相反，当估计 θ $ \theta $$ 时，ψ ^ $$ \hat{\psi} $$ 的渐近方差通常较小。(2) 如果忽略估计 θ $ \theta $$，ψ ^ $$ \hat{psi} $$方差的三明治估计器是保守的。(3) ψ ^ $ $ \hat{psi} $ $方差的一致的三明治估计器。 (4) 如果插入真实θ $ \theta $的ψ ^ $ $ \hat{psi} $是有效的，则ψ ^ $ $ \hat{psi} $的渐近方差不取决于是否估计了θ $ \theta $。为了说明这一点，我们使用观察数据来计算以下内容的置信区间：(1) 卡扎维与可乐定对细菌感染的影响；(2) 抗逆转录病毒治疗的效果如何取决于其在 HIV 感染者中的启动时间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Statistics in Medicine 医学-公共卫生、环境卫生与职业卫生

CiteScore

3.40

自引率

10.00%

发文量

334

审稿时长

2-4 weeks

期刊介绍： The journal aims to influence practice in medicine and its associated sciences through the publication of papers on statistical and other quantitative methods. Papers will explain new methods and demonstrate their application, preferably through a substantive, real, motivating example or a comprehensive evaluation based on an illustrative example. Alternatively, papers will report on case-studies where creative use or technical generalizations of established methodology is directed towards a substantive application. Reviews of, and tutorials on, general topics relevant to the application of statistics to medicine will also be published. The main criteria for publication are appropriateness of the statistical methods to a particular medical problem and clarity of exposition. Papers with primarily mathematical content will be excluded. The journal aims to enhance communication between statisticians, clinicians and medical researchers.