首页 > 最新文献

Journal of Causal Inference最新文献

英文 中文
On the bias of adjusting for a non-differentially mismeasured discrete confounder 关于非差分错测离散混杂因素的调整偏差
IF 1.4 4区 医学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2021-01-01 DOI: 10.1515/jci-2021-0033
J. Peña, Sourabh Vivek Balgi, A. Sjölander, E. Gabriel
Abstract Biological and epidemiological phenomena are often measured with error or imperfectly captured in data. When the true state of this imperfect measure is a confounder of an outcome exposure relationship of interest, it was previously widely believed that adjustment for the mismeasured observed variables provides a less biased estimate of the true average causal effect than not adjusting. However, this is not always the case and depends on both the nature of the measurement and confounding. We describe two sets of conditions under which adjusting for a non-deferentially mismeasured proxy comes closer to the unidentifiable true average causal effect than the unadjusted or crude estimate. The first set of conditions apply when the exposure is discrete or continuous and the confounder is ordinal, and the expectation of the outcome is monotonic in the confounder for both treatment levels contrasted. The second set of conditions apply when the exposure and the confounder are categorical (nominal). In all settings, the mismeasurement must be non-differential, as differential mismeasurement, particularly an unknown pattern, can cause unpredictable results.
生物学和流行病学现象的测量常常有误差或数据捕捉不完美。当这种不完美测量的真实状态是感兴趣的结果暴露关系的混杂因素时,以前人们普遍认为,对错误测量的观察变量进行调整,比不进行调整,对真实平均因果效应的估计偏差更小。然而,情况并非总是如此,这取决于测量和混淆的性质。我们描述了两组条件,在这两组条件下,与未调整或粗糙的估计相比,调整非服从错误测量的代理更接近无法识别的真实平均因果效应。第一组条件适用于暴露是离散的或连续的,混杂因素是有序的,对比两种治疗水平,混杂因素的结果预期是单调的。第二组条件适用于暴露和混杂因素是绝对的(名义的)。在所有情况下,错误测量必须是非微分的,因为微分错误测量,特别是未知模式,可能导致不可预测的结果。
{"title":"On the bias of adjusting for a non-differentially mismeasured discrete confounder","authors":"J. Peña, Sourabh Vivek Balgi, A. Sjölander, E. Gabriel","doi":"10.1515/jci-2021-0033","DOIUrl":"https://doi.org/10.1515/jci-2021-0033","url":null,"abstract":"Abstract Biological and epidemiological phenomena are often measured with error or imperfectly captured in data. When the true state of this imperfect measure is a confounder of an outcome exposure relationship of interest, it was previously widely believed that adjustment for the mismeasured observed variables provides a less biased estimate of the true average causal effect than not adjusting. However, this is not always the case and depends on both the nature of the measurement and confounding. We describe two sets of conditions under which adjusting for a non-deferentially mismeasured proxy comes closer to the unidentifiable true average causal effect than the unadjusted or crude estimate. The first set of conditions apply when the exposure is discrete or continuous and the confounder is ordinal, and the expectation of the outcome is monotonic in the confounder for both treatment levels contrasted. The second set of conditions apply when the exposure and the confounder are categorical (nominal). In all settings, the mismeasurement must be non-differential, as differential mismeasurement, particularly an unknown pattern, can cause unpredictable results.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"265 1","pages":"229 - 249"},"PeriodicalIF":1.4,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72830204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Causal versions of maximum entropy and principle of insufficient reason 最大熵的因果版本和不充分理性原理
IF 1.4 4区 医学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2021-01-01 DOI: 10.1515/jci-2021-0022
D. Janzing
Abstract The principle of insufficient reason (PIR) assigns equal probabilities to each alternative of a random experiment whenever there is no reason to prefer one over the other. The maximum entropy principle (MaxEnt) generalizes PIR to the case where statistical information like expectations are given. It is known that both principles result in paradoxical probability updates for joint distributions of cause and effect. This is because constraints on the conditional P ( effect ∣ cause ) Pleft({rm{effect}}| {rm{cause}}) result in changes of P ( cause ) Pleft({rm{cause}}) that assign higher probability to those values of the cause that offer more options for the effect, suggesting “intentional behavior.” Earlier work therefore suggested sequentially maximizing (conditional) entropy according to the causal order, but without further justification apart from plausibility on toy examples. We justify causal modifications of PIR and MaxEnt by separating constraints into restrictions for the cause and restrictions for the mechanism that generates the effect from the cause. We further sketch why causal PIR also entails “Information Geometric Causal Inference.” We briefly discuss problems of generalizing the causal version of MaxEnt to arbitrary causal DAGs.
当没有理由使某一选择优于另一选择时,不充分理由原则(PIR)对随机实验的每一个选择都赋予相等的概率。最大熵原理(MaxEnt)将PIR推广到给出期望等统计信息的情况。众所周知,这两个原理都会导致因果联合分布的悖论概率更新。这是因为对条件P (effect∣cause) Pleft({rm{effect}}| {rm{cause}})的约束导致P (cause) Pleft({rm{cause}})的变化,这些变化将更高的概率分配给那些为结果提供更多选项的原因值,表明“有意行为”。因此,早期的工作建议根据因果顺序依次最大化(条件)熵,但除了在玩具示例上的合理性之外,没有进一步的证明。我们通过将约束分离为对原因的限制和对从原因产生结果的机制的限制来证明PIR和MaxEnt的因果修改。我们进一步概述了为什么因果PIR也需要“信息几何因果推理”。我们简要讨论了将MaxEnt的因果版本推广到任意因果dag的问题。
{"title":"Causal versions of maximum entropy and principle of insufficient reason","authors":"D. Janzing","doi":"10.1515/jci-2021-0022","DOIUrl":"https://doi.org/10.1515/jci-2021-0022","url":null,"abstract":"Abstract The principle of insufficient reason (PIR) assigns equal probabilities to each alternative of a random experiment whenever there is no reason to prefer one over the other. The maximum entropy principle (MaxEnt) generalizes PIR to the case where statistical information like expectations are given. It is known that both principles result in paradoxical probability updates for joint distributions of cause and effect. This is because constraints on the conditional P ( effect ∣ cause ) Pleft({rm{effect}}| {rm{cause}}) result in changes of P ( cause ) Pleft({rm{cause}}) that assign higher probability to those values of the cause that offer more options for the effect, suggesting “intentional behavior.” Earlier work therefore suggested sequentially maximizing (conditional) entropy according to the causal order, but without further justification apart from plausibility on toy examples. We justify causal modifications of PIR and MaxEnt by separating constraints into restrictions for the cause and restrictions for the mechanism that generates the effect from the cause. We further sketch why causal PIR also entails “Information Geometric Causal Inference.” We briefly discuss problems of generalizing the causal version of MaxEnt to arbitrary causal DAGs.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"73 1","pages":"285 - 301"},"PeriodicalIF":1.4,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76776803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Designing experiments informed by observational studies 根据观察性研究设计实验
IF 1.4 4区 医学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2021-01-01 DOI: 10.1515/jci-2021-0010
Evan T. R. Rosenman, A. Owen
Abstract The increasing availability of passively observed data has yielded a growing interest in “data fusion” methods, which involve merging data from observational and experimental sources to draw causal conclusions. Such methods often require a precarious tradeoff between the unknown bias in the observational dataset and the often-large variance in the experimental dataset. We propose an alternative approach, which avoids this tradeoff: rather than using observational data for inference, we use it to design a more efficient experiment. We consider the case of a stratified experiment with a binary outcome and suppose pilot estimates for the stratum potential outcome variances can be obtained from the observational study. We extend existing results to generate confidence sets for these variances, while accounting for the possibility of unmeasured confounding. Then, we pose the experimental design problem as a regret minimization problem subject to the constraints imposed by our confidence sets. We show that this problem can be converted into a concave maximization and solved using conventional methods. Finally, we demonstrate the practical utility of our methods using data from the Women’s Health Initiative.
被动观测数据的可用性越来越高,使得人们对“数据融合”方法越来越感兴趣,这种方法涉及将观测和实验来源的数据合并以得出因果结论。这种方法通常需要在观测数据集中的未知偏差和实验数据集中通常较大的方差之间进行不稳定的权衡。我们提出了一种替代方法,避免了这种权衡:而不是使用观测数据进行推理,我们使用它来设计一个更有效的实验。我们考虑具有二元结果的分层实验的情况,并假设可以从观测研究中获得地层潜在结果方差的初步估计。我们扩展现有的结果,为这些方差生成置信集,同时考虑到不可测量的混杂的可能性。然后,我们将实验设计问题作为一个受我们的置信集约束的遗憾最小化问题。我们证明了这个问题可以转化为一个凹最大化问题,并使用常规方法求解。最后,我们利用妇女健康倡议的数据证明了我们的方法的实际效用。
{"title":"Designing experiments informed by observational studies","authors":"Evan T. R. Rosenman, A. Owen","doi":"10.1515/jci-2021-0010","DOIUrl":"https://doi.org/10.1515/jci-2021-0010","url":null,"abstract":"Abstract The increasing availability of passively observed data has yielded a growing interest in “data fusion” methods, which involve merging data from observational and experimental sources to draw causal conclusions. Such methods often require a precarious tradeoff between the unknown bias in the observational dataset and the often-large variance in the experimental dataset. We propose an alternative approach, which avoids this tradeoff: rather than using observational data for inference, we use it to design a more efficient experiment. We consider the case of a stratified experiment with a binary outcome and suppose pilot estimates for the stratum potential outcome variances can be obtained from the observational study. We extend existing results to generate confidence sets for these variances, while accounting for the possibility of unmeasured confounding. Then, we pose the experimental design problem as a regret minimization problem subject to the constraints imposed by our confidence sets. We show that this problem can be converted into a concave maximization and solved using conventional methods. Finally, we demonstrate the practical utility of our methods using data from the Women’s Health Initiative.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"51 1","pages":"147 - 171"},"PeriodicalIF":1.4,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83776795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Novel bounds for causal effects based on sensitivity parameters on the risk difference scale 基于风险差异量表上敏感性参数的因果效应的新界限
IF 1.4 4区 医学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2021-01-01 DOI: 10.1515/jci-2021-0024
A. Sjölander, O. Hössjer
Abstract Unmeasured confounding is an important threat to the validity of observational studies. A common way to deal with unmeasured confounding is to compute bounds for the causal effect of interest, that is, a range of values that is guaranteed to include the true effect, given the observed data. Recently, bounds have been proposed that are based on sensitivity parameters, which quantify the degree of unmeasured confounding on the risk ratio scale. These bounds can be used to compute an E-value, that is, the degree of confounding required to explain away an observed association, on the risk ratio scale. We complement and extend this previous work by deriving analogous bounds, based on sensitivity parameters on the risk difference scale. We show that our bounds can also be used to compute an E-value, on the risk difference scale. We compare our novel bounds with previous bounds through a real data example and a simulation study.
未测量的混杂是对观察性研究有效性的一个重要威胁。处理不可测量的混杂的一种常用方法是计算感兴趣的因果效应的界限,即在给定观测数据的情况下,保证包含真实效应的值范围。最近,人们提出了基于敏感性参数的界限,它量化了风险比尺度上未测量的混杂程度。这些界限可以用来计算e值,即在风险比尺度上解释观察到的关联所需的混淆程度。我们通过推导基于风险差异尺度上的敏感性参数的类似边界来补充和扩展先前的工作。我们表明,我们的界限也可以用于计算风险差尺度上的e值。通过一个实际的数据例子和仿真研究,比较了我们的新边界和以前的边界。
{"title":"Novel bounds for causal effects based on sensitivity parameters on the risk difference scale","authors":"A. Sjölander, O. Hössjer","doi":"10.1515/jci-2021-0024","DOIUrl":"https://doi.org/10.1515/jci-2021-0024","url":null,"abstract":"Abstract Unmeasured confounding is an important threat to the validity of observational studies. A common way to deal with unmeasured confounding is to compute bounds for the causal effect of interest, that is, a range of values that is guaranteed to include the true effect, given the observed data. Recently, bounds have been proposed that are based on sensitivity parameters, which quantify the degree of unmeasured confounding on the risk ratio scale. These bounds can be used to compute an E-value, that is, the degree of confounding required to explain away an observed association, on the risk ratio scale. We complement and extend this previous work by deriving analogous bounds, based on sensitivity parameters on the risk difference scale. We show that our bounds can also be used to compute an E-value, on the risk difference scale. We compare our novel bounds with previous bounds through a real data example and a simulation study.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"191 1","pages":"190 - 210"},"PeriodicalIF":1.4,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83067085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Instrumental variable regression via kernel maximum moment loss 通过核最大矩损失的工具变量回归
IF 1.4 4区 医学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2020-10-15 DOI: 10.1515/jci-2022-0073
Rui Zhang, M. Imaizumi, B. Scholkopf, Krikamol Muandet
Abstract We investigate a simple objective for nonlinear instrumental variable (IV) regression based on a kernelized conditional moment restriction known as a maximum moment restriction (MMR). The MMR objective is formulated by maximizing the interaction between the residual and the instruments belonging to a unit ball in a reproducing kernel Hilbert space. First, it allows us to simplify the IV regression as an empirical risk minimization problem, where the risk function depends on the reproducing kernel on the instrument and can be estimated by a U-statistic or V-statistic. Second, on the basis this simplification, we are able to provide consistency and asymptotic normality results in both parametric and nonparametric settings. Finally, we provide easy-to-use IV regression algorithms with an efficient hyperparameter selection procedure. We demonstrate the effectiveness of our algorithms using experiments on both synthetic and real-world data.
我们研究了一个基于核化条件矩约束的非线性工具变量(IV)回归的简单目标,即最大矩约束(MMR)。MMR目标是通过在再现核希尔伯特空间中最大化残差和属于单位球的仪器之间的相互作用来制定的。首先,它允许我们将IV回归简化为经验风险最小化问题,其中风险函数依赖于仪器上的再现核,并且可以通过u统计量或v统计量进行估计。其次,在此简化的基础上,我们能够在参数和非参数设置中提供一致性和渐近正态性结果。最后,我们提供了易于使用的IV回归算法与一个有效的超参数选择程序。我们通过合成数据和真实数据的实验证明了算法的有效性。
{"title":"Instrumental variable regression via kernel maximum moment loss","authors":"Rui Zhang, M. Imaizumi, B. Scholkopf, Krikamol Muandet","doi":"10.1515/jci-2022-0073","DOIUrl":"https://doi.org/10.1515/jci-2022-0073","url":null,"abstract":"Abstract We investigate a simple objective for nonlinear instrumental variable (IV) regression based on a kernelized conditional moment restriction known as a maximum moment restriction (MMR). The MMR objective is formulated by maximizing the interaction between the residual and the instruments belonging to a unit ball in a reproducing kernel Hilbert space. First, it allows us to simplify the IV regression as an empirical risk minimization problem, where the risk function depends on the reproducing kernel on the instrument and can be estimated by a U-statistic or V-statistic. Second, on the basis this simplification, we are able to provide consistency and asymptotic normality results in both parametric and nonparametric settings. Finally, we provide easy-to-use IV regression algorithms with an efficient hyperparameter selection procedure. We demonstrate the effectiveness of our algorithms using experiments on both synthetic and real-world data.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"1 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2020-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83678311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Learning linear non-Gaussian graphical models with multidirected edges 学习具有多向边的线性非高斯图形模型
IF 1.4 4区 医学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2020-10-11 DOI: 10.1515/jci-2020-0027
Yiheng Liu, Elina Robeva, Huanqing Wang
Abstract In this article, we propose a new method to learn the underlying acyclic mixed graph of a linear non-Gaussian structural equation model with given observational data. We build on an algorithm proposed by Wang and Drton, and we show that one can augment the hidden variable structure of the recovered model by learning multidirected edges rather than only directed and bidirected ones. Multidirected edges appear when more than two of the observed variables have a hidden common cause. We detect the presence of such hidden causes by looking at higher order cumulants and exploiting the multi-trek rule. Our method recovers the correct structure when the underlying graph is a bow-free acyclic mixed graph with potential multidirected edges.
本文提出了一种新的方法来学习具有给定观测数据的线性非高斯结构方程模型的底层无环混合图。我们在Wang和Drton提出的算法的基础上,证明了可以通过学习多向边来增强恢复模型的隐变量结构,而不仅仅是有向边和双向边。当两个以上的观测变量有一个隐藏的共同原因时,就会出现多向边。我们通过观察高阶累积量和利用多重跋涉规则来检测这些隐藏原因的存在。当底层图是具有潜在多向边的无弓无环混合图时,我们的方法恢复了正确的结构。
{"title":"Learning linear non-Gaussian graphical models with multidirected edges","authors":"Yiheng Liu, Elina Robeva, Huanqing Wang","doi":"10.1515/jci-2020-0027","DOIUrl":"https://doi.org/10.1515/jci-2020-0027","url":null,"abstract":"Abstract In this article, we propose a new method to learn the underlying acyclic mixed graph of a linear non-Gaussian structural equation model with given observational data. We build on an algorithm proposed by Wang and Drton, and we show that one can augment the hidden variable structure of the recovered model by learning multidirected edges rather than only directed and bidirected ones. Multidirected edges appear when more than two of the observed variables have a hidden common cause. We detect the presence of such hidden causes by looking at higher order cumulants and exploiting the multi-trek rule. Our method recovers the correct structure when the underlying graph is a bow-free acyclic mixed graph with potential multidirected edges.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"12 1","pages":"250 - 263"},"PeriodicalIF":1.4,"publicationDate":"2020-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84219125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Randomized graph cluster randomization 随机图聚类随机化
IF 1.4 4区 医学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2020-09-04 DOI: 10.1515/jci-2022-0014
J. Ugander, Hao Yin
Abstract The global average treatment effect (GATE) is a primary quantity of interest in the study of causal inference under network interference. With a correctly specified exposure model of the interference, the Horvitz–Thompson (HT) and Hájek estimators of the GATE are unbiased and consistent, respectively, yet known to exhibit extreme variance under many designs and in many settings of interest. With a fixed clustering of the interference graph, graph cluster randomization (GCR) designs have been shown to greatly reduce variance compared to node-level random assignment, but even so the variance is still often prohibitively large. In this work, we propose a randomized version of the GCR design, descriptively named randomized graph cluster randomization (RGCR), which uses a random clustering rather than a single fixed clustering. By considering an ensemble of many different clustering assignments, this design avoids a key problem with GCR where the network exposure probability of a given node can be exponentially small in a single clustering. We propose two inherently randomized graph decomposition algorithms for use with RGCR designs, randomized 3-net and 1-hop-max, adapted from the prior work on multiway graph cut problems and the probabilistic approximation of (graph) metrics. We also propose weighted extensions of these two algorithms with slight additional advantages. All these algorithms result in network exposure probabilities that can be estimated efficiently. We derive structure-dependent upper bounds on the variance of the HT estimator of the GATE, depending on the metric structure of the graph driving the interference. Where the best-known such upper bound for the HT estimator under a GCR design is exponential in the parameters of the metric structure, we give a comparable upper bound under RGCR that is instead polynomial in the same parameters. We provide extensive simulations comparing RGCR and GCR designs, observing substantial improvements in GATE estimation in a variety of settings.
摘要全局平均处理效应(global average treatment effect, GATE)是研究网络干扰下因果推理的一个重要参数。有了正确指定的干扰暴露模型,GATE的Horvitz-Thompson (HT)和Hájek估计器分别是无偏和一致的,但已知在许多设计和许多感兴趣的设置下表现出极端的差异。对于干扰图的固定聚类,与节点级随机分配相比,图簇随机化(GCR)设计已被证明可以大大减少方差,但即使如此,方差仍然经常大得令人难以置信。在这项工作中,我们提出了一个随机版本的GCR设计,描述性地命名为随机图聚类随机化(RGCR),它使用随机聚类而不是单个固定聚类。通过考虑许多不同聚类分配的集成,该设计避免了GCR的一个关键问题,即给定节点的网络暴露概率在单个聚类中可能呈指数级小。我们提出了两种用于RGCR设计的固有随机图分解算法,随机3-net和1-hop-max,它们改编自先前关于多路图切问题和(图)度量的概率逼近的工作。我们还提出了这两种算法的加权扩展,并增加了一些额外的优点。所有这些算法都可以有效地估计网络暴露概率。我们根据驱动干涉的图的度量结构,推导出GATE的HT估计量方差的结构相关的上界。其中最著名的在GCR设计下的HT估计量的上界在度量结构的参数中是指数的,我们给出了RGCR下的一个类似的上界,它在相同的参数中是多项式的。我们提供了比较RGCR和GCR设计的广泛模拟,观察到在各种设置下GATE估计的实质性改进。
{"title":"Randomized graph cluster randomization","authors":"J. Ugander, Hao Yin","doi":"10.1515/jci-2022-0014","DOIUrl":"https://doi.org/10.1515/jci-2022-0014","url":null,"abstract":"Abstract The global average treatment effect (GATE) is a primary quantity of interest in the study of causal inference under network interference. With a correctly specified exposure model of the interference, the Horvitz–Thompson (HT) and Hájek estimators of the GATE are unbiased and consistent, respectively, yet known to exhibit extreme variance under many designs and in many settings of interest. With a fixed clustering of the interference graph, graph cluster randomization (GCR) designs have been shown to greatly reduce variance compared to node-level random assignment, but even so the variance is still often prohibitively large. In this work, we propose a randomized version of the GCR design, descriptively named randomized graph cluster randomization (RGCR), which uses a random clustering rather than a single fixed clustering. By considering an ensemble of many different clustering assignments, this design avoids a key problem with GCR where the network exposure probability of a given node can be exponentially small in a single clustering. We propose two inherently randomized graph decomposition algorithms for use with RGCR designs, randomized 3-net and 1-hop-max, adapted from the prior work on multiway graph cut problems and the probabilistic approximation of (graph) metrics. We also propose weighted extensions of these two algorithms with slight additional advantages. All these algorithms result in network exposure probabilities that can be estimated efficiently. We derive structure-dependent upper bounds on the variance of the HT estimator of the GATE, depending on the metric structure of the graph driving the interference. Where the best-known such upper bound for the HT estimator under a GCR design is exponential in the parameters of the metric structure, we give a comparable upper bound under RGCR that is instead polynomial in the same parameters. We provide extensive simulations comparing RGCR and GCR designs, observing substantial improvements in GATE estimation in a variety of settings.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"77 2-3 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2020-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78140869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Estimating causal effects with the neural autoregressive density estimator 用神经自回归密度估计器估计因果效应
IF 1.4 4区 医学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2020-08-17 DOI: 10.1515/jci-2020-0007
Sergio Garrido, S. Borysov, Jeppe Rich, F. Pereira
Abstract The estimation of causal effects is fundamental in situations where the underlying system will be subject to active interventions. Part of building a causal inference engine is defining how variables relate to each other, that is, defining the functional relationship between variables entailed by the graph conditional dependencies. In this article, we deviate from the common assumption of linear relationships in causal models by making use of neural autoregressive density estimators and use them to estimate causal effects within Pearl’s do-calculus framework. Using synthetic data, we show that the approach can retrieve causal effects from non-linear systems without explicitly modeling the interactions between the variables and include confidence bands using the non-parametric bootstrap. We also explore scenarios that deviate from the ideal causal effect estimation setting such as poor data support or unobserved confounders.
在基础系统将受到积极干预的情况下,因果效应的估计是基本的。构建因果推理引擎的一部分工作是定义变量如何相互关联,也就是说,定义由图条件依赖关系所包含的变量之间的函数关系。在本文中,我们通过使用神经自回归密度估计器来偏离因果模型中线性关系的常见假设,并使用它们来估计Pearl的do-calculus框架内的因果效应。使用合成数据,我们表明该方法可以从非线性系统中检索因果效应,而无需显式建模变量之间的相互作用,并使用非参数自举包括置信带。我们还探讨了偏离理想因果效应估计设置的情况,如数据支持不足或未观察到的混杂因素。
{"title":"Estimating causal effects with the neural autoregressive density estimator","authors":"Sergio Garrido, S. Borysov, Jeppe Rich, F. Pereira","doi":"10.1515/jci-2020-0007","DOIUrl":"https://doi.org/10.1515/jci-2020-0007","url":null,"abstract":"Abstract The estimation of causal effects is fundamental in situations where the underlying system will be subject to active interventions. Part of building a causal inference engine is defining how variables relate to each other, that is, defining the functional relationship between variables entailed by the graph conditional dependencies. In this article, we deviate from the common assumption of linear relationships in causal models by making use of neural autoregressive density estimators and use them to estimate causal effects within Pearl’s do-calculus framework. Using synthetic data, we show that the approach can retrieve causal effects from non-linear systems without explicitly modeling the interactions between the variables and include confidence bands using the non-parametric bootstrap. We also explore scenarios that deviate from the ideal causal effect estimation setting such as poor data support or unobserved confounders.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"2 1","pages":"211 - 228"},"PeriodicalIF":1.4,"publicationDate":"2020-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86219822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Conditional as-if analyses in randomized experiments 随机实验中的条件假设分析
IF 1.4 4区 医学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2020-08-03 DOI: 10.1515/jci-2021-0012
Nicole E. Pashley, Guillaume W. Basse, Luke W. Miratrix
Abstract The injunction to “analyze the way you randomize” is well known to statisticians since Fisher advocated for randomization as the basis of inference. Yet even those convinced by the merits of randomization-based inference seldom follow this injunction to the letter. Bernoulli randomized experiments are often analyzed as completely randomized experiments, and completely randomized experiments are analyzed as if they had been stratified; more generally, it is not uncommon to analyze an experiment as if it had been randomized differently. This article examines the theoretical foundation behind this practice within a randomization-based framework. Specifically, we ask when is it legitimate to analyze an experiment randomized according to one design as if it had been randomized according to some other design. We show that a sufficient condition for this type of analysis to be valid is that the design used for analysis should be derived from the original design by an appropriate form of conditioning. We use our theory to justify certain existing methods, question others, and finally suggest new methodological insights such as conditioning on approximate covariate balance.
自从费雪主张将随机化作为推理的基础以来,统计学家就熟知“以随机化的方式进行分析”这条戒律。然而,即使是那些相信基于随机的推理的优点的人也很少严格遵守这一禁令。伯努利随机实验通常被分析为完全随机实验,完全随机实验被分析为分层;更一般地说,分析一个实验,就好像它是随机的一样,这并不罕见。本文在基于随机化的框架中研究了这一实践背后的理论基础。具体来说,我们要问的是,在什么情况下,根据一种随机设计来分析一个实验,就像它是根据另一种随机设计来分析一样,是合理的。我们证明,这种分析有效的充分条件是,用于分析的设计应通过适当形式的条件作用从原始设计中推导出来。我们用我们的理论来证明某些现有方法,质疑其他方法,并最终提出新的方法见解,如近似协变量平衡的条件。
{"title":"Conditional as-if analyses in randomized experiments","authors":"Nicole E. Pashley, Guillaume W. Basse, Luke W. Miratrix","doi":"10.1515/jci-2021-0012","DOIUrl":"https://doi.org/10.1515/jci-2021-0012","url":null,"abstract":"Abstract The injunction to “analyze the way you randomize” is well known to statisticians since Fisher advocated for randomization as the basis of inference. Yet even those convinced by the merits of randomization-based inference seldom follow this injunction to the letter. Bernoulli randomized experiments are often analyzed as completely randomized experiments, and completely randomized experiments are analyzed as if they had been stratified; more generally, it is not uncommon to analyze an experiment as if it had been randomized differently. This article examines the theoretical foundation behind this practice within a randomization-based framework. Specifically, we ask when is it legitimate to analyze an experiment randomized according to one design as if it had been randomized according to some other design. We show that a sufficient condition for this type of analysis to be valid is that the design used for analysis should be derived from the original design by an appropriate form of conditioning. We use our theory to justify certain existing methods, question others, and finally suggest new methodological insights such as conditioning on approximate covariate balance.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"92 1","pages":"264 - 284"},"PeriodicalIF":1.4,"publicationDate":"2020-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83798256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Properties of restricted randomization with implications for experimental design 限制随机化的性质及其对实验设计的影响
IF 1.4 4区 医学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2020-06-26 DOI: 10.1515/jci-2021-0057
Mattias Nordin, M. Schultzberg
Abstract Recently, there has been increasing interest in the use of heavily restricted randomization designs which enforce balance on observed covariates in randomized controlled trials. However, when restrictions are strict, there is a risk that the treatment effect estimator will have a very high mean squared error (MSE). In this article, we formalize this risk and propose a novel combinatoric-based approach to describe and address this issue. First, we validate our new approach by re-proving some known properties of complete randomization and restricted randomization. Second, we propose a novel diagnostic measure for restricted designs that only use the information embedded in the combinatorics of the design. Third, we show that the variance of the MSE of the difference-in-means estimator in a randomized experiment is a linear function of this diagnostic measure. Finally, we identify situations in which restricted designs can lead to an increased risk of getting a high MSE and discuss how our diagnostic measure can be used to detect such designs. Our results have implications for any restricted randomization design and can be used to evaluate the trade-off between enforcing balance on observed covariates and avoiding too restrictive designs.
摘要近年来,人们对使用严格限制的随机化设计越来越感兴趣,这种设计在随机对照试验中强制平衡观察到的协变量。然而,当限制很严格时,存在治疗效果估计量具有非常高的均方误差(MSE)的风险。在本文中,我们将这种风险形式化,并提出一种新的基于组合的方法来描述和解决这个问题。首先,我们通过重新证明完全随机化和受限随机化的一些已知性质来验证我们的新方法。其次,我们提出了一种新的诊断方法,用于仅使用嵌入在设计组合中的信息的限制性设计。第三,我们证明了随机实验中均值差估计量的MSE方差是该诊断度量的线性函数。最后,我们确定了限制性设计可能导致获得高MSE风险增加的情况,并讨论了如何使用我们的诊断措施来检测此类设计。我们的研究结果对任何限制性随机化设计都有启示,可以用来评估强制观察协变量平衡和避免过于严格的设计之间的权衡。
{"title":"Properties of restricted randomization with implications for experimental design","authors":"Mattias Nordin, M. Schultzberg","doi":"10.1515/jci-2021-0057","DOIUrl":"https://doi.org/10.1515/jci-2021-0057","url":null,"abstract":"Abstract Recently, there has been increasing interest in the use of heavily restricted randomization designs which enforce balance on observed covariates in randomized controlled trials. However, when restrictions are strict, there is a risk that the treatment effect estimator will have a very high mean squared error (MSE). In this article, we formalize this risk and propose a novel combinatoric-based approach to describe and address this issue. First, we validate our new approach by re-proving some known properties of complete randomization and restricted randomization. Second, we propose a novel diagnostic measure for restricted designs that only use the information embedded in the combinatorics of the design. Third, we show that the variance of the MSE of the difference-in-means estimator in a randomized experiment is a linear function of this diagnostic measure. Finally, we identify situations in which restricted designs can lead to an increased risk of getting a high MSE and discuss how our diagnostic measure can be used to detect such designs. Our results have implications for any restricted randomization design and can be used to evaluate the trade-off between enforcing balance on observed covariates and avoiding too restrictive designs.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"39 1","pages":"227 - 245"},"PeriodicalIF":1.4,"publicationDate":"2020-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89531588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
期刊
Journal of Causal Inference
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1