{"title":"随机图聚类随机化","authors":"J. Ugander, Hao Yin","doi":"10.1515/jci-2022-0014","DOIUrl":null,"url":null,"abstract":"Abstract The global average treatment effect (GATE) is a primary quantity of interest in the study of causal inference under network interference. With a correctly specified exposure model of the interference, the Horvitz–Thompson (HT) and Hájek estimators of the GATE are unbiased and consistent, respectively, yet known to exhibit extreme variance under many designs and in many settings of interest. With a fixed clustering of the interference graph, graph cluster randomization (GCR) designs have been shown to greatly reduce variance compared to node-level random assignment, but even so the variance is still often prohibitively large. In this work, we propose a randomized version of the GCR design, descriptively named randomized graph cluster randomization (RGCR), which uses a random clustering rather than a single fixed clustering. By considering an ensemble of many different clustering assignments, this design avoids a key problem with GCR where the network exposure probability of a given node can be exponentially small in a single clustering. We propose two inherently randomized graph decomposition algorithms for use with RGCR designs, randomized 3-net and 1-hop-max, adapted from the prior work on multiway graph cut problems and the probabilistic approximation of (graph) metrics. We also propose weighted extensions of these two algorithms with slight additional advantages. All these algorithms result in network exposure probabilities that can be estimated efficiently. We derive structure-dependent upper bounds on the variance of the HT estimator of the GATE, depending on the metric structure of the graph driving the interference. Where the best-known such upper bound for the HT estimator under a GCR design is exponential in the parameters of the metric structure, we give a comparable upper bound under RGCR that is instead polynomial in the same parameters. We provide extensive simulations comparing RGCR and GCR designs, observing substantial improvements in GATE estimation in a variety of settings.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"77 2-3 1","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2020-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":"{\"title\":\"Randomized graph cluster randomization\",\"authors\":\"J. Ugander, Hao Yin\",\"doi\":\"10.1515/jci-2022-0014\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract The global average treatment effect (GATE) is a primary quantity of interest in the study of causal inference under network interference. With a correctly specified exposure model of the interference, the Horvitz–Thompson (HT) and Hájek estimators of the GATE are unbiased and consistent, respectively, yet known to exhibit extreme variance under many designs and in many settings of interest. With a fixed clustering of the interference graph, graph cluster randomization (GCR) designs have been shown to greatly reduce variance compared to node-level random assignment, but even so the variance is still often prohibitively large. In this work, we propose a randomized version of the GCR design, descriptively named randomized graph cluster randomization (RGCR), which uses a random clustering rather than a single fixed clustering. By considering an ensemble of many different clustering assignments, this design avoids a key problem with GCR where the network exposure probability of a given node can be exponentially small in a single clustering. We propose two inherently randomized graph decomposition algorithms for use with RGCR designs, randomized 3-net and 1-hop-max, adapted from the prior work on multiway graph cut problems and the probabilistic approximation of (graph) metrics. We also propose weighted extensions of these two algorithms with slight additional advantages. All these algorithms result in network exposure probabilities that can be estimated efficiently. We derive structure-dependent upper bounds on the variance of the HT estimator of the GATE, depending on the metric structure of the graph driving the interference. Where the best-known such upper bound for the HT estimator under a GCR design is exponential in the parameters of the metric structure, we give a comparable upper bound under RGCR that is instead polynomial in the same parameters. We provide extensive simulations comparing RGCR and GCR designs, observing substantial improvements in GATE estimation in a variety of settings.\",\"PeriodicalId\":48576,\"journal\":{\"name\":\"Journal of Causal Inference\",\"volume\":\"77 2-3 1\",\"pages\":\"\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2020-09-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"20\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Causal Inference\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1515/jci-2022-0014\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Causal Inference","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1515/jci-2022-0014","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 20
摘要
摘要全局平均处理效应(global average treatment effect, GATE)是研究网络干扰下因果推理的一个重要参数。有了正确指定的干扰暴露模型,GATE的Horvitz-Thompson (HT)和Hájek估计器分别是无偏和一致的,但已知在许多设计和许多感兴趣的设置下表现出极端的差异。对于干扰图的固定聚类,与节点级随机分配相比,图簇随机化(GCR)设计已被证明可以大大减少方差,但即使如此,方差仍然经常大得令人难以置信。在这项工作中,我们提出了一个随机版本的GCR设计,描述性地命名为随机图聚类随机化(RGCR),它使用随机聚类而不是单个固定聚类。通过考虑许多不同聚类分配的集成,该设计避免了GCR的一个关键问题,即给定节点的网络暴露概率在单个聚类中可能呈指数级小。我们提出了两种用于RGCR设计的固有随机图分解算法,随机3-net和1-hop-max,它们改编自先前关于多路图切问题和(图)度量的概率逼近的工作。我们还提出了这两种算法的加权扩展,并增加了一些额外的优点。所有这些算法都可以有效地估计网络暴露概率。我们根据驱动干涉的图的度量结构,推导出GATE的HT估计量方差的结构相关的上界。其中最著名的在GCR设计下的HT估计量的上界在度量结构的参数中是指数的,我们给出了RGCR下的一个类似的上界,它在相同的参数中是多项式的。我们提供了比较RGCR和GCR设计的广泛模拟,观察到在各种设置下GATE估计的实质性改进。
Abstract The global average treatment effect (GATE) is a primary quantity of interest in the study of causal inference under network interference. With a correctly specified exposure model of the interference, the Horvitz–Thompson (HT) and Hájek estimators of the GATE are unbiased and consistent, respectively, yet known to exhibit extreme variance under many designs and in many settings of interest. With a fixed clustering of the interference graph, graph cluster randomization (GCR) designs have been shown to greatly reduce variance compared to node-level random assignment, but even so the variance is still often prohibitively large. In this work, we propose a randomized version of the GCR design, descriptively named randomized graph cluster randomization (RGCR), which uses a random clustering rather than a single fixed clustering. By considering an ensemble of many different clustering assignments, this design avoids a key problem with GCR where the network exposure probability of a given node can be exponentially small in a single clustering. We propose two inherently randomized graph decomposition algorithms for use with RGCR designs, randomized 3-net and 1-hop-max, adapted from the prior work on multiway graph cut problems and the probabilistic approximation of (graph) metrics. We also propose weighted extensions of these two algorithms with slight additional advantages. All these algorithms result in network exposure probabilities that can be estimated efficiently. We derive structure-dependent upper bounds on the variance of the HT estimator of the GATE, depending on the metric structure of the graph driving the interference. Where the best-known such upper bound for the HT estimator under a GCR design is exponential in the parameters of the metric structure, we give a comparable upper bound under RGCR that is instead polynomial in the same parameters. We provide extensive simulations comparing RGCR and GCR designs, observing substantial improvements in GATE estimation in a variety of settings.
期刊介绍:
Journal of Causal Inference (JCI) publishes papers on theoretical and applied causal research across the range of academic disciplines that use quantitative tools to study causality.