{"title":"Discovering Causal Models with Optimization: Confounders, Cycles, and Feature Selection","authors":"F. Eberhardt, Nur Kaynar, Auyon Siddiq","doi":"10.2139/ssrn.3873034","DOIUrl":null,"url":null,"abstract":"We propose a new method for learning causal structures from observational data, a process known as causal discovery. Our method takes as input observational data over a set of variables and returns a graph in which causal relations are specified by directed edges. We consider a highly general search space that accommodates latent confounders and feedback cycles, which few extant methods do. We formulate the discovery problem as an integer program, and propose a solution technique that leverages the conditional independence structure in the data to identify promising edges for inclusion in the output graph. In the large-sample limit, our method recovers a graph that is equivalent to the true data-generating graph. Computationally, our method is competitive with the state-of-the-art, and can solve in minutes instances that are intractable for alternative causal discovery methods. We demonstrate our approach by showing how it can be used to examine the validity of instrumental variables, which are widely used for causal inference. In particular, we analyze US Census data from the seminal paper on the returns to education by Angrist and Krueger (1991), and find that the causal structures uncovered by our method are consistent with the literature.","PeriodicalId":11485,"journal":{"name":"Econometrics: Applied Econometrics & Modeling eJournal","volume":"4 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Econometrics: Applied Econometrics & Modeling eJournal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2139/ssrn.3873034","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
We propose a new method for learning causal structures from observational data, a process known as causal discovery. Our method takes as input observational data over a set of variables and returns a graph in which causal relations are specified by directed edges. We consider a highly general search space that accommodates latent confounders and feedback cycles, which few extant methods do. We formulate the discovery problem as an integer program, and propose a solution technique that leverages the conditional independence structure in the data to identify promising edges for inclusion in the output graph. In the large-sample limit, our method recovers a graph that is equivalent to the true data-generating graph. Computationally, our method is competitive with the state-of-the-art, and can solve in minutes instances that are intractable for alternative causal discovery methods. We demonstrate our approach by showing how it can be used to examine the validity of instrumental variables, which are widely used for causal inference. In particular, we analyze US Census data from the seminal paper on the returns to education by Angrist and Krueger (1991), and find that the causal structures uncovered by our method are consistent with the literature.