Discovering Causal Models with Optimization: Confounders, Cycles, and Feature Selection

F. Eberhardt, Nur Kaynar, Auyon Siddiq
{"title":"Discovering Causal Models with Optimization: Confounders, Cycles, and Feature Selection","authors":"F. Eberhardt, Nur Kaynar, Auyon Siddiq","doi":"10.2139/ssrn.3873034","DOIUrl":null,"url":null,"abstract":"We propose a new method for learning causal structures from observational data, a process known as causal discovery. Our method takes as input observational data over a set of variables and returns a graph in which causal relations are specified by directed edges. We consider a highly general search space that accommodates latent confounders and feedback cycles, which few extant methods do. We formulate the discovery problem as an integer program, and propose a solution technique that leverages the conditional independence structure in the data to identify promising edges for inclusion in the output graph. In the large-sample limit, our method recovers a graph that is equivalent to the true data-generating graph. Computationally, our method is competitive with the state-of-the-art, and can solve in minutes instances that are intractable for alternative causal discovery methods. We demonstrate our approach by showing how it can be used to examine the validity of instrumental variables, which are widely used for causal inference. In particular, we analyze US Census data from the seminal paper on the returns to education by Angrist and Krueger (1991), and find that the causal structures uncovered by our method are consistent with the literature.","PeriodicalId":11485,"journal":{"name":"Econometrics: Applied Econometrics & Modeling eJournal","volume":"4 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Econometrics: Applied Econometrics & Modeling eJournal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2139/ssrn.3873034","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

We propose a new method for learning causal structures from observational data, a process known as causal discovery. Our method takes as input observational data over a set of variables and returns a graph in which causal relations are specified by directed edges. We consider a highly general search space that accommodates latent confounders and feedback cycles, which few extant methods do. We formulate the discovery problem as an integer program, and propose a solution technique that leverages the conditional independence structure in the data to identify promising edges for inclusion in the output graph. In the large-sample limit, our method recovers a graph that is equivalent to the true data-generating graph. Computationally, our method is competitive with the state-of-the-art, and can solve in minutes instances that are intractable for alternative causal discovery methods. We demonstrate our approach by showing how it can be used to examine the validity of instrumental variables, which are widely used for causal inference. In particular, we analyze US Census data from the seminal paper on the returns to education by Angrist and Krueger (1991), and find that the causal structures uncovered by our method are consistent with the literature.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过优化发现因果模型:混杂因素、周期和特征选择
我们提出了一种从观测数据中学习因果结构的新方法,这一过程被称为因果发现。我们的方法将一组变量上的观测数据作为输入,并返回一个图,其中因果关系由有向边指定。我们考虑了一个高度通用的搜索空间,它可以容纳潜在的混杂因素和反馈周期,而现有的方法很少能做到这一点。我们将发现问题表述为一个整数程序,并提出了一种解决技术,该技术利用数据中的条件独立结构来识别包含在输出图中的有希望的边。在大样本限制下,我们的方法恢复了一个与真实数据生成图等价的图。在计算上,我们的方法与最先进的方法相比具有竞争力,并且可以在几分钟内解决替代因果发现方法难以解决的实例。我们通过展示如何使用它来检查工具变量的有效性来展示我们的方法,工具变量被广泛用于因果推理。特别是,我们分析了安格里斯特和克鲁格(1991)关于教育回报的开创性论文中的美国人口普查数据,发现我们的方法揭示的因果结构与文献一致。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Discovering Causal Models with Optimization: Confounders, Cycles, and Feature Selection Improving the Wisdom of Crowds with Analysis of Variance of Predictions of Related Outcomes Canonical Correlation-based Model Selection for the Multilevel Factors Robust Forecasting Resurrecting the Size Effect: Firm Size, Profitability Shocks, and Expected Stock Returns
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1