Double/debiased machine learning for treatment and structural parameters

IF 2.9 4区 经济学 Q1 ECONOMICS Econometrics Journal Pub Date : 2017-06-24 DOI:10.1111/ectj.12097
Victor Chernozhukov, Denis Chetverikov, Mert Demirer, Esther Duflo, Christian Hansen, Whitney Newey, James Robins
{"title":"Double/debiased machine learning for treatment and structural parameters","authors":"Victor Chernozhukov,&nbsp;Denis Chetverikov,&nbsp;Mert Demirer,&nbsp;Esther Duflo,&nbsp;Christian Hansen,&nbsp;Whitney Newey,&nbsp;James Robins","doi":"10.1111/ectj.12097","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>We revisit the classic semi-parametric problem of inference on a low-dimensional parameter θ<sub>0</sub> in the presence of high-dimensional nuisance parameters η<sub>0</sub>. We depart from the classical setting by allowing for η<sub>0</sub> to be so high-dimensional that the traditional assumptions (e.g. Donsker properties) that limit complexity of the parameter space for this object break down. To estimate η<sub>0</sub>, we consider the use of statistical or machine learning (ML) methods, which are particularly well suited to estimation in modern, very high-dimensional cases. ML methods perform well by employing regularization to reduce variance and trading off regularization bias with overfitting in practice. However, both regularization bias and overfitting in estimating η<sub>0</sub> cause a heavy bias in estimators of θ<sub>0</sub> that are obtained by naively plugging ML estimators of η<sub>0</sub> into estimating equations for θ<sub>0</sub>. This bias results in the naive estimator failing to be consistent, where <i>N</i> is the sample size. We show that the impact of regularization bias and overfitting on estimation of the parameter of interest θ<sub>0</sub> can be removed by using two simple, yet critical, ingredients: (1) using Neyman-orthogonal moments/scores that have reduced sensitivity with respect to nuisance parameters to estimate θ<sub>0</sub>; (2) making use of cross-fitting, which provides an efficient form of data-splitting. We call the resulting set of methods double or debiased ML (DML). We verify that DML delivers point estimators that concentrate in an -neighbourhood of the true parameter values and are approximately unbiased and normally distributed, which allows construction of valid confidence statements. The generic statistical theory of DML is elementary and simultaneously relies on only weak theoretical requirements, which will admit the use of a broad array of modern ML methods for estimating the nuisance parameters, such as random forests, lasso, ridge, deep neural nets, boosted trees, and various hybrids and ensembles of these methods. We illustrate the general theory by applying it to provide theoretical properties of the following: DML applied to learn the main regression parameter in a partially linear regression model; DML applied to learn the coefficient on an endogenous variable in a partially linear instrumental variables model; DML applied to learn the average treatment effect and the average treatment effect on the treated under unconfoundedness; DML applied to learn the local average treatment effect in an instrumental variables setting. In addition to these theoretical applications, we also illustrate the use of DML in three empirical examples.</p></div>","PeriodicalId":50555,"journal":{"name":"Econometrics Journal","volume":null,"pages":null},"PeriodicalIF":2.9000,"publicationDate":"2017-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1111/ectj.12097","citationCount":"1512","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Econometrics Journal","FirstCategoryId":"96","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/ectj.12097","RegionNum":4,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECONOMICS","Score":null,"Total":0}
引用次数: 1512

Abstract

We revisit the classic semi-parametric problem of inference on a low-dimensional parameter θ0 in the presence of high-dimensional nuisance parameters η0. We depart from the classical setting by allowing for η0 to be so high-dimensional that the traditional assumptions (e.g. Donsker properties) that limit complexity of the parameter space for this object break down. To estimate η0, we consider the use of statistical or machine learning (ML) methods, which are particularly well suited to estimation in modern, very high-dimensional cases. ML methods perform well by employing regularization to reduce variance and trading off regularization bias with overfitting in practice. However, both regularization bias and overfitting in estimating η0 cause a heavy bias in estimators of θ0 that are obtained by naively plugging ML estimators of η0 into estimating equations for θ0. This bias results in the naive estimator failing to be consistent, where N is the sample size. We show that the impact of regularization bias and overfitting on estimation of the parameter of interest θ0 can be removed by using two simple, yet critical, ingredients: (1) using Neyman-orthogonal moments/scores that have reduced sensitivity with respect to nuisance parameters to estimate θ0; (2) making use of cross-fitting, which provides an efficient form of data-splitting. We call the resulting set of methods double or debiased ML (DML). We verify that DML delivers point estimators that concentrate in an -neighbourhood of the true parameter values and are approximately unbiased and normally distributed, which allows construction of valid confidence statements. The generic statistical theory of DML is elementary and simultaneously relies on only weak theoretical requirements, which will admit the use of a broad array of modern ML methods for estimating the nuisance parameters, such as random forests, lasso, ridge, deep neural nets, boosted trees, and various hybrids and ensembles of these methods. We illustrate the general theory by applying it to provide theoretical properties of the following: DML applied to learn the main regression parameter in a partially linear regression model; DML applied to learn the coefficient on an endogenous variable in a partially linear instrumental variables model; DML applied to learn the average treatment effect and the average treatment effect on the treated under unconfoundedness; DML applied to learn the local average treatment effect in an instrumental variables setting. In addition to these theoretical applications, we also illustrate the use of DML in three empirical examples.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于治疗和结构参数的双/去偏机器学习
我们重新讨论了在存在高维扰动参数η0的情况下对低维参数θ0进行推理的经典半参数问题。我们偏离了经典设置,允许η0是如此高维,以至于限制该对象参数空间复杂性的传统假设(例如Donsker性质)被打破。为了估计η0,我们考虑使用统计或机器学习(ML)方法,这些方法特别适合在现代非常高维的情况下进行估计。ML方法通过使用正则化来减少方差,并在实践中权衡正则化偏差和过拟合,表现良好。然而,估计η0时的正则化偏差和过拟合都会导致θ0估计量的严重偏差,这些估计量是通过将η0的ML估计量天真地插入θ0的估计方程中而获得的。这种偏差导致天真估计器不一致,其中N是样本大小。我们表明,正则化偏差和过拟合对感兴趣参数θ0估计的影响可以通过使用两个简单但关键的成分来消除:(1)使用对干扰参数敏感度降低的Neyman正交矩/分数来估计θ0;(2) 利用交叉拟合,这提供了一种有效的数据分割形式。我们将得到的方法集称为双偏或去偏ML(DML)。我们验证了DML提供的点估计集中在真实参数值的邻域内,并且是近似无偏和正态分布的,这允许构建有效的置信度声明。DML的通用统计理论是基本的,同时只依赖于较弱的理论要求,这将允许使用广泛的现代ML方法来估计干扰参数,如随机森林、套索、山脊、深度神经网络、增强树,以及这些方法的各种混合和集合。我们通过应用它来提供以下的理论性质来说明一般理论:DML用于学习部分线性回归模型中的主要回归参数;DML用于学习部分线性工具变量模型中内生变量的系数;DML用于学习平均治疗效果和对未发现的患者的平均治疗效果;DML应用于学习工具变量设置中的局部平均治疗效果。除了这些理论应用之外,我们还在三个经验例子中说明了DML的使用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Econometrics Journal
Econometrics Journal 管理科学-数学跨学科应用
CiteScore
4.20
自引率
5.30%
发文量
25
审稿时长
>12 weeks
期刊介绍: The Econometrics Journal was established in 1998 by the Royal Economic Society with the aim of creating a top international field journal for the publication of econometric research with a standard of intellectual rigour and academic standing similar to those of the pre-existing top field journals in econometrics. The Econometrics Journal is committed to publishing first-class papers in macro-, micro- and financial econometrics. It is a general journal for econometric research open to all areas of econometrics, whether applied, computational, methodological or theoretical contributions.
期刊最新文献
The Vector Error Correction Index Model: Representation, Estimation and Identification Double Robustness for Complier Parameters and a Semiparametric Test for Complier Characteristics Revealing priors from posteriors with an application to inflation forecasting in the UK Penalized quasi-likelihood estimation and model selection with parameters on the boundary of the parameter space Identifying the elasticity of substitution with biased technical change - a structural panel GMM estimator
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1