Calibration of Heterogeneous Treatment Effects in Randomized Experiments

IF 5 3区 管理学 Q1 INFORMATION SCIENCE & LIBRARY SCIENCE Information Systems Research Pub Date : 2024-01-12 DOI:10.1287/isre.2021.0343
Yan Leng, Drew Dimmery
{"title":"Calibration of Heterogeneous Treatment Effects in Randomized Experiments","authors":"Yan Leng, Drew Dimmery","doi":"10.1287/isre.2021.0343","DOIUrl":null,"url":null,"abstract":"Machine learning is commonly used to estimate the heterogeneous treatment effects (HTEs) in randomized experiments. Using large-scale randomized experiments on Facebook and Criteo platforms, we observe substantial discrepancies between machine learning-based treatment effect estimates and difference-in-means estimates directly from the randomized experiment. This paper provides a two-step framework for practitioners and researchers to diagnose and rectify this discrepancy. We first introduce a diagnostic tool to assess whether bias exists in the model-based estimates from machine learning. If bias exists, we then offer a model-agnostic method to calibrate any HTE estimates to known, unbiased, subgroup difference-in-means estimates, ensuring that the sign and magnitude of the subgroup estimates approximate the model-free benchmarks. This calibration method requires no additional data and can be scaled for large data sets. To highlight potential sources of bias, we theoretically show that this bias can result from regularization, and further use synthetic simulation to show biases result from misspecification and high-dimensional features. We demonstrate the efficacy of our calibration method using extensive synthetic simulations and two real-world randomized experiments. We further demonstrate the practical value of this calibration in three typical policy-making settings: a prescriptive, budget-constrained optimization framework; a setting seeking to maximize multiple performance indicators; and a multitreatment uplift modeling setting.","PeriodicalId":48411,"journal":{"name":"Information Systems Research","volume":"35 1","pages":""},"PeriodicalIF":5.0000,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Systems Research","FirstCategoryId":"91","ListUrlMain":"https://doi.org/10.1287/isre.2021.0343","RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Machine learning is commonly used to estimate the heterogeneous treatment effects (HTEs) in randomized experiments. Using large-scale randomized experiments on Facebook and Criteo platforms, we observe substantial discrepancies between machine learning-based treatment effect estimates and difference-in-means estimates directly from the randomized experiment. This paper provides a two-step framework for practitioners and researchers to diagnose and rectify this discrepancy. We first introduce a diagnostic tool to assess whether bias exists in the model-based estimates from machine learning. If bias exists, we then offer a model-agnostic method to calibrate any HTE estimates to known, unbiased, subgroup difference-in-means estimates, ensuring that the sign and magnitude of the subgroup estimates approximate the model-free benchmarks. This calibration method requires no additional data and can be scaled for large data sets. To highlight potential sources of bias, we theoretically show that this bias can result from regularization, and further use synthetic simulation to show biases result from misspecification and high-dimensional features. We demonstrate the efficacy of our calibration method using extensive synthetic simulations and two real-world randomized experiments. We further demonstrate the practical value of this calibration in three typical policy-making settings: a prescriptive, budget-constrained optimization framework; a setting seeking to maximize multiple performance indicators; and a multitreatment uplift modeling setting.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
校准随机实验中的异质治疗效果
机器学习通常用于估计随机实验中的异质性治疗效果(HTE)。利用 Facebook 和 Criteo 平台上的大规模随机实验,我们观察到基于机器学习的治疗效果估计值与直接来自随机实验的均值差估计值之间存在巨大差异。本文为从业人员和研究人员提供了一个两步框架,用于诊断和纠正这种差异。我们首先介绍了一种诊断工具,用于评估基于模型的机器学习估计值是否存在偏差。如果存在偏差,我们将提供一种与模型无关的方法,将任何 HTE 估计值校准为已知的、无偏见的、分组均值差估计值,确保分组估计值的符号和幅度接近无模型基准。这种校准方法不需要额外的数据,并可根据大型数据集进行调整。为了突出偏差的潜在来源,我们从理论上证明了正则化可能会导致偏差,并进一步使用合成模拟来证明错误规范和高维特征会导致偏差。我们通过大量的合成模拟和两个真实世界的随机实验证明了我们的校准方法的有效性。我们还进一步证明了这种校准方法在三种典型决策环境中的实用价值:一种规范性的、预算受限的优化框架;一种寻求多种绩效指标最大化的环境;以及一种多处理提升建模环境。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
9.10
自引率
8.20%
发文量
120
期刊介绍: ISR (Information Systems Research) is a journal of INFORMS, the Institute for Operations Research and the Management Sciences. Information Systems Research is a leading international journal of theory, research, and intellectual development, focused on information systems in organizations, institutions, the economy, and society.
期刊最新文献
Win by Hook or Crook? Self-Injecting Favorable Online Reviews to Fight Adjacent Rivals Omnificence or Differentiation? An Empirical Study of Knowledge Structure and Career Development of IT Workers Timely Quality Problem Resolution in Peer-Production Systems: The Impact of Bots, Policy Citations, and Contributor Experience Does David Make A Goliath? Impact of Rival’s Expertise Signals on Online User Engagement How to Make My Bug Bounty Cost-Effective? A Game-Theoretical Model
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1