{"title":"Heteroscedasticity-aware stratified sampling to improve uplift modeling","authors":"Björn Bokelmann, Stefan Lessmann","doi":"10.1016/j.ejor.2025.02.030","DOIUrl":null,"url":null,"abstract":"Randomized controlled trials (RCTs) are conducted in many business applications including online marketing or customer churn prevention to investigate the effect of specific treatments (coupons, retention offers, mailings, etc.). RCTs allow for the estimation of average treatment effects and the training of (uplift) models for the heterogeneity of treatment effects across individuals. The problem with RCTs is that they are costly, and this cost increases with the number of individuals included. These costs have inspired research on how to conduct experiments with a small number of individuals while still obtaining precise treatment effect estimates. We contribute to this literature a <ce:italic>heteroskedasticity-aware stratified sampling</ce:italic> (HS) scheme. We leverage the fact that different individuals have different noise levels in their outcome and that precise treatment effect estimation requires more observations from the “high-noise” individuals than from the “low-noise” individuals. We show theoretically and empirically that HS sampling yields significantly more precise estimates of the ATE, improves uplift models, and makes their evaluation more reliable compared to RCT data sampled completely randomly. Due to these benefits and the simplicity of our approach, we expect HS sampling to be valuable in many real-world applications in business and beyond.","PeriodicalId":55161,"journal":{"name":"European Journal of Operational Research","volume":"44 1","pages":""},"PeriodicalIF":6.0000,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Operational Research","FirstCategoryId":"91","ListUrlMain":"https://doi.org/10.1016/j.ejor.2025.02.030","RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OPERATIONS RESEARCH & MANAGEMENT SCIENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Randomized controlled trials (RCTs) are conducted in many business applications including online marketing or customer churn prevention to investigate the effect of specific treatments (coupons, retention offers, mailings, etc.). RCTs allow for the estimation of average treatment effects and the training of (uplift) models for the heterogeneity of treatment effects across individuals. The problem with RCTs is that they are costly, and this cost increases with the number of individuals included. These costs have inspired research on how to conduct experiments with a small number of individuals while still obtaining precise treatment effect estimates. We contribute to this literature a heteroskedasticity-aware stratified sampling (HS) scheme. We leverage the fact that different individuals have different noise levels in their outcome and that precise treatment effect estimation requires more observations from the “high-noise” individuals than from the “low-noise” individuals. We show theoretically and empirically that HS sampling yields significantly more precise estimates of the ATE, improves uplift models, and makes their evaluation more reliable compared to RCT data sampled completely randomly. Due to these benefits and the simplicity of our approach, we expect HS sampling to be valuable in many real-world applications in business and beyond.
期刊介绍:
The European Journal of Operational Research (EJOR) publishes high quality, original papers that contribute to the methodology of operational research (OR) and to the practice of decision making.