Double Machine Learning at Scale to Predict Causal Impact of Customer Actions

Sushant More, Priya Kotwal, Sujith Chappidi, Dinesh Mandalapu, Chris Khawand
{"title":"Double Machine Learning at Scale to Predict Causal Impact of Customer Actions","authors":"Sushant More, Priya Kotwal, Sujith Chappidi, Dinesh Mandalapu, Chris Khawand","doi":"arxiv-2409.02332","DOIUrl":null,"url":null,"abstract":"Causal Impact (CI) of customer actions are broadly used across the industry\nto inform both short- and long-term investment decisions of various types. In\nthis paper, we apply the double machine learning (DML) methodology to estimate\nthe CI values across 100s of customer actions of business interest and 100s of\nmillions of customers. We operationalize DML through a causal ML library based\non Spark with a flexible, JSON-driven model configuration approach to estimate\nCI at scale (i.e., across hundred of actions and millions of customers). We\noutline the DML methodology and implementation, and associated benefits over\nthe traditional potential outcomes based CI model. We show population-level as\nwell as customer-level CI values along with confidence intervals. The\nvalidation metrics show a 2.2% gain over the baseline methods and a 2.5X gain\nin the computational time. Our contribution is to advance the scalable\napplication of CI, while also providing an interface that allows faster\nexperimentation, cross-platform support, ability to onboard new use cases, and\nimproves accessibility of underlying code for partner teams.","PeriodicalId":501293,"journal":{"name":"arXiv - ECON - Econometrics","volume":"26 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - ECON - Econometrics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.02332","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Causal Impact (CI) of customer actions are broadly used across the industry to inform both short- and long-term investment decisions of various types. In this paper, we apply the double machine learning (DML) methodology to estimate the CI values across 100s of customer actions of business interest and 100s of millions of customers. We operationalize DML through a causal ML library based on Spark with a flexible, JSON-driven model configuration approach to estimate CI at scale (i.e., across hundred of actions and millions of customers). We outline the DML methodology and implementation, and associated benefits over the traditional potential outcomes based CI model. We show population-level as well as customer-level CI values along with confidence intervals. The validation metrics show a 2.2% gain over the baseline methods and a 2.5X gain in the computational time. Our contribution is to advance the scalable application of CI, while also providing an interface that allows faster experimentation, cross-platform support, ability to onboard new use cases, and improves accessibility of underlying code for partner teams.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
规模化双重机器学习预测客户行为的因果影响
客户行为的因果影响(CI)被广泛应用于整个行业,为各种类型的短期和长期投资决策提供依据。在本文中,我们应用双重机器学习(DML)方法来估算企业感兴趣的数百种客户行为和数亿客户的 CI 值。我们通过基于 Spark 的因果 ML 库和灵活的 JSON 驱动型模型配置方法对 DML 进行操作,以大规模(即跨越数百个行为和数百万客户)估算 CI。我们概述了 DML 方法和实施,以及与传统的基于潜在结果的 CI 模型相比的相关优势。我们展示了人口级和客户级 CI 值以及置信区间。验证指标显示,与基线方法相比,DML 的收益为 2.2%,计算时间增加了 2.5 倍。我们的贡献在于推进了 CI 的可扩展应用,同时还提供了一个接口,允许快速实验、跨平台支持、加入新用例的能力,并提高了合作伙伴团队对底层代码的可访问性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Simple robust two-stage estimation and inference for generalized impulse responses and multi-horizon causality GPT takes the SAT: Tracing changes in Test Difficulty and Math Performance of Students A Simple and Adaptive Confidence Interval when Nuisance Parameters Satisfy an Inequality Why you should also use OLS estimation of tail exponents On LASSO Inference for High Dimensional Predictive Regression
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1