因果参数双重/偏差机器学习的随时有效推理

arXiv - ECON - Econometrics Pub Date : 2024-08-18 DOI:arxiv-2408.09598

Abhinandan Dalal, Patrick Blöbaum, Shiva Kasiviswanathan, Aaditya Ramdas

{"title":"因果参数双重/偏差机器学习的随时有效推理","authors":"Abhinandan Dalal, Patrick Blöbaum, Shiva Kasiviswanathan, Aaditya Ramdas","doi":"arxiv-2408.09598","DOIUrl":null,"url":null,"abstract":"Double (debiased) machine learning (DML) has seen widespread use in recent\nyears for learning causal/structural parameters, in part due to its flexibility\nand adaptability to high-dimensional nuisance functions as well as its ability\nto avoid bias from regularization or overfitting. However, the classic\ndouble-debiased framework is only valid asymptotically for a predetermined\nsample size, thus lacking the flexibility of collecting more data if sharper\ninference is needed, or stopping data collection early if useful inferences can\nbe made earlier than expected. This can be of particular concern in large scale\nexperimental studies with huge financial costs or human lives at stake, as well\nas in observational studies where the length of confidence of intervals do not\nshrink to zero even with increasing sample size due to partial identifiability\nof a structural parameter. In this paper, we present time-uniform counterparts\nto the asymptotic DML results, enabling valid inference and confidence\nintervals for structural parameters to be constructed at any arbitrary\n(possibly data-dependent) stopping time. We provide conditions which are only\nslightly stronger than the standard DML conditions, but offer the stronger\nguarantee for anytime-valid inference. This facilitates the transformation of\nany existing DML method to provide anytime-valid guarantees with minimal\nmodifications, making it highly adaptable and easy to use. We illustrate our\nprocedure using two instances: a) local average treatment effect in online\nexperiments with non-compliance, and b) partial identification of average\ntreatment effect in observational studies with potential unmeasured\nconfounding.","PeriodicalId":501293,"journal":{"name":"arXiv - ECON - Econometrics","volume":"24 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Anytime-Valid Inference for Double/Debiased Machine Learning of Causal Parameters\",\"authors\":\"Abhinandan Dalal, Patrick Blöbaum, Shiva Kasiviswanathan, Aaditya Ramdas\",\"doi\":\"arxiv-2408.09598\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Double (debiased) machine learning (DML) has seen widespread use in recent\\nyears for learning causal/structural parameters, in part due to its flexibility\\nand adaptability to high-dimensional nuisance functions as well as its ability\\nto avoid bias from regularization or overfitting. However, the classic\\ndouble-debiased framework is only valid asymptotically for a predetermined\\nsample size, thus lacking the flexibility of collecting more data if sharper\\ninference is needed, or stopping data collection early if useful inferences can\\nbe made earlier than expected. This can be of particular concern in large scale\\nexperimental studies with huge financial costs or human lives at stake, as well\\nas in observational studies where the length of confidence of intervals do not\\nshrink to zero even with increasing sample size due to partial identifiability\\nof a structural parameter. In this paper, we present time-uniform counterparts\\nto the asymptotic DML results, enabling valid inference and confidence\\nintervals for structural parameters to be constructed at any arbitrary\\n(possibly data-dependent) stopping time. We provide conditions which are only\\nslightly stronger than the standard DML conditions, but offer the stronger\\nguarantee for anytime-valid inference. This facilitates the transformation of\\nany existing DML method to provide anytime-valid guarantees with minimal\\nmodifications, making it highly adaptable and easy to use. We illustrate our\\nprocedure using two instances: a) local average treatment effect in online\\nexperiments with non-compliance, and b) partial identification of average\\ntreatment effect in observational studies with potential unmeasured\\nconfounding.\",\"PeriodicalId\":501293,\"journal\":{\"name\":\"arXiv - ECON - Econometrics\",\"volume\":\"24 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - ECON - Econometrics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.09598\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - ECON - Econometrics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.09598","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

近年来，双（去偏）机器学习（DML）在因果/结构参数学习中得到了广泛应用，部分原因在于它的灵活性和对高维骚扰函数的适应性，以及避免正则化或过度拟合产生偏差的能力。然而，经典的双偏差框架仅在预定样本量下渐进有效，因此缺乏灵活性，无法在需要更清晰推断时收集更多数据，或在比预期更早做出有用推断时提前停止数据收集。这一点在涉及巨大经济成本或人命的大型实验研究中，以及在观察性研究中尤为重要，因为在观察性研究中，由于结构参数的部分可识别性，即使样本量增加，置信区间的长度也不会缩减为零。在本文中，我们提出了与渐近 DML 结果相对应的时间均匀性，从而使结构参数的有效推断和置信区间可以在任意（可能取决于数据）停止时间构建。我们提供的条件只比标准 DML 条件稍强，但却为任何时间的有效推断提供了更强的保证。这有助于将任何现有的 DML 方法转化为提供随时有效保证的方法，而只需做极少的修改，从而使其具有很强的适应性和易用性。我们用两个例子来说明我们的方法：a) 在线实验中的局部平均治疗效果，有不遵守的情况；b) 观察性研究中的平均治疗效果的部分识别，有潜在的非测量混淆。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Anytime-Valid Inference for Double/Debiased Machine Learning of Causal Parameters

Double (debiased) machine learning (DML) has seen widespread use in recent years for learning causal/structural parameters, in part due to its flexibility and adaptability to high-dimensional nuisance functions as well as its ability to avoid bias from regularization or overfitting. However, the classic double-debiased framework is only valid asymptotically for a predetermined sample size, thus lacking the flexibility of collecting more data if sharper inference is needed, or stopping data collection early if useful inferences can be made earlier than expected. This can be of particular concern in large scale experimental studies with huge financial costs or human lives at stake, as well as in observational studies where the length of confidence of intervals do not shrink to zero even with increasing sample size due to partial identifiability of a structural parameter. In this paper, we present time-uniform counterparts to the asymptotic DML results, enabling valid inference and confidence intervals for structural parameters to be constructed at any arbitrary (possibly data-dependent) stopping time. We provide conditions which are only slightly stronger than the standard DML conditions, but offer the stronger guarantee for anytime-valid inference. This facilitates the transformation of any existing DML method to provide anytime-valid guarantees with minimal modifications, making it highly adaptable and easy to use. We illustrate our procedure using two instances: a) local average treatment effect in online experiments with non-compliance, and b) partial identification of average treatment effect in observational studies with potential unmeasured confounding.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - ECON - Econometrics

自引率

0.00%

发文量