Pseudo-random Number Generator Influences on Average Treatment Effect Estimates Obtained with Machine Learning.

IF 4.7 2区医学 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH Epidemiology Pub Date : 2024-11-01 Epub Date: 2024-08-16 DOI:10.1097/EDE.0000000000001785

Ashley I Naimi, Ya-Hui Yu, Lisa M Bodnar

{"title":"Pseudo-random Number Generator Influences on Average Treatment Effect Estimates Obtained with Machine Learning.","authors":"Ashley I Naimi, Ya-Hui Yu, Lisa M Bodnar","doi":"10.1097/EDE.0000000000001785","DOIUrl":null,"url":null,"abstract":"Background: The use of machine learning to estimate exposure effects introduces a dependence between the results of an empirical study and the value of the seed used to fix the pseudo-random number generator.Methods: We used data from 10,038 pregnant women and a 10% subsample (N = 1004) to examine the extent to which the risk difference for the relation between fruit and vegetable consumption and preeclampsia risk changes under different seed values. We fit an augmented inverse probability weighted estimator with two Super Learner algorithms: a simple algorithm including random forests and single-layer neural networks and a more complex algorithm with a mix of tree-based, regression-based, penalized, and simple algorithms. We evaluated the distributions of risk differences, standard errors, and P values that result from 5000 different seed value selections.Results: Our findings suggest important variability in the risk difference estimates, as well as an important effect of the stacking algorithm used. The interquartile range width of the risk differences in the full sample with the simple algorithm was 13 per 1000. However, all other interquartile ranges were roughly an order of magnitude lower. The medians of the distributions of risk differences differed according to the sample size and the algorithm used.Conclusions: Our findings add another dimension of concern regarding the potential for \"p-hacking,\" and further warrant the need to move away from simplistic evidentiary thresholds in empirical research. When empirical results depend on pseudo-random number generator seed values, caution is warranted in interpreting these results.","PeriodicalId":11779,"journal":{"name":"Epidemiology","volume":" ","pages":"779-786"},"PeriodicalIF":4.7000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11560583/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1097/EDE.0000000000001785","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/8/16 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}

引用次数: 0

Abstract

Background: The use of machine learning to estimate exposure effects introduces a dependence between the results of an empirical study and the value of the seed used to fix the pseudo-random number generator.

Methods: We used data from 10,038 pregnant women and a 10% subsample (N = 1004) to examine the extent to which the risk difference for the relation between fruit and vegetable consumption and preeclampsia risk changes under different seed values. We fit an augmented inverse probability weighted estimator with two Super Learner algorithms: a simple algorithm including random forests and single-layer neural networks and a more complex algorithm with a mix of tree-based, regression-based, penalized, and simple algorithms. We evaluated the distributions of risk differences, standard errors, and P values that result from 5000 different seed value selections.

Results: Our findings suggest important variability in the risk difference estimates, as well as an important effect of the stacking algorithm used. The interquartile range width of the risk differences in the full sample with the simple algorithm was 13 per 1000. However, all other interquartile ranges were roughly an order of magnitude lower. The medians of the distributions of risk differences differed according to the sample size and the algorithm used.

Conclusions: Our findings add another dimension of concern regarding the potential for "p-hacking," and further warrant the need to move away from simplistic evidentiary thresholds in empirical research. When empirical results depend on pseudo-random number generator seed values, caution is warranted in interpreting these results.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

伪随机数生成器对机器学习获得的平均治疗效果估计值的影响》（Pseudo-Random Number Generator Influences on Average Treatment Effect Estimates obtained with Machine Learning）。

背景：使用机器学习估算暴露效应会在实证研究结果与用于固定伪随机数生成器的种子值之间引入一种依赖关系：我们使用了来自 10,038 名孕妇和 10% 的子样本（N = 1,004）的数据，研究了在不同的种子值下，水果和蔬菜摄入量与子痫前期风险之间的风险差异变化程度。我们用两种超级学习器算法拟合了一个增强的反概率加权估计器：一种是包括随机森林和单层神经网络的简单算法，另一种是混合了基于树、基于回归、惩罚算法和简单算法的更复杂算法。我们评估了 5000 个不同种子值的风险差异、标准误差和 p 值的分布情况：结果：我们的研究结果表明，风险差异估计值存在很大差异，所使用的堆叠算法也有重要影响。在使用简单算法的全样本中，风险差异的四分位数范围宽度（IQRw）为 13‰。然而，所有其他的 IQR 都低了大约一个数量级。风险差异分布的中位数因样本量和所用算法而异：我们的发现为 "p-黑客 "的可能性增添了新的担忧，并进一步证明了在实证研究中摒弃简单的证据阈值的必要性。当实证结果依赖于伪随机数生成器种子值时，在解释这些结果时必须谨慎。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Epidemiology 医学-公共卫生、环境卫生与职业卫生

CiteScore

6.70

自引率

3.70%

发文量

177

审稿时长

6-12 weeks

期刊介绍： Epidemiology publishes original research from all fields of epidemiology. The journal also welcomes review articles and meta-analyses, novel hypotheses, descriptions and applications of new methods, and discussions of research theory or public health policy. We give special consideration to papers from developing countries.