利用机器学习、特定中心和基于国家登记册的模型预测试管婴儿活产概率

Elizabeth T. Nguyen, Matthew G. Retzloff, Laura April Gago, John E. Nichols, John F. Payne, Barry A. Ripps, Michael Opsahl, Jeremy Groll, Ronald Beesley, Lorie Nowak, Gregory Neal, Jaye Adams, Trevor Swanson, Xiaocong Chen, Mylene W. M. Yao
{"title":"利用机器学习、特定中心和基于国家登记册的模型预测试管婴儿活产概率","authors":"Elizabeth T. Nguyen, Matthew G. Retzloff, Laura April Gago, John E. Nichols, John F. Payne, Barry A. Ripps, Michael Opsahl, Jeremy Groll, Ronald Beesley, Lorie Nowak, Gregory Neal, Jaye Adams, Trevor Swanson, Xiaocong Chen, Mylene W. M. Yao","doi":"10.1101/2024.06.20.24308970","DOIUrl":null,"url":null,"abstract":"Objective:\nTo compare the performance of machine learning based, center-specific (MLCS) models and the US national registry-based, multicenter model (SART model) in predicting IVF live birth probabilities (LBPs) for 6 unrelated, geographically diverse US fertility centers. Design:\nRetrospective observational design. Subjects:\nTest sets comprised first IVF cycle data (2013-2022) extracted from a retrospective cohort of 4,645 patients at 6 fertility centers. Intervention or Exposure:\nThe initial (MLCS1) and updated (MLCS2) models were compared against age control. MLSC2 and SART models were compared. Main Outcome Measures:\nModel validation metrics, reported in median and interquartile range (IQR), were compared using Wilcoxon signed-rank test: ROC AUC, posterior log-likelihood of odds ratio compared to age (PLORA), Precision-Recall (PR) AUC, F1 score and continuous net reclassification improvement (NRI). Results:\nMLCS1 and MLCS2 models showed improved AUC and PLORA compared to age control; MLCS1 models were validated using out-of-time test data. MLCS2 models showed improved PLORA 23.9 (IQR 10.2, 39.4) compared to 7.2 (IQR 3.6, 11.8) for MLCS1, p<0.05. MLCS2 showed higher median PR AUC at 0.75 (IQR 0.73, 0.77) compared to 0.69 (IQR 0.68, 0.71) for SART, p<0.05. In addition, the median F1 Score was higher for MLCS2 compared to SART model across predicted live birth probability (LBP) thresholds sampled at deciles at ≥40%, ≥50%, ≥60%, ≥70%. For example, at the 50% LBP threshold, MLCS2 had a median F1 score of 0.74 (IQR 0.72, 0.78) compared to 0.71 (IQR 0.68, 0.73) for SART. At these six centers, using the LBP threshold of ≥ 50%, MLCS2 models can identify ~84% of patients who would go on to have IVF live births, while the SART model can only identify ~75%. That means for every 100 patients who will have a first IVF cycle live birth, using LBR ≥ 50% as threshold, the MLCS2 model can identify 9 more such patients without overcalling or overestimating LBPs compared to the SART model. Conclusion:\nMLCS models accurately assign higher IVF LBPs to more patients compared to the SART model at 6 US fertility centers. We recommend testing a larger sample of fertility centers to evaluate generalizability of MLCS model benefits.","PeriodicalId":501409,"journal":{"name":"medRxiv - Obstetrics and Gynecology","volume":"67 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Predicting IVF live birth probabilities using machine learning, center-specific and national registry-based models\",\"authors\":\"Elizabeth T. Nguyen, Matthew G. Retzloff, Laura April Gago, John E. Nichols, John F. Payne, Barry A. Ripps, Michael Opsahl, Jeremy Groll, Ronald Beesley, Lorie Nowak, Gregory Neal, Jaye Adams, Trevor Swanson, Xiaocong Chen, Mylene W. M. Yao\",\"doi\":\"10.1101/2024.06.20.24308970\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Objective:\\nTo compare the performance of machine learning based, center-specific (MLCS) models and the US national registry-based, multicenter model (SART model) in predicting IVF live birth probabilities (LBPs) for 6 unrelated, geographically diverse US fertility centers. Design:\\nRetrospective observational design. Subjects:\\nTest sets comprised first IVF cycle data (2013-2022) extracted from a retrospective cohort of 4,645 patients at 6 fertility centers. Intervention or Exposure:\\nThe initial (MLCS1) and updated (MLCS2) models were compared against age control. MLSC2 and SART models were compared. Main Outcome Measures:\\nModel validation metrics, reported in median and interquartile range (IQR), were compared using Wilcoxon signed-rank test: ROC AUC, posterior log-likelihood of odds ratio compared to age (PLORA), Precision-Recall (PR) AUC, F1 score and continuous net reclassification improvement (NRI). Results:\\nMLCS1 and MLCS2 models showed improved AUC and PLORA compared to age control; MLCS1 models were validated using out-of-time test data. MLCS2 models showed improved PLORA 23.9 (IQR 10.2, 39.4) compared to 7.2 (IQR 3.6, 11.8) for MLCS1, p<0.05. MLCS2 showed higher median PR AUC at 0.75 (IQR 0.73, 0.77) compared to 0.69 (IQR 0.68, 0.71) for SART, p<0.05. In addition, the median F1 Score was higher for MLCS2 compared to SART model across predicted live birth probability (LBP) thresholds sampled at deciles at ≥40%, ≥50%, ≥60%, ≥70%. For example, at the 50% LBP threshold, MLCS2 had a median F1 score of 0.74 (IQR 0.72, 0.78) compared to 0.71 (IQR 0.68, 0.73) for SART. At these six centers, using the LBP threshold of ≥ 50%, MLCS2 models can identify ~84% of patients who would go on to have IVF live births, while the SART model can only identify ~75%. That means for every 100 patients who will have a first IVF cycle live birth, using LBR ≥ 50% as threshold, the MLCS2 model can identify 9 more such patients without overcalling or overestimating LBPs compared to the SART model. Conclusion:\\nMLCS models accurately assign higher IVF LBPs to more patients compared to the SART model at 6 US fertility centers. We recommend testing a larger sample of fertility centers to evaluate generalizability of MLCS model benefits.\",\"PeriodicalId\":501409,\"journal\":{\"name\":\"medRxiv - Obstetrics and Gynecology\",\"volume\":\"67 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"medRxiv - Obstetrics and Gynecology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1101/2024.06.20.24308970\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Obstetrics and Gynecology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.06.20.24308970","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

目的:比较基于机器学习的特定中心模型(MLCS)和基于美国国家登记处的多中心模型(SART 模型)在预测 6 个无关联、地理位置不同的美国生殖中心的试管婴儿活产概率(LBPs)方面的性能。设计:回顾性观察设计。受试者:测试集包括从 6 家生殖中心 4645 名患者的回顾性队列中提取的首个试管婴儿周期数据(2013-2022 年)。干预或暴露:初始模型(MLCS1)和更新模型(MLCS2)与年龄对照进行了比较。比较了MLSC2和SART模型。主要结果指标:模型验证指标(以中位数和四分位数间距(IQR)表示)采用Wilcoxon符号秩检验进行比较:ROC AUC、与年龄相比的几率比后验对数似然比(PLORA)、精确度-召回(PR)AUC、F1得分和连续净再分类改善(NRI)。结果:与年龄对照相比,MLCS1 和 MLCS2 模型的 AUC 和 PLORA 均有所提高;MLCS1 模型使用时间外测试数据进行了验证。与 MLCS1 的 7.2(IQR 3.6,11.8)相比,MLCS2 模型的 PLORA 提高了 23.9(IQR 10.2,39.4),p<0.05。MLCS2 的中位 PR AUC 为 0.75(IQR 0.73,0.77),高于 SART 的 0.69(IQR 0.68,0.71),p<0.05。此外,在预测活产概率(LBP)阈值≥40%、≥50%、≥60%、≥70%的十等分抽样中,MLCS2 的中位 F1 得分高于 SART 模型。例如,在 50% LBP 阈值时,MLCS2 的中位 F1 得分为 0.74(IQR 0.72,0.78),而 SART 的中位 F1 得分为 0.71(IQR 0.68,0.73)。在这六个中心,使用≥50%的LBP阈值,MLCS2模型可以识别约84%的患者将继续进行试管婴儿活产,而SART模型只能识别约75%的患者。这意味着,以LBR≥50%为阈值,每100名会在首个试管婴儿周期活产的患者中,MLCS2模型能比SART模型多识别出9名这样的患者,而不会过高或高估LBPs。结论:在美国的 6 家生殖中心,与 SART 模型相比,MLCS 模型能准确地为更多患者分配更高的 IVF LBP。我们建议对更多的生育中心样本进行测试,以评估MLCS模型效益的普遍性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Predicting IVF live birth probabilities using machine learning, center-specific and national registry-based models
Objective: To compare the performance of machine learning based, center-specific (MLCS) models and the US national registry-based, multicenter model (SART model) in predicting IVF live birth probabilities (LBPs) for 6 unrelated, geographically diverse US fertility centers. Design: Retrospective observational design. Subjects: Test sets comprised first IVF cycle data (2013-2022) extracted from a retrospective cohort of 4,645 patients at 6 fertility centers. Intervention or Exposure: The initial (MLCS1) and updated (MLCS2) models were compared against age control. MLSC2 and SART models were compared. Main Outcome Measures: Model validation metrics, reported in median and interquartile range (IQR), were compared using Wilcoxon signed-rank test: ROC AUC, posterior log-likelihood of odds ratio compared to age (PLORA), Precision-Recall (PR) AUC, F1 score and continuous net reclassification improvement (NRI). Results: MLCS1 and MLCS2 models showed improved AUC and PLORA compared to age control; MLCS1 models were validated using out-of-time test data. MLCS2 models showed improved PLORA 23.9 (IQR 10.2, 39.4) compared to 7.2 (IQR 3.6, 11.8) for MLCS1, p<0.05. MLCS2 showed higher median PR AUC at 0.75 (IQR 0.73, 0.77) compared to 0.69 (IQR 0.68, 0.71) for SART, p<0.05. In addition, the median F1 Score was higher for MLCS2 compared to SART model across predicted live birth probability (LBP) thresholds sampled at deciles at ≥40%, ≥50%, ≥60%, ≥70%. For example, at the 50% LBP threshold, MLCS2 had a median F1 score of 0.74 (IQR 0.72, 0.78) compared to 0.71 (IQR 0.68, 0.73) for SART. At these six centers, using the LBP threshold of ≥ 50%, MLCS2 models can identify ~84% of patients who would go on to have IVF live births, while the SART model can only identify ~75%. That means for every 100 patients who will have a first IVF cycle live birth, using LBR ≥ 50% as threshold, the MLCS2 model can identify 9 more such patients without overcalling or overestimating LBPs compared to the SART model. Conclusion: MLCS models accurately assign higher IVF LBPs to more patients compared to the SART model at 6 US fertility centers. We recommend testing a larger sample of fertility centers to evaluate generalizability of MLCS model benefits.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Contraceptive Outcomes of the Natural Cycles Birth Control App: A Study of Canadian Women Uptake of Intrauterine Contraception after Medical Management of First Trimester Incomplete Abortion: A Cross-sectional study in central Uganda Impact and factors affecting unplanned out-of-hospital birth on newborns at University Hospital compared to in-hospital born newborns Effectiveness of the modified WHO labour care guide to detect prolonged and obstructed labour among women admitted at publicly funded facilities in rural Mbarara district, Southwestern Uganda: an ambispective cohort study ACVR2A Facilitates Trophoblast Cell Invasion through TCF7/c-JUN Pathway in Pre-eclampsia Progression
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1