利用机器学习检测随机对照试验中的违规行为。

IF 2.2 3区 医学 Q3 MEDICINE, RESEARCH & EXPERIMENTAL Clinical Trials Pub Date : 2024-11-25 DOI:10.1177/17407745241297947
Walter Nelson, Jeremy Petch, Jonathan Ranisau, Robin Zhao, Kumar Balasubramanian, Shrikant I Bangdiwala
{"title":"利用机器学习检测随机对照试验中的违规行为。","authors":"Walter Nelson, Jeremy Petch, Jonathan Ranisau, Robin Zhao, Kumar Balasubramanian, Shrikant I Bangdiwala","doi":"10.1177/17407745241297947","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Over the course of a clinical trial, irregularities may arise in the data. Trialists implement human-intensive, expensive central statistical monitoring procedures to identify and correct these irregularities before the results of the trial are analyzed and disseminated. Machine learning algorithms have shown promise for identifying center-level irregularities in multi-center clinical trials with minimal human intervention. We aimed to characterize the form-level data irregularities in several historical clinical trials and evaluate the ability of a machine learning-based outlier detection algorithm to identify them.</p><p><strong>Methods: </strong>Data irregularities previously identified by humans in historical clinical trials were ascertained by comparing preliminary snapshots of the trial databases to the final, locked databases. We measured the ability of a machine learning based outlier detection algorithm to identify form-level irregularities using concordance (area under the receiver operator characteristic), positive predictive value (precision), and sensitivity (recall).</p><p><strong>Results: </strong>We examined preliminary snapshots of seven historical clinical trials which randomized a total of 77,001 participants. We extracted a total of 1,267,484 completed entries from 358 case report forms containing irregularities from all snapshots across all trials, containing a total of 24,850 form-wide irregularities (median per-form form-level irregularity rate: 1.81%). Our proposed machine learning algorithm detects form-level irregularities with a median concordance of 0.74 (interquartile range = 0.57-0.89), slightly exceeding the performance of a previously proposed machine learning approach with a median area under the receiver operator characteristic of 0.73 (interquartile range = 0.54-0.88).</p><p><strong>Conclusion: </strong>Data irregularities in historical clinical trials were ascertained by comparing preliminary snapshots of the trial database to the final database. These irregularities can be categorized according to their scope. Irregularities can be successfully detected by a machine learning algorithm as early or earlier than a human can, without human intervention. Such an approach may complement existing techniques for central statistical monitoring in large multi-center randomized controlled trials and possibly improve the efficiency of costly data verification processes.</p>","PeriodicalId":10685,"journal":{"name":"Clinical Trials","volume":" ","pages":"17407745241297947"},"PeriodicalIF":2.2000,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Detecting irregularities in randomized controlled trials using machine learning.\",\"authors\":\"Walter Nelson, Jeremy Petch, Jonathan Ranisau, Robin Zhao, Kumar Balasubramanian, Shrikant I Bangdiwala\",\"doi\":\"10.1177/17407745241297947\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Over the course of a clinical trial, irregularities may arise in the data. Trialists implement human-intensive, expensive central statistical monitoring procedures to identify and correct these irregularities before the results of the trial are analyzed and disseminated. Machine learning algorithms have shown promise for identifying center-level irregularities in multi-center clinical trials with minimal human intervention. We aimed to characterize the form-level data irregularities in several historical clinical trials and evaluate the ability of a machine learning-based outlier detection algorithm to identify them.</p><p><strong>Methods: </strong>Data irregularities previously identified by humans in historical clinical trials were ascertained by comparing preliminary snapshots of the trial databases to the final, locked databases. We measured the ability of a machine learning based outlier detection algorithm to identify form-level irregularities using concordance (area under the receiver operator characteristic), positive predictive value (precision), and sensitivity (recall).</p><p><strong>Results: </strong>We examined preliminary snapshots of seven historical clinical trials which randomized a total of 77,001 participants. We extracted a total of 1,267,484 completed entries from 358 case report forms containing irregularities from all snapshots across all trials, containing a total of 24,850 form-wide irregularities (median per-form form-level irregularity rate: 1.81%). Our proposed machine learning algorithm detects form-level irregularities with a median concordance of 0.74 (interquartile range = 0.57-0.89), slightly exceeding the performance of a previously proposed machine learning approach with a median area under the receiver operator characteristic of 0.73 (interquartile range = 0.54-0.88).</p><p><strong>Conclusion: </strong>Data irregularities in historical clinical trials were ascertained by comparing preliminary snapshots of the trial database to the final database. These irregularities can be categorized according to their scope. Irregularities can be successfully detected by a machine learning algorithm as early or earlier than a human can, without human intervention. Such an approach may complement existing techniques for central statistical monitoring in large multi-center randomized controlled trials and possibly improve the efficiency of costly data verification processes.</p>\",\"PeriodicalId\":10685,\"journal\":{\"name\":\"Clinical Trials\",\"volume\":\" \",\"pages\":\"17407745241297947\"},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2024-11-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Clinical Trials\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1177/17407745241297947\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"MEDICINE, RESEARCH & EXPERIMENTAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical Trials","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/17407745241297947","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0

摘要

背景:在临床试验过程中,数据可能会出现异常。试验者需要实施人力密集、成本高昂的中央统计监测程序,以便在分析和发布试验结果之前识别并纠正这些违规现象。机器学习算法在识别多中心临床试验中中心层面的违规行为方面大有可为,只需极少的人工干预。我们的目的是描述几项历史临床试验中表格级数据异常的特征,并评估基于机器学习的离群点检测算法识别这些异常的能力:方法:通过比较试验数据库的初步快照和最终锁定的数据库,我们确定了之前由人类在历史临床试验中识别出的数据不规则性。我们使用一致性(接收者运算特征下的面积)、阳性预测值(精确度)和灵敏度(召回率)衡量了基于机器学习的离群点检测算法识别形式级不规范的能力:我们研究了七项历史临床试验的初步快照,这些试验共随机抽取了 77,001 名参与者。我们从所有试验的所有快照中提取了 358 份病例报告表中包含违规行为的 1,267,484 个完整条目,共包含 24,850 个全表违规行为(每表违规率中位数:1.81%)。我们提出的机器学习算法检测表单级不规范性的中位一致性为 0.74(四分位间范围 = 0.57-0.89),略高于之前提出的机器学习方法的性能,后者的接收算子特征下的中位面积为 0.73(四分位间范围 = 0.54-0.88):通过比较试验数据库的初步快照和最终数据库,确定了历史临床试验中的数据不规则性。这些违规数据可根据其范围进行分类。机器学习算法可以在不需要人工干预的情况下,比人类更早或更早地成功检测出违规数据。这种方法可以补充现有的大型多中心随机对照试验中央统计监测技术,并有可能提高成本高昂的数据验证过程的效率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Detecting irregularities in randomized controlled trials using machine learning.

Background: Over the course of a clinical trial, irregularities may arise in the data. Trialists implement human-intensive, expensive central statistical monitoring procedures to identify and correct these irregularities before the results of the trial are analyzed and disseminated. Machine learning algorithms have shown promise for identifying center-level irregularities in multi-center clinical trials with minimal human intervention. We aimed to characterize the form-level data irregularities in several historical clinical trials and evaluate the ability of a machine learning-based outlier detection algorithm to identify them.

Methods: Data irregularities previously identified by humans in historical clinical trials were ascertained by comparing preliminary snapshots of the trial databases to the final, locked databases. We measured the ability of a machine learning based outlier detection algorithm to identify form-level irregularities using concordance (area under the receiver operator characteristic), positive predictive value (precision), and sensitivity (recall).

Results: We examined preliminary snapshots of seven historical clinical trials which randomized a total of 77,001 participants. We extracted a total of 1,267,484 completed entries from 358 case report forms containing irregularities from all snapshots across all trials, containing a total of 24,850 form-wide irregularities (median per-form form-level irregularity rate: 1.81%). Our proposed machine learning algorithm detects form-level irregularities with a median concordance of 0.74 (interquartile range = 0.57-0.89), slightly exceeding the performance of a previously proposed machine learning approach with a median area under the receiver operator characteristic of 0.73 (interquartile range = 0.54-0.88).

Conclusion: Data irregularities in historical clinical trials were ascertained by comparing preliminary snapshots of the trial database to the final database. These irregularities can be categorized according to their scope. Irregularities can be successfully detected by a machine learning algorithm as early or earlier than a human can, without human intervention. Such an approach may complement existing techniques for central statistical monitoring in large multi-center randomized controlled trials and possibly improve the efficiency of costly data verification processes.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Clinical Trials
Clinical Trials 医学-医学:研究与实验
CiteScore
4.10
自引率
3.70%
发文量
82
审稿时长
6-12 weeks
期刊介绍: Clinical Trials is dedicated to advancing knowledge on the design and conduct of clinical trials related research methodologies. Covering the design, conduct, analysis, synthesis and evaluation of key methodologies, the journal remains on the cusp of the latest topics, including ethics, regulation and policy impact.
期刊最新文献
Evaluating the use of text-message reminders and personalised text-message reminders on the return of participant questionnaires in trials, a systematic review and meta-analysis. Impact of differences between interim and post-interim analysis populations on outcomes of a group sequential trial: Example of the MOVe-OUT study. From RAGs to riches: Utilizing large language models to write documents for clinical trials. Hybrid sample size calculations for cluster randomised trials using assurance. Characterization of studies considered and required under Medicare's coverage with evidence development program.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1