Pablo M Marti-Castellote, Christopher Reeder, Brian L Claggett, Pulkit Singh, Emily S Lau, Shaan Khurshid, Puneet Batra, Steven A Lubitz, Mahnaz Maddah, Orly Vardeny, Eldrin F Lewis, Marc A Pfeffer, Pardeep S Jhund, Akshay S Desai, John J V McMurray, Patrick T Ellinor, Jennifer E Ho, Scott D Solomon, Jonathan W Cunningham
{"title":"在全球临床试验中通过自然语言处理来判断心力衰竭的住院情况。","authors":"Pablo M Marti-Castellote, Christopher Reeder, Brian L Claggett, Pulkit Singh, Emily S Lau, Shaan Khurshid, Puneet Batra, Steven A Lubitz, Mahnaz Maddah, Orly Vardeny, Eldrin F Lewis, Marc A Pfeffer, Pardeep S Jhund, Akshay S Desai, John J V McMurray, Patrick T Ellinor, Jennifer E Ho, Scott D Solomon, Jonathan W Cunningham","doi":"10.1161/CIRCHEARTFAILURE.124.012514","DOIUrl":null,"url":null,"abstract":"<p><p><b>Background:</b> Medical record review by a physician clinical events committee is the gold standard for identifying cardiovascular outcomes in clinical trials, but is labor-intensive and poorly reproducible. Automated outcome adjudication by artificial intelligence (AI) could enable larger and less expensive clinical trials, but has not been validated in global studies. <b>Methods:</b> We developed a novel model for automated AI-based heart failure adjudication (\"HF-NLP\") using hospitalizations from three international clinical outcomes trials. This model was tested on potential heart failure hospitalizations from the DELIVER trial, a cardiovascular outcomes trial comparing dapagliflozin with placebo in 6063 patients with heart failure with mildly reduced or preserved ejection fraction. AI-based adjudications were compared with adjudications from a clinical events committee that followed FDA-based criteria. <b>Results:</b> AI-based adjudication agreed with the clinical events committee in 83% of events. A strategy of human review for events that the AI model deemed uncertain (16%) would have achieved 91% agreement with the clinical events committee while reducing adjudication workload by 84%. The estimated effect of dapagliflozin on heart failure hospitalization was nearly identical with AI-based adjudication (hazard ratio 0.76 [95% CI 0.66-0.88]) compared to clinical events committee adjudication (hazard ratio 0.77 [95% CI 0.67-0.89]). The AI model extracted symptoms, signs, and treatments of heart failure from each medical record in tabular format and quoted sentences documenting them. <b>Conclusions:</b> AI-based adjudication of clinical outcomes has the potential to improve the efficiency of global clinical trials while preserving accuracy and interpretability.</p>","PeriodicalId":10196,"journal":{"name":"Circulation: Heart Failure","volume":" ","pages":""},"PeriodicalIF":7.8000,"publicationDate":"2024-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Natural Language Processing to Adjudicate Heart Failure Hospitalizations in Global Clinical Trials.\",\"authors\":\"Pablo M Marti-Castellote, Christopher Reeder, Brian L Claggett, Pulkit Singh, Emily S Lau, Shaan Khurshid, Puneet Batra, Steven A Lubitz, Mahnaz Maddah, Orly Vardeny, Eldrin F Lewis, Marc A Pfeffer, Pardeep S Jhund, Akshay S Desai, John J V McMurray, Patrick T Ellinor, Jennifer E Ho, Scott D Solomon, Jonathan W Cunningham\",\"doi\":\"10.1161/CIRCHEARTFAILURE.124.012514\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p><b>Background:</b> Medical record review by a physician clinical events committee is the gold standard for identifying cardiovascular outcomes in clinical trials, but is labor-intensive and poorly reproducible. Automated outcome adjudication by artificial intelligence (AI) could enable larger and less expensive clinical trials, but has not been validated in global studies. <b>Methods:</b> We developed a novel model for automated AI-based heart failure adjudication (\\\"HF-NLP\\\") using hospitalizations from three international clinical outcomes trials. This model was tested on potential heart failure hospitalizations from the DELIVER trial, a cardiovascular outcomes trial comparing dapagliflozin with placebo in 6063 patients with heart failure with mildly reduced or preserved ejection fraction. AI-based adjudications were compared with adjudications from a clinical events committee that followed FDA-based criteria. <b>Results:</b> AI-based adjudication agreed with the clinical events committee in 83% of events. A strategy of human review for events that the AI model deemed uncertain (16%) would have achieved 91% agreement with the clinical events committee while reducing adjudication workload by 84%. The estimated effect of dapagliflozin on heart failure hospitalization was nearly identical with AI-based adjudication (hazard ratio 0.76 [95% CI 0.66-0.88]) compared to clinical events committee adjudication (hazard ratio 0.77 [95% CI 0.67-0.89]). The AI model extracted symptoms, signs, and treatments of heart failure from each medical record in tabular format and quoted sentences documenting them. <b>Conclusions:</b> AI-based adjudication of clinical outcomes has the potential to improve the efficiency of global clinical trials while preserving accuracy and interpretability.</p>\",\"PeriodicalId\":10196,\"journal\":{\"name\":\"Circulation: Heart Failure\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":7.8000,\"publicationDate\":\"2024-11-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Circulation: Heart Failure\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1161/CIRCHEARTFAILURE.124.012514\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CARDIAC & CARDIOVASCULAR SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Circulation: Heart Failure","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1161/CIRCHEARTFAILURE.124.012514","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CARDIAC & CARDIOVASCULAR SYSTEMS","Score":null,"Total":0}
引用次数: 0
摘要
背景:由医生组成的临床事件委员会审查病历是在临床试验中确定心血管结果的黄金标准,但这需要大量人力,而且可重复性差。通过人工智能(AI)自动判定结果可使临床试验规模更大、成本更低,但尚未在全球研究中得到验证。方法:我们利用三项国际临床试验的住院病例,开发了一种基于人工智能的心力衰竭自动判定新模型("HF-NLP")。该模型在 DELIVER 试验的潜在心衰住院病例中进行了测试,DELIVER 试验是一项心血管预后试验,对 6063 名射血分数轻度降低或保留的心衰患者进行了达帕格列氯嗪与安慰剂的比较。将基于 AI 的裁定与临床事件委员会按照 FDA 标准做出的裁定进行了比较。结果:在 83% 的事件中,基于人工智能的裁定与临床事件委员会的裁定一致。如果对人工智能模型认为不确定的事件(16%)采取人工审核策略,则与临床事件委员会的一致率将达到 91%,同时减少 84% 的裁定工作量。达帕格列净对心力衰竭住院治疗的估计效果与基于人工智能的判定(危险比为 0.76 [95% CI 0.66-0.88])几乎相同,而临床事件委员会的判定(危险比为 0.77 [95% CI 0.67-0.89])则不尽相同。人工智能模型以表格形式从每份病历中提取了心衰的症状、体征和治疗方法,并引用了记录这些症状、体征和治疗方法的句子。结论基于人工智能的临床结果判定有可能提高全球临床试验的效率,同时保持准确性和可解释性。
Natural Language Processing to Adjudicate Heart Failure Hospitalizations in Global Clinical Trials.
Background: Medical record review by a physician clinical events committee is the gold standard for identifying cardiovascular outcomes in clinical trials, but is labor-intensive and poorly reproducible. Automated outcome adjudication by artificial intelligence (AI) could enable larger and less expensive clinical trials, but has not been validated in global studies. Methods: We developed a novel model for automated AI-based heart failure adjudication ("HF-NLP") using hospitalizations from three international clinical outcomes trials. This model was tested on potential heart failure hospitalizations from the DELIVER trial, a cardiovascular outcomes trial comparing dapagliflozin with placebo in 6063 patients with heart failure with mildly reduced or preserved ejection fraction. AI-based adjudications were compared with adjudications from a clinical events committee that followed FDA-based criteria. Results: AI-based adjudication agreed with the clinical events committee in 83% of events. A strategy of human review for events that the AI model deemed uncertain (16%) would have achieved 91% agreement with the clinical events committee while reducing adjudication workload by 84%. The estimated effect of dapagliflozin on heart failure hospitalization was nearly identical with AI-based adjudication (hazard ratio 0.76 [95% CI 0.66-0.88]) compared to clinical events committee adjudication (hazard ratio 0.77 [95% CI 0.67-0.89]). The AI model extracted symptoms, signs, and treatments of heart failure from each medical record in tabular format and quoted sentences documenting them. Conclusions: AI-based adjudication of clinical outcomes has the potential to improve the efficiency of global clinical trials while preserving accuracy and interpretability.
期刊介绍:
Circulation: Heart Failure focuses on content related to heart failure, mechanical circulatory support, and heart transplant science and medicine. It considers studies conducted in humans or analyses of human data, as well as preclinical studies with direct clinical correlation or relevance. While primarily a clinical journal, it may publish novel basic and preclinical studies that significantly advance the field of heart failure.