Paul Windisch, Fabio Dennstädt, Carole Koechli, Robert Förster, Christina Schröder, Daniel M Aebersold, Daniel R Zwahlen
{"title":"利用自然语言处理技术从随机对照试验中自动提取转移性疾病与局部疾病的纳入标准","authors":"Paul Windisch, Fabio Dennstädt, Carole Koechli, Robert Förster, Christina Schröder, Daniel M Aebersold, Daniel R Zwahlen","doi":"10.1200/CCI-24-00150","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Extracting inclusion and exclusion criteria in a structured, automated fashion remains a challenge to developing better search functionalities or automating systematic reviews of randomized controlled trials in oncology. The question \"Did this trial enroll patients with localized disease, metastatic disease, or both?\" could be used to narrow down the number of potentially relevant trials when conducting a search.</p><p><strong>Methods: </strong>Six hundred trials from high-impact medical journals were classified depending on whether they allowed for the inclusion of patients with localized and/or metastatic disease. Five hundred trials were used to develop and validate three different models, with 100 trials being stored away for testing. The test set was also used to evaluate the performance of GPT-4o in the same task.</p><p><strong>Results: </strong>In the test set, a rule-based system using regular expressions achieved F1 scores of 0.72 for the prediction of whether the trial allowed for the inclusion of patients with localized disease and 0.77 for metastatic disease. A transformer-based machine learning (ML) model achieved F1 scores of 0.97 and 0.88, respectively. A combined approach where the rule-based system was allowed to over-rule the ML model achieved F1 scores of 0.97 and 0.89, respectively. GPT-4o achieved F1 scores of 0.87 and 0.92, respectively.</p><p><strong>Conclusion: </strong>Automatic classification of cancer trials with regard to the inclusion of patients with localized and/or metastatic disease is feasible. Turning the extraction of trial criteria into classification problems could, in selected cases, improve text-mining approaches in evidence-based medicine. Increasingly large language models can reduce or eliminate the need for previous training on the task at the expense of increased computational power and, in turn, cost.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"8 ","pages":"e2400150"},"PeriodicalIF":3.3000,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Metastatic Versus Localized Disease as Inclusion Criteria That Can Be Automatically Extracted From Randomized Controlled Trials Using Natural Language Processing.\",\"authors\":\"Paul Windisch, Fabio Dennstädt, Carole Koechli, Robert Förster, Christina Schröder, Daniel M Aebersold, Daniel R Zwahlen\",\"doi\":\"10.1200/CCI-24-00150\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>Extracting inclusion and exclusion criteria in a structured, automated fashion remains a challenge to developing better search functionalities or automating systematic reviews of randomized controlled trials in oncology. The question \\\"Did this trial enroll patients with localized disease, metastatic disease, or both?\\\" could be used to narrow down the number of potentially relevant trials when conducting a search.</p><p><strong>Methods: </strong>Six hundred trials from high-impact medical journals were classified depending on whether they allowed for the inclusion of patients with localized and/or metastatic disease. Five hundred trials were used to develop and validate three different models, with 100 trials being stored away for testing. The test set was also used to evaluate the performance of GPT-4o in the same task.</p><p><strong>Results: </strong>In the test set, a rule-based system using regular expressions achieved F1 scores of 0.72 for the prediction of whether the trial allowed for the inclusion of patients with localized disease and 0.77 for metastatic disease. A transformer-based machine learning (ML) model achieved F1 scores of 0.97 and 0.88, respectively. A combined approach where the rule-based system was allowed to over-rule the ML model achieved F1 scores of 0.97 and 0.89, respectively. GPT-4o achieved F1 scores of 0.87 and 0.92, respectively.</p><p><strong>Conclusion: </strong>Automatic classification of cancer trials with regard to the inclusion of patients with localized and/or metastatic disease is feasible. Turning the extraction of trial criteria into classification problems could, in selected cases, improve text-mining approaches in evidence-based medicine. Increasingly large language models can reduce or eliminate the need for previous training on the task at the expense of increased computational power and, in turn, cost.</p>\",\"PeriodicalId\":51626,\"journal\":{\"name\":\"JCO Clinical Cancer Informatics\",\"volume\":\"8 \",\"pages\":\"e2400150\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2024-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JCO Clinical Cancer Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1200/CCI-24-00150\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/11/27 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"ONCOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JCO Clinical Cancer Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1200/CCI-24-00150","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/11/27 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
摘要
目的:以结构化、自动化的方式提取纳入和排除标准仍然是开发更好的搜索功能或对肿瘤随机对照试验进行自动化系统综述所面临的挑战。在进行检索时,"该试验是否纳入了局部疾病、转移性疾病或两者兼有的患者?"这一问题可用于缩小潜在相关试验的数量:根据是否允许纳入局部性疾病和/或转移性疾病患者,对来自高影响力医学期刊的600项试验进行了分类。500 项试验用于开发和验证三种不同的模型,其中 100 项试验用于测试。测试集还用于评估 GPT-4o 在同一任务中的性能:在测试集中,基于规则的系统使用正则表达式预测试验是否允许纳入局部疾病患者,F1 得分为 0.72,预测转移性疾病的 F1 得分为 0.77。基于转换器的机器学习(ML)模型的 F1 分数分别为 0.97 和 0.88。在一种综合方法中,允许基于规则的系统凌驾于 ML 模型之上,F1 分数分别为 0.97 和 0.89。GPT-4o 的 F1 分数分别为 0.87 和 0.92:在纳入局部和/或转移性疾病患者方面对癌症试验进行自动分类是可行的。在某些情况下,将提取试验标准转化为分类问题可以改进循证医学中的文本挖掘方法。越来越多的大型语言模型可以减少或消除对先前任务训练的需求,但代价是计算能力的提高和成本的增加。
Metastatic Versus Localized Disease as Inclusion Criteria That Can Be Automatically Extracted From Randomized Controlled Trials Using Natural Language Processing.
Purpose: Extracting inclusion and exclusion criteria in a structured, automated fashion remains a challenge to developing better search functionalities or automating systematic reviews of randomized controlled trials in oncology. The question "Did this trial enroll patients with localized disease, metastatic disease, or both?" could be used to narrow down the number of potentially relevant trials when conducting a search.
Methods: Six hundred trials from high-impact medical journals were classified depending on whether they allowed for the inclusion of patients with localized and/or metastatic disease. Five hundred trials were used to develop and validate three different models, with 100 trials being stored away for testing. The test set was also used to evaluate the performance of GPT-4o in the same task.
Results: In the test set, a rule-based system using regular expressions achieved F1 scores of 0.72 for the prediction of whether the trial allowed for the inclusion of patients with localized disease and 0.77 for metastatic disease. A transformer-based machine learning (ML) model achieved F1 scores of 0.97 and 0.88, respectively. A combined approach where the rule-based system was allowed to over-rule the ML model achieved F1 scores of 0.97 and 0.89, respectively. GPT-4o achieved F1 scores of 0.87 and 0.92, respectively.
Conclusion: Automatic classification of cancer trials with regard to the inclusion of patients with localized and/or metastatic disease is feasible. Turning the extraction of trial criteria into classification problems could, in selected cases, improve text-mining approaches in evidence-based medicine. Increasingly large language models can reduce or eliminate the need for previous training on the task at the expense of increased computational power and, in turn, cost.