Metastatic Versus Localized Disease as Inclusion Criteria That Can Be Automatically Extracted From Randomized Controlled Trials Using Natural Language Processing.
Paul Windisch, Fabio Dennstädt, Carole Koechli, Robert Förster, Christina Schröder, Daniel M Aebersold, Daniel R Zwahlen
{"title":"Metastatic Versus Localized Disease as Inclusion Criteria That Can Be Automatically Extracted From Randomized Controlled Trials Using Natural Language Processing.","authors":"Paul Windisch, Fabio Dennstädt, Carole Koechli, Robert Förster, Christina Schröder, Daniel M Aebersold, Daniel R Zwahlen","doi":"10.1200/CCI-24-00150","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Extracting inclusion and exclusion criteria in a structured, automated fashion remains a challenge to developing better search functionalities or automating systematic reviews of randomized controlled trials in oncology. The question \"Did this trial enroll patients with localized disease, metastatic disease, or both?\" could be used to narrow down the number of potentially relevant trials when conducting a search.</p><p><strong>Methods: </strong>Six hundred trials from high-impact medical journals were classified depending on whether they allowed for the inclusion of patients with localized and/or metastatic disease. Five hundred trials were used to develop and validate three different models, with 100 trials being stored away for testing. The test set was also used to evaluate the performance of GPT-4o in the same task.</p><p><strong>Results: </strong>In the test set, a rule-based system using regular expressions achieved F1 scores of 0.72 for the prediction of whether the trial allowed for the inclusion of patients with localized disease and 0.77 for metastatic disease. A transformer-based machine learning (ML) model achieved F1 scores of 0.97 and 0.88, respectively. A combined approach where the rule-based system was allowed to over-rule the ML model achieved F1 scores of 0.97 and 0.89, respectively. GPT-4o achieved F1 scores of 0.87 and 0.92, respectively.</p><p><strong>Conclusion: </strong>Automatic classification of cancer trials with regard to the inclusion of patients with localized and/or metastatic disease is feasible. Turning the extraction of trial criteria into classification problems could, in selected cases, improve text-mining approaches in evidence-based medicine. Increasingly large language models can reduce or eliminate the need for previous training on the task at the expense of increased computational power and, in turn, cost.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"8 ","pages":"e2400150"},"PeriodicalIF":3.3000,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JCO Clinical Cancer Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1200/CCI-24-00150","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/11/27 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: Extracting inclusion and exclusion criteria in a structured, automated fashion remains a challenge to developing better search functionalities or automating systematic reviews of randomized controlled trials in oncology. The question "Did this trial enroll patients with localized disease, metastatic disease, or both?" could be used to narrow down the number of potentially relevant trials when conducting a search.
Methods: Six hundred trials from high-impact medical journals were classified depending on whether they allowed for the inclusion of patients with localized and/or metastatic disease. Five hundred trials were used to develop and validate three different models, with 100 trials being stored away for testing. The test set was also used to evaluate the performance of GPT-4o in the same task.
Results: In the test set, a rule-based system using regular expressions achieved F1 scores of 0.72 for the prediction of whether the trial allowed for the inclusion of patients with localized disease and 0.77 for metastatic disease. A transformer-based machine learning (ML) model achieved F1 scores of 0.97 and 0.88, respectively. A combined approach where the rule-based system was allowed to over-rule the ML model achieved F1 scores of 0.97 and 0.89, respectively. GPT-4o achieved F1 scores of 0.87 and 0.92, respectively.
Conclusion: Automatic classification of cancer trials with regard to the inclusion of patients with localized and/or metastatic disease is feasible. Turning the extraction of trial criteria into classification problems could, in selected cases, improve text-mining approaches in evidence-based medicine. Increasingly large language models can reduce or eliminate the need for previous training on the task at the expense of increased computational power and, in turn, cost.