Perseus V. Patel, Conner Davis, Amariel Ralbovsky, Daniel Tinoco, Christopher Y.K. Williams, Shadera Slatter, Behzad Naderalvojoud, Michael J. Rosen, Tina Hernandez-Boussard, Vivek Rudrapatna
{"title":"Large language models outperform traditional natural language processing methods in extracting patient-reported outcomes in IBD","authors":"Perseus V. Patel, Conner Davis, Amariel Ralbovsky, Daniel Tinoco, Christopher Y.K. Williams, Shadera Slatter, Behzad Naderalvojoud, Michael J. Rosen, Tina Hernandez-Boussard, Vivek Rudrapatna","doi":"10.1101/2024.09.05.24313139","DOIUrl":null,"url":null,"abstract":"<strong>Background and Aims</strong> Patient-reported outcomes (PROs) are vital in assessing disease activity and treatment outcomes in inflammatory bowel disease (IBD). However, manual extraction of these PROs from the free-text of clinical notes is burdensome. We aimed to improve data curation from free-text information in the electronic health record, making it more available for research and quality improvement. This study aimed to compare traditional natural language processing (tNLP) and large language models (LLMs) in extracting three IBD PROs (abdominal pain, diarrhea, fecal blood) from clinical notes across two institutions.","PeriodicalId":501258,"journal":{"name":"medRxiv - Gastroenterology","volume":"95 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Gastroenterology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.09.05.24313139","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Background and Aims Patient-reported outcomes (PROs) are vital in assessing disease activity and treatment outcomes in inflammatory bowel disease (IBD). However, manual extraction of these PROs from the free-text of clinical notes is burdensome. We aimed to improve data curation from free-text information in the electronic health record, making it more available for research and quality improvement. This study aimed to compare traditional natural language processing (tNLP) and large language models (LLMs) in extracting three IBD PROs (abdominal pain, diarrhea, fecal blood) from clinical notes across two institutions.