Evaluating the accuracy of lung-RADS score extraction from radiology reports: Manual entry versus natural language processing

IF 3.7 2区 医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS International Journal of Medical Informatics Pub Date : 2024-07-31 DOI:10.1016/j.ijmedinf.2024.105580
Amir Gandomi , Eusha Hasan , Jesse Chusid , Subroto Paul , Matthew Inra , Alex Makhnevich , Suhail Raoof , Gerard Silvestri , Brett C. Bade , Stuart L. Cohen
{"title":"Evaluating the accuracy of lung-RADS score extraction from radiology reports: Manual entry versus natural language processing","authors":"Amir Gandomi ,&nbsp;Eusha Hasan ,&nbsp;Jesse Chusid ,&nbsp;Subroto Paul ,&nbsp;Matthew Inra ,&nbsp;Alex Makhnevich ,&nbsp;Suhail Raoof ,&nbsp;Gerard Silvestri ,&nbsp;Brett C. Bade ,&nbsp;Stuart L. Cohen","doi":"10.1016/j.ijmedinf.2024.105580","DOIUrl":null,"url":null,"abstract":"<div><h3>Introduction</h3><p>Radiology scoring systems are critical to the success of lung cancer screening (LCS) programs, impacting patient care, adherence to follow-up, data management and reporting, and program evaluation. Lung<!--> <!-->CT Screening<!--> <!-->Reporting and Data System (Lung-RADS) is a structured radiology scoring system that provides recommendations for LCS follow-up that are utilized (a) in clinical care and (b) by LCS programs monitoring rates of adherence to follow-up. Thus, accurate reporting and reliable collection of Lung-RADS scores are fundamental components of LCS program evaluation and improvement. Unfortunately, due to variability in radiology reports, extraction of Lung-RADS scores is non-trivial, and best practices do not exist. The purpose of this project is to compare mechanisms to extract Lung-RADS scores from free-text radiology reports.</p></div><div><h3>Methods</h3><p>We retrospectively analyzed reports of LCS low-dose computed tomography (LDCT) examinations performed at a multihospital integrated healthcare network in New York State between January 2016 and July 2023. We compared three methods of Lung-RADS score extraction: manual physician entry at time of report creation, manual LCS specialist entry after report creation, and an internally developed, rule-based natural language processing (NLP) algorithm. Accuracy, recall, precision, and completeness (i.e., the proportion of LCS exams to which a Lung-RADS score has been assigned) were compared between the three methods.</p></div><div><h3>Results</h3><p>The dataset includes 24,060 LCS examinations on 14,243 unique patients. The mean patient age was 65 years, and most patients were male (54 %) and white (75 %). Completeness rate was 65 %, 68 %, and 99 % for radiologists’ manual entry, LCS specialists’ entry, and NLP algorithm, respectively. Accuracy, recall, and precision were high across all extraction methods (&gt;94 %), though the NLP-based approach was consistently higher than both manual entries in all metrics.</p></div><div><h3>Discussion</h3><p>An NLP-based method of LCS score determination is an efficient and more accurate means of extracting Lung-RADS scores than manual review and data entry. NLP-based methods should be considered best practice for extracting structured Lung-RADS scores from free-text radiology reports.</p></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.7000,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Medical Informatics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1386505624002430","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Introduction

Radiology scoring systems are critical to the success of lung cancer screening (LCS) programs, impacting patient care, adherence to follow-up, data management and reporting, and program evaluation. Lung CT Screening Reporting and Data System (Lung-RADS) is a structured radiology scoring system that provides recommendations for LCS follow-up that are utilized (a) in clinical care and (b) by LCS programs monitoring rates of adherence to follow-up. Thus, accurate reporting and reliable collection of Lung-RADS scores are fundamental components of LCS program evaluation and improvement. Unfortunately, due to variability in radiology reports, extraction of Lung-RADS scores is non-trivial, and best practices do not exist. The purpose of this project is to compare mechanisms to extract Lung-RADS scores from free-text radiology reports.

Methods

We retrospectively analyzed reports of LCS low-dose computed tomography (LDCT) examinations performed at a multihospital integrated healthcare network in New York State between January 2016 and July 2023. We compared three methods of Lung-RADS score extraction: manual physician entry at time of report creation, manual LCS specialist entry after report creation, and an internally developed, rule-based natural language processing (NLP) algorithm. Accuracy, recall, precision, and completeness (i.e., the proportion of LCS exams to which a Lung-RADS score has been assigned) were compared between the three methods.

Results

The dataset includes 24,060 LCS examinations on 14,243 unique patients. The mean patient age was 65 years, and most patients were male (54 %) and white (75 %). Completeness rate was 65 %, 68 %, and 99 % for radiologists’ manual entry, LCS specialists’ entry, and NLP algorithm, respectively. Accuracy, recall, and precision were high across all extraction methods (>94 %), though the NLP-based approach was consistently higher than both manual entries in all metrics.

Discussion

An NLP-based method of LCS score determination is an efficient and more accurate means of extracting Lung-RADS scores than manual review and data entry. NLP-based methods should be considered best practice for extracting structured Lung-RADS scores from free-text radiology reports.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
评估从放射学报告中提取肺RADS评分的准确性:人工输入与自然语言处理的对比。
导言:放射学评分系统对肺癌筛查(LCS)项目的成功至关重要,它影响着患者护理、随访的坚持、数据管理和报告以及项目评估。肺癌筛查报告和数据系统(Lung-RADS)是一个结构化的放射学评分系统,为肺癌筛查随访提供建议,这些建议(a)用于临床治疗,(b)用于肺癌筛查项目监测随访的坚持率。因此,准确报告和可靠收集 Lung-RADS 评分是 LCS 项目评估和改进的基本组成部分。遗憾的是,由于放射学报告的多变性,提取 Lung-RADS 分数并非易事,也不存在最佳实践。本项目旨在比较从自由文本放射学报告中提取 Lung-RADS 评分的机制:我们回顾性分析了 2016 年 1 月至 2023 年 7 月期间在纽约州一家多医院综合医疗网络进行的 LCS 低剂量计算机断层扫描 (LDCT) 检查报告。我们比较了三种提取 Lung-RADS 评分的方法:医生在创建报告时手动输入,LCS 专家在创建报告后手动输入,以及内部开发的基于规则的自然语言处理 (NLP) 算法。对三种方法的准确度、召回率、精确度和完整性(即已分配 Lung-RADS 分数的 LCS 检查比例)进行了比较:数据集包括对 14,243 名患者进行的 24,060 次 LCS 检查。患者平均年龄为 65 岁,大多数患者为男性(54%)和白人(75%)。放射科医生手动输入、LCS 专家输入和 NLP 算法的完整率分别为 65%、68% 和 99%。所有提取方法的准确率、召回率和精确率都很高(大于 94%),但基于 NLP 的方法在所有指标上都始终高于人工输入:讨论:与人工审核和数据录入相比,基于 NLP 的 LCS 评分确定方法是提取 Lung-RADS 评分的一种高效、更准确的方法。基于 NLP 的方法应被视为从自由文本放射学报告中提取结构化 Lung-RADS 分数的最佳方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
International Journal of Medical Informatics
International Journal of Medical Informatics 医学-计算机:信息系统
CiteScore
8.90
自引率
4.10%
发文量
217
审稿时长
42 days
期刊介绍: International Journal of Medical Informatics provides an international medium for dissemination of original results and interpretative reviews concerning the field of medical informatics. The Journal emphasizes the evaluation of systems in healthcare settings. The scope of journal covers: Information systems, including national or international registration systems, hospital information systems, departmental and/or physician''s office systems, document handling systems, electronic medical record systems, standardization, systems integration etc.; Computer-aided medical decision support systems using heuristic, algorithmic and/or statistical methods as exemplified in decision theory, protocol development, artificial intelligence, etc. Educational computer based programs pertaining to medical informatics or medicine in general; Organizational, economic, social, clinical impact, ethical and cost-benefit aspects of IT applications in health care.
期刊最新文献
Application of the openEHR reference model for PGHD: A case study on the DH-Convener initiative Tracking provenance in clinical data warehouses for quality management Acute myocardial infarction risk prediction in emergency chest pain patients: An external validation study Healthcare professionals’ cross-organizational access to electronic health records: A scoping review Cross-modal similar clinical case retrieval using a modular model based on contrastive learning and k-nearest neighbor search
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1