Chengyi Zheng, Benjamin C Sun, Yi-Lin Wu, Maros Ferencik, Ming-Sum Lee, Rita F Redberg, Aniket A Kawatkar, Visanee V Musigdilok, Adam L Sharp
{"title":"使用自然语言处理的压力超声心动图报告的自动解释。","authors":"Chengyi Zheng, Benjamin C Sun, Yi-Lin Wu, Maros Ferencik, Ming-Sum Lee, Rita F Redberg, Aniket A Kawatkar, Visanee V Musigdilok, Adam L Sharp","doi":"10.1093/ehjdh/ztac047","DOIUrl":null,"url":null,"abstract":"<p><strong>Aims: </strong>Stress echocardiography (SE) findings and interpretations are commonly documented in free-text reports. Reusing SE results requires laborious manual reviews. This study aimed to develop and validate an automated method for abstracting SE reports in a large cohort.</p><p><strong>Methods and results: </strong>This study included adult patients who had SE within 30 days of their emergency department visit for suspected acute coronary syndrome in a large integrated healthcare system. An automated natural language processing (NLP) algorithm was developed to abstract SE reports and classify overall SE results into normal, non-diagnostic, infarction, and ischaemia categories. Randomly selected reports (<i>n</i> = 140) were double-blindly reviewed by cardiologists to perform criterion validity of the NLP algorithm. Construct validity was tested on the entire cohort using abstracted SE data and additional clinical variables. The NLP algorithm abstracted 6346 consecutive SE reports. Cardiologists had good agreements on the overall SE results on the 140 reports: Kappa (0.83) and intraclass correlation coefficient (0.89). The NLP algorithm achieved 98.6% specificity and negative predictive value, 95.7% sensitivity, positive predictive value, and <i>F</i>-score on the overall SE results and near-perfect scores on ischaemia findings. The 30-day acute myocardial infarction or death outcomes were highest among patients with ischaemia (5.0%), followed by infarction (1.4%), non-diagnostic (0.8%), and normal (0.3%) results. We found substantial variations in the format and quality of SE reports, even within the same institution.</p><p><strong>Conclusions: </strong>Natural language processing is an accurate and efficient method for abstracting unstructured SE reports. This approach creates new opportunities for research, public health measures, and care improvement.</p>","PeriodicalId":72965,"journal":{"name":"European heart journal. Digital health","volume":"3 4","pages":"626-637"},"PeriodicalIF":3.9000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/97/ff/ztac047.PMC9779789.pdf","citationCount":"1","resultStr":"{\"title\":\"Automated interpretation of stress echocardiography reports using natural language processing.\",\"authors\":\"Chengyi Zheng, Benjamin C Sun, Yi-Lin Wu, Maros Ferencik, Ming-Sum Lee, Rita F Redberg, Aniket A Kawatkar, Visanee V Musigdilok, Adam L Sharp\",\"doi\":\"10.1093/ehjdh/ztac047\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Aims: </strong>Stress echocardiography (SE) findings and interpretations are commonly documented in free-text reports. Reusing SE results requires laborious manual reviews. This study aimed to develop and validate an automated method for abstracting SE reports in a large cohort.</p><p><strong>Methods and results: </strong>This study included adult patients who had SE within 30 days of their emergency department visit for suspected acute coronary syndrome in a large integrated healthcare system. An automated natural language processing (NLP) algorithm was developed to abstract SE reports and classify overall SE results into normal, non-diagnostic, infarction, and ischaemia categories. Randomly selected reports (<i>n</i> = 140) were double-blindly reviewed by cardiologists to perform criterion validity of the NLP algorithm. Construct validity was tested on the entire cohort using abstracted SE data and additional clinical variables. The NLP algorithm abstracted 6346 consecutive SE reports. Cardiologists had good agreements on the overall SE results on the 140 reports: Kappa (0.83) and intraclass correlation coefficient (0.89). The NLP algorithm achieved 98.6% specificity and negative predictive value, 95.7% sensitivity, positive predictive value, and <i>F</i>-score on the overall SE results and near-perfect scores on ischaemia findings. The 30-day acute myocardial infarction or death outcomes were highest among patients with ischaemia (5.0%), followed by infarction (1.4%), non-diagnostic (0.8%), and normal (0.3%) results. We found substantial variations in the format and quality of SE reports, even within the same institution.</p><p><strong>Conclusions: </strong>Natural language processing is an accurate and efficient method for abstracting unstructured SE reports. This approach creates new opportunities for research, public health measures, and care improvement.</p>\",\"PeriodicalId\":72965,\"journal\":{\"name\":\"European heart journal. Digital health\",\"volume\":\"3 4\",\"pages\":\"626-637\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/97/ff/ztac047.PMC9779789.pdf\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"European heart journal. Digital health\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/ehjdh/ztac047\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CARDIAC & CARDIOVASCULAR SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"European heart journal. Digital health","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/ehjdh/ztac047","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CARDIAC & CARDIOVASCULAR SYSTEMS","Score":null,"Total":0}
Automated interpretation of stress echocardiography reports using natural language processing.
Aims: Stress echocardiography (SE) findings and interpretations are commonly documented in free-text reports. Reusing SE results requires laborious manual reviews. This study aimed to develop and validate an automated method for abstracting SE reports in a large cohort.
Methods and results: This study included adult patients who had SE within 30 days of their emergency department visit for suspected acute coronary syndrome in a large integrated healthcare system. An automated natural language processing (NLP) algorithm was developed to abstract SE reports and classify overall SE results into normal, non-diagnostic, infarction, and ischaemia categories. Randomly selected reports (n = 140) were double-blindly reviewed by cardiologists to perform criterion validity of the NLP algorithm. Construct validity was tested on the entire cohort using abstracted SE data and additional clinical variables. The NLP algorithm abstracted 6346 consecutive SE reports. Cardiologists had good agreements on the overall SE results on the 140 reports: Kappa (0.83) and intraclass correlation coefficient (0.89). The NLP algorithm achieved 98.6% specificity and negative predictive value, 95.7% sensitivity, positive predictive value, and F-score on the overall SE results and near-perfect scores on ischaemia findings. The 30-day acute myocardial infarction or death outcomes were highest among patients with ischaemia (5.0%), followed by infarction (1.4%), non-diagnostic (0.8%), and normal (0.3%) results. We found substantial variations in the format and quality of SE reports, even within the same institution.
Conclusions: Natural language processing is an accurate and efficient method for abstracting unstructured SE reports. This approach creates new opportunities for research, public health measures, and care improvement.