从癫痫患者的门诊记录中提取发作控制指标:自然语言处理方法

IF 4.6 Q2 MATERIALS SCIENCE, BIOMATERIALS ACS Applied Bio Materials Pub Date : 2024-09-10 DOI:10.1016/j.eplepsyres.2024.107451
Marta Fernandes , Aidan Cardall , Lidia MVR Moura , Christopher McGraw , Sahar F. Zafar , M.Brandon Westover
{"title":"从癫痫患者的门诊记录中提取发作控制指标:自然语言处理方法","authors":"Marta Fernandes ,&nbsp;Aidan Cardall ,&nbsp;Lidia MVR Moura ,&nbsp;Christopher McGraw ,&nbsp;Sahar F. Zafar ,&nbsp;M.Brandon Westover","doi":"10.1016/j.eplepsyres.2024.107451","DOIUrl":null,"url":null,"abstract":"<div><h3>Objectives</h3><p>Monitoring seizure control metrics is key to clinical care of patients with epilepsy. Manually abstracting these metrics from unstructured text in electronic health records (EHR) is laborious. We aimed to abstract the date of last seizure and seizure frequency from clinical notes of patients with epilepsy using natural language processing (NLP).</p></div><div><h3>Methods</h3><p>We extracted seizure control metrics from notes of patients seen in epilepsy clinics from two hospitals in Boston. Extraction was performed with the pretrained model RoBERTa_for_seizureFrequency_QA, for both date of last seizure and seizure frequency, combined with regular expressions. We designed the algorithm to categorize the timing of last seizure (“today”, “1–6 days ago”, “1–4 weeks ago”, “more than 1–3 months ago”, “more than 3–6 months ago”, “more than 6–12 months ago”, “more than 1–2 years ago”, “more than 2 years ago”) and seizure frequency (“innumerable”, “multiple”, “daily”, “weekly”, “monthly”, “once per year”, “less than once per year”). Our ground truth consisted of structured questionnaires filled out by physicians. Model performance was measured using the areas under the receiving operating characteristic curve (AUROC) and precision recall curve (AUPRC) for categorical labels, and median absolute error (MAE) for ordinal labels, with 95 % confidence intervals (CI) estimated via bootstrapping.</p></div><div><h3>Results</h3><p>Our cohort included 1773 adult patients with a total of 5658 visits with reported seizure control metrics, seen in epilepsy clinics between December 2018 and May 2022. The cohort average age was 42 years old, the majority were female (57 %), White (81 %) and non-Hispanic (85 %). The models achieved an MAE (95 % CI) for date of last seizure of 4 (4.00–4.86) weeks, and for seizure frequency of 0.02 (0.02–0.02) seizures per day.</p></div><div><h3>Conclusions</h3><p>Our NLP approach demonstrates that the extraction of seizure control metrics from EHR is feasible allowing for large-scale EHR research.</p></div>","PeriodicalId":2,"journal":{"name":"ACS Applied Bio Materials","volume":null,"pages":null},"PeriodicalIF":4.6000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Extracting seizure control metrics from clinic notes of patients with epilepsy: A natural language processing approach\",\"authors\":\"Marta Fernandes ,&nbsp;Aidan Cardall ,&nbsp;Lidia MVR Moura ,&nbsp;Christopher McGraw ,&nbsp;Sahar F. Zafar ,&nbsp;M.Brandon Westover\",\"doi\":\"10.1016/j.eplepsyres.2024.107451\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Objectives</h3><p>Monitoring seizure control metrics is key to clinical care of patients with epilepsy. Manually abstracting these metrics from unstructured text in electronic health records (EHR) is laborious. We aimed to abstract the date of last seizure and seizure frequency from clinical notes of patients with epilepsy using natural language processing (NLP).</p></div><div><h3>Methods</h3><p>We extracted seizure control metrics from notes of patients seen in epilepsy clinics from two hospitals in Boston. Extraction was performed with the pretrained model RoBERTa_for_seizureFrequency_QA, for both date of last seizure and seizure frequency, combined with regular expressions. We designed the algorithm to categorize the timing of last seizure (“today”, “1–6 days ago”, “1–4 weeks ago”, “more than 1–3 months ago”, “more than 3–6 months ago”, “more than 6–12 months ago”, “more than 1–2 years ago”, “more than 2 years ago”) and seizure frequency (“innumerable”, “multiple”, “daily”, “weekly”, “monthly”, “once per year”, “less than once per year”). Our ground truth consisted of structured questionnaires filled out by physicians. Model performance was measured using the areas under the receiving operating characteristic curve (AUROC) and precision recall curve (AUPRC) for categorical labels, and median absolute error (MAE) for ordinal labels, with 95 % confidence intervals (CI) estimated via bootstrapping.</p></div><div><h3>Results</h3><p>Our cohort included 1773 adult patients with a total of 5658 visits with reported seizure control metrics, seen in epilepsy clinics between December 2018 and May 2022. The cohort average age was 42 years old, the majority were female (57 %), White (81 %) and non-Hispanic (85 %). The models achieved an MAE (95 % CI) for date of last seizure of 4 (4.00–4.86) weeks, and for seizure frequency of 0.02 (0.02–0.02) seizures per day.</p></div><div><h3>Conclusions</h3><p>Our NLP approach demonstrates that the extraction of seizure control metrics from EHR is feasible allowing for large-scale EHR research.</p></div>\",\"PeriodicalId\":2,\"journal\":{\"name\":\"ACS Applied Bio Materials\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2024-09-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS Applied Bio Materials\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0920121124001669\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MATERIALS SCIENCE, BIOMATERIALS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Bio Materials","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0920121124001669","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, BIOMATERIALS","Score":null,"Total":0}
引用次数: 0

摘要

目标监测癫痫发作控制指标是癫痫患者临床治疗的关键。从电子健康记录(EHR)中的非结构化文本中手动抽取这些指标非常费力。我们的目标是使用自然语言处理(NLP)从癫痫患者的临床笔记中抽取最后一次发作的日期和发作频率。我们使用预训练模型 RoBERTa_for_seizureFrequency_QA,结合正则表达式对最后一次发作日期和发作频率进行了提取。我们设计的算法可对最后一次发作的时间("今天"、"1-6 天前"、"1-4 周前"、"1-3 个多月前"、"3-6 个多月前"、"6-12 个多月前"、"1-2 年多前"、"2 年多前")和发作频率("无数次"、"多次"、"每天"、"每周"、"每月"、"每年一次"、"每年少于一次")进行分类。我们的基本事实由医生填写的结构化问卷组成。对于分类标签,我们使用接收操作特征曲线下面积(AUROC)和精确召回曲线(AUPRC)来衡量模型性能;对于序数标签,我们使用中位绝对误差(MAE)来衡量模型性能,并通过引导法估算出 95% 的置信区间(CI)。结果我们的队列包括 2018 年 12 月至 2022 年 5 月期间在癫痫诊所就诊的 1773 名成年患者,他们共就诊 5658 次,报告了癫痫发作控制指标。队列平均年龄为 42 岁,大多数为女性(57%)、白人(81%)和非西班牙裔(85%)。模型在最后一次癫痫发作日期为 4 (4.00-4.86) 周和癫痫发作频率为每天 0.02 (0.02-0.02) 次方面达到了 MAE (95 % CI)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Extracting seizure control metrics from clinic notes of patients with epilepsy: A natural language processing approach

Objectives

Monitoring seizure control metrics is key to clinical care of patients with epilepsy. Manually abstracting these metrics from unstructured text in electronic health records (EHR) is laborious. We aimed to abstract the date of last seizure and seizure frequency from clinical notes of patients with epilepsy using natural language processing (NLP).

Methods

We extracted seizure control metrics from notes of patients seen in epilepsy clinics from two hospitals in Boston. Extraction was performed with the pretrained model RoBERTa_for_seizureFrequency_QA, for both date of last seizure and seizure frequency, combined with regular expressions. We designed the algorithm to categorize the timing of last seizure (“today”, “1–6 days ago”, “1–4 weeks ago”, “more than 1–3 months ago”, “more than 3–6 months ago”, “more than 6–12 months ago”, “more than 1–2 years ago”, “more than 2 years ago”) and seizure frequency (“innumerable”, “multiple”, “daily”, “weekly”, “monthly”, “once per year”, “less than once per year”). Our ground truth consisted of structured questionnaires filled out by physicians. Model performance was measured using the areas under the receiving operating characteristic curve (AUROC) and precision recall curve (AUPRC) for categorical labels, and median absolute error (MAE) for ordinal labels, with 95 % confidence intervals (CI) estimated via bootstrapping.

Results

Our cohort included 1773 adult patients with a total of 5658 visits with reported seizure control metrics, seen in epilepsy clinics between December 2018 and May 2022. The cohort average age was 42 years old, the majority were female (57 %), White (81 %) and non-Hispanic (85 %). The models achieved an MAE (95 % CI) for date of last seizure of 4 (4.00–4.86) weeks, and for seizure frequency of 0.02 (0.02–0.02) seizures per day.

Conclusions

Our NLP approach demonstrates that the extraction of seizure control metrics from EHR is feasible allowing for large-scale EHR research.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
ACS Applied Bio Materials
ACS Applied Bio Materials Chemistry-Chemistry (all)
CiteScore
9.40
自引率
2.10%
发文量
464
期刊最新文献
A Systematic Review of Sleep Disturbance in Idiopathic Intracranial Hypertension. Advancing Patient Education in Idiopathic Intracranial Hypertension: The Promise of Large Language Models. Anti-Myelin-Associated Glycoprotein Neuropathy: Recent Developments. Approach to Managing the Initial Presentation of Multiple Sclerosis: A Worldwide Practice Survey. Association Between LACE+ Index Risk Category and 90-Day Mortality After Stroke.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1