Bridging information gaps in menopause status classification through natural language processing

IF 2.5 Q2 HEALTH CARE SCIENCES & SERVICES JAMIA Open Pub Date : 2024-02-09 DOI:10.1093/jamiaopen/ooae013
Hannah Eyre, Patrick R. Alba, Carolyn J Gibson, E. Gatsby, Kristine E Lynch, Olga V. Patterson, S. Duvall
{"title":"Bridging information gaps in menopause status classification through natural language processing","authors":"Hannah Eyre, Patrick R. Alba, Carolyn J Gibson, E. Gatsby, Kristine E Lynch, Olga V. Patterson, S. Duvall","doi":"10.1093/jamiaopen/ooae013","DOIUrl":null,"url":null,"abstract":"\n \n \n To use natural language processing (NLP) of clinical notes to augment existing structured electronic health record (EHR) data for classification of a patient’s menopausal status.\n \n \n \n A rule-based NLP system was designed to capture evidence of a patient’s menopause status including dates of a patient’s last menstrual period, reproductive surgeries, and postmenopause diagnosis as well as their use of birth control and menstrual interruptions. nlp-derived output was used in combination with structured EHR data to classify a patient’s menopausal status. NLP processing and patient classification was performed on a cohort of 307,512 female Veterans receiving healthcare at the US Department of Veterans Affairs (VA).\n \n \n \n NLP was validated at 99.6% precision. Including the nlp-derived data into a menopause phenotype increased the number of patients with data relevant to their menopausal status by 118%. Using structured codes alone, 81,173 (27.0%) are able to be classified as postmenopausal or premenopausal. However, with the inclusion of NLP, this number increased 167,804 (54.6%) patients. The premenopausal category grew by 532.7% with the inclusion of NLP data.\n \n \n \n By employing NLP, it became possible to identify documented data elements that predate VA care, originate outside VA networks, or have no corresponding structured field in the VA EHR that would be otherwise inaccessible for further analysis.\n \n \n \n NLP can be used to identify concepts relevant to a patient’s menopausal status in clinical notes. Adding nlp-derived data to an algorithm classifying a patient’s menopausal status significantly increases the number of patients classified using EHR data, ultimately enabling more detailed assessments of the impact of menopause on health outcomes.\n","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":null,"pages":null},"PeriodicalIF":2.5000,"publicationDate":"2024-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JAMIA Open","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/jamiaopen/ooae013","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0

Abstract

To use natural language processing (NLP) of clinical notes to augment existing structured electronic health record (EHR) data for classification of a patient’s menopausal status. A rule-based NLP system was designed to capture evidence of a patient’s menopause status including dates of a patient’s last menstrual period, reproductive surgeries, and postmenopause diagnosis as well as their use of birth control and menstrual interruptions. nlp-derived output was used in combination with structured EHR data to classify a patient’s menopausal status. NLP processing and patient classification was performed on a cohort of 307,512 female Veterans receiving healthcare at the US Department of Veterans Affairs (VA). NLP was validated at 99.6% precision. Including the nlp-derived data into a menopause phenotype increased the number of patients with data relevant to their menopausal status by 118%. Using structured codes alone, 81,173 (27.0%) are able to be classified as postmenopausal or premenopausal. However, with the inclusion of NLP, this number increased 167,804 (54.6%) patients. The premenopausal category grew by 532.7% with the inclusion of NLP data. By employing NLP, it became possible to identify documented data elements that predate VA care, originate outside VA networks, or have no corresponding structured field in the VA EHR that would be otherwise inaccessible for further analysis. NLP can be used to identify concepts relevant to a patient’s menopausal status in clinical notes. Adding nlp-derived data to an algorithm classifying a patient’s menopausal status significantly increases the number of patients classified using EHR data, ultimately enabling more detailed assessments of the impact of menopause on health outcomes.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过自然语言处理弥补更年期状态分类方面的信息差距
利用临床笔记的自然语言处理(NLP)来增强现有的结构化电子健康记录(EHR)数据,从而对患者的绝经状态进行分类。 我们设计了一个基于规则的 NLP 系统来捕捉患者更年期状态的证据,包括患者最后一次月经的日期、生殖手术和绝经后诊断,以及他们使用避孕药具和月经中断的情况。在美国退伍军人事务部(VA)接受医疗服务的 307,512 名女性退伍军人中进行了 NLP 处理和患者分类。 NLP 的精确度达到 99.6%。将 NLP 导出的数据纳入更年期表型后,获得更年期状态相关数据的患者人数增加了 118%。仅使用结构化代码,就有 81,173 人(27.0%)可被归类为绝经后或绝经前。然而,加入 NLP 后,这一数字增加了 167 804 人(54.6%)。纳入 NLP 数据后,绝经前类别增加了 532.7%。 通过使用 NLP,可以识别出在退伍军人事务部护理之前、源自退伍军人事务部网络之外或在退伍军人事务部电子病历中没有相应结构字段的记录数据元素,否则将无法进行进一步分析。 NLP 可用于识别临床笔记中与患者绝经状态相关的概念。将 NLP 衍生的数据添加到对患者更年期状态进行分类的算法中,可大大增加使用 EHR 数据对患者进行分类的数量,最终可对更年期对健康结果的影响进行更详细的评估。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
JAMIA Open
JAMIA Open Medicine-Health Informatics
CiteScore
4.10
自引率
4.80%
发文量
102
审稿时长
16 weeks
期刊最新文献
A landmark federal interagency collaboration to promote data science in health care: Million Veteran Program-Computational Health Analytics for Medical Precision to Improve Outcomes Now. Targetable molecular algorithm and training platform development for the treatment of non-small cell lung cancer. Sex, sexual orientation, and gender identity data collection across electronic health record platforms: a national cross-sectional survey. Assessing the use of unstructured electronic health record data to identify exposure to firearm violence. Developing personas to inform the design of digital interventions for perinatal mental health.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1