利用字义进行电子健康记录的否定识别

Ioana Barbantan, R. Potolea
{"title":"利用字义进行电子健康记录的否定识别","authors":"Ioana Barbantan, R. Potolea","doi":"10.1109/AQTR.2014.6857880","DOIUrl":null,"url":null,"abstract":"Topic extraction from Electronic Health Records is a sensitive step of the knowledge extraction process. As the meaning of a discourse can be completely distorted by negations, the correct identification of terms vs negated terms is mandatory. Our work is an attempt of automated negation identification in unstructured health records. We analyzed a corpus of medical documents containing 5103 sentences and we found that while adverbs have a distribution of 3%, the negation covers almost 2% of the words used in the corpus, justifying an in depth analysis of negation. The main contribution of the paper addresses the existing drawback of negation identification approaches in the literature that do not consider negation represented with negation prefixes. In this paper we address the tasks of syntactic and morphologic negation identification. In order to identify morphologic negation we propose the PreNex algorithm that consists in breaking down the terms into prefix and root word and the analysis of the root's validity using additional available resources (WordNet). The syntactic negation identification relies on a pattern matching approach where the negated concepts are identified based on a predefined Ust of negation identifiers. The results we obtained are promising and ensure a reliable negation identification approach for medical documents. We report a precision of 92.62% and recall of 93.60% in case of the morphologic negation identification and an overall performance in the morphologic and syntactic negation identification of 95.96% precision and 94.23% recall.","PeriodicalId":297141,"journal":{"name":"2014 IEEE International Conference on Automation, Quality and Testing, Robotics","volume":"96 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Exploiting word meaning for negation identification in electronic health records\",\"authors\":\"Ioana Barbantan, R. Potolea\",\"doi\":\"10.1109/AQTR.2014.6857880\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Topic extraction from Electronic Health Records is a sensitive step of the knowledge extraction process. As the meaning of a discourse can be completely distorted by negations, the correct identification of terms vs negated terms is mandatory. Our work is an attempt of automated negation identification in unstructured health records. We analyzed a corpus of medical documents containing 5103 sentences and we found that while adverbs have a distribution of 3%, the negation covers almost 2% of the words used in the corpus, justifying an in depth analysis of negation. The main contribution of the paper addresses the existing drawback of negation identification approaches in the literature that do not consider negation represented with negation prefixes. In this paper we address the tasks of syntactic and morphologic negation identification. In order to identify morphologic negation we propose the PreNex algorithm that consists in breaking down the terms into prefix and root word and the analysis of the root's validity using additional available resources (WordNet). The syntactic negation identification relies on a pattern matching approach where the negated concepts are identified based on a predefined Ust of negation identifiers. The results we obtained are promising and ensure a reliable negation identification approach for medical documents. We report a precision of 92.62% and recall of 93.60% in case of the morphologic negation identification and an overall performance in the morphologic and syntactic negation identification of 95.96% precision and 94.23% recall.\",\"PeriodicalId\":297141,\"journal\":{\"name\":\"2014 IEEE International Conference on Automation, Quality and Testing, Robotics\",\"volume\":\"96 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-05-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 IEEE International Conference on Automation, Quality and Testing, Robotics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AQTR.2014.6857880\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE International Conference on Automation, Quality and Testing, Robotics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AQTR.2014.6857880","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

摘要

电子健康记录的主题提取是知识提取过程中的一个敏感环节。由于话语的意义可能被否定完全扭曲,因此正确识别术语与被否定的术语是必须的。我们的工作是在非结构化健康记录中进行自动否定识别的尝试。我们分析了一个包含5103个句子的医学文档语料库,我们发现,虽然副词的分布占3%,但否定几乎占语料库中使用的单词的2%,这证明了对否定的深入分析是合理的。本文的主要贡献解决了文献中否定识别方法的现有缺点,即不考虑用否定前缀表示的否定。在本文中,我们讨论了句法和形态否定识别的任务。为了识别词形否定,我们提出了PreNex算法,该算法包括将术语分解为前缀词和词根词,并使用额外的可用资源(WordNet)分析词根的有效性。语法否定标识依赖于一种模式匹配方法,在这种方法中,否定概念是基于预定义的否定标识符集合来标识的。我们的结果是有希望的,为医疗文件的否定识别提供了可靠的方法。在形态否定识别方面,我们的准确率为92.62%,召回率为93.60%,在形态和句法否定识别方面的总体表现为95.96%,召回率为94.23%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Exploiting word meaning for negation identification in electronic health records
Topic extraction from Electronic Health Records is a sensitive step of the knowledge extraction process. As the meaning of a discourse can be completely distorted by negations, the correct identification of terms vs negated terms is mandatory. Our work is an attempt of automated negation identification in unstructured health records. We analyzed a corpus of medical documents containing 5103 sentences and we found that while adverbs have a distribution of 3%, the negation covers almost 2% of the words used in the corpus, justifying an in depth analysis of negation. The main contribution of the paper addresses the existing drawback of negation identification approaches in the literature that do not consider negation represented with negation prefixes. In this paper we address the tasks of syntactic and morphologic negation identification. In order to identify morphologic negation we propose the PreNex algorithm that consists in breaking down the terms into prefix and root word and the analysis of the root's validity using additional available resources (WordNet). The syntactic negation identification relies on a pattern matching approach where the negated concepts are identified based on a predefined Ust of negation identifiers. The results we obtained are promising and ensure a reliable negation identification approach for medical documents. We report a precision of 92.62% and recall of 93.60% in case of the morphologic negation identification and an overall performance in the morphologic and syntactic negation identification of 95.96% precision and 94.23% recall.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Criteria for the localization of the monophase fault in medium voltage industrial networks with an isolated neutral Industrial application of micro geometrical cutting edge optimization for solid carbide boring tools Synthesis of train traffic control system with evolutionary computing Active disturbance rejection controller for a separation column DSS for operation of the lateral reservoirs during flood period
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1