PharmacoNER Tagger:一个基于深度学习的工具,用于自动在西班牙医学文本中查找化学物质和药物

Jordi Armengol-Estapé, Felipe Soares, M. Marimon, Martin Krallinger
{"title":"PharmacoNER Tagger:一个基于深度学习的工具,用于自动在西班牙医学文本中查找化学物质和药物","authors":"Jordi Armengol-Estapé, Felipe Soares, M. Marimon, Martin Krallinger","doi":"10.5808/GI.2019.17.2.e15","DOIUrl":null,"url":null,"abstract":"Automatically detecting mentions of pharmaceutical drugs and chemical substances is key for the subsequent extraction of relations of chemicals with other biomedical entities such as genes, proteins, diseases, adverse reactions or symptoms. The identification of drug mentions is also a prior step for complex event types such as drug dosage recognition, duration of medical treatments or drug repurposing. Formally, this task is known as named entity recognition (NER), meaning automatically identifying mentions of predefined entities of interest in running text. In the domain of medical texts, for chemical entity recognition (CER), techniques based on hand-crafted rules and graph-based models can provide adequate performance. In the recent years, the field of natural language processing has mainly pivoted to deep learning and state-of-the-art results for most tasks involving natural language are usually obtained with artificial neural networks. Competitive resources for drug name recognition in English medical texts are already available and heavily used, while for other languages such as Spanish these tools, although clearly needed were missing. In this work, we adapt an existing neural NER system, NeuroNER, to the particular domain of Spanish clinical case texts, and extend the neural network to be able to take into account additional features apart from the plain text. NeuroNER can be considered a competitive baseline system for Spanish drug and CER promoted by the Spanish national plan for the advancement of language technologies (Plan TL).","PeriodicalId":94288,"journal":{"name":"Genomics & informatics","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"PharmacoNER Tagger: a deep learning-based tool for automatically finding chemicals and drugs in Spanish medical texts\",\"authors\":\"Jordi Armengol-Estapé, Felipe Soares, M. Marimon, Martin Krallinger\",\"doi\":\"10.5808/GI.2019.17.2.e15\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automatically detecting mentions of pharmaceutical drugs and chemical substances is key for the subsequent extraction of relations of chemicals with other biomedical entities such as genes, proteins, diseases, adverse reactions or symptoms. The identification of drug mentions is also a prior step for complex event types such as drug dosage recognition, duration of medical treatments or drug repurposing. Formally, this task is known as named entity recognition (NER), meaning automatically identifying mentions of predefined entities of interest in running text. In the domain of medical texts, for chemical entity recognition (CER), techniques based on hand-crafted rules and graph-based models can provide adequate performance. In the recent years, the field of natural language processing has mainly pivoted to deep learning and state-of-the-art results for most tasks involving natural language are usually obtained with artificial neural networks. Competitive resources for drug name recognition in English medical texts are already available and heavily used, while for other languages such as Spanish these tools, although clearly needed were missing. In this work, we adapt an existing neural NER system, NeuroNER, to the particular domain of Spanish clinical case texts, and extend the neural network to be able to take into account additional features apart from the plain text. NeuroNER can be considered a competitive baseline system for Spanish drug and CER promoted by the Spanish national plan for the advancement of language technologies (Plan TL).\",\"PeriodicalId\":94288,\"journal\":{\"name\":\"Genomics & informatics\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Genomics & informatics\",\"FirstCategoryId\":\"0\",\"ListUrlMain\":\"https://doi.org/10.5808/GI.2019.17.2.e15\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genomics & informatics","FirstCategoryId":"0","ListUrlMain":"https://doi.org/10.5808/GI.2019.17.2.e15","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14

摘要

自动检测药物和化学物质的提及是随后提取化学物质与其他生物医学实体(如基因、蛋白质、疾病、不良反应或症状)的关系的关键。识别药物提及也是复杂事件类型(如药物剂量识别、药物治疗持续时间或药物再利用)的先决步骤。从形式上讲,这项任务被称为命名实体识别(NER),意思是自动识别运行文本中感兴趣的预定义实体的提及。在医学文本领域,对于化学实体识别(CER),基于手工制定的规则和基于图形的模型的技术可以提供足够的性能。近年来,自然语言处理领域主要转向深度学习,大多数涉及自然语言的任务的最新结果通常是通过人工神经网络获得的。在英文医学文本中进行药物名称识别的竞争性资源已经可用,并得到了大量使用,而在西班牙语等其他语言中,这些工具虽然明显需要,但却缺失了。在这项工作中,我们将现有的神经网络反应器系统NeuroNER应用于西班牙临床病例文本的特定领域,并扩展神经网络,使其能够考虑除纯文本之外的其他特征。NeuroNER可以被视为西班牙语言技术进步国家计划(TL计划)推动的西班牙药物和CER的竞争性基线系统。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
PharmacoNER Tagger: a deep learning-based tool for automatically finding chemicals and drugs in Spanish medical texts
Automatically detecting mentions of pharmaceutical drugs and chemical substances is key for the subsequent extraction of relations of chemicals with other biomedical entities such as genes, proteins, diseases, adverse reactions or symptoms. The identification of drug mentions is also a prior step for complex event types such as drug dosage recognition, duration of medical treatments or drug repurposing. Formally, this task is known as named entity recognition (NER), meaning automatically identifying mentions of predefined entities of interest in running text. In the domain of medical texts, for chemical entity recognition (CER), techniques based on hand-crafted rules and graph-based models can provide adequate performance. In the recent years, the field of natural language processing has mainly pivoted to deep learning and state-of-the-art results for most tasks involving natural language are usually obtained with artificial neural networks. Competitive resources for drug name recognition in English medical texts are already available and heavily used, while for other languages such as Spanish these tools, although clearly needed were missing. In this work, we adapt an existing neural NER system, NeuroNER, to the particular domain of Spanish clinical case texts, and extend the neural network to be able to take into account additional features apart from the plain text. NeuroNER can be considered a competitive baseline system for Spanish drug and CER promoted by the Spanish national plan for the advancement of language technologies (Plan TL).
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Comparative analysis of generative LLMs for labeling entities in clinical notes. Analyzing COVID-19 progression with Markov multistage models: insights from a Korean cohort. Structural insights into antibody-based immunotherapy for hepatocellular carcinoma. DeepDoublet identifies neighboring cell-dependent gene expression. Rore: robust and efficient antioxidant protein classification via a novel dimensionality reduction strategy based on learning of fewer features.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1