CLEF健康评估实验室临床信息提取2016。

CEUR workshop proceedings Pub Date : 2016-09-01
Aurélie Névéol, K Bretonnel Cohen, Cyril Grouin, Thierry Hamon, Thomas Lavergne, Liadh Kelly, Lorraine Goeuriot, Grégoire Rey, Aude Robert, Xavier Tannier, Pierre Zweigenbaum
{"title":"CLEF健康评估实验室临床信息提取2016。","authors":"Aurélie Névéol,&nbsp;K Bretonnel Cohen,&nbsp;Cyril Grouin,&nbsp;Thierry Hamon,&nbsp;Thomas Lavergne,&nbsp;Liadh Kelly,&nbsp;Lorraine Goeuriot,&nbsp;Grégoire Rey,&nbsp;Aude Robert,&nbsp;Xavier Tannier,&nbsp;Pierre Zweigenbaum","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>This paper reports on Task 2 of the 2016 CLEF eHealth evaluation lab which extended the previous information extraction tasks of ShARe/CLEF eHealth evaluation labs. The task continued with named entity recognition and normalization in French narratives, as offered in CLEF eHealth 2015. Named entity recognition involved ten types of entities including <i>disorders</i> that were defined according to Semantic Groups in the Unified Medical Language System<sup>®</sup> (UMLS<sup>®</sup>), which was also used for normalizing the entities. In addition, we introduced a large-scale classification task in French death certificates, which consisted of extracting causes of death as coded in the International Classification of Diseases, tenth revision (ICD10). Participant systems were evaluated against a blind reference standard of 832 titles of scientific articles indexed in MEDLINE, 4 drug monographs published by the European Medicines Agency (EMEA) and 27,850 death certificates using Precision, Recall and F-measure. In total, seven teams participated, including five in the entity recognition and normalization task, and five in the death certificate coding task. Three teams submitted their systems to our newly offered reproducibility track. For entity recognition, the highest performance was achieved on the EMEA corpus, with an overall F-measure of 0.702 for plain entities recognition and 0.529 for normalized entity recognition. For entity normalization, the highest performance was achieved on the MEDLINE corpus, with an overall F-measure of 0.552. For death certificate coding, the highest performance was 0.848 F-measure.</p>","PeriodicalId":72554,"journal":{"name":"CEUR workshop proceedings","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5756095/pdf/nihms921614.pdf","citationCount":"0","resultStr":"{\"title\":\"Clinical Information Extraction at the CLEF eHealth Evaluation lab 2016.\",\"authors\":\"Aurélie Névéol,&nbsp;K Bretonnel Cohen,&nbsp;Cyril Grouin,&nbsp;Thierry Hamon,&nbsp;Thomas Lavergne,&nbsp;Liadh Kelly,&nbsp;Lorraine Goeuriot,&nbsp;Grégoire Rey,&nbsp;Aude Robert,&nbsp;Xavier Tannier,&nbsp;Pierre Zweigenbaum\",\"doi\":\"\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>This paper reports on Task 2 of the 2016 CLEF eHealth evaluation lab which extended the previous information extraction tasks of ShARe/CLEF eHealth evaluation labs. The task continued with named entity recognition and normalization in French narratives, as offered in CLEF eHealth 2015. Named entity recognition involved ten types of entities including <i>disorders</i> that were defined according to Semantic Groups in the Unified Medical Language System<sup>®</sup> (UMLS<sup>®</sup>), which was also used for normalizing the entities. In addition, we introduced a large-scale classification task in French death certificates, which consisted of extracting causes of death as coded in the International Classification of Diseases, tenth revision (ICD10). Participant systems were evaluated against a blind reference standard of 832 titles of scientific articles indexed in MEDLINE, 4 drug monographs published by the European Medicines Agency (EMEA) and 27,850 death certificates using Precision, Recall and F-measure. In total, seven teams participated, including five in the entity recognition and normalization task, and five in the death certificate coding task. Three teams submitted their systems to our newly offered reproducibility track. For entity recognition, the highest performance was achieved on the EMEA corpus, with an overall F-measure of 0.702 for plain entities recognition and 0.529 for normalized entity recognition. For entity normalization, the highest performance was achieved on the MEDLINE corpus, with an overall F-measure of 0.552. For death certificate coding, the highest performance was 0.848 F-measure.</p>\",\"PeriodicalId\":72554,\"journal\":{\"name\":\"CEUR workshop proceedings\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5756095/pdf/nihms921614.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"CEUR workshop proceedings\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"CEUR workshop proceedings","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

本文报告了2016年CLEF eHealth评估实验室的Task 2,它扩展了ShARe/CLEF eHealth评估实验室之前的信息提取任务。这项任务继续在法语叙述中进行命名实体识别和规范化,如CLEF eHealth 2015所提供的那样。命名实体识别涉及十种类型的实体,包括根据统一医学语言系统®(UMLS®)中的语义组定义的疾病,该系统也用于规范化实体。此外,我们在法国死亡证明中引入了一项大规模分类任务,其中包括提取国际疾病分类第十版(ICD10)编码的死亡原因。参与者系统根据MEDLINE索引的832篇科学文章标题、欧洲药品管理局(EMEA)发表的4篇药物专著和使用Precision、Recall和F-measure的27,850份死亡证明的盲参考标准进行评估。总共有7个小组参加,其中5个小组参加实体识别和规范化任务,5个小组参加死亡证明编码任务。三个团队将他们的系统提交到我们新提供的可重复性轨道上。对于实体识别,在EMEA语料库上实现了最高的性能,普通实体识别的总体f值为0.702,规范化实体识别的总体f值为0.529。对于实体规范化,在MEDLINE语料库上实现了最高的性能,总体f值为0.552。对于死亡证明编码,最高性能为0.848 F-measure。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Clinical Information Extraction at the CLEF eHealth Evaluation lab 2016.

This paper reports on Task 2 of the 2016 CLEF eHealth evaluation lab which extended the previous information extraction tasks of ShARe/CLEF eHealth evaluation labs. The task continued with named entity recognition and normalization in French narratives, as offered in CLEF eHealth 2015. Named entity recognition involved ten types of entities including disorders that were defined according to Semantic Groups in the Unified Medical Language System® (UMLS®), which was also used for normalizing the entities. In addition, we introduced a large-scale classification task in French death certificates, which consisted of extracting causes of death as coded in the International Classification of Diseases, tenth revision (ICD10). Participant systems were evaluated against a blind reference standard of 832 titles of scientific articles indexed in MEDLINE, 4 drug monographs published by the European Medicines Agency (EMEA) and 27,850 death certificates using Precision, Recall and F-measure. In total, seven teams participated, including five in the entity recognition and normalization task, and five in the death certificate coding task. Three teams submitted their systems to our newly offered reproducibility track. For entity recognition, the highest performance was achieved on the EMEA corpus, with an overall F-measure of 0.702 for plain entities recognition and 0.529 for normalized entity recognition. For entity normalization, the highest performance was achieved on the MEDLINE corpus, with an overall F-measure of 0.552. For death certificate coding, the highest performance was 0.848 F-measure.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
1.10
自引率
0.00%
发文量
0
期刊最新文献
A Privacy-Preserving Unsupervised Speaker Disentanglement Method for Depression Detection from Speech. Internet resources for foreign language education in primary school: challenges and opportunities YouTube as an open resource for foreign language learning: a case study of German Comparison of Human Experts and AI in Predicting Autism from Facial Behavior. An Extendible Realism-Based Ontology for Kinship.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1