UMLS 元词库中生物医学词汇的大规模对齐。

Vinh Nguyen, Hong Yung Yip, Olivier Bodenreider
{"title":"UMLS 元词库中生物医学词汇的大规模对齐。","authors":"Vinh Nguyen, Hong Yung Yip, Olivier Bodenreider","doi":"10.1145/3442381.3450128","DOIUrl":null,"url":null,"abstract":"<p><p>With 214 source vocabularies, the construction and maintenance process of the UMLS (Unified Medical Language System) Metathesaurus terminology integration system is costly, time-consuming, and error-prone as it primarily relies on (1) lexical and semantic processing for suggesting groupings of synonymous terms, and (2) the expertise of UMLS editors for curating these synonymy predictions. This paper aims to improve the UMLS Metathesaurus construction process by developing a novel supervised learning approach for improving the task of suggesting synonymous pairs that can scale to the size and diversity of the UMLS source vocabularies. We evaluate this deep learning (DL) approach against a rule-based approach (RBA) that approximates the current UMLS Metathesaurus construction process. The key to the generalizability of our approach is the use of various degrees of lexical similarity in negative pairs during the training process. Our initial experiments demonstrate the strong performance across multiple datasets of our DL approach in terms of recall (91-92%), precision (88-99%), and F1 score (89-95%). Our DL approach largely outperforms the RBA method in recall (+23%), precision (+2.4%), and F1 score (+14.1%). This novel approach has great potential for improving the UMLS Metathesaurus construction process by providing better synonymy suggestions to the UMLS editors.</p>","PeriodicalId":74532,"journal":{"name":"Proceedings of the ... International World-Wide Web Conference. International WWW Conference","volume":"2021 ","pages":"2672-2683"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8434895/pdf/","citationCount":"0","resultStr":"{\"title\":\"Biomedical Vocabulary Alignment at Scale in the UMLS Metathesaurus.\",\"authors\":\"Vinh Nguyen, Hong Yung Yip, Olivier Bodenreider\",\"doi\":\"10.1145/3442381.3450128\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>With 214 source vocabularies, the construction and maintenance process of the UMLS (Unified Medical Language System) Metathesaurus terminology integration system is costly, time-consuming, and error-prone as it primarily relies on (1) lexical and semantic processing for suggesting groupings of synonymous terms, and (2) the expertise of UMLS editors for curating these synonymy predictions. This paper aims to improve the UMLS Metathesaurus construction process by developing a novel supervised learning approach for improving the task of suggesting synonymous pairs that can scale to the size and diversity of the UMLS source vocabularies. We evaluate this deep learning (DL) approach against a rule-based approach (RBA) that approximates the current UMLS Metathesaurus construction process. The key to the generalizability of our approach is the use of various degrees of lexical similarity in negative pairs during the training process. Our initial experiments demonstrate the strong performance across multiple datasets of our DL approach in terms of recall (91-92%), precision (88-99%), and F1 score (89-95%). Our DL approach largely outperforms the RBA method in recall (+23%), precision (+2.4%), and F1 score (+14.1%). This novel approach has great potential for improving the UMLS Metathesaurus construction process by providing better synonymy suggestions to the UMLS editors.</p>\",\"PeriodicalId\":74532,\"journal\":{\"name\":\"Proceedings of the ... International World-Wide Web Conference. International WWW Conference\",\"volume\":\"2021 \",\"pages\":\"2672-2683\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8434895/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the ... International World-Wide Web Conference. International WWW Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3442381.3450128\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2021/4/19 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... International World-Wide Web Conference. International WWW Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3442381.3450128","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2021/4/19 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

UMLS(统一医学语言系统)元词库术语集成系统有 214 个源词汇表,其构建和维护过程成本高、耗时长且容易出错,因为它主要依赖于(1)词汇和语义处理来建议同义词的分组,以及(2)UMLS 编辑的专业知识来整理这些同义预测。本文旨在改进 UMLS 元词库的构建过程,为此开发了一种新颖的监督学习方法,用于改进同义词对的建议任务,该方法可以扩展到 UMLS 源词库的规模和多样性。我们将这种深度学习(DL)方法与基于规则的方法(RBA)进行了对比评估,后者近似于当前的 UMLS 元词库构建过程。我们的方法之所以具有通用性,关键在于在训练过程中使用了不同程度的词性相似性负对。我们的初步实验表明,在召回率(91-92%)、精确率(88-99%)和 F1 分数(89-95%)方面,我们的 DL 方法在多个数据集上都表现出色。我们的 DL 方法在召回率(+23%)、精确率(+2.4%)和 F1 分数(+14.1%)方面大大优于 RBA 方法。通过为 UMLS 编辑提供更好的同义词建议,这种新方法在改进 UMLS 元词库构建过程方面具有巨大潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Biomedical Vocabulary Alignment at Scale in the UMLS Metathesaurus.

With 214 source vocabularies, the construction and maintenance process of the UMLS (Unified Medical Language System) Metathesaurus terminology integration system is costly, time-consuming, and error-prone as it primarily relies on (1) lexical and semantic processing for suggesting groupings of synonymous terms, and (2) the expertise of UMLS editors for curating these synonymy predictions. This paper aims to improve the UMLS Metathesaurus construction process by developing a novel supervised learning approach for improving the task of suggesting synonymous pairs that can scale to the size and diversity of the UMLS source vocabularies. We evaluate this deep learning (DL) approach against a rule-based approach (RBA) that approximates the current UMLS Metathesaurus construction process. The key to the generalizability of our approach is the use of various degrees of lexical similarity in negative pairs during the training process. Our initial experiments demonstrate the strong performance across multiple datasets of our DL approach in terms of recall (91-92%), precision (88-99%), and F1 score (89-95%). Our DL approach largely outperforms the RBA method in recall (+23%), precision (+2.4%), and F1 score (+14.1%). This novel approach has great potential for improving the UMLS Metathesaurus construction process by providing better synonymy suggestions to the UMLS editors.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
DPAR: Decoupled Graph Neural Networks with Node-Level Differential Privacy. Exploring Representations for Singular and Multi-Concept Relations for Biomedical Named Entity Normalization. Context-Enriched Learning Models for Aligning Biomedical Vocabularies at Scale in the UMLS Metathesaurus. Contrastive Lexical Diffusion Coefficient: Quantifying the Stickiness of the Ordinary. Communication Efficient Federated Generalized Tensor Factorization for Collaborative Health Data Analytics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1