Fangyi Chen, Gongbo Zhang, Si Chen, Tiffany Callahan, Chunhua Weng
{"title":"Clinical Note Structural Knowledge Improves Word Sense Disambiguation.","authors":"Fangyi Chen, Gongbo Zhang, Si Chen, Tiffany Callahan, Chunhua Weng","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>Clinical notes are full of ambiguous medical abbreviations. Contextual knowledge has been leveraged by recent learning-based approaches for sense disambiguation. Previous findings indicated that structural elements of clinical notes entail useful characteristics for informing different interpretations of abbreviations, yet they have remained underutilized and have not been fully investigated. To our best knowledge, the only study exploring note structures simply enumerated the headers in the notes, where such representations are not semantically meaningful. This paper describes a learning-based approach using the note structure represented by the semantic types predefined in Unified Medical Language System (UMLS). We evaluated the representation in addition to the widely used N-gram with three learning models on two different datasets. Experiments indicate that our feature augmentation consistently improved model performance for abbreviation disambiguation, with the optimal F1 score of 0.93.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141859/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Clinical notes are full of ambiguous medical abbreviations. Contextual knowledge has been leveraged by recent learning-based approaches for sense disambiguation. Previous findings indicated that structural elements of clinical notes entail useful characteristics for informing different interpretations of abbreviations, yet they have remained underutilized and have not been fully investigated. To our best knowledge, the only study exploring note structures simply enumerated the headers in the notes, where such representations are not semantically meaningful. This paper describes a learning-based approach using the note structure represented by the semantic types predefined in Unified Medical Language System (UMLS). We evaluated the representation in addition to the widely used N-gram with three learning models on two different datasets. Experiments indicate that our feature augmentation consistently improved model performance for abbreviation disambiguation, with the optimal F1 score of 0.93.
临床笔记中充满了模棱两可的医学缩写。最近基于学习的方法利用上下文知识进行意义消歧。以前的研究结果表明,临床笔记的结构元素包含有用的特征,可为缩写的不同解释提供信息,但这些特征仍未得到充分利用,也未得到充分研究。据我们所知,唯一一项探索笔记结构的研究只是列举了笔记中的标题,而这种表述并不具有语义意义。本文介绍了一种基于学习的方法,该方法使用统一医学语言系统(UMLS)中预定义的语义类型来表示笔记结构。除了广泛使用的 N-gram,我们还在两个不同的数据集上使用三种学习模型对该表示法进行了评估。实验结果表明,我们的特征增强技术持续提高了缩写消歧模型的性能,最佳 F1 得分为 0.93。