From Free-text Drug Labels to Structured Medication Terminology with BERT and GPT.

AMIA ... Annual Symposium proceedings. AMIA Symposium Pub Date : 2024-01-11 eCollection Date: 2023-01-01

Duy-Hoa Ngo, Bevan Koopman

{"title":"From Free-text Drug Labels to Structured Medication Terminology with BERT and GPT.","authors":"Duy-Hoa Ngo, Bevan Koopman","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>We present a method to enrich controlled medication terminology from free-text drug labels. This is important because, while controlled medication terminology capture well-structured medication information, much of the information pertaining to medications is still found in free-text. First, we compared different Named Entity Recognition (NER) models including rule-based, feature-based, deep learning-based models with Transformers as well as ChatGPT, few-shot and fine-tuned GPT-3 to find the most suitable model that accurately extracts medication entities (ingredients, brand, dose, etc.) from free-text. Then, a rule-based Relation Extraction algorithm transforms NER results into a well-structured medication knowledge graph. Finally, a Medication Searching method takes the knowledge graph and matches it to relevant medications in the terminology server. An empirical evaluation on real-world drug labels shows that BERT-CRF was the most effective NER model with F-measure 95%. After performing terms normalization, the Medication Searching achieved an accuracy of 77% for when matching a label to relevant medication in the terminology server. The NER and Medication Searching models could be deployed as a web service capable of accepting free-text queries and returning structured medication information; thus providing a useful means of better managing medications information found in different health systems.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2023 ","pages":"540-549"},"PeriodicalIF":0.0000,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10785872/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AMIA ... Annual Symposium proceedings. AMIA Symposium","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/1/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

We present a method to enrich controlled medication terminology from free-text drug labels. This is important because, while controlled medication terminology capture well-structured medication information, much of the information pertaining to medications is still found in free-text. First, we compared different Named Entity Recognition (NER) models including rule-based, feature-based, deep learning-based models with Transformers as well as ChatGPT, few-shot and fine-tuned GPT-3 to find the most suitable model that accurately extracts medication entities (ingredients, brand, dose, etc.) from free-text. Then, a rule-based Relation Extraction algorithm transforms NER results into a well-structured medication knowledge graph. Finally, a Medication Searching method takes the knowledge graph and matches it to relevant medications in the terminology server. An empirical evaluation on real-world drug labels shows that BERT-CRF was the most effective NER model with F-measure 95%. After performing terms normalization, the Medication Searching achieved an accuracy of 77% for when matching a label to relevant medication in the terminology server. The NER and Medication Searching models could be deployed as a web service capable of accepting free-text queries and returning structured medication information; thus providing a useful means of better managing medications information found in different health systems.

微信好友朋友圈 QQ好友复制链接

本刊更多论文

使用 BERT 和 GPT，从自由文本药物标签到结构化药物术语。

我们提出了一种从自由文本药物标签中丰富受控药物术语的方法。这一点非常重要，因为虽然受控药物术语能捕捉到结构良好的药物信息，但许多与药物相关的信息仍然存在于自由文本中。首先，我们比较了不同的命名实体识别（NER）模型，包括基于规则的模型、基于特征的模型、基于深度学习的模型、Transformers 模型以及 ChatGPT 模型、少拍模型和微调 GPT-3 模型，以找到最适合的模型，从自由文本中准确提取药物实体（成分、品牌、剂量等）。然后，基于规则的关系提取算法将 NER 结果转化为结构良好的药物知识图谱。最后，药物搜索方法将知识图谱与术语服务器中的相关药物进行匹配。对真实世界药物标签的经验评估表明，BERT-CRF 是最有效的 NER 模型，F-measure 为 95%。在对术语进行归一化处理后，当将标签与术语服务器中的相关药物进行匹配时，药物搜索的准确率达到了 77%。NER 和用药搜索模型可作为网络服务部署，能够接受自由文本查询并返回结构化的用药信息，从而为更好地管理不同医疗系统中的用药信息提供了有用的手段。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

AMIA ... Annual Symposium proceedings. AMIA Symposium

自引率

0.00%

发文量