Classification of Use Status for Dietary Supplements in Clinical Notes.

Yadan Fan, Lu He, Rui Zhang
{"title":"Classification of Use Status for Dietary Supplements in Clinical Notes.","authors":"Yadan Fan,&nbsp;Lu He,&nbsp;Rui Zhang","doi":"10.1109/BIBM.2016.7822668","DOIUrl":null,"url":null,"abstract":"<p><p>Clinical notes contain rich information about dietary supplements, which are critical for detecting signals of dietary supplement side effects and interactions between drugs and supplements. One of the important factors of supplement documentation is usage status, such as started and discontinuation. Such information is usually stored in the unstructured clinical notes. We developed a rule-based classifier to identify supplement usage status in clinical notes. The categories referring to the patient's status of supplement use were classified into four classes: Continuing (C), Discontinued (D), Started (S), and Unclassified (U). Clinical notes containing 10 of the most commonly consumed supplements (i.e., alfalfa, echinacea, fish oil, garlic, ginger, ginkgo, ginseng, melatonin, St. John's Wort, and Vitamin E) were retrieved from the University of Minnesota Clinical Data Repository. The gold standard was defined by manually annotating 1000 randomly selected sentences or statements mentioning at least one of these 10 supplements. The rules in the classifier was initially developed on two-thirds of the set of 7 supplements (i.e., alfalfa, garlic, ginger, ginkgo, ginseng, St. John's Wort, and Vitamin E); the performance was evaluated on the remaining one-third of this set. To evaluate the generalizability of rules, we further validated the second testing set on other 3 supplements (i.e., echinacea, fish oil, and melatonin). The performance of the classifier achieved F-measures of 0.95, 0.97, 0.96, and 0.96 for status C, D, S, and U on 7 supplements, respectively. The classifier also showed good generalizability when it was applied to the other 3 supplements with F-measures of 0.96 for C, 0.96 for D, 0.95 for S, and 0.89 for U. This study demonstrated that the classifier can accurately classify supplement usage status, which can be further integrated as a module into the existing natural language processing pipeline for supporting dietary supplement knowledge discovery.</p>","PeriodicalId":74563,"journal":{"name":"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine","volume":"2016 ","pages":"1054-1061"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/BIBM.2016.7822668","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBM.2016.7822668","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2017/1/19 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

Clinical notes contain rich information about dietary supplements, which are critical for detecting signals of dietary supplement side effects and interactions between drugs and supplements. One of the important factors of supplement documentation is usage status, such as started and discontinuation. Such information is usually stored in the unstructured clinical notes. We developed a rule-based classifier to identify supplement usage status in clinical notes. The categories referring to the patient's status of supplement use were classified into four classes: Continuing (C), Discontinued (D), Started (S), and Unclassified (U). Clinical notes containing 10 of the most commonly consumed supplements (i.e., alfalfa, echinacea, fish oil, garlic, ginger, ginkgo, ginseng, melatonin, St. John's Wort, and Vitamin E) were retrieved from the University of Minnesota Clinical Data Repository. The gold standard was defined by manually annotating 1000 randomly selected sentences or statements mentioning at least one of these 10 supplements. The rules in the classifier was initially developed on two-thirds of the set of 7 supplements (i.e., alfalfa, garlic, ginger, ginkgo, ginseng, St. John's Wort, and Vitamin E); the performance was evaluated on the remaining one-third of this set. To evaluate the generalizability of rules, we further validated the second testing set on other 3 supplements (i.e., echinacea, fish oil, and melatonin). The performance of the classifier achieved F-measures of 0.95, 0.97, 0.96, and 0.96 for status C, D, S, and U on 7 supplements, respectively. The classifier also showed good generalizability when it was applied to the other 3 supplements with F-measures of 0.96 for C, 0.96 for D, 0.95 for S, and 0.89 for U. This study demonstrated that the classifier can accurately classify supplement usage status, which can be further integrated as a module into the existing natural language processing pipeline for supporting dietary supplement knowledge discovery.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
临床记录中膳食补充剂使用状况的分类。
临床记录包含丰富的膳食补充剂信息,这对于检测膳食补充剂副作用和药物与补充剂之间的相互作用至关重要。补充文档的一个重要因素是使用状态,如启动和停止。这些信息通常存储在非结构化的临床记录中。我们开发了一个基于规则的分类器来识别临床记录中的补充剂使用状况。将患者服用补充剂的情况分为四类:持续(C)、停止(D)、开始(S)和未分类(U)。临床记录中包含10种最常服用的补充剂(即苜蓿、紫锥菊、鱼油、大蒜、生姜、银杏、人参、褪黑素、圣约翰草和维生素E)从明尼苏达大学临床数据存储库中检索。黄金标准是通过手动标注1000个随机选择的句子或语句来定义的,这些句子或语句至少提到了这10个补充内容中的一个。分类器中的规则最初是针对7种补充剂(即苜蓿、大蒜、生姜、银杏、人参、圣约翰草和维生素E)中的三分之二制定的;对剩下的三分之一进行性能评估。为了评估规则的普遍性,我们进一步验证了其他3种补充剂(即紫锥菊、鱼油和褪黑素)的第二组测试集。分类器在7种补充剂上的C、D、S和U状态的f测量值分别为0.95、0.97、0.96和0.96。该分类器对C、D、S、u的f值分别为0.96、0.96、0.95和0.89的其他3种补充剂也表现出了良好的泛化性。研究表明,该分类器可以准确地对补充剂的使用状态进行分类,可以作为模块进一步集成到现有的自然语言处理管道中,支持膳食补充剂知识的发现。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Interpreting Lung Cancer Health Disparity between African American Males and European American Males. Causal Explanation from Mild Cognitive Impairment Progression using Graph Neural Networks. Predicting HIV Diagnosis Among Emerging Adults Using Electronic Health Records and Health Survey Data in All of Us Research Program. A generalizable physiological model for detection of Delayed Cerebral Ischemia using Federated Learning. Harnessing Transfer Learning for Dementia Prediction: Leveraging Sex-Different Mild Cognitive Impairment Prognosis.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1