12. Contextualizing clinical significance using FDA label supplemented DGI data

IF 1.4 4区 医学 Q4 GENETICS & HEREDITY Cancer Genetics Pub Date : 2024-08-01 DOI:10.1016/j.cancergen.2024.08.014
Matthew Cannon , James Stevenson , Kathryn Stahl , Rohit Basu , Adam Coffman , Susanna Kiwala , Joshua McMichael , Elaine Mardis , Obi Griffith , Malachi Griffith , Alex Wagner
{"title":"12. Contextualizing clinical significance using FDA label supplemented DGI data","authors":"Matthew Cannon ,&nbsp;James Stevenson ,&nbsp;Kathryn Stahl ,&nbsp;Rohit Basu ,&nbsp;Adam Coffman ,&nbsp;Susanna Kiwala ,&nbsp;Joshua McMichael ,&nbsp;Elaine Mardis ,&nbsp;Obi Griffith ,&nbsp;Malachi Griffith ,&nbsp;Alex Wagner","doi":"10.1016/j.cancergen.2024.08.014","DOIUrl":null,"url":null,"abstract":"<div><div>The drug-gene interaction database (DGIdb) is a resource that aggregates interaction data from over 40 different resources into one platform with the primary goal of making the druggable genome accessible to clinicians and researchers. By providing a public, computationally accessible database, the DGIdb enables therapeutic insights through broad aggregation of DGI data.</div><div>As part of our aggregation process, DGIdb preserves data regarding interaction types, directionality, and other attributes that enable filtering or biochemical insight. However, source data are often incomplete and may not contain the original physiological context of the interaction. Without this context, the therapeutic relevance of an interaction may be compromised or lost. In this report, we address these missing data and extract therapeutic context from free-text sources. We apply existing large language models (LLMs) that have been fine-tuned on additional medical corpuses to tag and extract indications, cancer types, and relevant pharmacogenomics from free-text, FDA approved labels. We are then able to utilize our in-house normalization services to link extracted data back to formally grouped concepts.</div><div>In a preliminary test set of 355 FDA labels, we were able to normalize 59.4%, 49.8%, and 49.1% of extracted chemical, disease, and genetic entities back to harmonized concepts. Extracting this data allows us to supplement our existing interactions with relevant context that may inform the therapeutic relevance of a particular interaction. Inclusion of these data will be particularly invaluable for variant interpretation pipelines where mutational status can lead to the identification of a lifesaving therapeutic and a positive patient outcome.</div></div>","PeriodicalId":49225,"journal":{"name":"Cancer Genetics","volume":null,"pages":null},"PeriodicalIF":1.4000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cancer Genetics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2210776224000528","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

The drug-gene interaction database (DGIdb) is a resource that aggregates interaction data from over 40 different resources into one platform with the primary goal of making the druggable genome accessible to clinicians and researchers. By providing a public, computationally accessible database, the DGIdb enables therapeutic insights through broad aggregation of DGI data.
As part of our aggregation process, DGIdb preserves data regarding interaction types, directionality, and other attributes that enable filtering or biochemical insight. However, source data are often incomplete and may not contain the original physiological context of the interaction. Without this context, the therapeutic relevance of an interaction may be compromised or lost. In this report, we address these missing data and extract therapeutic context from free-text sources. We apply existing large language models (LLMs) that have been fine-tuned on additional medical corpuses to tag and extract indications, cancer types, and relevant pharmacogenomics from free-text, FDA approved labels. We are then able to utilize our in-house normalization services to link extracted data back to formally grouped concepts.
In a preliminary test set of 355 FDA labels, we were able to normalize 59.4%, 49.8%, and 49.1% of extracted chemical, disease, and genetic entities back to harmonized concepts. Extracting this data allows us to supplement our existing interactions with relevant context that may inform the therapeutic relevance of a particular interaction. Inclusion of these data will be particularly invaluable for variant interpretation pipelines where mutational status can lead to the identification of a lifesaving therapeutic and a positive patient outcome.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
12.利用 FDA 标签补充 DGI 数据确定临床意义的内涵
药物基因相互作用数据库(DGIdb)是一种资源,它将40多种不同资源中的相互作用数据聚合到一个平台上,其主要目的是让临床医生和研究人员能够访问药物基因组。作为我们聚合过程的一部分,DGIdb 保留了有关相互作用类型、方向性和其他属性的数据,这些数据有助于筛选或生化研究。然而,源数据往往是不完整的,可能不包含相互作用的原始生理背景。没有这种背景,相互作用的治疗相关性可能会受到影响或丧失。在本报告中,我们解决了这些数据缺失的问题,并从自由文本源中提取了治疗背景。我们应用现有的大型语言模型 (LLM),这些模型已在其他医疗语料库中进行过微调,可从自由文本、FDA 批准的标签中标记并提取适应症、癌症类型和相关药物基因组学。在 355 个 FDA 标签的初步测试集中,我们能够将 59.4%、49.8% 和 49.1% 的提取化学、疾病和基因实体归一化为统一的概念。通过提取这些数据,我们可以用相关的上下文来补充现有的相互作用,从而为特定相互作用的治疗相关性提供信息。纳入这些数据对于变异解释管道尤其有价值,因为变异状态可以帮助确定拯救生命的疗法和积极的患者预后。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Cancer Genetics
Cancer Genetics ONCOLOGY-GENETICS & HEREDITY
CiteScore
3.20
自引率
5.30%
发文量
167
审稿时长
27 days
期刊介绍: The aim of Cancer Genetics is to publish high quality scientific papers on the cellular, genetic and molecular aspects of cancer, including cancer predisposition and clinical diagnostic applications. Specific areas of interest include descriptions of new chromosomal, molecular or epigenetic alterations in benign and malignant diseases; novel laboratory approaches for identification and characterization of chromosomal rearrangements or genomic alterations in cancer cells; correlation of genetic changes with pathology and clinical presentation; and the molecular genetics of cancer predisposition. To reach a basic science and clinical multidisciplinary audience, we welcome original full-length articles, reviews, meeting summaries, brief reports, and letters to the editor.
期刊最新文献
Identification and characterization of ADAR1 mutations and changes in gene expression in human cancers Recurrent cytogenetic abnormalities reveal alterations that promote progression and transformation in myelodysplastic syndrome Potential use of SCAT1, SCAT2, and SCAT8 as diagnostic and prognosis markers in colorectal cancer Elucidating the prognostic and therapeutic significance of TOP2A in various malignancies Influence of polymorphisms on the phenotype of TLR1, TLR4 and TLR9 genes and their association with cervical cancer: Bioinformatics prediction analysis and a case-control study
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1