Matrix factorization-based data fusion for drug-induced liver injury prediction

M. Zitnik, B. Zupan
{"title":"Matrix factorization-based data fusion for drug-induced liver injury prediction","authors":"M. Zitnik, B. Zupan","doi":"10.4161/sysb.29072","DOIUrl":null,"url":null,"abstract":"Traditional studies of liver toxicity involve screening compounds through in vivo and in vitro tests. They need to distinguish between compounds that represent little or no health concern and those with the greatest likelihood to cause adverse effects in humans. High-throughput and toxicogenomic screening methods coupled with a plethora of circumstantial evidence provide a challenge for improved toxicity prediction and require appropriate computational methods that integrate various biological, chemical and toxicological data. We report on a data fusion approach for prediction of drug-induced liver injury potential in humans using microarray data from the Japanese Toxicogenomics Project (TGP) as provided for the contest by CAMDA 2013 Conference. Our aim was to investigate if the data from different TGP studies could be fused together to boost prediction accuracy. We were also interested if in vitro studies provided sufficient information to refrain from studies in animals. We show that our recently proposed matrix factorization-based data fusion provides an elegant computational framework for integration of the TGP and related data sets, 29 data sets in total. Fusion yields a high cross-validated accuracy (AUC of 0.819 for in vivo assays), which is above the accuracy of the established machine learning procedure of stacked classification with feature selection. Our data analysis shows that animal studies may be replaced with in vitro assays (AUC = 0.799) and that liver injury in humans can be predicted from animal data (AUC = 0.811). Our principal contribution is a demonstration that analysis of toxicogenomic data can substantially benefit from data fusion with directly and circumstantially related data sets.","PeriodicalId":90057,"journal":{"name":"Systems biomedicine (Austin, Tex.)","volume":"112 1","pages":"16 - 22"},"PeriodicalIF":0.0000,"publicationDate":"2014-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.4161/sysb.29072","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Systems biomedicine (Austin, Tex.)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4161/sysb.29072","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16

Abstract

Traditional studies of liver toxicity involve screening compounds through in vivo and in vitro tests. They need to distinguish between compounds that represent little or no health concern and those with the greatest likelihood to cause adverse effects in humans. High-throughput and toxicogenomic screening methods coupled with a plethora of circumstantial evidence provide a challenge for improved toxicity prediction and require appropriate computational methods that integrate various biological, chemical and toxicological data. We report on a data fusion approach for prediction of drug-induced liver injury potential in humans using microarray data from the Japanese Toxicogenomics Project (TGP) as provided for the contest by CAMDA 2013 Conference. Our aim was to investigate if the data from different TGP studies could be fused together to boost prediction accuracy. We were also interested if in vitro studies provided sufficient information to refrain from studies in animals. We show that our recently proposed matrix factorization-based data fusion provides an elegant computational framework for integration of the TGP and related data sets, 29 data sets in total. Fusion yields a high cross-validated accuracy (AUC of 0.819 for in vivo assays), which is above the accuracy of the established machine learning procedure of stacked classification with feature selection. Our data analysis shows that animal studies may be replaced with in vitro assays (AUC = 0.799) and that liver injury in humans can be predicted from animal data (AUC = 0.811). Our principal contribution is a demonstration that analysis of toxicogenomic data can substantially benefit from data fusion with directly and circumstantially related data sets.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于矩阵分解的数据融合预测药物性肝损伤
传统的肝毒性研究包括通过体内和体外试验筛选化合物。他们需要区分那些对健康影响很小或没有影响的化合物和那些最有可能对人类造成不利影响的化合物。高通量和毒物基因组学筛选方法加上大量的间接证据为改进毒性预测提供了挑战,并且需要适当的计算方法来整合各种生物,化学和毒理学数据。我们报告了一种数据融合方法,用于预测人类药物性肝损伤的潜力,该方法使用来自日本毒物基因组学计划(TGP)的微阵列数据,该数据是由CAMDA 2013会议提供的竞赛。我们的目的是研究来自不同TGP研究的数据是否可以融合在一起以提高预测的准确性。我们同样感兴趣的是,体外研究是否提供了足够的信息来避免在动物身上进行研究。我们表明,我们最近提出的基于矩阵分解的数据融合为TGP和相关数据集(总共29个数据集)的集成提供了一个优雅的计算框架。融合产生了很高的交叉验证精度(体内检测的AUC为0.819),高于已建立的带有特征选择的堆叠分类机器学习过程的精度。我们的数据分析表明,动物实验可以被体外实验取代(AUC = 0.799),并且可以从动物数据中预测人类的肝损伤(AUC = 0.811)。我们的主要贡献是证明了毒物基因组学数据的分析可以从与直接和间接相关的数据集的数据融合中受益匪浅。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Gulf War Illness: Is there lasting damage to the endocrine-immune circuitry? Survival regression by data fusion An integrative exploratory analysis of –omics data from the ICGC cancer genomes lung adenocarcinoma study Drug-induced liver injury classification model based on in vitro human transcriptomics and in vivo rat clinical chemistry data Cross-organism toxicogenomics with group factor analysis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1