深度 IDA:利用特征排序对多组学数据进行综合判别分析的深度学习方法--在 COVID-19 中的应用

IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Bioinformatics advances Pub Date : 2024-04-24 DOI:10.1093/bioadv/vbae060
Jiuzhou Wang, S. Safo
{"title":"深度 IDA:利用特征排序对多组学数据进行综合判别分析的深度学习方法--在 COVID-19 中的应用","authors":"Jiuzhou Wang, S. Safo","doi":"10.1093/bioadv/vbae060","DOIUrl":null,"url":null,"abstract":"\n \n \n Many diseases are complex heterogeneous conditions that affect multiple organs in the body and depend on the interplay between several factors that include molecular and environmental factors, requiring a holistic approach to better understand disease pathobiology. Most existing methods for integrating data from multiple sources and classifying individuals into one of multiple classes or disease groups have mainly focused on linear relationships despite the complexity of these relationships. On the other hand, methods for nonlinear association and classification studies are limited in their ability to identify variables to aid in our understanding of the complexity of the disease or can be applied to only two data types.\n \n \n \n We propose Deep IDA (Integrative Discriminant Analysis), a deep learning method to learn complex nonlinear transformations of two or more views such that resulting projections have maximum association and maximum separation. Further, we propose a feature ranking approach based on ensemble learning for interpretatble results. We test Deep IDA on both simulated data and two large real-world datasets, including RNA sequencing, metabolomics, and proteomics data pertaining to COVID-19 severity. We identified signatures that better discriminated COVID-19 patient groups, and related to neurological conditions, cancer, and metabolic diseases, corroborating current research findings and heightening the need to study the post sequelae effects of COVID-19 to devise effective treatments and to improve patient care.\n \n \n \n Our algorithms are implemented in PyTorch and available at: https://github.com/JiuzhouW/DeepIDA.\n \n \n \n Supplementary materials are available at Bioinformatics Advances online\n","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":null,"pages":null},"PeriodicalIF":2.4000,"publicationDate":"2024-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep IDA: A Deep Learning Approach for Integrative Discriminant Analysis of Multi-omics Data with Feature Ranking- An Application to COVID-19\",\"authors\":\"Jiuzhou Wang, S. Safo\",\"doi\":\"10.1093/bioadv/vbae060\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n \\n \\n Many diseases are complex heterogeneous conditions that affect multiple organs in the body and depend on the interplay between several factors that include molecular and environmental factors, requiring a holistic approach to better understand disease pathobiology. Most existing methods for integrating data from multiple sources and classifying individuals into one of multiple classes or disease groups have mainly focused on linear relationships despite the complexity of these relationships. On the other hand, methods for nonlinear association and classification studies are limited in their ability to identify variables to aid in our understanding of the complexity of the disease or can be applied to only two data types.\\n \\n \\n \\n We propose Deep IDA (Integrative Discriminant Analysis), a deep learning method to learn complex nonlinear transformations of two or more views such that resulting projections have maximum association and maximum separation. Further, we propose a feature ranking approach based on ensemble learning for interpretatble results. We test Deep IDA on both simulated data and two large real-world datasets, including RNA sequencing, metabolomics, and proteomics data pertaining to COVID-19 severity. We identified signatures that better discriminated COVID-19 patient groups, and related to neurological conditions, cancer, and metabolic diseases, corroborating current research findings and heightening the need to study the post sequelae effects of COVID-19 to devise effective treatments and to improve patient care.\\n \\n \\n \\n Our algorithms are implemented in PyTorch and available at: https://github.com/JiuzhouW/DeepIDA.\\n \\n \\n \\n Supplementary materials are available at Bioinformatics Advances online\\n\",\"PeriodicalId\":72368,\"journal\":{\"name\":\"Bioinformatics advances\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2024-04-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bioinformatics advances\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/bioadv/vbae060\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MATHEMATICAL & COMPUTATIONAL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics advances","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioadv/vbae060","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

许多疾病都是影响体内多个器官的复杂异质病症,取决于包括分子和环境因素在内的多种因素之间的相互作用,因此需要采用整体方法来更好地了解疾病的病理生物学。尽管这些关系错综复杂,但大多数现有方法都主要关注线性关系,用于整合来自多个来源的数据,并将个体划分为多个类别或疾病组别之一。另一方面,用于非线性关联和分类研究的方法在识别变量以帮助我们理解疾病的复杂性方面能力有限,或者只能应用于两种数据类型。 我们提出了深度 IDA(整合判别分析),这是一种深度学习方法,用于学习两个或多个视图的复杂非线性变换,从而使产生的投影具有最大关联性和最大分离性。此外,我们还提出了一种基于集合学习的特征排序方法,以获得可解释的结果。我们在模拟数据和两个大型真实数据集(包括与 COVID-19 严重程度相关的 RNA 测序、代谢组学和蛋白质组学数据)上测试了 Deep IDA。我们发现了能更好地区分 COVID-19 患者群体的特征,这些特征与神经系统疾病、癌症和代谢性疾病相关,证实了当前的研究成果,并提高了研究 COVID-19 后遗症影响的必要性,从而设计出有效的治疗方法并改善患者护理。 我们的算法是在 PyTorch 中实现的,可在以下网址获取:https://github.com/JiuzhouW/DeepIDA。 补充材料可在 Bioinformatics Advances 在线查阅。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Deep IDA: A Deep Learning Approach for Integrative Discriminant Analysis of Multi-omics Data with Feature Ranking- An Application to COVID-19
Many diseases are complex heterogeneous conditions that affect multiple organs in the body and depend on the interplay between several factors that include molecular and environmental factors, requiring a holistic approach to better understand disease pathobiology. Most existing methods for integrating data from multiple sources and classifying individuals into one of multiple classes or disease groups have mainly focused on linear relationships despite the complexity of these relationships. On the other hand, methods for nonlinear association and classification studies are limited in their ability to identify variables to aid in our understanding of the complexity of the disease or can be applied to only two data types. We propose Deep IDA (Integrative Discriminant Analysis), a deep learning method to learn complex nonlinear transformations of two or more views such that resulting projections have maximum association and maximum separation. Further, we propose a feature ranking approach based on ensemble learning for interpretatble results. We test Deep IDA on both simulated data and two large real-world datasets, including RNA sequencing, metabolomics, and proteomics data pertaining to COVID-19 severity. We identified signatures that better discriminated COVID-19 patient groups, and related to neurological conditions, cancer, and metabolic diseases, corroborating current research findings and heightening the need to study the post sequelae effects of COVID-19 to devise effective treatments and to improve patient care. Our algorithms are implemented in PyTorch and available at: https://github.com/JiuzhouW/DeepIDA. Supplementary materials are available at Bioinformatics Advances online
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
1.60
自引率
0.00%
发文量
0
期刊最新文献
motifbreakR v2: expanded variant analysis including indels and integrated evidence from transcription factor binding databases. TransAnnot-a fast transcriptome annotation pipeline. PatchProt: hydrophobic patch prediction using protein foundation models. Accelerating protein-protein interaction screens with reduced AlphaFold-Multimer sampling. CAPTVRED: an automated pipeline for viral tracking and discovery from capture-based metagenomics samples.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1