MOCAT: multi-omics integration with auxiliary classifiers enhanced autoencoder

IF 4 3区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Biodata Mining Pub Date : 2024-03-05 DOI:10.1186/s13040-024-00360-6
Xiaohui Yao, Xiaohan Jiang, Haoran Luo, Hong Liang, Xiufen Ye, Yanhui Wei, Shan Cong
{"title":"MOCAT: multi-omics integration with auxiliary classifiers enhanced autoencoder","authors":"Xiaohui Yao, Xiaohan Jiang, Haoran Luo, Hong Liang, Xiufen Ye, Yanhui Wei, Shan Cong","doi":"10.1186/s13040-024-00360-6","DOIUrl":null,"url":null,"abstract":"Integrating multi-omics data is emerging as a critical approach in enhancing our understanding of complex diseases. Innovative computational methods capable of managing high-dimensional and heterogeneous datasets are required to unlock the full potential of such rich and diverse data. We propose a Multi-Omics integration framework with auxiliary Classifiers-enhanced AuToencoders (MOCAT) to utilize intra- and inter-omics information comprehensively. Additionally, attention mechanisms with confidence learning are incorporated for enhanced feature representation and trustworthy prediction. Extensive experiments were conducted on four benchmark datasets to evaluate the effectiveness of our proposed model, including BRCA, ROSMAP, LGG, and KIPAN. Our model significantly improved most evaluation measurements and consistently surpassed the state-of-the-art methods. Ablation studies showed that the auxiliary classifiers significantly boosted classification accuracy in the ROSMAP and LGG datasets. Moreover, the attention mechanisms and confidence evaluation block contributed to improvements in the predictive accuracy and generalizability of our model. The proposed framework exhibits superior performance in disease classification and biomarker discovery, establishing itself as a robust and versatile tool for analyzing multi-layer biological data. This study highlights the significance of elaborated designed deep learning methodologies in dissecting complex disease phenotypes and improving the accuracy of disease predictions.","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"42 1","pages":""},"PeriodicalIF":4.0000,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biodata Mining","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13040-024-00360-6","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Integrating multi-omics data is emerging as a critical approach in enhancing our understanding of complex diseases. Innovative computational methods capable of managing high-dimensional and heterogeneous datasets are required to unlock the full potential of such rich and diverse data. We propose a Multi-Omics integration framework with auxiliary Classifiers-enhanced AuToencoders (MOCAT) to utilize intra- and inter-omics information comprehensively. Additionally, attention mechanisms with confidence learning are incorporated for enhanced feature representation and trustworthy prediction. Extensive experiments were conducted on four benchmark datasets to evaluate the effectiveness of our proposed model, including BRCA, ROSMAP, LGG, and KIPAN. Our model significantly improved most evaluation measurements and consistently surpassed the state-of-the-art methods. Ablation studies showed that the auxiliary classifiers significantly boosted classification accuracy in the ROSMAP and LGG datasets. Moreover, the attention mechanisms and confidence evaluation block contributed to improvements in the predictive accuracy and generalizability of our model. The proposed framework exhibits superior performance in disease classification and biomarker discovery, establishing itself as a robust and versatile tool for analyzing multi-layer biological data. This study highlights the significance of elaborated designed deep learning methodologies in dissecting complex disease phenotypes and improving the accuracy of disease predictions.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
MOCAT:带辅助分类器的多组学集成增强型自动编码器
整合多组学数据正在成为增进我们对复杂疾病了解的一种重要方法。我们需要能够管理高维异构数据集的创新计算方法,以充分挖掘这些丰富多样数据的潜力。我们提出了一个多组学集成框架,该框架带有辅助分类器增强型 AuToencoders(MOCAT),可全面利用组学内部和组学之间的信息。此外,还纳入了具有置信度学习的注意力机制,以增强特征表示和可信预测。我们在四个基准数据集(包括 BRCA、ROSMAP、LGG 和 KIPAN)上进行了广泛的实验,以评估我们提出的模型的有效性。我们的模型明显改善了大多数评估指标,并一直超越最先进的方法。消融研究表明,在 ROSMAP 和 LGG 数据集中,辅助分类器大大提高了分类准确率。此外,注意力机制和置信度评估块也有助于提高我们模型的预测准确性和普适性。所提出的框架在疾病分类和生物标记物发现方面表现出卓越的性能,使其成为分析多层生物数据的稳健而通用的工具。这项研究凸显了精心设计的深度学习方法在剖析复杂疾病表型和提高疾病预测准确性方面的重要意义。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Biodata Mining
Biodata Mining MATHEMATICAL & COMPUTATIONAL BIOLOGY-
CiteScore
7.90
自引率
0.00%
发文量
28
审稿时长
23 weeks
期刊介绍: BioData Mining is an open access, open peer-reviewed journal encompassing research on all aspects of data mining applied to high-dimensional biological and biomedical data, focusing on computational aspects of knowledge discovery from large-scale genetic, transcriptomic, genomic, proteomic, and metabolomic data. Topical areas include, but are not limited to: -Development, evaluation, and application of novel data mining and machine learning algorithms. -Adaptation, evaluation, and application of traditional data mining and machine learning algorithms. -Open-source software for the application of data mining and machine learning algorithms. -Design, development and integration of databases, software and web services for the storage, management, retrieval, and analysis of data from large scale studies. -Pre-processing, post-processing, modeling, and interpretation of data mining and machine learning results for biological interpretation and knowledge discovery.
期刊最新文献
Transcriptome-based network analysis related to regulatory T cells infiltration identified RCN1 as a potential biomarker for prognosis in clear cell renal cell carcinoma. Deciphering the tissue-specific functional effect of Alzheimer risk SNPs with deep genome annotation. Investigating potential drug targets for IgA nephropathy and membranous nephropathy through multi-queue plasma protein analysis: a Mendelian randomization study based on SMR and co-localization analysis. Deep joint learning diagnosis of Alzheimer's disease based on multimodal feature fusion. Modeling heterogeneity of Sudanese hospital stay in neonatal and maternal unit: non-parametric random effect models with Gamma distribution.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1