MOCAT：带辅助分类器的多组学集成增强型自动编码器

IF 4 3区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Biodata Mining Pub Date : 2024-03-05 DOI:10.1186/s13040-024-00360-6

Xiaohui Yao, Xiaohan Jiang, Haoran Luo, Hong Liang, Xiufen Ye, Yanhui Wei, Shan Cong

{"title":"MOCAT：带辅助分类器的多组学集成增强型自动编码器","authors":"Xiaohui Yao, Xiaohan Jiang, Haoran Luo, Hong Liang, Xiufen Ye, Yanhui Wei, Shan Cong","doi":"10.1186/s13040-024-00360-6","DOIUrl":null,"url":null,"abstract":"Integrating multi-omics data is emerging as a critical approach in enhancing our understanding of complex diseases. Innovative computational methods capable of managing high-dimensional and heterogeneous datasets are required to unlock the full potential of such rich and diverse data. We propose a Multi-Omics integration framework with auxiliary Classifiers-enhanced AuToencoders (MOCAT) to utilize intra- and inter-omics information comprehensively. Additionally, attention mechanisms with confidence learning are incorporated for enhanced feature representation and trustworthy prediction. Extensive experiments were conducted on four benchmark datasets to evaluate the effectiveness of our proposed model, including BRCA, ROSMAP, LGG, and KIPAN. Our model significantly improved most evaluation measurements and consistently surpassed the state-of-the-art methods. Ablation studies showed that the auxiliary classifiers significantly boosted classification accuracy in the ROSMAP and LGG datasets. Moreover, the attention mechanisms and confidence evaluation block contributed to improvements in the predictive accuracy and generalizability of our model. The proposed framework exhibits superior performance in disease classification and biomarker discovery, establishing itself as a robust and versatile tool for analyzing multi-layer biological data. This study highlights the significance of elaborated designed deep learning methodologies in dissecting complex disease phenotypes and improving the accuracy of disease predictions.","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"42 1","pages":""},"PeriodicalIF":4.0000,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MOCAT: multi-omics integration with auxiliary classifiers enhanced autoencoder\",\"authors\":\"Xiaohui Yao, Xiaohan Jiang, Haoran Luo, Hong Liang, Xiufen Ye, Yanhui Wei, Shan Cong\",\"doi\":\"10.1186/s13040-024-00360-6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Integrating multi-omics data is emerging as a critical approach in enhancing our understanding of complex diseases. Innovative computational methods capable of managing high-dimensional and heterogeneous datasets are required to unlock the full potential of such rich and diverse data. We propose a Multi-Omics integration framework with auxiliary Classifiers-enhanced AuToencoders (MOCAT) to utilize intra- and inter-omics information comprehensively. Additionally, attention mechanisms with confidence learning are incorporated for enhanced feature representation and trustworthy prediction. Extensive experiments were conducted on four benchmark datasets to evaluate the effectiveness of our proposed model, including BRCA, ROSMAP, LGG, and KIPAN. Our model significantly improved most evaluation measurements and consistently surpassed the state-of-the-art methods. Ablation studies showed that the auxiliary classifiers significantly boosted classification accuracy in the ROSMAP and LGG datasets. Moreover, the attention mechanisms and confidence evaluation block contributed to improvements in the predictive accuracy and generalizability of our model. The proposed framework exhibits superior performance in disease classification and biomarker discovery, establishing itself as a robust and versatile tool for analyzing multi-layer biological data. This study highlights the significance of elaborated designed deep learning methodologies in dissecting complex disease phenotypes and improving the accuracy of disease predictions.\",\"PeriodicalId\":48947,\"journal\":{\"name\":\"Biodata Mining\",\"volume\":\"42 1\",\"pages\":\"\"},\"PeriodicalIF\":4.0000,\"publicationDate\":\"2024-03-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biodata Mining\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1186/s13040-024-00360-6\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MATHEMATICAL & COMPUTATIONAL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biodata Mining","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13040-024-00360-6","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

整合多组学数据正在成为增进我们对复杂疾病了解的一种重要方法。我们需要能够管理高维异构数据集的创新计算方法，以充分挖掘这些丰富多样数据的潜力。我们提出了一个多组学集成框架，该框架带有辅助分类器增强型 AuToencoders（MOCAT），可全面利用组学内部和组学之间的信息。此外，还纳入了具有置信度学习的注意力机制，以增强特征表示和可信预测。我们在四个基准数据集（包括 BRCA、ROSMAP、LGG 和 KIPAN）上进行了广泛的实验，以评估我们提出的模型的有效性。我们的模型明显改善了大多数评估指标，并一直超越最先进的方法。消融研究表明，在 ROSMAP 和 LGG 数据集中，辅助分类器大大提高了分类准确率。此外，注意力机制和置信度评估块也有助于提高我们模型的预测准确性和普适性。所提出的框架在疾病分类和生物标记物发现方面表现出卓越的性能，使其成为分析多层生物数据的稳健而通用的工具。这项研究凸显了精心设计的深度学习方法在剖析复杂疾病表型和提高疾病预测准确性方面的重要意义。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

MOCAT: multi-omics integration with auxiliary classifiers enhanced autoencoder

Integrating multi-omics data is emerging as a critical approach in enhancing our understanding of complex diseases. Innovative computational methods capable of managing high-dimensional and heterogeneous datasets are required to unlock the full potential of such rich and diverse data. We propose a Multi-Omics integration framework with auxiliary Classifiers-enhanced AuToencoders (MOCAT) to utilize intra- and inter-omics information comprehensively. Additionally, attention mechanisms with confidence learning are incorporated for enhanced feature representation and trustworthy prediction. Extensive experiments were conducted on four benchmark datasets to evaluate the effectiveness of our proposed model, including BRCA, ROSMAP, LGG, and KIPAN. Our model significantly improved most evaluation measurements and consistently surpassed the state-of-the-art methods. Ablation studies showed that the auxiliary classifiers significantly boosted classification accuracy in the ROSMAP and LGG datasets. Moreover, the attention mechanisms and confidence evaluation block contributed to improvements in the predictive accuracy and generalizability of our model. The proposed framework exhibits superior performance in disease classification and biomarker discovery, establishing itself as a robust and versatile tool for analyzing multi-layer biological data. This study highlights the significance of elaborated designed deep learning methodologies in dissecting complex disease phenotypes and improving the accuracy of disease predictions.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Biodata Mining MATHEMATICAL & COMPUTATIONAL BIOLOGY-

CiteScore

7.90

自引率

0.00%

发文量

审稿时长

23 weeks

期刊介绍： BioData Mining is an open access, open peer-reviewed journal encompassing research on all aspects of data mining applied to high-dimensional biological and biomedical data, focusing on computational aspects of knowledge discovery from large-scale genetic, transcriptomic, genomic, proteomic, and metabolomic data. Topical areas include, but are not limited to: -Development, evaluation, and application of novel data mining and machine learning algorithms. -Adaptation, evaluation, and application of traditional data mining and machine learning algorithms. -Open-source software for the application of data mining and machine learning algorithms. -Design, development and integration of databases, software and web services for the storage, management, retrieval, and analysis of data from large scale studies. -Pre-processing, post-processing, modeling, and interpretation of data mining and machine learning results for biological interpretation and knowledge discovery.