UnPaSt: unsupervised patient stratification by differentially expressed biclusters in omics data

arXiv - QuanBio - Genomics Pub Date : 2024-07-31 DOI:arxiv-2408.00200

Michael Hartung, Andreas Maier, Fernando Delgado-Chaves, Yuliya Burankova, Olga I. Isaeva, Fábio Malta de Sá Patroni, Daniel He, Casey Shannon, Katharina Kaufmann, Jens Lohmann, Alexey Savchik, Anne Hartebrodt, Zoe Chervontseva, Farzaneh Firoozbakht, Niklas Probul, Evgenia Zotova, Olga Tsoy, David B. Blumenthal, Martin Ester, Tanja Laske, Jan Baumbach, Olga Zolotareva

{"title":"UnPaSt: unsupervised patient stratification by differentially expressed biclusters in omics data","authors":"Michael Hartung, Andreas Maier, Fernando Delgado-Chaves, Yuliya Burankova, Olga I. Isaeva, Fábio Malta de Sá Patroni, Daniel He, Casey Shannon, Katharina Kaufmann, Jens Lohmann, Alexey Savchik, Anne Hartebrodt, Zoe Chervontseva, Farzaneh Firoozbakht, Niklas Probul, Evgenia Zotova, Olga Tsoy, David B. Blumenthal, Martin Ester, Tanja Laske, Jan Baumbach, Olga Zolotareva","doi":"arxiv-2408.00200","DOIUrl":null,"url":null,"abstract":"Most complex diseases, including cancer and non-malignant diseases like\nasthma, have distinct molecular subtypes that require distinct clinical\napproaches. However, existing computational patient stratification methods have\nbeen benchmarked almost exclusively on cancer omics data and only perform well\nwhen mutually exclusive subtypes can be characterized by many biomarkers. Here,\nwe contribute with a massive evaluation attempt, quantitatively exploring the\npower of 22 unsupervised patient stratification methods using both, simulated\nand real transcriptome data. From this experience, we developed UnPaSt\n(https://apps.cosy.bio/unpast/) optimizing unsupervised patient stratification,\nworking even with only a limited number of subtype-predictive biomarkers. We\nevaluated all 23 methods on real-world breast cancer and asthma transcriptomics\ndata. Although many methods reliably detected major breast cancer subtypes,\nonly few identified Th2-high asthma, and UnPaSt significantly outperformed its\nclosest competitors in both test datasets. Essentially, we showed that UnPaSt\ncan detect many biologically insightful and reproducible patterns in omic\ndatasets.","PeriodicalId":501070,"journal":{"name":"arXiv - QuanBio - Genomics","volume":"43 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Genomics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.00200","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Most complex diseases, including cancer and non-malignant diseases like asthma, have distinct molecular subtypes that require distinct clinical approaches. However, existing computational patient stratification methods have been benchmarked almost exclusively on cancer omics data and only perform well when mutually exclusive subtypes can be characterized by many biomarkers. Here, we contribute with a massive evaluation attempt, quantitatively exploring the power of 22 unsupervised patient stratification methods using both, simulated and real transcriptome data. From this experience, we developed UnPaSt (https://apps.cosy.bio/unpast/) optimizing unsupervised patient stratification, working even with only a limited number of subtype-predictive biomarkers. We evaluated all 23 methods on real-world breast cancer and asthma transcriptomics data. Although many methods reliably detected major breast cancer subtypes, only few identified Th2-high asthma, and UnPaSt significantly outperformed its closest competitors in both test datasets. Essentially, we showed that UnPaSt can detect many biologically insightful and reproducible patterns in omic datasets.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

UnPaSt：通过 omics 数据中的差异表达双簇对患者进行无监督分层

大多数复杂疾病，包括癌症和非恶性疾病（如哮喘），都有不同的分子亚型，需要不同的临床方法。然而，现有的计算患者分层方法几乎都是以癌症组学数据为基准的，只有当相互排斥的亚型可以用许多生物标记物来表征时，这些方法才会表现良好。在此，我们进行了大规模的评估尝试，利用模拟和真实转录组数据定量探索了 22 种无监督患者分层方法的能力。根据这些经验，我们开发了 UnPaSt(https://apps.cosy.bio/unpast/)，优化了无监督患者分层，即使只有有限数量的亚型预测生物标记物也能发挥作用。我们在真实世界的乳腺癌和哮喘转录组学数据上评估了所有 23 种方法。尽管许多方法都能可靠地检测出主要的乳腺癌亚型，但只有少数方法能识别出 Th2 高的哮喘，而 UnPaSt 在这两个测试数据集中的表现明显优于其最接近的竞争对手。从根本上说，我们证明了 UnPaSt 可以检测到 omic 数据集中许多具有生物洞察力且可重复的模式。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

arXiv - QuanBio - Genomics

自引率

0.00%

发文量