CopyMix: Mixture model based single-cell clustering and copy number profiling using variational inference

IF 3.1 4区生物学 Q2 BIOLOGY Computational Biology and Chemistry Pub Date : 2024-12-01 Epub Date: 2024-10-23 DOI:10.1016/j.compbiolchem.2024.108257

Negar Safinianaini , Camila P.E. De Souza , Andrew Roth , Hazal Koptagel , Hosein Toosi , Jens Lagergren

{"title":"CopyMix: Mixture model based single-cell clustering and copy number profiling using variational inference","authors":"Negar Safinianaini , Camila P.E. De Souza , Andrew Roth , Hazal Koptagel , Hosein Toosi , Jens Lagergren","doi":"10.1016/j.compbiolchem.2024.108257","DOIUrl":null,"url":null,"abstract":"<div><div>Investigating tumor heterogeneity using single-cell sequencing technologies is imperative to understand how tumors evolve since each cell subpopulation harbors a unique set of genomic features that yields a unique phenotype, which is bound to have clinical relevance. Clustering of cells based on copy number data obtained from single-cell DNA sequencing provides an opportunity to identify different tumor cell subpopulations. Accordingly, computational methods have emerged for single-cell copy number profiling and clustering; however, these two tasks have been handled sequentially by applying various ad-hoc pre- and post-processing steps; hence, a procedure vulnerable to introducing clustering artifacts. We avoid the clustering artifact issues in our method, CopyMix, a Variational Inference for a novel mixture model, by jointly inferring cell clusters and their underlying copy number profile. Our probabilistic graphical model is an improved version of the mixture of hidden Markov models, which is designed uniquely to infer single-cell copy number profiling and clustering. For the evaluation, we used likelihood-ratio test, CH index, Silhouette, V-measure, total variation scores. CopyMix performs well on both biological and simulated data. Our favorable results indicate a considerable potential to obtain clinical impact by using CopyMix in studies of cancer tumor heterogeneity.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"113 ","pages":"Article 108257"},"PeriodicalIF":3.1000,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Biology and Chemistry","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1476927124002457","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/10/23 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Investigating tumor heterogeneity using single-cell sequencing technologies is imperative to understand how tumors evolve since each cell subpopulation harbors a unique set of genomic features that yields a unique phenotype, which is bound to have clinical relevance. Clustering of cells based on copy number data obtained from single-cell DNA sequencing provides an opportunity to identify different tumor cell subpopulations. Accordingly, computational methods have emerged for single-cell copy number profiling and clustering; however, these two tasks have been handled sequentially by applying various ad-hoc pre- and post-processing steps; hence, a procedure vulnerable to introducing clustering artifacts. We avoid the clustering artifact issues in our method, CopyMix, a Variational Inference for a novel mixture model, by jointly inferring cell clusters and their underlying copy number profile. Our probabilistic graphical model is an improved version of the mixture of hidden Markov models, which is designed uniquely to infer single-cell copy number profiling and clustering. For the evaluation, we used likelihood-ratio test, CH index, Silhouette, V-measure, total variation scores. CopyMix performs well on both biological and simulated data. Our favorable results indicate a considerable potential to obtain clinical impact by using CopyMix in studies of cancer tumor heterogeneity.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

CopyMix：利用变异推理进行基于混合模型的单细胞聚类和拷贝数分析

利用单细胞测序技术研究肿瘤异质性是了解肿瘤如何演变的当务之急，因为每个细胞亚群都有一套独特的基因组特征，从而产生独特的表型，这必然与临床相关。根据单细胞 DNA 测序获得的拷贝数数据对细胞进行聚类，为识别不同的肿瘤细胞亚群提供了机会。因此，出现了用于单细胞拷贝数分析和聚类的计算方法；然而，这两项任务是通过应用各种临时的前处理和后处理步骤来顺序处理的；因此，这种程序很容易引入聚类伪影。在我们的方法 "CopyMix--新型混合模型的变量推理 "中，我们通过联合推断细胞簇及其基本拷贝数特征，避免了聚类伪影问题。我们的概率图形模型是隐马尔可夫模型混合物的改进版，其设计独特，可用于推断单细胞拷贝数剖析和聚类。在评估中，我们使用了似然比检验、CH 指数、Silhouette、V-measure 和总变异分数。CopyMix 在生物数据和模拟数据上都表现良好。我们的良好结果表明，在癌症肿瘤异质性研究中使用 CopyMix 有很大的潜力产生临床影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Computational Biology and Chemistry 生物-计算机：跨学科应用

CiteScore

6.10

自引率

3.20%

发文量

142

审稿时长

24 days

期刊介绍： Computational Biology and Chemistry publishes original research papers and review articles in all areas of computational life sciences. High quality research contributions with a major computational component in the areas of nucleic acid and protein sequence research, molecular evolution, molecular genetics (functional genomics and proteomics), theory and practice of either biology-specific or chemical-biology-specific modeling, and structural biology of nucleic acids and proteins are particularly welcome. Exceptionally high quality research work in bioinformatics, systems biology, ecology, computational pharmacology, metabolism, biomedical engineering, epidemiology, and statistical genetics will also be considered. Given their inherent uncertainty, protein modeling and molecular docking studies should be thoroughly validated. In the absence of experimental results for validation, the use of molecular dynamics simulations along with detailed free energy calculations, for example, should be used as complementary techniques to support the major conclusions. Submissions of premature modeling exercises without additional biological insights will not be considered. Review articles will generally be commissioned by the editors and should not be submitted to the journal without explicit invitation. However prospective authors are welcome to send a brief (one to three pages) synopsis, which will be evaluated by the editors.