Unsupervised manifold alignment for single-cell multi-omics data.

ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine Pub Date : 2020-09-01 DOI:10.1145/3388440.3412410

Ritambhara Singh, Pinar Demetci, Giancarlo Bonora, Vijay Ramani, Choli Lee, He Fang, Zhijun Duan, Xinxian Deng, Jay Shendure, Christine Disteche, William Stafford Noble

{"title":"Unsupervised manifold alignment for single-cell multi-omics data.","authors":"Ritambhara Singh, Pinar Demetci, Giancarlo Bonora, Vijay Ramani, Choli Lee, He Fang, Zhijun Duan, Xinxian Deng, Jay Shendure, Christine Disteche, William Stafford Noble","doi":"10.1145/3388440.3412410","DOIUrl":null,"url":null,"abstract":"Integrating single-cell measurements that capture different properties of the genome is vital to extending our understanding of genome biology. This task is challenging due to the lack of a shared axis across datasets obtained from different types of single-cell experiments. For most such datasets, we lack corresponding information among the cells (samples) and the measurements (features). In this scenario, unsupervised algorithms that are capable of aligning single-cell experiments are critical to learning an in silico co-assay that can help draw correspondences among the cells. Maximum mean discrepancy-based manifold alignment (MMD-MA) is such an unsupervised algorithm. Without requiring correspondence information, it can align single-cell datasets from different modalities in a common shared latent space, showing promising results on simulations and a small-scale single-cell experiment with 61 cells. However, it is essential to explore the applicability of this method to larger single-cell experiments with thousands of cells so that it can be of practical interest to the community. In this paper, we apply MMD-MA to two recent datasets that measure transcriptome and chromatin accessibility in ~2000 single cells. To scale the runtime of MMD-MA to a more substantial number of cells, we extend the original implementation to run on GPUs. We also introduce a method to automatically select one of the user-defined parameters, thus reducing the hyperparameter search space. We demonstrate that the proposed extensions allow MMD-MA to accurately align state-of-the-art single-cell experiments.","PeriodicalId":72044,"journal":{"name":"ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine","volume":"2020 ","pages":"1-10"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/3388440.3412410","citationCount":"38","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3388440.3412410","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 38

Abstract

Integrating single-cell measurements that capture different properties of the genome is vital to extending our understanding of genome biology. This task is challenging due to the lack of a shared axis across datasets obtained from different types of single-cell experiments. For most such datasets, we lack corresponding information among the cells (samples) and the measurements (features). In this scenario, unsupervised algorithms that are capable of aligning single-cell experiments are critical to learning an in silico co-assay that can help draw correspondences among the cells. Maximum mean discrepancy-based manifold alignment (MMD-MA) is such an unsupervised algorithm. Without requiring correspondence information, it can align single-cell datasets from different modalities in a common shared latent space, showing promising results on simulations and a small-scale single-cell experiment with 61 cells. However, it is essential to explore the applicability of this method to larger single-cell experiments with thousands of cells so that it can be of practical interest to the community. In this paper, we apply MMD-MA to two recent datasets that measure transcriptome and chromatin accessibility in ~2000 single cells. To scale the runtime of MMD-MA to a more substantial number of cells, we extend the original implementation to run on GPUs. We also introduce a method to automatically select one of the user-defined parameters, thus reducing the hyperparameter search space. We demonstrate that the proposed extensions allow MMD-MA to accurately align state-of-the-art single-cell experiments.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

单细胞多组学数据的无监督流形对齐。

整合捕获基因组不同特性的单细胞测量对于扩展我们对基因组生物学的理解至关重要。由于从不同类型的单细胞实验中获得的数据集之间缺乏共享轴，因此这项任务具有挑战性。对于大多数这样的数据集，我们缺乏单元(样本)和测量(特征)之间的相应信息。在这种情况下，能够对齐单细胞实验的无监督算法对于学习可以帮助绘制细胞之间对应关系的计算机联合分析至关重要。基于最大平均误差的流形对齐(MMD-MA)就是这样一种无监督算法。在不需要对应信息的情况下，它可以将来自不同模态的单细胞数据集对齐在一个共同的潜在空间中，在模拟和61个细胞的小规模单细胞实验中显示出令人鼓舞的结果。然而，探索这种方法在数千个细胞的大型单细胞实验中的适用性是至关重要的，这样它才能对社区产生实际的兴趣。在本文中，我们将MMD-MA应用于两个最近的数据集，这些数据集测量了约2000个单细胞的转录组和染色质可及性。为了将MMD-MA的运行时扩展到更大数量的单元，我们扩展了原始实现以在gpu上运行。我们还引入了一种自动选择用户自定义参数的方法，从而减少了超参数搜索空间。我们证明，提出的扩展允许MMD-MA准确地对准最先进的单细胞实验。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine

自引率

0.00%

发文量