Haiyi Mao, Minxue Jia, Jason Xiaotian Dou Haotian Zhang Panayiotis V. Benos
{"title":"COEM: Cross-Modal Embedding for MetaCell Identification","authors":"Haiyi Mao, Minxue Jia, Jason Xiaotian Dou Haotian Zhang Panayiotis V. Benos","doi":"arxiv-2207.07734","DOIUrl":null,"url":null,"abstract":"Metacells are disjoint and homogeneous groups of single-cell profiles,\nrepresenting discrete and highly granular cell states. Existing metacell\nalgorithms tend to use only one modality to infer metacells, even though\nsingle-cell multi-omics datasets profile multiple molecular modalities within\nthe same cell. Here, we present \\textbf{C}ross-M\\textbf{O}dal\n\\textbf{E}mbedding for \\textbf{M}etaCell Identification (COEM), which utilizes\nan embedded space leveraging the information of both scATAC-seq and scRNA-seq\nto perform aggregation, balancing the trade-off between fine resolution and\nsufficient sequencing coverage. COEM outperforms the state-of-the-art method\nSEACells by efficiently identifying accurate and well-separated metacells\nacross datasets with continuous and discrete cell types. Furthermore, COEM\nsignificantly improves peak-to-gene association analyses, and facilitates\ncomplex gene regulatory inference tasks.","PeriodicalId":501533,"journal":{"name":"arXiv - CS - General Literature","volume":"61 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - General Literature","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2207.07734","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Metacells are disjoint and homogeneous groups of single-cell profiles,
representing discrete and highly granular cell states. Existing metacell
algorithms tend to use only one modality to infer metacells, even though
single-cell multi-omics datasets profile multiple molecular modalities within
the same cell. Here, we present \textbf{C}ross-M\textbf{O}dal
\textbf{E}mbedding for \textbf{M}etaCell Identification (COEM), which utilizes
an embedded space leveraging the information of both scATAC-seq and scRNA-seq
to perform aggregation, balancing the trade-off between fine resolution and
sufficient sequencing coverage. COEM outperforms the state-of-the-art method
SEACells by efficiently identifying accurate and well-separated metacells
across datasets with continuous and discrete cell types. Furthermore, COEM
significantly improves peak-to-gene association analyses, and facilitates
complex gene regulatory inference tasks.