{"title":"A comparative study of cell type annotation methods for immune cells using single-cell sequencing technology","authors":"Tian-Yu Zhang","doi":"10.1145/3498731.3498735","DOIUrl":null,"url":null,"abstract":"Abstract: Single-cell sequencing is an emerging technique that allows high-throughput data analysis at an individual cell resolution and is applied in diverse fields of biology. Due to the large amount of data, downstream analysis is very complicated, and cell type annotation is a critical step; however, it is currently difficult to obtain good results. Single-cell sequencing has resulted in new breakthroughs in multi-omics, such as CITE-seq (Cellular Indexing of Transcriptomes and Epitopes by sequencing), which allows the measurement of surface marker proteins simultaneously with the sequencing of mRNA at the single-cell level. In this study, a CITE-seq dataset of human PBMCs (peripheral blood mononuclear cells) was annotated using the most popular reference-based annotation methods, including SingleR, Seurat with the RNA-seq dataset, and Seurat with both the RNA and protein datasets; the results were then compared with RNA and protein expression levels to determine the role of proteins in cell annotation. The results indicate that protein expression can supplement datasets with some mRNAs with low expression to improve accuracy. With the verification of single-cell biomarkers, the multi-omics annotation method Seurat with both the RNA and protein databases showed the best performance, especially in the differentiation of NK cells and T cells and of dendritic cells and monocytes. This study shows the significance of multi-omics information for improving cell annotation and has great potential for perfecting these annotations with more data support.","PeriodicalId":166893,"journal":{"name":"Proceedings of the 2021 10th International Conference on Bioinformatics and Biomedical Science","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2021 10th International Conference on Bioinformatics and Biomedical Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3498731.3498735","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Abstract: Single-cell sequencing is an emerging technique that allows high-throughput data analysis at an individual cell resolution and is applied in diverse fields of biology. Due to the large amount of data, downstream analysis is very complicated, and cell type annotation is a critical step; however, it is currently difficult to obtain good results. Single-cell sequencing has resulted in new breakthroughs in multi-omics, such as CITE-seq (Cellular Indexing of Transcriptomes and Epitopes by sequencing), which allows the measurement of surface marker proteins simultaneously with the sequencing of mRNA at the single-cell level. In this study, a CITE-seq dataset of human PBMCs (peripheral blood mononuclear cells) was annotated using the most popular reference-based annotation methods, including SingleR, Seurat with the RNA-seq dataset, and Seurat with both the RNA and protein datasets; the results were then compared with RNA and protein expression levels to determine the role of proteins in cell annotation. The results indicate that protein expression can supplement datasets with some mRNAs with low expression to improve accuracy. With the verification of single-cell biomarkers, the multi-omics annotation method Seurat with both the RNA and protein databases showed the best performance, especially in the differentiation of NK cells and T cells and of dendritic cells and monocytes. This study shows the significance of multi-omics information for improving cell annotation and has great potential for perfecting these annotations with more data support.
摘要:单细胞测序是一项新兴的技术,可以在单个细胞分辨率下进行高通量数据分析,并应用于生物学的各个领域。由于数据量大,下游分析非常复杂,细胞类型标注是关键步骤;然而,目前很难取得良好的效果。单细胞测序带来了多组学的新突破,例如CITE-seq (Cellular Indexing of Transcriptomes and Epitopes by sequencing),它允许在单细胞水平上测量mRNA的同时测量表面标记蛋白。在本研究中,使用最流行的基于参考的注释方法对人外周血单个核细胞的CITE-seq数据集进行注释,包括SingleR、Seurat与RNA-seq数据集以及Seurat与RNA和蛋白质数据集的注释;然后将结果与RNA和蛋白质表达水平进行比较,以确定蛋白质在细胞注释中的作用。结果表明,蛋白质表达可以用一些低表达的mrna补充数据集,以提高准确性。通过对单细胞生物标志物的验证,结合RNA和蛋白质数据库的多组学注释方法Seurat在NK细胞和T细胞、树突状细胞和单核细胞的分化中表现出最好的性能。本研究显示了多组学信息对改进细胞注释的重要意义,并且在有更多数据支持的情况下具有完善细胞注释的巨大潜力。