细胞游侠版本对 Chromium 基因表达数据的影响

Imad Abugessaisa, Akira Hasegawa, Scott Walker, Shintaro Katayama, Juha Kere, Takeya Kasukawa
{"title":"细胞游侠版本对 Chromium 基因表达数据的影响","authors":"Imad Abugessaisa, Akira Hasegawa, Scott Walker, Shintaro Katayama, Juha Kere, Takeya Kasukawa","doi":"10.1101/2024.08.10.607413","DOIUrl":null,"url":null,"abstract":"In droplet-based Chromium single cell gene expression data by the 10x Genomics platform, cell barcode calling by Cell Ranger (CR) is a standard pipeline. However, no systematic evaluation of the impact of the released versions of CR on Chromium single cell gene expression data has been conducted. To comprehensively evaluate the impact of CR, we considered six molecular quality criteria, quantified gene expression, and performed downstream analysis for 12 single-cell Chromium gene expression datasets. Each dataset was processed by 10 versions of CR resulting in 180 datasets and a total of 702,493 cell barcodes. We demonstrated that different versions of CR yield different numbers of cell barcodes with significant variation in molecular qualities and average gene expression for the same dataset. Our analysis finds distinction between two diverse categories of cell barcodes: common barcodes called (unmasked) by all versions of CR, and specific barcodes only called (unmasked/masked) by some versions. Surprisingly, we observed variations in molecular quality indices between common cell barcodes when called by different versions of CR. The specific barcodes yield skewed gene body coverage and form distinct clusters at the edges of UMAP plots. The choice of CR version affects scores for quality, average gene expression, clustering results, and top cluster marker genes for each dataset. Our study indicates a demonstrable, quantitative effect on downstream analysis from choice of CR version, resulting in widely different Chromium single cell gene expression data for different CR versions.","PeriodicalId":501307,"journal":{"name":"bioRxiv - Bioinformatics","volume":"86 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Impacts of Cell Ranger versions on Chromium gene expression data\",\"authors\":\"Imad Abugessaisa, Akira Hasegawa, Scott Walker, Shintaro Katayama, Juha Kere, Takeya Kasukawa\",\"doi\":\"10.1101/2024.08.10.607413\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In droplet-based Chromium single cell gene expression data by the 10x Genomics platform, cell barcode calling by Cell Ranger (CR) is a standard pipeline. However, no systematic evaluation of the impact of the released versions of CR on Chromium single cell gene expression data has been conducted. To comprehensively evaluate the impact of CR, we considered six molecular quality criteria, quantified gene expression, and performed downstream analysis for 12 single-cell Chromium gene expression datasets. Each dataset was processed by 10 versions of CR resulting in 180 datasets and a total of 702,493 cell barcodes. We demonstrated that different versions of CR yield different numbers of cell barcodes with significant variation in molecular qualities and average gene expression for the same dataset. Our analysis finds distinction between two diverse categories of cell barcodes: common barcodes called (unmasked) by all versions of CR, and specific barcodes only called (unmasked/masked) by some versions. Surprisingly, we observed variations in molecular quality indices between common cell barcodes when called by different versions of CR. The specific barcodes yield skewed gene body coverage and form distinct clusters at the edges of UMAP plots. The choice of CR version affects scores for quality, average gene expression, clustering results, and top cluster marker genes for each dataset. Our study indicates a demonstrable, quantitative effect on downstream analysis from choice of CR version, resulting in widely different Chromium single cell gene expression data for different CR versions.\",\"PeriodicalId\":501307,\"journal\":{\"name\":\"bioRxiv - Bioinformatics\",\"volume\":\"86 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"bioRxiv - Bioinformatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1101/2024.08.10.607413\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv - Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.08.10.607413","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

在 10x Genomics 平台基于液滴的 Chromium 单细胞基因表达数据中,细胞游侠(Cell Ranger,CR)的细胞条形码调用是一个标准流程。然而,目前还没有系统评估已发布版本的 CR 对 Chromium 单细胞基因表达数据的影响。为了全面评估 CR 的影响,我们考虑了六个分子质量标准,量化了基因表达,并对 12 个 Chromium 单细胞基因表达数据集进行了下游分析。每个数据集都经过 10 个版本的 CR 处理,共产生 180 个数据集和 702,493 个细胞条形码。我们证明,不同版本的 CR 产生的细胞条形码数量不同,同一数据集的分子质量和平均基因表达量也有显著差异。我们的分析发现细胞条形码有两种不同的类别:一种是所有 CR 版本都调用(未屏蔽)的普通条形码,另一种是某些版本才调用(未屏蔽/屏蔽)的特定条形码。令人惊讶的是,我们观察到不同版本的 CR 调用普通细胞条形码时,其分子质量指数存在差异。特定的条形码会产生偏斜的基因体覆盖率,并在 UMAP 图的边缘形成明显的群集。CR 版本的选择会影响每个数据集的质量得分、平均基因表达量、聚类结果和顶级聚类标记基因。我们的研究表明,选择 CR 版本会对下游分析产生明显的定量影响,导致不同 CR 版本的 Chromium 单细胞基因表达数据大相径庭。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Impacts of Cell Ranger versions on Chromium gene expression data
In droplet-based Chromium single cell gene expression data by the 10x Genomics platform, cell barcode calling by Cell Ranger (CR) is a standard pipeline. However, no systematic evaluation of the impact of the released versions of CR on Chromium single cell gene expression data has been conducted. To comprehensively evaluate the impact of CR, we considered six molecular quality criteria, quantified gene expression, and performed downstream analysis for 12 single-cell Chromium gene expression datasets. Each dataset was processed by 10 versions of CR resulting in 180 datasets and a total of 702,493 cell barcodes. We demonstrated that different versions of CR yield different numbers of cell barcodes with significant variation in molecular qualities and average gene expression for the same dataset. Our analysis finds distinction between two diverse categories of cell barcodes: common barcodes called (unmasked) by all versions of CR, and specific barcodes only called (unmasked/masked) by some versions. Surprisingly, we observed variations in molecular quality indices between common cell barcodes when called by different versions of CR. The specific barcodes yield skewed gene body coverage and form distinct clusters at the edges of UMAP plots. The choice of CR version affects scores for quality, average gene expression, clustering results, and top cluster marker genes for each dataset. Our study indicates a demonstrable, quantitative effect on downstream analysis from choice of CR version, resulting in widely different Chromium single cell gene expression data for different CR versions.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
ECSFinder: Optimized prediction of evolutionarily conserved RNA secondary structures from genome sequences GeneSpectra: a method for context-aware comparison of cell type gene expression across species A Bioinformatician, Computer Scientist, and Geneticist lead bioinformatic tool development - which one is better? Interpretable high-resolution dimension reduction of spatial transcriptomics data by DeepFuseNMF Pangenomics to understand prophage dynamics in the Pectobacterium genus and the radiating lineages of P. brasiliense
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1