{"title":"Dataset including whole blood gene expression profiles and matched leukocyte counts with utility for benchmarking cellular deconvolution pipelines.","authors":"Grant C O'Connell","doi":"10.1186/s12863-024-01223-z","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>Cellular deconvolution is a valuable computational process that can infer the cellular composition of heterogeneous tissue samples from bulk RNA-sequencing data. Benchmark testing is a crucial step in the development and evaluation of new cellular deconvolution algorithms, and also plays a key role in the process of building and optimizing deconvolution pipelines for specific experimental applications. However, few in vivo benchmarking datasets exist, particularly for whole blood, which is the single most profiled human tissue. Here, we describe a unique dataset containing whole blood gene expression profiles and matched circulating leukocyte counts from a large cohort of human donors with utility for benchmarking cellular deconvolution pipelines.</p><p><strong>Data description: </strong>To produce this dataset, venous whole blood was sampled from 138 total donors recruited at an academic medical center. Genome-wide expression profiling was subsequently performed via next-generation RNA sequencing, and white blood cell differentials were collected in parallel using flow cytometry. The resultant final dataset contains donor-level expression data for over 45,000 protein coding and non-protein coding genes, as well as matched neutrophil, lymphocyte, monocyte, and eosinophil counts.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"25 1","pages":"45"},"PeriodicalIF":1.9000,"publicationDate":"2024-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11077736/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC genomic data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s12863-024-01223-z","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0
Abstract
Objectives: Cellular deconvolution is a valuable computational process that can infer the cellular composition of heterogeneous tissue samples from bulk RNA-sequencing data. Benchmark testing is a crucial step in the development and evaluation of new cellular deconvolution algorithms, and also plays a key role in the process of building and optimizing deconvolution pipelines for specific experimental applications. However, few in vivo benchmarking datasets exist, particularly for whole blood, which is the single most profiled human tissue. Here, we describe a unique dataset containing whole blood gene expression profiles and matched circulating leukocyte counts from a large cohort of human donors with utility for benchmarking cellular deconvolution pipelines.
Data description: To produce this dataset, venous whole blood was sampled from 138 total donors recruited at an academic medical center. Genome-wide expression profiling was subsequently performed via next-generation RNA sequencing, and white blood cell differentials were collected in parallel using flow cytometry. The resultant final dataset contains donor-level expression data for over 45,000 protein coding and non-protein coding genes, as well as matched neutrophil, lymphocyte, monocyte, and eosinophil counts.