Zhaojun Zhang, Divij Mathew, Tristan L. Lim, Kaishu Mason, Clara Morral Martinez, Sijia Huang, E. John Wherry, Katalin Susztak, Andy J. Minn, Zongming Ma, Nancy R. Zhang
{"title":"Recovery of biological signals lost in single-cell batch integration with CellANOVA","authors":"Zhaojun Zhang, Divij Mathew, Tristan L. Lim, Kaishu Mason, Clara Morral Martinez, Sijia Huang, E. John Wherry, Katalin Susztak, Andy J. Minn, Zongming Ma, Nancy R. Zhang","doi":"10.1038/s41587-024-02463-1","DOIUrl":null,"url":null,"abstract":"<p>Data integration to align cells across batches has become a cornerstone of single-cell data analysis, critically affecting downstream results. Currently, there are no guidelines for when the biological differences between samples are separable from batch effects. Here we show that current paradigms for single-cell data integration remove biologically meaningful variation and introduce distortion. We present a statistical model and computationally scalable algorithm, CellANOVA (cell state space analysis of variance), that harnesses experimental design to explicitly recover biological signals that are erased during single-cell data integration. CellANOVA uses a ‘pool-of-controls’ design concept, applicable across diverse settings, to separate unwanted variation from biological variation of interest and allow the recovery of subtle biological signals. We apply CellANOVA to diverse contexts and validate the recovered biological signals by orthogonal assays. In particular, we show that CellANOVA is effective in the challenging case of single-cell and single-nucleus data integration, where it recovers subtle biological signals that can be validated and replicated by external data.</p>","PeriodicalId":19084,"journal":{"name":"Nature biotechnology","volume":"191 1","pages":""},"PeriodicalIF":33.1000,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature biotechnology","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1038/s41587-024-02463-1","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Data integration to align cells across batches has become a cornerstone of single-cell data analysis, critically affecting downstream results. Currently, there are no guidelines for when the biological differences between samples are separable from batch effects. Here we show that current paradigms for single-cell data integration remove biologically meaningful variation and introduce distortion. We present a statistical model and computationally scalable algorithm, CellANOVA (cell state space analysis of variance), that harnesses experimental design to explicitly recover biological signals that are erased during single-cell data integration. CellANOVA uses a ‘pool-of-controls’ design concept, applicable across diverse settings, to separate unwanted variation from biological variation of interest and allow the recovery of subtle biological signals. We apply CellANOVA to diverse contexts and validate the recovered biological signals by orthogonal assays. In particular, we show that CellANOVA is effective in the challenging case of single-cell and single-nucleus data integration, where it recovers subtle biological signals that can be validated and replicated by external data.
期刊介绍:
Nature Biotechnology is a monthly journal that focuses on the science and business of biotechnology. It covers a wide range of topics including technology/methodology advancements in the biological, biomedical, agricultural, and environmental sciences. The journal also explores the commercial, political, ethical, legal, and societal aspects of this research.
The journal serves researchers by providing peer-reviewed research papers in the field of biotechnology. It also serves the business community by delivering news about research developments. This approach ensures that both the scientific and business communities are well-informed and able to stay up-to-date on the latest advancements and opportunities in the field.
Some key areas of interest in which the journal actively seeks research papers include molecular engineering of nucleic acids and proteins, molecular therapy, large-scale biology, computational biology, regenerative medicine, imaging technology, analytical biotechnology, applied immunology, food and agricultural biotechnology, and environmental biotechnology.
In summary, Nature Biotechnology is a comprehensive journal that covers both the scientific and business aspects of biotechnology. It strives to provide researchers with valuable research papers and news while also delivering important scientific advancements to the business community.