{"title":"Enhancing single-cell classification accuracy using image conversion and deep learning.","authors":"Bingxi Gao, Huaxuan Wu, Zhiqiang Du","doi":"10.16288/j.yczz.24-213","DOIUrl":null,"url":null,"abstract":"<p><p>Single-cell transcriptome sequencing (scRNA-seq) is widely used in the fields of animal and plant developmental biology and important trait analysis by obtaining single-cell transcript abundance data in high throughput, which can deeply reveal cell types, subtype composition, specific gene markers and functional differences. However, scRNA-seq data are often accompanied by problems such as high noise, high dimensionality and batch effect, resulting in a large number of low-expressed genes and variants, which seriously affect the accuracy and reliability of data analysis. This not only increases the complexity of data processing, but also limits the effectiveness of feature selection and downstream analysis. Although several statistical inference and machine learning methods have been used to address these challenges, the existing methods still have limitations in cell type identification, feature selection, and batch effect correction, which are difficult to meet the needs of complex biological research. In this study, we proposes an innovative single-cell classification method, scIC (single-cell image classification), which converts scRNA-seq data into image form and combines it with deep learning techniques for cell classification. Through this image conversion, we are able to capture complex patterns in the data more efficiently, and then construct efficient classification models using convolutional neural networks (CNN) and residual networks (ResNet). After testing scRNA-seq data from four cell types (mouse skin basal cells, mouse lymphocytes, human neuronal cells, and mouse spinal cord cells), the accuracy of the classification models exceeded 94%, with the mouse skin basal cell dataset achieving a classification accuracy of 99.8% when using the ResNet50 model. These results indicate that image transformation of scRNA-seq data and combining it with deep learning techniques can significantly improve the classification accuracy, providing new ideas and effective tools for solving key challenges in single-cell data analysis. The code for this study is publicly available at: https://github.com/Bingxi-Gao/SCImageClassify.</p>","PeriodicalId":35536,"journal":{"name":"遗传","volume":"47 3","pages":"382-392"},"PeriodicalIF":0.0000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"遗传","FirstCategoryId":"1091","ListUrlMain":"https://doi.org/10.16288/j.yczz.24-213","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Medicine","Score":null,"Total":0}
引用次数: 0
Abstract
Single-cell transcriptome sequencing (scRNA-seq) is widely used in the fields of animal and plant developmental biology and important trait analysis by obtaining single-cell transcript abundance data in high throughput, which can deeply reveal cell types, subtype composition, specific gene markers and functional differences. However, scRNA-seq data are often accompanied by problems such as high noise, high dimensionality and batch effect, resulting in a large number of low-expressed genes and variants, which seriously affect the accuracy and reliability of data analysis. This not only increases the complexity of data processing, but also limits the effectiveness of feature selection and downstream analysis. Although several statistical inference and machine learning methods have been used to address these challenges, the existing methods still have limitations in cell type identification, feature selection, and batch effect correction, which are difficult to meet the needs of complex biological research. In this study, we proposes an innovative single-cell classification method, scIC (single-cell image classification), which converts scRNA-seq data into image form and combines it with deep learning techniques for cell classification. Through this image conversion, we are able to capture complex patterns in the data more efficiently, and then construct efficient classification models using convolutional neural networks (CNN) and residual networks (ResNet). After testing scRNA-seq data from four cell types (mouse skin basal cells, mouse lymphocytes, human neuronal cells, and mouse spinal cord cells), the accuracy of the classification models exceeded 94%, with the mouse skin basal cell dataset achieving a classification accuracy of 99.8% when using the ResNet50 model. These results indicate that image transformation of scRNA-seq data and combining it with deep learning techniques can significantly improve the classification accuracy, providing new ideas and effective tools for solving key challenges in single-cell data analysis. The code for this study is publicly available at: https://github.com/Bingxi-Gao/SCImageClassify.
期刊介绍:
Hereditas is a national academic journal sponsored by the Institute of Genetics and Developmental Biology of the Chinese Academy of Sciences and the Chinese Society of Genetics and published by Science Press. It is a Chinese core journal and a Chinese high-quality scientific journal. The journal mainly publishes innovative research papers in the fields of genetics, genomics, cell biology, developmental biology, biological evolution, genetic engineering and biotechnology; new technologies and new methods; monographs and reviews on hot issues in the discipline; academic debates and discussions; experience in genetics teaching; introductions to famous geneticists at home and abroad; genetic counseling; information on academic conferences at home and abroad, etc. Main columns: review, frontier focus, research report, technology and method, resources and platform, experimental operation guide, genetic resources, genetics teaching, scientific news, etc.