Inductive graph neural network framework for imputation of single-cell RNA sequencing data

IF 3.9 2区工程技术 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Computers & Chemical Engineering Pub Date : 2025-02-06 DOI:10.1016/j.compchemeng.2025.109031

Boneshwar V K , Deepesh Agarwal , Bala Natarajan , Babji Srinivasan

{"title":"Inductive graph neural network framework for imputation of single-cell RNA sequencing data","authors":"Boneshwar V K , Deepesh Agarwal , Bala Natarajan , Babji Srinivasan","doi":"10.1016/j.compchemeng.2025.109031","DOIUrl":null,"url":null,"abstract":"<div><div>Single-cell RNA sequencing (scRNA-seq) has transformed biological research, enabling detailed analysis of disease pathways, cellular differentiation, and immune responses at a cellular level. However, the noisy and sparse nature of scRNA-seq datasets often impedes accurate downstream analyses. Cell clustering and gene imputation serve as foundational tasks in harnessing scRNA-seq data for complex biological insights. While various graph-based methods have been developed to enhance imputation and clustering accuracy, traditional transductive models require entire graphs during training, limiting computational efficiency on large biological networks. This study introduces a novel inductive framework that efficiently learns relationships among graph nodes by utilizing subgraphs rather than full neighbor sets for node embedding generation, significantly reducing computational demands while maintaining robust performance. The proposed model achieves up to 60% improvement in Silhouette score, 14.9% in Adjusted Rand Index, 48% in runtime, and 4.5% in L<span><math><msub><mrow></mrow><mrow><mn>1</mn></mrow></msub></math></span> Median error over baseline models, validating the effectiveness of inductive graph learning. Evaluated on diverse scRNA-seq datasets—GSE75748 (progenitor cell types derived from human embryonic stem cells (hESCs)), GSE131928 (adult and pediatric IDH-wildtype glioblastomas (GBM)), and Goolam et al (blastomeres from early-stage Mus musculus (mouse) embryos collected at the 2-cell, 4-cell, 8-cell, 16-cell, and 32-cell stages of preimplantation development).—this framework demonstrates scalability and adaptability, offering a reliable approach for future applications in trajectory inference and gene pathway analysis.</div></div>","PeriodicalId":286,"journal":{"name":"Computers & Chemical Engineering","volume":"195 ","pages":"Article 109031"},"PeriodicalIF":3.9000,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Chemical Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0098135425000353","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

Single-cell RNA sequencing (scRNA-seq) has transformed biological research, enabling detailed analysis of disease pathways, cellular differentiation, and immune responses at a cellular level. However, the noisy and sparse nature of scRNA-seq datasets often impedes accurate downstream analyses. Cell clustering and gene imputation serve as foundational tasks in harnessing scRNA-seq data for complex biological insights. While various graph-based methods have been developed to enhance imputation and clustering accuracy, traditional transductive models require entire graphs during training, limiting computational efficiency on large biological networks. This study introduces a novel inductive framework that efficiently learns relationships among graph nodes by utilizing subgraphs rather than full neighbor sets for node embedding generation, significantly reducing computational demands while maintaining robust performance. The proposed model achieves up to 60% improvement in Silhouette score, 14.9% in Adjusted Rand Index, 48% in runtime, and 4.5% in L

_{1}

Median error over baseline models, validating the effectiveness of inductive graph learning. Evaluated on diverse scRNA-seq datasets—GSE75748 (progenitor cell types derived from human embryonic stem cells (hESCs)), GSE131928 (adult and pediatric IDH-wildtype glioblastomas (GBM)), and Goolam et al (blastomeres from early-stage Mus musculus (mouse) embryos collected at the 2-cell, 4-cell, 8-cell, 16-cell, and 32-cell stages of preimplantation development).—this framework demonstrates scalability and adaptability, offering a reliable approach for future applications in trajectory inference and gene pathway analysis.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

求助全文

约1分钟内获得全文去求助

来源期刊

Computers & Chemical Engineering 工程技术-工程：化工

CiteScore

8.70

自引率

14.00%

发文量

374

审稿时长

70 days

期刊介绍： Computers & Chemical Engineering is primarily a journal of record for new developments in the application of computing and systems technology to chemical engineering problems.