MVCLST: A spatial transcriptome data analysis pipeline for cell type classification based on multi-view comparative learning

IF 4.2 3区生物学 Q1 BIOCHEMICAL RESEARCH METHODS Methods Pub Date : 2024-11-13 DOI:10.1016/j.ymeth.2024.11.001

Wei Peng , Zhihao Zhang , Wei Dai , Zhihao Ping , Xiaodong Fu , Li Liu , Lijun Liu , Ning Yu

{"title":"MVCLST: A spatial transcriptome data analysis pipeline for cell type classification based on multi-view comparative learning","authors":"Wei Peng , Zhihao Zhang , Wei Dai , Zhihao Ping , Xiaodong Fu , Li Liu , Lijun Liu , Ning Yu","doi":"10.1016/j.ymeth.2024.11.001","DOIUrl":null,"url":null,"abstract":"<div><div>Recent advancements in spatial transcriptomics sequencing technologies can not only provide gene expression within individual cells or cell clusters (spots) in a tissue but also pinpoint the exact location of this expression and generate detailed images of stained tissue sections, which offers invaluable insights into cell type identification and cell function exploration. However, effectively integratingthegene expression data, spatial location information, and tissue images from spatial transcriptomics data presents a significant challenge for computational methodsin cell classification. In this work, we propose MVCLST, a multi-view comparative learningmethod to analyze spatial transcriptomicsdata for accurate cell type classification. MVCLSTconstructs two views based on gene expression profiles, cell coordinates and image features. The multi-view method we proposed can significantly enhance the effectiveness of feature extraction while avoiding the impact of erroneous information in organizing image or gene expression data. The model employs four separate encoders to capture shared and unique features within each view. To ensure consistency and facilitate information exchange between the two views, MVCLST incorporates a contrastive learning loss function. The extracted shared and private features from both views are fused using corresponding decoders. Finally, the model utilizes the Leiden algorithm to clusterthe learned featuresfor cell type identification. Additionally, we establish a framework called MVCLST-CCFS for spatial transcriptomicsdata analysis based on MVCLST and consistent clustering. Our method achieves excellent results in clustering on human dorsolateral prefrontal cortex data and the mouse brain tissue data. Italso outperforms state-of-the-art techniques in the subsequent search for highly variable genes across cell types on the mouse olfactory bulbdata.</div></div>","PeriodicalId":390,"journal":{"name":"Methods","volume":"232 ","pages":"Pages 115-128"},"PeriodicalIF":4.2000,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Methods","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S104620232400238X","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

Recent advancements in spatial transcriptomics sequencing technologies can not only provide gene expression within individual cells or cell clusters (spots) in a tissue but also pinpoint the exact location of this expression and generate detailed images of stained tissue sections, which offers invaluable insights into cell type identification and cell function exploration. However, effectively integrating the gene expression data, spatial location information, and tissue images from spatial transcriptomics data presents a significant challenge for computational methods in cell classification. In this work, we propose MVCLST, a multi-view comparative learning method to analyze spatial transcriptomics data for accurate cell type classification. MVCLST constructs two views based on gene expression profiles, cell coordinates and image features. The multi-view method we proposed can significantly enhance the effectiveness of feature extraction while avoiding the impact of erroneous information in organizing image or gene expression data. The model employs four separate encoders to capture shared and unique features within each view. To ensure consistency and facilitate information exchange between the two views, MVCLST incorporates a contrastive learning loss function. The extracted shared and private features from both views are fused using corresponding decoders. Finally, the model utilizes the Leiden algorithm to cluster the learned features for cell type identification. Additionally, we establish a framework called MVCLST-CCFS for spatial transcriptomics data analysis based on MVCLST and consistent clustering. Our method achieves excellent results in clustering on human dorsolateral prefrontal cortex data and the mouse brain tissue data. It also outperforms state-of-the-art techniques in the subsequent search for highly variable genes across cell types on the mouse olfactory bulb data.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

MVCLST：基于多视角比较学习的细胞类型分类空间转录组数据分析管道。

空间转录组学测序技术的最新进展不仅能提供组织中单个细胞或细胞簇（点）内的基因表达，还能精确定位基因表达的确切位置，并生成染色组织切片的详细图像，这为细胞类型鉴定和细胞功能探索提供了宝贵的见解。然而，如何有效整合空间转录组学数据中的基因表达数据、空间位置信息和组织图像，是细胞分类计算方法面临的重大挑战。在这项工作中，我们提出了 MVCLST，这是一种多视图比较学习方法，用于分析空间转录组学数据，以实现准确的细胞类型分类。MVCLST 基于基因表达谱、细胞坐标和图像特征构建两个视图。我们提出的多视图方法可以显著提高特征提取的有效性，同时避免错误信息对图像或基因表达数据组织的影响。该模型采用四个独立的编码器来捕捉每个视图中的共享和独特特征。为了确保一致性并促进两个视图之间的信息交流，MVCLST 采用了对比学习损失函数。使用相应的解码器融合从两个视图中提取的共享和私有特征。最后，该模型利用莱顿算法对所学特征进行聚类，以识别细胞类型。此外，我们还建立了一个基于 MVCLST 和一致聚类的空间转录组学数据分析框架，称为 MVCLST-CCFS。我们的方法在人类背外侧前额叶皮层数据和小鼠脑组织数据的聚类中取得了优异的成绩。在随后对小鼠嗅球数据进行跨细胞类型的高变异基因搜索时，Italso 的表现优于最先进的技术。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Methods 生物-生化研究方法

CiteScore

9.80

自引率

2.10%

发文量

222

审稿时长

11.3 weeks

期刊介绍： Methods focuses on rapidly developing techniques in the experimental biological and medical sciences. Each topical issue, organized by a guest editor who is an expert in the area covered, consists solely of invited quality articles by specialist authors, many of them reviews. Issues are devoted to specific technical approaches with emphasis on clear detailed descriptions of protocols that allow them to be reproduced easily. The background information provided enables researchers to understand the principles underlying the methods; other helpful sections include comparisons of alternative methods giving the advantages and disadvantages of particular methods, guidance on avoiding potential pitfalls, and suggestions for troubleshooting.