利用组织学图像进行空间基因表达预测的多模态对比学习。

IF 6.8 2区生物学 Q1 BIOCHEMICAL RESEARCH METHODS Briefings in bioinformatics Pub Date : 2024-09-23 DOI:10.1093/bib/bbae551

Wenwen Min, Zhiceng Shi, Jun Zhang, Jun Wan, Changmiao Wang

{"title":"利用组织学图像进行空间基因表达预测的多模态对比学习。","authors":"Wenwen Min, Zhiceng Shi, Jun Zhang, Jun Wan, Changmiao Wang","doi":"10.1093/bib/bbae551","DOIUrl":null,"url":null,"abstract":"In recent years, the advent of spatial transcriptomics (ST) technology has unlocked unprecedented opportunities for delving into the complexities of gene expression patterns within intricate biological systems. Despite its transformative potential, the prohibitive cost of ST technology remains a significant barrier to its widespread adoption in large-scale studies. An alternative, more cost-effective strategy involves employing artificial intelligence to predict gene expression levels using readily accessible whole-slide images stained with Hematoxylin and Eosin (H&E). However, existing methods have yet to fully capitalize on multimodal information provided by H&E images and ST data with spatial location. In this paper, we propose mclSTExp, a multimodal contrastive learning with Transformer and Densenet-121 encoder for Spatial Transcriptomics Expression prediction. We conceptualize each spot as a \"word\", integrating its intrinsic features with spatial context through the self-attention mechanism of a Transformer encoder. This integration is further enriched by incorporating image features via contrastive learning, thereby enhancing the predictive capability of our model. We conducted an extensive evaluation of highly variable genes in two breast cancer datasets and a skin squamous cell carcinoma dataset, and the results demonstrate that mclSTExp exhibits superior performance in predicting spatial gene expression. Moreover, mclSTExp has shown promise in interpreting cancer-specific overexpressed genes, elucidating immune-related genes, and identifying specialized spatial domains annotated by pathologists. Our source code is available at https://github.com/shizhiceng/mclSTExp.","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"25 6","pages":""},"PeriodicalIF":6.8000,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multimodal contrastive learning for spatial gene expression prediction using histology images.\",\"authors\":\"Wenwen Min, Zhiceng Shi, Jun Zhang, Jun Wan, Changmiao Wang\",\"doi\":\"10.1093/bib/bbae551\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, the advent of spatial transcriptomics (ST) technology has unlocked unprecedented opportunities for delving into the complexities of gene expression patterns within intricate biological systems. Despite its transformative potential, the prohibitive cost of ST technology remains a significant barrier to its widespread adoption in large-scale studies. An alternative, more cost-effective strategy involves employing artificial intelligence to predict gene expression levels using readily accessible whole-slide images stained with Hematoxylin and Eosin (H&E). However, existing methods have yet to fully capitalize on multimodal information provided by H&E images and ST data with spatial location. In this paper, we propose mclSTExp, a multimodal contrastive learning with Transformer and Densenet-121 encoder for Spatial Transcriptomics Expression prediction. We conceptualize each spot as a \\\"word\\\", integrating its intrinsic features with spatial context through the self-attention mechanism of a Transformer encoder. This integration is further enriched by incorporating image features via contrastive learning, thereby enhancing the predictive capability of our model. We conducted an extensive evaluation of highly variable genes in two breast cancer datasets and a skin squamous cell carcinoma dataset, and the results demonstrate that mclSTExp exhibits superior performance in predicting spatial gene expression. Moreover, mclSTExp has shown promise in interpreting cancer-specific overexpressed genes, elucidating immune-related genes, and identifying specialized spatial domains annotated by pathologists. Our source code is available at https://github.com/shizhiceng/mclSTExp.\",\"PeriodicalId\":9209,\"journal\":{\"name\":\"Briefings in bioinformatics\",\"volume\":\"25 6\",\"pages\":\"\"},\"PeriodicalIF\":6.8000,\"publicationDate\":\"2024-09-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Briefings in bioinformatics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1093/bib/bbae551\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Briefings in bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bib/bbae551","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 0

摘要

近年来，空间转录组学（ST）技术的出现为深入研究复杂生物系统中复杂的基因表达模式带来了前所未有的机遇。尽管空间转录组学技术具有变革性的潜力，但其高昂的成本仍然是阻碍其在大规模研究中广泛应用的一大障碍。另一种更具成本效益的策略是利用人工智能预测基因表达水平，这种方法使用的是易于获取的经苏木精和伊红（H&E）染色的整张切片图像。然而，现有方法尚未充分利用 H&E 图像提供的多模态信息和带有空间位置的 ST 数据。在本文中，我们提出了 mclSTExp，这是一种利用 Transformer 和 Densenet-121 编码器进行多模态对比学习的空间转录组学表达预测方法。我们将每个点概念化为一个 "词"，通过 Transformer 编码器的自我注意机制将其内在特征与空间上下文整合在一起。通过对比学习结合图像特征进一步丰富了这种整合，从而增强了我们模型的预测能力。我们对两个乳腺癌数据集和一个皮肤鳞状细胞癌数据集中的高变异基因进行了广泛评估，结果表明 mclSTExp 在预测空间基因表达方面表现出色。此外，mclSTExp 在解读癌症特异性过表达基因、阐明免疫相关基因以及识别病理学家注释的特殊空间域方面也表现出了良好的前景。我们的源代码见 https://github.com/shizhiceng/mclSTExp。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Multimodal contrastive learning for spatial gene expression prediction using histology images.

In recent years, the advent of spatial transcriptomics (ST) technology has unlocked unprecedented opportunities for delving into the complexities of gene expression patterns within intricate biological systems. Despite its transformative potential, the prohibitive cost of ST technology remains a significant barrier to its widespread adoption in large-scale studies. An alternative, more cost-effective strategy involves employing artificial intelligence to predict gene expression levels using readily accessible whole-slide images stained with Hematoxylin and Eosin (H&E). However, existing methods have yet to fully capitalize on multimodal information provided by H&E images and ST data with spatial location. In this paper, we propose mclSTExp, a multimodal contrastive learning with Transformer and Densenet-121 encoder for Spatial Transcriptomics Expression prediction. We conceptualize each spot as a "word", integrating its intrinsic features with spatial context through the self-attention mechanism of a Transformer encoder. This integration is further enriched by incorporating image features via contrastive learning, thereby enhancing the predictive capability of our model. We conducted an extensive evaluation of highly variable genes in two breast cancer datasets and a skin squamous cell carcinoma dataset, and the results demonstrate that mclSTExp exhibits superior performance in predicting spatial gene expression. Moreover, mclSTExp has shown promise in interpreting cancer-specific overexpressed genes, elucidating immune-related genes, and identifying specialized spatial domains annotated by pathologists. Our source code is available at https://github.com/shizhiceng/mclSTExp.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Briefings in bioinformatics 生物-生化研究方法

CiteScore

13.20

自引率

13.70%

发文量

549

审稿时长

6 months

期刊介绍： Briefings in Bioinformatics is an international journal serving as a platform for researchers and educators in the life sciences. It also appeals to mathematicians, statisticians, and computer scientists applying their expertise to biological challenges. The journal focuses on reviews tailored for users of databases and analytical tools in contemporary genetics, molecular and systems biology. It stands out by offering practical assistance and guidance to non-specialists in computerized methodologies. Covering a wide range from introductory concepts to specific protocols and analyses, the papers address bacterial, plant, fungal, animal, and human data. The journal's detailed subject areas include genetic studies of phenotypes and genotypes, mapping, DNA sequencing, expression profiling, gene expression studies, microarrays, alignment methods, protein profiles and HMMs, lipids, metabolic and signaling pathways, structure determination and function prediction, phylogenetic studies, and education and training.