LncLocFormer: a Transformer-based deep learning model for multi-label lncRNA subcellular localization prediction by using localization-specific attention mechanism

IF 5.4 3区生物学 Q1 BIOCHEMICAL RESEARCH METHODS Bioinformatics Pub Date : 2023-12-18 DOI:10.1093/bioinformatics/btad752

Min Zeng, Yifan Wu, Yiming Li, Rui Yin, Chengqian Lu, Junwen Duan, Min Li

{"title":"LncLocFormer: a Transformer-based deep learning model for multi-label lncRNA subcellular localization prediction by using localization-specific attention mechanism","authors":"Min Zeng, Yifan Wu, Yiming Li, Rui Yin, Chengqian Lu, Junwen Duan, Min Li","doi":"10.1093/bioinformatics/btad752","DOIUrl":null,"url":null,"abstract":"Motivation There is mounting evidence that the subcellular localization of lncRNAs can provide valuable insights into their biological functions. In the real world of transcriptomes, lncRNAs are usually localized in multiple subcellular localizations. Furthermore, lncRNAs have specific localization patterns for different subcellular localizations. Although several computational methods have been developed to predict the subcellular localization of lncRNAs, few of them are designed for lncRNAs that have multiple subcellular localizations, and none of them take motif specificity into consideration. Results In this study, we proposed a novel deep learning model, called LncLocFormer, which uses only lncRNA sequences to predict multi-label lncRNA subcellular localization. LncLocFormer utilizes 8 Transformer blocks to model long-range dependencies within the lncRNA sequence and share information across the lncRNA sequence. To exploit the relationship between different subcellular localizations and find distinct localization patterns for different subcellular localizations, LncLocFormer employs a localization-specific attention mechanism. The results demonstrate that LncLocFormer outperforms existing state-of-the-art predictors on the hold-out test set. Furthermore, we conducted a motif analysis and found LncLocFormer can capture known motifs. Ablation studies confirmed the contribution of the localization-specific attention mechanism in improving the prediction performance. Availability The LncLocFormer web server is available at http://csuligroup.com:9000/LncLocFormer. The source code can be obtained from https://github.com/CSUBioGroup/LncLocFormer. Supplementary information Supplementary data are available at Bioinformatics online.","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":"20 1","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2023-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btad752","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

Motivation There is mounting evidence that the subcellular localization of lncRNAs can provide valuable insights into their biological functions. In the real world of transcriptomes, lncRNAs are usually localized in multiple subcellular localizations. Furthermore, lncRNAs have specific localization patterns for different subcellular localizations. Although several computational methods have been developed to predict the subcellular localization of lncRNAs, few of them are designed for lncRNAs that have multiple subcellular localizations, and none of them take motif specificity into consideration. Results In this study, we proposed a novel deep learning model, called LncLocFormer, which uses only lncRNA sequences to predict multi-label lncRNA subcellular localization. LncLocFormer utilizes 8 Transformer blocks to model long-range dependencies within the lncRNA sequence and share information across the lncRNA sequence. To exploit the relationship between different subcellular localizations and find distinct localization patterns for different subcellular localizations, LncLocFormer employs a localization-specific attention mechanism. The results demonstrate that LncLocFormer outperforms existing state-of-the-art predictors on the hold-out test set. Furthermore, we conducted a motif analysis and found LncLocFormer can capture known motifs. Ablation studies confirmed the contribution of the localization-specific attention mechanism in improving the prediction performance. Availability The LncLocFormer web server is available at http://csuligroup.com:9000/LncLocFormer. The source code can be obtained from https://github.com/CSUBioGroup/LncLocFormer. Supplementary information Supplementary data are available at Bioinformatics online.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

LncLocFormer：基于变换器的深度学习模型，利用特定于定位的注意力机制进行多标签 lncRNA 亚细胞定位预测

动机越来越多的证据表明，lncRNAs 的亚细胞定位可以为了解其生物学功能提供有价值的信息。在转录组的真实世界中，lncRNA 通常在多个亚细胞定位。此外，lncRNA 在不同亚细胞定位中具有特定的定位模式。虽然目前已开发出多种计算方法来预测lncRNA的亚细胞定位，但其中很少有方法是针对具有多种亚细胞定位的lncRNA设计的，而且没有一种方法考虑到motif的特异性。结果在这项研究中，我们提出了一种名为LncLocFormer的新型深度学习模型，它仅使用lncRNA序列来预测多标签lncRNA亚细胞定位。LncLocFormer利用8个Transformer块来模拟lncRNA序列内的长程依赖关系，并在lncRNA序列间共享信息。为了利用不同亚细胞定位之间的关系，并为不同的亚细胞定位找到不同的定位模式，LncLocFormer 采用了一种特定于定位的关注机制。结果表明，LncLocFormer 在hold-out 测试集上的表现优于现有的最先进预测器。此外，我们还进行了图案分析，发现 LncLocFormer 可以捕捉已知图案。消融研究证实了定位特异性注意机制在提高预测性能方面的贡献。可用性 LncLocFormer网络服务器可在http://csuligroup.com:9000/LncLocFormer。源代码可从 https://github.com/CSUBioGroup/LncLocFormer 获取。补充信息补充数据可在 Bioinformatics online 上获取。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Bioinformatics 生物-生化研究方法

CiteScore

11.20

自引率

5.20%

发文量

753

审稿时长

2.1 months

期刊介绍： The leading journal in its field, Bioinformatics publishes the highest quality scientific papers and review articles of interest to academic and industrial researchers. Its main focus is on new developments in genome bioinformatics and computational biology. Two distinct sections within the journal - Discovery Notes and Application Notes- focus on shorter papers; the former reporting biologically interesting discoveries using computational methods, the latter exploring the applications used for experiments.