LncLocFormer: a Transformer-based deep learning model for multi-label lncRNA subcellular localization prediction by using localization-specific attention mechanism

IF 4.4 3区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Bioinformatics Pub Date : 2023-12-18 DOI:10.1093/bioinformatics/btad752
Min Zeng, Yifan Wu, Yiming Li, Rui Yin, Chengqian Lu, Junwen Duan, Min Li
{"title":"LncLocFormer: a Transformer-based deep learning model for multi-label lncRNA subcellular localization prediction by using localization-specific attention mechanism","authors":"Min Zeng, Yifan Wu, Yiming Li, Rui Yin, Chengqian Lu, Junwen Duan, Min Li","doi":"10.1093/bioinformatics/btad752","DOIUrl":null,"url":null,"abstract":"Motivation There is mounting evidence that the subcellular localization of lncRNAs can provide valuable insights into their biological functions. In the real world of transcriptomes, lncRNAs are usually localized in multiple subcellular localizations. Furthermore, lncRNAs have specific localization patterns for different subcellular localizations. Although several computational methods have been developed to predict the subcellular localization of lncRNAs, few of them are designed for lncRNAs that have multiple subcellular localizations, and none of them take motif specificity into consideration. Results In this study, we proposed a novel deep learning model, called LncLocFormer, which uses only lncRNA sequences to predict multi-label lncRNA subcellular localization. LncLocFormer utilizes 8 Transformer blocks to model long-range dependencies within the lncRNA sequence and share information across the lncRNA sequence. To exploit the relationship between different subcellular localizations and find distinct localization patterns for different subcellular localizations, LncLocFormer employs a localization-specific attention mechanism. The results demonstrate that LncLocFormer outperforms existing state-of-the-art predictors on the hold-out test set. Furthermore, we conducted a motif analysis and found LncLocFormer can capture known motifs. Ablation studies confirmed the contribution of the localization-specific attention mechanism in improving the prediction performance. Availability The LncLocFormer web server is available at http://csuligroup.com:9000/LncLocFormer. The source code can be obtained from https://github.com/CSUBioGroup/LncLocFormer. Supplementary information Supplementary data are available at Bioinformatics online.","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":null,"pages":null},"PeriodicalIF":4.4000,"publicationDate":"2023-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btad752","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Motivation There is mounting evidence that the subcellular localization of lncRNAs can provide valuable insights into their biological functions. In the real world of transcriptomes, lncRNAs are usually localized in multiple subcellular localizations. Furthermore, lncRNAs have specific localization patterns for different subcellular localizations. Although several computational methods have been developed to predict the subcellular localization of lncRNAs, few of them are designed for lncRNAs that have multiple subcellular localizations, and none of them take motif specificity into consideration. Results In this study, we proposed a novel deep learning model, called LncLocFormer, which uses only lncRNA sequences to predict multi-label lncRNA subcellular localization. LncLocFormer utilizes 8 Transformer blocks to model long-range dependencies within the lncRNA sequence and share information across the lncRNA sequence. To exploit the relationship between different subcellular localizations and find distinct localization patterns for different subcellular localizations, LncLocFormer employs a localization-specific attention mechanism. The results demonstrate that LncLocFormer outperforms existing state-of-the-art predictors on the hold-out test set. Furthermore, we conducted a motif analysis and found LncLocFormer can capture known motifs. Ablation studies confirmed the contribution of the localization-specific attention mechanism in improving the prediction performance. Availability The LncLocFormer web server is available at http://csuligroup.com:9000/LncLocFormer. The source code can be obtained from https://github.com/CSUBioGroup/LncLocFormer. Supplementary information Supplementary data are available at Bioinformatics online.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
LncLocFormer:基于变换器的深度学习模型,利用特定于定位的注意力机制进行多标签 lncRNA 亚细胞定位预测
动机 越来越多的证据表明,lncRNAs 的亚细胞定位可以为了解其生物学功能提供有价值的信息。在转录组的真实世界中,lncRNA 通常在多个亚细胞定位。此外,lncRNA 在不同亚细胞定位中具有特定的定位模式。虽然目前已开发出多种计算方法来预测lncRNA的亚细胞定位,但其中很少有方法是针对具有多种亚细胞定位的lncRNA设计的,而且没有一种方法考虑到motif的特异性。结果 在这项研究中,我们提出了一种名为LncLocFormer的新型深度学习模型,它仅使用lncRNA序列来预测多标签lncRNA亚细胞定位。LncLocFormer利用8个Transformer块来模拟lncRNA序列内的长程依赖关系,并在lncRNA序列间共享信息。为了利用不同亚细胞定位之间的关系,并为不同的亚细胞定位找到不同的定位模式,LncLocFormer 采用了一种特定于定位的关注机制。结果表明,LncLocFormer 在hold-out 测试集上的表现优于现有的最先进预测器。此外,我们还进行了图案分析,发现 LncLocFormer 可以捕捉已知图案。消融研究证实了定位特异性注意机制在提高预测性能方面的贡献。可用性 LncLocFormer网络服务器可在http://csuligroup.com:9000/LncLocFormer。源代码可从 https://github.com/CSUBioGroup/LncLocFormer 获取。补充信息 补充数据可在 Bioinformatics online 上获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Bioinformatics
Bioinformatics 生物-生化研究方法
CiteScore
11.20
自引率
5.20%
发文量
753
审稿时长
2.1 months
期刊介绍: The leading journal in its field, Bioinformatics publishes the highest quality scientific papers and review articles of interest to academic and industrial researchers. Its main focus is on new developments in genome bioinformatics and computational biology. Two distinct sections within the journal - Discovery Notes and Application Notes- focus on shorter papers; the former reporting biologically interesting discoveries using computational methods, the latter exploring the applications used for experiments.
期刊最新文献
PQSDC: a parallel lossless compressor for quality scores data via sequences partition and Run-Length prediction mapping. MUSE-XAE: MUtational Signature Extraction with eXplainable AutoEncoder enhances tumour types classification. CopyVAE: a variational autoencoder-based approach for copy number variation inference using single-cell transcriptomics CORDAX web server: An online platform for the prediction and 3D visualization of aggregation motifs in protein sequences. LMCrot: An enhanced protein crotonylation site predictor by leveraging an interpretable window-level embedding from a transformer-based protein language model.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1