DeepES: deep learning-based enzyme screening to identify orphan enzyme genes.

Keisuke Hirota, Felix Salim, Takuji Yamada
{"title":"DeepES: deep learning-based enzyme screening to identify orphan enzyme genes.","authors":"Keisuke Hirota, Felix Salim, Takuji Yamada","doi":"10.1093/bioinformatics/btaf053","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>Progress in sequencing technology has led to determination of large numbers of protein sequences, and large enzyme databases are now available. Although many computational tools for enzyme annotation were developed, sequence information is unavailable for many enzymes, known as orphan enzymes. These orphan enzymes hinder sequence similarity-based functional annotation, leading gaps in understanding the association between sequences and enzymatic reactions.</p><p><strong>Results: </strong>Therefore, we developed DeepES, a deep learning-based tool for enzyme screening to identify orphan enzyme genes, focusing on biosynthetic gene clusters and reaction class. DeepES uses protein sequences as inputs and evaluates whether the input genes contain biosynthetic gene clusters of interest by integrating the outputs of the binary classifier for each reaction class. The validation results suggested that DeepES can capture functional similarity between protein sequences, and it can be implemented to explore orphan enzyme genes. By applying DeepES to 4744 metagenome-assembled genomes, we identified candidate genes for 236 orphan enzymes, including those involved in short-chain fatty acid production as a characteristic pathway in human gut bacteria.</p><p><strong>Availability and implementation: </strong>DeepES is available at https://github.com/yamada-lab/DeepES. Model weights and the candidate genes are available at Zenodo (https://doi.org/10.5281/zenodo.11123900).</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11881691/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btaf053","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Motivation: Progress in sequencing technology has led to determination of large numbers of protein sequences, and large enzyme databases are now available. Although many computational tools for enzyme annotation were developed, sequence information is unavailable for many enzymes, known as orphan enzymes. These orphan enzymes hinder sequence similarity-based functional annotation, leading gaps in understanding the association between sequences and enzymatic reactions.

Results: Therefore, we developed DeepES, a deep learning-based tool for enzyme screening to identify orphan enzyme genes, focusing on biosynthetic gene clusters and reaction class. DeepES uses protein sequences as inputs and evaluates whether the input genes contain biosynthetic gene clusters of interest by integrating the outputs of the binary classifier for each reaction class. The validation results suggested that DeepES can capture functional similarity between protein sequences, and it can be implemented to explore orphan enzyme genes. By applying DeepES to 4744 metagenome-assembled genomes, we identified candidate genes for 236 orphan enzymes, including those involved in short-chain fatty acid production as a characteristic pathway in human gut bacteria.

Availability and implementation: DeepES is available at https://github.com/yamada-lab/DeepES. Model weights and the candidate genes are available at Zenodo (https://doi.org/10.5281/zenodo.11123900).

Abstract Image

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
DeepES:基于深度学习的酶筛选,以识别孤儿酶基因。
动机:测序技术的进步导致了大量蛋白质序列的测定,现在有了大量的酶数据库。尽管开发了许多用于酶注释的计算工具,但许多酶的序列信息是不可用的,称为孤儿酶。这些孤儿酶阻碍了基于序列相似性的功能注释,导致理解序列和酶反应之间关系的空白。因此,我们开发了基于深度学习的酶筛选工具DeepES,以识别孤儿酶基因,重点关注生物合成基因簇和反应类。DeepES使用蛋白质序列作为输入,并通过整合每个反应类别的二元分类器的输出来评估输入基因是否包含感兴趣的生物合成基因簇。验证结果表明,DeepES可以捕获蛋白质序列之间的功能相似性,可以实现孤儿酶基因的探索。通过将DeepES应用于4744个宏基因组组装的基因组,我们确定了236个孤儿酶的候选基因,包括那些参与短链脂肪酸生产的基因,这是人类肠道细菌的一个特征途径。可用性:DeepES可在https://github.com/yamada-lab/DeepES上获得。模型权重和候选基因可在Zenodo上获得(https://doi.org/10.5281/zenodo.11123900).Supplementary信息:补充数据可在Bioinformatics在线上获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
DeeDeeExperiment: Building an infrastructure for integrating and managing omics data analysis results in R/Bioconductor. Hypergraph Representations of Single-Cell RNA Sequencing Data for Improved Cell Clustering. GlycanGT: A Pretrained Graph Transformer Framework for Glycan Graph Representation and Generative Learning. Enhancing Mutation Impact Prediction in Protein Protein Interactions through Interpretable Graph-Based Multi-Level Feature Interactions. ONEST: A Web-Based Platform for the Rapid and Robust Analysis of Protein Excited States through CEST Spectroscopy.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1