FEEDS，食品废料生物肽识别器：从微生物基因组和底物到生物肽功能

IF 3.6 Q2 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Current Research in Biotechnology Pub Date : 2024-01-01 DOI:10.1016/j.crbiot.2024.100186

Victor Borin Centurion , Edoardo Bizzotto , Stefano Tonini , Pasquale Filannino , Raffaella Di Cagno , Guido Zampieri , Stefano Campanaro

{"title":"FEEDS，食品废料生物肽识别器：从微生物基因组和底物到生物肽功能","authors":"Victor Borin Centurion , Edoardo Bizzotto , Stefano Tonini , Pasquale Filannino , Raffaella Di Cagno , Guido Zampieri , Stefano Campanaro","doi":"10.1016/j.crbiot.2024.100186","DOIUrl":null,"url":null,"abstract":"<div><p>The production of biopeptides from food waste through microbial fermentation faces challenges arising from the diverse proteolytic abilities of microorganisms and substrate variability, impacting both the quality and yield of generated biopeptides. To address these challenges, preliminary in-silico bioinformatics analyses play a crucial role in evaluating suitable substrates and proteases for the fermentation process. However, existing tools lack comprehensive predictive capabilities for relevant proteases, substrate performance assessment, and final biopeptide family characterization. To overcome these limitations, we developed FEEDS (Food wastE biopEptiDe claSsifier), a novel biopeptide prediction and classification tool. FEEDS predicts biopeptide families based on microbial genome protease profiles and substrate composition during proteolysis. The tool also employs a machine learning approach for functional biopeptide classification. Results from testing on 1000 microbial genomes demonstrate the effectiveness of biopeptide classification, particularly in categorizing peptides derived from substrates like <em>Hordeum vulgare</em> and <em>Vitis vinifera</em> seed storage proteins. In addition to biopeptide classification, our study delves into the distinctive protease profiles of bacteria and yeast genomes. Bacterial genomes exhibited 60 to 100 proteases across 40–55 families. Contrastingly, yeast genomes displayed a more evenly distributed pattern with 150 to 160 protease-encoding genes across 60 to 67 families, surpassing bacterial counts. Remarkably, a substantial portion of yeast proteases (∼66 %) was secreted. Moreover, our integration of a machine learning methodology within the FEEDS pipeline proved highly effective, achieving over 80 % accuracy in predicting the function of peptides derived from seed storage proteins. Notably, longer peptide sequences exceeding 20 amino acids consistently displayed a higher probability of correct assignment compared to shorter counterparts.</p></div>","PeriodicalId":52676,"journal":{"name":"Current Research in Biotechnology","volume":null,"pages":null},"PeriodicalIF":3.6000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590262824000121/pdfft?md5=dc14271b17d72c293fd1596de17ef07f&pid=1-s2.0-S2590262824000121-main.pdf","citationCount":"0","resultStr":"{\"title\":\"FEEDS, the Food wastE biopEptiDe claSsifier: From microbial genomes and substrates to biopeptides function\",\"authors\":\"Victor Borin Centurion , Edoardo Bizzotto , Stefano Tonini , Pasquale Filannino , Raffaella Di Cagno , Guido Zampieri , Stefano Campanaro\",\"doi\":\"10.1016/j.crbiot.2024.100186\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The production of biopeptides from food waste through microbial fermentation faces challenges arising from the diverse proteolytic abilities of microorganisms and substrate variability, impacting both the quality and yield of generated biopeptides. To address these challenges, preliminary in-silico bioinformatics analyses play a crucial role in evaluating suitable substrates and proteases for the fermentation process. However, existing tools lack comprehensive predictive capabilities for relevant proteases, substrate performance assessment, and final biopeptide family characterization. To overcome these limitations, we developed FEEDS (Food wastE biopEptiDe claSsifier), a novel biopeptide prediction and classification tool. FEEDS predicts biopeptide families based on microbial genome protease profiles and substrate composition during proteolysis. The tool also employs a machine learning approach for functional biopeptide classification. Results from testing on 1000 microbial genomes demonstrate the effectiveness of biopeptide classification, particularly in categorizing peptides derived from substrates like <em>Hordeum vulgare</em> and <em>Vitis vinifera</em> seed storage proteins. In addition to biopeptide classification, our study delves into the distinctive protease profiles of bacteria and yeast genomes. Bacterial genomes exhibited 60 to 100 proteases across 40–55 families. Contrastingly, yeast genomes displayed a more evenly distributed pattern with 150 to 160 protease-encoding genes across 60 to 67 families, surpassing bacterial counts. Remarkably, a substantial portion of yeast proteases (∼66 %) was secreted. Moreover, our integration of a machine learning methodology within the FEEDS pipeline proved highly effective, achieving over 80 % accuracy in predicting the function of peptides derived from seed storage proteins. Notably, longer peptide sequences exceeding 20 amino acids consistently displayed a higher probability of correct assignment compared to shorter counterparts.</p></div>\",\"PeriodicalId\":52676,\"journal\":{\"name\":\"Current Research in Biotechnology\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.6000,\"publicationDate\":\"2024-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2590262824000121/pdfft?md5=dc14271b17d72c293fd1596de17ef07f&pid=1-s2.0-S2590262824000121-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Current Research in Biotechnology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2590262824000121\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOTECHNOLOGY & APPLIED MICROBIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Current Research in Biotechnology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2590262824000121","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

通过微生物发酵从食物垃圾中生产生物肽面临着各种挑战，因为微生物的蛋白水解能力各不相同，底物也千差万别，这些都会影响所生产生物肽的质量和产量。为了应对这些挑战，初步的室内生物信息学分析在评估适合发酵过程的底物和蛋白酶方面发挥着至关重要的作用。然而，现有工具缺乏对相关蛋白酶、底物性能评估和最终生物肽家族特征描述的全面预测能力。为了克服这些局限性，我们开发了一种新型生物肽预测和分类工具 FEEDS（Food wastE biopEptiDe claSsifier）。FEEDS 根据微生物基因组蛋白酶图谱和蛋白水解过程中的底物组成预测生物肽家族。该工具还采用机器学习方法进行功能性生物肽分类。对 1000 个微生物基因组的测试结果表明了生物肽分类的有效性，尤其是在对来自底物（如大麦和葡萄籽贮藏蛋白）的肽进行分类方面。除了生物肽分类，我们的研究还深入探讨了细菌和酵母基因组独特的蛋白酶特征。细菌基因组展示了 60 至 100 种蛋白酶，涉及 40 至 55 个家族。与此形成鲜明对比的是，酵母基因组的分布更为均匀，有 150 到 160 个蛋白酶编码基因，分布在 60 到 67 个族中，超过了细菌的数量。值得注意的是，酵母蛋白酶的很大一部分（66%）是分泌的。此外，我们在 FEEDS 管道中集成的机器学习方法证明非常有效，在预测来自种子贮藏蛋白的肽的功能方面达到了 80% 以上的准确率。值得注意的是，与较短的肽序列相比，超过 20 个氨基酸的长肽序列显示出更高的正确分配概率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

FEEDS, the Food wastE biopEptiDe claSsifier: From microbial genomes and substrates to biopeptides function

The production of biopeptides from food waste through microbial fermentation faces challenges arising from the diverse proteolytic abilities of microorganisms and substrate variability, impacting both the quality and yield of generated biopeptides. To address these challenges, preliminary in-silico bioinformatics analyses play a crucial role in evaluating suitable substrates and proteases for the fermentation process. However, existing tools lack comprehensive predictive capabilities for relevant proteases, substrate performance assessment, and final biopeptide family characterization. To overcome these limitations, we developed FEEDS (Food wastE biopEptiDe claSsifier), a novel biopeptide prediction and classification tool. FEEDS predicts biopeptide families based on microbial genome protease profiles and substrate composition during proteolysis. The tool also employs a machine learning approach for functional biopeptide classification. Results from testing on 1000 microbial genomes demonstrate the effectiveness of biopeptide classification, particularly in categorizing peptides derived from substrates like Hordeum vulgare and Vitis vinifera seed storage proteins. In addition to biopeptide classification, our study delves into the distinctive protease profiles of bacteria and yeast genomes. Bacterial genomes exhibited 60 to 100 proteases across 40–55 families. Contrastingly, yeast genomes displayed a more evenly distributed pattern with 150 to 160 protease-encoding genes across 60 to 67 families, surpassing bacterial counts. Remarkably, a substantial portion of yeast proteases (∼66 %) was secreted. Moreover, our integration of a machine learning methodology within the FEEDS pipeline proved highly effective, achieving over 80 % accuracy in predicting the function of peptides derived from seed storage proteins. Notably, longer peptide sequences exceeding 20 amino acids consistently displayed a higher probability of correct assignment compared to shorter counterparts.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Current Research in Biotechnology Biochemistry, Genetics and Molecular Biology-Biotechnology

CiteScore

6.70

自引率

3.60%

发文量

审稿时长

38 days

期刊介绍： Current Research in Biotechnology (CRBIOT) is a new primary research, gold open access journal from Elsevier. CRBIOT publishes original papers, reviews, and short communications (including viewpoints and perspectives) resulting from research in biotechnology and biotech-associated disciplines. Current Research in Biotechnology is a peer-reviewed gold open access (OA) journal and upon acceptance all articles are permanently and freely available. It is a companion to the highly regarded review journal Current Opinion in Biotechnology (2018 CiteScore 8.450) and is part of the Current Opinion and Research (CO+RE) suite of journals. All CO+RE journals leverage the Current Opinion legacy-of editorial excellence, high-impact, and global reach-to ensure they are a widely read resource that is integral to scientists' workflow.