PanKB: An interactive microbial pangenome knowledgebase for research, biotechnological innovation, and knowledge mining.

IF 16.6 2区生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Nucleic Acids Research Pub Date : 2025-01-06 DOI:10.1093/nar/gkae1042

Binhuan Sun, Liubov Pashkova, Pascal Aldo Pieters, Archana Sanjay Harke, Omkar Satyavan Mohite, Alberto Santos, Daniel C Zielinski, Bernhard O Palsson, Patrick Victor Phaneuf

{"title":"PanKB: An interactive microbial pangenome knowledgebase for research, biotechnological innovation, and knowledge mining.","authors":"Binhuan Sun, Liubov Pashkova, Pascal Aldo Pieters, Archana Sanjay Harke, Omkar Satyavan Mohite, Alberto Santos, Daniel C Zielinski, Bernhard O Palsson, Patrick Victor Phaneuf","doi":"10.1093/nar/gkae1042","DOIUrl":null,"url":null,"abstract":"<p><p>The exponential growth of microbial genome data presents unprecedented opportunities for unlocking the potential of microorganisms. The burgeoning field of pangenomics offers a framework for extracting insights from this big biological data. Recent advances in microbial pangenomic research have generated substantial data and literature, yielding valuable knowledge across diverse microbial species. PanKB (pankb.org), a knowledgebase designed for microbial pangenomics research and biotechnological applications, was built to capitalize on this wealth of information. PanKB currently includes 51 pangenomes from 8 industrially relevant microbial families, comprising 8402 genomes, over 500 000 genes and over 7M mutations. To describe this data, PanKB implements four main components: (1) Interactive pangenomic analytics to facilitate exploration, intuition, and potential discoveries; (2) Alleleomic analytics, a pangenomic-scale analysis of variants, providing insights into intra-species sequence variation and potential mutations for applications; (3) A global search function enabling broad and deep investigations across pangenomes to power research and bioengineering workflows; (4) A bibliome of 833 open-access pangenomic papers and an interface with an LLM that can answer in-depth questions using its knowledge. PanKB empowers researchers and bioengineers to harness the potential of microbial pangenomics and serves as a valuable resource bridging the gap between pangenomic data and practical applications.</p>","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":" ","pages":"D806-D818"},"PeriodicalIF":16.6000,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11701538/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nucleic Acids Research","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/nar/gkae1042","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

The exponential growth of microbial genome data presents unprecedented opportunities for unlocking the potential of microorganisms. The burgeoning field of pangenomics offers a framework for extracting insights from this big biological data. Recent advances in microbial pangenomic research have generated substantial data and literature, yielding valuable knowledge across diverse microbial species. PanKB (pankb.org), a knowledgebase designed for microbial pangenomics research and biotechnological applications, was built to capitalize on this wealth of information. PanKB currently includes 51 pangenomes from 8 industrially relevant microbial families, comprising 8402 genomes, over 500 000 genes and over 7M mutations. To describe this data, PanKB implements four main components: (1) Interactive pangenomic analytics to facilitate exploration, intuition, and potential discoveries; (2) Alleleomic analytics, a pangenomic-scale analysis of variants, providing insights into intra-species sequence variation and potential mutations for applications; (3) A global search function enabling broad and deep investigations across pangenomes to power research and bioengineering workflows; (4) A bibliome of 833 open-access pangenomic papers and an interface with an LLM that can answer in-depth questions using its knowledge. PanKB empowers researchers and bioengineers to harness the potential of microbial pangenomics and serves as a valuable resource bridging the gap between pangenomic data and practical applications.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

PanKB：用于研究、生物技术创新和知识挖掘的交互式微生物泛基因组知识库。

微生物基因组数据的指数级增长为发掘微生物的潜力提供了前所未有的机遇。蓬勃发展的泛基因组学领域为从这一生物大数据中获取洞察力提供了一个框架。微生物庞基因组研究的最新进展产生了大量数据和文献，为不同微生物物种提供了宝贵的知识。PanKB（pankb.org）是一个专为微生物泛基因组学研究和生物技术应用而设计的知识库，它的建立就是为了充分利用这些丰富的信息。PanKB 目前包括来自 8 个工业相关微生物家族的 51 个庞基因组，包含 8402 个基因组、500 000 多个基因和 700 多万个突变。为了描述这些数据，PanKB 实现了四个主要组件：(1)交互式泛基因组分析，促进探索、直觉和潜在发现；(2)等位基因分析，泛基因组规模的变异分析，洞察种内序列变异和潜在突变的应用；(3)全局搜索功能，实现泛基因组的广泛和深入研究，为研究和生物工程工作流提供动力；(4)包含 833 篇开放获取的泛基因组论文的书目，以及可利用其知识回答深入问题的 LLM 接口。PanKB 使研究人员和生物工程人员能够利用微生物庞基因组学的潜力，是弥合庞基因组学数据与实际应用之间差距的宝贵资源。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Nucleic Acids Research 生物-生化与分子生物学

CiteScore

27.10

自引率

4.70%

发文量

1057

审稿时长

2 months

期刊介绍： Nucleic Acids Research (NAR) is a scientific journal that publishes research on various aspects of nucleic acids and proteins involved in nucleic acid metabolism and interactions. It covers areas such as chemistry and synthetic biology, computational biology, gene regulation, chromatin and epigenetics, genome integrity, repair and replication, genomics, molecular biology, nucleic acid enzymes, RNA, and structural biology. The journal also includes a Survey and Summary section for brief reviews. Additionally, each year, the first issue is dedicated to biological databases, and an issue in July focuses on web-based software resources for the biological community. Nucleic Acids Research is indexed by several services including Abstracts on Hygiene and Communicable Diseases, Animal Breeding Abstracts, Agricultural Engineering Abstracts, Agbiotech News and Information, BIOSIS Previews, CAB Abstracts, and EMBASE.