Eric W Sayers, Mark Cavanaugh, Linda Frisse, Kim D Pruitt, Valerie A Schneider, Beverly A Underwood, Linda Yankie, Ilene Karsch-Mizrachi
{"title":"GenBank 2025 update.","authors":"Eric W Sayers, Mark Cavanaugh, Linda Frisse, Kim D Pruitt, Valerie A Schneider, Beverly A Underwood, Linda Yankie, Ilene Karsch-Mizrachi","doi":"10.1093/nar/gkae1114","DOIUrl":null,"url":null,"abstract":"<p><p>GenBank® (https://www.ncbi.nlm.nih.gov/genbank/) is a comprehensive, public data repository that contains 34 trillion base pairs from over 4.7 billion nucleotide sequences for 581 000 formally described species. Daily data exchange with the European Nucleotide Archive and the DNA Data Bank of Japan ensures worldwide coverage. We summarize the content of the database in 2025 and recent updates such as accelerated processing of influenza sequences and the ability to upload feature tables to Submission Portal for messenger RNA sequences. We provide an overview of the web, application programming and command-line interfaces that allow users to access GenBank data. We also discuss the importance of creating BioProject and BioSample records during submissions, particularly for viruses and metagenomes. Finally, we summarize educational materials and recent community outreach efforts.</p>","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":" ","pages":""},"PeriodicalIF":16.6000,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nucleic Acids Research","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/nar/gkae1114","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
GenBank® (https://www.ncbi.nlm.nih.gov/genbank/) is a comprehensive, public data repository that contains 34 trillion base pairs from over 4.7 billion nucleotide sequences for 581 000 formally described species. Daily data exchange with the European Nucleotide Archive and the DNA Data Bank of Japan ensures worldwide coverage. We summarize the content of the database in 2025 and recent updates such as accelerated processing of influenza sequences and the ability to upload feature tables to Submission Portal for messenger RNA sequences. We provide an overview of the web, application programming and command-line interfaces that allow users to access GenBank data. We also discuss the importance of creating BioProject and BioSample records during submissions, particularly for viruses and metagenomes. Finally, we summarize educational materials and recent community outreach efforts.
期刊介绍:
Nucleic Acids Research (NAR) is a scientific journal that publishes research on various aspects of nucleic acids and proteins involved in nucleic acid metabolism and interactions. It covers areas such as chemistry and synthetic biology, computational biology, gene regulation, chromatin and epigenetics, genome integrity, repair and replication, genomics, molecular biology, nucleic acid enzymes, RNA, and structural biology. The journal also includes a Survey and Summary section for brief reviews. Additionally, each year, the first issue is dedicated to biological databases, and an issue in July focuses on web-based software resources for the biological community. Nucleic Acids Research is indexed by several services including Abstracts on Hygiene and Communicable Diseases, Animal Breeding Abstracts, Agricultural Engineering Abstracts, Agbiotech News and Information, BIOSIS Previews, CAB Abstracts, and EMBASE.