Ethalinda K S Cannon, David C Molik, Adam J Wright, Huiting Zhang, Loren Honaas, Kapeel Chougule, Sarah Dyer
{"title":"基因和基因组组装命名指南。","authors":"Ethalinda K S Cannon, David C Molik, Adam J Wright, Huiting Zhang, Loren Honaas, Kapeel Chougule, Sarah Dyer","doi":"10.1093/genetics/iyaf006","DOIUrl":null,"url":null,"abstract":"<p><p>The rapid increase in the number of reference-quality genome assemblies presents significant new opportunities for genomic research. However, the absence of standardized naming conventions for genome assemblies and annotations across datasets creates substantial challenges. Inconsistent naming hinders the identification of correct assemblies, complicates the integration of bioinformatics pipelines, and makes it difficult to link assemblies across multiple resources. To address this, we developed a specification for standardizing the naming of reference genome assemblies, to improve consistency across datasets and facilitate interoperability. This specification was created with FAIR (Findable, Accessible, Interoperable, and Reusable) practices in mind, ensuring that reference assemblies are easier to locate, access, and reuse across research communities. Additionally, it has been designed to comply with primary genomic data repositories, including members of the International Nucleotide Sequence Database Collaboration (INSDC) consortium, ensuring compatibility with widely used databases. While initially tailored to the agricultural genomics community, the specification is adaptable for use across different taxa. Widespread adoption of this standardized nomenclature would streamline assembly management, better enable cross-species analyses, and improve the reproducibility of research. It would also enhance natural language processing applications that depend on consistent reference assembly names in genomic literature, promoting greater integration and automated analysis of genomic data. This is a good time to consider more consistent genomic data nomenclature as many research communities and data resources are now finding themselves juggling multiple datasets from multiple data providers.</p>","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":""},"PeriodicalIF":3.3000,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Guidelines for Gene and Genome Assembly Nomenclature.\",\"authors\":\"Ethalinda K S Cannon, David C Molik, Adam J Wright, Huiting Zhang, Loren Honaas, Kapeel Chougule, Sarah Dyer\",\"doi\":\"10.1093/genetics/iyaf006\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The rapid increase in the number of reference-quality genome assemblies presents significant new opportunities for genomic research. However, the absence of standardized naming conventions for genome assemblies and annotations across datasets creates substantial challenges. Inconsistent naming hinders the identification of correct assemblies, complicates the integration of bioinformatics pipelines, and makes it difficult to link assemblies across multiple resources. To address this, we developed a specification for standardizing the naming of reference genome assemblies, to improve consistency across datasets and facilitate interoperability. This specification was created with FAIR (Findable, Accessible, Interoperable, and Reusable) practices in mind, ensuring that reference assemblies are easier to locate, access, and reuse across research communities. Additionally, it has been designed to comply with primary genomic data repositories, including members of the International Nucleotide Sequence Database Collaboration (INSDC) consortium, ensuring compatibility with widely used databases. While initially tailored to the agricultural genomics community, the specification is adaptable for use across different taxa. Widespread adoption of this standardized nomenclature would streamline assembly management, better enable cross-species analyses, and improve the reproducibility of research. It would also enhance natural language processing applications that depend on consistent reference assembly names in genomic literature, promoting greater integration and automated analysis of genomic data. This is a good time to consider more consistent genomic data nomenclature as many research communities and data resources are now finding themselves juggling multiple datasets from multiple data providers.</p>\",\"PeriodicalId\":48925,\"journal\":{\"name\":\"Genetics\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2025-01-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Genetics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1093/genetics/iyaf006\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genetics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/genetics/iyaf006","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
Guidelines for Gene and Genome Assembly Nomenclature.
The rapid increase in the number of reference-quality genome assemblies presents significant new opportunities for genomic research. However, the absence of standardized naming conventions for genome assemblies and annotations across datasets creates substantial challenges. Inconsistent naming hinders the identification of correct assemblies, complicates the integration of bioinformatics pipelines, and makes it difficult to link assemblies across multiple resources. To address this, we developed a specification for standardizing the naming of reference genome assemblies, to improve consistency across datasets and facilitate interoperability. This specification was created with FAIR (Findable, Accessible, Interoperable, and Reusable) practices in mind, ensuring that reference assemblies are easier to locate, access, and reuse across research communities. Additionally, it has been designed to comply with primary genomic data repositories, including members of the International Nucleotide Sequence Database Collaboration (INSDC) consortium, ensuring compatibility with widely used databases. While initially tailored to the agricultural genomics community, the specification is adaptable for use across different taxa. Widespread adoption of this standardized nomenclature would streamline assembly management, better enable cross-species analyses, and improve the reproducibility of research. It would also enhance natural language processing applications that depend on consistent reference assembly names in genomic literature, promoting greater integration and automated analysis of genomic data. This is a good time to consider more consistent genomic data nomenclature as many research communities and data resources are now finding themselves juggling multiple datasets from multiple data providers.
期刊介绍:
GENETICS is published by the Genetics Society of America, a scholarly society that seeks to deepen our understanding of the living world by advancing our understanding of genetics. Since 1916, GENETICS has published high-quality, original research presenting novel findings bearing on genetics and genomics. The journal publishes empirical studies of organisms ranging from microbes to humans, as well as theoretical work.
While it has an illustrious history, GENETICS has changed along with the communities it serves: it is not your mentor''s journal.
The editors make decisions quickly – in around 30 days – without sacrificing the excellence and scholarship for which the journal has long been known. GENETICS is a peer reviewed, peer-edited journal, with an international reach and increasing visibility and impact. All editorial decisions are made through collaboration of at least two editors who are practicing scientists.
GENETICS is constantly innovating: expanded types of content include Reviews, Commentary (current issues of interest to geneticists), Perspectives (historical), Primers (to introduce primary literature into the classroom), Toolbox Reviews, plus YeastBook, FlyBook, and WormBook (coming spring 2016). For particularly time-sensitive results, we publish Communications. As part of our mission to serve our communities, we''ve published thematic collections, including Genomic Selection, Multiparental Populations, Mouse Collaborative Cross, and the Genetics of Sex.