Chromosome-level genome assembly of the bay scallop Argopecten irradians.

IF 6.9 2区综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Scientific Data Pub Date : 2024-09-28 DOI:10.1038/s41597-024-03904-x

Denis Grouzdev, Emmanuelle Pales Espinosa, Stephen Tettelbach, Sarah Farhat, Arnaud Tanguy, Isabelle Boutet, Nadège Guiglielmoni, Jean-François Flot, Harrison Tobi, Bassem Allam

{"title":"Chromosome-level genome assembly of the bay scallop Argopecten irradians.","authors":"Denis Grouzdev, Emmanuelle Pales Espinosa, Stephen Tettelbach, Sarah Farhat, Arnaud Tanguy, Isabelle Boutet, Nadège Guiglielmoni, Jean-François Flot, Harrison Tobi, Bassem Allam","doi":"10.1038/s41597-024-03904-x","DOIUrl":null,"url":null,"abstract":"<p><p>The bay scallop, Argopecten irradians, is a species of major commercial, cultural, and ecological importance. It is endemic to the eastern coast of the United States, but has also been introduced to China, where it supports a significant aquaculture industry. Here, we provide an annotated chromosome-level reference genome assembly for the bay scallop, assembled using PacBio and Hi-C data. The total genome size is 845.9 Mb, distributed over 1,503 scaffolds with a scaffold N50 of 44.3 Mb. The majority (92.9%) of the assembled genome is contained within the 16 largest scaffolds, corresponding to the 16 chromosomes confirmed by Hi-C analysis. The assembly also includes the complete mitochondrial genome. Approximately 36.2% of the genome consists of repetitive elements. The BUSCO analysis showed a completeness of 96.2%. We identified 33,772 protein-coding genes. This genome assembly will be a valuable resource for future research on evolutionary dynamics, adaptive mechanisms, and will support genome-assisted breeding, contributing to the conservation and management of this iconic species in the face of environmental and pathogenic challenges.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"11 1","pages":"1057"},"PeriodicalIF":6.9000,"publicationDate":"2024-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11439060/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific Data","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41597-024-03904-x","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}

引用次数: 0

Abstract

The bay scallop, Argopecten irradians, is a species of major commercial, cultural, and ecological importance. It is endemic to the eastern coast of the United States, but has also been introduced to China, where it supports a significant aquaculture industry. Here, we provide an annotated chromosome-level reference genome assembly for the bay scallop, assembled using PacBio and Hi-C data. The total genome size is 845.9 Mb, distributed over 1,503 scaffolds with a scaffold N50 of 44.3 Mb. The majority (92.9%) of the assembled genome is contained within the 16 largest scaffolds, corresponding to the 16 chromosomes confirmed by Hi-C analysis. The assembly also includes the complete mitochondrial genome. Approximately 36.2% of the genome consists of repetitive elements. The BUSCO analysis showed a completeness of 96.2%. We identified 33,772 protein-coding genes. This genome assembly will be a valuable resource for future research on evolutionary dynamics, adaptive mechanisms, and will support genome-assisted breeding, contributing to the conservation and management of this iconic species in the face of environmental and pathogenic challenges.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

海湾扇贝 Argopecten irradians 染色体级基因组组装。

海湾扇贝（Argopecten irradians）是一种具有重要商业、文化和生态意义的物种。它是美国东海岸的特有物种，但也被引入中国，并在中国支撑起了一个重要的水产养殖业。在这里，我们利用 PacBio 和 Hi-C 数据为海湾扇贝提供了染色体组水平的参考基因组注释。基因组总大小为 845.9 Mb，分布在 1,503 个支架上，支架 N50 为 44.3 Mb。组装基因组的大部分（92.9%）包含在 16 个最大的支架中，与 Hi-C 分析确认的 16 条染色体相对应。该基因组还包括完整的线粒体基因组。约 36.2% 的基因组由重复元件组成。BUSCO 分析显示其完整性为 96.2%。我们确定了 33772 个编码蛋白质的基因。该基因组组装将成为未来研究进化动态和适应机制的宝贵资源，并将支持基因组辅助育种，为保护和管理这一面临环境和病原体挑战的标志性物种做出贡献。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Scientific Data Social Sciences-Education

CiteScore

11.20

自引率

4.10%

发文量

689

审稿时长

16 weeks

期刊介绍： Scientific Data is an open-access journal focused on data, publishing descriptions of research datasets and articles on data sharing across natural sciences, medicine, engineering, and social sciences. Its goal is to enhance the sharing and reuse of scientific data, encourage broader data sharing, and acknowledge those who share their data. The journal primarily publishes Data Descriptors, which offer detailed descriptions of research datasets, including data collection methods and technical analyses validating data quality. These descriptors aim to facilitate data reuse rather than testing hypotheses or presenting new interpretations, methods, or in-depth analyses.