首页 > 最新文献

GigaByte (Hong Kong, China)最新文献

英文 中文
Jellyfish in Hong Kong: a citizen science dataset. 香港的水母:公民科学数据集。
Pub Date : 2024-05-20 eCollection Date: 2024-01-01 DOI: 10.46471/gigabyte.125
John Terenzini, Yannan Fan, Melissa Jean-Yi Liu, Laura J Falkenberg

The Hong Kong Jellyfish Project is a citizen science initiative started in early 2021 to enhance our understanding of jellyfish in Hong Kong. Here, we present a dataset of jellyfish sightings collected by citizen scientists from 2021 through 2023 within local waters. Citizen scientists submitted photographs and other data (time, date, and location) using a website, iNaturalist project, and social media. Sightings were validated using references from the literature. A total of 1,020 usable observations are included in this dataset, showing the occurrence and distribution of jellyfish in Hong Kong in 2021-2023. This dataset is now publicly available and discoverable in the Global Biodiversity Information Facility database and is available for download. This data can be used to enhance our understanding of the biodiversity of local marine ecosystems.

香港水母计划是一项公民科学计划,于2021年初展开,旨在加深我们对香港水母的认识。在此,我们展示市民科学家于2021年至2023年期间在本港水域发现水母的数据集。市民科学家通过网站、iNaturalist 项目和社交媒体提交照片和其他数据(时间、日期和地点)。观测结果通过参考文献进行验证。本数据集共包含 1,020 个可用观测数据,显示了 2021-2023 年香港水母的出现和分布情况。這個數據集現已在全球生物多樣性資訊基金的數據庫中公開發放,可供下載。這些數據可用於加深我們對本地海洋生態系統生物多樣性的了解。
{"title":"Jellyfish in Hong Kong: a citizen science dataset.","authors":"John Terenzini, Yannan Fan, Melissa Jean-Yi Liu, Laura J Falkenberg","doi":"10.46471/gigabyte.125","DOIUrl":"10.46471/gigabyte.125","url":null,"abstract":"<p><p>The Hong Kong Jellyfish Project is a citizen science initiative started in early 2021 to enhance our understanding of jellyfish in Hong Kong. Here, we present a dataset of jellyfish sightings collected by citizen scientists from 2021 through 2023 within local waters. Citizen scientists submitted photographs and other data (time, date, and location) using a website, iNaturalist project, and social media. Sightings were validated using references from the literature. A total of 1,020 usable observations are included in this dataset, showing the occurrence and distribution of jellyfish in Hong Kong in 2021-2023. This dataset is now publicly available and discoverable in the Global Biodiversity Information Facility database and is available for download. This data can be used to enhance our understanding of the biodiversity of local marine ecosystems.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte125"},"PeriodicalIF":0.0,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11131163/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141163074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
De novo transcriptome assembly and genome annotation of the fat-tailed dunnart (Sminthopsis crassicaudata). 肥尾盾尾鱼(Sminthopsis crassicaudata)的全新转录组组装和基因组注释。
Pub Date : 2024-05-02 eCollection Date: 2024-01-01 DOI: 10.46471/gigabyte.118
Neke Ibeh, Charles Y Feigin, Stephen R Frankenberg, Davis J McCarthy, Andrew J Pask, Irene Gallego Romero

Marsupials exhibit distinctive modes of reproduction and early development that set them apart from their eutherian counterparts and render them invaluable for comparative studies. However, marsupial genomic resources still lag far behind those of eutherian mammals. We present a series of novel genomic resources for the fat-tailed dunnart (Sminthopsis crassicaudata), a mouse-like marsupial that, due to its ease of husbandry and ex-utero development, is emerging as a laboratory model. We constructed a highly representative multi-tissue de novo transcriptome assembly of dunnart RNA-seq reads spanning 12 tissues. The transcriptome includes 2,093,982 assembled transcripts and has a mammalian transcriptome BUSCO completeness score of 93.3%, the highest amongst currently published marsupial transcriptomes. This global transcriptome, along with ab initio predictions, supported annotation of the existing dunnart genome, revealing 21,622 protein-coding genes. Altogether, these resources will enable wider use of the dunnart as a model marsupial and deepen our understanding of mammalian genome evolution.

有袋类动物在繁殖和早期发育方面表现出独特的模式,这使它们有别于有蹄类动物,也使它们成为比较研究的宝贵对象。然而,有袋类动物的基因组资源仍然远远落后于有蹄类哺乳动物。我们为肥尾盾尾鼠(Sminthopsis crassicaudata)提供了一系列新的基因组资源,肥尾盾尾鼠是一种类似小鼠的有袋类动物,由于其易于饲养和胎前发育,正逐渐成为一种实验室模型。我们构建了一个极具代表性的多组织从头转录组,该转录组包含 12 个组织的邓纳特 RNA-seq 读数。该转录组包括 2,093,982 个已组装的转录本,哺乳动物转录组 BUSCO 完整性得分为 93.3%,是目前已发表的有袋动物转录组中最高的。该全球转录组与ab initio预测一起支持了对现有敦纳特基因组的注释,揭示了21622个编码蛋白质的基因。总之,这些资源将使我们能够更广泛地利用敦奈特作为有袋类动物的模型,并加深我们对哺乳动物基因组进化的了解。
{"title":"<i>De novo</i> transcriptome assembly and genome annotation of the fat-tailed dunnart (<i>Sminthopsis crassicaudata</i>).","authors":"Neke Ibeh, Charles Y Feigin, Stephen R Frankenberg, Davis J McCarthy, Andrew J Pask, Irene Gallego Romero","doi":"10.46471/gigabyte.118","DOIUrl":"10.46471/gigabyte.118","url":null,"abstract":"<p><p>Marsupials exhibit distinctive modes of reproduction and early development that set them apart from their eutherian counterparts and render them invaluable for comparative studies. However, marsupial genomic resources still lag far behind those of eutherian mammals. We present a series of novel genomic resources for the fat-tailed dunnart (<i>Sminthopsis crassicaudata</i>), a mouse-like marsupial that, due to its ease of husbandry and <i>ex-utero</i> development, is emerging as a laboratory model. We constructed a highly representative multi-tissue <i>de novo</i> transcriptome assembly of dunnart RNA-seq reads spanning 12 tissues. The transcriptome includes 2,093,982 assembled transcripts and has a mammalian transcriptome BUSCO completeness score of 93.3%, the highest amongst currently published marsupial transcriptomes. This global transcriptome, along with <i>ab initio</i> predictions, supported annotation of the existing dunnart genome, revealing 21,622 protein-coding genes. Altogether, these resources will enable wider use of the dunnart as a model marsupial and deepen our understanding of mammalian genome evolution.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte118"},"PeriodicalIF":0.0,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11091235/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140923702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Chromosomal-level genome assembly of golden birdwing Troides aeacus (Felder & Felder, 1860). 金色鸟翼 Troides aeacus (Felder & Felder, 1860) 染色体级基因组组装。
Pub Date : 2024-04-25 eCollection Date: 2024-01-01 DOI: 10.46471/gigabyte.122

The golden birdwing Troides aeacus (Lepidoptera, Papilionidae), a significant species in Asia, faces habitat loss due to urbanization and human activities, necessitating its protection. However, the lack of genomic resources hinders our understanding of their biology and diversity, and impedes our conservation efforts based on genetic information or markers. Here, we present the first chromosomal-level genome assembly of T. aeacus using PacBio SMRT and Omni-C scaffolding technologies. The assembled genome (351 Mb) contains 98.94% of the sequences anchored to 30 pseudo-molecules. The genome assembly has high sequence continuity with contig length N50 = 11.67 Mb and L50 = 14, and scaffold length N50 = 12.2 Mb and L50 = 13. A total of 24,946 protein-coding genes were predicted, with high BUSCO score completeness (98.8% and 94.7% of genome and proteome BUSCO, respectively. This genome offers a significant resource for understanding the swallowtail butterfly biology and carrying out its conservation.

金翅鸟Troides aeacus(鳞翅目,凤蝶科)是亚洲的一个重要物种,由于城市化和人类活动,其栖息地面临丧失,因此有必要对其进行保护。然而,基因组资源的缺乏阻碍了我们对其生物学和多样性的了解,也阻碍了我们基于遗传信息或标记的保护工作。在这里,我们利用 PacBio SMRT 和 Omni-C 支架技术首次完成了 T. aeacus 的染色体级基因组组装。组装完成的基因组(351 Mb)包含了锚定在 30 个假分子上的 98.94% 的序列。基因组组装具有较高的序列连续性,等位长度 N50 = 11.67 Mb,L50 = 14,支架长度 N50 = 12.2 Mb,L50 = 13。共预测了 24,946 个编码蛋白质的基因,BUSCO 得分的完整性很高(基因组和蛋白质组的 BUSCO 得分分别为 98.8% 和 94.7%)。该基因组为了解燕尾蝶生物学特性和保护燕尾蝶提供了重要资源。
{"title":"Chromosomal-level genome assembly of golden birdwing <i>Troides aeacus</i> (Felder & Felder, 1860).","authors":"","doi":"10.46471/gigabyte.122","DOIUrl":"10.46471/gigabyte.122","url":null,"abstract":"<p><p>The golden birdwing <i>Troides aeacus</i> (Lepidoptera, Papilionidae), a significant species in Asia, faces habitat loss due to urbanization and human activities, necessitating its protection. However, the lack of genomic resources hinders our understanding of their biology and diversity, and impedes our conservation efforts based on genetic information or markers. Here, we present the first chromosomal-level genome assembly of <i>T. aeacus</i> using PacBio SMRT and Omni-C scaffolding technologies. The assembled genome (351 Mb) contains 98.94% of the sequences anchored to 30 pseudo-molecules. The genome assembly has high sequence continuity with contig length N50 = 11.67 Mb and L50 = 14, and scaffold length N50 = 12.2 Mb and L50 = 13. A total of 24,946 protein-coding genes were predicted, with high BUSCO score completeness (98.8% and 94.7% of genome and proteome BUSCO, respectively. This genome offers a significant resource for understanding the swallowtail butterfly biology and carrying out its conservation.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte122"},"PeriodicalIF":0.0,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11068028/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140874142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Chromosomal-level genome assembly of the long-spined sea urchin Diadema setosum (Leske, 1778). 长刺海胆 Diadema setosum (Leske, 1778) 染色体级基因组组装。
Pub Date : 2024-04-25 eCollection Date: 2024-01-01 DOI: 10.46471/gigabyte.121

The long-spined sea urchin Diadema setosum is an algal and coral feeder widely distributed in the Indo-Pacific that can cause severe bioerosion on the reef community. However, the lack of genomic information has hindered the study of its ecology and evolution. Here, we report the chromosomal-level genome (885.8 Mb) of the long-spined sea urchin D. setosum using a combination of PacBio long-read sequencing and Omni-C scaffolding technology. The assembled genome contains a scaffold N50 length of 38.3 Mb, 98.1% of complete BUSCO (Geno, metazoa_odb10) genes (the single copy score is 97.8% and the duplication score is 0.3%), and 98.6% of the sequences are anchored to 22 pseudo-molecules/chromosomes. A total of 27,478 gene models have were annotated, reaching a total of 28,414 transcripts, including 5,384 tRNA and 23,030 protein-coding genes. The high-quality genome of D. setosum presented here is a valuable resource for the ecological and evolutionary studies of this coral reef-associated sea urchin.

长刺海胆(Diadema setosum)是一种广泛分布于印度洋-太平洋地区的藻类和珊瑚喂食者,可对珊瑚礁群落造成严重的生物侵蚀。然而,基因组信息的缺乏阻碍了对其生态学和进化的研究。在这里,我们利用 PacBio 长线程测序技术和 Omni-C 支架技术,报告了长棘海胆 D. setosum 的染色体级基因组(885.8 Mb)。组装的基因组包含 38.3 Mb 的支架 N50 长度,98.1% 的完整 BUSCO(Geno,metazoa_odb10)基因(单拷贝得分 97.8%,重复得分 0.3%),98.6% 的序列锚定在 22 个伪分子/染色体上。共注释了 27,478 个基因模型,共有 28,414 个转录本,包括 5,384 个 tRNA 和 23,030 个编码蛋白质的基因。这里展示的高质量 D. setosum 基因组是研究这种与珊瑚礁相关的海胆的生态和进化的宝贵资源。
{"title":"Chromosomal-level genome assembly of the long-spined sea urchin <i>Diadema setosum</i> (Leske, 1778).","authors":"","doi":"10.46471/gigabyte.121","DOIUrl":"10.46471/gigabyte.121","url":null,"abstract":"<p><p>The long-spined sea urchin <i>Diadema setosum</i> is an algal and coral feeder widely distributed in the Indo-Pacific that can cause severe bioerosion on the reef community. However, the lack of genomic information has hindered the study of its ecology and evolution. Here, we report the chromosomal-level genome (885.8 Mb) of the long-spined sea urchin <i>D. setosum</i> using a combination of PacBio long-read sequencing and Omni-C scaffolding technology. The assembled genome contains a scaffold N50 length of 38.3 Mb, 98.1% of complete BUSCO (Geno, metazoa_odb10) genes (the single copy score is 97.8% and the duplication score is 0.3%), and 98.6% of the sequences are anchored to 22 pseudo-molecules/chromosomes. A total of 27,478 gene models have were annotated, reaching a total of 28,414 transcripts, including 5,384 tRNA and 23,030 protein-coding genes. The high-quality genome of <i>D. setosum</i> presented here is a valuable resource for the ecological and evolutionary studies of this coral reef-associated sea urchin.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte121"},"PeriodicalIF":0.0,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11066563/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140860904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Chromosome-level genome assembly of the common chiton, Liolophura japonica (Lischke, 1873). 普通甲壳动物 Liolophura japonica (Lischke, 1873) 染色体级基因组组装。
Pub Date : 2024-04-25 eCollection Date: 2024-01-01 DOI: 10.46471/gigabyte.123

Chitons (Polyplacophora) are marine molluscs that can be found worldwide from cold waters to the tropics, and play important ecological roles in the environment. However, only two chiton genomes have been sequenced to date. The chiton Liolophura japonica (Lischke, 1873) is one of the most abundant polyplacophorans found throughout East Asia. Our PacBio HiFi reads and Omni-C sequencing data resulted in a high-quality near chromosome-level genome assembly of ∼609 Mb with a scaffold N50 length of 37.34 Mb (96.1% BUSCO). A total of 28,233 genes were predicted, including 28,010 protein-coding ones. The repeat content (27.89%) was similar to that of other Chitonidae species and approximately three times lower than that of the Hanleyidae chiton genome. The genomic resources provided by this work will help to expand our understanding of the evolution of molluscs and the ecological adaptation of chitons.

甲壳纲(Polyplacophora)是一种海洋软体动物,从寒冷的水域到热带地区都有分布,在环境中扮演着重要的生态角色。然而,迄今为止只有两个甲壳动物基因组被测序。Liolophura japonica(Lischke,1873 年)是在整个东亚发现的最丰富的多孔软体动物之一。我们的 PacBio HiFi 读数和 Omni-C 测序数据产生了一个 ∼609 Mb 的高质量近染色体级基因组,支架 N50 长度为 37.34 Mb(96.1% BUSCO)。共预测出 28,233 个基因,包括 28,010 个编码蛋白质的基因。重复含量(27.89%)与壳斗科其他物种相似,比汉雷科壳斗鱼基因组低约三倍。这项工作提供的基因组资源将有助于拓展我们对软体动物进化和甲壳动物生态适应性的认识。
{"title":"Chromosome-level genome assembly of the common chiton, <i>Liolophura japonica</i> (Lischke, 1873).","authors":"","doi":"10.46471/gigabyte.123","DOIUrl":"10.46471/gigabyte.123","url":null,"abstract":"<p><p>Chitons (Polyplacophora) are marine molluscs that can be found worldwide from cold waters to the tropics, and play important ecological roles in the environment. However, only two chiton genomes have been sequenced to date. The chiton <i>Liolophura japonica</i> (Lischke, 1873) is one of the most abundant polyplacophorans found throughout East Asia. Our PacBio HiFi reads and Omni-C sequencing data resulted in a high-quality near chromosome-level genome assembly of ∼609 Mb with a scaffold N50 length of 37.34 Mb (96.1% BUSCO). A total of 28,233 genes were predicted, including 28,010 protein-coding ones. The repeat content (27.89%) was similar to that of other Chitonidae species and approximately three times lower than that of the Hanleyidae chiton genome. The genomic resources provided by this work will help to expand our understanding of the evolution of molluscs and the ecological adaptation of chitons.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte123"},"PeriodicalIF":0.0,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11068029/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140869055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Genome assembly of the edible jelly fungus Dacryopinax spathularia (Dacrymycetaceae). 食用果冻真菌 Dacryopinax spathularia(Dacrymycetaceae)的基因组组装。
Pub Date : 2024-04-25 eCollection Date: 2024-01-01 DOI: 10.46471/gigabyte.120

The edible jelly fungus Dacryopinax spathularia (Dacrymycetaceae) is wood-decaying and can be commonly found worldwide. It has found application in food additives, given its ability to synthesize long-chain glycolipids, among other uses. In this study, we present the genome assembly of D. spathularia using a combination of PacBio HiFi reads and Omni-C data. The genome size is 29.2 Mb. It has high sequence contiguity and completeness, with a scaffold N50 of 1.925 Mb and a 92.0% BUSCO score. A total of 11,510 protein-coding genes and 474.7 kb repeats (accounting for 1.62% of the genome) were predicted. The D. spathularia genome assembly generated in this study provides a valuable resource for understanding their ecology, such as their wood-decaying capability, their evolutionary relationships with other fungi, and their unique biology and applications in the food industry.

可食用的果冻真菌 Dacryopinax spathularia(Dacrymycetaceae)是一种木材腐生菌,在世界各地都能常见到。由于它具有合成长链糖脂的能力,因此在食品添加剂等方面也有应用。在这项研究中,我们结合使用 PacBio HiFi 读数和 Omni-C 数据,完成了 D. spathularia 的基因组组装。基因组大小为 29.2 Mb。它具有较高的序列连续性和完整性,支架 N50 为 1.925 Mb,BUSCO 得分为 92.0%。共预测出 11,510 个编码蛋白质的基因和 474.7 kb 的重复序列(占基因组的 1.62%)。本研究中生成的 D. spathularia 基因组组装为了解其生态学(如木材腐烂能力)、与其他真菌的进化关系以及其独特的生物学特性和在食品工业中的应用提供了宝贵的资源。
{"title":"Genome assembly of the edible jelly fungus <i>Dacryopinax spathularia (Dacrymycetaceae)</i>.","authors":"","doi":"10.46471/gigabyte.120","DOIUrl":"10.46471/gigabyte.120","url":null,"abstract":"<p><p>The edible jelly fungus <i>Dacryopinax spathularia</i> (<i>Dacrymycetaceae</i>) is wood-decaying and can be commonly found worldwide. It has found application in food additives, given its ability to synthesize long-chain glycolipids, among other uses. In this study, we present the genome assembly of <i>D. spathularia</i> using a combination of PacBio HiFi reads and Omni-C data. The genome size is 29.2 Mb. It has high sequence contiguity and completeness, with a scaffold N50 of 1.925 Mb and a 92.0% BUSCO score. A total of 11,510 protein-coding genes and 474.7 kb repeats (accounting for 1.62% of the genome) were predicted. The <i>D. spathularia</i> genome assembly generated in this study provides a valuable resource for understanding their ecology, such as their wood-decaying capability, their evolutionary relationships with other fungi, and their unique biology and applications in the food industry.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte120"},"PeriodicalIF":0.0,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11066560/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140874143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Genome assembly of the milky mangrove Excoecaria agallocha. 牛奶红树林 Excoecaria agallocha 的基因组组装。
Pub Date : 2024-04-25 eCollection Date: 2024-01-01 DOI: 10.46471/gigabyte.119

The milky mangrove Excoecaria agallocha is a latex-secreting mangrove that are distributed in tropical and subtropical regions. While its poisonous latex is regarded as a potential source of phytochemicals for biomedical applications, the genomic resources of E. agallocha remains limited. Here, we present a chromosomal level genome of E. agallocha, assembled from the combination of PacBio long-read sequencing and Omni-C data. The resulting assembly size is 1,332.45 Mb and has high contiguity and completeness with a scaffold N50 of 58.9 Mb and a BUSCO score of 98.4%, with 86.08% of sequences anchored to 18 pseudomolecules. 73,740 protein-coding genes were also predicted. The milky mangrove genome provides a useful resource for further understanding the biosynthesis of phytochemical compounds in E. agallocha.

乳汁红树林(Excoecaria agallocha)是一种分泌乳汁的红树林,分布于热带和亚热带地区。虽然其有毒的乳汁被认为是生物医学应用中植物化学物质的潜在来源,但 E. agallocha 的基因组资源仍然有限。在这里,我们展示了结合 PacBio 长线程测序和 Omni-C 数据组装的 E. agallocha 染色体级基因组。组装结果大小为 1,332.45 Mb,具有很高的连续性和完整性,支架 N50 为 58.9 Mb,BUSCO 得分为 98.4%,其中 86.08% 的序列锚定在 18 个假分子上。此外,还预测了 73,740 个编码蛋白质的基因。乳汁红树林基因组为进一步了解 E. agallocha 植物化学物质的生物合成提供了有用的资源。
{"title":"Genome assembly of the milky mangrove <i>Excoecaria agallocha</i>.","authors":"","doi":"10.46471/gigabyte.119","DOIUrl":"10.46471/gigabyte.119","url":null,"abstract":"<p><p>The milky mangrove <i>Excoecaria agallocha</i> is a latex-secreting mangrove that are distributed in tropical and subtropical regions. While its poisonous latex is regarded as a potential source of phytochemicals for biomedical applications, the genomic resources of <i>E. agallocha</i> remains limited. Here, we present a chromosomal level genome of <i>E. agallocha</i>, assembled from the combination of PacBio long-read sequencing and Omni-C data. The resulting assembly size is 1,332.45 Mb and has high contiguity and completeness with a scaffold N50 of 58.9 Mb and a BUSCO score of 98.4%, with 86.08% of sequences anchored to 18 pseudomolecules. 73,740 protein-coding genes were also predicted. The milky mangrove genome provides a useful resource for further understanding the biosynthesis of phytochemical compounds in <i>E. agallocha</i>.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte119"},"PeriodicalIF":0.0,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11066562/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140854565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bridging Biodiversity and Health: The Global Biodiversity Information Facility's initiative on open data on vectors of human diseases. 连接生物多样性与健康:全球生物多样性信息基金关于人类疾病媒介开放数据的倡议。
Pub Date : 2024-04-11 eCollection Date: 2024-01-01 DOI: 10.46471/gigabyte.117
Paloma Shimabukuro, Quentin Groom, Florence Fouque, Lindsay Campbell, Theeraphap Chareonviriyaphap, Josiane Etang, Sylvie Manguin, Marianne Sinka, Dmitry Schigel, Kate Ingenloff

There is an increased awareness of the importance of data publication, data sharing, and open science to support research, monitoring and control of vector-borne disease (VBD). Here we describe the efforts of the Global Biodiversity Information Facility (GBIF) as well as the World Health Special Programme on Research and Training in Diseases of Poverty (TDR) to promote publication of data related to vectors of diseases. In 2020, a GBIF task group of experts was formed to provide advice and support efforts aimed at enhancing the coverage and accessibility of data on vectors of human diseases within GBIF. Various strategies, such as organizing training courses and publishing data papers, were used to increase this content. This editorial introduces the outcome of a second call for data papers partnered by the TDR, GBIF and GigaScience Press in the journal GigaByte. Biodiversity and infectious diseases are linked in complex ways. These links can involve changes from the microorganism level to that of the habitat, and there are many ways in which these factors interact to affect human health. One way to tackle disease control and possibly elimination, is to provide stakeholders with access to a wide range of data shared under the FAIR principles, so it is possible to support early detection, analyses and evaluation, and to promote policy improvements and/or development.

人们越来越意识到数据发布、数据共享和开放科学对于支持病媒生物疾病(VBD)的研究、监测和控制的重要性。在此,我们将介绍全球生物多样性信息基金(GBIF)以及世界贫困疾病研究和培训特别计划(TDR)为促进病媒相关数据的发布所做的努力。2020 年,成立了 GBIF 专家工作组,以提供建议和支持旨在加强 GBIF 内人类疾病病媒数据的覆盖面和可获取性的工作。为增加这方面的内容,采取了各种策略,如组织培训课程、发表数据论文等。这篇社论介绍了由TDR、GBIF和GigaScience出版社合作在《GigaByte》杂志上第二次征集数据论文的结果。生物多样性与传染性疾病之间有着复杂的联系。这些联系可能涉及从微生物层面到栖息地层面的变化,这些因素通过多种方式相互作用,影响人类健康。解决疾病控制和可能的消除问题的方法之一,是让利益相关者能够访问在 FAIR 原则下共享的各种数据,从而支持早期检测、分析和评估,并促进政策改进和/或发展。
{"title":"Bridging Biodiversity and Health: The Global Biodiversity Information Facility's initiative on open data on vectors of human diseases.","authors":"Paloma Shimabukuro, Quentin Groom, Florence Fouque, Lindsay Campbell, Theeraphap Chareonviriyaphap, Josiane Etang, Sylvie Manguin, Marianne Sinka, Dmitry Schigel, Kate Ingenloff","doi":"10.46471/gigabyte.117","DOIUrl":"10.46471/gigabyte.117","url":null,"abstract":"<p><p>There is an increased awareness of the importance of data publication, data sharing, and open science to support research, monitoring and control of vector-borne disease (VBD). Here we describe the efforts of the Global Biodiversity Information Facility (GBIF) as well as the World Health Special Programme on Research and Training in Diseases of Poverty (TDR) to promote publication of data related to vectors of diseases. In 2020, a GBIF task group of experts was formed to provide advice and support efforts aimed at enhancing the coverage and accessibility of data on vectors of human diseases within GBIF. Various strategies, such as organizing training courses and publishing data papers, were used to increase this content. This editorial introduces the outcome of a second call for data papers partnered by the TDR, GBIF and GigaScience Press in the journal <i>GigaByte</i>. Biodiversity and infectious diseases are linked in complex ways. These links can involve changes from the microorganism level to that of the habitat, and there are many ways in which these factors interact to affect human health. One way to tackle disease control and possibly elimination, is to provide stakeholders with access to a wide range of data shared under the FAIR principles, so it is possible to support early detection, analyses and evaluation, and to promote policy improvements and/or development.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte117"},"PeriodicalIF":0.0,"publicationDate":"2024-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11027195/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140860840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Snakemake workflows for long-read bacterial genome assembly and evaluation. 用于长读数细菌基因组组装和评估的 Snakemake 工作流程。
Pub Date : 2024-04-01 eCollection Date: 2024-01-01 DOI: 10.46471/gigabyte.116
Peter Menzel

With the advancement of long-read sequencing technologies and their increasing use for bacterial genomics, several methods for generating genome assemblies from error-prone long reads have been developed. These are complemented by various tools for assembly polishing using either long reads, short reads, or reference genomes. End users are therefore left with a plethora of possible combinations of programs for obtaining a final trusted assembly. Hence, there is also a need to measure the completeness and accuracy of such assemblies, for which, again, several evaluation methods implemented in various programs are available. In order to automatically run multiple genome assembly and evaluation programs at once, I developed two workflows for the workflow management system Snakemake, which provide end users with an easy-to-run solution for testing various genome assemblies from their sequencing data. Both workflows use the conda packaging system, so there is no need for manual installation of each program.

Availability & implementation: The workflows are available as open source software under the MIT license at github.com/pmenzel/ont-assembly-snake and github.com/pmenzel/score-assemblies.

随着长读数测序技术的发展及其在细菌基因组学中的应用日益广泛,已经开发出了几种从容易出错的长读数中生成基因组装配的方法。此外,还有各种利用长读数、短读数或参考基因组进行组装抛光的工具。因此,最终用户只能通过大量可能的程序组合来获得最终可信的组装结果。因此,还需要对这些组装的完整性和准确性进行测量,为此,在各种程序中也提供了多种评估方法。为了一次自动运行多个基因组组装和评估程序,我为工作流管理系统 Snakemake 开发了两个工作流,为终端用户提供了一个易于运行的解决方案,以测试其测序数据中的各种基因组组装。这两个工作流程都使用 conda 打包系统,因此无需手动安装每个程序:这两个工作流均为 MIT 许可下的开源软件,分别位于 github.com/pmenzel/ont-assembly-snake 和 github.com/pmenzel/score-assemblies。
{"title":"Snakemake workflows for long-read bacterial genome assembly and evaluation.","authors":"Peter Menzel","doi":"10.46471/gigabyte.116","DOIUrl":"10.46471/gigabyte.116","url":null,"abstract":"<p><p>With the advancement of long-read sequencing technologies and their increasing use for bacterial genomics, several methods for generating genome assemblies from error-prone long reads have been developed. These are complemented by various tools for assembly polishing using either long reads, short reads, or reference genomes. End users are therefore left with a plethora of possible combinations of programs for obtaining a final trusted assembly. Hence, there is also a need to measure the completeness and accuracy of such assemblies, for which, again, several evaluation methods implemented in various programs are available. In order to automatically run multiple genome assembly and evaluation programs at once, I developed two workflows for the workflow management system Snakemake, which provide end users with an easy-to-run solution for testing various genome assemblies from their sequencing data. Both workflows use the conda packaging system, so there is no need for manual installation of each program.</p><p><strong>Availability & implementation: </strong>The workflows are available as open source software under the MIT license at github.com/pmenzel/ont-assembly-snake and github.com/pmenzel/score-assemblies.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte116"},"PeriodicalIF":0.0,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11000499/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140874304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Whole genome assembly and annotation of the King Angelfish (Holacanthus passer) gives insight into the evolution of marine fishes of the Tropical Eastern Pacific. 帝王姬鱼(Holacanthus passer)的全基因组组装和注释有助于深入了解热带东太平洋海洋鱼类的进化。
Pub Date : 2024-03-21 eCollection Date: 2024-01-01 DOI: 10.46471/gigabyte.115
Remy Gatins, Carlos F Arias, Carlos Sánchez, Giacomo Bernardi, Luis F De León

Holacanthus angelfishes are some of the most iconic marine fishes of the Tropical Eastern Pacific (TEP). However, very limited genomic resources currently exist for the genus. In this study we: (i) assembled and annotated the nuclear genome of the King Angelfish (Holacanthus passer), and (ii) examined the demographic history of H. passer in the TEP. We generated 43.8 Gb of ONT and 97.3 Gb Illumina reads representing 75× and 167× coverage, respectively. The final genome assembly size was 583 Mb with a contig N50 of 5.7 Mb, which captured 97.5% of the complete Actinoterygii Benchmarking Universal Single-Copy Orthologs (BUSCOs). Repetitive elements accounted for 5.09% of the genome, and 33,889 protein-coding genes were predicted, of which 22,984 were functionally annotated. Our demographic analysis suggests that population expansions of H. passer occurred prior to the last glacial maximum (LGM) and were more likely shaped by events associated with the closure of the Isthmus of Panama. This result is surprising, given that most rapid population expansions in both freshwater and marine organisms have been reported to occur globally after the LGM. Overall, this annotated genome assembly provides a novel molecular resource to study the evolution of Holacanthus angelfishes, while facilitating research into local adaptation, speciation, and introgression in marine fishes.

天使鱼(Holacanthus angelfishes)是东太平洋热带地区(TEP)一些最具代表性的海洋鱼类。然而,目前该属的基因组资源非常有限。在这项研究中,我们(i) 组装并注释了帝王吴郭鱼(Holacanthus passer)的核基因组,(ii) 研究了帝王吴郭鱼在热带东太平洋的种群历史。我们生成了 43.8 Gb ONT 和 97.3 Gb Illumina 读数,覆盖率分别为 75 倍和 167 倍。最终的基因组组装大小为 583 Mb,等位基因 N50 为 5.7 Mb,捕获了 97.5% 的完整的放线虫基准通用单拷贝同源物(BUSCOs)。重复元件占基因组的 5.09%,预测了 33,889 个编码蛋白质的基因,其中 22,984 个已进行了功能注释。我们的人口学分析表明,H. passer的种群扩张发生在上一个冰川极盛时期(LGM)之前,更有可能是由与巴拿马地峡关闭相关的事件形成的。这一结果令人惊讶,因为据报道,淡水和海洋生物的大多数快速种群扩张都发生在全球大冰川时期之后。总之,该注释基因组的组装为研究 Holacanthus Angelf 鱼的进化提供了新的分子资源,同时也促进了对海洋鱼类的局部适应、物种分化和引种的研究。
{"title":"Whole genome assembly and annotation of the King Angelfish (<i>Holacanthus passer</i>) gives insight into the evolution of marine fishes of the Tropical Eastern Pacific.","authors":"Remy Gatins, Carlos F Arias, Carlos Sánchez, Giacomo Bernardi, Luis F De León","doi":"10.46471/gigabyte.115","DOIUrl":"10.46471/gigabyte.115","url":null,"abstract":"<p><p><i>Holacanthus</i> angelfishes are some of the most iconic marine fishes of the Tropical Eastern Pacific (TEP). However, very limited genomic resources currently exist for the genus. In this study we: (i) assembled and annotated the nuclear genome of the King Angelfish (<i>Holacanthus passer</i>), and (ii) examined the demographic history of <i>H. passer</i> in the TEP. We generated 43.8 Gb of ONT and 97.3 Gb Illumina reads representing 75× and 167× coverage, respectively. The final genome assembly size was 583 Mb with a contig N50 of 5.7 Mb, which captured 97.5% of the complete Actinoterygii Benchmarking Universal Single-Copy Orthologs (BUSCOs). Repetitive elements accounted for 5.09% of the genome, and 33,889 protein-coding genes were predicted, of which 22,984 were functionally annotated. Our demographic analysis suggests that population expansions of <i>H. passer</i> occurred prior to the last glacial maximum (LGM) and were more likely shaped by events associated with the closure of the Isthmus of Panama. This result is surprising, given that most rapid population expansions in both freshwater and marine organisms have been reported to occur globally after the LGM. Overall, this annotated genome assembly provides a novel molecular resource to study the evolution of <i>Holacanthus</i> angelfishes, while facilitating research into local adaptation, speciation, and introgression in marine fishes.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte115"},"PeriodicalIF":0.0,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10973836/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140320042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
GigaByte (Hong Kong, China)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1