GigaByte (Hong Kong, China)最新文献

英文中文

Portable-CELLxGENE: standalone executables of CELLxGENE for easy installation. Portable-CELLxGENE: CELLxGENE 的独立可执行文件，便于安装。

GigaByte (Hong Kong, China)

Pub Date : 2025-03-03 eCollection Date: 2025-01-01 DOI: 10.46471/gigabyte.151

George T Hall

Biologists who want to analyse their single-cell transcriptomics dataset must install and use specialist software via the command line. This is often impractical for non-bioinformaticians. Whilst the popular CELLxGENE software provides an intuitive graphical interface to facilitate analysis outside the command line, its server-side installation and execution remain complex. A version that is easier to install and run would allow non-bioinformaticians to take advantage of this valuable tool without needing to use the command line. This work introduces Portable-CELLxGENE, a standalone distribution of CELLxGENE that can be installed via a graphical interface. It contains an easy-to-use extension of the CELLxGENE-Gateway Python package to allow the analysis of multiple datasets. This tool enables non-bioinformaticians to carry out simple analyses independently.

Availability and implementation: Versions of Portable-CELLxGENE for Windows and MacOS, along with source code, are available at https://george-hall-ucl.github.io/Portable-CELLxGENE-Docs. It is licensed under the GNU General Public License v3.

想要分析单细胞转录组学数据集的生物学家必须通过命令行安装和使用专业软件。这对于非生物信息学家来说通常是不切实际的。虽然流行的CELLxGENE软件提供了直观的图形界面，以方便命令行之外的分析，但其服务器端安装和执行仍然很复杂。一个更容易安装和运行的版本将允许非生物信息学家利用这个有价值的工具，而不需要使用命令行。本文介绍了Portable-CELLxGENE，它是CELLxGENE的独立发行版，可以通过图形界面安装。它包含易于使用的CELLxGENE-Gateway Python包扩展，允许分析多个数据集。该工具使非生物信息学家能够独立进行简单的分析。可用性和实现：Portable-CELLxGENE的Windows和MacOS版本，以及源代码，可在https://george-hall-ucl.github.io/Portable-CELLxGENE-Docs获得。它是在GNU通用公共许可证v3下授权的。

{"title":"Portable-CELLxGENE: standalone executables of CELLxGENE for easy installation.","authors":"George T Hall","doi":"10.46471/gigabyte.151","DOIUrl":"10.46471/gigabyte.151","url":null,"abstract":"Biologists who want to analyse their single-cell transcriptomics dataset must install and use specialist software via the command line. This is often impractical for non-bioinformaticians. Whilst the popular CELLxGENE software provides an intuitive graphical interface to facilitate analysis outside the command line, its server-side installation and execution remain complex. A version that is easier to install and run would allow non-bioinformaticians to take advantage of this valuable tool without needing to use the command line. This work introduces Portable-CELLxGENE, a standalone distribution of CELLxGENE that can be installed via a graphical interface. It contains an easy-to-use extension of the CELLxGENE-Gateway Python package to allow the analysis of multiple datasets. This tool enables non-bioinformaticians to carry out simple analyses independently.Availability and implementation: Versions of Portable-CELLxGENE for Windows and MacOS, along with source code, are available at https://george-hall-ucl.github.io/Portable-CELLxGENE-Docs. It is licensed under the GNU General Public License v3.","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2025 ","pages":"gigabyte151"},"PeriodicalIF":0.0,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11894539/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143607446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The assembly and annotation of two teinturier grapevine varieties, Dakapo and Rubired. 达卡波和鲁比雷德两个葡萄品种的组合和注释。

GigaByte (Hong Kong, China)

Pub Date : 2025-02-27 eCollection Date: 2025-01-01 DOI: 10.46471/gigabyte.149

Eleanore J Ritter, Noé Cochetel, Andrea Minio, Peter Cousins, Dario Cantu, Chad Niederhuth

Teinturier grapevines, known for their pigmented flesh berries due to anthocyanin production, are valuable for enhancing the pigmentation of wine, for potential health benefits, and for investigating anthocyanin production in plants. Here, we assembled and annotated the Dakapo and Rubired genomes, two teinturier varieties. For Dakapo, we combined Nanopore sequencing, Illumina sequencing, and scaffolding to the existing grapevine assembly to generate a final assembly of 508.5 Mbp. Combining de novo annotation and lifting over annotations from the existing grapevine reference produced annotation 36,940 gene annotations for Dakapo. For Rubired, PacBio HiFi reads were assembled, scaffolded, and phased to generate a diploid assembly with two haplotypes 474.7-476.0 Mbp long. De novo annotation of the diploid Rubired genome yielded annotations for 56,681 genes. Both genomes are highly contiguous and complete. The Dakapo and Rubired genome assemblies provide genetic resources for investigations into berry flesh pigmentation and other traits of interest in grapevine.

由于花青素的生产，Teinturier葡萄以其色素果肉浆果而闻名，对于增强葡萄酒的色素沉着，潜在的健康益处以及研究植物中花青素的生产都很有价值。在这里，我们组装并注释了Dakapo和Rubired这两个品种的基因组。对于Dakapo，我们将纳米孔测序、Illumina测序和脚手架结合到现有的葡萄藤组装中，产生了508.5 Mbp的最终组装。结合de novo注释和从现有葡萄参考文献中提取的注释，为Dakapo提供了36,940个基因注释。对于Rubired， PacBio HiFi reads被组装、搭建和分阶段生成具有两个单倍型474.7-476.0 Mbp长的二倍体组装。二倍体Rubired基因组的从头注释得到了56,681个基因的注释。两个基因组都是高度连续和完整的。Dakapo和Rubired基因组组合为研究葡萄果实果肉色素沉着和其他感兴趣的性状提供了遗传资源。

{"title":"The assembly and annotation of two teinturier grapevine varieties, Dakapo and Rubired.","authors":"Eleanore J Ritter, Noé Cochetel, Andrea Minio, Peter Cousins, Dario Cantu, Chad Niederhuth","doi":"10.46471/gigabyte.149","DOIUrl":"10.46471/gigabyte.149","url":null,"abstract":"Teinturier grapevines, known for their pigmented flesh berries due to anthocyanin production, are valuable for enhancing the pigmentation of wine, for potential health benefits, and for investigating anthocyanin production in plants. Here, we assembled and annotated the Dakapo and Rubired genomes, two teinturier varieties. For Dakapo, we combined Nanopore sequencing, Illumina sequencing, and scaffolding to the existing grapevine assembly to generate a final assembly of 508.5 Mbp. Combining de novo annotation and lifting over annotations from the existing grapevine reference produced annotation 36,940 gene annotations for Dakapo. For Rubired, PacBio HiFi reads were assembled, scaffolded, and phased to generate a diploid assembly with two haplotypes 474.7-476.0 Mbp long. De novo annotation of the diploid Rubired genome yielded annotations for 56,681 genes. Both genomes are highly contiguous and complete. The Dakapo and Rubired genome assemblies provide genetic resources for investigations into berry flesh pigmentation and other traits of interest in grapevine.","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2025 ","pages":"gigabyte149"},"PeriodicalIF":0.0,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11891882/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143598414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Draft genome of the endangered visayan spotted deer (Rusa alfredi), a Philippine endemic species. 濒危的菲律宾特有物种鄢陵斑鹿（Rusa alfredi）的基因组草案。

GigaByte (Hong Kong, China)

Pub Date : 2025-02-24 eCollection Date: 2025-01-01 DOI: 10.46471/gigabyte.150

Ma Carmel F Javier, Albert C Noblezada, Persie Mark Q Sienes, Robert S Guino-O, Nadia Palomar-Abesamis, Maria Celia D Malay, Carmelo S Del Castillo, Victor Marco Emmanuel N Ferriols

The Visayan Spotted Deer (VSD), or Rusa alfredi, is an endangered and endemic species in the Philippines. Despite its status, genomic information on R. alfredi, and the genus Rusa in general, is missing. This study presents the first draft genome assembly of the VSD using the Illumina short-read sequencing technology. The resulting RusAlf_1.1 assembly has a 2.52 Gb total length, with a contig N50 of 46 Kb and scaffold N50 size of 75 Mb. The assembly has a BUSCO complete score of 95.5%, demonstrating the genome's completeness, and includes the annotation of 24,531 genes. Our phylogenetic analysis based on single-copy orthologs revealed a close evolutionary relationship between R. alfredi and the genus Cervus. RusAlf_1.1 represents a significant advancement in our understanding of the VSD. It opens opportunities for further research in population genetics and evolutionary biology, potentially contributing to more effective conservation and management strategies for this endangered species.

维萨扬斑点鹿（VSD），或Rusa alfredi，是菲律宾的一种濒临灭绝的特有物种。尽管它的地位，R. alfredi的基因组信息，以及一般的Rusa属，都是缺失的。本研究提出了利用Illumina短读测序技术的VSD基因组组装草图。得到的RusAlf_1.1组装体全长2.52 Gb，序列N50为46 Kb，支架N50大小为75 Mb。该组装体的BUSCO完成度为95.5%，显示了基因组的完整性，包含24,531个基因的注释。基于单拷贝同源基因的系统发育分析揭示了鹿角鹿属与鹿角鹿属之间的密切进化关系。RusAlf_1.1代表了我们对VSD的理解的重大进步。这为种群遗传学和进化生物学的进一步研究提供了机会，可能有助于更有效地保护和管理这种濒危物种。

{"title":"Draft genome of the endangered visayan spotted deer (Rusa alfredi), a Philippine endemic species.","authors":"Ma Carmel F Javier, Albert C Noblezada, Persie Mark Q Sienes, Robert S Guino-O, Nadia Palomar-Abesamis, Maria Celia D Malay, Carmelo S Del Castillo, Victor Marco Emmanuel N Ferriols","doi":"10.46471/gigabyte.150","DOIUrl":"10.46471/gigabyte.150","url":null,"abstract":"The Visayan Spotted Deer (VSD), or Rusa alfredi, is an endangered and endemic species in the Philippines. Despite its status, genomic information on R. alfredi, and the genus Rusa in general, is missing. This study presents the first draft genome assembly of the VSD using the Illumina short-read sequencing technology. The resulting RusAlf_1.1 assembly has a 2.52 Gb total length, with a contig N50 of 46 Kb and scaffold N50 size of 75 Mb. The assembly has a BUSCO complete score of 95.5%, demonstrating the genome's completeness, and includes the annotation of 24,531 genes. Our phylogenetic analysis based on single-copy orthologs revealed a close evolutionary relationship between R. alfredi and the genus Cervus. RusAlf_1.1 represents a significant advancement in our understanding of the VSD. It opens opportunities for further research in population genetics and evolutionary biology, potentially contributing to more effective conservation and management strategies for this endangered species.","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2025 ","pages":"gigabyte150"},"PeriodicalIF":0.0,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11876970/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143560181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SqueezeCall: nanopore basecalling using a Squeezeformer network. SqueezeCall：使用Squeezeformer网络的纳米孔基调用。

GigaByte (Hong Kong, China)

Pub Date : 2025-02-14 eCollection Date: 2025-01-01 DOI: 10.46471/gigabyte.148

Zhongxu Zhu

Nanopore sequencing, a third-generation sequencing technique, enables direct RNA sequencing, real-time analysis, and long-read length. Nanopore sequencers measure electrical current changes as nucleotides pass through nanopores; a basecaller identifies base sequences according to the raw current measurements. However, accurate basecalling remains challenging due to molecular variations and sequencing noise. Here, we introduce SqueezeCall, a novel Squeezeformer-based model for accurate nanopore basecalling. SqueezeCall uses convolution layers to down-sample raw signals and model local dependencies. A Squeezeformer network captures the global context, and a connectionist temporal classification (CTC) decoder with beam search generates DNA sequences. Experimental results demonstrated SqueezeCall's ability to resist noise, improving basecalling accuracy. We trained SqueezeCall combining three types of loss, and found that all three loss types contribute to basecalling accuracy. Experiments across multiple species demonstrated the potential of a Squeezeformer-based model to improve basecalling accuracy and its superiority over recurrent neural network-based models and Transformer-based models.

纳米孔测序是第三代测序技术，可实现直接 RNA 测序、实时分析和长读数长度。纳米孔测序仪测量核苷酸通过纳米孔时的电流变化；基数呼应器根据原始电流测量值识别基数序列。然而，由于分子变异和测序噪音，准确的碱基识别仍然具有挑战性。在此，我们介绍一种基于 Squeezeformer 的新型模型 SqueezeCall，用于精确的纳米孔基数调用。SqueezeCall 使用卷积层对原始信号进行下采样，并对局部依赖性进行建模。一个 Squeezeformer 网络捕捉全局上下文，一个带有波束搜索功能的连接时序分类（CTC）解码器生成 DNA 序列。实验结果表明，SqueezeCall 具有抗噪能力，从而提高了基呼准确率。我们结合三种损失类型对 SqueezeCall 进行了训练，发现所有三种损失类型都有助于提高基呼准确率。多个物种的实验证明，基于 Squeezeformer 的模型具有提高基呼准确率的潜力，而且比基于递归神经网络的模型和基于 Transformer 的模型更有优势。

{"title":"SqueezeCall: nanopore basecalling using a Squeezeformer network.","authors":"Zhongxu Zhu","doi":"10.46471/gigabyte.148","DOIUrl":"10.46471/gigabyte.148","url":null,"abstract":"Nanopore sequencing, a third-generation sequencing technique, enables direct RNA sequencing, real-time analysis, and long-read length. Nanopore sequencers measure electrical current changes as nucleotides pass through nanopores; a basecaller identifies base sequences according to the raw current measurements. However, accurate basecalling remains challenging due to molecular variations and sequencing noise. Here, we introduce SqueezeCall, a novel Squeezeformer-based model for accurate nanopore basecalling. SqueezeCall uses convolution layers to down-sample raw signals and model local dependencies. A Squeezeformer network captures the global context, and a connectionist temporal classification (CTC) decoder with beam search generates DNA sequences. Experimental results demonstrated SqueezeCall's ability to resist noise, improving basecalling accuracy. We trained SqueezeCall combining three types of loss, and found that all three loss types contribute to basecalling accuracy. Experiments across multiple species demonstrated the potential of a Squeezeformer-based model to improve basecalling accuracy and its superiority over recurrent neural network-based models and Transformer-based models.","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2025 ","pages":"gigabyte148"},"PeriodicalIF":0.0,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11851125/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143506532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A practical DNA data storage using an expanded alphabet introducing 5-methylcytosine. 一个实用的DNA数据存储使用扩展字母表引入5-甲基胞嘧啶。

GigaByte (Hong Kong, China)

Pub Date : 2025-01-24 eCollection Date: 2025-01-01 DOI: 10.46471/gigabyte.147

Deruilin Liu, Demin Xu, Liuxin Shi, Jiayuan Zhang, Kewei Bi, Bei Luo, Chen Liu, Yuxiang Li, Guangyi Fan, Wen Wang, Zhi Ping

The DNA molecule is a promising next-generation data storage medium. Recently, it has been theoretically proposed that non-natural or modified bases can serve as extra molecular letters to increase the information density. However, this strategy is challenging due to the difficulty in synthesizing non-natural DNA sequences and their complex structure. Here, we described a practical DNA data storage transcoding scheme named R+ based on an expanded molecular alphabet that introduces 5-methylcytosine (5mC). We demonstrated its experimental validation by encoding one representative file into several 1.3∼1.6 kbps in vitro DNA fragments for nanopore sequencing. Our results show an average data recovery rate of 98.97% and 86.91% with and without reference, respectively. Our work validates the practicability of 5mC in DNA storage systems, with a potentially wide range of applications.

Availability and implementation: R+ is implemented in Python and the code is available under a MIT license at https://github.com/Incpink-Liu/DNA-storage-R_plus.

DNA分子是一种很有前途的下一代数据存储介质。近年来，从理论上提出非天然或修饰碱基可以作为额外的分子字母来增加信息密度。然而，由于合成非天然DNA序列的困难及其复杂的结构，这种策略具有挑战性。在这里，我们描述了一种实用的DNA数据存储转编码方案，名为R+，该方案基于扩展的分子字母表，引入了5-甲基胞嘧啶（5mC）。我们通过将一个代表性文件编码为几个1.3 ~ 1.6 kbps的体外DNA片段进行纳米孔测序来验证其实验有效性。结果表明，在有参考文献和无参考文献的情况下，平均数据恢复率分别为98.97%和86.91%。我们的工作验证了5mC在DNA存储系统中的实用性，具有潜在的广泛应用。可用性和实现：R+是用Python实现的，其代码在MIT许可下可在https://github.com/Incpink-Liu/DNA-storage-R_plus上获得。

{"title":"A practical DNA data storage using an expanded alphabet introducing 5-methylcytosine.","authors":"Deruilin Liu, Demin Xu, Liuxin Shi, Jiayuan Zhang, Kewei Bi, Bei Luo, Chen Liu, Yuxiang Li, Guangyi Fan, Wen Wang, Zhi Ping","doi":"10.46471/gigabyte.147","DOIUrl":"10.46471/gigabyte.147","url":null,"abstract":"The DNA molecule is a promising next-generation data storage medium. Recently, it has been theoretically proposed that non-natural or modified bases can serve as extra molecular letters to increase the information density. However, this strategy is challenging due to the difficulty in synthesizing non-natural DNA sequences and their complex structure. Here, we described a practical DNA data storage transcoding scheme named R+ based on an expanded molecular alphabet that introduces 5-methylcytosine (5mC). We demonstrated its experimental validation by encoding one representative file into several 1.3∼1.6 kbps in vitro DNA fragments for nanopore sequencing. Our results show an average data recovery rate of 98.97% and 86.91% with and without reference, respectively. Our work validates the practicability of 5mC in DNA storage systems, with a potentially wide range of applications.Availability and implementation: R+ is implemented in Python and the code is available under a MIT license at https://github.com/Incpink-Liu/DNA-storage-R_plus.","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2025 ","pages":"gigabyte147"},"PeriodicalIF":0.0,"publicationDate":"2025-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11791762/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143191462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Biodepot Launcher: an app to install, manage and launch bioinformatics workflows. Biodepot Launcher：用于安装、管理和启动生物信息学工作流程的应用程序。

GigaByte (Hong Kong, China)

Pub Date : 2025-01-14 eCollection Date: 2025-01-01 DOI: 10.46471/gigabyte.146

Ling-Hong Hung, Thomas J Dahlstrom, Johnalbert Garnica, Emmanuel Munoz, Robert Schmitz, Ka Yee Yeung

We present the Biodepot Launcher, a desktop application that facilitates installation, management and deployment of bioinformatics workflows using the Biodepot-workflow-builder (Bwb). With the new app, Bwb can be started by double-clicking on an icon, eliminating the need for typing cryptic start up commands into the terminal. This creates an end-to-end graphical and easy-to-use interface to manage and launch containerized workflows on the local computer or cloud instances. Biodepot Launcher is written in React and Javascript, and uses the node.js framework Neutralinojs and web browser routines to allow the application to execute on Linux, Windows and Mac desktop environments.

我们介绍了Biodepot Launcher，这是一个桌面应用程序，可以使用Biodepot-工作流构建器（Bwb）促进生物信息学工作流的安装，管理和部署。有了这款新应用，Bwb可以通过双击一个图标来启动，无需在终端输入神秘的启动命令。这将创建一个端到端图形化且易于使用的界面，用于在本地计算机或云实例上管理和启动容器化工作流。Biodepot Launcher是用React和Javascript编写的，并使用node.js框架Neutralinojs和web浏览器例程来允许应用程序在Linux， Windows和Mac桌面环境中执行。

引用次数: 0

The genome of the sapphire damselfish Chrysiptera cyanea: a new resource to support further investigation of the evolution of Pomacentrids. 蓝宝石雀鲷Chrysiptera cyanea的基因组：支持进一步研究Pomacentrids进化的新资源。

GigaByte (Hong Kong, China)

Pub Date : 2024-12-31 eCollection Date: 2024-01-01 DOI: 10.46471/gigabyte.144

Emma Gairin, Saori Miura, Hiroki Takamiyagi, Marcela Herrera, Vincent Laudet

The number of high-quality genomes is rapidly increasing across taxa. However, it remains limited for coral reef fish of the Pomacentrid family, with most research focused on anemonefish. Here, we present the first assembly for a Pomacentrid of the genus Chrysiptera. Using PacBio long-read sequencing with 94.5× coverage, the genome of the Sapphire Devil, Chrysiptera cyanea, was assembled and annotated. The final assembly comprises 896 Mb pairs across 91 contigs, with a BUSCO completeness of 97.6%, and 28,173 genes. Comparative analyses with chromosome-scale assemblies of related species identified contig-chromosome correspondences. This genome will be useful as a comparison to study specific adaptations linked to the symbiotic life of closely related anemonefish. Furthermore, C. cyanea is found in most tropical coastal areas of the Indo-West Pacific and could become a model for environmental monitoring. This work will expand coral reef research efforts, highlighting the power of long-read assemblies to retrieve high quality genomes.

高质量基因组的数量在整个分类群中迅速增加。然而，它仍然局限于Pomacentrid家族的珊瑚礁鱼，大多数研究都集中在海葵鱼上。在这里，我们提出了第一个聚类的Pomacentrid属Chrysiptera。采用94.5×覆盖的PacBio长读测序技术，对蓝藻（Chrysiptera cyanea）的基因组进行了组装和注释。最终的组装包括896 Mb对，横跨91个contigs， BUSCO完整性为97.6%,28,173个基因。与近缘物种的染色体尺度组合进行比较分析，鉴定出了连续染色体对应关系。该基因组将有助于研究与密切相关的海葵鱼共生生活相关的特定适应性。此外，蓝藻在印度-西太平洋的大多数热带沿海地区都有发现，可以成为环境监测的模型。这项工作将扩大珊瑚礁的研究工作，突出长读汇编检索高质量基因组的能力。

{"title":"The genome of the sapphire damselfish Chrysiptera cyanea: a new resource to support further investigation of the evolution of Pomacentrids.","authors":"Emma Gairin, Saori Miura, Hiroki Takamiyagi, Marcela Herrera, Vincent Laudet","doi":"10.46471/gigabyte.144","DOIUrl":"10.46471/gigabyte.144","url":null,"abstract":"The number of high-quality genomes is rapidly increasing across taxa. However, it remains limited for coral reef fish of the Pomacentrid family, with most research focused on anemonefish. Here, we present the first assembly for a Pomacentrid of the genus Chrysiptera. Using PacBio long-read sequencing with 94.5× coverage, the genome of the Sapphire Devil, Chrysiptera cyanea, was assembled and annotated. The final assembly comprises 896 Mb pairs across 91 contigs, with a BUSCO completeness of 97.6%, and 28,173 genes. Comparative analyses with chromosome-scale assemblies of related species identified contig-chromosome correspondences. This genome will be useful as a comparison to study specific adaptations linked to the symbiotic life of closely related anemonefish. Furthermore, C. cyanea is found in most tropical coastal areas of the Indo-West Pacific and could become a model for environmental monitoring. This work will expand coral reef research efforts, highlighting the power of long-read assemblies to retrieve high quality genomes.","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte144"},"PeriodicalIF":0.0,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11711634/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142959724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Polyploid genome assembly of Cardamine chenopodiifolia. 小豆蔻的多倍体基因组组装。

GigaByte (Hong Kong, China)

Pub Date : 2024-12-23 eCollection Date: 2024-01-01 DOI: 10.46471/gigabyte.145

Aurélia Emonet, Mohamed Awad, Nikita Tikhomirov, Maria Vasilarou, Miguel Pérez-Antón, Xiangchao Gan, Polina Yu Novikova, Angela Hay

Cardamine chenopodiifolia is an amphicarpic plant in the Brassicaceae family. Plants develop two fruit types, one above and another below ground. This rare trait is associated with octoploidy in C. chenopodiifolia. The absence of genomic data for C. chenopodiifolia currently limits our understanding of the development and evolution of amphicarpy. Here, we produced a chromosome-scale assembly of the C. chenopodiifolia genome using high-fidelity long read sequencing with the Pacific Biosciences platform. We assembled 32 chromosomes and two organelle genomes with a total length of 597.2 Mb and an N50 of 18.8 Mb. Genome completeness was estimated at 99.8%. We observed structural variation among homeologous chromosomes, suggesting that C. chenopodiifolia originated via allopolyploidy, and phased the octoploid genome into four sub-genomes using orthogroup trees. This fully phased, chromosome-level genome assembly is an important resource to help investigate amphicarpy in C. chenopodiifolia and the origin of trait novelties by allopolyploidy.

小豆蔻是芸苔科的一种两性植物。植物长出两种果实，一种在地上，另一种在地下。这一罕见性状与陈果紫八倍体有关。目前，由于缺乏关于C. chenopodiifolia的基因组数据，限制了我们对两栖动物的发育和进化的理解。在这里，我们使用太平洋生物科学平台的高保真长读测序，制作了C. chenopodiifolia基因组的染色体尺度组装。我们组装了32条染色体和2个细胞器基因组，总长度597.2 Mb， N50为18.8 Mb，基因组完整性估计为99.8%。我们观察到同源染色体之间的结构差异，表明C. chenopodiifolia起源于异源多倍体，并通过正群树将八倍体基因组分为四个亚基因组。这种完全分期的染色体水平基因组组装是帮助研究C. chenopodiifolia两性性和异源多倍体新性状起源的重要资源。

{"title":"Polyploid genome assembly of Cardamine chenopodiifolia.","authors":"Aurélia Emonet, Mohamed Awad, Nikita Tikhomirov, Maria Vasilarou, Miguel Pérez-Antón, Xiangchao Gan, Polina Yu Novikova, Angela Hay","doi":"10.46471/gigabyte.145","DOIUrl":"10.46471/gigabyte.145","url":null,"abstract":"Cardamine chenopodiifolia is an amphicarpic plant in the Brassicaceae family. Plants develop two fruit types, one above and another below ground. This rare trait is associated with octoploidy in C. chenopodiifolia. The absence of genomic data for C. chenopodiifolia currently limits our understanding of the development and evolution of amphicarpy. Here, we produced a chromosome-scale assembly of the C. chenopodiifolia genome using high-fidelity long read sequencing with the Pacific Biosciences platform. We assembled 32 chromosomes and two organelle genomes with a total length of 597.2 Mb and an N50 of 18.8 Mb. Genome completeness was estimated at 99.8%. We observed structural variation among homeologous chromosomes, suggesting that C. chenopodiifolia originated via allopolyploidy, and phased the octoploid genome into four sub-genomes using orthogroup trees. This fully phased, chromosome-level genome assembly is an important resource to help investigate amphicarpy in C. chenopodiifolia and the origin of trait novelties by allopolyploidy.","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte145"},"PeriodicalIF":0.0,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11693932/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142923940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

NeuroVar: an open-source tool for the visualization of gene expression and variation data for biomarkers of neurological diseases. NeuroVar：一个用于神经系统疾病生物标志物的基因表达和变异数据可视化的开源工具。

GigaByte (Hong Kong, China)

Pub Date : 2024-11-25 eCollection Date: 2024-01-01 DOI: 10.46471/gigabyte.143

Hiba Ben Aribi, Najla Abassi, Olaitan I Awe

The expanding availability of large-scale genomic data and the growing interest in uncovering gene-disease associations call for efficient tools to visualize and evaluate gene expression and genetic variation data. Here, we developed a comprehensive pipeline that was implemented as an interactive Shiny application and a standalone desktop application. NeuroVar is a tool for visualizing genetic variation (single nucleotide polymorphisms and insertions/deletions) and gene expression profiles of biomarkers of neurological diseases. Data collection involved filtering biomarkers related to multiple neurological diseases from the ClinGen database. NeuroVar provides a user-friendly graphical user interface to visualize genomic data and is freely accessible on the project's GitHub repository (https://github.com/omicscodeathon/neurovar).

随着大规模基因组数据的不断扩大，以及人们对揭示基因与疾病关联的兴趣日益浓厚，需要有效的工具来可视化和评估基因表达和遗传变异数据。在这里，我们开发了一个全面的管道，它被实现为一个交互式的Shiny应用程序和一个独立的桌面应用程序。NeuroVar是一种可视化遗传变异（单核苷酸多态性和插入/缺失）和神经系统疾病生物标志物基因表达谱的工具。数据收集包括从ClinGen数据库中筛选与多种神经系统疾病相关的生物标志物。NeuroVar提供了一个用户友好的图形用户界面来可视化基因组数据，并且可以在项目的GitHub存储库（https://github.com/omicscodeathon/neurovar）上免费访问。

引用次数: 0

Whole-genome re-sequencing of the Baikal seal and other phocid seals for a glimpse into their genetic diversity, demographic history, and phylogeny. 对贝加尔湖海豹和其他海豹进行全基因组重测序，以了解它们的遗传多样性、人口统计学历史和系统发育。

GigaByte (Hong Kong, China)

Pub Date : 2024-11-20 eCollection Date: 2024-01-01 DOI: 10.46471/gigabyte.142

Marcel Nebenführ, Ulfur Arnason, Axel Janke

The Baikal seal (Pusa sibirica) is a freshwater seal endemic to Lake Baikal, where it became landlocked million years ago. It is an abundant species of least concern despite the limited habitat. Research on its genetic diversity had only been done on mitochondrial genes, restriction fragment analyses, and microsatellites, before its reference genome was published. Here, we report the genome sequences of six Baikal seals, and one individual of the Caspian, ringed, and harbor seal, re-sequenced from Illumina paired-end short read data. Heterozygosity calculations of the six newly sequenced individuals are similar to previously reported genomes. Also, the novel genome data of the other species contributed to a more complete phocid seal phylogeny based on whole-genome data. Despite the isolation of the land-locked Baikal seal, its genetic diversity is comparable to that of other seal species. Future targeted genome studies need to explore the genomic diversity throughout their distribution.

贝加尔湖海豹（Pusa sibirica）是贝加尔湖特有的淡水海豹，它在百万年前成为内陆。尽管栖息地有限，但它是一种最不受关注的丰富物种。在其参考基因组发表之前，对其遗传多样性的研究仅在线粒体基因、限制性内切片段分析和微卫星上进行。在这里，我们报告了6只贝加尔湖海豹和1只里海海豹、环斑海豹和港湾海豹的基因组序列，这些基因组序列来自Illumina配对端短读数据。六个新测序个体的杂合性计算与先前报道的基因组相似。此外，其他物种的新基因组数据有助于在全基因组数据的基础上建立更完整的phocid seal系统发育。尽管贝加尔湖海豹与世隔绝，但其遗传多样性与其他海豹物种相当。未来的针对性基因组研究需要探索其分布中的基因组多样性。

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

GigaByte (Hong Kong, China)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀