首页 > 最新文献

GigaByte (Hong Kong, China)最新文献

英文 中文
Whole genome sequencing and assembly of the house sparrow, Passer domesticus. 家雀全基因组测序与组装。
IF 1.2 Pub Date : 2025-07-21 eCollection Date: 2025-01-01 DOI: 10.46471/gigabyte.161
Vikas Kumar, Gopesh Sharma, Sankalp Sharma, Samvrutha Prasad, Shailesh Desai, Toral Vaishnani, Dalia Vishnudasan, Gopinathan Maheswaran, Kaomud Tyagi, Inderjeet Tyagi, Polavarapu B Kavi Kishor, Gyaneshwer Chaubey, Prashanth Suravajhala

The common house sparrow, Passer domesticus, is a small bird belonging to the family Passeridae. Here, we provide high-quality whole-genome sequencing data along with its assembly for the house sparrow. The final genome assembly was generated using a workflow that included Shovill, SPAdes, MaSuRCA, and BUSCO. The assembly consists of contigs spanning 268,193 bases and coalescing around a 922 MB sized reference genome. We used rigorous statistical thresholds to check the coverage, as the Passer genome showed considerable similarity to the Gallus gallus (chicken) and Taeniopygia guttata (Zebra finch) genomes, also providing functional annotations. This new annotated genome assembly will be a valuable resource for comparative and population genomic analyses of passerine, avian, and vertebrate evolution.

普通的家麻雀(Passer domesticus)是雀形科的一种小鸟。在这里,我们为家雀提供高质量的全基因组测序数据及其组装。使用包括Shovill, SPAdes, MaSuRCA和BUSCO的工作流程生成最终的基因组组装。该组合包含268,193个碱基,并围绕922 MB大小的参考基因组聚合。我们使用严格的统计阈值来检查覆盖范围,因为Passer基因组与鸡(Gallus Gallus)和斑胸草雀(Taeniopygia guttata)基因组具有相当大的相似性,并提供了功能注释。这个新的带注释的基因组组合将为雀形目动物、鸟类和脊椎动物进化的比较和种群基因组分析提供宝贵的资源。
{"title":"Whole genome sequencing and assembly of the house sparrow, <i>Passer domesticus</i>.","authors":"Vikas Kumar, Gopesh Sharma, Sankalp Sharma, Samvrutha Prasad, Shailesh Desai, Toral Vaishnani, Dalia Vishnudasan, Gopinathan Maheswaran, Kaomud Tyagi, Inderjeet Tyagi, Polavarapu B Kavi Kishor, Gyaneshwer Chaubey, Prashanth Suravajhala","doi":"10.46471/gigabyte.161","DOIUrl":"10.46471/gigabyte.161","url":null,"abstract":"<p><p>The common house sparrow, <i>Passer domesticus</i>, is a small bird belonging to the family Passeridae. Here, we provide high-quality whole-genome sequencing data along with its assembly for the house sparrow. The final genome assembly was generated using a workflow that included Shovill, SPAdes, MaSuRCA, and BUSCO. The assembly consists of contigs spanning 268,193 bases and coalescing around a 922 MB sized reference genome. We used rigorous statistical thresholds to check the coverage, as the Passer genome showed considerable similarity to the <i>Gallus gallus</i> (chicken) and <i>Taeniopygia guttata</i> (Zebra finch) genomes, also providing functional annotations. This new annotated genome assembly will be a valuable resource for comparative and population genomic analyses of passerine, avian, and vertebrate evolution.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2025 ","pages":"gigabyte161"},"PeriodicalIF":1.2,"publicationDate":"2025-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12308067/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144755270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Galaxy QCxMS for straightforward semi-empirical quantum mechanical EI-MS prediction. 银河QCxMS直接半经验量子力学EI-MS预测。
Pub Date : 2025-07-04 eCollection Date: 2025-01-01 DOI: 10.46471/gigabyte.160
Wudmir Y Rojas, Zargham Ahmad, Julia Jakiela, Helge Hecht, Jana Klánová, Elliott J Price

High-performance computing (HPC) environments are crucial for computational research, including quantum chemistry (QC), but pose challenges for non-expert users. Researchers with limited computational knowledge struggle to utilise domain-specific software and access mass spectra prediction for in silico annotation. Here, we provide a robust workflow that leverages interoperable file formats for molecular structures to ensure integration across various QC tools. The quantum chemistry package for mass spectral predictions after electron ionization or collision-induced dissociation has been integrated into the Galaxy platform, enabling automated analysis of fragmentation mechanisms. The extended tight binding quantum chemistry package, chosen for its balance between accuracy and computational efficiency, provides molecular geometry optimisation. A Docker image encapsulates the necessary software stack. We demonstrated the workflow for four molecules, highlighting the scalability and efficiency of our solution via runtime performance analysis. This work shows how non-HPC users can make these predictions effortlessly, using advanced computational tools without needing in-depth expertise.

高性能计算(HPC)环境对包括量子化学(QC)在内的计算研究至关重要,但对非专业用户构成了挑战。计算知识有限的研究人员难以利用特定领域的软件和访问质谱预测进行硅注释。在这里,我们提供了一个强大的工作流,利用分子结构的可互操作文件格式来确保跨各种QC工具的集成。用于电子电离或碰撞诱导解离后质谱预测的量子化学包已集成到Galaxy平台中,使碎片机制的自动分析成为可能。扩展紧密结合量子化学包,选择其精度和计算效率之间的平衡,提供分子几何优化。Docker镜像封装了必要的软件堆栈。我们演示了四种分子的工作流程,通过运行时性能分析突出了我们的解决方案的可扩展性和效率。这项工作表明,非hpc用户可以使用先进的计算工具,而无需深入的专业知识,轻松地做出这些预测。
{"title":"Galaxy QCxMS for straightforward semi-empirical quantum mechanical EI-MS prediction.","authors":"Wudmir Y Rojas, Zargham Ahmad, Julia Jakiela, Helge Hecht, Jana Klánová, Elliott J Price","doi":"10.46471/gigabyte.160","DOIUrl":"10.46471/gigabyte.160","url":null,"abstract":"<p><p>High-performance computing (HPC) environments are crucial for computational research, including quantum chemistry (QC), but pose challenges for non-expert users. Researchers with limited computational knowledge struggle to utilise domain-specific software and access mass spectra prediction for <i>in silico</i> annotation. Here, we provide a robust workflow that leverages interoperable file formats for molecular structures to ensure integration across various QC tools. The quantum chemistry package for mass spectral predictions after electron ionization or collision-induced dissociation has been integrated into the Galaxy platform, enabling automated analysis of fragmentation mechanisms. The extended tight binding quantum chemistry package, chosen for its balance between accuracy and computational efficiency, provides molecular geometry optimisation. A Docker image encapsulates the necessary software stack. We demonstrated the workflow for four molecules, highlighting the scalability and efficiency of our solution via runtime performance analysis. This work shows how non-HPC users can make these predictions effortlessly, using advanced computational tools without needing in-depth expertise.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2025 ","pages":"gigabyte160"},"PeriodicalIF":0.0,"publicationDate":"2025-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12257954/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144638787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Chevreul: an R bioconductor package for exploratory analysis of full-length single cell sequencing. cherreul:一个R生物导体包,用于全长单细胞测序的探索性分析。
IF 1.2 Pub Date : 2025-06-24 eCollection Date: 2025-01-01 DOI: 10.46471/gigabyte.158
Kevin Stachelek, Bhavana Bhat, David Cobrinik

Chevreul is an open-source R Bioconductor package and interactive R Shiny app for processing and visualising single-cell RNA sequencing (scRNA-seq) data. Chevreul differs from other scRNA-seq analysis packages in its ease of use, capacity to analyze full-length RNA sequencing data for exon coverage and transcript isoform inference, and support for batch correction. Chevreul enables exploratory analyses of scRNA-seq data using Bioconductor SingleCellExperiment objects (or converted Seurat objects), including batch integration, quality control filtering, read count normalization and transformation, dimensionality reduction, clustering at a range of resolutions, and cluster marker gene identification. Processed data can be visualized in the R Shiny app. Gene or transcript expression can be visualized using PCA, tSNE, UMAP, heatmaps, or violin plots; differential expression can be evaluated with several statistical tests. Chevreul also provides accessible tools for isoform-level analyses and alternative splicing detection. Chevreul empowers researchers without programming experience to analyze full-length scRNA-seq data.

Availability & implementation: Chevreul is implemented in R, and the R package and integrated Shiny application are freely available at https://github.com/cobriniklab/chevreul with constituent packages hosted on Bioconductor at https://bioconductor.org/packages/chevreulProcess, https://bioconductor.org/packages/chevreulPlot, and https://bioconductor.org/packages/chevreulShiny.

chevrel是一个开源的R Bioconductor包和交互式R Shiny应用程序,用于处理和可视化单细胞RNA测序(scRNA-seq)数据。chevrel与其他scRNA-seq分析软件包的不同之处在于其易用性,分析全长RNA测序数据的外显子覆盖和转录异构体推断的能力,以及支持批量校正。chevrel能够使用Bioconductor singlecelleexperiment对象(或转换的Seurat对象)对scRNA-seq数据进行探索性分析,包括批量集成、质量控制过滤、读取计数归一化和转换、降维、在一定分辨率下聚类和聚类标记基因鉴定。处理后的数据可以在R Shiny应用程序中可视化。基因或转录表达可以使用PCA, tSNE, UMAP,热图或小提琴图进行可视化;差异表达可以用几种统计检验来评估。chevrel还提供了易于访问的工具,用于异构体水平分析和替代剪接检测。chevrel使没有编程经验的研究人员能够分析全长scRNA-seq数据。可用性和实现:chevrel是用R实现的,R包和集成的Shiny应用程序可以在https://github.com/cobriniklab/chevreul上免费获得,其组成包托管在Bioconductor的https://bioconductor.org/packages/chevreulProcess、https://bioconductor.org/packages/chevreulPlot和https://bioconductor.org/packages/chevreulShiny上。
{"title":"Chevreul: an R bioconductor package for exploratory analysis of full-length single cell sequencing.","authors":"Kevin Stachelek, Bhavana Bhat, David Cobrinik","doi":"10.46471/gigabyte.158","DOIUrl":"10.46471/gigabyte.158","url":null,"abstract":"<p><p>Chevreul is an open-source R Bioconductor package and interactive R Shiny app for processing and visualising single-cell RNA sequencing (scRNA-seq) data. Chevreul differs from other scRNA-seq analysis packages in its ease of use, capacity to analyze full-length RNA sequencing data for exon coverage and transcript isoform inference, and support for batch correction. Chevreul enables exploratory analyses of scRNA-seq data using Bioconductor SingleCellExperiment objects (or converted Seurat objects), including batch integration, quality control filtering, read count normalization and transformation, dimensionality reduction, clustering at a range of resolutions, and cluster marker gene identification. Processed data can be visualized in the R Shiny app. Gene or transcript expression can be visualized using PCA, tSNE, UMAP, heatmaps, or violin plots; differential expression can be evaluated with several statistical tests. Chevreul also provides accessible tools for isoform-level analyses and alternative splicing detection. Chevreul empowers researchers without programming experience to analyze full-length scRNA-seq data.</p><p><strong>Availability & implementation: </strong>Chevreul is implemented in R, and the R package and integrated Shiny application are freely available at https://github.com/cobriniklab/chevreul with constituent packages hosted on Bioconductor at https://bioconductor.org/packages/chevreulProcess, https://bioconductor.org/packages/chevreulPlot, and https://bioconductor.org/packages/chevreulShiny.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2025 ","pages":"gigabyte158"},"PeriodicalIF":1.2,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12320507/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144786110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A draft genome assembly for the dart-poison frog Phyllobates terribilis. 箭毒蛙的基因组组装草图。
IF 1.2 Pub Date : 2025-06-20 eCollection Date: 2025-01-01 DOI: 10.46471/gigabyte.157
Roberto Márquez, Denis Jacob Machado, Reyhaneh Nouri, Kerry L Gendreau, Daniel Janies, Ralph A Saporito, Marcus R Kronforst, Taran Grant

Dendrobatid poison frogs have become well established as model systems in several fields of biology. Nevertheless, the development of molecular and genetic resources for these frogs has been hindered by their large, highly repetitive genomes, which have proven difficult to assemble. Here we present a draft assembly for Phyllobates terribilis (12.6 Gb), generated using a combination of sequencing platforms and bioinformatic approaches. Similar to other poison frog sequencing efforts, we recovered a highly fragmented assembly, likely due to the genome's large size and very high repeat content, which we estimated to be ≍88%. Despite the assembly's low contiguity, we were able to annotate multiple members of three gene sets of interest (voltage-gated sodium channels and Notch and Wnt signaling pathways), demonstrating the usefulness of our assembly to the amphibian research community.

在生物学的许多领域中,石斛毒蛙已经成为公认的模型系统。然而,这些青蛙的分子和遗传资源的开发一直受到它们庞大,高度重复的基因组的阻碍,这些基因组已被证明难以组装。在这里,我们展示了叶状叶(phylloates terribilis) (12.6 Gb)的草案汇编,该草案汇编使用测序平台和生物信息学方法的组合生成。与其他毒蛙测序工作类似,我们恢复了一个高度碎片化的组装,可能是由于基因组的大尺寸和非常高的重复含量,我们估计其为≥88%。尽管组装的低连续性,我们能够注释三个感兴趣的基因集的多个成员(电压门控钠通道和Notch和Wnt信号通路),证明了我们的组装对两栖动物研究界的有用性。
{"title":"A draft genome assembly for the dart-poison frog <i>Phyllobates terribilis</i>.","authors":"Roberto Márquez, Denis Jacob Machado, Reyhaneh Nouri, Kerry L Gendreau, Daniel Janies, Ralph A Saporito, Marcus R Kronforst, Taran Grant","doi":"10.46471/gigabyte.157","DOIUrl":"10.46471/gigabyte.157","url":null,"abstract":"<p><p>Dendrobatid poison frogs have become well established as model systems in several fields of biology. Nevertheless, the development of molecular and genetic resources for these frogs has been hindered by their large, highly repetitive genomes, which have proven difficult to assemble. Here we present a draft assembly for <i>Phyllobates terribilis</i> (12.6 Gb), generated using a combination of sequencing platforms and bioinformatic approaches. Similar to other poison frog sequencing efforts, we recovered a highly fragmented assembly, likely due to the genome's large size and very high repeat content, which we estimated to be ≍88%. Despite the assembly's low contiguity, we were able to annotate multiple members of three gene sets of interest (voltage-gated sodium channels and <i>Notch</i> and <i>Wnt</i> signaling pathways), demonstrating the usefulness of our assembly to the amphibian research community.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2025 ","pages":"gigabyte157"},"PeriodicalIF":1.2,"publicationDate":"2025-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12208295/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144531342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Chromosome-level genome assembly of the lemon sole, Microstomus kitt (Pleuronectiformes: Pleuronectidae). 小比目鱼柠檬比目鱼染色体水平的基因组组装。
Pub Date : 2025-05-27 eCollection Date: 2025-01-01 DOI: 10.46471/gigabyte.156
Marcel Nebenführ, David Prochotta, Maria A Nilsson, Menno J de Jong, Tunca D Yazici, Fabienne Langefeld, Malambo Muloongo, Helena Woköck, Jakob Jilg, Sina C Bender, Marvin M Zangl, Juan-Manuel Ortega Guatame, Kimberley Williams, Moritz Sonnewald, Axel Janke

Background: The lemon sole (Microstomus kitt) is a culinary fish from the family of righteye flounders (Pleuronectidae), inhabiting sandy, shallow offshore grounds of the North Sea, western Baltic Sea, English Channel, Great Britain and Ireland, Bay of Biscay, and coastal waters of Norway.

Findings: Here, we present a chromosome-level genome assembly of the lemon sole. We applied PacBio HiFi sequencing on the PacBio Revio system to generate a highly complete and contiguous reference genome.The resulting assembly has a contig N50 of 17.2 Mbp and a scaffold N50 of 27.2 Mbp. The total assembly length is 628 Mbp, comprising 24 chromosome-length scaffolds. The identification of 99.7% complete BUSCO genes indicates a high level of assembly completeness.

Conclusions: The chromosome-level genome assembly of the lemon sole provides a high-quality reference genome for future population-level genomic analyses of this commercially valuable, edible fish.

背景:柠檬比目鱼(Microstomus kitt)是一种食用鱼,属于义鲽科(Pleuronectidae),生活在北海、波罗的海西部、英吉利海峡、大不列颠和爱尔兰、比斯开湾和挪威沿海水域的浅海沙滩上。研究结果:在这里,我们提出了柠檬比目鱼的染色体水平基因组组装。我们在PacBio Revio系统上应用PacBio HiFi测序,生成高度完整和连续的参考基因组。所得到的组装体的N50为17.2 Mbp,支架N50为27.2 Mbp。总组装长度为628 Mbp,由24个染色体长度的支架组成。鉴定出99.7%完整的BUSCO基因,表明具有较高的组装完整性。结论:柠檬比目鱼的染色体水平基因组组装为未来这种具有商业价值的食用鱼类的种群水平基因组分析提供了高质量的参考基因组。
{"title":"Chromosome-level genome assembly of the lemon sole, <i>Microstomus kitt</i> (Pleuronectiformes: Pleuronectidae).","authors":"Marcel Nebenführ, David Prochotta, Maria A Nilsson, Menno J de Jong, Tunca D Yazici, Fabienne Langefeld, Malambo Muloongo, Helena Woköck, Jakob Jilg, Sina C Bender, Marvin M Zangl, Juan-Manuel Ortega Guatame, Kimberley Williams, Moritz Sonnewald, Axel Janke","doi":"10.46471/gigabyte.156","DOIUrl":"10.46471/gigabyte.156","url":null,"abstract":"<p><strong>Background: </strong>The lemon sole (<i>Microstomus kitt</i>) is a culinary fish from the family of righteye flounders (Pleuronectidae), inhabiting sandy, shallow offshore grounds of the North Sea, western Baltic Sea, English Channel, Great Britain and Ireland, Bay of Biscay, and coastal waters of Norway.</p><p><strong>Findings: </strong>Here, we present a chromosome-level genome assembly of the lemon sole. We applied PacBio HiFi sequencing on the PacBio Revio system to generate a highly complete and contiguous reference genome.The resulting assembly has a contig N50 of 17.2 Mbp and a scaffold N50 of 27.2 Mbp. The total assembly length is 628 Mbp, comprising 24 chromosome-length scaffolds. The identification of 99.7% complete BUSCO genes indicates a high level of assembly completeness.</p><p><strong>Conclusions: </strong>The chromosome-level genome assembly of the lemon sole provides a high-quality reference genome for future population-level genomic analyses of this commercially valuable, edible fish.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2025 ","pages":"gigabyte156"},"PeriodicalIF":0.0,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12135936/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144227869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Chromosome-level genome assemblies of five Sinocyclocheilus species. 五种中华环藻的染色体水平基因组组装。
Pub Date : 2025-05-09 eCollection Date: 2025-01-01 DOI: 10.46471/gigabyte.155
Chao Bian, Ruihan Li, Yuqian Ouyang, Junxing Yang, Xidong Mu, Qiong Shi

Sinocyclocheilus, a genus of tetraploid fishes endemic to Southwest China's karst regions, are classified as second-class nationally protected species due to their fragile habitat. Limited high-quality genomic resources have hampered studies on their phylogenetic relationships and the origin of their polyploidy. Here, we present a high-quality genome assembly of the most abundant Sinocyclocheilus species, the golden-line barbel (Sinocyclocheilus grahami), by integrating PacBio long-read and Hi-C sequencing. The resulting scaffold-level genome-assembly is 1.6 Gb long, with a scaffold N50 of up to 30.7 Mb. We annotated 42,806 protein-coding genes. Also, 93.1% of the assembled genome sequences (about 1.5 Gb) and 93.8% of the total predicted genes were successfully anchored onto 48 chromosomes. Furthermore, we obtained chromosome-level genome assemblies for four other Sinocyclocheilus species (S. anophthalmus, S. maitianheensis, S. anshuiensis, and S. rhinocerous) based on homologous comparisons. These genomic resources will enable in-depth investigations on cave adaptation, improvement of economic values, and conservation of diverse Sinocyclocheilus fishes.

中华青鱼(Sinocyclocheilus)是中国西南喀斯特地区特有的四倍体鱼类,因其栖息地脆弱,被列为国家二级保护物种。有限的高质量基因组资源阻碍了它们的系统发育关系和多倍体起源的研究。在这里,我们通过整合PacBio长读和Hi-C测序,展示了最丰富的Sinocyclocheilus物种,金线barbel (Sinocyclocheilus grahami)的高质量基因组组装。由此得到的支架水平基因组组装长1.6 Gb,其中支架N50高达30.7 Mb。我们注释了42,806个蛋白质编码基因。93.1%的基因组序列(约1.5 Gb)和93.8%的预测基因成功锚定在48条染色体上。此外,我们还通过同源比较获得了另外4种中华环蚊(S. anophthalmus, S. maitianheensis, S. anshuiensis和S. rhinocerous)的染色体水平基因组组装。这些基因组资源将有助于深入研究洞穴适应、提高经济价值和保护各种中华青鱼。
{"title":"Chromosome-level genome assemblies of five <i>Sinocyclocheilus</i> species.","authors":"Chao Bian, Ruihan Li, Yuqian Ouyang, Junxing Yang, Xidong Mu, Qiong Shi","doi":"10.46471/gigabyte.155","DOIUrl":"10.46471/gigabyte.155","url":null,"abstract":"<p><p><i>Sinocyclocheilus</i>, a genus of tetraploid fishes endemic to Southwest China's karst regions, are classified as second-class nationally protected species due to their fragile habitat. Limited high-quality genomic resources have hampered studies on their phylogenetic relationships and the origin of their polyploidy. Here, we present a high-quality genome assembly of the most abundant <i>Sinocyclocheilus</i> species, the golden-line barbel (<i>Sinocyclocheilus grahami</i>), by integrating PacBio long-read and Hi-C sequencing. The resulting scaffold-level genome-assembly is 1.6 Gb long, with a scaffold N50 of up to 30.7 Mb. We annotated 42,806 protein-coding genes. Also, 93.1% of the assembled genome sequences (about 1.5 Gb) and 93.8% of the total predicted genes were successfully anchored onto 48 chromosomes. Furthermore, we obtained chromosome-level genome assemblies for four other <i>Sinocyclocheilus</i> species (<i>S. anophthalmus</i>, <i>S. maitianheensis</i>, <i>S. anshuiensis</i>, and <i>S. rhinocerous</i>) based on homologous comparisons. These genomic resources will enable in-depth investigations on cave adaptation, improvement of economic values, and conservation of diverse <i>Sinocyclocheilus</i> fishes.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2025 ","pages":"gigabyte155"},"PeriodicalIF":0.0,"publicationDate":"2025-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12089701/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144113018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficiently constructing complete genomes with CycloneSEQ to fill gaps in bacterial draft assemblies. 利用CycloneSEQ高效构建全基因组,填补细菌草稿组装体的空白。
Pub Date : 2025-04-25 eCollection Date: 2025-01-01 DOI: 10.46471/gigabyte.154
Hewei Liang, Yuanqiang Zou, Mengmeng Wang, Tongyuan Hu, Haoyu Wang, Wenxin He, Yanmei Ju, Ruijin Guo, Junyi Chen, Fei Guo, Tao Zeng, Yuliang Dong, Yuning Zhang, Bo Wang, Chuanyu Liu, Xin Jin, Wenwei Zhang, Xun Xu, Liang Xiao

Current microbial sequencing relies on short-read platforms like Illumina and DNBSEQ, which are cost-effective and accurate but often produce fragmented draft genomes. Here, we used CycloneSEQ for long-read sequencing of ATCC BAA-835, producing long-reads with an average length of 11.6 kbp and an average quality score of 14.4. Hybrid assembly with short-reads data resulted in an error rate of only 0.04 mismatches and 0.08 indels per 100 kbp compared to the reference genome. This method, validated across nine species, successfully assembled complete circular genomes. Hybrid assembly significantly enhances genome completeness by using long-reads to fill gaps and accurately assembling multi-copy rRNA genes, unlike short-reads alone. Data subsampling showed that combining over 500 Mbp of short-read data with 100 Mbp of long-read data yields high-quality circular assemblies. CycloneSEQ long-reads improves the assembly of circular complete genomes from mixed microbial communities; however, its base quality needs improving. Integrating DNBSEQ short-reads improved accuracy, resulting in complete and accurate assemblies.

目前的微生物测序依赖于像Illumina和DNBSEQ这样的短读平台,这些平台既经济又准确,但往往产生碎片化的基因组草图。在这里,我们使用CycloneSEQ对ATCC BAA-835进行长读测序,得到的长读平均长度为11.6 kbp,平均质量分数为14.4。与参考基因组相比,具有短读段数据的杂交组合的错配率仅为0.04,每100 kbp只有0.08个索引。这种方法在9个物种中得到验证,成功地组装了完整的圆形基因组。与单独使用短读段不同,杂交组装通过使用长读段填补空白和准确组装多拷贝rRNA基因,显著提高了基因组的完整性。数据子采样表明,将超过500 Mbp的短读数据与100 Mbp的长读数据相结合,可以产生高质量的圆形组件。CycloneSEQ长reads提高了混合微生物群落的环状完整基因组的组装;但其基础质量有待提高。整合DNBSEQ短读提高了精度,导致完整和准确的组装。
{"title":"Efficiently constructing complete genomes with CycloneSEQ to fill gaps in bacterial draft assemblies.","authors":"Hewei Liang, Yuanqiang Zou, Mengmeng Wang, Tongyuan Hu, Haoyu Wang, Wenxin He, Yanmei Ju, Ruijin Guo, Junyi Chen, Fei Guo, Tao Zeng, Yuliang Dong, Yuning Zhang, Bo Wang, Chuanyu Liu, Xin Jin, Wenwei Zhang, Xun Xu, Liang Xiao","doi":"10.46471/gigabyte.154","DOIUrl":"https://doi.org/10.46471/gigabyte.154","url":null,"abstract":"<p><p>Current microbial sequencing relies on short-read platforms like Illumina and DNBSEQ, which are cost-effective and accurate but often produce fragmented draft genomes. Here, we used CycloneSEQ for long-read sequencing of ATCC BAA-835, producing long-reads with an average length of 11.6 kbp and an average quality score of 14.4. Hybrid assembly with short-reads data resulted in an error rate of only 0.04 mismatches and 0.08 indels per 100 kbp compared to the reference genome. This method, validated across nine species, successfully assembled complete circular genomes. Hybrid assembly significantly enhances genome completeness by using long-reads to fill gaps and accurately assembling multi-copy rRNA genes, unlike short-reads alone. Data subsampling showed that combining over 500 Mbp of short-read data with 100 Mbp of long-read data yields high-quality circular assemblies. CycloneSEQ long-reads improves the assembly of circular complete genomes from mixed microbial communities; however, its base quality needs improving. Integrating DNBSEQ short-reads improved accuracy, resulting in complete and accurate assemblies.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2025 ","pages":"gigabyte154"},"PeriodicalIF":0.0,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12051259/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144044131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Genome assembly and annotation of Acropora pulchra from Mo'orea French Polynesia. 法属波利尼西亚莫奥利亚地区鹿角蕨基因组组装与注释。
Pub Date : 2025-04-10 eCollection Date: 2025-01-01 DOI: 10.46471/gigabyte.153
Trinity Conn, Jill Ashey, Ross Cunning, Hollie M Putnam

Reef-building corals are integral ecosystem engineers of tropical reefs but face threats from climate change. Investigating genetic, epigenetic, and environmental factors influencing their adaptation is critical. Genomic resources are essential for understanding coral biology and guiding conservation efforts. However, genomes of the coral genus Acropora are limited to highly-studied species. Here, we present the assembly and annotation of the genome and DNA methylome of Acropora pulchra from Mo'orea, French Polynesia. Using long-read PacBio HiFi and Illumina RNASeq, we generated the most complete Acropora genome to date (BUSCO completeness of 96.7% metazoan genes). The assembly size is 518 Mbp, with 174 scaffolds, and a scaffold N50 of 17 Mbp. We predicted 40,518 protein-coding genes and 16.74% of the genome in repeats. DNA methylation in the CpG context is 14.6%. This assembly of the A. pulchra genome and DNA methylome will support studies of coastal corals in French Polynesia, aiding conservation and comparative studies of Acropora and cnidarians.

造礁珊瑚是热带珊瑚礁不可或缺的生态系统工程师,但也面临着气候变化的威胁。研究影响它们适应的遗传、表观遗传和环境因素至关重要。基因组资源对于了解珊瑚生物学和指导保护工作至关重要。然而,珊瑚属Acropora的基因组仅限于高度研究的物种。本文报道了法属波利尼西亚莫奥利亚(Mo’orea)鹿角蕨(Acropora pulchra)基因组和DNA甲基化组的组装和注释。使用长读PacBio HiFi和Illumina RNASeq,我们生成了迄今为止最完整的Acropora基因组(96.7%的后生动物基因的BUSCO完整性)。装配尺寸为518 Mbp,包含174个支架,其中一个支架N50为17 Mbp。我们预测了40,518个蛋白质编码基因和16.74%的基因组重复序列。CpG背景下的DNA甲基化率为14.6%。A. pulchra基因组和DNA甲基组的组装将支持法属波利尼西亚沿海珊瑚的研究,有助于Acropora和刺胞动物的保护和比较研究。
{"title":"Genome assembly and annotation of <i>Acropora pulchra</i> from Mo'orea French Polynesia.","authors":"Trinity Conn, Jill Ashey, Ross Cunning, Hollie M Putnam","doi":"10.46471/gigabyte.153","DOIUrl":"https://doi.org/10.46471/gigabyte.153","url":null,"abstract":"<p><p>Reef-building corals are integral ecosystem engineers of tropical reefs but face threats from climate change. Investigating genetic, epigenetic, and environmental factors influencing their adaptation is critical. Genomic resources are essential for understanding coral biology and guiding conservation efforts. However, genomes of the coral genus <i>Acropora</i> are limited to highly-studied species. Here, we present the assembly and annotation of the genome and DNA methylome of <i>Acropora pulchra</i> from Mo'orea, French Polynesia. Using long-read PacBio HiFi and Illumina RNASeq, we generated the most complete <i>Acropora</i> genome to date (BUSCO completeness of 96.7% metazoan genes). The assembly size is 518 Mbp, with 174 scaffolds, and a scaffold N50 of 17 Mbp. We predicted 40,518 protein-coding genes and 16.74% of the genome in repeats. DNA methylation in the CpG context is 14.6%. This assembly of the <i>A. pulchra</i> genome and DNA methylome will support studies of coastal corals in French Polynesia, aiding conservation and comparative studies of <i>Acropora</i> and cnidarians.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2025 ","pages":"gigabyte153"},"PeriodicalIF":0.0,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11985253/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144060361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CompactTree: a lightweight header-only C++ library and Python wrapper for ultra-large phylogenetics. CompactTree:一个轻量级的仅头文件的c++库和用于超大型系统发育的Python包装器。
Pub Date : 2025-03-07 eCollection Date: 2025-01-01 DOI: 10.46471/gigabyte.152
Niema Moshiri

The study of viral and bacterial species requires the ability to load and traverse ultra-large phylogenies with tens of millions of tips, but existing tree libraries struggle to scale to these sizes. We introduce CompactTree, a lightweight header-only C++ library with a user-friendly Python wrapper for traversing ultra-large trees that can be easily incorporated into other tools. We show that CompactTree is orders of magnitude faster and requires orders of magnitude less memory than existing tree packages. CompactTree is freely accessible as an open source project: https://github.com/niemasd/CompactTree.

病毒和细菌物种的研究需要有能力加载和遍历数千万个尖端的超大系统发育,但现有的树库很难扩展到这些规模。我们将介绍CompactTree,这是一个轻量级的仅限头文件的c++库,带有用户友好的Python包装器,用于遍历超大型树,可以很容易地合并到其他工具中。我们表明,compactretree比现有的树包要快几个数量级,并且需要的内存要少几个数量级。compactreree是一个免费的开源项目:https://github.com/niemasd/CompactTree。
{"title":"CompactTree: a lightweight header-only C++ library and Python wrapper for ultra-large phylogenetics.","authors":"Niema Moshiri","doi":"10.46471/gigabyte.152","DOIUrl":"10.46471/gigabyte.152","url":null,"abstract":"<p><p>The study of viral and bacterial species requires the ability to load and traverse ultra-large phylogenies with tens of millions of tips, but existing tree libraries struggle to scale to these sizes. We introduce CompactTree, a lightweight header-only C++ library with a user-friendly Python wrapper for traversing ultra-large trees that can be easily incorporated into other tools. We show that CompactTree is orders of magnitude faster and requires orders of magnitude less memory than existing tree packages. CompactTree is freely accessible as an open source project: https://github.com/niemasd/CompactTree.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2025 ","pages":"gigabyte152"},"PeriodicalIF":0.0,"publicationDate":"2025-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11921128/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143665474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Portable-CELLxGENE: standalone executables of CELLxGENE for easy installation. Portable-CELLxGENE: CELLxGENE 的独立可执行文件,便于安装。
Pub Date : 2025-03-03 eCollection Date: 2025-01-01 DOI: 10.46471/gigabyte.151
George T Hall

Biologists who want to analyse their single-cell transcriptomics dataset must install and use specialist software via the command line. This is often impractical for non-bioinformaticians. Whilst the popular CELLxGENE software provides an intuitive graphical interface to facilitate analysis outside the command line, its server-side installation and execution remain complex. A version that is easier to install and run would allow non-bioinformaticians to take advantage of this valuable tool without needing to use the command line. This work introduces Portable-CELLxGENE, a standalone distribution of CELLxGENE that can be installed via a graphical interface. It contains an easy-to-use extension of the CELLxGENE-Gateway Python package to allow the analysis of multiple datasets. This tool enables non-bioinformaticians to carry out simple analyses independently.

Availability and implementation: Versions of Portable-CELLxGENE for Windows and MacOS, along with source code, are available at https://george-hall-ucl.github.io/Portable-CELLxGENE-Docs. It is licensed under the GNU General Public License v3.

想要分析单细胞转录组学数据集的生物学家必须通过命令行安装和使用专业软件。这对于非生物信息学家来说通常是不切实际的。虽然流行的CELLxGENE软件提供了直观的图形界面,以方便命令行之外的分析,但其服务器端安装和执行仍然很复杂。一个更容易安装和运行的版本将允许非生物信息学家利用这个有价值的工具,而不需要使用命令行。本文介绍了Portable-CELLxGENE,它是CELLxGENE的独立发行版,可以通过图形界面安装。它包含易于使用的CELLxGENE-Gateway Python包扩展,允许分析多个数据集。该工具使非生物信息学家能够独立进行简单的分析。可用性和实现:Portable-CELLxGENE的Windows和MacOS版本,以及源代码,可在https://george-hall-ucl.github.io/Portable-CELLxGENE-Docs获得。它是在GNU通用公共许可证v3下授权的。
{"title":"Portable-CELLxGENE: standalone executables of CELLxGENE for easy installation.","authors":"George T Hall","doi":"10.46471/gigabyte.151","DOIUrl":"10.46471/gigabyte.151","url":null,"abstract":"<p><p>Biologists who want to analyse their single-cell transcriptomics dataset must install and use specialist software via the command line. This is often impractical for non-bioinformaticians. Whilst the popular CELLxGENE software provides an intuitive graphical interface to facilitate analysis outside the command line, its server-side installation and execution remain complex. A version that is easier to install and run would allow non-bioinformaticians to take advantage of this valuable tool without needing to use the command line. This work introduces Portable-CELLxGENE, a standalone distribution of CELLxGENE that can be installed via a graphical interface. It contains an easy-to-use extension of the CELLxGENE-Gateway Python package to allow the analysis of multiple datasets. This tool enables non-bioinformaticians to carry out simple analyses independently.</p><p><strong>Availability and implementation: </strong>Versions of Portable-CELLxGENE for Windows and MacOS, along with source code, are available at https://george-hall-ucl.github.io/Portable-CELLxGENE-Docs. It is licensed under the GNU General Public License v3.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2025 ","pages":"gigabyte151"},"PeriodicalIF":0.0,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11894539/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143607446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
GigaByte (Hong Kong, China)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1