首页 > 最新文献

GigaByte (Hong Kong, China)最新文献

英文 中文
Draft genome of the endangered visayan spotted deer (Rusa alfredi), a Philippine endemic species.
Pub Date : 2025-02-24 eCollection Date: 2025-01-01 DOI: 10.46471/gigabyte.150
Ma Carmel F Javier, Albert C Noblezada, Persie Mark Q Sienes, Robert S Guino-O, Nadia Palomar-Abesamis, Maria Celia D Malay, Carmelo S Del Castillo, Victor Marco Emmanuel N Ferriols

The Visayan Spotted Deer (VSD), or Rusa alfredi, is an endangered and endemic species in the Philippines. Despite its status, genomic information on R. alfredi, and the genus Rusa in general, is missing. This study presents the first draft genome assembly of the VSD using the Illumina short-read sequencing technology. The resulting RusAlf_1.1 assembly has a 2.52 Gb total length, with a contig N50 of 46 Kb and scaffold N50 size of 75 Mb. The assembly has a BUSCO complete score of 95.5%, demonstrating the genome's completeness, and includes the annotation of 24,531 genes. Our phylogenetic analysis based on single-copy orthologs revealed a close evolutionary relationship between R. alfredi and the genus Cervus. RusAlf_1.1 represents a significant advancement in our understanding of the VSD. It opens opportunities for further research in population genetics and evolutionary biology, potentially contributing to more effective conservation and management strategies for this endangered species.

{"title":"Draft genome of the endangered visayan spotted deer (<i>Rusa alfredi)</i>, a Philippine endemic species.","authors":"Ma Carmel F Javier, Albert C Noblezada, Persie Mark Q Sienes, Robert S Guino-O, Nadia Palomar-Abesamis, Maria Celia D Malay, Carmelo S Del Castillo, Victor Marco Emmanuel N Ferriols","doi":"10.46471/gigabyte.150","DOIUrl":"https://doi.org/10.46471/gigabyte.150","url":null,"abstract":"<p><p>The Visayan Spotted Deer (VSD), or <i>Rusa alfredi</i>, is an endangered and endemic species in the Philippines. Despite its status, genomic information on <i>R. alfredi</i>, and the genus <i>Rusa</i> in general, is missing. This study presents the first draft genome assembly of the VSD using the Illumina short-read sequencing technology. The resulting RusAlf_1.1 assembly has a 2.52 Gb total length, with a contig N50 of 46 Kb and scaffold N50 size of 75 Mb. The assembly has a BUSCO complete score of 95.5%, demonstrating the genome's completeness, and includes the annotation of 24,531 genes. Our phylogenetic analysis based on single-copy orthologs revealed a close evolutionary relationship between <i>R. alfredi</i> and the genus <i>Cervus</i>. RusAlf_1.1 represents a significant advancement in our understanding of the VSD. It opens opportunities for further research in population genetics and evolutionary biology, potentially contributing to more effective conservation and management strategies for this endangered species.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2025 ","pages":"gigabyte150"},"PeriodicalIF":0.0,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11876970/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143560181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SqueezeCall: nanopore basecalling using a Squeezeformer network.
Pub Date : 2025-02-14 eCollection Date: 2025-01-01 DOI: 10.46471/gigabyte.148
Zhongxu Zhu

Nanopore sequencing, a third-generation sequencing technique, enables direct RNA sequencing, real-time analysis, and long-read length. Nanopore sequencers measure electrical current changes as nucleotides pass through nanopores; a basecaller identifies base sequences according to the raw current measurements. However, accurate basecalling remains challenging due to molecular variations and sequencing noise. Here, we introduce SqueezeCall, a novel Squeezeformer-based model for accurate nanopore basecalling. SqueezeCall uses convolution layers to down-sample raw signals and model local dependencies. A Squeezeformer network captures the global context, and a connectionist temporal classification (CTC) decoder with beam search generates DNA sequences. Experimental results demonstrated SqueezeCall's ability to resist noise, improving basecalling accuracy. We trained SqueezeCall combining three types of loss, and found that all three loss types contribute to basecalling accuracy. Experiments across multiple species demonstrated the potential of a Squeezeformer-based model to improve basecalling accuracy and its superiority over recurrent neural network-based models and Transformer-based models.

纳米孔测序是第三代测序技术,可实现直接 RNA 测序、实时分析和长读数长度。纳米孔测序仪测量核苷酸通过纳米孔时的电流变化;基数呼应器根据原始电流测量值识别基数序列。然而,由于分子变异和测序噪音,准确的碱基识别仍然具有挑战性。在此,我们介绍一种基于 Squeezeformer 的新型模型 SqueezeCall,用于精确的纳米孔基数调用。SqueezeCall 使用卷积层对原始信号进行下采样,并对局部依赖性进行建模。一个 Squeezeformer 网络捕捉全局上下文,一个带有波束搜索功能的连接时序分类(CTC)解码器生成 DNA 序列。实验结果表明,SqueezeCall 具有抗噪能力,从而提高了基呼准确率。我们结合三种损失类型对 SqueezeCall 进行了训练,发现所有三种损失类型都有助于提高基呼准确率。多个物种的实验证明,基于 Squeezeformer 的模型具有提高基呼准确率的潜力,而且比基于递归神经网络的模型和基于 Transformer 的模型更有优势。
{"title":"SqueezeCall: nanopore basecalling using a Squeezeformer network.","authors":"Zhongxu Zhu","doi":"10.46471/gigabyte.148","DOIUrl":"10.46471/gigabyte.148","url":null,"abstract":"<p><p>Nanopore sequencing, a third-generation sequencing technique, enables direct RNA sequencing, real-time analysis, and long-read length. Nanopore sequencers measure electrical current changes as nucleotides pass through nanopores; a basecaller identifies base sequences according to the raw current measurements. However, accurate basecalling remains challenging due to molecular variations and sequencing noise. Here, we introduce SqueezeCall, a novel Squeezeformer-based model for accurate nanopore basecalling. SqueezeCall uses convolution layers to down-sample raw signals and model local dependencies. A Squeezeformer network captures the global context, and a connectionist temporal classification (CTC) decoder with beam search generates DNA sequences. Experimental results demonstrated SqueezeCall's ability to resist noise, improving basecalling accuracy. We trained SqueezeCall combining three types of loss, and found that all three loss types contribute to basecalling accuracy. Experiments across multiple species demonstrated the potential of a Squeezeformer-based model to improve basecalling accuracy and its superiority over recurrent neural network-based models and Transformer-based models.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2025 ","pages":"gigabyte148"},"PeriodicalIF":0.0,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11851125/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143506532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A practical DNA data storage using an expanded alphabet introducing 5-methylcytosine.
Pub Date : 2025-01-24 eCollection Date: 2025-01-01 DOI: 10.46471/gigabyte.147
Deruilin Liu, Demin Xu, Liuxin Shi, Jiayuan Zhang, Kewei Bi, Bei Luo, Chen Liu, Yuxiang Li, Guangyi Fan, Wen Wang, Zhi Ping

The DNA molecule is a promising next-generation data storage medium. Recently, it has been theoretically proposed that non-natural or modified bases can serve as extra molecular letters to increase the information density. However, this strategy is challenging due to the difficulty in synthesizing non-natural DNA sequences and their complex structure. Here, we described a practical DNA data storage transcoding scheme named R+ based on an expanded molecular alphabet that introduces 5-methylcytosine (5mC). We demonstrated its experimental validation by encoding one representative file into several 1.3∼1.6 kbps in vitro DNA fragments for nanopore sequencing. Our results show an average data recovery rate of 98.97% and 86.91% with and without reference, respectively. Our work validates the practicability of 5mC in DNA storage systems, with a potentially wide range of applications.

Availability and implementation: R+ is implemented in Python and the code is available under a MIT license at https://github.com/Incpink-Liu/DNA-storage-R_plus.

{"title":"A practical DNA data storage using an expanded alphabet introducing 5-methylcytosine.","authors":"Deruilin Liu, Demin Xu, Liuxin Shi, Jiayuan Zhang, Kewei Bi, Bei Luo, Chen Liu, Yuxiang Li, Guangyi Fan, Wen Wang, Zhi Ping","doi":"10.46471/gigabyte.147","DOIUrl":"10.46471/gigabyte.147","url":null,"abstract":"<p><p>The DNA molecule is a promising next-generation data storage medium. Recently, it has been theoretically proposed that non-natural or modified bases can serve as extra molecular letters to increase the information density. However, this strategy is challenging due to the difficulty in synthesizing non-natural DNA sequences and their complex structure. Here, we described a practical DNA data storage transcoding scheme named R+ based on an expanded molecular alphabet that introduces 5-methylcytosine (5mC). We demonstrated its experimental validation by encoding one representative file into several 1.3∼1.6 kbps <i>in vitro</i> DNA fragments for nanopore sequencing. Our results show an average data recovery rate of 98.97% and 86.91% with and without reference, respectively. Our work validates the practicability of 5mC in DNA storage systems, with a potentially wide range of applications.</p><p><strong>Availability and implementation: </strong>R+ is implemented in Python and the code is available under a MIT license at https://github.com/Incpink-Liu/DNA-storage-R_plus.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2025 ","pages":"gigabyte147"},"PeriodicalIF":0.0,"publicationDate":"2025-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11791762/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143191462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Biodepot Launcher: an app to install, manage and launch bioinformatics workflows.
Pub Date : 2025-01-14 eCollection Date: 2025-01-01 DOI: 10.46471/gigabyte.146
Ling-Hong Hung, Thomas J Dahlstrom, Johnalbert Garnica, Emmanuel Munoz, Robert Schmitz, Ka Yee Yeung

We present the Biodepot Launcher, a desktop application that facilitates installation, management and deployment of bioinformatics workflows using the Biodepot-workflow-builder (Bwb). With the new app, Bwb can be started by double-clicking on an icon, eliminating the need for typing cryptic start up commands into the terminal. This creates an end-to-end graphical and easy-to-use interface to manage and launch containerized workflows on the local computer or cloud instances. Biodepot Launcher is written in React and Javascript, and uses the node.js framework Neutralinojs and web browser routines to allow the application to execute on Linux, Windows and Mac desktop environments.

{"title":"Biodepot Launcher: an app to install, manage and launch bioinformatics workflows.","authors":"Ling-Hong Hung, Thomas J Dahlstrom, Johnalbert Garnica, Emmanuel Munoz, Robert Schmitz, Ka Yee Yeung","doi":"10.46471/gigabyte.146","DOIUrl":"10.46471/gigabyte.146","url":null,"abstract":"<p><p>We present the Biodepot Launcher, a desktop application that facilitates installation, management and deployment of bioinformatics workflows using the Biodepot-workflow-builder (Bwb). With the new app, Bwb can be started by double-clicking on an icon, eliminating the need for typing cryptic start up commands into the terminal. This creates an end-to-end graphical and easy-to-use interface to manage and launch containerized workflows on the local computer or cloud instances. Biodepot Launcher is written in React and Javascript, and uses the node.js framework Neutralinojs and web browser routines to allow the application to execute on Linux, Windows and Mac desktop environments.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2025 ","pages":"gigabyte146"},"PeriodicalIF":0.0,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11751628/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143029757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The genome of the sapphire damselfish Chrysiptera cyanea: a new resource to support further investigation of the evolution of Pomacentrids. 蓝宝石雀鲷Chrysiptera cyanea的基因组:支持进一步研究Pomacentrids进化的新资源。
Pub Date : 2024-12-31 eCollection Date: 2024-01-01 DOI: 10.46471/gigabyte.144
Emma Gairin, Saori Miura, Hiroki Takamiyagi, Marcela Herrera, Vincent Laudet

The number of high-quality genomes is rapidly increasing across taxa. However, it remains limited for coral reef fish of the Pomacentrid family, with most research focused on anemonefish. Here, we present the first assembly for a Pomacentrid of the genus Chrysiptera. Using PacBio long-read sequencing with 94.5× coverage, the genome of the Sapphire Devil, Chrysiptera cyanea, was assembled and annotated. The final assembly comprises 896 Mb pairs across 91 contigs, with a BUSCO completeness of 97.6%, and 28,173 genes. Comparative analyses with chromosome-scale assemblies of related species identified contig-chromosome correspondences. This genome will be useful as a comparison to study specific adaptations linked to the symbiotic life of closely related anemonefish. Furthermore, C. cyanea is found in most tropical coastal areas of the Indo-West Pacific and could become a model for environmental monitoring. This work will expand coral reef research efforts, highlighting the power of long-read assemblies to retrieve high quality genomes.

高质量基因组的数量在整个分类群中迅速增加。然而,它仍然局限于Pomacentrid家族的珊瑚礁鱼,大多数研究都集中在海葵鱼上。在这里,我们提出了第一个聚类的Pomacentrid属Chrysiptera。采用94.5×覆盖的PacBio长读测序技术,对蓝藻(Chrysiptera cyanea)的基因组进行了组装和注释。最终的组装包括896 Mb对,横跨91个contigs, BUSCO完整性为97.6%,28,173个基因。与近缘物种的染色体尺度组合进行比较分析,鉴定出了连续染色体对应关系。该基因组将有助于研究与密切相关的海葵鱼共生生活相关的特定适应性。此外,蓝藻在印度-西太平洋的大多数热带沿海地区都有发现,可以成为环境监测的模型。这项工作将扩大珊瑚礁的研究工作,突出长读汇编检索高质量基因组的能力。
{"title":"The genome of the sapphire damselfish <i>Chrysiptera cyanea</i>: a new resource to support further investigation of the evolution of Pomacentrids.","authors":"Emma Gairin, Saori Miura, Hiroki Takamiyagi, Marcela Herrera, Vincent Laudet","doi":"10.46471/gigabyte.144","DOIUrl":"10.46471/gigabyte.144","url":null,"abstract":"<p><p>The number of high-quality genomes is rapidly increasing across taxa. However, it remains limited for coral reef fish of the Pomacentrid family, with most research focused on anemonefish. Here, we present the first assembly for a Pomacentrid of the genus <i>Chrysiptera</i>. Using PacBio long-read sequencing with 94.5× coverage, the genome of the Sapphire Devil, <i>Chrysiptera cyanea</i>, was assembled and annotated. The final assembly comprises 896 Mb pairs across 91 contigs, with a BUSCO completeness of 97.6%, and 28,173 genes. Comparative analyses with chromosome-scale assemblies of related species identified contig-chromosome correspondences. This genome will be useful as a comparison to study specific adaptations linked to the symbiotic life of closely related anemonefish. Furthermore, <i>C. cyanea</i> is found in most tropical coastal areas of the Indo-West Pacific and could become a model for environmental monitoring. This work will expand coral reef research efforts, highlighting the power of long-read assemblies to retrieve high quality genomes.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte144"},"PeriodicalIF":0.0,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11711634/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142959724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Polyploid genome assembly of Cardamine chenopodiifolia. 小豆蔻的多倍体基因组组装。
Pub Date : 2024-12-23 eCollection Date: 2024-01-01 DOI: 10.46471/gigabyte.145
Aurélia Emonet, Mohamed Awad, Nikita Tikhomirov, Maria Vasilarou, Miguel Pérez-Antón, Xiangchao Gan, Polina Yu Novikova, Angela Hay

Cardamine chenopodiifolia is an amphicarpic plant in the Brassicaceae family. Plants develop two fruit types, one above and another below ground. This rare trait is associated with octoploidy in C. chenopodiifolia. The absence of genomic data for C. chenopodiifolia currently limits our understanding of the development and evolution of amphicarpy. Here, we produced a chromosome-scale assembly of the C. chenopodiifolia genome using high-fidelity long read sequencing with the Pacific Biosciences platform. We assembled 32 chromosomes and two organelle genomes with a total length of 597.2 Mb and an N50 of 18.8 Mb. Genome completeness was estimated at 99.8%. We observed structural variation among homeologous chromosomes, suggesting that C. chenopodiifolia originated via allopolyploidy, and phased the octoploid genome into four sub-genomes using orthogroup trees. This fully phased, chromosome-level genome assembly is an important resource to help investigate amphicarpy in C. chenopodiifolia and the origin of trait novelties by allopolyploidy.

小豆蔻是芸苔科的一种两性植物。植物长出两种果实,一种在地上,另一种在地下。这一罕见性状与陈果紫八倍体有关。目前,由于缺乏关于C. chenopodiifolia的基因组数据,限制了我们对两栖动物的发育和进化的理解。在这里,我们使用太平洋生物科学平台的高保真长读测序,制作了C. chenopodiifolia基因组的染色体尺度组装。我们组装了32条染色体和2个细胞器基因组,总长度597.2 Mb, N50为18.8 Mb,基因组完整性估计为99.8%。我们观察到同源染色体之间的结构差异,表明C. chenopodiifolia起源于异源多倍体,并通过正群树将八倍体基因组分为四个亚基因组。这种完全分期的染色体水平基因组组装是帮助研究C. chenopodiifolia两性性和异源多倍体新性状起源的重要资源。
{"title":"Polyploid genome assembly of <i>Cardamine chenopodiifolia</i>.","authors":"Aurélia Emonet, Mohamed Awad, Nikita Tikhomirov, Maria Vasilarou, Miguel Pérez-Antón, Xiangchao Gan, Polina Yu Novikova, Angela Hay","doi":"10.46471/gigabyte.145","DOIUrl":"10.46471/gigabyte.145","url":null,"abstract":"<p><p><i>Cardamine chenopodiifolia</i> is an amphicarpic plant in the Brassicaceae family. Plants develop two fruit types, one above and another below ground. This rare trait is associated with octoploidy in <i>C. chenopodiifolia</i>. The absence of genomic data for <i>C. chenopodiifolia</i> currently limits our understanding of the development and evolution of amphicarpy. Here, we produced a chromosome-scale assembly of the <i>C. chenopodiifolia</i> genome using high-fidelity long read sequencing with the Pacific Biosciences platform. We assembled 32 chromosomes and two organelle genomes with a total length of 597.2 Mb and an N50 of 18.8 Mb. Genome completeness was estimated at 99.8%. We observed structural variation among homeologous chromosomes, suggesting that <i>C. chenopodiifolia</i> originated via allopolyploidy, and phased the octoploid genome into four sub-genomes using orthogroup trees. This fully phased, chromosome-level genome assembly is an important resource to help investigate amphicarpy in <i>C. chenopodiifolia</i> and the origin of trait novelties by allopolyploidy.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte145"},"PeriodicalIF":0.0,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11693932/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142923940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
NeuroVar: an open-source tool for the visualization of gene expression and variation data for biomarkers of neurological diseases. NeuroVar:一个用于神经系统疾病生物标志物的基因表达和变异数据可视化的开源工具。
Pub Date : 2024-11-25 eCollection Date: 2024-01-01 DOI: 10.46471/gigabyte.143
Hiba Ben Aribi, Najla Abassi, Olaitan I Awe

The expanding availability of large-scale genomic data and the growing interest in uncovering gene-disease associations call for efficient tools to visualize and evaluate gene expression and genetic variation data. Here, we developed a comprehensive pipeline that was implemented as an interactive Shiny application and a standalone desktop application. NeuroVar is a tool for visualizing genetic variation (single nucleotide polymorphisms and insertions/deletions) and gene expression profiles of biomarkers of neurological diseases. Data collection involved filtering biomarkers related to multiple neurological diseases from the ClinGen database. NeuroVar provides a user-friendly graphical user interface to visualize genomic data and is freely accessible on the project's GitHub repository (https://github.com/omicscodeathon/neurovar).

随着大规模基因组数据的不断扩大,以及人们对揭示基因与疾病关联的兴趣日益浓厚,需要有效的工具来可视化和评估基因表达和遗传变异数据。在这里,我们开发了一个全面的管道,它被实现为一个交互式的Shiny应用程序和一个独立的桌面应用程序。NeuroVar是一种可视化遗传变异(单核苷酸多态性和插入/缺失)和神经系统疾病生物标志物基因表达谱的工具。数据收集包括从ClinGen数据库中筛选与多种神经系统疾病相关的生物标志物。NeuroVar提供了一个用户友好的图形用户界面来可视化基因组数据,并且可以在项目的GitHub存储库(https://github.com/omicscodeathon/neurovar)上免费访问。
{"title":"NeuroVar: an open-source tool for the visualization of gene expression and variation data for biomarkers of neurological diseases.","authors":"Hiba Ben Aribi, Najla Abassi, Olaitan I Awe","doi":"10.46471/gigabyte.143","DOIUrl":"10.46471/gigabyte.143","url":null,"abstract":"<p><p>The expanding availability of large-scale genomic data and the growing interest in uncovering gene-disease associations call for efficient tools to visualize and evaluate gene expression and genetic variation data. Here, we developed a comprehensive pipeline that was implemented as an interactive Shiny application and a standalone desktop application. NeuroVar is a tool for visualizing genetic variation (single nucleotide polymorphisms and insertions/deletions) and gene expression profiles of biomarkers of neurological diseases. Data collection involved filtering biomarkers related to multiple neurological diseases from the ClinGen database. NeuroVar provides a user-friendly graphical user interface to visualize genomic data and is freely accessible on the project's GitHub repository (https://github.com/omicscodeathon/neurovar).</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte143"},"PeriodicalIF":0.0,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11612633/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142775024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Whole-genome re-sequencing of the Baikal seal and other phocid seals for a glimpse into their genetic diversity, demographic history, and phylogeny. 对贝加尔湖海豹和其他海豹进行全基因组重测序,以了解它们的遗传多样性、人口统计学历史和系统发育。
Pub Date : 2024-11-20 eCollection Date: 2024-01-01 DOI: 10.46471/gigabyte.142
Marcel Nebenführ, Ulfur Arnason, Axel Janke

The Baikal seal (Pusa sibirica) is a freshwater seal endemic to Lake Baikal, where it became landlocked million years ago. It is an abundant species of least concern despite the limited habitat. Research on its genetic diversity had only been done on mitochondrial genes, restriction fragment analyses, and microsatellites, before its reference genome was published. Here, we report the genome sequences of six Baikal seals, and one individual of the Caspian, ringed, and harbor seal, re-sequenced from Illumina paired-end short read data. Heterozygosity calculations of the six newly sequenced individuals are similar to previously reported genomes. Also, the novel genome data of the other species contributed to a more complete phocid seal phylogeny based on whole-genome data. Despite the isolation of the land-locked Baikal seal, its genetic diversity is comparable to that of other seal species. Future targeted genome studies need to explore the genomic diversity throughout their distribution.

贝加尔湖海豹(Pusa sibirica)是贝加尔湖特有的淡水海豹,它在百万年前成为内陆。尽管栖息地有限,但它是一种最不受关注的丰富物种。在其参考基因组发表之前,对其遗传多样性的研究仅在线粒体基因、限制性内切片段分析和微卫星上进行。在这里,我们报告了6只贝加尔湖海豹和1只里海海豹、环斑海豹和港湾海豹的基因组序列,这些基因组序列来自Illumina配对端短读数据。六个新测序个体的杂合性计算与先前报道的基因组相似。此外,其他物种的新基因组数据有助于在全基因组数据的基础上建立更完整的phocid seal系统发育。尽管贝加尔湖海豹与世隔绝,但其遗传多样性与其他海豹物种相当。未来的针对性基因组研究需要探索其分布中的基因组多样性。
{"title":"Whole-genome re-sequencing of the Baikal seal and other phocid seals for a glimpse into their genetic diversity, demographic history, and phylogeny.","authors":"Marcel Nebenführ, Ulfur Arnason, Axel Janke","doi":"10.46471/gigabyte.142","DOIUrl":"10.46471/gigabyte.142","url":null,"abstract":"<p><p>The Baikal seal (<i>Pusa sibirica</i>) is a freshwater seal endemic to Lake Baikal, where it became landlocked million years ago. It is an abundant species of least concern despite the limited habitat. Research on its genetic diversity had only been done on mitochondrial genes, restriction fragment analyses, and microsatellites, before its reference genome was published. Here, we report the genome sequences of six Baikal seals, and one individual of the Caspian, ringed, and harbor seal, re-sequenced from Illumina paired-end short read data. Heterozygosity calculations of the six newly sequenced individuals are similar to previously reported genomes. Also, the novel genome data of the other species contributed to a more complete phocid seal phylogeny based on whole-genome data. Despite the isolation of the land-locked Baikal seal, its genetic diversity is comparable to that of other seal species. Future targeted genome studies need to explore the genomic diversity throughout their distribution.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte142"},"PeriodicalIF":0.0,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11602651/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142752449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Chromosome-level genome assembly and annotation of the crested gecko, Correlophus ciliatus, a lizard incapable of tail regeneration. 冠壁虎(Correlophus ciliatus)染色体水平的基因组组装和注释,冠壁虎是一种无法进行尾部再生的蜥蜴。
Pub Date : 2024-11-06 eCollection Date: 2024-01-01 DOI: 10.46471/gigabyte.140
Marc A Gumangan, Zheyu Pan, Thomas P Lozito

The vast majority of gecko species are capable of tail regeneration, but singular geckos of Correlophus, Uroplatus, and Nephrurus genera are unable to regrow lost tails. Of these non-regenerative geckos, the crested gecko (Correlophus ciliatus) is distinguished by ready availability, ease of care, high productivity, and hybridization potential. These features make C. ciliatus particularly suited as a model for studying the genetic, molecular, and cellular mechanisms underlying loss of tail regeneration capabilities. We report a contiguous genome of C. ciliatus with a total size of 1.65 Gb, 152 scaffolds, L50 of 6, and N50 of 109 Mb. Repetitive content consists of 40.41% of the genome, and a total of 30,780 genes were annotated. Our assembly of the crested gecko genome provides a valuable resource for future comparative genomic studies between non-regenerative and regenerative geckos and other squamate reptiles.

Findings: We report genome sequencing, assembly, and annotation for the crested gecko, Correlophus ciliatus.

绝大多数壁虎物种都具有尾巴再生能力,但Correlophus属、Uroplatus属和Nephrurus属的奇特壁虎无法再生失去的尾巴。在这些不具备再生能力的壁虎中,冠壁虎(Correlophus ciliatus)的特点是随时可用、易于照料、产量高且具有杂交潜力。这些特点使冠壁虎特别适合作为研究尾部再生能力丧失的遗传、分子和细胞机制的模型。我们报告的纤毛虫连续基因组总大小为 1.65 Gb,有 152 个支架,L50 为 6,N50 为 109 Mb。重复内容占基因组的 40.41%,共注释了 30,780 个基因。我们对冠壁虎基因组的组装为未来非再生壁虎和再生壁虎以及其他有鳞类爬行动物的基因组比较研究提供了宝贵的资源:我们报告了冠壁虎(Correlophus ciliatus)的基因组测序、组装和注释。
{"title":"Chromosome-level genome assembly and annotation of the crested gecko, <i>Correlophus ciliatus</i>, a lizard incapable of tail regeneration.","authors":"Marc A Gumangan, Zheyu Pan, Thomas P Lozito","doi":"10.46471/gigabyte.140","DOIUrl":"10.46471/gigabyte.140","url":null,"abstract":"<p><p>The vast majority of gecko species are capable of tail regeneration, but singular geckos of <i>Correlophus</i>, <i>Uroplatus</i>, and <i>Nephrurus</i> genera are unable to regrow lost tails. Of these non-regenerative geckos, the crested gecko (<i>Correlophus ciliatus</i>) is distinguished by ready availability, ease of care, high productivity, and hybridization potential. These features make <i>C. ciliatus</i> particularly suited as a model for studying the genetic, molecular, and cellular mechanisms underlying loss of tail regeneration capabilities. We report a contiguous genome of <i>C. ciliatus</i> with a total size of 1.65 Gb, 152 scaffolds, L50 of 6, and N50 of 109 Mb. Repetitive content consists of 40.41% of the genome, and a total of 30,780 genes were annotated. Our assembly of the crested gecko genome provides a valuable resource for future comparative genomic studies between non-regenerative and regenerative geckos and other squamate reptiles.</p><p><strong>Findings: </strong>We report genome sequencing, assembly, and annotation for the crested gecko, <i>Correlophus ciliatus</i>.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte140"},"PeriodicalIF":0.0,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11558660/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142634020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TSTA: thread and SIMD-based trapezoidal pairwise/multiple sequence-alignment method. TSTA:基于线程和 SIMD 的梯形配对/多序列比对方法。
Pub Date : 2024-11-05 eCollection Date: 2024-01-01 DOI: 10.46471/gigabyte.141
Peiyu Zong, Wenpeng Deng, Jian Liu, Jue Ruan

The rapid advancements in sequencing length necessitate the adoption of increasingly efficient sequence alignment algorithms. The Needleman-Wunsch method introduces the foundational dynamic-programming matrix calculation for global alignment, which evaluates the overall alignment of sequences. However, this method is known to be highly time-consuming. The proposed TSTA algorithm leverages both vector-level and thread-level parallelism to accelerate pairwise and multiple sequence alignments.

Availability and implementation: Source codes are available at https://github.com/bxskdh/TSTA.

随着测序长度的快速发展,有必要采用越来越高效的序列比对算法。Needleman-Wunsch 方法引入了用于全局比对的基础动态编程矩阵计算,该方法对序列的整体比对进行评估。然而,众所周知这种方法非常耗时。所提出的 TSTA 算法利用向量级和线程级并行性来加速成对和多序列比对:源代码可从 https://github.com/bxskdh/TSTA 获取。
{"title":"TSTA: thread and SIMD-based trapezoidal pairwise/multiple sequence-alignment method.","authors":"Peiyu Zong, Wenpeng Deng, Jian Liu, Jue Ruan","doi":"10.46471/gigabyte.141","DOIUrl":"10.46471/gigabyte.141","url":null,"abstract":"<p><p>The rapid advancements in sequencing length necessitate the adoption of increasingly efficient sequence alignment algorithms. The Needleman-Wunsch method introduces the foundational dynamic-programming matrix calculation for global alignment, which evaluates the overall alignment of sequences. However, this method is known to be highly time-consuming. The proposed TSTA algorithm leverages both vector-level and thread-level parallelism to accelerate pairwise and multiple sequence alignments.</p><p><strong>Availability and implementation: </strong>Source codes are available at https://github.com/bxskdh/TSTA.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte141"},"PeriodicalIF":0.0,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11558659/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142633945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
GigaByte (Hong Kong, China)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1