Pub Date : 2025-02-24eCollection Date: 2025-01-01DOI: 10.46471/gigabyte.150
Ma Carmel F Javier, Albert C Noblezada, Persie Mark Q Sienes, Robert S Guino-O, Nadia Palomar-Abesamis, Maria Celia D Malay, Carmelo S Del Castillo, Victor Marco Emmanuel N Ferriols
The Visayan Spotted Deer (VSD), or Rusa alfredi, is an endangered and endemic species in the Philippines. Despite its status, genomic information on R. alfredi, and the genus Rusa in general, is missing. This study presents the first draft genome assembly of the VSD using the Illumina short-read sequencing technology. The resulting RusAlf_1.1 assembly has a 2.52 Gb total length, with a contig N50 of 46 Kb and scaffold N50 size of 75 Mb. The assembly has a BUSCO complete score of 95.5%, demonstrating the genome's completeness, and includes the annotation of 24,531 genes. Our phylogenetic analysis based on single-copy orthologs revealed a close evolutionary relationship between R. alfredi and the genus Cervus. RusAlf_1.1 represents a significant advancement in our understanding of the VSD. It opens opportunities for further research in population genetics and evolutionary biology, potentially contributing to more effective conservation and management strategies for this endangered species.
{"title":"Draft genome of the endangered visayan spotted deer (<i>Rusa alfredi)</i>, a Philippine endemic species.","authors":"Ma Carmel F Javier, Albert C Noblezada, Persie Mark Q Sienes, Robert S Guino-O, Nadia Palomar-Abesamis, Maria Celia D Malay, Carmelo S Del Castillo, Victor Marco Emmanuel N Ferriols","doi":"10.46471/gigabyte.150","DOIUrl":"https://doi.org/10.46471/gigabyte.150","url":null,"abstract":"<p><p>The Visayan Spotted Deer (VSD), or <i>Rusa alfredi</i>, is an endangered and endemic species in the Philippines. Despite its status, genomic information on <i>R. alfredi</i>, and the genus <i>Rusa</i> in general, is missing. This study presents the first draft genome assembly of the VSD using the Illumina short-read sequencing technology. The resulting RusAlf_1.1 assembly has a 2.52 Gb total length, with a contig N50 of 46 Kb and scaffold N50 size of 75 Mb. The assembly has a BUSCO complete score of 95.5%, demonstrating the genome's completeness, and includes the annotation of 24,531 genes. Our phylogenetic analysis based on single-copy orthologs revealed a close evolutionary relationship between <i>R. alfredi</i> and the genus <i>Cervus</i>. RusAlf_1.1 represents a significant advancement in our understanding of the VSD. It opens opportunities for further research in population genetics and evolutionary biology, potentially contributing to more effective conservation and management strategies for this endangered species.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2025 ","pages":"gigabyte150"},"PeriodicalIF":0.0,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11876970/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143560181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-14eCollection Date: 2025-01-01DOI: 10.46471/gigabyte.148
Zhongxu Zhu
Nanopore sequencing, a third-generation sequencing technique, enables direct RNA sequencing, real-time analysis, and long-read length. Nanopore sequencers measure electrical current changes as nucleotides pass through nanopores; a basecaller identifies base sequences according to the raw current measurements. However, accurate basecalling remains challenging due to molecular variations and sequencing noise. Here, we introduce SqueezeCall, a novel Squeezeformer-based model for accurate nanopore basecalling. SqueezeCall uses convolution layers to down-sample raw signals and model local dependencies. A Squeezeformer network captures the global context, and a connectionist temporal classification (CTC) decoder with beam search generates DNA sequences. Experimental results demonstrated SqueezeCall's ability to resist noise, improving basecalling accuracy. We trained SqueezeCall combining three types of loss, and found that all three loss types contribute to basecalling accuracy. Experiments across multiple species demonstrated the potential of a Squeezeformer-based model to improve basecalling accuracy and its superiority over recurrent neural network-based models and Transformer-based models.
{"title":"SqueezeCall: nanopore basecalling using a Squeezeformer network.","authors":"Zhongxu Zhu","doi":"10.46471/gigabyte.148","DOIUrl":"10.46471/gigabyte.148","url":null,"abstract":"<p><p>Nanopore sequencing, a third-generation sequencing technique, enables direct RNA sequencing, real-time analysis, and long-read length. Nanopore sequencers measure electrical current changes as nucleotides pass through nanopores; a basecaller identifies base sequences according to the raw current measurements. However, accurate basecalling remains challenging due to molecular variations and sequencing noise. Here, we introduce SqueezeCall, a novel Squeezeformer-based model for accurate nanopore basecalling. SqueezeCall uses convolution layers to down-sample raw signals and model local dependencies. A Squeezeformer network captures the global context, and a connectionist temporal classification (CTC) decoder with beam search generates DNA sequences. Experimental results demonstrated SqueezeCall's ability to resist noise, improving basecalling accuracy. We trained SqueezeCall combining three types of loss, and found that all three loss types contribute to basecalling accuracy. Experiments across multiple species demonstrated the potential of a Squeezeformer-based model to improve basecalling accuracy and its superiority over recurrent neural network-based models and Transformer-based models.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2025 ","pages":"gigabyte148"},"PeriodicalIF":0.0,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11851125/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143506532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The DNA molecule is a promising next-generation data storage medium. Recently, it has been theoretically proposed that non-natural or modified bases can serve as extra molecular letters to increase the information density. However, this strategy is challenging due to the difficulty in synthesizing non-natural DNA sequences and their complex structure. Here, we described a practical DNA data storage transcoding scheme named R+ based on an expanded molecular alphabet that introduces 5-methylcytosine (5mC). We demonstrated its experimental validation by encoding one representative file into several 1.3∼1.6 kbps in vitro DNA fragments for nanopore sequencing. Our results show an average data recovery rate of 98.97% and 86.91% with and without reference, respectively. Our work validates the practicability of 5mC in DNA storage systems, with a potentially wide range of applications.
Availability and implementation: R+ is implemented in Python and the code is available under a MIT license at https://github.com/Incpink-Liu/DNA-storage-R_plus.
{"title":"A practical DNA data storage using an expanded alphabet introducing 5-methylcytosine.","authors":"Deruilin Liu, Demin Xu, Liuxin Shi, Jiayuan Zhang, Kewei Bi, Bei Luo, Chen Liu, Yuxiang Li, Guangyi Fan, Wen Wang, Zhi Ping","doi":"10.46471/gigabyte.147","DOIUrl":"10.46471/gigabyte.147","url":null,"abstract":"<p><p>The DNA molecule is a promising next-generation data storage medium. Recently, it has been theoretically proposed that non-natural or modified bases can serve as extra molecular letters to increase the information density. However, this strategy is challenging due to the difficulty in synthesizing non-natural DNA sequences and their complex structure. Here, we described a practical DNA data storage transcoding scheme named R+ based on an expanded molecular alphabet that introduces 5-methylcytosine (5mC). We demonstrated its experimental validation by encoding one representative file into several 1.3∼1.6 kbps <i>in vitro</i> DNA fragments for nanopore sequencing. Our results show an average data recovery rate of 98.97% and 86.91% with and without reference, respectively. Our work validates the practicability of 5mC in DNA storage systems, with a potentially wide range of applications.</p><p><strong>Availability and implementation: </strong>R+ is implemented in Python and the code is available under a MIT license at https://github.com/Incpink-Liu/DNA-storage-R_plus.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2025 ","pages":"gigabyte147"},"PeriodicalIF":0.0,"publicationDate":"2025-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11791762/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143191462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-14eCollection Date: 2025-01-01DOI: 10.46471/gigabyte.146
Ling-Hong Hung, Thomas J Dahlstrom, Johnalbert Garnica, Emmanuel Munoz, Robert Schmitz, Ka Yee Yeung
We present the Biodepot Launcher, a desktop application that facilitates installation, management and deployment of bioinformatics workflows using the Biodepot-workflow-builder (Bwb). With the new app, Bwb can be started by double-clicking on an icon, eliminating the need for typing cryptic start up commands into the terminal. This creates an end-to-end graphical and easy-to-use interface to manage and launch containerized workflows on the local computer or cloud instances. Biodepot Launcher is written in React and Javascript, and uses the node.js framework Neutralinojs and web browser routines to allow the application to execute on Linux, Windows and Mac desktop environments.
{"title":"Biodepot Launcher: an app to install, manage and launch bioinformatics workflows.","authors":"Ling-Hong Hung, Thomas J Dahlstrom, Johnalbert Garnica, Emmanuel Munoz, Robert Schmitz, Ka Yee Yeung","doi":"10.46471/gigabyte.146","DOIUrl":"10.46471/gigabyte.146","url":null,"abstract":"<p><p>We present the Biodepot Launcher, a desktop application that facilitates installation, management and deployment of bioinformatics workflows using the Biodepot-workflow-builder (Bwb). With the new app, Bwb can be started by double-clicking on an icon, eliminating the need for typing cryptic start up commands into the terminal. This creates an end-to-end graphical and easy-to-use interface to manage and launch containerized workflows on the local computer or cloud instances. Biodepot Launcher is written in React and Javascript, and uses the node.js framework Neutralinojs and web browser routines to allow the application to execute on Linux, Windows and Mac desktop environments.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2025 ","pages":"gigabyte146"},"PeriodicalIF":0.0,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11751628/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143029757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-31eCollection Date: 2024-01-01DOI: 10.46471/gigabyte.144
Emma Gairin, Saori Miura, Hiroki Takamiyagi, Marcela Herrera, Vincent Laudet
The number of high-quality genomes is rapidly increasing across taxa. However, it remains limited for coral reef fish of the Pomacentrid family, with most research focused on anemonefish. Here, we present the first assembly for a Pomacentrid of the genus Chrysiptera. Using PacBio long-read sequencing with 94.5× coverage, the genome of the Sapphire Devil, Chrysiptera cyanea, was assembled and annotated. The final assembly comprises 896 Mb pairs across 91 contigs, with a BUSCO completeness of 97.6%, and 28,173 genes. Comparative analyses with chromosome-scale assemblies of related species identified contig-chromosome correspondences. This genome will be useful as a comparison to study specific adaptations linked to the symbiotic life of closely related anemonefish. Furthermore, C. cyanea is found in most tropical coastal areas of the Indo-West Pacific and could become a model for environmental monitoring. This work will expand coral reef research efforts, highlighting the power of long-read assemblies to retrieve high quality genomes.
{"title":"The genome of the sapphire damselfish <i>Chrysiptera cyanea</i>: a new resource to support further investigation of the evolution of Pomacentrids.","authors":"Emma Gairin, Saori Miura, Hiroki Takamiyagi, Marcela Herrera, Vincent Laudet","doi":"10.46471/gigabyte.144","DOIUrl":"10.46471/gigabyte.144","url":null,"abstract":"<p><p>The number of high-quality genomes is rapidly increasing across taxa. However, it remains limited for coral reef fish of the Pomacentrid family, with most research focused on anemonefish. Here, we present the first assembly for a Pomacentrid of the genus <i>Chrysiptera</i>. Using PacBio long-read sequencing with 94.5× coverage, the genome of the Sapphire Devil, <i>Chrysiptera cyanea</i>, was assembled and annotated. The final assembly comprises 896 Mb pairs across 91 contigs, with a BUSCO completeness of 97.6%, and 28,173 genes. Comparative analyses with chromosome-scale assemblies of related species identified contig-chromosome correspondences. This genome will be useful as a comparison to study specific adaptations linked to the symbiotic life of closely related anemonefish. Furthermore, <i>C. cyanea</i> is found in most tropical coastal areas of the Indo-West Pacific and could become a model for environmental monitoring. This work will expand coral reef research efforts, highlighting the power of long-read assemblies to retrieve high quality genomes.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte144"},"PeriodicalIF":0.0,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11711634/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142959724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-23eCollection Date: 2024-01-01DOI: 10.46471/gigabyte.145
Aurélia Emonet, Mohamed Awad, Nikita Tikhomirov, Maria Vasilarou, Miguel Pérez-Antón, Xiangchao Gan, Polina Yu Novikova, Angela Hay
Cardamine chenopodiifolia is an amphicarpic plant in the Brassicaceae family. Plants develop two fruit types, one above and another below ground. This rare trait is associated with octoploidy in C. chenopodiifolia. The absence of genomic data for C. chenopodiifolia currently limits our understanding of the development and evolution of amphicarpy. Here, we produced a chromosome-scale assembly of the C. chenopodiifolia genome using high-fidelity long read sequencing with the Pacific Biosciences platform. We assembled 32 chromosomes and two organelle genomes with a total length of 597.2 Mb and an N50 of 18.8 Mb. Genome completeness was estimated at 99.8%. We observed structural variation among homeologous chromosomes, suggesting that C. chenopodiifolia originated via allopolyploidy, and phased the octoploid genome into four sub-genomes using orthogroup trees. This fully phased, chromosome-level genome assembly is an important resource to help investigate amphicarpy in C. chenopodiifolia and the origin of trait novelties by allopolyploidy.
{"title":"Polyploid genome assembly of <i>Cardamine chenopodiifolia</i>.","authors":"Aurélia Emonet, Mohamed Awad, Nikita Tikhomirov, Maria Vasilarou, Miguel Pérez-Antón, Xiangchao Gan, Polina Yu Novikova, Angela Hay","doi":"10.46471/gigabyte.145","DOIUrl":"10.46471/gigabyte.145","url":null,"abstract":"<p><p><i>Cardamine chenopodiifolia</i> is an amphicarpic plant in the Brassicaceae family. Plants develop two fruit types, one above and another below ground. This rare trait is associated with octoploidy in <i>C. chenopodiifolia</i>. The absence of genomic data for <i>C. chenopodiifolia</i> currently limits our understanding of the development and evolution of amphicarpy. Here, we produced a chromosome-scale assembly of the <i>C. chenopodiifolia</i> genome using high-fidelity long read sequencing with the Pacific Biosciences platform. We assembled 32 chromosomes and two organelle genomes with a total length of 597.2 Mb and an N50 of 18.8 Mb. Genome completeness was estimated at 99.8%. We observed structural variation among homeologous chromosomes, suggesting that <i>C. chenopodiifolia</i> originated via allopolyploidy, and phased the octoploid genome into four sub-genomes using orthogroup trees. This fully phased, chromosome-level genome assembly is an important resource to help investigate amphicarpy in <i>C. chenopodiifolia</i> and the origin of trait novelties by allopolyploidy.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte145"},"PeriodicalIF":0.0,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11693932/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142923940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-25eCollection Date: 2024-01-01DOI: 10.46471/gigabyte.143
Hiba Ben Aribi, Najla Abassi, Olaitan I Awe
The expanding availability of large-scale genomic data and the growing interest in uncovering gene-disease associations call for efficient tools to visualize and evaluate gene expression and genetic variation data. Here, we developed a comprehensive pipeline that was implemented as an interactive Shiny application and a standalone desktop application. NeuroVar is a tool for visualizing genetic variation (single nucleotide polymorphisms and insertions/deletions) and gene expression profiles of biomarkers of neurological diseases. Data collection involved filtering biomarkers related to multiple neurological diseases from the ClinGen database. NeuroVar provides a user-friendly graphical user interface to visualize genomic data and is freely accessible on the project's GitHub repository (https://github.com/omicscodeathon/neurovar).
{"title":"NeuroVar: an open-source tool for the visualization of gene expression and variation data for biomarkers of neurological diseases.","authors":"Hiba Ben Aribi, Najla Abassi, Olaitan I Awe","doi":"10.46471/gigabyte.143","DOIUrl":"10.46471/gigabyte.143","url":null,"abstract":"<p><p>The expanding availability of large-scale genomic data and the growing interest in uncovering gene-disease associations call for efficient tools to visualize and evaluate gene expression and genetic variation data. Here, we developed a comprehensive pipeline that was implemented as an interactive Shiny application and a standalone desktop application. NeuroVar is a tool for visualizing genetic variation (single nucleotide polymorphisms and insertions/deletions) and gene expression profiles of biomarkers of neurological diseases. Data collection involved filtering biomarkers related to multiple neurological diseases from the ClinGen database. NeuroVar provides a user-friendly graphical user interface to visualize genomic data and is freely accessible on the project's GitHub repository (https://github.com/omicscodeathon/neurovar).</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte143"},"PeriodicalIF":0.0,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11612633/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142775024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-20eCollection Date: 2024-01-01DOI: 10.46471/gigabyte.142
Marcel Nebenführ, Ulfur Arnason, Axel Janke
The Baikal seal (Pusa sibirica) is a freshwater seal endemic to Lake Baikal, where it became landlocked million years ago. It is an abundant species of least concern despite the limited habitat. Research on its genetic diversity had only been done on mitochondrial genes, restriction fragment analyses, and microsatellites, before its reference genome was published. Here, we report the genome sequences of six Baikal seals, and one individual of the Caspian, ringed, and harbor seal, re-sequenced from Illumina paired-end short read data. Heterozygosity calculations of the six newly sequenced individuals are similar to previously reported genomes. Also, the novel genome data of the other species contributed to a more complete phocid seal phylogeny based on whole-genome data. Despite the isolation of the land-locked Baikal seal, its genetic diversity is comparable to that of other seal species. Future targeted genome studies need to explore the genomic diversity throughout their distribution.
{"title":"Whole-genome re-sequencing of the Baikal seal and other phocid seals for a glimpse into their genetic diversity, demographic history, and phylogeny.","authors":"Marcel Nebenführ, Ulfur Arnason, Axel Janke","doi":"10.46471/gigabyte.142","DOIUrl":"10.46471/gigabyte.142","url":null,"abstract":"<p><p>The Baikal seal (<i>Pusa sibirica</i>) is a freshwater seal endemic to Lake Baikal, where it became landlocked million years ago. It is an abundant species of least concern despite the limited habitat. Research on its genetic diversity had only been done on mitochondrial genes, restriction fragment analyses, and microsatellites, before its reference genome was published. Here, we report the genome sequences of six Baikal seals, and one individual of the Caspian, ringed, and harbor seal, re-sequenced from Illumina paired-end short read data. Heterozygosity calculations of the six newly sequenced individuals are similar to previously reported genomes. Also, the novel genome data of the other species contributed to a more complete phocid seal phylogeny based on whole-genome data. Despite the isolation of the land-locked Baikal seal, its genetic diversity is comparable to that of other seal species. Future targeted genome studies need to explore the genomic diversity throughout their distribution.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte142"},"PeriodicalIF":0.0,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11602651/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142752449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-06eCollection Date: 2024-01-01DOI: 10.46471/gigabyte.140
Marc A Gumangan, Zheyu Pan, Thomas P Lozito
The vast majority of gecko species are capable of tail regeneration, but singular geckos of Correlophus, Uroplatus, and Nephrurus genera are unable to regrow lost tails. Of these non-regenerative geckos, the crested gecko (Correlophus ciliatus) is distinguished by ready availability, ease of care, high productivity, and hybridization potential. These features make C. ciliatus particularly suited as a model for studying the genetic, molecular, and cellular mechanisms underlying loss of tail regeneration capabilities. We report a contiguous genome of C. ciliatus with a total size of 1.65 Gb, 152 scaffolds, L50 of 6, and N50 of 109 Mb. Repetitive content consists of 40.41% of the genome, and a total of 30,780 genes were annotated. Our assembly of the crested gecko genome provides a valuable resource for future comparative genomic studies between non-regenerative and regenerative geckos and other squamate reptiles.
Findings: We report genome sequencing, assembly, and annotation for the crested gecko, Correlophus ciliatus.
{"title":"Chromosome-level genome assembly and annotation of the crested gecko, <i>Correlophus ciliatus</i>, a lizard incapable of tail regeneration.","authors":"Marc A Gumangan, Zheyu Pan, Thomas P Lozito","doi":"10.46471/gigabyte.140","DOIUrl":"10.46471/gigabyte.140","url":null,"abstract":"<p><p>The vast majority of gecko species are capable of tail regeneration, but singular geckos of <i>Correlophus</i>, <i>Uroplatus</i>, and <i>Nephrurus</i> genera are unable to regrow lost tails. Of these non-regenerative geckos, the crested gecko (<i>Correlophus ciliatus</i>) is distinguished by ready availability, ease of care, high productivity, and hybridization potential. These features make <i>C. ciliatus</i> particularly suited as a model for studying the genetic, molecular, and cellular mechanisms underlying loss of tail regeneration capabilities. We report a contiguous genome of <i>C. ciliatus</i> with a total size of 1.65 Gb, 152 scaffolds, L50 of 6, and N50 of 109 Mb. Repetitive content consists of 40.41% of the genome, and a total of 30,780 genes were annotated. Our assembly of the crested gecko genome provides a valuable resource for future comparative genomic studies between non-regenerative and regenerative geckos and other squamate reptiles.</p><p><strong>Findings: </strong>We report genome sequencing, assembly, and annotation for the crested gecko, <i>Correlophus ciliatus</i>.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte140"},"PeriodicalIF":0.0,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11558660/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142634020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-05eCollection Date: 2024-01-01DOI: 10.46471/gigabyte.141
Peiyu Zong, Wenpeng Deng, Jian Liu, Jue Ruan
The rapid advancements in sequencing length necessitate the adoption of increasingly efficient sequence alignment algorithms. The Needleman-Wunsch method introduces the foundational dynamic-programming matrix calculation for global alignment, which evaluates the overall alignment of sequences. However, this method is known to be highly time-consuming. The proposed TSTA algorithm leverages both vector-level and thread-level parallelism to accelerate pairwise and multiple sequence alignments.
Availability and implementation: Source codes are available at https://github.com/bxskdh/TSTA.
{"title":"TSTA: thread and SIMD-based trapezoidal pairwise/multiple sequence-alignment method.","authors":"Peiyu Zong, Wenpeng Deng, Jian Liu, Jue Ruan","doi":"10.46471/gigabyte.141","DOIUrl":"10.46471/gigabyte.141","url":null,"abstract":"<p><p>The rapid advancements in sequencing length necessitate the adoption of increasingly efficient sequence alignment algorithms. The Needleman-Wunsch method introduces the foundational dynamic-programming matrix calculation for global alignment, which evaluates the overall alignment of sequences. However, this method is known to be highly time-consuming. The proposed TSTA algorithm leverages both vector-level and thread-level parallelism to accelerate pairwise and multiple sequence alignments.</p><p><strong>Availability and implementation: </strong>Source codes are available at https://github.com/bxskdh/TSTA.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte141"},"PeriodicalIF":0.0,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11558659/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142633945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}