Pub Date : 2023-09-04eCollection Date: 2023-01-01DOI: 10.46471/gigabyte.88
Xinyu Wang, Lirong Liu, Wenbiao Zhu, Shiqing Wang, Minhui Shi, Shuhui Yang, Haorong Lu, Jun Cao
The study of the currently known >3,000 species of snakes can provide valuable insights into the evolution of their genomes. Deinagkistrodon acutus, also known as Sharp-nosed Pit Viper, one hundred-pacer viper or five-pacer viper, is a venomous snake with significant economic, medicinal and scientific importance. Widely distributed in southeastern China and South-East Asia, D. acutus has been primarily studied for its venom. Here, we employed next-generation sequencing to assemble and annotate a highly continuous genome of D. acutus. The genome size is 1.46 Gb; its scaffold N50 length is 6.21 Mb, the repeat content is 42.81%, and 24,402 functional genes were annotated. This study helps to further understand and utilize D. acutus and its venom at the genetic level.
{"title":"Genome assembly and annotation of the Sharp-nosed Pit Viper <i>Deinagkistrodon acutus</i> based on next-generation sequencing data.","authors":"Xinyu Wang, Lirong Liu, Wenbiao Zhu, Shiqing Wang, Minhui Shi, Shuhui Yang, Haorong Lu, Jun Cao","doi":"10.46471/gigabyte.88","DOIUrl":"10.46471/gigabyte.88","url":null,"abstract":"<p><p>The study of the currently known >3,000 species of snakes can provide valuable insights into the evolution of their genomes. <i>Deinagkistrodon acutus</i>, also known as Sharp-nosed Pit Viper, one hundred-pacer viper or five-pacer viper, is a venomous snake with significant economic, medicinal and scientific importance. Widely distributed in southeastern China and South-East Asia, <i>D. acutus</i> has been primarily studied for its venom. Here, we employed next-generation sequencing to assemble and annotate a highly continuous genome of <i>D. acutus</i>. The genome size is 1.46 Gb; its scaffold N50 length is 6.21 Mb, the repeat content is 42.81%, and 24,402 functional genes were annotated. This study helps to further understand and utilize <i>D. acutus</i> and its venom at the genetic level.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2023 ","pages":"gigabyte88"},"PeriodicalIF":0.0,"publicationDate":"2023-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10498098/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10268545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-01eCollection Date: 2023-01-01DOI: 10.46471/gigabyte.89
Lucia Corte, Lathan Liou, Paul F O'Reilly, Judit García-González
Recent advances in genome-wide association and sequencing studies have shown that the genetic architecture of complex traits and diseases involves a combination of rare and common genetic variants distributed throughout the genome. One way to better understand this architecture is to visualize genetic associations across a wide range of allele frequencies. However, there is currently no standardized or consistent graphical representation for effectively illustrating these results. Here we propose a standardized approach for visualizing the effect size of risk variants across the allele frequency spectrum. The proposed plots have a distinctive trumpet shape: with the majority of variants having high frequency and small effects, and a small number of variants having lower frequency and larger effects. To demonstrate the utility of trumpet plots in illustrating the relationship between the number of variants, their frequency, and the magnitude of their effects in shaping the genetic architecture of complex traits and diseases, we generated trumpet plots for more than one hundred traits in the UK Biobank. To facilitate their broader use, we developed an R package, 'TrumpetPlots' (available at the Comprehensive R Archive Network) and R Shiny application, 'Shiny Trumpets' (available at https://juditgg.shinyapps.io/shinytrumpets/) that allows users to explore these results and submit their own data.
全基因组关联和测序研究的最新进展表明,复杂性状和疾病的遗传结构涉及分布在整个基因组中的罕见和常见遗传变异的组合。要更好地理解这种结构,一种方法是将广泛等位基因频率范围内的遗传关联可视化。然而,目前还没有标准化或一致的图形表示法来有效地说明这些结果。在此,我们提出了一种标准化的方法,用于直观显示风险变异在等位基因频率谱中的效应大小。所提出的图具有独特的喇叭形状:大多数变异具有高频率和小效应,而少数变异具有较低频率和较大效应。为了证明喇叭图在说明变体数量、变体频率及其对塑造复杂性状和疾病遗传结构的影响程度之间的关系方面的实用性,我们为英国生物库中的一百多个性状生成了喇叭图。为了便于更广泛地使用,我们开发了一个 R 软件包 "TrumpetPlots"(可在综合 R Archive Network 上获取)和 R Shiny 应用程序 "Shiny Trumpets"(可在 https://juditgg.shinyapps.io/shinytrumpets/ 上获取),允许用户探索这些结果并提交自己的数据。
{"title":"Trumpet plots: visualizing the relationship between allele frequency and effect size in genetic association studies.","authors":"Lucia Corte, Lathan Liou, Paul F O'Reilly, Judit García-González","doi":"10.46471/gigabyte.89","DOIUrl":"10.46471/gigabyte.89","url":null,"abstract":"<p><p>Recent advances in genome-wide association and sequencing studies have shown that the genetic architecture of complex traits and diseases involves a combination of rare and common genetic variants distributed throughout the genome. One way to better understand this architecture is to visualize genetic associations across a wide range of allele frequencies. However, there is currently no standardized or consistent graphical representation for effectively illustrating these results. Here we propose a standardized approach for visualizing the effect size of risk variants across the allele frequency spectrum. The proposed plots have a distinctive trumpet shape: with the majority of variants having high frequency and small effects, and a small number of variants having lower frequency and larger effects. To demonstrate the utility of trumpet plots in illustrating the relationship between the number of variants, their frequency, and the magnitude of their effects in shaping the genetic architecture of complex traits and diseases, we generated trumpet plots for more than one hundred traits in the UK Biobank. To facilitate their broader use, we developed an R package, 'TrumpetPlots' (available at the Comprehensive R Archive Network) and R Shiny application, 'Shiny Trumpets' (available at https://juditgg.shinyapps.io/shinytrumpets/) that allows users to explore these results and submit their own data.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2023 ","pages":"gigabyte89"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10498096/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10268544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-23eCollection Date: 2023-01-01DOI: 10.46471/gigabyte.87
Sonia García-Ruiz, Regina Hertfelder Reynolds, Melissa Grant-Peters, Emil Karl Gustavsson, Aine Fairbrother-Browne, Zhongbo Chen, Jonathan William Brenton, Mina Ryten
Amazon Simple Storage Service (Amazon S3) is a widely used platform for storing large biomedical datasets. Unintended data alterations can occur during data writing and transmission, altering the original content and generating unexpected results. However, no open-source and easy-to-use tool exists to verify end-to-end data integrity. Here, we present aws-s3-integrity-check, a user-friendly, lightweight, and reliable bash tool to verify the integrity of a dataset stored in an Amazon S3 bucket. Using this tool, we only needed ∼114 min to verify the integrity of 1,045 records ranging between 5 bytes and 10 gigabytes and occupying ∼935 gigabytes of the Amazon S3 cloud. Our aws-s3-integrity-check tool also provides file-by-file on-screen and log-file-based information about the status of each integrity check. To our knowledge, this tool is the only open-source one that allows verifying the integrity of a dataset uploaded to the Amazon S3 Storage quickly, reliably, and efficiently. The tool is freely available for download and use at https://github.com/SoniaRuiz/aws-s3-integrity-check and https://hub.docker.com/r/soniaruiz/aws-s3-integrity-check.
{"title":"aws-s3-integrity-check: an open-source bash tool to verify the integrity of a dataset stored on Amazon S3.","authors":"Sonia García-Ruiz, Regina Hertfelder Reynolds, Melissa Grant-Peters, Emil Karl Gustavsson, Aine Fairbrother-Browne, Zhongbo Chen, Jonathan William Brenton, Mina Ryten","doi":"10.46471/gigabyte.87","DOIUrl":"10.46471/gigabyte.87","url":null,"abstract":"<p><p>Amazon Simple Storage Service (Amazon S3) is a widely used platform for storing large biomedical datasets. Unintended data alterations can occur during data writing and transmission, altering the original content and generating unexpected results. However, no open-source and easy-to-use tool exists to verify end-to-end data integrity. Here, we present <i>aws-s3-integrity-check</i>, a user-friendly, lightweight, and reliable bash tool to verify the integrity of a dataset stored in an Amazon S3 bucket. Using this tool, we only needed ∼114 min to verify the integrity of 1,045 records ranging between 5 bytes and 10 gigabytes and occupying ∼935 gigabytes of the Amazon S3 cloud. Our <i>aws-s3-integrity-check</i> tool also provides file-by-file on-screen and log-file-based information about the status of each integrity check. To our knowledge, this tool is the only open-source one that allows verifying the integrity of a dataset uploaded to the Amazon S3 Storage quickly, reliably, and efficiently. The tool is freely available for download and use at https://github.com/SoniaRuiz/aws-s3-integrity-check and https://hub.docker.com/r/soniaruiz/aws-s3-integrity-check.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2023 ","pages":"gigabyte87"},"PeriodicalIF":0.0,"publicationDate":"2023-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10448181/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10165035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-14eCollection Date: 2023-01-01DOI: 10.46471/gigabyte.86
Angel Borja
Microbes have often been overlooked as indicators of how the ecological status is affected by human pressures. Recently, the biotic index microgAMBI was proposed to assess the status of marine sediments and waters, and it has been tested under different pressures and biogeographical areas. This index is based on the assignation of microbial taxa to one of two ecological groups: sensitive or tolerant to pollution or disturbance. The resulting taxa list has grown significantly since its first publication. Given the growing use of microgAMBI, it is crucial to make it more FAIR: Findable, Accessible, Interoperable and Reusable. Hence, this work provides the calculation template, the updated taxa list (1,974 taxa currently), and instructions on how to access and use them for assessing marine microbial ecological status.
{"title":"A dataset and template for assessing the ecological status of marine sediments and waters, based on microbial taxa.","authors":"Angel Borja","doi":"10.46471/gigabyte.86","DOIUrl":"10.46471/gigabyte.86","url":null,"abstract":"<p><p>Microbes have often been overlooked as indicators of how the ecological status is affected by human pressures. Recently, the biotic index microgAMBI was proposed to assess the status of marine sediments and waters, and it has been tested under different pressures and biogeographical areas. This index is based on the assignation of microbial taxa to one of two ecological groups: sensitive or tolerant to pollution or disturbance. The resulting taxa list has grown significantly since its first publication. Given the growing use of microgAMBI, it is crucial to make it more FAIR: Findable, Accessible, Interoperable and Reusable. Hence, this work provides the calculation template, the updated taxa list (1,974 taxa currently), and instructions on how to access and use them for assessing marine microbial ecological status.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2023 ","pages":"gigabyte86"},"PeriodicalIF":0.0,"publicationDate":"2023-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10427998/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10047074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-18eCollection Date: 2023-01-01DOI: 10.46471/gigabyte.85
Pasquale Ciliberti, Astrid Roquas, Becky Desjardins, Bibiche Berkholst, Frank Loggen, Menno Hooft, Gideon Gijswijt, Dick de Graaff
Natural history collections contain a wealth of information on species diversity, distribution and ecology. However, due to historical and practical constraints, this valuable information is not always available to researchers. Our project aimed at unlocking data handwritten in notebooks owned by Johanna Bonne-Wepster, a Culicidae researcher. These handwritten notes refer to specimens labeled with a number only. The notebooks were scanned and entered into a Google spreadsheet. The specimens were provided with a unique identifier, labeled with the information from the notebooks and the data exported to the Global Biodiversity Information Facility. In addition, the type specimens were photographed. Besides Johanna Bonne-Wepster's collection, mosquitoes from the former Rijksmuseum van Natuurlijk Historie collection and the former Zoölogisch Museum Amsterdam Nederland collection were digitized. All specimens are now housed at the Naturalis Biodiversity Center museum in Leiden. This paper describes the efforts to mobilize this data and the problems we encountered.
自然历史藏品蕴含着物种多样性、分布和生态学方面的丰富信息。然而,由于历史和现实的限制,研究人员并非总能获得这些宝贵的信息。我们的项目旨在解开约翰娜-博内-韦普斯特(Johanna Bonne-Wepster)笔记本中的手写数据。这些手写笔记只涉及标有编号的标本。这些笔记本被扫描并输入谷歌电子表格。标本都有一个唯一的标识符,并标注了笔记本上的信息,数据被导出到全球生物多样性信息基金。此外,还对模式标本进行了拍照。除了 Johanna Bonne-Wepster 的收藏外,前 Rijksmuseum van Natuurlijk Historie 收藏和前 Zoölogisch Museum Amsterdam Nederland 收藏的蚊子也被数字化。目前,所有标本都存放在莱顿的 Naturalis 生物多样性中心博物馆。本文介绍了我们为调动这些数据所做的努力以及遇到的问题。
{"title":"Digitizing the Culicidae collection of Naturalis Biodiversity Center, with a special focus on the former Bonne-Wepster subcollection.","authors":"Pasquale Ciliberti, Astrid Roquas, Becky Desjardins, Bibiche Berkholst, Frank Loggen, Menno Hooft, Gideon Gijswijt, Dick de Graaff","doi":"10.46471/gigabyte.85","DOIUrl":"10.46471/gigabyte.85","url":null,"abstract":"<p><p>Natural history collections contain a wealth of information on species diversity, distribution and ecology. However, due to historical and practical constraints, this valuable information is not always available to researchers. Our project aimed at unlocking data handwritten in notebooks owned by Johanna Bonne-Wepster, a Culicidae researcher. These handwritten notes refer to specimens labeled with a number only. The notebooks were scanned and entered into a Google spreadsheet. The specimens were provided with a unique identifier, labeled with the information from the notebooks and the data exported to the Global Biodiversity Information Facility. In addition, the type specimens were photographed. Besides Johanna Bonne-Wepster's collection, mosquitoes from the former Rijksmuseum van Natuurlijk Historie collection and the former Zoölogisch Museum Amsterdam Nederland collection were digitized. All specimens are now housed at the Naturalis Biodiversity Center museum in Leiden. This paper describes the efforts to mobilize this data and the problems we encountered.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2023 ","pages":"gigabyte85"},"PeriodicalIF":0.0,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10355122/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10208256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-03eCollection Date: 2023-01-01DOI: 10.46471/gigabyte.84
Sagar Patel, Zachary N Harris, Jason P Londo, Allison Miller, Anne Fennell
'Chambourcin' is a French-American interspecific hybrid grape grown in the eastern and midwestern United States and used for making wine. Few genomic resources are available for hybrid grapevines like 'Chambourcin'. Here, we assembled the genome of 'Chambourcin' using PacBio HiFi long-read, Bionano optical map, and Illumina short-read sequencing technologies. We generated an assembly for 'Chambourcin' with 26 scaffolds, with an N50 length of 23.3 Mb and an estimated BUSCO completeness of 97.9%. We predicted 33,791 gene models and identified 16,056 common orthologs between 'Chambourcin', V. vinifera 'PN40024' 12X.v2, VCOST.v3, Shine Muscat and V. riparia Gloire. We found 1,606 plant transcription factors from 58 gene families. Finally, we identified 304,571 simple sequence repeats (up to six base pairs long). Our work provides the genome assembly, annotation and the protein and coding sequences of 'Chambourcin'. Our genome assembly is a valuable resource for genome comparisons, functional genomic analyses and genome-assisted breeding research.
{"title":"Genome assembly of the hybrid grapevine <i>Vitis</i> 'Chambourcin'.","authors":"Sagar Patel, Zachary N Harris, Jason P Londo, Allison Miller, Anne Fennell","doi":"10.46471/gigabyte.84","DOIUrl":"10.46471/gigabyte.84","url":null,"abstract":"<p><p>'Chambourcin' is a French-American interspecific hybrid grape grown in the eastern and midwestern United States and used for making wine. Few genomic resources are available for hybrid grapevines like 'Chambourcin'. Here, we assembled the genome of 'Chambourcin' using PacBio HiFi long-read, Bionano optical map, and Illumina short-read sequencing technologies. We generated an assembly for 'Chambourcin' with 26 scaffolds, with an N50 length of 23.3 Mb and an estimated BUSCO completeness of 97.9%. We predicted 33,791 gene models and identified 16,056 common orthologs between 'Chambourcin', <i>V. vinifera</i> 'PN40024' 12X.v2, VCOST.v3, Shine Muscat and <i>V. riparia</i> Gloire. We found 1,606 plant transcription factors from 58 gene families. Finally, we identified 304,571 simple sequence repeats (up to six base pairs long). Our work provides the genome assembly, annotation and the protein and coding sequences of 'Chambourcin'. Our genome assembly is a valuable resource for genome comparisons, functional genomic analyses and genome-assisted breeding research.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2023 ","pages":"gigabyte84"},"PeriodicalIF":0.0,"publicationDate":"2023-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10318349/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10161639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-30eCollection Date: 2023-01-01DOI: 10.46471/gigabyte.83
Paul Taconet, Barnabas Zogo, Dieudonné Diloma Soma, Ludovic P Ahoua Alou, Karine Mouline, Roch Kounbobr Dabiré, Alphonsine Amanan Koffi, Cédric Pennetier, Nicolas Moiroux
Characterizing the entomological profile of malaria transmission at fine spatiotemporal scales is essential for developing and implementing effective vector control strategies. Here, we present a fine-grained dataset of Anopheles mosquitoes (Diptera: Culicidae) collected in 55 villages of the rural districts of Korhogo (Northern Côte d'Ivoire) and Diébougou (South-West Burkina Faso) between 2016 and 2018. In the framework of a randomized controlled trial, Anopheles mosquitoes were periodically collected by Human Landing Catches experts inside and outside households, and analyzed individually to identify the genus and, for a subsample, species, insecticide resistance genetic mutations, Plasmodium falciparum infection, and parity status. More than 3,000 collection sessions were carried out, achieving about 45,000 h of sampling efforts. Over 60,000 Anopheles were collected (mainly A. gambiae s.s., A. coluzzii, and A. funestus). The dataset is published as a Darwin Core archive in the Global Biodiversity Information Facility, comprising four files: events, occurrences, mosquito characterizations, and environmental data.
{"title":"<i>Anopheles</i> sampling collections in the health districts of Korhogo (Côte d'Ivoire) and Diébougou (Burkina Faso) between 2016 and 2018.","authors":"Paul Taconet, Barnabas Zogo, Dieudonné Diloma Soma, Ludovic P Ahoua Alou, Karine Mouline, Roch Kounbobr Dabiré, Alphonsine Amanan Koffi, Cédric Pennetier, Nicolas Moiroux","doi":"10.46471/gigabyte.83","DOIUrl":"10.46471/gigabyte.83","url":null,"abstract":"<p><p>Characterizing the entomological profile of malaria transmission at fine spatiotemporal scales is essential for developing and implementing effective vector control strategies. Here, we present a fine-grained dataset of <i>Anopheles</i> mosquitoes (Diptera: Culicidae) collected in 55 villages of the rural districts of Korhogo (Northern Côte d'Ivoire) and Diébougou (South-West Burkina Faso) between 2016 and 2018. In the framework of a randomized controlled trial, <i>Anopheles</i> mosquitoes were periodically collected by Human Landing Catches experts inside and outside households, and analyzed individually to identify the genus and, for a subsample, species, insecticide resistance genetic mutations, <i>Plasmodium falciparum</i> infection, and parity status. More than 3,000 collection sessions were carried out, achieving about 45,000 h of sampling efforts. Over 60,000 <i>Anopheles</i> were collected (mainly <i>A. gambiae</i> s.s., <i>A. coluzzii</i>, and <i>A. funestus</i>). The dataset is published as a Darwin Core archive in the Global Biodiversity Information Facility, comprising four files: events, occurrences, mosquito characterizations, and environmental data.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2023 ","pages":"gigabyte83"},"PeriodicalIF":0.0,"publicationDate":"2023-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10318348/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9803417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Snakes are a vital component of wildlife resources and are widely distributed across the globe. The many-banded krait Bungarus multicinctus is a highly venomous snake found across Southern Asia and central and southern China. Snakes are an ancient reptile group, and their genomes can provide important clues for understanding the evolutionary history of reptiles. Additionally, genomic resources play a crucial role in comprehending the evolution of all species. However, snake genomic resources are still scarce. Here, we present a highly contiguous genome of B. multicinctus with a size of 1.51 Gb. The genome contains a repeat content of 40.15%, with a total length exceeding 620 Mb. Additionally, we annotated a total of 24,869 functional genes. This research is of great significance for comprehending the evolution of B. multicinctus and provides genomic information on the genes involved in venom gland functions.
{"title":"The genome assembly and annotation of the many-banded krait, <i>Bungarus multicinctus</i>.","authors":"Boyang Liu, Liangyu Cui, Zhangwen Deng, Yue Ma, Diancheng Yang, Yanan Gong, Yanchun Xu, Tianming Lan, Shuhui Yang, Song Huang","doi":"10.46471/gigabyte.82","DOIUrl":"10.46471/gigabyte.82","url":null,"abstract":"<p><p>Snakes are a vital component of wildlife resources and are widely distributed across the globe. The many-banded krait <i>Bungarus multicinctus</i> is a highly venomous snake found across Southern Asia and central and southern China. Snakes are an ancient reptile group, and their genomes can provide important clues for understanding the evolutionary history of reptiles. Additionally, genomic resources play a crucial role in comprehending the evolution of all species. However, snake genomic resources are still scarce. Here, we present a highly contiguous genome of <i>B. multicinctus</i> with a size of 1.51 Gb. The genome contains a repeat content of 40.15%, with a total length exceeding 620 Mb. Additionally, we annotated a total of 24,869 functional genes. This research is of great significance for comprehending the evolution of <i>B. multicinctus</i> and provides genomic information on the genes involved in venom gland functions.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2023 ","pages":"gigabyte82"},"PeriodicalIF":0.0,"publicationDate":"2023-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10315667/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9802538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-05-15eCollection Date: 2023-01-01DOI: 10.46471/gigabyte.81
André Gomes-Dos-Santos, Manuel Lopes-Lima, André M Machado, Thomas Forest, Guillaume Achaz, Amílcar Teixeira, Vincent Prié, L Filipe C Castro, Elsa Froufe
Contiguous assemblies are fundamental to deciphering the composition of extant genomes. In molluscs, this is considerably challenging owing to the large size of their genomes, heterozygosity, and widespread repetitive content. Consequently, long-read sequencing technologies are fundamental for high contiguity and quality. The first genome assembly of Margaritifera margaritifera (Linnaeus, 1758) (Mollusca: Bivalvia: Unionida), a culturally relevant, widespread, and highly threatened species of freshwater mussels, was recently generated. However, the resulting genome is highly fragmented since the assembly relied on short-read approaches. Here, an improved reference genome assembly was generated using a combination of PacBio CLR long reads and Illumina paired-end short reads. This genome assembly is 2.4 Gb long, organized into 1,700 scaffolds with a contig N50 length of 3.4 Mbp. The ab initio gene prediction resulted in 48,314 protein-coding genes. Our new assembly is a substantial improvement and an essential resource for studying this species' unique biological and evolutionary features, helping promote its conservation.
{"title":"The Crown Pearl V2: an improved genome assembly of the European freshwater pearl mussel <i>Margaritifera margaritifera</i> (Linnaeus, 1758).","authors":"André Gomes-Dos-Santos, Manuel Lopes-Lima, André M Machado, Thomas Forest, Guillaume Achaz, Amílcar Teixeira, Vincent Prié, L Filipe C Castro, Elsa Froufe","doi":"10.46471/gigabyte.81","DOIUrl":"10.46471/gigabyte.81","url":null,"abstract":"<p><p>Contiguous assemblies are fundamental to deciphering the composition of extant genomes. In molluscs, this is considerably challenging owing to the large size of their genomes, heterozygosity, and widespread repetitive content. Consequently, long-read sequencing technologies are fundamental for high contiguity and quality. The first genome assembly of <i>Margaritifera margaritifera</i> (Linnaeus, 1758) (Mollusca: Bivalvia: Unionida), a culturally relevant, widespread, and highly threatened species of freshwater mussels, was recently generated. However, the resulting genome is highly fragmented since the assembly relied on short-read approaches. Here, an improved reference genome assembly was generated using a combination of PacBio CLR long reads and Illumina paired-end short reads. This genome assembly is 2.4 Gb long, organized into 1,700 scaffolds with a contig N50 length of 3.4 Mbp. The <i>ab initio</i> gene prediction resulted in 48,314 protein-coding genes. Our new assembly is a substantial improvement and an essential resource for studying this species' unique biological and evolutionary features, helping promote its conservation.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2023 ","pages":"gigabyte81"},"PeriodicalIF":0.0,"publicationDate":"2023-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10189783/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9862369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-30eCollection Date: 2023-01-01DOI: 10.46471/gigabyte.80
Bridget P Bannerman, Alexandru Oarga, Jorge Júlvez
Antibiotic resistance is increasing at an alarming rate, and three related mycobacteria are sources of widespread infections in humans. According to the World Health Organization, Mycobacterium leprae, which causes leprosy, is still endemic in tropical countries; Mycobacterium tuberculosis is the second leading infectious killer worldwide after COVID-19; and Mycobacteroides abscessus, a group of non-tuberculous mycobacteria, causes lung infections and other healthcare-associated infections in humans. Due to the rise in resistance to common antibacterial drugs, it is critical that we develop alternatives to traditional treatment procedures. Furthermore, an understanding of the biochemical mechanisms underlying pathogenic evolution is important for the treatment and management of these diseases. In this study, metabolic models have been developed for two bacterial pathogens, M. leprae and My. abscessus, and a new computational tool has been used to identify potential drug targets, which are referred to as bottleneck reactions. The genes, reactions, and pathways in each of these organisms have been highlighted; the potential drug targets can be further explored as broad-spectrum antibacterials and the unique drug targets for each pathogen are significant for precision medicine initiatives. The models and associated datasets described in this paper are available in GigaDB, Biomodels, and PatMeDB repositories.
{"title":"Mycobacterial metabolic model development for drug target identification.","authors":"Bridget P Bannerman, Alexandru Oarga, Jorge Júlvez","doi":"10.46471/gigabyte.80","DOIUrl":"10.46471/gigabyte.80","url":null,"abstract":"<p><p>Antibiotic resistance is increasing at an alarming rate, and three related mycobacteria are sources of widespread infections in humans. According to the World Health Organization, <i>Mycobacterium leprae</i>, which causes leprosy, is still endemic in tropical countries; <i>Mycobacterium tuberculosis</i> is the second leading infectious killer worldwide after COVID-19; and <i>Mycobacteroides abscessus</i>, a group of non-tuberculous mycobacteria, causes lung infections and other healthcare-associated infections in humans. Due to the rise in resistance to common antibacterial drugs, it is critical that we develop alternatives to traditional treatment procedures. Furthermore, an understanding of the biochemical mechanisms underlying pathogenic evolution is important for the treatment and management of these diseases. In this study, metabolic models have been developed for two bacterial pathogens, <i>M. leprae</i> and <i>My. abscessus</i>, and a new computational tool has been used to identify potential drug targets, which are referred to as bottleneck reactions. The genes, reactions, and pathways in each of these organisms have been highlighted; the potential drug targets can be further explored as broad-spectrum antibacterials and the unique drug targets for each pathogen are significant for precision medicine initiatives. The models and associated datasets described in this paper are available in GigaDB, Biomodels, and PatMeDB repositories.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2023 ","pages":"gigabyte80"},"PeriodicalIF":0.0,"publicationDate":"2023-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10154535/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9433054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}