Pub Date : 2023-12-04eCollection Date: 2023-01-01DOI: 10.46471/gigabyte.100
David Simons, Lauren A Attfield, Kate E Jones, Deborah Watson-Jones, Richard Kock
Rodents, a globally distributed and ecologically important mammalian order, serve as hosts for various zoonotic pathogens. However, sampling of rodents and their pathogens suffers from taxonomic and spatial biases. This affects consolidated databases, such as IUCN and GBIF, limiting inference regarding the spillover hazard of zoonotic pathogens into human populations. Here, we synthesised data from 127 rodent trapping studies conducted in 14 West African countries between 1964 and 2022. We combined occurrence data with pathogen screening results to produce a dataset containing detection/non-detection data for 65,628 individual small mammals identified to the species level from at least 1,611 trapping sites. We also included 32 microorganisms, identified to the species or genus levels, that are known or potential pathogens. The dataset is formatted to Darwin Core Standard with associated metadata. This dataset can mitigate spatial and taxonomic biases of current databases, improving understanding of rodent-associated zoonotic pathogen spillover across West Africa.
{"title":"A dataset of small-mammal detections in West Africa and their associated micro-organisms.","authors":"David Simons, Lauren A Attfield, Kate E Jones, Deborah Watson-Jones, Richard Kock","doi":"10.46471/gigabyte.100","DOIUrl":"https://doi.org/10.46471/gigabyte.100","url":null,"abstract":"<p><p>Rodents, a globally distributed and ecologically important mammalian order, serve as hosts for various zoonotic pathogens. However, sampling of rodents and their pathogens suffers from taxonomic and spatial biases. This affects consolidated databases, such as IUCN and GBIF, limiting inference regarding the spillover hazard of zoonotic pathogens into human populations. Here, we synthesised data from 127 rodent trapping studies conducted in 14 West African countries between 1964 and 2022. We combined occurrence data with pathogen screening results to produce a dataset containing detection/non-detection data for 65,628 individual small mammals identified to the species level from at least 1,611 trapping sites. We also included 32 microorganisms, identified to the species or genus levels, that are known or potential pathogens. The dataset is formatted to Darwin Core Standard with associated metadata. This dataset can mitigate spatial and taxonomic biases of current databases, improving understanding of rodent-associated zoonotic pathogen spillover across West Africa.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2023 ","pages":"gigabyte100"},"PeriodicalIF":0.0,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10711198/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138814641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-20eCollection Date: 2023-01-01DOI: 10.46471/gigabyte.99
Jiangang Wang, Yuxin Wu, Shiqing Wang, Weiwu Mu, Wenmei Zeng, Xi Chen, Kangfeng Jiang, Liangyu Yang, Guohai Hu, Fengping He
In China, 65 types of venomous snakes exist, with the Chinese Cobra Naja atra being prominent and a major cause of snakebites in humans. Furthermore, N. atra is a protected animal in some areas, as it has been listed as vulnerable by the International Union for Conservation of Nature. Recently, due to the medical value of snake venoms, venomics has experienced growing research interest. In particular, genomic resources are crucial for understanding the molecular mechanisms of venom production. Here, we report a highly continuous genome assembly of N. atra, based on a snake sample from Huangshan, Anhui, China. The size of this genome is 1.67 Gb, while its repeat content constitutes 37.8% of the genome. A total of 26,432 functional genes were annotated. This data provides an essential resource for studying venom production in N. atra. It may also provide guidance for the protection of this species.
{"title":"The genome assembly and annotation of the Chinese cobra, <i>Naja atra</i>.","authors":"Jiangang Wang, Yuxin Wu, Shiqing Wang, Weiwu Mu, Wenmei Zeng, Xi Chen, Kangfeng Jiang, Liangyu Yang, Guohai Hu, Fengping He","doi":"10.46471/gigabyte.99","DOIUrl":"10.46471/gigabyte.99","url":null,"abstract":"<p><p>In China, 65 types of venomous snakes exist, with the Chinese Cobra <i>Naja atra</i> being prominent and a major cause of snakebites in humans. Furthermore, <i>N. atra</i> is a protected animal in some areas, as it has been listed as vulnerable by the International Union for Conservation of Nature. Recently, due to the medical value of snake venoms, venomics has experienced growing research interest. In particular, genomic resources are crucial for understanding the molecular mechanisms of venom production. Here, we report a highly continuous genome assembly of <i>N. atra</i>, based on a snake sample from Huangshan, Anhui, China. The size of this genome is 1.67 Gb, while its repeat content constitutes 37.8% of the genome. A total of 26,432 functional genes were annotated. This data provides an essential resource for studying venom production in <i>N. atra</i>. It may also provide guidance for the protection of this species.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2023 ","pages":"gigabyte99"},"PeriodicalIF":0.0,"publicationDate":"2023-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10682346/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138464780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tyson Fuller, Derek M. Bickhart, Lisa M. Koch, Lisa Kissing Kucek, Shahjahan Ali, Haley Mangelson, Maria J. Monteros, Timothy Hernandez, Timothy P. L. Smith, Heathcliffe Riday, Michael L. Sullivan
Vicia villosa is an incompletely domesticated annual legume of the Fabaceae family native to Europe and Western Asia. V. villosa is widely used as a cover crop and forage due to its ability to withstand harsh winters. Here, we generated a reference-quality genome assembly (Vvill1.0) from low error-rate long-sequence reads to improve the genetic-based trait selection of this species. Our Vvill1.0 assembly includes seven scaffolds corresponding to the seven estimated linkage groups and comprising approximately 68% of the total genome size of 2.03 Gbp. This assembly is expected to be a useful resource for genetically improving this emerging cover crop species and provide useful insights into legume genomics and plant genome evolution.
{"title":"A reference assembly for the legume cover crop hairy vetch (Vicia villosa)","authors":"Tyson Fuller, Derek M. Bickhart, Lisa M. Koch, Lisa Kissing Kucek, Shahjahan Ali, Haley Mangelson, Maria J. Monteros, Timothy Hernandez, Timothy P. L. Smith, Heathcliffe Riday, Michael L. Sullivan","doi":"10.46471/gigabyte.98","DOIUrl":"https://doi.org/10.46471/gigabyte.98","url":null,"abstract":"Vicia villosa is an incompletely domesticated annual legume of the Fabaceae family native to Europe and Western Asia. V. villosa is widely used as a cover crop and forage due to its ability to withstand harsh winters. Here, we generated a reference-quality genome assembly (Vvill1.0) from low error-rate long-sequence reads to improve the genetic-based trait selection of this species. Our Vvill1.0 assembly includes seven scaffolds corresponding to the seven estimated linkage groups and comprising approximately 68% of the total genome size of 2.03 Gbp. This assembly is expected to be a useful resource for genetically improving this emerging cover crop species and provide useful insights into legume genomics and plant genome evolution.","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"13 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136346758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Victoire Nsabatien, Josue Zanga, Fiacre Agossa, Nono Mvuama, Maxwell Bamba, Osée Mansiangi, Leon Mbashi, Vanessa Mvudi, Glodie Diza, Dorcas Kantin, Narcisse Basosila, Hyacinthe Lukoki, Arsene Bokulu, Christelle Bosulu, Erick Bukaka, Jonas Nagahuedi, Jean Claude Palata, Emery Metelo
Arbovirus epidemics (chikungunya, dengue, West Nile fever, yellow fever and zika) are a growing threat in African areas where Aedes (Stegomyia) aegypti (Linnaeus, 1762) and Aedes albopictus (Skuse, 1895) are present. The lack of comprehensive sampling of these two vectors limits our understanding of their propagation dynamics in areas at risk of arboviruses. Here, we collected 6,943 observations (both larval and human capture) of Ae. aegypti and Ae. albopictus between 2020 and 2022. The study was carried out in the Vallee de la Funa, a post-epidemic zone in the city of Kinshasa, Democratic Republic of Congo. Our results provide important information for future basic and advanced studies on the ecology and phenology of these vectors, as well as on vector dynamics after a post-epidemic period. The data from this study are published in the public domain as the Darwin Core Archive in the Global Biodiversity Information Facility.
虫媒病毒流行(基孔肯雅热、登革热、西尼罗河热、黄热病和寨卡病毒)在存在埃及伊蚊(Linnaeus, 1762年)和白纹伊蚊(Skuse, 1895年)的非洲地区构成日益严重的威胁。由于缺乏对这两种载体的全面采样,限制了我们对它们在虫媒病毒危险地区的传播动态的了解。在此,我们收集了6,943例伊蚊的观察结果(包括幼虫和人类捕获)。埃及伊蚊和伊蚊。白纹伊蚊在2020年到2022年之间。这项研究是在刚果民主共和国金沙萨市的疫情后地区Vallee de la Funa进行的。我们的研究结果为今后对这些病媒的生态学和物候学的基础和高级研究以及流行后期病媒动态的研究提供了重要信息。这项研究的数据作为全球生物多样性信息设施的达尔文核心档案在公共领域发表。
{"title":"Data from Entomological Collections of Aedes (Diptera: Culicidae) in a post-epidemic area of Chikungunya, City of Kinshasa, Democratic Republic of Congo","authors":"Victoire Nsabatien, Josue Zanga, Fiacre Agossa, Nono Mvuama, Maxwell Bamba, Osée Mansiangi, Leon Mbashi, Vanessa Mvudi, Glodie Diza, Dorcas Kantin, Narcisse Basosila, Hyacinthe Lukoki, Arsene Bokulu, Christelle Bosulu, Erick Bukaka, Jonas Nagahuedi, Jean Claude Palata, Emery Metelo","doi":"10.46471/gigabyte.96","DOIUrl":"https://doi.org/10.46471/gigabyte.96","url":null,"abstract":"Arbovirus epidemics (chikungunya, dengue, West Nile fever, yellow fever and zika) are a growing threat in African areas where Aedes (Stegomyia) aegypti (Linnaeus, 1762) and Aedes albopictus (Skuse, 1895) are present. The lack of comprehensive sampling of these two vectors limits our understanding of their propagation dynamics in areas at risk of arboviruses. Here, we collected 6,943 observations (both larval and human capture) of Ae. aegypti and Ae. albopictus between 2020 and 2022. The study was carried out in the Vallee de la Funa, a post-epidemic zone in the city of Kinshasa, Democratic Republic of Congo. Our results provide important information for future basic and advanced studies on the ecology and phenology of these vectors, as well as on vector dynamics after a post-epidemic period. The data from this study are published in the public domain as the Darwin Core Archive in the Global Biodiversity Information Facility.","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"8 14","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135391259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Brown-Spotted Pit viper (Protobothrops mucrosquamatus), also known as the Chinese habu, is a widespread and highly venomous snake distributed from Northeastern India to Eastern China. Genomics research can contribute to our understanding of venom components and natural selection in vipers. Here, we collected, sequenced and assembled the genome of a male P. mucrosquamatus individual from China. We generated a highly continuous reference genome, with a length of 1.53 Gb and 41.18% of repeat elements content. Using this genome, we identified 24,799 genes, 97.97% of which could be annotated. We verified the validity of our genome assembly and annotation process by generating a phylogenetic tree based on the nuclear genome single-copy genes of six other reptile species. The results of our research will contribute to future studies on Protobothrops biology and the genetic basis of snake venom.
{"title":"Genome assembly and annotation of the Brown-Spotted Pit viper Protobothrops mucrosquamatus","authors":"Xiaotong Niu, Haorong Lu, Minhui Shi, Shiqing Wang, Yajie Zhou, Huan Liu","doi":"10.46471/gigabyte.97","DOIUrl":"https://doi.org/10.46471/gigabyte.97","url":null,"abstract":"The Brown-Spotted Pit viper (Protobothrops mucrosquamatus), also known as the Chinese habu, is a widespread and highly venomous snake distributed from Northeastern India to Eastern China. Genomics research can contribute to our understanding of venom components and natural selection in vipers. Here, we collected, sequenced and assembled the genome of a male P. mucrosquamatus individual from China. We generated a highly continuous reference genome, with a length of 1.53 Gb and 41.18% of repeat elements content. Using this genome, we identified 24,799 genes, 97.97% of which could be annotated. We verified the validity of our genome assembly and annotation process by generating a phylogenetic tree based on the nuclear genome single-copy genes of six other reptile species. The results of our research will contribute to future studies on Protobothrops biology and the genetic basis of snake venom.","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"28 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135480022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-27eCollection Date: 2023-01-01DOI: 10.46471/gigabyte.95
Cristina Sánchez Gutierrez, Erika Santamaría, Carlos Andrés Morales, María Camila Lesmes, Horacio Cadena, Alvaro Avila-Diaz, Patricia Fuya, Catalina Marceló-Díaz
Aedes aegypti mosquitoes are the main vector of human arbovirosis in tropical and subtropical areas. Their adaptation to urban and rural environments generates infestations inside households. Therefore, entomological surveillance associated with spatio-temporal analysis is an innovative approach for vector control and dengue management. Here, our main aim was to inspect immature pupal stages in households belonging to municipalities at high risk of dengue in Cauca, Colombia, by implementing entomological indices and relating how they influence adult mosquitos' density. We provide novel data for the geographical distribution of 3,806 immature pupal stages of Ae. aegypti. We also report entomological indices and spatial characterization. Our results suggest that, for Ae. aegypti species, pupal productivity generates high densities of adult mosquitos in neighbouring households, evidencing seasonal behaviour. Our dataset is essential as it provides an innovative strategy for mitigating vector-borne diseases using vector spatial patterns. It also delineates the association between these vector spatial patterns, entomological indicators, and breeding sites in high-risk neighbourhoods.
{"title":"Spatial patterns associated with the distribution of immature stages of <i>Aedes aegypti</i> in three dengue high-risk municipalities of Southwestern Colombia.","authors":"Cristina Sánchez Gutierrez, Erika Santamaría, Carlos Andrés Morales, María Camila Lesmes, Horacio Cadena, Alvaro Avila-Diaz, Patricia Fuya, Catalina Marceló-Díaz","doi":"10.46471/gigabyte.95","DOIUrl":"10.46471/gigabyte.95","url":null,"abstract":"<p><p><i>Aedes aegypti</i> mosquitoes are the main vector of human arbovirosis in tropical and subtropical areas. Their adaptation to urban and rural environments generates infestations inside households. Therefore, entomological surveillance associated with spatio-temporal analysis is an innovative approach for vector control and dengue management. Here, our main aim was to inspect immature pupal stages in households belonging to municipalities at high risk of dengue in Cauca, Colombia, by implementing entomological indices and relating how they influence adult mosquitos' density. We provide novel data for the geographical distribution of 3,806 immature pupal stages of <i>Ae. aegypti</i>. We also report entomological indices and spatial characterization. Our results suggest that, for <i>Ae. aegypti</i> species, pupal productivity generates high densities of adult mosquitos in neighbouring households, evidencing seasonal behaviour. Our dataset is essential as it provides an innovative strategy for mitigating vector-borne diseases using vector spatial patterns. It also delineates the association between these vector spatial patterns, entomological indicators, and breeding sites in high-risk neighbourhoods.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2023 ","pages":"gigabyte95"},"PeriodicalIF":0.0,"publicationDate":"2023-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10620433/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71489644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-05eCollection Date: 2023-01-01DOI: 10.46471/gigabyte.94
Robert E Bruccoleri, Edward J Oakeley, Ann Marie E Faust, Marc Altorfer, Sophie Dessus-Babus, David Burckhardt, Mevion Oertli, Ulrike Naumann, Frank Petersen, Joanne Wong
Irises are perennial plants, representing a large genus with hundreds of species. While cultivated extensively for their ornamental value, commercial interest in irises lies in the secondary metabolites present in their rhizomes. The Dalmatian Iris (Iris pallida Lam.) is an ornamental plant that also produces secondary metabolites with potential value to the fragrance and pharmaceutical industries. In addition to providing base notes for the fragrance industry, iris tissues and extracts possess antioxidant, anti-inflammatory and immunomodulatory effects. However, study of these secondary metabolites has been hampered by a lack of genomic information, requiring difficult extraction and analysis techniques. Here, we report the genome sequence of Iris pallida Lam., generated with Pacific Bioscience long-read sequencing, resulting in a 10.04-Gbp assembly with a scaffold N50 of 14.34 Mbp and 91.8% complete BUSCOs. This reference genome will allow researchers to study the biosynthesis of these secondary metabolites in much greater detail, opening new avenues of investigation for drug discovery and fragrance formulations.
{"title":"Genome assembly of the bearded iris, <i>Iris pallida</i> Lam.","authors":"Robert E Bruccoleri, Edward J Oakeley, Ann Marie E Faust, Marc Altorfer, Sophie Dessus-Babus, David Burckhardt, Mevion Oertli, Ulrike Naumann, Frank Petersen, Joanne Wong","doi":"10.46471/gigabyte.94","DOIUrl":"10.46471/gigabyte.94","url":null,"abstract":"<p><p>Irises are perennial plants, representing a large genus with hundreds of species. While cultivated extensively for their ornamental value, commercial interest in irises lies in the secondary metabolites present in their rhizomes. The Dalmatian Iris (<i>Iris pallida</i> Lam.) is an ornamental plant that also produces secondary metabolites with potential value to the fragrance and pharmaceutical industries. In addition to providing base notes for the fragrance industry, iris tissues and extracts possess antioxidant, anti-inflammatory and immunomodulatory effects. However, study of these secondary metabolites has been hampered by a lack of genomic information, requiring difficult extraction and analysis techniques. Here, we report the genome sequence of <i>Iris pallida</i> Lam., generated with Pacific Bioscience long-read sequencing, resulting in a 10.04-Gbp assembly with a scaffold N50 of 14.34 Mbp and 91.8% complete BUSCOs. This reference genome will allow researchers to study the biosynthesis of these secondary metabolites in much greater detail, opening new avenues of investigation for drug discovery and fragrance formulations.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2023 ","pages":"gigabyte94"},"PeriodicalIF":0.0,"publicationDate":"2023-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10565908/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41222110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-20eCollection Date: 2023-01-01DOI: 10.46471/gigabyte.92
Jiangang Wang, Shiqing Wang, Song Huang, Qing Wang, Tianming Lan, Ming Jiang, Haitao Wu, Yuxiang Yuan
The Oriental rat snake Ptyas mucosa is a common non-venomous snake of the colubrid family, spanning most of South and Southeast Asia. P. mucosa is widely bred for its uses in traditional medicine, scientific research, and handicrafts. Therefore, genome resources of P. mucosa could play an important role in the efficacy of traditional medicine and the analysis of the living environment of this species. Here, we present a highly continuous P. mucosa genome with a size of 1.74 Gb. Its scaffold N50 length is 9.57 Mb, and the maximal scaffold length is 78.3 Mb. Its CG content is 37.9%, and its gene integrity reaches 86.6%. Assembled using long-reads, the total length of the repeat sequences in the genome reaches 735 Mb, and its repeat content is 42.19%. Finally, 24,869 functional genes were annotated in this genome. This study may assist in understanding P. mucosa and supporting medicinal research.
{"title":"The genome assembly and annotation of the Oriental rat snake <i>Ptyas mucosa</i>.","authors":"Jiangang Wang, Shiqing Wang, Song Huang, Qing Wang, Tianming Lan, Ming Jiang, Haitao Wu, Yuxiang Yuan","doi":"10.46471/gigabyte.92","DOIUrl":"https://doi.org/10.46471/gigabyte.92","url":null,"abstract":"<p><p>The Oriental rat snake <i>Ptyas mucosa</i> is a common non-venomous snake of the colubrid family, spanning most of South and Southeast Asia. <i>P. mucosa</i> is widely bred for its uses in traditional medicine, scientific research, and handicrafts. Therefore, genome resources of <i>P. mucosa</i> could play an important role in the efficacy of traditional medicine and the analysis of the living environment of this species. Here, we present a highly continuous <i>P. mucosa</i> genome with a size of 1.74 Gb. Its scaffold N50 length is 9.57 Mb, and the maximal scaffold length is 78.3 Mb. Its CG content is 37.9%, and its gene integrity reaches 86.6%. Assembled using long-reads, the total length of the repeat sequences in the genome reaches 735 Mb, and its repeat content is 42.19%. Finally, 24,869 functional genes were annotated in this genome. This study may assist in understanding <i>P. mucosa</i> and supporting medicinal research.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2023 ","pages":"gigabyte92"},"PeriodicalIF":0.0,"publicationDate":"2023-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10518451/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41171012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-20eCollection Date: 2023-01-01DOI: 10.46471/gigabyte.93
Eamon Winden, Alejandro Vasquez-Echeverri, Susana Calle-Castañeda, Yumin Lian, Juan Pablo Hernandez Ortiz, David C Schwartz
While Bacterial Artificial Chromosomes libraries were once a key resource for the genomic community, they have been obviated, for sequencing purposes, by long-read technologies. Such libraries may now serve as a valuable resource for manipulating and assembling large genomic constructs. To enhance accessibility and comparison, we have developed a BAC restriction map database. Using information from the National Center for Biotechnology Information's cloneDB FTP site, we constructed a database containing the restriction maps for both uniquely placed and insert-sequenced BACs from 11 libraries covering the recognition sequences of the available restriction enzymes. Along with the database, we generated a set of Python functions to reconstruct the database and more easily access the information within. This data is valuable for researchers simply using BACs, as well as those working with larger sections of the genome in terms of synthetic genes, large-scale editing, and mapping.
{"title":"A database of restriction maps to expand the utility of bacterial artificial chromosomes.","authors":"Eamon Winden, Alejandro Vasquez-Echeverri, Susana Calle-Castañeda, Yumin Lian, Juan Pablo Hernandez Ortiz, David C Schwartz","doi":"10.46471/gigabyte.93","DOIUrl":"10.46471/gigabyte.93","url":null,"abstract":"<p><p>While Bacterial Artificial Chromosomes libraries were once a key resource for the genomic community, they have been obviated, for sequencing purposes, by long-read technologies. Such libraries may now serve as a valuable resource for manipulating and assembling large genomic constructs. To enhance accessibility and comparison, we have developed a BAC restriction map database. Using information from the National Center for Biotechnology Information's cloneDB FTP site, we constructed a database containing the restriction maps for both uniquely placed and insert-sequenced BACs from 11 libraries covering the recognition sequences of the available restriction enzymes. Along with the database, we generated a set of Python functions to reconstruct the database and more easily access the information within. This data is valuable for researchers simply using BACs, as well as those working with larger sections of the genome in terms of synthetic genes, large-scale editing, and mapping.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2023 ","pages":"gigabyte93"},"PeriodicalIF":0.0,"publicationDate":"2023-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10518450/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41164956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-14eCollection Date: 2023-01-01DOI: 10.46471/gigabyte.91
Aine Fairbrother-Browne, Sonia García-Ruiz, Regina Hertfelder Reynolds, Mina Ryten, Alan Hodgkinson
We present ensemblQueryR, an R package for querying Ensembl linkage disequilibrium (LD) endpoints. This package is flexible, fast and user-friendly, and optimised for high-throughput querying. ensemblQueryR uses functions that are intuitive and amenable to custom code integration, familiar R object types as inputs and outputs as well as providing parallelisation functionality. For each Ensembl LD endpoint, ensemblQueryR provides two functions, permitting both single- and multi-query modes of operation. The multi-query functions are optimised for large query sizes and provide optional parallelisation to leverage available computational resources and minimise processing time. We demonstrate improved computational performance of ensemblQueryR over an exisiting tool in terms of random access memory (RAM) usage and speed, delivering a 10-fold speed increase whilst using a third of the RAM. Finally, ensemblQueryR is near-agnostic to operating system and computational architecture through Docker and singularity images, making this tool widely accessible to the scientific community.
{"title":"ensemblQueryR: fast, flexible and high-throughput querying of Ensembl LD API endpoints in R.","authors":"Aine Fairbrother-Browne, Sonia García-Ruiz, Regina Hertfelder Reynolds, Mina Ryten, Alan Hodgkinson","doi":"10.46471/gigabyte.91","DOIUrl":"10.46471/gigabyte.91","url":null,"abstract":"<p><p>We present ensemblQueryR, an R package for querying Ensembl linkage disequilibrium (LD) endpoints. This package is flexible, fast and user-friendly, and optimised for high-throughput querying. ensemblQueryR uses functions that are intuitive and amenable to custom code integration, familiar R object types as inputs and outputs as well as providing parallelisation functionality. For each Ensembl LD endpoint, ensemblQueryR provides two functions, permitting both single- and multi-query modes of operation. The multi-query functions are optimised for large query sizes and provide optional parallelisation to leverage available computational resources and minimise processing time. We demonstrate improved computational performance of ensemblQueryR over an exisiting tool in terms of random access memory (RAM) usage and speed, delivering a 10-fold speed increase whilst using a third of the RAM. Finally, ensemblQueryR is near-agnostic to operating system and computational architecture through Docker and singularity images, making this tool widely accessible to the scientific community.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2023 ","pages":"1-10"},"PeriodicalIF":0.0,"publicationDate":"2023-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10507293/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41153439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}