GigaByte (Hong Kong, China)最新文献

英文中文

ensemblQueryR: fast, flexible and high-throughput querying of Ensembl LD API endpoints in R. ensemblQueryR：在R中对Ensembl LD API端点进行快速、灵活、高吞吐量的查询。

GigaByte (Hong Kong, China)

Pub Date : 2023-09-14 eCollection Date: 2023-01-01 DOI: 10.46471/gigabyte.91

Aine Fairbrother-Browne, Sonia García-Ruiz, Regina Hertfelder Reynolds, Mina Ryten, Alan Hodgkinson

We present ensemblQueryR, an R package for querying Ensembl linkage disequilibrium (LD) endpoints. This package is flexible, fast and user-friendly, and optimised for high-throughput querying. ensemblQueryR uses functions that are intuitive and amenable to custom code integration, familiar R object types as inputs and outputs as well as providing parallelisation functionality. For each Ensembl LD endpoint, ensemblQueryR provides two functions, permitting both single- and multi-query modes of operation. The multi-query functions are optimised for large query sizes and provide optional parallelisation to leverage available computational resources and minimise processing time. We demonstrate improved computational performance of ensemblQueryR over an exisiting tool in terms of random access memory (RAM) usage and speed, delivering a 10-fold speed increase whilst using a third of the RAM. Finally, ensemblQueryR is near-agnostic to operating system and computational architecture through Docker and singularity images, making this tool widely accessible to the scientific community.

我们提出了ensemblQueryR，一个用于查询Ensembl连锁不平衡（LD）终点的R包。该软件包灵活、快速、用户友好，并针对高通量查询进行了优化。ensemblQueryR使用直观且易于自定义代码集成的函数，熟悉的R对象类型作为输入和输出，并提供并行化功能。对于每个Ensembl-LD端点，ensemblQueryR提供两个函数，允许单查询和多查询操作模式。多查询功能针对大查询大小进行了优化，并提供了可选的并行化，以利用可用的计算资源并最大限度地减少处理时间。在随机存取存储器（RAM）的使用和速度方面，我们展示了ensemblQueryR相对于现有工具的计算性能改进，在使用三分之一RAM的同时，速度提高了10倍。最后，ensemblQueryR通过Docker和奇异图像对操作系统和计算架构几乎是不可知的，这使得科学界可以广泛使用该工具。

{"title":"ensemblQueryR: fast, flexible and high-throughput querying of Ensembl LD API endpoints in R.","authors":"Aine Fairbrother-Browne, Sonia García-Ruiz, Regina Hertfelder Reynolds, Mina Ryten, Alan Hodgkinson","doi":"10.46471/gigabyte.91","DOIUrl":"10.46471/gigabyte.91","url":null,"abstract":"We present ensemblQueryR, an R package for querying Ensembl linkage disequilibrium (LD) endpoints. This package is flexible, fast and user-friendly, and optimised for high-throughput querying. ensemblQueryR uses functions that are intuitive and amenable to custom code integration, familiar R object types as inputs and outputs as well as providing parallelisation functionality. For each Ensembl LD endpoint, ensemblQueryR provides two functions, permitting both single- and multi-query modes of operation. The multi-query functions are optimised for large query sizes and provide optional parallelisation to leverage available computational resources and minimise processing time. We demonstrate improved computational performance of ensemblQueryR over an exisiting tool in terms of random access memory (RAM) usage and speed, delivering a 10-fold speed increase whilst using a third of the RAM. Finally, ensemblQueryR is near-agnostic to operating system and computational architecture through Docker and singularity images, making this tool widely accessible to the scientific community.","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2023 ","pages":"1-10"},"PeriodicalIF":0.0,"publicationDate":"2023-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10507293/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41153439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Genome assembly and annotation of the Sharp-nosed Pit Viper Deinagkistrodon acutus based on next-generation sequencing data. 基于新一代测序数据的尖吻蝮基因组组装和注释。

GigaByte (Hong Kong, China)

Pub Date : 2023-09-04 eCollection Date: 2023-01-01 DOI: 10.46471/gigabyte.88

Xinyu Wang, Lirong Liu, Wenbiao Zhu, Shiqing Wang, Minhui Shi, Shuhui Yang, Haorong Lu, Jun Cao

The study of the currently known >3,000 species of snakes can provide valuable insights into the evolution of their genomes. Deinagkistrodon acutus, also known as Sharp-nosed Pit Viper, one hundred-pacer viper or five-pacer viper, is a venomous snake with significant economic, medicinal and scientific importance. Widely distributed in southeastern China and South-East Asia, D. acutus has been primarily studied for its venom. Here, we employed next-generation sequencing to assemble and annotate a highly continuous genome of D. acutus. The genome size is 1.46 Gb; its scaffold N50 length is 6.21 Mb, the repeat content is 42.81%, and 24,402 functional genes were annotated. This study helps to further understand and utilize D. acutus and its venom at the genetic level.

对目前已知的3000多种蛇类进行研究，可以为了解蛇类基因组的进化提供有价值的信息。尖吻蝮蛇（Deinagkistrodon acutus）又名尖吻蝮蛇、百步蛇或五步蛇，是一种毒蛇，具有重要的经济、药用和科学价值。尖吻蝮广泛分布于中国东南部和东南亚地区，人们主要研究其毒液。在这里，我们利用新一代测序技术组装并注释了乌梢蛇高度连续的基因组。该基因组大小为1.46 Gb，支架N50长度为6.21 Mb，重复含量为42.81%，注释了24 402个功能基因。这项研究有助于在基因水平上进一步了解和利用尖吻蝮及其毒液。

引用次数: 0

Trumpet plots: visualizing the relationship between allele frequency and effect size in genetic association studies. 小号图：可视化遗传关联研究中等位基因频率与效应大小之间的关系。

GigaByte (Hong Kong, China)

Pub Date : 2023-09-01 eCollection Date: 2023-01-01 DOI: 10.46471/gigabyte.89

Lucia Corte, Lathan Liou, Paul F O'Reilly, Judit García-González

Recent advances in genome-wide association and sequencing studies have shown that the genetic architecture of complex traits and diseases involves a combination of rare and common genetic variants distributed throughout the genome. One way to better understand this architecture is to visualize genetic associations across a wide range of allele frequencies. However, there is currently no standardized or consistent graphical representation for effectively illustrating these results. Here we propose a standardized approach for visualizing the effect size of risk variants across the allele frequency spectrum. The proposed plots have a distinctive trumpet shape: with the majority of variants having high frequency and small effects, and a small number of variants having lower frequency and larger effects. To demonstrate the utility of trumpet plots in illustrating the relationship between the number of variants, their frequency, and the magnitude of their effects in shaping the genetic architecture of complex traits and diseases, we generated trumpet plots for more than one hundred traits in the UK Biobank. To facilitate their broader use, we developed an R package, 'TrumpetPlots' (available at the Comprehensive R Archive Network) and R Shiny application, 'Shiny Trumpets' (available at https://juditgg.shinyapps.io/shinytrumpets/) that allows users to explore these results and submit their own data.

全基因组关联和测序研究的最新进展表明，复杂性状和疾病的遗传结构涉及分布在整个基因组中的罕见和常见遗传变异的组合。要更好地理解这种结构，一种方法是将广泛等位基因频率范围内的遗传关联可视化。然而，目前还没有标准化或一致的图形表示法来有效地说明这些结果。在此，我们提出了一种标准化的方法，用于直观显示风险变异在等位基因频率谱中的效应大小。所提出的图具有独特的喇叭形状：大多数变异具有高频率和小效应，而少数变异具有较低频率和较大效应。为了证明喇叭图在说明变体数量、变体频率及其对塑造复杂性状和疾病遗传结构的影响程度之间的关系方面的实用性，我们为英国生物库中的一百多个性状生成了喇叭图。为了便于更广泛地使用，我们开发了一个 R 软件包 "TrumpetPlots"（可在综合 R Archive Network 上获取）和 R Shiny 应用程序 "Shiny Trumpets"（可在 https://juditgg.shinyapps.io/shinytrumpets/ 上获取），允许用户探索这些结果并提交自己的数据。

{"title":"Trumpet plots: visualizing the relationship between allele frequency and effect size in genetic association studies.","authors":"Lucia Corte, Lathan Liou, Paul F O'Reilly, Judit García-González","doi":"10.46471/gigabyte.89","DOIUrl":"10.46471/gigabyte.89","url":null,"abstract":"Recent advances in genome-wide association and sequencing studies have shown that the genetic architecture of complex traits and diseases involves a combination of rare and common genetic variants distributed throughout the genome. One way to better understand this architecture is to visualize genetic associations across a wide range of allele frequencies. However, there is currently no standardized or consistent graphical representation for effectively illustrating these results. Here we propose a standardized approach for visualizing the effect size of risk variants across the allele frequency spectrum. The proposed plots have a distinctive trumpet shape: with the majority of variants having high frequency and small effects, and a small number of variants having lower frequency and larger effects. To demonstrate the utility of trumpet plots in illustrating the relationship between the number of variants, their frequency, and the magnitude of their effects in shaping the genetic architecture of complex traits and diseases, we generated trumpet plots for more than one hundred traits in the UK Biobank. To facilitate their broader use, we developed an R package, 'TrumpetPlots' (available at the Comprehensive R Archive Network) and R Shiny application, 'Shiny Trumpets' (available at https://juditgg.shinyapps.io/shinytrumpets/) that allows users to explore these results and submit their own data.","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2023 ","pages":"gigabyte89"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10498096/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10268544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

aws-s3-integrity-check: an open-source bash tool to verify the integrity of a dataset stored on Amazon S3. aws-s3-integrity-check：一款开源 bash 工具，用于验证存储在亚马逊 S3 上的数据集的完整性。

GigaByte (Hong Kong, China)

Pub Date : 2023-08-23 eCollection Date: 2023-01-01 DOI: 10.46471/gigabyte.87

Sonia García-Ruiz, Regina Hertfelder Reynolds, Melissa Grant-Peters, Emil Karl Gustavsson, Aine Fairbrother-Browne, Zhongbo Chen, Jonathan William Brenton, Mina Ryten

Amazon Simple Storage Service (Amazon S3) is a widely used platform for storing large biomedical datasets. Unintended data alterations can occur during data writing and transmission, altering the original content and generating unexpected results. However, no open-source and easy-to-use tool exists to verify end-to-end data integrity. Here, we present aws-s3-integrity-check, a user-friendly, lightweight, and reliable bash tool to verify the integrity of a dataset stored in an Amazon S3 bucket. Using this tool, we only needed ∼114 min to verify the integrity of 1,045 records ranging between 5 bytes and 10 gigabytes and occupying ∼935 gigabytes of the Amazon S3 cloud. Our aws-s3-integrity-check tool also provides file-by-file on-screen and log-file-based information about the status of each integrity check. To our knowledge, this tool is the only open-source one that allows verifying the integrity of a dataset uploaded to the Amazon S3 Storage quickly, reliably, and efficiently. The tool is freely available for download and use at https://github.com/SoniaRuiz/aws-s3-integrity-check and https://hub.docker.com/r/soniaruiz/aws-s3-integrity-check.

亚马逊简单存储服务（Amazon S3）是一个广泛用于存储大型生物医学数据集的平台。在数据写入和传输过程中，可能会发生意外的数据更改，从而改变原始内容并产生意想不到的结果。然而，目前还没有开源且易于使用的工具来验证端到端的数据完整性。在此，我们介绍 aws-s3-integrity-check，这是一款用户友好、轻量级且可靠的 bash 工具，用于验证亚马逊 S3 存储桶中存储的数据集的完整性。使用该工具，我们只用了 114 分钟就验证了亚马逊 S3 云中 1,045 条记录的完整性，这些记录的大小从 5 字节到 10 千兆字节不等，占用了 935 千兆字节的空间。我们的 aws-s3-integrity-check 工具还在屏幕上提供了逐个文件的信息，并在日志文件中提供了每次完整性检查的状态信息。据我们所知，该工具是唯一一款可以快速、可靠、高效地验证上传到亚马逊 S3 存储的数据集完整性的开源工具。该工具可在 https://github.com/SoniaRuiz/aws-s3-integrity-check 和 https://hub.docker.com/r/soniaruiz/aws-s3-integrity-check 免费下载和使用。

{"title":"aws-s3-integrity-check: an open-source bash tool to verify the integrity of a dataset stored on Amazon S3.","authors":"Sonia García-Ruiz, Regina Hertfelder Reynolds, Melissa Grant-Peters, Emil Karl Gustavsson, Aine Fairbrother-Browne, Zhongbo Chen, Jonathan William Brenton, Mina Ryten","doi":"10.46471/gigabyte.87","DOIUrl":"10.46471/gigabyte.87","url":null,"abstract":"Amazon Simple Storage Service (Amazon S3) is a widely used platform for storing large biomedical datasets. Unintended data alterations can occur during data writing and transmission, altering the original content and generating unexpected results. However, no open-source and easy-to-use tool exists to verify end-to-end data integrity. Here, we present aws-s3-integrity-check, a user-friendly, lightweight, and reliable bash tool to verify the integrity of a dataset stored in an Amazon S3 bucket. Using this tool, we only needed ∼114 min to verify the integrity of 1,045 records ranging between 5 bytes and 10 gigabytes and occupying ∼935 gigabytes of the Amazon S3 cloud. Our aws-s3-integrity-check tool also provides file-by-file on-screen and log-file-based information about the status of each integrity check. To our knowledge, this tool is the only open-source one that allows verifying the integrity of a dataset uploaded to the Amazon S3 Storage quickly, reliably, and efficiently. The tool is freely available for download and use at https://github.com/SoniaRuiz/aws-s3-integrity-check and https://hub.docker.com/r/soniaruiz/aws-s3-integrity-check.","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2023 ","pages":"gigabyte87"},"PeriodicalIF":0.0,"publicationDate":"2023-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10448181/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10165035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A dataset and template for assessing the ecological status of marine sediments and waters, based on microbial taxa. 根据微生物类群评估海洋沉积物和水域生态状况的数据集和模板。

GigaByte (Hong Kong, China)

Pub Date : 2023-08-14 eCollection Date: 2023-01-01 DOI: 10.46471/gigabyte.86

Angel Borja

Microbes have often been overlooked as indicators of how the ecological status is affected by human pressures. Recently, the biotic index microgAMBI was proposed to assess the status of marine sediments and waters, and it has been tested under different pressures and biogeographical areas. This index is based on the assignation of microbial taxa to one of two ecological groups: sensitive or tolerant to pollution or disturbance. The resulting taxa list has grown significantly since its first publication. Given the growing use of microgAMBI, it is crucial to make it more FAIR: Findable, Accessible, Interoperable and Reusable. Hence, this work provides the calculation template, the updated taxa list (1,974 taxa currently), and instructions on how to access and use them for assessing marine microbial ecological status.

微生物作为生态状况如何受到人类压力影响的指标，常常被忽视。最近，有人提出了生物指数 microgAMBI 来评估海洋沉积物和水域的状况，并在不同的压力和生物地理区域进行了测试。该指数基于将微生物类群归入两个生态群组之一：对污染或干扰敏感或耐受。自首次发布以来，由此产生的分类群列表已大幅增加。鉴于 microgAMBI 的使用日益广泛，使其更加 FAIR（可查找、可访问、可互操作和可重复使用）至关重要。因此，这项工作提供了计算模板、更新的分类群列表（目前有 1974 个分类群）以及如何获取和使用它们来评估海洋微生物生态状况的说明。

引用次数: 0

Digitizing the Culicidae collection of Naturalis Biodiversity Center, with a special focus on the former Bonne-Wepster subcollection. 将纳洛利斯生物多样性中心的蝇科动物藏品数字化，特别关注前 Bonne-Wepster 子藏品。

GigaByte (Hong Kong, China)

Pub Date : 2023-07-18 eCollection Date: 2023-01-01 DOI: 10.46471/gigabyte.85

Pasquale Ciliberti, Astrid Roquas, Becky Desjardins, Bibiche Berkholst, Frank Loggen, Menno Hooft, Gideon Gijswijt, Dick de Graaff

Natural history collections contain a wealth of information on species diversity, distribution and ecology. However, due to historical and practical constraints, this valuable information is not always available to researchers. Our project aimed at unlocking data handwritten in notebooks owned by Johanna Bonne-Wepster, a Culicidae researcher. These handwritten notes refer to specimens labeled with a number only. The notebooks were scanned and entered into a Google spreadsheet. The specimens were provided with a unique identifier, labeled with the information from the notebooks and the data exported to the Global Biodiversity Information Facility. In addition, the type specimens were photographed. Besides Johanna Bonne-Wepster's collection, mosquitoes from the former Rijksmuseum van Natuurlijk Historie collection and the former Zoölogisch Museum Amsterdam Nederland collection were digitized. All specimens are now housed at the Naturalis Biodiversity Center museum in Leiden. This paper describes the efforts to mobilize this data and the problems we encountered.

自然历史藏品蕴含着物种多样性、分布和生态学方面的丰富信息。然而，由于历史和现实的限制，研究人员并非总能获得这些宝贵的信息。我们的项目旨在解开约翰娜-博内-韦普斯特（Johanna Bonne-Wepster）笔记本中的手写数据。这些手写笔记只涉及标有编号的标本。这些笔记本被扫描并输入谷歌电子表格。标本都有一个唯一的标识符，并标注了笔记本上的信息，数据被导出到全球生物多样性信息基金。此外，还对模式标本进行了拍照。除了 Johanna Bonne-Wepster 的收藏外，前 Rijksmuseum van Natuurlijk Historie 收藏和前 Zoölogisch Museum Amsterdam Nederland 收藏的蚊子也被数字化。目前，所有标本都存放在莱顿的 Naturalis 生物多样性中心博物馆。本文介绍了我们为调动这些数据所做的努力以及遇到的问题。

{"title":"Digitizing the Culicidae collection of Naturalis Biodiversity Center, with a special focus on the former Bonne-Wepster subcollection.","authors":"Pasquale Ciliberti, Astrid Roquas, Becky Desjardins, Bibiche Berkholst, Frank Loggen, Menno Hooft, Gideon Gijswijt, Dick de Graaff","doi":"10.46471/gigabyte.85","DOIUrl":"10.46471/gigabyte.85","url":null,"abstract":"Natural history collections contain a wealth of information on species diversity, distribution and ecology. However, due to historical and practical constraints, this valuable information is not always available to researchers. Our project aimed at unlocking data handwritten in notebooks owned by Johanna Bonne-Wepster, a Culicidae researcher. These handwritten notes refer to specimens labeled with a number only. The notebooks were scanned and entered into a Google spreadsheet. The specimens were provided with a unique identifier, labeled with the information from the notebooks and the data exported to the Global Biodiversity Information Facility. In addition, the type specimens were photographed. Besides Johanna Bonne-Wepster's collection, mosquitoes from the former Rijksmuseum van Natuurlijk Historie collection and the former Zoölogisch Museum Amsterdam Nederland collection were digitized. All specimens are now housed at the Naturalis Biodiversity Center museum in Leiden. This paper describes the efforts to mobilize this data and the problems we encountered.","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2023 ","pages":"gigabyte85"},"PeriodicalIF":0.0,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10355122/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10208256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Genome assembly of the hybrid grapevine Vitis 'Chambourcin'. 杂交葡萄藤'Chambourcin'的基因组组装。

GigaByte (Hong Kong, China)

Pub Date : 2023-07-03 eCollection Date: 2023-01-01 DOI: 10.46471/gigabyte.84

Sagar Patel, Zachary N Harris, Jason P Londo, Allison Miller, Anne Fennell

'Chambourcin' is a French-American interspecific hybrid grape grown in the eastern and midwestern United States and used for making wine. Few genomic resources are available for hybrid grapevines like 'Chambourcin'. Here, we assembled the genome of 'Chambourcin' using PacBio HiFi long-read, Bionano optical map, and Illumina short-read sequencing technologies. We generated an assembly for 'Chambourcin' with 26 scaffolds, with an N50 length of 23.3 Mb and an estimated BUSCO completeness of 97.9%. We predicted 33,791 gene models and identified 16,056 common orthologs between 'Chambourcin', V. vinifera 'PN40024' 12X.v2, VCOST.v3, Shine Muscat and V. riparia Gloire. We found 1,606 plant transcription factors from 58 gene families. Finally, we identified 304,571 simple sequence repeats (up to six base pairs long). Our work provides the genome assembly, annotation and the protein and coding sequences of 'Chambourcin'. Our genome assembly is a valuable resource for genome comparisons, functional genomic analyses and genome-assisted breeding research.

Chambourcin'是一种法美种间杂交葡萄，生长在美国东部和中西部，用于酿造葡萄酒。像'Chambourcin'这样的杂交葡萄很少有基因组资源。在这里，我们利用 PacBio HiFi 长读数、Bionano 光学图谱和 Illumina 短读数测序技术组装了'Chambourcin'的基因组。我们为'Chambourcin'组装了 26 个支架，N50 长度为 23.3 Mb，BUSCO 的估计完整性为 97.9%。我们预测了 33,791 个基因模型，并在'Chambourcin'、V. vinifera 'PN40024' 12X.v2、VCOST.v3、Shine Muscat 和 V. riparia Gloire 之间发现了 16,056 个共同的直向同源物。我们发现了来自 58 个基因家族的 1,606 个植物转录因子。最后，我们还发现了 304,571 个简单序列重复（长度不超过 6 个碱基对）。我们的工作提供了'Chambourcin'的基因组组装、注释以及蛋白质和编码序列。我们的基因组组装是基因组比较、功能基因组分析和基因组辅助育种研究的宝贵资源。

引用次数: 0

Anopheles sampling collections in the health districts of Korhogo (Côte d'Ivoire) and Diébougou (Burkina Faso) between 2016 and 2018. 2016 年至 2018 年期间在科霍戈（科特迪瓦）和迪布古（布基纳法索）卫生区采集的疟蚊样本。

GigaByte (Hong Kong, China)

Pub Date : 2023-06-30 eCollection Date: 2023-01-01 DOI: 10.46471/gigabyte.83

Paul Taconet, Barnabas Zogo, Dieudonné Diloma Soma, Ludovic P Ahoua Alou, Karine Mouline, Roch Kounbobr Dabiré, Alphonsine Amanan Koffi, Cédric Pennetier, Nicolas Moiroux

Characterizing the entomological profile of malaria transmission at fine spatiotemporal scales is essential for developing and implementing effective vector control strategies. Here, we present a fine-grained dataset of Anopheles mosquitoes (Diptera: Culicidae) collected in 55 villages of the rural districts of Korhogo (Northern Côte d'Ivoire) and Diébougou (South-West Burkina Faso) between 2016 and 2018. In the framework of a randomized controlled trial, Anopheles mosquitoes were periodically collected by Human Landing Catches experts inside and outside households, and analyzed individually to identify the genus and, for a subsample, species, insecticide resistance genetic mutations, Plasmodium falciparum infection, and parity status. More than 3,000 collection sessions were carried out, achieving about 45,000 h of sampling efforts. Over 60,000 Anopheles were collected (mainly A. gambiae s.s., A. coluzzii, and A. funestus). The dataset is published as a Darwin Core archive in the Global Biodiversity Information Facility, comprising four files: events, occurrences, mosquito characterizations, and environmental data.

以精细的时空尺度描述疟疾传播的昆虫学特征对于制定和实施有效的病媒控制策略至关重要。在此，我们展示了2016年至2018年期间在科霍戈（科特迪瓦北部）和迪布古（布基纳法索西南部）农村地区55个村庄收集的按蚊精细数据集（双翅目：蚊科）。在随机对照试验框架内，人类登岸捕捉专家定期在住户内外收集按蚊，并逐一进行分析，以确定蚊属、子样本的蚊种、杀虫剂抗药性基因突变、恶性疟原虫感染和奇偶状态。共进行了 3,000 多次采集，采样时间约为 45,000 小时。收集到的按蚊超过 60,000 只（主要是冈比亚按蚊、科鲁兹按蚊和疟原虫）。数据集作为达尔文核心档案在全球生物多样性信息机制中公布，包括四个文件：事件、出现、蚊子特征和环境数据。

{"title":"Anopheles sampling collections in the health districts of Korhogo (Côte d'Ivoire) and Diébougou (Burkina Faso) between 2016 and 2018.","authors":"Paul Taconet, Barnabas Zogo, Dieudonné Diloma Soma, Ludovic P Ahoua Alou, Karine Mouline, Roch Kounbobr Dabiré, Alphonsine Amanan Koffi, Cédric Pennetier, Nicolas Moiroux","doi":"10.46471/gigabyte.83","DOIUrl":"10.46471/gigabyte.83","url":null,"abstract":"Characterizing the entomological profile of malaria transmission at fine spatiotemporal scales is essential for developing and implementing effective vector control strategies. Here, we present a fine-grained dataset of Anopheles mosquitoes (Diptera: Culicidae) collected in 55 villages of the rural districts of Korhogo (Northern Côte d'Ivoire) and Diébougou (South-West Burkina Faso) between 2016 and 2018. In the framework of a randomized controlled trial, Anopheles mosquitoes were periodically collected by Human Landing Catches experts inside and outside households, and analyzed individually to identify the genus and, for a subsample, species, insecticide resistance genetic mutations, Plasmodium falciparum infection, and parity status. More than 3,000 collection sessions were carried out, achieving about 45,000 h of sampling efforts. Over 60,000 Anopheles were collected (mainly A. gambiae s.s., A. coluzzii, and A. funestus). The dataset is published as a Darwin Core archive in the Global Biodiversity Information Facility, comprising four files: events, occurrences, mosquito characterizations, and environmental data.","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2023 ","pages":"gigabyte83"},"PeriodicalIF":0.0,"publicationDate":"2023-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10318348/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9803417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The genome assembly and annotation of the many-banded krait, Bungarus multicinctus. 多带石龙子（Bungarus multicinctus）的基因组组装和注释。

GigaByte (Hong Kong, China)

Pub Date : 2023-06-29 eCollection Date: 2023-01-01 DOI: 10.46471/gigabyte.82

Boyang Liu, Liangyu Cui, Zhangwen Deng, Yue Ma, Diancheng Yang, Yanan Gong, Yanchun Xu, Tianming Lan, Shuhui Yang, Song Huang

Snakes are a vital component of wildlife resources and are widely distributed across the globe. The many-banded krait Bungarus multicinctus is a highly venomous snake found across Southern Asia and central and southern China. Snakes are an ancient reptile group, and their genomes can provide important clues for understanding the evolutionary history of reptiles. Additionally, genomic resources play a crucial role in comprehending the evolution of all species. However, snake genomic resources are still scarce. Here, we present a highly contiguous genome of B. multicinctus with a size of 1.51 Gb. The genome contains a repeat content of 40.15%, with a total length exceeding 620 Mb. Additionally, we annotated a total of 24,869 functional genes. This research is of great significance for comprehending the evolution of B. multicinctus and provides genomic information on the genes involved in venom gland functions.

蛇是野生动物资源的重要组成部分，广泛分布于全球各地。多带乌梢蛇（Bungarus multicinctus）是一种剧毒蛇类，分布于亚洲南部和中国中部及南部。蛇类是一种古老的爬行动物，其基因组可为了解爬行动物的进化史提供重要线索。此外，基因组资源在理解所有物种的进化过程中发挥着至关重要的作用。然而，蛇类基因组资源仍然稀缺。在这里，我们展示了一个大小为 1.51 Gb 的 B. multicinctus 的高度连续基因组。该基因组的重复率为 40.15%，总长度超过 620 Mb。此外，我们还注释了 24 869 个功能基因。这项研究对理解多鳞蜥的进化具有重要意义，并提供了涉及毒腺功能基因的基因组信息。

引用次数: 0

The Crown Pearl V2: an improved genome assembly of the European freshwater pearl mussel Margaritifera margaritifera (Linnaeus, 1758). 皇冠珍珠 V2：欧洲淡水珍珠贻贝 Margaritifera margaritifera (Linnaeus, 1758) 的改进基因组组装。

GigaByte (Hong Kong, China)

Pub Date : 2023-05-15 eCollection Date: 2023-01-01 DOI: 10.46471/gigabyte.81

André Gomes-Dos-Santos, Manuel Lopes-Lima, André M Machado, Thomas Forest, Guillaume Achaz, Amílcar Teixeira, Vincent Prié, L Filipe C Castro, Elsa Froufe

Contiguous assemblies are fundamental to deciphering the composition of extant genomes. In molluscs, this is considerably challenging owing to the large size of their genomes, heterozygosity, and widespread repetitive content. Consequently, long-read sequencing technologies are fundamental for high contiguity and quality. The first genome assembly of Margaritifera margaritifera (Linnaeus, 1758) (Mollusca: Bivalvia: Unionida), a culturally relevant, widespread, and highly threatened species of freshwater mussels, was recently generated. However, the resulting genome is highly fragmented since the assembly relied on short-read approaches. Here, an improved reference genome assembly was generated using a combination of PacBio CLR long reads and Illumina paired-end short reads. This genome assembly is 2.4 Gb long, organized into 1,700 scaffolds with a contig N50 length of 3.4 Mbp. The ab initio gene prediction resulted in 48,314 protein-coding genes. Our new assembly is a substantial improvement and an essential resource for studying this species' unique biological and evolutionary features, helping promote its conservation.

连续组装是解读现存基因组组成的基础。在软体动物中，由于其基因组体积庞大、杂合性强、重复性内容广泛，这具有相当大的挑战性。因此，长线程测序技术是实现高连续性和高质量的基础。Margaritifera margaritifera (Linnaeus, 1758)（软体动物门：双壳纲：联合贻贝目）是一种与文化相关、分布广泛且濒临灭绝的淡水贻贝物种。然而，由于基因组组装依赖于短读数方法，因此产生的基因组高度破碎。在这里，我们使用 PacBio CLR 长读取和 Illumina 成对端短读取相结合的方法生成了一个改进的参考基因组。该基因组装配长 2.4 Gb，分为 1,700 个支架，等位基因 N50 长度为 3.4 Mbp。ab initio基因预测得出了48,314个编码蛋白质的基因。我们的新组配是一个实质性的改进，是研究该物种独特的生物学和进化特征的重要资源，有助于促进该物种的保护。

{"title":"The Crown Pearl V2: an improved genome assembly of the European freshwater pearl mussel Margaritifera margaritifera (Linnaeus, 1758).","authors":"André Gomes-Dos-Santos, Manuel Lopes-Lima, André M Machado, Thomas Forest, Guillaume Achaz, Amílcar Teixeira, Vincent Prié, L Filipe C Castro, Elsa Froufe","doi":"10.46471/gigabyte.81","DOIUrl":"10.46471/gigabyte.81","url":null,"abstract":"Contiguous assemblies are fundamental to deciphering the composition of extant genomes. In molluscs, this is considerably challenging owing to the large size of their genomes, heterozygosity, and widespread repetitive content. Consequently, long-read sequencing technologies are fundamental for high contiguity and quality. The first genome assembly of Margaritifera margaritifera (Linnaeus, 1758) (Mollusca: Bivalvia: Unionida), a culturally relevant, widespread, and highly threatened species of freshwater mussels, was recently generated. However, the resulting genome is highly fragmented since the assembly relied on short-read approaches. Here, an improved reference genome assembly was generated using a combination of PacBio CLR long reads and Illumina paired-end short reads. This genome assembly is 2.4 Gb long, organized into 1,700 scaffolds with a contig N50 length of 3.4 Mbp. The ab initio gene prediction resulted in 48,314 protein-coding genes. Our new assembly is a substantial improvement and an essential resource for studying this species' unique biological and evolutionary features, helping promote its conservation.","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2023 ","pages":"gigabyte81"},"PeriodicalIF":0.0,"publicationDate":"2023-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10189783/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9862369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

GigaByte (Hong Kong, China)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀