Pub Date : 2023-01-25eCollection Date: 2023-01-01DOI: 10.46471/gigabyte.75
Adrian Viehweger, Mike Marquet, Martin Hölzer, Nadine Dietze, Mathias W Pletz, Christian Brandt
Rapid screening of hospital admissions to detect asymptomatic carriers of resistant bacteria can prevent pathogen outbreaks. However, the resulting isolates rarely have their genome sequenced due to cost constraints and long turn-around times to get and process the data, limiting their usefulness to the practitioner. Here we used real-time, on-device target enrichment ("adaptive") sequencing as a highly multiplexed assay covering 1,147 antimicrobial resistance genes. We compared its utility against standard and metagenomic sequencing, focusing on an isolate of Raoultella ornithinolytica harbouring three carbapenemases (NDM, KPC, VIM). Based on this experimental data, we then modelled the influence of several variables on the enrichment results and predicted the large effect of nucleotide identity (higher is better) and read length (shorter is better). Lastly, we showed how all relevant resistance genes are detected using adaptive sequencing on a miniature ("Flongle") flow cell, motivating its use in a clinical setting to monitor similar cases and their surroundings.
{"title":"Nanopore-based enrichment of antimicrobial resistance genes - a case-based study.","authors":"Adrian Viehweger, Mike Marquet, Martin Hölzer, Nadine Dietze, Mathias W Pletz, Christian Brandt","doi":"10.46471/gigabyte.75","DOIUrl":"10.46471/gigabyte.75","url":null,"abstract":"<p><p>Rapid screening of hospital admissions to detect asymptomatic carriers of resistant bacteria can prevent pathogen outbreaks. However, the resulting isolates rarely have their genome sequenced due to cost constraints and long turn-around times to get and process the data, limiting their usefulness to the practitioner. Here we used real-time, on-device target enrichment (\"adaptive\") sequencing as a highly multiplexed assay covering 1,147 antimicrobial resistance genes. We compared its utility against standard and metagenomic sequencing, focusing on an isolate of <i>Raoultella ornithinolytica</i> harbouring three carbapenemases (<i>NDM</i>, <i>KPC</i>, <i>VIM</i>). Based on this experimental data, we then modelled the influence of several variables on the enrichment results and predicted the large effect of nucleotide identity (higher is better) and read length (shorter is better). Lastly, we showed how all relevant resistance genes are detected using adaptive sequencing on a miniature (\"Flongle\") flow cell, motivating its use in a clinical setting to monitor similar cases and their surroundings.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2023 ","pages":"gigabyte75"},"PeriodicalIF":0.0,"publicationDate":"2023-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10027057/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9534172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-20DOI: 10.1101/2023.01.23.525209
B. Ramesh, CM Small, H. Healey, B. Johnson, E. Barker, M. Currey, S. Bassham, M. Myers, WA Cresko, Ag Jones
The Gulf pipefish Syngnathus scovelli has emerged as an important species in the study of sexual selection, development, and physiology, among other topics. The fish family Syngnathidae, which includes pipefishes, seahorses, and seadragons, has become an increasingly attractive target for comparative research in ecological and evolutionary genomics. These endeavors depend on having a high-quality genome assembly and annotation. However, the first version of the S. scovelli genome assembly was generated by short-read sequencing and annotated using a small set of RNA-sequence data, resulting in limited contiguity and a relatively poor annotation. Here, we present an improved genome assembly and an enhanced annotation, resulting in a new official gene set for S. scovelli. By using PacBio long-read high-fidelity (Hi-Fi) sequences and a proximity ligation (Hi-C) library, we fill small gaps and join the contigs to obtain 22 chromosome-level scaffolds. Compared to the previously published genome, the gaps in our novel genome assembly are smaller, the N75 is much larger (13.3 Mb), and this new genome is around 95% BUSCO complete. The precision of the gene models in the NCBI’s eukaryotic annotation pipeline was enhanced by using a large body of RNA-Seq reads from different tissue types, leading to the discovery of 28,162 genes, of which 8,061 were non-coding genes. This new genome assembly and the annotation are tagged as a RefSeq genome by NCBI and thus provide substantially enhanced genomic resources for future research involving S. scovelli.
{"title":"Improvements to the Gulf pipefish Syngnathus scovelli genome","authors":"B. Ramesh, CM Small, H. Healey, B. Johnson, E. Barker, M. Currey, S. Bassham, M. Myers, WA Cresko, Ag Jones","doi":"10.1101/2023.01.23.525209","DOIUrl":"https://doi.org/10.1101/2023.01.23.525209","url":null,"abstract":"The Gulf pipefish Syngnathus scovelli has emerged as an important species in the study of sexual selection, development, and physiology, among other topics. The fish family Syngnathidae, which includes pipefishes, seahorses, and seadragons, has become an increasingly attractive target for comparative research in ecological and evolutionary genomics. These endeavors depend on having a high-quality genome assembly and annotation. However, the first version of the S. scovelli genome assembly was generated by short-read sequencing and annotated using a small set of RNA-sequence data, resulting in limited contiguity and a relatively poor annotation. Here, we present an improved genome assembly and an enhanced annotation, resulting in a new official gene set for S. scovelli. By using PacBio long-read high-fidelity (Hi-Fi) sequences and a proximity ligation (Hi-C) library, we fill small gaps and join the contigs to obtain 22 chromosome-level scaffolds. Compared to the previously published genome, the gaps in our novel genome assembly are smaller, the N75 is much larger (13.3 Mb), and this new genome is around 95% BUSCO complete. The precision of the gene models in the NCBI’s eukaryotic annotation pipeline was enhanced by using a large body of RNA-Seq reads from different tissue types, leading to the discovery of 28,162 genes, of which 8,061 were non-coding genes. This new genome assembly and the annotation are tagged as a RefSeq genome by NCBI and thus provide substantially enhanced genomic resources for future research involving S. scovelli.","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2023 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42096485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mosquitoes play a crucial role as primary vectors for various infectious diseases in Thailand. Therefore, accurate distribution information is vital for effectively combating and better controlling mosquito-borne diseases. Here, we present a curated dataset of the mosquito distribution in Thailand comprising 12,278 records of at least 117 mosquito species (Diptera: Culicidae). The main genera included in the dataset are Aedes, Anopheles, Armigeres, Culex, and Mansonia. From 2007 to 2023, data were collected through routine mosquito surveillance and research projects from 1,725 locations across 66 (out of 77) Thai provinces. The majority of the data were extracted from a Thai database of the Thailand Malaria Elimination Program. To facilitate broader access to mosquito-related data and support further exploration of the Thai mosquito fauna, the data were translated into English. Our dataset has been published in the Global Biodiversity Information Facility, making it available for researchers worldwide.
{"title":"Distribution of mosquitoes (Diptera: Culicidae) in Thailand: a dataset.","authors":"Chutipong Sukkanon, Wannapa Suwonkerd, Kanutcharee Thanispong, Manop Saeung, Pairpailin Jhaiaun, Suntorn Pimnon, Kanaphot Thongkhao, Sylvie Manguin, Theeraphap Chareonviriyaphap","doi":"10.46471/gigabyte.90","DOIUrl":"https://doi.org/10.46471/gigabyte.90","url":null,"abstract":"<p><p>Mosquitoes play a crucial role as primary vectors for various infectious diseases in Thailand. Therefore, accurate distribution information is vital for effectively combating and better controlling mosquito-borne diseases. Here, we present a curated dataset of the mosquito distribution in Thailand comprising 12,278 records of at least 117 mosquito species (Diptera: Culicidae). The main genera included in the dataset are <i>Aedes</i>, <i>Anopheles</i>, <i>Armigeres</i>, <i>Culex</i>, and <i>Mansonia</i>. From 2007 to 2023, data were collected through routine mosquito surveillance and research projects from 1,725 locations across 66 (out of 77) Thai provinces. The majority of the data were extracted from a Thai database of the Thailand Malaria Elimination Program. To facilitate broader access to mosquito-related data and support further exploration of the Thai mosquito fauna, the data were translated into English. Our dataset has been published in the Global Biodiversity Information Facility, making it available for researchers worldwide.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2023 ","pages":"gigabyte90"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10498097/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10270936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-02eCollection Date: 2022-01-01DOI: 10.46471/gigabyte.74
Lodovico Terzi di Bergamo, Francesca Guidetti, Davide Rossi, Francesco Bertoni, Luciano Cascione
Extraction-free HTG EdgeSeq protocols are used to profile sets of genes and measure their expression. Thus, these protocols are frequently used to characterise tumours and their microenvironments. However, although positive and control genes are provided, little indication is given concerning the assessment of the technical success of each sample within the sequencing run. We developed HTGQC, an R package for the quality control of HTG EdgeSeq protocols. Additionally, shinyHTGQC is a shiny application for users without computing knowledge, providing an easy-to-use interface for data quality control and visualisation. Quality checks can be performed on the raw sequencing outputs, and samples are flagged as FAIL or ALERT based on the expression levels of the positive and negative control genes.
Availability & implementation: The code is freely available at https://github.com/LodovicoTerzi/HTGQC (R package) and https://lodovico.shinyapps.io/shinyHTGQC/ (shiny application), including test datasets.
{"title":"HTGQC and shinyHTGQC: an R package and shinyR application for quality controls of HTG EDGE-seq protocols.","authors":"Lodovico Terzi di Bergamo, Francesca Guidetti, Davide Rossi, Francesco Bertoni, Luciano Cascione","doi":"10.46471/gigabyte.74","DOIUrl":"10.46471/gigabyte.74","url":null,"abstract":"<p><p>Extraction-free HTG EdgeSeq protocols are used to profile sets of genes and measure their expression. Thus, these protocols are frequently used to characterise tumours and their microenvironments. However, although positive and control genes are provided, little indication is given concerning the assessment of the technical success of each sample within the sequencing run. We developed HTGQC, an R package for the quality control of HTG EdgeSeq protocols. Additionally, shinyHTGQC is a shiny application for users without computing knowledge, providing an easy-to-use interface for data quality control and visualisation. Quality checks can be performed on the raw sequencing outputs, and samples are flagged as FAIL or ALERT based on the expression levels of the positive and negative control genes.</p><p><strong>Availability & implementation: </strong>The code is freely available at https://github.com/LodovicoTerzi/HTGQC (R package) and https://lodovico.shinyapps.io/shinyHTGQC/ (shiny application), including test datasets.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2022 ","pages":"gigabyte74"},"PeriodicalIF":0.0,"publicationDate":"2022-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10027062/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9166564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-30eCollection Date: 2022-01-01DOI: 10.46471/gigabyte.72
Daniel Bergman, Lauren Marazzi, Mukti Chowkwale, Deepa Maheshvare M, Supriya Bidanta, Tarunendu Mapder, Jialun Li
Pharmacokinetics and pharmacodynamics (PKPD) are key considerations in any study of molecular therapies. It is thus imperative to factor their effects into any in silico model of biological tissue involving such therapies. Furthermore, creating a standardized and flexible framework will benefit the community by increasing access to such modules and enhancing their communicability. PhysiCell is an open-source physics-based cell simulator, i.e., a platform for modeling biological tissue, that is quickly being adopted and utilized by the mathematical biology community. We present here PhysiPKPD, an open-source PhysiCell-based package that allows users to include PKPD in PhysiCell models.
Availability & implementation: The source code for PhysiPKPD is located here: https://github.com/drbergman/PhysiPKPD.
{"title":"PhysiPKPD: A pharmacokinetics and pharmacodynamics module for PhysiCell.","authors":"Daniel Bergman, Lauren Marazzi, Mukti Chowkwale, Deepa Maheshvare M, Supriya Bidanta, Tarunendu Mapder, Jialun Li","doi":"10.46471/gigabyte.72","DOIUrl":"10.46471/gigabyte.72","url":null,"abstract":"<p><p>Pharmacokinetics and pharmacodynamics (PKPD) are key considerations in any study of molecular therapies. It is thus imperative to factor their effects into any <i>in silico</i> model of biological tissue involving such therapies. Furthermore, creating a standardized and flexible framework will benefit the community by increasing access to such modules and enhancing their communicability. PhysiCell is an open-source physics-based cell simulator, i.e., a platform for modeling biological tissue, that is quickly being adopted and utilized by the mathematical biology community. We present here PhysiPKPD, an open-source PhysiCell-based package that allows users to include PKPD in PhysiCell models.</p><p><strong>Availability & implementation: </strong>The source code for PhysiPKPD is located here: https://github.com/drbergman/PhysiPKPD.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2022 ","pages":"gigabyte72"},"PeriodicalIF":0.0,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10027063/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9159221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-22eCollection Date: 2022-01-01DOI: 10.46471/gigabyte.73
Audrey J Majeske, Alejandro J Mercado Capote, Aleksey Komissarov, Anna Bogdanova, Nikolaos V Schizas, Stephanie O Castro Márquez, Kenneth Hilkert, Walter Wolfsberger, Tarás K Oleksyk
The mitochondrial genome of the long-spined black sea urchin, Diadema antillarum, was sequenced using Illumina next-generation sequencing technology. The complete mitogenome is 15,708 bp in length, containing two rRNA, 22 tRNA and 13 protein-coding genes, plus a noncoding control region of 133 bp. The nucleotide composition is 18.37% G, 23.79% C, 26.84% A and 30.99% T. The A + T bias is 57.84%. Phylogenetic analysis based on 12 complete mitochondrial genomes of sea urchins, including four species of the family Diadematidae, supported familial monophyly; however, the two Diadema species, D. antillarum and D. setosum were not recovered as sister taxa.
利用 Illumina 下一代测序技术对长棘黑海胆 Diadema antillarum 的线粒体基因组进行了测序。完整的线粒体基因组全长 15,708 bp,包含 2 个 rRNA、22 个 tRNA 和 13 个编码蛋白质的基因,以及一个 133 bp 的非编码控制区。核苷酸组成为 18.37% G、23.79% C、26.84% A 和 30.99% T。基于 12 个完整的海胆线粒体基因组(包括 Diadematidae 科的 4 个物种)的系统进化分析支持家族单系性;但是,两个 Diadema 物种 D. antillarum 和 D. setosum 没有恢复为姊妹类群。
{"title":"The first complete mitochondrial genome of <i>Diadema antillarum</i> (Diadematoida, Diadematidae).","authors":"Audrey J Majeske, Alejandro J Mercado Capote, Aleksey Komissarov, Anna Bogdanova, Nikolaos V Schizas, Stephanie O Castro Márquez, Kenneth Hilkert, Walter Wolfsberger, Tarás K Oleksyk","doi":"10.46471/gigabyte.73","DOIUrl":"10.46471/gigabyte.73","url":null,"abstract":"<p><p>The mitochondrial genome of the long-spined black sea urchin, <i>Diadema antillarum</i>, was sequenced using Illumina next-generation sequencing technology. The complete mitogenome is 15,708 bp in length, containing two rRNA, 22 tRNA and 13 protein-coding genes, plus a noncoding control region of 133 bp. The nucleotide composition is 18.37% G, 23.79% C, 26.84% A and 30.99% T. The A + T bias is 57.84%. Phylogenetic analysis based on 12 complete mitochondrial genomes of sea urchins, including four species of the family Diadematidae, supported familial monophyly; however, the two <i>Diadema</i> species, <i>D. antillarum</i> and <i>D. setosum</i> were not recovered as sister taxa.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2022 ","pages":"gigabyte73"},"PeriodicalIF":0.0,"publicationDate":"2022-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9693923/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9336289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-06eCollection Date: 2022-01-01DOI: 10.46471/gigabyte.71
Nataly Allasi Canales, Oscar A Pérez-Escobar, Robyn F Powell, Mats Töpel, Catherine Kidner, Mark Nesbitt, Carla Maldonado, Christopher J Barnes, Nina Rønsted, Natalia A S Przelomska, Ilia J Leitch, Alexandre Antonelli
The Andean fever tree (Cinchona L.; Rubiaceae) is a source of bioactive quinine alkaloids used to treat malaria. C. pubescens Vahl is a valuable cash crop within its native range in northwestern South America, however, genomic resources are lacking. Here we provide the first highly contiguous and annotated nuclear and plastid genome assemblies using Oxford Nanopore PromethION-derived long-read and Illumina short-read data. Our nuclear genome assembly comprises 603 scaffolds with a total length of 904 Mbp (∼82% of the full genome based on a genome size of 1.1 Gbp/1C). Using a combination of de novo and reference-based transcriptome assemblies we annotated 72,305 coding sequences comprising 83% of the BUSCO gene set and 4.6% fragmented sequences. Using additional plastid and nuclear datasets we place C. pubescens in the Gentianales order. This first genomic resource for C. pubescens opens new research avenues, including the analysis of alkaloid biosynthesis in the fever tree.
{"title":"A highly contiguous, scaffold-level nuclear genome assembly for the fever tree (<i>Cinchona pubescens</i> Vahl) as a novel resource for Rubiaceae research.","authors":"Nataly Allasi Canales, Oscar A Pérez-Escobar, Robyn F Powell, Mats Töpel, Catherine Kidner, Mark Nesbitt, Carla Maldonado, Christopher J Barnes, Nina Rønsted, Natalia A S Przelomska, Ilia J Leitch, Alexandre Antonelli","doi":"10.46471/gigabyte.71","DOIUrl":"10.46471/gigabyte.71","url":null,"abstract":"<p><p>The Andean fever tree (<i>Cinchona</i> L.; Rubiaceae) is a source of bioactive quinine alkaloids used to treat malaria. <i>C. pubescens</i> Vahl is a valuable cash crop within its native range in northwestern South America, however, genomic resources are lacking. Here we provide the first highly contiguous and annotated nuclear and plastid genome assemblies using Oxford Nanopore PromethION-derived long-read and Illumina short-read data. Our nuclear genome assembly comprises 603 scaffolds with a total length of 904 Mbp (∼82<i>%</i> of the full genome based on a genome size of 1.1 Gbp/1C). Using a combination of <i>de novo</i> and reference-based transcriptome assemblies we annotated 72,305 coding sequences comprising 83% of the BUSCO gene set and 4.6% fragmented sequences. Using additional plastid and nuclear datasets we place <i>C. pubescens</i> in the Gentianales order. This first genomic resource for <i>C. pubescens</i> opens new research avenues, including the analysis of alkaloid biosynthesis in the fever tree.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2022 ","pages":"gigabyte71"},"PeriodicalIF":0.0,"publicationDate":"2022-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10027117/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9164443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-05eCollection Date: 2022-01-01DOI: 10.46471/gigabyte.70
Ruining Dong, Daniel Cameron, Justin Bedo, Anthony T Papenfuss
Nuclear integration of mitochondrial genomes and retrocopied transcript insertion are biologically important but often-overlooked aspects of structural variant (SV) annotation. While tools for their detection exist, these typically rely on reanalysis of primary data using specialised detectors rather than leveraging calls from general purpose structural variant callers. Such reanalysis potentially leads to additional computational expense and does not take advantage of advances in general purpose structural variant calling. Here, we present svaRetro and svaNUMT; R packages that provide functions for annotating novel genomic events, such as nonreference retrocopied transcripts and nuclear integration of mitochondrial DNA. The packages were developed to work within the Bioconductor framework. We evaluate the performance of these packages to detect events using simulations and public benchmarking datasets, and annotate processed transcripts in a public structural variant database. svaRetro and svaNUMT provide modular, SV-caller agnostic tools for downstream annotation of structural variant calls.
线粒体基因组的核整合和逆转录转录本插入是结构变异(SV)注释的重要生物学方面,但往往被忽视。虽然已有用于检测它们的工具,但这些工具通常依赖于使用专用检测器对原始数据进行再分析,而不是利用通用结构变异调用器的调用。这种重新分析可能会导致额外的计算费用,而且无法利用通用结构变异调用的进步。在此,我们介绍了 svaRetro 和 svaNUMT;这两个 R 软件包提供了注释新基因组事件的功能,如非参考反转录本和线粒体 DNA 的核整合。这些软件包是在 Bioconductor 框架内开发的。我们利用模拟和公共基准数据集评估了这些软件包检测事件的性能,并在公共结构变异数据库中注释了处理过的转录本。svaRetro 和 svaNUMT 为结构变异调用的下游注释提供了模块化、与 SV 调用器无关的工具。
{"title":"svaRetro and svaNUMT: modular packages for annotating retrotransposed transcripts and nuclear integration of mitochondrial DNA in genome sequencing data.","authors":"Ruining Dong, Daniel Cameron, Justin Bedo, Anthony T Papenfuss","doi":"10.46471/gigabyte.70","DOIUrl":"10.46471/gigabyte.70","url":null,"abstract":"<p><p>Nuclear integration of mitochondrial genomes and retrocopied transcript insertion are biologically important but often-overlooked aspects of structural variant (SV) annotation. While tools for their detection exist, these typically rely on reanalysis of primary data using specialised detectors rather than leveraging calls from general purpose structural variant callers. Such reanalysis potentially leads to additional computational expense and does not take advantage of advances in general purpose structural variant calling. Here, we present svaRetro and svaNUMT; R packages that provide functions for annotating novel genomic events, such as nonreference retrocopied transcripts and nuclear integration of mitochondrial DNA. The packages were developed to work within the Bioconductor framework. We evaluate the performance of these packages to detect events using simulations and public benchmarking datasets, and annotate processed transcripts in a public structural variant database. svaRetro and svaNUMT provide modular, SV-caller agnostic tools for downstream annotation of structural variant calls.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2022 ","pages":"gigabyte70"},"PeriodicalIF":0.0,"publicationDate":"2022-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9694029/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10831320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-09-19eCollection Date: 2022-01-01DOI: 10.46471/gigabyte.69
Awais Khan, Sarah B Carey, Alicia Serrano, Huiting Zhang, Heidi Hargarten, Haley Hale, Alex Harkess, Loren Honaas
The apple cultivar 'Honeycrisp' has superior fruit quality traits, cold hardiness, and disease resistance, making it a popular breeding parent. However, it suffers from several physiological disorders, production, and postharvest issues. Despite several available apple genome sequences, understanding of the genetic mechanisms underlying cultivar-specific traits remains lacking. Here, we present a highly contiguous, fully phased, chromosome-level genome of 'Honeycrisp' apples, using PacBio HiFi, Omni-C, and Illumina sequencing platforms, with two assembled haplomes of 674 Mbp and 660 Mbp, and contig N50 values of 32.8 Mbp and 31.6 Mbp, respectively. Overall, 47,563 and 48,655 protein-coding genes were annotated from each haplome, capturing 96.8-97.4% complete BUSCOs in the eudicot database. Gene family analysis reveals most 'Honeycrisp' genes are assigned into orthogroups shared with other genomes, with 121 'Honeycrisp'-specific orthogroups. This resource is valuable for understanding the genetic basis of important traits in apples and related Rosaceae species to enhance breeding efforts.
{"title":"A phased, chromosome-scale genome of 'Honeycrisp' apple (<i>Malus domestica</i>).","authors":"Awais Khan, Sarah B Carey, Alicia Serrano, Huiting Zhang, Heidi Hargarten, Haley Hale, Alex Harkess, Loren Honaas","doi":"10.46471/gigabyte.69","DOIUrl":"10.46471/gigabyte.69","url":null,"abstract":"<p><p>The apple cultivar 'Honeycrisp' has superior fruit quality traits, cold hardiness, and disease resistance, making it a popular breeding parent. However, it suffers from several physiological disorders, production, and postharvest issues. Despite several available apple genome sequences, understanding of the genetic mechanisms underlying cultivar-specific traits remains lacking. Here, we present a highly contiguous, fully phased, chromosome-level genome of 'Honeycrisp' apples, using PacBio HiFi, Omni-C, and Illumina sequencing platforms, with two assembled haplomes of 674 Mbp and 660 Mbp, and contig N50 values of 32.8 Mbp and 31.6 Mbp, respectively. Overall, 47,563 and 48,655 protein-coding genes were annotated from each haplome, capturing 96.8-97.4% complete BUSCOs in the eudicot database. Gene family analysis reveals most 'Honeycrisp' genes are assigned into orthogroups shared with other genomes, with 121 'Honeycrisp'-specific orthogroups. This resource is valuable for understanding the genetic basis of important traits in apples and related Rosaceae species to enhance breeding efforts.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2022 ","pages":"gigabyte69"},"PeriodicalIF":0.0,"publicationDate":"2022-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9693968/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9336295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-09-15DOI: 10.1101/2022.09.12.507681
Daniel R. Bergman, Lauren Marazzi, Mukti Chowkwale, Deepa Maheshvare M, Supriya Bidanta, T. Mapder, Jialun Li
Pharmacokinetics and pharmacodynamics are key considerations in any study of molecular therapies. It is thus imperative to factor their effects in to any in silico model of biological tissue involving such therapies. Furthermore, creation of a standardized and flexible framework will benefit the community by increasing access to such modules and enhancing their communicability. PhysiCell is an open source physics-based cell simulator, i.e. a platform for modeling biological tissue, that is quickly being adopted and utilized by the mathematical biology community. We present here PhysiPKPD, an open source PhysiCell-based package that allows users to include PKPD in PhysiCell models. Availability & Implementation The source code for PhysiPKPD is located here: https://github.com/drbergman/PhysiPKPD.
{"title":"PhysiPKPD: A pharmacokinetics and pharmacodynamics module for PhysiCell","authors":"Daniel R. Bergman, Lauren Marazzi, Mukti Chowkwale, Deepa Maheshvare M, Supriya Bidanta, T. Mapder, Jialun Li","doi":"10.1101/2022.09.12.507681","DOIUrl":"https://doi.org/10.1101/2022.09.12.507681","url":null,"abstract":"Pharmacokinetics and pharmacodynamics are key considerations in any study of molecular therapies. It is thus imperative to factor their effects in to any in silico model of biological tissue involving such therapies. Furthermore, creation of a standardized and flexible framework will benefit the community by increasing access to such modules and enhancing their communicability. PhysiCell is an open source physics-based cell simulator, i.e. a platform for modeling biological tissue, that is quickly being adopted and utilized by the mathematical biology community. We present here PhysiPKPD, an open source PhysiCell-based package that allows users to include PKPD in PhysiCell models. Availability & Implementation The source code for PhysiPKPD is located here: https://github.com/drbergman/PhysiPKPD.","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2022 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46267613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}