首页 > 最新文献

GigaScience最新文献

英文 中文
Gapless genome assembly and epigenetic profiles reveal gene regulation of whole-genome triplication in lettuce. 无间隙基因组组装和表观遗传图谱揭示了莴苣全基因组三重复制的基因调控。
IF 3.5 2区 生物学 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-01-02 DOI: 10.1093/gigascience/giae043
Shuai Cao, Nunchanoke Sawettalake, Lisha Shen

Background: Lettuce, an important member of the Asteraceae family, is a globally cultivated cash vegetable crop. With a highly complex genome (∼2.5 Gb; 2n = 18) rich in repeat sequences, current lettuce reference genomes exhibit thousands of gaps, impeding a comprehensive understanding of the lettuce genome.

Findings: Here, we present a near-complete gapless reference genome for cutting lettuce with high transformability, using long-read PacBio HiFi and Nanopore sequencing data. In comparison to stem lettuce genome, we identify 127,681 structural variations (SVs, present in 0.41 Gb of sequence), reflecting the divergence of leafy and stem lettuce. Interestingly, these SVs are related to transposons and DNA methylation states. Furthermore, we identify 4,612 whole-genome triplication genes exhibiting high expression levels associated with low DNA methylation levels and high N6-methyladenosine RNA modifications. DNA methylation changes are also associated with activation of genes involved in callus formation.

Conclusions: Our gapless lettuce genome assembly, an unprecedented achievement in the Asteraceae family, establishes a solid foundation for functional genomics, epigenomics, and crop breeding and sheds new light on understanding the complexity of gene regulation associated with the dynamics of DNA and RNA epigenetics in genome evolution.

背景:莴苣是菊科植物的重要成员,是一种全球栽培的经济蔬菜作物。莴苣基因组高度复杂(2.5 Gb;2n = 18),重复序列丰富,目前的莴苣参考基因组存在数千个缺口,阻碍了对莴苣基因组的全面了解:在这里,我们利用长线程 PacBio HiFi 和 Nanopore 测序数据,为具有高转化率的切莴苣提供了一个近乎完整的无间隙参考基因组。与茎用莴苣基因组相比,我们发现了127,681个结构变异(SV,存在于0.41 Gb的序列中),反映了叶用莴苣和茎用莴苣的差异。有趣的是,这些 SV 与转座子和 DNA 甲基化状态有关。此外,我们还发现了 4,612 个全基因组三复制基因,这些基因的高表达水平与低 DNA 甲基化水平和高 N6-甲基腺苷 RNA 修饰有关。DNA甲基化变化还与参与胼胝体形成的基因激活有关:我们的无间隙莴苣基因组组装是菊科植物中前所未有的成就,为功能基因组学、表观基因组学和作物育种奠定了坚实的基础,并为理解基因组进化过程中与 DNA 和 RNA 表观遗传学动态相关的基因调控的复杂性提供了新的思路。
{"title":"Gapless genome assembly and epigenetic profiles reveal gene regulation of whole-genome triplication in lettuce.","authors":"Shuai Cao, Nunchanoke Sawettalake, Lisha Shen","doi":"10.1093/gigascience/giae043","DOIUrl":"10.1093/gigascience/giae043","url":null,"abstract":"<p><strong>Background: </strong>Lettuce, an important member of the Asteraceae family, is a globally cultivated cash vegetable crop. With a highly complex genome (∼2.5 Gb; 2n = 18) rich in repeat sequences, current lettuce reference genomes exhibit thousands of gaps, impeding a comprehensive understanding of the lettuce genome.</p><p><strong>Findings: </strong>Here, we present a near-complete gapless reference genome for cutting lettuce with high transformability, using long-read PacBio HiFi and Nanopore sequencing data. In comparison to stem lettuce genome, we identify 127,681 structural variations (SVs, present in 0.41 Gb of sequence), reflecting the divergence of leafy and stem lettuce. Interestingly, these SVs are related to transposons and DNA methylation states. Furthermore, we identify 4,612 whole-genome triplication genes exhibiting high expression levels associated with low DNA methylation levels and high N6-methyladenosine RNA modifications. DNA methylation changes are also associated with activation of genes involved in callus formation.</p><p><strong>Conclusions: </strong>Our gapless lettuce genome assembly, an unprecedented achievement in the Asteraceae family, establishes a solid foundation for functional genomics, epigenomics, and crop breeding and sheds new light on understanding the complexity of gene regulation associated with the dynamics of DNA and RNA epigenetics in genome evolution.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":3.5,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11238431/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141590091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PEPhub: a database, web interface, and API for editing, sharing, and validating biological sample metadata. PEPhub:用于编辑、共享和验证生物样本元数据的数据库、网络接口和应用程序接口。
IF 3.5 2区 生物学 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-01-02 DOI: 10.1093/gigascience/giae033
Nathan J LeRoy, Oleksandr Khoroshevskyi, Aaron O'Brien, Rafał Stępień, Alip Arslan, Nathan C Sheffield

Background: As biological data increase, we need additional infrastructure to share them and promote interoperability. While major effort has been put into sharing data, relatively less emphasis is placed on sharing metadata. Yet, sharing metadata is also important and in some ways has a wider scope than sharing data themselves.

Results: Here, we present PEPhub, an approach to improve sharing and interoperability of biological metadata. PEPhub provides an API, natural-language search, and user-friendly web-based sharing and editing of sample metadata tables. We used PEPhub to process more than 100,000 published biological research projects and index them with fast semantic natural-language search. PEPhub thus provides a fast and user-friendly way to finding existing biological research data or to share new data.

Availability: https://pephub.databio.org.

背景:随着生物数据的增加,我们需要更多的基础设施来共享这些数据并促进互操作性。虽然我们在数据共享方面投入了大量精力,但对元数据共享的重视程度却相对较低。然而,元数据共享同样重要,而且在某些方面比数据本身的共享范围更广:在此,我们提出了 PEPhub,一种改善生物元数据共享和互操作性的方法。PEPhub 提供了一个应用程序接口(API)、自然语言搜索以及基于用户友好的网络共享和编辑样本元数据表。我们使用 PEPhub 处理了 100,000 多个已发表的生物研究项目,并通过快速语义自然语言搜索对其进行索引。因此,PEPhub 为查找现有生物研究数据或共享新数据提供了一种快速、用户友好的方式。可用性:https://pephub.databio.org。
{"title":"PEPhub: a database, web interface, and API for editing, sharing, and validating biological sample metadata.","authors":"Nathan J LeRoy, Oleksandr Khoroshevskyi, Aaron O'Brien, Rafał Stępień, Alip Arslan, Nathan C Sheffield","doi":"10.1093/gigascience/giae033","DOIUrl":"10.1093/gigascience/giae033","url":null,"abstract":"<p><strong>Background: </strong>As biological data increase, we need additional infrastructure to share them and promote interoperability. While major effort has been put into sharing data, relatively less emphasis is placed on sharing metadata. Yet, sharing metadata is also important and in some ways has a wider scope than sharing data themselves.</p><p><strong>Results: </strong>Here, we present PEPhub, an approach to improve sharing and interoperability of biological metadata. PEPhub provides an API, natural-language search, and user-friendly web-based sharing and editing of sample metadata tables. We used PEPhub to process more than 100,000 published biological research projects and index them with fast semantic natural-language search. PEPhub thus provides a fast and user-friendly way to finding existing biological research data or to share new data.</p><p><strong>Availability: </strong>https://pephub.databio.org.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":3.5,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11238423/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141590108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Innovative approach for high-throughput exploiting sex-specific markers in Japanese parrotfish Oplegnathus fasciatus. 高通量利用日本鹦嘴鱼性别特异性标记的创新方法。
IF 3.5 2区 生物学 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-01-02 DOI: 10.1093/gigascience/giae045
Yongshuang Xiao, Zhizhong Xiao, Lin Liu, Yuting Ma, Haixia Zhao, Yanduo Wu, Jinwei Huang, Pingrui Xu, Jing Liu, Jun Li
<p><strong>Background: </strong>The use of sex-specific molecular markers has become a prominent method in enhancing fish production and economic value, as well as providing a foundation for understanding the complex molecular mechanisms involved in fish sex determination. Over the past decades, research on male and female sex identification has predominantly employed molecular biology methodologies such as restriction fragment length polymorphism, random amplification of polymorphic DNA, simple sequence repeat, and amplified fragment length polymorphism. The emergence of high-throughput sequencing technologies, particularly Illumina, has led to the utilization of single nucleotide polymorphism and insertion/deletion variants as significant molecular markers for investigating sex identification in fish. The advancement of sex-controlled breeding encounters numerous challenges, including the inefficiency of current methods, intricate experimental protocols, high costs of development, elevated rates of false positives, marker instability, and cumbersome field-testing procedures. Nevertheless, the emergence and swift progress of PacBio high-throughput sequencing technology, characterized by its long-read output capabilities, offers novel opportunities to overcome these obstacles.</p><p><strong>Findings: </strong>Utilizing male/female assembled genome information in conjunction with short-read sequencing data survey and long-read PacBio sequencing data, a catalog of large-segment (>100 bp) insertion/deletion genetic variants was generated through a genome-wide variant site-scanning approach with bidirectional comparisons. The sequence tagging sites were ranked based on the long-read depth of the insertion/deletion site, with markers exhibiting lower long-read depth being considered more effective for large-segment deletion variants. Subsequently, a catalog of bulk primers and simulated PCR for the male/female variant loci was developed, incorporating primer design for the target region and electronic PCR (e-PCR) technology. The Japanese parrotfish (Oplegnathus fasciatus), belonging to the Oplegnathidae family within the Centrarchiformes order, holds significant economic value as a rocky reef fish indigenous to East Asia. The criteria for rapid identification of male and female differences in Japanese parrotfish were established through agarose gel electrophoresis, which revealed 2 amplified bands for males and 1 amplified band for females. A high-throughput identification catalog of sex-specific markers was then constructed using this method, resulting in the identification of 3,639 (2,786 INS/853 DEL, ♀ as reference) and 3,672 (2,876 INS/833 DEL, ♂ as reference) markers in conjunction with 1,021 and 894 high-quality genetic sex identification markers, respectively. Sixteen differential loci were randomly chosen from the catalog for validation, with 11 of them meeting the criteria for male/female distinctions. The implementation of cost-effective and
背景:使用性别特异性分子标记已成为提高鱼类产量和经济价值的重要方法,同时也为了解鱼类性别决定所涉及的复杂分子机制奠定了基础。在过去几十年中,有关雌雄性别鉴定的研究主要采用限制性片段长度多态性、多态 DNA 随机扩增、简单序列重复和扩增片段长度多态性等分子生物学方法。高通量测序技术(尤其是 Illumina)的出现,使得单核苷酸多态性和插入/缺失变异成为研究鱼类性别鉴定的重要分子标记。性别控制育种的发展遇到了许多挑战,包括现有方法效率低、实验方案复杂、开发成本高、假阳性率高、标记不稳定以及现场测试程序繁琐。尽管如此,PacBio 高通量测序技术的出现和迅速发展(其特点是长读数输出能力)为克服这些障碍提供了新的机遇:研究结果:利用男性/女性组装基因组信息,结合短线程测序数据调查和长线程 PacBio 测序数据,通过双向比较的全基因组变异位点扫描方法,生成了大段(>100 bp)插入/缺失遗传变异目录。根据插入/缺失位点的长读取深度对序列标记位点进行排序,认为长读取深度较低的标记对大片段缺失变异更有效。随后,结合目标区域引物设计和电子 PCR(e-PCR)技术,开发了雄性/雌性变异位点的大量引物和模拟 PCR 目录。日本鹦嘴鱼(Oplegnathus fasciatus)隶属于半陆纲鹦嘴鱼科,是东亚特有的岩礁鱼类,具有重要的经济价值。通过琼脂糖凝胶电泳建立了快速鉴定日本鹦嘴鱼雌雄差异的标准,结果显示雄性有 2 条扩增带,雌性有 1 条扩增带。随后,利用该方法构建了性别特异性标记的高通量鉴定目录,分别鉴定出3639个(2786个INS/853个DEL,♀为参考)和3672个(2876个INS/833个DEL,♂为参考)标记,以及1021个和894个高质量遗传性别鉴定标记。从目录中随机选择了 16 个差异位点进行验证,其中 11 个符合雌雄鉴别标准。通过加快不同物种性别遗传标记的高通量开发,实施经济高效的技术流程将促进遗传育种的快速发展:我们的研究利用了从 PacBio 获得的雌雄个体基因组信息,以及短线程测序数据调查和长线程 PacBio 测序数据。我们广泛采用了全基因组变异位点扫描和鉴定、目标区域的高通量引物设计、e-PCR批量扩增,以及变异位点长读数深度的统计分析和排序。通过这种综合方法,我们成功编制了雌雄日本鹦鹉鱼的大插入/缺失位点(>100 bp)目录。
{"title":"Innovative approach for high-throughput exploiting sex-specific markers in Japanese parrotfish Oplegnathus fasciatus.","authors":"Yongshuang Xiao, Zhizhong Xiao, Lin Liu, Yuting Ma, Haixia Zhao, Yanduo Wu, Jinwei Huang, Pingrui Xu, Jing Liu, Jun Li","doi":"10.1093/gigascience/giae045","DOIUrl":"10.1093/gigascience/giae045","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;The use of sex-specific molecular markers has become a prominent method in enhancing fish production and economic value, as well as providing a foundation for understanding the complex molecular mechanisms involved in fish sex determination. Over the past decades, research on male and female sex identification has predominantly employed molecular biology methodologies such as restriction fragment length polymorphism, random amplification of polymorphic DNA, simple sequence repeat, and amplified fragment length polymorphism. The emergence of high-throughput sequencing technologies, particularly Illumina, has led to the utilization of single nucleotide polymorphism and insertion/deletion variants as significant molecular markers for investigating sex identification in fish. The advancement of sex-controlled breeding encounters numerous challenges, including the inefficiency of current methods, intricate experimental protocols, high costs of development, elevated rates of false positives, marker instability, and cumbersome field-testing procedures. Nevertheless, the emergence and swift progress of PacBio high-throughput sequencing technology, characterized by its long-read output capabilities, offers novel opportunities to overcome these obstacles.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Findings: &lt;/strong&gt;Utilizing male/female assembled genome information in conjunction with short-read sequencing data survey and long-read PacBio sequencing data, a catalog of large-segment (&gt;100 bp) insertion/deletion genetic variants was generated through a genome-wide variant site-scanning approach with bidirectional comparisons. The sequence tagging sites were ranked based on the long-read depth of the insertion/deletion site, with markers exhibiting lower long-read depth being considered more effective for large-segment deletion variants. Subsequently, a catalog of bulk primers and simulated PCR for the male/female variant loci was developed, incorporating primer design for the target region and electronic PCR (e-PCR) technology. The Japanese parrotfish (Oplegnathus fasciatus), belonging to the Oplegnathidae family within the Centrarchiformes order, holds significant economic value as a rocky reef fish indigenous to East Asia. The criteria for rapid identification of male and female differences in Japanese parrotfish were established through agarose gel electrophoresis, which revealed 2 amplified bands for males and 1 amplified band for females. A high-throughput identification catalog of sex-specific markers was then constructed using this method, resulting in the identification of 3,639 (2,786 INS/853 DEL, ♀ as reference) and 3,672 (2,876 INS/833 DEL, ♂ as reference) markers in conjunction with 1,021 and 894 high-quality genetic sex identification markers, respectively. Sixteen differential loci were randomly chosen from the catalog for validation, with 11 of them meeting the criteria for male/female distinctions. The implementation of cost-effective and","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":3.5,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11258905/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141727099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A high-quality chromosomal genome assembly of the sea cucumber Chiridota heheva and its hydrothermal adaptation. 海参 Chiridota heheva 的高质量染色体基因组组装及其热液适应性。
IF 3.5 2区 生物学 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-01-02 DOI: 10.1093/gigascience/giad107
Yujin Pu, Yang Zhou, Jun Liu, Haibin Zhang

Background: Chiridota heheva is a cosmopolitan holothurian well adapted to diverse deep-sea ecosystems, especially chemosynthetic environments. Besides high hydrostatic pressure and limited light, high concentrations of metal ions also represent harsh conditions in hydrothermal environments. Few holothurian species can live in such extreme conditions. Therefore, it is valuable to elucidate the adaptive genetic mechanisms of C. heheva in hydrothermal environments.

Findings: Herein, we report a high-quality reference genome assembly of C. heheva from the Kairei vent, which is the first chromosome-level genome of Apodida. The chromosome-level genome size was 1.43 Gb, with a scaffold N50 of 53.24 Mb and BUSCO completeness score of 94.5%. Contig sequences were clustered, ordered, and assembled into 19 natural chromosomes. Comparative genome analysis found that the expanded gene families and positively selected genes of C. heheva were involved in the DNA damage repair process. The expanded gene families and the unique genes contributed to maintaining iron homeostasis in an iron-enriched environment. The positively selected gene RFC2 with 10 positively selected sites played an essential role in DNA repair under extreme environments.

Conclusions: This first chromosome-level genome assembly of C. heheva reveals the hydrothermal adaptation of holothurians. As the first chromosome-level genome of order Apodida, this genome will provide the resource for investigating the evolution of class Holothuroidea.

背景:Chiridota heheva是一种世界性的百足类动物,能很好地适应各种深海生态系统,尤其是化合环境。除了高静水压和有限的光照外,高浓度的金属离子也代表了热液环境中的苛刻条件。能在如此极端条件下生活的百足虫物种少之又少。因此,阐明C. heheva在热液环境中的适应性遗传机制具有重要价值:在此,我们报告了来自凯雷喷口的C. heheva的高质量参考基因组组装,这是Apodida的第一个染色体组水平的基因组。染色体级基因组大小为1.43 Gb,支架N50为53.24 Mb,BUSCO完整性得分为94.5%。对等位基因序列进行了聚类、排序并组装成 19 条天然染色体。比较基因组分析发现,C. heheva的扩展基因家族和正选基因参与了DNA损伤修复过程。扩展基因家族和独特基因有助于在富铁环境中维持铁平衡。具有10个正选位点的正选基因RFC2在极端环境下的DNA修复中发挥了重要作用:C.heheva的首个染色体级基因组组装揭示了热液适应性。作为Apodida目第一个染色体水平的基因组,该基因组将为研究Holothuroidea类的进化提供资源。
{"title":"A high-quality chromosomal genome assembly of the sea cucumber Chiridota heheva and its hydrothermal adaptation.","authors":"Yujin Pu, Yang Zhou, Jun Liu, Haibin Zhang","doi":"10.1093/gigascience/giad107","DOIUrl":"10.1093/gigascience/giad107","url":null,"abstract":"<p><strong>Background: </strong>Chiridota heheva is a cosmopolitan holothurian well adapted to diverse deep-sea ecosystems, especially chemosynthetic environments. Besides high hydrostatic pressure and limited light, high concentrations of metal ions also represent harsh conditions in hydrothermal environments. Few holothurian species can live in such extreme conditions. Therefore, it is valuable to elucidate the adaptive genetic mechanisms of C. heheva in hydrothermal environments.</p><p><strong>Findings: </strong>Herein, we report a high-quality reference genome assembly of C. heheva from the Kairei vent, which is the first chromosome-level genome of Apodida. The chromosome-level genome size was 1.43 Gb, with a scaffold N50 of 53.24 Mb and BUSCO completeness score of 94.5%. Contig sequences were clustered, ordered, and assembled into 19 natural chromosomes. Comparative genome analysis found that the expanded gene families and positively selected genes of C. heheva were involved in the DNA damage repair process. The expanded gene families and the unique genes contributed to maintaining iron homeostasis in an iron-enriched environment. The positively selected gene RFC2 with 10 positively selected sites played an essential role in DNA repair under extreme environments.</p><p><strong>Conclusions: </strong>This first chromosome-level genome assembly of C. heheva reveals the hydrothermal adaptation of holothurians. As the first chromosome-level genome of order Apodida, this genome will provide the resource for investigating the evolution of class Holothuroidea.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":3.5,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10764150/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139086481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A graph clustering algorithm for detection and genotyping of structural variants from long reads. 从长读数中检测结构变异并进行基因分型的图聚类算法。
IF 3.5 2区 生物学 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-01-02 DOI: 10.1093/gigascience/giad112
Nicolás Gaitán, Jorge Duitama

Background: Structural variants (SVs) are genomic polymorphisms defined by their length (>50 bp). The usual types of SVs are deletions, insertions, translocations, inversions, and copy number variants. SV detection and genotyping is fundamental given the role of SVs in phenomena such as phenotypic variation and evolutionary events. Thus, methods to identify SVs using long-read sequencing data have been recently developed.

Findings: We present an accurate and efficient algorithm to predict germline SVs from long-read sequencing data. The algorithm starts collecting evidence (signatures) of SVs from read alignments. Then, signatures are clustered based on a Euclidean graph with coordinates calculated from lengths and genomic positions. Clustering is performed by the DBSCAN algorithm, which provides the advantage of delimiting clusters with high resolution. Clusters are transformed into SVs and a Bayesian model allows to precisely genotype SVs based on their supporting evidence. This algorithm is integrated into the single sample variants detector of the Next Generation Sequencing Experience Platform, which facilitates the integration with other functionalities for genomics analysis. We performed multiple benchmark experiments, including simulation and real data, representing different genome profiles, sequencing technologies (PacBio HiFi, ONT), and read depths.

Conclusion: The results show that our approach outperformed state-of-the-art tools on germline SV calling and genotyping, especially at low depths, and in error-prone repetitive regions. We believe this work significantly contributes to the development of bioinformatic strategies to maximize the use of long-read sequencing technologies.

背景:结构变异(SV)是由其长度(大于 50 bp)定义的基因组多态性。SV 的常见类型有缺失、插入、易位、倒位和拷贝数变异。鉴于 SV 在表型变异和进化事件等现象中的作用,SV 的检测和基因分型至关重要。因此,最近开发出了利用长线程测序数据识别 SV 的方法:我们提出了一种准确、高效的算法,用于从长读序测序数据中预测种系SV。该算法首先从读数比对中收集 SV 的证据(特征)。然后,根据长度和基因组位置计算出的坐标欧几里得图对特征进行聚类。聚类是通过 DBSCAN 算法进行的,该算法具有高分辨率划分聚类的优势。聚类被转化为 SV,贝叶斯模型可根据 SV 的支持证据对 SV 进行精确的基因分型。该算法已被集成到下一代测序体验平台的单样本变异检测器中,从而促进了与其他基因组学分析功能的集成。我们进行了多个基准实验,包括模拟和真实数据,代表了不同的基因组图谱、测序技术(PacBio HiFi、ONT)和读取深度:结果表明,在种系 SV 调用和基因分型方面,我们的方法优于最先进的工具,尤其是在低深度和易出错的重复区域。我们相信,这项工作将极大地促进生物信息学策略的发展,从而最大限度地利用长读数测序技术。
{"title":"A graph clustering algorithm for detection and genotyping of structural variants from long reads.","authors":"Nicolás Gaitán, Jorge Duitama","doi":"10.1093/gigascience/giad112","DOIUrl":"10.1093/gigascience/giad112","url":null,"abstract":"<p><strong>Background: </strong>Structural variants (SVs) are genomic polymorphisms defined by their length (>50 bp). The usual types of SVs are deletions, insertions, translocations, inversions, and copy number variants. SV detection and genotyping is fundamental given the role of SVs in phenomena such as phenotypic variation and evolutionary events. Thus, methods to identify SVs using long-read sequencing data have been recently developed.</p><p><strong>Findings: </strong>We present an accurate and efficient algorithm to predict germline SVs from long-read sequencing data. The algorithm starts collecting evidence (signatures) of SVs from read alignments. Then, signatures are clustered based on a Euclidean graph with coordinates calculated from lengths and genomic positions. Clustering is performed by the DBSCAN algorithm, which provides the advantage of delimiting clusters with high resolution. Clusters are transformed into SVs and a Bayesian model allows to precisely genotype SVs based on their supporting evidence. This algorithm is integrated into the single sample variants detector of the Next Generation Sequencing Experience Platform, which facilitates the integration with other functionalities for genomics analysis. We performed multiple benchmark experiments, including simulation and real data, representing different genome profiles, sequencing technologies (PacBio HiFi, ONT), and read depths.</p><p><strong>Conclusion: </strong>The results show that our approach outperformed state-of-the-art tools on germline SV calling and genotyping, especially at low depths, and in error-prone repetitive regions. We believe this work significantly contributes to the development of bioinformatic strategies to maximize the use of long-read sequencing technologies.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":3.5,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10783151/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139416802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MMV_Im2Im: an open-source microscopy machine vision toolbox for image-to-image transformation. MMV_Im2Im:用于图像到图像转换的开源显微镜机器视觉工具箱。
IF 3.5 2区 生物学 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-01-02 DOI: 10.1093/gigascience/giad120
Justin Sonneck, Yu Zhou, Jianxu Chen

Over the past decade, deep learning (DL) research in computer vision has been growing rapidly, with many advances in DL-based image analysis methods for biomedical problems. In this work, we introduce MMV_Im2Im, a new open-source Python package for image-to-image transformation in bioimaging applications. MMV_Im2Im is designed with a generic image-to-image transformation framework that can be used for a wide range of tasks, including semantic segmentation, instance segmentation, image restoration, image generation, and so on. Our implementation takes advantage of state-of-the-art machine learning engineering techniques, allowing researchers to focus on their research without worrying about engineering details. We demonstrate the effectiveness of MMV_Im2Im on more than 10 different biomedical problems, showcasing its general potentials and applicabilities. For computational biomedical researchers, MMV_Im2Im provides a starting point for developing new biomedical image analysis or machine learning algorithms, where they can either reuse the code in this package or fork and extend this package to facilitate the development of new methods. Experimental biomedical researchers can benefit from this work by gaining a comprehensive view of the image-to-image transformation concept through diversified examples and use cases. We hope this work can give the community inspirations on how DL-based image-to-image transformation can be integrated into the assay development process, enabling new biomedical studies that cannot be done only with traditional experimental assays. To help researchers get started, we have provided source code, documentation, and tutorials for MMV_Im2Im at [https://github.com/MMV-Lab/mmv_im2im] under MIT license.

过去十年间,计算机视觉领域的深度学习(DL)研究发展迅速,基于 DL 的生物医学问题图像分析方法也取得了许多进展。在这项工作中,我们介绍了 MMV_Im2Im,这是一个新的开源 Python 软件包,用于生物成像应用中的图像到图像转换。MMV_Im2Im 设计了一个通用的图像到图像转换框架,可用于多种任务,包括语义分割、实例分割、图像复原、图像生成等。我们的实现利用了最先进的机器学习工程技术,使研究人员能够专注于他们的研究,而不必担心工程细节。我们在 10 多个不同的生物医学问题上演示了 MMV_Im2Im 的有效性,展示了它的普遍潜力和适用性。对于计算生物医学研究人员来说,MMV_Im2Im 为他们开发新的生物医学图像分析或机器学习算法提供了一个起点,他们既可以重复使用该软件包中的代码,也可以对该软件包进行分叉和扩展,以促进新方法的开发。生物医学实验研究人员可以从这项工作中获益,通过多样化的示例和用例全面了解图像到图像的转换概念。我们希望这项工作能给社区带来启发,让他们了解如何将基于 DL 的图像到图像转换集成到检测开发流程中,从而实现传统实验检测无法完成的新生物医学研究。为了帮助研究人员入门,我们在 MIT 许可下在 [https://github.com/MMV-Lab/mmv_im2im] 网站上提供了 MMV_Im2Im 的源代码、文档和教程。
{"title":"MMV_Im2Im: an open-source microscopy machine vision toolbox for image-to-image transformation.","authors":"Justin Sonneck, Yu Zhou, Jianxu Chen","doi":"10.1093/gigascience/giad120","DOIUrl":"10.1093/gigascience/giad120","url":null,"abstract":"<p><p>Over the past decade, deep learning (DL) research in computer vision has been growing rapidly, with many advances in DL-based image analysis methods for biomedical problems. In this work, we introduce MMV_Im2Im, a new open-source Python package for image-to-image transformation in bioimaging applications. MMV_Im2Im is designed with a generic image-to-image transformation framework that can be used for a wide range of tasks, including semantic segmentation, instance segmentation, image restoration, image generation, and so on. Our implementation takes advantage of state-of-the-art machine learning engineering techniques, allowing researchers to focus on their research without worrying about engineering details. We demonstrate the effectiveness of MMV_Im2Im on more than 10 different biomedical problems, showcasing its general potentials and applicabilities. For computational biomedical researchers, MMV_Im2Im provides a starting point for developing new biomedical image analysis or machine learning algorithms, where they can either reuse the code in this package or fork and extend this package to facilitate the development of new methods. Experimental biomedical researchers can benefit from this work by gaining a comprehensive view of the image-to-image transformation concept through diversified examples and use cases. We hope this work can give the community inspirations on how DL-based image-to-image transformation can be integrated into the assay development process, enabling new biomedical studies that cannot be done only with traditional experimental assays. To help researchers get started, we have provided source code, documentation, and tutorials for MMV_Im2Im at [https://github.com/MMV-Lab/mmv_im2im] under MIT license.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":3.5,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10821710/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139570421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Celebrating 30 years of access to NASA Space Life Sciences data. 庆祝 NASA 太空生命科学数据开放 30 周年。
IF 9.2 2区 生物学 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-01-02 DOI: 10.1093/gigascience/giae066
Lauren M Sanders,Danielle K Lopez,Alan E Wood,Ryan T Scott,Samrawit G Gebre,Amanda M Saravia-Butler,Sylvain V Costes
NASA's space life sciences research programs established a decades-long legacy of enhancing our ability to safely explore the cosmos. From Skylab and the Space Shuttle Program to the NASA Balloon Program and the International Space Station National Lab, these programs generated priceless data that continue to paint a vibrant picture of life in space. These data are available to the scientific community in various data repositories, including the NASA Ames Life Sciences Data Archive (ALSDA) and NASA GeneLab. Here we recognize the 30-year anniversary of data access through ALSDA and the 10-year anniversary of GeneLab.
美国国家航空航天局(NASA)的太空生命科学研究计划为提高我们安全探索宇宙的能力建立了长达数十年的传统。从 Skylab 和航天飞机计划到 NASA 热气球计划和国际空间站国家实验室,这些计划产生了无价的数据,继续描绘着太空生命的生动图景。科学界可以从各种数据储存库中获得这些数据,包括美国国家航空航天局艾姆斯生命科学数据档案(ALSDA)和美国国家航空航天局基因实验室(NASA GeneLab)。在此,我们纪念通过 ALSDA 获取数据 30 周年和 GeneLab 成立 10 周年。
{"title":"Celebrating 30 years of access to NASA Space Life Sciences data.","authors":"Lauren M Sanders,Danielle K Lopez,Alan E Wood,Ryan T Scott,Samrawit G Gebre,Amanda M Saravia-Butler,Sylvain V Costes","doi":"10.1093/gigascience/giae066","DOIUrl":"https://doi.org/10.1093/gigascience/giae066","url":null,"abstract":"NASA's space life sciences research programs established a decades-long legacy of enhancing our ability to safely explore the cosmos. From Skylab and the Space Shuttle Program to the NASA Balloon Program and the International Space Station National Lab, these programs generated priceless data that continue to paint a vibrant picture of life in space. These data are available to the scientific community in various data repositories, including the NASA Ames Life Sciences Data Archive (ALSDA) and NASA GeneLab. Here we recognize the 30-year anniversary of data access through ALSDA and the 10-year anniversary of GeneLab.","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"87 1","pages":""},"PeriodicalIF":9.2,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142259659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The first high-altitude autotetraploid haplotype-resolved genome assembled (Rhododendron nivale subsp. boreale) provides new insights into mountaintop adaptation. 首次组装的高海拔自交单倍体单倍型基因组(Rhododendron nivale subsp.
IF 3.5 2区 生物学 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-01-02 DOI: 10.1093/gigascience/giae052
Zhen-Yu Lyu, Xiong-Li Zhou, Si-Qi Wang, Gao-Ming Yang, Wen-Guang Sun, Jie-Yu Zhang, Rui Zhang, Shi-Kang Shen

Background: Rhododendron nivale subsp. boreale Philipson et M. N. Philipson is an alpine woody species with ornamental qualities that serve as the predominant species in mountainous scrub habitats found at an altitude of ∼4,200 m. As a high-altitude woody polyploid, this species may serve as a model to understand how plants adapt to alpine environments. Despite its ecological significance, the lack of genomic resources has hindered a comprehensive understanding of its evolutionary and adaptive characteristics in high-altitude mountainous environments.

Findings: We sequenced and assembled the genome of R. nivale subsp. boreale, an assembly of the first subgenus Rhododendron and the first high-altitude woody flowering tetraploid, contributing an important genomic resource for alpine woody flora. The assembly included 52 pseudochromosomes (scaffold N50 = 42.93 Mb; BUSCO = 98.8%; QV = 45.51; S-AQI = 98.69), which belonged to 4 haplotypes, harboring 127,810 predicted protein-coding genes. Conjoint k-mer analysis, collinearity assessment, and phylogenetic investigation corroborated autotetraploid identity. Comparative genomic analysis revealed that R. nivale subsp. boreale originated as a neopolyploid of R. nivale and underwent 2 rounds of ancient polyploidy events. Transcriptional expression analysis showed that differences in expression between alleles were common and randomly distributed in the genome. We identified extended gene families and signatures of positive selection that are involved not only in adaptation to the mountaintop ecosystem (response to stress and developmental regulation) but also in autotetraploid reproduction (meiotic stabilization). Additionally, the expression levels of the (group VII ethylene response factor transcription factors) ERF VIIs were significantly higher than the mean global gene expression. We suspect that these changes have enabled the success of this species at high altitudes.

Conclusions: We assembled the first high-altitude autopolyploid genome and achieved chromosome-level assembly within the subgenus Rhododendron. In addition, a high-altitude adaptation strategy of R. nivale subsp. boreale was reasonably speculated. This study provides valuable data for the exploration of alpine mountaintop adaptations and the correlation between extreme environments and species polyploidization.

背景:Rhododendron nivale subsp. boreale Philipson et M. N. Philipson 是一种具有观赏价值的高山木本物种,是海拔 4,200 米以上山地灌丛生境中的主要物种。尽管其生态学意义重大,但基因组资源的缺乏阻碍了对其在高海拔山区环境中的进化和适应特征的全面了解:nivale subsp. boreale的基因组进行了测序和组装,这是杜鹃花亚属的第一个基因组,也是第一个高海拔木本开花四倍体,为高山木本植物群提供了重要的基因组资源。该组配包括 52 个假染色体(支架 N50 = 42.93 Mb;BUSCO = 98.8%;QV = 45.51;S-AQI = 98.69),分属 4 个单倍型,包含 127,810 个预测的蛋白编码基因。联合 k-mer 分析、共线性评估和系统发育调查证实了自四倍体的身份。比较基因组分析表明,R. nivale亚种起源于R. nivale的新多倍体,经历了两轮古老的多倍体事件。转录表达分析表明,等位基因之间的表达差异很常见,并且随机分布在基因组中。我们发现了扩展的基因家族和正选择的特征,它们不仅参与了对山顶生态系统的适应(对压力的反应和发育调节),还参与了自交四倍体的繁殖(减数分裂的稳定)。此外,(第七组乙烯响应因子转录因子)ERF VIIs 的表达水平明显高于全球基因的平均表达水平。我们怀疑这些变化使该物种在高海拔地区获得了成功:我们组装了首个高海拔自多倍体基因组,并在杜鹃花亚属中实现了染色体组水平的组装。此外,我们还合理推测了北海道杜鹃亚种的高海拔适应策略。该研究为探索高山山顶适应性以及极端环境与物种多倍体化之间的相关性提供了宝贵的数据。
{"title":"The first high-altitude autotetraploid haplotype-resolved genome assembled (Rhododendron nivale subsp. boreale) provides new insights into mountaintop adaptation.","authors":"Zhen-Yu Lyu, Xiong-Li Zhou, Si-Qi Wang, Gao-Ming Yang, Wen-Guang Sun, Jie-Yu Zhang, Rui Zhang, Shi-Kang Shen","doi":"10.1093/gigascience/giae052","DOIUrl":"10.1093/gigascience/giae052","url":null,"abstract":"<p><strong>Background: </strong>Rhododendron nivale subsp. boreale Philipson et M. N. Philipson is an alpine woody species with ornamental qualities that serve as the predominant species in mountainous scrub habitats found at an altitude of ∼4,200 m. As a high-altitude woody polyploid, this species may serve as a model to understand how plants adapt to alpine environments. Despite its ecological significance, the lack of genomic resources has hindered a comprehensive understanding of its evolutionary and adaptive characteristics in high-altitude mountainous environments.</p><p><strong>Findings: </strong>We sequenced and assembled the genome of R. nivale subsp. boreale, an assembly of the first subgenus Rhododendron and the first high-altitude woody flowering tetraploid, contributing an important genomic resource for alpine woody flora. The assembly included 52 pseudochromosomes (scaffold N50 = 42.93 Mb; BUSCO = 98.8%; QV = 45.51; S-AQI = 98.69), which belonged to 4 haplotypes, harboring 127,810 predicted protein-coding genes. Conjoint k-mer analysis, collinearity assessment, and phylogenetic investigation corroborated autotetraploid identity. Comparative genomic analysis revealed that R. nivale subsp. boreale originated as a neopolyploid of R. nivale and underwent 2 rounds of ancient polyploidy events. Transcriptional expression analysis showed that differences in expression between alleles were common and randomly distributed in the genome. We identified extended gene families and signatures of positive selection that are involved not only in adaptation to the mountaintop ecosystem (response to stress and developmental regulation) but also in autotetraploid reproduction (meiotic stabilization). Additionally, the expression levels of the (group VII ethylene response factor transcription factors) ERF VIIs were significantly higher than the mean global gene expression. We suspect that these changes have enabled the success of this species at high altitudes.</p><p><strong>Conclusions: </strong>We assembled the first high-altitude autopolyploid genome and achieved chromosome-level assembly within the subgenus Rhododendron. In addition, a high-altitude adaptation strategy of R. nivale subsp. boreale was reasonably speculated. This study provides valuable data for the exploration of alpine mountaintop adaptations and the correlation between extreme environments and species polyploidization.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":3.5,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11304948/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141901426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LRTK: a platform agnostic toolkit for linked-read analysis of both human genome and metagenome. LRTK:用于人类基因组和元基因组联读分析的平台无关工具包。
IF 3.5 2区 生物学 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-01-02 DOI: 10.1093/gigascience/giae028
Chao Yang, Zhenmiao Zhang, Yufen Huang, Xuefeng Xie, Herui Liao, Jin Xiao, Werner Pieter Veldsman, Kejing Yin, Xiaodong Fang, Lu Zhang

Background: Linked-read sequencing technologies generate high-base quality short reads that contain extrapolative information on long-range DNA connectedness. These advantages of linked-read technologies are well known and have been demonstrated in many human genomic and metagenomic studies. However, existing linked-read analysis pipelines (e.g., Long Ranger) were primarily developed to process sequencing data from the human genome and are not suited for analyzing metagenomic sequencing data. Moreover, linked-read analysis pipelines are typically limited to 1 specific sequencing platform.

Findings: To address these limitations, we present the Linked-Read ToolKit (LRTK), a unified and versatile toolkit for platform agnostic processing of linked-read sequencing data from both human genome and metagenome. LRTK provides functions to perform linked-read simulation, barcode sequencing error correction, barcode-aware read alignment and metagenome assembly, reconstruction of long DNA fragments, taxonomic classification and quantification, and barcode-assisted genomic variant calling and phasing. LRTK has the ability to process multiple samples automatically and provides users with the option to generate reproducible reports during processing of raw sequencing data and at multiple checkpoints throughout downstream analysis. We applied LRTK on linked reads from simulation, mock community, and real datasets for both human genome and metagenome. We showcased LRTK's ability to generate comparative performance results from preceding benchmark studies and to report these results in publication-ready HTML document plots.

Conclusions: LRTK provides comprehensive and flexible modules along with an easy-to-use Python-based workflow for processing linked-read sequencing datasets, thereby filling the current gap in the field caused by platform-centric genome-specific linked-read data analysis tools.

背景:链接读数测序技术可产生高碱基质量的短读数,这些读数包含长程 DNA 连接性的推断信息。链接读数技术的这些优势众所周知,并已在许多人类基因组和元基因组研究中得到证实。然而,现有的链接读数分析管道(如 Long Ranger)主要是为处理人类基因组测序数据而开发的,并不适合分析元基因组测序数据。此外,链接读数分析管道通常仅限于一种特定的测序平台:为了解决这些局限性,我们提出了链接读取工具包(LRTK),这是一个统一、通用的工具包,可用于处理人类基因组和元基因组的链接读取测序数据,不受平台限制。LRTK 提供的功能包括链接读数模拟、条形码测序纠错、条形码感知读数比对和元基因组组装、长 DNA 片段重建、分类学分类和量化以及条形码辅助基因组变异调用和分期。LRTK 能够自动处理多个样本,并为用户提供在处理原始测序数据期间和整个下游分析过程中的多个检查点生成可重现报告的选项。我们将 LRTK 应用于人类基因组和元基因组的模拟、模拟群落和真实数据集的链接读数。我们展示了 LRTK 从之前的基准研究中生成性能比较结果的能力,并以可供出版的 HTML 文档图报告这些结果:LRTK 提供了全面而灵活的模块,以及易于使用的基于 Python- 的工作流程,用于处理链接读数测序数据集,从而填补了该领域目前由以平台为中心的基因组特定链接读数数据分析工具造成的空白。
{"title":"LRTK: a platform agnostic toolkit for linked-read analysis of both human genome and metagenome.","authors":"Chao Yang, Zhenmiao Zhang, Yufen Huang, Xuefeng Xie, Herui Liao, Jin Xiao, Werner Pieter Veldsman, Kejing Yin, Xiaodong Fang, Lu Zhang","doi":"10.1093/gigascience/giae028","DOIUrl":"10.1093/gigascience/giae028","url":null,"abstract":"<p><strong>Background: </strong>Linked-read sequencing technologies generate high-base quality short reads that contain extrapolative information on long-range DNA connectedness. These advantages of linked-read technologies are well known and have been demonstrated in many human genomic and metagenomic studies. However, existing linked-read analysis pipelines (e.g., Long Ranger) were primarily developed to process sequencing data from the human genome and are not suited for analyzing metagenomic sequencing data. Moreover, linked-read analysis pipelines are typically limited to 1 specific sequencing platform.</p><p><strong>Findings: </strong>To address these limitations, we present the Linked-Read ToolKit (LRTK), a unified and versatile toolkit for platform agnostic processing of linked-read sequencing data from both human genome and metagenome. LRTK provides functions to perform linked-read simulation, barcode sequencing error correction, barcode-aware read alignment and metagenome assembly, reconstruction of long DNA fragments, taxonomic classification and quantification, and barcode-assisted genomic variant calling and phasing. LRTK has the ability to process multiple samples automatically and provides users with the option to generate reproducible reports during processing of raw sequencing data and at multiple checkpoints throughout downstream analysis. We applied LRTK on linked reads from simulation, mock community, and real datasets for both human genome and metagenome. We showcased LRTK's ability to generate comparative performance results from preceding benchmark studies and to report these results in publication-ready HTML document plots.</p><p><strong>Conclusions: </strong>LRTK provides comprehensive and flexible modules along with an easy-to-use Python-based workflow for processing linked-read sequencing datasets, thereby filling the current gap in the field caused by platform-centric genome-specific linked-read data analysis tools.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":3.5,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11170215/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141310460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A 3D printed plant model for accurate and reliable 3D plant phenotyping. 用于准确可靠的三维植物表型的三维打印植物模型。
IF 3.5 2区 生物学 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-01-02 DOI: 10.1093/gigascience/giae035
Jonas Bömer, Felix Esser, Elias Marks, Radu Alexandru Rosu, Sven Behnke, Lasse Klingbeil, Heiner Kuhlmann, Cyrill Stachniss, Anne-Katrin Mahlein, Stefan Paulus

Background: This study addresses the importance of precise referencing in 3-dimensional (3D) plant phenotyping, which is crucial for advancing plant breeding and improving crop production. Traditionally, reference data in plant phenotyping rely on invasive methods. Recent advancements in 3D sensing technologies offer the possibility to collect parameters that cannot be referenced by manual measurements. This work focuses on evaluating a 3D printed sugar beet plant model as a referencing tool.

Results: Fused deposition modeling has turned out to be a suitable 3D printing technique for creating reference objects in 3D plant phenotyping. Production deviations of the created reference model were in a low and acceptable range. We were able to achieve deviations ranging from -10 mm to +5 mm. In parallel, we demonstrated a high-dimensional stability of the reference model, reaching only ±4 mm deformation over the course of 1 year. Detailed print files, assembly descriptions, and benchmark parameters are provided, facilitating replication and benefiting the research community.

Conclusion: Consumer-grade 3D printing was utilized to create a stable and reproducible 3D reference model of a sugar beet plant, addressing challenges in referencing morphological parameters in 3D plant phenotyping. The reference model is applicable in 3 demonstrated use cases: evaluating and comparing 3D sensor systems, investigating the potential accuracy of parameter extraction algorithms, and continuously monitoring these algorithms in practical experiments in greenhouse and field experiments. Using this approach, it is possible to monitor the extraction of a nonverifiable parameter and create reference data. The process serves as a model for developing reference models for other agricultural crops.

背景:本研究探讨了三维(3D)植物表型分析中精确参照的重要性,这对推进植物育种和提高作物产量至关重要。传统上,植物表型分析中的参考数据依赖于侵入式方法。三维传感技术的最新进展为收集人工测量无法参考的参数提供了可能。这项工作的重点是评估作为参照工具的三维打印甜菜植物模型:结果:熔融沉积建模已被证明是一种适用于创建三维植物表型参考对象的三维打印技术。所创建参考模型的生产偏差较低,在可接受范围内。我们能够实现 -10 毫米到 +5 毫米的偏差范围。与此同时,我们还证明了参考模型的高维稳定性,在一年的时间里变形量仅为±4毫米。我们提供了详细的打印文件、装配说明和基准参数,为复制提供了便利,并使研究界受益匪浅:结论:利用消费级三维打印技术创建了稳定且可重复的甜菜植物三维参考模型,解决了三维植物表型中形态参数参考的难题。该参考模型适用于 3 个示范用例:评估和比较三维传感器系统、研究参数提取算法的潜在准确性,以及在温室和田间试验的实际实验中持续监控这些算法。使用这种方法,可以监测不可验证参数的提取,并创建参考数据。这一过程可作为开发其他农作物参考模型的范例。
{"title":"A 3D printed plant model for accurate and reliable 3D plant phenotyping.","authors":"Jonas Bömer, Felix Esser, Elias Marks, Radu Alexandru Rosu, Sven Behnke, Lasse Klingbeil, Heiner Kuhlmann, Cyrill Stachniss, Anne-Katrin Mahlein, Stefan Paulus","doi":"10.1093/gigascience/giae035","DOIUrl":"10.1093/gigascience/giae035","url":null,"abstract":"<p><strong>Background: </strong>This study addresses the importance of precise referencing in 3-dimensional (3D) plant phenotyping, which is crucial for advancing plant breeding and improving crop production. Traditionally, reference data in plant phenotyping rely on invasive methods. Recent advancements in 3D sensing technologies offer the possibility to collect parameters that cannot be referenced by manual measurements. This work focuses on evaluating a 3D printed sugar beet plant model as a referencing tool.</p><p><strong>Results: </strong>Fused deposition modeling has turned out to be a suitable 3D printing technique for creating reference objects in 3D plant phenotyping. Production deviations of the created reference model were in a low and acceptable range. We were able to achieve deviations ranging from -10 mm to +5 mm. In parallel, we demonstrated a high-dimensional stability of the reference model, reaching only ±4 mm deformation over the course of 1 year. Detailed print files, assembly descriptions, and benchmark parameters are provided, facilitating replication and benefiting the research community.</p><p><strong>Conclusion: </strong>Consumer-grade 3D printing was utilized to create a stable and reproducible 3D reference model of a sugar beet plant, addressing challenges in referencing morphological parameters in 3D plant phenotyping. The reference model is applicable in 3 demonstrated use cases: evaluating and comparing 3D sensor systems, investigating the potential accuracy of parameter extraction algorithms, and continuously monitoring these algorithms in practical experiments in greenhouse and field experiments. Using this approach, it is possible to monitor the extraction of a nonverifiable parameter and create reference data. The process serves as a model for developing reference models for other agricultural crops.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":3.5,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11186670/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141426681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
GigaScience
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1