Tao Zhou, Guoqing Bai, Yiheng Hu, Markus Ruhsam, Yanci Yang, Yuemei Zhao
Gentiana macrophylla is a perennial herb in the Gentianaceae family, whose dried roots are used in traditional Chinese medicine. Here, we assembled a chromosome-level genome of G. macrophylla using a combination of Nanopore, Illumina, and Hi-C scaffolding approaches. The final genome size was ~1.79 Gb (contig N50 = 720.804 kb), and 98.89% of the genome sequences were anchored on 13 pseudochromosomes (scaffold N50 = 122.73 Mb). The genome contained 55,337 protein-coding genes, and 73.47% of the assemblies were repetitive sequences. Genome evolution analysis indicated that G. macrophylla underwent two rounds of whole-genome duplication after the core eudicot γ genome triplication event. We further identified candidate genes related to the biosynthesis of iridoids, and the corresponding gene families mostly expanded in G. macrophylla. In addition, we found that root-specific genes are enriched in pathways involved in defense responses, which may greatly improve the biological adaptability of G. macrophylla. Phylogenomic analyses showed a sister relationship of asterids and rosids, and all Gentianales species formed a monophyletic group. Our study contributes to the understanding of genome evolution and active component biosynthesis in G. macrophylla and provides important genomic resource for the genetic improvement and breeding of G. macrophylla.
{"title":"De novo genome assembly of the medicinal plant Gentiana macrophylla provides insights into the genomic evolution and biosynthesis of iridoids.","authors":"Tao Zhou, Guoqing Bai, Yiheng Hu, Markus Ruhsam, Yanci Yang, Yuemei Zhao","doi":"10.1093/dnares/dsac034","DOIUrl":"https://doi.org/10.1093/dnares/dsac034","url":null,"abstract":"<p><p>Gentiana macrophylla is a perennial herb in the Gentianaceae family, whose dried roots are used in traditional Chinese medicine. Here, we assembled a chromosome-level genome of G. macrophylla using a combination of Nanopore, Illumina, and Hi-C scaffolding approaches. The final genome size was ~1.79 Gb (contig N50 = 720.804 kb), and 98.89% of the genome sequences were anchored on 13 pseudochromosomes (scaffold N50 = 122.73 Mb). The genome contained 55,337 protein-coding genes, and 73.47% of the assemblies were repetitive sequences. Genome evolution analysis indicated that G. macrophylla underwent two rounds of whole-genome duplication after the core eudicot γ genome triplication event. We further identified candidate genes related to the biosynthesis of iridoids, and the corresponding gene families mostly expanded in G. macrophylla. In addition, we found that root-specific genes are enriched in pathways involved in defense responses, which may greatly improve the biological adaptability of G. macrophylla. Phylogenomic analyses showed a sister relationship of asterids and rosids, and all Gentianales species formed a monophyletic group. Our study contributes to the understanding of genome evolution and active component biosynthesis in G. macrophylla and provides important genomic resource for the genetic improvement and breeding of G. macrophylla.</p>","PeriodicalId":51014,"journal":{"name":"DNA Research","volume":"29 6","pages":""},"PeriodicalIF":4.1,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/30/92/dsac034.PMC9724787.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10416325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A high-quality genome assembly is imperative to explore the evolutionary basis of characteristic attributes that define chemotype and provide essential resources for a molecular breeding strategy for enhanced production of medicinal metabolites. Here, using single-molecule high-fidelity (HiFi) sequencing reads, we report chromosome-scale genome assembly for Chinese licorice (Glycyrrhiza uralensis), a widely used herbal and natural medicine. The entire genome assembly was achieved in eight chromosomes, with contig and scaffold N50 as 36.02 and 60.2 Mb, respectively. With only 17 assembly gaps and half of the chromosomes having no or one assembly gap, the presented genome assembly is among the best plant genomes to date. Our results showed an advantage of using highly accurate long-read HiFi sequencing data for assembling a highly heterozygous genome including its complexed repeat content. Additionally, our analysis revealed that G. uralensis experienced a recent whole-genome duplication at approximately 59.02 million years ago post a gamma (γ) whole-genome triplication event, which contributed to its present chemotype features. The metabolic gene cluster analysis identified 355 gene clusters, which included the entire biosynthesis pathway of glycyrrhizin. The genome assembly and its annotations provide an essential resource for licorice improvement through molecular breeding and the discovery of valuable genes for engineering bioactive components and understanding the evolution of specialized metabolites biosynthesis.
{"title":"Chromosome-scale genome assembly of Glycyrrhiza uralensis revealed metabolic gene cluster centred specialized metabolites biosynthesis.","authors":"Amit Rai, Hideki Hirakawa, Megha Rai, Yohei Shimizu, Kenta Shirasawa, Shinji Kikuchi, Hikaru Seki, Mami Yamazaki, Atsushi Toyoda, Sachiko Isobe, Toshiya Muranaka, Kazuki Saito","doi":"10.1093/dnares/dsac043","DOIUrl":"https://doi.org/10.1093/dnares/dsac043","url":null,"abstract":"<p><p>A high-quality genome assembly is imperative to explore the evolutionary basis of characteristic attributes that define chemotype and provide essential resources for a molecular breeding strategy for enhanced production of medicinal metabolites. Here, using single-molecule high-fidelity (HiFi) sequencing reads, we report chromosome-scale genome assembly for Chinese licorice (Glycyrrhiza uralensis), a widely used herbal and natural medicine. The entire genome assembly was achieved in eight chromosomes, with contig and scaffold N50 as 36.02 and 60.2 Mb, respectively. With only 17 assembly gaps and half of the chromosomes having no or one assembly gap, the presented genome assembly is among the best plant genomes to date. Our results showed an advantage of using highly accurate long-read HiFi sequencing data for assembling a highly heterozygous genome including its complexed repeat content. Additionally, our analysis revealed that G. uralensis experienced a recent whole-genome duplication at approximately 59.02 million years ago post a gamma (γ) whole-genome triplication event, which contributed to its present chemotype features. The metabolic gene cluster analysis identified 355 gene clusters, which included the entire biosynthesis pathway of glycyrrhizin. The genome assembly and its annotations provide an essential resource for licorice improvement through molecular breeding and the discovery of valuable genes for engineering bioactive components and understanding the evolution of specialized metabolites biosynthesis.</p>","PeriodicalId":51014,"journal":{"name":"DNA Research","volume":"29 6","pages":""},"PeriodicalIF":4.1,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9763095/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10422694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mingcheng Wang, Jianwei Huang, Song Liu, Xiaofeng Liu, Rui Li, Junjia Luo, Zhixi Fu
Sesame (Sesamum indicum L.) is an important oilseed crop that produces abundant seed oil and has a pleasant flavor and high nutritional value. To date, several Illumina-based genome assemblies corresponding to different sesame genotypes have been published and widely used in genetic and genomic studies of sesame. However, these assemblies consistently showed low continuity with numerous gaps. Here, we reported a high-quality, reference-level sesame genome assembly by integrating PacBio high-fidelity sequencing and Hi-C technology. Our updated sesame assembly was 309.35 Mb in size with a high chromosome anchoring rate (97.54%) and contig N50 size (13.48 Mb), which were better than previously published genomes. We identified 163.38 Mb repetitive elements and 24,345 high-confidence protein-coding genes in the updated sesame assembly. Comparative genomic analysis showed that sesame shared an ancient whole-genome duplication event with two Lamiales species. A total of 2,782 genes were tandemly duplicated. We also identified several genes that were likely involved in fatty acid and triacylglycerol biosynthesis. Our improved sesame assembly and annotation will facilitate future genetic studies and genomics-assisted breeding of sesame.
{"title":"Improved assembly and annotation of the sesame genome.","authors":"Mingcheng Wang, Jianwei Huang, Song Liu, Xiaofeng Liu, Rui Li, Junjia Luo, Zhixi Fu","doi":"10.1093/dnares/dsac041","DOIUrl":"10.1093/dnares/dsac041","url":null,"abstract":"<p><p>Sesame (Sesamum indicum L.) is an important oilseed crop that produces abundant seed oil and has a pleasant flavor and high nutritional value. To date, several Illumina-based genome assemblies corresponding to different sesame genotypes have been published and widely used in genetic and genomic studies of sesame. However, these assemblies consistently showed low continuity with numerous gaps. Here, we reported a high-quality, reference-level sesame genome assembly by integrating PacBio high-fidelity sequencing and Hi-C technology. Our updated sesame assembly was 309.35 Mb in size with a high chromosome anchoring rate (97.54%) and contig N50 size (13.48 Mb), which were better than previously published genomes. We identified 163.38 Mb repetitive elements and 24,345 high-confidence protein-coding genes in the updated sesame assembly. Comparative genomic analysis showed that sesame shared an ancient whole-genome duplication event with two Lamiales species. A total of 2,782 genes were tandemly duplicated. We also identified several genes that were likely involved in fatty acid and triacylglycerol biosynthesis. Our improved sesame assembly and annotation will facilitate future genetic studies and genomics-assisted breeding of sesame.</p>","PeriodicalId":51014,"journal":{"name":"DNA Research","volume":"29 6","pages":""},"PeriodicalIF":3.9,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9724774/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10710794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bats (Chiroptera) constitute the second largest order of mammals and have several distinctive features, such as true self-powered flight and strong immunity. The Pendlebury's roundleaf bat, Hipposideros pendleburyi, is endemic to Thailand and listed as a vulnerable species. We employed the 10× Genomics linked-read technology to obtain a genome assembly of H. pendleburyi. The assembly size was 2.17 Gb with a scaffold N50 length of 15,398,518 bases. Our phylogenetic analysis placed H. pendleburyi within the rhinolophoid clade of the suborder Yinpterochiroptera. A synteny analysis showed that H. pendleburyi shared conserved chromosome segments (up to 105 Mb) with Rhinolophus ferrumequinum and Phyllostomus discolor albeit having different chromosome numbers and belonging different families. We found positive selection signals in genes involved in inflammation, spermatogenesis and Wnt signalling. The analyses of transposable elements suggested the contraction of short interspersed nuclear elements (SINEs) and the accumulation of young mariner DNA transposons in the analysed hipposiderids. Distinct mariners were likely horizontally transferred to hipposiderid genomes over the evolution of this family. The lineage-specific profiles of SINEs and mariners might involve in the evolution of hipposiderids and be associated with the phylogenetic separations of these bats from other bat families.
蝙蝠(Chiroptera)是哺乳动物的第二大目,具有一些独特的特征,如真正的自动力飞行和很强的免疫力。彭勒布里圆叶蝙蝠(Hipposideros pendleburyi)是泰国特有的蝙蝠,被列为易危物种。我们采用 10× Genomics 链接-读取技术获得了 H. pendleburyi 的基因组组装。组装大小为 2.17 Gb,支架 N50 长度为 15,398,518 个碱基。我们的系统进化分析将 H. pendleburyi 归入了银角亚目(Yinpterochiroptera)的犀牛科(rhinolophoid clade)。同源分析表明,H. pendleburyi与Rhinolophus ferrumequinum和Phyllostomus discolor共享保守的染色体片段(达105 Mb),尽管它们的染色体数目不同,属于不同的科。我们在涉及炎症、精子发生和 Wnt 信号的基因中发现了正选择信号。对转座元件的分析表明,在所分析的河马中,短穿插核元件(SINEs)收缩,年轻的海马 DNA 转座子积累。在海马科的进化过程中,不同的海马转座子很可能被水平转移到了海马的基因组中。SINEs和mariners的特异性特征可能与河马科蝙蝠的进化有关,也可能与这些蝙蝠与其他蝙蝠科的系统发育分离有关。
{"title":"Genome assembly of the Pendlebury's roundleaf bat, Hipposideros pendleburyi, revealed the expansion of Tc1/Mariner DNA transposons in Rhinolophoidea.","authors":"Wanapinun Nawae, Chutima Sonthirod, Thippawan Yoocha, Pitchaporn Waiyamitra, Pipat Soisook, Sithichoke Tangphatsornruang, Wirulda Pootakham","doi":"10.1093/dnares/dsac026","DOIUrl":"10.1093/dnares/dsac026","url":null,"abstract":"<p><p>Bats (Chiroptera) constitute the second largest order of mammals and have several distinctive features, such as true self-powered flight and strong immunity. The Pendlebury's roundleaf bat, Hipposideros pendleburyi, is endemic to Thailand and listed as a vulnerable species. We employed the 10× Genomics linked-read technology to obtain a genome assembly of H. pendleburyi. The assembly size was 2.17 Gb with a scaffold N50 length of 15,398,518 bases. Our phylogenetic analysis placed H. pendleburyi within the rhinolophoid clade of the suborder Yinpterochiroptera. A synteny analysis showed that H. pendleburyi shared conserved chromosome segments (up to 105 Mb) with Rhinolophus ferrumequinum and Phyllostomus discolor albeit having different chromosome numbers and belonging different families. We found positive selection signals in genes involved in inflammation, spermatogenesis and Wnt signalling. The analyses of transposable elements suggested the contraction of short interspersed nuclear elements (SINEs) and the accumulation of young mariner DNA transposons in the analysed hipposiderids. Distinct mariners were likely horizontally transferred to hipposiderid genomes over the evolution of this family. The lineage-specific profiles of SINEs and mariners might involve in the evolution of hipposiderids and be associated with the phylogenetic separations of these bats from other bat families.</p>","PeriodicalId":51014,"journal":{"name":"DNA Research","volume":"29 5","pages":""},"PeriodicalIF":3.9,"publicationDate":"2022-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/86/08/dsac026.PMC9549598.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33497015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Prunus humilis (2n = 2x = 16) is a dwarf shrub fruit tree native to China and distributed widely in the cold and arid northern region. In this study, we obtained the whole genome sequences of P. humilis by combining Illumina, PacBio and HiC sequencing technologies. This genome was 254.38 Mb long and encodes 28,301 putative proteins. Phylogenetic analysis indicated that P. humilis shares the same ancestor with Prunus mume and Prunus armeniaca at ∼ 29.03 Mya. Gene expansion analysis implied that the expansion of WAX-related and LEA genes might be associated with high drought tolerance of P. humilis and LTR maybe one of the driver factors for the drought adaption by increase the copy number of LEAs. Population diversity analysis among 20 P. humilis accessions found that the genetic diversity of P. humilis populations was limited, only 1.40% base pairs were different with each other, and more wild resources need to be collected and utilized in the breeding and improvement. This study provides new insights to the drought adaption and population diversity of P. humilis that could be used as a potential model plant for horticultural research.
{"title":"The genome of Prunus humilis provides new insights to drought adaption and population diversity.","authors":"Yi Wang, Jun Xie, Hongna Zhang, Weidong Li, Zhanjun Wang, Huayang Li, Qian Tong, Gaixia Qiao, Yujuan Liu, Ying Tian, Yongzan Wei, Ping Li, Rong Wang, Weiping Chen, Zhengchang Liang, Meilong Xu","doi":"10.1093/dnares/dsac021","DOIUrl":"https://doi.org/10.1093/dnares/dsac021","url":null,"abstract":"<p><p>Prunus humilis (2n = 2x = 16) is a dwarf shrub fruit tree native to China and distributed widely in the cold and arid northern region. In this study, we obtained the whole genome sequences of P. humilis by combining Illumina, PacBio and HiC sequencing technologies. This genome was 254.38 Mb long and encodes 28,301 putative proteins. Phylogenetic analysis indicated that P. humilis shares the same ancestor with Prunus mume and Prunus armeniaca at ∼ 29.03 Mya. Gene expansion analysis implied that the expansion of WAX-related and LEA genes might be associated with high drought tolerance of P. humilis and LTR maybe one of the driver factors for the drought adaption by increase the copy number of LEAs. Population diversity analysis among 20 P. humilis accessions found that the genetic diversity of P. humilis populations was limited, only 1.40% base pairs were different with each other, and more wild resources need to be collected and utilized in the breeding and improvement. This study provides new insights to the drought adaption and population diversity of P. humilis that could be used as a potential model plant for horticultural research.</p>","PeriodicalId":51014,"journal":{"name":"DNA Research","volume":"29 4","pages":""},"PeriodicalIF":4.1,"publicationDate":"2022-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9278622/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10412596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ATP-binding cassette (ABC) proteins are the largest membrane transporter family in plants. In addition to transporting organic substances, these proteins function as ion channels and molecular switches. The development of multiple genes encoding ABC proteins has been associated with their various biological roles. Plants utilize many secondary metabolites to adapt to environmental stresses and to communicate with other organisms, with many ABC proteins thought to be involved in metabolite transport. Lithospermum erythrorhizon is regarded as a model plant for studying secondary metabolism, as cells in culture yielded high concentrations of meroterpenes and phenylpropanoids. Analysis of the genome and transcriptomes of L. erythrorhizon showed expression of genes encoding 118 ABC proteins, similar to other plant species. The number of expressed proteins in the half-size ABCA and full-size ABCB subfamilies was ca. 50% lower in L. erythrorhizon than in Arabidopsis, whereas there was no significant difference in the numbers of other expressed ABC proteins. Because many ABCG proteins are involved in the export of organic substances, members of this subfamily may play important roles in the transport of secondary metabolites that are secreted into apoplasts.
{"title":"Inventory of ATP-binding cassette proteins in Lithospermum erythrorhizon as a model plant producing divergent secondary metabolites.","authors":"Hao Li, Hinako Matsuda, Ai Tsuboyama, Ryosuke Munakata, Akifumi Sugiyama, Kazufumi Yazaki","doi":"10.1093/dnares/dsac016","DOIUrl":"https://doi.org/10.1093/dnares/dsac016","url":null,"abstract":"<p><p>ATP-binding cassette (ABC) proteins are the largest membrane transporter family in plants. In addition to transporting organic substances, these proteins function as ion channels and molecular switches. The development of multiple genes encoding ABC proteins has been associated with their various biological roles. Plants utilize many secondary metabolites to adapt to environmental stresses and to communicate with other organisms, with many ABC proteins thought to be involved in metabolite transport. Lithospermum erythrorhizon is regarded as a model plant for studying secondary metabolism, as cells in culture yielded high concentrations of meroterpenes and phenylpropanoids. Analysis of the genome and transcriptomes of L. erythrorhizon showed expression of genes encoding 118 ABC proteins, similar to other plant species. The number of expressed proteins in the half-size ABCA and full-size ABCB subfamilies was ca. 50% lower in L. erythrorhizon than in Arabidopsis, whereas there was no significant difference in the numbers of other expressed ABC proteins. Because many ABCG proteins are involved in the export of organic substances, members of this subfamily may play important roles in the transport of secondary metabolites that are secreted into apoplasts.</p>","PeriodicalId":51014,"journal":{"name":"DNA Research","volume":"29 3","pages":""},"PeriodicalIF":4.1,"publicationDate":"2022-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/6e/6a/dsac016.PMC9195045.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9186993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fig wasp has always been thought the species-specific pollinator for their host fig (Moraceae, Ficus) and constitute a model system with its host to study co-evolution and co-speciation. The availability of a high-quality genome will help to further reveal the mechanisms underlying these characteristics. Here, we present a high-quality chromosome-level genome for Valisa javana developed by a combination of PacBio long-read and Illumina short-read. The assembled genome size is 296.34 Mb from 13 contigs with a contig N50 length of 26.76 kb. Comparative genomic analysis revealed expanded and positively selected genes related to biological features that aid fig wasps living in syconium of its highly specific host. Protein-coding genes associated with chemosensory, detoxification and venom genes were identified. Several differentially expressed genes in transcriptome data of V. javana between odor-stimulated samples and the controls have been identified in some olfactory signal transduction pathways, e.g. olfactory transduction, cAMP, cGMP-PKG, Calcim, Ras and Rap1. This study provides a valuable genomic resource for a fig wasp, and sheds insight into further revealing the mechanisms underlying their adaptive traits to their hosts in different places and co-speciation with their host.
{"title":"A chromosome-level genome assembly of the pollinating fig wasp Valisia javana.","authors":"Lianfu Chen,Chao Feng,Rong Wang,Xiaojue Nong,Xiaoxia Deng,Xiaoyong Chen,Hui Yu","doi":"10.1093/dnares/dsac014","DOIUrl":"https://doi.org/10.1093/dnares/dsac014","url":null,"abstract":"Fig wasp has always been thought the species-specific pollinator for their host fig (Moraceae, Ficus) and constitute a model system with its host to study co-evolution and co-speciation. The availability of a high-quality genome will help to further reveal the mechanisms underlying these characteristics. Here, we present a high-quality chromosome-level genome for Valisa javana developed by a combination of PacBio long-read and Illumina short-read. The assembled genome size is 296.34 Mb from 13 contigs with a contig N50 length of 26.76 kb. Comparative genomic analysis revealed expanded and positively selected genes related to biological features that aid fig wasps living in syconium of its highly specific host. Protein-coding genes associated with chemosensory, detoxification and venom genes were identified. Several differentially expressed genes in transcriptome data of V. javana between odor-stimulated samples and the controls have been identified in some olfactory signal transduction pathways, e.g. olfactory transduction, cAMP, cGMP-PKG, Calcim, Ras and Rap1. This study provides a valuable genomic resource for a fig wasp, and sheds insight into further revealing the mechanisms underlying their adaptive traits to their hosts in different places and co-speciation with their host.","PeriodicalId":51014,"journal":{"name":"DNA Research","volume":"26 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2022-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138517796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Weixiao Lei, Zefu Wang, Man Cao, Hui-Jun Zhu, Min Wang, Yi Zou, Yunchun Han, Dandan Wang, Zeyu Zheng, Ying Li, Bingbing Liu, Dafu Ru
Abstract Sophora japonica is a medium-size deciduous tree belonging to Leguminosae family and famous for its high ecological, economic and medicinal value. Here, we reveal a draft genome of S. japonica, which was ∼511.49 Mb long (contig N50 size of 17.34 Mb) based on Illumina, Nanopore and Hi-C data. We reliably assembled 110 contigs into 14 chromosomes, representing 91.62% of the total genome, with an improved N50 size of 31.32 Mb based on Hi-C data. Further investigation identified 271.76 Mb (53.13%) of repetitive sequences and 31,000 protein-coding genes, of which 30,721 (99.1%) were functionally annotated. Phylogenetic analysis indicates that S. japonica separated from Arabidopsis thaliana and Glycine max ∼107.53 and 61.24 million years ago, respectively. We detected evidence of species-specific and common-legume whole-genome duplication events in S. japonica. We further found that multiple TF families (e.g. BBX and PAL) have expanded in S. japonica, which might have led to its enhanced tolerance to abiotic stress. In addition, S. japonica harbours more genes involved in the lignin and cellulose biosynthesis pathways than the other two species. Finally, population genomic analyses revealed no obvious differentiation among geographical groups and the effective population size continuously declined since 2 Ma. Our genomic data provide a powerful comparative framework to study the adaptation, evolution and active ingredients biosynthesis in S. japonica. More importantly, our high-quality S. japonica genome is important for elucidating the biosynthesis of its main bioactive components, and improving its production and/or processing.
{"title":"Chromosome-level genome assembly and characterization of Sophora Japonica","authors":"Weixiao Lei, Zefu Wang, Man Cao, Hui-Jun Zhu, Min Wang, Yi Zou, Yunchun Han, Dandan Wang, Zeyu Zheng, Ying Li, Bingbing Liu, Dafu Ru","doi":"10.1093/dnares/dsac009","DOIUrl":"https://doi.org/10.1093/dnares/dsac009","url":null,"abstract":"Abstract Sophora japonica is a medium-size deciduous tree belonging to Leguminosae family and famous for its high ecological, economic and medicinal value. Here, we reveal a draft genome of S. japonica, which was ∼511.49 Mb long (contig N50 size of 17.34 Mb) based on Illumina, Nanopore and Hi-C data. We reliably assembled 110 contigs into 14 chromosomes, representing 91.62% of the total genome, with an improved N50 size of 31.32 Mb based on Hi-C data. Further investigation identified 271.76 Mb (53.13%) of repetitive sequences and 31,000 protein-coding genes, of which 30,721 (99.1%) were functionally annotated. Phylogenetic analysis indicates that S. japonica separated from Arabidopsis thaliana and Glycine max ∼107.53 and 61.24 million years ago, respectively. We detected evidence of species-specific and common-legume whole-genome duplication events in S. japonica. We further found that multiple TF families (e.g. BBX and PAL) have expanded in S. japonica, which might have led to its enhanced tolerance to abiotic stress. In addition, S. japonica harbours more genes involved in the lignin and cellulose biosynthesis pathways than the other two species. Finally, population genomic analyses revealed no obvious differentiation among geographical groups and the effective population size continuously declined since 2 Ma. Our genomic data provide a powerful comparative framework to study the adaptation, evolution and active ingredients biosynthesis in S. japonica. More importantly, our high-quality S. japonica genome is important for elucidating the biosynthesis of its main bioactive components, and improving its production and/or processing.","PeriodicalId":51014,"journal":{"name":"DNA Research","volume":"29 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2022-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"61096958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
V. Mixão, Valentina del Olmo, Eva Hegedűsová, E. Saus, Leszek P. Pryszcz, Andrea Cillingová, J. Nosek, T. Gabaldón
Abstract Candida parapsilosis species complex comprises three important pathogenic species: Candida parapsilosis sensu stricto, Candida orthopsilosis and Candida metapsilosis. The majority of C. orthopsilosis and all C. metapsilosis isolates sequenced thus far are hybrids, and most of the parental lineages remain unidentified. This led to the hypothesis that hybrids with pathogenic potential were formed by the hybridization of non-pathogenic lineages that thrive in the environment. In a search for the missing hybrid parentals, and aiming to get a better understanding of the evolution of the species complex, we sequenced, assembled and analysed the genome of five close relatives isolated from the environment: Candida jiufengensis, Candida pseudojiufengensis, Candida oxycetoniae, Candida margitis and Candida theae. We found that the linear conformation of mitochondrial genomes in Candida species emerged multiple times independently. Furthermore, our analyses discarded the possible involvement of these species in the mentioned hybridizations, but identified C. theae as an additional hybrid in the species complex. Importantly, C. theae was recently associated with a case of infection, and we also uncovered the hybrid nature of this clinical isolate. Altogether, our results reinforce the hypothesis that hybridization is widespread among Candida species, and potentially contributes to the emergence of lineages with opportunistic pathogenic behaviour.
{"title":"Genome analysis of five recently described species of the CUG-Ser clade uncovers Candida theae as a new hybrid lineage with pathogenic potential in the Candida parapsilosis species complex","authors":"V. Mixão, Valentina del Olmo, Eva Hegedűsová, E. Saus, Leszek P. Pryszcz, Andrea Cillingová, J. Nosek, T. Gabaldón","doi":"10.1093/dnares/dsac010","DOIUrl":"https://doi.org/10.1093/dnares/dsac010","url":null,"abstract":"Abstract Candida parapsilosis species complex comprises three important pathogenic species: Candida parapsilosis sensu stricto, Candida orthopsilosis and Candida metapsilosis. The majority of C. orthopsilosis and all C. metapsilosis isolates sequenced thus far are hybrids, and most of the parental lineages remain unidentified. This led to the hypothesis that hybrids with pathogenic potential were formed by the hybridization of non-pathogenic lineages that thrive in the environment. In a search for the missing hybrid parentals, and aiming to get a better understanding of the evolution of the species complex, we sequenced, assembled and analysed the genome of five close relatives isolated from the environment: Candida jiufengensis, Candida pseudojiufengensis, Candida oxycetoniae, Candida margitis and Candida theae. We found that the linear conformation of mitochondrial genomes in Candida species emerged multiple times independently. Furthermore, our analyses discarded the possible involvement of these species in the mentioned hybridizations, but identified C. theae as an additional hybrid in the species complex. Importantly, C. theae was recently associated with a case of infection, and we also uncovered the hybrid nature of this clinical isolate. Altogether, our results reinforce the hypothesis that hybridization is widespread among Candida species, and potentially contributes to the emergence of lineages with opportunistic pathogenic behaviour.","PeriodicalId":51014,"journal":{"name":"DNA Research","volume":"29 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"61096972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Y. Zhong, Yong Chen, Danjing Zheng, Jingyi Pang, Ying Liu, Shukai Luo, Shi-hua Meng, Lei Qian, Dan Wei, S. Dai, R. Zhou
Abstract Cercidoideae, one of the six subfamilies of Leguminosae, contains one genus Cercis with its chromosome number 2n = 14 and all other genera with 2n = 28. An allotetraploid origin hypothesis for the common ancestor of non-Cercis genera in this subfamily has been proposed; however, no chromosome-level genomes from Cercidoideae have been available to test this hypothesis. Here, we conducted a chromosome-level genome assembly of Bauhinia variegata to test this hypothesis. The assembled genome is 326.4 Mb with the scaffold N50 of 22.1 Mb and contains 37,996 protein-coding genes. The Ks distribution between gene pairs in the syntenic regions indicates two whole-genome duplications (WGDs): one is B. variegata-specific, and the other is shared among core eudicots. Although Ks between gene pairs generated by the recent WGD in Bauhinia is greater than that between Bauhinia and Cercis, the WGD was not detected in Cercis, which can be explained by an accelerated evolutionary rate in Bauhinia after divergence from Cercis. Ks distribution and phylogenetic analysis for gene pairs generated by the recent WGD in Bauhinia and their corresponding orthologs in Cercis support the allopolyploidy origin hypothesis of Bauhinia. The genome of B. variegata also provides a genomic resource for dissecting genetic basis of its ornamental traits.
{"title":"Chromosomal-level genome assembly of the orchid tree Bauhinia variegata (Leguminosae; Cercidoideae) supports the allotetraploid origin hypothesis of Bauhinia","authors":"Y. Zhong, Yong Chen, Danjing Zheng, Jingyi Pang, Ying Liu, Shukai Luo, Shi-hua Meng, Lei Qian, Dan Wei, S. Dai, R. Zhou","doi":"10.1093/dnares/dsac012","DOIUrl":"https://doi.org/10.1093/dnares/dsac012","url":null,"abstract":"Abstract Cercidoideae, one of the six subfamilies of Leguminosae, contains one genus Cercis with its chromosome number 2n = 14 and all other genera with 2n = 28. An allotetraploid origin hypothesis for the common ancestor of non-Cercis genera in this subfamily has been proposed; however, no chromosome-level genomes from Cercidoideae have been available to test this hypothesis. Here, we conducted a chromosome-level genome assembly of Bauhinia variegata to test this hypothesis. The assembled genome is 326.4 Mb with the scaffold N50 of 22.1 Mb and contains 37,996 protein-coding genes. The Ks distribution between gene pairs in the syntenic regions indicates two whole-genome duplications (WGDs): one is B. variegata-specific, and the other is shared among core eudicots. Although Ks between gene pairs generated by the recent WGD in Bauhinia is greater than that between Bauhinia and Cercis, the WGD was not detected in Cercis, which can be explained by an accelerated evolutionary rate in Bauhinia after divergence from Cercis. Ks distribution and phylogenetic analysis for gene pairs generated by the recent WGD in Bauhinia and their corresponding orthologs in Cercis support the allopolyploidy origin hypothesis of Bauhinia. The genome of B. variegata also provides a genomic resource for dissecting genetic basis of its ornamental traits.","PeriodicalId":51014,"journal":{"name":"DNA Research","volume":"10 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"61097063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}