{"title":"端粒到端粒的参考基因组为物种Chaenomeles speciosa的五环三萜生物合成提供了遗传学见解。","authors":"Shaofang He, Duanyang Weng, Yipeng Zhang, Qiusheng Kong, Keyue Wang, Naliang Jing, Fengfeng Li, Yuebin Ge, Hui Xiong, Lei Wu, De-Yu Xie, Shengqiu Feng, Xiaqing Yu, Xuekui Wang, Shaohua Shu, Zhinan Mei","doi":"10.1093/hr/uhad183","DOIUrl":null,"url":null,"abstract":"<p><p><i>Chaenomeles speciosa</i> (2<i>n</i> = 34), a medicinal and edible plant in the Rosaceae, is commonly used in traditional Chinese medicine. To date, the lack of genomic sequence and genetic studies has impeded efforts to improve its medicinal value. Herein, we report the use of an integrative approach involving PacBio HiFi (third-generation) sequencing and Hi-C scaffolding to assemble a high-quality telomere-to-telomere genome of <i>C. speciosa.</i> The genome comprised 650.4 Mb with a contig N50 of 35.5 Mb. Of these, 632.3 Mb were anchored to 17 pseudo-chromosomes, in which 12, 4, and 1 pseudo-chromosomes were represented by a single contig, two contigs, and four contigs, respectively. Eleven pseudo-chromosomes had telomere repeats at both ends, and four had telomere repeats at a single end. Repetitive sequences accounted for 49.5% of the genome, while a total of 45 515 protein-coding genes have been annotated. The genome size of <i>C. speciosa</i> was relatively similar to that of <i>Malus domestica</i>. Expanded or contracted gene families were identified and investigated for their association with different plant metabolisms or biological processes. In particular, functional annotation characterized gene families that were associated with the biosynthetic pathway of oleanolic and ursolic acids, two abundant pentacyclic triterpenoids in the fruits of <i>C. speciosa</i>. Taken together, this telomere-to-telomere and chromosome-level genome of <i>C. speciosa</i> not only provides a valuable resource to enhance understanding of the biosynthesis of medicinal compounds in tissues, but also promotes understanding of the evolution of the Rosaceae.</p>","PeriodicalId":57479,"journal":{"name":"园艺研究(英文)","volume":"10 10","pages":"uhad183"},"PeriodicalIF":7.6000,"publicationDate":"2023-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10623406/pdf/","citationCount":"0","resultStr":"{\"title\":\"A telomere-to-telomere reference genome provides genetic insight into the pentacyclic triterpenoid biosynthesis in <i>Chaenomeles speciosa</i>.\",\"authors\":\"Shaofang He, Duanyang Weng, Yipeng Zhang, Qiusheng Kong, Keyue Wang, Naliang Jing, Fengfeng Li, Yuebin Ge, Hui Xiong, Lei Wu, De-Yu Xie, Shengqiu Feng, Xiaqing Yu, Xuekui Wang, Shaohua Shu, Zhinan Mei\",\"doi\":\"10.1093/hr/uhad183\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p><i>Chaenomeles speciosa</i> (2<i>n</i> = 34), a medicinal and edible plant in the Rosaceae, is commonly used in traditional Chinese medicine. To date, the lack of genomic sequence and genetic studies has impeded efforts to improve its medicinal value. Herein, we report the use of an integrative approach involving PacBio HiFi (third-generation) sequencing and Hi-C scaffolding to assemble a high-quality telomere-to-telomere genome of <i>C. speciosa.</i> The genome comprised 650.4 Mb with a contig N50 of 35.5 Mb. Of these, 632.3 Mb were anchored to 17 pseudo-chromosomes, in which 12, 4, and 1 pseudo-chromosomes were represented by a single contig, two contigs, and four contigs, respectively. Eleven pseudo-chromosomes had telomere repeats at both ends, and four had telomere repeats at a single end. Repetitive sequences accounted for 49.5% of the genome, while a total of 45 515 protein-coding genes have been annotated. The genome size of <i>C. speciosa</i> was relatively similar to that of <i>Malus domestica</i>. Expanded or contracted gene families were identified and investigated for their association with different plant metabolisms or biological processes. In particular, functional annotation characterized gene families that were associated with the biosynthetic pathway of oleanolic and ursolic acids, two abundant pentacyclic triterpenoids in the fruits of <i>C. speciosa</i>. Taken together, this telomere-to-telomere and chromosome-level genome of <i>C. speciosa</i> not only provides a valuable resource to enhance understanding of the biosynthesis of medicinal compounds in tissues, but also promotes understanding of the evolution of the Rosaceae.</p>\",\"PeriodicalId\":57479,\"journal\":{\"name\":\"园艺研究(英文)\",\"volume\":\"10 10\",\"pages\":\"uhad183\"},\"PeriodicalIF\":7.6000,\"publicationDate\":\"2023-09-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10623406/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"园艺研究(英文)\",\"FirstCategoryId\":\"1091\",\"ListUrlMain\":\"https://doi.org/10.1093/hr/uhad183\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/10/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q1\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"园艺研究(英文)","FirstCategoryId":"1091","ListUrlMain":"https://doi.org/10.1093/hr/uhad183","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/10/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
A telomere-to-telomere reference genome provides genetic insight into the pentacyclic triterpenoid biosynthesis in Chaenomeles speciosa.
Chaenomeles speciosa (2n = 34), a medicinal and edible plant in the Rosaceae, is commonly used in traditional Chinese medicine. To date, the lack of genomic sequence and genetic studies has impeded efforts to improve its medicinal value. Herein, we report the use of an integrative approach involving PacBio HiFi (third-generation) sequencing and Hi-C scaffolding to assemble a high-quality telomere-to-telomere genome of C. speciosa. The genome comprised 650.4 Mb with a contig N50 of 35.5 Mb. Of these, 632.3 Mb were anchored to 17 pseudo-chromosomes, in which 12, 4, and 1 pseudo-chromosomes were represented by a single contig, two contigs, and four contigs, respectively. Eleven pseudo-chromosomes had telomere repeats at both ends, and four had telomere repeats at a single end. Repetitive sequences accounted for 49.5% of the genome, while a total of 45 515 protein-coding genes have been annotated. The genome size of C. speciosa was relatively similar to that of Malus domestica. Expanded or contracted gene families were identified and investigated for their association with different plant metabolisms or biological processes. In particular, functional annotation characterized gene families that were associated with the biosynthetic pathway of oleanolic and ursolic acids, two abundant pentacyclic triterpenoids in the fruits of C. speciosa. Taken together, this telomere-to-telomere and chromosome-level genome of C. speciosa not only provides a valuable resource to enhance understanding of the biosynthesis of medicinal compounds in tissues, but also promotes understanding of the evolution of the Rosaceae.