Ruirui Li, Xiaodie Geng, Min Liu, Guangming Liu, Tong Wei, Huan Liu, Yanqun Li, Sunil Kumar Sahu, Hong Wu
{"title":"Chromosome-scale genomes of wild and cultivated Morinda officinalis.","authors":"Ruirui Li, Xiaodie Geng, Min Liu, Guangming Liu, Tong Wei, Huan Liu, Yanqun Li, Sunil Kumar Sahu, Hong Wu","doi":"10.1038/s41597-025-04776-5","DOIUrl":null,"url":null,"abstract":"<p><p>Morinda officinalis is a renowned medicinal and edible plant native to southern China and northern Vietnam. Its dried roots, known as bajitian are extensively used in traditional Chinese medicine to treat various ailments. Driven by the increasing market demand, the wild populations of M. officinalis have been threatened, leading to the surge of cultivated varieties. Here, we present the chromosome-scale genome assemblies of both wild and cultivated M. officinalis, achieved through a combination of nanopore long-read sequencing and Hi-C technology, resulting in high-quality genomes for the wild (423 Mb) and cultivated (425 Mb) M. officinalis, boasting scaffold N50 values of 5.91 Mb and 10.99 Mb, respectively. Additionally, we predicted 31,308 and 29,528 protein-coding genes in wild and cultivated M. officinalis, respectively. Approximately 96.3% and 97.8% of the assembled sequences were anchored to 11 pseudo-chromosomes for the wild and cultivated genomes. The high-quality chromosome-scale genomes of M. officinalis could serve as a valuable resource for understanding the genetic basis of medicinal trait variations, improving cultivation practices, and conserving this ecologically and economically important species.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"443"},"PeriodicalIF":5.8000,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11911396/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific Data","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41597-025-04776-5","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Morinda officinalis is a renowned medicinal and edible plant native to southern China and northern Vietnam. Its dried roots, known as bajitian are extensively used in traditional Chinese medicine to treat various ailments. Driven by the increasing market demand, the wild populations of M. officinalis have been threatened, leading to the surge of cultivated varieties. Here, we present the chromosome-scale genome assemblies of both wild and cultivated M. officinalis, achieved through a combination of nanopore long-read sequencing and Hi-C technology, resulting in high-quality genomes for the wild (423 Mb) and cultivated (425 Mb) M. officinalis, boasting scaffold N50 values of 5.91 Mb and 10.99 Mb, respectively. Additionally, we predicted 31,308 and 29,528 protein-coding genes in wild and cultivated M. officinalis, respectively. Approximately 96.3% and 97.8% of the assembled sequences were anchored to 11 pseudo-chromosomes for the wild and cultivated genomes. The high-quality chromosome-scale genomes of M. officinalis could serve as a valuable resource for understanding the genetic basis of medicinal trait variations, improving cultivation practices, and conserving this ecologically and economically important species.
期刊介绍:
Scientific Data is an open-access journal focused on data, publishing descriptions of research datasets and articles on data sharing across natural sciences, medicine, engineering, and social sciences. Its goal is to enhance the sharing and reuse of scientific data, encourage broader data sharing, and acknowledge those who share their data.
The journal primarily publishes Data Descriptors, which offer detailed descriptions of research datasets, including data collection methods and technical analyses validating data quality. These descriptors aim to facilitate data reuse rather than testing hypotheses or presenting new interpretations, methods, or in-depth analyses.