{"title":"Genome assembly of the grassland caterpillar Gynaephora qinghaiensis.","authors":"Youpeng Lai, Shan Xiao, Minggang Qin, Xinhai Ye, Fang Wang, Qi Fang","doi":"10.1038/s41597-025-04466-2","DOIUrl":null,"url":null,"abstract":"<p><p>The grassland caterpillars are the most damaging insect pests to the alpine meadow of the Qinghai-Tibetan Plateau in China. In this study, we present a genome assembly of one grassland caterpillar Gynaephora qinghaiensis by using Oxford Nanopore long-read and BGI short-read sequencing. The genome assembly of 861.04 Mb in size consists of 107 contigs, with a contig N50 size of 18.65 Mb. The BUSCO analysis revealed the presence of 99.56% (99.27% complete and 0.29% fragmented) BUSCO genes in the assembly. 580.2 Mb (67.4% of genome) of repetitive sequences and 16,618 protein-coding genes were predicted in G. qinghaiensis genome. Phylogenomic analysis indicated that G. qinghaiensis and the rusty tussock moth Orgyia antiqua diverged approximately 18.3 million years ago. Moreover, gene family evolution analysis suggested that 130 gene families significantly expanded and 43 contracted in the G. qinghaiensis genome. The availability of the reference genome could provide genetic resources to uncover adaptive evolutionary mechanisms of grassland caterpillars to high-altitude environments and contributes to the development of integrated pest management strategies.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"158"},"PeriodicalIF":5.8000,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific Data","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41597-025-04466-2","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
The grassland caterpillars are the most damaging insect pests to the alpine meadow of the Qinghai-Tibetan Plateau in China. In this study, we present a genome assembly of one grassland caterpillar Gynaephora qinghaiensis by using Oxford Nanopore long-read and BGI short-read sequencing. The genome assembly of 861.04 Mb in size consists of 107 contigs, with a contig N50 size of 18.65 Mb. The BUSCO analysis revealed the presence of 99.56% (99.27% complete and 0.29% fragmented) BUSCO genes in the assembly. 580.2 Mb (67.4% of genome) of repetitive sequences and 16,618 protein-coding genes were predicted in G. qinghaiensis genome. Phylogenomic analysis indicated that G. qinghaiensis and the rusty tussock moth Orgyia antiqua diverged approximately 18.3 million years ago. Moreover, gene family evolution analysis suggested that 130 gene families significantly expanded and 43 contracted in the G. qinghaiensis genome. The availability of the reference genome could provide genetic resources to uncover adaptive evolutionary mechanisms of grassland caterpillars to high-altitude environments and contributes to the development of integrated pest management strategies.
期刊介绍:
Scientific Data is an open-access journal focused on data, publishing descriptions of research datasets and articles on data sharing across natural sciences, medicine, engineering, and social sciences. Its goal is to enhance the sharing and reuse of scientific data, encourage broader data sharing, and acknowledge those who share their data.
The journal primarily publishes Data Descriptors, which offer detailed descriptions of research datasets, including data collection methods and technical analyses validating data quality. These descriptors aim to facilitate data reuse rather than testing hypotheses or presenting new interpretations, methods, or in-depth analyses.