Over the past 20 years, tremendous advances in sequencing technologies and computational algorithms have spurred plant genomic research into a thriving era with hundreds of genomes decoded already, ranging from those of nonvascular plants to those of flowering plants. However, complex plant genome assembly is still challenging and remains difficult to fully resolve with conventional sequencing and assembly methods due to high heterozygosity, highly repetitive sequences, or high ploidy characteristics of complex genomes. Herein, we summarize the challenges of and advances in complex plant genome assembly, including feasible experimental strategies, upgrades to sequencing technology, existing assembly methods, and different phasing algorithms. Moreover, we list actual cases of complex genome projects for readers to refer to and draw upon to solve future problems related to complex genomes. Finally, we expect that the accurate, gapless, telomere-to-telomere, and fully phased assembly of complex plant genomes could soon become routine.
{"title":"Recent Advances in Assembly of Complex Plant Genomes","authors":"Weilong Kong, Yibin Wang, Shengcheng Zhang, Jiaxin Yu, Xingtan Zhang","doi":"10.1016/j.gpb.2023.04.004","DOIUrl":"10.1016/j.gpb.2023.04.004","url":null,"abstract":"<div><div>Over the past 20 years, tremendous advances in sequencing technologies and computational algorithms have spurred plant genomic research into a thriving era with hundreds of genomes decoded already, ranging from those of nonvascular plants to those of flowering plants. However, <strong>complex plant genome</strong> assembly is still challenging and remains difficult to fully resolve with conventional sequencing and assembly methods due to high heterozygosity, highly repetitive sequences, or high ploidy characteristics of complex genomes. Herein, we summarize the challenges of and advances in complex plant genome assembly, including feasible experimental strategies, upgrades to <strong>sequencing technology</strong>, existing assembly methods, and different phasing algorithms. Moreover, we list actual cases of complex genome projects for readers to refer to and draw upon to solve future problems related to complex genomes. Finally, we expect that the accurate, gapless, telomere-to-telomere, and fully phased assembly of complex plant genomes could soon become routine.</div></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"21 3","pages":"Pages 427-439"},"PeriodicalIF":11.5,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10787022/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9722114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01DOI: 10.1016/j.gpb.2022.08.003
Sheng Hu Qian , Yu-Li Xiong , Lu Chen , Ying-Jie Geng , Xiao-Man Tang , Zhen-Xia Chen
In the evolutionary model of dosage compensation, per-allele expression level of the X chromosome has been proposed to have twofold up-regulation to compensate its dose reduction in males (XY) compared to females (XX). However, the expression regulation of X-linked genes is still controversial, and comprehensive evaluations are still lacking. By integrating multi-omics datasets in mammals, we investigated the expression ratios including X to autosomes (X:AA ratio) and X to orthologs (X:XX ratio) at the transcriptome, translatome, and proteome levels. We revealed a dynamic spatial-temporal X:AA ratio during development in humans and mice. Meanwhile, by tracing the evolution of orthologous gene expression in chickens, platypuses, and opossums, we found a stable expression ratio of X-linked genes in humans to their autosomal orthologs in other species (X:XX ≈ 1) across tissues and developmental stages, demonstrating stable dosage compensation in mammals. We also found that different epigenetic regulations contributed to the high tissue specificity and stage specificity of X-linked gene expression, thus affecting X:AA ratios. It could be concluded that the dynamics of X:AA ratios were attributed to the different gene contents and expression preferences of the X chromosome, rather than the stable dosage compensation.
{"title":"Dynamic Spatial-temporal Expression Ratio of X Chromosome to Autosomes but Stable Dosage Compensation in Mammals","authors":"Sheng Hu Qian , Yu-Li Xiong , Lu Chen , Ying-Jie Geng , Xiao-Man Tang , Zhen-Xia Chen","doi":"10.1016/j.gpb.2022.08.003","DOIUrl":"10.1016/j.gpb.2022.08.003","url":null,"abstract":"<div><div>In the evolutionary model of <strong>dosage compensation</strong>, per-allele expression level of the <strong>X chromosome</strong> has been proposed to have twofold up-regulation to compensate its dose reduction in males (XY) compared to females (XX). However, the expression regulation of X-linked genes is still controversial, and comprehensive evaluations are still lacking. By integrating multi-omics datasets in <strong>mammals</strong>, we investigated the expression ratios including X to autosomes (X:AA ratio) and X to orthologs (X:<u>XX</u> ratio) at the transcriptome, translatome, and proteome levels. We revealed a dynamic spatial-temporal X:AA ratio during development in humans and mice. Meanwhile, by tracing the <strong>evolution</strong> of orthologous gene expression in chickens, platypuses, and opossums, we found a stable expression ratio of X-linked genes in humans to their autosomal orthologs in other species (X:<u>XX</u> ≈ 1) across tissues and developmental stages, demonstrating stable dosage compensation in mammals. We also found that different epigenetic regulations contributed to the high tissue specificity and stage specificity of X-linked gene expression, thus affecting X:AA ratios. It could be concluded that the dynamics of X:AA ratios were attributed to the different gene contents and expression preferences of the X chromosome, rather than the stable dosage compensation.</div></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"21 3","pages":"Pages 589-600"},"PeriodicalIF":11.5,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10787176/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33441569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01DOI: 10.1016/j.gpb.2022.11.002
Le Wang , May Lee , Zi Yi Wan , Bin Bai , Baoqing Ye , Yuzer Alfiko , Rahmadsyah Rahmadsyah , Sigit Purwantomo , Zhuojun Song , Antonius Suwanto , Gen Hua Yue
The palm family (Arecaceae), consisting of ∼ 2600 species, is the third most economically important family of plants. The African oil palm (Elaeis guineensis) is one of the most important palms. However, the genome sequences of palms that are currently available are still limited and fragmented. Here, we report a high-quality chromosome-level reference genome of an oil palm, Dura, assembled by integrating long reads with ∼ 150× genome coverage. The assembled genome was 1.7 Gb in size, covering 94.5% of the estimated genome, of which 91.6% was assigned into 16 pseudochromosomes and 73.7% was repetitive sequences. Relying on the conserved synteny with oil palm, the existing draft genome sequences of both date palm and coconut were further assembled into chromosomal level. Transposon burst, particularly long terminal repeat retrotransposons, following the last whole-genome duplication, likely explains the genome size variation across palms. Sequence analysis of the VIRESCENS gene in palms suggests that DNA variations in this gene are related to fruit colors. Recent duplications of highly tandemly repeated pathogenesis-related proteins from the same tandem arrays play an important role in defense responses to Ganoderma. Whole-genome resequencing of both ancestral African and introduced oil palms in Southeast Asia reveals that genes under putative selection are notably associated with stress responses, suggesting adaptation to stresses in the new habitat. The genomic resources and insights gained in this study could be exploited for accelerating genetic improvement and understanding the evolution of palms.
{"title":"A Chromosome-level Reference Genome of African Oil Palm Provides Insights into Its Divergence and Stress Adaptation","authors":"Le Wang , May Lee , Zi Yi Wan , Bin Bai , Baoqing Ye , Yuzer Alfiko , Rahmadsyah Rahmadsyah , Sigit Purwantomo , Zhuojun Song , Antonius Suwanto , Gen Hua Yue","doi":"10.1016/j.gpb.2022.11.002","DOIUrl":"10.1016/j.gpb.2022.11.002","url":null,"abstract":"<div><div>The palm family (Arecaceae), consisting of ∼ 2600 species, is the third most economically important family of plants. The African <strong>oil palm</strong> (<em>Elaeis guineensis</em>) is one of the most important palms. However, the <strong>genome</strong> sequences of palms that are currently available are still limited and fragmented. Here, we report a high-quality chromosome-level reference genome of an oil palm, <em>Dura</em>, assembled by integrating long reads with ∼ 150× genome coverage. The assembled genome was 1.7 Gb in size, covering 94.5% of the estimated genome, of which 91.6% was assigned into 16 pseudochromosomes and 73.7% was repetitive sequences. Relying on the conserved synteny with oil palm, the existing draft genome sequences of both date palm and coconut were further assembled into chromosomal level. Transposon burst, particularly long terminal repeat retrotransposons, following the last whole-genome duplication, likely explains the genome size variation across palms. Sequence analysis of the <strong><em>VIRESCENS</em></strong> gene in palms suggests that DNA variations in this gene are related to fruit colors. Recent duplications of highly tandemly repeated pathogenesis-related proteins from the same tandem arrays play an important role in defense responses to <em>Ganoderma</em>. Whole-genome resequencing of both ancestral African and introduced oil palms in Southeast Asia reveals that genes under putative selection are notably associated with stress responses, suggesting adaptation to stresses in the new habitat. The genomic resources and insights gained in this study could be exploited for accelerating genetic improvement and understanding the <strong>evolution</strong> of palms.</div></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"21 3","pages":"Pages 440-454"},"PeriodicalIF":11.5,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10787024/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40722477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01DOI: 10.1016/j.gpb.2023.02.001
Yinquan Qu , Xulan Shang , Ziyan Zeng , Yanhao Yu , Guoliang Bian , Wenling Wang , Li Liu , Li Tian , Shengcheng Zhang , Qian Wang , Dejin Xie , Xuequn Chen , Zhenyang Liao , Yibin Wang , Jian Qin , Wanxia Yang , Caowen Sun , Xiangxiang Fu , Xingtan Zhang , Shengzuo Fang
Cyclocarya paliurus is a relict plant species that survived the last glacial period and shows a population expansion recently. Its leaves have been traditionally used to treat obesity and diabetes with the well-known active ingredient cyclocaric acid B. Here, we presented three C. paliurus genomes from two diploids with different flower morphs and one haplotype-resolved tetraploid assembly. Comparative genomic analysis revealed two rounds of recent whole-genome duplication events and identified 691 genes with dosage effects that likely contribute to adaptive evolution through enhanced photosynthesis and increased accumulation of triterpenoids. Resequencing analysis of 45 C. paliurus individuals uncovered two bottlenecks, consistent with the known events of environmental changes, and many selectively swept genes involved in critical biological functions, including plant defense and secondary metabolite biosynthesis. We also proposed the biosynthesis pathway of cyclocaric acid B based on multi-omics data and identified key genes, in particular gibberellin-related genes, associated with the heterodichogamy in C. paliurus species. Our study sheds light on evolutionary history of C. paliurus and provides genomic resources to study the medicinal herbs.
{"title":"Whole-genome Duplication Reshaped Adaptive Evolution in A Relict Plant Species, Cyclocarya paliurus","authors":"Yinquan Qu , Xulan Shang , Ziyan Zeng , Yanhao Yu , Guoliang Bian , Wenling Wang , Li Liu , Li Tian , Shengcheng Zhang , Qian Wang , Dejin Xie , Xuequn Chen , Zhenyang Liao , Yibin Wang , Jian Qin , Wanxia Yang , Caowen Sun , Xiangxiang Fu , Xingtan Zhang , Shengzuo Fang","doi":"10.1016/j.gpb.2023.02.001","DOIUrl":"10.1016/j.gpb.2023.02.001","url":null,"abstract":"<div><div><strong><em>Cyclocarya paliurus</em></strong> is a relict plant species that survived the last glacial period and shows a population expansion recently. Its leaves have been traditionally used to treat obesity and diabetes with the well-known active ingredient cyclocaric acid B. Here, we presented three <em>C</em>. <em>paliurus</em> genomes from two diploids with different flower morphs and one haplotype-resolved tetraploid assembly. Comparative genomic analysis revealed two rounds of recent <strong>whole-genome duplication</strong> events and identified 691 genes with dosage effects that likely contribute to adaptive evolution through enhanced photosynthesis and increased accumulation of <strong>triterpenoids</strong>. <strong>Re</strong><strong>sequencing</strong> analysis of 45 <em>C</em>. <em>paliurus</em> individuals uncovered two bottlenecks, consistent with the known events of environmental changes, and many selectively swept genes involved in critical biological functions, including plant defense and secondary metabolite biosynthesis. We also proposed the biosynthesis pathway of cyclocaric acid B based on multi-omics data and identified key genes, in particular gibberellin-related genes, associated with the heterodichogamy in <em>C</em>. <em>paliurus</em> species. Our study sheds light on evolutionary history of <em>C</em>. <em>paliurus</em> and provides <strong>genomic</strong> resources to study the medicinal herbs.</div></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"21 3","pages":"Pages 455-469"},"PeriodicalIF":11.5,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10787019/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10695746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01DOI: 10.1016/j.gpb.2023.03.003
Liang Wang , Xiaojie Wang , Chunqi Liu , Wei Xu , Weihong Kuang , Qian Bu , Hongchun Li , Ying Zhao , Linhong Jiang , Yaxing Chen , Feng Qin , Shu Li , Qinfan Wei , Xiaocong Liu , Bin Liu , Yuanyuan Chen , Yanping Dai , Hongbo Wang , Jingwei Tian , Gang Cao , Xiaobo Cen
The expression of linear DNA sequence is precisely regulated by the three-dimensional (3D) architecture of chromatin. Morphine-induced aberrant gene networks of neurons have been extensively investigated; however, how morphine impacts the 3D genomic architecture of neurons is still unknown. Here, we applied digestion-ligation-only high-throughput chromosome conformation capture (DLO Hi-C) technology to investigate the effects of morphine on the 3D chromatin architecture of primate cortical neurons. After receiving continuous morphine administration for 90 days on rhesus monkeys, we discovered that morphine re-arranged chromosome territories, with a total of 391 segmented compartments being switched. Morphine altered over half of the detected topologically associated domains (TADs), most of which exhibited a variety of shifts, followed by separating and fusing types. Analysis of the looping events at kilobase-scale resolution revealed that morphine increased not only the number but also the length of differential loops. Moreover, all identified differentially expressed genes from the RNA sequencing data were mapped to the specific TAD boundaries or differential loops, and were further validated for changed expression. Collectively, an altered 3D genomic architecture of cortical neurons may regulate the gene networks associated with morphine effects. Our finding provides critical hubs connecting chromosome spatial organization and gene networks associated with the morphine effects in humans.
{"title":"Morphine Re-arranges Chromatin Spatial Architecture of Primate Cortical Neurons","authors":"Liang Wang , Xiaojie Wang , Chunqi Liu , Wei Xu , Weihong Kuang , Qian Bu , Hongchun Li , Ying Zhao , Linhong Jiang , Yaxing Chen , Feng Qin , Shu Li , Qinfan Wei , Xiaocong Liu , Bin Liu , Yuanyuan Chen , Yanping Dai , Hongbo Wang , Jingwei Tian , Gang Cao , Xiaobo Cen","doi":"10.1016/j.gpb.2023.03.003","DOIUrl":"10.1016/j.gpb.2023.03.003","url":null,"abstract":"<div><div>The expression of linear DNA sequence is precisely regulated by the three-dimensional (3D) architecture of chromatin. <strong>Morphine</strong>-induced aberrant gene networks of neurons have been extensively investigated; however, how morphine impacts the 3D genomic architecture of neurons is still unknown. Here, we applied digestion-ligation-only high-throughput chromosome conformation capture (DLO Hi-C) technology to investigate the effects of morphine on the 3D chromatin architecture of primate cortical neurons. After receiving continuous morphine administration for 90 days on <strong>rhesus monkeys</strong>, we discovered that morphine re-arranged chromosome territories, with a total of 391 segmented compartments being switched. Morphine altered over half of the detected <strong>topologically associated domains</strong> (TADs), most of which exhibited a variety of shifts, followed by separating and fusing types. Analysis of the looping events at kilobase-scale resolution revealed that morphine increased not only the number but also the length of differential <strong>loops</strong>. Moreover, all identified differentially expressed genes from the RNA sequencing data were mapped to the specific TAD boundaries or differential loops, and were further validated for changed expression. Collectively, an altered 3D genomic architecture of cortical neurons may regulate the gene networks associated with morphine effects. Our finding provides critical hubs connecting chromosome spatial organization and gene networks associated with the morphine effects in humans.</div></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"21 3","pages":"Pages 551-572"},"PeriodicalIF":11.5,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10787020/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9544973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01DOI: 10.1016/j.gpb.2021.10.003
Fang-Yuan Shi , Yu Wang , Dong Huang , Yu Liang , Nan Liang , Xiao-Wei Chen , Ge Gao
Large-scale genome-wide association studies (GWAS) and expression quantitative trait locus (eQTL) studies have identified multiple non-coding variants associated with genetic diseases by affecting gene expression. However, pinpointing causal variants effectively and efficiently remains a serious challenge. Here, we developed CARMEN, a novel algorithm to identify functional non-coding expression-modulating variants. Multiple evaluations demonstrated CARMEN’s superior performance over state-of-the-art tools. Applying CARMEN to GWAS and eQTL datasets further pinpointed several causal variants other than the reported lead single-nucleotide polymorphisms (SNPs). CARMEN scales well with the massive datasets, and is available online as a web server at http://carmen.gao-lab.org.
{"title":"Computational Assessment of the Expression-modulating Potential for Non-coding Variants","authors":"Fang-Yuan Shi , Yu Wang , Dong Huang , Yu Liang , Nan Liang , Xiao-Wei Chen , Ge Gao","doi":"10.1016/j.gpb.2021.10.003","DOIUrl":"10.1016/j.gpb.2021.10.003","url":null,"abstract":"<div><div>Large-scale genome-wide association studies (GWAS) and expression quantitative trait locus (eQTL) studies have identified multiple <strong>non</strong><strong>-</strong><strong>coding variants</strong> associated with genetic diseases by affecting gene expression. However, pinpointing causal variants effectively and efficiently remains a serious challenge. Here, we developed CARMEN, a novel <strong>algorithm</strong> to identify functional non-coding <strong>expression-modulating variants</strong>. Multiple evaluations demonstrated CARMEN’s superior performance over state-of-the-art tools. Applying CARMEN to GWAS and eQTL datasets further pinpointed several causal variants other than the reported lead single-nucleotide polymorphisms (SNPs). CARMEN scales well with the massive datasets, and is available online as a <strong>web server</strong> at <span><span>http://carmen.gao-lab.org</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"21 3","pages":"Pages 662-673"},"PeriodicalIF":11.5,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10787178/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39574450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01DOI: 10.1016/j.gpb.2022.11.001
Ruobing Han , Lei Han , Xunwu Zhao, Qianghui Wang, Yanling Xia, Heping Li
Despite the scientific and medicinal importance of diploid sika deer (Cervus nippon), its genome resources are limited and haplotype-resolved chromosome-scale assembly is urgently needed. To explore mechanisms underlying the expression patterns of the allele-specific genes in antlers and the chromosome evolution in Cervidae, we report, for the first time, a high-quality haplotype-resolved chromosome-scale genome of sika deer by integrating multiple sequencing strategies, which was anchored to 32 homologous groups with a pair of sex chromosomes (XY). Several expanded genes (RET, PPP2R1A, PPP2R1B, YWHAB, YWHAZ, and RPS6) and positively selected genes (eIF4E, Wnt8A, Wnt9B, BMP4, and TP53) were identified, which could contribute to rapid antler growth without carcinogenesis. A comprehensive and systematic genome-wide analysis of allele expression patterns revealed that most alleles were functionally equivalent in regulating rapid antler growth and inhibiting oncogenesis. Comparative genomic analysis revealed that chromosome fission might occur during the divergence of sika deer and red deer (Cervus elaphus), and the olfactory sensation of sika deer might be more powerful than that of red deer. Obvious inversion regions containing olfactory receptor genes were also identified, which arose since the divergence. In conclusion, the high-quality allele-aware reference genome provides valuable resources for further illustration of the unique biological characteristics of antler, chromosome evolution, and multi-omics research of cervid animals.
{"title":"Haplotype-resolved Genome of Sika Deer Reveals Allele-specific Gene Expression and Chromosome Evolution","authors":"Ruobing Han , Lei Han , Xunwu Zhao, Qianghui Wang, Yanling Xia, Heping Li","doi":"10.1016/j.gpb.2022.11.001","DOIUrl":"10.1016/j.gpb.2022.11.001","url":null,"abstract":"<div><div>Despite the scientific and medicinal importance of diploid <strong>sika deer</strong> (<em>Cervus nippon</em>), its genome resources are limited and haplotype-resolved chromosome-scale assembly is urgently needed. To explore mechanisms underlying the expression patterns of the allele-specific genes in antlers and the <strong>chromosome evolution</strong> in Cervidae, we report, for the first time, a high-quality haplotype-resolved chromosome-scale genome of sika deer by integrating multiple sequencing strategies, which was anchored to 32 homologous groups with a pair of sex chromosomes (XY). Several expanded genes (<em>RET</em>, <em>PPP2R1A</em>, <em>PPP2R1B</em>, <em>YWHAB</em>, <em>YWHAZ</em>, and <em>RPS6</em>) and positively selected genes (<em>eIF4E</em>, <em>Wnt8A</em>, <em>Wnt9B</em>, <em>BMP4</em>, and <em>TP53</em>) were identified, which could contribute to <strong>rapid antler growth</strong> without carcinogenesis. A comprehensive and systematic genome-wide analysis of allele expression patterns revealed that most alleles were functionally equivalent in regulating rapid antler growth and inhibiting oncogenesis. Comparative genomic analysis revealed that chromosome fission might occur during the divergence of sika deer and red deer (<em>Cervus elaphus</em>), and the olfactory sensation of sika deer might be more powerful than that of red deer. Obvious inversion regions containing olfactory receptor genes were also identified, which arose since the divergence. In conclusion, the high-quality allele-aware reference genome provides valuable resources for further illustration of the unique biological characteristics of antler, chromosome evolution, and multi-omics research of cervid animals.</div></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"21 3","pages":"Pages 470-482"},"PeriodicalIF":11.5,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10787017/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40474858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01DOI: 10.1016/j.gpb.2022.02.002
Zheng Wang , Guihu Zhao , Bin Li , Zhenghuan Fang , Qian Chen , Xiaomeng Wang , Tengfei Luo , Yijing Wang , Qiao Zhou , Kuokuo Li , Lu Xia , Yi Zhang , Xun Zhou , Hongxu Pan , Yuwen Zhao , Yige Wang , Lin Wang , Jifeng Guo , Beisha Tang , Kun Xia , Jinchen Li
Non-coding variants in the human genome significantly influence human traits and complex diseases via their regulation and modification effects. Hence, an increasing number of computational methods are developed to predict the effects of variants in human non-coding sequences. However, it is difficult for inexperienced users to select appropriate computational methods from dozens of available methods. To solve this issue, we assessed 12 performance metrics of 24 methods on four independent non-coding variant benchmark datasets: (1) rare germline variants from clinical relevant sequence variants (ClinVar), (2) rare somatic variants from Catalogue Of Somatic Mutations In Cancer (COSMIC), (3) common regulatory variants from curated expression quantitative trait locus (eQTL) data, and (4) disease-associated common variants from curated genome-wide association studies (GWAS). All 24 tested methods performed differently under various conditions, indicating varying strengths and weaknesses under different scenarios. Importantly, the performance of existing methods was acceptable for rare germline variants from ClinVar with the area under the receiver operating characteristic curve (AUROC) of 0.4481–0.8033 and poor for rare somatic variants from COSMIC (AUROC = 0.4984–0.7131), common regulatory variants from curated eQTL data (AUROC = 0.4837–0.6472), and disease-associated common variants from curated GWAS (AUROC = 0.4766–0.5188). We also compared the prediction performance of 24 methods for non-coding de novo mutations in autism spectrum disorder, and found that the combined annotation-dependent depletion (CADD) and context-dependent tolerance score (CDTS) methods showed better performance. Summarily, we assessed the performance of 24 computational methods under diverse scenarios, providing preliminary advice for proper tool selection and guiding the development of new techniques in interpreting non-coding variants.
{"title":"Performance Comparison of Computational Methods for the Prediction of the Function and Pathogenicity of Non-coding Variants","authors":"Zheng Wang , Guihu Zhao , Bin Li , Zhenghuan Fang , Qian Chen , Xiaomeng Wang , Tengfei Luo , Yijing Wang , Qiao Zhou , Kuokuo Li , Lu Xia , Yi Zhang , Xun Zhou , Hongxu Pan , Yuwen Zhao , Yige Wang , Lin Wang , Jifeng Guo , Beisha Tang , Kun Xia , Jinchen Li","doi":"10.1016/j.gpb.2022.02.002","DOIUrl":"10.1016/j.gpb.2022.02.002","url":null,"abstract":"<div><div><strong>Non-coding variants</strong> in the human genome significantly influence human traits and complex diseases via their regulation and modification effects. Hence, an increasing number of computational methods are developed to predict the effects of variants in human non-coding sequences. However, it is difficult for inexperienced users to select appropriate computational methods from dozens of available methods. To solve this issue, we assessed 12 performance metrics of 24 methods on four independent non-coding variant benchmark datasets: (1) rare germline variants from clinical relevant sequence variants (ClinVar), (2) rare somatic variants from Catalogue Of Somatic Mutations In Cancer (COSMIC), (3) common regulatory variants from curated expression quantitative trait locus (eQTL) data, and (4) disease-associated common variants from curated genome-wide association studies (GWAS). All 24 tested methods performed differently under various conditions, indicating varying strengths and weaknesses under different scenarios. Importantly, the performance of existing methods was acceptable for rare germline variants from ClinVar with the area under the receiver operating characteristic curve (AUROC) of 0.4481–0.8033 and poor for rare somatic variants from COSMIC (AUROC = 0.4984–0.7131), common regulatory variants from curated eQTL data (AUROC = 0.4837–0.6472), and disease-associated common variants from curated GWAS (AUROC = 0.4766–0.5188). We also compared the prediction performance of 24 methods for non-coding <em>de novo</em> mutations in autism spectrum disorder, and found that the combined annotation-dependent depletion (CADD) and context-dependent tolerance score (CDTS) methods showed better performance. Summarily, we assessed the performance of 24 computational methods under diverse scenarios, providing preliminary advice for proper tool selection and guiding the development of new techniques in interpreting non-coding variants.</div></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"21 3","pages":"Pages 649-661"},"PeriodicalIF":11.5,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10787016/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41273277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01DOI: 10.1016/j.gpb.2022.11.005
Jingqi Zhou , Ake Liu , Funan He , Yunbin Zhang , Libing Shen , Jun Yu , Xiang Zhang
The white-blotched river stingray (Potamotrygon leopoldi) is a cartilaginous fish native to the Xingu River, a tributary of the Amazon River system. As a rare freshwater-dwelling cartilaginous fish in the Potamotrygonidae family in which no member has the genome sequencing information available, P. leopoldi provides the evolutionary details in fish phylogeny, niche adaptation, and skeleton formation. In this study, we present its draft genome of 4.11 Gb comprising 16,227 contigs and 13,238 scaffolds, with contig N50 of 3937 kb and scaffold N50 of 5675 kb in size. Our analysis shows that P. leopoldi is a slow-evolving fish that diverged from elephant sharks about 96 million years ago. Moreover, two gene families related to the immune system (immunoglobulin heavy constant delta genes and T-cell receptor alpha/delta variable genes) exhibit expansion in P. leopoldi only. We also identified the Hox gene clusters in P. leopoldi and discovered that seven Hox genes shared by five representative fish species are missing in P. leopoldi. The RNA sequencing data from P. leopoldi and other three fish species demonstrate that fishes have a more diversified tissue expression spectrum when compared to mammals. Our functional studies suggest that lack of the gc gene encoding vitamin D-binding protein in cartilaginous fishes (both P. leopoldi and Callorhinchus milii) could partly explain the absence of hard bone in their endoskeleton. Overall, this genome resource provides new insights into the niche adaptation, body plan, and skeleton formation of P. leopoldi, as well as the genome evolution in cartilaginous fishes.
{"title":"Draft Genome of White-blotched River Stingray Provides Novel Clues for Niche Adaptation and Skeleton Formation","authors":"Jingqi Zhou , Ake Liu , Funan He , Yunbin Zhang , Libing Shen , Jun Yu , Xiang Zhang","doi":"10.1016/j.gpb.2022.11.005","DOIUrl":"10.1016/j.gpb.2022.11.005","url":null,"abstract":"<div><div>The <strong>white-blotched river stingray</strong> (<strong><em>Potamotrygon leopoldi</em></strong>) is a cartilaginous fish native to the Xingu River, a tributary of the Amazon River system. As a rare freshwater-dwelling cartilaginous fish in the Potamotrygonidae family in which no member has the genome sequencing information available, <em>P</em>. <em>leopoldi</em> provides the evolutionary details in fish phylogeny, <strong>niche adaptation</strong>, and skeleton formation. In this study, we present its draft genome of 4.11 Gb comprising 16,227 contigs and 13,238 scaffolds, with contig N50 of 3937 kb and scaffold N50 of 5675 kb in size. Our analysis shows that <em>P</em>. <em>leopoldi</em> is a slow-evolving fish that diverged from elephant sharks about 96 million years ago. Moreover, two gene families related to the immune system (immunoglobulin heavy constant delta genes and T-cell receptor alpha/delta variable genes) exhibit expansion in <em>P</em>. <em>leopoldi</em> only. We also identified the Hox gene clusters in <em>P</em>. <em>leopoldi</em> and discovered that seven <em>Hox</em> genes shared by five representative fish species are missing in <em>P</em>. <em>leopoldi</em>. The RNA sequencing data from <em>P</em>. <em>leopoldi</em> and other three fish species demonstrate that fishes have a more diversified tissue expression spectrum when compared to mammals. Our functional studies suggest that lack of the <em>gc</em> gene encoding <strong>vitamin D-binding protein</strong> in cartilaginous fishes (both <em>P</em>. <em>leopoldi</em> and <em>Callorhinchus milii</em>) could partly explain the absence of hard bone in their endoskeleton. Overall, this genome resource provides new insights into the niche adaptation, body plan, and skeleton formation of <em>P. leopoldi</em>, as well as the genome evolution in cartilaginous fishes.</div></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"21 3","pages":"Pages 501-514"},"PeriodicalIF":11.5,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10787021/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35255168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}