首页 > 最新文献

Genome research最新文献

英文 中文
Degrees of convergent evolution in rodent adaptations to arid environments. 啮齿动物适应干旱环境的趋同进化程度。
IF 5.5 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2026-01-21 DOI: 10.1101/gr.280089.124
Domitille Chalopin, Carine Rey, Jeremy Ganofsky, Juliana Blin, Pascale Chevret, Marion Mouginot, Laurent Gueguen, Bastien Boussau, Sophie Pantalacci, Marie Semon

Species adapting to a similar lifestyle may undergo convergent changes in organ structure and cellular function, themselves relying or not on convergent genetic changes. The extent of genomic convergence is thus debated and may further depend on the interplay between temporal factors, such as species relatedness or the age of the transition. Rodents have repeatedly adapted to life in arid conditions, notably with altered renal morphology and physiology. By analyzing kidney transcriptomes from 33 species, we find convergence at all examined biological levels, from the whole kidney transcriptome down to the coding sequences and expression level of individual genes. Transcriptome-level signatures reflect convergent changes in cell proportions, suggesting convergent structural adaptations of the kidney. A large proportion of genes shows convergent substitutions, but those happened in small subsets of species, showing that there are multiple genetic paths repeatedly taken in a mosaic manner. A similar mosaic signal of convergence is found comparing gene expression in species spanning the Rodentia order, but convergence is more widely shared at the lower level of the Murinae family. Therefore we test more directly the influence of temporal factors. We observe more convergent changes when we select species independently adapted from more closely than more distantly related ancestors, and when we select older transitions rather than recent transitions. Our study shows that there are many different, yet repeatedly selected, ways to adapt to aridity, and that the degree of convergent evolution increases with both the age of the transitions and species relatedness.

适应相似生活方式的物种可能会经历器官结构和细胞功能的趋同变化,它们本身依赖于或不依赖于趋同的遗传变化。因此,基因组趋同的程度存在争议,可能进一步取决于时间因素之间的相互作用,例如物种亲缘关系或过渡的年龄。啮齿类动物反复适应干旱条件下的生活,尤其是肾脏形态和生理上的改变。通过分析来自33个物种的肾脏转录组,我们发现从整个肾脏转录组到单个基因的编码序列和表达水平,在所有检测的生物学水平上都存在收敛性。转录组水平的特征反映了细胞比例的趋同变化,表明肾脏的趋同结构适应。大部分基因表现出趋同替代,但这种情况只发生在一小部分物种中,这表明有多种遗传路径以镶嵌的方式重复出现。在啮齿类目物种的基因表达比较中发现了类似的趋同的马赛克信号,但趋同在较低水平的Murinae科中更为普遍。因此,我们更直接地检验了时间因素的影响。当我们选择与亲缘关系较近的祖先独立适应的物种时,当我们选择较早的过渡而不是最近的过渡时,我们观察到更多的趋同变化。我们的研究表明,有许多不同的,但反复选择的方式来适应干旱,并且趋同进化的程度随着过渡的年龄和物种亲缘关系而增加。
{"title":"Degrees of convergent evolution in rodent adaptations to arid environments.","authors":"Domitille Chalopin, Carine Rey, Jeremy Ganofsky, Juliana Blin, Pascale Chevret, Marion Mouginot, Laurent Gueguen, Bastien Boussau, Sophie Pantalacci, Marie Semon","doi":"10.1101/gr.280089.124","DOIUrl":"https://doi.org/10.1101/gr.280089.124","url":null,"abstract":"<p><p>Species adapting to a similar lifestyle may undergo convergent changes in organ structure and cellular function, themselves relying or not on convergent genetic changes. The extent of genomic convergence is thus debated and may further depend on the interplay between temporal factors, such as species relatedness or the age of the transition. Rodents have repeatedly adapted to life in arid conditions, notably with altered renal morphology and physiology. By analyzing kidney transcriptomes from 33 species, we find convergence at all examined biological levels, from the whole kidney transcriptome down to the coding sequences and expression level of individual genes. Transcriptome-level signatures reflect convergent changes in cell proportions, suggesting convergent structural adaptations of the kidney. A large proportion of genes shows convergent substitutions, but those happened in small subsets of species, showing that there are multiple genetic paths repeatedly taken in a mosaic manner. A similar mosaic signal of convergence is found comparing gene expression in species spanning the Rodentia order, but convergence is more widely shared at the lower level of the Murinae family. Therefore we test more directly the influence of temporal factors. We observe more convergent changes when we select species independently adapted from more closely than more distantly related ancestors, and when we select older transitions rather than recent transitions. Our study shows that there are many different, yet repeatedly selected, ways to adapt to aridity, and that the degree of convergent evolution increases with both the age of the transitions and species relatedness.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":""},"PeriodicalIF":5.5,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146018238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Read-level genotyping of short tandem repeats using long reads and single-nucleotide variation with STRkit. 使用长reads和STRkit的单核苷酸变异进行短串联重复序列的读级基因分型。
IF 5.5 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2026-01-15 DOI: 10.1101/gr.280766.125
David R Lougheed, Tomi Pastinen, Guillaume Bourque

Variation in short tandem repeats (STRs) is implicated in Mendelian disease and complex traits, but can be difficult to resolve with short-read genome sequencing. We present STRkit, a software package for genotyping STRs using long read sequencing (LRS) that uses proximate single-nucleotide variants to improve genotyping accuracy without a priori haplotype information. We show that STRkit has unique strengths versus other methods: it can use data from both major LRS technologies (Pacific Biosciences HiFi [PB] and Oxford Nanopore [ONT]) to output both allele- and read-level copy number and sequence, performs best in benchmarking with F1 scores of 0.9631 and 0.9544 with PB and ONT data respectively, achieves higher rates of Mendelian consistency than other genotyping tools, and is open source software. STRkit's features open up new possibilities for association testing, assessing patterns of STR inheritance, and better understanding the functional effects of these notable repeat elements.

短串联重复序列(STRs)的变异与孟德尔病和复杂性状有关,但很难通过短读基因组测序来解决。我们提出STRkit,这是一个使用长读测序(LRS)对STRs进行基因分型的软件包,它使用近似的单核苷酸变体来提高基因分型的准确性,而无需先验的单倍型信息。我们发现STRkit与其他方法相比具有独特的优势:它可以使用来自两种主要LRS技术(Pacific Biosciences HiFi [PB]和Oxford Nanopore [ONT])的数据来输出等位基因和读级拷贝数和序列,在基准测试中表现最佳,PB和ONT数据的F1得分分别为0.9631和0.9544,比其他基因分型工具实现更高的孟德尔一致性率,并且是开源软件。STRkit的特性为关联测试、评估STR遗传模式以及更好地理解这些显著重复元件的功能影响开辟了新的可能性。
{"title":"Read-level genotyping of short tandem repeats using long reads and single-nucleotide variation with STRkit.","authors":"David R Lougheed, Tomi Pastinen, Guillaume Bourque","doi":"10.1101/gr.280766.125","DOIUrl":"https://doi.org/10.1101/gr.280766.125","url":null,"abstract":"<p><p>Variation in short tandem repeats (STRs) is implicated in Mendelian disease and complex traits, but can be difficult to resolve with short-read genome sequencing. We present STRkit, a software package for genotyping STRs using long read sequencing (LRS) that uses proximate single-nucleotide variants to improve genotyping accuracy without a priori haplotype information. We show that STRkit has unique strengths versus other methods: it can use data from both major LRS technologies (Pacific Biosciences HiFi [PB] and Oxford Nanopore [ONT]) to output both allele- and read-level copy number and sequence, performs best in benchmarking with F1 scores of 0.9631 and 0.9544 with PB and ONT data respectively, achieves higher rates of Mendelian consistency than other genotyping tools, and is open source software. STRkit's features open up new possibilities for association testing, assessing patterns of STR inheritance, and better understanding the functional effects of these notable repeat elements.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":""},"PeriodicalIF":5.5,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145989130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Chromosome engineering to correct a complex rearrangement on Chromosome 8 reveals the effects of 8p syndrome on gene expression and neural differentiation. 纠正8号染色体复杂重排的染色体工程揭示了8p综合征对基因表达和神经分化的影响。
IF 5.5 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2026-01-12 DOI: 10.1101/gr.280425.125
Sophia N Lee, Erin C Banda, Lu Qiao, Sarah L Thompson, Karan Singh, Ryan A Hagenson, Teresa Davoli, Stefan F Pinter, Jason M Sheltzer

Chromosomal rearrangements on the short arm of Chromosome 8 cause 8p syndrome, a rare developmental disorder characterized by neurodevelopmental delays, epilepsy, and cardiac abnormalities. While significant progress has been made in managing the symptoms of 8p syndrome and other conditions caused by large-scale chromosomal aneuploidies, no therapeutic approach has yet been demonstrated to target the underlying disease-causing chromosome. Here, we establish a two-step approach to eliminate the abnormal copy of Chromosome 8 and restore euploidy in cells derived from an individual with a complex rearrangement of Chromosome 8p. Transcriptomic analysis revealed 361 differentially expressed genes between the proband and the euploid revertant, highlighting genes both within and outside the 8p region that may contribute to 8p syndrome pathology. Furthermore, we demonstrate that the proband exhibits a significant defect in neural differentiation that could be partially rescued by treatment with small-molecule inhibitors of cell death. Our work demonstrates the feasibility of using chromosome engineering to correct complex aneuploidies in vitro and establishes a platform to further dissect the pathophysiology of 8p syndrome and other conditions caused by chromosomal rearrangements.

8号染色体短臂上的染色体重排导致8p综合征,这是一种罕见的发育障碍,以神经发育迟缓、癫痫和心脏异常为特征。虽然在控制8p综合征的症状和由大规模染色体非整倍体引起的其他疾病方面取得了重大进展,但尚未证明针对潜在致病染色体的治疗方法。在这里,我们建立了一种两步的方法来消除8号染色体的异常拷贝,并恢复细胞的整倍性,这些细胞来源于一个具有复杂的8p染色体重排的个体。转录组学分析显示,先证者和整倍体逆转录者之间存在361个差异表达基因,突出了8p区域内外可能导致8p综合征病理的基因。此外,我们证明先证者在神经分化方面表现出明显的缺陷,可以通过小分子细胞死亡抑制剂治疗部分挽救。我们的工作证明了利用染色体工程在体外纠正复杂非整倍体的可行性,并为进一步解剖8p综合征和其他由染色体重排引起的疾病的病理生理建立了平台。
{"title":"Chromosome engineering to correct a complex rearrangement on Chromosome 8 reveals the effects of 8p syndrome on gene expression and neural differentiation.","authors":"Sophia N Lee, Erin C Banda, Lu Qiao, Sarah L Thompson, Karan Singh, Ryan A Hagenson, Teresa Davoli, Stefan F Pinter, Jason M Sheltzer","doi":"10.1101/gr.280425.125","DOIUrl":"10.1101/gr.280425.125","url":null,"abstract":"<p><p>Chromosomal rearrangements on the short arm of Chromosome 8 cause 8p syndrome, a rare developmental disorder characterized by neurodevelopmental delays, epilepsy, and cardiac abnormalities. While significant progress has been made in managing the symptoms of 8p syndrome and other conditions caused by large-scale chromosomal aneuploidies, no therapeutic approach has yet been demonstrated to target the underlying disease-causing chromosome. Here, we establish a two-step approach to eliminate the abnormal copy of Chromosome 8 and restore euploidy in cells derived from an individual with a complex rearrangement of Chromosome 8p. Transcriptomic analysis revealed 361 differentially expressed genes between the proband and the euploid revertant, highlighting genes both within and outside the 8p region that may contribute to 8p syndrome pathology. Furthermore, we demonstrate that the proband exhibits a significant defect in neural differentiation that could be partially rescued by treatment with small-molecule inhibitors of cell death. Our work demonstrates the feasibility of using chromosome engineering to correct complex aneuploidies in vitro and establishes a platform to further dissect the pathophysiology of 8p syndrome and other conditions caused by chromosomal rearrangements.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":""},"PeriodicalIF":5.5,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145959299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Early feature extraction drives model performance in high-resolution chromatin accessibility prediction. 早期特征提取驱动模型在高分辨率染色质可及性预测中的性能。
IF 5.5 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2026-01-12 DOI: 10.1101/gr.281042.125
Aayush Grover, Till Muser, Liine Kasak, Lin Zhang, Ekaterina Krymova, Valentina Boeva

Fine-grained prediction of chromatin accessibility from DNA sequence is a foundational step in modeling gene expression changes resulting from sequence variants. Yet, few methods operate at the resolution necessary to capture subtle effects of single-nucleotide changes. Furthermore, it remains unclear which architectural components, such as residual connections, normalization strategies, or attention mechanisms, drive performance in these high-resolution predictions. To address these knowledge gaps, we systematically evaluate classic architectural choices and introduce ConvNeXt V2 blocks, originally developed for computer vision, as high-resolution feature extractors in deep learning models for genomic data. Integrated into diverse architectures such as CNNs, LSTMs, dilated CNNs, and transformers, ConvNeXt V2 blocks consistently improve performance, leading to similar prediction accuracy across these different model types. This reveals that early feature extraction, rather than downstream architecture, is the primary determinant of prediction accuracy. A comprehensive evaluation of these models on ATAC-seq signal prediction at 4 bp resolution in a cell type-specific manner identifies the ConvNeXt-based dilated CNN as the most robust performer, better preserving the signal's shape. Our codebase and benchmarks provide practical tools for high-resolution chromatin modeling.

从DNA序列中精细预测染色质可及性是建立由序列变异引起的基因表达变化模型的基础步骤。然而,很少有方法能达到捕捉单核苷酸变化的细微影响所必需的分辨率。此外,还不清楚哪些架构组件(如剩余连接、规范化策略或注意机制)驱动这些高分辨率预测的性能。为了解决这些知识差距,我们系统地评估了经典的架构选择,并引入了最初为计算机视觉开发的ConvNeXt V2块,作为基因组数据深度学习模型中的高分辨率特征提取器。集成到不同的体系结构中,如cnn、lstm、扩展cnn和变压器,ConvNeXt V2块不断提高性能,从而在这些不同的模型类型中实现相似的预测精度。这表明,早期特征提取,而不是下游架构,是预测精度的主要决定因素。对这些模型以细胞类型特异性的方式在4bp分辨率下预测ATAC-seq信号的综合评估表明,基于convnext的扩展CNN是最稳健的表现,更好地保留了信号的形状。我们的代码库和基准测试为高分辨率染色质建模提供了实用的工具。
{"title":"Early feature extraction drives model performance in high-resolution chromatin accessibility prediction.","authors":"Aayush Grover, Till Muser, Liine Kasak, Lin Zhang, Ekaterina Krymova, Valentina Boeva","doi":"10.1101/gr.281042.125","DOIUrl":"https://doi.org/10.1101/gr.281042.125","url":null,"abstract":"<p><p>Fine-grained prediction of chromatin accessibility from DNA sequence is a foundational step in modeling gene expression changes resulting from sequence variants. Yet, few methods operate at the resolution necessary to capture subtle effects of single-nucleotide changes. Furthermore, it remains unclear which architectural components, such as residual connections, normalization strategies, or attention mechanisms, drive performance in these high-resolution predictions. To address these knowledge gaps, we systematically evaluate classic architectural choices and introduce ConvNeXt V2 blocks, originally developed for computer vision, as high-resolution feature extractors in deep learning models for genomic data. Integrated into diverse architectures such as CNNs, LSTMs, dilated CNNs, and transformers, ConvNeXt V2 blocks consistently improve performance, leading to similar prediction accuracy across these different model types. This reveals that early feature extraction, rather than downstream architecture, is the primary determinant of prediction accuracy. A comprehensive evaluation of these models on ATAC-seq signal prediction at 4 bp resolution in a cell type-specific manner identifies the ConvNeXt-based dilated CNN as the most robust performer, better preserving the signal's shape. Our codebase and benchmarks provide practical tools for high-resolution chromatin modeling.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":""},"PeriodicalIF":5.5,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145959206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automated chromatin profiling with spa-ChIP-seq uncovers the impacts of condition variations. spa-ChIP-seq自动染色质分析揭示了条件变化的影响。
IF 5.5 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2026-01-05 DOI: 10.1101/gr.281320.125
Yuwei Cao, Lauren Patel, Lauren Alcoser, Eric Mendenhall, Christopher Benner, Sven Heinz, Alon Goren

Chromatin immunoprecipitation followed by sequencing (ChIP-seq) is widely used to study the genomic localization of DNA-associated proteins. However, conventional protocols include multiple manual steps that can introduce inconsistency and limit scalability, thereby restricting the inclusion of appropriate replicates and controls. Although the introduction of liquid handling platforms has improved reproducibility, most existing efforts have automated only a subset of the workflow, and extending automation to efficiently map nonhistone proteins, such as chromatin regulators, remains challenging. Here, we present a fully automated implementation of our previously developed single-pot ChIP-seq protocol, named spa-ChIP-seq, which enables scalable processing of eight to 96 ChIP-seq samples from cross-linked cells to a sequencing-ready library in approximately 3 days with an estimated cost of $70 per sample. Benchmarking spa-ChIP-seq against manual ChIP-seq performed in parallel demonstrates a comparable signal-to-noise ratio between the two workflows. Using spa-ChIP-seq, we systematically evaluate multiple parameters including shearing and cross-linking conditions, buffer compositions, and the ratio of antibody to cell number. We find, for the first time to our knowledge, that weaker genomic localization signals are sensitive to changing the antibody-to-cell-number ratio, whereas the stronger signals remain unaffected. This finding underscores the importance of maintaining consistent antibody-to-cell-number ratio for comparative studies, such as treatment responses or chromatin-QTL mapping. The spa-ChIP-seq protocol is publicly available, including deck setups, operational parameters, and scripts. We envision that this robust, cost-efficient protocol will facilitate high-throughput, reproducible ChIP-seq analyses, supporting large-scale studies of antibody validation, compound screening, population genomics, and diagnostic frameworks.

染色质免疫沉淀测序(ChIP-seq)被广泛用于研究dna相关蛋白的基因组定位。然而,传统协议包含多个手动步骤,可能会引入不一致并限制可伸缩性,从而限制适当复制和控制的包含。尽管液体处理平台的引入提高了重现性,但大多数现有的工作仅实现了工作流程的一部分自动化,并且将自动化扩展到有效地绘制非组蛋白(如染色质调节因子)仍然具有挑战性。在这里,我们展示了我们之前开发的单锅ChIP-seq协议的全自动实现,名为spa-ChIP-seq,它可以在大约3天内将8到96个ChIP-seq样品从交联细胞扩展到测序准备库,每个样品的估计成本为70美元。对并行执行的spa-ChIP-seq和手动ChIP-seq进行基准测试表明,两个工作流程之间的信噪比相当。使用spa-ChIP-seq,我们系统地评估了多个参数,包括剪切和交联条件,缓冲成分和抗体与细胞数的比例。据我们所知,我们首次发现,较弱的基因组定位信号对改变抗体与细胞数量的比例很敏感,而较强的信号则不受影响。这一发现强调了在比较研究中保持一致的抗体与细胞数量比例的重要性,例如治疗反应或染色质- qtl定位。spa-ChIP-seq协议是公开的,包括甲板设置、操作参数和脚本。我们设想这种稳健、经济的方案将促进高通量、可重复的ChIP-seq分析,支持抗体验证、化合物筛选、群体基因组学和诊断框架的大规模研究。
{"title":"Automated chromatin profiling with spa-ChIP-seq uncovers the impacts of condition variations.","authors":"Yuwei Cao, Lauren Patel, Lauren Alcoser, Eric Mendenhall, Christopher Benner, Sven Heinz, Alon Goren","doi":"10.1101/gr.281320.125","DOIUrl":"10.1101/gr.281320.125","url":null,"abstract":"<p><p>Chromatin immunoprecipitation followed by sequencing (ChIP-seq) is widely used to study the genomic localization of DNA-associated proteins. However, conventional protocols include multiple manual steps that can introduce inconsistency and limit scalability, thereby restricting the inclusion of appropriate replicates and controls. Although the introduction of liquid handling platforms has improved reproducibility, most existing efforts have automated only a subset of the workflow, and extending automation to efficiently map nonhistone proteins, such as chromatin regulators, remains challenging. Here, we present a fully automated implementation of our previously developed single-pot ChIP-seq protocol, named spa-ChIP-seq, which enables scalable processing of eight to 96 ChIP-seq samples from cross-linked cells to a sequencing-ready library in approximately 3 days with an estimated cost of $70 per sample. Benchmarking spa-ChIP-seq against manual ChIP-seq performed in parallel demonstrates a comparable signal-to-noise ratio between the two workflows. Using spa-ChIP-seq, we systematically evaluate multiple parameters including shearing and cross-linking conditions, buffer compositions, and the ratio of antibody to cell number. We find, for the first time to our knowledge, that weaker genomic localization signals are sensitive to changing the antibody-to-cell-number ratio, whereas the stronger signals remain unaffected. This finding underscores the importance of maintaining consistent antibody-to-cell-number ratio for comparative studies, such as treatment responses or chromatin-QTL mapping. The spa-ChIP-seq protocol is publicly available, including deck setups, operational parameters, and scripts. We envision that this robust, cost-efficient protocol will facilitate high-throughput, reproducible ChIP-seq analyses, supporting large-scale studies of antibody validation, compound screening, population genomics, and diagnostic frameworks.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":"129-141"},"PeriodicalIF":5.5,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758389/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145742134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Genome-wide maps of CPD deamination in yeast reveal the impact of DNA sequence context and nucleosome architecture on cytosine deamination rates. 酵母CPD脱胺的全基因组图谱揭示了DNA序列背景和核小体结构对胞嘧啶脱胺率的影响。
IF 5.5 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2026-01-05 DOI: 10.1101/gr.280384.124
Marian F Laughery, Bastian Stark, Benjamin Morledge-Hampton, Steven A Roberts, John J Wyrick

UV light induces cyclobutane pyrimidine dimers (CPDs) and other mutagenic lesions in cellular DNA. Cytosine-containing CPDs can subsequently undergo rapid deamination to uracil, a process that has been linked to UV mutagenesis. However, the impact of genomic context and chromatin architecture on CPD deamination rates in cells remains poorly understood. Here, we develop a method known as dCPD-seq to map deaminated CPDs (dCPDs) across the genome of repair-deficient yeast cells at single-nucleotide resolution. Our dCPD-seq data reveal that sequence context significantly modulates CPD deamination rates in UV-irradiated yeast cells, with CPDs in TCG contexts showing particularly rapid deamination rates. Our analysis indicates that rapid CPD deamination can explain why UV-induced mutations are specifically enriched at TCG sequences, both in UV-irradiated yeast cells and in human skin cancers. CPD deamination is suppressed near the transcription start and end sites of yeast genes, which may in part by mediated by DNA-bound transcription factors. Finally, we show that the wrapping of DNA in nucleosomes modulates CPD deamination in yeast cells. Our data indicate that CPD deamination is elevated at minor-in rotational positions where the DNA minor groove faces the histone octamer, likely owing to increased solvent accessibility of the C4 position of the cytosine base. Moreover, we also observe strand-specific enrichment of CPD deamination at rotational positions where the DNA backbone faces out toward the solvent. Taken together, these findings reveal how DNA sequence context and chromatin architecture modulates CPD deamination rates across a eukaryotic genome.

紫外光诱导细胞DNA中的环丁烷嘧啶二聚体(CPDs)和其他诱变损伤。含有胞嘧啶的cpd随后可迅速脱胺为尿嘧啶,这一过程与紫外线诱变有关。然而,基因组背景和染色质结构对细胞中CPD脱氨率的影响仍然知之甚少。在这里,我们开发了一种称为dCPD-seq的方法,以单核苷酸分辨率绘制修复缺陷酵母细胞基因组中的脱氨基CPDs (dCPDs)。我们的dCPD-seq数据显示,在紫外线照射的酵母细胞中,序列背景显著调节CPD的脱氨率,TCG背景下的CPD表现出特别快的脱氨率。我们的分析表明,快速的CPD脱胺可以解释为什么在紫外线照射的酵母细胞和人类皮肤癌中,紫外线诱导的突变在TCG序列上特异性富集。CPD脱氨作用在酵母基因转录起始和结束位点附近受到抑制,这可能部分是由dna结合转录因子介导的。最后,我们发现DNA在核小体中的包裹调节酵母细胞中的CPD脱胺作用。我们的数据表明,CPD的脱胺作用在DNA小槽面对组蛋白八聚体的小旋转位置上升高,这可能是由于胞嘧啶碱基C4位置的溶剂可及性增加。此外,我们还观察到在DNA主链面向溶剂的旋转位置,CPD脱胺的链特异性富集。综上所述,这些发现揭示了DNA序列背景和染色质结构如何调节真核生物基因组的CPD脱氨率。
{"title":"Genome-wide maps of CPD deamination in yeast reveal the impact of DNA sequence context and nucleosome architecture on cytosine deamination rates.","authors":"Marian F Laughery, Bastian Stark, Benjamin Morledge-Hampton, Steven A Roberts, John J Wyrick","doi":"10.1101/gr.280384.124","DOIUrl":"10.1101/gr.280384.124","url":null,"abstract":"<p><p>UV light induces cyclobutane pyrimidine dimers (CPDs) and other mutagenic lesions in cellular DNA. Cytosine-containing CPDs can subsequently undergo rapid deamination to uracil, a process that has been linked to UV mutagenesis. However, the impact of genomic context and chromatin architecture on CPD deamination rates in cells remains poorly understood. Here, we develop a method known as dCPD-seq to map deaminated CPDs (dCPDs) across the genome of repair-deficient yeast cells at single-nucleotide resolution. Our dCPD-seq data reveal that sequence context significantly modulates CPD deamination rates in UV-irradiated yeast cells, with CPDs in TCG contexts showing particularly rapid deamination rates. Our analysis indicates that rapid CPD deamination can explain why UV-induced mutations are specifically enriched at TCG sequences, both in UV-irradiated yeast cells and in human skin cancers. CPD deamination is suppressed near the transcription start and end sites of yeast genes, which may in part by mediated by DNA-bound transcription factors. Finally, we show that the wrapping of DNA in nucleosomes modulates CPD deamination in yeast cells. Our data indicate that CPD deamination is elevated at minor-in rotational positions where the DNA minor groove faces the histone octamer, likely owing to increased solvent accessibility of the C4 position of the cytosine base. Moreover, we also observe strand-specific enrichment of CPD deamination at rotational positions where the DNA backbone faces out toward the solvent. Taken together, these findings reveal how DNA sequence context and chromatin architecture modulates CPD deamination rates across a eukaryotic genome.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":"183-196"},"PeriodicalIF":5.5,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12887450/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145713697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Genome-wide nucleosome and transcription factor responses to genetic perturbations reveal chromatin-mediated mechanisms of transcriptional regulation. 全基因组核小体和转录因子对遗传扰动的反应揭示了染色质介导的转录调控机制。
IF 5.5 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2026-01-05 DOI: 10.1101/gr.279637.124
Kevin Moyung, Yulong Li, Heather K MacAlpine, Alexander J Hartemink, David M MacAlpine

Epigenetic mechanisms contribute to gene regulation by altering chromatin accessibility through changes in transcription factor (TF) and nucleosome occupancy across the genome. Despite numerous studies focusing on changes in gene expression, the intricate chromatin-mediated regulatory code remains largely uncharted on a comprehensive scale. We address this by employing a factor-agnostic, reverse-genetics approach that uses MNase-seq to capture genome-wide TF and nucleosome occupancies in response to the individual deletion of 201 transcriptional regulators in Saccharomyces cerevisiae, thereby assaying nearly 1 million mutant-gene interactions. We develop a principled new approach to identify and quantify chromatin changes genome-wide, allowing us to observe differences in TF and nucleosome occupancy that recapitulate well-established pathways identified by gene expression data. We also discover distinct chromatin signatures associated with the up- and downregulation of genes and use these signatures to reveal regulatory mechanisms previously unexplored in expression-based studies. Finally, we demonstrate that chromatin features are predictive of transcriptional activity, and we leverage these features to reconstruct chromatin-based transcriptional regulatory networks. Overall, these results illustrate the power of an approach combining genetic perturbation with high-resolution epigenomic profiling; the latter enables a close examination of the interplay between TFs and nucleosomes genome-wide, providing a deeper, more mechanistic understanding of the complex relationship between chromatin organization and transcription.

表观遗传机制通过改变转录因子(TF)和核小体在基因组中的占用来改变染色质可及性,从而促进基因调控。尽管有大量的研究关注基因表达的变化,但复杂的染色质介导的调控代码在很大程度上仍然是未知的。为了解决这个问题,我们采用了一种因子不可知的反向遗传学方法,该方法使用MNase-seq捕捉全基因组TF和核小体占用,以响应酿酒酵母中201个转录调节因子的个体缺失,从而分析了近100万个突变基因的相互作用。我们开发了一种原则性的新方法来鉴定和量化全基因组的染色质变化,使我们能够观察TF和核小体占用的差异,这些差异概括了基因表达数据确定的既定途径。我们还发现了与基因上调和下调相关的不同染色质特征,并利用这些特征揭示了以前未在基于表达的研究中探索的调节机制。最后,我们证明了染色质特征可以预测转录活性,我们利用这些特征来重建基于染色质的转录调控网络。总的来说,这些结果说明了将遗传扰动与高分辨率表观基因组分析相结合的方法的力量;后者能够在全基因组范围内仔细检查tf和核小体之间的相互作用,为染色质组织和转录之间的复杂关系提供更深入、更机械的理解。
{"title":"Genome-wide nucleosome and transcription factor responses to genetic perturbations reveal chromatin-mediated mechanisms of transcriptional regulation.","authors":"Kevin Moyung, Yulong Li, Heather K MacAlpine, Alexander J Hartemink, David M MacAlpine","doi":"10.1101/gr.279637.124","DOIUrl":"10.1101/gr.279637.124","url":null,"abstract":"<p><p>Epigenetic mechanisms contribute to gene regulation by altering chromatin accessibility through changes in transcription factor (TF) and nucleosome occupancy across the genome. Despite numerous studies focusing on changes in gene expression, the intricate chromatin-mediated regulatory code remains largely uncharted on a comprehensive scale. We address this by employing a factor-agnostic, reverse-genetics approach that uses MNase-seq to capture genome-wide TF and nucleosome occupancies in response to the individual deletion of 201 transcriptional regulators in <i>Saccharomyces cerevisiae</i>, thereby assaying nearly 1 million mutant-gene interactions. We develop a principled new approach to identify and quantify chromatin changes genome-wide, allowing us to observe differences in TF and nucleosome occupancy that recapitulate well-established pathways identified by gene expression data. We also discover distinct chromatin signatures associated with the up- and downregulation of genes and use these signatures to reveal regulatory mechanisms previously unexplored in expression-based studies. Finally, we demonstrate that chromatin features are predictive of transcriptional activity, and we leverage these features to reconstruct chromatin-based transcriptional regulatory networks. Overall, these results illustrate the power of an approach combining genetic perturbation with high-resolution epigenomic profiling; the latter enables a close examination of the interplay between TFs and nucleosomes genome-wide, providing a deeper, more mechanistic understanding of the complex relationship between chromatin organization and transcription.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":"115-128"},"PeriodicalIF":5.5,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758391/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145714012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A versatile type VI CRISPR-based approach for targeted m6A demethylation in mRNAs. 一种多功能的基于VI型crispr的mrna靶向m6A去甲基化方法。
IF 5.5 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2026-01-05 DOI: 10.1101/gr.280476.125
Panagiotis G Adamopoulos, Konstantina Athanasopoulou, Andreas Scorilas

Epitranscriptomics, a rapidly evolving field mainly driven by massive parallel sequencing technologies, explores post-transcriptional RNA modifications. N 6-methyladenosine (m6A) has emerged as the most prominent and dynamically regulated modification in human mRNAs, being implicated in the regulation of diverse biological processes, including spermatogenesis, heat shock response, ultraviolet-induced DNA damage response and maternal mRNA clearance. Despite the recognized significance of m6A in mRNA regulation, limited studies have focused on the targeted and efficient manipulation of this modification in mRNAs. Here, we present Dem6A-Vec, an "all-in-one" plasmid vector designed for site-specific m6A demethylation in human mRNAs. Dem6A-Vec integrates the expression of a catalytically inactive RfxCas13d fused to the m6A demethylase ALKBH5 and a U6-driven customizable guide RNA in a single construct, simplifying experimental workflows and enhancing targeting efficiency. Using nanopore direct RNA sequencing, we identify high-confident m6A sites in HeLa cells, which serve as targets for Dem6A-Vec. We validate the targeted demethylation of m6A sites in the EEF2 and RRAGA genes using the established SELECT-qPCR method, confirming the impacts on mRNA stability and highlighting the tool's precision and versatility. The presented approach is implemented in multiple mRNA sites with diverse methylation stoichiometries, underscoring its adaptability to various transcriptomic contexts. This study provides a robust and scalable method for investigating the functional roles of m6A modifications, offering a transformative platform for advancing epitranscriptomic research and potential therapeutic applications.

上皮转录组学是一个快速发展的领域,主要由大规模平行测序技术驱动,研究转录后RNA修饰。n6 -甲基腺苷(m6A)是人类mRNA中最重要的动态调控修饰,参与多种生物过程的调控,包括精子发生、热休克反应、紫外线诱导的DNA损伤反应和母体mRNA清除。尽管人们认识到m6A在mRNA调控中的重要作用,但有限的研究集中在靶向和有效地操纵mRNA中的这种修饰。在这里,我们提出了Dem6A-Vec,一种“all-in-one”质粒载体,设计用于人类mrna中特定位点的m6A去甲基化。Dem6A-Vec整合了与m6A去甲基化酶ALKBH5融合的催化无活性RfxCas13d和u6驱动的可定制向导RNA的表达,简化了实验工作流程,提高了靶向效率。利用纳米孔直接RNA测序,我们在HeLa细胞中确定了高可信度的m6A位点,作为Dem6A-Vec的靶点。我们使用已建立的SELECT-qPCR方法验证了EEF2和raga基因中m6A位点的靶向去甲基化,确认了对mRNA稳定性的影响,并强调了该工具的准确性和通用性。所提出的方法在具有不同甲基化化学计量的多个mRNA位点上实现,强调其对各种转录组背景的适应性。该研究为研究m6A修饰的功能作用提供了一种强大且可扩展的方法,为推进表转录组学研究和潜在的治疗应用提供了一个变革性的平台。
{"title":"A versatile type VI CRISPR-based approach for targeted m<sup>6</sup>A demethylation in mRNAs.","authors":"Panagiotis G Adamopoulos, Konstantina Athanasopoulou, Andreas Scorilas","doi":"10.1101/gr.280476.125","DOIUrl":"10.1101/gr.280476.125","url":null,"abstract":"<p><p>Epitranscriptomics, a rapidly evolving field mainly driven by massive parallel sequencing technologies, explores post-transcriptional RNA modifications. <i>N</i> <sup>6</sup>-methyladenosine (m<sup>6</sup>A) has emerged as the most prominent and dynamically regulated modification in human mRNAs, being implicated in the regulation of diverse biological processes, including spermatogenesis, heat shock response, ultraviolet-induced DNA damage response and maternal mRNA clearance. Despite the recognized significance of m<sup>6</sup>A in mRNA regulation, limited studies have focused on the targeted and efficient manipulation of this modification in mRNAs. Here, we present Dem6A-Vec, an \"all-in-one\" plasmid vector designed for site-specific m<sup>6</sup>A demethylation in human mRNAs. Dem6A-Vec integrates the expression of a catalytically inactive RfxCas13d fused to the m<sup>6</sup>A demethylase ALKBH5 and a U6-driven customizable guide RNA in a single construct, simplifying experimental workflows and enhancing targeting efficiency. Using nanopore direct RNA sequencing, we identify high-confident m<sup>6</sup>A sites in HeLa cells, which serve as targets for Dem6A-Vec. We validate the targeted demethylation of m<sup>6</sup>A sites in the <i>EEF2</i> and <i>RRAGA</i> genes using the established SELECT-qPCR method, confirming the impacts on mRNA stability and highlighting the tool's precision and versatility. The presented approach is implemented in multiple mRNA sites with diverse methylation stoichiometries, underscoring its adaptability to various transcriptomic contexts. This study provides a robust and scalable method for investigating the functional roles of m<sup>6</sup>A modifications, offering a transformative platform for advancing epitranscriptomic research and potential therapeutic applications.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":"169-182"},"PeriodicalIF":5.5,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758396/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145742068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cell-type- and chromosome-specific chromatin landscapes and DNA replication programs of Drosophila testis tumor stem cell-like cells. 睾丸果蝇肿瘤干细胞样细胞的细胞型和染色体特异性染色质景观和DNA复制程序。
IF 5.5 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2026-01-05 DOI: 10.1101/gr.280809.125
Jennifer A Urban, Daniel Ringwalt, John M Urban, Wingel Xue, Ryan Gleason, Keji Zhao, Xin Chen

Stem cells have the unique ability to self-renew and differentiate into specialized cell types. Epigenetic mechanisms, including histones and their post-translational modifications, play a crucial role in regulating programs integral to a cell's identity, like gene expression and DNA replication. However, the transcriptional, chromatin, and replication timing profiles of adult stem cells in vivo remain poorly understood. Containing germline stem cells (GSCs) and somatic cyst stem cells (CySCs), the Drosophila testis provides an excellent in vivo model for studying adult stem cells. However, the small number of stem cells and the cellular heterogeneity of this tissue have limited comprehensive genomic studies. In this study, we develop cell-type-specific genomic techniques to analyze the transcriptome, histone modification patterns, and replication timing of germline stem cell (GSC)-like and somatic cyst stem cell (CySC)-like cells. Single-cell RNA sequencing validates previous findings on GSC-CySC intercellular communication and reveals a high expression of chromatin regulators in GSC-like cells. To characterize chromatin landscapes, we develop a cell-type-specific chromatin profiling assay to map H3K4me3-, H3K27me3-, and H3K9me3-enriched regions, corresponding to the euchromatic, facultative heterochromatic, and constitutive heterochromatic domains, respectively. Finally, we determine cell-type-specific replication timing profiles, integrating our in vivo data sets with published data using cultured cell lines. Our results reveal that GSC-like cells display a distinct replication program, compared with somatic lineages, that aligns with chromatin state differences. Collectively, our integrated transcriptomic, chromatin, and replication data sets provide a comprehensive framework for understanding genome regulation differences between these in vivo stem-cell populations, demonstrating the power of multiomics in uncovering cell-type-specific regulatory features.

干细胞具有自我更新和分化为特化细胞类型的独特能力。表观遗传机制,包括组蛋白及其翻译后修饰,在调节细胞身份不可或缺的程序中起着至关重要的作用,如基因表达和DNA复制。然而,体内成体干细胞的转录、染色质和复制时间谱仍然知之甚少。果蝇睾丸含有生殖系干细胞(GSCs)和体细胞囊肿干细胞(CySCs),为研究成体干细胞提供了良好的体内模型。然而,干细胞数量少,且干细胞组织的细胞异质性限制了全面的基因组研究。在这项研究中,我们开发了细胞类型特异性基因组技术来分析生殖系干细胞(GSC)样细胞和体细胞囊肿干细胞(CySC)样细胞的转录组、组蛋白修饰模式和复制时间。单细胞RNA测序验证了先前关于GSC-CySC细胞间通讯的发现,并揭示了gsc样细胞中染色质调节因子的高表达。为了表征染色质景观,我们开发了一种细胞类型特异性的染色质分析方法来绘制H3K4me3-、H3K27me3-和h3k9me3富集区域,分别对应于常染色质、兼性异染色质和本构异染色质区域。最后,我们确定了细胞类型特异性复制时间谱,将我们的体内数据集与使用培养细胞系的已发表数据相结合。我们的研究结果表明,与体细胞谱系相比,gsc样细胞显示出独特的复制程序,这与染色质状态差异一致。总的来说,我们整合的转录组学、染色质和复制数据集为理解这些体内干细胞群体之间的基因组调控差异提供了一个全面的框架,展示了多组学在揭示细胞类型特异性调控特征方面的力量。
{"title":"Cell-type- and chromosome-specific chromatin landscapes and DNA replication programs of <i>Drosophila</i> testis tumor stem cell-like cells.","authors":"Jennifer A Urban, Daniel Ringwalt, John M Urban, Wingel Xue, Ryan Gleason, Keji Zhao, Xin Chen","doi":"10.1101/gr.280809.125","DOIUrl":"10.1101/gr.280809.125","url":null,"abstract":"<p><p>Stem cells have the unique ability to self-renew and differentiate into specialized cell types. Epigenetic mechanisms, including histones and their post-translational modifications, play a crucial role in regulating programs integral to a cell's identity, like gene expression and DNA replication. However, the transcriptional, chromatin, and replication timing profiles of adult stem cells in vivo remain poorly understood. Containing germline stem cells (GSCs) and somatic cyst stem cells (CySCs), the <i>Drosophila</i> testis provides an excellent in vivo model for studying adult stem cells. However, the small number of stem cells and the cellular heterogeneity of this tissue have limited comprehensive genomic studies. In this study, we develop cell-type-specific genomic techniques to analyze the transcriptome, histone modification patterns, and replication timing of germline stem cell (GSC)-like and somatic cyst stem cell (CySC)-like cells. Single-cell RNA sequencing validates previous findings on GSC-CySC intercellular communication and reveals a high expression of chromatin regulators in GSC-like cells. To characterize chromatin landscapes, we develop a cell-type-specific chromatin profiling assay to map H3K4me3-, H3K27me3-, and H3K9me3-enriched regions, corresponding to the euchromatic, facultative heterochromatic, and constitutive heterochromatic domains, respectively. Finally, we determine cell-type-specific replication timing profiles, integrating our in vivo data sets with published data using cultured cell lines. Our results reveal that GSC-like cells display a distinct replication program, compared with somatic lineages, that aligns with chromatin state differences. Collectively, our integrated transcriptomic, chromatin, and replication data sets provide a comprehensive framework for understanding genome regulation differences between these in vivo stem-cell populations, demonstrating the power of multiomics in uncovering cell-type-specific regulatory features.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":"83-101"},"PeriodicalIF":5.5,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758400/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145722456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The PanOryza pangene catalog of Asian cultivated rice. 亚洲栽培水稻的全景盘古目录。
IF 5.5 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2026-01-05 DOI: 10.1101/gr.280790.125
Bruno Contreras-Moreira, Eshan Sharma, Shradha Saraf, Guy Naamati, Parul Gupta, Justin Elser, Dmytro Chebotarov, Kapeel Chougule, Zhenyuan Lu, Sharon Wei, Andrew Olson, Ian Tsang, Disha Lodha, Yong Zhou, Zhichao Yu, Wen Zhao, Jianwei Zhang, Sandeep Amberkar, Kawinnat Sue-Ob, Zhi Sun, Maria Martin, Kenneth L McNally, Doreen Ware, Eric W Deutsch, Dario Copetti, Rod A Wing, Pankaj Jaiswal, Sarah Dyer, Andrew R Jones

The rice genome underpins fundamental research and breeding, but the Nipponbare (japonica) reference does not fully encompass the genetic diversity of Asian rice. To address this gap, the Rice Population Reference Panel (RPRP) was developed, comprising high-quality assemblies of 16 rice cultivars to represent the japonica, indica, aus, and aromatic varietal groups. The RPRP has been consistently annotated and supported by extensive experimental data, and here, we report the computational assignment, characterization, and dissemination of stably identified pangenes, collectively called the PanOryza data set. We identify 25,178 core pangenes shared across all cultivars, alongside cultivar-specific and family-enriched genes. Core genes exhibit higher gene expression and proteomic evidence, higher confidence protein domains, and AlphaFold structures, whereas cultivar-specific genes are enriched for domains under selective breeding pressure, such as for disease resistance. We identify more than 5000 genes absent in the IRGSP rice reference genome and present in at least two other Oryza cultivars. We demonstrate the utility of this resource through various examples of pangenes and their protein domains. This resource, integrated into public databases, enables researchers to explore genetic and functional diversity via a population-aware "reference guide" across rice genomes, advancing both basic and applied research.

水稻基因组是基础研究和育种的基础,但是日本水稻(japonica)的参考文献并没有完全包含亚洲水稻的遗传多样性。为了解决这一差距,水稻种群参考小组(RPRP)的建立,由代表粳稻、籼稻、黄稻和芳香品种群的16个水稻品种组成。RPRP得到了大量实验数据的一致注释和支持,在这里,我们报告了稳定鉴定的泛基因的计算分配、表征和传播,统称为PanOryza数据集。我们鉴定出所有品种共有的25178个核心泛基因,以及品种特异性和家族富集基因。核心基因表现出更高的基因表达和蛋白质组学证据,更高的置信度蛋白结构域和AlphaFold结构,而品种特异性基因则在选择性育种压力下丰富了结构域,如抗病。我们发现了超过5000个在IRGSP水稻参考基因组中缺失的基因,这些基因至少存在于另外两个水稻品种中。我们通过各种泛基因及其蛋白质结构域的例子来展示这种资源的效用。该资源被整合到公共数据库中,使科学家能够通过种群感知的水稻基因组“参考指南”探索遗传和功能多样性,从而推进基础研究和应用研究。
{"title":"The PanOryza pangene catalog of Asian cultivated rice.","authors":"Bruno Contreras-Moreira, Eshan Sharma, Shradha Saraf, Guy Naamati, Parul Gupta, Justin Elser, Dmytro Chebotarov, Kapeel Chougule, Zhenyuan Lu, Sharon Wei, Andrew Olson, Ian Tsang, Disha Lodha, Yong Zhou, Zhichao Yu, Wen Zhao, Jianwei Zhang, Sandeep Amberkar, Kawinnat Sue-Ob, Zhi Sun, Maria Martin, Kenneth L McNally, Doreen Ware, Eric W Deutsch, Dario Copetti, Rod A Wing, Pankaj Jaiswal, Sarah Dyer, Andrew R Jones","doi":"10.1101/gr.280790.125","DOIUrl":"10.1101/gr.280790.125","url":null,"abstract":"<p><p>The rice genome underpins fundamental research and breeding, but the Nipponbare (<i>japonica</i>) reference does not fully encompass the genetic diversity of Asian rice. To address this gap, the Rice Population Reference Panel (RPRP) was developed, comprising high-quality assemblies of 16 rice cultivars to represent the <i>japonica</i>, <i>indica</i>, <i>aus</i>, and <i>aromatic</i> varietal groups. The RPRP has been consistently annotated and supported by extensive experimental data, and here, we report the computational assignment, characterization, and dissemination of stably identified pangenes, collectively called the PanOryza data set. We identify 25,178 core pangenes shared across all cultivars, alongside cultivar-specific and family-enriched genes. Core genes exhibit higher gene expression and proteomic evidence, higher confidence protein domains, and AlphaFold structures, whereas cultivar-specific genes are enriched for domains under selective breeding pressure, such as for disease resistance. We identify more than 5000 genes absent in the IRGSP rice reference genome and present in at least two other <i>Oryza</i> cultivars. We demonstrate the utility of this resource through various examples of pangenes and their protein domains. This resource, integrated into public databases, enables researchers to explore genetic and functional diversity via a population-aware \"reference guide\" across rice genomes, advancing both basic and applied research.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":"226-238"},"PeriodicalIF":5.5,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758395/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145742243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Genome research
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1