首页 > 最新文献

Genome research最新文献

英文 中文
Full-length RNA transcript sequencing traces brain isoform diversity in house mouse natural populations 全长 RNA 转录本测序追踪家鼠自然种群大脑同工酶的多样性
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-09-17 DOI: 10.1101/gr.279166.124
Wenyu Zhang, Anja Guenther, Yuanxiao Gao, Kristian Ullrich, Bruno Huettel, Aftab Ahmad, Lei Duan, Kaizong Wei, Diethard Tautz
The ability to generate multiple RNA transcript isoforms from the same gene is a general phenomenon in eukaryotes. However, the complexity and diversity of alternative isoforms in natural populations remain largely unexplored. Using a newly developed full-length transcripts enrichment protocol with 5' CAP selection, we sequenced full-length RNA transcripts of 48 individuals from outbred populations and subspecies of Mus musculus, and from the closely related sister species Mus spretus and Mus spicilegus as outgroups. The dataset represents the most extensive full-length high-quality isoform catalog at the population level to date. In total, we reliably identified 117,728 distinct isoforms, of which only 51% were previously annotated. We show that the population-specific distribution pattern of isoforms is phylogenetically informative and reflects the segregating SNP diversity between the populations. We find that ancient housekeeping genes are a major source of the overall isoform diversity, and that the generation of alternative first exons plays a major role in generating new isoforms. Given that our data allow us to distinguish between population-specific isoforms and isoforms that are conserved across multiple populations, it is possible to refine the annotation of the reference mouse genome to a set of about 40,000 isoforms that should be most relevant for comparative functional analysis across species.
同一基因能够产生多种 RNA 转录本异构体是真核生物的普遍现象。然而,自然种群中替代异构体的复杂性和多样性在很大程度上仍未得到探索。利用新开发的全长转录本富集方案和 5' CAP 选择,我们测序了 48 个个体的全长 RNA 转录本,这些个体来自麝的近交种群和亚种,以及作为外群的近缘姊妹种麝 spretus 和麝 spicilegus。该数据集代表了迄今为止种群水平上最广泛的全长高质量同工酶序列。我们总共可靠地鉴定了 117 728 个不同的同工酶,其中只有 51% 以前做过注释。我们的研究表明,同工酶的种群特异性分布模式具有系统发生学意义,并反映了种群间的SNP分离多样性。我们发现,古老的看家基因是整体同工型多样性的主要来源,而替代性第一外显子的产生在新同工型的产生中起着重要作用。鉴于我们的数据使我们能够区分种群特异性同工酶和在多个种群中保守的同工酶,我们有可能将参考小鼠基因组的注释细化为一组约 40,000 个同工酶,它们应该与跨物种的比较功能分析最为相关。
{"title":"Full-length RNA transcript sequencing traces brain isoform diversity in house mouse natural populations","authors":"Wenyu Zhang, Anja Guenther, Yuanxiao Gao, Kristian Ullrich, Bruno Huettel, Aftab Ahmad, Lei Duan, Kaizong Wei, Diethard Tautz","doi":"10.1101/gr.279166.124","DOIUrl":"https://doi.org/10.1101/gr.279166.124","url":null,"abstract":"The ability to generate multiple RNA transcript isoforms from the same gene is a general phenomenon in eukaryotes. However, the complexity and diversity of alternative isoforms in natural populations remain largely unexplored. Using a newly developed full-length transcripts enrichment protocol with 5' CAP selection, we sequenced full-length RNA transcripts of 48 individuals from outbred populations and subspecies of <em>Mus musculus</em>, and from the closely related sister species <em>Mus spretus</em> and <em>Mus spicilegus</em> as outgroups. The dataset represents the most extensive full-length high-quality isoform catalog at the population level to date. In total, we reliably identified 117,728 distinct isoforms, of which only 51% were previously annotated. We show that the population-specific distribution pattern of isoforms is phylogenetically informative and reflects the segregating SNP diversity between the populations. We find that ancient housekeeping genes are a major source of the overall isoform diversity, and that the generation of alternative first exons plays a major role in generating new isoforms. Given that our data allow us to distinguish between population-specific isoforms and isoforms that are conserved across multiple populations, it is possible to refine the annotation of the reference mouse genome to a set of about 40,000 isoforms that should be most relevant for comparative functional analysis across species.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"186 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142236238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Long-read RNA sequencing of archival tissues reveals novel genes and transcripts associated with clear cell renal cell carcinoma recurrence and immune evasion 档案组织的长读 RNA 测序揭示了与透明细胞肾细胞癌复发和免疫逃避相关的新基因和转录本
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-09-16 DOI: 10.1101/gr.278801.123
Joshua Lee, Elizabeth A Snell, Joanne Brown, Charlotte Elizabeth Booth, Rosamonde E Banks, Daniel J Turner, Naveen Vasudev, Dimitris Lagos
The use of long-read direct RNA sequencing (DRS) and PCR cDNA sequencing (PCS) in clinical oncology remains limited, with no direct comparison between the two methods. We used DRS and PCS to study clear cell renal cell carcinoma (ccRCC), focussing on new transcript and gene discovery. Twelve primary ccRCC archival tumors, six from patients who went on to relapse, were analysed. Results were validated in an independent cohort of twenty patients by qRT-PCR and compared to DRS analysis of RCC4 cells. In archival clinical samples and due to long-term storage, average read length was lower (400-500nt) than that achieved through DRS of RCC4 cells (>1100nt). Still, deconvolution analysis showed a loss of immune infiltrate in primary tumors of patients who relapse as reported by others. Differentially expressed genes in patients who went on to relapse were determined with good overlap between DRS and PCS, identifying LINC04216 and the T cell exhaustion marker TOX as novel candidate recurrence-associated genes. Novel transcript analysis revealed over 10,000 candidate novel transcripts detected by both methods and in ccRCC cells in vitro, including a novel CD274 (PD-L1) transcript encoding for the soluble version of the protein with a longer 3' UTR and lower stability than the annotated transcript. Both methods identified 414 novel genes, also detected in RCC4 cells, including a novel noncoding gene over-expressed in patients who relapse. Overall, we showcase use of PCS and DRS in archival tumor samples to uncover unmapped features of cancer transcriptomes, linked to disease progression and immune evasion.
长线程直接 RNA 测序(DRS)和 PCR cDNA 测序(PCS)在临床肿瘤学中的应用仍然有限,这两种方法之间没有直接的比较。我们使用 DRS 和 PCS 研究透明细胞肾细胞癌(ccRCC),重点是发现新的转录本和基因。我们分析了 12 例原发性 ccRCC 档案肿瘤,其中 6 例来自复发患者。通过 qRT-PCR 在 20 名患者的独立队列中验证了结果,并与 RCC4 细胞的 DRS 分析进行了比较。在存档临床样本中,由于长期储存,平均读取长度(400-500nt)低于RCC4细胞的DRS分析结果(1100nt)。不过,解卷积分析显示,复发患者的原发肿瘤中免疫浸润消失,这与其他报道不谋而合。在DRS和PCS之间确定了复发患者的差异表达基因,发现LINC04216和T细胞衰竭标记物TOX是新的候选复发相关基因。新转录本分析显示,两种方法和体外 ccRCC 细胞都检测到了 10,000 多个候选新转录本,其中包括一个编码可溶性蛋白的 CD274 (PD-L1) 新转录本,它的 3' UTR 比注释转录本长,稳定性比注释转录本低。这两种方法还发现了 414 个新基因,它们也在 RCC4 细胞中被检测到,包括一个在复发患者中过度表达的新的非编码基因。总之,我们展示了在存档肿瘤样本中使用 PCS 和 DRS 发现癌症转录组中与疾病进展和免疫逃避相关的未绘制特征。
{"title":"Long-read RNA sequencing of archival tissues reveals novel genes and transcripts associated with clear cell renal cell carcinoma recurrence and immune evasion","authors":"Joshua Lee, Elizabeth A Snell, Joanne Brown, Charlotte Elizabeth Booth, Rosamonde E Banks, Daniel J Turner, Naveen Vasudev, Dimitris Lagos","doi":"10.1101/gr.278801.123","DOIUrl":"https://doi.org/10.1101/gr.278801.123","url":null,"abstract":"The use of long-read direct RNA sequencing (DRS) and PCR cDNA sequencing (PCS) in clinical oncology remains limited, with no direct comparison between the two methods. We used DRS and PCS to study clear cell renal cell carcinoma (ccRCC), focussing on new transcript and gene discovery. Twelve primary ccRCC archival tumors, six from patients who went on to relapse, were analysed. Results were validated in an independent cohort of twenty patients by qRT-PCR and compared to DRS analysis of RCC4 cells. In archival clinical samples and due to long-term storage, average read length was lower (400-500nt) than that achieved through DRS of RCC4 cells (&gt;1100nt). Still, deconvolution analysis showed a loss of immune infiltrate in primary tumors of patients who relapse as reported by others. Differentially expressed genes in patients who went on to relapse were determined with good overlap between DRS and PCS, identifying <em>LINC04216</em> and the T cell exhaustion marker <em>TOX</em> as novel candidate recurrence-associated genes. Novel transcript analysis revealed over 10,000 candidate novel transcripts detected by both methods and in ccRCC cells in vitro, including a novel <em>CD274</em> (<em>PD-L1</em>) transcript encoding for the soluble version of the protein with a longer 3' UTR and lower stability than the annotated transcript. Both methods identified 414 novel genes, also detected in RCC4 cells, including a novel noncoding gene over-expressed in patients who relapse. Overall, we showcase use of PCS and DRS in archival tumor samples to uncover unmapped features of cancer transcriptomes, linked to disease progression and immune evasion.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"63 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142235259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Measuring X inactivation skew for X-linked diseases with adaptive nanopore sequencing 利用自适应纳米孔测序技术测量 X 连锁疾病的 X 失活倾斜度
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-09-16 DOI: 10.1101/gr.279396.124
Sena A Gocuk, James Lancaster, Shian Su, Jasleen K Jolly, Thomas L Edwards, Doron G Hickey, Matthew E Ritchie, Marnie E Blewitt, Lauren N Ayton, Quentin Gouil
X-linked genetic disorders typically affect females less severely than males due to the presence of a second X Chromosome not carrying the deleterious variant. However, the phenotypic expression in females is highly variable, which may be explained by an allelic skew in X-Chromosome inactivation. Accurate measurement of X inactivation skew is crucial to understand and predict disease phenotype in carrier females, with prediction especially relevant for degenerative conditions. We propose a novel approach using nanopore sequencing to quantify skewed X inactivation accurately. By phasing sequence variants and methylation patterns, this single assay reveals the disease variant, X inactivation skew, its directionality, and is applicable to all patients and X-linked variants. Enrichment of X Chromosome reads through adaptive sampling enhances cost-efficiency. Our study includes a cohort of 16 X-linked variant carrier females affected by two X-linked inherited retinal diseases: choroideremia and RPGR-associated retinitis pigmentosa. As retinal DNA cannot be readily obtained, we instead determine the skew from peripheral samples (blood, saliva and buccal mucosa), and correlate it to phenotypic outcomes. This revealed a strong correlation between X inactivation skew and disease presentation, confirming the value in performing this assay and its potential as a way to prioritise patients for early intervention, such as gene therapy currently in clinical trials for these conditions. Our method of assessing skewed X inactivation is applicable to all long-read genomic datasets, providing insights into disease risk and severity and aiding in the development of individualised strategies for X-linked variant carrier females.
X 连锁遗传病对女性的影响通常不如男性严重,这是因为女性体内存在第二个不携带有害变异体的 X 染色体。然而,女性的表型表现却千差万别,这可能与 X 染色体失活的等位基因偏斜有关。精确测量 X 染色体失活偏斜对了解和预测携带者女性的疾病表型至关重要,尤其是对退行性疾病的预测。我们提出了一种利用纳米孔测序准确量化X失活偏斜的新方法。通过对序列变异和甲基化模式进行相位分析,这种单一检测方法可揭示疾病变异、X 失活偏斜及其方向性,并适用于所有患者和 X 连锁变异。通过自适应采样丰富 X 染色体读数可提高成本效益。我们的研究包括16名X连锁变异携带者女性,她们患有两种X连锁遗传性视网膜疾病:脉络膜血症和RPGR相关性色素性视网膜炎。由于无法轻易获得视网膜 DNA,我们转而从外周样本(血液、唾液和口腔粘膜)中确定偏斜度,并将其与表型结果相关联。结果表明,X 失活偏斜与疾病表现之间存在很强的相关性,这证实了进行这种检测的价值,以及它作为一种优先考虑对患者进行早期干预的方法的潜力,例如目前正在对这些疾病进行临床试验的基因疗法。我们的 X 失活偏斜评估方法适用于所有长读取基因组数据集,可帮助了解疾病风险和严重程度,并有助于为 X 连锁变异携带女性制定个体化策略。
{"title":"Measuring X inactivation skew for X-linked diseases with adaptive nanopore sequencing","authors":"Sena A Gocuk, James Lancaster, Shian Su, Jasleen K Jolly, Thomas L Edwards, Doron G Hickey, Matthew E Ritchie, Marnie E Blewitt, Lauren N Ayton, Quentin Gouil","doi":"10.1101/gr.279396.124","DOIUrl":"https://doi.org/10.1101/gr.279396.124","url":null,"abstract":"X-linked genetic disorders typically affect females less severely than males due to the presence of a second X Chromosome not carrying the deleterious variant. However, the phenotypic expression in females is highly variable, which may be explained by an allelic skew in X-Chromosome inactivation. Accurate measurement of X inactivation skew is crucial to understand and predict disease phenotype in carrier females, with prediction especially relevant for degenerative conditions. We propose a novel approach using nanopore sequencing to quantify skewed X inactivation accurately. By phasing sequence variants and methylation patterns, this single assay reveals the disease variant, X inactivation skew, its directionality, and is applicable to all patients and X-linked variants. Enrichment of X Chromosome reads through adaptive sampling enhances cost-efficiency. Our study includes a cohort of 16 X-linked variant carrier females affected by two X-linked inherited retinal diseases: choroideremia and RPGR-associated retinitis pigmentosa. As retinal DNA cannot be readily obtained, we instead determine the skew from peripheral samples (blood, saliva and buccal mucosa), and correlate it to phenotypic outcomes. This revealed a strong correlation between X inactivation skew and disease presentation, confirming the value in performing this assay and its potential as a way to prioritise patients for early intervention, such as gene therapy currently in clinical trials for these conditions. Our method of assessing skewed X inactivation is applicable to all long-read genomic datasets, providing insights into disease risk and severity and aiding in the development of individualised strategies for X-linked variant carrier females.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"8 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142235258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Long-read transcriptome sequencing of CLL and MDS patients uncovers molecular effects of SF3B1 mutations 对CLL和MDS患者进行长线程转录组测序,揭示SF3B1突变的分子效应
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-09-13 DOI: 10.1101/gr.279327.124
Alicja Pacholewska, Matthias Lienhard, Mirko Brueggemann, Heike Haenel, Lorina Bilalli, Anja Koenigs, Felix Hess, Kerstin Becker, Karl Koehrer, Jesko Fabian Kaiser, Holger Gohlke, Norbert Gattermann, Michael Hallek, Carmen Diana Herling, Julian Koenig, Christina Grimm, Ralf Herwig, Kathi Zarnack, Michal R. Schweiger
Mutations in splicing factor 3B subunit 1 (SF3B1) frequently occur in patients with chronic lymphocytic leukemia (CLL) and myelodysplastic syndromes (MDS). These mutations have different effects on the disease prognosis with beneficial effect in MDS and worse prognosis in CLL patients. A full-length transcriptome approach can expand our knowledge on SF3B1 mutation effects on RNA splicing and its contribution to patient survival and treatment options. We applied long-read transcriptome sequencing (LRTS) to 44 MDS and CLL patients, as well as two pairs of isogenic cell lines with and without SF3B1 mutations, and found >60% of novel isoforms. Splicing alterations were largely shared between cancer types and specifically affected the usage of introns and 3’ splice sites. Our data highlighted a constrained window at canonical 3’ splice sites in which dynamic splice site switches occurred in SF3B1-mutated patients. Using transcriptome-wide RNA binding maps and molecular dynamics simulations, we showed multimodal SF3B1 binding at 3’ splice sites and predicted reduced RNA binding at the second binding pocket of SF3B1K700E. Our work presents the hitherto most complete LRTS study of the SF3B1 mutation in CLL and MDS and provides a resource to study aberrant splicing in cancer. Moreover, we showed that different disease prognosis most likely results from the different cell types expanded during carcinogenesis rather than different mechanisms of action of the mutated SF3B1. These results have important implications for understanding the role of SF3B1 mutations in hematological malignancies and other related diseases.
剪接因子 3B 亚基 1(SF3B1)的突变经常发生在慢性淋巴细胞白血病(CLL)和骨髓增生异常综合征(MDS)患者中。这些突变对疾病的预后有不同的影响,MDS 患者的预后较好,而 CLL 患者的预后较差。全长转录组方法可以扩展我们对SF3B1突变对RNA剪接的影响及其对患者生存和治疗选择的贡献的认识。我们对44例MDS和CLL患者以及两对有SF3B1突变和无SF3B1突变的同源细胞系进行了长片段转录组测序(LRTS),发现了60%的新型同工酶。剪接改变在很大程度上是不同癌症类型共有的,并特别影响到内含子和3'剪接位点的使用。我们的数据强调了在SF3B1突变患者中发生动态剪接位点切换的规范3'剪接位点的受限窗口。利用转录组范围的 RNA 结合图和分子动力学模拟,我们显示了 SF3B1 在 3' 剪接位点的多模式结合,并预测 SF3B1K700E 的第二个结合口袋的 RNA 结合减少。我们的研究是迄今为止对 CLL 和 MDS 中 SF3B1 突变进行的最完整的 LRTS 研究,为研究癌症中的剪接异常提供了资源。这些结果对于理解 SF3B1 突变在血液恶性肿瘤及其他相关疾病中的作用具有重要意义。
{"title":"Long-read transcriptome sequencing of CLL and MDS patients uncovers molecular effects of SF3B1 mutations","authors":"Alicja Pacholewska, Matthias Lienhard, Mirko Brueggemann, Heike Haenel, Lorina Bilalli, Anja Koenigs, Felix Hess, Kerstin Becker, Karl Koehrer, Jesko Fabian Kaiser, Holger Gohlke, Norbert Gattermann, Michael Hallek, Carmen Diana Herling, Julian Koenig, Christina Grimm, Ralf Herwig, Kathi Zarnack, Michal R. Schweiger","doi":"10.1101/gr.279327.124","DOIUrl":"https://doi.org/10.1101/gr.279327.124","url":null,"abstract":"Mutations in splicing factor 3B subunit 1 (<em>SF3B1</em>) frequently occur in patients with chronic lymphocytic leukemia (CLL) and myelodysplastic syndromes (MDS). These mutations have different effects on the disease prognosis with beneficial effect in MDS and worse prognosis in CLL patients. A full-length transcriptome approach can expand our knowledge on <em>SF3B1</em> mutation effects on RNA splicing and its contribution to patient survival and treatment options. We applied long-read transcriptome sequencing (LRTS) to 44 MDS and CLL patients, as well as two pairs of isogenic cell lines with and without <em>SF3B1</em> mutations, and found &gt;60% of novel isoforms. Splicing alterations were largely shared between cancer types and specifically affected the usage of introns and 3’ splice sites. Our data highlighted a constrained window at canonical 3’ splice sites in which dynamic splice site switches occurred in <em>SF3B1</em>-mutated patients. Using transcriptome-wide RNA binding maps and molecular dynamics simulations, we showed multimodal SF3B1 binding at 3’ splice sites and predicted reduced RNA binding at the second binding pocket of SF3B1<sup>K700E</sup>. Our work presents the hitherto most complete LRTS study of the <em>SF3B1</em> mutation in CLL and MDS and provides a resource to study aberrant splicing in cancer. Moreover, we showed that different disease prognosis most likely results from the different cell types expanded during carcinogenesis rather than different mechanisms of action of the mutated <em>SF3B1</em>. These results have important implications for understanding the role of <em>SF3B1</em> mutations in hematological malignancies and other related diseases.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"5 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142231561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhanced detection of RNA modifications and read mapping with high-accuracy nanopore RNA basecalling models 利用高精度纳米孔 RNA 基调用模型增强 RNA 修饰检测和读图能力
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-09-13 DOI: 10.1101/gr.278849.123
Gregor Diensthuber, Leszek P Pryszcz, Laia Llovera, Morghan C Lucas, Anna Delgado-Tejedor, Sonia Cruciani, Jean-Yves Roignant, Oguzhan Begik, Eva Maria Novoa
In recent years, nanopore direct RNA sequencing (DRS) became a valuable tool for studying the epitranscriptome, due to its ability to detect multiple modifications within the same full-length native RNA molecules. While RNA modifications can be identified in the form of systematic basecalling 'errors' in DRS datasets, N6-methyladenosine (m6A) modifications produce relatively low 'errors' compared to other RNA modifications, limiting the applicability of this approach to m6A sites that are modified at high stoichiometries. Here, we demonstrate that the use of alternative RNA basecalling models, trained with fully unmodified sequences, increases the 'error'signal of m6A, leading to enhanced detection and improved sensitivity even at low stoichiometries. Moreover, we find that high-accuracy alternative RNA basecalling models can show up to 97% median basecalling accuracy, outperforming currently available RNA basecalling models, which show 91% median basecalling accuracy. Notably, the use of high-accuracy basecalling models is accompanied by a significant increase in the number of mapped reads –especially in shorter RNA fractions– and increased basecalling error signatures at pseudouridine (Ψ) and N1-methylpseudouridine (m1Ψ) modified sites. Overall, our work demonstrates that alternative RNA basecalling models can be used to improve the detection of RNA modifications, read mappability, and basecalling accuracy in nanopore DRS datasets.
近年来,纳米孔直接 RNA 测序(DRS)因其能够检测同一全长原生 RNA 分子中的多种修饰而成为研究表转录组的重要工具。在 DRS 数据集中,RNA 修饰可以以系统性基调 "误差 "的形式被识别出来,但与其他 RNA 修饰相比,N6-甲基腺苷(m6A)修饰产生的 "误差 "相对较低,从而限制了这种方法对高化学计量比修饰的 m6A 位点的适用性。在这里,我们证明了使用完全未修饰序列训练的替代 RNA 基信号模型会增加 m6A 的 "误差 "信号,从而提高检测能力和灵敏度,即使在低化学计量比的情况下也是如此。此外,我们发现高精度替代 RNA 基调模型的中位基调准确率可达 97%,优于目前可用的 RNA 基调模型,后者的中位基调准确率为 91%。值得注意的是,在使用高精度基调模型的同时,映射读数的数量也显著增加--尤其是在较短的 RNA 片段中--并且假尿嘧啶(Ψ)和 N1-甲基假尿嘧啶(m1Ψ)修饰位点的基调错误特征也增加了。总之,我们的工作表明,替代的 RNA 基调模型可用于改进纳米孔 DRS 数据集中 RNA 修饰的检测、读取映射性和基调准确性。
{"title":"Enhanced detection of RNA modifications and read mapping with high-accuracy nanopore RNA basecalling models","authors":"Gregor Diensthuber, Leszek P Pryszcz, Laia Llovera, Morghan C Lucas, Anna Delgado-Tejedor, Sonia Cruciani, Jean-Yves Roignant, Oguzhan Begik, Eva Maria Novoa","doi":"10.1101/gr.278849.123","DOIUrl":"https://doi.org/10.1101/gr.278849.123","url":null,"abstract":"In recent years, nanopore direct RNA sequencing (DRS) became a valuable tool for studying the epitranscriptome, due to its ability to detect multiple modifications within the same full-length native RNA molecules. While RNA modifications can be identified in the form of systematic basecalling 'errors' in DRS datasets, N6-methyladenosine (m6A) modifications produce relatively low 'errors' compared to other RNA modifications, limiting the applicability of this approach to m6A sites that are modified at high stoichiometries. Here, we demonstrate that the use of alternative RNA basecalling models, trained with fully unmodified sequences, increases the 'error'signal of m6A, leading to enhanced detection and improved sensitivity even at low stoichiometries. Moreover, we find that high-accuracy alternative RNA basecalling models can show up to 97% median basecalling accuracy, outperforming currently available RNA basecalling models, which show 91% median basecalling accuracy. Notably, the use of high-accuracy basecalling models is accompanied by a significant increase in the number of mapped reads –especially in shorter RNA fractions– and increased basecalling error signatures at pseudouridine (Ψ) and N1-methylpseudouridine (m1Ψ) modified sites. Overall, our work demonstrates that alternative RNA basecalling models can be used to improve the detection of RNA modifications, read mappability, and basecalling accuracy in nanopore DRS datasets.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"63 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142231560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Long-read DNA and cDNA sequencing identify cancer-predisposing deep intronic variation in tumor-suppressor genes 长读DNA和cDNA测序确定肿瘤抑制基因中易致癌的深层内含子变异
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-09-13 DOI: 10.1101/gr.279158.124
Suleyman Gulsuner, Amal AbuRayyan, Jessica B. Mandell, Ming K. Lee, Greta V. Bernier, Barbara M. Norquist, Sarah B. Pierce, Mary-Claire King, Tom Walsh
The vast majority of deeply intronic genomic variants are benign, but some extremely rare or private deep intronic variants lead to exonification of intronic sequence with abnormal transcriptional consequences. Damaging variants of this class are likely underreported as causes of disease for several reasons: Most clinical DNA and RNA testing does not include full intronic sequences; many of these variants lie in complex repetitive regions that cannot be aligned from short-read whole-genome sequence; and, until recently, consequences of deep intronic variants were not accurately predicted by in silico tools. We evaluated the frequency and consequences of rare deep intronic variants for families severely affected with breast, ovarian, pancreatic, and/or metastatic prostate cancer, but with no causal variant identified by any previous genomic or cDNA-based approach. For 10 tumor-suppressor genes, we used multiplexed adaptive sampling long-read DNA sequencing and cDNA sequencing, based on patient-derived DNA and RNA, to systematically evaluate deep intronic variation. We identified all variants across the full genomic loci of targeted genes, applied the in silico tools SpliceAI and Pangolin to predict variants of functional consequence, and then carried out long-read cDNA sequencing to identify aberrant transcripts. For eight of the 120 (6%) previously unsolved families, rare deep intronic variants in BRCA1, PALB2, and ATM create intronic pseudoexons that are spliced into transcripts, leading to premature truncations. These results suggest that long-read DNA and cDNA sequencing can be integrated into variant discovery, with strategies for accurately characterizing pathogenic variants.
绝大多数深度内含子基因组变异都是良性的,但一些极其罕见或私密的深度内含子变异会导致内含子序列的外显子化,从而产生异常的转录后果。由于多种原因,这类损伤性变异很可能未被充分报告为致病原因:大多数临床 DNA 和 RNA 检测并不包括完整的内含子序列;许多此类变异位于复杂的重复区域,无法通过短读数全基因组序列进行比对;直到最近,深度内含子变异的后果也无法通过硅学工具准确预测。我们评估了受乳腺癌、卵巢癌、胰腺癌和/或转移性前列腺癌严重影响的家族中罕见深部内含子变异的频率和后果,但以前的任何基因组学或基于 cDNA 的方法都没有发现因果变异。对于 10 个肿瘤抑制基因,我们使用了基于患者 DNA 和 RNA 的多重自适应采样长读数 DNA 测序和 cDNA 测序来系统评估深层内含子变异。我们确定了目标基因全基因组位点上的所有变异,应用 Silico 工具 SpliceAI 和 Pangolin 预测功能性变异,然后进行长读程 cDNA 测序以确定异常转录本。在 120 个先前未解决的家族中,有 8 个家族(6%)的 BRCA1、PALB2 和 ATM 中的罕见深内含子变异产生了内含子假外显子,这些假外显子被剪接到转录本中,导致过早截断。这些结果表明,长线程DNA和cDNA测序可被整合到变异发现中,其策略可准确表征致病变异。
{"title":"Long-read DNA and cDNA sequencing identify cancer-predisposing deep intronic variation in tumor-suppressor genes","authors":"Suleyman Gulsuner, Amal AbuRayyan, Jessica B. Mandell, Ming K. Lee, Greta V. Bernier, Barbara M. Norquist, Sarah B. Pierce, Mary-Claire King, Tom Walsh","doi":"10.1101/gr.279158.124","DOIUrl":"https://doi.org/10.1101/gr.279158.124","url":null,"abstract":"The vast majority of deeply intronic genomic variants are benign, but some extremely rare or private deep intronic variants lead to exonification of intronic sequence with abnormal transcriptional consequences. Damaging variants of this class are likely underreported as causes of disease for several reasons: Most clinical DNA and RNA testing does not include full intronic sequences; many of these variants lie in complex repetitive regions that cannot be aligned from short-read whole-genome sequence; and, until recently, consequences of deep intronic variants were not accurately predicted by in silico tools. We evaluated the frequency and consequences of rare deep intronic variants for families severely affected with breast, ovarian, pancreatic, and/or metastatic prostate cancer, but with no causal variant identified by any previous genomic or cDNA-based approach. For 10 tumor-suppressor genes, we used multiplexed adaptive sampling long-read DNA sequencing and cDNA sequencing, based on patient-derived DNA and RNA, to systematically evaluate deep intronic variation. We identified all variants across the full genomic loci of targeted genes, applied the in silico tools SpliceAI and Pangolin to predict variants of functional consequence, and then carried out long-read cDNA sequencing to identify aberrant transcripts. For eight of the 120 (6%) previously unsolved families, rare deep intronic variants in <em>BRCA1</em>, <em>PALB2</em>, and <em>ATM</em> create intronic pseudoexons that are spliced into transcripts, leading to premature truncations. These results suggest that long-read DNA and cDNA sequencing can be integrated into variant discovery, with strategies for accurately characterizing pathogenic variants.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"74 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142231562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comprehensive identification of genomic and environmental determinants of phenotypic plasticity in maize 全面鉴定玉米表型可塑性的基因组和环境决定因素
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-09-13 DOI: 10.1101/gr.279027.124
Laura E. Tibbs-Cortes, Tingting Guo, Carson M. Andorf, Xianran Li, Jianming Yu
Maize phenotypes are plastic, determined by the complex interplay of genetics and environmental variables. Uncovering the genes responsible and understanding how their effects change across a large geographic region are challenging. In this study, we conducted systematic analysis to identify environmental indices that strongly influence 19 traits (including flowering time, plant architecture, and yield component traits) measured in the maize nested association mapping (NAM) population grown in 11 environments. Identified environmental indices based on day length, temperature, moisture, and combinations of these are biologically meaningful. Next, we leveraged a total of more than 20 million SNP and SV markers derived from recent de novo sequencing of the NAM founders for trait prediction and dissection. When combined with identified environmental indices, genomic prediction enables accurate performance predictions. Genome-wide association studies (GWASs) detected genetic loci associated with the plastic response to the identified environmental indices for all examined traits. By systematically uncovering the major environmental and genomic factors underlying phenotypic plasticity in a wide variety of traits and depositing our results as a track on the MaizeGDB genome browser, we provide a community resource as well as a comprehensive analytical framework to facilitate continuing complex trait dissection and prediction in maize and other crops. Our findings also provide a conceptual framework for the genetic architecture of phenotypic plasticity by accommodating two alternative models, regulatory gene model and allelic sensitivity model, as special cases of a continuum.
玉米的表型具有可塑性,由遗传和环境变量的复杂相互作用决定。揭示相关基因并了解其影响如何在一个大的地理区域内发生变化是一项挑战。在这项研究中,我们进行了系统分析,以确定对在 11 种环境中生长的玉米嵌套关联图谱(NAM)群体测量的 19 个性状(包括开花时间、植株结构和产量成分性状)有强烈影响的环境指数。所确定的环境指数基于日长、温度、湿度以及这些指数的组合,具有生物学意义。接下来,我们利用最近对 NAM 创始者进行从头测序得到的总计超过 2,000 万个 SNP 和 SV 标记进行性状预测和分析。结合已确定的环境指数,基因组预测可实现准确的性能预测。全基因组关联研究(GWAS)发现了与所有受检性状对已确定环境指数的可塑性响应相关的基因位点。通过系统地揭示各种性状表型可塑性的主要环境和基因组因素,并将我们的研究成果作为一个轨道存放在 MaizeGDB 基因组浏览器上,我们提供了一个社区资源和一个全面的分析框架,以促进对玉米和其他作物复杂性状的持续分析和预测。我们的研究结果还为表型可塑性的遗传结构提供了一个概念框架,将两种可选模型(调控基因模型和等位基因敏感性模型)作为连续体的特例。
{"title":"Comprehensive identification of genomic and environmental determinants of phenotypic plasticity in maize","authors":"Laura E. Tibbs-Cortes, Tingting Guo, Carson M. Andorf, Xianran Li, Jianming Yu","doi":"10.1101/gr.279027.124","DOIUrl":"https://doi.org/10.1101/gr.279027.124","url":null,"abstract":"Maize phenotypes are plastic, determined by the complex interplay of genetics and environmental variables. Uncovering the genes responsible and understanding how their effects change across a large geographic region are challenging. In this study, we conducted systematic analysis to identify environmental indices that strongly influence 19 traits (including flowering time, plant architecture, and yield component traits) measured in the maize nested association mapping (NAM) population grown in 11 environments. Identified environmental indices based on day length, temperature, moisture, and combinations of these are biologically meaningful. Next, we leveraged a total of more than 20 million SNP and SV markers derived from recent de novo sequencing of the NAM founders for trait prediction and dissection. When combined with identified environmental indices, genomic prediction enables accurate performance predictions. Genome-wide association studies (GWASs) detected genetic loci associated with the plastic response to the identified environmental indices for all examined traits. By systematically uncovering the major environmental and genomic factors underlying phenotypic plasticity in a wide variety of traits and depositing our results as a track on the MaizeGDB genome browser, we provide a community resource as well as a comprehensive analytical framework to facilitate continuing complex trait dissection and prediction in maize and other crops. Our findings also provide a conceptual framework for the genetic architecture of phenotypic plasticity by accommodating two alternative models, regulatory gene model and allelic sensitivity model, as special cases of a continuum.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"32 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142231609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel approach for in vivo DNA footprinting using short double-stranded cell-free DNA from plasma 利用血浆中短双链无细胞 DNA 进行体内 DNA 追踪的新方法
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-09-13 DOI: 10.1101/gr.279326.124
Jan Müller, Christina Hartwig, Mirko Sonntag, Lisa Bitzer, Christopher Adelmann, Yevhen Vainshtein, Karolina Glanz, Sebastian O. Decker, Thorsten Brenner, Georg F. Weber, Arndt von Haeseler, Kai Sohn
Here, we present a method for enrichment of double-stranded cfDNA with an average length of ∼40 bp from cfDNA for high-throughput DNA sequencing. This class of cfDNA is enriched at gene promoters and binding sites of transcription factors or structural DNA-binding proteins, so that a genome-wide DNA footprint is directly captured from liquid biopsies. In short double-stranded cfDNA from healthy individuals, we find significant enrichment of 203 transcription factor motifs. Additionally, short double-stranded cfDNA signals at specific genomic regions correlate negatively with DNA methylation, positively with H3K4me3 histone modifications and gene transcription. The diagnostic potential of short double-stranded cell-free DNA (cfDNA) in blood plasma has not yet been recognized. When comparing short double-stranded cfDNA from patient samples of pancreatic ductal adenocarcinoma with colorectal carcinoma or septic with postoperative controls, we identify 136 and 241 differentially enriched loci, respectively. Using these differentially enriched loci, the disease types can be clearly distinguished by principal component analysis, demonstrating the diagnostic potential of short double-stranded cfDNA signals as a new class of biomarkers for liquid biopsies.
在此,我们介绍一种从 cfDNA 中富集平均长度为 40 bp 的双链 cfDNA 的方法,用于高通量 DNA 测序。这类 cfDNA 富集在基因启动子和转录因子或结构 DNA 结合蛋白的结合位点,因此可以直接从液体活检组织中捕获全基因组的 DNA 足印。在来自健康人的短双链 cfDNA 中,我们发现 203 个转录因子基序显著富集。此外,特定基因组区域的短双链 cfDNA 信号与 DNA 甲基化呈负相关,与 H3K4me3 组蛋白修饰和基因转录呈正相关。血浆中短双链无细胞 DNA(cfDNA)的诊断潜力尚未得到认可。在比较胰腺导管腺癌、结直肠癌或败血症患者样本与术后对照组样本中的短双链 cfDNA 时,我们分别发现了 136 个和 241 个不同的富集位点。利用这些不同的富集位点,可以通过主成分分析清楚地区分疾病类型,这证明了短双链 cfDNA 信号作为液体活检的一类新生物标记物的诊断潜力。
{"title":"A novel approach for in vivo DNA footprinting using short double-stranded cell-free DNA from plasma","authors":"Jan Müller, Christina Hartwig, Mirko Sonntag, Lisa Bitzer, Christopher Adelmann, Yevhen Vainshtein, Karolina Glanz, Sebastian O. Decker, Thorsten Brenner, Georg F. Weber, Arndt von Haeseler, Kai Sohn","doi":"10.1101/gr.279326.124","DOIUrl":"https://doi.org/10.1101/gr.279326.124","url":null,"abstract":"Here, we present a method for enrichment of double-stranded cfDNA with an average length of ∼40 bp from cfDNA for high-throughput DNA sequencing. This class of cfDNA is enriched at gene promoters and binding sites of transcription factors or structural DNA-binding proteins, so that a genome-wide DNA footprint is directly captured from liquid biopsies. In short double-stranded cfDNA from healthy individuals, we find significant enrichment of 203 transcription factor motifs. Additionally, short double-stranded cfDNA signals at specific genomic regions correlate negatively with DNA methylation, positively with H3K4me3 histone modifications and gene transcription. The diagnostic potential of short double-stranded cell-free DNA (cfDNA) in blood plasma has not yet been recognized. When comparing short double-stranded cfDNA from patient samples of pancreatic ductal adenocarcinoma with colorectal carcinoma or septic with postoperative controls, we identify 136 and 241 differentially enriched loci, respectively. Using these differentially enriched loci, the disease types can be clearly distinguished by principal component analysis, demonstrating the diagnostic potential of short double-stranded cfDNA signals as a new class of biomarkers for liquid biopsies.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"71 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142231563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evidence for compensatory evolution within pleiotropic regulatory elements 多效应调控元件补偿性进化的证据
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-09-10 DOI: 10.1101/gr.279001.124
Zane Kliesmete, Peter Orchard, Victor Yan Kin Lee, Johanna Geuder, Simon M. Krauß, Mari Ohnuki, Jessica Jocher, Beate Vieth, Wolfgang Enard, Ines Hellmann
Pleiotropy, measured as expression breadth across tissues, is one of the best predictors for protein sequence and expression conservation. In this study, we investigated its effect on the evolution of cis-regulatory elements (CREs). To this end, we carefully reanalyzed the Epigenomics Roadmap data for nine fetal tissues, assigning a measure of pleiotropic degree to nearly half a million CREs. To assess the functional conservation of CREs, we generated ATAC-seq and RNA-seq data from humans and macaques. We found that more pleiotropic CREs exhibit greater conservation in accessibility, and the mRNA expression levels of the associated genes are more conserved. This trend of higher conservation for higher degrees of pleiotropy persists when analyzing the transcription factor binding repertoire. In contrast, simple DNA sequence conservation of orthologous sites between species tends to be even lower for pleiotropic CREs than for species-specific CREs. Combining various lines of evidence, we propose that the lack of sequence conservation in functionally conserved pleiotropic CREs is due to within-element compensatory evolution. In summary, our findings suggest that pleiotropy is also a good predictor for the functional conservation of CREs, even though this is not reflected in the sequence conservation of pleiotropic CREs.
以跨组织表达广度衡量的多义性是蛋白质序列和表达保护的最佳预测指标之一。在本研究中,我们研究了它对顺式调控元件(CRE)进化的影响。为此,我们仔细地重新分析了九种胎儿组织的表观基因组学路线图数据,为近五十万个 CREs 指定了一个多义性度量。为了评估 CRE 的功能保护,我们生成了来自人类和猕猴的 ATAC-seq 和 RNA-seq 数据。我们发现,多向性 CREs 在可及性方面表现出更大的保护性,相关基因的 mRNA 表达水平也更加保守。在分析转录因子结合库时,这种多效性程度越高,保守性越高的趋势依然存在。与此相反,物种间同源位点的简单 DNA 序列保守性在多效应 CREs 中往往比物种特异性 CREs 更低。综合各种证据,我们提出,功能保守的多效应 CRE 缺乏序列保守是由于元件内补偿进化造成的。总之,我们的研究结果表明,多效性也是预测 CREs 功能保守性的一个很好的指标,尽管多效性 CREs 的序列保守性并没有反映出这一点。
{"title":"Evidence for compensatory evolution within pleiotropic regulatory elements","authors":"Zane Kliesmete, Peter Orchard, Victor Yan Kin Lee, Johanna Geuder, Simon M. Krauß, Mari Ohnuki, Jessica Jocher, Beate Vieth, Wolfgang Enard, Ines Hellmann","doi":"10.1101/gr.279001.124","DOIUrl":"https://doi.org/10.1101/gr.279001.124","url":null,"abstract":"Pleiotropy, measured as expression breadth across tissues, is one of the best predictors for protein sequence and expression conservation. In this study, we investigated its effect on the evolution of <em>cis</em>-regulatory elements (CREs). To this end, we carefully reanalyzed the Epigenomics Roadmap data for nine fetal tissues, assigning a measure of pleiotropic degree to nearly half a million CREs. To assess the functional conservation of CREs, we generated ATAC-seq and RNA-seq data from humans and macaques. We found that more pleiotropic CREs exhibit greater conservation in accessibility, and the mRNA expression levels of the associated genes are more conserved. This trend of higher conservation for higher degrees of pleiotropy persists when analyzing the transcription factor binding repertoire. In contrast, simple DNA sequence conservation of orthologous sites between species tends to be even lower for pleiotropic CREs than for species-specific CREs. Combining various lines of evidence, we propose that the lack of sequence conservation in functionally conserved pleiotropic CREs is due to within-element compensatory evolution. In summary, our findings suggest that pleiotropy is also a good predictor for the functional conservation of CREs, even though this is not reflected in the sequence conservation of pleiotropic CREs.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"204 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142160429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Differences in activity and stability drive transposable element variation in tropical and temperate maize 热带和温带玉米转座元件变异的活性和稳定性差异
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-09-09 DOI: 10.1101/gr.278131.123
Shujun Ou, Armin Scheben, Tyler Collins, Yinjie Qiu, Arun S. Seetharam, Claire C. Menard, Nancy Manchanda, Jonathan I. Gent, Michael C. Schatz, Sarah N. Anderson, Matthew B. Hufford, Candice N. Hirsch
Much of the profound interspecific variation in genome content has been attributed to transposable elements (TEs). To explore the extent of TE variation within species, we developed an optimized open-source algorithm, panEDTA, to de novo annotate TEs in a pangenome context. We then generated a unified TE annotation for a maize pangenome derived from 26 reference-quality genomes, which reveals an excess of 35.1 Mb of TE sequences per genome in tropical maize relative to temperate maize. A small number (n = 216) of TE families, mainly LTR retrotransposons, drive these differences. Evidence from the methylome, transcriptome, LTR age distribution, and LTR insertional polymorphisms reveals that 64.7% of the variability is contributed by LTR families that are young, less methylated, and more expressed in tropical maize, whereas 18.5% is driven by LTR families with removal or loss in temperate maize. Additionally, we find enrichment for Young LTR families adjacent to nucleotide-binding and leucine-rich repeat (NLR) clusters of varying copy number across lines, suggesting TE activity may be associated with disease resistance in maize.
转座元件(TEs)在基因组内容上的深刻种间差异中占了很大一部分。为了探索物种内 TE 的变异程度,我们开发了一种优化的开源算法--panEDTA,用于在泛基因组背景下重新注释 TE。然后,我们为来自26个参考质量基因组的玉米泛基因组生成了统一的TE注释,发现热带玉米每个基因组的TE序列比温带玉米多35.1 Mb。这些差异是由少数(n = 216)TE 家族(主要是 LTR 反转座子)造成的。来自甲基组、转录组、LTR年龄分布和LTR插入多态性的证据显示,64.7%的变异是由热带玉米中年轻、甲基化较少和表达较多的LTR家族贡献的,而18.5%的变异是由温带玉米中移除或丢失的LTR家族驱动的。此外,我们还发现,年轻的 LTR 家族富集在各品系拷贝数不同的核苷酸结合和富亮氨酸重复(NLR)簇附近,这表明 TE 活性可能与玉米的抗病性有关。
{"title":"Differences in activity and stability drive transposable element variation in tropical and temperate maize","authors":"Shujun Ou, Armin Scheben, Tyler Collins, Yinjie Qiu, Arun S. Seetharam, Claire C. Menard, Nancy Manchanda, Jonathan I. Gent, Michael C. Schatz, Sarah N. Anderson, Matthew B. Hufford, Candice N. Hirsch","doi":"10.1101/gr.278131.123","DOIUrl":"https://doi.org/10.1101/gr.278131.123","url":null,"abstract":"Much of the profound interspecific variation in genome content has been attributed to transposable elements (TEs). To explore the extent of TE variation within species, we developed an optimized open-source algorithm, panEDTA, to de novo annotate TEs in a pangenome context. We then generated a unified TE annotation for a maize pangenome derived from 26 reference-quality genomes, which reveals an excess of 35.1 Mb of TE sequences per genome in tropical maize relative to temperate maize. A small number (<em>n</em> = 216) of TE families, mainly LTR retrotransposons, drive these differences. Evidence from the methylome, transcriptome, LTR age distribution, and LTR insertional polymorphisms reveals that 64.7% of the variability is contributed by LTR families that are young, less methylated, and more expressed in tropical maize, whereas 18.5% is driven by LTR families with removal or loss in temperate maize. Additionally, we find enrichment for Young LTR families adjacent to nucleotide-binding and leucine-rich repeat (NLR) clusters of varying copy number across lines, suggesting TE activity may be associated with disease resistance in maize.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"9 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142160427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Genome research
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1