Pub Date : 2024-10-19DOI: 10.1186/s13100-024-00331-y
Herui Liao, Yanni Sun, Shujun Ou
Genome annotation is an important but challenging task. Accurate identification of short interspersed nuclear elements (SINEs) is particularly difficult due to their lack of highly conserved sequences. AnnoSINE is state-of-the-art software for annotating SINEs in plant genomes, but it is computationally inefficient for large genomes. Moreover, its applicability to animals is limited due to the absence of animal pHMMs in its HMM library. Therefore, we propose AnnoSINE_v2, which extends accurate SINE annotation for animal genomes with greatly optimized computational efficiency. Our results show that AnnoSINE_v2's annotation of SINEs has over 20% higher F1-score compared to the existing tools on animal genomes and enables the processing of complicated genomes, like human and zebrafish, which were beyond the capabilities of AnnoSINE_v1. AnnoSINE_v2 is freely available on Conda and GitHub: https://github.com/liaoherui/AnnoSINE_v2 .
基因组注释是一项重要但极具挑战性的任务。由于缺乏高度保守的序列,准确识别短穿插核元素(SINEs)尤其困难。AnnoSINE 是注释植物基因组中 SINEs 的最先进软件,但它对大型基因组的计算效率较低。此外,由于其 HMM 库中没有动物 pHMMs,它对动物的适用性也很有限。因此,我们提出了 AnnoSINE_v2,它将精确的 SINE 注释扩展到动物基因组,并大大优化了计算效率。我们的研究结果表明,与现有的动物基因组工具相比,AnnoSINE_v2 的 SINE 注释 F1 分数提高了 20% 以上,而且还能处理人类和斑马鱼等复杂基因组,而这些都是 AnnoSINE_v1 所无法胜任的。AnnoSINE_v2 可在 Conda 和 GitHub 上免费获取:https://github.com/liaoherui/AnnoSINE_v2 。
{"title":"Accelerating de novo SINE annotation in plant and animal genomes.","authors":"Herui Liao, Yanni Sun, Shujun Ou","doi":"10.1186/s13100-024-00331-y","DOIUrl":"https://doi.org/10.1186/s13100-024-00331-y","url":null,"abstract":"<p><p>Genome annotation is an important but challenging task. Accurate identification of short interspersed nuclear elements (SINEs) is particularly difficult due to their lack of highly conserved sequences. AnnoSINE is state-of-the-art software for annotating SINEs in plant genomes, but it is computationally inefficient for large genomes. Moreover, its applicability to animals is limited due to the absence of animal pHMMs in its HMM library. Therefore, we propose AnnoSINE_v2, which extends accurate SINE annotation for animal genomes with greatly optimized computational efficiency. Our results show that AnnoSINE_v2's annotation of SINEs has over 20% higher F1-score compared to the existing tools on animal genomes and enables the processing of complicated genomes, like human and zebrafish, which were beyond the capabilities of AnnoSINE_v1. AnnoSINE_v2 is freely available on Conda and GitHub: https://github.com/liaoherui/AnnoSINE_v2 .</p>","PeriodicalId":18854,"journal":{"name":"Mobile DNA","volume":"15 1","pages":"24"},"PeriodicalIF":4.7,"publicationDate":"2024-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11490119/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142470305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-16DOI: 10.1186/s13100-024-00332-x
Jacopo Martelossi, Mariangela Iannello, Fabrizio Ghiselli, Andrea Luchetti
Background: Short interspersed nuclear elements (SINEs) are non-autonomous non-LTR retrotransposons widespread across eukaryotes. They exist both as lineage-specific, fast-evolving elements and as ubiquitous superfamilies characterized by highly conserved domains (HCD). Several of these superfamilies have been described in bivalves, however their overall distribution and impact on host genome evolution are still unknown due to the extreme scarcity of transposon libraries for the clade. In this study, we examined more than 40 bivalve genomes to uncover the distribution of HCD-tRNA-related SINEs, discover novel SINE-LINE partnerships, and understand their possible role in shaping bivalve genome evolution.
Results: We found that bivalve HCD SINEs have an ancient origin, and they can rely on at least four different LINE clades. According to a "mosaic" evolutionary scenario, multiple LINE partner can promote the amplification of the same HCD SINE superfamilies while homologues LINE-derived tails are present between different superfamilies. Multiple SINEs were found to be highly similar between phylogenetically related species but separated by extremely long evolutionary timescales, up to ~ 400 million years. Studying their genomic distribution in a subset of five species, we observed different patterns of SINE enrichment in various genomic compartments as well as differences in the tendency of SINEs to form tandem-like and palindromic structures also within intronic sequences. Despite these differences, we observed that SINEs, especially older ones, tend to accumulate preferentially within genes, or in their close proximity, consistently with a model of survival bias for less harmful, short non-coding transposons in euchromatic genomic regions.
Conclusion: Here we conducted a wide characterization of tRNA-related SINEs in bivalves revealing their taxonomic distribution and LINE partnerships across the clade. Moreover, through the study of their genomic distribution in five species, we highlighted commonalities and differences with other previously studied eukaryotes, thus extending our understanding of SINE evolution across the tree of life.
背景:短间隔核元素(SINEs)是广泛存在于真核生物中的非自主性非LTR逆转录转座子。它们既可以作为种系特异的快速进化元件而存在,也可以作为以高度保守结构域(HCD)为特征的无处不在的超家族而存在。这些超家族中有几个已经在双壳类动物中得到了描述,但由于双壳类动物的转座子文库极其稀少,它们的总体分布及其对宿主基因组进化的影响仍然未知。在这项研究中,我们研究了 40 多个双壳类基因组,以揭示 HCD-tRNA 相关 SINE 的分布,发现新的 SINE-LINE 伙伴关系,并了解它们在塑造双壳类基因组进化过程中可能扮演的角色:结果:我们发现双壳类动物的HCD SINEs起源古老,它们至少依赖于四个不同的LINE支系。根据 "马赛克 "进化设想,多个 LINE 伙伴可促进相同 HCD SINE 超家族的扩增,而同源 LINE 衍生的尾部则存在于不同超家族之间。研究发现,多个 SINEs 在系统发育相关的物种之间高度相似,但它们之间的进化时间尺度却极长,最长可达 4 亿年。通过研究它们在五个物种亚群中的基因组分布,我们观察到 SINE 在不同基因组区块中的富集模式不同,而且 SINE 在内含子序列中形成串联结构和回文结构的趋势也不同。尽管存在这些差异,但我们观察到,SINEs,尤其是较老的 SINEs,倾向于优先在基因内或基因附近积累,这与有害性较低的短非编码转座子在染色体外基因组区域的生存偏倚模型是一致的:在这里,我们对双壳类动物中与 tRNA 相关的 SINEs 进行了广泛的特征描述,揭示了它们在分类学上的分布以及整个类群中的 LINE 伙伴关系。此外,通过研究它们在五个物种中的基因组分布,我们强调了它们与之前研究过的其他真核生物的共同点和不同点,从而扩展了我们对整个生命树中 SINE 演化的了解。
{"title":"Widespread HCD-tRNA derived SINEs in bivalves rely on multiple LINE partners and accumulate in genic regions.","authors":"Jacopo Martelossi, Mariangela Iannello, Fabrizio Ghiselli, Andrea Luchetti","doi":"10.1186/s13100-024-00332-x","DOIUrl":"https://doi.org/10.1186/s13100-024-00332-x","url":null,"abstract":"<p><strong>Background: </strong>Short interspersed nuclear elements (SINEs) are non-autonomous non-LTR retrotransposons widespread across eukaryotes. They exist both as lineage-specific, fast-evolving elements and as ubiquitous superfamilies characterized by highly conserved domains (HCD). Several of these superfamilies have been described in bivalves, however their overall distribution and impact on host genome evolution are still unknown due to the extreme scarcity of transposon libraries for the clade. In this study, we examined more than 40 bivalve genomes to uncover the distribution of HCD-tRNA-related SINEs, discover novel SINE-LINE partnerships, and understand their possible role in shaping bivalve genome evolution.</p><p><strong>Results: </strong>We found that bivalve HCD SINEs have an ancient origin, and they can rely on at least four different LINE clades. According to a \"mosaic\" evolutionary scenario, multiple LINE partner can promote the amplification of the same HCD SINE superfamilies while homologues LINE-derived tails are present between different superfamilies. Multiple SINEs were found to be highly similar between phylogenetically related species but separated by extremely long evolutionary timescales, up to ~ 400 million years. Studying their genomic distribution in a subset of five species, we observed different patterns of SINE enrichment in various genomic compartments as well as differences in the tendency of SINEs to form tandem-like and palindromic structures also within intronic sequences. Despite these differences, we observed that SINEs, especially older ones, tend to accumulate preferentially within genes, or in their close proximity, consistently with a model of survival bias for less harmful, short non-coding transposons in euchromatic genomic regions.</p><p><strong>Conclusion: </strong>Here we conducted a wide characterization of tRNA-related SINEs in bivalves revealing their taxonomic distribution and LINE partnerships across the clade. Moreover, through the study of their genomic distribution in five species, we highlighted commonalities and differences with other previously studied eukaryotes, thus extending our understanding of SINE evolution across the tree of life.</p>","PeriodicalId":18854,"journal":{"name":"Mobile DNA","volume":"15 1","pages":"22"},"PeriodicalIF":4.7,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11481361/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142470308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-12DOI: 10.1186/s13100-024-00329-6
Weronika Mikina, Paweł Hałakuc, Rafał Milanowski
{"title":"Correction: Transposon-derived introns as an element shaping the structure of eukaryotic genomes.","authors":"Weronika Mikina, Paweł Hałakuc, Rafał Milanowski","doi":"10.1186/s13100-024-00329-6","DOIUrl":"https://doi.org/10.1186/s13100-024-00329-6","url":null,"abstract":"","PeriodicalId":18854,"journal":{"name":"Mobile DNA","volume":"15 1","pages":"21"},"PeriodicalIF":4.7,"publicationDate":"2024-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11470543/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142470307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-09DOI: 10.1186/s13100-024-00330-z
Md Fakhrul Azad, Tong Tong, Nelson C Lau
Recent studies have suggested that Transposable Elements (TEs) residing in introns frequently splice into and alter primary gene-coding transcripts. To re-examine the exonization frequency of TEs into protein-coding gene transcripts, we re-analyzed a Drosophila neuron circadian rhythm RNAseq dataset and a deep long RNA fly midbrain RNAseq dataset using our Transposon Insertion and Depletion Analyzer (TIDAL) program. Our TIDAL results were able to predict several TE insertions from RNAseq data that were consistent with previous published studies. However, we also uncovered many discrepancies in TE-exonization calls, such as reads that mainly support intron retention of the TE and little support for chimeric mRNA spliced to the TE. We then deployed rigorous genomic DNA-PCR (gDNA-PCR) and RT-PCR procedures on TE-mRNA fusion candidates to see how many of bioinformatics predictions could be validated. By testing a w1118 strain from which the deeper long RNAseq data was derived and comparing to an OreR strain, only 9 of 23 TIDAL candidates (< 40%) could be validated as a novel TE insertion by gDNA-PCR, indicating that deeper study is needed when using RNAseq data as inputs into current TE-insertion prediction programs. Of these validated calls, our RT-PCR results only supported TE-intron retention. Lastly, in the Dscam2 and Bx genes of the w1118 strain that contained intronic TEs, gene expression was 23 times higher than the OreR genes lacking the TEs. This study's validation approach indicates that chimeric TE-mRNAs are infrequent and cautions that more optimization is required in bioinformatics programs to call TE insertions using RNAseq datasets.
最近的研究表明,内含子中的可转座元件(Transposable Elements,TEs)经常剪接到主基因编码转录本中并改变其编码。为了重新研究TE插入蛋白编码基因转录本的频率,我们使用转座子插入和删除分析器(TIDAL)程序重新分析了果蝇神经元昼夜节律RNAseq数据集和长RNA蝇中脑RNAseq数据集。我们的 TIDAL 结果能够从 RNAseq 数据中预测出几个 TE 插入,这与之前发表的研究结果一致。但是,我们也发现了许多 TE 缺失调用中的差异,如主要支持 TE 内含子保留的读数,以及很少支持与 TE 剪接的嵌合 mRNA。然后,我们对TE-mRNA融合候选基因采用了严格的基因组DNA-PCR(gDNA-PCR)和RT-PCR程序,以了解有多少生物信息学预测可以得到验证。通过测试w1118菌株(其深层长RNAseq数据来源于该菌株)并与OreR菌株进行比较,23个TIDAL候选者中只有9个(
{"title":"Transposable Element (TE) insertion predictions from RNAseq inputs and TE impact on RNA splicing and gene expression in Drosophila brain transcriptomes.","authors":"Md Fakhrul Azad, Tong Tong, Nelson C Lau","doi":"10.1186/s13100-024-00330-z","DOIUrl":"10.1186/s13100-024-00330-z","url":null,"abstract":"<p><p>Recent studies have suggested that Transposable Elements (TEs) residing in introns frequently splice into and alter primary gene-coding transcripts. To re-examine the exonization frequency of TEs into protein-coding gene transcripts, we re-analyzed a Drosophila neuron circadian rhythm RNAseq dataset and a deep long RNA fly midbrain RNAseq dataset using our Transposon Insertion and Depletion Analyzer (TIDAL) program. Our TIDAL results were able to predict several TE insertions from RNAseq data that were consistent with previous published studies. However, we also uncovered many discrepancies in TE-exonization calls, such as reads that mainly support intron retention of the TE and little support for chimeric mRNA spliced to the TE. We then deployed rigorous genomic DNA-PCR (gDNA-PCR) and RT-PCR procedures on TE-mRNA fusion candidates to see how many of bioinformatics predictions could be validated. By testing a w1118 strain from which the deeper long RNAseq data was derived and comparing to an OreR strain, only 9 of 23 TIDAL candidates (< 40%) could be validated as a novel TE insertion by gDNA-PCR, indicating that deeper study is needed when using RNAseq data as inputs into current TE-insertion prediction programs. Of these validated calls, our RT-PCR results only supported TE-intron retention. Lastly, in the Dscam2 and Bx genes of the w1118 strain that contained intronic TEs, gene expression was 23 times higher than the OreR genes lacking the TEs. This study's validation approach indicates that chimeric TE-mRNAs are infrequent and cautions that more optimization is required in bioinformatics programs to call TE insertions using RNAseq datasets.</p>","PeriodicalId":18854,"journal":{"name":"Mobile DNA","volume":"15 1","pages":"20"},"PeriodicalIF":3.1,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11462757/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142391915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-09DOI: 10.1186/s13100-024-00333-w
Erin E Grundy, Lauren C Shaw, Loretta Wang, Abigail V Lee, James Castro Argueta, Daniel J Powell, Mario Ostrowski, R Brad Jones, C Russell Y Cruz, Heather Gordish-Dressman, Nicole P Chappell, Catherine M Bollard, Katherine B Chiappinelli
Transposable elements (TEs) are often expressed at higher levels in tumor cells than normal cells, implicating these genomic regions as an untapped pool of tumor-associated antigens. In ovarian cancer (OC), protein from the TE ERV-K is frequently expressed by tumor cells. Here we determined whether the targeting of previously identified epitope in the envelope gene (env) of ERV-K resulted in target antigen specificity against cancer cells. We found that transducing healthy donor T cells with an ERV-K-Env-specific T cell receptor construct resulted in antigen specificity only when co-cultured with HLA-A*03:01 B lymphoblastoid cells. Furthermore, in vitro priming of several healthy donors with this epitope of ERV-K-Env did not result in target antigen specificity. These data suggest that the T cell receptor is a poor candidate for targeting this specific ERV-K-Env epitope and has limited potential as a T cell therapy for OC.
{"title":"A T cell receptor specific for an HLA-A*03:01-restricted epitope in the endogenous retrovirus ERV-K-Env exhibits limited recognition of its cognate epitope.","authors":"Erin E Grundy, Lauren C Shaw, Loretta Wang, Abigail V Lee, James Castro Argueta, Daniel J Powell, Mario Ostrowski, R Brad Jones, C Russell Y Cruz, Heather Gordish-Dressman, Nicole P Chappell, Catherine M Bollard, Katherine B Chiappinelli","doi":"10.1186/s13100-024-00333-w","DOIUrl":"10.1186/s13100-024-00333-w","url":null,"abstract":"<p><p>Transposable elements (TEs) are often expressed at higher levels in tumor cells than normal cells, implicating these genomic regions as an untapped pool of tumor-associated antigens. In ovarian cancer (OC), protein from the TE ERV-K is frequently expressed by tumor cells. Here we determined whether the targeting of previously identified epitope in the envelope gene (env) of ERV-K resulted in target antigen specificity against cancer cells. We found that transducing healthy donor T cells with an ERV-K-Env-specific T cell receptor construct resulted in antigen specificity only when co-cultured with HLA-A*03:01 B lymphoblastoid cells. Furthermore, in vitro priming of several healthy donors with this epitope of ERV-K-Env did not result in target antigen specificity. These data suggest that the T cell receptor is a poor candidate for targeting this specific ERV-K-Env epitope and has limited potential as a T cell therapy for OC.</p>","PeriodicalId":18854,"journal":{"name":"Mobile DNA","volume":"15 1","pages":"19"},"PeriodicalIF":4.7,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11462856/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142391914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01DOI: 10.1186/s13100-024-00328-7
Mohadeseh S Tahami, Carlos Vargas-Chavez, Noora Poikela, Marta Coronado-Zamora, Josefa González, Maaria Kankare
Background: Substantial discoveries during the past century have revealed that transposable elements (TEs) can play a crucial role in genome evolution by affecting gene expression and inducing genetic rearrangements, among other molecular and structural effects. Yet, our knowledge on the role of TEs in adaptation to extreme climates is still at its infancy. The availability of long-read sequencing has opened up the possibility to identify and study potential functional effects of TEs with higher precision. In this work, we used Drosophila montana as a model for cold-adapted organisms to study the association between TEs and adaptation to harsh climates.
Results: Using the PacBio long-read sequencing technique, we de novo identified and manually curated TE sequences in five Drosophila montana genomes from eco-geographically distinct populations. We identified 489 new TE consensus sequences which represented 92% of the total TE consensus in D. montana. Overall, 11-13% of the D. montana genome is occupied by TEs, which as expected are non-randomly distributed across the genome. We identified five potentially active TE families, most of them from the retrotransposon class of TEs. Additionally, we found TEs present in the five analyzed genomes that were located nearby previously identified cold tolerant genes. Some of these TEs contain promoter elements and transcription binding sites. Finally, we detected TEs nearby fixed and polymorphic inversion breakpoints.
Conclusions: Our research revealed a significant number of newly identified TE consensus sequences in the genome of D. montana, suggesting that non-model species should be studied to get a comprehensive view of the TE repertoire in Drosophila species and beyond. Genome annotations with the new D. montana library allowed us to identify TEs located nearby cold tolerant genes, and present at high population frequencies, that contain regulatory regions and are thus good candidates to play a role in D. montana cold stress response. Finally, our annotations also allow us to identify for the first time TEs present in the breakpoints of three D. montana inversions.
背景:上个世纪的重大发现揭示了转座元件(TEs)可通过影响基因表达和诱导基因重排等分子和结构效应在基因组进化中发挥关键作用。然而,我们对转座元件在适应极端气候方面所起作用的了解仍处于起步阶段。长线程测序技术的出现为更精确地识别和研究 TEs 的潜在功能效应提供了可能。在这项工作中,我们以果蝇作为寒冷适应生物的模型,研究了TEs与恶劣气候适应之间的关联:结果:利用 PacBio 长序列测序技术,我们从五个不同生态地理种群的果蝇基因组中重新鉴定并人工编辑了 TE 序列。我们鉴定出了 489 个新的 TE 共识序列,占蒙大拿果蝇总 TE 共识序列的 92%。总体而言,11%-13%的D. montana基因组被TE占据,正如预期的那样,TE在基因组中的分布是非随机的。我们发现了五个潜在的活性 TE 家族,其中大部分属于逆转录转座子类 TE。此外,我们还在五个分析的基因组中发现了位于先前发现的耐寒基因附近的TE。其中一些TE含有启动子元件和转录结合位点。最后,我们在固定和多态反转断点附近检测到了TE:我们的研究在蒙大拿果蝇的基因组中发现了大量新鉴定的TE共识序列,这表明应该对非模式物种进行研究,以全面了解果蝇及其他物种的TE谱系。利用新的蒙大拿果蝇基因库进行基因组注释,我们发现了位于耐寒基因附近的TEs,这些TEs以较高的种群频率存在,包含调控区域,因此是在蒙大拿果蝇冷应激反应中发挥作用的良好候选者。最后,我们的注释还让我们首次发现了存在于三个蒙大拿倒位基因断裂点的 TEs。
{"title":"Transposable elements in Drosophila montana from harsh cold environments.","authors":"Mohadeseh S Tahami, Carlos Vargas-Chavez, Noora Poikela, Marta Coronado-Zamora, Josefa González, Maaria Kankare","doi":"10.1186/s13100-024-00328-7","DOIUrl":"10.1186/s13100-024-00328-7","url":null,"abstract":"<p><strong>Background: </strong>Substantial discoveries during the past century have revealed that transposable elements (TEs) can play a crucial role in genome evolution by affecting gene expression and inducing genetic rearrangements, among other molecular and structural effects. Yet, our knowledge on the role of TEs in adaptation to extreme climates is still at its infancy. The availability of long-read sequencing has opened up the possibility to identify and study potential functional effects of TEs with higher precision. In this work, we used Drosophila montana as a model for cold-adapted organisms to study the association between TEs and adaptation to harsh climates.</p><p><strong>Results: </strong>Using the PacBio long-read sequencing technique, we de novo identified and manually curated TE sequences in five Drosophila montana genomes from eco-geographically distinct populations. We identified 489 new TE consensus sequences which represented 92% of the total TE consensus in D. montana. Overall, 11-13% of the D. montana genome is occupied by TEs, which as expected are non-randomly distributed across the genome. We identified five potentially active TE families, most of them from the retrotransposon class of TEs. Additionally, we found TEs present in the five analyzed genomes that were located nearby previously identified cold tolerant genes. Some of these TEs contain promoter elements and transcription binding sites. Finally, we detected TEs nearby fixed and polymorphic inversion breakpoints.</p><p><strong>Conclusions: </strong>Our research revealed a significant number of newly identified TE consensus sequences in the genome of D. montana, suggesting that non-model species should be studied to get a comprehensive view of the TE repertoire in Drosophila species and beyond. Genome annotations with the new D. montana library allowed us to identify TEs located nearby cold tolerant genes, and present at high population frequencies, that contain regulatory regions and are thus good candidates to play a role in D. montana cold stress response. Finally, our annotations also allow us to identify for the first time TEs present in the breakpoints of three D. montana inversions.</p>","PeriodicalId":18854,"journal":{"name":"Mobile DNA","volume":"15 1","pages":"18"},"PeriodicalIF":3.1,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11445987/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142361787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-05DOI: 10.1186/s13100-024-00327-8
Ezequiel G Mogro, Walter O Draghi, Antonio Lagares, Mauricio J Lozano
Rhizobia are alpha- and beta- Proteobacteria that, through the establishment of symbiotic interactions with leguminous plants, are able to fix atmospheric nitrogen as ammonium. The successful establishment of a symbiotic interaction is highly dependent on the availability of nitrogen sources in the soil, and on the specific rhizobia strain. Insertion sequences (ISs) are simple transposable genetic elements that can move to different locations within the host genome and are known to play an important evolutionary role, contributing to genome plasticity by acting as recombination hot-spots, and disrupting coding and regulatory sequences. Disruption of coding sequences may have occurred either in a common ancestor of the species or more recently. By means of ISComapare, we identified Differentially Located ISs (DLISs) in nearly related rhizobial strains of the genera Bradyrhizobium, Mesorhizobium, Rhizobium and Sinorhizobium. Our results revealed that recent IS transposition could have a role in adaptation by enabling the activation and inactivation of genes that could dynamically affect the competition and survival of rhizobia in the rhizosphere.
{"title":"Identification and functional analysis of recent IS transposition events in rhizobia.","authors":"Ezequiel G Mogro, Walter O Draghi, Antonio Lagares, Mauricio J Lozano","doi":"10.1186/s13100-024-00327-8","DOIUrl":"10.1186/s13100-024-00327-8","url":null,"abstract":"<p><p>Rhizobia are alpha- and beta- Proteobacteria that, through the establishment of symbiotic interactions with leguminous plants, are able to fix atmospheric nitrogen as ammonium. The successful establishment of a symbiotic interaction is highly dependent on the availability of nitrogen sources in the soil, and on the specific rhizobia strain. Insertion sequences (ISs) are simple transposable genetic elements that can move to different locations within the host genome and are known to play an important evolutionary role, contributing to genome plasticity by acting as recombination hot-spots, and disrupting coding and regulatory sequences. Disruption of coding sequences may have occurred either in a common ancestor of the species or more recently. By means of ISComapare, we identified Differentially Located ISs (DLISs) in nearly related rhizobial strains of the genera Bradyrhizobium, Mesorhizobium, Rhizobium and Sinorhizobium. Our results revealed that recent IS transposition could have a role in adaptation by enabling the activation and inactivation of genes that could dynamically affect the competition and survival of rhizobia in the rhizosphere.</p>","PeriodicalId":18854,"journal":{"name":"Mobile DNA","volume":"15 1","pages":"17"},"PeriodicalIF":4.7,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11375893/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142140578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-05DOI: 10.1186/s13100-024-00326-9
Matthias Heuberger, Dal-Hoe Koo, Hanin Ibrahim Ahmed, Vijay K Tiwari, Michael Abrouk, Jesse Poland, Simon G Krattinger, Thomas Wicker
Background: Centromere function is highly conserved across eukaryotes, but the underlying centromeric DNA sequences vary dramatically between species. Centromeres often contain a high proportion of repetitive DNA, such as tandem repeats and/or transposable elements (TEs). Einkorn wheat centromeres lack tandem repeat arrays and are instead composed mostly of the two long terminal repeat (LTR) retrotransposon families RLG_Cereba and RLG_Quinta which specifically insert in centromeres. However, it is poorly understood how these two TE families relate to each other and if and how they contribute to centromere function and evolution.
Results: Based on conservation of diagnostic motifs (LTRs, integrase and primer binding site and polypurine-tract), we propose that RLG_Cereba and RLG_Quinta are a pair of autonomous and non-autonomous partners, in which the autonomous RLG_Cereba contributes all the proteins required for transposition, while the non-autonomous RLG_Quinta contributes GAG protein. Phylogenetic analysis of predicted GAG proteins showed that the RLG_Cereba lineage was present for at least 100 million years in monocotyledon plants. In contrast, RLG_Quinta evolved from RLG_Cereba between 28 and 35 million years ago in the common ancestor of oat and wheat. Interestingly, the integrase of RLG_Cereba is fused to a so-called CR-domain, which is hypothesized to guide the integrase to the functional centromere. Indeed, ChIP-seq data and TE population analysis show only the youngest subfamilies of RLG_Cereba and RLG_Quinta are found in the active centromeres. Importantly, the LTRs of RLG_Quinta and RLG_Cereba are strongly associated with the presence of the centromere-specific CENH3 histone variant. We hypothesize that the LTRs of RLG_Cereba and RLG_Quinta contribute to wheat centromere integrity by phasing and/or placing CENH3 nucleosomes, thus favoring their persistence in the competitive centromere-niche.
Conclusion: Our data show that RLG_Cereba cross-mobilizes the non-autonomous RLG_Quinta retrotransposons. New copies of both families are specifically integrated into functional centromeres presumably through direct binding of the integrase CR domain to CENH3 histone variants. The LTRs of newly inserted RLG_Cereba and RLG_Quinta elements, in turn, recruit and/or phase new CENH3 deposition. This mutualistic interplay between the two TE families and the plant host dynamically maintains wheat centromeres.
{"title":"Evolution of Einkorn wheat centromeres is driven by the mutualistic interplay of two LTR retrotransposons.","authors":"Matthias Heuberger, Dal-Hoe Koo, Hanin Ibrahim Ahmed, Vijay K Tiwari, Michael Abrouk, Jesse Poland, Simon G Krattinger, Thomas Wicker","doi":"10.1186/s13100-024-00326-9","DOIUrl":"10.1186/s13100-024-00326-9","url":null,"abstract":"<p><strong>Background: </strong>Centromere function is highly conserved across eukaryotes, but the underlying centromeric DNA sequences vary dramatically between species. Centromeres often contain a high proportion of repetitive DNA, such as tandem repeats and/or transposable elements (TEs). Einkorn wheat centromeres lack tandem repeat arrays and are instead composed mostly of the two long terminal repeat (LTR) retrotransposon families RLG_Cereba and RLG_Quinta which specifically insert in centromeres. However, it is poorly understood how these two TE families relate to each other and if and how they contribute to centromere function and evolution.</p><p><strong>Results: </strong>Based on conservation of diagnostic motifs (LTRs, integrase and primer binding site and polypurine-tract), we propose that RLG_Cereba and RLG_Quinta are a pair of autonomous and non-autonomous partners, in which the autonomous RLG_Cereba contributes all the proteins required for transposition, while the non-autonomous RLG_Quinta contributes GAG protein. Phylogenetic analysis of predicted GAG proteins showed that the RLG_Cereba lineage was present for at least 100 million years in monocotyledon plants. In contrast, RLG_Quinta evolved from RLG_Cereba between 28 and 35 million years ago in the common ancestor of oat and wheat. Interestingly, the integrase of RLG_Cereba is fused to a so-called CR-domain, which is hypothesized to guide the integrase to the functional centromere. Indeed, ChIP-seq data and TE population analysis show only the youngest subfamilies of RLG_Cereba and RLG_Quinta are found in the active centromeres. Importantly, the LTRs of RLG_Quinta and RLG_Cereba are strongly associated with the presence of the centromere-specific CENH3 histone variant. We hypothesize that the LTRs of RLG_Cereba and RLG_Quinta contribute to wheat centromere integrity by phasing and/or placing CENH3 nucleosomes, thus favoring their persistence in the competitive centromere-niche.</p><p><strong>Conclusion: </strong>Our data show that RLG_Cereba cross-mobilizes the non-autonomous RLG_Quinta retrotransposons. New copies of both families are specifically integrated into functional centromeres presumably through direct binding of the integrase CR domain to CENH3 histone variants. The LTRs of newly inserted RLG_Cereba and RLG_Quinta elements, in turn, recruit and/or phase new CENH3 deposition. This mutualistic interplay between the two TE families and the plant host dynamically maintains wheat centromeres.</p>","PeriodicalId":18854,"journal":{"name":"Mobile DNA","volume":"15 1","pages":"16"},"PeriodicalIF":4.7,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11302176/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141893850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-27DOI: 10.1186/s13100-024-00325-w
Weronika Mikina, Paweł Hałakuc, Rafał Milanowski
The widely accepted hypothesis postulates that the first spliceosomal introns originated from group II self-splicing introns. However, it is evident that not all spliceosomal introns in the nuclear genes of modern eukaryotes are inherited through vertical transfer of intronic sequences. Several phenomena contribute to the formation of new introns but their most common origin seems to be the insertion of transposable elements. Recent analyses have highlighted instances of mass gains of new introns from transposable elements. These events often coincide with an increase or change in the spliceosome's tolerance to splicing signals, including the acceptance of noncanonical borders. Widespread acquisitions of transposon-derived introns occur across diverse evolutionary lineages, indicating convergent processes. These events, though independent, likely require a similar set of conditions. These conditions include the presence of transposon elements with features enabling their removal at the RNA level as introns and/or the existence of a splicing mechanism capable of excising unusual sequences that would otherwise not be recognized as introns by standard splicing machinery. Herein we summarize those mechanisms across different eukaryotic lineages.
{"title":"Transposon-derived introns as an element shaping the structure of eukaryotic genomes","authors":"Weronika Mikina, Paweł Hałakuc, Rafał Milanowski","doi":"10.1186/s13100-024-00325-w","DOIUrl":"https://doi.org/10.1186/s13100-024-00325-w","url":null,"abstract":"The widely accepted hypothesis postulates that the first spliceosomal introns originated from group II self-splicing introns. However, it is evident that not all spliceosomal introns in the nuclear genes of modern eukaryotes are inherited through vertical transfer of intronic sequences. Several phenomena contribute to the formation of new introns but their most common origin seems to be the insertion of transposable elements. Recent analyses have highlighted instances of mass gains of new introns from transposable elements. These events often coincide with an increase or change in the spliceosome's tolerance to splicing signals, including the acceptance of noncanonical borders. Widespread acquisitions of transposon-derived introns occur across diverse evolutionary lineages, indicating convergent processes. These events, though independent, likely require a similar set of conditions. These conditions include the presence of transposon elements with features enabling their removal at the RNA level as introns and/or the existence of a splicing mechanism capable of excising unusual sequences that would otherwise not be recognized as introns by standard splicing machinery. Herein we summarize those mechanisms across different eukaryotic lineages.","PeriodicalId":18854,"journal":{"name":"Mobile DNA","volume":"62 1","pages":""},"PeriodicalIF":4.9,"publicationDate":"2024-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141779436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-27DOI: 10.1186/s13100-024-00324-x
Fatemeh Moadab, Sepideh Sohrabi, Xiaoxing Wang, Rayan Najjar, Justina C Wolters, Hua Jiang, Wenyan Miao, Donna Romero, Dennis M Zaller, Megan Tran, Alison Bays, Martin S Taylor, Rosana Kapeller, John LaCava, Tomas Mustelin
Background: Systemic lupus erythematosus (SLE) is a chronic autoimmune disease with an unpredictable course of recurrent exacerbations alternating with more stable disease. SLE is characterized by broad immune activation and autoantibodies against double-stranded DNA and numerous proteins that exist in cells as aggregates with nucleic acids, such as Ro60, MOV10, and the L1 retrotransposon-encoded ORF1p.
Results: Here we report that these 3 proteins are co-expressed and co-localized in a subset of SLE granulocytes and are concentrated in cytosolic dots that also contain DNA: RNA heteroduplexes and the DNA sensor ZBP1, but not cGAS. The DNA: RNA heteroduplexes vanished from the neutrophils when they were treated with a selective inhibitor of the L1 reverse transcriptase. We also report that ORF1p granules escape neutrophils during the extrusion of neutrophil extracellular traps (NETs) and, to a lesser degree, from neutrophils dying by pyroptosis, but not apoptosis.
Conclusions: These results bring new insights into the composition of ORF1p granules in SLE neutrophils and may explain, in part, why proteins in these granules become targeted by autoantibodies in this disease.
{"title":"Subcellular location of L1 retrotransposon-encoded ORF1p, reverse transcription products, and DNA sensors in lupus granulocytes.","authors":"Fatemeh Moadab, Sepideh Sohrabi, Xiaoxing Wang, Rayan Najjar, Justina C Wolters, Hua Jiang, Wenyan Miao, Donna Romero, Dennis M Zaller, Megan Tran, Alison Bays, Martin S Taylor, Rosana Kapeller, John LaCava, Tomas Mustelin","doi":"10.1186/s13100-024-00324-x","DOIUrl":"10.1186/s13100-024-00324-x","url":null,"abstract":"<p><strong>Background: </strong>Systemic lupus erythematosus (SLE) is a chronic autoimmune disease with an unpredictable course of recurrent exacerbations alternating with more stable disease. SLE is characterized by broad immune activation and autoantibodies against double-stranded DNA and numerous proteins that exist in cells as aggregates with nucleic acids, such as Ro60, MOV10, and the L1 retrotransposon-encoded ORF1p.</p><p><strong>Results: </strong>Here we report that these 3 proteins are co-expressed and co-localized in a subset of SLE granulocytes and are concentrated in cytosolic dots that also contain DNA: RNA heteroduplexes and the DNA sensor ZBP1, but not cGAS. The DNA: RNA heteroduplexes vanished from the neutrophils when they were treated with a selective inhibitor of the L1 reverse transcriptase. We also report that ORF1p granules escape neutrophils during the extrusion of neutrophil extracellular traps (NETs) and, to a lesser degree, from neutrophils dying by pyroptosis, but not apoptosis.</p><p><strong>Conclusions: </strong>These results bring new insights into the composition of ORF1p granules in SLE neutrophils and may explain, in part, why proteins in these granules become targeted by autoantibodies in this disease.</p>","PeriodicalId":18854,"journal":{"name":"Mobile DNA","volume":"15 1","pages":"14"},"PeriodicalIF":4.7,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11212426/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141469595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}