Pub Date : 2025-02-26DOI: 10.1186/s13100-025-00345-0
Filip Wierzbicki, Riccardo Pianezza, Divya Selvaraju, Madeleine Maria Eller, Robert Kofler
The horizontal transfer (HT) of the P-element is one of the best documented cases of the HT of a transposable element. The P-element invaded natural D. melanogaster populations between 1950 and 1980 following its HT from Drosophila willistoni, a species endemic to South and Central America. Subsequently, it spread in D. simulans populations between 2006 and 2014, following a HT from D. melanogaster. The geographic region where the spread into D. simulans occurred is unclear, as both involved species are cosmopolitan. The P-element differs between these two species by a single base substitution at site 2040, where D. melanogaster carries a 'G' and D. simulans carries an 'A'. It has been hypothesized that this base substitution was a necessary adaptation that enabled the spread of the P-element in D. simulans, potentially explaining the 30-50-year lag between the invasions of D. melanogaster and D. simulans. To test this hypothesis, we monitored the invasion dynamics of P-elements with both alleles in experimental populations of D. melanogaster and D. simulans. Our results indicate that the allele at site 2040 has a minimal impact on the invasion dynamics of the P-element and, therefore, was not necessary for the invasion of D. simulans. However, we found that the host species significantly influenced the invasion dynamics, with higher P-element copy numbers accumulating in D. melanogaster than in D. simulans. Finally, based on SNPs segregating in natural D. melanogaster populations, we suggest that the horizontal transfer of the P-element from D. melanogaster to D. simulans likely occurred around Tasmania.
{"title":"On the origin of the P-element invasion in Drosophila simulans.","authors":"Filip Wierzbicki, Riccardo Pianezza, Divya Selvaraju, Madeleine Maria Eller, Robert Kofler","doi":"10.1186/s13100-025-00345-0","DOIUrl":"10.1186/s13100-025-00345-0","url":null,"abstract":"<p><p>The horizontal transfer (HT) of the P-element is one of the best documented cases of the HT of a transposable element. The P-element invaded natural D. melanogaster populations between 1950 and 1980 following its HT from Drosophila willistoni, a species endemic to South and Central America. Subsequently, it spread in D. simulans populations between 2006 and 2014, following a HT from D. melanogaster. The geographic region where the spread into D. simulans occurred is unclear, as both involved species are cosmopolitan. The P-element differs between these two species by a single base substitution at site 2040, where D. melanogaster carries a 'G' and D. simulans carries an 'A'. It has been hypothesized that this base substitution was a necessary adaptation that enabled the spread of the P-element in D. simulans, potentially explaining the 30-50-year lag between the invasions of D. melanogaster and D. simulans. To test this hypothesis, we monitored the invasion dynamics of P-elements with both alleles in experimental populations of D. melanogaster and D. simulans. Our results indicate that the allele at site 2040 has a minimal impact on the invasion dynamics of the P-element and, therefore, was not necessary for the invasion of D. simulans. However, we found that the host species significantly influenced the invasion dynamics, with higher P-element copy numbers accumulating in D. melanogaster than in D. simulans. Finally, based on SNPs segregating in natural D. melanogaster populations, we suggest that the horizontal transfer of the P-element from D. melanogaster to D. simulans likely occurred around Tasmania.</p>","PeriodicalId":18854,"journal":{"name":"Mobile DNA","volume":"16 1","pages":"7"},"PeriodicalIF":4.7,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11863927/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143516163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-22DOI: 10.1186/s13100-025-00341-4
Alex Nesta, Diogo F T Veiga, Jacques Banchereau, Olga Anczukow, Christine R Beck
Transposable elements (TEs) drive genome evolution and can affect gene expression through diverse mechanisms. In breast cancer, disrupted regulation of TE sequences may facilitate tumor-specific transcriptomic alterations. We examine 142,514 full-length isoforms derived from long-read RNA sequencing (LR-seq) of 30 breast samples to investigate the effects of TEs on the breast cancer transcriptome. Approximately half of these isoforms contain TE sequences, and these contribute to half of the novel annotated splice junctions. We quantify splicing of these LR-seq derived isoforms in 1,135 breast tumors from The Cancer Genome Atlas (TCGA) and 1,329 healthy tissue samples from the Genotype-Tissue Expression (GTEx), and find 300 TE-overlapping tumor-specific splicing events. Some splicing events are enriched in specific breast cancer subtypes - for example, a TE-driven transcription start site upstream of ERBB2 in HER2 + tumors, and several TE-mediated splicing events are associated with patient survival and poor prognosis. The full-length sequences we capture with LR-seq reveal thousands of isoforms with signatures of RNA editing, including a novel isoform belonging to RHOA; a gene previously implicated in tumor progression. We utilize our full-length isoforms to discover polymorphic TE insertions that alter splicing and validate one of these events in breast cancer cell lines. Together, our results demonstrate the widespread effects of dysregulated TEs on breast cancer transcriptomes and highlight the advantages of long-read isoform sequencing for understanding TE biology. TE-derived isoforms may alter the expression of genes important in cancer and can potentially be used as novel, disease-specific therapeutic targets or biomarkers.One sentence summary: Transposable elements generate alternative isoforms and alter post-transcriptional regulation in human breast cancer.
{"title":"Alternative splicing of transposable elements in human breast cancer.","authors":"Alex Nesta, Diogo F T Veiga, Jacques Banchereau, Olga Anczukow, Christine R Beck","doi":"10.1186/s13100-025-00341-4","DOIUrl":"10.1186/s13100-025-00341-4","url":null,"abstract":"<p><p>Transposable elements (TEs) drive genome evolution and can affect gene expression through diverse mechanisms. In breast cancer, disrupted regulation of TE sequences may facilitate tumor-specific transcriptomic alterations. We examine 142,514 full-length isoforms derived from long-read RNA sequencing (LR-seq) of 30 breast samples to investigate the effects of TEs on the breast cancer transcriptome. Approximately half of these isoforms contain TE sequences, and these contribute to half of the novel annotated splice junctions. We quantify splicing of these LR-seq derived isoforms in 1,135 breast tumors from The Cancer Genome Atlas (TCGA) and 1,329 healthy tissue samples from the Genotype-Tissue Expression (GTEx), and find 300 TE-overlapping tumor-specific splicing events. Some splicing events are enriched in specific breast cancer subtypes - for example, a TE-driven transcription start site upstream of ERBB2 in HER2 + tumors, and several TE-mediated splicing events are associated with patient survival and poor prognosis. The full-length sequences we capture with LR-seq reveal thousands of isoforms with signatures of RNA editing, including a novel isoform belonging to RHOA; a gene previously implicated in tumor progression. We utilize our full-length isoforms to discover polymorphic TE insertions that alter splicing and validate one of these events in breast cancer cell lines. Together, our results demonstrate the widespread effects of dysregulated TEs on breast cancer transcriptomes and highlight the advantages of long-read isoform sequencing for understanding TE biology. TE-derived isoforms may alter the expression of genes important in cancer and can potentially be used as novel, disease-specific therapeutic targets or biomarkers.One sentence summary: Transposable elements generate alternative isoforms and alter post-transcriptional regulation in human breast cancer.</p>","PeriodicalId":18854,"journal":{"name":"Mobile DNA","volume":"16 1","pages":"6"},"PeriodicalIF":4.7,"publicationDate":"2025-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11846448/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143476766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-18DOI: 10.1186/s13100-025-00344-1
Laura Chacon Machado, Joseph E Peters
Tn7 family transposons are mobile genetic elements known for precise target site selection, with some co-opting CRISPR-Cas systems for RNA-guided transposition. We identified a novel group of Tn7-like transposons in Cyanobacteria that preferentially target CRISPR arrays, suggesting a new functional interaction between these elements and CRISPR-Cas systems. Using bioinformatics tools, we characterized their phylogeny, target specificity, and sub-specialization. The array-targeting elements are phylogenetically close to tRNA-targeting elements. The distinct target preference coincides with loss of a C-terminal region in the TnsD protein which is responsible for recognizing target sites when compared to closely related elements. Notably, elements are found integrated into a fixed position within CRISPR spacer regions, a behavior that might minimize negative impacts on the host defense system. These transposons were identified in both plasmid and genomic CRISPR arrays, indicating that their preferred target provides a means for both safe insertion in the host chromosome and a mechanism for dissemination. Attempts to reconstitute these elements in E. coli were unsuccessful, indicating possible dependence on native host factors. Our findings expand the diversity of interactions between Tn7-like transposons and CRISPR systems.
{"title":"A family of Tn7-like transposons evolved to target CRISPR repeats.","authors":"Laura Chacon Machado, Joseph E Peters","doi":"10.1186/s13100-025-00344-1","DOIUrl":"10.1186/s13100-025-00344-1","url":null,"abstract":"<p><p>Tn7 family transposons are mobile genetic elements known for precise target site selection, with some co-opting CRISPR-Cas systems for RNA-guided transposition. We identified a novel group of Tn7-like transposons in Cyanobacteria that preferentially target CRISPR arrays, suggesting a new functional interaction between these elements and CRISPR-Cas systems. Using bioinformatics tools, we characterized their phylogeny, target specificity, and sub-specialization. The array-targeting elements are phylogenetically close to tRNA-targeting elements. The distinct target preference coincides with loss of a C-terminal region in the TnsD protein which is responsible for recognizing target sites when compared to closely related elements. Notably, elements are found integrated into a fixed position within CRISPR spacer regions, a behavior that might minimize negative impacts on the host defense system. These transposons were identified in both plasmid and genomic CRISPR arrays, indicating that their preferred target provides a means for both safe insertion in the host chromosome and a mechanism for dissemination. Attempts to reconstitute these elements in E. coli were unsuccessful, indicating possible dependence on native host factors. Our findings expand the diversity of interactions between Tn7-like transposons and CRISPR systems.</p>","PeriodicalId":18854,"journal":{"name":"Mobile DNA","volume":"16 1","pages":"5"},"PeriodicalIF":4.7,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11837452/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143449492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-17DOI: 10.1186/s13100-024-00337-6
Marie Verneret, Caroline Leroux, Thomas Faraut, Vincent Navratil, Emmanuelle Lerat, Jocelyn Turpin
Background: Endogenous retroviruses (ERV) are traces of ancestral retroviral germline infections that constitute a significant portion of mammalian genomes and are classified as LTR-retrotransposons. The exploration of their dynamics and evolutionary history in ruminants remains limited, highlighting the need for a comprehensive and thorough investigation of the ERV landscape in the genomes of cattle, sheep and goat.
Results: Through a de novo bioinformatic analysis, we characterized 24 Class I and II ERV families across four reference assemblies of domestic and wild sheep and goats, and one assembly of cattle. Among these families, 13 are represented by consensus sequences identified in the five analyzed species, while eight are exclusive to small ruminants and three to cattle. The similarity-based approach used to search for the presence of these families in other ruminant species revealed multiple endogenization events over the last 40 million years and distinct evolutionary dynamics among species. The ERV annotation resulted in a high-resolution dataset of 100,534 ERV insertions across the five genomes, representing between 0.5 and 1% of their genomes. Solo-LTRs account for 83.2% of the annotated insertions demonstrating that most of the ERVs are relics of past events. Two Class II families showed higher abundance and copy conservation in small ruminants. One of them is closely related to circulating exogenous retroviruses and is represented by 22 copies sharing identical LTRs and 12 with complete coding capacities in the domestic goat.
Conclusions: Our results suggest the presence of two ERV families with recent transpositional activity in ruminant genomes, particularly in the domestic goat, illustrating distinct evolutionary dynamics among the analyzed species. This work highlights the ongoing influence of ERVs on genomic landscapes and call for further investigation of their evolutionary trajectories in these genomes.
{"title":"A genome-wide study of ruminants uncovers two endogenous retrovirus families recently active in goats.","authors":"Marie Verneret, Caroline Leroux, Thomas Faraut, Vincent Navratil, Emmanuelle Lerat, Jocelyn Turpin","doi":"10.1186/s13100-024-00337-6","DOIUrl":"10.1186/s13100-024-00337-6","url":null,"abstract":"<p><strong>Background: </strong>Endogenous retroviruses (ERV) are traces of ancestral retroviral germline infections that constitute a significant portion of mammalian genomes and are classified as LTR-retrotransposons. The exploration of their dynamics and evolutionary history in ruminants remains limited, highlighting the need for a comprehensive and thorough investigation of the ERV landscape in the genomes of cattle, sheep and goat.</p><p><strong>Results: </strong>Through a de novo bioinformatic analysis, we characterized 24 Class I and II ERV families across four reference assemblies of domestic and wild sheep and goats, and one assembly of cattle. Among these families, 13 are represented by consensus sequences identified in the five analyzed species, while eight are exclusive to small ruminants and three to cattle. The similarity-based approach used to search for the presence of these families in other ruminant species revealed multiple endogenization events over the last 40 million years and distinct evolutionary dynamics among species. The ERV annotation resulted in a high-resolution dataset of 100,534 ERV insertions across the five genomes, representing between 0.5 and 1% of their genomes. Solo-LTRs account for 83.2% of the annotated insertions demonstrating that most of the ERVs are relics of past events. Two Class II families showed higher abundance and copy conservation in small ruminants. One of them is closely related to circulating exogenous retroviruses and is represented by 22 copies sharing identical LTRs and 12 with complete coding capacities in the domestic goat.</p><p><strong>Conclusions: </strong>Our results suggest the presence of two ERV families with recent transpositional activity in ruminant genomes, particularly in the domestic goat, illustrating distinct evolutionary dynamics among the analyzed species. This work highlights the ongoing influence of ERVs on genomic landscapes and call for further investigation of their evolutionary trajectories in these genomes.</p>","PeriodicalId":18854,"journal":{"name":"Mobile DNA","volume":"16 1","pages":"4"},"PeriodicalIF":4.7,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11831830/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143440945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-06DOI: 10.1186/s13100-024-00339-4
Mathias I Nielsen, Justina C Wolters, Omar G Rosas Bringas, Hua Jiang, Luciano H Di Stefano, Mehrnoosh Oghbaie, Samira Hozeifi, Mats J Nitert, Alienke van Pijkeren, Marieke Smit, Lars Ter Morsche, Apostolos Mourtzinos, Vikram Deshpande, Martin S Taylor, Brian T Chait, John LaCava
Background: Both the expression and activities of LINE-1 (L1) retrotransposons are known to occur in numerous cell-types and are implicated in pathobiological contexts such as aging-related inflammation, autoimmunity, and in cancers. L1s encode two proteins that are translated from bicistronic transcripts. The translation product of ORF1 (ORF1p) has been robustly detected by immunoassays and shotgun mass spectrometry (MS). Yet, more sensitive detection methods would enhance the use of ORF1p as a clinical biomarker. In contrast, until now, no direct evidence of endogenous L1 ORF2 translation to protein (ORF2p) has been shown. Instead, assays for ORF2p have been limited to ectopic L1 ORF over-expression contexts and to indirect detection of endogenous ORF2p enzymatic activity, such as by the sequencing of de novo genomic insertions. Immunoassays for endogenous ORF2p have been problematic, producing apparent false positives due to cross-reactivities, and shotgun MS has not yielded reliable evidence of ORF2p peptides in biological samples.
Results: Here we present targeted mass spectrometry assays, selected and parallel reaction monitoring (SRM and PRM, respectively) to detect and quantify L1 ORF1p and ORF2p at their endogenous abundances. We were able to quantify ORF1p and ORF2p present in our samples down to a range in the low attomoles. Confident in our ability to affinity enrich ORF2p, we describe an interactome associated with endogenous ORF2-containing macromolecular assemblies.
Conclusions: This is the first assay to demonstrate sensitive and robust quantitation of endogenous ORF2p. The ability to assay ORF2p directly and quantitatively will improve our understanding of the developmental and diseased cell states where L1 expression and its activity naturally occur. The ability to simultaneously assay endogenous L1 ORF1p and ORF2p is an important step forward for L1 analytical biochemistry. Endogenous ORF2p interactomes can now be presented with confidence that ORF2p is among the enriched proteins.
{"title":"Targeted detection of endogenous LINE-1 proteins and ORF2p interactions.","authors":"Mathias I Nielsen, Justina C Wolters, Omar G Rosas Bringas, Hua Jiang, Luciano H Di Stefano, Mehrnoosh Oghbaie, Samira Hozeifi, Mats J Nitert, Alienke van Pijkeren, Marieke Smit, Lars Ter Morsche, Apostolos Mourtzinos, Vikram Deshpande, Martin S Taylor, Brian T Chait, John LaCava","doi":"10.1186/s13100-024-00339-4","DOIUrl":"10.1186/s13100-024-00339-4","url":null,"abstract":"<p><strong>Background: </strong>Both the expression and activities of LINE-1 (L1) retrotransposons are known to occur in numerous cell-types and are implicated in pathobiological contexts such as aging-related inflammation, autoimmunity, and in cancers. L1s encode two proteins that are translated from bicistronic transcripts. The translation product of ORF1 (ORF1p) has been robustly detected by immunoassays and shotgun mass spectrometry (MS). Yet, more sensitive detection methods would enhance the use of ORF1p as a clinical biomarker. In contrast, until now, no direct evidence of endogenous L1 ORF2 translation to protein (ORF2p) has been shown. Instead, assays for ORF2p have been limited to ectopic L1 ORF over-expression contexts and to indirect detection of endogenous ORF2p enzymatic activity, such as by the sequencing of de novo genomic insertions. Immunoassays for endogenous ORF2p have been problematic, producing apparent false positives due to cross-reactivities, and shotgun MS has not yielded reliable evidence of ORF2p peptides in biological samples.</p><p><strong>Results: </strong>Here we present targeted mass spectrometry assays, selected and parallel reaction monitoring (SRM and PRM, respectively) to detect and quantify L1 ORF1p and ORF2p at their endogenous abundances. We were able to quantify ORF1p and ORF2p present in our samples down to a range in the low attomoles. Confident in our ability to affinity enrich ORF2p, we describe an interactome associated with endogenous ORF2-containing macromolecular assemblies.</p><p><strong>Conclusions: </strong>This is the first assay to demonstrate sensitive and robust quantitation of endogenous ORF2p. The ability to assay ORF2p directly and quantitatively will improve our understanding of the developmental and diseased cell states where L1 expression and its activity naturally occur. The ability to simultaneously assay endogenous L1 ORF1p and ORF2p is an important step forward for L1 analytical biochemistry. Endogenous ORF2p interactomes can now be presented with confidence that ORF2p is among the enriched proteins.</p>","PeriodicalId":18854,"journal":{"name":"Mobile DNA","volume":"16 1","pages":"3"},"PeriodicalIF":4.7,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11800616/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143365220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-28DOI: 10.1186/s13100-025-00342-3
Alice M Godden, Benjamin Rix, Simone Immler
Background: Piwi-interacting RNAs (piRNA)s are non-coding small RNAs that post-transcriptionally affect gene expression and regulation. Through complementary seed region binding with transposable elements (TEs), piRNAs protect the genome from transposition. A tool to link piRNAs with complementary TE targets will improve our understanding of the role of piRNAs in genome maintenance and gene regulation. Existing tools such as TEsmall can process sRNA-seq datasets to produce differentially expressed piRNAs, and piRScan developed for nematodes can link piRNAs and TEs but it requires knowledge about the target region of interest and works backwards.
Results: We developed FishPi to predict the pairings between piRNA and TEs for available genomes from zebrafish, medaka and tilapia, with full user customisation of parameters including orientation of piRNA, mismatches in the piRNA seed binding to TE and scored output lists of piRNA-TE matches. FishPi works with individual piRNAs or a list of piRNA sequences in fasta format. The software focuses on the piRNA-TE seed region and analyses reference TEs for piRNA complementarity. TE type is examined, counted and stored to a dictionary, with genomic loci recorded. Any updates to piRNA-TE binding rules can easily be incorporated by changing the seed-region options in the graphic user-interface. FishPi provides a graphic interface using tkinter for the user to input piRNA sequences to generate comprehensive reports on piRNA-TE interactions. FishPi can easily be adapted to genomes from other species and taxa opening the interpretation of piRNA functionality to a wide community.
Conclusions: Users will gain insight into genome mobility and FishPi will help further our understanding of the biological role of piRNAs and their interaction with TEs in a similar way that public databases have improved the access to and the understanding of the role of small RNAs.
{"title":"FishPi: a bioinformatic prediction tool to link piRNA and transposable elements.","authors":"Alice M Godden, Benjamin Rix, Simone Immler","doi":"10.1186/s13100-025-00342-3","DOIUrl":"10.1186/s13100-025-00342-3","url":null,"abstract":"<p><strong>Background: </strong>Piwi-interacting RNAs (piRNA)s are non-coding small RNAs that post-transcriptionally affect gene expression and regulation. Through complementary seed region binding with transposable elements (TEs), piRNAs protect the genome from transposition. A tool to link piRNAs with complementary TE targets will improve our understanding of the role of piRNAs in genome maintenance and gene regulation. Existing tools such as TEsmall can process sRNA-seq datasets to produce differentially expressed piRNAs, and piRScan developed for nematodes can link piRNAs and TEs but it requires knowledge about the target region of interest and works backwards.</p><p><strong>Results: </strong>We developed FishPi to predict the pairings between piRNA and TEs for available genomes from zebrafish, medaka and tilapia, with full user customisation of parameters including orientation of piRNA, mismatches in the piRNA seed binding to TE and scored output lists of piRNA-TE matches. FishPi works with individual piRNAs or a list of piRNA sequences in fasta format. The software focuses on the piRNA-TE seed region and analyses reference TEs for piRNA complementarity. TE type is examined, counted and stored to a dictionary, with genomic loci recorded. Any updates to piRNA-TE binding rules can easily be incorporated by changing the seed-region options in the graphic user-interface. FishPi provides a graphic interface using tkinter for the user to input piRNA sequences to generate comprehensive reports on piRNA-TE interactions. FishPi can easily be adapted to genomes from other species and taxa opening the interpretation of piRNA functionality to a wide community.</p><p><strong>Conclusions: </strong>Users will gain insight into genome mobility and FishPi will help further our understanding of the biological role of piRNAs and their interaction with TEs in a similar way that public databases have improved the access to and the understanding of the role of small RNAs.</p>","PeriodicalId":18854,"journal":{"name":"Mobile DNA","volume":"16 1","pages":"2"},"PeriodicalIF":4.7,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11773700/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143053085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-04DOI: 10.1186/s13100-024-00338-5
Jessica D Choi, Lelani A Del Pinto, Nathan B Sutter
Background: Messenger RNA 3' untranslated regions (3'UTRs) control many aspects of gene expression and determine where the transcript will terminate. The polyadenylation signal (PAS) AAUAAA (AATAAA in DNA) is a key regulator of transcript termination and this hexamer, or a similar sequence, is very frequently found within 30 bp of 3'UTR ends. Short interspersed element (SINE) retrotransposons are found throughout genomes in high copy numbers. When inserted into genes they can disrupt expression, alter splicing, or cause nuclear retention of mRNAs. The genomes of the domestic dog and other carnivores carry hundreds of thousands of Can-SINEs, a tRNA-related SINE with transcription termination potential. Because of this we asked whether Can-SINEs may terminate transcript in some dog genes.
Results: Each of the dog's nine Can-SINE consensus sequences carry an average of three AATAAA PASs on their sense strands but zero on their antisense strands. Consistent with the idea that Can-SINEs can terminate transcripts, we find that sense-oriented Can-SINEs are approximately ten times more frequent at 3' ends of 3'UTRs compared to further upstream within 3'UTRs. Furthermore, the count of AATAAA PASs on head-to-tail SINE sequences differs significantly between sense and antisense-oriented retrotransposons in transcripts. Can-SINEs near 3'UTR ends are likely to carry an AATAAA motif on the mRNA sense strand while those further upstream are not. We identified loci where Can-SINE insertion has truncated or altered a 3'UTR of the dog genome (dog 3'UTR) compared to the human ortholog. Dog 3'UTRs have peaks of AATAAA PAS frequency at 28, 32, and 36 bp from the end. The periodicity is partly explained by TAAA(n) repeats within Can-SINE AT-rich tails. We annotated all repeat-masked Can-SINE copies in the Boxer reference genome and found that the young SINEC_Cf type has a mode of 15 bp length for target site duplications (TSDs). All dog Can-SINE types favor integration at TSDs beginning with A(4).
Conclusion: Dog Can-SINE retrotransposition has imported AATAAA PASs into gene transcripts and led to alteration of 3'UTRs. AATAAA sequences are selectively removed from Can-SINEs in introns and upstream 3'UTR regions but are retained at the far downstream end of 3'UTRs, which we infer reflects their role as termination sequences for these transcripts.
{"title":"SINE retrotransposons import polyadenylation signals to 3'UTRs in dog (Canis familiaris).","authors":"Jessica D Choi, Lelani A Del Pinto, Nathan B Sutter","doi":"10.1186/s13100-024-00338-5","DOIUrl":"https://doi.org/10.1186/s13100-024-00338-5","url":null,"abstract":"<p><strong>Background: </strong>Messenger RNA 3' untranslated regions (3'UTRs) control many aspects of gene expression and determine where the transcript will terminate. The polyadenylation signal (PAS) AAUAAA (AATAAA in DNA) is a key regulator of transcript termination and this hexamer, or a similar sequence, is very frequently found within 30 bp of 3'UTR ends. Short interspersed element (SINE) retrotransposons are found throughout genomes in high copy numbers. When inserted into genes they can disrupt expression, alter splicing, or cause nuclear retention of mRNAs. The genomes of the domestic dog and other carnivores carry hundreds of thousands of Can-SINEs, a tRNA-related SINE with transcription termination potential. Because of this we asked whether Can-SINEs may terminate transcript in some dog genes.</p><p><strong>Results: </strong>Each of the dog's nine Can-SINE consensus sequences carry an average of three AATAAA PASs on their sense strands but zero on their antisense strands. Consistent with the idea that Can-SINEs can terminate transcripts, we find that sense-oriented Can-SINEs are approximately ten times more frequent at 3' ends of 3'UTRs compared to further upstream within 3'UTRs. Furthermore, the count of AATAAA PASs on head-to-tail SINE sequences differs significantly between sense and antisense-oriented retrotransposons in transcripts. Can-SINEs near 3'UTR ends are likely to carry an AATAAA motif on the mRNA sense strand while those further upstream are not. We identified loci where Can-SINE insertion has truncated or altered a 3'UTR of the dog genome (dog 3'UTR) compared to the human ortholog. Dog 3'UTRs have peaks of AATAAA PAS frequency at 28, 32, and 36 bp from the end. The periodicity is partly explained by TAAA(n) repeats within Can-SINE AT-rich tails. We annotated all repeat-masked Can-SINE copies in the Boxer reference genome and found that the young SINEC_Cf type has a mode of 15 bp length for target site duplications (TSDs). All dog Can-SINE types favor integration at TSDs beginning with A(4).</p><p><strong>Conclusion: </strong>Dog Can-SINE retrotransposition has imported AATAAA PASs into gene transcripts and led to alteration of 3'UTRs. AATAAA sequences are selectively removed from Can-SINEs in introns and upstream 3'UTR regions but are retained at the far downstream end of 3'UTRs, which we infer reflects their role as termination sequences for these transcripts.</p>","PeriodicalId":18854,"journal":{"name":"Mobile DNA","volume":"16 1","pages":"1"},"PeriodicalIF":4.7,"publicationDate":"2025-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142927290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-31DOI: 10.1186/s13100-024-00340-x
Irina R Arkhipova, Kathleen H Burns, Pascale Lesage
{"title":"Controlling and controlled elements: highlights of the year in mobile DNA research.","authors":"Irina R Arkhipova, Kathleen H Burns, Pascale Lesage","doi":"10.1186/s13100-024-00340-x","DOIUrl":"10.1186/s13100-024-00340-x","url":null,"abstract":"","PeriodicalId":18854,"journal":{"name":"Mobile DNA","volume":"15 1","pages":"27"},"PeriodicalIF":4.7,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11689530/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142909997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-15DOI: 10.1186/s13100-024-00336-7
Alexander Belyayev, Begoña Quirós de la Peña, Simon Villanueva Corrales, Shook Ling Low, Barbora Frejová, Zuzana Sejfová, Jiřina Josefiová, Eliška Záveská, Yann J K Bertrand, Jindřich Chrtek, Patrik Mráz
Background: The centromere is one of the key regions of the eukaryotic chromosome. While maintaining its function, centromeric DNA may differ among closely related species. Here, we explored the composition and structure of the pericentromeres (a chromosomal region including a functional centromere) of Hieracium alpinum (Asteraceae), a member of one of the most diverse genera in the plant kingdom. Previously, we identified a pericentromere-specific tandem repeat that made it possible to distinguish reads within the Oxford Nanopore library attributed to the pericentromeres, separating them into a discrete subset and allowing comparison of the repeatome composition of this subset with the remaining genome.
Results: We found that the main satellite DNA (satDNA) monomer forms long arrays of linear and block types in the pericentromeric heterochromatin of H. alpinum, and very often, single reads contain forward and reverse arrays and mirror each other. Beside the major, two new minor satDNA families were discovered. In addition to satDNAs, high amounts of LTR retrotransposons (TEs) with dominant of Tekay lineage, were detected in the pericentromeres. We were able to reconstruct four main TEs of the Ty3-gypsy and Ty1-copia superfamilies and compare their relative positions with satDNAs. The latter showed that the conserved domains (CDs) of the TE proteins are located between the newly discovered satDNAs, which appear to be parts of ancient Tekay LTRs that we were able to reconstruct. The dominant satDNA monomer shows a certain similarity to the GAG CD of the Angela retrotransposon.
Conclusions: The species-specific pericentromeric arrays of the H. alpinum genome are heterogeneous, exhibiting both linear and block type structures. High amounts of forward and reverse arrays of the main satDNA monomer point to multiple microinversions that could be the main mechanism for rapid structural evolution stochastically creating the uniqueness of an individual pericentromeric structure. The traces of TEs insertion waves remain in pericentromeres for a long time, thus "keeping memories" of past genomic events. We counted at least four waves of TEs insertions. In pericentromeres, TEs particles can be transformed into satDNA, which constitutes a background pool of minor families that, under certain conditions, can replace the dominant one(s).
{"title":"Analysis of pericentromere composition and structure elucidated the history of the Hieracium alpinum L. genome, revealing waves of transposable elements insertions.","authors":"Alexander Belyayev, Begoña Quirós de la Peña, Simon Villanueva Corrales, Shook Ling Low, Barbora Frejová, Zuzana Sejfová, Jiřina Josefiová, Eliška Záveská, Yann J K Bertrand, Jindřich Chrtek, Patrik Mráz","doi":"10.1186/s13100-024-00336-7","DOIUrl":"10.1186/s13100-024-00336-7","url":null,"abstract":"<p><strong>Background: </strong>The centromere is one of the key regions of the eukaryotic chromosome. While maintaining its function, centromeric DNA may differ among closely related species. Here, we explored the composition and structure of the pericentromeres (a chromosomal region including a functional centromere) of Hieracium alpinum (Asteraceae), a member of one of the most diverse genera in the plant kingdom. Previously, we identified a pericentromere-specific tandem repeat that made it possible to distinguish reads within the Oxford Nanopore library attributed to the pericentromeres, separating them into a discrete subset and allowing comparison of the repeatome composition of this subset with the remaining genome.</p><p><strong>Results: </strong>We found that the main satellite DNA (satDNA) monomer forms long arrays of linear and block types in the pericentromeric heterochromatin of H. alpinum, and very often, single reads contain forward and reverse arrays and mirror each other. Beside the major, two new minor satDNA families were discovered. In addition to satDNAs, high amounts of LTR retrotransposons (TEs) with dominant of Tekay lineage, were detected in the pericentromeres. We were able to reconstruct four main TEs of the Ty3-gypsy and Ty1-copia superfamilies and compare their relative positions with satDNAs. The latter showed that the conserved domains (CDs) of the TE proteins are located between the newly discovered satDNAs, which appear to be parts of ancient Tekay LTRs that we were able to reconstruct. The dominant satDNA monomer shows a certain similarity to the GAG CD of the Angela retrotransposon.</p><p><strong>Conclusions: </strong>The species-specific pericentromeric arrays of the H. alpinum genome are heterogeneous, exhibiting both linear and block type structures. High amounts of forward and reverse arrays of the main satDNA monomer point to multiple microinversions that could be the main mechanism for rapid structural evolution stochastically creating the uniqueness of an individual pericentromeric structure. The traces of TEs insertion waves remain in pericentromeres for a long time, thus \"keeping memories\" of past genomic events. We counted at least four waves of TEs insertions. In pericentromeres, TEs particles can be transformed into satDNA, which constitutes a background pool of minor families that, under certain conditions, can replace the dominant one(s).</p>","PeriodicalId":18854,"journal":{"name":"Mobile DNA","volume":"15 1","pages":"26"},"PeriodicalIF":4.7,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11566620/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142644426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-26DOI: 10.1186/s13100-024-00334-9
Pascale Lesage, Emilie Brasset, Gael Cristofari, Clément Gilbert, Didier Mazel, Rita Rebollo, Clémentine Vitte
From April 20 to 23, 2024, three hundred ten researchers from around the world gathered in Saint-Malo, France, at the fourth International Congress on Transposable Elements (ICTE 2024), to present their most recent discoveries on transposable elements (TEs) and exchange ideas and methodologies. ICTE has been held every four years since 2008 (except in 2020, when it was exceptionally transformed into a seminar series due to the Covid-19 pandemic) and is organized by the French network on Mobile Genetic Elements (CNRS GDR 3546). This fourth edition offered two keynote presentations and four sessions presenting the latest findings and encouraging discussions on the following topics: (1) TEs, genome evolution and adaptation; (2) TEs in health and diseases; (3) TE control and epigenetics; (4) Transposition mechanisms and applications. The 2024 edition also included a half-day satellite workshop on new challenges in TE annotation, organized in collaboration with the TE Hub. The meeting gathered long-term TE enthusiasts, as well as newcomers to the field, with 77% of the participants attending ICTE for the first time.
2024年4月20日至23日,来自世界各地的110名研究人员齐聚法国圣马洛(Saint-Malo),参加第四届可转座元件国际大会(ICTE 2024),展示他们在可转座元件(TEs)方面的最新发现,并交流思想和方法。国际可转座元件大会自2008年起每四年举办一次(2020年除外,该年因Covid-19大流行而破例改为系列研讨会),由法国移动遗传元件网络(CNRS GDR 3546)主办。第四届会议提供了两个主旨报告和四个分会场,介绍最新研究成果,并鼓励就以下主题展开讨论:(1) TE、基因组进化和适应;(2) TE 在健康和疾病中的作用;(3) TE 控制和表观遗传学;(4) 转座机制和应用。2024 年会议还包括与 TE Hub 合作举办的为期半天的关于 TE 注释新挑战的卫星研讨会。会议聚集了 TE 领域的长期爱好者和新手,77% 的与会者是首次参加 ICTE。
{"title":"International congress on transposable elements (ICTE 2024) in Saint Malo: breaking down transposon waves and their impact.","authors":"Pascale Lesage, Emilie Brasset, Gael Cristofari, Clément Gilbert, Didier Mazel, Rita Rebollo, Clémentine Vitte","doi":"10.1186/s13100-024-00334-9","DOIUrl":"10.1186/s13100-024-00334-9","url":null,"abstract":"<p><p>From April 20 to 23, 2024, three hundred ten researchers from around the world gathered in Saint-Malo, France, at the fourth International Congress on Transposable Elements (ICTE 2024), to present their most recent discoveries on transposable elements (TEs) and exchange ideas and methodologies. ICTE has been held every four years since 2008 (except in 2020, when it was exceptionally transformed into a seminar series due to the Covid-19 pandemic) and is organized by the French network on Mobile Genetic Elements (CNRS GDR 3546). This fourth edition offered two keynote presentations and four sessions presenting the latest findings and encouraging discussions on the following topics: (1) TEs, genome evolution and adaptation; (2) TEs in health and diseases; (3) TE control and epigenetics; (4) Transposition mechanisms and applications. The 2024 edition also included a half-day satellite workshop on new challenges in TE annotation, organized in collaboration with the TE Hub. The meeting gathered long-term TE enthusiasts, as well as newcomers to the field, with 77% of the participants attending ICTE for the first time.</p>","PeriodicalId":18854,"journal":{"name":"Mobile DNA","volume":"15 1","pages":"25"},"PeriodicalIF":4.7,"publicationDate":"2024-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11512509/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142504417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}