Cassandra Elphinstone, Rob Elphinstone, Marco Todesco, Loren H Rieseberg
{"title":"RepeatOBserver: Tandem Repeat Visualisation and Putative Centromere Detection.","authors":"Cassandra Elphinstone, Rob Elphinstone, Marco Todesco, Loren H Rieseberg","doi":"10.1111/1755-0998.14084","DOIUrl":null,"url":null,"abstract":"<p><p>Tandem repeats play an important role in centromere structure, subtelomeric regions, DNA methylation, recombination and the regulation of gene activity. Analysis of their distribution in genomes offers a potential means for predicting putative centromere locations, which continues to be a challenge for genome annotation. Here we present RepeatOBserver (https://github.com/celphin/RepeatOBserverV1), a new tool for visualising repeat patterns and identifying putative centromere locations, using a Fourier transform of DNA walks. RepeatOBserver can identify and visualise a broad range of perfect and imperfect repeats (3-5000 bp long) in genome assemblies without any a priori knowledge of repeat sequences or the need for optimising parameters. RepeatOBserver heatmaps can distinguish between tandem and retrotransposon repeats. We analysed 159 chromosomes with experimentally-verified centromere positions from 12 plant and animal species. We find that 93% of experimentally-verified tandem repeat centromeres occur in regions of low sequence diversity and 97% of retrotransposon centromeres occur in regions with a high abundance of repeat lengths. Depending on the centromere type predicted by the heatmaps, putative centromere locations can be predicted using either a genomic Shannon diversity index or a repeat abundance sum. RepeatOBserver can also locate other regions of interest including potential neocentromeres and gene copy variation. Split and inverted tandem repeats at inversion boundaries suggest that chromosomal inversions or mis-assemblies can also be located. RepeatOBserver is a flexible tool for comprehensive characterisation of repeat patterns that can be used to visualise and identify a variety of regions of interest in genome assemblies.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14084"},"PeriodicalIF":5.5000,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Ecology Resources","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1111/1755-0998.14084","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Tandem repeats play an important role in centromere structure, subtelomeric regions, DNA methylation, recombination and the regulation of gene activity. Analysis of their distribution in genomes offers a potential means for predicting putative centromere locations, which continues to be a challenge for genome annotation. Here we present RepeatOBserver (https://github.com/celphin/RepeatOBserverV1), a new tool for visualising repeat patterns and identifying putative centromere locations, using a Fourier transform of DNA walks. RepeatOBserver can identify and visualise a broad range of perfect and imperfect repeats (3-5000 bp long) in genome assemblies without any a priori knowledge of repeat sequences or the need for optimising parameters. RepeatOBserver heatmaps can distinguish between tandem and retrotransposon repeats. We analysed 159 chromosomes with experimentally-verified centromere positions from 12 plant and animal species. We find that 93% of experimentally-verified tandem repeat centromeres occur in regions of low sequence diversity and 97% of retrotransposon centromeres occur in regions with a high abundance of repeat lengths. Depending on the centromere type predicted by the heatmaps, putative centromere locations can be predicted using either a genomic Shannon diversity index or a repeat abundance sum. RepeatOBserver can also locate other regions of interest including potential neocentromeres and gene copy variation. Split and inverted tandem repeats at inversion boundaries suggest that chromosomal inversions or mis-assemblies can also be located. RepeatOBserver is a flexible tool for comprehensive characterisation of repeat patterns that can be used to visualise and identify a variety of regions of interest in genome assemblies.
期刊介绍:
Molecular Ecology Resources promotes the creation of comprehensive resources for the scientific community, encompassing computer programs, statistical and molecular advancements, and a diverse array of molecular tools. Serving as a conduit for disseminating these resources, the journal targets a broad audience of researchers in the fields of evolution, ecology, and conservation. Articles in Molecular Ecology Resources are crafted to support investigations tackling significant questions within these disciplines.
In addition to original resource articles, Molecular Ecology Resources features Reviews, Opinions, and Comments relevant to the field. The journal also periodically releases Special Issues focusing on resource development within specific areas.