{"title":"从盘古基因组多态性中识别转座元件家族。","authors":"Pío Sierra, Richard Durbin","doi":"10.1186/s13100-024-00323-y","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Transposable Elements (TEs) are segments of DNA, typically a few hundred base pairs up to several tens of thousands bases long, that have the ability to generate new copies of themselves in the genome. Most existing methods used to identify TEs in a newly sequenced genome are based on their repetitive character, together with detection based on homology and structural features. As new high quality assemblies become more common, including the availability of multiple independent assemblies from the same species, an alternative strategy for identification of TE families becomes possible in which we focus on the polymorphism at insertion sites caused by TE mobility.</p><p><strong>Results: </strong>We develop the idea of using the structural polymorphisms found in pangenomes to create a library of the TE families recently active in a species, or in a closely related group of species. We present a tool, pantera, that achieves this task, and illustrate its use both on species with well-curated libraries, and on new assemblies.</p><p><strong>Conclusions: </strong>Our results show that pantera is sensitive and accurate, tending to correctly identify complete elements with precise boundaries, and is particularly well suited to detect larger, low copy number TEs that are often undetected with existing de novo methods.</p>","PeriodicalId":18854,"journal":{"name":"Mobile DNA","volume":null,"pages":null},"PeriodicalIF":4.7000,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11202377/pdf/","citationCount":"0","resultStr":"{\"title\":\"Identification of transposable element families from pangenome polymorphisms.\",\"authors\":\"Pío Sierra, Richard Durbin\",\"doi\":\"10.1186/s13100-024-00323-y\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Transposable Elements (TEs) are segments of DNA, typically a few hundred base pairs up to several tens of thousands bases long, that have the ability to generate new copies of themselves in the genome. Most existing methods used to identify TEs in a newly sequenced genome are based on their repetitive character, together with detection based on homology and structural features. As new high quality assemblies become more common, including the availability of multiple independent assemblies from the same species, an alternative strategy for identification of TE families becomes possible in which we focus on the polymorphism at insertion sites caused by TE mobility.</p><p><strong>Results: </strong>We develop the idea of using the structural polymorphisms found in pangenomes to create a library of the TE families recently active in a species, or in a closely related group of species. We present a tool, pantera, that achieves this task, and illustrate its use both on species with well-curated libraries, and on new assemblies.</p><p><strong>Conclusions: </strong>Our results show that pantera is sensitive and accurate, tending to correctly identify complete elements with precise boundaries, and is particularly well suited to detect larger, low copy number TEs that are often undetected with existing de novo methods.</p>\",\"PeriodicalId\":18854,\"journal\":{\"name\":\"Mobile DNA\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.7000,\"publicationDate\":\"2024-06-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11202377/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Mobile DNA\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1186/s13100-024-00323-y\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mobile DNA","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13100-024-00323-y","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0
摘要
背景:可转座元件(Transposable Elements,TEs)是 DNA 片段,通常只有几百个碱基对到几万个碱基,能够在基因组中产生新的拷贝。在新测序的基因组中,现有的大多数用于识别TE的方法都是基于其重复性,以及基于同源性和结构特征的检测。随着新的高质量集合越来越常见,包括来自同一物种的多个独立集合的可用性,另一种识别 TE 家族的策略成为可能,我们将重点放在 TE 移动性引起的插入位点的多态性上:结果:我们提出了利用庞基因组中发现的结构多态性来创建一个最近在一个物种或密切相关的物种群中活跃的TE家族库的想法。我们介绍了一个实现这一任务的工具--pantera,并说明了它在具有良好整合库的物种和新的集合上的应用:我们的研究结果表明,pantera 灵敏而准确,能正确识别具有精确边界的完整元素,尤其适合检测较大的低拷贝数 TE,而现有的从头检测方法往往检测不到这些 TE。
Identification of transposable element families from pangenome polymorphisms.
Background: Transposable Elements (TEs) are segments of DNA, typically a few hundred base pairs up to several tens of thousands bases long, that have the ability to generate new copies of themselves in the genome. Most existing methods used to identify TEs in a newly sequenced genome are based on their repetitive character, together with detection based on homology and structural features. As new high quality assemblies become more common, including the availability of multiple independent assemblies from the same species, an alternative strategy for identification of TE families becomes possible in which we focus on the polymorphism at insertion sites caused by TE mobility.
Results: We develop the idea of using the structural polymorphisms found in pangenomes to create a library of the TE families recently active in a species, or in a closely related group of species. We present a tool, pantera, that achieves this task, and illustrate its use both on species with well-curated libraries, and on new assemblies.
Conclusions: Our results show that pantera is sensitive and accurate, tending to correctly identify complete elements with precise boundaries, and is particularly well suited to detect larger, low copy number TEs that are often undetected with existing de novo methods.
期刊介绍:
Mobile DNA is an online, peer-reviewed, open access journal that publishes articles providing novel insights into DNA rearrangements in all organisms, ranging from transposition and other types of recombination mechanisms to patterns and processes of mobile element and host genome evolution. In addition, the journal will consider articles on the utility of mobile genetic elements in biotechnological methods and protocols.