Xiaolei Brian Zhang, Grace Oualline, Jim Shaw, Yun William Yu
{"title":"skandiver: a divergence-based analysis tool for identifying intercellular mobile genetic elements","authors":"Xiaolei Brian Zhang, Grace Oualline, Jim Shaw, Yun William Yu","doi":"arxiv-2406.12064","DOIUrl":null,"url":null,"abstract":"Mobile genetic elements (MGEs) are as ubiquitous in nature as they are varied\nin type, ranging from viral insertions to transposons to incorporated plasmids.\nHorizontal transfer of MGEs across bacterial species may also pose a\nsignificant threat to global health due to their capability to harbour\nantibiotic resistance genes. However, despite cheap and rapid whole genome\nsequencing, the varied nature of MGEs makes it difficult to fully characterize\nthem, and existing methods for detecting MGEs often don't agree on what should\ncount. In this manuscript, we first define and argue in favor of a\ndivergence-based characterization of mobile-genetic elements. Using that\nparadigm, we present skandiver, a tool designed to efficiently detect MGEs from\nwhole genome assemblies without the need for gene annotation or markers.\nskandiver determines mobile elements via genome fragmentation, average\nnucleotide identity (ANI), and divergence time. By building on the scalable\nskani software for ANI computation, skandiver can query hundreds of complete\nassemblies against $>$65,000 representative genomes in a few minutes and 19 GB\nmemory, providing scalable and efficient method for elucidating mobile element\nprofiles in incomplete, uncharacterized genomic sequences. For isolated and\nintegrated large plasmids (>10kbp), skandiver's recall was 48\\% and 47\\%,\nMobileElementFinder was 59\\% and 17\\%, and geNomad was 86\\% and 32\\%,\nrespectively. For isolated large plasmids, skandiver's recall (48\\%) is lower\nthan state-of-the-art reference-based methods geNomad (86\\%) and\nMobileElementFinder (59\\%). However, skandiver achieves higher recall on\nintegrated plasmids and, unlike other methods, without comparing against a\ncurated database, making skandiver suitable for discovery of novel MGEs. Availability: https://github.com/YoukaiFromAccounting/skandiver","PeriodicalId":501070,"journal":{"name":"arXiv - QuanBio - Genomics","volume":"136 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Genomics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2406.12064","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Mobile genetic elements (MGEs) are as ubiquitous in nature as they are varied
in type, ranging from viral insertions to transposons to incorporated plasmids.
Horizontal transfer of MGEs across bacterial species may also pose a
significant threat to global health due to their capability to harbour
antibiotic resistance genes. However, despite cheap and rapid whole genome
sequencing, the varied nature of MGEs makes it difficult to fully characterize
them, and existing methods for detecting MGEs often don't agree on what should
count. In this manuscript, we first define and argue in favor of a
divergence-based characterization of mobile-genetic elements. Using that
paradigm, we present skandiver, a tool designed to efficiently detect MGEs from
whole genome assemblies without the need for gene annotation or markers.
skandiver determines mobile elements via genome fragmentation, average
nucleotide identity (ANI), and divergence time. By building on the scalable
skani software for ANI computation, skandiver can query hundreds of complete
assemblies against $>$65,000 representative genomes in a few minutes and 19 GB
memory, providing scalable and efficient method for elucidating mobile element
profiles in incomplete, uncharacterized genomic sequences. For isolated and
integrated large plasmids (>10kbp), skandiver's recall was 48\% and 47\%,
MobileElementFinder was 59\% and 17\%, and geNomad was 86\% and 32\%,
respectively. For isolated large plasmids, skandiver's recall (48\%) is lower
than state-of-the-art reference-based methods geNomad (86\%) and
MobileElementFinder (59\%). However, skandiver achieves higher recall on
integrated plasmids and, unlike other methods, without comparing against a
curated database, making skandiver suitable for discovery of novel MGEs. Availability: https://github.com/YoukaiFromAccounting/skandiver