{"title":"Generating multiple alignments on a pangenomic scale.","authors":"Jannik Olbrich, Thomas Büchler, Enno Ohlebusch","doi":"10.1093/bioinformatics/btaf104","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>Since novel long read sequencing technologies allow for de novo assembly of many individuals of a species, high-quality assemblies are becoming widely available. For example, the recently published draft human pangenome reference was based on assemblies composed of contigs. There is an urgent need for a software-tool that is able to generate a multiple alignment of genomes of the same species because current multiple sequence alignment programs cannot deal with such a volume of data.</p><p><strong>Results: </strong>We show that the combination of a well-known anchor-based method with the technique of prefix-free parsing yields an approach that is able to generate multiple alignments on a pangenomic scale, provided that large-scale structural variants are rare. Furthermore, experiments with real world data show that our software tool PANgenomic Anchor-based Multiple Alignment significantly outperforms current state-of-the art programs.</p><p><strong>Availability and implementation: </strong>Source code is available at: https://gitlab.com/qwerzuiop/panama, archived at swh:1:dir:e90c9f664995acca9063245cabdd97549cf39694.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btaf104","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Motivation: Since novel long read sequencing technologies allow for de novo assembly of many individuals of a species, high-quality assemblies are becoming widely available. For example, the recently published draft human pangenome reference was based on assemblies composed of contigs. There is an urgent need for a software-tool that is able to generate a multiple alignment of genomes of the same species because current multiple sequence alignment programs cannot deal with such a volume of data.
Results: We show that the combination of a well-known anchor-based method with the technique of prefix-free parsing yields an approach that is able to generate multiple alignments on a pangenomic scale, provided that large-scale structural variants are rare. Furthermore, experiments with real world data show that our software tool PANgenomic Anchor-based Multiple Alignment significantly outperforms current state-of-the art programs.
Availability and implementation: Source code is available at: https://gitlab.com/qwerzuiop/panama, archived at swh:1:dir:e90c9f664995acca9063245cabdd97549cf39694.