Gabriel Siqueira, Alexsandro Oliveira Alexandrino, Andre Rodrigues Oliveira, Geraldine Jean, Guillaume Fertin, Zanoni Dias
{"title":"Partition Based Algorithms for Rearrangement Distances with Flexible Intergenic Regions.","authors":"Gabriel Siqueira, Alexsandro Oliveira Alexandrino, Andre Rodrigues Oliveira, Geraldine Jean, Guillaume Fertin, Zanoni Dias","doi":"10.1109/TCBB.2024.3467033","DOIUrl":null,"url":null,"abstract":"<p><p>Genome Rearrangement distance problems are used in Computational Biology to estimate the evolutionary distance between genomes. These problems consist of minimizing the number of rearrangement events necessary to transform one genome into another. Two commonly used rearrangement events are reversal and transposition. The first studied problems ignored nucleotides outside genes (called intergenic regions), or assumed that genomes have a single copy of each gene. Recent works made advancements in more general problems considering the number of nucleotides in intergenic regions, and replicated genes. Nevertheless, genomes tend to have wildly different quantities of nucleotides on their intergenic regions, which poses a problem when comparing these regions exactly. To overcome this limitation, our work considers some flexibility when matching intergenic regions that do not have the same number of nucleotides. We propose new problems seeking the minimum number of reversals, or reversals and transpositions, necessary to transform one genome into another, while considering flexible intergenic region information. We show approximations for these problems by exploring their relationship with the Signed Minimum Common Flexible Intergenic String Partition problem. We also present different heuristics for the partition problem, and conduct experimental tests on simulated genomes to assess the performance of our algorithms.</p>","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"PP ","pages":""},"PeriodicalIF":3.6000,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1109/TCBB.2024.3467033","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Genome Rearrangement distance problems are used in Computational Biology to estimate the evolutionary distance between genomes. These problems consist of minimizing the number of rearrangement events necessary to transform one genome into another. Two commonly used rearrangement events are reversal and transposition. The first studied problems ignored nucleotides outside genes (called intergenic regions), or assumed that genomes have a single copy of each gene. Recent works made advancements in more general problems considering the number of nucleotides in intergenic regions, and replicated genes. Nevertheless, genomes tend to have wildly different quantities of nucleotides on their intergenic regions, which poses a problem when comparing these regions exactly. To overcome this limitation, our work considers some flexibility when matching intergenic regions that do not have the same number of nucleotides. We propose new problems seeking the minimum number of reversals, or reversals and transpositions, necessary to transform one genome into another, while considering flexible intergenic region information. We show approximations for these problems by exploring their relationship with the Signed Minimum Common Flexible Intergenic String Partition problem. We also present different heuristics for the partition problem, and conduct experimental tests on simulated genomes to assess the performance of our algorithms.
期刊介绍:
IEEE/ACM Transactions on Computational Biology and Bioinformatics emphasizes the algorithmic, mathematical, statistical and computational methods that are central in bioinformatics and computational biology; the development and testing of effective computer programs in bioinformatics; the development of biological databases; and important biological results that are obtained from the use of these methods, programs and databases; the emerging field of Systems Biology, where many forms of data are used to create a computer-based model of a complex biological system