Sumayyea Salahuddin, M. Ashfaq, M. Sher, L. Hasan, N. Ahmad
{"title":"Performance comparison & analysis of DivSufSort-based & SAIS-based FM-Index","authors":"Sumayyea Salahuddin, M. Ashfaq, M. Sher, L. Hasan, N. Ahmad","doi":"10.1109/ICCSE.2015.7250220","DOIUrl":null,"url":null,"abstract":"FM-Index and Suffix Array are closely related to each other and are both extremely popular indexes for genomic sequences. They are used in several popular read alignment tools, including Bowtie, Bowtie 2, BWA, and GEM. In literature, there exist several Suffix Array Construction Algorithms (SACA). We have considered two popular SACA techniques: DivSufSort and SAIS. Both techniques construct suffix array in linear time. We have constructed FM-Index using both of these SACA algorithms. In this paper, we comprehensively describe our FM-Index construction approach and compare performance of these two indexes in terms of time for different string types in the Dataset. Our result shows that DivSufSort-based FM-Index performs 3.67% time efficient than SAIS-based FM-Index on 10 out of 11 strings in the Dataset.","PeriodicalId":311451,"journal":{"name":"2015 10th International Conference on Computer Science & Education (ICCSE)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 10th International Conference on Computer Science & Education (ICCSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCSE.2015.7250220","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
FM-Index and Suffix Array are closely related to each other and are both extremely popular indexes for genomic sequences. They are used in several popular read alignment tools, including Bowtie, Bowtie 2, BWA, and GEM. In literature, there exist several Suffix Array Construction Algorithms (SACA). We have considered two popular SACA techniques: DivSufSort and SAIS. Both techniques construct suffix array in linear time. We have constructed FM-Index using both of these SACA algorithms. In this paper, we comprehensively describe our FM-Index construction approach and compare performance of these two indexes in terms of time for different string types in the Dataset. Our result shows that DivSufSort-based FM-Index performs 3.67% time efficient than SAIS-based FM-Index on 10 out of 11 strings in the Dataset.