{"title":"Linear Suffix Array Construction by Almost Pure Induced-Sorting","authors":"Ge Nong, Sen Zhang, W. H. Chan","doi":"10.1109/DCC.2009.42","DOIUrl":null,"url":null,"abstract":"We present a linear time and space suffix array (SA) construction algorithm called the SA-IS algorithm.The SA-IS algorithm is novel because of the LMS-substrings used for the problem reduction and the pure induced-sorting (specially coined for this algorithm)used to propagate the order of suffixes as well as that of LMS-substrings, which makes the algorithm almost purely relying on induced sorting at both its crucial steps.The pure induced-sorting renders the algorithm an elegant design and in turn a surprisingly compact implementation which consists of less than 100 lines of C code.The experimental results demonstrate that this newly proposed algorithm yields noticeably better time and space efficiencies than all the currently published linear time algorithms for SA construction.","PeriodicalId":377880,"journal":{"name":"2009 Data Compression Conference","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"160","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 Data Compression Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DCC.2009.42","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 160
Abstract
We present a linear time and space suffix array (SA) construction algorithm called the SA-IS algorithm.The SA-IS algorithm is novel because of the LMS-substrings used for the problem reduction and the pure induced-sorting (specially coined for this algorithm)used to propagate the order of suffixes as well as that of LMS-substrings, which makes the algorithm almost purely relying on induced sorting at both its crucial steps.The pure induced-sorting renders the algorithm an elegant design and in turn a surprisingly compact implementation which consists of less than 100 lines of C code.The experimental results demonstrate that this newly proposed algorithm yields noticeably better time and space efficiencies than all the currently published linear time algorithms for SA construction.