{"title":"Parallel Approximate Multi-Pattern Matching on Heterogeneous Cluster Systems","authors":"Cheng Zhong, Zeng Fan, Defu Su","doi":"10.1109/PDCAT.2008.23","DOIUrl":null,"url":null,"abstract":"For the given multiple patterns and a text string, firstly, a perfect hash function is constructed, the patterns are transformed into the unique pairs of integer values in parallel by the perfect hash function, the corresponding integer values are stored in a global hash table, and a recursion expression for computing hash function value of the signatures of each sub-string of text is also proposed. Secondly, based on divisible load principle, a linear programming model for the optimal text distribution strategy is created and a parallel approximate multi-pattern matching algorithm allowing one error is presented on the heterogeneous cluster system which processors have different computing speeds and distinct communication capabilities and different memory sizes by taking into account computation and communication startup time and using the assigned processor distribution order. The experimental results on the cluster system of heterogeneous personal computers show that the presented parallel algorithm is averagely 25% faster than that one using the even text distribution strategy, and it obtains a nearly linear speedup and good scalability.","PeriodicalId":282779,"journal":{"name":"2008 Ninth International Conference on Parallel and Distributed Computing, Applications and Technologies","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 Ninth International Conference on Parallel and Distributed Computing, Applications and Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDCAT.2008.23","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
For the given multiple patterns and a text string, firstly, a perfect hash function is constructed, the patterns are transformed into the unique pairs of integer values in parallel by the perfect hash function, the corresponding integer values are stored in a global hash table, and a recursion expression for computing hash function value of the signatures of each sub-string of text is also proposed. Secondly, based on divisible load principle, a linear programming model for the optimal text distribution strategy is created and a parallel approximate multi-pattern matching algorithm allowing one error is presented on the heterogeneous cluster system which processors have different computing speeds and distinct communication capabilities and different memory sizes by taking into account computation and communication startup time and using the assigned processor distribution order. The experimental results on the cluster system of heterogeneous personal computers show that the presented parallel algorithm is averagely 25% faster than that one using the even text distribution strategy, and it obtains a nearly linear speedup and good scalability.