{"title":"An Improved Sequential Pattern mining Algorithm based on Large Dataset","authors":"Jia Wu, Bing Lv, Wei Cui","doi":"10.1109/ICPDS47662.2019.9017198","DOIUrl":null,"url":null,"abstract":"So far, there are many classical algorithms for sequential pattern mining. In these algorithms, PrefixSpan algorithm is one of the most widely used algorithm, the algorithm USES the prefix projection technology, effectively avoid the candidate item, to a certain extent, improve the efficiency of mining, however, need to construct a large number of projection database PrefixSpan algorithm, and constructs the projection database not only need to consume a lot of memory, and need to add a lot of scanning time, therefore, in this paper, the PrefixSpan algorithm is improved, and put forward the ISPA algorithm, this algorithm can greatly reduce the number of projection database building and thus improve the efficiency of sequential pattern mining First, by comparing the mining results of the two algorithms, it is found that ISPA algorithm can find the most important sequence pattern, thus satisfying. Secondly, experiments are performed on three aspects: different support, types of data sets, and size data sets. It is verifies that the ISPA algorithm is better than the PrefixSpan algorithm.","PeriodicalId":130202,"journal":{"name":"2019 IEEE International Conference on Power Data Science (ICPDS)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Conference on Power Data Science (ICPDS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPDS47662.2019.9017198","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
So far, there are many classical algorithms for sequential pattern mining. In these algorithms, PrefixSpan algorithm is one of the most widely used algorithm, the algorithm USES the prefix projection technology, effectively avoid the candidate item, to a certain extent, improve the efficiency of mining, however, need to construct a large number of projection database PrefixSpan algorithm, and constructs the projection database not only need to consume a lot of memory, and need to add a lot of scanning time, therefore, in this paper, the PrefixSpan algorithm is improved, and put forward the ISPA algorithm, this algorithm can greatly reduce the number of projection database building and thus improve the efficiency of sequential pattern mining First, by comparing the mining results of the two algorithms, it is found that ISPA algorithm can find the most important sequence pattern, thus satisfying. Secondly, experiments are performed on three aspects: different support, types of data sets, and size data sets. It is verifies that the ISPA algorithm is better than the PrefixSpan algorithm.