J. Pei, Jiawei Han, B. Mortazavi-Asl, Helen Pinto, Qiming Chen, U. Dayal, M. Hsu
{"title":"PrefixSpan,通过前缀投影模式增长有效地挖掘序列模式","authors":"J. Pei, Jiawei Han, B. Mortazavi-Asl, Helen Pinto, Qiming Chen, U. Dayal, M. Hsu","doi":"10.1109/ICDE.2001.914830","DOIUrl":null,"url":null,"abstract":"Sequential pattern mining is an important data mining problem with broad applications. It is challenging since one may need to examine a combinatorially explosive number of possible subsequence patterns. Most of the previously developed sequential pattern mining methods follow the methodology of A priori which may substantially reduce the number of combinations to be examined. Howeve6 Apriori still encounters problems when a sequence database is large andor when sequential patterns to be mined are numerous ano we propose a novel sequential pattern mining method, called Prefixspan (i.e., Prefix-projected - Ettern_ mining), which explores prejxprojection in sequential pattern mining. Prefixspan mines the complete set of patterns but greatly reduces the efforts of candidate subsequence generation. Moreover; prefi-projection substantially reduces the size of projected databases and leads to efJicient processing. Our performance study shows that Prefixspan outperforms both the Apriori-based GSP algorithm and another recently proposed method; Frees pan, in mining large sequence data bases.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"69 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2158","resultStr":"{\"title\":\"PrefixSpan,: mining sequential patterns efficiently by prefix-projected pattern growth\",\"authors\":\"J. Pei, Jiawei Han, B. Mortazavi-Asl, Helen Pinto, Qiming Chen, U. Dayal, M. Hsu\",\"doi\":\"10.1109/ICDE.2001.914830\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sequential pattern mining is an important data mining problem with broad applications. It is challenging since one may need to examine a combinatorially explosive number of possible subsequence patterns. Most of the previously developed sequential pattern mining methods follow the methodology of A priori which may substantially reduce the number of combinations to be examined. Howeve6 Apriori still encounters problems when a sequence database is large andor when sequential patterns to be mined are numerous ano we propose a novel sequential pattern mining method, called Prefixspan (i.e., Prefix-projected - Ettern_ mining), which explores prejxprojection in sequential pattern mining. Prefixspan mines the complete set of patterns but greatly reduces the efforts of candidate subsequence generation. Moreover; prefi-projection substantially reduces the size of projected databases and leads to efJicient processing. Our performance study shows that Prefixspan outperforms both the Apriori-based GSP algorithm and another recently proposed method; Frees pan, in mining large sequence data bases.\",\"PeriodicalId\":431818,\"journal\":{\"name\":\"Proceedings 17th International Conference on Data Engineering\",\"volume\":\"69 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2001-04-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2158\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings 17th International Conference on Data Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDE.2001.914830\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 17th International Conference on Data Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE.2001.914830","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
PrefixSpan,: mining sequential patterns efficiently by prefix-projected pattern growth
Sequential pattern mining is an important data mining problem with broad applications. It is challenging since one may need to examine a combinatorially explosive number of possible subsequence patterns. Most of the previously developed sequential pattern mining methods follow the methodology of A priori which may substantially reduce the number of combinations to be examined. Howeve6 Apriori still encounters problems when a sequence database is large andor when sequential patterns to be mined are numerous ano we propose a novel sequential pattern mining method, called Prefixspan (i.e., Prefix-projected - Ettern_ mining), which explores prejxprojection in sequential pattern mining. Prefixspan mines the complete set of patterns but greatly reduces the efforts of candidate subsequence generation. Moreover; prefi-projection substantially reduces the size of projected databases and leads to efJicient processing. Our performance study shows that Prefixspan outperforms both the Apriori-based GSP algorithm and another recently proposed method; Frees pan, in mining large sequence data bases.