Kenmogne Edith Belise, Nkambou Roger, Tadmon Calvin, E. Nguifo
{"title":"基于模式生长的序列模式挖掘方法中最优模式生长方向的启发式预测","authors":"Kenmogne Edith Belise, Nkambou Roger, Tadmon Calvin, E. Nguifo","doi":"10.14419/jacst.v6i2.7011","DOIUrl":null,"url":null,"abstract":"Sequential pattern mining is an efficient technique for discovering recurring structures or patterns from very large datasets, with a very large field of applications. It aims at extracting a set of attributes, shared across time among a large number of objects in a given database. Previous studies have developed two major classes of sequential pattern mining methods, namely, the candidate generation-and-test approach based on either vertical or horizontal data formats represented respectively by GSP and SPADE, and the pattern-growth approach represented by FreeSpan, PrefixSpan and their further extensions. The performances of these algorithms depend on how patterns grow. Because of this, we introduce a heuristic to predict the optimal pattern-growth direction, i.e. the pattern-growth direction leading to the best performance in terms of runtime and memory usage. Then, we perform a number of experimentations on both real-life and synthetic datasets to test the heuristic. The performance analysis of these experimentations show that the heuristic prediction is reliable in general.","PeriodicalId":445404,"journal":{"name":"Journal of Advanced Computer Science and Technology","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A heuristic to predict the optimal pattern-growth direction for the pattern growth-based sequential pattern mining approach\",\"authors\":\"Kenmogne Edith Belise, Nkambou Roger, Tadmon Calvin, E. Nguifo\",\"doi\":\"10.14419/jacst.v6i2.7011\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sequential pattern mining is an efficient technique for discovering recurring structures or patterns from very large datasets, with a very large field of applications. It aims at extracting a set of attributes, shared across time among a large number of objects in a given database. Previous studies have developed two major classes of sequential pattern mining methods, namely, the candidate generation-and-test approach based on either vertical or horizontal data formats represented respectively by GSP and SPADE, and the pattern-growth approach represented by FreeSpan, PrefixSpan and their further extensions. The performances of these algorithms depend on how patterns grow. Because of this, we introduce a heuristic to predict the optimal pattern-growth direction, i.e. the pattern-growth direction leading to the best performance in terms of runtime and memory usage. Then, we perform a number of experimentations on both real-life and synthetic datasets to test the heuristic. The performance analysis of these experimentations show that the heuristic prediction is reliable in general.\",\"PeriodicalId\":445404,\"journal\":{\"name\":\"Journal of Advanced Computer Science and Technology\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-06-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Advanced Computer Science and Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.14419/jacst.v6i2.7011\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Advanced Computer Science and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14419/jacst.v6i2.7011","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A heuristic to predict the optimal pattern-growth direction for the pattern growth-based sequential pattern mining approach
Sequential pattern mining is an efficient technique for discovering recurring structures or patterns from very large datasets, with a very large field of applications. It aims at extracting a set of attributes, shared across time among a large number of objects in a given database. Previous studies have developed two major classes of sequential pattern mining methods, namely, the candidate generation-and-test approach based on either vertical or horizontal data formats represented respectively by GSP and SPADE, and the pattern-growth approach represented by FreeSpan, PrefixSpan and their further extensions. The performances of these algorithms depend on how patterns grow. Because of this, we introduce a heuristic to predict the optimal pattern-growth direction, i.e. the pattern-growth direction leading to the best performance in terms of runtime and memory usage. Then, we perform a number of experimentations on both real-life and synthetic datasets to test the heuristic. The performance analysis of these experimentations show that the heuristic prediction is reliable in general.