{"title":"Incremental Mining of High Utility Sequential Patterns in Incremental Databases","authors":"Jun-Zhe Wang, Jiun-Long Huang","doi":"10.1145/2983323.2983691","DOIUrl":null,"url":null,"abstract":"High utility sequential pattern (HUSP) mining is an emerging topic in pattern mining, and only a few algorithms have been proposed to address it. In practice, most sequence databases usually grow over time, and it is inefficient for existing algorithms to mine HUSPs from scratch when databases grow with a small portion of updates. In view of this, we propose the IncUSP-Miner algorithm to mine HUSPs incrementally. Specifically, to avoid redundant computations, we propose a tighter upper bound of the utility of a sequence, called TSU, and then design a novel data structure, called the candidate pattern tree, to maintain the sequences whose TSU values are greater than or equal to the minimum utility threshold. Accordingly, to avoid keeping a huge amount of utility information for each sequence, a set of auxiliary utility information is designed to be stored in each tree node. Moreover, for those nodes whose utilities have to be updated, a strategy is also proposed to reduce the amount of computation, thereby improving the mining efficiency. Experimental results on three real datasets show that IncUSP-Miner is able to efficiently mine HUSPs incrementally.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2983323.2983691","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14
Abstract
High utility sequential pattern (HUSP) mining is an emerging topic in pattern mining, and only a few algorithms have been proposed to address it. In practice, most sequence databases usually grow over time, and it is inefficient for existing algorithms to mine HUSPs from scratch when databases grow with a small portion of updates. In view of this, we propose the IncUSP-Miner algorithm to mine HUSPs incrementally. Specifically, to avoid redundant computations, we propose a tighter upper bound of the utility of a sequence, called TSU, and then design a novel data structure, called the candidate pattern tree, to maintain the sequences whose TSU values are greater than or equal to the minimum utility threshold. Accordingly, to avoid keeping a huge amount of utility information for each sequence, a set of auxiliary utility information is designed to be stored in each tree node. Moreover, for those nodes whose utilities have to be updated, a strategy is also proposed to reduce the amount of computation, thereby improving the mining efficiency. Experimental results on three real datasets show that IncUSP-Miner is able to efficiently mine HUSPs incrementally.