{"title":"利用生物启发算法挖掘Top-K高实用项集","authors":"N. Pham, Z. Oplatková, H. M. Huynh, Bay Vo","doi":"10.1109/COMPENG50184.2022.9905433","DOIUrl":null,"url":null,"abstract":"High utility itemset (HUI) mining is a necessary research problem in the field of knowledge discovery and data mining. Many algorithms for Top-K HUI mining have been proposed. However, the principal issue with these algorithms is that they need to store potential top-k patterns in the memory anytime, and they request the minimum utility threshold to automatically rise when finding HUIs. Consequently, the performance of existing exact algorithms for Top-K HUIs mining tends to decrease when the database size and the number of distinct items in the databases rise. To address this issue, we suggest a Binary Particle Swarm Optimization (BPSO) based algorithm for mining Top-K HUIs effectively, namely TKO-BPSO (Top-K high utility itemset mining in One phase based on Binary Particle Swarm Optimization). The main idea of TKO-BPSO is not only to use a one-phase model and strategy Raising the threshold by the Utility of Candidates (RUC) to effectively increase the border thresholds for pruning the search space but also to adopt the sigmoid function in the updating process of the particles. This might significantly reduce the combinational problem in traditional HUIM when the database size and the number of distinct items in the databases rise. Consequently, its performance outperforms existing exact algorithms for mining Top-K HUIs because it efficiently overcomes the problem of the vast amount candidates. Substantial experiments conducted on publicly available several real and synthetic datasets show that the proposed algorithm has better results than existing state-of-the-art algorithms in terms of runtime, which can significantly reduce the combinational problem and memory usage.","PeriodicalId":211056,"journal":{"name":"2022 IEEE Workshop on Complexity in Engineering (COMPENG)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Mining Top-K High Utility Itemset Using Bio-Inspired Algorithms\",\"authors\":\"N. Pham, Z. Oplatková, H. M. Huynh, Bay Vo\",\"doi\":\"10.1109/COMPENG50184.2022.9905433\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"High utility itemset (HUI) mining is a necessary research problem in the field of knowledge discovery and data mining. Many algorithms for Top-K HUI mining have been proposed. However, the principal issue with these algorithms is that they need to store potential top-k patterns in the memory anytime, and they request the minimum utility threshold to automatically rise when finding HUIs. Consequently, the performance of existing exact algorithms for Top-K HUIs mining tends to decrease when the database size and the number of distinct items in the databases rise. To address this issue, we suggest a Binary Particle Swarm Optimization (BPSO) based algorithm for mining Top-K HUIs effectively, namely TKO-BPSO (Top-K high utility itemset mining in One phase based on Binary Particle Swarm Optimization). The main idea of TKO-BPSO is not only to use a one-phase model and strategy Raising the threshold by the Utility of Candidates (RUC) to effectively increase the border thresholds for pruning the search space but also to adopt the sigmoid function in the updating process of the particles. This might significantly reduce the combinational problem in traditional HUIM when the database size and the number of distinct items in the databases rise. Consequently, its performance outperforms existing exact algorithms for mining Top-K HUIs because it efficiently overcomes the problem of the vast amount candidates. Substantial experiments conducted on publicly available several real and synthetic datasets show that the proposed algorithm has better results than existing state-of-the-art algorithms in terms of runtime, which can significantly reduce the combinational problem and memory usage.\",\"PeriodicalId\":211056,\"journal\":{\"name\":\"2022 IEEE Workshop on Complexity in Engineering (COMPENG)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-07-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE Workshop on Complexity in Engineering (COMPENG)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/COMPENG50184.2022.9905433\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE Workshop on Complexity in Engineering (COMPENG)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COMPENG50184.2022.9905433","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
摘要
高效用项集(High utility itemset, HUI)挖掘是知识发现和数据挖掘领域的一个必要研究问题。对于Top-K HUI的挖掘,已经提出了许多算法。然而,这些算法的主要问题是,它们需要随时在内存中存储潜在的top-k模式,并且它们要求在找到hui时自动提高最小效用阈值。因此,当数据库规模和数据库中不同条目的数量增加时,现有的Top-K hui挖掘精确算法的性能趋于下降。为了解决这一问题,我们提出了一种基于二进制粒子群优化(Binary Particle Swarm Optimization, BPSO)的Top-K高效用项集挖掘算法,即TKO-BPSO (Top-K high utility itemset mining in One phase based based Binary Particle Swarm Optimization)。TKO-BPSO的主要思想是利用候选效用(Utility of candidate, RUC)提高阈值的一阶段模型和策略来有效地提高边界阈值来修剪搜索空间,并在粒子的更新过程中采用sigmoid函数。当数据库大小和数据库中不同项目的数量增加时,这可能会显著减少传统HUIM中的组合问题。因此,它的性能优于现有的挖掘Top-K hui的精确算法,因为它有效地克服了大量候选者的问题。在公开的多个真实和合成数据集上进行的大量实验表明,该算法在运行时间方面优于现有的最先进算法,可以显着减少组合问题和内存使用。
Mining Top-K High Utility Itemset Using Bio-Inspired Algorithms
High utility itemset (HUI) mining is a necessary research problem in the field of knowledge discovery and data mining. Many algorithms for Top-K HUI mining have been proposed. However, the principal issue with these algorithms is that they need to store potential top-k patterns in the memory anytime, and they request the minimum utility threshold to automatically rise when finding HUIs. Consequently, the performance of existing exact algorithms for Top-K HUIs mining tends to decrease when the database size and the number of distinct items in the databases rise. To address this issue, we suggest a Binary Particle Swarm Optimization (BPSO) based algorithm for mining Top-K HUIs effectively, namely TKO-BPSO (Top-K high utility itemset mining in One phase based on Binary Particle Swarm Optimization). The main idea of TKO-BPSO is not only to use a one-phase model and strategy Raising the threshold by the Utility of Candidates (RUC) to effectively increase the border thresholds for pruning the search space but also to adopt the sigmoid function in the updating process of the particles. This might significantly reduce the combinational problem in traditional HUIM when the database size and the number of distinct items in the databases rise. Consequently, its performance outperforms existing exact algorithms for mining Top-K HUIs because it efficiently overcomes the problem of the vast amount candidates. Substantial experiments conducted on publicly available several real and synthetic datasets show that the proposed algorithm has better results than existing state-of-the-art algorithms in terms of runtime, which can significantly reduce the combinational problem and memory usage.