图形处理器上频繁的项集挖掘

International Workshop on Data Management on New Hardware Pub Date : 2009-06-28 DOI:10.1145/1565694.1565702

Wenbin Fang, Mian Lu, Xiangye Xiao, Bingsheng He, Qiong Luo

{"title":"图形处理器上频繁的项集挖掘","authors":"Wenbin Fang, Mian Lu, Xiangye Xiao, Bingsheng He, Qiong Luo","doi":"10.1145/1565694.1565702","DOIUrl":null,"url":null,"abstract":"We present two efficient Apriori implementations of Frequent Itemset Mining (FIM) that utilize new-generation graphics processing units (GPUs). Our implementations take advantage of the GPU's massively multi-threaded SIMD (Single Instruction, Multiple Data) architecture. Both implementations employ a bitmap data structure to exploit the GPU's SIMD parallelism and to accelerate the frequency counting operation. One implementation runs entirely on the GPU and eliminates intermediate data transfer between the GPU memory and the CPU memory. The other implementation employs both the GPU and the CPU for processing. It represents itemsets in a trie, and uses the CPU for trie traversing and incremental maintenance. Our preliminary results show that both implementations achieve a speedup of up to two orders of magnitude over optimized CPU Apriori implementations on a PC with an NVIDIA GTX 280 GPU and a quad-core CPU.","PeriodicalId":298901,"journal":{"name":"International Workshop on Data Management on New Hardware","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"133","resultStr":"{\"title\":\"Frequent itemset mining on graphics processors\",\"authors\":\"Wenbin Fang, Mian Lu, Xiangye Xiao, Bingsheng He, Qiong Luo\",\"doi\":\"10.1145/1565694.1565702\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present two efficient Apriori implementations of Frequent Itemset Mining (FIM) that utilize new-generation graphics processing units (GPUs). Our implementations take advantage of the GPU's massively multi-threaded SIMD (Single Instruction, Multiple Data) architecture. Both implementations employ a bitmap data structure to exploit the GPU's SIMD parallelism and to accelerate the frequency counting operation. One implementation runs entirely on the GPU and eliminates intermediate data transfer between the GPU memory and the CPU memory. The other implementation employs both the GPU and the CPU for processing. It represents itemsets in a trie, and uses the CPU for trie traversing and incremental maintenance. Our preliminary results show that both implementations achieve a speedup of up to two orders of magnitude over optimized CPU Apriori implementations on a PC with an NVIDIA GTX 280 GPU and a quad-core CPU.\",\"PeriodicalId\":298901,\"journal\":{\"name\":\"International Workshop on Data Management on New Hardware\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-06-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"133\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Workshop on Data Management on New Hardware\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1565694.1565702\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Workshop on Data Management on New Hardware","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1565694.1565702","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 133

摘要

我们提出了两种利用新一代图形处理单元(gpu)的频繁项集挖掘(FIM)的高效Apriori实现。我们的实现利用了GPU的大规模多线程SIMD(单指令，多数据)架构。两种实现都采用位图数据结构来利用GPU的SIMD并行性并加速频率计数操作。一种实现完全在GPU上运行，消除了GPU内存和CPU内存之间的中间数据传输。另一种实现同时使用GPU和CPU进行处理。它表示树中的项集，并使用CPU进行树遍历和增量维护。我们的初步结果表明，在使用NVIDIA GTX 280 GPU和四核CPU的PC上，两种实现都比优化后的CPU Apriori实现的速度提高了两个数量级。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Frequent itemset mining on graphics processors

We present two efficient Apriori implementations of Frequent Itemset Mining (FIM) that utilize new-generation graphics processing units (GPUs). Our implementations take advantage of the GPU's massively multi-threaded SIMD (Single Instruction, Multiple Data) architecture. Both implementations employ a bitmap data structure to exploit the GPU's SIMD parallelism and to accelerate the frequency counting operation. One implementation runs entirely on the GPU and eliminates intermediate data transfer between the GPU memory and the CPU memory. The other implementation employs both the GPU and the CPU for processing. It represents itemsets in a trie, and uses the CPU for trie traversing and incremental maintenance. Our preliminary results show that both implementations achieve a speedup of up to two orders of magnitude over optimized CPU Apriori implementations on a PC with an NVIDIA GTX 280 GPU and a quad-core CPU.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Workshop on Data Management on New Hardware

自引率

0.00%

发文量

期刊最新文献

On testing persistent-memory-based software SIMD-accelerated regular expression matching FPGA-accelerated group-by aggregation using synchronizing caches Customized OS support for data-processing Larger-than-memory data management on modern storage hardware for in-memory OLTP database systems