A scalable bottom-up data mining algorithm for relational databases

Proceedings. Tenth International Conference on Scientific and Statistical Database Management (Cat. No.98TB100243) Pub Date : 1998-07-01 DOI:10.1109/SSDM.1998.688125

G. Giuffrida, Lee G. Cooper, W. Chu

引用次数: 8

Abstract

Machine learning induction algorithms are difficult to scale to very large databases because of their memory-bound nature. Using virtual memory results in a significant performance degradation. To overcome such shortcomings, we developed a classification rule induction algorithm for relational databases. Our algorithm uses a bottom-up rule generation strategy that is more effective for mining databases having large cardinality of nominal variables. We have successfully used our algorithm to mine a retail grocery database containing more than 1.6 million records in about 5 hours on a dual Pentium processor PC.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

面向关系数据库的可伸缩自底向上数据挖掘算法

机器学习归纳算法很难扩展到非常大的数据库，因为它们的内存限制的性质。使用虚拟内存会导致显著的性能下降。为了克服这些缺点，我们开发了一种面向关系数据库的分类规则归纳算法。我们的算法使用自下而上的规则生成策略，该策略对于挖掘具有大量名义变量基数的数据库更有效。我们已经成功地使用我们的算法在一台双奔腾处理器的PC上在大约5小时内挖掘了一个包含160多万条记录的零售杂货数据库。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings. Tenth International Conference on Scientific and Statistical Database Management (Cat. No.98TB100243)

自引率

0.00%

发文量