有趣的关联规则挖掘,从分布式环境下的大销售数据中检测出一致和不一致的规则

Dinesh J. Prajapati , Sanjay Garg , N.C. Chauhan
{"title":"有趣的关联规则挖掘,从分布式环境下的大销售数据中检测出一致和不一致的规则","authors":"Dinesh J. Prajapati ,&nbsp;Sanjay Garg ,&nbsp;N.C. Chauhan","doi":"10.1016/j.fcij.2017.04.003","DOIUrl":null,"url":null,"abstract":"<div><p>Nowadays, there is an increasing demand in mining interesting patterns from the big data. The process of analyzing such a huge amount of data is really computationally complex task when using traditional methods. The overall purpose of this paper is in twofold. First, this paper presents a novel approach to identify consistent and inconsistent association rules from sales data located in distributed environment. Secondly, the paper also overcomes the main memory bottleneck and computing time overhead of single computing system by applying computations to multi node cluster. The proposed method initially extracts frequent itemsets for each zone using existing distributed frequent pattern mining algorithms. The paper also compares the time efficiency of Mapreduce based frequent pattern mining algorithm with Count Distribution Algorithm (CDA) and Fast Distributed Mining (FDM) algorithms. The association generated from frequent itemsets are too large that it becomes complex to analyze it. Thus, Mapreduce based consistent and inconsistent rule detection (MR-CIRD) algorithm is proposed to detect the consistent and inconsistent rules from big data and provide useful and actionable knowledge to the domain experts. These pruned interesting rules also give useful knowledge for better marketing strategy as well. The extracted consistent and inconsistent rules are evaluated and compared based on different interestingness measures presented together with experimental results that lead to the final conclusions.</p></div>","PeriodicalId":100561,"journal":{"name":"Future Computing and Informatics Journal","volume":"2 1","pages":"Pages 19-30"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.fcij.2017.04.003","citationCount":"56","resultStr":"{\"title\":\"Interesting association rule mining with consistent and inconsistent rule detection from big sales data in distributed environment\",\"authors\":\"Dinesh J. Prajapati ,&nbsp;Sanjay Garg ,&nbsp;N.C. Chauhan\",\"doi\":\"10.1016/j.fcij.2017.04.003\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Nowadays, there is an increasing demand in mining interesting patterns from the big data. The process of analyzing such a huge amount of data is really computationally complex task when using traditional methods. The overall purpose of this paper is in twofold. First, this paper presents a novel approach to identify consistent and inconsistent association rules from sales data located in distributed environment. Secondly, the paper also overcomes the main memory bottleneck and computing time overhead of single computing system by applying computations to multi node cluster. The proposed method initially extracts frequent itemsets for each zone using existing distributed frequent pattern mining algorithms. The paper also compares the time efficiency of Mapreduce based frequent pattern mining algorithm with Count Distribution Algorithm (CDA) and Fast Distributed Mining (FDM) algorithms. The association generated from frequent itemsets are too large that it becomes complex to analyze it. Thus, Mapreduce based consistent and inconsistent rule detection (MR-CIRD) algorithm is proposed to detect the consistent and inconsistent rules from big data and provide useful and actionable knowledge to the domain experts. These pruned interesting rules also give useful knowledge for better marketing strategy as well. The extracted consistent and inconsistent rules are evaluated and compared based on different interestingness measures presented together with experimental results that lead to the final conclusions.</p></div>\",\"PeriodicalId\":100561,\"journal\":{\"name\":\"Future Computing and Informatics Journal\",\"volume\":\"2 1\",\"pages\":\"Pages 19-30\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1016/j.fcij.2017.04.003\",\"citationCount\":\"56\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Future Computing and Informatics Journal\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2314728816300460\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Computing and Informatics Journal","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2314728816300460","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 56

摘要

如今,从大数据中挖掘有趣模式的需求越来越大。使用传统方法分析如此大量的数据的过程在计算上是非常复杂的任务。本文的总体目的有两个方面。首先,本文提出了一种从分布环境中的销售数据中识别一致和不一致关联规则的新方法。其次,通过将计算应用于多节点集群,克服了单一计算系统的主要内存瓶颈和计算时间开销。该方法首先利用现有的分布式频繁模式挖掘算法提取每个区域的频繁项集。本文还比较了基于Mapreduce的频繁模式挖掘算法与计数分布算法(CDA)和快速分布挖掘(FDM)算法的时间效率。频繁项集产生的关联太大,分析起来很复杂。为此,提出了基于Mapreduce的一致和不一致规则检测算法(MR-CIRD),从大数据中检测一致和不一致规则,为领域专家提供有用和可操作的知识。这些精简的有趣规则也为更好的营销策略提供了有用的知识。基于不同的兴趣度度量和实验结果,对提取的一致和不一致规则进行评估和比较,从而得出最终结论。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Interesting association rule mining with consistent and inconsistent rule detection from big sales data in distributed environment

Nowadays, there is an increasing demand in mining interesting patterns from the big data. The process of analyzing such a huge amount of data is really computationally complex task when using traditional methods. The overall purpose of this paper is in twofold. First, this paper presents a novel approach to identify consistent and inconsistent association rules from sales data located in distributed environment. Secondly, the paper also overcomes the main memory bottleneck and computing time overhead of single computing system by applying computations to multi node cluster. The proposed method initially extracts frequent itemsets for each zone using existing distributed frequent pattern mining algorithms. The paper also compares the time efficiency of Mapreduce based frequent pattern mining algorithm with Count Distribution Algorithm (CDA) and Fast Distributed Mining (FDM) algorithms. The association generated from frequent itemsets are too large that it becomes complex to analyze it. Thus, Mapreduce based consistent and inconsistent rule detection (MR-CIRD) algorithm is proposed to detect the consistent and inconsistent rules from big data and provide useful and actionable knowledge to the domain experts. These pruned interesting rules also give useful knowledge for better marketing strategy as well. The extracted consistent and inconsistent rules are evaluated and compared based on different interestingness measures presented together with experimental results that lead to the final conclusions.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Relationship between E-CRM, Service Quality, Customer Satisfaction, Trust, and Loyalty in banking Industry Enhancing query processing on stock market cloud-based database Crow search algorithm with time varying flight length Strategies for feature selection A Framework to Enhance the International Competitive Advantage of Information Technology Graduates A Literature Review on Agile Methodologies Quality, eXtreme Programming and SCRUM
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1