Decision tree construction for data mining on grid computing

Shun-Tzu Tsai, Chao-Tung Yang
{"title":"Decision tree construction for data mining on grid computing","authors":"Shun-Tzu Tsai, Chao-Tung Yang","doi":"10.1109/EEE.2004.1287344","DOIUrl":null,"url":null,"abstract":"Decision tree is one of the frequently used methods in data mining for searching prediction information. Due to its characteristics which are suitable for parallelism, it has been widely adopted in high performance field and developed into various parallel decision tree algorithms to deal with huge data and complex computation. Following the development of other technology fields, grid computing is regarded as the extension of PC cluster and therefore it future research development is highly valued. This new wave of Internet application is the 3rd generation of Internet applications following the traditional Internet and Web application. We have presented a grid-based decision tree architecture, and hope it can be applied on both parallel and sequential algorithms for the decision tree applications. Also, based on the scope and model of data mining applied in grid environment as well as user equivalent perspective, grid roles can be categorized into three types. We are hoping that through these definitions, software developers can define clear system processes and differentiate the application scope for software applications. To fulfil our architecture, we first apply an existing parallel decision tree algorithm-SPRINT algorithm in the grid environment. The performance and differences in many other areas are compared using different sizes of dataset. The experimental results are used for future reference and further development.","PeriodicalId":360167,"journal":{"name":"IEEE International Conference on e-Technology, e-Commerce and e-Service, 2004. EEE '04. 2004","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE International Conference on e-Technology, e-Commerce and e-Service, 2004. EEE '04. 2004","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EEE.2004.1287344","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14

Abstract

Decision tree is one of the frequently used methods in data mining for searching prediction information. Due to its characteristics which are suitable for parallelism, it has been widely adopted in high performance field and developed into various parallel decision tree algorithms to deal with huge data and complex computation. Following the development of other technology fields, grid computing is regarded as the extension of PC cluster and therefore it future research development is highly valued. This new wave of Internet application is the 3rd generation of Internet applications following the traditional Internet and Web application. We have presented a grid-based decision tree architecture, and hope it can be applied on both parallel and sequential algorithms for the decision tree applications. Also, based on the scope and model of data mining applied in grid environment as well as user equivalent perspective, grid roles can be categorized into three types. We are hoping that through these definitions, software developers can define clear system processes and differentiate the application scope for software applications. To fulfil our architecture, we first apply an existing parallel decision tree algorithm-SPRINT algorithm in the grid environment. The performance and differences in many other areas are compared using different sizes of dataset. The experimental results are used for future reference and further development.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于网格计算的数据挖掘决策树构建
决策树是数据挖掘中常用的预测信息搜索方法之一。由于其适合并行化的特点,被广泛应用于高性能领域,并发展成各种并行决策树算法来处理海量数据和复杂计算。随着其他技术领域的发展,网格计算被视为PC集群的延伸,因此其未来的研究发展受到高度重视。这一新的Internet应用浪潮是继传统Internet和Web应用之后的第三代Internet应用。我们提出了一种基于网格的决策树架构,并希望它可以应用于并行和顺序算法的决策树应用。此外,根据网格环境中数据挖掘的范围和模型,以及用户等效视角,网格角色可以分为三种类型。我们希望通过这些定义,软件开发人员可以定义清晰的系统过程,并区分软件应用程序的应用范围。为了实现我们的架构,我们首先在网格环境中应用现有的并行决策树算法- sprint算法。使用不同大小的数据集比较许多其他领域的性能和差异。实验结果可供今后参考和进一步开发。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
On the distributed management of SCORM-compliant course contents A new fair micropayment system based on hash chain An enhanced EDCG replica allocation method in ad hoc networks Using element and document profile for information clustering A scheme for MAC isolation to realize effective management in public wireless LAN
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1