Map Reduce by K-Nearest Neighbor Joins

Srikanth Bethu, B. Babu, S. G. Rao, R. Florence
{"title":"Map Reduce by K-Nearest Neighbor Joins","authors":"Srikanth Bethu, B. Babu, S. G. Rao, R. Florence","doi":"10.1109/CYBERC.2018.00050","DOIUrl":null,"url":null,"abstract":"Knowledge discovery and Data mining plays a major role in computational intensive tasks with high range of applications. With the increase of volume and dimension of data, the distributed features perform operations in a reasonable period. MapReduce programming is suitable for distributed large scale data processing that provides different ways of solutions to the same problem, that (one) has particular constraints and properties. In this paper, we give comparative analysis and its approaches for computing KNN on MapReduce[1] theoretically and experimental evaluation. Load balancing, accuracy and complexity are analyzed on each step of data preprocessing, data partitioning and computation. The experiment results in this are produced by using variety of datasets. Time and Space complexity are analyzed periodically on each dataset and gives new advantages and short comings that are discussed for each algorithm. Finally this paper can be used as a reference material to handle KNN [2] based problems in the idea of Mapreducing in Big Data.","PeriodicalId":282903,"journal":{"name":"2018 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CYBERC.2018.00050","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Knowledge discovery and Data mining plays a major role in computational intensive tasks with high range of applications. With the increase of volume and dimension of data, the distributed features perform operations in a reasonable period. MapReduce programming is suitable for distributed large scale data processing that provides different ways of solutions to the same problem, that (one) has particular constraints and properties. In this paper, we give comparative analysis and its approaches for computing KNN on MapReduce[1] theoretically and experimental evaluation. Load balancing, accuracy and complexity are analyzed on each step of data preprocessing, data partitioning and computation. The experiment results in this are produced by using variety of datasets. Time and Space complexity are analyzed periodically on each dataset and gives new advantages and short comings that are discussed for each algorithm. Finally this paper can be used as a reference material to handle KNN [2] based problems in the idea of Mapreducing in Big Data.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过k近邻连接映射约简
知识发现和数据挖掘在应用范围广的计算密集型任务中起着重要作用。随着数据量和维数的增加,分布式特征在合理的周期内执行操作。MapReduce编程适用于分布式大规模数据处理,它为相同的问题提供了不同的解决方案,并且(一个)具有特定的约束和属性。本文对MapReduce[1]上的KNN计算方法进行了理论分析和实验评价。分析了数据预处理、数据划分和计算各步骤的负载均衡、精度和复杂度。实验结果是通过使用不同的数据集得出的。对每个数据集的时间和空间复杂度进行了周期性分析,并给出了每个算法的优点和缺点。最后,本文可以作为在大数据中mapreduce思想下处理基于KNN[2]的问题的参考材料。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Information Fusion VIA Optimized KECA with Application to Audio Emotion Recognition Application Research of YOLO v2 Combined with Color Identification Decentralized Cross-Layer Optimization for Energy-Efficient Resource Allocation in HetNets A Smart QoE Aware Network Selection Solution for IoT Systems in HetNets Based 5G Scenarios Improving Word Representation with Word Pair Distributional Asymmetry
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1