Big data: Mining of log file through hadoop

B. Kotiyal, Ankit Kumar, B. Pant, R. Goudar
{"title":"Big data: Mining of log file through hadoop","authors":"B. Kotiyal, Ankit Kumar, B. Pant, R. Goudar","doi":"10.1109/ICHCI-IEEE.2013.6887797","DOIUrl":null,"url":null,"abstract":"The unremitting increase of computational strength has produced tremendous flow of data in the past two decades. This tremendous flow of data is known as “big data”. Big data is the data which cannot be processed with the aid of existing tools or techniques and if processed can result in interesting information's such as analysing the behaviour of the user, business intelligence etc. This paper discusses the difference between the traditional relational database and big data; it also shows the characteristics of big data. The paper also focuses on the distinct big data channels processes along with the various challenges and as well as on how big data is a solution to the organizations. Big data does not only focus to store and handle the large volume of data but also to analysed and extract the correct information from the data in lesser time span. At last it discusses about hadoop an open source framework that allows the distributed processing for massive datasets on cluster of computers which is shown with using the log file for extraction of information based on user query.","PeriodicalId":419263,"journal":{"name":"2013 International Conference on Human Computer Interactions (ICHCI)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 International Conference on Human Computer Interactions (ICHCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICHCI-IEEE.2013.6887797","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 18

Abstract

The unremitting increase of computational strength has produced tremendous flow of data in the past two decades. This tremendous flow of data is known as “big data”. Big data is the data which cannot be processed with the aid of existing tools or techniques and if processed can result in interesting information's such as analysing the behaviour of the user, business intelligence etc. This paper discusses the difference between the traditional relational database and big data; it also shows the characteristics of big data. The paper also focuses on the distinct big data channels processes along with the various challenges and as well as on how big data is a solution to the organizations. Big data does not only focus to store and handle the large volume of data but also to analysed and extract the correct information from the data in lesser time span. At last it discusses about hadoop an open source framework that allows the distributed processing for massive datasets on cluster of computers which is shown with using the log file for extraction of information based on user query.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
大数据:利用hadoop挖掘日志文件
在过去的二十年里,计算能力的不断提高产生了巨大的数据流。这种巨大的数据流被称为“大数据”。大数据是指现有工具或技术无法处理的数据,如果处理,可以产生有趣的信息,如分析用户行为、商业智能等。本文讨论了传统关系数据库与大数据的区别;这也体现了大数据的特点。本文还重点介绍了不同的大数据渠道流程以及各种挑战,以及大数据如何成为组织的解决方案。大数据不仅注重存储和处理大量数据,而且注重在较短的时间跨度内从数据中分析和提取正确的信息。最后讨论了hadoop这个开源框架,它允许在计算机集群上对海量数据集进行分布式处理,具体表现为基于用户查询使用日志文件提取信息。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
An efficient technique for video content managing in peer-to-peer computing using multilevel cache and bandwidth based cluster A feasibility study for developing an emotional control system through brain computer interface Various levels of human stress & their impact on human computer interaction Partial-retuning of decentralised PI controller of nonlinear multivariable process using firefly algorithm Automation framework for localizability testing of internationalized software
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1