MapReduce与Hadoop在天气数据和单词统计分析方面的优势

Sree Lakshmi K, Theertha Jayarajan N, Nitha L
{"title":"MapReduce与Hadoop在天气数据和单词统计分析方面的优势","authors":"Sree Lakshmi K, Theertha Jayarajan N, Nitha L","doi":"10.1109/ICOEI51242.2021.9452980","DOIUrl":null,"url":null,"abstract":"Data flows from various sources in structured, semistructured or unstructured form and this type of data flow is referred as big data. Due to their large scale, rapid growth and diverse formats, these datasets are difficult to manage using conventional tools and techniques. Big Data analysis is a daunting activity as it requires large decentralized file systems that should be adaptive, resilient and responsive to fault. For the effective analysis of big data, Map Reduce is commonly used. Big data analysis helps researchers, scholars, and business users to extract the value and knowledge. Huge amounts of data have become accessible to decision makers in the information age. Due to the rapid increase of such data, strategies to manage and obtain value and knowledge from these datasets must be studied and delivered. Moreover, decision-makers must be able to extract useful information from such a dynamic and rapidly changing set of data, which includes everything from daily transactions to customer contact and social media data. In this paper, we explore Hadoop's parallel processing power in two application areas. The first scenario is calculation of minimum and maximum temperature with huge amount of weather data, which has been collected from an open source. The application analyses the entire weather station data set and the minimum and maximum temperatures (in Fahrenheit) of the respective weather stations will be displayed. The second scenario is to find the word count from huge datasets and checks the frequency of each word in a given data set irrespective of the data volume.","PeriodicalId":420826,"journal":{"name":"2021 5th International Conference on Trends in Electronics and Informatics (ICOEI)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Ascendancy of MapReduce with Hadoop for Weather Data and Word Count Analytics\",\"authors\":\"Sree Lakshmi K, Theertha Jayarajan N, Nitha L\",\"doi\":\"10.1109/ICOEI51242.2021.9452980\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data flows from various sources in structured, semistructured or unstructured form and this type of data flow is referred as big data. Due to their large scale, rapid growth and diverse formats, these datasets are difficult to manage using conventional tools and techniques. Big Data analysis is a daunting activity as it requires large decentralized file systems that should be adaptive, resilient and responsive to fault. For the effective analysis of big data, Map Reduce is commonly used. Big data analysis helps researchers, scholars, and business users to extract the value and knowledge. Huge amounts of data have become accessible to decision makers in the information age. Due to the rapid increase of such data, strategies to manage and obtain value and knowledge from these datasets must be studied and delivered. Moreover, decision-makers must be able to extract useful information from such a dynamic and rapidly changing set of data, which includes everything from daily transactions to customer contact and social media data. In this paper, we explore Hadoop's parallel processing power in two application areas. The first scenario is calculation of minimum and maximum temperature with huge amount of weather data, which has been collected from an open source. The application analyses the entire weather station data set and the minimum and maximum temperatures (in Fahrenheit) of the respective weather stations will be displayed. The second scenario is to find the word count from huge datasets and checks the frequency of each word in a given data set irrespective of the data volume.\",\"PeriodicalId\":420826,\"journal\":{\"name\":\"2021 5th International Conference on Trends in Electronics and Informatics (ICOEI)\",\"volume\":\"45 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-06-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 5th International Conference on Trends in Electronics and Informatics (ICOEI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICOEI51242.2021.9452980\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 5th International Conference on Trends in Electronics and Informatics (ICOEI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOEI51242.2021.9452980","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

数据以结构化、半结构化或非结构化的形式从各种来源流出,这种类型的数据流被称为大数据。由于这些数据集规模庞大、增长迅速、格式多样,使用传统工具和技术很难对其进行管理。大数据分析是一项艰巨的任务,因为它需要大型分散的文件系统,这些文件系统应该具有自适应能力、弹性和对故障的响应能力。为了对大数据进行有效的分析,Map Reduce是常用的。大数据分析帮助研究人员、学者和商业用户提取价值和知识。在信息时代,决策者可以获得大量的数据。由于此类数据的快速增长,必须研究和提供管理策略,并从这些数据集中获取价值和知识。此外,决策者必须能够从这种动态和快速变化的数据集中提取有用的信息,这些数据集包括从日常交易到客户联系和社交媒体数据的所有内容。在本文中,我们将探讨Hadoop在两个应用领域中的并行处理能力。第一种情况是利用从开源软件收集的大量天气数据计算最低和最高温度。该应用程序分析整个气象站数据集,并显示各个气象站的最低和最高温度(以华氏度为单位)。第二种情况是从庞大的数据集中找到单词计数,并检查给定数据集中每个单词的频率,而不考虑数据量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Ascendancy of MapReduce with Hadoop for Weather Data and Word Count Analytics
Data flows from various sources in structured, semistructured or unstructured form and this type of data flow is referred as big data. Due to their large scale, rapid growth and diverse formats, these datasets are difficult to manage using conventional tools and techniques. Big Data analysis is a daunting activity as it requires large decentralized file systems that should be adaptive, resilient and responsive to fault. For the effective analysis of big data, Map Reduce is commonly used. Big data analysis helps researchers, scholars, and business users to extract the value and knowledge. Huge amounts of data have become accessible to decision makers in the information age. Due to the rapid increase of such data, strategies to manage and obtain value and knowledge from these datasets must be studied and delivered. Moreover, decision-makers must be able to extract useful information from such a dynamic and rapidly changing set of data, which includes everything from daily transactions to customer contact and social media data. In this paper, we explore Hadoop's parallel processing power in two application areas. The first scenario is calculation of minimum and maximum temperature with huge amount of weather data, which has been collected from an open source. The application analyses the entire weather station data set and the minimum and maximum temperatures (in Fahrenheit) of the respective weather stations will be displayed. The second scenario is to find the word count from huge datasets and checks the frequency of each word in a given data set irrespective of the data volume.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A Comparative Analysis of Various Transfer Learning Approaches Skin Cancer Detection Deep Learning Methods for Object Detection in Autonomous Vehicles Load Manage Optimization through Grid and PV Energy Integration System Design of Brain Controlled Robotic Car using Raspberry Pi Feasibility Study of Economic Forecasting Model based on Data Mining
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1