基于Apache Spark的高级流处理研究

Q3 Decision Sciences International Journal of Industrial Engineering and Production Research Pub Date : 2021-01-10 DOI:10.22068/IJIEPR.32.1.133

A.K.V.K Sasikanthr, K. Samatha, N. Deshai, B. Sekhar, S. Venkatramana

{"title":"基于Apache Spark的高级流处理研究","authors":"A.K.V.K Sasikanthr, K. Samatha, N. Deshai, B. Sekhar, S. Venkatramana","doi":"10.22068/IJIEPR.32.1.133","DOIUrl":null,"url":null,"abstract":"Today’s digital world computations are tremendously difficult and they always demand essential requirements to significantly process and store datasets of enormous size for a wide variety of applications. Since the volume of digital world data is enormous, unstructured data are mostly generated at high velocity beyond limits and are doubled day by day. Over the last decade, many organizations have been facing major problems in handling and processing massive chunks of data, which could not be processed efficiently due to lack of enhancements on existing and conventional technologies. This paper addresses how to overcome these problems efficiently using the most recent and world primary powerful data processing tool, namely clean open-source Hadoop, one of its core components being Map Reduce that is subject to few performance issues. The objective of this paper is to address and overcome the limitations and weaknesses of Map Reduce with Apache Spark.","PeriodicalId":52223,"journal":{"name":"International Journal of Industrial Engineering and Production Research","volume":"8 1","pages":"133-141"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Research on Advanced Streaming Processing on Apache Spark\",\"authors\":\"A.K.V.K Sasikanthr, K. Samatha, N. Deshai, B. Sekhar, S. Venkatramana\",\"doi\":\"10.22068/IJIEPR.32.1.133\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Today’s digital world computations are tremendously difficult and they always demand essential requirements to significantly process and store datasets of enormous size for a wide variety of applications. Since the volume of digital world data is enormous, unstructured data are mostly generated at high velocity beyond limits and are doubled day by day. Over the last decade, many organizations have been facing major problems in handling and processing massive chunks of data, which could not be processed efficiently due to lack of enhancements on existing and conventional technologies. This paper addresses how to overcome these problems efficiently using the most recent and world primary powerful data processing tool, namely clean open-source Hadoop, one of its core components being Map Reduce that is subject to few performance issues. The objective of this paper is to address and overcome the limitations and weaknesses of Map Reduce with Apache Spark.\",\"PeriodicalId\":52223,\"journal\":{\"name\":\"International Journal of Industrial Engineering and Production Research\",\"volume\":\"8 1\",\"pages\":\"133-141\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-01-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Industrial Engineering and Production Research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.22068/IJIEPR.32.1.133\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Decision Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Industrial Engineering and Production Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22068/IJIEPR.32.1.133","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Decision Sciences","Score":null,"Total":0}

引用次数: 0

摘要

今天的数字世界的计算是非常困难的，他们总是需要重要的要求，以显着处理和存储各种各样的应用程序的巨大规模的数据集。由于数字世界的数据量是巨大的，非结构化数据大多以超出限制的高速产生，并且每天都在翻倍。在过去的十年中，许多组织都面临着处理大量数据的主要问题，由于缺乏对现有和传统技术的增强，这些数据无法有效地处理。本文讨论了如何使用最新的、世界上最强大的数据处理工具，即干净的开源Hadoop，有效地克服这些问题，它的核心组件之一是Map Reduce，它几乎没有性能问题。本文的目的是解决和克服mapreduce与Apache Spark的局限性和弱点。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Research on Advanced Streaming Processing on Apache Spark

Today’s digital world computations are tremendously difficult and they always demand essential requirements to significantly process and store datasets of enormous size for a wide variety of applications. Since the volume of digital world data is enormous, unstructured data are mostly generated at high velocity beyond limits and are doubled day by day. Over the last decade, many organizations have been facing major problems in handling and processing massive chunks of data, which could not be processed efficiently due to lack of enhancements on existing and conventional technologies. This paper addresses how to overcome these problems efficiently using the most recent and world primary powerful data processing tool, namely clean open-source Hadoop, one of its core components being Map Reduce that is subject to few performance issues. The objective of this paper is to address and overcome the limitations and weaknesses of Map Reduce with Apache Spark.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊