Job starvation avoidance with alleviation of data skewness in Big Data infrastructure

2017 2nd International Conference on Computing and Communications Technologies (ICCCT) Pub Date : 2017-02-01 DOI:10.1109/ICCCT2.2017.7972264

Sankari Subbiah, S. Mala, S. Nayagam

{"title":"Job starvation avoidance with alleviation of data skewness in Big Data infrastructure","authors":"Sankari Subbiah, S. Mala, S. Nayagam","doi":"10.1109/ICCCT2.2017.7972264","DOIUrl":null,"url":null,"abstract":"During the age of rush in the need for big data, Hadoop is a postulate or cloud-based platform that has been heavily encouraged for all solutions in the business world's big data problems. Parallel execution of jobs consists of large data sets is done through map reduce in the hadoop cluster. The completion of job time will depend on the slowest running task in the job. The entire job is extended if one particular job takes longer time to finish and it is done by the delayer. An inequality in the measure of data allocated to each individual task is referred to as Data skewness. An efficient dynamic data splitting approach on Hadoop called the Hybrid scheduler who monitors the samples while running batch jobs and allocates resources to slaves depending on the complexity of data and the time taken for processing. In this paper, the effectiveness of web swarming is showcased using hadoop eliminating Distributed Denial of Service (DDoS) attack detection scenarios in the Web servers. Query processing is done through Map Reduce in traditional Hadoop clusters and is replaced by the proposed Block chain query processing algorithm. Thereby improvise the execution time of the assigned task in the proposed system to mitigate the data skewness. The main aim of this paper is to avoid job starvation thus minimizing the response time efficiently during the process and mitigating data skewness in existing system.","PeriodicalId":445567,"journal":{"name":"2017 2nd International Conference on Computing and Communications Technologies (ICCCT)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 2nd International Conference on Computing and Communications Technologies (ICCCT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCCT2.2017.7972264","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

During the age of rush in the need for big data, Hadoop is a postulate or cloud-based platform that has been heavily encouraged for all solutions in the business world's big data problems. Parallel execution of jobs consists of large data sets is done through map reduce in the hadoop cluster. The completion of job time will depend on the slowest running task in the job. The entire job is extended if one particular job takes longer time to finish and it is done by the delayer. An inequality in the measure of data allocated to each individual task is referred to as Data skewness. An efficient dynamic data splitting approach on Hadoop called the Hybrid scheduler who monitors the samples while running batch jobs and allocates resources to slaves depending on the complexity of data and the time taken for processing. In this paper, the effectiveness of web swarming is showcased using hadoop eliminating Distributed Denial of Service (DDoS) attack detection scenarios in the Web servers. Query processing is done through Map Reduce in traditional Hadoop clusters and is replaced by the proposed Block chain query processing algorithm. Thereby improvise the execution time of the assigned task in the proposed system to mitigate the data skewness. The main aim of this paper is to avoid job starvation thus minimizing the response time efficiently during the process and mitigating data skewness in existing system.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

在大数据基础设施中避免工作饥饿和缓解数据偏度

在对大数据需求激增的时代，Hadoop是一种假设或基于云的平台，它被大力鼓励用于解决商业世界的大数据问题。并行执行由大数据集组成的作业是通过hadoop集群中的map reduce完成的。作业的完成时间将取决于作业中运行最慢的任务。如果一个特定的工作需要更长的时间来完成，并且它是由延迟者完成的，那么整个工作就被延长了。分配给每个单独任务的数据度量中的不平等称为数据偏度。Hadoop上一种高效的动态数据分割方法，称为Hybrid调度器，它在运行批处理作业时监视样本，并根据数据的复杂性和处理所需的时间将资源分配给slave。在本文中，通过hadoop消除web服务器中的分布式拒绝服务(DDoS)攻击检测场景，展示了web集群的有效性。在传统的Hadoop集群中，查询处理是通过Map Reduce完成的，并被本文提出的区块链查询处理算法所取代。从而在所提出的系统中临时调整所分配任务的执行时间，以减轻数据偏度。本文的主要目的是避免作业饥饿，从而最大限度地减少过程中的响应时间，并减轻现有系统中的数据偏差。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2017 2nd International Conference on Computing and Communications Technologies (ICCCT)

自引率

0.00%

发文量