{"title":"A predictive approach to task scheduling for Big Data in cloud environments using classification algorithms","authors":"Vidushi Vashishth, Anshuman Chhabra, A. Sood","doi":"10.1109/CONFLUENCE.2017.7943147","DOIUrl":null,"url":null,"abstract":"There have been many recent developments in integrating the Cloud with the Internet of Τhings (IoT) which comprise of up and coming technologies such as Smart Cities and Smart devices. This federation has resulted in research being directed towards further integration of Big Data with the Cloud, as IoT devices consisting of such technologies generate a continuous stream of sensor data. Thus, in this paper, we seek to present a predictive approach to task scheduling with the aim of reducing the overhead incurred when Big Data is processed on the Cloud. Subsequently, we wish to increase both the efficiency and reliability of the Cloud network while handling Big Data. We present a method of using classification in Machine Learning as a tool for scheduling tasks and assigning them to Virtual Machines (VMs) in the Cloud environment. A comparative study is undertaken to observe which brand of classifiers perform optimally in the given scenario. Particle Swarm Optimization (PSO) is used to generate the dataset which is used to train the classifiers. A number of classification algorithms such as Naive Bayes, Random Forest and Κ Nearest Neighbor are then used to predict the VM best suited to a task in the test dataset.","PeriodicalId":6651,"journal":{"name":"2017 7th International Conference on Cloud Computing, Data Science & Engineering - Confluence","volume":"65 1","pages":"188-192"},"PeriodicalIF":0.0000,"publicationDate":"2017-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 7th International Conference on Cloud Computing, Data Science & Engineering - Confluence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CONFLUENCE.2017.7943147","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12
Abstract
There have been many recent developments in integrating the Cloud with the Internet of Τhings (IoT) which comprise of up and coming technologies such as Smart Cities and Smart devices. This federation has resulted in research being directed towards further integration of Big Data with the Cloud, as IoT devices consisting of such technologies generate a continuous stream of sensor data. Thus, in this paper, we seek to present a predictive approach to task scheduling with the aim of reducing the overhead incurred when Big Data is processed on the Cloud. Subsequently, we wish to increase both the efficiency and reliability of the Cloud network while handling Big Data. We present a method of using classification in Machine Learning as a tool for scheduling tasks and assigning them to Virtual Machines (VMs) in the Cloud environment. A comparative study is undertaken to observe which brand of classifiers perform optimally in the given scenario. Particle Swarm Optimization (PSO) is used to generate the dataset which is used to train the classifiers. A number of classification algorithms such as Naive Bayes, Random Forest and Κ Nearest Neighbor are then used to predict the VM best suited to a task in the test dataset.