{"title":"Bejo: Behavior Based Job Classification for Resource Consumption Prediction in the Cloud","authors":"Lin Xu, Jiannong Cao, Yan Wang, Lei Yang, Jing Li","doi":"10.1109/CLOUDCOM.2014.48","DOIUrl":null,"url":null,"abstract":"Resource prediction (e.g. CPU/memory utilization) of cloud computing jobs has attracted substantial amount of attention. Existing works use regression methods based on historical information of jobs, with an impractical assumption that the job to be predicted has the same class as the historical jobs. To address this problem, we propose to take the category of the jobs into consideration for effective resource prediction. Existing works on job classification either ignores the temporal variance of resource consumption during job execution or use it in a naive way, resulting in unsatisfactory classification accuracy and/or slow speed. In this paper, we introduce a new and efficient job classification approach, called Bejo. Inspired by the textual document classification methods, which use distribution of text words to describe and classify a document, Bejo treats the job as a document, assigns each collected resource consumption snapshot to a certain \"resource word\", and uses the distribution of the words to describe and classify a job. An ℓ1 norm minimization formulation is used to assign each resource snapshot to a resource word, to especially address the unique challenges of high noise and tight time budget of cloud job classification. We collect a comprehensive dataset for job classification and resource consumption prediction on cloud platforms, and demonstrate superior quality and efficiency of Bejo over state-of-the-art algorithms. Experiments also show the relative error of resource consumption prediction can be dramatically reduced by adding an extra job classification step to the existing regression methods.","PeriodicalId":249306,"journal":{"name":"2014 IEEE 6th International Conference on Cloud Computing Technology and Science","volume":"128 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE 6th International Conference on Cloud Computing Technology and Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CLOUDCOM.2014.48","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Resource prediction (e.g. CPU/memory utilization) of cloud computing jobs has attracted substantial amount of attention. Existing works use regression methods based on historical information of jobs, with an impractical assumption that the job to be predicted has the same class as the historical jobs. To address this problem, we propose to take the category of the jobs into consideration for effective resource prediction. Existing works on job classification either ignores the temporal variance of resource consumption during job execution or use it in a naive way, resulting in unsatisfactory classification accuracy and/or slow speed. In this paper, we introduce a new and efficient job classification approach, called Bejo. Inspired by the textual document classification methods, which use distribution of text words to describe and classify a document, Bejo treats the job as a document, assigns each collected resource consumption snapshot to a certain "resource word", and uses the distribution of the words to describe and classify a job. An ℓ1 norm minimization formulation is used to assign each resource snapshot to a resource word, to especially address the unique challenges of high noise and tight time budget of cloud job classification. We collect a comprehensive dataset for job classification and resource consumption prediction on cloud platforms, and demonstrate superior quality and efficiency of Bejo over state-of-the-art algorithms. Experiments also show the relative error of resource consumption prediction can be dramatically reduced by adding an extra job classification step to the existing regression methods.