{"title":"Transfer learning for server behavior classification in small IT environments","authors":"Jasmina Bogojeska, Dorothea Wiesmann","doi":"10.1109/NOMS.2018.8406251","DOIUrl":null,"url":null,"abstract":"Technology refresh is an important component in data-center management that needs to be properly justified because of its high cost and associated migration risk. The goal of this paper is to support the technology refresh decision process for small target IT environments with a statistical learning method that automatically identifies and ranks their servers with problematic behavior based on incident ticket and server attribute data. Since the IT environments are heterogeneous, in practice, a separate model is trained for each of them. To address the small sample sizes available for many IT environments, we develop a random forest transfer learning solution that leverages information from large IT environments in a selective manner. It trains a model for each target IT environment that uses properly derived resampling weights such that the distribution of the pool of all examples from the large accounts is matched to the target distribution of the small target IT environment. In this way, a tailored predictive model that uses the information available from many large IT environments provides good quality predictions for small IT environments. We demonstrate the superior prediction quality of our model on a large set of real data.","PeriodicalId":19331,"journal":{"name":"NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium","volume":"68 1","pages":"1-9"},"PeriodicalIF":0.0000,"publicationDate":"2018-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NOMS.2018.8406251","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Technology refresh is an important component in data-center management that needs to be properly justified because of its high cost and associated migration risk. The goal of this paper is to support the technology refresh decision process for small target IT environments with a statistical learning method that automatically identifies and ranks their servers with problematic behavior based on incident ticket and server attribute data. Since the IT environments are heterogeneous, in practice, a separate model is trained for each of them. To address the small sample sizes available for many IT environments, we develop a random forest transfer learning solution that leverages information from large IT environments in a selective manner. It trains a model for each target IT environment that uses properly derived resampling weights such that the distribution of the pool of all examples from the large accounts is matched to the target distribution of the small target IT environment. In this way, a tailored predictive model that uses the information available from many large IT environments provides good quality predictions for small IT environments. We demonstrate the superior prediction quality of our model on a large set of real data.