{"title":"在不平衡数据集中提高用户体验质量","authors":"Tanghui Wang, Ruochen Huang, Xin Wei, Fang Zhou","doi":"10.1109/ICS.2016.0142","DOIUrl":null,"url":null,"abstract":"Currently, traditional algorithm performs not well in terms of predicting the user's complaint in imbalanced IPTV dataset. To solve this problem, we combine status data from the set-top box with data of user's complaints and select the appropriate model to predict user's quality of experience (QoE). Concretely, we firstly perform data cleaning and select suitable attributes from the original dataset. Then, we apply random under-sampling and synthetic over-sampling to the preprocessed dataset. In order to get better performance, we improves the Synthetic Minority Over-sampling Technique (SMOTE) algorithm and combine it with K-means algorithm to generate a new dataset. After these procedures, we use the Naïve Bayes (NB) model in user's complaint dataset. Through the rigorous modeling and prediction, extensive experimental results show that this integrated algorithm performs better than the Borderline-SMOTE algorithm in predicting user's complaints.","PeriodicalId":281088,"journal":{"name":"2016 International Computer Symposium (ICS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Improving User's Quality of Experience in Imbalanced Dataset\",\"authors\":\"Tanghui Wang, Ruochen Huang, Xin Wei, Fang Zhou\",\"doi\":\"10.1109/ICS.2016.0142\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Currently, traditional algorithm performs not well in terms of predicting the user's complaint in imbalanced IPTV dataset. To solve this problem, we combine status data from the set-top box with data of user's complaints and select the appropriate model to predict user's quality of experience (QoE). Concretely, we firstly perform data cleaning and select suitable attributes from the original dataset. Then, we apply random under-sampling and synthetic over-sampling to the preprocessed dataset. In order to get better performance, we improves the Synthetic Minority Over-sampling Technique (SMOTE) algorithm and combine it with K-means algorithm to generate a new dataset. After these procedures, we use the Naïve Bayes (NB) model in user's complaint dataset. Through the rigorous modeling and prediction, extensive experimental results show that this integrated algorithm performs better than the Borderline-SMOTE algorithm in predicting user's complaints.\",\"PeriodicalId\":281088,\"journal\":{\"name\":\"2016 International Computer Symposium (ICS)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 International Computer Symposium (ICS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICS.2016.0142\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Computer Symposium (ICS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICS.2016.0142","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Improving User's Quality of Experience in Imbalanced Dataset
Currently, traditional algorithm performs not well in terms of predicting the user's complaint in imbalanced IPTV dataset. To solve this problem, we combine status data from the set-top box with data of user's complaints and select the appropriate model to predict user's quality of experience (QoE). Concretely, we firstly perform data cleaning and select suitable attributes from the original dataset. Then, we apply random under-sampling and synthetic over-sampling to the preprocessed dataset. In order to get better performance, we improves the Synthetic Minority Over-sampling Technique (SMOTE) algorithm and combine it with K-means algorithm to generate a new dataset. After these procedures, we use the Naïve Bayes (NB) model in user's complaint dataset. Through the rigorous modeling and prediction, extensive experimental results show that this integrated algorithm performs better than the Borderline-SMOTE algorithm in predicting user's complaints.