{"title":"Filtering spam in Weibo using ensemble imbalanced classification and knowledge expansion","authors":"Zhipeng Jin, Qiudan Li, D. Zeng, Lei Wang","doi":"10.1109/ISI.2015.7165952","DOIUrl":null,"url":null,"abstract":"Weibo has become an important information sharing platform in our daily life in China. Many applications utilize Weibo data to analyze hot topic and opinion evolution patterns to gain insights into user behavior. However, various spam messages degrade the performance of these applications and thus are essential to be filtered. In this paper, we propose a unified spam detection approach, which utilizes external knowledge sources to expand keywords features and applies an ensemble under-sampling based strategy to handle the class-imbalance problem. The experimental results show the effectiveness and robustness of our approach in Weibo data.","PeriodicalId":292352,"journal":{"name":"2015 IEEE International Conference on Intelligence and Security Informatics (ISI)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Conference on Intelligence and Security Informatics (ISI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISI.2015.7165952","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15
Abstract
Weibo has become an important information sharing platform in our daily life in China. Many applications utilize Weibo data to analyze hot topic and opinion evolution patterns to gain insights into user behavior. However, various spam messages degrade the performance of these applications and thus are essential to be filtered. In this paper, we propose a unified spam detection approach, which utilizes external knowledge sources to expand keywords features and applies an ensemble under-sampling based strategy to handle the class-imbalance problem. The experimental results show the effectiveness and robustness of our approach in Weibo data.