SSDMV: Semi-Supervised Deep Social Spammer Detection by Multi-view Data Fusion

Chaozhuo Li, Senzhang Wang, Lifang He, Philip S. Yu, Yanbo Liang, Zhoujun Li
{"title":"SSDMV: Semi-Supervised Deep Social Spammer Detection by Multi-view Data Fusion","authors":"Chaozhuo Li, Senzhang Wang, Lifang He, Philip S. Yu, Yanbo Liang, Zhoujun Li","doi":"10.1109/ICDM.2018.00040","DOIUrl":null,"url":null,"abstract":"The explosive use of social media makes it a popular platform for malicious users, known as social spammers, to overwhelm legitimate users with unwanted content. Most existing social spammer detection approaches are supervised and need a large number of manually labeled data for training, which is infeasible in practice. To address this issue, some semi-supervised models are proposed by incorporating side information such as user profiles and posted tweets. However, these shallow models are not effective to deeply learn the desirable user representations for spammer detection, and the multi-view data are usually loosely coupled without considering their correlations. In this paper, we propose a Semi-Supervised Deep social spammer detection model by Multi-View data fusion (SSDMV). The insight is that we aim to extensively learn the task-relevant discriminative representations for users to address the challenge of annotation scarcity. Under a unified semi-supervised learning framework, we first design a deep multi-view feature learning module which fuses information from different views, and then propose a label inference module to predict labels for users. The mutual refinement between the two modules ensures SSDMV to be able to both generate high quality features and make accurate predictions.Empirically, we evaluate SSDMV over two real social network datasets on three tasks, and the results demonstrate that SSDMV significantly outperforms the state-of-the-art methods.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"27","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Conference on Data Mining (ICDM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM.2018.00040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 27

Abstract

The explosive use of social media makes it a popular platform for malicious users, known as social spammers, to overwhelm legitimate users with unwanted content. Most existing social spammer detection approaches are supervised and need a large number of manually labeled data for training, which is infeasible in practice. To address this issue, some semi-supervised models are proposed by incorporating side information such as user profiles and posted tweets. However, these shallow models are not effective to deeply learn the desirable user representations for spammer detection, and the multi-view data are usually loosely coupled without considering their correlations. In this paper, we propose a Semi-Supervised Deep social spammer detection model by Multi-View data fusion (SSDMV). The insight is that we aim to extensively learn the task-relevant discriminative representations for users to address the challenge of annotation scarcity. Under a unified semi-supervised learning framework, we first design a deep multi-view feature learning module which fuses information from different views, and then propose a label inference module to predict labels for users. The mutual refinement between the two modules ensures SSDMV to be able to both generate high quality features and make accurate predictions.Empirically, we evaluate SSDMV over two real social network datasets on three tasks, and the results demonstrate that SSDMV significantly outperforms the state-of-the-art methods.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于多视图数据融合的半监督深度社会垃圾邮件发送者检测
社交媒体的爆炸性使用使其成为恶意用户的流行平台,这些恶意用户被称为社交垃圾邮件发送者,他们用不想要的内容淹没合法用户。现有的大多数社交垃圾邮件检测方法都是有监督的,并且需要大量人工标记的数据进行训练,这在实践中是不可行的。为了解决这个问题,提出了一些半监督模型,通过合并用户个人资料和发布的tweet等侧信息。然而,这些浅层模型不能有效地深入学习垃圾邮件发送者检测所需的用户表示,并且多视图数据通常是松散耦合的,而不考虑它们之间的相关性。本文提出了一种基于多视图数据融合(SSDMV)的半监督深度社交垃圾邮件检测模型。我们的目标是为用户广泛学习与任务相关的判别表示,以解决注释稀缺性的挑战。在统一的半监督学习框架下,我们首先设计了融合不同视图信息的深度多视图特征学习模块,然后提出了标签推理模块,为用户预测标签。两个模块之间的相互改进确保了SSDMV能够生成高质量的特征并做出准确的预测。在经验上,我们在两个真实的社会网络数据集上对三个任务进行了评估,结果表明SSDMV显著优于最先进的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Entire Regularization Path for Sparse Nonnegative Interaction Model Accelerating Experimental Design by Incorporating Experimenter Hunches Title Page i An Efficient Many-Class Active Learning Framework for Knowledge-Rich Domains Social Recommendation with Missing Not at Random Data
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1