Ilias Dimitriadis, George Dialektakis, Athena Vakali
{"title":"CALEB: A Conditional Adversarial Learning Framework to enhance bot detection","authors":"Ilias Dimitriadis, George Dialektakis, Athena Vakali","doi":"10.1016/j.datak.2023.102245","DOIUrl":null,"url":null,"abstract":"<div><p><span>The high growth of Online Social Networks<span> (OSNs) over the last few years has allowed automated accounts, known as social bots, to gain ground. As highlighted by other researchers, many of these bots have malicious purposes and tend to mimic human behavior, posing high-level security threats on OSN platforms. Moreover, recent studies have shown that social bots evolve over time by reforming and reinventing unforeseen and sophisticated characteristics, making them capable of evading the current machine learning<span> state-of-the-art bot detection systems. This work is motivated by the critical need to establish adaptive bot detection methods in order to proactively capture unseen evolved bots towards healthier OSNs interactions. In contrast with most earlier supervised ML approaches which are limited by the inability to effectively detect new types of bots, this paper proposes CALEB, a robust end-to-end proactive framework based on the Conditional </span></span></span>Generative Adversarial Network<span><span> (CGAN) and its extension, Auxiliary Classifier GAN (AC-GAN), to simulate bot evolution by creating realistic synthetic instances of different bot types. These simulated evolved bots augment existing bot datasets and therefore enhance the detection of emerging generations of bots before they even appear. Furthermore, we show that our augmentation approach overpasses other earlier augmentation techniques which fail at simulating evolving bots. Extensive experimentation on well established public bot datasets, show that our approach offers a performance boost of up to 10% regarding the detection of new unseen bots. Finally, the use of the AC-GAN </span>Discriminator as a bot detector, has outperformed former ML approaches, showcasing the efficiency of our end to end framework.</span></p></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"149 ","pages":"Article 102245"},"PeriodicalIF":2.7000,"publicationDate":"2023-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data & Knowledge Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169023X23001052","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
The high growth of Online Social Networks (OSNs) over the last few years has allowed automated accounts, known as social bots, to gain ground. As highlighted by other researchers, many of these bots have malicious purposes and tend to mimic human behavior, posing high-level security threats on OSN platforms. Moreover, recent studies have shown that social bots evolve over time by reforming and reinventing unforeseen and sophisticated characteristics, making them capable of evading the current machine learning state-of-the-art bot detection systems. This work is motivated by the critical need to establish adaptive bot detection methods in order to proactively capture unseen evolved bots towards healthier OSNs interactions. In contrast with most earlier supervised ML approaches which are limited by the inability to effectively detect new types of bots, this paper proposes CALEB, a robust end-to-end proactive framework based on the Conditional Generative Adversarial Network (CGAN) and its extension, Auxiliary Classifier GAN (AC-GAN), to simulate bot evolution by creating realistic synthetic instances of different bot types. These simulated evolved bots augment existing bot datasets and therefore enhance the detection of emerging generations of bots before they even appear. Furthermore, we show that our augmentation approach overpasses other earlier augmentation techniques which fail at simulating evolving bots. Extensive experimentation on well established public bot datasets, show that our approach offers a performance boost of up to 10% regarding the detection of new unseen bots. Finally, the use of the AC-GAN Discriminator as a bot detector, has outperformed former ML approaches, showcasing the efficiency of our end to end framework.
过去几年,在线社交网络(Online Social Networks,简称OSNs)的高速增长,使得被称为社交机器人(Social bots)的自动账户获得了发展。正如其他研究人员所强调的那样,许多这些机器人具有恶意目的,倾向于模仿人类行为,对OSN平台构成高级别的安全威胁。此外,最近的研究表明,随着时间的推移,社交机器人通过改革和重塑不可预见的复杂特征而进化,使它们能够逃避目前最先进的机器学习机器人检测系统。这项工作的动机是建立自适应机器人检测方法的迫切需要,以便主动捕获未见过的进化机器人,以实现更健康的osn交互。与大多数早期的监督式机器学习方法相比,这些方法受到无法有效检测新型机器人的限制,本文提出了CALEB,这是一种基于条件生成对抗网络(CGAN)及其扩展,辅助分类器GAN (AC-GAN)的鲁棒端到端主动框架,通过创建不同机器人类型的真实合成实例来模拟机器人进化。这些模拟进化的机器人增强了现有的机器人数据集,因此增强了对新一代机器人的检测,甚至在它们出现之前。此外,我们表明我们的增强方法优于其他早期的增强技术,这些技术在模拟进化机器人方面失败。在完善的公共机器人数据集上进行的大量实验表明,我们的方法在检测新的未见过的机器人方面提供了高达10%的性能提升。最后,使用AC-GAN鉴别器作为机器人检测器,优于以前的机器学习方法,展示了我们的端到端框架的效率。
期刊介绍:
Data & Knowledge Engineering (DKE) stimulates the exchange of ideas and interaction between these two related fields of interest. DKE reaches a world-wide audience of researchers, designers, managers and users. The major aim of the journal is to identify, investigate and analyze the underlying principles in the design and effective use of these systems.