Deeply fused flow and topology features for botnet detection based on a pretrained GCN

IF 4.3 3区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Computer Communications Pub Date : 2025-03-01 Epub Date: 2025-01-27 DOI:10.1016/j.comcom.2025.108084
Xiaoyuan Meng , Bo Lang , Yuhao Yan , Yanxi Liu
{"title":"Deeply fused flow and topology features for botnet detection based on a pretrained GCN","authors":"Xiaoyuan Meng ,&nbsp;Bo Lang ,&nbsp;Yuhao Yan ,&nbsp;Yanxi Liu","doi":"10.1016/j.comcom.2025.108084","DOIUrl":null,"url":null,"abstract":"<div><div>The characteristics of botnets are mainly reflected in their network behaviors and the intercommunication relationships among their bots. The existing botnet detection methods typically use only one kind of feature, i.e., flow features or topological features; each feature type overlooks the other type of features and affects the resulting model performance. In this paper, for the first time, we propose a botnet detection model that uses a graph convolutional network (GCN) to deeply fuse flow features and topological features. We construct communication graphs from network traffic and represent node attributes with flow features. The extreme sample imbalance phenomenon exhibited by the existing public traffic datasets makes training a GCN model impractical. To address this problem, we propose a pretrained GCN framework that utilizes a public balanced artificial communication graph dataset to pretrain the GCN model, and the feature output obtained from the last hidden layer of the GCN model containing the flow and topology information is input into the Extra Tree classification model. Furthermore, our model can effectively detect command-and-control (C2) and peer-to-peer (P2P) botnets by simply adjusting the number of layers in the GCN. The experimental results obtained on public datasets demonstrate that our approach outperforms the current state-of-the-art botnet detection models. In addition, our model also performs well in real-world botnet detection scenarios.</div></div>","PeriodicalId":55224,"journal":{"name":"Computer Communications","volume":"233 ","pages":"Article 108084"},"PeriodicalIF":4.3000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Communications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0140366425000416","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/27 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

The characteristics of botnets are mainly reflected in their network behaviors and the intercommunication relationships among their bots. The existing botnet detection methods typically use only one kind of feature, i.e., flow features or topological features; each feature type overlooks the other type of features and affects the resulting model performance. In this paper, for the first time, we propose a botnet detection model that uses a graph convolutional network (GCN) to deeply fuse flow features and topological features. We construct communication graphs from network traffic and represent node attributes with flow features. The extreme sample imbalance phenomenon exhibited by the existing public traffic datasets makes training a GCN model impractical. To address this problem, we propose a pretrained GCN framework that utilizes a public balanced artificial communication graph dataset to pretrain the GCN model, and the feature output obtained from the last hidden layer of the GCN model containing the flow and topology information is input into the Extra Tree classification model. Furthermore, our model can effectively detect command-and-control (C2) and peer-to-peer (P2P) botnets by simply adjusting the number of layers in the GCN. The experimental results obtained on public datasets demonstrate that our approach outperforms the current state-of-the-art botnet detection models. In addition, our model also performs well in real-world botnet detection scenarios.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于预训练GCN的深度融合流和拓扑特征的僵尸网络检测
僵尸网络的特征主要体现在其网络行为和机器人之间的相互通信关系上。现有的僵尸网络检测方法通常只使用一种特征,即流量特征或拓扑特征;每种特征类型都会忽略其他类型的特征,并影响最终的模型性能。在本文中,我们首次提出了一种利用图卷积网络(GCN)深度融合流特征和拓扑特征的僵尸网络检测模型。我们从网络流量中构造通信图,并用流特征表示节点属性。现有公共交通数据集表现出的极端样本不平衡现象使得GCN模型的训练不切实际。为了解决这一问题,我们提出了一种预训练GCN框架,该框架利用公共平衡人工通信图数据集对GCN模型进行预训练,并将GCN模型最后一层隐含的包含流和拓扑信息的特征输出输入Extra Tree分类模型。此外,我们的模型可以通过简单地调整GCN中的层数来有效地检测命令和控制(C2)和点对点(P2P)僵尸网络。在公共数据集上获得的实验结果表明,我们的方法优于当前最先进的僵尸网络检测模型。此外,我们的模型在真实的僵尸网络检测场景中也表现良好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Computer Communications
Computer Communications 工程技术-电信学
CiteScore
14.10
自引率
5.00%
发文量
397
审稿时长
66 days
期刊介绍: Computer and Communications networks are key infrastructures of the information society with high socio-economic value as they contribute to the correct operations of many critical services (from healthcare to finance and transportation). Internet is the core of today''s computer-communication infrastructures. This has transformed the Internet, from a robust network for data transfer between computers, to a global, content-rich, communication and information system where contents are increasingly generated by the users, and distributed according to human social relations. Next-generation network technologies, architectures and protocols are therefore required to overcome the limitations of the legacy Internet and add new capabilities and services. The future Internet should be ubiquitous, secure, resilient, and closer to human communication paradigms. Computer Communications is a peer-reviewed international journal that publishes high-quality scientific articles (both theory and practice) and survey papers covering all aspects of future computer communication networks (on all layers, except the physical layer), with a special attention to the evolution of the Internet architecture, protocols, services, and applications.
期刊最新文献
Digital Twins for smart campus networks: An end-to-end framework for multi-domain data intelligence Multimedia's multicast broadcast services (MBS) over vehicles in the cellular network using the standard deviation-oriented grouping mechanism V2X rebroadcasting for C-V2X communications Enabling ML-driven threat detection: An analysis of IoT network traffic datasets AssertGPT: LLM-driven assertion generation for programmable networks verification
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1