Sanfeng Zhang , Qingyu Hao , Zijian Gong , Fengzhou Zhu , Yan Wang , Wang Yang
{"title":"MDD-FedGNN: A vertical federated graph learning framework for malicious domain detection","authors":"Sanfeng Zhang , Qingyu Hao , Zijian Gong , Fengzhou Zhu , Yan Wang , Wang Yang","doi":"10.1016/j.cose.2024.104093","DOIUrl":null,"url":null,"abstract":"<div><p>The domain name system (DNS) serves as a fundamental component of the Internet infrastructure, but it is also exploited by attackers in various cyber-crimes, underscoring the significance of malicious domain detection (MDD). Recent advances show that graph-based models exhibit potential for inferring malicious domains and demonstrate superior performance. However, acquiring large-scale and high-quality graph datasets for MDD proves challenging for individual security institutes. Hence, a promising research direction involves employing vertical federated graph learning scheme to unite diverse security institutes and enhance local datasets resulting in more robust and powerful detection models. Nonetheless, directly applying vertical federated graph neural networks for MDD confronts challenges posed by noisy labels and noisy edges among security institutes, which ultimately diminish detection performance. This paper introduces a novel vertical federated learning framework, called MDD-FedGNN, that applies contrastive learning with two different encoders to deal with noisy labels and employs a new loss function based on the information bottleneck theory to handle noisy edges. Comparative experiments are conducted on a publicly available DNS dataset to evaluate the effectiveness of MDD-FedGNN in addressing the challenges of noisy labels and edges in vertical federated graph learning. The results demonstrate that MDD-FedGNN outperforms baseline methods, confirming the feasibility of training more powerful malicious domain detection models through data sharing and vertical federated learning among different security agencies.</p></div>","PeriodicalId":51004,"journal":{"name":"Computers & Security","volume":"147 ","pages":"Article 104093"},"PeriodicalIF":4.8000,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Security","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167404824003985","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
The domain name system (DNS) serves as a fundamental component of the Internet infrastructure, but it is also exploited by attackers in various cyber-crimes, underscoring the significance of malicious domain detection (MDD). Recent advances show that graph-based models exhibit potential for inferring malicious domains and demonstrate superior performance. However, acquiring large-scale and high-quality graph datasets for MDD proves challenging for individual security institutes. Hence, a promising research direction involves employing vertical federated graph learning scheme to unite diverse security institutes and enhance local datasets resulting in more robust and powerful detection models. Nonetheless, directly applying vertical federated graph neural networks for MDD confronts challenges posed by noisy labels and noisy edges among security institutes, which ultimately diminish detection performance. This paper introduces a novel vertical federated learning framework, called MDD-FedGNN, that applies contrastive learning with two different encoders to deal with noisy labels and employs a new loss function based on the information bottleneck theory to handle noisy edges. Comparative experiments are conducted on a publicly available DNS dataset to evaluate the effectiveness of MDD-FedGNN in addressing the challenges of noisy labels and edges in vertical federated graph learning. The results demonstrate that MDD-FedGNN outperforms baseline methods, confirming the feasibility of training more powerful malicious domain detection models through data sharing and vertical federated learning among different security agencies.
期刊介绍:
Computers & Security is the most respected technical journal in the IT security field. With its high-profile editorial board and informative regular features and columns, the journal is essential reading for IT security professionals around the world.
Computers & Security provides you with a unique blend of leading edge research and sound practical management advice. It is aimed at the professional involved with computer security, audit, control and data integrity in all sectors - industry, commerce and academia. Recognized worldwide as THE primary source of reference for applied research and technical expertise it is your first step to fully secure systems.