Integrating an Attention Mechanism and Deep Neural Network for Detection of DGA Domain Names

2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI) Pub Date : 2019-11-01 DOI:10.1109/ICTAI.2019.00121

Fangli Ren, Zhengwei Jiang, Jian Liu

{"title":"Integrating an Attention Mechanism and Deep Neural Network for Detection of DGA Domain Names","authors":"Fangli Ren, Zhengwei Jiang, Jian Liu","doi":"10.1109/ICTAI.2019.00121","DOIUrl":null,"url":null,"abstract":"Domain generation algorithms (DGA) are employed by malware to generate domain names as a common practice, with which to confirm rendezvous points to their command-and-control (C2) servers. The detection of DGA domain names is one of the important technologies for command and control communication detection. Considering the randomness of the DGA domain names, recent work in DGA detection employed machine learning methods based on features extracting and deep learning architectures to classify domain names. However, these methods perform poorly on wordlistbased DGA families, which generate domain names by randomly concatenating dictionary words. In this paper, we proposed the ATT-CNN-BiLSTM model to detect and classify DGA domain names. Firstly, the Convolutional Neural Network (CNN) and bidirectional Long Short-Term Memory (BiLSTM) neural network layer was used to extract the features of the domain sequences information; secondly, the attention layer was used to allocate the corresponding weight of the extracted domain deep information. Finally, the domain feature messages of different weights were put into the output layer to complete the tasks of detection and classification. The experiment results demonstrate the effectiveness of the proposed model both on regular DGA domain names and wordlist-based ones. To be precise, we got a F1 score of 98.92% for the detection and macro average F1 score of 81% for the classification task of DGA domain names.","PeriodicalId":346657,"journal":{"name":"2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTAI.2019.00121","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

Abstract

Domain generation algorithms (DGA) are employed by malware to generate domain names as a common practice, with which to confirm rendezvous points to their command-and-control (C2) servers. The detection of DGA domain names is one of the important technologies for command and control communication detection. Considering the randomness of the DGA domain names, recent work in DGA detection employed machine learning methods based on features extracting and deep learning architectures to classify domain names. However, these methods perform poorly on wordlistbased DGA families, which generate domain names by randomly concatenating dictionary words. In this paper, we proposed the ATT-CNN-BiLSTM model to detect and classify DGA domain names. Firstly, the Convolutional Neural Network (CNN) and bidirectional Long Short-Term Memory (BiLSTM) neural network layer was used to extract the features of the domain sequences information; secondly, the attention layer was used to allocate the corresponding weight of the extracted domain deep information. Finally, the domain feature messages of different weights were put into the output layer to complete the tasks of detection and classification. The experiment results demonstrate the effectiveness of the proposed model both on regular DGA domain names and wordlist-based ones. To be precise, we got a F1 score of 98.92% for the detection and macro average F1 score of 81% for the classification task of DGA domain names.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于注意力机制和深度神经网络的DGA域名检测

域名生成算法(DGA)被恶意软件用来生成域名，作为一种常见的做法，用它来确认与他们的指挥和控制(C2)服务器的会合点。DGA域名检测是指挥控制通信检测的重要技术之一。考虑到DGA域名的随机性，最近的DGA检测工作采用基于特征提取和深度学习架构的机器学习方法对域名进行分类。然而，这些方法在基于wordlist的DGA族上表现不佳，这些DGA族通过随机连接字典中的单词来生成域名。本文提出了ATT-CNN-BiLSTM模型对DGA域名进行检测和分类。首先，利用卷积神经网络(CNN)和双向长短期记忆(BiLSTM)神经网络层提取域序列特征信息;其次，利用关注层对提取的领域深度信息进行权重分配;最后，将不同权重的域特征信息放入输出层，完成检测和分类任务。实验结果表明，该模型对常规DGA域名和基于词表的域名都是有效的。准确地说，我们对DGA域名的检测得到了98.92%的F1分数，对DGA域名的分类任务得到了81%的宏观平均F1分数。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)

自引率

0.00%

发文量

期刊最新文献

Monaural Music Source Separation using a ResNet Latent Separator Network Graph-Based Attention Networks for Aspect Level Sentiment Analysis A Multi-channel Neural Network for Imbalanced Emotion Recognition Scaling up Prediction of Psychosis by Natural Language Processing Improving Bandit-Based Recommendations with Spatial Context Reasoning: An Online Evaluation