{"title":"利用新颖性分类从Twitter收集网络威胁情报","authors":"Ba-Dung Le, Guanhua Wang, Mehwish Nasim, M. Babar","doi":"10.1109/CW.2019.00058","DOIUrl":null,"url":null,"abstract":"Preventing organizations from Cyber exploits needs timely intelligence about Cyber vulnerabilities and attacks, referred to as threats. Cyber threat intelligence can be extracted from various sources including social media platforms where users publish the threat information in real-time. Gathering Cyber threat intelligence from social media sites is a time-consuming task for security analysts that can delay timely response to emerging Cyber threats. We propose a framework for automatically gathering Cyber threat intelligence from Twitter by using a novelty detection model. Our model learns the features of Cyber threat intelligence from the threat descriptions published in public repositories such as Common Vulnerabilities and Exposures (CVE) and classifies a new unseen tweet as either normal or anomalous to Cyber threat intelligence. We evaluate our framework using a purpose-built data set of tweets from 50 influential Cyber security-related accounts over twelve months (in 2018). Our classifier achieves the F1-score of 0.643 for classifying Cyber threat tweets and outperforms several baselines including binary classification models. Analysis of the classification results suggests that Cyber threat-relevant tweets on Twitter do not often include the CVE identifier of the related threats. Hence, it would be valuable to collect these tweets and associate them with the related CVE identifier for Cyber security applications.","PeriodicalId":117409,"journal":{"name":"2019 International Conference on Cyberworlds (CW)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":"{\"title\":\"Gathering Cyber Threat Intelligence from Twitter Using Novelty Classification\",\"authors\":\"Ba-Dung Le, Guanhua Wang, Mehwish Nasim, M. Babar\",\"doi\":\"10.1109/CW.2019.00058\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Preventing organizations from Cyber exploits needs timely intelligence about Cyber vulnerabilities and attacks, referred to as threats. Cyber threat intelligence can be extracted from various sources including social media platforms where users publish the threat information in real-time. Gathering Cyber threat intelligence from social media sites is a time-consuming task for security analysts that can delay timely response to emerging Cyber threats. We propose a framework for automatically gathering Cyber threat intelligence from Twitter by using a novelty detection model. Our model learns the features of Cyber threat intelligence from the threat descriptions published in public repositories such as Common Vulnerabilities and Exposures (CVE) and classifies a new unseen tweet as either normal or anomalous to Cyber threat intelligence. We evaluate our framework using a purpose-built data set of tweets from 50 influential Cyber security-related accounts over twelve months (in 2018). Our classifier achieves the F1-score of 0.643 for classifying Cyber threat tweets and outperforms several baselines including binary classification models. Analysis of the classification results suggests that Cyber threat-relevant tweets on Twitter do not often include the CVE identifier of the related threats. Hence, it would be valuable to collect these tweets and associate them with the related CVE identifier for Cyber security applications.\",\"PeriodicalId\":117409,\"journal\":{\"name\":\"2019 International Conference on Cyberworlds (CW)\",\"volume\":\"49 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-07-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"24\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Conference on Cyberworlds (CW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CW.2019.00058\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Cyberworlds (CW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CW.2019.00058","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 24
摘要
为了防止组织受到网络攻击,需要及时获得有关网络漏洞和攻击(即威胁)的情报。网络威胁情报可以从各种来源提取,包括用户实时发布威胁信息的社交媒体平台。对于安全分析师来说,从社交媒体网站收集网络威胁情报是一项耗时的任务,可能会延迟对新出现的网络威胁的及时响应。本文提出了一种利用新颖性检测模型自动收集Twitter网络威胁情报的框架。我们的模型从公共存储库(如Common Vulnerabilities and Exposures, CVE)中发布的威胁描述中学习网络威胁情报的特征,并将一条新的未见过的推文分类为正常或异常的网络威胁情报。我们使用专门构建的数据集来评估我们的框架,这些数据集来自50个有影响力的网络安全相关账户,历时12个月(2018年)。我们的分类器对网络威胁推文进行分类的f1得分为0.643,并且优于包括二元分类模型在内的几个基线。对分类结果的分析表明,Twitter上与网络威胁相关的推文通常不包含相关威胁的CVE标识符。因此,收集这些tweet并将它们与网络安全应用程序的相关CVE标识符关联起来是很有价值的。
Gathering Cyber Threat Intelligence from Twitter Using Novelty Classification
Preventing organizations from Cyber exploits needs timely intelligence about Cyber vulnerabilities and attacks, referred to as threats. Cyber threat intelligence can be extracted from various sources including social media platforms where users publish the threat information in real-time. Gathering Cyber threat intelligence from social media sites is a time-consuming task for security analysts that can delay timely response to emerging Cyber threats. We propose a framework for automatically gathering Cyber threat intelligence from Twitter by using a novelty detection model. Our model learns the features of Cyber threat intelligence from the threat descriptions published in public repositories such as Common Vulnerabilities and Exposures (CVE) and classifies a new unseen tweet as either normal or anomalous to Cyber threat intelligence. We evaluate our framework using a purpose-built data set of tweets from 50 influential Cyber security-related accounts over twelve months (in 2018). Our classifier achieves the F1-score of 0.643 for classifying Cyber threat tweets and outperforms several baselines including binary classification models. Analysis of the classification results suggests that Cyber threat-relevant tweets on Twitter do not often include the CVE identifier of the related threats. Hence, it would be valuable to collect these tweets and associate them with the related CVE identifier for Cyber security applications.