CANAL - Cyber Activity News Alerting Language Model : Empirical Approach vs. Expensive LLMs

Urjitkumar Patel, Fang-Chun Yeh, Chinmay Gondhalekar
{"title":"CANAL - Cyber Activity News Alerting Language Model : Empirical Approach vs. Expensive LLMs","authors":"Urjitkumar Patel, Fang-Chun Yeh, Chinmay Gondhalekar","doi":"10.1109/ICAIC60265.2024.10433839","DOIUrl":null,"url":null,"abstract":"In today’s digital landscape, where cyber attacks have become the norm, the detection of cyber attacks and threats is critically imperative across diverse domains. Our research presents a new empirical framework for cyber threat modeling, adept at parsing and categorizing cyber-related information from news articles, enhancing real-time vigilance for market stakeholders. At the core of this framework is a fine-tuned BERT model, which we call CANAL - Cyber Activity News Alerting Language Model, tailored for cyber categorization using a novel silver labeling approach powered by Random Forest. We benchmark CANAL against larger, costlier LLMs, including GPT-4, LLaMA, and Zephyr, highlighting their zero to few-shot learning in cyber news classification. CANAL demonstrates superior performance by outperforming all other LLM counterparts in both accuracy and cost-effectiveness. Furthermore, we introduce the Cyber Signal Discovery module, a strategic component designed to efficiently detect emerging cyber signals from news articles. Collectively, CANAL and Cyber Signal Discovery module equip our framework to provide a robust and cost-effective solution for businesses that require agile responses to cyber intelligence.","PeriodicalId":517265,"journal":{"name":"2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)","volume":"24 6","pages":"1-12"},"PeriodicalIF":0.0000,"publicationDate":"2024-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAIC60265.2024.10433839","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In today’s digital landscape, where cyber attacks have become the norm, the detection of cyber attacks and threats is critically imperative across diverse domains. Our research presents a new empirical framework for cyber threat modeling, adept at parsing and categorizing cyber-related information from news articles, enhancing real-time vigilance for market stakeholders. At the core of this framework is a fine-tuned BERT model, which we call CANAL - Cyber Activity News Alerting Language Model, tailored for cyber categorization using a novel silver labeling approach powered by Random Forest. We benchmark CANAL against larger, costlier LLMs, including GPT-4, LLaMA, and Zephyr, highlighting their zero to few-shot learning in cyber news classification. CANAL demonstrates superior performance by outperforming all other LLM counterparts in both accuracy and cost-effectiveness. Furthermore, we introduce the Cyber Signal Discovery module, a strategic component designed to efficiently detect emerging cyber signals from news articles. Collectively, CANAL and Cyber Signal Discovery module equip our framework to provide a robust and cost-effective solution for businesses that require agile responses to cyber intelligence.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
CANAL - 网络活动新闻预警语言模型:经验方法与昂贵的 LLMs 比较
在当今的数字环境中,网络攻击已成为常态,因此在不同领域检测网络攻击和威胁至关重要。我们的研究为网络威胁建模提出了一个新的经验框架,该框架善于解析和分类新闻报道中的网络相关信息,提高市场利益相关者的实时警惕性。该框架的核心是一个经过微调的 BERT 模型,我们称之为 CANAL - 网络活动新闻预警语言模型,它采用随机森林(Random Forest)驱动的新型银标签方法,专为网络分类量身定制。我们将 CANAL 与更大型、成本更高的 LLM(包括 GPT-4、LLaMA 和 Zephyr)进行比较,突出它们在网络新闻分类中从零到几的学习能力。CANAL 在准确性和成本效益方面都优于所有其他 LLM,表现出了卓越的性能。此外,我们还介绍了网络信号发现模块,这是一个战略性组件,旨在从新闻文章中有效地发现新出现的网络信号。总之,CANAL 和网络信号发现模块使我们的框架能够为需要对网络情报做出敏捷反应的企业提供强大而经济高效的解决方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
AI-Based Cybersecurity Policies and Procedures Leveraging Advanced Visual Recognition Classifier For Pneumonia Prediction Risk-Aware Mobile App Security Testing: Safeguarding Sensitive User Inputs CANAL - Cyber Activity News Alerting Language Model : Empirical Approach vs. Expensive LLMs Link-based Anomaly Detection with Sysmon and Graph Neural Networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1