Albert Calvo , Santiago Escuder , Nil Ortiz , Josep Escrig , Maxime Compastié
{"title":"RBD24 : A labelled dataset with risk activities using log application data","authors":"Albert Calvo , Santiago Escuder , Nil Ortiz , Josep Escrig , Maxime Compastié","doi":"10.1016/j.cose.2024.104290","DOIUrl":null,"url":null,"abstract":"<div><div>This paper introduces the Risk Activities Dataset 2024 (RBD24), an open-source dataset designed to facilitate the identification and analysis of risk activities within the cybersecurity domain. The RBD24 Dataset is derived from multimodal application logs collected over a two-week period at a Spanish state university, identifying activities aligned with the early stages of the attack scenario. This dataset paves the way for novel User and Entity behaviour Analytics (UEBA) and risk assessment frameworks within the cybersecurity domain. In detail, the dataset offers a fully user-centric approach by providing ground-truth data for various risk behaviours, including cryptocurrency activities, outdated software usage, P2P file sharing, and phishing incidents. These ground-truth data, identified through intrusion detection systems (IDS) and experimental campaigns, are represented as a set of indicators extracted from DNS, HTTP, SSL, and SMTP protocol logs. This dataset is expected to be a valuable resource for developing and benchmarking cybersecurity models, particularly in the realm of risk behaviour assessment.</div></div>","PeriodicalId":51004,"journal":{"name":"Computers & Security","volume":"150 ","pages":"Article 104290"},"PeriodicalIF":4.8000,"publicationDate":"2024-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Security","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167404824005960","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
This paper introduces the Risk Activities Dataset 2024 (RBD24), an open-source dataset designed to facilitate the identification and analysis of risk activities within the cybersecurity domain. The RBD24 Dataset is derived from multimodal application logs collected over a two-week period at a Spanish state university, identifying activities aligned with the early stages of the attack scenario. This dataset paves the way for novel User and Entity behaviour Analytics (UEBA) and risk assessment frameworks within the cybersecurity domain. In detail, the dataset offers a fully user-centric approach by providing ground-truth data for various risk behaviours, including cryptocurrency activities, outdated software usage, P2P file sharing, and phishing incidents. These ground-truth data, identified through intrusion detection systems (IDS) and experimental campaigns, are represented as a set of indicators extracted from DNS, HTTP, SSL, and SMTP protocol logs. This dataset is expected to be a valuable resource for developing and benchmarking cybersecurity models, particularly in the realm of risk behaviour assessment.
期刊介绍:
Computers & Security is the most respected technical journal in the IT security field. With its high-profile editorial board and informative regular features and columns, the journal is essential reading for IT security professionals around the world.
Computers & Security provides you with a unique blend of leading edge research and sound practical management advice. It is aimed at the professional involved with computer security, audit, control and data integrity in all sectors - industry, commerce and academia. Recognized worldwide as THE primary source of reference for applied research and technical expertise it is your first step to fully secure systems.