{"title":"Advancing Cyber Incident Timeline Analysis Through Rule Based AI and Large Language Models","authors":"Fatma Yasmine Loumachi, Mohamed Chahine Ghanem","doi":"arxiv-2409.02572","DOIUrl":null,"url":null,"abstract":"Timeline Analysis (TA) is a key part of Timeline Forensics (TF) in Digital\nForensics (DF), focusing primarily on examining and analysing temporal digital\nartefacts such as timestamps, derived from event logs, file metadata, and other\nrelated data to correlate events resulting from cyber incidents and reconstruct\ntheir chronological timeline. Traditional tools often struggle to efficiently\nprocess the vast volume and variety of data acquired during DF investigations\nand Incident Response (IR) processes. This paper presents a novel framework,\nGenDFIR, that combines Rule-Based Artificial Intelligence (R-BAI) algorithms\nwith Large Language Models (LLMs) to advance and automate the TA process. Our\napproach consists of two main stages (1) We use R-BAI to identify and select\nanomalous digital artefacts based on predefined rules. (2) The selected\nartefacts are then converted into embeddings for processing by an LLM with the\nhelp of a Retrieval-Augmented Generation (RAG) agent. The LLM consequently\nleverages its capabilities to perform automated TA on the artefacts and predict\npotential incident scenarios. To validate our framework, we evaluate GenDFIR\nperformance, efficiency, and reliability using various metrics across synthetic\ncyber incident simulation scenarios. This paper presents a proof of concept,\nwhere the findings demonstrate the significant potential of integrating R-BAI\nand LLMs for TA. This novel approach highlights the power of Generative AI\n(GenAI), specifically LLMs, and opens new avenues for advanced threat detection\nand incident reconstruction, representing a significant step forward in the\nfield.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"10 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Emerging Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.02572","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Timeline Analysis (TA) is a key part of Timeline Forensics (TF) in Digital
Forensics (DF), focusing primarily on examining and analysing temporal digital
artefacts such as timestamps, derived from event logs, file metadata, and other
related data to correlate events resulting from cyber incidents and reconstruct
their chronological timeline. Traditional tools often struggle to efficiently
process the vast volume and variety of data acquired during DF investigations
and Incident Response (IR) processes. This paper presents a novel framework,
GenDFIR, that combines Rule-Based Artificial Intelligence (R-BAI) algorithms
with Large Language Models (LLMs) to advance and automate the TA process. Our
approach consists of two main stages (1) We use R-BAI to identify and select
anomalous digital artefacts based on predefined rules. (2) The selected
artefacts are then converted into embeddings for processing by an LLM with the
help of a Retrieval-Augmented Generation (RAG) agent. The LLM consequently
leverages its capabilities to perform automated TA on the artefacts and predict
potential incident scenarios. To validate our framework, we evaluate GenDFIR
performance, efficiency, and reliability using various metrics across synthetic
cyber incident simulation scenarios. This paper presents a proof of concept,
where the findings demonstrate the significant potential of integrating R-BAI
and LLMs for TA. This novel approach highlights the power of Generative AI
(GenAI), specifically LLMs, and opens new avenues for advanced threat detection
and incident reconstruction, representing a significant step forward in the
field.