Kexiong Fei , Jiang Zhou , Yucan Zhou , Xiaoyan Gu , Haihui Fan , Bo Li , Weiping Wang , Yong Chen
{"title":"LaAeb: A comprehensive log-text analysis based approach for insider threat detection","authors":"Kexiong Fei , Jiang Zhou , Yucan Zhou , Xiaoyan Gu , Haihui Fan , Bo Li , Weiping Wang , Yong Chen","doi":"10.1016/j.cose.2024.104126","DOIUrl":null,"url":null,"abstract":"<div><div>Insider threats have increasingly become a critical issue that modern enterprises and organizations faced. They are mainly initiated by insider attackers, which may cause disastrous impacts. Numerous research studies have been conducted for insider threat detection. However, most of them are limited due to a small number of malicious samples. Moreover, as existing methods often concentrate on feature information or statistical characteristics for anomaly detection, they still lack effective use of comprehensive textual content information contained in logs and thus will affect detection efficiency.</div><div>We propose <span>LaAeb</span>, a novel unsupervised insider threat detection framework that leverages rich linguistic information in log contents to enable conventional methods, such as an Isolation Forest-based anomaly detection, to better detect insider threats besides using various features and statistical information. To find malicious acts under different scenarios, we consider three patterns of insider threats, including <em>attention</em>, <em>emotion</em>, and <em>behavior anomaly</em>. The attention anomaly detection analyzes textual contents of operation objects (e.g., emails and web pages) in logs to detect threats, where the textual information reflects the areas that employees focus on. When the attention seriously deviates from daily work, an employee may involve malicious acts. The emotion anomaly detection analyzes all dialogs between every two employees’ daily communicated texts and uses the degree of negative to find potential psychological problems. The behavior anomaly detection analyzes the operations of logs to detect threats. It utilizes information acquired from attention and emotion anomalies as ancillary features, integrating them with features and statistics extracted from log operations to create log embeddings. With these log embeddings, <span>LaAeb</span> employs anomaly detection algorithm like Isolation Forest to analyze an employee’s malicious operations, and further detects the employee’s behavior anomaly by considering all employees’ acts in the same department. Finally, <span>LaAeb</span> consolidates detection results of three patterns indicative of insider threats in a comprehensive manner.</div><div>We implement the prototype of <span>LaAeb</span> and test it on CERT and LANL datasets. Our evaluations demonstrate that compared with state-of-the-art unsupervised methods, <span>LaAeb</span> reduces FPR by 50% to reach 0.05 on CERT dataset under the same AUC <span><math><mrow><mo>(</mo><mn>0</mn><mo>.</mo><mn>93</mn><mo>)</mo></mrow></math></span>, and gets the best AUC <span><math><mrow><mo>(</mo><mn>0</mn><mo>.</mo><mn>97</mn><mo>)</mo></mrow></math></span> with 0.06 higher value on LANL dataset.</div></div>","PeriodicalId":51004,"journal":{"name":"Computers & Security","volume":"148 ","pages":"Article 104126"},"PeriodicalIF":4.8000,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Security","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167404824004310","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Insider threats have increasingly become a critical issue that modern enterprises and organizations faced. They are mainly initiated by insider attackers, which may cause disastrous impacts. Numerous research studies have been conducted for insider threat detection. However, most of them are limited due to a small number of malicious samples. Moreover, as existing methods often concentrate on feature information or statistical characteristics for anomaly detection, they still lack effective use of comprehensive textual content information contained in logs and thus will affect detection efficiency.
We propose LaAeb, a novel unsupervised insider threat detection framework that leverages rich linguistic information in log contents to enable conventional methods, such as an Isolation Forest-based anomaly detection, to better detect insider threats besides using various features and statistical information. To find malicious acts under different scenarios, we consider three patterns of insider threats, including attention, emotion, and behavior anomaly. The attention anomaly detection analyzes textual contents of operation objects (e.g., emails and web pages) in logs to detect threats, where the textual information reflects the areas that employees focus on. When the attention seriously deviates from daily work, an employee may involve malicious acts. The emotion anomaly detection analyzes all dialogs between every two employees’ daily communicated texts and uses the degree of negative to find potential psychological problems. The behavior anomaly detection analyzes the operations of logs to detect threats. It utilizes information acquired from attention and emotion anomalies as ancillary features, integrating them with features and statistics extracted from log operations to create log embeddings. With these log embeddings, LaAeb employs anomaly detection algorithm like Isolation Forest to analyze an employee’s malicious operations, and further detects the employee’s behavior anomaly by considering all employees’ acts in the same department. Finally, LaAeb consolidates detection results of three patterns indicative of insider threats in a comprehensive manner.
We implement the prototype of LaAeb and test it on CERT and LANL datasets. Our evaluations demonstrate that compared with state-of-the-art unsupervised methods, LaAeb reduces FPR by 50% to reach 0.05 on CERT dataset under the same AUC , and gets the best AUC with 0.06 higher value on LANL dataset.
期刊介绍:
Computers & Security is the most respected technical journal in the IT security field. With its high-profile editorial board and informative regular features and columns, the journal is essential reading for IT security professionals around the world.
Computers & Security provides you with a unique blend of leading edge research and sound practical management advice. It is aimed at the professional involved with computer security, audit, control and data integrity in all sectors - industry, commerce and academia. Recognized worldwide as THE primary source of reference for applied research and technical expertise it is your first step to fully secure systems.