A deep learning approach based on multi-view consensus for SQL injection detection

IF 2.4 4区计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS International Journal of Information Security Pub Date : 2024-01-09 DOI:10.1007/s10207-023-00791-y

Arzu Gorgulu Kakisim

{"title":"A deep learning approach based on multi-view consensus for SQL injection detection","authors":"Arzu Gorgulu Kakisim","doi":"10.1007/s10207-023-00791-y","DOIUrl":null,"url":null,"abstract":"<p>SQL injection (SQLi) attacks are one of the oldest and most serious security threats, consistently ranking among the top ten critical web security risks. Traditional defense mechanisms against SQL injection predominantly use blacklists to disallow common injection characters or terms. However, the major challenge for these systems is to create a comprehensive list of potential SQLi characters, terms, and multi-terms that encompass various types of SQLi attacks (time-based, error-based, etc.), taking into account various SQL datasets (such as MySQL, Oracle, and NoSQL). Recently, some research studies have concentrated on feature learning from SQL queries by applying some well-known deep architectures to detect SQLi attacks. Motivated by a similar objective, this research introduces a novel deep learning-based SQLi detection system named “Bidirectional LSTM-CNN based on Multi-View Consensus” (MVC-BiCNN). The proposed method implements a pre-processing step that generates multiple views from SQL data by semantically encoding SQL statements into their corresponding SQL tags. By utilizing two different main layers, which are bidirectional long short-term memory (LSTM) and convolutional neural network (CNN), the proposed method learns a joint latent space from multi-view representations. In the detection phase, the proposed method yields separate predictions for each representation and assesses whether the query constitutes an SQLi attack based on a consensus function’s output. Moreover, Interpretable Model-Agnostic Annotations (LIME), one of the methods of Explainable Artificial Intelligence (XAI), is employed for the purpose of interpreting the model’s results and analyzing the SQL injection (SQLi) inputs. The experimental results demonstrate that MVC-BiCNN outperforms the baseline methods, yielding 99.96% detection rate.</p>","PeriodicalId":50316,"journal":{"name":"International Journal of Information Security","volume":"3 1","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Information Security","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10207-023-00791-y","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

SQL injection (SQLi) attacks are one of the oldest and most serious security threats, consistently ranking among the top ten critical web security risks. Traditional defense mechanisms against SQL injection predominantly use blacklists to disallow common injection characters or terms. However, the major challenge for these systems is to create a comprehensive list of potential SQLi characters, terms, and multi-terms that encompass various types of SQLi attacks (time-based, error-based, etc.), taking into account various SQL datasets (such as MySQL, Oracle, and NoSQL). Recently, some research studies have concentrated on feature learning from SQL queries by applying some well-known deep architectures to detect SQLi attacks. Motivated by a similar objective, this research introduces a novel deep learning-based SQLi detection system named “Bidirectional LSTM-CNN based on Multi-View Consensus” (MVC-BiCNN). The proposed method implements a pre-processing step that generates multiple views from SQL data by semantically encoding SQL statements into their corresponding SQL tags. By utilizing two different main layers, which are bidirectional long short-term memory (LSTM) and convolutional neural network (CNN), the proposed method learns a joint latent space from multi-view representations. In the detection phase, the proposed method yields separate predictions for each representation and assesses whether the query constitutes an SQLi attack based on a consensus function’s output. Moreover, Interpretable Model-Agnostic Annotations (LIME), one of the methods of Explainable Artificial Intelligence (XAI), is employed for the purpose of interpreting the model’s results and analyzing the SQL injection (SQLi) inputs. The experimental results demonstrate that MVC-BiCNN outperforms the baseline methods, yielding 99.96% detection rate.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于多视角共识的 SQL 注入检测深度学习方法

SQL 注入（SQLi）攻击是最古老、最严重的安全威胁之一，一直被列为十大关键网络安全风险之一。传统的 SQL 注入防御机制主要使用黑名单来禁止常见的注入字符或术语。然而，这些系统面临的主要挑战是如何创建一个全面的潜在 SQLi 字符、术语和多术语列表，其中包含各种类型的 SQLi 攻击（基于时间、基于错误等），并考虑到各种 SQL 数据集（如 MySQL、Oracle 和 NoSQL）。最近，一些研究集中于通过应用一些著名的深度架构从 SQL 查询中学习特征来检测 SQLi 攻击。出于类似的目的，本研究提出了一种基于深度学习的新型 SQLi 检测系统，名为 "基于多视图共识的双向 LSTM-CNN"（MVC-BiCNN）。所提出的方法实施了一个预处理步骤，通过将 SQL 语句语义化为相应的 SQL 标记，从 SQL 数据生成多个视图。通过利用双向长短期记忆（LSTM）和卷积神经网络（CNN）这两个不同的主要层，所提出的方法从多视图表示中学习联合潜空间。在检测阶段，所提出的方法对每个表征进行单独预测，并根据共识函数的输出评估查询是否构成 SQLi 攻击。此外，为了解释模型结果和分析 SQL 注入（SQLi）输入，采用了可解释人工智能（XAI）方法之一的可解释模型诊断注释（LIME）。实验结果表明，MVC-BiCNN 的性能优于基线方法，其检测率高达 99.96%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

International Journal of Information Security 工程技术-计算机：理论方法

CiteScore

6.30

自引率

3.10%

发文量

审稿时长

12 months

期刊介绍： The International Journal of Information Security is an English language periodical on research in information security which offers prompt publication of important technical work, whether theoretical, applicable, or related to implementation. Coverage includes system security: intrusion detection, secure end systems, secure operating systems, database security, security infrastructures, security evaluation; network security: Internet security, firewalls, mobile security, security agents, protocols, anti-virus and anti-hacker measures; content protection: watermarking, software protection, tamper resistant software; applications: electronic commerce, government, health, telecommunications, mobility.