Analyzing SQL payloads using logistic regression in a big data environment

IF 2.1 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Journal of Intelligent Systems Pub Date : 2023-01-01 DOI:10.1515/jisys-2023-0063

O. Shareef, Rehab Flaih Hasan, Ammar Hatem Farhan

{"title":"Analyzing SQL payloads using logistic regression in a big data environment","authors":"O. Shareef, Rehab Flaih Hasan, Ammar Hatem Farhan","doi":"10.1515/jisys-2023-0063","DOIUrl":null,"url":null,"abstract":"Abstract Protecting big data from attacks on large organizations is essential because of how vital such data are to organizations and individuals. Moreover, such data can be put at risk when attackers gain unauthorized access to information and use it in illegal ways. One of the most common such attacks is the structured query language injection attack (SQLIA). This attack is a vulnerability attack that allows attackers to illegally access a database quickly and easily by manipulating structured query language (SQL) queries, especially when dealing with a big data environment. To address these risks, this study aims to build an approach that acts as a middle protection layer between the client and database server layers and reduces the time consumed to classify the SQL payload sent from the user layer. The proposed method involves training a model by using a machine learning (ML) technique for logistic regression with the Spark ML library that handles big data. An experiment was conducted using the SQLI dataset. Results show that the proposed approach achieved an accuracy of 99.04, a precision of 98.87, a recall of 99.89, and an F-score of 99.04. The time taken to identify and prevent SQLIA is 0.05 s. Our approach can protect the data by using the middle layer. Moreover, using the Spark ML library with ML algorithms gives better accuracy and shortens the time required to determine the type of request sent from the user layer.","PeriodicalId":46139,"journal":{"name":"Journal of Intelligent Systems","volume":"137 1","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Intelligent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1515/jisys-2023-0063","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Abstract Protecting big data from attacks on large organizations is essential because of how vital such data are to organizations and individuals. Moreover, such data can be put at risk when attackers gain unauthorized access to information and use it in illegal ways. One of the most common such attacks is the structured query language injection attack (SQLIA). This attack is a vulnerability attack that allows attackers to illegally access a database quickly and easily by manipulating structured query language (SQL) queries, especially when dealing with a big data environment. To address these risks, this study aims to build an approach that acts as a middle protection layer between the client and database server layers and reduces the time consumed to classify the SQL payload sent from the user layer. The proposed method involves training a model by using a machine learning (ML) technique for logistic regression with the Spark ML library that handles big data. An experiment was conducted using the SQLI dataset. Results show that the proposed approach achieved an accuracy of 99.04, a precision of 98.87, a recall of 99.89, and an F-score of 99.04. The time taken to identify and prevent SQLIA is 0.05 s. Our approach can protect the data by using the middle layer. Moreover, using the Spark ML library with ML algorithms gives better accuracy and shortens the time required to determine the type of request sent from the user layer.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

在大数据环境中使用逻辑回归分析SQL有效负载

保护大数据免受大型组织的攻击至关重要，因为这些数据对组织和个人都至关重要。此外，当攻击者未经授权访问信息并以非法方式使用这些信息时，这些数据可能会处于危险之中。最常见的攻击之一是结构化查询语言注入攻击(SQLIA)。这种攻击是一种漏洞攻击，攻击者可以通过操纵结构化查询语言(SQL)查询，快速轻松地非法访问数据库，特别是在处理大数据环境时。为了解决这些风险，本研究旨在构建一种方法，作为客户端和数据库服务器层之间的中间保护层，减少对从用户层发送的SQL有效负载进行分类所花费的时间。提出的方法包括通过使用机器学习(ML)技术来训练模型，并使用处理大数据的Spark ML库进行逻辑回归。使用SQLI数据集进行了实验。结果表明，该方法的准确率为99.04，精密度为98.87，召回率为99.89,f分数为99.04。识别和预防SQLIA所需时间为0.05 s。我们的方法可以通过使用中间层来保护数据。此外，使用Spark ML库和ML算法提供了更好的准确性，并缩短了确定从用户层发送的请求类型所需的时间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Journal of Intelligent Systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-

CiteScore

5.90

自引率

3.30%

发文量

审稿时长

51 weeks

期刊介绍： The Journal of Intelligent Systems aims to provide research and review papers, as well as Brief Communications at an interdisciplinary level, with the field of intelligent systems providing the focal point. This field includes areas like artificial intelligence, models and computational theories of human cognition, perception and motivation; brain models, artificial neural nets and neural computing. It covers contributions from the social, human and computer sciences to the analysis and application of information technology.