Abstract In recent years, there have been several calls by practitioners of machine learning to provide more guidelines on how to use its methods and techniques. For example, the current literature on resampling methods is confusing and sometimes contradictory; worse, there are sometimes no practical guidelines offered at all. To address this shortcoming, a simulation study was conducted that evaluated ridge regression models fitted on five real-world datasets. The study compared the performance of four resampling methods, namely, Monte Carlo resampling, bootstrap, k-fold cross-validation, and repeated k-fold cross-validation. The goal was to find the best-fitting λ (regularization) parameter that would minimize mean squared error, by using nine variations of these resampling methods. For each of the nine resampling variations, 1,000 runs were performed to see how often a good fit, average fit, and poor fit λ value would be chosen. The resampling method that chose good fit values the greatest number of times was deemed the best method. Based on the results of the investigation, three general recommendations are made: (1) repeated k-fold cross-validation is the best method to select as a general-purpose resampling method; (2) k = 10 folds is a good choice in k-fold cross-validation; (3) Monte Carlo and bootstrap are underperformers, so they are not recommended as general-purpose resampling methods. At the same time, no resampling method was found to be uniformly better than the others.
{"title":"Validation of machine learning ridge regression models using Monte Carlo, bootstrap, and variations in cross-validation","authors":"Robbie T. Nakatsu","doi":"10.1515/jisys-2022-0224","DOIUrl":"https://doi.org/10.1515/jisys-2022-0224","url":null,"abstract":"Abstract In recent years, there have been several calls by practitioners of machine learning to provide more guidelines on how to use its methods and techniques. For example, the current literature on resampling methods is confusing and sometimes contradictory; worse, there are sometimes no practical guidelines offered at all. To address this shortcoming, a simulation study was conducted that evaluated ridge regression models fitted on five real-world datasets. The study compared the performance of four resampling methods, namely, Monte Carlo resampling, bootstrap, k-fold cross-validation, and repeated k-fold cross-validation. The goal was to find the best-fitting λ (regularization) parameter that would minimize mean squared error, by using nine variations of these resampling methods. For each of the nine resampling variations, 1,000 runs were performed to see how often a good fit, average fit, and poor fit λ value would be chosen. The resampling method that chose good fit values the greatest number of times was deemed the best method. Based on the results of the investigation, three general recommendations are made: (1) repeated k-fold cross-validation is the best method to select as a general-purpose resampling method; (2) k = 10 folds is a good choice in k-fold cross-validation; (3) Monte Carlo and bootstrap are underperformers, so they are not recommended as general-purpose resampling methods. At the same time, no resampling method was found to be uniformly better than the others.","PeriodicalId":46139,"journal":{"name":"Journal of Intelligent Systems","volume":"6 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78871554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Image data play important role in various real-time online and offline applications. Biomedical field has adopted the imaging system to detect, diagnose, and prevent several types of diseases and abnormalities. The biomedical imaging data contain huge information which requires huge storage space. Moreover, currently telemedicine and IoT based remote health monitoring systems are widely developed where data is transmitted from one place to another. Transmission of this type of huge data consumes more bandwidth. Along with this, during this transmission, the attackers can attack the communication channel and obtain the important and secret information. Hence, biomedical image compression and encryption are considered the solution to deal with these issues. Several techniques have been presented but achieving desired performance for combined module is a challenging task. Hence, in this work, a novel combined approach for image compression and encryption is developed. First, image compression scheme using wavelet transform is presented and later a cryptography scheme is presented using confusion and diffusion schemes. The outcome of the proposed approach is compared with various existing techniques. The experimental analysis shows that the proposed approach achieves better performance in terms of autocorrelation, histogram, information entropy, PSNR, MSE, and SSIM.
{"title":"HWCD: A hybrid approach for image compression using wavelet, encryption using confusion, and decryption using diffusion scheme","authors":"H. R. Latha, Alagarswamy Ramaprasath","doi":"10.1515/jisys-2022-9056","DOIUrl":"https://doi.org/10.1515/jisys-2022-9056","url":null,"abstract":"Abstract Image data play important role in various real-time online and offline applications. Biomedical field has adopted the imaging system to detect, diagnose, and prevent several types of diseases and abnormalities. The biomedical imaging data contain huge information which requires huge storage space. Moreover, currently telemedicine and IoT based remote health monitoring systems are widely developed where data is transmitted from one place to another. Transmission of this type of huge data consumes more bandwidth. Along with this, during this transmission, the attackers can attack the communication channel and obtain the important and secret information. Hence, biomedical image compression and encryption are considered the solution to deal with these issues. Several techniques have been presented but achieving desired performance for combined module is a challenging task. Hence, in this work, a novel combined approach for image compression and encryption is developed. First, image compression scheme using wavelet transform is presented and later a cryptography scheme is presented using confusion and diffusion schemes. The outcome of the proposed approach is compared with various existing techniques. The experimental analysis shows that the proposed approach achieves better performance in terms of autocorrelation, histogram, information entropy, PSNR, MSE, and SSIM.","PeriodicalId":46139,"journal":{"name":"Journal of Intelligent Systems","volume":"37 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78967186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Translation of long sentences in English is a complex problem in machine translation. This work briefly introduced the basic framework of intelligent machine translation algorithm and improved the long short-term memory (LSTM)-based intelligent machine translation algorithm by introducing the long sentence segmentation module and reordering module. Simulation experiments were conducted using the public corpus and the local corpus containing self-collected linguistic data. The improved algorithm was compared with machine translation algorithms based on a recurrent neural network and LSTM. The results suggested that the LSTM-based machine translation algorithm added with the long sentence segmentation module and reordering module effectively segmented long sentences and translated long English sentences more accurately, and the translation was more grammatically correct.
{"title":"An intelligent algorithm for fast machine translation of long English sentences","authors":"Hengheng He","doi":"10.1515/jisys-2022-0257","DOIUrl":"https://doi.org/10.1515/jisys-2022-0257","url":null,"abstract":"Abstract Translation of long sentences in English is a complex problem in machine translation. This work briefly introduced the basic framework of intelligent machine translation algorithm and improved the long short-term memory (LSTM)-based intelligent machine translation algorithm by introducing the long sentence segmentation module and reordering module. Simulation experiments were conducted using the public corpus and the local corpus containing self-collected linguistic data. The improved algorithm was compared with machine translation algorithms based on a recurrent neural network and LSTM. The results suggested that the LSTM-based machine translation algorithm added with the long sentence segmentation module and reordering module effectively segmented long sentences and translated long English sentences more accurately, and the translation was more grammatically correct.","PeriodicalId":46139,"journal":{"name":"Journal of Intelligent Systems","volume":"22 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83202682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Remote sensing image technology to the ground has important guiding significance in disaster assessment and emergency rescue deployment. In order to realize the fast automatic registration of multi-sensor remote sensing images, the remote sensing image block registration idea is introduced, and the image reconstruction is processed by using the conjugate gradient descent (CGD) method. The scale-invariant feature transformation (SIFT) algorithm is improved and optimized by combining the function-fitting method. By this way, it can improve the registration accuracy and efficiency of multi-sensor remote sensing images. The results show that the average peak signal-to-noise ratio of the image processed by the CGD method is 25.428. The average root mean square value is 17.442. The average image processing time is 6.093 s. These indicators are better than the passive filter algorithm and the gradient descent method. The average accuracy of image registration of the improved SIFT registration method is 96.37%, and the average image registration time is 2.14 s. These indicators are significantly better than the traditional SIFT algorithm and speeded-up robust features algorithm. It is proved that the improved SIFT registration method can effectively improve the accuracy and operation efficiency of multi-sensor remote sensing image registration methods. The improved SIFT registration method effectively solves the problems of low accuracy and long time consumption of traditional multi-sensor remote sensing image fast registration methods. While maintaining high registration accuracy, it improves the image registration speed and provides technical support for a rapid disaster assessment after major disasters such as earthquakes and floods. And it has an important value for the development of the efficient post-disaster rescue deployment.
{"title":"Multi-sensor remote sensing image alignment based on fast algorithms","authors":"Tao Shu","doi":"10.1515/jisys-2022-0289","DOIUrl":"https://doi.org/10.1515/jisys-2022-0289","url":null,"abstract":"Abstract Remote sensing image technology to the ground has important guiding significance in disaster assessment and emergency rescue deployment. In order to realize the fast automatic registration of multi-sensor remote sensing images, the remote sensing image block registration idea is introduced, and the image reconstruction is processed by using the conjugate gradient descent (CGD) method. The scale-invariant feature transformation (SIFT) algorithm is improved and optimized by combining the function-fitting method. By this way, it can improve the registration accuracy and efficiency of multi-sensor remote sensing images. The results show that the average peak signal-to-noise ratio of the image processed by the CGD method is 25.428. The average root mean square value is 17.442. The average image processing time is 6.093 s. These indicators are better than the passive filter algorithm and the gradient descent method. The average accuracy of image registration of the improved SIFT registration method is 96.37%, and the average image registration time is 2.14 s. These indicators are significantly better than the traditional SIFT algorithm and speeded-up robust features algorithm. It is proved that the improved SIFT registration method can effectively improve the accuracy and operation efficiency of multi-sensor remote sensing image registration methods. The improved SIFT registration method effectively solves the problems of low accuracy and long time consumption of traditional multi-sensor remote sensing image fast registration methods. While maintaining high registration accuracy, it improves the image registration speed and provides technical support for a rapid disaster assessment after major disasters such as earthquakes and floods. And it has an important value for the development of the efficient post-disaster rescue deployment.","PeriodicalId":46139,"journal":{"name":"Journal of Intelligent Systems","volume":"25 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82616252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Deep learning (DL) has revolutionized advanced digital picture processing, enabling significant advancements in computer vision (CV). However, it is important to note that older CV techniques, developed prior to the emergence of DL, still hold value and relevance. Particularly in the realm of more complex, three-dimensional (3D) data such as video and 3D models, CV and multimedia retrieval remain at the forefront of technological advancements. We provide critical insights into the progress made in developing higher-dimensional qualities through the application of DL, and also discuss the advantages and strategies employed in DL. With the widespread use of 3D sensor data and 3D modeling, the analysis and representation of the world in three dimensions have become commonplace. This progress has been facilitated by the development of additional sensors, driven by advancements in areas such as 3D gaming and self-driving vehicles. These advancements have enabled researchers to create feature description models that surpass traditional two-dimensional approaches. This study reveals the current state of advanced digital picture processing, highlighting the role of DL in pushing the boundaries of CV and multimedia retrieval in handling complex, 3D data.
{"title":"Development and research of deep neural network fusion computer vision technology","authors":"Jiangtao Wang","doi":"10.1515/jisys-2022-0264","DOIUrl":"https://doi.org/10.1515/jisys-2022-0264","url":null,"abstract":"Abstract Deep learning (DL) has revolutionized advanced digital picture processing, enabling significant advancements in computer vision (CV). However, it is important to note that older CV techniques, developed prior to the emergence of DL, still hold value and relevance. Particularly in the realm of more complex, three-dimensional (3D) data such as video and 3D models, CV and multimedia retrieval remain at the forefront of technological advancements. We provide critical insights into the progress made in developing higher-dimensional qualities through the application of DL, and also discuss the advantages and strategies employed in DL. With the widespread use of 3D sensor data and 3D modeling, the analysis and representation of the world in three dimensions have become commonplace. This progress has been facilitated by the development of additional sensors, driven by advancements in areas such as 3D gaming and self-driving vehicles. These advancements have enabled researchers to create feature description models that surpass traditional two-dimensional approaches. This study reveals the current state of advanced digital picture processing, highlighting the role of DL in pushing the boundaries of CV and multimedia retrieval in handling complex, 3D data.","PeriodicalId":46139,"journal":{"name":"Journal of Intelligent Systems","volume":"148 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135157947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract In the era of big data, data information has exploded, and all walks of life are impacted by big data. The arrival of big data provides the possibility for the realization of intelligent financial analysis of enterprises. At present, most enterprises’ financial analysis and decision-making based on the analysis results are mainly based on human resources, with poor automation and obvious problems in efficiency and error. In order to help the senior management of enterprises to conduct scientific and effective management, the study uses big data web crawler technology and ETL technology to process data and build an intelligent financial decision support system integrating big data together with Internet plus platform. J Group in S Province is taken as an example to study the effect before and after the application of intelligent financial decision support system. The results show that the crawler technology can monitor the basic data and the big data in the industry in real time, and improve the accuracy of decision-making. Through the intelligent financial decision support system which integrates big data, the core indexes such as profit, net asset return, and accounts receivable of the enterprises can be clearly displayed. The system can query the causes of financial changes hidden behind the financial data. Through the intelligent financial decision support system, it is found that the asset liability ratio, current assets growth rate, operating income growth rate, and financial expenses of J Group are 55.27, 10.38, 20.28%, and 1,974 million RMB, respectively. The growth rate of real sales income of J Group is 0.63%, which is 31.27% less than the excellent value of the industry 31.90%. After adopting the intelligent financial decision support system, the monthly financial statements of the enterprises increase significantly, and the monthly report analysis time decreases. The maximum number of financial statements received by the Group per month is 332, and the processing time at this time is only 2 h. According to the results, it can be seen that the intelligent financial decision support system integrating big data as the research result can effectively improve the financial management level of enterprises, improve the usefulness of financial decision-making, and make practical contributions to the field of corporate financial decision-making.
{"title":"Intelligent financial decision support system based on big data","authors":"Danna Tong, Guixian Tian","doi":"10.1515/jisys-2022-0320","DOIUrl":"https://doi.org/10.1515/jisys-2022-0320","url":null,"abstract":"Abstract In the era of big data, data information has exploded, and all walks of life are impacted by big data. The arrival of big data provides the possibility for the realization of intelligent financial analysis of enterprises. At present, most enterprises’ financial analysis and decision-making based on the analysis results are mainly based on human resources, with poor automation and obvious problems in efficiency and error. In order to help the senior management of enterprises to conduct scientific and effective management, the study uses big data web crawler technology and ETL technology to process data and build an intelligent financial decision support system integrating big data together with Internet plus platform. J Group in S Province is taken as an example to study the effect before and after the application of intelligent financial decision support system. The results show that the crawler technology can monitor the basic data and the big data in the industry in real time, and improve the accuracy of decision-making. Through the intelligent financial decision support system which integrates big data, the core indexes such as profit, net asset return, and accounts receivable of the enterprises can be clearly displayed. The system can query the causes of financial changes hidden behind the financial data. Through the intelligent financial decision support system, it is found that the asset liability ratio, current assets growth rate, operating income growth rate, and financial expenses of J Group are 55.27, 10.38, 20.28%, and 1,974 million RMB, respectively. The growth rate of real sales income of J Group is 0.63%, which is 31.27% less than the excellent value of the industry 31.90%. After adopting the intelligent financial decision support system, the monthly financial statements of the enterprises increase significantly, and the monthly report analysis time decreases. The maximum number of financial statements received by the Group per month is 332, and the processing time at this time is only 2 h. According to the results, it can be seen that the intelligent financial decision support system integrating big data as the research result can effectively improve the financial management level of enterprises, improve the usefulness of financial decision-making, and make practical contributions to the field of corporate financial decision-making.","PeriodicalId":46139,"journal":{"name":"Journal of Intelligent Systems","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135358255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Paternity testing using a deoxyribose nucleic acid (DNA) profile is an essential branch of forensic science, and DNA short tandem repeat (STR) is usually used for this purpose. Nowadays, in third-world countries, conventional kinship analysis techniques used in forensic investigations result in inadequate accuracy measurements, especially when dealing with large human STR datasets; they compare human profiles manually so that the number of samples is limited due to the required human efforts and time consumption. By utilizing automation made possible by AI, forensic investigations are conducted more efficiently, saving both time conception and cost. In this article, we propose a new algorithm for predicting paternity based on the 15-loci STR-DNA datasets using a deep neural network (DNN), where comparisons among many human profiles are held regardless of the limitation of the number of samples. For the purpose of paternity testing, familial data are artificially created based on the real data of individual Iraqi people from Al-Najaf province. Such action helps to overcome the shortage of Iraqi data due to restricted policies and the secrecy of familial datasets. About 53,530 datasets are used in the proposed DNN model for the purpose of training and testing. The Keras library based on Python is used to implement and test the proposed system, as well as the confusion matrix and receiver operating characteristic curve for system evaluation. The system shows excellent accuracy of 99.6% in paternity tests, which is the highest accuracy compared to the existing works. This system shows a good attempt at testing paternity based on a technique of artificial intelligence.
{"title":"A deep neural network model for paternity testing based on 15-loci STR for Iraqi families","authors":"Donya A. Khalid, Nasser Nafea","doi":"10.1515/jisys-2023-0041","DOIUrl":"https://doi.org/10.1515/jisys-2023-0041","url":null,"abstract":"Abstract Paternity testing using a deoxyribose nucleic acid (DNA) profile is an essential branch of forensic science, and DNA short tandem repeat (STR) is usually used for this purpose. Nowadays, in third-world countries, conventional kinship analysis techniques used in forensic investigations result in inadequate accuracy measurements, especially when dealing with large human STR datasets; they compare human profiles manually so that the number of samples is limited due to the required human efforts and time consumption. By utilizing automation made possible by AI, forensic investigations are conducted more efficiently, saving both time conception and cost. In this article, we propose a new algorithm for predicting paternity based on the 15-loci STR-DNA datasets using a deep neural network (DNN), where comparisons among many human profiles are held regardless of the limitation of the number of samples. For the purpose of paternity testing, familial data are artificially created based on the real data of individual Iraqi people from Al-Najaf province. Such action helps to overcome the shortage of Iraqi data due to restricted policies and the secrecy of familial datasets. About 53,530 datasets are used in the proposed DNN model for the purpose of training and testing. The Keras library based on Python is used to implement and test the proposed system, as well as the confusion matrix and receiver operating characteristic curve for system evaluation. The system shows excellent accuracy of 99.6% in paternity tests, which is the highest accuracy compared to the existing works. This system shows a good attempt at testing paternity based on a technique of artificial intelligence.","PeriodicalId":46139,"journal":{"name":"Journal of Intelligent Systems","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135611752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract The nonlinear system is difficult to achieve the desired effect by using traditional proportional integral derivative (PID) or linear controller. First, this study presents an improved lazy learning algorithm based on k-vector nearest neighbors, which not only considers the matching of input and output data, but also considers the consistency of the model. Based on the optimization index of an additional penalty function, the optimal solution of the lazy learning is obtained by the iterative least-square method. Second, based on the improved lazy learning, an adaptive PID control algorithm is proposed. Finally, the control effect under the condition of complete data and incomplete data is compared by simulation experiment.
{"title":"Design model-free adaptive PID controller based on lazy learning algorithm","authors":"Hongcheng Zhou","doi":"10.1515/jisys-2022-0279","DOIUrl":"https://doi.org/10.1515/jisys-2022-0279","url":null,"abstract":"Abstract The nonlinear system is difficult to achieve the desired effect by using traditional proportional integral derivative (PID) or linear controller. First, this study presents an improved lazy learning algorithm based on k-vector nearest neighbors, which not only considers the matching of input and output data, but also considers the consistency of the model. Based on the optimization index of an additional penalty function, the optimal solution of the lazy learning is obtained by the iterative least-square method. Second, based on the improved lazy learning, an adaptive PID control algorithm is proposed. Finally, the control effect under the condition of complete data and incomplete data is compared by simulation experiment.","PeriodicalId":46139,"journal":{"name":"Journal of Intelligent Systems","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135953356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract In recent years, smart contract technology has garnered significant attention due to its ability to address trust issues that traditional technologies have long struggled with. However, like any evolving technology, smart contracts are not immune to vulnerabilities, and some remain underexplored, often eluding detection by existing vulnerability assessment tools. In this article, we have performed a systematic literature review of all the scientific research and papers conducted between 2016 and 2021. The main objective of this work is to identify what vulnerabilities and smart contract technologies have not been well studied. In addition, we list all the datasets used by previous researchers that can help researchers in building more efficient machine-learning models in the future. In addition, comparisons are drawn among the smart contract analysis tools by considering various features. Finally, various future directions are also discussed in the field of smart contracts that can help researchers to set the direction for future research in this domain.
{"title":"A systematic literature review of undiscovered vulnerabilities and tools in smart contract technology","authors":"Oualid Zaazaa, Hanan El Bakkali","doi":"10.1515/jisys-2023-0038","DOIUrl":"https://doi.org/10.1515/jisys-2023-0038","url":null,"abstract":"Abstract In recent years, smart contract technology has garnered significant attention due to its ability to address trust issues that traditional technologies have long struggled with. However, like any evolving technology, smart contracts are not immune to vulnerabilities, and some remain underexplored, often eluding detection by existing vulnerability assessment tools. In this article, we have performed a systematic literature review of all the scientific research and papers conducted between 2016 and 2021. The main objective of this work is to identify what vulnerabilities and smart contract technologies have not been well studied. In addition, we list all the datasets used by previous researchers that can help researchers in building more efficient machine-learning models in the future. In addition, comparisons are drawn among the smart contract analysis tools by considering various features. Finally, various future directions are also discussed in the field of smart contracts that can help researchers to set the direction for future research in this domain.","PeriodicalId":46139,"journal":{"name":"Journal of Intelligent Systems","volume":"22 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84117441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Protecting big data from attacks on large organizations is essential because of how vital such data are to organizations and individuals. Moreover, such data can be put at risk when attackers gain unauthorized access to information and use it in illegal ways. One of the most common such attacks is the structured query language injection attack (SQLIA). This attack is a vulnerability attack that allows attackers to illegally access a database quickly and easily by manipulating structured query language (SQL) queries, especially when dealing with a big data environment. To address these risks, this study aims to build an approach that acts as a middle protection layer between the client and database server layers and reduces the time consumed to classify the SQL payload sent from the user layer. The proposed method involves training a model by using a machine learning (ML) technique for logistic regression with the Spark ML library that handles big data. An experiment was conducted using the SQLI dataset. Results show that the proposed approach achieved an accuracy of 99.04, a precision of 98.87, a recall of 99.89, and an F-score of 99.04. The time taken to identify and prevent SQLIA is 0.05 s. Our approach can protect the data by using the middle layer. Moreover, using the Spark ML library with ML algorithms gives better accuracy and shortens the time required to determine the type of request sent from the user layer.
{"title":"Analyzing SQL payloads using logistic regression in a big data environment","authors":"O. Shareef, Rehab Flaih Hasan, Ammar Hatem Farhan","doi":"10.1515/jisys-2023-0063","DOIUrl":"https://doi.org/10.1515/jisys-2023-0063","url":null,"abstract":"Abstract Protecting big data from attacks on large organizations is essential because of how vital such data are to organizations and individuals. Moreover, such data can be put at risk when attackers gain unauthorized access to information and use it in illegal ways. One of the most common such attacks is the structured query language injection attack (SQLIA). This attack is a vulnerability attack that allows attackers to illegally access a database quickly and easily by manipulating structured query language (SQL) queries, especially when dealing with a big data environment. To address these risks, this study aims to build an approach that acts as a middle protection layer between the client and database server layers and reduces the time consumed to classify the SQL payload sent from the user layer. The proposed method involves training a model by using a machine learning (ML) technique for logistic regression with the Spark ML library that handles big data. An experiment was conducted using the SQLI dataset. Results show that the proposed approach achieved an accuracy of 99.04, a precision of 98.87, a recall of 99.89, and an F-score of 99.04. The time taken to identify and prevent SQLIA is 0.05 s. Our approach can protect the data by using the middle layer. Moreover, using the Spark ML library with ML algorithms gives better accuracy and shortens the time required to determine the type of request sent from the user layer.","PeriodicalId":46139,"journal":{"name":"Journal of Intelligent Systems","volume":"137 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86671255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}