Pub Date : 2023-12-01DOI: 10.3390/computers12120250
Rohit Mittal, Geeta Rani, Vibhakar Pathak, Sonam Chhikara, V. Dhaka, E. Vocaturo, Ester Zumpano
The automation industry faces the challenge of avoiding interference with obstacles, estimating the next move of a robot, and optimizing its path in various environments. Although researchers have predicted the next move of a robot in linear and non-linear environments, there is a lack of precise estimation of sectorial error probability while moving a robot on a curvy path. Additionally, existing approaches use visual sensors, incur high costs for robot design, and ineffective in achieving motion stability on various surfaces. To address these issues, the authors in this manuscript propose a low-cost and multisensory robot capable of moving on an optimized path in diverse environments with eight degrees of freedom. The authors use the extended Kalman filter and unscented Kalman filter for localization and position estimation of the robot. They also compare the sectorial path prediction error at different angles from 0° to 180° and demonstrate the mathematical modeling of various operations involved in navigating the robot. The minimum deviation of 1.125 cm between the actual and predicted path proves the effectiveness of the robot in a real-life environment.
{"title":"Low-Cost Multisensory Robot for Optimized Path Planning in Diverse Environments","authors":"Rohit Mittal, Geeta Rani, Vibhakar Pathak, Sonam Chhikara, V. Dhaka, E. Vocaturo, Ester Zumpano","doi":"10.3390/computers12120250","DOIUrl":"https://doi.org/10.3390/computers12120250","url":null,"abstract":"The automation industry faces the challenge of avoiding interference with obstacles, estimating the next move of a robot, and optimizing its path in various environments. Although researchers have predicted the next move of a robot in linear and non-linear environments, there is a lack of precise estimation of sectorial error probability while moving a robot on a curvy path. Additionally, existing approaches use visual sensors, incur high costs for robot design, and ineffective in achieving motion stability on various surfaces. To address these issues, the authors in this manuscript propose a low-cost and multisensory robot capable of moving on an optimized path in diverse environments with eight degrees of freedom. The authors use the extended Kalman filter and unscented Kalman filter for localization and position estimation of the robot. They also compare the sectorial path prediction error at different angles from 0° to 180° and demonstrate the mathematical modeling of various operations involved in navigating the robot. The minimum deviation of 1.125 cm between the actual and predicted path proves the effectiveness of the robot in a real-life environment.","PeriodicalId":46292,"journal":{"name":"Computers","volume":"53 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138626317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-29DOI: 10.3390/computers12120249
Broderick Crawford, Felipe Cisternas-Caneo, Katherine Sepúlveda, Ricardo Soto, Álex Paz, Alvaro Peña, Claudio León de la Barra, E. Rodriguez-Tello, Gino Astorga, Carlos Castro, Franklin Johnson, Giovanni Giachetti
The digitization of information and technological advancements have enabled us to gather vast amounts of data from various domains, including but not limited to medicine, commerce, and mining. Machine learning techniques use this information to improve decision-making, but they have a big problem: they are very sensitive to data variation, so it is necessary to clean them to remove irrelevant and redundant information. This removal of information is known as the Feature Selection Problem. This work presents the Pendulum Search Algorithm applied to solve the Feature Selection Problem. As the Pendulum Search Algorithm is a metaheuristic designed for continuous optimization problems, a binarization process is performed using the Two-Step Technique. Preliminary results indicate that our proposal obtains competitive results when compared to other metaheuristics extracted from the literature, solving well-known benchmarks.
{"title":"B-PSA: A Binary Pendulum Search Algorithm for the Feature Selection Problem","authors":"Broderick Crawford, Felipe Cisternas-Caneo, Katherine Sepúlveda, Ricardo Soto, Álex Paz, Alvaro Peña, Claudio León de la Barra, E. Rodriguez-Tello, Gino Astorga, Carlos Castro, Franklin Johnson, Giovanni Giachetti","doi":"10.3390/computers12120249","DOIUrl":"https://doi.org/10.3390/computers12120249","url":null,"abstract":"The digitization of information and technological advancements have enabled us to gather vast amounts of data from various domains, including but not limited to medicine, commerce, and mining. Machine learning techniques use this information to improve decision-making, but they have a big problem: they are very sensitive to data variation, so it is necessary to clean them to remove irrelevant and redundant information. This removal of information is known as the Feature Selection Problem. This work presents the Pendulum Search Algorithm applied to solve the Feature Selection Problem. As the Pendulum Search Algorithm is a metaheuristic designed for continuous optimization problems, a binarization process is performed using the Two-Step Technique. Preliminary results indicate that our proposal obtains competitive results when compared to other metaheuristics extracted from the literature, solving well-known benchmarks.","PeriodicalId":46292,"journal":{"name":"Computers","volume":"19 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2023-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139209721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-28DOI: 10.3390/computers12120248
Eren Duman, Mehmet S. Aktas, Ezgi Yahsi
In today’s financial landscape, traditional banking institutions rely extensively on customers’ historical financial data to evaluate their eligibility for loan approvals. While these decision support systems offer predictive accuracy for established customers, they overlook a crucial demographic: individuals without a financial history. To address this gap, our study presents a methodology for a decision support system that is intended to assist in determining credit risk. Rather than solely focusing on past financial records, our methodology assesses customer credibility by generating credit risk scores derived from psychometric test results. Utilizing machine learning algorithms, we model customer credibility through multidimensional metrics such as character traits and attitudes toward money management. Preliminary results from our prototype testing indicate that this innovative approach holds promise for accurate risk assessment.
{"title":"Credit Risk Prediction Based on Psychometric Data","authors":"Eren Duman, Mehmet S. Aktas, Ezgi Yahsi","doi":"10.3390/computers12120248","DOIUrl":"https://doi.org/10.3390/computers12120248","url":null,"abstract":"In today’s financial landscape, traditional banking institutions rely extensively on customers’ historical financial data to evaluate their eligibility for loan approvals. While these decision support systems offer predictive accuracy for established customers, they overlook a crucial demographic: individuals without a financial history. To address this gap, our study presents a methodology for a decision support system that is intended to assist in determining credit risk. Rather than solely focusing on past financial records, our methodology assesses customer credibility by generating credit risk scores derived from psychometric test results. Utilizing machine learning algorithms, we model customer credibility through multidimensional metrics such as character traits and attitudes toward money management. Preliminary results from our prototype testing indicate that this innovative approach holds promise for accurate risk assessment.","PeriodicalId":46292,"journal":{"name":"Computers","volume":"13 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2023-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139220012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-28DOI: 10.3390/computers12120247
Midya Alqaradaghi, Muhammad Zafar Iqbal Nazir, Tamás Kozsik
Static analysis is a software testing technique that analyzes the code without executing it. It is widely used to detect vulnerabilities, errors, and other issues during software development. Many tools are available for static analysis of Java code, including SpotBugs. Methods that perform a security check must be declared private or final; otherwise, they can be compromised when a malicious subclass overrides the methods and omits the checks. In Java, security checks can be performed using the SecurityManager class. This paper addresses the aforementioned problem by building a new automated checker that raises an issue when this rule is violated. The checker is built under the SpotBugs static analysis tool. We evaluated our approach on both custom test cases and real-world software, and the results revealed that the checker successfully detected related bugs in both with optimal metrics values.
{"title":"Design and Implement an Accurate Automated Static Analysis Checker to Detect Insecure Use of SecurityManager","authors":"Midya Alqaradaghi, Muhammad Zafar Iqbal Nazir, Tamás Kozsik","doi":"10.3390/computers12120247","DOIUrl":"https://doi.org/10.3390/computers12120247","url":null,"abstract":"Static analysis is a software testing technique that analyzes the code without executing it. It is widely used to detect vulnerabilities, errors, and other issues during software development. Many tools are available for static analysis of Java code, including SpotBugs. Methods that perform a security check must be declared private or final; otherwise, they can be compromised when a malicious subclass overrides the methods and omits the checks. In Java, security checks can be performed using the SecurityManager class. This paper addresses the aforementioned problem by building a new automated checker that raises an issue when this rule is violated. The checker is built under the SpotBugs static analysis tool. We evaluated our approach on both custom test cases and real-world software, and the results revealed that the checker successfully detected related bugs in both with optimal metrics values.","PeriodicalId":46292,"journal":{"name":"Computers","volume":"46 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2023-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139216075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This article examines intrusion detection systems in depth using the CSE-CIC-IDS-2018 dataset. The investigation is divided into three stages: to begin, data cleaning, exploratory data analysis, and data normalization procedures (min-max and Z-score) are used to prepare data for use with various classifiers; second, in order to improve processing speed and reduce model complexity, a combination of principal component analysis (PCA) and random forest (RF) is used to reduce non-significant features by comparing them to the full dataset; finally, machine learning methods (XGBoost, CART, DT, KNN, MLP, RF, LR, and Bayes) are applied to specific features and preprocessing procedures, with the XGBoost, DT, and RF models outperforming the others in terms of both ROC values and CPU runtime. The evaluation concludes with the discovery of an optimal set, which includes PCA and RF feature selection.
{"title":"Optimizing Intrusion Detection Systems in Three Phases on the CSE-CIC-IDS-2018 Dataset","authors":"Surasit Songma, Theera Sathuphan, Thanakorn Pamutha","doi":"10.3390/computers12120245","DOIUrl":"https://doi.org/10.3390/computers12120245","url":null,"abstract":"This article examines intrusion detection systems in depth using the CSE-CIC-IDS-2018 dataset. The investigation is divided into three stages: to begin, data cleaning, exploratory data analysis, and data normalization procedures (min-max and Z-score) are used to prepare data for use with various classifiers; second, in order to improve processing speed and reduce model complexity, a combination of principal component analysis (PCA) and random forest (RF) is used to reduce non-significant features by comparing them to the full dataset; finally, machine learning methods (XGBoost, CART, DT, KNN, MLP, RF, LR, and Bayes) are applied to specific features and preprocessing procedures, with the XGBoost, DT, and RF models outperforming the others in terms of both ROC values and CPU runtime. The evaluation concludes with the discovery of an optimal set, which includes PCA and RF feature selection.","PeriodicalId":46292,"journal":{"name":"Computers","volume":"62 37","pages":""},"PeriodicalIF":2.8,"publicationDate":"2023-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139240036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-24DOI: 10.3390/computers12120246
J. Ferrer-Gomila, M. F. Hinarejos
In this article, we present the first proposal for contract signing based on blockchain that meets the requirements of fairness, hard-timeliness, and bc-optimism. The proposal, thanks to the use of blockchain, does not require the use of trusted third parties (TTPs), thus avoiding a point of failure and the problem of signatories having to agree on a TTP that is trusted by both. The presented protocol is fair because it is designed such that no honest signatory can be placed at a disadvantage. It meets the hard-timeliness requirement because both signatories can end the execution of the protocol at any time they wish. Finally, the proposal is bc-optimistic because blockchain functions are only executed in case of exception (and not in each execution of the protocol), with consequent savings when working with public blockchains. No previous proposal simultaneously met these three requirements. In addition to the above, this article clarifies the concept of timeliness, which previously has been defined in a confusing way (starting with the authors who used the term for the first time). We conducted a security review that allowed us to verify that our proposal meets the desired requirements. Furthermore, we provide the specifications of a smart contract designed for the Ethereum blockchain family and verified the economic feasibility of the proposal, ensuring it can be aligned with the financial requirements of different scenarios.
{"title":"A Hard-Timeliness Blockchain-Based Contract Signing Protocol","authors":"J. Ferrer-Gomila, M. F. Hinarejos","doi":"10.3390/computers12120246","DOIUrl":"https://doi.org/10.3390/computers12120246","url":null,"abstract":"In this article, we present the first proposal for contract signing based on blockchain that meets the requirements of fairness, hard-timeliness, and bc-optimism. The proposal, thanks to the use of blockchain, does not require the use of trusted third parties (TTPs), thus avoiding a point of failure and the problem of signatories having to agree on a TTP that is trusted by both. The presented protocol is fair because it is designed such that no honest signatory can be placed at a disadvantage. It meets the hard-timeliness requirement because both signatories can end the execution of the protocol at any time they wish. Finally, the proposal is bc-optimistic because blockchain functions are only executed in case of exception (and not in each execution of the protocol), with consequent savings when working with public blockchains. No previous proposal simultaneously met these three requirements. In addition to the above, this article clarifies the concept of timeliness, which previously has been defined in a confusing way (starting with the authors who used the term for the first time). We conducted a security review that allowed us to verify that our proposal meets the desired requirements. Furthermore, we provide the specifications of a smart contract designed for the Ethereum blockchain family and verified the economic feasibility of the proposal, ensuring it can be aligned with the financial requirements of different scenarios.","PeriodicalId":46292,"journal":{"name":"Computers","volume":"402 ","pages":""},"PeriodicalIF":2.8,"publicationDate":"2023-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139240316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-22DOI: 10.3390/computers12120244
Pau Fonseca i Casas, I. Romanowska, Joan Garcia i Subirana
Specification and Description Language (SDL) is a language that can represent the behavior and structure of a model completely and unambiguously. It allows the creation of frameworks that can run a model without the need to code it in a specific programming language. This automatic process simplifies the key phases of model building: validation and verification. SDLPS is a simulator that enables the definition and execution of models using SDL. In this paper, we present a new library that enables the execution of SDL models defined on SDLPS infrastructure on a HPC platform, such as a supercomputer, thus significantly speeding up simulation runtime. Moreover, we apply the SDL language to a social science use case, thus opening a new avenue for facilitating the use of HPC power to new groups of users. The tools presented here have the potential to increase the robustness of modeling software by improving the documentation, verification, and validation of the models.
{"title":"Specification and Description Language Models Automatic Execution in a High-Performance Environment","authors":"Pau Fonseca i Casas, I. Romanowska, Joan Garcia i Subirana","doi":"10.3390/computers12120244","DOIUrl":"https://doi.org/10.3390/computers12120244","url":null,"abstract":"Specification and Description Language (SDL) is a language that can represent the behavior and structure of a model completely and unambiguously. It allows the creation of frameworks that can run a model without the need to code it in a specific programming language. This automatic process simplifies the key phases of model building: validation and verification. SDLPS is a simulator that enables the definition and execution of models using SDL. In this paper, we present a new library that enables the execution of SDL models defined on SDLPS infrastructure on a HPC platform, such as a supercomputer, thus significantly speeding up simulation runtime. Moreover, we apply the SDL language to a social science use case, thus opening a new avenue for facilitating the use of HPC power to new groups of users. The tools presented here have the potential to increase the robustness of modeling software by improving the documentation, verification, and validation of the models.","PeriodicalId":46292,"journal":{"name":"Computers","volume":"32 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2023-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139249200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-21DOI: 10.3390/computers12120241
Andrea Calvagna, E. Tramontana, Gabriella Verga
Social network systems are constantly fed with text messages. While this enables rapid communication and global awareness, some messages could be aptly made to hurt or mislead. Automatically identifying meaningful parts of a sentence, such as, e.g., positive or negative sentiments in a phrase, would give valuable support for automatically flagging hateful messages, propaganda, etc. Many existing approaches concerned with the study of people’s opinions, attitudes and emotions and based on machine learning require an extensive labelled dataset and provide results that are not very decisive in many circumstances due to the complexity of the language structure and the fuzziness inherent in most of the techniques adopted. This paper proposes a deterministic approach that automatically identifies people’s sentiments at the sentence level. The approach is based on text analysis rules that are manually derived from the way Italian grammar works. Such rules are embedded in finite-state automata and then expressed in a way that facilitates checking unstructured Italian text. A few grammar rules suffice to analyse an ample amount of correctly formed text. We have developed a tool that has validated the proposed approach by analysing several hundreds of sentences gathered from social media: hence, they are actual comments given by users. Such a tool exploits parallel execution to make it ready to process many thousands of sentences in a fraction of a second. Our approach outperforms a well-known previous approach in terms of precision.
{"title":"Revealing People’s Sentiment in Natural Italian Language Sentences","authors":"Andrea Calvagna, E. Tramontana, Gabriella Verga","doi":"10.3390/computers12120241","DOIUrl":"https://doi.org/10.3390/computers12120241","url":null,"abstract":"Social network systems are constantly fed with text messages. While this enables rapid communication and global awareness, some messages could be aptly made to hurt or mislead. Automatically identifying meaningful parts of a sentence, such as, e.g., positive or negative sentiments in a phrase, would give valuable support for automatically flagging hateful messages, propaganda, etc. Many existing approaches concerned with the study of people’s opinions, attitudes and emotions and based on machine learning require an extensive labelled dataset and provide results that are not very decisive in many circumstances due to the complexity of the language structure and the fuzziness inherent in most of the techniques adopted. This paper proposes a deterministic approach that automatically identifies people’s sentiments at the sentence level. The approach is based on text analysis rules that are manually derived from the way Italian grammar works. Such rules are embedded in finite-state automata and then expressed in a way that facilitates checking unstructured Italian text. A few grammar rules suffice to analyse an ample amount of correctly formed text. We have developed a tool that has validated the proposed approach by analysing several hundreds of sentences gathered from social media: hence, they are actual comments given by users. Such a tool exploits parallel execution to make it ready to process many thousands of sentences in a fraction of a second. Our approach outperforms a well-known previous approach in terms of precision.","PeriodicalId":46292,"journal":{"name":"Computers","volume":"6 2","pages":""},"PeriodicalIF":2.8,"publicationDate":"2023-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139254491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-21DOI: 10.3390/computers12120242
N. Anđelić, Sandi Baressi Baressi Šegota, Z. Car
Malware detection using hybrid features, combining binary and hexadecimal analysis with DLL calls, is crucial for leveraging the strengths of both static and dynamic analysis methods. Artificial intelligence (AI) enhances this process by enabling automated pattern recognition, anomaly detection, and continuous learning, allowing security systems to adapt to evolving threats and identify complex, polymorphic malware that may exhibit varied behaviors. This synergy of hybrid features with AI empowers malware detection systems to efficiently and proactively identify and respond to sophisticated cyber threats in real time. In this paper, the genetic programming symbolic classifier (GPSC) algorithm was applied to the publicly available dataset to obtain symbolic expressions (SEs) that could detect the malware software with high classification performance. The initial problem with the dataset was a high imbalance between class samples, so various oversampling techniques were utilized to obtain balanced dataset variations on which GPSC was applied. To find the optimal combination of GPSC hyperparameter values, the random hyperparameter value search method (RHVS) was developed and applied to obtain SEs with high classification accuracy. The GPSC was trained with five-fold cross-validation (5FCV) to obtain a robust set of SEs on each dataset variation. To choose the best SEs, several evaluation metrics were used, i.e., the length and depth of SEs, accuracy score (ACC), area under receiver operating characteristic curve (AUC), precision, recall, f1-score, and confusion matrix. The best-obtained SEs are applied on the original imbalanced dataset to see if the classification performance is the same as it was on balanced dataset variations. The results of the investigation showed that the proposed method generated SEs with high classification accuracy (0.9962) in malware software detection.
使用混合功能(将二进制和十六进制分析与 DLL 调用相结合)进行恶意软件检测,对于发挥静态和动态分析方法的优势至关重要。人工智能(AI)通过自动模式识别、异常检测和持续学习增强了这一过程,使安全系统能够适应不断变化的威胁,并识别可能表现出各种行为的复杂多态恶意软件。这种混合功能与人工智能的协同作用使恶意软件检测系统能够高效、主动地实时识别和应对复杂的网络威胁。本文将遗传编程符号分类器(GPSC)算法应用于公开可用的数据集,以获得能够以高分类性能检测恶意软件的符号表达式(SE)。数据集最初的问题是类样本之间的高度不平衡,因此利用了各种超采样技术来获得平衡的数据集变化,并在此基础上应用 GPSC。为了找到 GPSC 超参数值的最佳组合,开发并应用了随机超参数值搜索法(RHVS),以获得分类准确率高的 SE。使用五倍交叉验证(5FCV)训练 GPSC,以在每个数据集变化上获得一组稳健的 SE。为了选择最佳 SE,使用了几个评估指标,即 SE 的长度和深度、准确度得分(ACC)、接收器工作特征曲线下面积(AUC)、精确度、召回率、f1-分数和混淆矩阵。将获得的最佳 SE 应用于原始不平衡数据集,以观察分类性能是否与平衡数据集变化时的性能相同。调查结果表明,在恶意软件检测方面,建议的方法生成的 SE 具有较高的分类准确率(0.9962)。
{"title":"Improvement of Malicious Software Detection Accuracy through Genetic Programming Symbolic Classifier with Application of Dataset Oversampling Techniques","authors":"N. Anđelić, Sandi Baressi Baressi Šegota, Z. Car","doi":"10.3390/computers12120242","DOIUrl":"https://doi.org/10.3390/computers12120242","url":null,"abstract":"Malware detection using hybrid features, combining binary and hexadecimal analysis with DLL calls, is crucial for leveraging the strengths of both static and dynamic analysis methods. Artificial intelligence (AI) enhances this process by enabling automated pattern recognition, anomaly detection, and continuous learning, allowing security systems to adapt to evolving threats and identify complex, polymorphic malware that may exhibit varied behaviors. This synergy of hybrid features with AI empowers malware detection systems to efficiently and proactively identify and respond to sophisticated cyber threats in real time. In this paper, the genetic programming symbolic classifier (GPSC) algorithm was applied to the publicly available dataset to obtain symbolic expressions (SEs) that could detect the malware software with high classification performance. The initial problem with the dataset was a high imbalance between class samples, so various oversampling techniques were utilized to obtain balanced dataset variations on which GPSC was applied. To find the optimal combination of GPSC hyperparameter values, the random hyperparameter value search method (RHVS) was developed and applied to obtain SEs with high classification accuracy. The GPSC was trained with five-fold cross-validation (5FCV) to obtain a robust set of SEs on each dataset variation. To choose the best SEs, several evaluation metrics were used, i.e., the length and depth of SEs, accuracy score (ACC), area under receiver operating characteristic curve (AUC), precision, recall, f1-score, and confusion matrix. The best-obtained SEs are applied on the original imbalanced dataset to see if the classification performance is the same as it was on balanced dataset variations. The results of the investigation showed that the proposed method generated SEs with high classification accuracy (0.9962) in malware software detection.","PeriodicalId":46292,"journal":{"name":"Computers","volume":"46 10","pages":""},"PeriodicalIF":2.8,"publicationDate":"2023-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139252986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-21DOI: 10.3390/computers12120243
Vaclav Skala, Eliska Mourycová
Interpolating and approximating scattered scalar and vector data is fundamental in resolving numerous engineering challenges. These methodologies predominantly rely on establishing a triangulated structure within the data domain, typically constrained to the dimensions of 2D or 3D. Subsequently, an interpolation or approximation technique is employed to yield a smooth and coherent outcome. This contribution introduces a meshless methodology founded upon radial basis functions (RBFs). This approach exhibits a nearly dimensionless character, facilitating the interpolation of data evolving over time. Specifically, it enables the interpolation of dispersed spatio-temporally varying data, allowing for interpolation within the space-time domain devoid of the conventional “time-frames”. Meshless methodologies tailored for scattered spatio-temporal data hold applicability across a spectrum of domains, encompassing the interpolation, approximation, and assessment of data originating from various sources, such as buoys, sensor networks, tsunami monitoring instruments, chemical and radiation detectors, vessel and submarine detection systems, weather forecasting models, as well as the compression and visualization of 3D vector fields, among others.
{"title":"Meshfree Interpolation of Multidimensional Time-Varying Scattered Data","authors":"Vaclav Skala, Eliska Mourycová","doi":"10.3390/computers12120243","DOIUrl":"https://doi.org/10.3390/computers12120243","url":null,"abstract":"Interpolating and approximating scattered scalar and vector data is fundamental in resolving numerous engineering challenges. These methodologies predominantly rely on establishing a triangulated structure within the data domain, typically constrained to the dimensions of 2D or 3D. Subsequently, an interpolation or approximation technique is employed to yield a smooth and coherent outcome. This contribution introduces a meshless methodology founded upon radial basis functions (RBFs). This approach exhibits a nearly dimensionless character, facilitating the interpolation of data evolving over time. Specifically, it enables the interpolation of dispersed spatio-temporally varying data, allowing for interpolation within the space-time domain devoid of the conventional “time-frames”. Meshless methodologies tailored for scattered spatio-temporal data hold applicability across a spectrum of domains, encompassing the interpolation, approximation, and assessment of data originating from various sources, such as buoys, sensor networks, tsunami monitoring instruments, chemical and radiation detectors, vessel and submarine detection systems, weather forecasting models, as well as the compression and visualization of 3D vector fields, among others.","PeriodicalId":46292,"journal":{"name":"Computers","volume":"34 6","pages":""},"PeriodicalIF":2.8,"publicationDate":"2023-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139252052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}