Pub Date : 2022-07-13DOI: 10.1109/cits55221.2022.9832982
K. Arivarasan, M. Obaidat
With the advancement of internet technologies comes the need for systems that can ensure the security of a network. An intrusion Detection System (IDS) can detect and sometimes take action against malicious network traffic. There are different types of IDS. For example, based on the detection method, it can be Signature-based IDS or Anomaly-based IDS or Hybrid IDS. In this work, multiple models are trained using various machine learning algorithms on the NSL-KDD dataset to build an efficient anomaly-based IDS that can detect malicious traffic with utmost accuracy. Supervised Learning algorithms like Logistic Regression, Decision Tree, K-Nearest Neighbour (KNN), XGBoost, Random Forest and Multilayer Perceptron (MLP) are used. At last, the Hard Voting technique is employed to increase efficiency.
{"title":"Intrusion Detection System using Aggregation of Machine Learning Algorithms","authors":"K. Arivarasan, M. Obaidat","doi":"10.1109/cits55221.2022.9832982","DOIUrl":"https://doi.org/10.1109/cits55221.2022.9832982","url":null,"abstract":"With the advancement of internet technologies comes the need for systems that can ensure the security of a network. An intrusion Detection System (IDS) can detect and sometimes take action against malicious network traffic. There are different types of IDS. For example, based on the detection method, it can be Signature-based IDS or Anomaly-based IDS or Hybrid IDS. In this work, multiple models are trained using various machine learning algorithms on the NSL-KDD dataset to build an efficient anomaly-based IDS that can detect malicious traffic with utmost accuracy. Supervised Learning algorithms like Logistic Regression, Decision Tree, K-Nearest Neighbour (KNN), XGBoost, Random Forest and Multilayer Perceptron (MLP) are used. At last, the Hard Voting technique is employed to increase efficiency.","PeriodicalId":136239,"journal":{"name":"2022 International Conference on Computer, Information and Telecommunication Systems (CITS)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131387184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-13DOI: 10.1109/cits55221.2022.9832992
{"title":"CITS 2022 Cover Page","authors":"","doi":"10.1109/cits55221.2022.9832992","DOIUrl":"https://doi.org/10.1109/cits55221.2022.9832992","url":null,"abstract":"","PeriodicalId":136239,"journal":{"name":"2022 International Conference on Computer, Information and Telecommunication Systems (CITS)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123998809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-13DOI: 10.1109/cits55221.2022.9832986
Georgios L. Stavrinides, H. Karatza
In hybrid and multi-tier distributed architectures, where data may have different security requirements and typically require processing in a pipeline fashion, resource allocation has become particularly challenging. In such environments, it is crucial to use security-aware and effective resource allocation techniques, in order to ensure the secure processing of the workload and achieve a satisfactory Quality of Service (QoS). Towards this direction, in this paper we examine the performance of security-aware resource allocation strategies for linear workflow (LW) jobs in an environment of distributed resources. Only a subset of the resources is considered secure and thus suitable for processing high risk LW jobs. Low risk LW jobs may be executed on either secure or non-secure resources. Two commonly used routing techniques are adapted in order to incorporate security awareness. Their performance is evaluated through simulation. Several scenarios are investigated, with different subset sizes of the secure resources, as well as different probabilities for a LW job to be considered high risk. The simulation results provide useful insights into how the percentage of high risk LW jobs affects the performance in each of the examined cases of secure resources.
{"title":"Security-Aware Orchestration of Linear Workflows on Distributed Resources","authors":"Georgios L. Stavrinides, H. Karatza","doi":"10.1109/cits55221.2022.9832986","DOIUrl":"https://doi.org/10.1109/cits55221.2022.9832986","url":null,"abstract":"In hybrid and multi-tier distributed architectures, where data may have different security requirements and typically require processing in a pipeline fashion, resource allocation has become particularly challenging. In such environments, it is crucial to use security-aware and effective resource allocation techniques, in order to ensure the secure processing of the workload and achieve a satisfactory Quality of Service (QoS). Towards this direction, in this paper we examine the performance of security-aware resource allocation strategies for linear workflow (LW) jobs in an environment of distributed resources. Only a subset of the resources is considered secure and thus suitable for processing high risk LW jobs. Low risk LW jobs may be executed on either secure or non-secure resources. Two commonly used routing techniques are adapted in order to incorporate security awareness. Their performance is evaluated through simulation. Several scenarios are investigated, with different subset sizes of the secure resources, as well as different probabilities for a LW job to be considered high risk. The simulation results provide useful insights into how the percentage of high risk LW jobs affects the performance in each of the examined cases of secure resources.","PeriodicalId":136239,"journal":{"name":"2022 International Conference on Computer, Information and Telecommunication Systems (CITS)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127751904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-19DOI: 10.1109/CITS55221.2022.9832981
Theofanis P. Raptis, A. Passarella
Apache Kafka addresses the general problem of delivering extreme high volume event data to diverse consumers via a publish-subscribe messaging system. It uses partitions to scale a topic across many brokers for producers to write data in parallel, and also to facilitate parallel reading of consumers. Even though Apache Kafka provides some out of the box optimizations, it does not strictly define how each topic shall be efficiently distributed into partitions. The well-formulated fine-tuning that is needed in order to improve an Apache Kafka cluster performance is still an open research problem. In this paper, we first model the Apache Kafka topic partitioning process for a given topic. Then, given the set of brokers, constraints and application requirements on throughput, OS load, replication latency and unavailability, we formulate the optimization problem of finding how many partitions are needed and show that it is computationally intractable, being an integer program. Furthermore, we propose two simple, yet efficient heuristics to solve the problem: the first tries to minimize and the second to maximize the number of brokers used in the cluster. Finally, we evaluate its performance via largescale simulations, considering as benchmarks some Apache Kafka cluster configuration recommendations provided by Microsoft and Confluent. We demonstrate that, unlike the recommendations, the proposed heuristics respect the hard constraints on replication latency and perform better w.r.t. unavailability time and OS load, using the system resources in a more prudent way.
{"title":"On Efficiently Partitioning a Topic in Apache Kafka","authors":"Theofanis P. Raptis, A. Passarella","doi":"10.1109/CITS55221.2022.9832981","DOIUrl":"https://doi.org/10.1109/CITS55221.2022.9832981","url":null,"abstract":"Apache Kafka addresses the general problem of delivering extreme high volume event data to diverse consumers via a publish-subscribe messaging system. It uses partitions to scale a topic across many brokers for producers to write data in parallel, and also to facilitate parallel reading of consumers. Even though Apache Kafka provides some out of the box optimizations, it does not strictly define how each topic shall be efficiently distributed into partitions. The well-formulated fine-tuning that is needed in order to improve an Apache Kafka cluster performance is still an open research problem. In this paper, we first model the Apache Kafka topic partitioning process for a given topic. Then, given the set of brokers, constraints and application requirements on throughput, OS load, replication latency and unavailability, we formulate the optimization problem of finding how many partitions are needed and show that it is computationally intractable, being an integer program. Furthermore, we propose two simple, yet efficient heuristics to solve the problem: the first tries to minimize and the second to maximize the number of brokers used in the cluster. Finally, we evaluate its performance via largescale simulations, considering as benchmarks some Apache Kafka cluster configuration recommendations provided by Microsoft and Confluent. We demonstrate that, unlike the recommendations, the proposed heuristics respect the hard constraints on replication latency and perform better w.r.t. unavailability time and OS load, using the system resources in a more prudent way.","PeriodicalId":136239,"journal":{"name":"2022 International Conference on Computer, Information and Telecommunication Systems (CITS)","volume":"347 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115893603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}