Public IP addresses from a private or public higher education institution receive large amounts of network traffic. However, the data network is vulnerable to the possibility of security attacks. This study develops a case in a practical way based in the use of the Advance IP Scanner and Shodan software tools, and following a methodology that consists of discovering an education institution IP network and scanning its hosts of interest to then find the security vulnerabilities of the main network addresses. From a statistical universe consisting of the entire range of IP addresses in the institution's network, a group of hosts of interest were defined as a sample set for further examination. On that base, the aim of this study is to analyze and classify the obtained vulnerabilities information by severity of the vulnerability for each found host using the described methodology, in order to obtain statistics at a host level and at the entire network level of the vulnerabilities by severity and quantity. It is concluded that most of the hosts have vulnerabilities in their Apache servers’ HTTP daemons, and they cause in a high percentage of them having vulnerabilities at the Critical level.
{"title":"Vulnerability Analysis of the Exposed Public IPs in a Higher Education Institution","authors":"Agustín Chancusi, Paúl Diestra, Damián Nicolalde","doi":"10.1145/3442520.3442523","DOIUrl":"https://doi.org/10.1145/3442520.3442523","url":null,"abstract":"Public IP addresses from a private or public higher education institution receive large amounts of network traffic. However, the data network is vulnerable to the possibility of security attacks. This study develops a case in a practical way based in the use of the Advance IP Scanner and Shodan software tools, and following a methodology that consists of discovering an education institution IP network and scanning its hosts of interest to then find the security vulnerabilities of the main network addresses. From a statistical universe consisting of the entire range of IP addresses in the institution's network, a group of hosts of interest were defined as a sample set for further examination. On that base, the aim of this study is to analyze and classify the obtained vulnerabilities information by severity of the vulnerability for each found host using the described methodology, in order to obtain statistics at a host level and at the entire network level of the vulnerabilities by severity and quantity. It is concluded that most of the hosts have vulnerabilities in their Apache servers’ HTTP daemons, and they cause in a high percentage of them having vulnerabilities at the Critical level.","PeriodicalId":340416,"journal":{"name":"Proceedings of the 2020 10th International Conference on Communication and Network Security","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127910797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tomoya Yamashita, Daisuke Miyamoto, Y. Sekiya, Hiroshi Nakamura
We present a novel method for detecting slow scan attacks. Attackers collect information about vulnerabilities in hosts by scan attacks and then penetrate the systems based on the collected information. Detection of scan attacks is therefore useful to avoid the following attacks. The intrusion detection system (IDS) has been proposed for detecting scan attacks. However, it cannot detect slow scan attacks that are executed slowly over a long period. In this paper, we introduce novel features that are useful to distinguish the difference in the communication behavior between the scanning hosts and the benign hosts. Then, we propose the detection method using the features. Furthermore, through the experiments, we confirm the effectiveness of our method for detecting a slow scan attack.
{"title":"Slow Scan Attack Detection Based on Communication Behavior","authors":"Tomoya Yamashita, Daisuke Miyamoto, Y. Sekiya, Hiroshi Nakamura","doi":"10.1145/3442520.3442525","DOIUrl":"https://doi.org/10.1145/3442520.3442525","url":null,"abstract":"We present a novel method for detecting slow scan attacks. Attackers collect information about vulnerabilities in hosts by scan attacks and then penetrate the systems based on the collected information. Detection of scan attacks is therefore useful to avoid the following attacks. The intrusion detection system (IDS) has been proposed for detecting scan attacks. However, it cannot detect slow scan attacks that are executed slowly over a long period. In this paper, we introduce novel features that are useful to distinguish the difference in the communication behavior between the scanning hosts and the benign hosts. Then, we propose the detection method using the features. Furthermore, through the experiments, we confirm the effectiveness of our method for detecting a slow scan attack.","PeriodicalId":340416,"journal":{"name":"Proceedings of the 2020 10th International Conference on Communication and Network Security","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123932855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Darknet traffic classification is significantly important to categorize real-time applications. Although there are notable efforts to classify darknet traffic which rely heavily on existing datasets and machine learning classifiers, there are extremely few efforts to detect and characterize darknet traffic using deep learning. This work proposes a novel approach, named DeepImage, which uses feature selection to pick the most important features to create a gray image and feed it to a two-dimensional convolutional neural network to detect and characterize darknet traffic. Two encrypted traffic datasets are merged to create a darknet dataset to evaluate the proposed approach which successfully characterizes darknet traffic with 86% accuracy.
{"title":"DIDarknet: A Contemporary Approach to Detect and Characterize the Darknet Traffic using Deep Image Learning","authors":"Arash Habibi Lashkari, Gurdip Kaur, Abir Rahali","doi":"10.1145/3442520.3442521","DOIUrl":"https://doi.org/10.1145/3442520.3442521","url":null,"abstract":"Darknet traffic classification is significantly important to categorize real-time applications. Although there are notable efforts to classify darknet traffic which rely heavily on existing datasets and machine learning classifiers, there are extremely few efforts to detect and characterize darknet traffic using deep learning. This work proposes a novel approach, named DeepImage, which uses feature selection to pick the most important features to create a gray image and feed it to a two-dimensional convolutional neural network to detect and characterize darknet traffic. Two encrypted traffic datasets are merged to create a darknet dataset to evaluate the proposed approach which successfully characterizes darknet traffic with 86% accuracy.","PeriodicalId":340416,"journal":{"name":"Proceedings of the 2020 10th International Conference on Communication and Network Security","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117122464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The unrivaled threat of android malware is the root cause of various security problems on the internet. Although there are remarkable efforts in detection and classification of android malware based on machine learning techniques, a small number of attempts are made to classify and characterize it using deep learning. Detecting android malware in smartphones is an essential target for cyber community to get rid of menacing malware samples. This paper proposes an image-based deep neural network method to classify and characterize android malware samples taken from a huge malware dataset with 12 prominent malware categories and 191 eminent malware families. This work successfully demonstrates the use of deep image learning to classify and characterize android malware with an accuracy of 93.36% and log loss of less than 0.20 for training and testing set.
{"title":"DIDroid: Android Malware Classification and Characterization Using Deep Image Learning","authors":"Abir Rahali, Arash Habibi Lashkari, Gurdip Kaur, Laya Taheri, F. Gagnon, Frédéric Massicotte","doi":"10.1145/3442520.3442522","DOIUrl":"https://doi.org/10.1145/3442520.3442522","url":null,"abstract":"The unrivaled threat of android malware is the root cause of various security problems on the internet. Although there are remarkable efforts in detection and classification of android malware based on machine learning techniques, a small number of attempts are made to classify and characterize it using deep learning. Detecting android malware in smartphones is an essential target for cyber community to get rid of menacing malware samples. This paper proposes an image-based deep neural network method to classify and characterize android malware samples taken from a huge malware dataset with 12 prominent malware categories and 191 eminent malware families. This work successfully demonstrates the use of deep image learning to classify and characterize android malware with an accuracy of 93.36% and log loss of less than 0.20 for training and testing set.","PeriodicalId":340416,"journal":{"name":"Proceedings of the 2020 10th International Conference on Communication and Network Security","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116262373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Many companies outsource their network security functionality to third party service providers. To guarantee the quality of such services, a Security Service Level Agreement (SSLA) between the two parties often needs to be signed and enforced. Some mechanisms to verify the execution of the SSLA must be designed. In this paper, we propose a mechanism to allow a non-interest third party to help end customers verify the SSLA. Specifically, an end customer can carefully craft network traffic and conduct spontaneous and configurable verification of the SSLA with the help of a group of testers. While the basic idea is straightforward, multiple methods must be designed to guarantee the execution of the testing procedure. For example, we need to prevent the testing sites from being abused for network attacks. We describe our approaches in details. Our analysis and quantitative results show that our approach can effectively help end customers verify the execution of network security SLA.
{"title":"Proof of Network Security Services: Enforcement of Security SLA through Outsourced Network Testing","authors":"Sultan Alasmari, Weichao Wang, Yu Wang","doi":"10.1145/3442520.3442533","DOIUrl":"https://doi.org/10.1145/3442520.3442533","url":null,"abstract":"Many companies outsource their network security functionality to third party service providers. To guarantee the quality of such services, a Security Service Level Agreement (SSLA) between the two parties often needs to be signed and enforced. Some mechanisms to verify the execution of the SSLA must be designed. In this paper, we propose a mechanism to allow a non-interest third party to help end customers verify the SSLA. Specifically, an end customer can carefully craft network traffic and conduct spontaneous and configurable verification of the SSLA with the help of a group of testers. While the basic idea is straightforward, multiple methods must be designed to guarantee the execution of the testing procedure. For example, we need to prevent the testing sites from being abused for network attacks. We describe our approaches in details. Our analysis and quantitative results show that our approach can effectively help end customers verify the execution of network security SLA.","PeriodicalId":340416,"journal":{"name":"Proceedings of the 2020 10th International Conference on Communication and Network Security","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122615345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ming Yang, Xuexian Hu, Jianghong Wei, Qihui Zhang, Wenfen Liu
Due to the capacity of storing massive data and providing huge computing resources, cloud computing has been a desirable platform to assist machine learning in multiple-data-owners scenarios. However, the issue of data privacy is far from being well solved and thus has been a general concern in the cloud-assisted machine learning. For example, in the existing cloud-assisted decision tree classification algorithms, it is very hard to guarantee data privacy since all data owners have to aggregate their data to the cloud platform for model training. In this paper, we investigate the possibility of training a decision tree in the scenario that the distributed data are stored locally in each data owner, where the privacy of the original data can be guaranteed in a more intuitive approach. Specifically, we present a positive answer to the above issue by presenting a privacy-preserving ID3 training scheme using Gini index over horizontally partitioned datasets by multiple data owners. Since each data owner cannot directly divide the local dataset according to the best attributes selected, a consortium blockchain and a homomorphic encryption algorithm are employed to ensure the privacy and usability of the distributed data. Security analysis indicates that our scheme can preserve the privacy of the original data and the intermediate values. Moreover, extensive experiments show that our scheme can achieve the same result compared with the original ID3 decision tree algorithm while additionally preserving data privacy, and calculation time overhead and communication time overhead on data owners decrease greatly.
{"title":"Outsourced Secure ID3 Decision Tree Algorithm over Horizontally Partitioned Datasets with Consortium Blockchain","authors":"Ming Yang, Xuexian Hu, Jianghong Wei, Qihui Zhang, Wenfen Liu","doi":"10.1145/3442520.3442534","DOIUrl":"https://doi.org/10.1145/3442520.3442534","url":null,"abstract":"Due to the capacity of storing massive data and providing huge computing resources, cloud computing has been a desirable platform to assist machine learning in multiple-data-owners scenarios. However, the issue of data privacy is far from being well solved and thus has been a general concern in the cloud-assisted machine learning. For example, in the existing cloud-assisted decision tree classification algorithms, it is very hard to guarantee data privacy since all data owners have to aggregate their data to the cloud platform for model training. In this paper, we investigate the possibility of training a decision tree in the scenario that the distributed data are stored locally in each data owner, where the privacy of the original data can be guaranteed in a more intuitive approach. Specifically, we present a positive answer to the above issue by presenting a privacy-preserving ID3 training scheme using Gini index over horizontally partitioned datasets by multiple data owners. Since each data owner cannot directly divide the local dataset according to the best attributes selected, a consortium blockchain and a homomorphic encryption algorithm are employed to ensure the privacy and usability of the distributed data. Security analysis indicates that our scheme can preserve the privacy of the original data and the intermediate values. Moreover, extensive experiments show that our scheme can achieve the same result compared with the original ID3 decision tree algorithm while additionally preserving data privacy, and calculation time overhead and communication time overhead on data owners decrease greatly.","PeriodicalId":340416,"journal":{"name":"Proceedings of the 2020 10th International Conference on Communication and Network Security","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130990736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Random number generator (RNG) is the basic primitive in cryptography. The randomness of random numbers generated by RNGs is the base of the security of various cryptosystems implemented in network and communications. With the popularization of smart mobile devices (such as smartphones) and the surge in demand for cryptographic applications of such devices, research on providing random number services for mobile devices has attracted more and more attentions. As the important components of smartphones, sensors are used to collect data from user behaviors and environments, and some data sources have the non-deterministic properties. Currently, some work focuses on how to design sensor-based RNG towards smartphones, since no additional hardware is required by this method. It is critical to evaluate the quality of entropy sources which is the main source of randomness for RNGs. However, as far as we know, there is no work to systematically analyze the feasibility for utilizing the raw sensor data to generate random sequences, and how much the entropy contained in the data is. In this paper, we aim to providing an analysis method for quantifying the entropy in the raw data captured by sensors embedded in smartphones, and studying the feasibility of generating random numbers from the data. We establish several data collection models for some typical sensors with different scenarios and data sampling frequencies. Furthermore, we propose a universal entropy estimation scheme for multivariate data to quantify the entropy of the sensor data, and apply it on a type of Android smartphones. The experiments demonstrate that the raw data collected by the sensors has a considerable amount of entropy, and the ability of different sensors to provide entropy has a certain relationship with the usage scenarios of smartphones and the sampling frequency of sensor data. Particularly, when in a static scenario and the sampling frequency is 50Hz, we get a conservative entropy estimation for our testing smartphones based on the min-entropy, which is about 189bits/s, 13bits/s and 254bits/s for the accelerometer, gyroscope, and magnetometer respectively. While the randomness of sensor data in dynamic scenarios will increase compared to static scenarios, because the environment and the way that the user uses the smartphones actually exist differences each time, parts of which are unknowable to the attacker.
{"title":"Analysis on Entropy Sources based on Smartphone Sensors","authors":"Na Lv, Tianyu Chen, Yuan Ma","doi":"10.1145/3442520.3442528","DOIUrl":"https://doi.org/10.1145/3442520.3442528","url":null,"abstract":"Random number generator (RNG) is the basic primitive in cryptography. The randomness of random numbers generated by RNGs is the base of the security of various cryptosystems implemented in network and communications. With the popularization of smart mobile devices (such as smartphones) and the surge in demand for cryptographic applications of such devices, research on providing random number services for mobile devices has attracted more and more attentions. As the important components of smartphones, sensors are used to collect data from user behaviors and environments, and some data sources have the non-deterministic properties. Currently, some work focuses on how to design sensor-based RNG towards smartphones, since no additional hardware is required by this method. It is critical to evaluate the quality of entropy sources which is the main source of randomness for RNGs. However, as far as we know, there is no work to systematically analyze the feasibility for utilizing the raw sensor data to generate random sequences, and how much the entropy contained in the data is. In this paper, we aim to providing an analysis method for quantifying the entropy in the raw data captured by sensors embedded in smartphones, and studying the feasibility of generating random numbers from the data. We establish several data collection models for some typical sensors with different scenarios and data sampling frequencies. Furthermore, we propose a universal entropy estimation scheme for multivariate data to quantify the entropy of the sensor data, and apply it on a type of Android smartphones. The experiments demonstrate that the raw data collected by the sensors has a considerable amount of entropy, and the ability of different sensors to provide entropy has a certain relationship with the usage scenarios of smartphones and the sampling frequency of sensor data. Particularly, when in a static scenario and the sampling frequency is 50Hz, we get a conservative entropy estimation for our testing smartphones based on the min-entropy, which is about 189bits/s, 13bits/s and 254bits/s for the accelerometer, gyroscope, and magnetometer respectively. While the randomness of sensor data in dynamic scenarios will increase compared to static scenarios, because the environment and the way that the user uses the smartphones actually exist differences each time, parts of which are unknowable to the attacker.","PeriodicalId":340416,"journal":{"name":"Proceedings of the 2020 10th International Conference on Communication and Network Security","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129273882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Florian Adamsky, Daniel Kaiser, M. Steglich, T. Engel
Distributed Hash Table (DHT) protocols, such as Kademlia, provide a decentralized key-value lookup which is nowadays integrated into a wide variety of applications, such as Ethereum, InterPlanetary File System (IPFS), and BitTorrent. However, many security issues in DHT protocols have not been solved yet. DHT networks are typically evaluated using mathematical models or simulations, often abstracting away from artefacts that can be relevant for security and/or performance. Experiments capturing these artefacts are typically run with too few nodes. In this paper, we provide Locust, a novel highly concurrent DHT experimentation framework written in Elixir, which is designed for security evaluations. This framework allows running experiments with a full DHT implementation and around 4,000 nodes on a single machine including an adjustable churn rate; thus yielding a favourable trade-off between the number of analysed nodes and being realistic. We evaluate our framework in terms of memory consumption, processing power, and network traffic.
{"title":"Locust: Highly Concurrent DHT Experimentation Framework for Security Evaluations","authors":"Florian Adamsky, Daniel Kaiser, M. Steglich, T. Engel","doi":"10.1145/3442520.3442531","DOIUrl":"https://doi.org/10.1145/3442520.3442531","url":null,"abstract":"Distributed Hash Table (DHT) protocols, such as Kademlia, provide a decentralized key-value lookup which is nowadays integrated into a wide variety of applications, such as Ethereum, InterPlanetary File System (IPFS), and BitTorrent. However, many security issues in DHT protocols have not been solved yet. DHT networks are typically evaluated using mathematical models or simulations, often abstracting away from artefacts that can be relevant for security and/or performance. Experiments capturing these artefacts are typically run with too few nodes. In this paper, we provide Locust, a novel highly concurrent DHT experimentation framework written in Elixir, which is designed for security evaluations. This framework allows running experiments with a full DHT implementation and around 4,000 nodes on a single machine including an adjustable churn rate; thus yielding a favourable trade-off between the number of analysed nodes and being realistic. We evaluate our framework in terms of memory consumption, processing power, and network traffic.","PeriodicalId":340416,"journal":{"name":"Proceedings of the 2020 10th International Conference on Communication and Network Security","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123191953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yongfu Wang, Ying Zhou, Xiaohai Zou, Quanqiang Miao, Wei Wang
Given the increasingly prominent network security issues, it is of great significance to deeply analyze the vulnerability of network space software and hardware resources. Although the existing Common Vulnerabilities and Exposures (CVE) security vulnerability database contains a wealth of vulnerability information, the information is poorly readable, the potential correlation is difficult to express intuitively, and the degree of visualization is insufficient. To solve the current problems, a method of constructing a knowledge graph of CVE security vulnerabilities is proposed. By acquiring raw data, ontology modeling, data extraction and import, the knowledge graph is imported into the Neo4j graph database to complete the construction of the CVE knowledge graph. Based on the knowledge graph, the in-depth analysis is performed from the cause dimension, time dimension and association dimension, and the results are displayed visually. Experiments show that this analysis method can intuitively and effectively mine the intrinsic value of CVE security vulnerability data.
在网络安全问题日益突出的今天,深入分析网络空间软硬件资源的脆弱性具有十分重要的意义。现有的CVE (Common Vulnerabilities and Exposures)安全漏洞数据库虽然包含了丰富的漏洞信息,但信息可读性差,潜在的相关性难以直观表达,可视化程度不足。针对目前存在的问题,提出了一种构建CVE安全漏洞知识图的方法。通过获取原始数据、本体建模、数据提取和导入,将知识图导入到Neo4j图形数据库中,完成CVE知识图的构建。在知识图谱的基础上,从原因维度、时间维度和关联维度进行深入分析,并以可视化的方式显示分析结果。实验表明,该分析方法能够直观有效地挖掘CVE安全漏洞数据的内在价值。
{"title":"The analysis method of security vulnerability based on the knowledge graph","authors":"Yongfu Wang, Ying Zhou, Xiaohai Zou, Quanqiang Miao, Wei Wang","doi":"10.1145/3442520.3442535","DOIUrl":"https://doi.org/10.1145/3442520.3442535","url":null,"abstract":"Given the increasingly prominent network security issues, it is of great significance to deeply analyze the vulnerability of network space software and hardware resources. Although the existing Common Vulnerabilities and Exposures (CVE) security vulnerability database contains a wealth of vulnerability information, the information is poorly readable, the potential correlation is difficult to express intuitively, and the degree of visualization is insufficient. To solve the current problems, a method of constructing a knowledge graph of CVE security vulnerabilities is proposed. By acquiring raw data, ontology modeling, data extraction and import, the knowledge graph is imported into the Neo4j graph database to complete the construction of the CVE knowledge graph. Based on the knowledge graph, the in-depth analysis is performed from the cause dimension, time dimension and association dimension, and the results are displayed visually. Experiments show that this analysis method can intuitively and effectively mine the intrinsic value of CVE security vulnerability data.","PeriodicalId":340416,"journal":{"name":"Proceedings of the 2020 10th International Conference on Communication and Network Security","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121032133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Network traffic classification has become increasingly important as the number of devices connected to the Internet is rapidly growing. Proportionally, the amount of encrypted traffic is also increasing, making payload based classification methods obsolete. Consequently, machine learning approaches have become crucial when user privacy is concerned. For this purpose, we propose an accurate, fast, and privacy preserved encrypted traffic classification approach with engineered flow feature extraction and appropriate feature selection. The proposed scheme achieves a 0.92899 macro-average F1 score and a 0.88313 macro-averaged mAP score for the encrypted traffic classification of Audio, Email, Chat, and Video classes derived from the non-vpn2016 dataset. Further experiments on the mixed non-encrypted and encrypted flow dataset with a data augmentation method called Synthetic Minority Over-Sampling Technique are conducted and the results are discussed for TLS-encrypted and mixed flows.
{"title":"TLS Encrypted Application Classification Using Machine Learning with Flow Feature Engineering","authors":"Onur Barut, Rebecca S. Zhu, Yan Luo, Tong Zhang","doi":"10.1145/3442520.3442529","DOIUrl":"https://doi.org/10.1145/3442520.3442529","url":null,"abstract":"Network traffic classification has become increasingly important as the number of devices connected to the Internet is rapidly growing. Proportionally, the amount of encrypted traffic is also increasing, making payload based classification methods obsolete. Consequently, machine learning approaches have become crucial when user privacy is concerned. For this purpose, we propose an accurate, fast, and privacy preserved encrypted traffic classification approach with engineered flow feature extraction and appropriate feature selection. The proposed scheme achieves a 0.92899 macro-average F1 score and a 0.88313 macro-averaged mAP score for the encrypted traffic classification of Audio, Email, Chat, and Video classes derived from the non-vpn2016 dataset. Further experiments on the mixed non-encrypted and encrypted flow dataset with a data augmentation method called Synthetic Minority Over-Sampling Technique are conducted and the results are discussed for TLS-encrypted and mixed flows.","PeriodicalId":340416,"journal":{"name":"Proceedings of the 2020 10th International Conference on Communication and Network Security","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130648433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}