Pub Date : 2020-12-01DOI: 10.1109/TrustCom50675.2020.00261
Monika Di Angelo, G. Salzer
Like most programs, smart contracts offer their functionality via entry points that constitute the interface. Interface standards, e.g. for tokens contracts, foster interoperability. Ethereum is the most prominent platform for smart contracts. The number of contract deployments approaches 30 million, corresponding to roughly 300 000 distinct contract codes. In view of these numbers, it is necessary to develop automated methods for classifying contracts regarding their purpose, if one aims at a qualitative and quantitative understanding of what blockchain applications are used for at large. We approach the task by considering contracts as similar if their interfaces are. We encode interfaces and their interrelationships as graphs and explore several algorithms regarding their ability to find clusters of functionally similar contracts. Our evaluation of the quality of clustering relies on a ground truth of token and wallet contracts identified in earlier work. Our analysis is based on the bytecodes deployed on the main chain of Ethereum up to block 10.5 million, mined on July 21, 2020.
{"title":"Assessing the Similarity of Smart Contracts by Clustering their Interfaces","authors":"Monika Di Angelo, G. Salzer","doi":"10.1109/TrustCom50675.2020.00261","DOIUrl":"https://doi.org/10.1109/TrustCom50675.2020.00261","url":null,"abstract":"Like most programs, smart contracts offer their functionality via entry points that constitute the interface. Interface standards, e.g. for tokens contracts, foster interoperability. Ethereum is the most prominent platform for smart contracts. The number of contract deployments approaches 30 million, corresponding to roughly 300 000 distinct contract codes. In view of these numbers, it is necessary to develop automated methods for classifying contracts regarding their purpose, if one aims at a qualitative and quantitative understanding of what blockchain applications are used for at large. We approach the task by considering contracts as similar if their interfaces are. We encode interfaces and their interrelationships as graphs and explore several algorithms regarding their ability to find clusters of functionally similar contracts. Our evaluation of the quality of clustering relies on a ground truth of token and wallet contracts identified in earlier work. Our analysis is based on the bytecodes deployed on the main chain of Ethereum up to block 10.5 million, mined on July 21, 2020.","PeriodicalId":221956,"journal":{"name":"2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122685216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-01DOI: 10.1109/TrustCom50675.2020.00259
R. Cents, Nhien-An Le-Khac
Today traditional communication methods, such as SMS or phone calls, are used less often and are replaced by the use of chat applications. WhatsApp is one of the most popular chat applications nowadays. WhatsApp offers different ways of communicating, which include sending text messages and making phone calls. The implementation of encryption makes WhatsApp more challenging for law enforcement agencies to identify when a suspect is sending or receiving messages via this chat application. Most research in literature focused on the analysis of WhatsApp data by obtaining information from a physical device, such as a seized mobile device. However, it is not always possible to extract the data needed from a mobile device for the analysis of the WhatsApp data because of the encryption, or no devices have been seized yet. In addition, the current techniques for real time analysis of WhatsApp messages show that there is a high risk of detection by the suspect. Alternative methods are needed to understand the communication patterns of a suspect and criminal organizations. In this paper, we focused on identifying when a suspect is receiving or sending WhatsApp messages using only wiretap data. Therefore, no seized devices are needed. The pattern analysis has been used to identify patterns of data sent to and received from the WhatsApp servers. The identified patterns were tested against a large dataset created with different mobile devices to determine if the patterns are consistent. By using the technique described in this paper, investigators will obtain more information if and with whom a suspect is communicating.
{"title":"Towards A New Approach to Identify WhatsApp Messages","authors":"R. Cents, Nhien-An Le-Khac","doi":"10.1109/TrustCom50675.2020.00259","DOIUrl":"https://doi.org/10.1109/TrustCom50675.2020.00259","url":null,"abstract":"Today traditional communication methods, such as SMS or phone calls, are used less often and are replaced by the use of chat applications. WhatsApp is one of the most popular chat applications nowadays. WhatsApp offers different ways of communicating, which include sending text messages and making phone calls. The implementation of encryption makes WhatsApp more challenging for law enforcement agencies to identify when a suspect is sending or receiving messages via this chat application. Most research in literature focused on the analysis of WhatsApp data by obtaining information from a physical device, such as a seized mobile device. However, it is not always possible to extract the data needed from a mobile device for the analysis of the WhatsApp data because of the encryption, or no devices have been seized yet. In addition, the current techniques for real time analysis of WhatsApp messages show that there is a high risk of detection by the suspect. Alternative methods are needed to understand the communication patterns of a suspect and criminal organizations. In this paper, we focused on identifying when a suspect is receiving or sending WhatsApp messages using only wiretap data. Therefore, no seized devices are needed. The pattern analysis has been used to identify patterns of data sent to and received from the WhatsApp servers. The identified patterns were tested against a large dataset created with different mobile devices to determine if the patterns are consistent. By using the technique described in this paper, investigators will obtain more information if and with whom a suspect is communicating.","PeriodicalId":221956,"journal":{"name":"2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122167703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-01DOI: 10.1109/TrustCom50675.2020.00107
Yuqing Pan, Xi Xiao, Guangwu Hu, Bin Zhang, Qing Li, Haitao Zheng
Automatic fault localization plays a significant role in assisting developers to fix software bugs efficiently. Although existing approaches, e.g., static methods and dynamic ones, have greatly alleviated this problem by analyzing static features in source code and diagnosing dynamic behaviors in software running state respectively, the fault localization accuracy still does not meet user requirements. To improve the fault locating ability with statement granularity, this paper proposes ALBFL, a novel neural ranking model that involves the attention mechanism and the LambdaRank model, which can integrate the static and dynamic features and achieve very high accuracy for identifying software faults. ALBFL first introduces a transformer encoder to learn the semantic features from software source code. Also, it leverages other static statistical features and dynamic features, i.e., eleven Spectrum-Based Fault Localization (SBFL) features, three mutation features, to evaluate software together. Specially, the two types of features are integrated through a self-attention layer, and fed into the LambdaRank model so as to rank a list of possible fault statements. Finally, thorough experiments are conducted on 5 open-source projects with 357 faulty programs in Defects4J. The results show that ALBFL outperforms 11 traditional SBFL methods (by three times) and 2 state-of-the-art approaches (by 13%) on ranking faulty statements in the first position.
{"title":"ALBFL: A Novel Neural Ranking Model for Software Fault Localization via Combining Static and Dynamic Features","authors":"Yuqing Pan, Xi Xiao, Guangwu Hu, Bin Zhang, Qing Li, Haitao Zheng","doi":"10.1109/TrustCom50675.2020.00107","DOIUrl":"https://doi.org/10.1109/TrustCom50675.2020.00107","url":null,"abstract":"Automatic fault localization plays a significant role in assisting developers to fix software bugs efficiently. Although existing approaches, e.g., static methods and dynamic ones, have greatly alleviated this problem by analyzing static features in source code and diagnosing dynamic behaviors in software running state respectively, the fault localization accuracy still does not meet user requirements. To improve the fault locating ability with statement granularity, this paper proposes ALBFL, a novel neural ranking model that involves the attention mechanism and the LambdaRank model, which can integrate the static and dynamic features and achieve very high accuracy for identifying software faults. ALBFL first introduces a transformer encoder to learn the semantic features from software source code. Also, it leverages other static statistical features and dynamic features, i.e., eleven Spectrum-Based Fault Localization (SBFL) features, three mutation features, to evaluate software together. Specially, the two types of features are integrated through a self-attention layer, and fed into the LambdaRank model so as to rank a list of possible fault statements. Finally, thorough experiments are conducted on 5 open-source projects with 357 faulty programs in Defects4J. The results show that ALBFL outperforms 11 traditional SBFL methods (by three times) and 2 state-of-the-art approaches (by 13%) on ranking faulty statements in the first position.","PeriodicalId":221956,"journal":{"name":"2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125813441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-01DOI: 10.1109/TrustCom50675.2020.00073
Nan Messe, Vanea Chiprianov, Nicolas Belloir, Jamal El Hachem, Régis Fleurquin, Salah Sadou
Threat modeling is recognized as one of the most important activities in software security. It helps to address security issues in software development. Several threat modeling processes are widely used in the industry such as the one of Microsoft SDL. In threat modeling, it is essential to first identify assets before enumerating threats, in order to diagnose the threat targets and spot the protection mechanisms. Asset identification and threat enumeration are collaborative activities involving many actors such as security experts and software architects. These activities are traditionally carried out in brainstorming sessions. Due to the lack of guidance, the lack of a sufficiently formalized process, the high dependence on actors' knowledge, and the variety of actors' background, these actors often have difficulties collaborating with each other. Brainstorming sessions are thus often conducted sub-optimally and require significant effort. To address this problem, we aim at structuring the asset identification phase by proposing a systematic asset identification process, which is based on a reference model. This process structures and identifies relevant assets, facilitating the threat enumeration during brainstorming. We illustrate the proposed process with a case study and show the usefulness of our process in supporting threat enumeration and improving existing threat modeling processes such as the Microsoft SDL one.
{"title":"Asset-Oriented Threat Modeling","authors":"Nan Messe, Vanea Chiprianov, Nicolas Belloir, Jamal El Hachem, Régis Fleurquin, Salah Sadou","doi":"10.1109/TrustCom50675.2020.00073","DOIUrl":"https://doi.org/10.1109/TrustCom50675.2020.00073","url":null,"abstract":"Threat modeling is recognized as one of the most important activities in software security. It helps to address security issues in software development. Several threat modeling processes are widely used in the industry such as the one of Microsoft SDL. In threat modeling, it is essential to first identify assets before enumerating threats, in order to diagnose the threat targets and spot the protection mechanisms. Asset identification and threat enumeration are collaborative activities involving many actors such as security experts and software architects. These activities are traditionally carried out in brainstorming sessions. Due to the lack of guidance, the lack of a sufficiently formalized process, the high dependence on actors' knowledge, and the variety of actors' background, these actors often have difficulties collaborating with each other. Brainstorming sessions are thus often conducted sub-optimally and require significant effort. To address this problem, we aim at structuring the asset identification phase by proposing a systematic asset identification process, which is based on a reference model. This process structures and identifies relevant assets, facilitating the threat enumeration during brainstorming. We illustrate the proposed process with a case study and show the usefulness of our process in supporting threat enumeration and improving existing threat modeling processes such as the Microsoft SDL one.","PeriodicalId":221956,"journal":{"name":"2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130043712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-01DOI: 10.1109/TrustCom50675.2020.00055
T. Gasiba, U. Lechner, M. Pinto-Albuquerque, Daniel Méndez Fernández
Software needs to be secure, in particular, when deployed to critical infrastructures. Secure coding guidelines capture practices in industrial software engineering to ensure the security of code. This study aims to assess the level of awareness of secure coding in industrial software engineering, the skills of software developers to spot weaknesses in software code, avoid them, and the organizational support to adhere to coding guidelines. The approach draws on well-established theories of policy compliance, neutralization theory, and security-related stress and the authors' many years of experience in industrial software engineering and on lessons identified from training secure coding in the industry. The paper presents the questionnaire design for the online survey and the first analysis of data from the pilot study.
{"title":"Awareness of Secure Coding Guidelines in the Industry - A first data analysis","authors":"T. Gasiba, U. Lechner, M. Pinto-Albuquerque, Daniel Méndez Fernández","doi":"10.1109/TrustCom50675.2020.00055","DOIUrl":"https://doi.org/10.1109/TrustCom50675.2020.00055","url":null,"abstract":"Software needs to be secure, in particular, when deployed to critical infrastructures. Secure coding guidelines capture practices in industrial software engineering to ensure the security of code. This study aims to assess the level of awareness of secure coding in industrial software engineering, the skills of software developers to spot weaknesses in software code, avoid them, and the organizational support to adhere to coding guidelines. The approach draws on well-established theories of policy compliance, neutralization theory, and security-related stress and the authors' many years of experience in industrial software engineering and on lessons identified from training secure coding in the industry. The paper presents the questionnaire design for the online survey and the first analysis of data from the pilot study.","PeriodicalId":221956,"journal":{"name":"2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130088768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-01DOI: 10.1109/TrustCom50675.2020.00189
Syed Ghazanfar Abbas, M. Husnain, U. U. Fayyaz, F. Shahzad, G. Shah, K. Zafar
In this research we propose a framework that will strengthen the IoT devices security from dual perspectives; avoid devices to become attack target as well as a source of an attack. Unlike traditional devices, IoT devices are equipped with insufficient host-based defense system and a continuous internet connection. All time internet enabled devices with insufficient security allures the attackers to use such devices and carry out their attacks on rest of internet. When plethora of vulnerable devices become source of an attack, intensity of such attacks increases exponentially. Mirai was one of the first well-known attack that exploited large number of vulnerable IoT devices, that bring down a large part of Internet. To strengthen the IoT devices from dual security perspective, we propose a two step framework. Firstly, confine the communication boundary of IoT devices; IoT-Sphere. A sphere of IPs that are allowed to communicate with a device. Any communication that violates the sphere will be blocked at the gateway level. Secondly, only allowed communication will be evaluated for potential attacks and anomalies using advance detection engines. To show the effectiveness of our proposed framework, we perform couple of attacks on IoT devices; camera and google home and show the feasibility of IoT-Sphere.
{"title":"IoT-Sphere: A Framework To Secure IoT Devices From Becoming Attack Target And Attack Source","authors":"Syed Ghazanfar Abbas, M. Husnain, U. U. Fayyaz, F. Shahzad, G. Shah, K. Zafar","doi":"10.1109/TrustCom50675.2020.00189","DOIUrl":"https://doi.org/10.1109/TrustCom50675.2020.00189","url":null,"abstract":"In this research we propose a framework that will strengthen the IoT devices security from dual perspectives; avoid devices to become attack target as well as a source of an attack. Unlike traditional devices, IoT devices are equipped with insufficient host-based defense system and a continuous internet connection. All time internet enabled devices with insufficient security allures the attackers to use such devices and carry out their attacks on rest of internet. When plethora of vulnerable devices become source of an attack, intensity of such attacks increases exponentially. Mirai was one of the first well-known attack that exploited large number of vulnerable IoT devices, that bring down a large part of Internet. To strengthen the IoT devices from dual security perspective, we propose a two step framework. Firstly, confine the communication boundary of IoT devices; IoT-Sphere. A sphere of IPs that are allowed to communicate with a device. Any communication that violates the sphere will be blocked at the gateway level. Secondly, only allowed communication will be evaluated for potential attacks and anomalies using advance detection engines. To show the effectiveness of our proposed framework, we perform couple of attacks on IoT devices; camera and google home and show the feasibility of IoT-Sphere.","PeriodicalId":221956,"journal":{"name":"2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127802465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Named entity recognition is an important and challenging problem in Natural language processing. Although the past decade has witnessed major advances in entity recognition in many fields, such successes have been slow to network security field, not only because of the data in the network security field is very professional, but also due to the sensitive information in the data. To advance named entity recognition research in network security field, we introduce a large-scale Dataset for Named Entity Recognition in Threat Intelligence (DNRTI). To this end, we collect more than 300 pieces of threat intelligence. The data in DNRTI is all annotated by experts in threat intelligence interpretation using 13 object categories. The fully annotated DNRTI contains 175220 words. To build a baseline for named entity recognition in the threat intelligence field, we evaluate some deep learning model on DNRTI. Experiments demonstrate that DNRTI well represents the key information in threat intelligence and are quite challenging.
{"title":"DNRTI: A Large-scale Dataset for Named Entity Recognition in Threat Intelligence","authors":"Xuren Wang, Xinpei Liu, Shengqin Ao, Ning Li, Zhengwei Jiang, Zongyi Xu, Zihan Xiong, Mengbo Xiong, Xiaoqing Zhang","doi":"10.1109/TrustCom50675.2020.00252","DOIUrl":"https://doi.org/10.1109/TrustCom50675.2020.00252","url":null,"abstract":"Named entity recognition is an important and challenging problem in Natural language processing. Although the past decade has witnessed major advances in entity recognition in many fields, such successes have been slow to network security field, not only because of the data in the network security field is very professional, but also due to the sensitive information in the data. To advance named entity recognition research in network security field, we introduce a large-scale Dataset for Named Entity Recognition in Threat Intelligence (DNRTI). To this end, we collect more than 300 pieces of threat intelligence. The data in DNRTI is all annotated by experts in threat intelligence interpretation using 13 object categories. The fully annotated DNRTI contains 175220 words. To build a baseline for named entity recognition in the threat intelligence field, we evaluate some deep learning model on DNRTI. Experiments demonstrate that DNRTI well represents the key information in threat intelligence and are quite challenging.","PeriodicalId":221956,"journal":{"name":"2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132488293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-01DOI: 10.1109/TrustCom50675.2020.00026
Ming Liu, Zhi Xue, Xiangjian He
The host-based intrusion detection system (HIDS) is an essential research domain of cybersecurity. HIDS examines log data of hosts to identify intrusive behaviors. The detection efficiency is a significant factor of HIDS. Traditionally, HIDS is often installed with a standalone mode. Training detection engines with a large amount of data on a single physical computer with limited computing resources may be time-consuming. Therefore, this paper offers a unified HIDS framework based on Spark and deployed in the Google cloud. The framework includes a unified machine learning pipeline to implement scalable and efficient HIDS.
{"title":"A Unified Host-based Intrusion Detection Framework using Spark in Cloud","authors":"Ming Liu, Zhi Xue, Xiangjian He","doi":"10.1109/TrustCom50675.2020.00026","DOIUrl":"https://doi.org/10.1109/TrustCom50675.2020.00026","url":null,"abstract":"The host-based intrusion detection system (HIDS) is an essential research domain of cybersecurity. HIDS examines log data of hosts to identify intrusive behaviors. The detection efficiency is a significant factor of HIDS. Traditionally, HIDS is often installed with a standalone mode. Training detection engines with a large amount of data on a single physical computer with limited computing resources may be time-consuming. Therefore, this paper offers a unified HIDS framework based on Spark and deployed in the Google cloud. The framework includes a unified machine learning pipeline to implement scalable and efficient HIDS.","PeriodicalId":221956,"journal":{"name":"2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)","volume":"225 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122361248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Although machine learning (ML) based solutions are ever-evolving for the attack defending paradigm, signatures of malicious network traffic are vital resources for intrusion detection systems (IDSs) and network forensic procedure, covering the lack of interpretability and stability for ML models. However, signature extraction is still a time and labor consuming task nowadays, resulting in possible increase of the attackers' dwell time. Existing automatic solutions rely too much on sequence similarity based and heuristic based methods, encountering performance degradation in large scale and dynamic network environment. In this paper, we present a novel method, called Clustering and Model Inference-based Rule Generation (CMIRGen), automatically generating token-set based signature rules for malicious traffic payloads to be inspected. CMIRGen leverages both optimized sequence similarity based and black-box model inference based methods to extract patterns from homogeneous and heterogeneous payloads respectively. Experimental evaluations have been conducted on several datasets and show the CMIRGen framework can extract discriminative signatures, presenting high recall rate and low false positive rate at the same time for malicious content recognition.
{"title":"CMIRGen: Automatic Signature Generation Algorithm for Malicious Network Traffic","authors":"Runzi Zhang, Mingkai Tong, Lei Chen, Jianxin Xue, Wenmao Liu, Feng Xie","doi":"10.1109/TrustCom50675.2020.00101","DOIUrl":"https://doi.org/10.1109/TrustCom50675.2020.00101","url":null,"abstract":"Although machine learning (ML) based solutions are ever-evolving for the attack defending paradigm, signatures of malicious network traffic are vital resources for intrusion detection systems (IDSs) and network forensic procedure, covering the lack of interpretability and stability for ML models. However, signature extraction is still a time and labor consuming task nowadays, resulting in possible increase of the attackers' dwell time. Existing automatic solutions rely too much on sequence similarity based and heuristic based methods, encountering performance degradation in large scale and dynamic network environment. In this paper, we present a novel method, called Clustering and Model Inference-based Rule Generation (CMIRGen), automatically generating token-set based signature rules for malicious traffic payloads to be inspected. CMIRGen leverages both optimized sequence similarity based and black-box model inference based methods to extract patterns from homogeneous and heterogeneous payloads respectively. Experimental evaluations have been conducted on several datasets and show the CMIRGen framework can extract discriminative signatures, presenting high recall rate and low false positive rate at the same time for malicious content recognition.","PeriodicalId":221956,"journal":{"name":"2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123028409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-01DOI: 10.1109/TrustCom50675.2020.00234
Xianze Liu, Jihong Liu, B. Jiang, Haozhen Jiang, Zhi Yang
Currently, SM9 algorithm has received more and more attention as a new cryptographic product. The SM9 algorithm encryption and decryption principle relies on the mapping relationship on the elliptic curve. Although this mapping relationship improves the security, it will slightly reduce the efficiency. The goal of this article is to improve the efficiency of the SM9 algorithm. Different from the traditional assembly line acceleration method, we decided to start with the basic operation of the algorithm itself. There is a bilinear pairing operation on the elliptic curve, which completes the point to point on the elliptic curve. The calculation complexity directly determines the SM9 algorithm. For this reason, we propose two new bilinear pair processing methods. The former uses the properties of isomorphic mapping to transfer the operations involved in the calculation of bilinear pairs from a large feature domain to a small feature domain, reducing the number of operations on the feature domain. The latter is for special operations in the bilinear pairing process, adding intermediate variables to convert them into low-time-consuming multiplication operations. According to the traditional Miller algorithm, the calculation of bilinear pairs requires 900 multiplication time units. Our solution can reduce this value to 700 and 800 multiplication time units respectively. In addition, the two algorithms have not changed the mapping relationship of the bilinear pair. On the premise of ensuring the correct mapping relationship, the efficiency of the SM9 algorithm is improved.
{"title":"More efficient SM9 algorithm based on bilinear pair optimization processing","authors":"Xianze Liu, Jihong Liu, B. Jiang, Haozhen Jiang, Zhi Yang","doi":"10.1109/TrustCom50675.2020.00234","DOIUrl":"https://doi.org/10.1109/TrustCom50675.2020.00234","url":null,"abstract":"Currently, SM9 algorithm has received more and more attention as a new cryptographic product. The SM9 algorithm encryption and decryption principle relies on the mapping relationship on the elliptic curve. Although this mapping relationship improves the security, it will slightly reduce the efficiency. The goal of this article is to improve the efficiency of the SM9 algorithm. Different from the traditional assembly line acceleration method, we decided to start with the basic operation of the algorithm itself. There is a bilinear pairing operation on the elliptic curve, which completes the point to point on the elliptic curve. The calculation complexity directly determines the SM9 algorithm. For this reason, we propose two new bilinear pair processing methods. The former uses the properties of isomorphic mapping to transfer the operations involved in the calculation of bilinear pairs from a large feature domain to a small feature domain, reducing the number of operations on the feature domain. The latter is for special operations in the bilinear pairing process, adding intermediate variables to convert them into low-time-consuming multiplication operations. According to the traditional Miller algorithm, the calculation of bilinear pairs requires 900 multiplication time units. Our solution can reduce this value to 700 and 800 multiplication time units respectively. In addition, the two algorithms have not changed the mapping relationship of the bilinear pair. On the premise of ensuring the correct mapping relationship, the efficiency of the SM9 algorithm is improved.","PeriodicalId":221956,"journal":{"name":"2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128789067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}