Pub Date : 2018-01-04DOI: 10.4108/eai.4-1-2018.153528
Lei Xu, Lin Chen, Zhimin Gao, Shouhuai Xu, W. Shi
Public blockchains provide a decentralized method for storing transaction data and have many applications in different sectors. In order for users to track transactions, a simple method is to let them keep a local copy of the entire public ledger. Since the size of the ledger keeps growing, this method becomes increasingly less practical, especially for lightweight users such as IoT devices and smartphones. In order to cope with the problem, several solutions have been proposed to reduce the storage burden. However, existing solutions either achieve a limited storage reduction (e.g., simple payment verification), or rely on some strong security assumption (e.g., the use of trusted server). In this paper, we propose a new approach to solving the problem. Specifically, we propose an underline{e}fficient verification protocol for underline{p}ublic underline{b}lockunderline{c}hains, or EPBC for short. EPBC is particularly suitable for lightweight users, who only need to store a small amount of data that is {it independent of} the size of the blockchain. We analyze EPBC's performance and security, and discuss its integration with existing public ledger systems. Experimental results confirm that EPBC is practical for lightweight users.
{"title":"Efficient Public Blockchain Client for Lightweight Users","authors":"Lei Xu, Lin Chen, Zhimin Gao, Shouhuai Xu, W. Shi","doi":"10.4108/eai.4-1-2018.153528","DOIUrl":"https://doi.org/10.4108/eai.4-1-2018.153528","url":null,"abstract":"Public blockchains provide a decentralized method for storing transaction data and have many applications in different sectors. In order for users to track transactions, a simple method is to let them keep a local copy of the entire public ledger. Since the size of the ledger keeps growing, this method becomes increasingly less practical, especially for lightweight users such as IoT devices and smartphones. In order to cope with the problem, several solutions have been proposed to reduce the storage burden. However, existing solutions either achieve a limited storage reduction (e.g., simple payment verification), or rely on some strong security assumption (e.g., the use of trusted server). In this paper, we propose a new approach to solving the problem. Specifically, we propose an underline{e}fficient verification protocol for underline{p}ublic underline{b}lockunderline{c}hains, or EPBC for short. EPBC is particularly suitable for lightweight users, who only need to store a small amount of data that is {it independent of} the size of the blockchain. We analyze EPBC's performance and security, and discuss its integration with existing public ledger systems. Experimental results confirm that EPBC is practical for lightweight users.","PeriodicalId":335727,"journal":{"name":"EAI Endorsed Trans. Security Safety","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116380284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-01-04DOI: 10.4108/eai.4-1-2018.153526
Xiaoyan Sun, Jun Dai, A. Singhal, Peng Liu
Cloud computing, with the paradigm of computing as a utility, has the potential to significantly tranform the IT industry. Attracted by the high efficiency, low cost, and great flexibility of cloud, enterprises began to migrate large parts of their networks into cloud. The cloud becomes a public space where multiple “tenants” reside. Except for some public services, the enterprise networks in cloud should be absolutely isolated from each other. However, some “stealthy bridges” could be established to break such isolation due to two features of the public cloud: virtual machine image sharing and virtual machine co-residency. This paper proposes to use cross-layer Bayesian networks to infer the stealthy bridges existing between enterprise network islands. Cloud-level attack graphs are firstly built to capture the potential attacks enabled by stealthy bridges and reveal hidden possible attack paths. Cross-layer Bayesian networks are then constructed to infer the probability of stealthy bridge existence. The experiment results show that the cross-layer Bayesian networks are capable of inferring the existence of stealthy bridges given supporting evidence from other intrusion steps in a multi-step attack. Received on 25 December 2017; accepted on 26 December 2017; published on 4 January 2018
{"title":"Probabilistic Inference of the Stealthy Bridges between Enterprise Networks in Cloud","authors":"Xiaoyan Sun, Jun Dai, A. Singhal, Peng Liu","doi":"10.4108/eai.4-1-2018.153526","DOIUrl":"https://doi.org/10.4108/eai.4-1-2018.153526","url":null,"abstract":"Cloud computing, with the paradigm of computing as a utility, has the potential to significantly tranform the IT industry. Attracted by the high efficiency, low cost, and great flexibility of cloud, enterprises began to migrate large parts of their networks into cloud. The cloud becomes a public space where multiple “tenants” reside. Except for some public services, the enterprise networks in cloud should be absolutely isolated from each other. However, some “stealthy bridges” could be established to break such isolation due to two features of the public cloud: virtual machine image sharing and virtual machine co-residency. This paper proposes to use cross-layer Bayesian networks to infer the stealthy bridges existing between enterprise network islands. Cloud-level attack graphs are firstly built to capture the potential attacks enabled by stealthy bridges and reveal hidden possible attack paths. Cross-layer Bayesian networks are then constructed to infer the probability of stealthy bridge existence. The experiment results show that the cross-layer Bayesian networks are capable of inferring the existence of stealthy bridges given supporting evidence from other intrusion steps in a multi-step attack. Received on 25 December 2017; accepted on 26 December 2017; published on 4 January 2018","PeriodicalId":335727,"journal":{"name":"EAI Endorsed Trans. Security Safety","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133566958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-12-28DOI: 10.4108/eai.28-12-2017.153516
Qi Dong, Zekun Yang, Yu Chen, Xiaohua Li, K. Zeng
Cognitive radio networks (CRNs) have been recognized as a promising technology that allows secondary users (SUs) extensively explore spectrum resource usage efficiency, while not introducing interference to licensed users. Due to the unregulated wireless network environment, CRNs are susceptible to various malicious entities. Thus, it is critical to detect anomalies in the first place. However, from the perspective of intrinsic features of CRNs, there is hardly in existence of an universal applicable anomaly detection scheme. Singular Spectrum Analysis (SSA) has been theoretically proven an optimal approach for accurate and quick detection of changes in the characteristics of a running (random) process. In addition, SSA is a model-free method and no parametric models have to be assumed for different types of anomalies, which makes it a universal anomaly detection scheme. In this paper, we introduce an adaptive parameter and component selection mechanism based on coherence for basic SSA method, upon which we built up a sliding window online anomaly detector in CRNs. Our experimental results indicate great accuracy of the SSA-based anomaly detector for multiple anomalies.
{"title":"Exploration of Singular Spectrum Analysis for Online Anomaly Detection in CRNs","authors":"Qi Dong, Zekun Yang, Yu Chen, Xiaohua Li, K. Zeng","doi":"10.4108/eai.28-12-2017.153516","DOIUrl":"https://doi.org/10.4108/eai.28-12-2017.153516","url":null,"abstract":"Cognitive radio networks (CRNs) have been recognized as a promising technology that allows secondary users (SUs) extensively explore spectrum resource usage efficiency, while not introducing interference to licensed users. Due to the unregulated wireless network environment, CRNs are susceptible to various malicious entities. Thus, it is critical to detect anomalies in the first place. However, from the perspective of intrinsic features of CRNs, there is hardly in existence of an universal applicable anomaly detection scheme. Singular Spectrum Analysis (SSA) has been theoretically proven an optimal approach for accurate and quick detection of changes in the characteristics of a running (random) process. In addition, SSA is a model-free method and no parametric models have to be assumed for different types of anomalies, which makes it a universal anomaly detection scheme. In this paper, we introduce an adaptive parameter and component selection mechanism based on coherence for basic SSA method, upon which we built up a sliding window online anomaly detector in CRNs. Our experimental results indicate great accuracy of the SSA-based anomaly detector for multiple anomalies.","PeriodicalId":335727,"journal":{"name":"EAI Endorsed Trans. Security Safety","volume":"140 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131672224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-12-28DOI: 10.4108/eai.28-12-2017.153518
C. Leca
This paper presents a study of wireless network security and statistics in Romania aimed at raising public awareness on security issues and highlighting the prevalence of known vulnerabilities in commercial equipment. The data used for the study consist of wireless network broadcast data acquisitioned by the technique of war-driving. In order to ensure a thorough overview, the data collected includes more than 100000 unique wireless networks gathered in Bucharest, major urban areas and the surrounding rural areas. The results of the study cover security protocol usage, the percentage in which known vulnerabilities are still deployed in wireless networks and statistics regarding channel and band usage, common SSIDs in Romania, top equipment manufacturers and the situation of provider wireless access points. The study also shows that provider wireless access points on average offer better security than private networks. Received on 28 January 2017; accepted on 20 April 2017; published on 28 December 2017
{"title":"Overview of Romania 802.11Wireless Security & Statistics","authors":"C. Leca","doi":"10.4108/eai.28-12-2017.153518","DOIUrl":"https://doi.org/10.4108/eai.28-12-2017.153518","url":null,"abstract":"This paper presents a study of wireless network security and statistics in Romania aimed at raising public awareness on security issues and highlighting the prevalence of known vulnerabilities in commercial equipment. The data used for the study consist of wireless network broadcast data acquisitioned by the technique of war-driving. In order to ensure a thorough overview, the data collected includes more than 100000 unique wireless networks gathered in Bucharest, major urban areas and the surrounding rural areas. The results of the study cover security protocol usage, the percentage in which known vulnerabilities are still deployed in wireless networks and statistics regarding channel and band usage, common SSIDs in Romania, top equipment manufacturers and the situation of provider wireless access points. The study also shows that provider wireless access points on average offer better security than private networks. Received on 28 January 2017; accepted on 20 April 2017; published on 28 December 2017","PeriodicalId":335727,"journal":{"name":"EAI Endorsed Trans. Security Safety","volume":"2472 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131087078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-12-13DOI: 10.4108/eai.28-12-2017.153517
Leixiao Cheng, Quanshui Wu, Yunlei Zhao
Lossy trapdoor functions (LTDF) and all-but-one trapdoor functions (ABO-TDF) are fundamental cryptographic primitives. And given the recent advances in quantum computing, it would be much desirable to develop new and improved lattice-based LTDF and ABO-TDF. In this work, we provide more compact constructions of LTDF and ABO-TDF based on the learning with errors (LWE) problem. In addition, our LWE-based ABO-TDF can allow smaller system parameters to support super-polynomially many injective branches in the construction of CCA secure public key encryption. As a core building tool, we provide a more compact homomorphic symmetric encryption schemes based on LWE, which might be of independent interest. To further optimize the ABO-TDF construction, we employ the full rank difference encoding technique. As a consequence, the results presented in this work can substantially improve the performance of all the previous LWE-based cryptographic constructions based upon LTDF and ABO-TDF.
{"title":"Compact lossy and all-but-one trapdoor functions from lattice","authors":"Leixiao Cheng, Quanshui Wu, Yunlei Zhao","doi":"10.4108/eai.28-12-2017.153517","DOIUrl":"https://doi.org/10.4108/eai.28-12-2017.153517","url":null,"abstract":"Lossy trapdoor functions (LTDF) and all-but-one trapdoor functions (ABO-TDF) are fundamental cryptographic primitives. And given the recent advances in quantum computing, it would be much desirable to develop new and improved lattice-based LTDF and ABO-TDF. In this work, we provide more compact constructions of LTDF and ABO-TDF based on the learning with errors (LWE) problem. In addition, our LWE-based ABO-TDF can allow smaller system parameters to support super-polynomially many injective branches in the construction of CCA secure public key encryption. As a core building tool, we provide a more compact homomorphic symmetric encryption schemes based on LWE, which might be of independent interest. To further optimize the ABO-TDF construction, we employ the full rank difference encoding technique. As a consequence, the results presented in this work can substantially improve the performance of all the previous LWE-based cryptographic constructions based upon LTDF and ABO-TDF.","PeriodicalId":335727,"journal":{"name":"EAI Endorsed Trans. Security Safety","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127967898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-12-07DOI: 10.4108/eai.7-12-2017.153394
Yongfeng Li, Jinbin Ouyang, Bing Mao, Kai Ma, Shanqing Guo
Smartphones carry a large quantity of sensitive information to satisfy people’s various requirements, but the way of using information is important to keep the security of users’ privacy. There are two kinds of misuses of sensitive information for apps. On the one hand, careless programmers may leak the data by accident. On the other hand, the attackers develop malware to collect sensitive data intentionally. Many researchers apply data flow analysis to detect data leakages of an app. However, data flow analysis on Android platform is quite di ff erent from the programs on desktop. Many researchers have solved some problems of data flow analysis on Android platform, like Activity lifecycle, callback methods, inter-component communication. We find that Fragment’s lifecycle also has an e ff ect on the data flow analysis of Android apps. Some data will be leaked if we don’t take Fragment’s lifecycle into consideration when performing data flow analysis in Android apps. So in this paper, we propose an approach to model Fragment’s lifecycle and its relationship with Activity’s lifecycle, then introduce a tool called FragDroid based on FlowDroid [7]. We conduct some experiments to evaluate the e ff ectiveness of our tool and the results show that there are 8% of apps in our data set using Fragment. In particular, for popular apps, the result is 50.8%. We also evaluate the performance of using FragDroid to analyze Android apps, the result shows the average overhead is 17%.
{"title":"Data Flow Analysis on Android Platform with Fragment Lifecycle Modeling and Callbacks","authors":"Yongfeng Li, Jinbin Ouyang, Bing Mao, Kai Ma, Shanqing Guo","doi":"10.4108/eai.7-12-2017.153394","DOIUrl":"https://doi.org/10.4108/eai.7-12-2017.153394","url":null,"abstract":"Smartphones carry a large quantity of sensitive information to satisfy people’s various requirements, but the way of using information is important to keep the security of users’ privacy. There are two kinds of misuses of sensitive information for apps. On the one hand, careless programmers may leak the data by accident. On the other hand, the attackers develop malware to collect sensitive data intentionally. Many researchers apply data flow analysis to detect data leakages of an app. However, data flow analysis on Android platform is quite di ff erent from the programs on desktop. Many researchers have solved some problems of data flow analysis on Android platform, like Activity lifecycle, callback methods, inter-component communication. We find that Fragment’s lifecycle also has an e ff ect on the data flow analysis of Android apps. Some data will be leaked if we don’t take Fragment’s lifecycle into consideration when performing data flow analysis in Android apps. So in this paper, we propose an approach to model Fragment’s lifecycle and its relationship with Activity’s lifecycle, then introduce a tool called FragDroid based on FlowDroid [7]. We conduct some experiments to evaluate the e ff ectiveness of our tool and the results show that there are 8% of apps in our data set using Fragment. In particular, for popular apps, the result is 50.8%. We also evaluate the performance of using FragDroid to analyze Android apps, the result shows the average overhead is 17%.","PeriodicalId":335727,"journal":{"name":"EAI Endorsed Trans. Security Safety","volume":"108 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122546735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-12-07DOI: 10.4108/eai.7-12-2017.153395
Nicolas Van Balen, C. Ball, Haining Wang
Gender is one of the essential characteristics of personal identity that is often misused by online impostors for malicious purposes. This paper proposes a naturalistic approach for identity protection with a specific focus on using mouse biometrics to ensure accurate gender identification. Our underpinning rationale lies in the fact that men and women differ in their natural aiming movements of a hand held object in twodimensional space due to anthropometric, biomechanical, and perceptual-motor control differences between the genders. Although some research has been done on classifying user by gender using biometrics, to the best of our knowledge, no research has provided a comprehensive list of which metrics (features) of movements are actually relevant to gender classification, or method by which these metrics may be chosen. This can lead to researchers making unguided decisions on which metrics to extract from the data, doing so for convenience or personal preference. Making choices this way can lead to negatively affecting the accuracy of the model by the inclusion of metrics with little relevance to the problem, and excluding metrics of high relevance. In this paper, we outline a method for choosing metrics based on empirical evidence of natural differences in the genders, and make recommendations on the choice of metrics. The efficacy of our method is then tested through the use of a logistic regression model. Received on 29 November 2017; accepted on 02 December 2017; published on 07 December 2017
{"title":"Analysis of Targeted Mouse Movements for Gender Classification","authors":"Nicolas Van Balen, C. Ball, Haining Wang","doi":"10.4108/eai.7-12-2017.153395","DOIUrl":"https://doi.org/10.4108/eai.7-12-2017.153395","url":null,"abstract":"Gender is one of the essential characteristics of personal identity that is often misused by online impostors for malicious purposes. This paper proposes a naturalistic approach for identity protection with a specific focus on using mouse biometrics to ensure accurate gender identification. Our underpinning rationale lies in the fact that men and women differ in their natural aiming movements of a hand held object in twodimensional space due to anthropometric, biomechanical, and perceptual-motor control differences between the genders. Although some research has been done on classifying user by gender using biometrics, to the best of our knowledge, no research has provided a comprehensive list of which metrics (features) of movements are actually relevant to gender classification, or method by which these metrics may be chosen. This can lead to researchers making unguided decisions on which metrics to extract from the data, doing so for convenience or personal preference. Making choices this way can lead to negatively affecting the accuracy of the model by the inclusion of metrics with little relevance to the problem, and excluding metrics of high relevance. In this paper, we outline a method for choosing metrics based on empirical evidence of natural differences in the genders, and make recommendations on the choice of metrics. The efficacy of our method is then tested through the use of a logistic regression model. Received on 29 November 2017; accepted on 02 December 2017; published on 07 December 2017","PeriodicalId":335727,"journal":{"name":"EAI Endorsed Trans. Security Safety","volume":"333 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124697415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-12-07DOI: 10.4108/EAI.7-12-2017.153397
Nicolae Paladi, C. Gehrmann
Software-Defined Networking (SDN) is a novel architectural model for cloud network infrastructure, improving resource utilization, scalability and administration. SDN deployments increasingly rely on virtual switches executing on commodity operating systems with large code bases, which are prime targets for adversaries attacking the network infrastructure. We describe and implement TruSDN, a framework for bootstrapping trust in SDN infrastructure using Intel Software Guard Extensions (SGX), allowing to securely deploy SDN components and protect communication between network endpoints. We introduce ephemeral flow-specific preshared keys and propose a novel defense against cuckoo attacks on SGX enclaves. TruSDN is secure under a powerful adversary model, with a minor performance overhead. (Less)
{"title":"Bootstrapping trust in software defined networks","authors":"Nicolae Paladi, C. Gehrmann","doi":"10.4108/EAI.7-12-2017.153397","DOIUrl":"https://doi.org/10.4108/EAI.7-12-2017.153397","url":null,"abstract":"Software-Defined Networking (SDN) is a novel architectural model for cloud network infrastructure, improving resource utilization, scalability and administration. SDN deployments increasingly rely on virtual switches executing on commodity operating systems with large code bases, which are prime targets for adversaries attacking the network infrastructure. We describe and implement TruSDN, a framework for bootstrapping trust in SDN infrastructure using Intel Software Guard Extensions (SGX), allowing to securely deploy SDN components and protect communication between network endpoints. We introduce ephemeral flow-specific preshared keys and propose a novel defense against cuckoo attacks on SGX enclaves. TruSDN is secure under a powerful adversary model, with a minor performance overhead. (Less)","PeriodicalId":335727,"journal":{"name":"EAI Endorsed Trans. Security Safety","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133633892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-08DOI: 10.4108/eai.8-12-2016.151725
N. Rowe
For digital forensics, eliminating the uninteresting is often more critical than finding the interesting since there is so much more of it. Published software-file hash values like those of the National Software Reference Library (NSRL) have limited scope. We discuss methods based on analysis of file context using the metadata of a large corpus. Tests were done with an international corpus of 262.7 million files obtained from 4018 drives. For malware investigations, we identify clues to malware in context, and show that using a Bayesian ranking formula on metadata can increase recall by 5.1 while increasing precision by 1.7 times over inspecting executables alone. For more general investigations, we show that using together two of nine criteria for uninteresting files, with exceptions for some special interesting files, can exclude 77.4% of our corpus instead of the 23.8% that were excluded by NSRL. For a test set of 19,784 randomly selected files from our corpus that were manually inspected, false positives after file exclusion (interesting files identified as uninteresting) were 0.18% and false negatives (uninteresting files identified as interesting) were 29.31% using our methods. The generality of the methods was confirmed by separately testing two halves of our corpus. Few of our excluded files were matched in two commercial hash sets. This work provides both new uninteresting hash values and programs for finding more.
{"title":"Identifying forensically uninteresting files in a large corpus","authors":"N. Rowe","doi":"10.4108/eai.8-12-2016.151725","DOIUrl":"https://doi.org/10.4108/eai.8-12-2016.151725","url":null,"abstract":"For digital forensics, eliminating the uninteresting is often more critical than finding the interesting since there is so much more of it. Published software-file hash values like those of the National Software Reference Library (NSRL) have limited scope. We discuss methods based on analysis of file context using the metadata of a large corpus. Tests were done with an international corpus of 262.7 million files obtained from 4018 drives. For malware investigations, we identify clues to malware in context, and show that using a Bayesian ranking formula on metadata can increase recall by 5.1 while increasing precision by 1.7 times over inspecting executables alone. For more general investigations, we show that using together two of nine criteria for uninteresting files, with exceptions for some special interesting files, can exclude 77.4% of our corpus instead of the 23.8% that were excluded by NSRL. For a test set of 19,784 randomly selected files from our corpus that were manually inspected, false positives after file exclusion (interesting files identified as uninteresting) were 0.18% and false negatives (uninteresting files identified as interesting) were 29.31% using our methods. The generality of the methods was confirmed by separately testing two halves of our corpus. Few of our excluded files were matched in two commercial hash sets. This work provides both new uninteresting hash values and programs for finding more.","PeriodicalId":335727,"journal":{"name":"EAI Endorsed Trans. Security Safety","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130305831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}