Pub Date : 2015-08-20DOI: 10.1109/Trustcom.2015.472
Mahmoud Khonji, Y. Iraqi
Stylometry analysis of given electronic texts can allow for the extraction of information about their authors by analyzing the stylistic choices the authors make to write their texts. Such extracted information could be the identity of suspect authors or their profile attributes such as their gender, age group, ethnicity group, etc. Therefore, when preserving the anonymity of an author is critical, such as that of a whistle blower, it is important to ensure the stylistic anonymity of the conveyed text itself in addition to anonymizing communication channels (e.g. Tor, or the minimization of application fingerprints). Currently, only two stylistic anonymization strategies are known, namely: imitation and obfuscation attacks. A long-term objective is to find automated methods that reliably transform given input texts such that the output texts maximize author anonymity while, reasonably, preserving the semantics of the input texts. Before one proceeds with such long-term objective, it is important to first identify effective strategies that maximize stylistic anonymity. The current state of the literature implies that imitation attacks are better at preserving the anonymity of authors than obfuscation. However, we argue that such evaluations are limited and should not generalize to stylistic anonymity as they were only executed against AA solvers, a closed-set problem. In this study, we extend such evaluations against state-of-the-art AV solvers, an open-set problem. Our results show that imitation attacks degrade the classification accuracy of AV solvers more aggressively than that of AA solvers. We argue that such reduction in accuracy below random chance guessing renders imitation attacks as inferior strategies relative to obfuscation attacks. Furthermore, as we present a general formal notation of stylometry problems, we conjecture that the same observations apply to all stylometry problems (AA, AV, AP, SI).
{"title":"Stylometric Anonymity: Is Imitation the Best Strategy?","authors":"Mahmoud Khonji, Y. Iraqi","doi":"10.1109/Trustcom.2015.472","DOIUrl":"https://doi.org/10.1109/Trustcom.2015.472","url":null,"abstract":"Stylometry analysis of given electronic texts can allow for the extraction of information about their authors by analyzing the stylistic choices the authors make to write their texts. Such extracted information could be the identity of suspect authors or their profile attributes such as their gender, age group, ethnicity group, etc. Therefore, when preserving the anonymity of an author is critical, such as that of a whistle blower, it is important to ensure the stylistic anonymity of the conveyed text itself in addition to anonymizing communication channels (e.g. Tor, or the minimization of application fingerprints). Currently, only two stylistic anonymization strategies are known, namely: imitation and obfuscation attacks. A long-term objective is to find automated methods that reliably transform given input texts such that the output texts maximize author anonymity while, reasonably, preserving the semantics of the input texts. Before one proceeds with such long-term objective, it is important to first identify effective strategies that maximize stylistic anonymity. The current state of the literature implies that imitation attacks are better at preserving the anonymity of authors than obfuscation. However, we argue that such evaluations are limited and should not generalize to stylistic anonymity as they were only executed against AA solvers, a closed-set problem. In this study, we extend such evaluations against state-of-the-art AV solvers, an open-set problem. Our results show that imitation attacks degrade the classification accuracy of AV solvers more aggressively than that of AA solvers. We argue that such reduction in accuracy below random chance guessing renders imitation attacks as inferior strategies relative to obfuscation attacks. Furthermore, as we present a general formal notation of stylometry problems, we conjecture that the same observations apply to all stylometry problems (AA, AV, AP, SI).","PeriodicalId":277092,"journal":{"name":"2015 IEEE Trustcom/BigDataSE/ISPA","volume":"156 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133549900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-08-20DOI: 10.1109/Trustcom.2015.635
Lidia Kuan, L. Sousa, P. Tomás
MrBayes is a popular software package for Bayesian phylogenetic inference that is used to derive an evolutionary tree for a collection of species whose DNA sequences are known. At the high pace which biological data has been accumulating over the years, there has been a huge growth in the computational challenges required by this type of applications. To overcome this issue, researchers turned to parallel computing to speedup execution, for instance by using Graphics Processing Units (GPUs). At the same time, GPUs architectures of different manufacturers evolved, presenting more and more computing power. Additionally, parallel programming frameworks became more mature providing more features to programmers to exploit parallelism within GPUs. In this work, we parallelized the MrBayes 3.2 in order to accelerate and reduce the execution time using the Open Computing Language (OpenCL) programming framework. Furthermore, we studied the performance of MrBayes execution using different computing platforms and different GPUs architectures of both NVIDIA and AMD vendors to determine the best architecture for this application. Results showed that even with GPUs with similar computing power NVIDIA's obtained better performance when compared to AMD's, with the later providing an unexpected low performance. Moreover, results also showed that for this particular application, NVIDIA architectural advances over the years provide limited performance improvement.
{"title":"Accelerating Phylogenetic Inference on Heterogeneous OpenCL Platforms","authors":"Lidia Kuan, L. Sousa, P. Tomás","doi":"10.1109/Trustcom.2015.635","DOIUrl":"https://doi.org/10.1109/Trustcom.2015.635","url":null,"abstract":"MrBayes is a popular software package for Bayesian phylogenetic inference that is used to derive an evolutionary tree for a collection of species whose DNA sequences are known. At the high pace which biological data has been accumulating over the years, there has been a huge growth in the computational challenges required by this type of applications. To overcome this issue, researchers turned to parallel computing to speedup execution, for instance by using Graphics Processing Units (GPUs). At the same time, GPUs architectures of different manufacturers evolved, presenting more and more computing power. Additionally, parallel programming frameworks became more mature providing more features to programmers to exploit parallelism within GPUs. In this work, we parallelized the MrBayes 3.2 in order to accelerate and reduce the execution time using the Open Computing Language (OpenCL) programming framework. Furthermore, we studied the performance of MrBayes execution using different computing platforms and different GPUs architectures of both NVIDIA and AMD vendors to determine the best architecture for this application. Results showed that even with GPUs with similar computing power NVIDIA's obtained better performance when compared to AMD's, with the later providing an unexpected low performance. Moreover, results also showed that for this particular application, NVIDIA architectural advances over the years provide limited performance improvement.","PeriodicalId":277092,"journal":{"name":"2015 IEEE Trustcom/BigDataSE/ISPA","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115539540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
G. Figueredo, Christian Wagner, J. Garibaldi, U. Aickelin
In this position paper, we present ideas about creating a next generation framework towards an adaptive interface for data communication and visualisation systems. Our objective is to develop a system that accepts large data sets as inputs and provides user-centric, meaningful visual information to assist owners in making sense of their data collection. The proposed framework comprises four stages: (i) the knowledge base compilation, where we search and collect existing state-of-the-art visualisation techniques per domain and user preferences, (ii) the development of the learning and inference system, where we apply artificial intelligence techniques to learn, predict and recommend new graphic interpretations (iii) results evaluation, and (iv) reinforcement and adaptation, where valid outputs are stored in our knowledge base and the system is iteratively tuned to address new demands. These stages, as well as our overall vision, limitations and possible challenges are introduced in this article. We also discuss further extensions of this framework for other knowledge discovery tasks.
{"title":"Adaptive Data Communication Interface: A User-Centric Visual Data Interpretation Framework","authors":"G. Figueredo, Christian Wagner, J. Garibaldi, U. Aickelin","doi":"10.2139/ssrn.2828007","DOIUrl":"https://doi.org/10.2139/ssrn.2828007","url":null,"abstract":"In this position paper, we present ideas about creating a next generation framework towards an adaptive interface for data communication and visualisation systems. Our objective is to develop a system that accepts large data sets as inputs and provides user-centric, meaningful visual information to assist owners in making sense of their data collection. The proposed framework comprises four stages: (i) the knowledge base compilation, where we search and collect existing state-of-the-art visualisation techniques per domain and user preferences, (ii) the development of the learning and inference system, where we apply artificial intelligence techniques to learn, predict and recommend new graphic interpretations (iii) results evaluation, and (iv) reinforcement and adaptation, where valid outputs are stored in our knowledge base and the system is iteratively tuned to address new demands. These stages, as well as our overall vision, limitations and possible challenges are introduced in this article. We also discuss further extensions of this framework for other knowledge discovery tasks.","PeriodicalId":277092,"journal":{"name":"2015 IEEE Trustcom/BigDataSE/ISPA","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124357819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-08-20DOI: 10.1109/Trustcom.2015.418
Ian Oliver, S. Holtmanns
Surveillance is seen as a key tool to detect terrorist activities or to counteract attacks on critical communication infrastructure. Everybody is in those systems to some degree under suspicion. The principle of innocent till proven guilty does not seem to apply to modern surveillance technology usage. On the other hand, criminals would gain easily upper hand in communication networks that are not protected and on the outlook for attacks. This poses quite a problem for the technical implementation and handling of network communication traffic. How can a communication network provider protect user data against malicious activities without screening and loss of the human right of privacy? This article provides a classification system for data usage, privacy sensitivity and risk. With those theoretical tools, we will illustrate on a concrete example how to provide user privacy, while still enable protection against criminals or unauthorized intruders.
{"title":"Aligning the Conflicting Needs of Privacy, Malware Detection and Network Protection","authors":"Ian Oliver, S. Holtmanns","doi":"10.1109/Trustcom.2015.418","DOIUrl":"https://doi.org/10.1109/Trustcom.2015.418","url":null,"abstract":"Surveillance is seen as a key tool to detect terrorist activities or to counteract attacks on critical communication infrastructure. Everybody is in those systems to some degree under suspicion. The principle of innocent till proven guilty does not seem to apply to modern surveillance technology usage. On the other hand, criminals would gain easily upper hand in communication networks that are not protected and on the outlook for attacks. This poses quite a problem for the technical implementation and handling of network communication traffic. How can a communication network provider protect user data against malicious activities without screening and loss of the human right of privacy? This article provides a classification system for data usage, privacy sensitivity and risk. With those theoretical tools, we will illustrate on a concrete example how to provide user privacy, while still enable protection against criminals or unauthorized intruders.","PeriodicalId":277092,"journal":{"name":"2015 IEEE Trustcom/BigDataSE/ISPA","volume":"119 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114615242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Network servers and applications commonly use static IP addresses and communication ports, making themselves easy targets for network reconnaissances and attacks. Port and address hopping is a novel and effective moving target defense (MTD) which hides network servers and applications by constantly changing IP addresses and ports. In this paper, we develop a novel port and address hopping mechanism called Random Port and Address Hopping (RPAH), which constantly and unpredictably mutates IP addresses and communication ports based on source identity, service identity as well as time with high rate. RPAH provides us a more strength and effective MTD mechanism with three hopping frequency, i.e., source hopping, service hopping and temporal hopping. In RPAH networks, the real IPs (rIPs) and real ports (rPorts) remain untouched and packets are routed based on dynamic and temporary virtual IPs (vIPs) of servers. Therefore, messages from adversaries using static, invalid or inactive IP addresses/ports will be detected and denied. Our experiments and evaluation show that RPAH is effective in defense against various internal and external threats such as network scanning, SYN flooding attack and worm propagation, while introducing an acceptable operation overhead.
{"title":"RPAH: Random Port and Address Hopping for Thwarting Internal and External Adversaries","authors":"Yue Luo, Baosheng Wang, Xiaofeng Wang, Xiaofeng Hu, Gui-lin Cai, Hao Sun","doi":"10.1109/Trustcom.2015.383","DOIUrl":"https://doi.org/10.1109/Trustcom.2015.383","url":null,"abstract":"Network servers and applications commonly use static IP addresses and communication ports, making themselves easy targets for network reconnaissances and attacks. Port and address hopping is a novel and effective moving target defense (MTD) which hides network servers and applications by constantly changing IP addresses and ports. In this paper, we develop a novel port and address hopping mechanism called Random Port and Address Hopping (RPAH), which constantly and unpredictably mutates IP addresses and communication ports based on source identity, service identity as well as time with high rate. RPAH provides us a more strength and effective MTD mechanism with three hopping frequency, i.e., source hopping, service hopping and temporal hopping. In RPAH networks, the real IPs (rIPs) and real ports (rPorts) remain untouched and packets are routed based on dynamic and temporary virtual IPs (vIPs) of servers. Therefore, messages from adversaries using static, invalid or inactive IP addresses/ports will be detected and denied. Our experiments and evaluation show that RPAH is effective in defense against various internal and external threats such as network scanning, SYN flooding attack and worm propagation, while introducing an acceptable operation overhead.","PeriodicalId":277092,"journal":{"name":"2015 IEEE Trustcom/BigDataSE/ISPA","volume":"27 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123638966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-08-20DOI: 10.1109/Trustcom.2015.559
S. Ramírez-Gallego, S. García, Héctor Mouriño-Talín, David Martínez-Rego
The astonishing rate of data generation on the Internet nowadays has caused that many classical knowledge extraction techniques have become obsolete. Data reduction techniques are required in order to reduce the complexity order held by these techniques. Among reduction techniques, discretization is one of the most important tasks in data mining process, aimed at simplifying and reducing continuous-valued data in large datasets. In spite of the great interest in this reduction mechanism, only a few simple discretization techniques have been implemented in the literature for Big Data. Thereby we propose a distributed implementation of the entropy minimization discretizer proposed by Fayyad and Irani using Apache Spark platform. Our solution goes beyond a simple parallelization, transforming the iterativity yielded by the original proposal in a single-step computation. Experimental results on two large-scale datasets show that our solution is able to improve the classification accuracy as well as boosting the underlying learning process.
{"title":"Distributed Entropy Minimization Discretizer for Big Data Analysis under Apache Spark","authors":"S. Ramírez-Gallego, S. García, Héctor Mouriño-Talín, David Martínez-Rego","doi":"10.1109/Trustcom.2015.559","DOIUrl":"https://doi.org/10.1109/Trustcom.2015.559","url":null,"abstract":"The astonishing rate of data generation on the Internet nowadays has caused that many classical knowledge extraction techniques have become obsolete. Data reduction techniques are required in order to reduce the complexity order held by these techniques. Among reduction techniques, discretization is one of the most important tasks in data mining process, aimed at simplifying and reducing continuous-valued data in large datasets. In spite of the great interest in this reduction mechanism, only a few simple discretization techniques have been implemented in the literature for Big Data. Thereby we propose a distributed implementation of the entropy minimization discretizer proposed by Fayyad and Irani using Apache Spark platform. Our solution goes beyond a simple parallelization, transforming the iterativity yielded by the original proposal in a single-step computation. Experimental results on two large-scale datasets show that our solution is able to improve the classification accuracy as well as boosting the underlying learning process.","PeriodicalId":277092,"journal":{"name":"2015 IEEE Trustcom/BigDataSE/ISPA","volume":"132 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117121570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-08-20DOI: 10.1109/Trustcom.2015.373
L. Apvrille, A. Apvrille
Android malware unfortunately have little difficulty to sneak in marketplaces. While known malware and their variants are nowadays quite well detected by antivirus scanners, new unknown malware, which are fundamentally different from others (e.g. "0-day"), remain an issue. To discover such new malware, the SherlockDroid framework filters masses of applications and only keeps the most likely to be malicious for future inspection by antivirus teams. Apart from crawling applications from marketplaces, SherlockDroid extracts code-level features, and then classifies unknown applications with Alligator. Alligator is a classification tool that efficiently and automatically combines several classification algorithms. To demonstrate the efficiency of our approach, we have extracted properties and classified over 600,000 applications during two crawling campaigns in July 2014 and October 2014, with the detection of one new malware, Android/Odpa.A!tr.spy, and two new riskware. With other findings, this increases SherlockDroid's "Hall of Shame" to 9 totally unknown malware and potentially unwanted applications.
{"title":"Identifying Unknown Android Malware with Feature Extractions and Classification Techniques","authors":"L. Apvrille, A. Apvrille","doi":"10.1109/Trustcom.2015.373","DOIUrl":"https://doi.org/10.1109/Trustcom.2015.373","url":null,"abstract":"Android malware unfortunately have little difficulty to sneak in marketplaces. While known malware and their variants are nowadays quite well detected by antivirus scanners, new unknown malware, which are fundamentally different from others (e.g. \"0-day\"), remain an issue. To discover such new malware, the SherlockDroid framework filters masses of applications and only keeps the most likely to be malicious for future inspection by antivirus teams. Apart from crawling applications from marketplaces, SherlockDroid extracts code-level features, and then classifies unknown applications with Alligator. Alligator is a classification tool that efficiently and automatically combines several classification algorithms. To demonstrate the efficiency of our approach, we have extracted properties and classified over 600,000 applications during two crawling campaigns in July 2014 and October 2014, with the detection of one new malware, Android/Odpa.A!tr.spy, and two new riskware. With other findings, this increases SherlockDroid's \"Hall of Shame\" to 9 totally unknown malware and potentially unwanted applications.","PeriodicalId":277092,"journal":{"name":"2015 IEEE Trustcom/BigDataSE/ISPA","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125750116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-08-20DOI: 10.1109/Trustcom.2015.436
Abdullah Aref, T. Tran
Trust is a complex, multifaceted concept that includes more than just evaluating others' honesty. Many trust evaluation models have been proposed and implemented in different areas, most of them focused on creating algorithms for trusters to model the honesty of trustees in order to make effective decisions about which trustees to select, where a rational truster is supposed to interact with the trustworthy ones. If interactions are based on trust, trustworthy trustees will have a greater impact on the results of interactions' results. Consequently, building a high trust may be an advantage for rational trustees. This work describes a Reinforcement Learning based Trust Establishment model (RLTE) that goes beyond trust evaluation to outline actions to direct trustees (instead of trusters). RLTE uses the retention of trusters and reinforcement learning to model trustors' behaviors. A trustee uses reinforcement learning to adjust the utility gain it provides when interacting with each truster. The trustee depends on the average number of transactions carried out by that truster, relative to the mean number of transactions performed by all trusters interacting with this trustee. The trustee accelerates or decelerates the adjustment of the utility gain based on the increase or decrease of the average retention rate of all trusters in the society, respectively. The proposed model does not depend on direct feedback, nor does it depend on the current reputation of trustees in the environment. Simulation results indicate that trustees empowered with the proposed model can be selected more by trusters.
{"title":"RLTE: A Reinforcement Learning Based Trust Establishment Model","authors":"Abdullah Aref, T. Tran","doi":"10.1109/Trustcom.2015.436","DOIUrl":"https://doi.org/10.1109/Trustcom.2015.436","url":null,"abstract":"Trust is a complex, multifaceted concept that includes more than just evaluating others' honesty. Many trust evaluation models have been proposed and implemented in different areas, most of them focused on creating algorithms for trusters to model the honesty of trustees in order to make effective decisions about which trustees to select, where a rational truster is supposed to interact with the trustworthy ones. If interactions are based on trust, trustworthy trustees will have a greater impact on the results of interactions' results. Consequently, building a high trust may be an advantage for rational trustees. This work describes a Reinforcement Learning based Trust Establishment model (RLTE) that goes beyond trust evaluation to outline actions to direct trustees (instead of trusters). RLTE uses the retention of trusters and reinforcement learning to model trustors' behaviors. A trustee uses reinforcement learning to adjust the utility gain it provides when interacting with each truster. The trustee depends on the average number of transactions carried out by that truster, relative to the mean number of transactions performed by all trusters interacting with this trustee. The trustee accelerates or decelerates the adjustment of the utility gain based on the increase or decrease of the average retention rate of all trusters in the society, respectively. The proposed model does not depend on direct feedback, nor does it depend on the current reputation of trustees in the environment. Simulation results indicate that trustees empowered with the proposed model can be selected more by trusters.","PeriodicalId":277092,"journal":{"name":"2015 IEEE Trustcom/BigDataSE/ISPA","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128927526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-08-20DOI: 10.1109/Trustcom.2015.482
J. Walls, Kim-Kwang Raymond Choo
The traditional way of protecting a system against malicious threats and loss of personal data by using locally installed anti-malware software is unlikely to work on mobile devices due to the changing threat landscape and the mobile device resource limitations (e.g. storage and battery life). A number of anti-malware providers have migrated to the cloud where the computationally demanding tasks of analyzing malware is conducted by cloud-based server. However, the effectiveness of these anti-mobile apps has not been studied. Therefore, in this paper, we evaluate the effectiveness of ten popular free cloud-based anti-malware apps using a known Android malware dataset. We hope that this research will contribute towards a better understanding of the effectiveness of Android cloud-based anti-malware apps.
{"title":"A Review of Free Cloud-Based Anti-Malware Apps for Android","authors":"J. Walls, Kim-Kwang Raymond Choo","doi":"10.1109/Trustcom.2015.482","DOIUrl":"https://doi.org/10.1109/Trustcom.2015.482","url":null,"abstract":"The traditional way of protecting a system against malicious threats and loss of personal data by using locally installed anti-malware software is unlikely to work on mobile devices due to the changing threat landscape and the mobile device resource limitations (e.g. storage and battery life). A number of anti-malware providers have migrated to the cloud where the computationally demanding tasks of analyzing malware is conducted by cloud-based server. However, the effectiveness of these anti-mobile apps has not been studied. Therefore, in this paper, we evaluate the effectiveness of ten popular free cloud-based anti-malware apps using a known Android malware dataset. We hope that this research will contribute towards a better understanding of the effectiveness of Android cloud-based anti-malware apps.","PeriodicalId":277092,"journal":{"name":"2015 IEEE Trustcom/BigDataSE/ISPA","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131064689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-08-20DOI: 10.1109/Trustcom.2015.464
Sultan Alneyadi, E. Sithirasenan, V. Muthukkumarasamy
Data leakage prevention systems (DLPSs) are increasingly being implemented by organizations. Unlike standard security mechanisms such as firewalls and intrusion detection systems, DLPSs are designated systems used to protect in use, at rest and in transit data. DLPSs analytically use the content and surrounding context of confidential data to detect and prevent unauthorized access to confidential data. DLPSs that use content analysis techniques are largely dependent upon data fingerprinting, regular expressions, and statistical analysis to detect data leaks. Given that data is susceptible to change, data fingerprinting and regular expressions suffer from shortcomings in detecting the semantics of evolved confidential data. However, statistical analysis can manage any data that appears fuzzy in nature or has other variations. Thus, DLPSs with statistical analysis capabilities can approximate the presence of data semantics. In this paper, a statistical data leakage prevention (DLP) model is presented to classify data on the basis of semantics. This study contributes to the data leakage prevention field by using data statistical analysis to detect evolved confidential data. The approach was based on using the well-known information retrieval function Term Frequency-Inverse Document Frequency (TF-IDF) to classify documents under certain topics. A Singular Value Decomposition (SVD) matrix was also used to visualize the classification results. The results showed that the proposed statistical DLP approach could correctly classify documents even in cases of extreme modification. It also had a high level of precision and recall scores.
{"title":"Detecting Data Semantic: A Data Leakage Prevention Approach","authors":"Sultan Alneyadi, E. Sithirasenan, V. Muthukkumarasamy","doi":"10.1109/Trustcom.2015.464","DOIUrl":"https://doi.org/10.1109/Trustcom.2015.464","url":null,"abstract":"Data leakage prevention systems (DLPSs) are increasingly being implemented by organizations. Unlike standard security mechanisms such as firewalls and intrusion detection systems, DLPSs are designated systems used to protect in use, at rest and in transit data. DLPSs analytically use the content and surrounding context of confidential data to detect and prevent unauthorized access to confidential data. DLPSs that use content analysis techniques are largely dependent upon data fingerprinting, regular expressions, and statistical analysis to detect data leaks. Given that data is susceptible to change, data fingerprinting and regular expressions suffer from shortcomings in detecting the semantics of evolved confidential data. However, statistical analysis can manage any data that appears fuzzy in nature or has other variations. Thus, DLPSs with statistical analysis capabilities can approximate the presence of data semantics. In this paper, a statistical data leakage prevention (DLP) model is presented to classify data on the basis of semantics. This study contributes to the data leakage prevention field by using data statistical analysis to detect evolved confidential data. The approach was based on using the well-known information retrieval function Term Frequency-Inverse Document Frequency (TF-IDF) to classify documents under certain topics. A Singular Value Decomposition (SVD) matrix was also used to visualize the classification results. The results showed that the proposed statistical DLP approach could correctly classify documents even in cases of extreme modification. It also had a high level of precision and recall scores.","PeriodicalId":277092,"journal":{"name":"2015 IEEE Trustcom/BigDataSE/ISPA","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131005035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}