We propose a tool to capture applications requirements with respect to the enforcement of network security policies in an object-oriented design language. Once a design captures clear, concise, easily understood network requirements new technologies become possible, including network transactions and user-driven policies to remove rarely used network permissions until needed, creating a least privilege in time policy. Existing security enforcement policies represent a model of all allowable behavior. Only modeling allowable behavior requires that any entity that may need a permission, be granted it permanently. Refining the modeling to distinguish between common behavior and rare behavior will increase security. The increased security comes with costs, such as requiring users to strongly authenticate more often. This paper discusses those costs and the complexity of increasing security enforcement models.
{"title":"Network Policy Enforcement Using Transactions: The NEUTRON Approach","authors":"D. Thomsen, E. Bertino","doi":"10.1145/3205977.3206000","DOIUrl":"https://doi.org/10.1145/3205977.3206000","url":null,"abstract":"We propose a tool to capture applications requirements with respect to the enforcement of network security policies in an object-oriented design language. Once a design captures clear, concise, easily understood network requirements new technologies become possible, including network transactions and user-driven policies to remove rarely used network permissions until needed, creating a least privilege in time policy. Existing security enforcement policies represent a model of all allowable behavior. Only modeling allowable behavior requires that any entity that may need a permission, be granted it permanently. Refining the modeling to distinguish between common behavior and rare behavior will increase security. The increased security comes with costs, such as requiring users to strongly authenticate more often. This paper discusses those costs and the complexity of increasing security enforcement models.","PeriodicalId":423087,"journal":{"name":"Proceedings of the 23nd ACM on Symposium on Access Control Models and Technologies","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116599794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Many access control patterns, both positive and negative, have been identified in the past. However, there is little research describing how to leverage those patterns for the detection of access control bugs in code. Many software bug detection models and frameworks for access control exist, however most of these approaches and tools are process-based and suffer from many limitations. We propose a framework to detect access control bugs based on code pattern detection. Our framework will mine and generate bug patterns, detect those patterns in code, and calculate a vulnerability measure of software. Based on our knowledge we are the first pattern-based model for the detection and measurement of bugs in software. As a proof of concept, we perform a case study of the relational database access control pattern "Improper Authorization''.
{"title":"Toward A Code Pattern Based Vulnerability Measurement Model","authors":"John Heaps, Rocky Slavin, Xiaoyin Wang","doi":"10.1145/3205977.3208948","DOIUrl":"https://doi.org/10.1145/3205977.3208948","url":null,"abstract":"Many access control patterns, both positive and negative, have been identified in the past. However, there is little research describing how to leverage those patterns for the detection of access control bugs in code. Many software bug detection models and frameworks for access control exist, however most of these approaches and tools are process-based and suffer from many limitations. We propose a framework to detect access control bugs based on code pattern detection. Our framework will mine and generate bug patterns, detect those patterns in code, and calculate a vulnerability measure of software. Based on our knowledge we are the first pattern-based model for the detection and measurement of bugs in software. As a proof of concept, we perform a case study of the relational database access control pattern \"Improper Authorization''.","PeriodicalId":423087,"journal":{"name":"Proceedings of the 23nd ACM on Symposium on Access Control Models and Technologies","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125647862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Internet of Things has become a predominant phenomenon in every sphere of smart life. Connected Cars and Vehicular Internet of Things, which involves communication and data exchange between vehicles, traffic infrastructure or other entities are pivotal to realize the vision of smart city and intelligent transportation. Vehicular Cloud offers a promising architecture wherein storage and processing capabilities of smart objects are utilized to provide on-the-fly fog platform. Researchers have demonstrated vulnerabilities in this emerging vehicular IoT ecosystem, where data has been stolen from critical sensors and smart vehicles controlled remotely. Security and privacy is important in Internet of Vehicles (IoV) where access to electronic control units, applications and data in connected cars should only be authorized to legitimate users, sensors or vehicles. In this paper, we propose an authorization framework to secure this dynamic system where interactions among entities is not pre-defined. We provide an extended access control oriented (E-ACO) architecture relevant to IoV and discuss the need of vehicular clouds in this time and location sensitive environment. We outline approaches to different access control models which can be enforced at various layers of E-ACO architecture and in the authorization framework. Finally, we discuss use cases to illustrate access control requirements in our vision of cloud assisted connected cars and vehicular IoT, and discuss possible research directions.
{"title":"Authorization Framework for Secure Cloud Assisted Connected Cars and Vehicular Internet of Things","authors":"Maanak Gupta, R. Sandhu","doi":"10.1145/3205977.3205994","DOIUrl":"https://doi.org/10.1145/3205977.3205994","url":null,"abstract":"Internet of Things has become a predominant phenomenon in every sphere of smart life. Connected Cars and Vehicular Internet of Things, which involves communication and data exchange between vehicles, traffic infrastructure or other entities are pivotal to realize the vision of smart city and intelligent transportation. Vehicular Cloud offers a promising architecture wherein storage and processing capabilities of smart objects are utilized to provide on-the-fly fog platform. Researchers have demonstrated vulnerabilities in this emerging vehicular IoT ecosystem, where data has been stolen from critical sensors and smart vehicles controlled remotely. Security and privacy is important in Internet of Vehicles (IoV) where access to electronic control units, applications and data in connected cars should only be authorized to legitimate users, sensors or vehicles. In this paper, we propose an authorization framework to secure this dynamic system where interactions among entities is not pre-defined. We provide an extended access control oriented (E-ACO) architecture relevant to IoV and discuss the need of vehicular clouds in this time and location sensitive environment. We outline approaches to different access control models which can be enforced at various layers of E-ACO architecture and in the authorization framework. Finally, we discuss use cases to illustrate access control requirements in our vision of cloud assisted connected cars and vehicular IoT, and discuss possible research directions.","PeriodicalId":423087,"journal":{"name":"Proceedings of the 23nd ACM on Symposium on Access Control Models and Technologies","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130775969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Calo, D. Verma, Supriyo Chakraborty, E. Bertino, Emil C. Lupu, G. Cirincione
Access control for information has primarily focused on access statically granted to subjects by administrators usually in the context of a specific system. Even if mechanisms are available for access revocation, revocations must still be executed manually by an administrator. However, as physical devices become increasingly embedded and interconnected, access control needs to become an integral part of the resource being protected and be generated dynamically by resources depending on the context in which the resource is being used. In this paper, we discuss a set of scenarios for access control needed in current and future systems and use that to argue that an approach for resources to generate and manage their access control policies dynamically on their own is needed. We discuss some approaches for generating such access control policies that may address the requirements of the scenarios.
{"title":"Self-Generation of Access Control Policies","authors":"S. Calo, D. Verma, Supriyo Chakraborty, E. Bertino, Emil C. Lupu, G. Cirincione","doi":"10.1145/3205977.3205995","DOIUrl":"https://doi.org/10.1145/3205977.3205995","url":null,"abstract":"Access control for information has primarily focused on access statically granted to subjects by administrators usually in the context of a specific system. Even if mechanisms are available for access revocation, revocations must still be executed manually by an administrator. However, as physical devices become increasingly embedded and interconnected, access control needs to become an integral part of the resource being protected and be generated dynamically by resources depending on the context in which the resource is being used. In this paper, we discuss a set of scenarios for access control needed in current and future systems and use that to argue that an approach for resources to generate and manage their access control policies dynamically on their own is needed. We discuss some approaches for generating such access control policies that may address the requirements of the scenarios.","PeriodicalId":423087,"journal":{"name":"Proceedings of the 23nd ACM on Symposium on Access Control Models and Technologies","volume":"547 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133901678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Despite decades of research on the Internet security, we constantly hear about mega data breaches and malware infections affecting hundreds of millions of hosts. The key reason is that the current threat model of the Internet relies on two assumptions that no longer hold true: (1) Web servers, hosting the content, are secure, (2) each Internet connection starts from the original content provider and terminates at the content consumer. Internet security is today merely patched on top of the TCP/IP protocol stack. In order to achieve comprehensive security for the Internet, we believe that a clean-slate approach must be adopted where a content based security model is employed. Named Data Networking (NDN) is a step in this direction which is envisioned to be the next generation Internet architecture based on a content centric communication model. NDN is currently being designed with security as a key requirement, and thus to support content integrity, authenticity, confidentiality and privacy. However, in order to meet such a requirement, one needs to overcome several challenges, especially in either large operational environments or resource constrained networks. In this paper, we explore the security challenges in achieving comprehensive content security in NDN and propose a research agenda to address some of the challenges.
{"title":"Securing Named Data Networks: Challenges and the Way Forward","authors":"E. Bertino, Mohamed Nabeel","doi":"10.1145/3205977.3205996","DOIUrl":"https://doi.org/10.1145/3205977.3205996","url":null,"abstract":"Despite decades of research on the Internet security, we constantly hear about mega data breaches and malware infections affecting hundreds of millions of hosts. The key reason is that the current threat model of the Internet relies on two assumptions that no longer hold true: (1) Web servers, hosting the content, are secure, (2) each Internet connection starts from the original content provider and terminates at the content consumer. Internet security is today merely patched on top of the TCP/IP protocol stack. In order to achieve comprehensive security for the Internet, we believe that a clean-slate approach must be adopted where a content based security model is employed. Named Data Networking (NDN) is a step in this direction which is envisioned to be the next generation Internet architecture based on a content centric communication model. NDN is currently being designed with security as a key requirement, and thus to support content integrity, authenticity, confidentiality and privacy. However, in order to meet such a requirement, one needs to overcome several challenges, especially in either large operational environments or resource constrained networks. In this paper, we explore the security challenges in achieving comprehensive content security in NDN and propose a research agenda to address some of the challenges.","PeriodicalId":423087,"journal":{"name":"Proceedings of the 23nd ACM on Symposium on Access Control Models and Technologies","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130602255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the light of mobile and ubiquitous computing, sharing sensitive information across different computer systems has become an increasingly prominent practice. This development entails a demand of access control measures that can protect data even after it has been transferred to a remote computer system. In order to address this problem, sophisticated usage control models have been developed. These models include a client side reference monitor (CRM) that continuously enforces protection policies on foreign data. However, it is still unclear how such a CRM can be properly protected in a hostile environment. The user of the data on the client system can influence the client's state and has physical access to the system. Hence technical measures are required to protect the CRM on a system, which is legitimately used by potential attackers. Existing solutions utilize Trusted Platform Modules (TPMs) to solve this problem by establishing an attestable trust anchor on the client. However, the resulting protocols have several drawbacks that make them infeasible for practical use. This work proposes a reference monitor implementation that establishes trust by using TPMs along with Intel SGX enclaves. First we show how SGX enclaves can realize a subset of the existing usage control requirements. Then we add a TPM to establish and protect a powerful enforcement component on the client. Ultimately this allows us to technically enforce usage control policies on an untrusted remote system.
{"title":"Distributed Usage Control Enforcement through Trusted Platform Modules and SGX Enclaves","authors":"P. Wagner, Pascal Birnstill, J. Beyerer","doi":"10.1145/3205977.3205990","DOIUrl":"https://doi.org/10.1145/3205977.3205990","url":null,"abstract":"In the light of mobile and ubiquitous computing, sharing sensitive information across different computer systems has become an increasingly prominent practice. This development entails a demand of access control measures that can protect data even after it has been transferred to a remote computer system. In order to address this problem, sophisticated usage control models have been developed. These models include a client side reference monitor (CRM) that continuously enforces protection policies on foreign data. However, it is still unclear how such a CRM can be properly protected in a hostile environment. The user of the data on the client system can influence the client's state and has physical access to the system. Hence technical measures are required to protect the CRM on a system, which is legitimately used by potential attackers. Existing solutions utilize Trusted Platform Modules (TPMs) to solve this problem by establishing an attestable trust anchor on the client. However, the resulting protocols have several drawbacks that make them infeasible for practical use. This work proposes a reference monitor implementation that establishes trust by using TPMs along with Intel SGX enclaves. First we show how SGX enclaves can realize a subset of the existing usage control requirements. Then we add a TPM to establish and protect a powerful enforcement component on the client. Ultimately this allows us to technically enforce usage control policies on an untrusted remote system.","PeriodicalId":423087,"journal":{"name":"Proceedings of the 23nd ACM on Symposium on Access Control Models and Technologies","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129053664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the past decade, many organizations have adopted a Role-Based Access Control model (RBAC) to reduce their administration costs and increase security. The migration to RBAC requires a role engineering phase aimed at generating "good" initial roles starting from direct assignments of permissions to users. For an RBAC approach to be effective, however, it is also necessary to update roles and keep them compliant with the dynamic nature of the business processes; not only this, but errors and misalignments between the current RBAC state and reality need to be promptly detected and fixed. In this paper, we propose a new maintenance process to fix and refine an RBAC state when "exceptions" are detected. Exceptions are permissions some users realize they miss that are instrumental to their job and should be granted as soon as possible. They are catched by a monitoring system as unexpected "access denied" conditions and then validated by the RBAC administrator. The fix we produce aims at balancing two conflicting objectives, i.e., (i) simplifying the current RBAC state, and (ii) reducing the transition cost. Our approach is based on a Max-SAT formalization of this trade-off and it exploits incomplete solvers that quickly provide approximations of optimal solutions. Experiments show good performance on real-world benchmarks.
{"title":"Parametric RBAC Maintenance via Max-SAT","authors":"Marco Benedetti, Marco Mori","doi":"10.1145/3205977.3205987","DOIUrl":"https://doi.org/10.1145/3205977.3205987","url":null,"abstract":"In the past decade, many organizations have adopted a Role-Based Access Control model (RBAC) to reduce their administration costs and increase security. The migration to RBAC requires a role engineering phase aimed at generating \"good\" initial roles starting from direct assignments of permissions to users. For an RBAC approach to be effective, however, it is also necessary to update roles and keep them compliant with the dynamic nature of the business processes; not only this, but errors and misalignments between the current RBAC state and reality need to be promptly detected and fixed. In this paper, we propose a new maintenance process to fix and refine an RBAC state when \"exceptions\" are detected. Exceptions are permissions some users realize they miss that are instrumental to their job and should be granted as soon as possible. They are catched by a monitoring system as unexpected \"access denied\" conditions and then validated by the RBAC administrator. The fix we produce aims at balancing two conflicting objectives, i.e., (i) simplifying the current RBAC state, and (ii) reducing the transition cost. Our approach is based on a Max-SAT formalization of this trade-off and it exploits incomplete solvers that quickly provide approximations of optimal solutions. Experiments show good performance on real-world benchmarks.","PeriodicalId":423087,"journal":{"name":"Proceedings of the 23nd ACM on Symposium on Access Control Models and Technologies","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115902953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The National Institute of Standards and Technology (NIST) has identified natural language policies as the preferred expression of policy and implicitly called for an automated translation of ABAC natural language access control policy (NLACP) to a machine-readable form. An essential step towards this automation is to automate the extraction of ABAC attributes from NLACPs, which is the focus of this paper. We, therefore, raise the question of: how can we automate the task of attributes extraction from natural language documents? Our proposed solution to this question is built upon the recent advancements in natural language processing and machine learning techniques. For such a solution, the lack of appropriate data often poses a bottleneck. Therefore, we decouple the primary contributions of this work into: (1) developing a practical framework to extract ABAC attributes from natural language artifacts, and (2) generating a set of realistic synthetic natural language access control policies (NLACPs) to evaluate the proposed framework. The experimental results are promising with regard to the potential automation of the task of interest. Using a convolutional neural network (CNN), we achieved - in average - an F1-score of 0.96 when extracting the attributes of subjects, and 0.91 when extracting the objects' attributes from natural language access control policies.
{"title":"A Deep Learning Approach for Extracting Attributes of ABAC Policies","authors":"Manar Alohaly, Hassan Takabi, Eduardo Blanco","doi":"10.1145/3205977.3205984","DOIUrl":"https://doi.org/10.1145/3205977.3205984","url":null,"abstract":"The National Institute of Standards and Technology (NIST) has identified natural language policies as the preferred expression of policy and implicitly called for an automated translation of ABAC natural language access control policy (NLACP) to a machine-readable form. An essential step towards this automation is to automate the extraction of ABAC attributes from NLACPs, which is the focus of this paper. We, therefore, raise the question of: how can we automate the task of attributes extraction from natural language documents? Our proposed solution to this question is built upon the recent advancements in natural language processing and machine learning techniques. For such a solution, the lack of appropriate data often poses a bottleneck. Therefore, we decouple the primary contributions of this work into: (1) developing a practical framework to extract ABAC attributes from natural language artifacts, and (2) generating a set of realistic synthetic natural language access control policies (NLACPs) to evaluate the proposed framework. The experimental results are promising with regard to the potential automation of the task of interest. Using a convolutional neural network (CNN), we achieved - in average - an F1-score of 0.96 when extracting the attributes of subjects, and 0.91 when extracting the objects' attributes from natural language access control policies.","PeriodicalId":423087,"journal":{"name":"Proceedings of the 23nd ACM on Symposium on Access Control Models and Technologies","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130215843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
B. Thuraisingham, Murat Kantarcioglu, E. Bertino, J. Bakdash, M. Fernández
Massive amounts of data are being collected, stored, and analyzed for various business and marketing purposes. While such data analysis is critical for many applications, it could also violate the privacy of individuals. This paper describes the issues involved in designing a privacy aware data management framework for collecting, storing, and analyzing the data. We also discuss behavioral aspects of data sharing as well as aspects of a formal framework based on rewriting rules that encompasses the privacy aware data management framework.
{"title":"Towards a Privacy-Aware Quantified Self Data Management Framework","authors":"B. Thuraisingham, Murat Kantarcioglu, E. Bertino, J. Bakdash, M. Fernández","doi":"10.1145/3205977.3205997","DOIUrl":"https://doi.org/10.1145/3205977.3205997","url":null,"abstract":"Massive amounts of data are being collected, stored, and analyzed for various business and marketing purposes. While such data analysis is critical for many applications, it could also violate the privacy of individuals. This paper describes the issues involved in designing a privacy aware data management framework for collecting, storing, and analyzing the data. We also discuss behavioral aspects of data sharing as well as aspects of a formal framework based on rewriting rules that encompasses the privacy aware data management framework.","PeriodicalId":423087,"journal":{"name":"Proceedings of the 23nd ACM on Symposium on Access Control Models and Technologies","volume":"55 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120988112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Phishing websites remain a persistent security threat. Thus far, machine learning approaches appear to have the best potential as defenses. But, there are two main concerns with existing machine learning approaches for phishing detection. The first is the large number of training features used and the lack of validating arguments for these feature choices. The second concern is the type of datasets used in the literature that are inadvertently biased with respect to the features based on the website URL or content. To address these concerns, we put forward the intuition that the domain name of phishing websites is the tell-tale sign of phishing and holds the key to successful phishing detection. Accordingly, we design features that model the relationships, visual as well as statistical, of the domain name to the key elements of a phishing website, which are used to snare the end-users. The main value of our feature design is that, to bypass detection, an attacker will find it very difficult to tamper with the visual content of the phishing website without arousing the suspicion of the end user. Our feature set ensures that there is minimal or no bias with respect to a dataset. Our learning model trains with only seven features and achieves a true positive rate of 98% and a classification accuracy of 97%, on sample dataset. Compared to the state-of-the-art work, our per data instance classification is 4 times faster for legitimate websites and 10 times faster for phishing websites. Importantly, we demonstrate the shortcomings of using features based on URLs as they are likely to be biased towards specific datasets. We show the robustness of our learning algorithm by testing on unknown live phishing URLs and achieve a high detection accuracy of $99.7%$.
{"title":"\"Kn0w Thy Doma1n Name\": Unbiased Phishing Detection Using Domain Name Based Features","authors":"H. Shirazi, Bruhadeshwar Bezawada, I. Ray","doi":"10.1145/3205977.3205992","DOIUrl":"https://doi.org/10.1145/3205977.3205992","url":null,"abstract":"Phishing websites remain a persistent security threat. Thus far, machine learning approaches appear to have the best potential as defenses. But, there are two main concerns with existing machine learning approaches for phishing detection. The first is the large number of training features used and the lack of validating arguments for these feature choices. The second concern is the type of datasets used in the literature that are inadvertently biased with respect to the features based on the website URL or content. To address these concerns, we put forward the intuition that the domain name of phishing websites is the tell-tale sign of phishing and holds the key to successful phishing detection. Accordingly, we design features that model the relationships, visual as well as statistical, of the domain name to the key elements of a phishing website, which are used to snare the end-users. The main value of our feature design is that, to bypass detection, an attacker will find it very difficult to tamper with the visual content of the phishing website without arousing the suspicion of the end user. Our feature set ensures that there is minimal or no bias with respect to a dataset. Our learning model trains with only seven features and achieves a true positive rate of 98% and a classification accuracy of 97%, on sample dataset. Compared to the state-of-the-art work, our per data instance classification is 4 times faster for legitimate websites and 10 times faster for phishing websites. Importantly, we demonstrate the shortcomings of using features based on URLs as they are likely to be biased towards specific datasets. We show the robustness of our learning algorithm by testing on unknown live phishing URLs and achieve a high detection accuracy of $99.7%$.","PeriodicalId":423087,"journal":{"name":"Proceedings of the 23nd ACM on Symposium on Access Control Models and Technologies","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123203276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}