Static taint analyses are widely-applied techniques to detect taint flows in software systems. Although they are theoretically conservative and de-signed to detect all possible taint flows, static taint analyses almost always exhibit false negatives due to a variety of implementation limitations. Dynamic programming language features, inaccessible code, and the usage of multiple programming languages in a software project are some of the major causes. To alleviate this problem, we developed a novel approach, DySTA, which uses dynamic taint analysis results as additional sources for static taint analysis. However, naïvely adding sources causes static analysis to lose context sensitivity and thus produce false positives. Thus, we developed a hybrid context matching algorithm and corresponding tool, ConDySTA, to preserve context sensitivity in DySTA. We applied REPRODROID [1], a comprehensive benchmarking framework for Android analysis tools, to evaluate ConDySTA. The results show that across 28 apps (1) ConDySTA was able to detect 12 out of 28 taint flows which were not detected by any of the six state-of-the-art static taint analyses considered in ReproDroid, and (2) ConDySTA reported no false positives, whereas nine were reported by DySTA alone. We further applied ConDySTA and FlowDroid to 100 top Android apps from Google Play, and ConDySTA was able to detect 39 additional taint flows (besides 281 taint flows found by FlowDroid) while preserving the context sensitivity of FlowDroid.
{"title":"ConDySTA: Context-Aware Dynamic Supplement to Static Taint Analysis","authors":"Xueling Zhang, Xiaoyin Wang, Rocky Slavin, Jianwei Niu","doi":"10.1109/SP40001.2021.00040","DOIUrl":"https://doi.org/10.1109/SP40001.2021.00040","url":null,"abstract":"Static taint analyses are widely-applied techniques to detect taint flows in software systems. Although they are theoretically conservative and de-signed to detect all possible taint flows, static taint analyses almost always exhibit false negatives due to a variety of implementation limitations. Dynamic programming language features, inaccessible code, and the usage of multiple programming languages in a software project are some of the major causes. To alleviate this problem, we developed a novel approach, DySTA, which uses dynamic taint analysis results as additional sources for static taint analysis. However, naïvely adding sources causes static analysis to lose context sensitivity and thus produce false positives. Thus, we developed a hybrid context matching algorithm and corresponding tool, ConDySTA, to preserve context sensitivity in DySTA. We applied REPRODROID [1], a comprehensive benchmarking framework for Android analysis tools, to evaluate ConDySTA. The results show that across 28 apps (1) ConDySTA was able to detect 12 out of 28 taint flows which were not detected by any of the six state-of-the-art static taint analyses considered in ReproDroid, and (2) ConDySTA reported no false positives, whereas nine were reported by DySTA alone. We further applied ConDySTA and FlowDroid to 100 top Android apps from Google Play, and ConDySTA was able to detect 39 additional taint flows (besides 281 taint flows found by FlowDroid) while preserving the context sensitivity of FlowDroid.","PeriodicalId":6786,"journal":{"name":"2021 IEEE Symposium on Security and Privacy (SP)","volume":"49 1 1","pages":"796-812"},"PeriodicalIF":0.0,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91119728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-05-01DOI: 10.1109/SP40001.2021.00032
James C. Davis, Francisco Servant, Dongyoon Lee
Regular expressions (regexes) are a denial of service vector in most mainstream programming languages. Recent empirical work has demonstrated that up to 10% of regexes have super-linear worst-case behavior in typical regex engines. It is therefore not surprising that many web services are reportedly vulnerable to regex denial of service (ReDoS). If the time complexity of a regex engine can be reduced transparently, ReDoS vulnerabilities can be eliminated at no cost to application developers. Unfortunately, existing ReDoS defenses — replacing the regex engine, optimizing it, or replacing regexes piecemeal — struggle with soundness and compatibility. Full memoization is sound and compatible, but its space costs are too high. No effective ReDoS defense has been adopted in practice. We present techniques to provably eliminate super-linear regex behavior with low space costs for typical regexes. We propose selective memoization schemes with varying space/time tradeoffs. We then describe an encoding scheme that leverages insights about regex engine semantics to further reduce the space cost of memoization. We also consider how to safely handle extended regex features. We implemented our proposals and evaluated them on a corpus of real-world regexes. We found that selective memoization lowers the space cost of memoization by an order of magnitude for the median regex, and that run-length encoding further lowers the space cost to constant for 90% of regexes. "Those who cannot remember the past are condemned to repeat it." –George Santayana
{"title":"Using Selective Memoization to Defeat Regular Expression Denial of Service (ReDoS)","authors":"James C. Davis, Francisco Servant, Dongyoon Lee","doi":"10.1109/SP40001.2021.00032","DOIUrl":"https://doi.org/10.1109/SP40001.2021.00032","url":null,"abstract":"Regular expressions (regexes) are a denial of service vector in most mainstream programming languages. Recent empirical work has demonstrated that up to 10% of regexes have super-linear worst-case behavior in typical regex engines. It is therefore not surprising that many web services are reportedly vulnerable to regex denial of service (ReDoS). If the time complexity of a regex engine can be reduced transparently, ReDoS vulnerabilities can be eliminated at no cost to application developers. Unfortunately, existing ReDoS defenses — replacing the regex engine, optimizing it, or replacing regexes piecemeal — struggle with soundness and compatibility. Full memoization is sound and compatible, but its space costs are too high. No effective ReDoS defense has been adopted in practice. We present techniques to provably eliminate super-linear regex behavior with low space costs for typical regexes. We propose selective memoization schemes with varying space/time tradeoffs. We then describe an encoding scheme that leverages insights about regex engine semantics to further reduce the space cost of memoization. We also consider how to safely handle extended regex features. We implemented our proposals and evaluated them on a corpus of real-world regexes. We found that selective memoization lowers the space cost of memoization by an order of magnitude for the median regex, and that run-length encoding further lowers the space cost to constant for 90% of regexes. \"Those who cannot remember the past are condemned to repeat it.\" –George Santayana","PeriodicalId":6786,"journal":{"name":"2021 IEEE Symposium on Security and Privacy (SP)","volume":"200 1","pages":"1-17"},"PeriodicalIF":0.0,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76971723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-05-01DOI: 10.1109/SP40001.2021.00101
Miao Yu, V. Gligor, Limin Jia
Commodity I/O hardware often fails to separate I/O transfers of isolated OS and applications code. Even when using the best I/O hardware, commodity systems sometimes trade off separation assurance for increased performance. Remarkably, device firmware need not be malicious. Instead, any malicious driver, even if isolated in its own execution domain, can manipulate its device to breach I/O separation. To prevent such vulnerabilities with high assurance, a formal I/O separation model and its use in automatic generation of secure I/O kernel code is necessary.This paper presents a formal I/O separation model, which defines a separation policy based on authorization of I/O transfers and is hardware agnostic. The model, its refinement, and instantiation in the Wimpy kernel design, are formally specified and verified in Dafny. We then specify the kernel implementation and automatically generate verified-correct assembly code that enforces the I/O separation policies. Our formal modeling enables the discovery of heretofore unknown design and implementation vulnerabilities of the original Wimpy kernel. Finally, we outline how the model can be applied to other I/O kernels and conclude with the key lessons learned.
{"title":"An I/O Separation Model for Formal Verification of Kernel Implementations","authors":"Miao Yu, V. Gligor, Limin Jia","doi":"10.1109/SP40001.2021.00101","DOIUrl":"https://doi.org/10.1109/SP40001.2021.00101","url":null,"abstract":"Commodity I/O hardware often fails to separate I/O transfers of isolated OS and applications code. Even when using the best I/O hardware, commodity systems sometimes trade off separation assurance for increased performance. Remarkably, device firmware need not be malicious. Instead, any malicious driver, even if isolated in its own execution domain, can manipulate its device to breach I/O separation. To prevent such vulnerabilities with high assurance, a formal I/O separation model and its use in automatic generation of secure I/O kernel code is necessary.This paper presents a formal I/O separation model, which defines a separation policy based on authorization of I/O transfers and is hardware agnostic. The model, its refinement, and instantiation in the Wimpy kernel design, are formally specified and verified in Dafny. We then specify the kernel implementation and automatically generate verified-correct assembly code that enforces the I/O separation policies. Our formal modeling enables the discovery of heretofore unknown design and implementation vulnerabilities of the original Wimpy kernel. Finally, we outline how the model can be applied to other I/O kernels and conclude with the key lessons learned.","PeriodicalId":6786,"journal":{"name":"2021 IEEE Symposium on Security and Privacy (SP)","volume":"16 1","pages":"572-589"},"PeriodicalIF":0.0,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73116106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-05-01DOI: 10.1109/SP40001.2021.00007
Quan Chen, Peter Snyder, B. Livshits, A. Kapravelos
Content blocking is an important part of a per-formant, user-serving, privacy respecting web. Current content blockers work by building trust labels over URLs. While useful, this approach has many well understood shortcomings. Attackers may avoid detection by changing URLs or domains, bundling unwanted code with benign code, or inlining code in pages.The common flaw in existing approaches is that they evaluate code based on its delivery mechanism, not its behavior. In this work we address this problem by building a system for generating signatures of the privacy-and-security relevant behavior of executed JavaScript. Our system uses as the unit of analysis each script’s behavior during each turn on the JavaScript event loop. Focusing on event loop turns allows us to build highly identifying signatures for JavaScript code that are robust against code obfuscation, code bundling, URL modification, and other common evasions, as well as handle unique aspects of web applications.This work makes the following contributions to the problem of measuring and improving content blocking on the web: First, we design and implement a novel system to build per-event-loop-turn signatures of JavaScript behavior through deep instrumentation of the Blink and V8 runtimes. Second, we apply these signatures to measure how much privacy-and-security harming code is missed by current content blockers, by using EasyList and EasyPrivacy as ground truth and finding scripts that have the same privacy and security harming patterns. We build 1,995,444 signatures of privacy-and-security relevant behaviors from 11,212 unique scripts blocked by filter lists, and find 3,589 unique scripts hosting known harmful code, but missed by filter lists, affecting 12.48% of websites measured. Third, we provide a taxonomy of ways scripts avoid detection and quantify the occurrence of each. Finally, we present defenses against these evasions, in the form of filter list additions where possible, and through a proposed, signature based system in other cases.As part of this work, we share the implementation of our signature-generation system, the data gathered by applying that system to the Alexa 100K, and 586 AdBlock Plus compatible filter list rules to block instances of currently blocked code being moved to new URLs.
{"title":"Detecting Filter List Evasion with Event-Loop-Turn Granularity JavaScript Signatures","authors":"Quan Chen, Peter Snyder, B. Livshits, A. Kapravelos","doi":"10.1109/SP40001.2021.00007","DOIUrl":"https://doi.org/10.1109/SP40001.2021.00007","url":null,"abstract":"Content blocking is an important part of a per-formant, user-serving, privacy respecting web. Current content blockers work by building trust labels over URLs. While useful, this approach has many well understood shortcomings. Attackers may avoid detection by changing URLs or domains, bundling unwanted code with benign code, or inlining code in pages.The common flaw in existing approaches is that they evaluate code based on its delivery mechanism, not its behavior. In this work we address this problem by building a system for generating signatures of the privacy-and-security relevant behavior of executed JavaScript. Our system uses as the unit of analysis each script’s behavior during each turn on the JavaScript event loop. Focusing on event loop turns allows us to build highly identifying signatures for JavaScript code that are robust against code obfuscation, code bundling, URL modification, and other common evasions, as well as handle unique aspects of web applications.This work makes the following contributions to the problem of measuring and improving content blocking on the web: First, we design and implement a novel system to build per-event-loop-turn signatures of JavaScript behavior through deep instrumentation of the Blink and V8 runtimes. Second, we apply these signatures to measure how much privacy-and-security harming code is missed by current content blockers, by using EasyList and EasyPrivacy as ground truth and finding scripts that have the same privacy and security harming patterns. We build 1,995,444 signatures of privacy-and-security relevant behaviors from 11,212 unique scripts blocked by filter lists, and find 3,589 unique scripts hosting known harmful code, but missed by filter lists, affecting 12.48% of websites measured. Third, we provide a taxonomy of ways scripts avoid detection and quantify the occurrence of each. Finally, we present defenses against these evasions, in the form of filter list additions where possible, and through a proposed, signature based system in other cases.As part of this work, we share the implementation of our signature-generation system, the data gathered by applying that system to the Alexa 100K, and 586 AdBlock Plus compatible filter list rules to block instances of currently blocked code being moved to new URLs.","PeriodicalId":6786,"journal":{"name":"2021 IEEE Symposium on Security and Privacy (SP)","volume":"16 1","pages":"1715-1729"},"PeriodicalIF":0.0,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75799009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-05-01DOI: 10.1109/SP40001.2021.00081
Benjamin Bichsel, Samuel Steffen, Ilija Bogunovic, Martin T. Vechev
We present DP-Sniper, a practical black-box method that automatically finds violations of differential privacy.DP-Sniper is based on two key ideas: (i) training a classifier to predict if an observed output was likely generated from one of two possible inputs, and (ii) transforming this classifier into an approximately optimal attack on differential privacy.Our experimental evaluation demonstrates that DP-Sniper obtains up to 12.4 times stronger guarantees than state-of-the-art, while being 15.5 times faster. Further, we show that DP-Sniper is effective in exploiting floating-point vulnerabilities of naively implemented algorithms: it detects that a supposedly 0.1-differentially private implementation of the Laplace mechanism actually does not satisfy even 0.25-differential privacy.
{"title":"DP-Sniper: Black-Box Discovery of Differential Privacy Violations using Classifiers","authors":"Benjamin Bichsel, Samuel Steffen, Ilija Bogunovic, Martin T. Vechev","doi":"10.1109/SP40001.2021.00081","DOIUrl":"https://doi.org/10.1109/SP40001.2021.00081","url":null,"abstract":"We present DP-Sniper, a practical black-box method that automatically finds violations of differential privacy.DP-Sniper is based on two key ideas: (i) training a classifier to predict if an observed output was likely generated from one of two possible inputs, and (ii) transforming this classifier into an approximately optimal attack on differential privacy.Our experimental evaluation demonstrates that DP-Sniper obtains up to 12.4 times stronger guarantees than state-of-the-art, while being 15.5 times faster. Further, we show that DP-Sniper is effective in exploiting floating-point vulnerabilities of naively implemented algorithms: it detects that a supposedly 0.1-differentially private implementation of the Laplace mechanism actually does not satisfy even 0.25-differential privacy.","PeriodicalId":6786,"journal":{"name":"2021 IEEE Symposium on Security and Privacy (SP)","volume":"27 1","pages":"391-409"},"PeriodicalIF":0.0,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81378507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-05-01DOI: 10.1109/SP40001.2021.00026
Benjamin E. Diamond
Anonymous Zether, proposed by Bünz, Agrawal, Zamani, and Boneh (FC’20), is a private payment design whose wallets demand little bandwidth and need not remain online; this unique property makes it a compelling choice for resource-constrained devices. In this work, we describe an efficient construction of Anonymous Zether. Our protocol features proofs which grow only logarithmically in the size of the "anonymity sets" used, improving upon the linear growth attained by prior efforts. It also features competitive transaction sizes in practice (on the order of 3 kilobytes).Our central tool is a new family of extensions to Groth and Kohlweiss’s one-out-of-many proofs (Eurocrypt 2015), which efficiently prove statements about many messages among a list of commitments. These extensions prove knowledge of a secret subset of a public list, and assert that the commitments in the subset satisfy certain properties (expressed as linear equations). Remarkably, our communication remains logarithmic; our computation increases only by a logarithmic multiplicative factor. This technique is likely to be of independent interest.We present an open-source, Ethereum-based implementation of our Anonymous Zether construction.
{"title":"Many-out-of-Many Proofs and Applications to Anonymous Zether","authors":"Benjamin E. Diamond","doi":"10.1109/SP40001.2021.00026","DOIUrl":"https://doi.org/10.1109/SP40001.2021.00026","url":null,"abstract":"Anonymous Zether, proposed by Bünz, Agrawal, Zamani, and Boneh (FC’20), is a private payment design whose wallets demand little bandwidth and need not remain online; this unique property makes it a compelling choice for resource-constrained devices. In this work, we describe an efficient construction of Anonymous Zether. Our protocol features proofs which grow only logarithmically in the size of the \"anonymity sets\" used, improving upon the linear growth attained by prior efforts. It also features competitive transaction sizes in practice (on the order of 3 kilobytes).Our central tool is a new family of extensions to Groth and Kohlweiss’s one-out-of-many proofs (Eurocrypt 2015), which efficiently prove statements about many messages among a list of commitments. These extensions prove knowledge of a secret subset of a public list, and assert that the commitments in the subset satisfy certain properties (expressed as linear equations). Remarkably, our communication remains logarithmic; our computation increases only by a logarithmic multiplicative factor. This technique is likely to be of independent interest.We present an open-source, Ethereum-based implementation of our Anonymous Zether construction.","PeriodicalId":6786,"journal":{"name":"2021 IEEE Symposium on Security and Privacy (SP)","volume":"12 1","pages":"1800-1817"},"PeriodicalIF":0.0,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79589997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-05-01DOI: 10.1109/SP40001.2021.00055
Alaa Daffalla, Lucy Simko, Tadayoshi Kohno, Alexandru G. Bardas
Political activism is a worldwide force in geopolitical change and has, historically, helped lead to greater justice, equality, and stopping human rights abuses. A modern revolution—an extreme form of political activism—pits activists, who rely on technology for critical operational tasks, against a resource-rich government that controls the very telecommunications network they must use to operationalize, putting the technology they use under extreme stress. Our work presents insights about activists’ technological defense strategies from interviews with 13 political activists who were active during the 2018-2019 Sudanese revolution. We find that politics and society are driving factors of security and privacy behavior and app adoption. Moreover, a social media blockade can trigger a series of anti-censorship approaches at scale, while a complete internet blackout can cripple activists’ use of technology. Even though the activists’ technological defenses against the threats of surveillance, arrest and physical device seizure were low tech, they were largely sufficient against their adversary. Through these results, we surface key design principles, but we observe that the generalization of design recommendations often runs into fundamental tensions between the security and usability needs of different user groups. Thus, we provide a set of structured questions in an attempt to turn these tensions into opportunities for technology designers and policy makers.
{"title":"Defensive Technology Use by Political Activists During the Sudanese Revolution","authors":"Alaa Daffalla, Lucy Simko, Tadayoshi Kohno, Alexandru G. Bardas","doi":"10.1109/SP40001.2021.00055","DOIUrl":"https://doi.org/10.1109/SP40001.2021.00055","url":null,"abstract":"Political activism is a worldwide force in geopolitical change and has, historically, helped lead to greater justice, equality, and stopping human rights abuses. A modern revolution—an extreme form of political activism—pits activists, who rely on technology for critical operational tasks, against a resource-rich government that controls the very telecommunications network they must use to operationalize, putting the technology they use under extreme stress. Our work presents insights about activists’ technological defense strategies from interviews with 13 political activists who were active during the 2018-2019 Sudanese revolution. We find that politics and society are driving factors of security and privacy behavior and app adoption. Moreover, a social media blockade can trigger a series of anti-censorship approaches at scale, while a complete internet blackout can cripple activists’ use of technology. Even though the activists’ technological defenses against the threats of surveillance, arrest and physical device seizure were low tech, they were largely sufficient against their adversary. Through these results, we surface key design principles, but we observe that the generalization of design recommendations often runs into fundamental tensions between the security and usability needs of different user groups. Thus, we provide a set of structured questions in an attempt to turn these tensions into opportunities for technology designers and policy makers.","PeriodicalId":6786,"journal":{"name":"2021 IEEE Symposium on Security and Privacy (SP)","volume":"144 1","pages":"372-390"},"PeriodicalIF":0.0,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86199437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-05-01DOI: 10.1109/SP40001.2021.00044
Evgenios M. Kornaropoulos, Charalampos Papamanthou, R. Tamassia
Despite a growing body of work on leakage-abuse attacks for encrypted databases, attacks on practical response-hiding constructions are yet to appear. Response-hiding constructions are superior in that they nullify access-pattern based attacks by revealing only the search token and the result size of each query. Response-hiding schemes are vulnerable to existing volume attacks, which are, however, based on strong assumptions such as the uniform query assumption or the dense database assumption. More crucially, these attacks only apply to schemes that cannot be deployed in practice (ones with quadratic storage and increased leakage) while practical response-hiding schemes (Demertzis et al. [SIGMOD’16] and Faber et al. [ESORICS’15]) have linear storage and less leakage. Due to these shortcomings, the value of existing volume attacks on response-hiding schemes is unclear.In this work, we close the aforementioned gap by introducing a parametrized leakage-abuse attack that applies to practical response-hiding structured encryption schemes. The use of non-parametric estimation techniques makes our attack agnostic to both the data and the query distribution. At the very core of our technique lies the newly defined concept of a counting function with respect to a range scheme. We propose a two-phase framework to approximate the counting function for any range scheme. By simply switching one counting function for another, i.e., the so-called "parameter" of our modular attack, an adversary can attack different encrypted range schemes. We propose a constrained optimization formulation for the attack algorithm that is based on the counting functions. We demonstrate the effectiveness of our leakage-abuse attack on synthetic and real-world data under various scenarios.
{"title":"Response-Hiding Encrypted Ranges: Revisiting Security via Parametrized Leakage-Abuse Attacks","authors":"Evgenios M. Kornaropoulos, Charalampos Papamanthou, R. Tamassia","doi":"10.1109/SP40001.2021.00044","DOIUrl":"https://doi.org/10.1109/SP40001.2021.00044","url":null,"abstract":"Despite a growing body of work on leakage-abuse attacks for encrypted databases, attacks on practical response-hiding constructions are yet to appear. Response-hiding constructions are superior in that they nullify access-pattern based attacks by revealing only the search token and the result size of each query. Response-hiding schemes are vulnerable to existing volume attacks, which are, however, based on strong assumptions such as the uniform query assumption or the dense database assumption. More crucially, these attacks only apply to schemes that cannot be deployed in practice (ones with quadratic storage and increased leakage) while practical response-hiding schemes (Demertzis et al. [SIGMOD’16] and Faber et al. [ESORICS’15]) have linear storage and less leakage. Due to these shortcomings, the value of existing volume attacks on response-hiding schemes is unclear.In this work, we close the aforementioned gap by introducing a parametrized leakage-abuse attack that applies to practical response-hiding structured encryption schemes. The use of non-parametric estimation techniques makes our attack agnostic to both the data and the query distribution. At the very core of our technique lies the newly defined concept of a counting function with respect to a range scheme. We propose a two-phase framework to approximate the counting function for any range scheme. By simply switching one counting function for another, i.e., the so-called \"parameter\" of our modular attack, an adversary can attack different encrypted range schemes. We propose a constrained optimization formulation for the attack algorithm that is based on the counting functions. We demonstrate the effectiveness of our leakage-abuse attack on synthetic and real-world data under various scenarios.","PeriodicalId":6786,"journal":{"name":"2021 IEEE Symposium on Security and Privacy (SP)","volume":"1 1","pages":"1502-1519"},"PeriodicalIF":0.0,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91530904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-05-01DOI: 10.1109/SP40001.2021.00062
Yinxi Liu, Mingxue Zhang, W. Meng
Regular expression Denial-of-Service (ReDoS) is a class of algorithmic complexity attacks. Attackers can craft particular strings to trigger the worst-case super-linear matching time of some vulnerable regular expressions (regex) with extended features that are commonly supported by popular programming languages. ReDoS attacks can severely degrade the performance of web applications, which extensively employ regexes in their server-side logic. Nevertheless, the characteristics of vulnerable regexes with extended features remain understudied, making it difficult to mitigate or even detect such vulnerabilities.In this paper, we aim to model vulnerable regex patterns generated by popular regex engines and craft attack strings accordingly. Our characterization fully supports the analysis of regexes with any extended feature. We develop Revealer to detect vulnerable structures presented in any given regex and generate attack strings to exploit the corresponding vulnerabilities. Revealer takes a hybrid approach. It first statically locates potential vulnerable structures of a regex, then dynamically verifies whether the vulnerabilities can be triggered or not, and finally crafts attack strings that can lead to recursive backtracking. By combining both static analysis and dynamic analysis, Revealer can accurately and efficiently generate exploits in a limited amount of time. It can further offer mitigation suggestions based on the structural information it identifies.We implemented a prototype of Revealer for Java. We evaluated Revealer over a dataset with 29,088 regexes, and compared it with three state-of-the-art tools. The evaluation shows that Revealer considerably outperformed all the existing tools—Revealer can detect all 237 vulnerabilities that can be detected by any other tool, find 213 new vulnerabilities, and beat the best tool by 140.64%. We further demonstrate that Revealer successfully detected 45 vulnerable regexes in popular real-world applications. Our evaluation demonstrates that Revealer is both effective and efficient in detecting and exploiting ReDoS vulnerabilities.
{"title":"Revealer: Detecting and Exploiting Regular Expression Denial-of-Service Vulnerabilities","authors":"Yinxi Liu, Mingxue Zhang, W. Meng","doi":"10.1109/SP40001.2021.00062","DOIUrl":"https://doi.org/10.1109/SP40001.2021.00062","url":null,"abstract":"Regular expression Denial-of-Service (ReDoS) is a class of algorithmic complexity attacks. Attackers can craft particular strings to trigger the worst-case super-linear matching time of some vulnerable regular expressions (regex) with extended features that are commonly supported by popular programming languages. ReDoS attacks can severely degrade the performance of web applications, which extensively employ regexes in their server-side logic. Nevertheless, the characteristics of vulnerable regexes with extended features remain understudied, making it difficult to mitigate or even detect such vulnerabilities.In this paper, we aim to model vulnerable regex patterns generated by popular regex engines and craft attack strings accordingly. Our characterization fully supports the analysis of regexes with any extended feature. We develop Revealer to detect vulnerable structures presented in any given regex and generate attack strings to exploit the corresponding vulnerabilities. Revealer takes a hybrid approach. It first statically locates potential vulnerable structures of a regex, then dynamically verifies whether the vulnerabilities can be triggered or not, and finally crafts attack strings that can lead to recursive backtracking. By combining both static analysis and dynamic analysis, Revealer can accurately and efficiently generate exploits in a limited amount of time. It can further offer mitigation suggestions based on the structural information it identifies.We implemented a prototype of Revealer for Java. We evaluated Revealer over a dataset with 29,088 regexes, and compared it with three state-of-the-art tools. The evaluation shows that Revealer considerably outperformed all the existing tools—Revealer can detect all 237 vulnerabilities that can be detected by any other tool, find 213 new vulnerabilities, and beat the best tool by 140.64%. We further demonstrate that Revealer successfully detected 45 vulnerable regexes in popular real-world applications. Our evaluation demonstrates that Revealer is both effective and efficient in detecting and exploiting ReDoS vulnerabilities.","PeriodicalId":6786,"journal":{"name":"2021 IEEE Symposium on Security and Privacy (SP)","volume":"69 1","pages":"1468-1484"},"PeriodicalIF":0.0,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84354086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-05-01DOI: 10.1109/SP40001.2021.00071
Yongheng Chen, Rui Zhong, Hong Hu, Hangfan Zhang, Yupeng Yang, Dinghao Wu, Wenke Lee
Language processors, such as compilers and interpreters, are indispensable in building modern software. Errors in language processors can lead to severe consequences, like incorrect functionalities or even malicious attacks. However, it is not trivial to automatically test language processors to find bugs. Existing testing methods (or fuzzers) either fail to generate high-quality (i.e., semantically correct) test cases, or only support limited programming languages.In this paper, we propose POLYGLOT, a generic fuzzing framework that generates high-quality test cases for exploring processors of different programming languages. To achieve the generic applicability, POLYGLOT neutralizes the difference in syntax and semantics of programming languages with a uniform intermediate representation (IR). To improve the language validity, POLYGLOT performs constrained mutation and semantic validation to preserve syntactic correctness and fix semantic errors. We have applied POLYGLOT on 21 popular language processors of 9 programming languages, and identified 173 new bugs, 113 of which are fixed with 18 CVEs assigned. Our experiments show that POLYGLOT can support a wide range of programming languages, and outperforms existing fuzzers with up to 30× improvement in code coverage.
{"title":"One Engine to Fuzz ’em All: Generic Language Processor Testing with Semantic Validation","authors":"Yongheng Chen, Rui Zhong, Hong Hu, Hangfan Zhang, Yupeng Yang, Dinghao Wu, Wenke Lee","doi":"10.1109/SP40001.2021.00071","DOIUrl":"https://doi.org/10.1109/SP40001.2021.00071","url":null,"abstract":"Language processors, such as compilers and interpreters, are indispensable in building modern software. Errors in language processors can lead to severe consequences, like incorrect functionalities or even malicious attacks. However, it is not trivial to automatically test language processors to find bugs. Existing testing methods (or fuzzers) either fail to generate high-quality (i.e., semantically correct) test cases, or only support limited programming languages.In this paper, we propose POLYGLOT, a generic fuzzing framework that generates high-quality test cases for exploring processors of different programming languages. To achieve the generic applicability, POLYGLOT neutralizes the difference in syntax and semantics of programming languages with a uniform intermediate representation (IR). To improve the language validity, POLYGLOT performs constrained mutation and semantic validation to preserve syntactic correctness and fix semantic errors. We have applied POLYGLOT on 21 popular language processors of 9 programming languages, and identified 173 new bugs, 113 of which are fixed with 18 CVEs assigned. Our experiments show that POLYGLOT can support a wide range of programming languages, and outperforms existing fuzzers with up to 30× improvement in code coverage.","PeriodicalId":6786,"journal":{"name":"2021 IEEE Symposium on Security and Privacy (SP)","volume":"39 1","pages":"642-658"},"PeriodicalIF":0.0,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85487216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}