Pub Date : 2020-05-01DOI: 10.1109/SP40000.2020.00009
S. Dinesh, N. Burow, Dongyan Xu, Mathias Payer
Analyzing the security of closed source binaries is currently impractical for end-users, or even developers who rely on third-party libraries. Such analysis relies on automatic vulnerability discovery techniques, most notably fuzzing with sanitizers enabled. The current state of the art for applying fuzzing or sanitization to binaries is dynamic binary translation, which has prohibitive performance overhead. The alternate technique, static binary rewriting, cannot fully recover symbolization information and hence has difficulty modifying binaries to track code coverage for fuzzing or to add security checks for sanitizers.The ideal solution for binary security analysis would be a static rewriter that can intelligently add the required instrumentation as if it were inserted at compile time. Such instrumentation requires an analysis to statically disambiguate between references and scalars, a problem known to be undecidable in the general case. We show that recovering this information is possible in practice for the most common class of software and libraries: 64-bit, position independent code. Based on this observation, we develop RetroWrite, a binary-rewriting instrumentation to support American Fuzzy Lop (AFL) and Address Sanitizer (ASan), and show that it can achieve compiler-level performance while retaining precision. Binaries rewritten for coverage-guided fuzzing using RetroWrite are identical in performance to compiler-instrumented binaries and outperform the default QEMU-based instrumentation by 4.5x while triggering more bugs. Our implementation of binary-only Address Sanitizer is 3x faster than Valgrind’s memcheck, the state-of-the-art binary-only memory checker, and detects 80% more bugs in our evaluation.
{"title":"RetroWrite: Statically Instrumenting COTS Binaries for Fuzzing and Sanitization","authors":"S. Dinesh, N. Burow, Dongyan Xu, Mathias Payer","doi":"10.1109/SP40000.2020.00009","DOIUrl":"https://doi.org/10.1109/SP40000.2020.00009","url":null,"abstract":"Analyzing the security of closed source binaries is currently impractical for end-users, or even developers who rely on third-party libraries. Such analysis relies on automatic vulnerability discovery techniques, most notably fuzzing with sanitizers enabled. The current state of the art for applying fuzzing or sanitization to binaries is dynamic binary translation, which has prohibitive performance overhead. The alternate technique, static binary rewriting, cannot fully recover symbolization information and hence has difficulty modifying binaries to track code coverage for fuzzing or to add security checks for sanitizers.The ideal solution for binary security analysis would be a static rewriter that can intelligently add the required instrumentation as if it were inserted at compile time. Such instrumentation requires an analysis to statically disambiguate between references and scalars, a problem known to be undecidable in the general case. We show that recovering this information is possible in practice for the most common class of software and libraries: 64-bit, position independent code. Based on this observation, we develop RetroWrite, a binary-rewriting instrumentation to support American Fuzzy Lop (AFL) and Address Sanitizer (ASan), and show that it can achieve compiler-level performance while retaining precision. Binaries rewritten for coverage-guided fuzzing using RetroWrite are identical in performance to compiler-instrumented binaries and outperform the default QEMU-based instrumentation by 4.5x while triggering more bugs. Our implementation of binary-only Address Sanitizer is 3x faster than Valgrind’s memcheck, the state-of-the-art binary-only memory checker, and detects 80% more bugs in our evaluation.","PeriodicalId":6849,"journal":{"name":"2020 IEEE Symposium on Security and Privacy (SP)","volume":"52 1","pages":"1497-1511"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84651798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-05-01DOI: 10.1109/SP40000.2020.00087
Zhe Wang, Chenggang Wu, Mengyao Xie, Yinqian Zhang, Kangjie Lu, Xiaofeng Zhang, Yuanming Lai, Yan Kang, Min Yang
Memory-corruption attacks such as code-reuse attacks and data-only attacks have been a key threat to systems security. To counter these threats, researchers have proposed a variety of defenses, including control-flow integrity (CFI), code-pointer integrity (CPI), and code (re-)randomization. All of them, to be effective, require a security primitive—intra-process protection of confidentiality and/or integrity for sensitive data (such as CFI’s shadow stack and CPI’s safe region).In this paper, we propose SEIMI, a highly efficient intra-process memory isolation technique for memory-corruption defenses to protect their sensitive data. The core of SEIMI is to use the efficient Supervisor-mode Access Prevention (SMAP), a hardware feature that is originally used for preventing the kernel from accessing the user space, to achieve intra-process memory isolation. To leverage SMAP, SEIMI creatively executes the user code in the privileged mode. In addition to enabling the new design of the SMAP-based memory isolation, we further develop multiple new techniques to ensure secure escalation of user code, e.g., using the descriptor caches to capture the potential segment operations and configuring the Virtual Machine Control Structure (VMCS) to invalidate the execution result of the control registers related operations. Extensive experimental results show that SEIMI outperforms existing isolation mechanisms, including both the Memory Protection Keys (MPK) based scheme and the Memory Protection Extensions (MPX) based scheme, while providing secure memory isolation.
{"title":"SEIMI: Efficient and Secure SMAP-Enabled Intra-process Memory Isolation","authors":"Zhe Wang, Chenggang Wu, Mengyao Xie, Yinqian Zhang, Kangjie Lu, Xiaofeng Zhang, Yuanming Lai, Yan Kang, Min Yang","doi":"10.1109/SP40000.2020.00087","DOIUrl":"https://doi.org/10.1109/SP40000.2020.00087","url":null,"abstract":"Memory-corruption attacks such as code-reuse attacks and data-only attacks have been a key threat to systems security. To counter these threats, researchers have proposed a variety of defenses, including control-flow integrity (CFI), code-pointer integrity (CPI), and code (re-)randomization. All of them, to be effective, require a security primitive—intra-process protection of confidentiality and/or integrity for sensitive data (such as CFI’s shadow stack and CPI’s safe region).In this paper, we propose SEIMI, a highly efficient intra-process memory isolation technique for memory-corruption defenses to protect their sensitive data. The core of SEIMI is to use the efficient Supervisor-mode Access Prevention (SMAP), a hardware feature that is originally used for preventing the kernel from accessing the user space, to achieve intra-process memory isolation. To leverage SMAP, SEIMI creatively executes the user code in the privileged mode. In addition to enabling the new design of the SMAP-based memory isolation, we further develop multiple new techniques to ensure secure escalation of user code, e.g., using the descriptor caches to capture the potential segment operations and configuring the Virtual Machine Control Structure (VMCS) to invalidate the execution result of the control registers related operations. Extensive experimental results show that SEIMI outperforms existing isolation mechanisms, including both the Memory Protection Keys (MPK) based scheme and the Memory Protection Extensions (MPX) based scheme, while providing secure memory isolation.","PeriodicalId":6849,"journal":{"name":"2020 IEEE Symposium on Security and Privacy (SP)","volume":"11 1","pages":"592-607"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81797179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-05-01DOI: 10.1109/SP40000.2020.00065
Sebastian Angel, Sampath Kannan, Zachary B. Ratliff
This paper introduces a new cryptographic primitive called a private resource allocator (PRA) that can be used to allocate resources (e.g., network bandwidth, CPUs) to a set of clients without revealing to the clients whether any other clients received resources. We give several constructions of PRAs that provide guarantees ranging from information-theoretic to differential privacy. PRAs are useful in preventing a new class of attacks that we call allocation-based side-channel attacks. These attacks can be used, for example, to break the privacy guarantees of anonymous messaging systems that were designed specifically to defend against side-channel and traffic analysis attacks. Our implementation of PRAs in Alpenhorn, which is a recent anonymous messaging system, shows that PRAs increase the network resources required to start a conversation by up to 16× (can be made as low as 4× in some cases), but add no overhead once the conversation has been established.
{"title":"Private resource allocators and their applications","authors":"Sebastian Angel, Sampath Kannan, Zachary B. Ratliff","doi":"10.1109/SP40000.2020.00065","DOIUrl":"https://doi.org/10.1109/SP40000.2020.00065","url":null,"abstract":"This paper introduces a new cryptographic primitive called a private resource allocator (PRA) that can be used to allocate resources (e.g., network bandwidth, CPUs) to a set of clients without revealing to the clients whether any other clients received resources. We give several constructions of PRAs that provide guarantees ranging from information-theoretic to differential privacy. PRAs are useful in preventing a new class of attacks that we call allocation-based side-channel attacks. These attacks can be used, for example, to break the privacy guarantees of anonymous messaging systems that were designed specifically to defend against side-channel and traffic analysis attacks. Our implementation of PRAs in Alpenhorn, which is a recent anonymous messaging system, shows that PRAs increase the network resources required to start a conversation by up to 16× (can be made as low as 4× in some cases), but add no overhead once the conversation has been established.","PeriodicalId":6849,"journal":{"name":"2020 IEEE Symposium on Security and Privacy (SP)","volume":"9 1","pages":"372-391"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81825494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-05-01DOI: 10.1109/SP40000.2020.00059
Alin Tomescu, Robert Chen, Yiming Zheng, Ittai Abraham, Benny Pinkas, Guy Golan-Gueta, S. Devadas
The resurging interest in Byzantine fault tolerant systems will demand more scalable threshold cryptosystems. Unfortunately, current systems scale poorly, requiring time quadratic in the number of participants. In this paper, we present techniques that help scale threshold signature schemes (TSS), verifiable secret sharing (VSS) and distributed key generation (DKG) protocols to hundreds of thousands of participants and beyond. First, we use efficient algorithms for evaluating polynomials at multiple points to speed up computing Lagrange coefficients when aggregating threshold signatures. As a result, we can aggregate a 130,000 out of 260,000 BLS threshold signature in just 6 seconds (down from 30 minutes). Second, we show how "authenticating" such multipoint evaluations can speed up proving polynomial evaluations, a key step in communication-efficient VSS and DKG protocols. As a result, we reduce the asymptotic (and concrete) computational complexity of VSS and DKG protocols from quadratic time to quasilinear time, at a small increase in communication complexity. For example, using our DKG protocol, we can securely generate a key for the BLS scheme above in 2.3 hours (down from 8 days). Our techniques improve performance for thresholds as small as 255 and generalize to any Lagrange-based threshold scheme, not just threshold signatures. Our work has certain limitations: we require a trusted setup, we focus on synchronous VSS and DKG protocols and we do not address the worst-case complaint overhead in DKGs. Nonetheless, we hope it will spark new interest in designing large-scale distributed systems.
{"title":"Towards Scalable Threshold Cryptosystems","authors":"Alin Tomescu, Robert Chen, Yiming Zheng, Ittai Abraham, Benny Pinkas, Guy Golan-Gueta, S. Devadas","doi":"10.1109/SP40000.2020.00059","DOIUrl":"https://doi.org/10.1109/SP40000.2020.00059","url":null,"abstract":"The resurging interest in Byzantine fault tolerant systems will demand more scalable threshold cryptosystems. Unfortunately, current systems scale poorly, requiring time quadratic in the number of participants. In this paper, we present techniques that help scale threshold signature schemes (TSS), verifiable secret sharing (VSS) and distributed key generation (DKG) protocols to hundreds of thousands of participants and beyond. First, we use efficient algorithms for evaluating polynomials at multiple points to speed up computing Lagrange coefficients when aggregating threshold signatures. As a result, we can aggregate a 130,000 out of 260,000 BLS threshold signature in just 6 seconds (down from 30 minutes). Second, we show how \"authenticating\" such multipoint evaluations can speed up proving polynomial evaluations, a key step in communication-efficient VSS and DKG protocols. As a result, we reduce the asymptotic (and concrete) computational complexity of VSS and DKG protocols from quadratic time to quasilinear time, at a small increase in communication complexity. For example, using our DKG protocol, we can securely generate a key for the BLS scheme above in 2.3 hours (down from 8 days). Our techniques improve performance for thresholds as small as 255 and generalize to any Lagrange-based threshold scheme, not just threshold signatures. Our work has certain limitations: we require a trusted setup, we focus on synchronous VSS and DKG protocols and we do not address the worst-case complaint overhead in DKGs. Nonetheless, we hope it will spark new interest in designing large-scale distributed systems.","PeriodicalId":6849,"journal":{"name":"2020 IEEE Symposium on Security and Privacy (SP)","volume":"83 1","pages":"877-893"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84026311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-05-01DOI: 10.1109/SP40000.2020.00024
Anton Permenev, Dimitar Dimitrov, Petar Tsankov, Dana Drachsler-Cohen, Martin T. Vechev
We present VerX, the first automated verifier able to prove functional properties of Ethereum smart contracts. VerX addresses an important problem as all real-world contracts must satisfy custom functional specifications.VerX is based on a careful combination of three techniques, enabling it to automatically verify temporal properties of infinite- state smart contracts: (i) reduction of temporal property verification to reachability checking, (ii) a new symbolic execution engine for the Ethereum Virtual Machine that is precise and efficient for a practical fragment of Ethereum contracts, and (iii) delayed predicate abstraction which uses symbolic execution during transactions and abstraction at transaction boundaries.Our extensive experimental evaluation on 83 temporal properties and 12 real-world projects, including popular crowdsales and libraries, demonstrates that VerX is practically effective.
{"title":"VerX: Safety Verification of Smart Contracts","authors":"Anton Permenev, Dimitar Dimitrov, Petar Tsankov, Dana Drachsler-Cohen, Martin T. Vechev","doi":"10.1109/SP40000.2020.00024","DOIUrl":"https://doi.org/10.1109/SP40000.2020.00024","url":null,"abstract":"We present VerX, the first automated verifier able to prove functional properties of Ethereum smart contracts. VerX addresses an important problem as all real-world contracts must satisfy custom functional specifications.VerX is based on a careful combination of three techniques, enabling it to automatically verify temporal properties of infinite- state smart contracts: (i) reduction of temporal property verification to reachability checking, (ii) a new symbolic execution engine for the Ethereum Virtual Machine that is precise and efficient for a practical fragment of Ethereum contracts, and (iii) delayed predicate abstraction which uses symbolic execution during transactions and abstraction at transaction boundaries.Our extensive experimental evaluation on 83 temporal properties and 12 real-world projects, including popular crowdsales and libraries, demonstrates that VerX is practically effective.","PeriodicalId":6849,"journal":{"name":"2020 IEEE Symposium on Security and Privacy (SP)","volume":"121 5 1","pages":"1661-1677"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84084673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-05-01DOI: 10.1109/SP40000.2020.00097
Rakibul Hasan, David J. Crandall, Mario Fritz, Apu Kapadia
Photographs taken in public places often contain bystanders - people who are not the main subject of a photo. These photos, when shared online, can reach a large number of viewers and potentially undermine the bystanders’ privacy. Furthermore, recent developments in computer vision and machine learning can be used by online platforms to identify and track individuals. To combat this problem, researchers have proposed technical solutions that require bystanders to be proactive and use specific devices or applications to broadcast their privacy policy and identifying information to locate them in an image.We explore the prospect of a different approach – identifying bystanders solely based on the visual information present in an image. Through an online user study, we catalog the rationale humans use to classify subjects and bystanders in an image, and systematically validate a set of intuitive concepts (such as intentionally posing for a photo) that can be used to automatically identify bystanders. Using image data, we infer those concepts and then use them to train several classifier models. We extensively evaluate the models and compare them with human raters. On our initial dataset, with a 10-fold cross validation, our best model achieves a mean detection accuracy of 93% for images when human raters have 100% agreement on the class label and 80% when the agreement is only 67%. We validate this model on a completely different dataset and achieve similar results, demonstrating that our model generalizes well.
{"title":"Automatically Detecting Bystanders in Photos to Reduce Privacy Risks","authors":"Rakibul Hasan, David J. Crandall, Mario Fritz, Apu Kapadia","doi":"10.1109/SP40000.2020.00097","DOIUrl":"https://doi.org/10.1109/SP40000.2020.00097","url":null,"abstract":"Photographs taken in public places often contain bystanders - people who are not the main subject of a photo. These photos, when shared online, can reach a large number of viewers and potentially undermine the bystanders’ privacy. Furthermore, recent developments in computer vision and machine learning can be used by online platforms to identify and track individuals. To combat this problem, researchers have proposed technical solutions that require bystanders to be proactive and use specific devices or applications to broadcast their privacy policy and identifying information to locate them in an image.We explore the prospect of a different approach – identifying bystanders solely based on the visual information present in an image. Through an online user study, we catalog the rationale humans use to classify subjects and bystanders in an image, and systematically validate a set of intuitive concepts (such as intentionally posing for a photo) that can be used to automatically identify bystanders. Using image data, we infer those concepts and then use them to train several classifier models. We extensively evaluate the models and compare them with human raters. On our initial dataset, with a 10-fold cross validation, our best model achieves a mean detection accuracy of 93% for images when human raters have 100% agreement on the class label and 80% when the agreement is only 67%. We validate this model on a completely different dataset and achieve similar results, demonstrating that our model generalizes well.","PeriodicalId":6849,"journal":{"name":"2020 IEEE Symposium on Security and Privacy (SP)","volume":"112 1","pages":"318-335"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82469993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-05-01DOI: 10.1109/SP40000.2020.00094
I. Pustogarov, Qian Wu, D. Lie
The ability to execute and analyze code makes many security tasks such as exploit development, reverse engineering, and vulnerability detection much easier. However, on embedded devices such as Android smartphones, executing code in-vivo, on the device, for analysis is limited by the need to acquire such devices, the speed of the device, and in some cases the need to flash custom code onto the devices. The other option is to execute the code ex-vivo, off the device, but this approach either requires porting or complex hardware emulation.In this paper, we take advantage of the observation that many execution paths in drivers are only superficially dependent on both the hardware and kernel on which the driver executes, to create an ex-vivo dynamic driver analysis framework for Android devices that requires neither porting nor emulation. We achieve this by developing a generic evasion framework that enables driver initialization by evading hardware and kernel dependencies instead of precisely emulating them, and then developing a novel Ex-vivo AnalySIs framEwoRk (EASIER) that enables off-device analysis with the initialized driver state. Compared to on-device analysis, our approach enables the use of userspace tools and scales with the number of available commodity CPU’s, not the number of smartphones.We demonstrate the usefulness of our framework by targeting privilege escalation vulnerabilities in system call handlers in platform device drivers. We find it can load 48/62 (77%) drivers from three different Android kernels: MSM, Xiaomi, and Huawei. We then confirm that it is able to reach and detect 21 known vulnerabilities. Finally, we have discovered 12 new bugs which we have reported and confirmed.
{"title":"Ex-vivo dynamic analysis framework for Android device drivers","authors":"I. Pustogarov, Qian Wu, D. Lie","doi":"10.1109/SP40000.2020.00094","DOIUrl":"https://doi.org/10.1109/SP40000.2020.00094","url":null,"abstract":"The ability to execute and analyze code makes many security tasks such as exploit development, reverse engineering, and vulnerability detection much easier. However, on embedded devices such as Android smartphones, executing code in-vivo, on the device, for analysis is limited by the need to acquire such devices, the speed of the device, and in some cases the need to flash custom code onto the devices. The other option is to execute the code ex-vivo, off the device, but this approach either requires porting or complex hardware emulation.In this paper, we take advantage of the observation that many execution paths in drivers are only superficially dependent on both the hardware and kernel on which the driver executes, to create an ex-vivo dynamic driver analysis framework for Android devices that requires neither porting nor emulation. We achieve this by developing a generic evasion framework that enables driver initialization by evading hardware and kernel dependencies instead of precisely emulating them, and then developing a novel Ex-vivo AnalySIs framEwoRk (EASIER) that enables off-device analysis with the initialized driver state. Compared to on-device analysis, our approach enables the use of userspace tools and scales with the number of available commodity CPU’s, not the number of smartphones.We demonstrate the usefulness of our framework by targeting privilege escalation vulnerabilities in system call handlers in platform device drivers. We find it can load 48/62 (77%) drivers from three different Android kernels: MSM, Xiaomi, and Huawei. We then confirm that it is able to reach and detect 21 known vulnerabilities. Finally, we have discovered 12 new bugs which we have reported and confirmed.","PeriodicalId":6849,"journal":{"name":"2020 IEEE Symposium on Security and Privacy (SP)","volume":"9 1","pages":"1088-1105"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76188359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-05-01DOI: 10.1109/SP40000.2020.00018
Clemens Deusser, Steffen Passmann, T. Strufe
Cross domain tracking has become the rule, rather than the exception, and scripts that collect behavioral data from visitors across sites have become ubiquitous on the Web. The collections form comprehensive profiles of browsing patterns and contain personal, sensitive information. This data can easily be linked back to the tracked individuals, most of whom are likely unaware of this information’s mere existence, let alone its perpetual storage and processing. As public pressure has increased, tracking companies like Google, Facebook, or Baidu now claim to anonymize their datasets, thus limiting or eliminating the possibility of linking it back to data subjects.In cooperation with Europe’s largest audience measurement association we use access to a comprehensive tracking dataset to assess both identifiability and the possibility of convincingly anonymizing browsing data. Our results show that anonymization through generalization does not sufficiently protect anonymity. Reducing unicity of browsing data to negligible levels would necessitate removal of all client and web domain information as well as click timings. In tangible adversary scenarios, supposedly anonymized datasets are highly vulnerable to dataset enrichment and shoulder surfing adversaries, with almost half of all browsing sessions being identified by just two observations. We conclude that while it may be possible to store single coarsened clicks anonymously, any collection of higher complexity will contain large amounts of pseudonymous data.
{"title":"Browsing Unicity: On the Limits of Anonymizing Web Tracking Data","authors":"Clemens Deusser, Steffen Passmann, T. Strufe","doi":"10.1109/SP40000.2020.00018","DOIUrl":"https://doi.org/10.1109/SP40000.2020.00018","url":null,"abstract":"Cross domain tracking has become the rule, rather than the exception, and scripts that collect behavioral data from visitors across sites have become ubiquitous on the Web. The collections form comprehensive profiles of browsing patterns and contain personal, sensitive information. This data can easily be linked back to the tracked individuals, most of whom are likely unaware of this information’s mere existence, let alone its perpetual storage and processing. As public pressure has increased, tracking companies like Google, Facebook, or Baidu now claim to anonymize their datasets, thus limiting or eliminating the possibility of linking it back to data subjects.In cooperation with Europe’s largest audience measurement association we use access to a comprehensive tracking dataset to assess both identifiability and the possibility of convincingly anonymizing browsing data. Our results show that anonymization through generalization does not sufficiently protect anonymity. Reducing unicity of browsing data to negligible levels would necessitate removal of all client and web domain information as well as click timings. In tangible adversary scenarios, supposedly anonymized datasets are highly vulnerable to dataset enrichment and shoulder surfing adversaries, with almost half of all browsing sessions being identified by just two observations. We conclude that while it may be possible to store single coarsened clicks anonymously, any collection of higher complexity will contain large amounts of pseudonymous data.","PeriodicalId":6849,"journal":{"name":"2020 IEEE Symposium on Security and Privacy (SP)","volume":"28 1","pages":"777-790"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90332025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-05-01DOI: 10.1109/SP40000.2020.00091
Marco Cominelli, F. Gringoli, P. Patras, Margus Lind, G. Noubir
Bluetooth Classic (BT) remains the de facto connectivity technology in car stereo systems, wireless headsets, laptops, and a plethora of wearables, especially for applications that require high data rates, such as audio streaming, voice calling, tethering, etc. Unlike in Bluetooth Low Energy (BLE), where address randomization is a feature available to manufactures, BT addresses are not randomized because they are largely believed to be immune to tracking attacks. We analyze the design of BT and devise a robust de-anonymization technique that hinges on the apparently benign information leaking from frame encoding, to infer a piconet’s clock, hopping sequence, and ultimately the Upper Address Part (UAP) of the master device’s physical address, which are never exchanged in clear. Used together with the Lower Address Part (LAP), which is present in all frames transmitted, this enables tracking of the piconet master, thereby debunking the privacy guarantees of BT. We validate this attack by developing the first Software-defined Radio (SDR) based sniffer that allows full BT spectrum analysis (79 MHz) and implements the proposed de-anonymization technique. We study the feasibility of privacy attacks with multiple testbeds, considering different numbers of devices, traffic regimes, and communication ranges. We demonstrate that it is possible to track BT devices up to 85 meters from the sniffer, and achieve more than 80% device identification accuracy within less than 1 second of sniffing and 100% detection within less than 4 seconds. Lastly, we study the identified privacy attack in the wild, capturing BT traffic at a road junction over 5 days, demonstrating that our system can re-identify hundreds of users and infer their commuting patterns.
{"title":"Even Black Cats Cannot Stay Hidden in the Dark: Full-band De-anonymization of Bluetooth Classic Devices","authors":"Marco Cominelli, F. Gringoli, P. Patras, Margus Lind, G. Noubir","doi":"10.1109/SP40000.2020.00091","DOIUrl":"https://doi.org/10.1109/SP40000.2020.00091","url":null,"abstract":"Bluetooth Classic (BT) remains the de facto connectivity technology in car stereo systems, wireless headsets, laptops, and a plethora of wearables, especially for applications that require high data rates, such as audio streaming, voice calling, tethering, etc. Unlike in Bluetooth Low Energy (BLE), where address randomization is a feature available to manufactures, BT addresses are not randomized because they are largely believed to be immune to tracking attacks. We analyze the design of BT and devise a robust de-anonymization technique that hinges on the apparently benign information leaking from frame encoding, to infer a piconet’s clock, hopping sequence, and ultimately the Upper Address Part (UAP) of the master device’s physical address, which are never exchanged in clear. Used together with the Lower Address Part (LAP), which is present in all frames transmitted, this enables tracking of the piconet master, thereby debunking the privacy guarantees of BT. We validate this attack by developing the first Software-defined Radio (SDR) based sniffer that allows full BT spectrum analysis (79 MHz) and implements the proposed de-anonymization technique. We study the feasibility of privacy attacks with multiple testbeds, considering different numbers of devices, traffic regimes, and communication ranges. We demonstrate that it is possible to track BT devices up to 85 meters from the sniffer, and achieve more than 80% device identification accuracy within less than 1 second of sniffing and 100% detection within less than 4 seconds. Lastly, we study the identified privacy attack in the wild, capturing BT traffic at a road junction over 5 days, demonstrating that our system can re-identify hundreds of users and infer their commuting patterns.","PeriodicalId":6849,"journal":{"name":"2020 IEEE Symposium on Security and Privacy (SP)","volume":"1 1","pages":"534-548"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85991854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}