Pub Date : 2020-05-01DOI: 10.1109/SP40000.2020.00004
Zhengxiong Li, Fenglong Ma, Aditya Singh Rathore, Zhuolin Yang, Baicheng Chen, Lu Su, Wenyao Xu
Digital screens, such as liquid crystal displays (LCDs), are vulnerable to attacks (e.g., "shoulder surfing") that can bypass security protection services (e.g., firewall) to steal confidential information from intended victims. The conventional practice to mitigate these threats is isolation. An isolated zone, without accessibility, proximity, and line-of-sight, seems to bring personal devices to a truly secure place.In this paper, we revisit this historical topic and re-examine the security risk of screen attacks in an isolation scenario mentioned above. Specifically, we identify and validate a new and practical side-channel attack for screen content via liquid crystal nematic state estimation using a low-cost radio-frequency sensor. By leveraging the relationship between the screen content and the states of liquid crystal arrays in displays, we develop WaveSpy, an end-to-end portable through-wall screen attack system. WaveSpy comprises a low-cost, energy-efficient and light-weight millimeter-wave (mmWave) probe which can remotely collect the liquid crystal state response to a set of mmWave stimuli and facilitate screen content inference, even when the victim’s screen is placed in an isolated zone. We intensively evaluate the performance and practicality of WaveSpy in screen attacks, including over 100 different types of content on 30 digital screens of modern electronic devices. WaveSpy achieves an accuracy of 99% in screen content type recognition and a success rate of 87.77% in Top-3 sensitive information retrieval under real-world scenarios, respectively. Furthermore, we discuss several potential defense mechanisms to mitigate screen eavesdropping similar to WaveSpy.
{"title":"WaveSpy: Remote and Through-wall Screen Attack via mmWave Sensing","authors":"Zhengxiong Li, Fenglong Ma, Aditya Singh Rathore, Zhuolin Yang, Baicheng Chen, Lu Su, Wenyao Xu","doi":"10.1109/SP40000.2020.00004","DOIUrl":"https://doi.org/10.1109/SP40000.2020.00004","url":null,"abstract":"Digital screens, such as liquid crystal displays (LCDs), are vulnerable to attacks (e.g., \"shoulder surfing\") that can bypass security protection services (e.g., firewall) to steal confidential information from intended victims. The conventional practice to mitigate these threats is isolation. An isolated zone, without accessibility, proximity, and line-of-sight, seems to bring personal devices to a truly secure place.In this paper, we revisit this historical topic and re-examine the security risk of screen attacks in an isolation scenario mentioned above. Specifically, we identify and validate a new and practical side-channel attack for screen content via liquid crystal nematic state estimation using a low-cost radio-frequency sensor. By leveraging the relationship between the screen content and the states of liquid crystal arrays in displays, we develop WaveSpy, an end-to-end portable through-wall screen attack system. WaveSpy comprises a low-cost, energy-efficient and light-weight millimeter-wave (mmWave) probe which can remotely collect the liquid crystal state response to a set of mmWave stimuli and facilitate screen content inference, even when the victim’s screen is placed in an isolated zone. We intensively evaluate the performance and practicality of WaveSpy in screen attacks, including over 100 different types of content on 30 digital screens of modern electronic devices. WaveSpy achieves an accuracy of 99% in screen content type recognition and a success rate of 87.77% in Top-3 sensitive information retrieval under real-world scenarios, respectively. Furthermore, we discuss several potential defense mechanisms to mitigate screen eavesdropping similar to WaveSpy.","PeriodicalId":6849,"journal":{"name":"2020 IEEE Symposium on Security and Privacy (SP)","volume":"82 1","pages":"217-232"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78936615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-05-01DOI: 10.1109/SP40000.2020.00041
Sergej Proskurin, Marius Momeu, Seyedhamed Ghavamnia, V. Kemerlis, M. Polychronakis
Attackers leverage memory corruption vulnerabilities to establish primitives for reading from or writing to the address space of a vulnerable process. These primitives form the foundation for code-reuse and data-oriented attacks. While various defenses against the former class of attacks have proven effective, mitigation of the latter remains an open problem. In this paper, we identify various shortcomings of the x86 architecture regarding memory isolation, and leverage virtualization to build an effective defense against data-oriented attacks. Our approach, called xMP, provides (in-guest) selective memory protection primitives that allow VMs to isolate sensitive data in user or kernel space in disjoint xMP domains. We interface the Xen altp2m subsystem with the Linux memory management system, lending VMs the flexibility to define custom policies. Contrary to conventional approaches, xMP takes advantage of virtualization extensions, but after initialization, it does not require any hypervisor intervention. To ensure the integrity of in-kernel management information and pointers to sensitive data within isolated domains, xMP protects pointers with HMACs bound to an immutable context, so that integrity validation succeeds only in the right context. We have applied xMP to protect the page tables and process credentials of the Linux kernel, as well as sensitive data in various user-space applications. Overall, our evaluation shows that xMP introduces minimal overhead for real-world workloads and applications, and offers effective protection against data-oriented attacks.
{"title":"xMP: Selective Memory Protection for Kernel and User Space","authors":"Sergej Proskurin, Marius Momeu, Seyedhamed Ghavamnia, V. Kemerlis, M. Polychronakis","doi":"10.1109/SP40000.2020.00041","DOIUrl":"https://doi.org/10.1109/SP40000.2020.00041","url":null,"abstract":"Attackers leverage memory corruption vulnerabilities to establish primitives for reading from or writing to the address space of a vulnerable process. These primitives form the foundation for code-reuse and data-oriented attacks. While various defenses against the former class of attacks have proven effective, mitigation of the latter remains an open problem. In this paper, we identify various shortcomings of the x86 architecture regarding memory isolation, and leverage virtualization to build an effective defense against data-oriented attacks. Our approach, called xMP, provides (in-guest) selective memory protection primitives that allow VMs to isolate sensitive data in user or kernel space in disjoint xMP domains. We interface the Xen altp2m subsystem with the Linux memory management system, lending VMs the flexibility to define custom policies. Contrary to conventional approaches, xMP takes advantage of virtualization extensions, but after initialization, it does not require any hypervisor intervention. To ensure the integrity of in-kernel management information and pointers to sensitive data within isolated domains, xMP protects pointers with HMACs bound to an immutable context, so that integrity validation succeeds only in the right context. We have applied xMP to protect the page tables and process credentials of the Linux kernel, as well as sensitive data in various user-space applications. Overall, our evaluation shows that xMP introduces minimal overhead for real-world workloads and applications, and offers effective protection against data-oriented attacks.","PeriodicalId":6849,"journal":{"name":"2020 IEEE Symposium on Security and Privacy (SP)","volume":"5 1","pages":"563-577"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89064199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-05-01DOI: 10.1109/SP40000.2020.00118
Matthew Bernhard, Allison McDonald, Henry Meng, Jensen Hwa, Nakul Bajaj, Kevin Chang, J. A. Halderman
Ballot marking devices (BMDs) allow voters to select candidates on a computer kiosk, which prints a paper ballot that the voter can review before inserting it into a scanner to be tabulated. Unlike paperless voting machines, BMDs provide voters an opportunity to verify an auditable physical record of their choices, and a growing number of U.S. jurisdictions are adopting them for all voters. However, the security of BMDs depends on how reliably voters notice and correct any adversarially induced errors on their printed ballots. In order to measure voters’ error detection abilities, we conducted a large study (N = 241) in a realistic polling place setting using real voting machines that we modified to introduce an error into each printout. Without intervention, only 40% of participants reviewed their printed ballots at all, and only 6.6% told a poll worker something was wrong. We also find that carefully designed interventions can improve verification performance. Verbally instructing voters to review the printouts and providing a written slate of candidates for whom to vote both significantly increased review and reporting rates—although the improvements may not be large enough to provide strong security in close elections, especially when BMDs are used by all voters. Based on these findings, we make several evidence-based recommendations to help better defend BMD-based elections.
{"title":"Can Voters Detect Malicious Manipulation of Ballot Marking Devices?","authors":"Matthew Bernhard, Allison McDonald, Henry Meng, Jensen Hwa, Nakul Bajaj, Kevin Chang, J. A. Halderman","doi":"10.1109/SP40000.2020.00118","DOIUrl":"https://doi.org/10.1109/SP40000.2020.00118","url":null,"abstract":"Ballot marking devices (BMDs) allow voters to select candidates on a computer kiosk, which prints a paper ballot that the voter can review before inserting it into a scanner to be tabulated. Unlike paperless voting machines, BMDs provide voters an opportunity to verify an auditable physical record of their choices, and a growing number of U.S. jurisdictions are adopting them for all voters. However, the security of BMDs depends on how reliably voters notice and correct any adversarially induced errors on their printed ballots. In order to measure voters’ error detection abilities, we conducted a large study (N = 241) in a realistic polling place setting using real voting machines that we modified to introduce an error into each printout. Without intervention, only 40% of participants reviewed their printed ballots at all, and only 6.6% told a poll worker something was wrong. We also find that carefully designed interventions can improve verification performance. Verbally instructing voters to review the printouts and providing a written slate of candidates for whom to vote both significantly increased review and reporting rates—although the improvements may not be large enough to provide strong security in close elections, especially when BMDs are used by all voters. Based on these findings, we make several evidence-based recommendations to help better defend BMD-based elections.","PeriodicalId":6849,"journal":{"name":"2020 IEEE Symposium on Security and Privacy (SP)","volume":"16 1","pages":"679-694"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87417478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-05-01DOI: 10.1109/SP40000.2020.00031
M. Vanhoef, Eyal Ronen
The WPA3 certification aims to secure home networks, while EAP-pwd is used by certain enterprise Wi-Fi networks to authenticate users. Both use the Dragonfly handshake to provide forward secrecy and resistance to dictionary attacks. In this paper, we systematically evaluate Dragonfly’s security. First, we audit implementations, and present timing leaks and authentication bypasses in EAP-pwd and WPA3 daemons. We then study Dragonfly’s design and discuss downgrade and denial-of-service attacks. Our next and main results are side-channel attacks against Dragonfly’s password encoding method (e.g. hash-to-curve). We believe that these side-channel leaks are inherent to Dragonfly. For example, after our initial disclosure, patched software was still affected by a novel side-channel leak. We also analyze the complexity of using the leaked information to brute-force the password. For instance, brute-forcing a dictionary of size 1010 requires less than $1 in Amazon EC2 instances. These results are also of general interest due to ongoing standardization efforts on Dragonfly as a TLS handshake, Password-Authenticated Key Exchanges (PAKEs), and hash-to-curve. Finally, we discuss backwards-compatible defenses, and propose protocol fixes that prevent attacks. Our work resulted in a new draft of the protocols incorporating our proposed design changes.
{"title":"Dragonblood: Analyzing the Dragonfly Handshake of WPA3 and EAP-pwd","authors":"M. Vanhoef, Eyal Ronen","doi":"10.1109/SP40000.2020.00031","DOIUrl":"https://doi.org/10.1109/SP40000.2020.00031","url":null,"abstract":"The WPA3 certification aims to secure home networks, while EAP-pwd is used by certain enterprise Wi-Fi networks to authenticate users. Both use the Dragonfly handshake to provide forward secrecy and resistance to dictionary attacks. In this paper, we systematically evaluate Dragonfly’s security. First, we audit implementations, and present timing leaks and authentication bypasses in EAP-pwd and WPA3 daemons. We then study Dragonfly’s design and discuss downgrade and denial-of-service attacks. Our next and main results are side-channel attacks against Dragonfly’s password encoding method (e.g. hash-to-curve). We believe that these side-channel leaks are inherent to Dragonfly. For example, after our initial disclosure, patched software was still affected by a novel side-channel leak. We also analyze the complexity of using the leaked information to brute-force the password. For instance, brute-forcing a dictionary of size 1010 requires less than $1 in Amazon EC2 instances. These results are also of general interest due to ongoing standardization efforts on Dragonfly as a TLS handshake, Password-Authenticated Key Exchanges (PAKEs), and hash-to-curve. Finally, we discuss backwards-compatible defenses, and propose protocol fixes that prevent attacks. Our work resulted in a new draft of the protocols incorporating our proposed design changes.","PeriodicalId":6849,"journal":{"name":"2020 IEEE Symposium on Security and Privacy (SP)","volume":"18 1","pages":"517-533"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85648526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-05-01DOI: 10.1109/SP40000.2020.00019
Savino Dambra, Leyla Bilge, D. Balzarotti
Cyber attacks have increased in number and complexity in recent years, and companies and organizations have accordingly raised their investments in more robust infrastructure to preserve their data, assets and reputation. However, the full protection against these countless and constantly evolving threats is unattainable by the sole use of preventive measures. Therefore, to handle residual risks and contain business losses in case of an incident, firms are increasingly adopting a cyber insurance as part of their corporate risk management strategy.As a result, the cyber insurance sector – which offers to transfer the financial risks related to network and computer incidents to a third party – is rapidly growing, with recent claims that already reached a $100M dollars. However, while other insurance sectors rely on consolidated methodologies to accurately predict risks, the many peculiarities of the cyber domain resulted in carriers to often resort to qualitative approaches based on experts opinions.This paper looks at past research conducted in the area of cyber insurance and classifies previous studies in four different areas, focused respectively on studying the economical aspects, the mathematical models, the risk management methodologies, and the predictions of cyber events. We then identify, for each insurance phase, a group of practical research problems where security experts can help develop new data-driven methodologies and automated tools to replace the existing qualitative approaches.
{"title":"SoK: Cyber Insurance – Technical Challenges and a System Security Roadmap","authors":"Savino Dambra, Leyla Bilge, D. Balzarotti","doi":"10.1109/SP40000.2020.00019","DOIUrl":"https://doi.org/10.1109/SP40000.2020.00019","url":null,"abstract":"Cyber attacks have increased in number and complexity in recent years, and companies and organizations have accordingly raised their investments in more robust infrastructure to preserve their data, assets and reputation. However, the full protection against these countless and constantly evolving threats is unattainable by the sole use of preventive measures. Therefore, to handle residual risks and contain business losses in case of an incident, firms are increasingly adopting a cyber insurance as part of their corporate risk management strategy.As a result, the cyber insurance sector – which offers to transfer the financial risks related to network and computer incidents to a third party – is rapidly growing, with recent claims that already reached a $100M dollars. However, while other insurance sectors rely on consolidated methodologies to accurately predict risks, the many peculiarities of the cyber domain resulted in carriers to often resort to qualitative approaches based on experts opinions.This paper looks at past research conducted in the area of cyber insurance and classifies previous studies in four different areas, focused respectively on studying the economical aspects, the mathematical models, the risk management methodologies, and the predictions of cyber events. We then identify, for each insurance phase, a group of practical research problems where security experts can help develop new data-driven methodologies and automated tools to replace the existing qualitative approaches.","PeriodicalId":6849,"journal":{"name":"2020 IEEE Symposium on Security and Privacy (SP)","volume":"88 1","pages":"1367-1383"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85913928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-05-01DOI: 10.1109/SP40000.2020.00082
Michael Kurth, Ben Gras, Dennis Andriesse, Cristiano Giuffrida, H. Bos, Kaveh Razavi
Increased peripheral performance is causing strain on the memory subsystem of modern processors. For example, available DRAM throughput can no longer sustain the traffic of a modern network card. Scrambling to deliver the promised performance, instead of transferring peripheral data to and from DRAM, modern Intel processors perform I/O operations directly on the Last Level Cache (LLC). While Direct Cache Access (DCA) instead of Direct Memory Access (DMA) is a sensible performance optimization, it is unfortunately implemented without care for security, as the LLC is now shared between the CPU and all the attached devices, including the network card.In this paper, we reverse engineer the behavior of DCA, widely referred to as Data-Direct I/O (DDIO), on recent Intel processors and present its first security analysis. Based on our analysis, we present NetCAT, the first Network-based PRIME+PROBE Cache Attack on the processor’s LLC of a remote machine. We show that NetCAT not only enables attacks in cooperative settings where an attacker can build a covert channel between a network client and a sandboxed server process (without network), but more worryingly, in general adversarial settings. In such settings, NetCAT can enable disclosure of network timing-based sensitive information. As an example, we show a keystroke timing attack on a victim SSH connection belonging to another client on the target server. Our results should caution processor vendors against unsupervised sharing of (additional) microarchitectural components with peripherals exposed to malicious input.
{"title":": Practical Cache Attacks from the Network","authors":"Michael Kurth, Ben Gras, Dennis Andriesse, Cristiano Giuffrida, H. Bos, Kaveh Razavi","doi":"10.1109/SP40000.2020.00082","DOIUrl":"https://doi.org/10.1109/SP40000.2020.00082","url":null,"abstract":"Increased peripheral performance is causing strain on the memory subsystem of modern processors. For example, available DRAM throughput can no longer sustain the traffic of a modern network card. Scrambling to deliver the promised performance, instead of transferring peripheral data to and from DRAM, modern Intel processors perform I/O operations directly on the Last Level Cache (LLC). While Direct Cache Access (DCA) instead of Direct Memory Access (DMA) is a sensible performance optimization, it is unfortunately implemented without care for security, as the LLC is now shared between the CPU and all the attached devices, including the network card.In this paper, we reverse engineer the behavior of DCA, widely referred to as Data-Direct I/O (DDIO), on recent Intel processors and present its first security analysis. Based on our analysis, we present NetCAT, the first Network-based PRIME+PROBE Cache Attack on the processor’s LLC of a remote machine. We show that NetCAT not only enables attacks in cooperative settings where an attacker can build a covert channel between a network client and a sandboxed server process (without network), but more worryingly, in general adversarial settings. In such settings, NetCAT can enable disclosure of network timing-based sensitive information. As an example, we show a keystroke timing attack on a victim SSH connection belonging to another client on the target server. Our results should caution processor vendors against unsupervised sharing of (additional) microarchitectural components with peripherals exposed to malicious input.","PeriodicalId":6849,"journal":{"name":"2020 IEEE Symposium on Security and Privacy (SP)","volume":"17 1","pages":"20-38"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84432839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the increasing popularity of the Internet of Things (IoT), many IoT cloud platforms have emerged to help the IoT manufacturers connect their devices to their users. Serving the device-user communication is general messaging protocol deployed on the platforms. Less clear, however, is whether such protocols, which are not designed to work in the adversarial environment of IoT, introduce new risks. In this paper, we report the first systematic study on the protection of major IoT clouds (e.g., AWS, Microsoft, IBM) put in place for the arguably most popular messaging protocol - MQTT. We found that these platforms’ security additions to the protocol are all vulnerable, allowing the adversary to gain control of the device, launch a large-scale denial-of-service attack, steal the victim’s secrets data and fake the victim’s device status for deception. We successfully performed end-to-end attacks on these popular IoT clouds and further conducted a measurement study, which demonstrates that the security impacts of our attacks are real, severe and broad. We reported our findings to related parties, which all acknowledged the importance. We further propose new design principles and an enhanced access model MOUCON. We implemented our protection on a popular open-source MQTT server. Our evaluation shows its high effectiveness and negligible performance overhead.
{"title":"Burglars’ IoT Paradise: Understanding and Mitigating Security Risks of General Messaging Protocols on IoT Clouds","authors":"Yan Jia, Luyi Xing, Yuhang Mao, Dongfang Zhao, Xiaofeng Wang, Shangru Zhao, Yuqing Zhang","doi":"10.1109/SP40000.2020.00051","DOIUrl":"https://doi.org/10.1109/SP40000.2020.00051","url":null,"abstract":"With the increasing popularity of the Internet of Things (IoT), many IoT cloud platforms have emerged to help the IoT manufacturers connect their devices to their users. Serving the device-user communication is general messaging protocol deployed on the platforms. Less clear, however, is whether such protocols, which are not designed to work in the adversarial environment of IoT, introduce new risks. In this paper, we report the first systematic study on the protection of major IoT clouds (e.g., AWS, Microsoft, IBM) put in place for the arguably most popular messaging protocol - MQTT. We found that these platforms’ security additions to the protocol are all vulnerable, allowing the adversary to gain control of the device, launch a large-scale denial-of-service attack, steal the victim’s secrets data and fake the victim’s device status for deception. We successfully performed end-to-end attacks on these popular IoT clouds and further conducted a measurement study, which demonstrates that the security impacts of our attacks are real, severe and broad. We reported our findings to related parties, which all acknowledged the importance. We further propose new design principles and an enhanced access model MOUCON. We implemented our protection on a popular open-source MQTT server. Our evaluation shows its high effectiveness and negligible performance overhead.","PeriodicalId":6849,"journal":{"name":"2020 IEEE Symposium on Security and Privacy (SP)","volume":"18 1","pages":"465-481"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83636527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-05-01DOI: 10.1109/sp40000.2020.00104
{"title":"Message from the Program Chairs: SP 2020","authors":"","doi":"10.1109/sp40000.2020.00104","DOIUrl":"https://doi.org/10.1109/sp40000.2020.00104","url":null,"abstract":"","PeriodicalId":6849,"journal":{"name":"2020 IEEE Symposium on Security and Privacy (SP)","volume":"55 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90851333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-05-01DOI: 10.1109/SP40000.2020.00079
Steve T. K. Jan, Qingying Hao, Tianrui Hu, Jiameng Pu, Sonal Oswal, Gang Wang, Bimal Viswanath
Machine learning has been widely applied to building security applications. However, many machine learning models require the continuous supply of representative labeled data for training, which limits the models’ usefulness in practice. In this paper, we use bot detection as an example to explore the use of data synthesis to address this problem. We collected the network traffic from 3 online services in three different months within a year (23 million network requests). We develop a stream-based feature encoding scheme to support machine learning models for detecting advanced bots. The key novelty is that our model detects bots with extremely limited labeled data. We propose a data synthesis method to synthesize unseen (or future) bot behavior distributions. The synthesis method is distribution-aware, using two different generators in a Generative Adversarial Network to synthesize data for the clustered regions and the outlier regions in the feature space. We evaluate this idea and show our method can train a model that outperforms existing methods with only 1% of the labeled data. We show that data synthesis also improves the model’s sustainability over time and speeds up the retraining. Finally, we compare data synthesis and adversarial retraining and show they can work complementary with each other to improve the model generalizability.
{"title":"Throwing Darts in the Dark? Detecting Bots with Limited Data using Neural Data Augmentation","authors":"Steve T. K. Jan, Qingying Hao, Tianrui Hu, Jiameng Pu, Sonal Oswal, Gang Wang, Bimal Viswanath","doi":"10.1109/SP40000.2020.00079","DOIUrl":"https://doi.org/10.1109/SP40000.2020.00079","url":null,"abstract":"Machine learning has been widely applied to building security applications. However, many machine learning models require the continuous supply of representative labeled data for training, which limits the models’ usefulness in practice. In this paper, we use bot detection as an example to explore the use of data synthesis to address this problem. We collected the network traffic from 3 online services in three different months within a year (23 million network requests). We develop a stream-based feature encoding scheme to support machine learning models for detecting advanced bots. The key novelty is that our model detects bots with extremely limited labeled data. We propose a data synthesis method to synthesize unseen (or future) bot behavior distributions. The synthesis method is distribution-aware, using two different generators in a Generative Adversarial Network to synthesize data for the clustered regions and the outlier regions in the feature space. We evaluate this idea and show our method can train a model that outperforms existing methods with only 1% of the labeled data. We show that data synthesis also improves the model’s sustainability over time and speeds up the retraining. Finally, we compare data synthesis and adversarial retraining and show they can work complementary with each other to improve the model generalizability.","PeriodicalId":6849,"journal":{"name":"2020 IEEE Symposium on Security and Privacy (SP)","volume":"46 1","pages":"1190-1206"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86882908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}