Auror: defending against poisoning attacks in collaborative deep learning systems

Shiqi Shen, Shruti Tople, P. Saxena
{"title":"Auror: defending against poisoning attacks in collaborative deep learning systems","authors":"Shiqi Shen, Shruti Tople, P. Saxena","doi":"10.1145/2991079.2991125","DOIUrl":null,"url":null,"abstract":"Deep learning in a collaborative setting is emerging as a corner-stone of many upcoming applications, wherein untrusted users collaborate to generate more accurate models. From the security perspective, this opens collaborative deep learning to poisoning attacks, wherein adversarial users deliberately alter their inputs to mis-train the model. These attacks are known for machine learning systems in general, but their impact on new deep learning systems is not well-established. We investigate the setting of indirect collaborative deep learning --- a form of practical deep learning wherein users submit masked features rather than direct data. Indirect collaborative deep learning is preferred over direct, because it distributes the cost of computation and can be made privacy-preserving. In this paper, we study the susceptibility of collaborative deep learning systems to adversarial poisoning attacks. Specifically, we obtain the following empirical results on 2 popular datasets for handwritten images (MNIST) and traffic signs (GTSRB) used in auto-driving cars. For collaborative deep learning systems, we demonstrate that the attacks have 99% success rate for misclassifying specific target data while poisoning only 10% of the entire training dataset. As a defense, we propose Auror, a system that detects malicious users and generates an accurate model. The accuracy under the deployed defense on practical datasets is nearly unchanged when operating in the absence of attacks. The accuracy of a model trained using Auror drops by only 3% even when 30% of all the users are adversarial. Auror provides a strong guarantee against evasion; if the attacker tries to evade, its attack effectiveness is bounded.","PeriodicalId":419419,"journal":{"name":"Proceedings of the 32nd Annual Conference on Computer Security Applications","volume":"76 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"220","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 32nd Annual Conference on Computer Security Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2991079.2991125","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 220

Abstract

Deep learning in a collaborative setting is emerging as a corner-stone of many upcoming applications, wherein untrusted users collaborate to generate more accurate models. From the security perspective, this opens collaborative deep learning to poisoning attacks, wherein adversarial users deliberately alter their inputs to mis-train the model. These attacks are known for machine learning systems in general, but their impact on new deep learning systems is not well-established. We investigate the setting of indirect collaborative deep learning --- a form of practical deep learning wherein users submit masked features rather than direct data. Indirect collaborative deep learning is preferred over direct, because it distributes the cost of computation and can be made privacy-preserving. In this paper, we study the susceptibility of collaborative deep learning systems to adversarial poisoning attacks. Specifically, we obtain the following empirical results on 2 popular datasets for handwritten images (MNIST) and traffic signs (GTSRB) used in auto-driving cars. For collaborative deep learning systems, we demonstrate that the attacks have 99% success rate for misclassifying specific target data while poisoning only 10% of the entire training dataset. As a defense, we propose Auror, a system that detects malicious users and generates an accurate model. The accuracy under the deployed defense on practical datasets is nearly unchanged when operating in the absence of attacks. The accuracy of a model trained using Auror drops by only 3% even when 30% of all the users are adversarial. Auror provides a strong guarantee against evasion; if the attacker tries to evade, its attack effectiveness is bounded.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
傲罗:在协作深度学习系统中防御中毒攻击
协作环境中的深度学习正在成为许多即将到来的应用程序的基石,其中不受信任的用户协作以生成更准确的模型。从安全的角度来看,这为协作深度学习打开了中毒攻击的大门,在这种攻击中,敌对用户故意改变他们的输入来错误地训练模型。一般来说,这些攻击都是针对机器学习系统的,但它们对新的深度学习系统的影响还没有得到证实。我们研究了间接协作深度学习的设置——一种实用的深度学习形式,其中用户提交屏蔽特征而不是直接数据。间接协作深度学习比直接协作深度学习更受欢迎,因为它可以分配计算成本,并且可以保护隐私。在本文中,我们研究了协作深度学习系统对对抗性中毒攻击的敏感性。具体而言,我们在自动驾驶汽车中使用的手写图像(MNIST)和交通标志(GTSRB)的两个流行数据集上获得了以下实证结果。对于协作深度学习系统,我们证明了攻击对特定目标数据的错误分类有99%的成功率,而整个训练数据集的中毒率仅为10%。作为防御,我们提出了Auror,一个检测恶意用户并生成准确模型的系统。在实际数据集上部署防御的准确性在没有攻击的情况下几乎没有变化。即使30%的用户是敌对的,使用傲罗训练的模型的准确性也只下降了3%。傲罗提供了强大的保证,防止逃避;如果攻击者试图逃避,其攻击效果是有限的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
ShieldFS: a self-healing, ransomware-aware filesystem CoKey: fast token-based cooperative cryptography Proceedings of the 32nd Annual Conference on Computer Security Applications Reliably determining data leakage in the presence of strong attackers Code obfuscation against symbolic execution attacks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1