Model Reconstruction from Model Explanations

S. Milli, Ludwig Schmidt, A. Dragan, Moritz Hardt
{"title":"Model Reconstruction from Model Explanations","authors":"S. Milli, Ludwig Schmidt, A. Dragan, Moritz Hardt","doi":"10.1145/3287560.3287562","DOIUrl":null,"url":null,"abstract":"We show through theory and experiment that gradient-based explanations of a model quickly reveal the model itself. Our results speak to a tension between the desire to keep a proprietary model secret and the ability to offer model explanations. On the theoretical side, we give an algorithm that provably learns a two-layer ReLU network in a setting where the algorithm may query the gradient of the model with respect to chosen inputs. The number of queries is independent of the dimension and nearly optimal in its dependence on the model size. Of interest not only from a learning-theoretic perspective, this result highlights the power of gradients rather than labels as a learning primitive. Complementing our theory, we give effective heuristics for reconstructing models from gradient explanations that are orders of magnitude more query-efficient than reconstruction attacks relying on prediction interfaces.","PeriodicalId":20573,"journal":{"name":"Proceedings of the Conference on Fairness, Accountability, and Transparency","volume":"25 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2018-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"131","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Conference on Fairness, Accountability, and Transparency","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3287560.3287562","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 131

Abstract

We show through theory and experiment that gradient-based explanations of a model quickly reveal the model itself. Our results speak to a tension between the desire to keep a proprietary model secret and the ability to offer model explanations. On the theoretical side, we give an algorithm that provably learns a two-layer ReLU network in a setting where the algorithm may query the gradient of the model with respect to chosen inputs. The number of queries is independent of the dimension and nearly optimal in its dependence on the model size. Of interest not only from a learning-theoretic perspective, this result highlights the power of gradients rather than labels as a learning primitive. Complementing our theory, we give effective heuristics for reconstructing models from gradient explanations that are orders of magnitude more query-efficient than reconstruction attacks relying on prediction interfaces.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
从模型解释中重建模型
我们通过理论和实验证明,基于梯度的模型解释可以快速揭示模型本身。我们的研究结果说明了保持专有模型秘密的愿望与提供模型解释的能力之间的紧张关系。在理论方面,我们给出了一种算法,该算法可以在一个设置中证明学习两层ReLU网络,其中算法可以查询模型相对于所选输入的梯度。查询的数量与维度无关,并且与模型大小的依赖关系几乎是最佳的。不仅从学习理论的角度来看,这个结果突出了梯度而不是标签作为学习原语的力量。为了补充我们的理论,我们给出了有效的启发式方法,用于从梯度解释中重建模型,这比依赖预测接口的重建攻击的查询效率要高几个数量级。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Algorithmic Transparency from the South: Examining the state of algorithmic transparency in Chile's public administration algorithms FAccT '21: 2021 ACM Conference on Fairness, Accountability, and Transparency, Virtual Event / Toronto, Canada, March 3-10, 2021 Transparency universal Resisting transparency Conclusion
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1