EANA: Reducing Privacy Risk on Large-scale Recommendation Models

Lin Ning, Steve Chien, Shuang Song, Mei Chen, Yunqi Xue, D. Berlowitz
{"title":"EANA: Reducing Privacy Risk on Large-scale Recommendation Models","authors":"Lin Ning, Steve Chien, Shuang Song, Mei Chen, Yunqi Xue, D. Berlowitz","doi":"10.1145/3523227.3546769","DOIUrl":null,"url":null,"abstract":"Embedding-based deep neural networks (DNNs) are widely used in large-scale recommendation systems. Differentially-private stochastic gradient descent (DP-SGD) provides a way to enable personalized experiences while preserving user privacy by injecting noise into every model parameter during the training process. However, it is challenging to apply DP-SGD to large-scale embedding-based DNNs due to its effect on training speed. This happens because the noise added by DP-SGD causes normally sparse gradients to become dense, introducing a large communication overhead between workers and parameter servers in a typical distributed training framework. This paper proposes embedding-aware noise addition (EANA) to mitigate the communication overhead, making training a large-scale embedding-based DNN possible. We examine the privacy benefit of EANA both analytically and empirically using secret sharer techniques. We demonstrate that training with EANA can achieve reasonable model precision while providing good practical privacy protection as measured by the secret sharer tests. Experiments on a real-world, large-scale dataset and model show that EANA is much faster than standard DP-SGD, improving the training speed by 54X and unblocking the training of a large-scale embedding-based DNN with reduced privacy risk.","PeriodicalId":443279,"journal":{"name":"Proceedings of the 16th ACM Conference on Recommender Systems","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 16th ACM Conference on Recommender Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3523227.3546769","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Embedding-based deep neural networks (DNNs) are widely used in large-scale recommendation systems. Differentially-private stochastic gradient descent (DP-SGD) provides a way to enable personalized experiences while preserving user privacy by injecting noise into every model parameter during the training process. However, it is challenging to apply DP-SGD to large-scale embedding-based DNNs due to its effect on training speed. This happens because the noise added by DP-SGD causes normally sparse gradients to become dense, introducing a large communication overhead between workers and parameter servers in a typical distributed training framework. This paper proposes embedding-aware noise addition (EANA) to mitigate the communication overhead, making training a large-scale embedding-based DNN possible. We examine the privacy benefit of EANA both analytically and empirically using secret sharer techniques. We demonstrate that training with EANA can achieve reasonable model precision while providing good practical privacy protection as measured by the secret sharer tests. Experiments on a real-world, large-scale dataset and model show that EANA is much faster than standard DP-SGD, improving the training speed by 54X and unblocking the training of a large-scale embedding-based DNN with reduced privacy risk.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
EANA:降低大规模推荐模型的隐私风险
基于嵌入的深度神经网络(dnn)广泛应用于大规模推荐系统。差分私有随机梯度下降(DP-SGD)提供了一种实现个性化体验的方法,同时通过在训练过程中向每个模型参数注入噪声来保护用户隐私。然而,由于DP-SGD对训练速度的影响,将其应用于大规模基于嵌入的深度神经网络是一个挑战。这是因为DP-SGD添加的噪声导致通常稀疏的梯度变得密集,在典型的分布式训练框架中引入了工作人员和参数服务器之间的大量通信开销。本文提出了嵌入感知噪声添加(EANA)来减少通信开销,使训练大规模的基于嵌入的深度神经网络成为可能。我们使用秘密共享器技术对EANA的隐私优势进行了分析和实证研究。我们证明使用EANA训练可以获得合理的模型精度,同时通过秘密共享器测试提供了良好的实用隐私保护。在真实世界的大规模数据集和模型上的实验表明,EANA比标准DP-SGD快得多,将训练速度提高了54X,并在降低隐私风险的情况下解除了大规模基于嵌入的DNN的训练阻塞。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Heterogeneous Graph Representation Learning for multi-target Cross-Domain Recommendation Imbalanced Data Sparsity as a Source of Unfair Bias in Collaborative Filtering Position Awareness Modeling with Knowledge Distillation for CTR Prediction Multi-Modal Dialog State Tracking for Interactive Fashion Recommendation Denoising Self-Attentive Sequential Recommendation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1