Attacking Click-through Rate Predictors via Generating Realistic Fake Samples

IF 4 3区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS ACM Transactions on Knowledge Discovery from Data Pub Date : 2024-01-27 DOI:10.1145/3643685
Mingxing Duan, Kenli Li, Weinan Zhang, Jiarui Qin, Bin Xiao
{"title":"Attacking Click-through Rate Predictors via Generating Realistic Fake Samples","authors":"Mingxing Duan, Kenli Li, Weinan Zhang, Jiarui Qin, Bin Xiao","doi":"10.1145/3643685","DOIUrl":null,"url":null,"abstract":"<p>How to construct imperceptible (realistic) fake samples is critical in adversarial attacks. Due to the sample feature diversity of a recommender system (containing both discrete and continuous features), traditional gradient-based adversarial attack methods may fail to construct realistic fake samples. Meanwhile, most recommendation models adopt click-through rate (CTR) predictors, which usually utilize black-box deep models with discrete features as input. Thus, how to efficiently construct realistic fake samples for black-box recommender systems is still full of challenges. In this paper, we propose a hierarchical adversarial attack method against black-box CTR models via generating realistic fake samples, named CTRAttack. To better train the generation network, the weights of its embedding layer are shared with those of the substitute model, with both the similarity loss and classification loss used to update the generation network. To ensure that the discrete features of the generated fake samples are all real, we first adopt the similarity loss to ensure that the distribution of the generated perturbed samples is sufficiently close to the distribution of the real features, then the nearest neighbor algorithm is used to retrieve the most appropriate features for non-existent discrete features from the candidate instance set. Extensive experiments demonstrate that CTRAttack can not only effectively attack the black-box recommender systems but also improve the robustness of these models while maintaining prediction accuracy.</p>","PeriodicalId":49249,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data","volume":"10 1","pages":""},"PeriodicalIF":4.0000,"publicationDate":"2024-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Knowledge Discovery from Data","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3643685","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

How to construct imperceptible (realistic) fake samples is critical in adversarial attacks. Due to the sample feature diversity of a recommender system (containing both discrete and continuous features), traditional gradient-based adversarial attack methods may fail to construct realistic fake samples. Meanwhile, most recommendation models adopt click-through rate (CTR) predictors, which usually utilize black-box deep models with discrete features as input. Thus, how to efficiently construct realistic fake samples for black-box recommender systems is still full of challenges. In this paper, we propose a hierarchical adversarial attack method against black-box CTR models via generating realistic fake samples, named CTRAttack. To better train the generation network, the weights of its embedding layer are shared with those of the substitute model, with both the similarity loss and classification loss used to update the generation network. To ensure that the discrete features of the generated fake samples are all real, we first adopt the similarity loss to ensure that the distribution of the generated perturbed samples is sufficiently close to the distribution of the real features, then the nearest neighbor algorithm is used to retrieve the most appropriate features for non-existent discrete features from the candidate instance set. Extensive experiments demonstrate that CTRAttack can not only effectively attack the black-box recommender systems but also improve the robustness of these models while maintaining prediction accuracy.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过生成逼真的假样本攻击点击率预测器
如何构建不易察觉(逼真)的虚假样本在对抗攻击中至关重要。由于推荐系统的样本特征多样性(包含离散和连续特征),传统的基于梯度的对抗攻击方法可能无法构建逼真的假样本。同时,大多数推荐模型都采用点击率(CTR)预测器,这种预测器通常利用黑盒深度模型,以离散特征作为输入。因此,如何高效地为黑盒推荐系统构建真实的假样本仍然充满挑战。本文提出了一种针对黑盒 CTR 模型的分层对抗攻击方法,即 CTRAttack。为了更好地训练生成网络,其嵌入层的权重与替代模型的权重共享,相似性损失和分类损失都用于更新生成网络。为了确保生成的假样本的离散特征都是真实的,我们首先采用相似性损失来确保生成的扰动样本的分布与真实特征的分布足够接近,然后使用近邻算法从候选实例集中为不存在的离散特征检索最合适的特征。大量实验证明,CTRAttack 不仅能有效攻击黑盒推荐系统,还能在保持预测准确性的同时提高这些模型的鲁棒性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
ACM Transactions on Knowledge Discovery from Data
ACM Transactions on Knowledge Discovery from Data COMPUTER SCIENCE, INFORMATION SYSTEMS-COMPUTER SCIENCE, SOFTWARE ENGINEERING
CiteScore
6.70
自引率
5.60%
发文量
172
审稿时长
3 months
期刊介绍: TKDD welcomes papers on a full range of research in the knowledge discovery and analysis of diverse forms of data. Such subjects include, but are not limited to: scalable and effective algorithms for data mining and big data analysis, mining brain networks, mining data streams, mining multi-media data, mining high-dimensional data, mining text, Web, and semi-structured data, mining spatial and temporal data, data mining for community generation, social network analysis, and graph structured data, security and privacy issues in data mining, visual, interactive and online data mining, pre-processing and post-processing for data mining, robust and scalable statistical methods, data mining languages, foundations of data mining, KDD framework and process, and novel applications and infrastructures exploiting data mining technology including massively parallel processing and cloud computing platforms. TKDD encourages papers that explore the above subjects in the context of large distributed networks of computers, parallel or multiprocessing computers, or new data devices. TKDD also encourages papers that describe emerging data mining applications that cannot be satisfied by the current data mining technology.
期刊最新文献
Structural properties on scale-free tree network with an ultra-large diameter Learning Individual Treatment Effects under Heterogeneous Interference in Networks Deconfounding User Preference in Recommendation Systems through Implicit and Explicit Feedback Interdisciplinary Fairness in Imbalanced Research Proposal Topic Inference: A Hierarchical Transformer-based Method with Selective Interpolation A Compact Vulnerability Knowledge Graph for Risk Assessment
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1