Fortifying NLP models against poisoning attacks: The power of personalized prediction architectures

IF 14.7 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Information Fusion Pub Date : 2024-09-19 DOI:10.1016/j.inffus.2024.102692
{"title":"Fortifying NLP models against poisoning attacks: The power of personalized prediction architectures","authors":"","doi":"10.1016/j.inffus.2024.102692","DOIUrl":null,"url":null,"abstract":"<div><p>In Natural Language Processing (NLP), state-of-the-art machine learning models heavily depend on vast amounts of training data. Often, this data is sourced from third parties, such as crowdsourcing platforms, to enable swift and efficient annotation collection for supervised learning. Yet, such an approach is susceptible to poisoning attacks where malicious agents deliberately insert harmful data to skew the resulting model behavior. Current countermeasures to these attacks either come at a significant cost, lack full efficacy, or are simply non-applicable. This study introduces and evaluates the potential of personalized model architectures as a defense against these threats. By comparing two top-performing personalized model architectures, User-ID and HuBi-Medium, against a standard non-personalized baseline across two NLP tasks and various simulated attack scenarios, we found that the personalized model architectures significantly outperformed the baseline. The robustness advantage increased with the rise in malicious annotations. Notably, the User-ID model excelled in safeguarding predictions for legitimate users from the influence of malicious annotations. Our findings emphasize the benefit of adopting personalized model architectures to bolster NLP system defenses against poisoning attacks.</p></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":null,"pages":null},"PeriodicalIF":14.7000,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1566253524004706/pdfft?md5=3a6019ed5699d3ea16b3237461a74599&pid=1-s2.0-S1566253524004706-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253524004706","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

In Natural Language Processing (NLP), state-of-the-art machine learning models heavily depend on vast amounts of training data. Often, this data is sourced from third parties, such as crowdsourcing platforms, to enable swift and efficient annotation collection for supervised learning. Yet, such an approach is susceptible to poisoning attacks where malicious agents deliberately insert harmful data to skew the resulting model behavior. Current countermeasures to these attacks either come at a significant cost, lack full efficacy, or are simply non-applicable. This study introduces and evaluates the potential of personalized model architectures as a defense against these threats. By comparing two top-performing personalized model architectures, User-ID and HuBi-Medium, against a standard non-personalized baseline across two NLP tasks and various simulated attack scenarios, we found that the personalized model architectures significantly outperformed the baseline. The robustness advantage increased with the rise in malicious annotations. Notably, the User-ID model excelled in safeguarding predictions for legitimate users from the influence of malicious annotations. Our findings emphasize the benefit of adopting personalized model architectures to bolster NLP system defenses against poisoning attacks.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
强化 NLP 模型,抵御中毒攻击:个性化预测架构的力量
在自然语言处理(NLP)领域,最先进的机器学习模型在很大程度上依赖于大量的训练数据。这些数据通常来自第三方,如众包平台,以便为监督学习快速、高效地收集注释。然而,这种方法很容易受到 "中毒 "攻击,即恶意代理蓄意插入有害数据,以歪曲由此产生的模型行为。目前针对这些攻击的对策要么成本高昂,要么缺乏全面的有效性,要么根本无法应用。本研究介绍并评估了个性化模型架构作为防御这些威胁的潜力。通过在两个 NLP 任务和各种模拟攻击场景中将两个表现最佳的个性化模型架构(User-ID 和 HuBi-Medium )与标准非个性化基线进行比较,我们发现个性化模型架构的表现明显优于基线。随着恶意注释的增加,鲁棒性优势也在增加。值得注意的是,User-ID 模型在保护合法用户的预测不受恶意注释影响方面表现出色。我们的研究结果强调了采用个性化模型架构来增强 NLP 系统防御中毒攻击的优势。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Information Fusion
Information Fusion 工程技术-计算机:理论方法
CiteScore
33.20
自引率
4.30%
发文量
161
审稿时长
7.9 months
期刊介绍: Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.
期刊最新文献
Fault stands out in contrast: Zero-shot diagnosis method based on dual-level contrastive fusion network for control moment gyroscopes predictive maintenance Graph refinement and consistency self-supervision for tensorized incomplete multi-view clustering Fortifying NLP models against poisoning attacks: The power of personalized prediction architectures Multiplex graph aggregation and feature refinement for unsupervised incomplete multimodal emotion recognition Adversarial attacks and defenses on text-to-image diffusion models: A survey
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1