Leverage NLP Models Against Other NLP Models: Two Invisible Feature Space Backdoor Attacks

IF 5 2区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE IEEE Transactions on Reliability Pub Date : 2024-03-29 DOI:10.1109/TR.2024.3375526
Xiangjun Li;Xin Lu;Peixuan Li
{"title":"Leverage NLP Models Against Other NLP Models: Two Invisible Feature Space Backdoor Attacks","authors":"Xiangjun Li;Xin Lu;Peixuan Li","doi":"10.1109/TR.2024.3375526","DOIUrl":null,"url":null,"abstract":"At present, deep neural networks are at risk from backdoor attacks, but natural language processing (NLP) lacks sufficient research on backdoor attacks. To improve the invisibility of backdoor attacks, some innovative textual backdoor attack methods utilize modern language models to generate poisoned text with backdoor triggers, which are called feature space backdoor attacks. However, this article find that texts generated by the same language model without backdoor triggers also have a high probability of activating the backdoors they injected. Therefore, this article proposes a multistyle transfer-based backdoor attack that uses multiple text styles as the backdoor trigger. Furthermore, inspired by the ability of modern language models to distinguish between texts generated by different language models, this article proposes a paraphrase-based backdoor attack, which leverages the shared characteristics of sentences generated by the same paraphrase model as the backdoor trigger. Experiments have been conducted to demonstrate that both backdoor attack methods can be effective against NLP models. More importantly, compared with other feature space backdoor attacks, the poisoned samples generated by paraphrase-based backdoor attacks have improved semantic similarity.","PeriodicalId":56305,"journal":{"name":"IEEE Transactions on Reliability","volume":"73 3","pages":"1559-1568"},"PeriodicalIF":5.0000,"publicationDate":"2024-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Reliability","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10485431/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

At present, deep neural networks are at risk from backdoor attacks, but natural language processing (NLP) lacks sufficient research on backdoor attacks. To improve the invisibility of backdoor attacks, some innovative textual backdoor attack methods utilize modern language models to generate poisoned text with backdoor triggers, which are called feature space backdoor attacks. However, this article find that texts generated by the same language model without backdoor triggers also have a high probability of activating the backdoors they injected. Therefore, this article proposes a multistyle transfer-based backdoor attack that uses multiple text styles as the backdoor trigger. Furthermore, inspired by the ability of modern language models to distinguish between texts generated by different language models, this article proposes a paraphrase-based backdoor attack, which leverages the shared characteristics of sentences generated by the same paraphrase model as the backdoor trigger. Experiments have been conducted to demonstrate that both backdoor attack methods can be effective against NLP models. More importantly, compared with other feature space backdoor attacks, the poisoned samples generated by paraphrase-based backdoor attacks have improved semantic similarity.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用 NLP 模型对抗其他 NLP 模型:两种隐形特征空间后门攻击
目前,深度神经网络面临后门攻击的风险,但自然语言处理(NLP)缺乏对后门攻击的充分研究。为了提高后门攻击的隐蔽性,一些创新的文本后门攻击方法利用现代语言模型生成带有后门触发器的中毒文本,这种方法被称为特征空间后门攻击。然而,本文发现,由同一语言模型生成的无后门触发器的文本也有很高的概率激活其注入的后门。因此,本文提出了一种基于多风格转移的后门攻击,使用多种文本风格作为后门触发器。此外,受现代语言模型区分不同语言模型生成的文本的能力的启发,本文提出了一种基于转述的后门攻击,利用同一转述模型生成的句子的共同特征作为后门触发器。实验证明,这两种后门攻击方法都能有效对付 NLP 模型。更重要的是,与其他特征空间后门攻击相比,基于转述的后门攻击所生成的中毒样本具有更好的语义相似性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
相关文献
Chronic conditions, disability, and quality of life in older adults with multimorbidity in Spain
IF 8 2区 医学European Journal of Internal MedicinePub Date : 2015-04-01 DOI: 10.1016/j.ejim.2015.02.016
Maria João Forjaz , Carmen Rodriguez-Blazquez , Alba Ayala , Vicente Rodriguez-Rodriguez , Jesús de Pedro-Cuesta , Susana Garcia-Gutierrez , Alexandra Prados-Torres
Functional disability and social participation restriction associated with chronic conditions in middle-aged and older adults
IF 0 Journal of Epidemiology & Community HealthPub Date : 2016-10-17 DOI: 10.1136/jech-2016-207982
L. Griffith, P. Raina, M. Levasseur, N. Sohel, H. Payette, H. Tuokko, E. R. van den Heuvel, A. Wister, A. Gilsing, Christopher J. Patterson
Disease-related disability burden: a comparison of seven chronic conditions in middle-aged and older adults.
IF 4.1 3区 材料科学ACS Applied Electronic MaterialsPub Date : 2021-03-23 DOI: 10.1186/s12877-021-02137-6
Chieh-Ying Chou, Ching-Ju Chiu, Chia-Ming Chang, Chih-Hsing Wu, Feng-Hwa Lu, Jin-Shang Wu, Yi-Ching Yang
来源期刊
IEEE Transactions on Reliability
IEEE Transactions on Reliability 工程技术-工程:电子与电气
CiteScore
12.20
自引率
8.50%
发文量
153
审稿时长
7.5 months
期刊介绍: IEEE Transactions on Reliability is a refereed journal for the reliability and allied disciplines including, but not limited to, maintainability, physics of failure, life testing, prognostics, design and manufacture for reliability, reliability for systems of systems, network availability, mission success, warranty, safety, and various measures of effectiveness. Topics eligible for publication range from hardware to software, from materials to systems, from consumer and industrial devices to manufacturing plants, from individual items to networks, from techniques for making things better to ways of predicting and measuring behavior in the field. As an engineering subject that supports new and existing technologies, we constantly expand into new areas of the assurance sciences.
期刊最新文献
Table of Contents IEEE Reliability Society Information Editorial: Applied AI for Reliability and Cybersecurity 2024 Index IEEE Transactions on Reliability Vol. 73 Table of Contents
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1