CBAs: Character-level Backdoor Attacks against Chinese Pre-trained Language Models

IF 3 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS ACM Transactions on Privacy and Security Pub Date : 2024-07-12 DOI:10.1145/3678007
Xinyu He, Fengrui Hao, Tianlong Gu, Liang Chang
{"title":"CBAs: Character-level Backdoor Attacks against Chinese Pre-trained Language Models","authors":"Xinyu He, Fengrui Hao, Tianlong Gu, Liang Chang","doi":"10.1145/3678007","DOIUrl":null,"url":null,"abstract":"The pre-trained language models (PLMs) aim to assist computers in various domains to provide natural and efficient language interaction and text processing capabilities. However, recent studies have shown that PLMs are highly vulnerable to malicious backdoor attacks, where triggers could be injected into the models to guide them to exhibit the expected behavior of the attackers. Unfortunately, existing researches on backdoor attacks have mainly focused on English PLMs, but paid less attention to the Chinese PLMs. Moreover, these extant backdoor attacks don’t work well against Chinese PLMs. In this paper, we disclose the limitations of English backdoor attacks against Chinese PLMs, and propose the character-level backdoor attacks (CBAs) against the Chinese PLMs. Specifically, we first design three Chinese trigger generation strategies to ensure the backdoor being effectively triggered while improving the effectiveness of the backdoor attacks. Then, based on the attacker’s capabilities of accessing the training dataset, we develop trigger injection mechanisms with either the target label similarity or the masked language model, which select the most influential position and insert the trigger to maximize the stealth of backdoor attacks. Extensive experiments on three major natural language processing tasks in various Chinese PLMs and English PLMs demonstrate the effectiveness and stealthiness of our method. Besides, CBAs also have very strong resistance against three state-of-the-art backdoor defense methods.","PeriodicalId":56050,"journal":{"name":"ACM Transactions on Privacy and Security","volume":null,"pages":null},"PeriodicalIF":3.0000,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Privacy and Security","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3678007","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

The pre-trained language models (PLMs) aim to assist computers in various domains to provide natural and efficient language interaction and text processing capabilities. However, recent studies have shown that PLMs are highly vulnerable to malicious backdoor attacks, where triggers could be injected into the models to guide them to exhibit the expected behavior of the attackers. Unfortunately, existing researches on backdoor attacks have mainly focused on English PLMs, but paid less attention to the Chinese PLMs. Moreover, these extant backdoor attacks don’t work well against Chinese PLMs. In this paper, we disclose the limitations of English backdoor attacks against Chinese PLMs, and propose the character-level backdoor attacks (CBAs) against the Chinese PLMs. Specifically, we first design three Chinese trigger generation strategies to ensure the backdoor being effectively triggered while improving the effectiveness of the backdoor attacks. Then, based on the attacker’s capabilities of accessing the training dataset, we develop trigger injection mechanisms with either the target label similarity or the masked language model, which select the most influential position and insert the trigger to maximize the stealth of backdoor attacks. Extensive experiments on three major natural language processing tasks in various Chinese PLMs and English PLMs demonstrate the effectiveness and stealthiness of our method. Besides, CBAs also have very strong resistance against three state-of-the-art backdoor defense methods.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
CBAs:针对中文预训练语言模型的字符级后门攻击
预训练语言模型(PLMs)旨在协助各领域的计算机提供自然、高效的语言交互和文本处理能力。然而,最近的研究表明,PLMs 非常容易受到恶意后门攻击的影响,恶意后门攻击可将触发器注入模型,引导模型表现出攻击者预期的行为。遗憾的是,现有的后门攻击研究主要集中在英文版 PLM 上,对中文版 PLM 关注较少。此外,这些现有的后门攻击对中文 PLM 也不起作用。本文揭示了针对中文 PLM 的英文后门攻击的局限性,并提出了针对中文 PLM 的字符级后门攻击(CBA)。具体来说,我们首先设计了三种中文触发生成策略,以确保后门被有效触发,同时提高后门攻击的有效性。然后,根据攻击者获取训练数据集的能力,我们开发了目标标签相似度或屏蔽语言模型的触发器注入机制,选择最有影响力的位置插入触发器,最大限度地提高后门攻击的隐蔽性。在各种中文 PLM 和英文 PLM 中进行的三大自然语言处理任务的广泛实验证明了我们的方法的有效性和隐蔽性。此外,CBA 对三种最先进的后门防御方法也有很强的抵御能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
ACM Transactions on Privacy and Security
ACM Transactions on Privacy and Security Computer Science-General Computer Science
CiteScore
5.20
自引率
0.00%
发文量
52
期刊介绍: ACM Transactions on Privacy and Security (TOPS) (formerly known as TISSEC) publishes high-quality research results in the fields of information and system security and privacy. Studies addressing all aspects of these fields are welcomed, ranging from technologies, to systems and applications, to the crafting of policies.
期刊最新文献
Flexichain: Flexible Payment Channel Network to Defend Against Channel Exhaustion Attack SPArch: A Hardware-oriented Sketch-based Architecture for High-speed Network Flow Measurements VeriBin: A Malware Authorship Verification Approach for APT Tracking through Explainable and Functionality-Debiasing Adversarial Representation Learning CBAs: Character-level Backdoor Attacks against Chinese Pre-trained Language Models PEBASI: A Privacy preserving, Efficient Biometric Authentication Scheme based on Irises
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1