Humpty Dumpty: Controlling Word Meanings via Corpus Poisoning

R. Schuster, Tal Schuster, Yoav Meri, Vitaly Shmatikov
{"title":"Humpty Dumpty: Controlling Word Meanings via Corpus Poisoning","authors":"R. Schuster, Tal Schuster, Yoav Meri, Vitaly Shmatikov","doi":"10.1109/SP40000.2020.00115","DOIUrl":null,"url":null,"abstract":"Word embeddings, i.e., low-dimensional vector representations such as GloVe and SGNS, encode word \"meaning\" in the sense that distances between words’ vectors correspond to their semantic proximity. This enables transfer learning of semantics for a variety of natural language processing tasks.Word embeddings are typically trained on large public corpora such as Wikipedia or Twitter. We demonstrate that an attacker who can modify the corpus on which the embedding is trained can control the \"meaning\" of new and existing words by changing their locations in the embedding space. We develop an explicit expression over corpus features that serves as a proxy for distance between words and establish a causative relationship between its values and embedding distances. We then show how to use this relationship for two adversarial objectives: (1) make a word a top-ranked neighbor of another word, and (2) move a word from one semantic cluster to another.An attack on the embedding can affect diverse downstream tasks, demonstrating for the first time the power of data poisoning in transfer learning scenarios. We use this attack to manipulate query expansion in information retrieval systems such as resume search, make certain names more or less visible to named entity recognition models, and cause new words to be translated to a particular target word regardless of the language. Finally, we show how the attacker can generate linguistically likely corpus modifications, thus fooling defenses that attempt to filter implausible sentences from the corpus using a language model.","PeriodicalId":6849,"journal":{"name":"2020 IEEE Symposium on Security and Privacy (SP)","volume":"122 1","pages":"1295-1313"},"PeriodicalIF":0.0000,"publicationDate":"2020-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"30","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE Symposium on Security and Privacy (SP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SP40000.2020.00115","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 30

Abstract

Word embeddings, i.e., low-dimensional vector representations such as GloVe and SGNS, encode word "meaning" in the sense that distances between words’ vectors correspond to their semantic proximity. This enables transfer learning of semantics for a variety of natural language processing tasks.Word embeddings are typically trained on large public corpora such as Wikipedia or Twitter. We demonstrate that an attacker who can modify the corpus on which the embedding is trained can control the "meaning" of new and existing words by changing their locations in the embedding space. We develop an explicit expression over corpus features that serves as a proxy for distance between words and establish a causative relationship between its values and embedding distances. We then show how to use this relationship for two adversarial objectives: (1) make a word a top-ranked neighbor of another word, and (2) move a word from one semantic cluster to another.An attack on the embedding can affect diverse downstream tasks, demonstrating for the first time the power of data poisoning in transfer learning scenarios. We use this attack to manipulate query expansion in information retrieval systems such as resume search, make certain names more or less visible to named entity recognition models, and cause new words to be translated to a particular target word regardless of the language. Finally, we show how the attacker can generate linguistically likely corpus modifications, thus fooling defenses that attempt to filter implausible sentences from the corpus using a language model.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
矮胖子:通过语料库中毒控制词义
词嵌入,即低维向量表示,如GloVe和SGNS,在单词向量之间的距离对应于它们的语义接近的意义上编码单词的“意义”。这使得语义迁移学习可以用于各种自然语言处理任务。词嵌入通常是在维基百科或Twitter等大型公共语料库上进行训练的。我们证明,攻击者可以修改训练嵌入的语料库,通过改变嵌入空间中的位置来控制新单词和现有单词的“意义”。我们开发了语料库特征的显式表达式,作为单词之间距离的代理,并在其值和嵌入距离之间建立了因果关系。然后,我们展示了如何将这种关系用于两个对立的目标:(1)使一个词成为另一个词的顶级邻居,(2)将一个词从一个语义簇移动到另一个语义簇。对嵌入的攻击可以影响不同的下游任务,首次展示了数据中毒在迁移学习场景中的力量。我们使用这种攻击来操纵信息检索系统(如简历搜索)中的查询扩展,使某些名称对命名实体识别模型或多或少可见,并导致新单词被翻译成特定的目标单词,而不管语言是什么。最后,我们展示了攻击者如何生成语言上可能的语料库修改,从而欺骗试图使用语言模型从语料库中过滤不可信句子的防御。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Unexpected Data Dependency Creation and Chaining: A New Attack to SDN TextExerciser: Feedback-driven Text Input Exercising for Android Applications Ijon: Exploring Deep State Spaces via Fuzzing Efficient and Secure Multiparty Computation from Fixed-Key Block Ciphers EverCrypt: A Fast, Verified, Cross-Platform Cryptographic Provider
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1