Humpty Dumpty: Controlling Word Meanings via Corpus Poisoning

2020 IEEE Symposium on Security and Privacy (SP) Pub Date : 2020-01-14 DOI:10.1109/SP40000.2020.00115

R. Schuster, Tal Schuster, Yoav Meri, Vitaly Shmatikov

{"title":"Humpty Dumpty: Controlling Word Meanings via Corpus Poisoning","authors":"R. Schuster, Tal Schuster, Yoav Meri, Vitaly Shmatikov","doi":"10.1109/SP40000.2020.00115","DOIUrl":null,"url":null,"abstract":"Word embeddings, i.e., low-dimensional vector representations such as GloVe and SGNS, encode word \"meaning\" in the sense that distances between words’ vectors correspond to their semantic proximity. This enables transfer learning of semantics for a variety of natural language processing tasks.Word embeddings are typically trained on large public corpora such as Wikipedia or Twitter. We demonstrate that an attacker who can modify the corpus on which the embedding is trained can control the \"meaning\" of new and existing words by changing their locations in the embedding space. We develop an explicit expression over corpus features that serves as a proxy for distance between words and establish a causative relationship between its values and embedding distances. We then show how to use this relationship for two adversarial objectives: (1) make a word a top-ranked neighbor of another word, and (2) move a word from one semantic cluster to another.An attack on the embedding can affect diverse downstream tasks, demonstrating for the first time the power of data poisoning in transfer learning scenarios. We use this attack to manipulate query expansion in information retrieval systems such as resume search, make certain names more or less visible to named entity recognition models, and cause new words to be translated to a particular target word regardless of the language. Finally, we show how the attacker can generate linguistically likely corpus modifications, thus fooling defenses that attempt to filter implausible sentences from the corpus using a language model.","PeriodicalId":6849,"journal":{"name":"2020 IEEE Symposium on Security and Privacy (SP)","volume":"122 1","pages":"1295-1313"},"PeriodicalIF":0.0000,"publicationDate":"2020-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"30","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE Symposium on Security and Privacy (SP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SP40000.2020.00115","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 30

Abstract

Word embeddings, i.e., low-dimensional vector representations such as GloVe and SGNS, encode word "meaning" in the sense that distances between words’ vectors correspond to their semantic proximity. This enables transfer learning of semantics for a variety of natural language processing tasks.Word embeddings are typically trained on large public corpora such as Wikipedia or Twitter. We demonstrate that an attacker who can modify the corpus on which the embedding is trained can control the "meaning" of new and existing words by changing their locations in the embedding space. We develop an explicit expression over corpus features that serves as a proxy for distance between words and establish a causative relationship between its values and embedding distances. We then show how to use this relationship for two adversarial objectives: (1) make a word a top-ranked neighbor of another word, and (2) move a word from one semantic cluster to another.An attack on the embedding can affect diverse downstream tasks, demonstrating for the first time the power of data poisoning in transfer learning scenarios. We use this attack to manipulate query expansion in information retrieval systems such as resume search, make certain names more or less visible to named entity recognition models, and cause new words to be translated to a particular target word regardless of the language. Finally, we show how the attacker can generate linguistically likely corpus modifications, thus fooling defenses that attempt to filter implausible sentences from the corpus using a language model.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

矮胖子:通过语料库中毒控制词义

词嵌入，即低维向量表示，如GloVe和SGNS，在单词向量之间的距离对应于它们的语义接近的意义上编码单词的“意义”。这使得语义迁移学习可以用于各种自然语言处理任务。词嵌入通常是在维基百科或Twitter等大型公共语料库上进行训练的。我们证明，攻击者可以修改训练嵌入的语料库，通过改变嵌入空间中的位置来控制新单词和现有单词的“意义”。我们开发了语料库特征的显式表达式，作为单词之间距离的代理，并在其值和嵌入距离之间建立了因果关系。然后，我们展示了如何将这种关系用于两个对立的目标:(1)使一个词成为另一个词的顶级邻居，(2)将一个词从一个语义簇移动到另一个语义簇。对嵌入的攻击可以影响不同的下游任务，首次展示了数据中毒在迁移学习场景中的力量。我们使用这种攻击来操纵信息检索系统(如简历搜索)中的查询扩展，使某些名称对命名实体识别模型或多或少可见，并导致新单词被翻译成特定的目标单词，而不管语言是什么。最后，我们展示了攻击者如何生成语言上可能的语料库修改，从而欺骗试图使用语言模型从语料库中过滤不可信句子的防御。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2020 IEEE Symposium on Security and Privacy (SP)

自引率

0.00%

发文量

期刊最新文献

Unexpected Data Dependency Creation and Chaining: A New Attack to SDN TextExerciser: Feedback-driven Text Input Exercising for Android Applications Ijon: Exploring Deep State Spaces via Fuzzing Efficient and Secure Multiparty Computation from Fixed-Key Block Ciphers EverCrypt: A Fast, Verified, Cross-Platform Cryptographic Provider