{"title":"强化 NLP 模型,抵御中毒攻击:个性化预测架构的力量","authors":"Teddy Ferdinan, Jan Kocoń","doi":"10.1016/j.inffus.2024.102692","DOIUrl":null,"url":null,"abstract":"<div><p>In Natural Language Processing (NLP), state-of-the-art machine learning models heavily depend on vast amounts of training data. Often, this data is sourced from third parties, such as crowdsourcing platforms, to enable swift and efficient annotation collection for supervised learning. Yet, such an approach is susceptible to poisoning attacks where malicious agents deliberately insert harmful data to skew the resulting model behavior. Current countermeasures to these attacks either come at a significant cost, lack full efficacy, or are simply non-applicable. This study introduces and evaluates the potential of personalized model architectures as a defense against these threats. By comparing two top-performing personalized model architectures, User-ID and HuBi-Medium, against a standard non-personalized baseline across two NLP tasks and various simulated attack scenarios, we found that the personalized model architectures significantly outperformed the baseline. The robustness advantage increased with the rise in malicious annotations. Notably, the User-ID model excelled in safeguarding predictions for legitimate users from the influence of malicious annotations. Our findings emphasize the benefit of adopting personalized model architectures to bolster NLP system defenses against poisoning attacks.</p></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"114 ","pages":"Article 102692"},"PeriodicalIF":14.7000,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1566253524004706/pdfft?md5=3a6019ed5699d3ea16b3237461a74599&pid=1-s2.0-S1566253524004706-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Fortifying NLP models against poisoning attacks: The power of personalized prediction architectures\",\"authors\":\"Teddy Ferdinan, Jan Kocoń\",\"doi\":\"10.1016/j.inffus.2024.102692\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>In Natural Language Processing (NLP), state-of-the-art machine learning models heavily depend on vast amounts of training data. Often, this data is sourced from third parties, such as crowdsourcing platforms, to enable swift and efficient annotation collection for supervised learning. Yet, such an approach is susceptible to poisoning attacks where malicious agents deliberately insert harmful data to skew the resulting model behavior. Current countermeasures to these attacks either come at a significant cost, lack full efficacy, or are simply non-applicable. This study introduces and evaluates the potential of personalized model architectures as a defense against these threats. By comparing two top-performing personalized model architectures, User-ID and HuBi-Medium, against a standard non-personalized baseline across two NLP tasks and various simulated attack scenarios, we found that the personalized model architectures significantly outperformed the baseline. The robustness advantage increased with the rise in malicious annotations. Notably, the User-ID model excelled in safeguarding predictions for legitimate users from the influence of malicious annotations. Our findings emphasize the benefit of adopting personalized model architectures to bolster NLP system defenses against poisoning attacks.</p></div>\",\"PeriodicalId\":50367,\"journal\":{\"name\":\"Information Fusion\",\"volume\":\"114 \",\"pages\":\"Article 102692\"},\"PeriodicalIF\":14.7000,\"publicationDate\":\"2024-09-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S1566253524004706/pdfft?md5=3a6019ed5699d3ea16b3237461a74599&pid=1-s2.0-S1566253524004706-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Fusion\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1566253524004706\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253524004706","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Fortifying NLP models against poisoning attacks: The power of personalized prediction architectures
In Natural Language Processing (NLP), state-of-the-art machine learning models heavily depend on vast amounts of training data. Often, this data is sourced from third parties, such as crowdsourcing platforms, to enable swift and efficient annotation collection for supervised learning. Yet, such an approach is susceptible to poisoning attacks where malicious agents deliberately insert harmful data to skew the resulting model behavior. Current countermeasures to these attacks either come at a significant cost, lack full efficacy, or are simply non-applicable. This study introduces and evaluates the potential of personalized model architectures as a defense against these threats. By comparing two top-performing personalized model architectures, User-ID and HuBi-Medium, against a standard non-personalized baseline across two NLP tasks and various simulated attack scenarios, we found that the personalized model architectures significantly outperformed the baseline. The robustness advantage increased with the rise in malicious annotations. Notably, the User-ID model excelled in safeguarding predictions for legitimate users from the influence of malicious annotations. Our findings emphasize the benefit of adopting personalized model architectures to bolster NLP system defenses against poisoning attacks.
期刊介绍:
Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.