Katarzyna Kapusta, Lucas Mattioli, Boussad Addad, Mohammed Lansari
{"title":"Protecting ownership rights of ML models using watermarking in the light of adversarial attacks","authors":"Katarzyna Kapusta, Lucas Mattioli, Boussad Addad, Mohammed Lansari","doi":"10.1007/s43681-023-00412-3","DOIUrl":null,"url":null,"abstract":"<div><p>In this paper, we present and analyze two novel—and seemingly distant—research trends in Machine Learning: ML watermarking and adversarial patches. First, we show how ML watermarking uses specially crafted inputs to provide a proof of model ownership. Second, we demonstrate how an attacker can craft adversarial samples in order to trigger an abnormal behavior in a model and thus perform an ambiguity attack on ML watermarking. Finally, we describe three countermeasures that could be applied in order to prevent ambiguity attacks. We illustrate our works using the example of a binary classification model for welding inspection.</p></div>","PeriodicalId":72137,"journal":{"name":"AI and ethics","volume":"4 1","pages":"95 - 103"},"PeriodicalIF":0.0000,"publicationDate":"2024-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AI and ethics","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1007/s43681-023-00412-3","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, we present and analyze two novel—and seemingly distant—research trends in Machine Learning: ML watermarking and adversarial patches. First, we show how ML watermarking uses specially crafted inputs to provide a proof of model ownership. Second, we demonstrate how an attacker can craft adversarial samples in order to trigger an abnormal behavior in a model and thus perform an ambiguity attack on ML watermarking. Finally, we describe three countermeasures that could be applied in order to prevent ambiguity attacks. We illustrate our works using the example of a binary classification model for welding inspection.
在本文中,我们介绍并分析了机器学习领域的两个新颖且看似遥远的研究趋势:ML 水印和对抗补丁。首先,我们展示了 ML 水印如何使用特制输入来提供模型所有权证明。其次,我们展示了攻击者如何制作对抗样本,以触发模型中的异常行为,从而对 ML 水印进行模糊攻击。最后,我们介绍了可用于防止模糊攻击的三种对策。我们以用于焊接检测的二进制分类模型为例说明我们的工作。