Machine Unlearning for Image Retrieval: A Generative Scrubbing Approach

Proceedings of the 30th ACM International Conference on Multimedia Pub Date : 2022-10-10 DOI:10.1145/3503161.3548378

P. Zhang, Guangdong Bai, Zi Huang, Xin-Shun Xu

{"title":"Machine Unlearning for Image Retrieval: A Generative Scrubbing Approach","authors":"P. Zhang, Guangdong Bai, Zi Huang, Xin-Shun Xu","doi":"10.1145/3503161.3548378","DOIUrl":null,"url":null,"abstract":"Data owners have the right to request for deleting their data from a machine learning (ML) model. In response, a naïve way is to retrain the model with the original dataset excluding the data to forget, which is however unrealistic as the required dataset may no longer be available and the retraining process is usually computationally expensive. To cope with this reality, machine unlearning has recently attained much attention, which aims to enable data removal from a trained ML model responding to deletion requests, without retraining the model from scratch or full access to the original training dataset. Existing unlearning methods mainly focus on handling conventional ML methods, while unlearning deep neural networks (DNNs) based models remains underexplored, especially for the ones trained on large-scale datasets. In this paper, we make the first attempt to realize data forgetting on deep models for image retrieval. Image retrieval targets at searching relevant data to the query according to similarity measures. Intuitively, unlearning a deep image retrieval model can be achieved by breaking down its ability of similarity modeling on the data to forget. To this end, we propose a generative scrubbing (GS) method that learns a generator to craft noisy data to manipulate the model weights. A novel framework is designed consisting of the generator and the target retrieval model, where a pair of coupled static and dynamic learning procedures are performed simultaneously. This novel learning strategy effectively enables the generated noisy data to fade away the memory of the model on the data to forget whilst retaining the information of the remaining data. Extensive experiments on three widely-used datasets have successfully verified the effectiveness of the proposed method.","PeriodicalId":412792,"journal":{"name":"Proceedings of the 30th ACM International Conference on Multimedia","volume":"55 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 30th ACM International Conference on Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3503161.3548378","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

Data owners have the right to request for deleting their data from a machine learning (ML) model. In response, a naïve way is to retrain the model with the original dataset excluding the data to forget, which is however unrealistic as the required dataset may no longer be available and the retraining process is usually computationally expensive. To cope with this reality, machine unlearning has recently attained much attention, which aims to enable data removal from a trained ML model responding to deletion requests, without retraining the model from scratch or full access to the original training dataset. Existing unlearning methods mainly focus on handling conventional ML methods, while unlearning deep neural networks (DNNs) based models remains underexplored, especially for the ones trained on large-scale datasets. In this paper, we make the first attempt to realize data forgetting on deep models for image retrieval. Image retrieval targets at searching relevant data to the query according to similarity measures. Intuitively, unlearning a deep image retrieval model can be achieved by breaking down its ability of similarity modeling on the data to forget. To this end, we propose a generative scrubbing (GS) method that learns a generator to craft noisy data to manipulate the model weights. A novel framework is designed consisting of the generator and the target retrieval model, where a pair of coupled static and dynamic learning procedures are performed simultaneously. This novel learning strategy effectively enables the generated noisy data to fade away the memory of the model on the data to forget whilst retaining the information of the remaining data. Extensive experiments on three widely-used datasets have successfully verified the effectiveness of the proposed method.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

图像检索的机器学习:一种生成式擦洗方法

数据所有者有权要求从机器学习(ML)模型中删除其数据。作为回应，naïve的一种方法是用原始数据集重新训练模型，不包括要忘记的数据，然而这是不现实的，因为所需的数据集可能不再可用，而且重新训练过程通常在计算上很昂贵。为了应对这一现实，机器学习最近得到了很多关注，其目的是在不重新训练模型或完全访问原始训练数据集的情况下，从响应删除请求的训练ML模型中删除数据。现有的学习方法主要集中在处理传统的机器学习方法上，而基于深度神经网络(dnn)的模型的学习仍然没有得到充分的探索，特别是对于那些在大规模数据集上训练的模型。本文首次尝试在深度模型上实现用于图像检索的数据遗忘。图像检索的目标是根据相似度度量来搜索与查询相关的数据。直观地，可以通过分解其对数据的相似性建模能力来实现深度图像检索模型的遗忘。为此，我们提出了一种生成洗涤(GS)方法，该方法学习生成器来制作有噪声的数据以操纵模型权重。设计了一个由生成器和目标检索模型组成的框架，其中一对耦合的静态和动态学习过程同时进行。这种新颖的学习策略有效地使生成的带噪数据在保留剩余数据信息的同时，逐渐消除模型对数据的记忆。在三个广泛使用的数据集上进行的大量实验成功地验证了该方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 30th ACM International Conference on Multimedia

自引率

0.00%

发文量