A Privacy-Preserving Walk in the Latent Space of Generative Models for Medical Applications

M. Pennisi, Federica Proietto Salanitri, G. Bellitto, S. Palazzo, Ulas Bagci, C. Spampinato
{"title":"A Privacy-Preserving Walk in the Latent Space of Generative Models for Medical Applications","authors":"M. Pennisi, Federica Proietto Salanitri, G. Bellitto, S. Palazzo, Ulas Bagci, C. Spampinato","doi":"10.48550/arXiv.2307.02984","DOIUrl":null,"url":null,"abstract":"Generative Adversarial Networks (GANs) have demonstrated their ability to generate synthetic samples that match a target distribution. However, from a privacy perspective, using GANs as a proxy for data sharing is not a safe solution, as they tend to embed near-duplicates of real samples in the latent space. Recent works, inspired by k-anonymity principles, address this issue through sample aggregation in the latent space, with the drawback of reducing the dataset by a factor of k. Our work aims to mitigate this problem by proposing a latent space navigation strategy able to generate diverse synthetic samples that may support effective training of deep models, while addressing privacy concerns in a principled way. Our approach leverages an auxiliary identity classifier as a guide to non-linearly walk between points in the latent space, minimizing the risk of collision with near-duplicates of real samples. We empirically demonstrate that, given any random pair of points in the latent space, our walking strategy is safer than linear interpolation. We then test our path-finding strategy combined to k-same methods and demonstrate, on two benchmarks for tuberculosis and diabetic retinopathy classification, that training a model using samples generated by our approach mitigate drops in performance, while keeping privacy preservation.","PeriodicalId":18289,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"195 1","pages":"422-431"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2307.02984","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Generative Adversarial Networks (GANs) have demonstrated their ability to generate synthetic samples that match a target distribution. However, from a privacy perspective, using GANs as a proxy for data sharing is not a safe solution, as they tend to embed near-duplicates of real samples in the latent space. Recent works, inspired by k-anonymity principles, address this issue through sample aggregation in the latent space, with the drawback of reducing the dataset by a factor of k. Our work aims to mitigate this problem by proposing a latent space navigation strategy able to generate diverse synthetic samples that may support effective training of deep models, while addressing privacy concerns in a principled way. Our approach leverages an auxiliary identity classifier as a guide to non-linearly walk between points in the latent space, minimizing the risk of collision with near-duplicates of real samples. We empirically demonstrate that, given any random pair of points in the latent space, our walking strategy is safer than linear interpolation. We then test our path-finding strategy combined to k-same methods and demonstrate, on two benchmarks for tuberculosis and diabetic retinopathy classification, that training a model using samples generated by our approach mitigate drops in performance, while keeping privacy preservation.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
医学应用生成模型潜在空间中的隐私保护行走
生成对抗网络(GANs)已经证明了它们生成符合目标分布的合成样本的能力。然而,从隐私的角度来看,使用gan作为数据共享的代理并不是一个安全的解决方案,因为它们倾向于在潜在空间中嵌入接近重复的真实样本。最近的作品受到k-匿名原则的启发,通过潜在空间中的样本聚合来解决这个问题,缺点是将数据集减少了k个因子。我们的工作旨在通过提出一种潜在空间导航策略来缓解这个问题,该策略能够生成多种合成样本,这些样本可以支持深度模型的有效训练,同时以原则性的方式解决隐私问题。我们的方法利用辅助身份分类器作为潜在空间中点之间非线性行走的指南,最大限度地减少与真实样本的近重复碰撞的风险。我们的经验证明,给定潜在空间中的任意随机点对,我们的行走策略比线性插值更安全。然后,我们测试了结合k-same方法的寻路策略,并在结核病和糖尿病视网膜病变分类的两个基准上证明,使用我们的方法生成的样本训练模型可以减轻性能下降,同时保持隐私保护。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
MUSCLE: Multi-task Self-supervised Continual Learning to Pre-train Deep Models for X-Ray Images of Multiple Body Parts Self-pruning Graph Neural Network for Predicting Inflammatory Disease Activity in Multiple Sclerosis from Brain MR Images Self-Supervised Learning for Endoscopic Video Analysis Exploring Unsupervised Cell Recognition with Prior Self-activation Maps DMCVR: Morphology-Guided Diffusion Model for 3D Cardiac Volume Reconstruction
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1