gan的数据清理

IF 8.9 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE IEEE transactions on neural networks and learning systems Pub Date : 2025-01-29 DOI:10.1109/TNNLS.2025.3529540

Naoyuki Terashita;Hiroki Ohashi;Satoshi Hara

{"title":"gan的数据清理","authors":"Naoyuki Terashita;Hiroki Ohashi;Satoshi Hara","doi":"10.1109/TNNLS.2025.3529540","DOIUrl":null,"url":null,"abstract":"As the application of generative adversarial networks (GANs) expands, it becomes increasingly critical to develop a unified approach that improves performance across various generative tasks. One effective strategy that applies to any machine learning task is identifying harmful instances, whose removal improves the performance. While previous studies have successfully estimated these harmful training instances in supervised settings, their approaches are not easily applicable to GANs. The challenge lies in two requirements of the previous approaches that do not apply to GANs. First, previous approaches require that the absence of a training instance directly affects the parameters. However, in the training for GANs, the instances do not directly affect the generator’s parameters since they are only fed into the discriminator. Second, previous approaches assume that the change in loss directly quantifies the harmfulness of the instance to a model’s performance, while common types of GAN losses do not always reflect the generative performance. To overcome the first challenge, we propose influence estimation methods that use the Jacobian of the generator’s gradient with respect to the discriminator’s parameters (and vice versa). Such a Jacobian represents the indirect effect between two models: how removing an instance from the discriminator’s training changes the generator’s parameters. Second, we propose an instance evaluation scheme that measures the harmfulness of each training instance based on how a GAN evaluation metric [e.g., inception score (IS)] is expected to change by the instance’s removal. Furthermore, we demonstrate that removing the identified harmful instances significantly improves the generative performance on various GAN evaluation metrics. The code is available at <uri>https://github.com/hitachi-rd-cv/data-cleansing-for-gans</uri>.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"36 6","pages":"11575-11588"},"PeriodicalIF":8.9000,"publicationDate":"2025-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Data Cleansing for GANs\",\"authors\":\"Naoyuki Terashita;Hiroki Ohashi;Satoshi Hara\",\"doi\":\"10.1109/TNNLS.2025.3529540\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As the application of generative adversarial networks (GANs) expands, it becomes increasingly critical to develop a unified approach that improves performance across various generative tasks. One effective strategy that applies to any machine learning task is identifying harmful instances, whose removal improves the performance. While previous studies have successfully estimated these harmful training instances in supervised settings, their approaches are not easily applicable to GANs. The challenge lies in two requirements of the previous approaches that do not apply to GANs. First, previous approaches require that the absence of a training instance directly affects the parameters. However, in the training for GANs, the instances do not directly affect the generator’s parameters since they are only fed into the discriminator. Second, previous approaches assume that the change in loss directly quantifies the harmfulness of the instance to a model’s performance, while common types of GAN losses do not always reflect the generative performance. To overcome the first challenge, we propose influence estimation methods that use the Jacobian of the generator’s gradient with respect to the discriminator’s parameters (and vice versa). Such a Jacobian represents the indirect effect between two models: how removing an instance from the discriminator’s training changes the generator’s parameters. Second, we propose an instance evaluation scheme that measures the harmfulness of each training instance based on how a GAN evaluation metric [e.g., inception score (IS)] is expected to change by the instance’s removal. Furthermore, we demonstrate that removing the identified harmful instances significantly improves the generative performance on various GAN evaluation metrics. The code is available at <uri>https://github.com/hitachi-rd-cv/data-cleansing-for-gans</uri>.\",\"PeriodicalId\":13303,\"journal\":{\"name\":\"IEEE transactions on neural networks and learning systems\",\"volume\":\"36 6\",\"pages\":\"11575-11588\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2025-01-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on neural networks and learning systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10857591/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks and learning systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10857591/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

随着生成对抗网络（GANs）应用的扩展，开发一种统一的方法来提高各种生成任务的性能变得越来越重要。一个适用于任何机器学习任务的有效策略是识别有害实例，删除它们可以提高性能。虽然以前的研究已经成功地在监督设置中估计了这些有害的训练实例，但他们的方法不容易适用于gan。挑战在于先前方法的两个要求，而这些要求不适用于gan。首先，以前的方法要求缺少训练实例直接影响参数。然而，在gan的训练中，实例并不直接影响生成器的参数，因为它们只被馈送到鉴别器中。其次，以前的方法假设损失的变化直接量化了实例对模型性能的危害，而常见类型的GAN损失并不总是反映生成性能。为了克服第一个挑战，我们提出了使用生成器梯度相对于鉴别器参数的雅可比矩阵的影响估计方法（反之亦然）。这样的雅可比矩阵表示两个模型之间的间接影响：从鉴别器的训练中删除一个实例如何改变生成器的参数。其次，我们提出了一个实例评估方案，该方案根据GAN评估指标（例如初始分数（IS））在移除实例时的预期变化来衡量每个训练实例的危害性。此外，我们证明了去除已识别的有害实例可以显着提高各种GAN评估指标的生成性能。代码可在https://github.com/hitachi-rd-cv/data-cleansing-for-gans上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Data Cleansing for GANs

As the application of generative adversarial networks (GANs) expands, it becomes increasingly critical to develop a unified approach that improves performance across various generative tasks. One effective strategy that applies to any machine learning task is identifying harmful instances, whose removal improves the performance. While previous studies have successfully estimated these harmful training instances in supervised settings, their approaches are not easily applicable to GANs. The challenge lies in two requirements of the previous approaches that do not apply to GANs. First, previous approaches require that the absence of a training instance directly affects the parameters. However, in the training for GANs, the instances do not directly affect the generator’s parameters since they are only fed into the discriminator. Second, previous approaches assume that the change in loss directly quantifies the harmfulness of the instance to a model’s performance, while common types of GAN losses do not always reflect the generative performance. To overcome the first challenge, we propose influence estimation methods that use the Jacobian of the generator’s gradient with respect to the discriminator’s parameters (and vice versa). Such a Jacobian represents the indirect effect between two models: how removing an instance from the discriminator’s training changes the generator’s parameters. Second, we propose an instance evaluation scheme that measures the harmfulness of each training instance based on how a GAN evaluation metric [e.g., inception score (IS)] is expected to change by the instance’s removal. Furthermore, we demonstrate that removing the identified harmful instances significantly improves the generative performance on various GAN evaluation metrics. The code is available at https://github.com/hitachi-rd-cv/data-cleansing-for-gans.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE transactions on neural networks and learning systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

CiteScore

23.80

自引率

9.60%

发文量

2102

审稿时长

3-8 weeks

期刊介绍： The focus of IEEE Transactions on Neural Networks and Learning Systems is to present scholarly articles discussing the theory, design, and applications of neural networks as well as other learning systems. The journal primarily highlights technical and scientific research in this domain.