GDUI: Guided Diffusion Model for Unlabeled Images

Algorithms Pub Date : 2024-03-18 DOI:10.3390/a17030125

Xuanyuan Xie, Jieyu Zhao

{"title":"GDUI: Guided Diffusion Model for Unlabeled Images","authors":"Xuanyuan Xie, Jieyu Zhao","doi":"10.3390/a17030125","DOIUrl":null,"url":null,"abstract":"The diffusion model has made progress in the field of image synthesis, especially in the area of conditional image synthesis. However, this improvement is highly dependent on large annotated datasets. To tackle this challenge, we present the Guided Diffusion model for Unlabeled Images (GDUI) framework in this article. It utilizes the inherent feature similarity and semantic differences in the data, as well as the downstream transferability of Contrastive Language-Image Pretraining (CLIP), to guide the diffusion model in generating high-quality images. We design two semantic-aware algorithms, namely, the pseudo-label-matching algorithm and label-matching refinement algorithm, to match the clustering results with the true semantic information and provide more accurate guidance for the diffusion model. First, GDUI encodes the image into a semantically meaningful latent vector through clustering. Then, pseudo-label matching is used to complete the matching of the true semantic information of the image. Finally, the label-matching refinement algorithm is used to adjust the irrelevant semantic information in the data, thereby improving the quality of the guided diffusion model image generation. Our experiments on labeled datasets show that GDUI outperforms diffusion models without any guidance and significantly reduces the gap between it and models guided by ground-truth labels.","PeriodicalId":502609,"journal":{"name":"Algorithms","volume":"70 6","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Algorithms","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/a17030125","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The diffusion model has made progress in the field of image synthesis, especially in the area of conditional image synthesis. However, this improvement is highly dependent on large annotated datasets. To tackle this challenge, we present the Guided Diffusion model for Unlabeled Images (GDUI) framework in this article. It utilizes the inherent feature similarity and semantic differences in the data, as well as the downstream transferability of Contrastive Language-Image Pretraining (CLIP), to guide the diffusion model in generating high-quality images. We design two semantic-aware algorithms, namely, the pseudo-label-matching algorithm and label-matching refinement algorithm, to match the clustering results with the true semantic information and provide more accurate guidance for the diffusion model. First, GDUI encodes the image into a semantically meaningful latent vector through clustering. Then, pseudo-label matching is used to complete the matching of the true semantic information of the image. Finally, the label-matching refinement algorithm is used to adjust the irrelevant semantic information in the data, thereby improving the quality of the guided diffusion model image generation. Our experiments on labeled datasets show that GDUI outperforms diffusion models without any guidance and significantly reduces the gap between it and models guided by ground-truth labels.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

GDUI：无标记图像的引导扩散模型

扩散模型在图像合成领域取得了进展，尤其是在条件图像合成领域。然而，这种进步在很大程度上依赖于大型注释数据集。为了应对这一挑战，我们在本文中提出了未标注图像的引导扩散模型（GDUI）框架。它利用数据固有的特征相似性和语义差异，以及对比语言-图像预训练（CLIP）的下游可转移性，引导扩散模型生成高质量图像。我们设计了两种语义感知算法，即伪标签匹配算法和标签匹配细化算法，使聚类结果与真实语义信息相匹配，为扩散模型提供更准确的指导。首先，GDUI 通过聚类将图像编码为具有语义意义的潜在向量。然后，使用伪标签匹配完成图像真实语义信息的匹配。最后，使用标签匹配细化算法调整数据中无关的语义信息，从而提高引导扩散模型图像生成的质量。我们在有标签的数据集上进行的实验表明，GDUI 的性能优于没有任何引导的扩散模型，并显著缩小了它与由地面真实标签引导的模型之间的差距。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Algorithms

自引率

0.00%

发文量