Defect Image Sample Generation With Diffusion Prior for Steel Surface Defect Recognition

IF 6.4 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS IEEE Transactions on Automation Science and Engineering Pub Date : 2024-10-25 DOI:10.1109/TASE.2024.3482362

Yichun Tai;Kun Yang;Tao Peng;Zhenzhen Huang;Zhijiang Zhang

{"title":"Defect Image Sample Generation With Diffusion Prior for Steel Surface Defect Recognition","authors":"Yichun Tai;Kun Yang;Tao Peng;Zhenzhen Huang;Zhijiang Zhang","doi":"10.1109/TASE.2024.3482362","DOIUrl":null,"url":null,"abstract":"The task of steel surface defect recognition is an industrial problem with great industry values. The data insufficiency is the major challenge in training a robust defect recognition network. Existing methods have investigated to enlarge the dataset by generating samples with generative models. However, their generation quality is still limited by the insufficiency of defect image samples. To this end, we propose Stable Surface Defect Generation (StableSDG), which transfers the vast generation distribution embedded in Stable Diffusion model for steel surface defect image generation. To tackle with the distinctive distribution gap between steel surface images and generated images of the diffusion model, we propose two processes. First, we align the distribution by adapting parameters of the diffusion model, adopted both in the token embedding space and network parameter space. Besides, in the generation process, we propose image-oriented generation rather than from pure Gaussian noises. We conduct extensive experiments on steel surface defect dataset, demonstrating state-of-the-art performance on generating high-quality samples and training recognition models, and both designed processes are significant for the performance. Note to Practitioners—This article introduces StableSDG, a method that generates realistic defect images even with limited data. It overcomes the shortcomings of current deep learning approaches that need large datasets to train from scratch. Our solution is to adapt a text-to-image diffusion model for defect generation. The proposed strategy involves two processes: training to adapt token embeddings and model parameters, and generation from partially perturbed defect images. The results show enhanced generation quality and improved accuracy for recognition models trained on the expanded dataset. StableSDG can be practically applied to efficiently enlarge a defect dataset, even when starting with a small amount of data.","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"22 ","pages":"8239-8251"},"PeriodicalIF":6.4000,"publicationDate":"2024-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automation Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10735788/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

The task of steel surface defect recognition is an industrial problem with great industry values. The data insufficiency is the major challenge in training a robust defect recognition network. Existing methods have investigated to enlarge the dataset by generating samples with generative models. However, their generation quality is still limited by the insufficiency of defect image samples. To this end, we propose Stable Surface Defect Generation (StableSDG), which transfers the vast generation distribution embedded in Stable Diffusion model for steel surface defect image generation. To tackle with the distinctive distribution gap between steel surface images and generated images of the diffusion model, we propose two processes. First, we align the distribution by adapting parameters of the diffusion model, adopted both in the token embedding space and network parameter space. Besides, in the generation process, we propose image-oriented generation rather than from pure Gaussian noises. We conduct extensive experiments on steel surface defect dataset, demonstrating state-of-the-art performance on generating high-quality samples and training recognition models, and both designed processes are significant for the performance. Note to Practitioners—This article introduces StableSDG, a method that generates realistic defect images even with limited data. It overcomes the shortcomings of current deep learning approaches that need large datasets to train from scratch. Our solution is to adapt a text-to-image diffusion model for defect generation. The proposed strategy involves two processes: training to adapt token embeddings and model parameters, and generation from partially perturbed defect images. The results show enhanced generation quality and improved accuracy for recognition models trained on the expanded dataset. StableSDG can be practically applied to efficiently enlarge a defect dataset, even when starting with a small amount of data.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用扩散先验法生成缺陷图像样本，用于钢铁表面缺陷识别

钢材表面缺陷识别是一个具有重大工业价值的工业问题。数据不足是训练鲁棒缺陷识别网络的主要挑战。现有的方法研究了通过生成模型生成样本来扩大数据集的方法。然而，由于缺陷图像样本的不足，其生成质量仍然受到限制。为此，我们提出了稳定表面缺陷生成（StableSDG），它将嵌入在稳定扩散模型中的巨大生成分布转移到钢表面缺陷图像生成中。为了解决扩散模型中钢材表面图像与生成图像之间明显的分布差距，我们提出了两种处理方法。首先，我们通过调整令牌嵌入空间和网络参数空间的扩散模型参数来对齐分布。此外，在生成过程中，我们提出了面向图像的生成，而不是单纯的高斯噪声。我们对钢表面缺陷数据集进行了广泛的实验，证明了在生成高质量样本和训练识别模型方面的最先进性能，并且设计的两个过程对性能都很重要。从业人员注意：本文介绍了StableSDG，一种即使使用有限的数据也能生成真实缺陷图像的方法。它克服了当前深度学习方法需要大量数据集从零开始训练的缺点。我们的解决方案是采用文本到图像的扩散模型来生成缺陷。该策略包括两个过程：训练以适应标记嵌入和模型参数，以及从部分扰动缺陷图像中生成。结果表明，在扩展数据集上训练的识别模型提高了生成质量和准确性。StableSDG可以实际应用于有效地扩大缺陷数据集，即使从少量数据开始也是如此。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Automation Science and Engineering 工程技术-自动化与控制系统

CiteScore

12.50

自引率

14.30%

发文量

404

审稿时长

3.0 months

期刊介绍： The IEEE Transactions on Automation Science and Engineering (T-ASE) publishes fundamental papers on Automation, emphasizing scientific results that advance efficiency, quality, productivity, and reliability. T-ASE encourages interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, operations research, and other fields. T-ASE welcomes results relevant to industries such as agriculture, biotechnology, healthcare, home automation, maintenance, manufacturing, pharmaceuticals, retail, security, service, supply chains, and transportation. T-ASE addresses a research community willing to integrate knowledge across disciplines and industries. For this purpose, each paper includes a Note to Practitioners that summarizes how its results can be applied or how they might be extended to apply in practice.