{"title":"Defect Image Sample Generation With Diffusion Prior for Steel Surface Defect Recognition","authors":"Yichun Tai;Kun Yang;Tao Peng;Zhenzhen Huang;Zhijiang Zhang","doi":"10.1109/TASE.2024.3482362","DOIUrl":null,"url":null,"abstract":"The task of steel surface defect recognition is an industrial problem with great industry values. The data insufficiency is the major challenge in training a robust defect recognition network. Existing methods have investigated to enlarge the dataset by generating samples with generative models. However, their generation quality is still limited by the insufficiency of defect image samples. To this end, we propose Stable Surface Defect Generation (StableSDG), which transfers the vast generation distribution embedded in Stable Diffusion model for steel surface defect image generation. To tackle with the distinctive distribution gap between steel surface images and generated images of the diffusion model, we propose two processes. First, we align the distribution by adapting parameters of the diffusion model, adopted both in the token embedding space and network parameter space. Besides, in the generation process, we propose image-oriented generation rather than from pure Gaussian noises. We conduct extensive experiments on steel surface defect dataset, demonstrating state-of-the-art performance on generating high-quality samples and training recognition models, and both designed processes are significant for the performance. Note to Practitioners—This article introduces StableSDG, a method that generates realistic defect images even with limited data. It overcomes the shortcomings of current deep learning approaches that need large datasets to train from scratch. Our solution is to adapt a text-to-image diffusion model for defect generation. The proposed strategy involves two processes: training to adapt token embeddings and model parameters, and generation from partially perturbed defect images. The results show enhanced generation quality and improved accuracy for recognition models trained on the expanded dataset. StableSDG can be practically applied to efficiently enlarge a defect dataset, even when starting with a small amount of data.","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"22 ","pages":"8239-8251"},"PeriodicalIF":6.4000,"publicationDate":"2024-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automation Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10735788/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
The task of steel surface defect recognition is an industrial problem with great industry values. The data insufficiency is the major challenge in training a robust defect recognition network. Existing methods have investigated to enlarge the dataset by generating samples with generative models. However, their generation quality is still limited by the insufficiency of defect image samples. To this end, we propose Stable Surface Defect Generation (StableSDG), which transfers the vast generation distribution embedded in Stable Diffusion model for steel surface defect image generation. To tackle with the distinctive distribution gap between steel surface images and generated images of the diffusion model, we propose two processes. First, we align the distribution by adapting parameters of the diffusion model, adopted both in the token embedding space and network parameter space. Besides, in the generation process, we propose image-oriented generation rather than from pure Gaussian noises. We conduct extensive experiments on steel surface defect dataset, demonstrating state-of-the-art performance on generating high-quality samples and training recognition models, and both designed processes are significant for the performance. Note to Practitioners—This article introduces StableSDG, a method that generates realistic defect images even with limited data. It overcomes the shortcomings of current deep learning approaches that need large datasets to train from scratch. Our solution is to adapt a text-to-image diffusion model for defect generation. The proposed strategy involves two processes: training to adapt token embeddings and model parameters, and generation from partially perturbed defect images. The results show enhanced generation quality and improved accuracy for recognition models trained on the expanded dataset. StableSDG can be practically applied to efficiently enlarge a defect dataset, even when starting with a small amount of data.
期刊介绍:
The IEEE Transactions on Automation Science and Engineering (T-ASE) publishes fundamental papers on Automation, emphasizing scientific results that advance efficiency, quality, productivity, and reliability. T-ASE encourages interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, operations research, and other fields. T-ASE welcomes results relevant to industries such as agriculture, biotechnology, healthcare, home automation, maintenance, manufacturing, pharmaceuticals, retail, security, service, supply chains, and transportation. T-ASE addresses a research community willing to integrate knowledge across disciplines and industries. For this purpose, each paper includes a Note to Practitioners that summarizes how its results can be applied or how they might be extended to apply in practice.