Lennard Wunsch, Katharina Anding, Galina Polte, Kun Liu, Gunther Notni
{"title":"Data augmentation for solving industrial recognition tasks with underrepresented defect classes","authors":"Lennard Wunsch, Katharina Anding, Galina Polte, Kun Liu, Gunther Notni","doi":"10.21014/actaimeko.v12i4.1320","DOIUrl":null,"url":null,"abstract":"This paper discusses neural network-based data augmentation to increase the performance of neural networks in classification of datasets with underrepresented defect classes. The performance of deep neural networks suffers from an inhomogeneous class distribution in recognition tasks. In particular, applications of deep neural networks to solve quality assurance tasks in industrial production suffer from such unbalanced class distributions. In order to train deep learning networks, a large amount of data is needed to avoid overfitting and to give the network a good generalisation ability. Therefore, a large amount of defect class objects is needed. However, when it comes to producing defect classes, obtaining a dataset for training can be costly. To reduce this costs, artificial intelligence in the form of Generative Adversarial Networks (GANs) can be used to generate images without producing real objects of defect classes. This allows a cost-effective solution for any kind of underrepresented classes. However, the focus of this work is on defect classes. In this paper a comparison of GANs for data augmentation with classical data augmentation methods for simulating images of defect classes in an industrial context is presented. The results show the positive effect of both, classical and GAN-based data augmentation. By applying both methods parallel the best results for defect-class recognition tasks of datasets with underrepresented classes can be achieved.","PeriodicalId":37987,"journal":{"name":"Acta IMEKO","volume":"620 ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta IMEKO","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21014/actaimeko.v12i4.1320","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Engineering","Score":null,"Total":0}
引用次数: 0
Abstract
This paper discusses neural network-based data augmentation to increase the performance of neural networks in classification of datasets with underrepresented defect classes. The performance of deep neural networks suffers from an inhomogeneous class distribution in recognition tasks. In particular, applications of deep neural networks to solve quality assurance tasks in industrial production suffer from such unbalanced class distributions. In order to train deep learning networks, a large amount of data is needed to avoid overfitting and to give the network a good generalisation ability. Therefore, a large amount of defect class objects is needed. However, when it comes to producing defect classes, obtaining a dataset for training can be costly. To reduce this costs, artificial intelligence in the form of Generative Adversarial Networks (GANs) can be used to generate images without producing real objects of defect classes. This allows a cost-effective solution for any kind of underrepresented classes. However, the focus of this work is on defect classes. In this paper a comparison of GANs for data augmentation with classical data augmentation methods for simulating images of defect classes in an industrial context is presented. The results show the positive effect of both, classical and GAN-based data augmentation. By applying both methods parallel the best results for defect-class recognition tasks of datasets with underrepresented classes can be achieved.
本文讨论了基于神经网络的数据增强,以提高神经网络在缺陷类别代表性不足的数据集分类中的性能。在识别任务中,深度神经网络的性能会受到类别分布不均的影响。特别是,深度神经网络在解决工业生产中的质量保证任务时,就会受到这种不均衡类别分布的影响。为了训练深度学习网络,需要大量数据以避免过拟合,并使网络具有良好的泛化能力。因此,需要大量的缺陷类对象。然而,在生成缺陷类时,获取用于训练的数据集的成本可能会很高。为了降低成本,可以使用生成对抗网络(GAN)形式的人工智能来生成图像,而无需生成真实的缺陷类对象。这就为任何类型的代表性不足的类别提供了一种具有成本效益的解决方案。不过,这项工作的重点是缺陷类别。本文比较了用于数据增强的 GAN 与用于模拟工业缺陷类别图像的经典数据增强方法。结果表明,经典数据扩增和基于 GAN 的数据扩增都有积极的效果。通过并行应用这两种方法,可以在具有代表性不足类别的数据集的缺陷类别识别任务中取得最佳结果。
期刊介绍:
The main goal of this journal is the enhancement of academic activities of IMEKO and a wider dissemination of scientific output from IMEKO TC events. High-quality papers presented at IMEKO conferences, workshops or congresses are seleted by the event organizers and the authors are invited to publish an enhanced version of their paper in this journal. The journal also publishes scientific articles on measurement and instrumentation not related to an IMEKO event.