David Kornish, Soundararajan Ezekiel, Maria Scalzo-Cornacchia
{"title":"DCNN Augmentation via Synthetic Data from Variational Autoencoders and Generative Adversarial Networks","authors":"David Kornish, Soundararajan Ezekiel, Maria Scalzo-Cornacchia","doi":"10.1109/AIPR.2018.8707390","DOIUrl":null,"url":null,"abstract":"Deep convolutional neural networks have recently demonstrated incredible capabilities in areas such as image classification and object detection, but they require large datasets of quality pre-labeled data to achieve high levels of performance. Almost all data is not properly labeled when it is captured, and the process of manually labeling large enough datasets for effective learning is impractical in many real-world applications. New studies have shown that synthetic data, generated from a simulated environment, can be effective training data for DCNNs. However, synthetic data is only as effective as the simulation from which it is gathered, and there is often a significant trade-off between designing a simulation that properly models real-world conditions and simply gathering better real-world data. Using generative network architectures, such as Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs), it is possible to produce new synthetic samples based on the features of real-world data. This data can be used to augment small datasets to increase DCNN performance, similar to traditional augmentation methods such as scaling, translation, rotation, and adding noise. In this paper, we compare the advantages of synthetic data from GANs and VAEs to traditional data augmentation techniques. Initial results are promising, indicating that using synthetic data for augmentation can improve the accuracy of DCNN classifiers.","PeriodicalId":230582,"journal":{"name":"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AIPR.2018.8707390","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
Deep convolutional neural networks have recently demonstrated incredible capabilities in areas such as image classification and object detection, but they require large datasets of quality pre-labeled data to achieve high levels of performance. Almost all data is not properly labeled when it is captured, and the process of manually labeling large enough datasets for effective learning is impractical in many real-world applications. New studies have shown that synthetic data, generated from a simulated environment, can be effective training data for DCNNs. However, synthetic data is only as effective as the simulation from which it is gathered, and there is often a significant trade-off between designing a simulation that properly models real-world conditions and simply gathering better real-world data. Using generative network architectures, such as Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs), it is possible to produce new synthetic samples based on the features of real-world data. This data can be used to augment small datasets to increase DCNN performance, similar to traditional augmentation methods such as scaling, translation, rotation, and adding noise. In this paper, we compare the advantages of synthetic data from GANs and VAEs to traditional data augmentation techniques. Initial results are promising, indicating that using synthetic data for augmentation can improve the accuracy of DCNN classifiers.