{"title":"Comparison Analysis of Data Augmentation using Bootstrap, GANs and Autoencoder","authors":"Mukrin Nakhwan, Rakkrit Duangsoithong","doi":"10.1109/KST53302.2022.9729065","DOIUrl":null,"url":null,"abstract":"In order to improve predictive accuracy for insufficient observations, data augmentation is a well-known and commonly useful technique to increase more samples by generating new data which can avoid data collection problems. This paper presents comparison analysis of three data augmentation methods using Bootstrap method, Generative Adversarial Networks (GANs) and Autoencoder for increasing a number of samples. The proposal is applied on 8 datasets with binary classification from repository data websites. The research is mainly evaluated by generating new additional data using data augmentation. Secondly, combining generated samples and original data. Finally, validating performance on four classifier models. The experimental result showed that the proposed approach of increasing samples by Autoencoder and GANs achieved better predictive performance than the original data. Conversely, increasing samples by Bootstrap method provided lowest predictive performance.","PeriodicalId":433638,"journal":{"name":"2022 14th International Conference on Knowledge and Smart Technology (KST)","volume":"81 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 14th International Conference on Knowledge and Smart Technology (KST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/KST53302.2022.9729065","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
In order to improve predictive accuracy for insufficient observations, data augmentation is a well-known and commonly useful technique to increase more samples by generating new data which can avoid data collection problems. This paper presents comparison analysis of three data augmentation methods using Bootstrap method, Generative Adversarial Networks (GANs) and Autoencoder for increasing a number of samples. The proposal is applied on 8 datasets with binary classification from repository data websites. The research is mainly evaluated by generating new additional data using data augmentation. Secondly, combining generated samples and original data. Finally, validating performance on four classifier models. The experimental result showed that the proposed approach of increasing samples by Autoencoder and GANs achieved better predictive performance than the original data. Conversely, increasing samples by Bootstrap method provided lowest predictive performance.