Yun Zhang, Zongze Jin, Fan Liu, Weilin Zhu, Weimin Mu, Weiping Wang
{"title":"ImageDC: Image Data Cleaning Framework Based on Deep Learning","authors":"Yun Zhang, Zongze Jin, Fan Liu, Weilin Zhu, Weimin Mu, Weiping Wang","doi":"10.1109/ICAIIS49377.2020.9194803","DOIUrl":null,"url":null,"abstract":"Although user-generated image data increases more and more quickly on the current Internet, many image methods have attracted widespread attention from industry and academia. Recently, some image classification approaches using deep learning have demonstrated that they can potentially enhance the accuracy of the classification based on the high quality datasets. However, the existing methods only consider the accuracy of the classification and ignore the quality of the datasets. To address these issues, we propose a new image data cleaning framework using deep neural networks, named ImageDC, to improve the quality of the datasets. ImageDC not only uses cleaning with the minority class to remove the images of the rarely classes, but also adopts cleaning with the low recognition rate to remove the noisy data to enhance the recognition rate of the datasets. Experimental results conducted on a variety of datasets demonstrate that our model significantly outperforms the whole approaches.","PeriodicalId":416002,"journal":{"name":"2020 IEEE International Conference on Artificial Intelligence and Information Systems (ICAIIS)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Artificial Intelligence and Information Systems (ICAIIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAIIS49377.2020.9194803","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
Although user-generated image data increases more and more quickly on the current Internet, many image methods have attracted widespread attention from industry and academia. Recently, some image classification approaches using deep learning have demonstrated that they can potentially enhance the accuracy of the classification based on the high quality datasets. However, the existing methods only consider the accuracy of the classification and ignore the quality of the datasets. To address these issues, we propose a new image data cleaning framework using deep neural networks, named ImageDC, to improve the quality of the datasets. ImageDC not only uses cleaning with the minority class to remove the images of the rarely classes, but also adopts cleaning with the low recognition rate to remove the noisy data to enhance the recognition rate of the datasets. Experimental results conducted on a variety of datasets demonstrate that our model significantly outperforms the whole approaches.