{"title":"基于双resnet的GAN环境声音分类","authors":"Se-Young Jang, Yanggon Kim","doi":"10.1109/IMCOM56909.2023.10035597","DOIUrl":null,"url":null,"abstract":"Various deep learning studies have been gaining interest in environmental sound classification. In recent years, as the performance of image classification in deep learning increases, the field of converting and classifying audio data into images to classify has been steadily drawing attention. However, publicly accessible sound datasets are limited, so it is difficult to develop environmental sound classification compared to other classification. Among many augmentation methods, approaches are being made to generate synthetic data through a generative adversarial network for augmentation. In this paper, we suggest a deep learning framework that allows simultaneous learning of synthetic data and original data. Our network uses dual ResNet18, and it allows GAN-generated synthetic data and original data to be learned simultaneously within the network. The proposed method is evaluated through UrbanSound8K dataset. As a result, it showed a performance improvement compared to the method used as synthetic data augmentation in terms of learning efficiency and accuracy.","PeriodicalId":230213,"journal":{"name":"2023 17th International Conference on Ubiquitous Information Management and Communication (IMCOM)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Dual ResNet-based Environmental Sound Classification using GAN\",\"authors\":\"Se-Young Jang, Yanggon Kim\",\"doi\":\"10.1109/IMCOM56909.2023.10035597\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Various deep learning studies have been gaining interest in environmental sound classification. In recent years, as the performance of image classification in deep learning increases, the field of converting and classifying audio data into images to classify has been steadily drawing attention. However, publicly accessible sound datasets are limited, so it is difficult to develop environmental sound classification compared to other classification. Among many augmentation methods, approaches are being made to generate synthetic data through a generative adversarial network for augmentation. In this paper, we suggest a deep learning framework that allows simultaneous learning of synthetic data and original data. Our network uses dual ResNet18, and it allows GAN-generated synthetic data and original data to be learned simultaneously within the network. The proposed method is evaluated through UrbanSound8K dataset. As a result, it showed a performance improvement compared to the method used as synthetic data augmentation in terms of learning efficiency and accuracy.\",\"PeriodicalId\":230213,\"journal\":{\"name\":\"2023 17th International Conference on Ubiquitous Information Management and Communication (IMCOM)\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 17th International Conference on Ubiquitous Information Management and Communication (IMCOM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IMCOM56909.2023.10035597\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 17th International Conference on Ubiquitous Information Management and Communication (IMCOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IMCOM56909.2023.10035597","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Dual ResNet-based Environmental Sound Classification using GAN
Various deep learning studies have been gaining interest in environmental sound classification. In recent years, as the performance of image classification in deep learning increases, the field of converting and classifying audio data into images to classify has been steadily drawing attention. However, publicly accessible sound datasets are limited, so it is difficult to develop environmental sound classification compared to other classification. Among many augmentation methods, approaches are being made to generate synthetic data through a generative adversarial network for augmentation. In this paper, we suggest a deep learning framework that allows simultaneous learning of synthetic data and original data. Our network uses dual ResNet18, and it allows GAN-generated synthetic data and original data to be learned simultaneously within the network. The proposed method is evaluated through UrbanSound8K dataset. As a result, it showed a performance improvement compared to the method used as synthetic data augmentation in terms of learning efficiency and accuracy.