{"title":"基于生成对抗网络的半监督学习病理语音分类","authors":"Nam H. Trinh, Darragh O'Brien","doi":"10.1109/ISSC49989.2020.9180211","DOIUrl":null,"url":null,"abstract":"One application of deep learning in medical applications is the use of deep neural networks to classify human speech as healthy or pathological. In such applications, the audio signal is transformed into a spectrogram that captures its time-varying content and the latter “images” are fed into a classifier for classification. A challenge in applying this approach is the shortage of suitable speech data for training purposes. Labelled data acquisition requires significant human effort and/or time-consuming experiments. In this paper, we propose a semi-supervised learning approach that employs a Generative Adversarial Network (GAN) to alleviate the problem of insufficient training data. We compare the classification performance of a traditional classifier and our semi-supervised classifier. We observe that the GAN-based semi-supervised approach demonstrates a significant improvement in terms of accuracy and ROC curve when supplied an equivalent number of training samples.","PeriodicalId":351013,"journal":{"name":"2020 31st Irish Signals and Systems Conference (ISSC)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Semi-Supervised Learning with Generative Adversarial Networks for Pathological Speech Classification\",\"authors\":\"Nam H. Trinh, Darragh O'Brien\",\"doi\":\"10.1109/ISSC49989.2020.9180211\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"One application of deep learning in medical applications is the use of deep neural networks to classify human speech as healthy or pathological. In such applications, the audio signal is transformed into a spectrogram that captures its time-varying content and the latter “images” are fed into a classifier for classification. A challenge in applying this approach is the shortage of suitable speech data for training purposes. Labelled data acquisition requires significant human effort and/or time-consuming experiments. In this paper, we propose a semi-supervised learning approach that employs a Generative Adversarial Network (GAN) to alleviate the problem of insufficient training data. We compare the classification performance of a traditional classifier and our semi-supervised classifier. We observe that the GAN-based semi-supervised approach demonstrates a significant improvement in terms of accuracy and ROC curve when supplied an equivalent number of training samples.\",\"PeriodicalId\":351013,\"journal\":{\"name\":\"2020 31st Irish Signals and Systems Conference (ISSC)\",\"volume\":\"27 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 31st Irish Signals and Systems Conference (ISSC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISSC49989.2020.9180211\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 31st Irish Signals and Systems Conference (ISSC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISSC49989.2020.9180211","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Semi-Supervised Learning with Generative Adversarial Networks for Pathological Speech Classification
One application of deep learning in medical applications is the use of deep neural networks to classify human speech as healthy or pathological. In such applications, the audio signal is transformed into a spectrogram that captures its time-varying content and the latter “images” are fed into a classifier for classification. A challenge in applying this approach is the shortage of suitable speech data for training purposes. Labelled data acquisition requires significant human effort and/or time-consuming experiments. In this paper, we propose a semi-supervised learning approach that employs a Generative Adversarial Network (GAN) to alleviate the problem of insufficient training data. We compare the classification performance of a traditional classifier and our semi-supervised classifier. We observe that the GAN-based semi-supervised approach demonstrates a significant improvement in terms of accuracy and ROC curve when supplied an equivalent number of training samples.