{"title":"Speaker identification using convolutional neural network for clean and noisy speech samples","authors":"Ali Muayad Jalil, F. S. Hasan, H. Alabbasi","doi":"10.1109/CAS47993.2019.9075461","DOIUrl":null,"url":null,"abstract":"Conventional speaker identification systems require features that are carefully designed to achieve high identification accuracy rates. With deep learning, these features are learned rather than specifically designed. The improvements of deep neural networks algorithms and techniques lead to an increase in using deep neural networks for speaker identification systems in favour of the conventional systems. In this paper, we use a convolutional neural network with Mel-spectrogram as an input for the identification purpose. The experiments are done on TIMIT dataset to evaluate the proposed CNN architecture and to compare with state-of-the-art systems for clean and noisy speech samples.","PeriodicalId":202291,"journal":{"name":"2019 First International Conference of Computer and Applied Sciences (CAS)","volume":"100 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 First International Conference of Computer and Applied Sciences (CAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CAS47993.2019.9075461","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
Conventional speaker identification systems require features that are carefully designed to achieve high identification accuracy rates. With deep learning, these features are learned rather than specifically designed. The improvements of deep neural networks algorithms and techniques lead to an increase in using deep neural networks for speaker identification systems in favour of the conventional systems. In this paper, we use a convolutional neural network with Mel-spectrogram as an input for the identification purpose. The experiments are done on TIMIT dataset to evaluate the proposed CNN architecture and to compare with state-of-the-art systems for clean and noisy speech samples.