{"title":"Novel Unsupervised Feature Extraction Protocol using Autoencoders for Connected Speech: Application in Parkinson's Disease Classification","authors":"Sai Bharadwaj Appakaya, R. Sankar, E. Sheybani","doi":"10.1109/WTS51064.2021.9433683","DOIUrl":null,"url":null,"abstract":"Speech processing has generated substantial research interest for telemonitoring and classification applications in healthcare due to the ease of acquisition and availability of established research protocols. This growing research interest has shown significant progress in processing Parkinsonian speech for monitoring and classification applications. A considerable portion of the studies in this research area focuses on developing automatic telemonitoring protocols with passive data collection using wearable or mobile devices. Most of these studies focus on using sustained vowel phonations and handcrafted features for training classifiers. Though some researchers suggest better suitability of connected/running speech for this application, fewer studies focus on it predominantly because of the processing complexity. This study focuses on using connected speech with pitch synchronous segmentation and convolutional Autoencoders for feature extraction from regular and advanced spectrograms. The spectrograms were created using pitch synchronous and block processing segmentations have been evaluated in this study. This methodology also aims to bypass data availability issues by using standardized TIMIT dataset for training Autoencoders. With Logistic regression and Linear SVM, we achieved 85% classification accuracy using the features from Autoencoders. Mean accuracy of 84% was obtained under leave one subject out (LOSO) classification indicating the performance reliability for entirely new data.","PeriodicalId":443112,"journal":{"name":"2021 Wireless Telecommunications Symposium (WTS)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Wireless Telecommunications Symposium (WTS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WTS51064.2021.9433683","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Speech processing has generated substantial research interest for telemonitoring and classification applications in healthcare due to the ease of acquisition and availability of established research protocols. This growing research interest has shown significant progress in processing Parkinsonian speech for monitoring and classification applications. A considerable portion of the studies in this research area focuses on developing automatic telemonitoring protocols with passive data collection using wearable or mobile devices. Most of these studies focus on using sustained vowel phonations and handcrafted features for training classifiers. Though some researchers suggest better suitability of connected/running speech for this application, fewer studies focus on it predominantly because of the processing complexity. This study focuses on using connected speech with pitch synchronous segmentation and convolutional Autoencoders for feature extraction from regular and advanced spectrograms. The spectrograms were created using pitch synchronous and block processing segmentations have been evaluated in this study. This methodology also aims to bypass data availability issues by using standardized TIMIT dataset for training Autoencoders. With Logistic regression and Linear SVM, we achieved 85% classification accuracy using the features from Autoencoders. Mean accuracy of 84% was obtained under leave one subject out (LOSO) classification indicating the performance reliability for entirely new data.