{"title":"基于AR-HMM的语音病理分析","authors":"A. Sasou","doi":"10.1109/APSIPA.2016.7820679","DOIUrl":null,"url":null,"abstract":"Voice-pathology detection from a subject's voice is a promising technology for pre-diagnosis of larynx diseases. Glottal source estimation in particular plays a very important role in voice-pathology analysis. For more accurate estimation of the spectral envelope and glottal source of the pathology voice, we propose a method that can automatically generate the topology of the glottal source Hidden Markov Model (HMM), as well as estimate the Auto-Regressive (AR)-HMM parameter by combining AR-HMM parameter estimation and the Minimum Description Length-based Successive State Splitting (MDL-SSS) algorithm. The AR-HMM adopts a single Gaussian distribution for the output Probability Distribution Function (PDF) of each state in the glottal source HMM. In this paper, we propose a novel voice-pathology detection method based on the AR-HMM with automatic topology generation, which utilizes the output PDF variances normalized with regard to the maximum variance as clues for voice-pathology detection. We experimentally demonstrate that for normal voices, other normalized variances are distributed around a lower range than the maximum variance. This is because the PDF of the state just following vocal fold closure tends to have a maximum variance far greater than other variances. For pathology voices, the maximum variance and other variances are more closely distributed than for normal voices, possibly due to air leaking through the vocal folds. The experiment results confirmed the feasibility and fundamental validity of the proposed method.","PeriodicalId":409448,"journal":{"name":"2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)","volume":"150 2","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Voice-pathology analysis based on AR-HMM\",\"authors\":\"A. Sasou\",\"doi\":\"10.1109/APSIPA.2016.7820679\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Voice-pathology detection from a subject's voice is a promising technology for pre-diagnosis of larynx diseases. Glottal source estimation in particular plays a very important role in voice-pathology analysis. For more accurate estimation of the spectral envelope and glottal source of the pathology voice, we propose a method that can automatically generate the topology of the glottal source Hidden Markov Model (HMM), as well as estimate the Auto-Regressive (AR)-HMM parameter by combining AR-HMM parameter estimation and the Minimum Description Length-based Successive State Splitting (MDL-SSS) algorithm. The AR-HMM adopts a single Gaussian distribution for the output Probability Distribution Function (PDF) of each state in the glottal source HMM. In this paper, we propose a novel voice-pathology detection method based on the AR-HMM with automatic topology generation, which utilizes the output PDF variances normalized with regard to the maximum variance as clues for voice-pathology detection. We experimentally demonstrate that for normal voices, other normalized variances are distributed around a lower range than the maximum variance. This is because the PDF of the state just following vocal fold closure tends to have a maximum variance far greater than other variances. For pathology voices, the maximum variance and other variances are more closely distributed than for normal voices, possibly due to air leaking through the vocal folds. The experiment results confirmed the feasibility and fundamental validity of the proposed method.\",\"PeriodicalId\":409448,\"journal\":{\"name\":\"2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)\",\"volume\":\"150 2\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/APSIPA.2016.7820679\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APSIPA.2016.7820679","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Voice-pathology detection from a subject's voice is a promising technology for pre-diagnosis of larynx diseases. Glottal source estimation in particular plays a very important role in voice-pathology analysis. For more accurate estimation of the spectral envelope and glottal source of the pathology voice, we propose a method that can automatically generate the topology of the glottal source Hidden Markov Model (HMM), as well as estimate the Auto-Regressive (AR)-HMM parameter by combining AR-HMM parameter estimation and the Minimum Description Length-based Successive State Splitting (MDL-SSS) algorithm. The AR-HMM adopts a single Gaussian distribution for the output Probability Distribution Function (PDF) of each state in the glottal source HMM. In this paper, we propose a novel voice-pathology detection method based on the AR-HMM with automatic topology generation, which utilizes the output PDF variances normalized with regard to the maximum variance as clues for voice-pathology detection. We experimentally demonstrate that for normal voices, other normalized variances are distributed around a lower range than the maximum variance. This is because the PDF of the state just following vocal fold closure tends to have a maximum variance far greater than other variances. For pathology voices, the maximum variance and other variances are more closely distributed than for normal voices, possibly due to air leaking through the vocal folds. The experiment results confirmed the feasibility and fundamental validity of the proposed method.