N. Nayak, R. Velmurugan, P. C. Pandey, Sudipan Saha
{"title":"Estimation of lip opening for scaling of vocal tract area function for speech training aids","authors":"N. Nayak, R. Velmurugan, P. C. Pandey, Sudipan Saha","doi":"10.1109/NCC.2012.6176814","DOIUrl":null,"url":null,"abstract":"For visual feedback of articulatory efforts in speech training aids, the vocal tract shape can be estimated by LPC analysis of the speech signal. The vocal tract is modelled as a concatenation of equal length sections and the ratios of the areas at section interfaces are calculated and these are scaled using the area of a reference section. The lip opening area as estimated from a video recording of the speaker's face can be used as a reference area for obtaining the vocal tract shape during speech utterances with transitional tract configuration. A technique for estimating the area of the lip opening based on template matching is investigated. It satisfactorily tracked the horizontal and vertical opening of the lips in the video images of speakers with different skin hues, recorded under good lighting conditions.","PeriodicalId":178278,"journal":{"name":"2012 National Conference on Communications (NCC)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 National Conference on Communications (NCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NCC.2012.6176814","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
For visual feedback of articulatory efforts in speech training aids, the vocal tract shape can be estimated by LPC analysis of the speech signal. The vocal tract is modelled as a concatenation of equal length sections and the ratios of the areas at section interfaces are calculated and these are scaled using the area of a reference section. The lip opening area as estimated from a video recording of the speaker's face can be used as a reference area for obtaining the vocal tract shape during speech utterances with transitional tract configuration. A technique for estimating the area of the lip opening based on template matching is investigated. It satisfactorily tracked the horizontal and vertical opening of the lips in the video images of speakers with different skin hues, recorded under good lighting conditions.