{"title":"Automated Detection of Transition Segments for Intensity and Time-Scale Modification for Speech Intelligibility Enhancement","authors":"A. Jayan, P. C. Pandey, P. Lehana","doi":"10.1109/ICSCN.2008.4447162","DOIUrl":null,"url":null,"abstract":"Spectral transition segments serve as landmarks for the perception of consonants. In \"clear speech\" mode adopted by speakers to improve intelligibility in difficult communication environments, transition segments are of increased duration and intensity. Modification of conversational speech to have acoustic properties of clear speech has been reported to improve its intelligibility. This paper presents an automated method for locating spectral transition segments in speech, and to produce natural quality resynthesized speech with intensity and time-scale modified spectral transition segments. The boundaries of spectral transition segments are located using an index derived from the rate of variation of energy and centroid frequency in five non-overlapping spectral bands. Time-scale modification is performed using harmonic plus noise model (HNM) based analysis-synthesis. The overall speech duration is kept unaltered by appropriately compressing the steady state segments. Transition segments are intensity scaled by 6 dB. The effectiveness of the method was evaluated by conducting listening tests on normal hearing subjects using VCV syllables as the test material.","PeriodicalId":158011,"journal":{"name":"2008 International Conference on Signal Processing, Communications and Networking","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 International Conference on Signal Processing, Communications and Networking","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSCN.2008.4447162","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12
Abstract
Spectral transition segments serve as landmarks for the perception of consonants. In "clear speech" mode adopted by speakers to improve intelligibility in difficult communication environments, transition segments are of increased duration and intensity. Modification of conversational speech to have acoustic properties of clear speech has been reported to improve its intelligibility. This paper presents an automated method for locating spectral transition segments in speech, and to produce natural quality resynthesized speech with intensity and time-scale modified spectral transition segments. The boundaries of spectral transition segments are located using an index derived from the rate of variation of energy and centroid frequency in five non-overlapping spectral bands. Time-scale modification is performed using harmonic plus noise model (HNM) based analysis-synthesis. The overall speech duration is kept unaltered by appropriately compressing the steady state segments. Transition segments are intensity scaled by 6 dB. The effectiveness of the method was evaluated by conducting listening tests on normal hearing subjects using VCV syllables as the test material.