Hideki Kawahara, M. Morise, T. Toda, Hideki Banno, R. Nisimura, T. Irino
{"title":"Excitation source design for high-quality speech manipulation systems based on a temporally static group delay representation of periodic signals","authors":"Hideki Kawahara, M. Morise, T. Toda, Hideki Banno, R. Nisimura, T. Irino","doi":"10.1109/APSIPA.2014.7041594","DOIUrl":null,"url":null,"abstract":"A new group delay representation, which yields value zero for periodic signals irrespective to the initial phase and the relative level of each harmonic component. This new group delay representation provides a unified basis for defining \"aperiodicity\" in speech sounds. For example, the periodic to noise ratio or harmonic to noise ratio is directly derived from the deviation of this group delay representation from value zero, after removing FM effects of harmonic frequencies and removing AM effects of harmonic component level. The derived deviation is combined with estimated excitation duration information and used to design aperiodic components of excitation source for high-quality synthetic speech. The proposed group delay representation is based on FO-adaptive weighted average of frequency shifted versions and temporally shifted versions of group delays with power spectral weighting.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APSIPA.2014.7041594","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
A new group delay representation, which yields value zero for periodic signals irrespective to the initial phase and the relative level of each harmonic component. This new group delay representation provides a unified basis for defining "aperiodicity" in speech sounds. For example, the periodic to noise ratio or harmonic to noise ratio is directly derived from the deviation of this group delay representation from value zero, after removing FM effects of harmonic frequencies and removing AM effects of harmonic component level. The derived deviation is combined with estimated excitation duration information and used to design aperiodic components of excitation source for high-quality synthetic speech. The proposed group delay representation is based on FO-adaptive weighted average of frequency shifted versions and temporally shifted versions of group delays with power spectral weighting.