{"title":"A mixed-excitation frequency domain model for time-scale pitch-scale modification of speech","authors":"A. Acero","doi":"10.21437/ICSLP.1998-16","DOIUrl":null,"url":null,"abstract":"This paper presents a time-scale pitch-scale modification technique for concatenative speech synthesis. The method is based on a frequency domain source-filter model, where the source is modeled as a mixed excitation. This model is highly coupled with a compression scheme that result in compact acoustic inventories. When compared to the approach in the Whistler system using no mixed excitation, the new method shows improvement in voiced fricatives and over-stretched voiced sounds. In addition, it allows for spectral manipulation such as smoothing of discontinuities at unit boundaries, voice transformations or loudness equalization.","PeriodicalId":117113,"journal":{"name":"5th International Conference on Spoken Language Processing (ICSLP 1998)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1998-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"5th International Conference on Spoken Language Processing (ICSLP 1998)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21437/ICSLP.1998-16","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
This paper presents a time-scale pitch-scale modification technique for concatenative speech synthesis. The method is based on a frequency domain source-filter model, where the source is modeled as a mixed excitation. This model is highly coupled with a compression scheme that result in compact acoustic inventories. When compared to the approach in the Whistler system using no mixed excitation, the new method shows improvement in voiced fricatives and over-stretched voiced sounds. In addition, it allows for spectral manipulation such as smoothing of discontinuities at unit boundaries, voice transformations or loudness equalization.