使用时间分解技术建模频谱语音转换

G. Ahlbom, F. Bimbot, G. Chollet
{"title":"使用时间分解技术建模频谱语音转换","authors":"G. Ahlbom, F. Bimbot, G. Chollet","doi":"10.1109/ICASSP.1987.1169742","DOIUrl":null,"url":null,"abstract":"ATAL [1] introduced a technique for decomposing speech into phone-length temporal events in terms of overlapping and interacting articulatory gestures. This paper reports on simplifications of this technique with applications to acoustic-phonetic synthesis. Spectral evolution is represented by time-indexed trajectories in the p-dimensional space of Log-Area Ratios{y_{i}= \\Ln ((1+k_{i})/(1-k_{i}))}where kiare the reflection coefficients obtained from short-time stationary LPC analysis. The vocal tract configuration (spectral vector) associated with each interpolation function belongs to a finite set of articulatory targets (vector quantization code book). A set of speech segments (\"polysons\") has been encoded using this technique. It includes diphones, demi-syllables, and other units that are difficult to segment. Temporal decomposition using target spectra can break the complex encoding of these segments. In particular, coarticulation effects are analyticaiy explained and modeled. It is demonstrated that these new tools provide an adequate environment in our search for better rules in acoustic speech synthesis.","PeriodicalId":140810,"journal":{"name":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1987-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":"{\"title\":\"Modeling spectral speech transitions using temporal decomposition techniques\",\"authors\":\"G. Ahlbom, F. Bimbot, G. Chollet\",\"doi\":\"10.1109/ICASSP.1987.1169742\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"ATAL [1] introduced a technique for decomposing speech into phone-length temporal events in terms of overlapping and interacting articulatory gestures. This paper reports on simplifications of this technique with applications to acoustic-phonetic synthesis. Spectral evolution is represented by time-indexed trajectories in the p-dimensional space of Log-Area Ratios{y_{i}= \\\\Ln ((1+k_{i})/(1-k_{i}))}where kiare the reflection coefficients obtained from short-time stationary LPC analysis. The vocal tract configuration (spectral vector) associated with each interpolation function belongs to a finite set of articulatory targets (vector quantization code book). A set of speech segments (\\\"polysons\\\") has been encoded using this technique. It includes diphones, demi-syllables, and other units that are difficult to segment. Temporal decomposition using target spectra can break the complex encoding of these segments. In particular, coarticulation effects are analyticaiy explained and modeled. It is demonstrated that these new tools provide an adequate environment in our search for better rules in acoustic speech synthesis.\",\"PeriodicalId\":140810,\"journal\":{\"name\":\"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1987-04-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"22\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICASSP.1987.1169742\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP.1987.1169742","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 22

摘要

ATAL[1]引入了一种技术,根据重叠和相互作用的发音手势,将语音分解为电话长度的时间事件。本文报道了该技术的简化及其在声音合成中的应用。光谱演化由p维空间中Log-Area ratio (y_{i}= \Ln ((1+k_{i})/(1-k_{i}))}的时间索引轨迹表示,其中ki为短时平稳LPC分析得到的反射系数。声道结构(谱矢量)与每个插值函数相关联,属于有限的发音目标集合(矢量量化代码书)。一组语音片段(“多义词”)已经使用这种技术进行了编码。它包括双音、半音节和其他难以分割的单位。利用目标光谱进行时间分解可以打破这些片段的复杂编码。特别是,协同衔接效应的分析解释和建模。结果表明,这些新工具为我们寻找更好的声学语音合成规则提供了充分的环境。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Modeling spectral speech transitions using temporal decomposition techniques
ATAL [1] introduced a technique for decomposing speech into phone-length temporal events in terms of overlapping and interacting articulatory gestures. This paper reports on simplifications of this technique with applications to acoustic-phonetic synthesis. Spectral evolution is represented by time-indexed trajectories in the p-dimensional space of Log-Area Ratios{y_{i}= \Ln ((1+k_{i})/(1-k_{i}))}where kiare the reflection coefficients obtained from short-time stationary LPC analysis. The vocal tract configuration (spectral vector) associated with each interpolation function belongs to a finite set of articulatory targets (vector quantization code book). A set of speech segments ("polysons") has been encoded using this technique. It includes diphones, demi-syllables, and other units that are difficult to segment. Temporal decomposition using target spectra can break the complex encoding of these segments. In particular, coarticulation effects are analyticaiy explained and modeled. It is demonstrated that these new tools provide an adequate environment in our search for better rules in acoustic speech synthesis.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A high resolution data-adaptive time-frequency representation A fast prediction-error detector for estimating sparse-spike sequences Some applications of mathematical morphology to range imagery Parameter estimation using the autocorrelation of the discrete Fourier transform Array signal processing with interconnected Neuron-like elements
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1