时态、模态和松弛发声对元音三维有限元合成的影响[A]

IberSPEECH Conference Pub Date : 2018-11-21 DOI:10.21437/IberSPEECH.2018-28

M. Freixes, M. Arnela, J. Socoró, Francesc Alías, O. Guasch

{"title":"时态、模态和松弛发声对元音三维有限元合成的影响[A]","authors":"M. Freixes, M. Arnela, J. Socoró, Francesc Alías, O. Guasch","doi":"10.21437/IberSPEECH.2018-28","DOIUrl":null,"url":null,"abstract":"One-dimensional articulatory speech models have long been used to generate synthetic voice. These models assume plane wave propagation within the vocal tract, which holds for frequencies up to ∼ 5 kHz. However, higher order modes also propagate beyond this limit, which may be relevant to produce a more natural voice. Such modes could be especially impor-tant for phonation types with signiﬁcant high frequency energy (HFE) content. In this work, we study the inﬂuence of tense, modal and lax phonation on the synthesis of vowel [A] through 3D ﬁnite element modelling (FEM). The three phonation types are reproduced with an LF (Liljencrants-Fant) model controlled by the R d glottal shape parameter. The onset of the higher order modes essentially depends on the vocal tract geometry. Two of them are considered, a realistic vocal tract obtained from MRI and a simpliﬁed straight duct with varying circular cross-sections. Long-term average spectra are computed from the FEM synthesised [A] vowels, extracting the overall sound pressure level and the HFE level in the 8 kHz octave band. Results indicate that higher order modes may be perceptually relevant for the tense and modal voice qualities, but not for the lax phonation.","PeriodicalId":115963,"journal":{"name":"IberSPEECH Conference","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Influence of tense, modal and lax phonation on the three-dimensional finite element synthesis of vowel [A]\",\"authors\":\"M. Freixes, M. Arnela, J. Socoró, Francesc Alías, O. Guasch\",\"doi\":\"10.21437/IberSPEECH.2018-28\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"One-dimensional articulatory speech models have long been used to generate synthetic voice. These models assume plane wave propagation within the vocal tract, which holds for frequencies up to ∼ 5 kHz. However, higher order modes also propagate beyond this limit, which may be relevant to produce a more natural voice. Such modes could be especially impor-tant for phonation types with signiﬁcant high frequency energy (HFE) content. In this work, we study the inﬂuence of tense, modal and lax phonation on the synthesis of vowel [A] through 3D ﬁnite element modelling (FEM). The three phonation types are reproduced with an LF (Liljencrants-Fant) model controlled by the R d glottal shape parameter. The onset of the higher order modes essentially depends on the vocal tract geometry. Two of them are considered, a realistic vocal tract obtained from MRI and a simpliﬁed straight duct with varying circular cross-sections. Long-term average spectra are computed from the FEM synthesised [A] vowels, extracting the overall sound pressure level and the HFE level in the 8 kHz octave band. Results indicate that higher order modes may be perceptually relevant for the tense and modal voice qualities, but not for the lax phonation.\",\"PeriodicalId\":115963,\"journal\":{\"name\":\"IberSPEECH Conference\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-11-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IberSPEECH Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.21437/IberSPEECH.2018-28\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IberSPEECH Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21437/IberSPEECH.2018-28","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

一维发音语音模型一直被用于合成语音。这些模型假设声道内的平面波传播，其频率高达~ 5 kHz。然而，高阶模式的传播也超过了这个限制，这可能与产生更自然的声音有关。这种模式对于具有显著高频能量(HFE)含量的发声类型尤其重要。在这项工作中，我们通过三维有限元建模(FEM)研究了时态、模态和松弛发声对元音合成的影响[A]。这三种发声类型用一个由rd声门形状参数控制的LF (Liljencrants-Fant)模型再现。高阶模式的开始基本上取决于声道的几何形状。本文考虑了其中的两种，一种是磁共振成像获得的真实声道，另一种是具有不同圆形截面的简化直管。从FEM合成的[A]元音计算长期平均谱，提取8 kHz频带内的总声压级和HFE级。结果表明，高阶模态可能与时态和情态音质感知相关，但与松弛发声无关。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Influence of tense, modal and lax phonation on the three-dimensional finite element synthesis of vowel [A]

One-dimensional articulatory speech models have long been used to generate synthetic voice. These models assume plane wave propagation within the vocal tract, which holds for frequencies up to ∼ 5 kHz. However, higher order modes also propagate beyond this limit, which may be relevant to produce a more natural voice. Such modes could be especially impor-tant for phonation types with signiﬁcant high frequency energy (HFE) content. In this work, we study the inﬂuence of tense, modal and lax phonation on the synthesis of vowel [A] through 3D ﬁnite element modelling (FEM). The three phonation types are reproduced with an LF (Liljencrants-Fant) model controlled by the R d glottal shape parameter. The onset of the higher order modes essentially depends on the vocal tract geometry. Two of them are considered, a realistic vocal tract obtained from MRI and a simpliﬁed straight duct with varying circular cross-sections. Long-term average spectra are computed from the FEM synthesised [A] vowels, extracting the overall sound pressure level and the HFE level in the 8 kHz octave band. Results indicate that higher order modes may be perceptually relevant for the tense and modal voice qualities, but not for the lax phonation.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IberSPEECH Conference

自引率

0.00%

发文量