Jérémy Genette , Jose Manuel Rivera Espejo , Steven Gillis , Jo Verhoeven
{"title":"Determining spectral stability in vowels: A comparison and assessment of different metrics","authors":"Jérémy Genette , Jose Manuel Rivera Espejo , Steven Gillis , Jo Verhoeven","doi":"10.1016/j.specom.2023.102984","DOIUrl":null,"url":null,"abstract":"<div><p>This study investigated the performance of several metrics used to evaluate spectral stability in vowels. Four metrics suggested in the literature and a newly developed one were tested and compared to the traditional method of associating the spectrally stable portion with the middle of the vowel. First, synthetic stimuli whose spectrally stable portion had been defined in advance were used to evaluate the potential of the different metrics to capture spectral stability. Second, the output of the different metrics on the acoustic measurements obtained in the vowel portions identified as spectrally stable was compared on both synthesized and natural speech. It is clear that higher-dimensional features are needed to capture spectral stability and that the best-performing metrics yield acoustic measurements that are similar to those obtained in the middle of the vowel. This study empirically validates long-standing intuitions about the validity of selecting the middle section of vowels as the preferred method to identify the spectrally stable region in vowels.</p></div>","PeriodicalId":49485,"journal":{"name":"Speech Communication","volume":"154 ","pages":"Article 102984"},"PeriodicalIF":2.4000,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Speech Communication","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167639323001188","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ACOUSTICS","Score":null,"Total":0}
引用次数: 0
Abstract
This study investigated the performance of several metrics used to evaluate spectral stability in vowels. Four metrics suggested in the literature and a newly developed one were tested and compared to the traditional method of associating the spectrally stable portion with the middle of the vowel. First, synthetic stimuli whose spectrally stable portion had been defined in advance were used to evaluate the potential of the different metrics to capture spectral stability. Second, the output of the different metrics on the acoustic measurements obtained in the vowel portions identified as spectrally stable was compared on both synthesized and natural speech. It is clear that higher-dimensional features are needed to capture spectral stability and that the best-performing metrics yield acoustic measurements that are similar to those obtained in the middle of the vowel. This study empirically validates long-standing intuitions about the validity of selecting the middle section of vowels as the preferred method to identify the spectrally stable region in vowels.
期刊介绍:
Speech Communication is an interdisciplinary journal whose primary objective is to fulfil the need for the rapid dissemination and thorough discussion of basic and applied research results.
The journal''s primary objectives are:
• to present a forum for the advancement of human and human-machine speech communication science;
• to stimulate cross-fertilization between different fields of this domain;
• to contribute towards the rapid and wide diffusion of scientifically sound contributions in this domain.