Acoustic characteristics of whispered vowels: A dynamic feature exploration

IF 3.4 2区 物理与天体物理 Q1 ACOUSTICS Applied Acoustics Pub Date : 2024-11-02 DOI:10.1016/j.apacoust.2024.110362
Tianxiang Cao , Cenyu Xiang , Yuxin Wu , Yanlong Zhang
{"title":"Acoustic characteristics of whispered vowels: A dynamic feature exploration","authors":"Tianxiang Cao ,&nbsp;Cenyu Xiang ,&nbsp;Yuxin Wu ,&nbsp;Yanlong Zhang","doi":"10.1016/j.apacoust.2024.110362","DOIUrl":null,"url":null,"abstract":"<div><div>Whispered speech exhibits distinct acoustic characteristics compared to phonated speech due to the absence of vocal cord vibration. Previous research primarily focused on static features, neglecting the crucial role of dynamic spectral changes in vowel perception. This study investigates the effectiveness of dynamic features, specifically Vowel Inherent Spectral Change (VISC), in characterizing whispered vowels. VISC, a robust measure of dynamic spectral change, captures the continuous variation of formant frequencies throughout vowel articulation. It provides a richer representation than static features measured at a single point. This study analyzes formant frequencies and two key VISC metrics − vector length (VL) and spectral angle (α) in whispered and phonated Japanese vowels. VL represents the magnitude of formant change, while α reflects the direction of formant movement in acoustic space. Additionally, a vowel classification experiment using a Support Vector Machine (SVM) is conducted to directly compare the performance of models trained on static features versus those trained on a combination of static and dynamic features. Results reveal the effectiveness of VISC as a dynamic feature for characterizing whispered vowels. Through VISC analysis, a vowel classification experiment, and an exploration of the acoustic space of whispered vowels, dynamic features exhibit greater robustness across different phonation types in contrast to traditional static features. This study highlights the potential of VISC for enhancing whispered speech recognition. It further contributes to a more comprehensive understanding of whispered vowel acoustics and informs the development of more effective whisper-related speech technology.</div></div>","PeriodicalId":55506,"journal":{"name":"Applied Acoustics","volume":null,"pages":null},"PeriodicalIF":3.4000,"publicationDate":"2024-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Acoustics","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0003682X24005139","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ACOUSTICS","Score":null,"Total":0}
引用次数: 0

Abstract

Whispered speech exhibits distinct acoustic characteristics compared to phonated speech due to the absence of vocal cord vibration. Previous research primarily focused on static features, neglecting the crucial role of dynamic spectral changes in vowel perception. This study investigates the effectiveness of dynamic features, specifically Vowel Inherent Spectral Change (VISC), in characterizing whispered vowels. VISC, a robust measure of dynamic spectral change, captures the continuous variation of formant frequencies throughout vowel articulation. It provides a richer representation than static features measured at a single point. This study analyzes formant frequencies and two key VISC metrics − vector length (VL) and spectral angle (α) in whispered and phonated Japanese vowels. VL represents the magnitude of formant change, while α reflects the direction of formant movement in acoustic space. Additionally, a vowel classification experiment using a Support Vector Machine (SVM) is conducted to directly compare the performance of models trained on static features versus those trained on a combination of static and dynamic features. Results reveal the effectiveness of VISC as a dynamic feature for characterizing whispered vowels. Through VISC analysis, a vowel classification experiment, and an exploration of the acoustic space of whispered vowels, dynamic features exhibit greater robustness across different phonation types in contrast to traditional static features. This study highlights the potential of VISC for enhancing whispered speech recognition. It further contributes to a more comprehensive understanding of whispered vowel acoustics and informs the development of more effective whisper-related speech technology.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
低声元音的声学特征:动态特征探索
由于没有声带振动,低声语音与有声语音相比具有明显的声学特征。以往的研究主要关注静态特征,忽视了动态频谱变化在元音感知中的关键作用。本研究调查了动态特征,特别是元音固有频谱变化(VISC)在描述耳语元音特征方面的有效性。VISC 是一种动态频谱变化的稳健测量方法,可捕捉整个元音发音过程中声母频率的连续变化。与单点测量的静态特征相比,它能提供更丰富的表征。本研究分析了低声和发音日语元音中的心形频率和两个关键的 VISC 指标--矢量长度(VL)和频谱角度(α)。VL 表示声母变化的幅度,而 α 则反映声母在声学空间中的移动方向。此外,还使用支持向量机(SVM)进行了元音分类实验,以直接比较根据静态特征训练的模型与根据静态和动态特征组合训练的模型的性能。结果表明,VISC 作为一种动态特征,在描述耳语元音时非常有效。通过 VISC 分析、元音分类实验和对耳语元音声学空间的探索,与传统的静态特征相比,动态特征在不同的发音类型中表现出更强的鲁棒性。这项研究凸显了 VISC 在增强耳语语音识别方面的潜力。它进一步促进了对耳语元音声学的全面了解,并为开发更有效的耳语相关语音技术提供了信息。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Applied Acoustics
Applied Acoustics 物理-声学
CiteScore
7.40
自引率
11.80%
发文量
618
审稿时长
7.5 months
期刊介绍: Since its launch in 1968, Applied Acoustics has been publishing high quality research papers providing state-of-the-art coverage of research findings for engineers and scientists involved in applications of acoustics in the widest sense. Applied Acoustics looks not only at recent developments in the understanding of acoustics but also at ways of exploiting that understanding. The Journal aims to encourage the exchange of practical experience through publication and in so doing creates a fund of technological information that can be used for solving related problems. The presentation of information in graphical or tabular form is especially encouraged. If a report of a mathematical development is a necessary part of a paper it is important to ensure that it is there only as an integral part of a practical solution to a problem and is supported by data. Applied Acoustics encourages the exchange of practical experience in the following ways: • Complete Papers • Short Technical Notes • Review Articles; and thereby provides a wealth of technological information that can be used to solve related problems. Manuscripts that address all fields of applications of acoustics ranging from medicine and NDT to the environment and buildings are welcome.
期刊最新文献
Fibonacci array-based temporal-spatial localization with neural networks Semi-analytical prediction of energy-based acoustical parameters in proscenium theatres Preparation and performance analysis of porous materials for road noise abatement using waste rubber tires Acoustic characteristics of whispered vowels: A dynamic feature exploration A high DOF and azimuth resolution beamforming via enhanced virtual aperture extension of joint linear prediction and inverse beamforming
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1