Estimation of Japanese DRT intelligibility using Articulation Index Band Correlations

K. Kondo
{"title":"Estimation of Japanese DRT intelligibility using Articulation Index Band Correlations","authors":"K. Kondo","doi":"10.1109/APSIPA.2014.7041516","DOIUrl":null,"url":null,"abstract":"We proposed and evaluated an estimation method for the forced selection Japanese Diagnostic Rhyme Test (DRT). The proposed measure takes into account the forced selection manner of the DRT from a pair of rhyming words. The objective distance measure used here was based on the Articulation index Band Correlation (ABC), which showed favorable results for the English Modified Rhyme Test (MRT). The correlation of time-frequency patterns between the test word and the template word speech of the two words in the candidate word pair was calculated. The word with the higher correlation was decided to be the likely candidate word. The time-frequency (T-F) pattern was calculated in the Articulation Index (AI) bands, and the correlation was calculated between the corresponding bands of the test and candidate word sample. The candidate word with more AI bands showing higher correlation values was finally chosen. The ratio of bands with higher correlation with the candidate word vs. the total number of bands is calculated to quantify how well the test word matches the candidate word in the word pair. We estimated a logistic mapping function from this ratio to intelligibility scores using speech mixed with known noise. The mapping functions were then used to estimate the intelligibility of speech mixed with unknown noise. This estimation was compared to another measure that we previously have evaluated, the frequency-weighed segmental SNR, and was proven to be more accurate, with the correlation between estimated and estimated intelligibility over 0.93, and the root mean square below 0.15. Thus, it should be possible to \"screen\" the intelligibility in many of the noise conditions to be tested, and cut down on the scale of the subjective test needed.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APSIPA.2014.7041516","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

We proposed and evaluated an estimation method for the forced selection Japanese Diagnostic Rhyme Test (DRT). The proposed measure takes into account the forced selection manner of the DRT from a pair of rhyming words. The objective distance measure used here was based on the Articulation index Band Correlation (ABC), which showed favorable results for the English Modified Rhyme Test (MRT). The correlation of time-frequency patterns between the test word and the template word speech of the two words in the candidate word pair was calculated. The word with the higher correlation was decided to be the likely candidate word. The time-frequency (T-F) pattern was calculated in the Articulation Index (AI) bands, and the correlation was calculated between the corresponding bands of the test and candidate word sample. The candidate word with more AI bands showing higher correlation values was finally chosen. The ratio of bands with higher correlation with the candidate word vs. the total number of bands is calculated to quantify how well the test word matches the candidate word in the word pair. We estimated a logistic mapping function from this ratio to intelligibility scores using speech mixed with known noise. The mapping functions were then used to estimate the intelligibility of speech mixed with unknown noise. This estimation was compared to another measure that we previously have evaluated, the frequency-weighed segmental SNR, and was proven to be more accurate, with the correlation between estimated and estimated intelligibility over 0.93, and the root mean square below 0.15. Thus, it should be possible to "screen" the intelligibility in many of the noise conditions to be tested, and cut down on the scale of the subjective test needed.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用发音指数波段相关性估计日语DRT可理解性
我们提出并评估了一种强制选择日语诊断韵测试(DRT)的估计方法。该方法考虑了从一对押韵词中强制选择DRT的方式。本文使用的客观距离测量是基于发音指标频带相关性(ABC),该方法在英语修饰韵测试(MRT)中显示出良好的效果。计算候选词对中两个词的测试词与模板词语音的时频相关性。相关性较高的单词被决定为可能的候选单词。计算发音指数(Articulation Index, AI)波段的时频(T-F)模式,并计算测试对应波段与候选词样本之间的相关性。最终选择具有更多AI波段且相关值较高的候选词。计算与候选单词相关度较高的频带与频带总数的比率,以量化测试单词与单词对中候选单词的匹配程度。我们使用混合了已知噪声的语音,从可理解性分数的比率估计了一个逻辑映射函数。然后利用映射函数估计含有未知噪声的语音的可理解性。该估计与我们之前评估的另一种测量方法进行了比较,即频率加权的分段信噪比,并被证明更准确,估计和估计的可理解性之间的相关性超过0.93,均方根低于0.15。因此,在许多需要测试的噪声条件下,应该有可能“筛选”可理解性,并减少所需的主观测试的规模。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Smoothing of spatial filter by graph Fourier transform for EEG signals Intra line copy for HEVC screen content coding Design of FPGA-based rapid prototype spectral subtraction for hands-free speech applications Fetal ECG extraction using adaptive functional link artificial neural network Opened Pins Recommendation System to promote tourism sector in Chiang Rai Thailand
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1