Using a blind EC mechanism for modelling the interaction between binaural and temporal speech processing

IF 1 3区 物理与天体物理 Q4 ACOUSTICS Acta Acustica Pub Date : 2022-01-01 DOI:10.1051/aacus/2022009
Saskia Rӧttges, C. Hauth, J. Rennies, T. Brand
{"title":"Using a blind EC mechanism for modelling the interaction between binaural and temporal speech processing","authors":"Saskia Rӧttges, C. Hauth, J. Rennies, T. Brand","doi":"10.1051/aacus/2022009","DOIUrl":null,"url":null,"abstract":"We reanalyzed a study that investigated binaural and temporal integration of speech reflections with different amplitudes, delays, and interaural phase differences. We used a blind binaural speech intelligibility model (bBSIM), applying an equalization-cancellation process for modeling binaural release from masking. bBSIM is blind, as it requires only the mixed binaural speech and noise signals and no auxiliary information about the listening conditions. bBSIM was combined with two non-blind back-ends: The speech intelligibility index (SII) and the speech transmission index (STI) resulting in hybrid-models. Furthermore, bBSIM was combined with the non-intrusive short-time objective intelligibility (NI-STOI) resulting in a fully blind model. The fully non-blind reference model used in the previous study achieved the best prediction accuracy (R2 = 0.91 and RMSE = 1 dB). The fully blind model yielded a coefficient of determination (R2 = 0.87) similar to that of the reference model but also the highest root mean square error of the models tested in this study (RMSE = 4.4 dB). By adjusting the binaural processing errors of bBSIM as done in the reference model, the RMSE could be decreased to 1.9 dB. Furthermore, in this study, the dynamic range of the SII had to be adjusted to predict the low SRTs of the speech material used.","PeriodicalId":48486,"journal":{"name":"Acta Acustica","volume":"22 1","pages":""},"PeriodicalIF":1.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta Acustica","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1051/aacus/2022009","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ACOUSTICS","Score":null,"Total":0}
引用次数: 2

Abstract

We reanalyzed a study that investigated binaural and temporal integration of speech reflections with different amplitudes, delays, and interaural phase differences. We used a blind binaural speech intelligibility model (bBSIM), applying an equalization-cancellation process for modeling binaural release from masking. bBSIM is blind, as it requires only the mixed binaural speech and noise signals and no auxiliary information about the listening conditions. bBSIM was combined with two non-blind back-ends: The speech intelligibility index (SII) and the speech transmission index (STI) resulting in hybrid-models. Furthermore, bBSIM was combined with the non-intrusive short-time objective intelligibility (NI-STOI) resulting in a fully blind model. The fully non-blind reference model used in the previous study achieved the best prediction accuracy (R2 = 0.91 and RMSE = 1 dB). The fully blind model yielded a coefficient of determination (R2 = 0.87) similar to that of the reference model but also the highest root mean square error of the models tested in this study (RMSE = 4.4 dB). By adjusting the binaural processing errors of bBSIM as done in the reference model, the RMSE could be decreased to 1.9 dB. Furthermore, in this study, the dynamic range of the SII had to be adjusted to predict the low SRTs of the speech material used.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用盲EC机制来模拟双耳和时间语音处理之间的相互作用
我们重新分析了一项研究,该研究调查了具有不同振幅,延迟和耳间相位差的双耳和时间整合语音反射。我们使用盲双耳语音清晰度模型(bBSIM),应用均衡-抵消过程来建模双耳从掩蔽中释放。bBSIM是盲的,它只需要双耳混合语音和噪声信号,而不需要关于收听情况的辅助信息。bBSIM与两个非盲后端:语音可理解度指数(SII)和语音传输指数(STI)相结合,形成混合模型。此外,将bBSIM与非侵入性短时客观可解度(NI-STOI)相结合,形成全盲模型。先前研究中采用的全非盲参考模型预测精度最好(R2 = 0.91, RMSE = 1 dB)。全盲模型的决定系数(R2 = 0.87)与参考模型相似,但也是本研究检验的模型中均方根误差最高的(RMSE = 4.4 dB)。与参考模型一样,通过调整bBSIM双耳处理误差,可将RMSE降至1.9 dB。此外,在本研究中,必须调整SII的动态范围以预测所使用语音材料的低srt。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Acta Acustica
Acta Acustica ACOUSTICS-
CiteScore
2.80
自引率
21.40%
发文量
0
审稿时长
12 weeks
期刊介绍: Acta Acustica, the Journal of the European Acoustics Association (EAA). After the publication of its Journal Acta Acustica from 1993 to 1995, the EAA published Acta Acustica united with Acustica from 1996 to 2019. From 2020, the EAA decided to publish a journal in full Open Access. See Article Processing charges. Acta Acustica reports on original scientific research in acoustics and on engineering applications. The journal considers review papers, scientific papers, technical and applied papers, short communications, letters to the editor. From time to time, special issues and review articles are also published. For book reviews or doctoral thesis abstracts, please contact the Editor in Chief.
期刊最新文献
Auralization based on multi-perspective ambisonic room impulse responses Amplitude-dependent modal coefficients accounting for localized nonlinear losses in a time-domain integration of woodwind model A direct-hybrid CFD/CAA method based on lattice Boltzmann and acoustic perturbation equations Acta Acustica: State of art and achievements after 3 years Impact of wearing a head-mounted display on localization accuracy of real sound sources
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1