Blind Identification of Binaural Room Impulse Responses From Smart Glasses

IF 4.1 2区 计算机科学 Q1 ACOUSTICS IEEE/ACM Transactions on Audio, Speech, and Language Processing Pub Date : 2024-09-05 DOI:10.1109/TASLP.2024.3454964
Thomas Deppisch;Nils Meyer-Kahlen;Sebastià V. Amengual Garí
{"title":"Blind Identification of Binaural Room Impulse Responses From Smart Glasses","authors":"Thomas Deppisch;Nils Meyer-Kahlen;Sebastià V. Amengual Garí","doi":"10.1109/TASLP.2024.3454964","DOIUrl":null,"url":null,"abstract":"Smart glasses are increasingly recognized as a key medium for augmented reality, offering a hands-free platform with integrated microphones and non-ear-occluding loudspeakers to seamlessly mix virtual sound sources into the real-world acoustic scene. To convincingly integrate virtual sound sources, the room acoustic rendering of the virtual sources must match the real-world acoustics. Information about a user's acoustic environment however is typically not available. This work uses a microphone array in a pair of smart glasses to blindly identify binaural room impulse responses (BRIRs) from a few seconds of speech in the real-world environment. The proposed method uses dereverberation and beamforming to generate a pseudo reference signal that is used by a multichannel Wiener filter to estimate room impulse responses which are then converted to BRIRs. The multichannel room impulse responses can be used to estimate room acoustic parameters which is shown to outperform baseline algorithms in the estimation of reverberation time and direct-to-reverberant energy ratio. Results from a listening experiment further indicate that the estimated BRIRs often reproduce the real-world room acoustics perceptually more convincingly than measured BRIRs from other rooms of similar size.","PeriodicalId":13332,"journal":{"name":"IEEE/ACM Transactions on Audio, Speech, and Language Processing","volume":"32 ","pages":"4052-4065"},"PeriodicalIF":4.1000,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE/ACM Transactions on Audio, Speech, and Language Processing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10666277/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ACOUSTICS","Score":null,"Total":0}
引用次数: 0

Abstract

Smart glasses are increasingly recognized as a key medium for augmented reality, offering a hands-free platform with integrated microphones and non-ear-occluding loudspeakers to seamlessly mix virtual sound sources into the real-world acoustic scene. To convincingly integrate virtual sound sources, the room acoustic rendering of the virtual sources must match the real-world acoustics. Information about a user's acoustic environment however is typically not available. This work uses a microphone array in a pair of smart glasses to blindly identify binaural room impulse responses (BRIRs) from a few seconds of speech in the real-world environment. The proposed method uses dereverberation and beamforming to generate a pseudo reference signal that is used by a multichannel Wiener filter to estimate room impulse responses which are then converted to BRIRs. The multichannel room impulse responses can be used to estimate room acoustic parameters which is shown to outperform baseline algorithms in the estimation of reverberation time and direct-to-reverberant energy ratio. Results from a listening experiment further indicate that the estimated BRIRs often reproduce the real-world room acoustics perceptually more convincingly than measured BRIRs from other rooms of similar size.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
从智能眼镜盲识别双耳室内脉冲响应
智能眼镜作为增强现实技术的一个重要媒介,其提供的免提平台集成了麦克风和不包括耳朵的扬声器,可将虚拟声源无缝地混合到现实世界的声学场景中,这一点正日益得到认可。要令人信服地整合虚拟声源,虚拟声源的室内声学渲染必须与真实世界的声学效果相匹配。然而,用户的声学环境信息通常是不可用的。这项工作利用一副智能眼镜中的麦克风阵列,从现实世界环境中几秒钟的语音中盲目识别双耳房间脉冲响应(BRIR)。所提出的方法利用去混响和波束成形来生成伪参考信号,该信号被多通道维纳滤波器用于估计房间脉冲响应,然后将其转换为双耳房间脉冲响应。多通道房间脉冲响应可用于估算房间声学参数,在估算混响时间和直接与混响能量比方面优于基准算法。听音实验的结果进一步表明,估算出的混响时间与混响能量比通常比从其他类似大小的房间测得的混响时间与混响能量比更令人信服地再现了真实世界的房间声学效果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE/ACM Transactions on Audio, Speech, and Language Processing
IEEE/ACM Transactions on Audio, Speech, and Language Processing ACOUSTICS-ENGINEERING, ELECTRICAL & ELECTRONIC
CiteScore
11.30
自引率
11.10%
发文量
217
期刊介绍: The IEEE/ACM Transactions on Audio, Speech, and Language Processing covers audio, speech and language processing and the sciences that support them. In audio processing: transducers, room acoustics, active sound control, human audition, analysis/synthesis/coding of music, and consumer audio. In speech processing: areas such as speech analysis, synthesis, coding, speech and speaker recognition, speech production and perception, and speech enhancement. In language processing: speech and text analysis, understanding, generation, dialog management, translation, summarization, question answering and document indexing and retrieval, as well as general language modeling.
期刊最新文献
List of Reviewers IPDnet: A Universal Direct-Path IPD Estimation Network for Sound Source Localization MO-Transformer: Extract High-Level Relationship Between Words for Neural Machine Translation Online Neural Speaker Diarization With Target Speaker Tracking Blind Audio Bandwidth Extension: A Diffusion-Based Zero-Shot Approach
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1