通过入射角编码突破声学空间感知的极限

IF 3.6 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies Pub Date : 2024-05-13 DOI:10.1145/3659583
Yongjian Fu, Yongzhao Zhang, Hao Pan, Yu Lu, Xinyi Li, Lili Chen, Ju Ren, Xiong Li, Xiaosong Zhang, Yaoxue Zhang
{"title":"通过入射角编码突破声学空间感知的极限","authors":"Yongjian Fu, Yongzhao Zhang, Hao Pan, Yu Lu, Xinyi Li, Lili Chen, Ju Ren, Xiong Li, Xiaosong Zhang, Yaoxue Zhang","doi":"10.1145/3659583","DOIUrl":null,"url":null,"abstract":"With the growing popularity of smart speakers, numerous novel acoustic sensing applications have been proposed for low-frequency human speech and high-frequency inaudible sounds. Spatial information plays a crucial role in these acoustic applications, enabling various location-based services. However, typically commercial microphone arrays face limitations in spatial perception of inaudible sounds due to their sparse array geometries optimized for low-frequency speech. In this paper, we introduce MetaAng, a system designed to augment microphone arrays by enabling wideband spatial perception across both speech signals and inaudible sounds by leveraging the spatial encoding capabilities of acoustic metasurfaces. Our design is grounded in the fact that, while sensitive to high-frequency signals, acoustic metasurfaces are almost non-responsive to low-frequency speech due to significant wavelength discrepancy. This observation allows us to integrate acoustic metasurfaces with sparse array geometry, simultaneously enhancing the spatial perception of high-frequency and low-frequency acoustic signals. To achieve this, we first utilize acoustic metasurfaces and a configuration optimization algorithm to encode the unique features for each incident angle. Then, we propose an unrolling soft thresholding network that employs neural-enhanced priors and compressive sensing for high-accuracy, high-resolution multi-source angle estimation. We implement a prototype, and experimental results demonstrate that MetaAng maintains robustness across various scenarios, facilitating multiple applications, including localization and tracking.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":null,"pages":null},"PeriodicalIF":3.6000,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Pushing the Limits of Acoustic Spatial Perception via Incident Angle Encoding\",\"authors\":\"Yongjian Fu, Yongzhao Zhang, Hao Pan, Yu Lu, Xinyi Li, Lili Chen, Ju Ren, Xiong Li, Xiaosong Zhang, Yaoxue Zhang\",\"doi\":\"10.1145/3659583\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the growing popularity of smart speakers, numerous novel acoustic sensing applications have been proposed for low-frequency human speech and high-frequency inaudible sounds. Spatial information plays a crucial role in these acoustic applications, enabling various location-based services. However, typically commercial microphone arrays face limitations in spatial perception of inaudible sounds due to their sparse array geometries optimized for low-frequency speech. In this paper, we introduce MetaAng, a system designed to augment microphone arrays by enabling wideband spatial perception across both speech signals and inaudible sounds by leveraging the spatial encoding capabilities of acoustic metasurfaces. Our design is grounded in the fact that, while sensitive to high-frequency signals, acoustic metasurfaces are almost non-responsive to low-frequency speech due to significant wavelength discrepancy. This observation allows us to integrate acoustic metasurfaces with sparse array geometry, simultaneously enhancing the spatial perception of high-frequency and low-frequency acoustic signals. To achieve this, we first utilize acoustic metasurfaces and a configuration optimization algorithm to encode the unique features for each incident angle. Then, we propose an unrolling soft thresholding network that employs neural-enhanced priors and compressive sensing for high-accuracy, high-resolution multi-source angle estimation. We implement a prototype, and experimental results demonstrate that MetaAng maintains robustness across various scenarios, facilitating multiple applications, including localization and tracking.\",\"PeriodicalId\":20553,\"journal\":{\"name\":\"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.6000,\"publicationDate\":\"2024-05-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3659583\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3659583","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

随着智能扬声器的日益普及,针对低频人类语音和高频不可听声音提出了许多新颖的声学传感应用。空间信息在这些声学应用中发挥着至关重要的作用,使各种基于位置的服务成为可能。然而,通常的商用麦克风阵列由于其针对低频语音而优化的稀疏阵列几何结构,在对不可听声音的空间感知方面存在局限性。在本文中,我们介绍了 MetaAng,这是一个旨在增强麦克风阵列的系统,通过利用声学元表面的空间编码能力,实现对语音信号和不可闻声音的宽带空间感知。我们的设计基于以下事实:声学元表面虽然对高频信号很敏感,但由于波长差异很大,对低频语音几乎没有反应。根据这一观察结果,我们可以将声学元表面与稀疏阵列几何相结合,同时增强对高频和低频声学信号的空间感知。为此,我们首先利用声元曲面和配置优化算法来编码每个入射角的独特特征。然后,我们提出了一种开卷软阈值网络,该网络采用神经增强先验和压缩传感技术,用于高精度、高分辨率的多源角度估计。我们实现了一个原型,实验结果表明,MetaAng 在各种情况下都能保持稳健性,为定位和跟踪等多种应用提供了便利。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Pushing the Limits of Acoustic Spatial Perception via Incident Angle Encoding
With the growing popularity of smart speakers, numerous novel acoustic sensing applications have been proposed for low-frequency human speech and high-frequency inaudible sounds. Spatial information plays a crucial role in these acoustic applications, enabling various location-based services. However, typically commercial microphone arrays face limitations in spatial perception of inaudible sounds due to their sparse array geometries optimized for low-frequency speech. In this paper, we introduce MetaAng, a system designed to augment microphone arrays by enabling wideband spatial perception across both speech signals and inaudible sounds by leveraging the spatial encoding capabilities of acoustic metasurfaces. Our design is grounded in the fact that, while sensitive to high-frequency signals, acoustic metasurfaces are almost non-responsive to low-frequency speech due to significant wavelength discrepancy. This observation allows us to integrate acoustic metasurfaces with sparse array geometry, simultaneously enhancing the spatial perception of high-frequency and low-frequency acoustic signals. To achieve this, we first utilize acoustic metasurfaces and a configuration optimization algorithm to encode the unique features for each incident angle. Then, we propose an unrolling soft thresholding network that employs neural-enhanced priors and compressive sensing for high-accuracy, high-resolution multi-source angle estimation. We implement a prototype, and experimental results demonstrate that MetaAng maintains robustness across various scenarios, facilitating multiple applications, including localization and tracking.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies Computer Science-Computer Networks and Communications
CiteScore
9.10
自引率
0.00%
发文量
154
期刊最新文献
Talk2Care: An LLM-based Voice Assistant for Communication between Healthcare Providers and Older Adults A Digital Companion Architecture for Ambient Intelligence Waving Hand as Infrared Source for Ubiquitous Gas Sensing PPG-Hear: A Practical Eavesdropping Attack with Photoplethysmography Sensors User-directed Assembly Code Transformations Enabling Efficient Batteryless Arduino Applications
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1