用单个声音重建增强现实显示的房间尺度

IF 3.7 3区 工程技术 Q2 MATERIALS SCIENCE, MULTIDISCIPLINARY Journal of Information Display Pub Date : 2022-11-15 DOI:10.1080/15980316.2022.2145377
Benjamin Liang, AN Liang, Irán R. Román, Tomer Weiss, Budmonde Duinkharjav, J. Bello, Qi Sun
{"title":"用单个声音重建增强现实显示的房间尺度","authors":"Benjamin Liang, AN Liang, Irán R. Román, Tomer Weiss, Budmonde Duinkharjav, J. Bello, Qi Sun","doi":"10.1080/15980316.2022.2145377","DOIUrl":null,"url":null,"abstract":"Perception and reconstruction of our 3D physical environment is an essential task with broad applications for Augmented Reality (AR) displays. For example, reconstructed geometries are commonly leveraged for displaying 3D objects at accurate positions. While camera-captured images are a frequently used data source for realistically reconstructing 3D physical surroundings, they are limited to line-of-sight environments, requiring time-consuming and repetitive data-capture techniques to capture a full 3D picture. For instance, current AR devices require users to scan through a whole room to obtain its geometric sizes. This optical process is tedious and inapplicable when the space is occluded or inaccessible. Audio waves propagate through space by bouncing from different surfaces, but are not 'occluded' by a single object such as a wall, unlike light. In this research, we aim to ask the question ‘can one hear the size of a room?’. To answer that, we propose an approach for inferring room geometries only from a single sound, which we define as an audio wave sequence played from a single loud speaker, leveraging deep learning for decoding implicitly-carried spatial information from a single speaker-and-microphone system. Through a series of experiments and studies, our work demonstrates our method's effectiveness at inferring a 3D environment's spatial layout. Our work introduces a robust building block in multi-modal layout reconstruction.","PeriodicalId":16257,"journal":{"name":"Journal of Information Display","volume":null,"pages":null},"PeriodicalIF":3.7000,"publicationDate":"2022-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Reconstructing room scales with a single sound for augmented reality displays\",\"authors\":\"Benjamin Liang, AN Liang, Irán R. Román, Tomer Weiss, Budmonde Duinkharjav, J. Bello, Qi Sun\",\"doi\":\"10.1080/15980316.2022.2145377\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Perception and reconstruction of our 3D physical environment is an essential task with broad applications for Augmented Reality (AR) displays. For example, reconstructed geometries are commonly leveraged for displaying 3D objects at accurate positions. While camera-captured images are a frequently used data source for realistically reconstructing 3D physical surroundings, they are limited to line-of-sight environments, requiring time-consuming and repetitive data-capture techniques to capture a full 3D picture. For instance, current AR devices require users to scan through a whole room to obtain its geometric sizes. This optical process is tedious and inapplicable when the space is occluded or inaccessible. Audio waves propagate through space by bouncing from different surfaces, but are not 'occluded' by a single object such as a wall, unlike light. In this research, we aim to ask the question ‘can one hear the size of a room?’. To answer that, we propose an approach for inferring room geometries only from a single sound, which we define as an audio wave sequence played from a single loud speaker, leveraging deep learning for decoding implicitly-carried spatial information from a single speaker-and-microphone system. Through a series of experiments and studies, our work demonstrates our method's effectiveness at inferring a 3D environment's spatial layout. Our work introduces a robust building block in multi-modal layout reconstruction.\",\"PeriodicalId\":16257,\"journal\":{\"name\":\"Journal of Information Display\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2022-11-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Information Display\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1080/15980316.2022.2145377\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MATERIALS SCIENCE, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Information Display","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1080/15980316.2022.2145377","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 1

摘要

感知和重建我们的3D物理环境是增强现实(AR)显示器广泛应用的一项重要任务。例如,重建的几何体通常用于在精确位置显示3D对象。虽然相机捕获的图像是真实重建3D物理环境的常用数据源,但它们仅限于视线环境,需要耗时且重复的数据捕获技术来捕获完整的3D图片。例如,当前的AR设备要求用户扫描整个房间以获得其几何尺寸。当空间被遮挡或无法进入时,这种光学过程是乏味和不适用的。与光不同,声波通过从不同表面反弹在空间中传播,但不会被墙等单个物体“遮挡”。在这项研究中,我们的目的是问“一个人能听到房间的大小吗?”。为了回答这个问题,我们提出了一种仅从单个声音推断房间几何形状的方法,我们将其定义为从单个扬声器播放的声波序列,利用深度学习来解码来自单个扬声器和麦克风系统的隐含空间信息。通过一系列的实验和研究,我们的工作证明了我们的方法在推断三维环境的空间布局方面的有效性。我们的工作在多模态布局重建中引入了一个稳健的构建块。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Reconstructing room scales with a single sound for augmented reality displays
Perception and reconstruction of our 3D physical environment is an essential task with broad applications for Augmented Reality (AR) displays. For example, reconstructed geometries are commonly leveraged for displaying 3D objects at accurate positions. While camera-captured images are a frequently used data source for realistically reconstructing 3D physical surroundings, they are limited to line-of-sight environments, requiring time-consuming and repetitive data-capture techniques to capture a full 3D picture. For instance, current AR devices require users to scan through a whole room to obtain its geometric sizes. This optical process is tedious and inapplicable when the space is occluded or inaccessible. Audio waves propagate through space by bouncing from different surfaces, but are not 'occluded' by a single object such as a wall, unlike light. In this research, we aim to ask the question ‘can one hear the size of a room?’. To answer that, we propose an approach for inferring room geometries only from a single sound, which we define as an audio wave sequence played from a single loud speaker, leveraging deep learning for decoding implicitly-carried spatial information from a single speaker-and-microphone system. Through a series of experiments and studies, our work demonstrates our method's effectiveness at inferring a 3D environment's spatial layout. Our work introduces a robust building block in multi-modal layout reconstruction.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Information Display
Journal of Information Display MATERIALS SCIENCE, MULTIDISCIPLINARY-
CiteScore
7.10
自引率
5.40%
发文量
27
审稿时长
30 weeks
期刊最新文献
Tunneling assisted p-contact free GaN-InGaN green light-emitting diodes Deterioration of Li-doped phenanthroline-based charge generation layer for tandem organic light-emitting diodes A low-power metal–oxide scan driver circuit outputting non-overlapping pulses with DC power-supplied buffer Advances in display technology: augmented reality, virtual reality, quantum dot-based light-emitting diodes, and organic light-emitting diodes Advances in diffractive liquid crystal grating devices using patterned electrodes
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1