首页 > 最新文献

2010 IEEE International Workshop on Multimedia Signal Processing最新文献

英文 中文
Visibility-based beam tracing for soundfield rendering 声场渲染中基于可见性的波束跟踪
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5661991
Dejan Markovic, A. Canclini, F. Antonacci, A. Sarti, S. Tubaro
In this paper we present a visibility-based beam tracing solution for the simulation of the acoustics of environment that makes use of a projective geometry representation. More specifically, projective geometry turns out to be useful for the pre-computation of the visibility among all the reflectors in the environment. The simulation engine has a straightforward application in the rendering of the acoustics of virtual environments using loudspeaker arrays. More specifically, the acoustic wavefield is conceived as a superposition of acoustic beams, whose parameters (i.e. origin, orientation and aperture) are computed using the fast beam tracing methodology presented here. This information is processed by the rendering engine to compute spatial filters to be applied to the loudspeakers within the array. Simulative results show that an accurate simulation of the acoustic wavefield can be obtained using this approach.
在本文中,我们提出了一种基于可见性的光束跟踪解决方案,用于环境声学的模拟,该解决方案利用射影几何表示。更具体地说,射影几何被证明对环境中所有反射器之间的可见性的预计算是有用的。仿真引擎在使用扬声器阵列的虚拟环境的声学渲染中有一个直接的应用。更具体地说,声波场被认为是声波束的叠加,其参数(即原点,方向和孔径)是使用这里提出的快速波束跟踪方法计算的。该信息由渲染引擎处理,以计算应用于阵列内扬声器的空间滤波器。仿真结果表明,采用该方法可以较准确地模拟声波场。
{"title":"Visibility-based beam tracing for soundfield rendering","authors":"Dejan Markovic, A. Canclini, F. Antonacci, A. Sarti, S. Tubaro","doi":"10.1109/MMSP.2010.5661991","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5661991","url":null,"abstract":"In this paper we present a visibility-based beam tracing solution for the simulation of the acoustics of environment that makes use of a projective geometry representation. More specifically, projective geometry turns out to be useful for the pre-computation of the visibility among all the reflectors in the environment. The simulation engine has a straightforward application in the rendering of the acoustics of virtual environments using loudspeaker arrays. More specifically, the acoustic wavefield is conceived as a superposition of acoustic beams, whose parameters (i.e. origin, orientation and aperture) are computed using the fast beam tracing methodology presented here. This information is processed by the rendering engine to compute spatial filters to be applied to the loudspeakers within the array. Simulative results show that an accurate simulation of the acoustic wavefield can be obtained using this approach.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121617715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
4-D broadcasting with MPEG-V 4-D广播与MPEG-V
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5662029
Kyoungro Yoon, Bumsuk Choi, Eun-Seo Lee, Tae-Beom Lim
Advances in media technologies brought 3-D TV home and 4-D movies to your neighbour. We present a framework for 4-D broadcasting to bring 4-D entertainment home based on MPEG-V standard. A complete framework for 4-D entertainment from authoring of sensory effects to environment description and commanding rendering devices for the sensory effects can supported by MPEG-V and couple of other standards. Part 2 of MPEG-V provides tools for describing capabilities of the sensory devices and sensors, part 3 provides tools to describe sensory effects, and part 5 provides tools to actually interact with the sensory devices and sensors.
媒体技术的进步使3-D电视走入家庭,4-D电影走入邻里。提出了一种基于MPEG-V标准的4d广播框架,将4d娱乐带入家庭。一个完整的4-D娱乐框架,从感官效果的创作到环境描述,再到指挥感官效果的渲染设备,都可以由MPEG-V和其他一些标准来支持。MPEG-V的第2部分提供了描述传感设备和传感器功能的工具,第3部分提供了描述传感效果的工具,第5部分提供了实际与传感设备和传感器交互的工具。
{"title":"4-D broadcasting with MPEG-V","authors":"Kyoungro Yoon, Bumsuk Choi, Eun-Seo Lee, Tae-Beom Lim","doi":"10.1109/MMSP.2010.5662029","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5662029","url":null,"abstract":"Advances in media technologies brought 3-D TV home and 4-D movies to your neighbour. We present a framework for 4-D broadcasting to bring 4-D entertainment home based on MPEG-V standard. A complete framework for 4-D entertainment from authoring of sensory effects to environment description and commanding rendering devices for the sensory effects can supported by MPEG-V and couple of other standards. Part 2 of MPEG-V provides tools for describing capabilities of the sensory devices and sensors, part 3 provides tools to describe sensory effects, and part 5 provides tools to actually interact with the sensory devices and sensors.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115085833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
Enhancing loudspeaker-based 3D audio with room modeling 增强基于扬声器的3D音频与房间建模
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5661990
Myung-Suk Song, Cha Zhang, D. Florêncio, Hong-Goo Kang
For many years, spatial (3D) sound using headphones has been widely used in a number of applications. A rich spatial sensation is obtained by using head related transfer functions (HRTF) and playing the appropriate sound through headphones. In theory, loudspeaker audio systems would be capable of rendering 3D sound fields almost as rich as headphones, as long as the room impulse responses (RIRs) between the loudspeakers and the ears are known. In practice, however, obtaining these RIRs is hard, and the performance of loudspeaker based systems is far from perfect. New hope has been recently raised by a system that tracks the user's head position and orientation, and incorporates them into the RIRs estimates in real time. That system made two simplifying assumptions: it used generic HRTFs, and it ignored room reverberation. In this paper we tackle the second problem: we incorporate a room reverberation estimate into the RIRs. Note that this is a nontrivial task: RIRs vary significantly with the listener's positions, and even if one could measure them at a few points, they are notoriously hard to interpolate. Instead, we take an indirect approach: we model the room, and from that model we obtain an estimate of the main reflections. Position and characteristics of walls do not vary with the users' movement, yet they allow to quickly compute an estimate of the RIR for each new user position. Of course the key question is whether the estimates are good enough. We show an improvement in localization perception of up to 32% (i.e., reducing average error from 23.5° to 15.9°).
多年来,使用耳机的空间(3D)声音已被广泛应用于许多应用中。通过使用头部相关传递函数(HRTF)并通过耳机播放适当的声音,获得丰富的空间感。理论上,只要扬声器和耳朵之间的房间脉冲响应(RIRs)是已知的,扬声器音频系统将能够呈现几乎和耳机一样丰富的3D声场。然而,在实践中,获得这些rir是困难的,并且基于扬声器的系统的性能远非完美。最近,一种追踪用户头部位置和方向的系统带来了新的希望,并将它们实时整合到RIRs估计中。该系统做了两个简化的假设:它使用一般的hrtf,并且忽略了房间混响。在本文中,我们解决了第二个问题:我们将一个房间混响估计纳入rir。请注意,这是一项非常重要的任务:rir会随着听者的位置而显著变化,即使可以在几个点上测量它们,它们也很难插入。相反,我们采取间接的方法:我们对房间进行建模,并从该模型中获得对主要反射的估计。墙壁的位置和特征不随用户的移动而变化,但它们允许快速计算每个新用户位置的RIR估计。当然,关键问题是这些估计是否足够准确。我们展示了高达32%的定位感知改进(即,将平均误差从23.5°减少到15.9°)。
{"title":"Enhancing loudspeaker-based 3D audio with room modeling","authors":"Myung-Suk Song, Cha Zhang, D. Florêncio, Hong-Goo Kang","doi":"10.1109/MMSP.2010.5661990","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5661990","url":null,"abstract":"For many years, spatial (3D) sound using headphones has been widely used in a number of applications. A rich spatial sensation is obtained by using head related transfer functions (HRTF) and playing the appropriate sound through headphones. In theory, loudspeaker audio systems would be capable of rendering 3D sound fields almost as rich as headphones, as long as the room impulse responses (RIRs) between the loudspeakers and the ears are known. In practice, however, obtaining these RIRs is hard, and the performance of loudspeaker based systems is far from perfect. New hope has been recently raised by a system that tracks the user's head position and orientation, and incorporates them into the RIRs estimates in real time. That system made two simplifying assumptions: it used generic HRTFs, and it ignored room reverberation. In this paper we tackle the second problem: we incorporate a room reverberation estimate into the RIRs. Note that this is a nontrivial task: RIRs vary significantly with the listener's positions, and even if one could measure them at a few points, they are notoriously hard to interpolate. Instead, we take an indirect approach: we model the room, and from that model we obtain an estimate of the main reflections. Position and characteristics of walls do not vary with the users' movement, yet they allow to quickly compute an estimate of the RIR for each new user position. Of course the key question is whether the estimates are good enough. We show an improvement in localization perception of up to 32% (i.e., reducing average error from 23.5° to 15.9°).","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117171903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
QoE based adaptation mechanism for media distribution in connected home 基于QoE的互联家庭媒体分发适应机制
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5662057
Jianfeng Chen, Xiaojun Ma, Jun Yu Li
Rich media application enables plenty of interactive, information-rich services to enhance end user's viewing experience. In current standards released for such service, only one rendering space is defined to handle multiple types of content belonging to the same service. However, a new trend is to assign more than one terminal device to render rich media application in a cooperative way inside a digital connected home network. The conventional audio-visual synchronization mechanism is focused on the packet level QoS (quality of service) control with less consideration of viewing experience. However, the actual QoE (quality of experience) is the final viewer's subjective perception for the displaying visual element. In order to design an optimized media distribution system based on QoE, this paper firstly introduces a subjective visual synchronization test for the same or tight relating contents rendering in dual screens, where the relationship between delay variation and the end user's evaluation is explored. Secondly, a QoE based media distribution mechanism is proposed to dynamically adjust the media flow transmission rate by using delay variation reports from terminals; at the same time, the tradeoff between rate adaptation and buffer overload is also considered. Simulation results show the proposed algorithm can not only improve the overall QoE score under either discrete or continuous delay variations; but also outperform the delay guarantee solutions without consideration of QoE.
富媒体应用程序支持大量交互式、信息丰富的服务,以增强最终用户的观看体验。在针对此类服务发布的当前标准中,只定义了一个呈现空间来处理属于同一服务的多种类型的内容。然而,一个新的趋势是在数字互联家庭网络中分配多个终端设备以协作的方式呈现富媒体应用。传统的视听同步机制侧重于分组级的QoS(服务质量)控制,对观看体验的考虑较少。然而,实际的QoE(体验质量)是最终观看者对所显示的视觉元素的主观感知。为了设计一个基于QoE的优化媒体分发系统,本文首先引入了双屏呈现相同或紧密相关内容的主观视觉同步测试,并探讨了延迟变化与最终用户评价之间的关系。其次,提出了一种基于QoE的媒体分发机制,利用来自终端的延迟变化报告动态调整媒体流传输速率;同时,还考虑了速率自适应和缓冲区过载之间的权衡。仿真结果表明,该算法不仅可以提高离散或连续延迟变化下的总体QoE分数;而且在不考虑QoE的情况下也优于延迟保证方案。
{"title":"QoE based adaptation mechanism for media distribution in connected home","authors":"Jianfeng Chen, Xiaojun Ma, Jun Yu Li","doi":"10.1109/MMSP.2010.5662057","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5662057","url":null,"abstract":"Rich media application enables plenty of interactive, information-rich services to enhance end user's viewing experience. In current standards released for such service, only one rendering space is defined to handle multiple types of content belonging to the same service. However, a new trend is to assign more than one terminal device to render rich media application in a cooperative way inside a digital connected home network. The conventional audio-visual synchronization mechanism is focused on the packet level QoS (quality of service) control with less consideration of viewing experience. However, the actual QoE (quality of experience) is the final viewer's subjective perception for the displaying visual element. In order to design an optimized media distribution system based on QoE, this paper firstly introduces a subjective visual synchronization test for the same or tight relating contents rendering in dual screens, where the relationship between delay variation and the end user's evaluation is explored. Secondly, a QoE based media distribution mechanism is proposed to dynamically adjust the media flow transmission rate by using delay variation reports from terminals; at the same time, the tradeoff between rate adaptation and buffer overload is also considered. Simulation results show the proposed algorithm can not only improve the overall QoE score under either discrete or continuous delay variations; but also outperform the delay guarantee solutions without consideration of QoE.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126381792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatial synchronization of audiovisual objects by 3D audio object coding 基于三维音频对象编码的视听对象空间同步
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5662065
B. Gunel, E. Ekmekcioglu, A. Kondoz
Free viewpoint video enables the visualisation of a scene from arbitrary viewpoints and directions. However, this flexibility in video rendering provides a challenge in 3D media for achieving spatial synchronicity between the audio and video objects. When the viewpoint is changed, its effect on the perceived audio scene should be considered to avoid mismatches in the perceived positions of audiovisual objects. Spatial audio coding with such flexibility requires decomposing the sound scene into audio objects initially, and then synthesizing the new scene according to the geometric relations between the A/V capturing setup, selected viewpoint and the rendering system. This paper proposes a free viewpoint audio coding framework for 3D media systems utilising multiview cameras and a microphone array. A real-time source separation technique is used for object decomposition followed by spatial audio coding. Binaural, multichannel sound systems and wave field synthesis systems are addressed. Subjective test results shows that the method achieves spatial synchronicity for various viewpoints consistently, which is not possible by conventional recording techniques.
免费视点视频可以从任意视点和方向可视化场景。然而,视频渲染的这种灵活性为实现音频和视频对象之间的空间同步性提供了3D媒体的挑战。当视点发生变化时,应考虑视点变化对感知到的音频场景的影响,避免视听对象感知位置不匹配。具有这种灵活性的空间音频编码需要首先将声音场景分解为音频对象,然后根据A/V捕获设置、选择的视点和渲染系统之间的几何关系合成新的场景。本文提出了一种基于多视点摄像机和麦克风阵列的三维媒体系统自由视点音频编码框架。实时源分离技术用于对象分解,然后进行空间音频编码。讨论了双耳、多声道音响系统和波场合成系统。主观测试结果表明,该方法能够一致地实现不同视点的空间同步性,这是传统记录技术无法实现的。
{"title":"Spatial synchronization of audiovisual objects by 3D audio object coding","authors":"B. Gunel, E. Ekmekcioglu, A. Kondoz","doi":"10.1109/MMSP.2010.5662065","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5662065","url":null,"abstract":"Free viewpoint video enables the visualisation of a scene from arbitrary viewpoints and directions. However, this flexibility in video rendering provides a challenge in 3D media for achieving spatial synchronicity between the audio and video objects. When the viewpoint is changed, its effect on the perceived audio scene should be considered to avoid mismatches in the perceived positions of audiovisual objects. Spatial audio coding with such flexibility requires decomposing the sound scene into audio objects initially, and then synthesizing the new scene according to the geometric relations between the A/V capturing setup, selected viewpoint and the rendering system. This paper proposes a free viewpoint audio coding framework for 3D media systems utilising multiview cameras and a microphone array. A real-time source separation technique is used for object decomposition followed by spatial audio coding. Binaural, multichannel sound systems and wave field synthesis systems are addressed. Subjective test results shows that the method achieves spatial synchronicity for various viewpoints consistently, which is not possible by conventional recording techniques.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"156 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124366614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Real-time video enhancement for high quality videoconferencing
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5662064
P. Kisilev, Sagi Schein
In this paper we present a novel method for high quality real-time video enhancement; it improves the sharpness and the contrast of video streams, and simultaneously suppresses noise. The method is comprised of three main modules: (1) noise analysis, (2) spatial processing, based on a new multi-scale pseudo-bilateral filter, and (3) temporal processing that includes robust motion detection and recursive temporal noise filtering. To achieve video frame rates for HD signals used in high-end telepresence systems such as the HP Halo room, we employ the computational capacity of modern graphics cards (GPUs), and distribute the tasks of analysis and of processing between CPU and GPU. The proposed scheme allows achieving video quality which is comparable with high-end camera systems, while using much lower cost cameras and reducing channel bandwidth requirements.
本文提出了一种新的高质量实时视频增强方法;它提高了视频流的清晰度和对比度,同时抑制了噪声。该方法包括三个主要模块:(1)噪声分析;(2)基于新型多尺度伪双边滤波器的空间处理;(3)包括鲁棒运动检测和递归时间噪声滤波的时间处理。为了实现用于高端远程呈现系统(如HP Halo room)的高清信号的视频帧率,我们利用现代图形卡(GPU)的计算能力,并在CPU和GPU之间分配分析和处理任务。所提出的方案可以实现与高端摄像机系统相当的视频质量,同时使用更低成本的摄像机并减少信道带宽要求。
{"title":"Real-time video enhancement for high quality videoconferencing","authors":"P. Kisilev, Sagi Schein","doi":"10.1109/MMSP.2010.5662064","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5662064","url":null,"abstract":"In this paper we present a novel method for high quality real-time video enhancement; it improves the sharpness and the contrast of video streams, and simultaneously suppresses noise. The method is comprised of three main modules: (1) noise analysis, (2) spatial processing, based on a new multi-scale pseudo-bilateral filter, and (3) temporal processing that includes robust motion detection and recursive temporal noise filtering. To achieve video frame rates for HD signals used in high-end telepresence systems such as the HP Halo room, we employ the computational capacity of modern graphics cards (GPUs), and distribute the tasks of analysis and of processing between CPU and GPU. The proposed scheme allows achieving video quality which is comparable with high-end camera systems, while using much lower cost cameras and reducing channel bandwidth requirements.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123373837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Depth camera based system for auto-stereoscopic displays 基于深度相机的自动立体显示系统
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5662047
François de Sorbier, Yuko Uematsu, H. Saito
Stereoscopic displays are becoming very popular since more and more contents are now available. As an extension, auto-stereoscopic screens allow several users to watch stereoscopic images without wearing any glasses. For the moment, synthetized content are the easiest solutions to provide, in realtime, all the multiple input images required by such kind of technology. However, live videos are a very important issue in some fields like augmented reality applications, but remain difficult to be applied on auto-stereoscopic displays. In this paper, we present a system based on a depth camera and a color camera that are combined to produce the multiple input images in realtime. The result of this approach can be easily used with any kind of auto-stereoscopic screen.
立体显示器正变得非常流行,因为现在有越来越多的内容可供选择。作为扩展,自动立体屏幕允许多个用户在不戴任何眼镜的情况下观看立体图像。目前,合成内容是最简单的解决方案,可以实时提供此类技术所需的所有多输入图像。然而,在增强现实应用等领域,实时视频是一个非常重要的问题,但在自动立体显示上仍然很难应用。在本文中,我们提出了一个基于深度相机和彩色相机相结合的系统,可以实时产生多个输入图像。这种方法的结果可以很容易地用于任何类型的自动立体屏幕。
{"title":"Depth camera based system for auto-stereoscopic displays","authors":"François de Sorbier, Yuko Uematsu, H. Saito","doi":"10.1109/MMSP.2010.5662047","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5662047","url":null,"abstract":"Stereoscopic displays are becoming very popular since more and more contents are now available. As an extension, auto-stereoscopic screens allow several users to watch stereoscopic images without wearing any glasses. For the moment, synthetized content are the easiest solutions to provide, in realtime, all the multiple input images required by such kind of technology. However, live videos are a very important issue in some fields like augmented reality applications, but remain difficult to be applied on auto-stereoscopic displays. In this paper, we present a system based on a depth camera and a color camera that are combined to produce the multiple input images in realtime. The result of this approach can be easily used with any kind of auto-stereoscopic screen.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"283 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121447375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Geometric calibration of distributed microphone arrays from acoustic source correspondences 基于声源对应的分布式传声器阵列几何校正
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5661986
S. Valente, M. Tagliasacchi, F. Antonacci, Paolo Bestagini, A. Sarti, S. Tubaro
This paper proposes a method that solves the problem of geometric calibration of microphone arrays. We consider a distributed system, in which each array is controlled by separate acquisition devices that do not share a common synchronization clock. Given a set of probing sources, e.g. loudspeakers, each array computes an estimate of the source locations using a conventional TDOA-based algorithm. These observations are fused together by the proposed method, in order to estimate the position and pose of one array with respect to the other. Unlike previous approaches, we explicitly consider the anisotropic distribution of localization errors. As such, the proposed method is able to address the problem of geometric calibration when the probing sources are located both in the near- and far-field of the microphone arrays. Experimental results demonstrate that the improvement in terms of calibration accuracy with respect to state-of-the-art algorithms can be substantial, especially in the far-field.
本文提出了一种解决传声器阵列几何标定问题的方法。我们考虑一个分布式系统,其中每个阵列由单独的采集设备控制,这些设备不共享公共同步时钟。给定一组探测源,例如扬声器,每个阵列使用传统的基于tdoa的算法计算源位置的估计。通过该方法将这些观测结果融合在一起,以估计一个阵列相对于另一个阵列的位置和姿态。与以往的方法不同,我们明确地考虑了定位误差的各向异性分布。因此,所提出的方法能够解决探测源位于传声器阵列的近场和远场时的几何校准问题。实验结果表明,相对于最先进的算法,在校准精度方面的改进可以是实质性的,特别是在远场。
{"title":"Geometric calibration of distributed microphone arrays from acoustic source correspondences","authors":"S. Valente, M. Tagliasacchi, F. Antonacci, Paolo Bestagini, A. Sarti, S. Tubaro","doi":"10.1109/MMSP.2010.5661986","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5661986","url":null,"abstract":"This paper proposes a method that solves the problem of geometric calibration of microphone arrays. We consider a distributed system, in which each array is controlled by separate acquisition devices that do not share a common synchronization clock. Given a set of probing sources, e.g. loudspeakers, each array computes an estimate of the source locations using a conventional TDOA-based algorithm. These observations are fused together by the proposed method, in order to estimate the position and pose of one array with respect to the other. Unlike previous approaches, we explicitly consider the anisotropic distribution of localization errors. As such, the proposed method is able to address the problem of geometric calibration when the probing sources are located both in the near- and far-field of the microphone arrays. Experimental results demonstrate that the improvement in terms of calibration accuracy with respect to state-of-the-art algorithms can be substantial, especially in the far-field.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125015390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
Common Spatial Pattern revisited by Riemannian geometry 黎曼几何重新审视的共同空间模式
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5662067
A. Barachant, S. Bonnet, M. Congedo, C. Jutten
This paper presents a link between the well known Common Spatial Pattern (CSP) algorithm and Riemannian geometry in the context of Brain Computer Interface (BCI). It will be shown that CSP spatial filtering and Log variance features extraction can be resumed as a computation of a Riemann distance in the space of covariances matrices. This fact yields to highlight several approximations with respect to the space topology. According to these conclusions, we propose an improvement of classical CSP method.
本文提出了众所周知的公共空间模式(CSP)算法和黎曼几何在脑机接口(BCI)背景下的联系。将证明CSP空间滤波和Log方差特征提取可以恢复为协方差矩阵空间中黎曼距离的计算。这一事实产生了关于空间拓扑的几个近似。根据这些结论,我们提出了对经典CSP方法的改进。
{"title":"Common Spatial Pattern revisited by Riemannian geometry","authors":"A. Barachant, S. Bonnet, M. Congedo, C. Jutten","doi":"10.1109/MMSP.2010.5662067","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5662067","url":null,"abstract":"This paper presents a link between the well known Common Spatial Pattern (CSP) algorithm and Riemannian geometry in the context of Brain Computer Interface (BCI). It will be shown that CSP spatial filtering and Log variance features extraction can be resumed as a computation of a Riemann distance in the space of covariances matrices. This fact yields to highlight several approximations with respect to the space topology. According to these conclusions, we propose an improvement of classical CSP method.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128839819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 63
Clickable augmented documents 可点击的增强文档
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5662012
Sandy Martedi, Hideaki Uchiyama, H. Saito
This paper presents an Augmented Reality (AR) system for physical text documents that enable users to click a document. In the system, we track the relative pose between a camera and a document to overlay some virtual contents on the document continuously. In addition, we compute the trajectory of a fingertip based on skin color detection for clicking interaction. By merging a document tracking and an interaction technique, we have developed a novel tangible document system. As an application, we develop an AR dictionary system that overlays the meaning and explanation of words by clicking on a document. In the experiment part, we present the accuracy of the clicking interaction and the robustness of our document tracking method against the occlusion.
本文提出了一种用于物理文本文档的增强现实(AR)系统,使用户能够单击文档。在该系统中,我们跟踪相机与文档之间的相对姿态,从而在文档上连续覆盖一些虚拟内容。此外,我们计算了基于皮肤颜色检测的指尖轨迹,用于点击交互。通过合并文档跟踪和交互技术,我们开发了一种新颖的有形文档系统。作为一个应用程序,我们开发了一个AR词典系统,通过点击文档来覆盖单词的含义和解释。在实验部分,我们展示了点击交互的准确性和我们的文档跟踪方法对遮挡的鲁棒性。
{"title":"Clickable augmented documents","authors":"Sandy Martedi, Hideaki Uchiyama, H. Saito","doi":"10.1109/MMSP.2010.5662012","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5662012","url":null,"abstract":"This paper presents an Augmented Reality (AR) system for physical text documents that enable users to click a document. In the system, we track the relative pose between a camera and a document to overlay some virtual contents on the document continuously. In addition, we compute the trajectory of a fingertip based on skin color detection for clicking interaction. By merging a document tracking and an interaction technique, we have developed a novel tangible document system. As an application, we develop an AR dictionary system that overlays the meaning and explanation of words by clicking on a document. In the experiment part, we present the accuracy of the clicking interaction and the robustness of our document tracking method against the occlusion.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121688605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
2010 IEEE International Workshop on Multimedia Signal Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1