首页 > 最新文献

2014 22nd European Signal Processing Conference (EUSIPCO)最新文献

英文 中文
Piecewise nonlinear regression via decision adaptive trees 基于决策自适应树的分段非线性回归
Pub Date : 2014-11-13 DOI: 10.5281/ZENODO.44014
N. D. Vanli, M. O. Sayin, S. Ergüt, S. Kozat
We investigate the problem of adaptive nonlinear regression and introduce tree based piecewise linear regression algorithms that are highly efficient and provide significantly improved performance with guaranteed upper bounds in an individual sequence manner. We partition the regressor space using hyperplanes in a nested structure according to the notion of a tree. In this manner, we introduce an adaptive nonlinear regression algorithm that not only adapts the regressor of each partition but also learns the complete tree structure with a computational complexity only polynomial in the number of nodes of the tree. Our algorithm is constructed to directly minimize the final regression error without introducing any ad-hoc parameters. Moreover, our method can be readily incorporated with any tree construction method as demonstrated in the paper.
我们研究了自适应非线性回归问题,并引入了基于树的分段线性回归算法,该算法在单个序列方式下具有保证上界的高效率和显著改进的性能。根据树的概念,利用嵌套结构中的超平面划分回归量空间。在这种情况下,我们引入了一种自适应非线性回归算法,该算法不仅可以适应每个分区的回归量,而且可以学习完整的树结构,计算复杂度仅为树节点数的多项式。我们的算法在不引入任何特别参数的情况下直接最小化最终的回归误差。此外,我们的方法可以很容易地与本文所演示的任何树构建方法相结合。
{"title":"Piecewise nonlinear regression via decision adaptive trees","authors":"N. D. Vanli, M. O. Sayin, S. Ergüt, S. Kozat","doi":"10.5281/ZENODO.44014","DOIUrl":"https://doi.org/10.5281/ZENODO.44014","url":null,"abstract":"We investigate the problem of adaptive nonlinear regression and introduce tree based piecewise linear regression algorithms that are highly efficient and provide significantly improved performance with guaranteed upper bounds in an individual sequence manner. We partition the regressor space using hyperplanes in a nested structure according to the notion of a tree. In this manner, we introduce an adaptive nonlinear regression algorithm that not only adapts the regressor of each partition but also learns the complete tree structure with a computational complexity only polynomial in the number of nodes of the tree. Our algorithm is constructed to directly minimize the final regression error without introducing any ad-hoc parameters. Moreover, our method can be readily incorporated with any tree construction method as demonstrated in the paper.","PeriodicalId":198408,"journal":{"name":"2014 22nd European Signal Processing Conference (EUSIPCO)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130501843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Distributed reduced-rank estimation based on joint iterative optimization in sensor networks 基于联合迭代优化的传感器网络分布式降阶估计
Pub Date : 2014-11-13 DOI: 10.5281/ZENODO.43837
Songcen Xu, R. D. Lamare, H. Poor
This paper proposes a novel distributed reduced-rank scheme and an adaptive algorithm for distributed estimation in wireless sensor networks. The proposed distributed scheme is based on a transformation that performs dimensionality reduction at each agent of the network followed by a reduced-dimension parameter vector. A distributed reduced-rank joint iterative estimation algorithm is developed, which has the ability to achieve significantly reduced communication overhead and improved performance when compared with existing techniques. Simulation results illustrate the advantages of the proposed strategy in terms of convergence rate and mean square error performance.
针对无线传感器网络中的分布式估计问题,提出了一种新的分布式降阶方案和自适应算法。所提出的分布式方案基于一种转换,该转换在网络的每个代理上执行降维,然后执行降维参数向量。提出了一种分布式降秩联合迭代估计算法,与现有算法相比,该算法显著降低了通信开销,提高了性能。仿真结果表明了该策略在收敛速度和均方误差性能方面的优势。
{"title":"Distributed reduced-rank estimation based on joint iterative optimization in sensor networks","authors":"Songcen Xu, R. D. Lamare, H. Poor","doi":"10.5281/ZENODO.43837","DOIUrl":"https://doi.org/10.5281/ZENODO.43837","url":null,"abstract":"This paper proposes a novel distributed reduced-rank scheme and an adaptive algorithm for distributed estimation in wireless sensor networks. The proposed distributed scheme is based on a transformation that performs dimensionality reduction at each agent of the network followed by a reduced-dimension parameter vector. A distributed reduced-rank joint iterative estimation algorithm is developed, which has the ability to achieve significantly reduced communication overhead and improved performance when compared with existing techniques. Simulation results illustrate the advantages of the proposed strategy in terms of convergence rate and mean square error performance.","PeriodicalId":198408,"journal":{"name":"2014 22nd European Signal Processing Conference (EUSIPCO)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129549039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Retina enhanced bag of words descriptors for video classification 视网膜增强的用于视频分类的词描述符包
Pub Date : 2014-11-13 DOI: 10.5281/ZENODO.44198
Sabin Tiberius Strat, A. Benoît, P. Lambert
This paper addresses the task of detecting diverse semantic concepts in videos. Within this context, the Bag Of Visual Words (BoW) model, inherited from sampled video keyframes analysis, is among the most popular methods. However, in the case of image sequences, this model faces new difficulties such as the added motion information, the extra computational cost and the increased variability of content and concepts to handle. Considering this spatio-temporal context, we propose to extend the BoW model by introducing video preprocessing strategies with the help of a retina model, before extracting BoW descriptors. This preprocessing increases the robustness of local features to disturbances such as noise and lighting variations. Additionally, the retina model is used to detect potentially salient areas and to construct spatio-temporal descriptors. We experiment with three state of the art local features, SIFT, SURF and FREAK, and we evaluate our results on the TRECVid 2012 Semantic Indexing (SIN) challenge.
本文研究了视频中不同语义概念的检测问题。在这种情况下,从采样视频关键帧分析中继承而来的视觉词袋(BoW)模型是最流行的方法之一。然而,在图像序列的情况下,该模型面临着新的困难,例如增加的运动信息,额外的计算成本以及内容和概念的可变性增加。考虑到这种时空背景,我们建议在提取BoW描述符之前,通过引入视网膜模型的视频预处理策略来扩展BoW模型。这种预处理增加了局部特征对噪声和光照变化等干扰的鲁棒性。此外,视网膜模型用于检测潜在的显著区域并构建时空描述符。我们实验了三种最先进的局部特征,SIFT, SURF和FREAK,并在TRECVid 2012语义索引(SIN)挑战中评估了我们的结果。
{"title":"Retina enhanced bag of words descriptors for video classification","authors":"Sabin Tiberius Strat, A. Benoît, P. Lambert","doi":"10.5281/ZENODO.44198","DOIUrl":"https://doi.org/10.5281/ZENODO.44198","url":null,"abstract":"This paper addresses the task of detecting diverse semantic concepts in videos. Within this context, the Bag Of Visual Words (BoW) model, inherited from sampled video keyframes analysis, is among the most popular methods. However, in the case of image sequences, this model faces new difficulties such as the added motion information, the extra computational cost and the increased variability of content and concepts to handle. Considering this spatio-temporal context, we propose to extend the BoW model by introducing video preprocessing strategies with the help of a retina model, before extracting BoW descriptors. This preprocessing increases the robustness of local features to disturbances such as noise and lighting variations. Additionally, the retina model is used to detect potentially salient areas and to construct spatio-temporal descriptors. We experiment with three state of the art local features, SIFT, SURF and FREAK, and we evaluate our results on the TRECVid 2012 Semantic Indexing (SIN) challenge.","PeriodicalId":198408,"journal":{"name":"2014 22nd European Signal Processing Conference (EUSIPCO)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129672768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
A psychoacoustic model with Partial Spectral Flatness Measure for tonality estimation 一种用于调性估计的带有部分频谱平坦度测量的心理声学模型
Pub Date : 2014-11-13 DOI: 10.5281/ZENODO.43815
Armin Taghipour, M. Jaikumar, B. Edler
Psychoacoustic studies show that the strength of masking is, among others, dependent on the tonality of the masker: the effect of noise maskers is stronger than that of tone maskers. Recently, a Partial Spectral Flatness Measure (PSFM) was introduced for tonality estimation in a psychoacoustic model for perceptual audio coding. The model consists of an Infinite Impulse Response (IIR) filterbank which considers the spreading effect of individual local maskers in simultaneous masking. An optimized (with respect to audio quality and computational efficiency) PSFM is now compared to a similar psychoacoustic model with prediction based tonality estimation in medium (48 kbit/s) and low (32 kbit/s) bit rate conditions (mono) via subjective quality tests. 15 expert listeners participated in the subjective tests. The results are depicted and discussed. Additionally, we conducted the subjective tests with 15 non-expert consumers whose results are also shown and compared to those of the experts.
心理声学研究表明,掩蔽的强度,除其他外,取决于掩蔽者的调性:噪音掩蔽者的效果比音调掩蔽者强。近年来,在感知音频编码的心理声学模型中引入了部分频谱平坦度测度(PSFM)来进行调性估计。该模型由无限脉冲响应(IIR)滤波器组组成,该滤波器组考虑了同时掩蔽中单个局部掩蔽器的扩散效应。现在,通过主观质量测试,将优化的(关于音频质量和计算效率)PSFM与类似的心理声学模型进行比较,该模型在中(48 kbit/s)和低(32 kbit/s)比特率条件下(单声道)进行基于预测的音调估计。15名专家听众参加了主观测试。对结果进行了描述和讨论。此外,我们还对15名非专业消费者进行了主观测试,其结果也被显示并与专家的结果进行了比较。
{"title":"A psychoacoustic model with Partial Spectral Flatness Measure for tonality estimation","authors":"Armin Taghipour, M. Jaikumar, B. Edler","doi":"10.5281/ZENODO.43815","DOIUrl":"https://doi.org/10.5281/ZENODO.43815","url":null,"abstract":"Psychoacoustic studies show that the strength of masking is, among others, dependent on the tonality of the masker: the effect of noise maskers is stronger than that of tone maskers. Recently, a Partial Spectral Flatness Measure (PSFM) was introduced for tonality estimation in a psychoacoustic model for perceptual audio coding. The model consists of an Infinite Impulse Response (IIR) filterbank which considers the spreading effect of individual local maskers in simultaneous masking. An optimized (with respect to audio quality and computational efficiency) PSFM is now compared to a similar psychoacoustic model with prediction based tonality estimation in medium (48 kbit/s) and low (32 kbit/s) bit rate conditions (mono) via subjective quality tests. 15 expert listeners participated in the subjective tests. The results are depicted and discussed. Additionally, we conducted the subjective tests with 15 non-expert consumers whose results are also shown and compared to those of the experts.","PeriodicalId":198408,"journal":{"name":"2014 22nd European Signal Processing Conference (EUSIPCO)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129995030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Cost function optimization and its hardware design for the Sample Adaptive Offset of HEVC standard HEVC标准样本自适应偏移的代价函数优化及其硬件设计
Pub Date : 2014-11-13 DOI: 10.5281/ZENODO.44158
Fabiane Rediess, R. Conceição, B. Zatt, M. Porto, L. Agostini
This work presents a cost function optimization for the internal decision of the HEVC Sample Adaptive Offset (SAO) filter. The optimization approach is focused on an efficient hardware design implementation, and explores two critical points. The first one focus in the use of fixed-point data instead of float-point data, and the second focus on reduce the number of full multipliers and divisors. The simulations results show that those proposals do not present significant impact on BD-rate measurements. Based on both these two hardware-friendly optimizations, we propose a hardware design for this cost function module. The FPGA synthesis results show that the proposed architecture achieved 521 MHz, and are able to process UHD 8K@120 fps operating at 47 MHz.
本文提出了一种用于HEVC采样自适应偏移(SAO)滤波器内部决策的代价函数优化方法。优化方法的重点是高效的硬件设计实现,并探讨了两个关键点。第一个重点是使用定点数据而不是浮点数据,第二个重点是减少完整乘数和除数的数量。模拟结果表明,这些建议对BD-rate测量没有显著影响。基于这两种硬件友好的优化,我们提出了成本函数模块的硬件设计。FPGA综合结果表明,所提架构达到521 MHz,并能处理47 MHz的超高清fps。
{"title":"Cost function optimization and its hardware design for the Sample Adaptive Offset of HEVC standard","authors":"Fabiane Rediess, R. Conceição, B. Zatt, M. Porto, L. Agostini","doi":"10.5281/ZENODO.44158","DOIUrl":"https://doi.org/10.5281/ZENODO.44158","url":null,"abstract":"This work presents a cost function optimization for the internal decision of the HEVC Sample Adaptive Offset (SAO) filter. The optimization approach is focused on an efficient hardware design implementation, and explores two critical points. The first one focus in the use of fixed-point data instead of float-point data, and the second focus on reduce the number of full multipliers and divisors. The simulations results show that those proposals do not present significant impact on BD-rate measurements. Based on both these two hardware-friendly optimizations, we propose a hardware design for this cost function module. The FPGA synthesis results show that the proposed architecture achieved 521 MHz, and are able to process UHD 8K@120 fps operating at 47 MHz.","PeriodicalId":198408,"journal":{"name":"2014 22nd European Signal Processing Conference (EUSIPCO)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115101363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Near-field localization of audio: A maximum likelihood approach 音频的近场定位:最大似然方法
Pub Date : 2014-11-13 DOI: 10.5281/ZENODO.43840
J. Jensen, M. G. Christensen
Localization of audio sources using microphone arrays has been an important research problem for more than two decades. Many traditional methods for solving the problem are based on a two-stage procedure: first, information about the audio source, such as time differences-of-arrival (TDOAs) and gain ratios-of-arrival (GROAs) between microphones is estimated, and, second, this knowledge is used to localize the audio source. These methods often have a low computational complexity, but this comes at the cost of a limited estimation accuracy. Therefore, we propose a new localization approach, where the desired signal is modeled using TDOAs and GROAs, which are determined by the source location. This facilitates the derivation of one-stage, maximum likelihood methods under a white Gaussian noise assumption that is applicable in both near- and far-field scenarios. Simulations show that the proposed method is statistically efficient and outperforms state-of-the-art estimators in most scenarios, involving both synthetic and real data.
二十多年来,利用麦克风阵列定位音源一直是一个重要的研究问题。许多传统的解决问题的方法都是基于一个两阶段的过程:首先,估计关于音频源的信息,如麦克风之间的到达时间差(TDOAs)和到达增益比(GROAs),其次,利用这些知识来定位音频源。这些方法通常具有较低的计算复杂度,但这是以有限的估计精度为代价的。因此,我们提出了一种新的定位方法,其中使用由源位置确定的tdoa和groa对所需信号进行建模。这有助于在高斯白噪声假设下推导出适用于近场和远场场景的单阶段最大似然方法。仿真结果表明,该方法在统计上是有效的,并且在大多数情况下都优于最先进的估计方法,包括合成数据和真实数据。
{"title":"Near-field localization of audio: A maximum likelihood approach","authors":"J. Jensen, M. G. Christensen","doi":"10.5281/ZENODO.43840","DOIUrl":"https://doi.org/10.5281/ZENODO.43840","url":null,"abstract":"Localization of audio sources using microphone arrays has been an important research problem for more than two decades. Many traditional methods for solving the problem are based on a two-stage procedure: first, information about the audio source, such as time differences-of-arrival (TDOAs) and gain ratios-of-arrival (GROAs) between microphones is estimated, and, second, this knowledge is used to localize the audio source. These methods often have a low computational complexity, but this comes at the cost of a limited estimation accuracy. Therefore, we propose a new localization approach, where the desired signal is modeled using TDOAs and GROAs, which are determined by the source location. This facilitates the derivation of one-stage, maximum likelihood methods under a white Gaussian noise assumption that is applicable in both near- and far-field scenarios. Simulations show that the proposed method is statistically efficient and outperforms state-of-the-art estimators in most scenarios, involving both synthetic and real data.","PeriodicalId":198408,"journal":{"name":"2014 22nd European Signal Processing Conference (EUSIPCO)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131060447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Near-optimal sensor placement for signals lying in a union of subspaces 位于子空间并集中的信号的近最优传感器放置
Pub Date : 2014-11-13 DOI: 10.5281/ZENODO.44165
Dalia El Badawy, J. Ranieri, M. Vetterli
Sensor networks are commonly deployed to measure data from the environment and accurately estimate certain parameters. However, the number of deployed sensors is often limited by several constraints, such as their cost. Therefore, their locations must be opportunely optimized to enhance the estimation of the parameters. In a previous work, we considered a low-dimensional linear model for the measured data and proposed a near-optimal algorithm to optimize the sensor placement. In this paper, we propose to model the data as a union of subspaces to further reduce the amount of sensors without degrading the quality of the estimation. Moreover, we introduce a greedy algorithm for the sensor placement for such a model and show the near-optimality of its solution. Finally, we verify with numerical experiments the advantage of the proposed model in reducing the number of sensors while maintaining intact the estimation performance.
传感器网络通常用于测量来自环境的数据并准确估计某些参数。然而,部署传感器的数量通常受到一些限制,例如成本。因此,必须适当地优化它们的位置,以增强对参数的估计。在之前的工作中,我们考虑了测量数据的低维线性模型,并提出了一种近乎最优的算法来优化传感器的放置。在本文中,我们建议将数据建模为子空间的并集,以进一步减少传感器的数量而不降低估计的质量。此外,我们还引入了一种贪婪算法来求解该模型的传感器位置,并展示了其解的近最优性。最后,通过数值实验验证了该模型在保持估计性能不变的情况下减少传感器数量的优势。
{"title":"Near-optimal sensor placement for signals lying in a union of subspaces","authors":"Dalia El Badawy, J. Ranieri, M. Vetterli","doi":"10.5281/ZENODO.44165","DOIUrl":"https://doi.org/10.5281/ZENODO.44165","url":null,"abstract":"Sensor networks are commonly deployed to measure data from the environment and accurately estimate certain parameters. However, the number of deployed sensors is often limited by several constraints, such as their cost. Therefore, their locations must be opportunely optimized to enhance the estimation of the parameters. In a previous work, we considered a low-dimensional linear model for the measured data and proposed a near-optimal algorithm to optimize the sensor placement. In this paper, we propose to model the data as a union of subspaces to further reduce the amount of sensors without degrading the quality of the estimation. Moreover, we introduce a greedy algorithm for the sensor placement for such a model and show the near-optimality of its solution. Finally, we verify with numerical experiments the advantage of the proposed model in reducing the number of sensors while maintaining intact the estimation performance.","PeriodicalId":198408,"journal":{"name":"2014 22nd European Signal Processing Conference (EUSIPCO)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130837164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A unified approach to numerical auditory scene synthesis using loudspeaker arrays 使用扬声器阵列的数字听觉场景合成的统一方法
Pub Date : 2014-11-13 DOI: 10.5281/ZENODO.44186
Joshua Atkins, Ismael Nawfal, D. Giacobello
In this work we address the problem of simulating the spatial and timbral cues of a given sound event, or auditory scene, using an array of loudspeakers. We first define the problem with a general numerical framework that encompasses many known techniques from physical acoustics, crosstalk cancellation, and acoustic control. In contrast to many previous approaches, the system described in this work is inherently broadband as it jointly designs a set of spatio-temporal filters while allowing for constraints in other domains. With this framework we show similarities and differences between known techniques and suggest some new, unexplored methods. In particular, we focus on perceptually motivated choices for the cost function and regularization. These methods are then compared by implementing the systems on a linear array of loudspeakers and evaluating the timbral and spatial qualities of the system using objective metrics.
在这项工作中,我们解决了使用扬声器阵列模拟给定声音事件或听觉场景的空间和音色线索的问题。我们首先用一个通用的数值框架来定义这个问题,这个框架包含了许多已知的技术,包括物理声学、串音消除和声学控制。与许多以前的方法相比,本工作中描述的系统固有的宽带,因为它联合设计了一组时空滤波器,同时允许其他领域的约束。通过这个框架,我们展示了已知技术之间的异同,并提出了一些新的、未探索的方法。特别是,我们专注于成本函数和正则化的感知动机选择。然后通过在扬声器的线性阵列上实现系统并使用客观指标评估系统的音色和空间质量来比较这些方法。
{"title":"A unified approach to numerical auditory scene synthesis using loudspeaker arrays","authors":"Joshua Atkins, Ismael Nawfal, D. Giacobello","doi":"10.5281/ZENODO.44186","DOIUrl":"https://doi.org/10.5281/ZENODO.44186","url":null,"abstract":"In this work we address the problem of simulating the spatial and timbral cues of a given sound event, or auditory scene, using an array of loudspeakers. We first define the problem with a general numerical framework that encompasses many known techniques from physical acoustics, crosstalk cancellation, and acoustic control. In contrast to many previous approaches, the system described in this work is inherently broadband as it jointly designs a set of spatio-temporal filters while allowing for constraints in other domains. With this framework we show similarities and differences between known techniques and suggest some new, unexplored methods. In particular, we focus on perceptually motivated choices for the cost function and regularization. These methods are then compared by implementing the systems on a linear array of loudspeakers and evaluating the timbral and spatial qualities of the system using objective metrics.","PeriodicalId":198408,"journal":{"name":"2014 22nd European Signal Processing Conference (EUSIPCO)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132156633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Improving scalar quantization for correlated processes using adaptive codebooks only at the receiver 仅在接收端使用自适应码本改进相关过程的标量量化
Pub Date : 2014-11-13 DOI: 10.5281/ZENODO.43845
Sai Han, T. Fingscheidt
Lloyd-Max quantization (LMQ) is a widely used scalar non-uniform quantization approach targeting for the minimum mean squared error (MMSE). Once designed, the quantizer codebook is fixed over time and does not take advantage of possible correlations in the input signals. Exploiting correlation in scalar quantization could be achieved by predictive quantization, however, for the price of a higher bit error sensitivity. In order to improve the Lloyd-Max quantizer performance for correlated processes without encoder-sided prediction, a novel scalar decoding approach utilizing the correlation of input signals is proposed in this paper. Based on previously received samples, the current sample can be predicted a priori. Thereafter, a quantization codebook adapted over time will be generated according to the prediction error probability density function. Compared to the standard LMQ, distinct improvement is achieved with our receiver in error-free and error-prone transmission conditions, both with hard-decision and soft-decision decoding.
Lloyd-Max量化(LMQ)是一种广泛应用的标量非均匀量化方法,其目标是最小均方误差(MMSE)。一旦设计好,量化码本就会随着时间的推移而固定,并且不会利用输入信号中可能存在的相关性。利用标量量化中的相关性可以通过预测量化来实现,但代价是更高的误码灵敏度。为了提高Lloyd-Max量化器在无编码器侧预测的相关过程中的性能,提出了一种利用输入信号相关性的标量解码方法。基于先前接收到的样本,可以先验地预测当前样本。然后,根据预测误差概率密度函数生成随时间变化的量化码本。与标准LMQ相比,我们的接收机在无错误和易出错的传输条件下,无论是硬判决解码还是软判决解码,都取得了明显的改进。
{"title":"Improving scalar quantization for correlated processes using adaptive codebooks only at the receiver","authors":"Sai Han, T. Fingscheidt","doi":"10.5281/ZENODO.43845","DOIUrl":"https://doi.org/10.5281/ZENODO.43845","url":null,"abstract":"Lloyd-Max quantization (LMQ) is a widely used scalar non-uniform quantization approach targeting for the minimum mean squared error (MMSE). Once designed, the quantizer codebook is fixed over time and does not take advantage of possible correlations in the input signals. Exploiting correlation in scalar quantization could be achieved by predictive quantization, however, for the price of a higher bit error sensitivity. In order to improve the Lloyd-Max quantizer performance for correlated processes without encoder-sided prediction, a novel scalar decoding approach utilizing the correlation of input signals is proposed in this paper. Based on previously received samples, the current sample can be predicted a priori. Thereafter, a quantization codebook adapted over time will be generated according to the prediction error probability density function. Compared to the standard LMQ, distinct improvement is achieved with our receiver in error-free and error-prone transmission conditions, both with hard-decision and soft-decision decoding.","PeriodicalId":198408,"journal":{"name":"2014 22nd European Signal Processing Conference (EUSIPCO)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132178729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Iterative Label Propagation on facial images 人脸图像的迭代标签传播
Pub Date : 2014-11-13 DOI: 10.5281/ZENODO.44196
O. Zoidi, A. Tefas, N. Nikolaidis, I. Pitas
In this paper a novel method is introduced for propagating person identity labels on facial images in an iterative manner. The proposed method takes into account information about the data structure, obtained through clustering. This information is exploited in two ways: to regulate the similarity strength between the data and to indicate which samples should be selected for label propagation initialization. The proposed method can also find application in label propagation on multiple graphs. The performance of the proposed Iterative Label Propagation (ILP) method was evaluated on facial images extracted from stereo movies. Experimental results showed that the proposed method outperforms state of the art methods either when only one or both video channels are used for label propagation.
本文提出了一种基于迭代的人脸图像身份标签传播方法。该方法考虑了通过聚类获得的数据结构信息。该信息以两种方式被利用:调节数据之间的相似性强度,并指示应该选择哪些样本进行标签传播初始化。该方法也适用于多图上的标签传播。在立体电影中提取人脸图像,并对所提出的迭代标记传播(ILP)方法进行了性能评价。实验结果表明,该方法在仅使用一个或两个视频通道进行标签传播时都优于现有方法。
{"title":"Iterative Label Propagation on facial images","authors":"O. Zoidi, A. Tefas, N. Nikolaidis, I. Pitas","doi":"10.5281/ZENODO.44196","DOIUrl":"https://doi.org/10.5281/ZENODO.44196","url":null,"abstract":"In this paper a novel method is introduced for propagating person identity labels on facial images in an iterative manner. The proposed method takes into account information about the data structure, obtained through clustering. This information is exploited in two ways: to regulate the similarity strength between the data and to indicate which samples should be selected for label propagation initialization. The proposed method can also find application in label propagation on multiple graphs. The performance of the proposed Iterative Label Propagation (ILP) method was evaluated on facial images extracted from stereo movies. Experimental results showed that the proposed method outperforms state of the art methods either when only one or both video channels are used for label propagation.","PeriodicalId":198408,"journal":{"name":"2014 22nd European Signal Processing Conference (EUSIPCO)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132798211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
2014 22nd European Signal Processing Conference (EUSIPCO)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1