2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics最新文献

英文中文

A recursive generalized sidelobe canceler for multichannel blind speech dereverberation 一种用于多通道盲语音去噪的递归广义旁瓣消去器

2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics

Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701814

S. Malik, J. Benesty, Jingdong Chen

In this paper, we propose a generalized sidelobe canceler for multichannel blind speech dereverberation, which relies on recursive estimation of posterior distributions on the unknown acoustic channels and the adaptive interference canceler (AIC). Contrary to conventional design approaches where a fixed beamformer is employed, we consider a marginalized maximum-likelihood equalizer that is driven by the channel posterior estimator. It is shown that the first moment of the inferred channel posterior can also serve as a representation of an adaptive blocking matrix (ABM). Using the output of the blocking matrix, we estimate the AIC posterior to minimize the residual reverberation in the equalized signal. We demonstrate the efficacy of our approach by evaluating the algorithm in different degrees of observation noise and varying reverberation times.

本文提出了一种基于未知声道后向分布递推估计和自适应干扰抵消的多通道盲语音消噪广义旁瓣消噪方法。与采用固定波束形成器的传统设计方法相反，我们考虑了由信道后验估计器驱动的边缘最大似然均衡器。结果表明，信道后验的第一矩也可以作为自适应阻塞矩阵(ABM)的表示。利用阻塞矩阵的输出，我们估计AIC后验以最小化均衡信号中的残留混响。我们通过在不同程度的观测噪声和不同的混响时间下评估算法来证明我们的方法的有效性。

引用次数: 0

Robust DOA estimation of speech signals via sparsity models using microphone arrays 基于稀疏度模型的语音信号鲁棒DOA估计

2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics

Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701823

Eleonora Cagli, Diego Carrera, G. Aletti, G. Naldi, B. Rossi

Direction-of-arrival (DOA) estimation of speech signals using a set of spatially separated microphones in an array is a problem arising in many practical applications. Examples include human computer interfaces, automatic camera-steering systems for multipartecipant videoconferencing, and tracking systems in smart home environments. This paper introduces a robust method for speech signals localization which makes use of sparsity models for signal representation, and includes an analysis of the denoising problem for realistic applications using MEMS microphone arrays. Experimental results on both synthetic and real speech data show that the proposed method is noise-robust and provides high reliable localization performances even in case of multiple sources and small number of microphones.

利用阵列中一组空间分离的传声器对语音信号的到达方向(DOA)进行估计是许多实际应用中遇到的问题。例子包括人机界面，用于多参与者视频会议的自动摄像机转向系统，以及智能家居环境中的跟踪系统。本文介绍了一种鲁棒的语音信号定位方法，该方法利用稀疏模型来表示信号，并分析了使用MEMS麦克风阵列的实际应用中的去噪问题。在合成语音和真实语音数据上的实验结果表明，该方法具有较强的噪声鲁棒性，即使在多声源和少量麦克风的情况下也能提供较高的可靠定位性能。

引用次数: 5

Perceptual Cepstral filters for speech and music processing 用于语音和音乐处理的感知倒谱滤波器

2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics

Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701858

R. Mignot, V. Välimäki

Source-filter modeling of speech or musical tones requires a filter model for the spectral envelope of the signal. To reduce the number of modeling parameters, one idea is the use of psychoacoustic knowledge to encode only the relevant information in a perceptual sense. Starting from an accurate estimation of the original spectral envelope, with imperceptible details, in this work, we propose to use its Mel-Frequency Cepstral Coefficient (MFCC) representation to catch the perceptually relevant information. Then, a new inverse process is presented to derive a smoother, but perceptually equivalent spectral envelope. For instance, this new method can be applied in speech coding, and thanks to the good properties of the MFCC representation, perceptual interpolations of sounds is made easier.

语音或音乐音调的源滤波器建模需要对信号的频谱包络建立滤波器模型。为了减少建模参数的数量，一个想法是使用心理声学知识在感知意义上仅对相关信息进行编码。从原始频谱包络的精确估计开始，在这项工作中，我们建议使用其Mel-Frequency Cepstral Coefficient (MFCC)表示来捕获感知相关信息。然后，提出了一种新的逆过程，以获得更平滑但感知等效的谱包络。例如，这种新方法可以应用于语音编码，并且由于MFCC表示的良好特性，使得声音的感知插值变得更加容易。

引用次数: 3

MINTFormer: A spatially aware channel equalizer MINTFormer:一个空间感知通道均衡器

2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics

Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701881

Felicia Lim, Mark R. P. Thomas, P. Naylor

Reverberation is a process that distorts a wanted signal and impairs perceived speech quality. In the context of multichannel dereverberation, channel-based methods and beamforming are two common approaches. Channel-based methods such as the multiple input/output inverse theorem (MINT) can provide perfect dereverberation provided the exact acoustic impulse responses (AIRs) are known. However, they have been shown to be very sensitive to AIR estimation errors for which several modifications have consequently been proposed. Conversely, beamformers are significantly more robust but provide comparatively modest dereverberation. While the two approaches are conventionally considered independent, both can be formulated as a filter-and-sum operation with differing filter design criteria. We propose a unified framework, termed MINT-Forming, that exploits this similarity and introduces a mixing parameter to control the tradeoff between the potential performance of MINT and the robustness of beamforming. Empirical results show that the mixing parameter is a monotonic function of channel estimation error, whereby a MINT solution is preferred when channel estimation error is low.

混响是一个过程，它扭曲了想要的信号，损害了感知的语音质量。在多信道去噪的背景下，基于信道的方法和波束形成是两种常见的方法。基于信道的方法，如多输入/输出逆定理(MINT)，可以提供完美的去噪，只要精确的声脉冲响应(AIRs)是已知的。然而，它们已被证明对AIR估计误差非常敏感，因此提出了一些修改。相反，波束形成器明显更健壮，但提供相对适度的去噪。虽然这两种方法通常被认为是独立的，但它们都可以被表述为具有不同滤波器设计标准的过滤求和操作。我们提出了一个统一的框架，称为薄荷成形，利用这种相似性，并引入一个混合参数来控制薄荷的潜在性能和波束形成的鲁棒性之间的权衡。实验结果表明，混合参数是信道估计误差的单调函数，当信道估计误差较低时，首选MINT解决方案。

{"title":"MINTFormer: A spatially aware channel equalizer","authors":"Felicia Lim, Mark R. P. Thomas, P. Naylor","doi":"10.1109/WASPAA.2013.6701881","DOIUrl":"https://doi.org/10.1109/WASPAA.2013.6701881","url":null,"abstract":"Reverberation is a process that distorts a wanted signal and impairs perceived speech quality. In the context of multichannel dereverberation, channel-based methods and beamforming are two common approaches. Channel-based methods such as the multiple input/output inverse theorem (MINT) can provide perfect dereverberation provided the exact acoustic impulse responses (AIRs) are known. However, they have been shown to be very sensitive to AIR estimation errors for which several modifications have consequently been proposed. Conversely, beamformers are significantly more robust but provide comparatively modest dereverberation. While the two approaches are conventionally considered independent, both can be formulated as a filter-and-sum operation with differing filter design criteria. We propose a unified framework, termed MINT-Forming, that exploits this similarity and introduces a mixing parameter to control the tradeoff between the potential performance of MINT and the robustness of beamforming. Empirical results show that the mixing parameter is a monotonic function of channel estimation error, whereby a MINT solution is preferred when channel estimation error is low.","PeriodicalId":341888,"journal":{"name":"2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124812059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Frequency domain multi-channel expectation maximization algorithm for audio background noise reduction 音频背景噪声降噪的频域多通道期望最大化算法

2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics

Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701859

Jichi Deng, S. Godsill

In this paper we implement expectation maximization (EM) based methods in the short time Fourier transform (STFT) domain for background noise reduction in multi-channel systems. The models introduce a Wishart prior for the unknown signal covariance matrix. An EM algorithm is used to maximise the posterior probability for the clean signal, approaching a stationary point of the distribution with increasing iterations. The background noise is modelled as white and stationary in this initial work. The proposed methods are found to outperform a multi-channel Wiener filter in terms of residual noise artefacts and MSE for a small initial trial.

本文在短时傅里叶变换(STFT)域实现了基于期望最大化(EM)的多通道系统背景噪声抑制方法。该模型对未知信号协方差矩阵引入了Wishart先验。EM算法用于最大化干净信号的后验概率，随着迭代次数的增加接近分布的平稳点。在最初的工作中，背景噪声被建模为白色和静止的。在一个小的初始试验中，发现所提出的方法在残余噪声伪像和MSE方面优于多通道维纳滤波器。

引用次数: 0

Gaussian process data fusion for heterogeneous HRTF datasets 异构HRTF数据集的高斯过程数据融合

2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics

Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701842

Yuancheng Luo, D. Zotkin, R. Duraiswami

Head-Related Transfer Function (HRTF) measurement and extraction are important tasks for personalized-spatial audio. Many laboratories have their own apparatuses for data-collection but few studies have compared their results to a common subject or have modeled inter-dataset variances. We present a Bayesian fusion method based on Gaussian process (GP) modeling of joint spatial-frequency HRTFs over different spherical-measurement grids. Neumann KU-100 dummy HRTFs from 7 labs in the “Club Fritz” study are compared and fused to each other based on learning a set of transformations from the GP data-likelihood and covariance assumptions; parameter and hyperparameter training is automatic. Experimental results show that fused models for horizontal and median-plane HRTFs generalize the datasets better than pre-transformed ones.

头部相关传递函数(HRTF)的测量和提取是个性化空间音频的重要任务。许多实验室都有自己的数据收集设备，但很少有研究将他们的结果与一个共同的主题进行比较，或者对数据集之间的差异进行建模。提出了一种基于高斯过程(GP)建模的不同球面测量网格上联合空频hrtf的贝叶斯融合方法。在“弗里茨俱乐部”研究中，来自7个实验室的Neumann KU-100虚拟hrtf在学习GP数据的一组转换的基础上进行了比较和融合，可能性和协方差假设;参数和超参数训练是自动的。实验结果表明，水平和中平面hrtf融合模型比预转换模型更能泛化数据集。

引用次数: 7

A new clustering approach for solving the permutation problem in convolutive blind source separation 一种解决卷积盲源分离中排列问题的聚类新方法

2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics

Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701852

Radoslaw Mazur, J. Jungmann, A. Mertins

In this paper we propose a new clustering approach for solving the permutation ambiguity in convolutive blind source separation. After the transformation to the time-frequency domain, the problem of separation of sources can be reduced to multiple instantaneous problems, which may be solved using independent component analysis. The drawbacks of this approach are the inherent permutation and scaling ambiguities, which have to be corrected before the transformation to the time domain. Here, we propose a new method that allows for aligning up to several hundreds of consecutive bins into clusters. The depermutation of these clusters using some known techniques is then much easier than the original problem. The performance of the proposed method is evaluated on real-room recordings.

本文提出了一种新的聚类方法来解决卷积盲源分离中的排列模糊问题。转换到时频域后，源分离问题可简化为多个瞬时问题，可用独立分量分析解决。这种方法的缺点是固有的排列和尺度模糊，必须在转换到时域之前进行纠正。在这里，我们提出了一种新的方法，允许将多达数百个连续的箱子对齐到集群中。使用一些已知的技术对这些簇进行蜕化，就比原来的问题容易得多。在实际房间记录中对该方法的性能进行了评价。

引用次数: 5

Room impulse response synthesis based on a 2D multi-plane FDTD hybrid acoustic model 基于二维多平面时域有限差分混合声学模型的房间脉冲响应合成

2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics

Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701887

Stephen Oxnard, D. Murphy

This paper exposes, and analyzes the validity of, a novel hybrid acoustic modeling system created through complementary assimilation of 3D geometric and 2D numerical modeling techniques. It is demonstrated that multiple 2D Finite Difference Time Domain schemes may be employed to simulate low-frequency sound wave propagation throughout a simplistic 3D enclosure, thus avoiding the immense computational challenges posed by 3D numerical approaches. Band limited room impulse responses (RIRs) generated in this way may be appropriately calibrated and combined with high-frequency results obtained from well-established geometric modeling methods to realize efficient, yet accurate hybrid RIR synthesis. Objective results show that the low-frequency 2D multiplane solution yields comparable accuracy to that gained through 3D simulation while achieving a run-time reduction of 99.15%.

本文揭示并分析了一种通过互补同化三维几何和二维数值模拟技术创建的新型混合声学建模系统的有效性。研究表明，多个二维时域有限差分方案可以用来模拟低频声波在一个简单的三维空间内的传播，从而避免了三维数值方法带来的巨大计算挑战。以这种方式产生的带限室脉冲响应(RIR)可以适当地校准，并与由成熟的几何建模方法获得的高频结果相结合，以实现高效而准确的混合RIR合成。客观结果表明，低频二维多平面解决方案的精度与通过3D模拟获得的精度相当，同时实现了99.15%的运行时间减少。

引用次数: 5

Wave-domain echo-path model with aliasing for echo cancellation 用混叠法消除回波的波域回波路径模型

2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics

Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701844

S. Emura, Y. Hiwasaki, H. Ohmuro

Wave-domain adaptive filtering for echo cancellation has been proposed for achieving immersive full-duplex sound conferencing that uses wave field reconstruction as spatial sound rendering. In wave-domain adaptive filtering, fundamental solutions of the wave equation are spatially sampled and used as the orthogonal basis functions. This sampling is determined by loudspeaker spacing and results in aliasing; aliasing occurs above a few thousand Hz for spacing of several centimeters. The goal of this work is to investigate the effect of applying adaptive filtering on echo signal with aliasing when the loudspeaker array and microphone array are uniform linear arrays of identical geometries. We came to the conclusion that we can apply the wave-domain echo-path model, used below spatial Nyquist frequency, to wave-domain adaptive filtering over this frequency even in the presence of aliasing components.

回声消除的波域自适应滤波已被提出用于实现沉浸式全双工声音会议，该会议使用波场重建作为空间声音渲染。在波域自适应滤波中，对波动方程的基本解进行空间采样并作为正交基函数。这种采样是由扬声器间距决定的，并导致混叠;当间隔为几厘米时，混叠发生在几千赫兹以上。本文研究了当扬声器阵列和传声器阵列为相同几何形状的均匀线性阵列时，自适应滤波对混叠回波信号的影响。我们得出的结论是，即使存在混叠成分，我们也可以将波域回声路径模型应用于该频率上的波域自适应滤波，该模型在空间奈奎斯特频率以下使用。

引用次数: 1

Advanced speech-audio processing in mobile phones and hearing aids: Synergies and distinctions 移动电话和助听器中的高级语音音频处理:协同作用和区别

2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics

Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701899

P. Vary

Summary form only given. Mobile phones and modern hearing aids comprise advanced digital signal processing techniques as well as coding algorithms. From a functional point of view, digital hearing devices and mobile phones are approaching each other. In both types of devices similar or partly even identical algorithms can be found such as echo, reverberation and feedback control, noise reduction, intelligibility enhancement, artificial bandwidth extension, and binaural processing with two or more microphones. Actual hearing aids include digital audio receivers and transmitters not only for communication and entertainment but also for binaural directional processing. State-of-the-art mobile phones offer new speech-audio compression schemes for the emerging HD-telephone services and they are equipped with two (or more) microphones for the purpose of speech enhancement. Thus, it is not a too big step to realize hearing aid features as apps on smart phones. The further evolution might lead us to binaural mobile telephony, providing ambient and spatial information - a preferred solution for audio conferencing, for example. Despite these relations, the signal conditions and the processing constraints are quite different, e.g., with respect to coherence of signals, complexity of algorithms, coding-noise shaping for binaural processing, power consumption, and latency. Synergies and distinctions of the corresponding signal processing and coding algorithms will be discussed. Design constraints and solutions will be presented by examples.

只提供摘要形式。移动电话和现代助听器包括先进的数字信号处理技术以及编码算法。从功能的角度来看，数字助听器和手机正在相互靠近。在这两种类型的设备中，可以发现类似或部分相同的算法，例如回声，混响和反馈控制，降噪，可理解性增强，人工带宽扩展以及使用两个或多个麦克风的双耳处理。实际的助听器包括数字音频接收器和发射器，不仅用于通信和娱乐，而且还用于双耳定向处理。最先进的流动电话为新兴的高清电话服务提供新的语音音频压缩方案，并配备两个(或更多)麦克风，以增强语音。因此，将助听器功能变成智能手机上的应用程序并不是太大的一步。进一步的发展可能会导致我们使用双耳移动电话，提供环境和空间信息——例如，音频会议的首选解决方案。尽管存在这些关系，但信号条件和处理约束是完全不同的，例如，在信号的相干性、算法的复杂性、双耳处理的编码噪声整形、功耗和延迟方面。将讨论相应的信号处理和编码算法的协同作用和区别。设计约束和解决方案将通过实例展示。

{"title":"Advanced speech-audio processing in mobile phones and hearing aids: Synergies and distinctions","authors":"P. Vary","doi":"10.1109/WASPAA.2013.6701899","DOIUrl":"https://doi.org/10.1109/WASPAA.2013.6701899","url":null,"abstract":"Summary form only given. Mobile phones and modern hearing aids comprise advanced digital signal processing techniques as well as coding algorithms. From a functional point of view, digital hearing devices and mobile phones are approaching each other. In both types of devices similar or partly even identical algorithms can be found such as echo, reverberation and feedback control, noise reduction, intelligibility enhancement, artificial bandwidth extension, and binaural processing with two or more microphones. Actual hearing aids include digital audio receivers and transmitters not only for communication and entertainment but also for binaural directional processing. State-of-the-art mobile phones offer new speech-audio compression schemes for the emerging HD-telephone services and they are equipped with two (or more) microphones for the purpose of speech enhancement. Thus, it is not a too big step to realize hearing aid features as apps on smart phones. The further evolution might lead us to binaural mobile telephony, providing ambient and spatial information - a preferred solution for audio conferencing, for example. Despite these relations, the signal conditions and the processing constraints are quite different, e.g., with respect to coherence of signals, complexity of algorithms, coding-noise shaping for binaural processing, power consumption, and latency. Synergies and distinctions of the corresponding signal processing and coding algorithms will be discussed. Design constraints and solutions will be presented by examples.","PeriodicalId":341888,"journal":{"name":"2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131832327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀