2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献

英文中文

Weakly supervised neural networks for Part-Of-Speech tagging 词性标注的弱监督神经网络

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2012-03-25 DOI: 10.1109/ICASSP.2012.6288291

S. Chopra, S. Bangalore

We introduce a simple and novel method for the weakly supervised problem of Part-Of-Speech tagging with a dictionary. Our method involves training a connectionist network that simultaneously learns a distributed latent representation of the words, while maximizing the tagging accuracy. To compensate for the unavailability of true labels, we resort to training the model using a Curriculum: instead of random order, the model is trained using an ordered sequence of training samples, proceeding from “easier” to “harder” samples. On a standard test corpus, we show that without using any grammatical information, our model is able to outperform the standard EM algorithm in tagging accuracy, and its performance is comparable to other state-of-the-art models. We also show that curriculum learning for this setting significantly improves performance, both in terms of speed of convergence and in terms of generalization.

本文提出了一种基于词典的弱监督词性标注方法。我们的方法包括训练一个连接网络，该网络同时学习单词的分布式潜在表示，同时最大化标注准确性。为了弥补真实标签的不可用性，我们使用课程来训练模型:不是随机顺序，而是使用有序的训练样本序列来训练模型，从“容易”到“难”样本。在一个标准的测试语料库上，我们证明了在不使用任何语法信息的情况下，我们的模型能够在标记准确性方面优于标准EM算法，并且其性能与其他最先进的模型相当。我们还表明，在这种情况下，课程学习可以显著提高性能，无论是在收敛速度方面还是在泛化方面。

引用次数: 0

A multichannel MMSE-based framework for joint blind source separation and noise reduction 一种基于多通道mmse的联合盲源分离与降噪框架

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2012-03-25 DOI: 10.1109/ICASSP.2012.6287829

M. Souden, S. Araki, K. Kinoshita, T. Nakatani, H. Sawada

In this paper, we propose a new framework to separate multiple speech signals and reduce the additive acoustic noise using multiple microphones. In this framework, we start by formulating the minimum-mean-square error (MMSE) criterion to retrieve each of the desired speech signals from the observed mixtures of sounds and outline the importance of multi-speaker activity detection. The latter is modeled by introducing a latent variable whose posterior probability is computed via expectation maximization (EM) combining both the spatial and spectral cues of the multichannel speech observations. We experimentally demonstrate that the resulting joint blind source separation (BSS) and noise reduction solution performs remarkably well in reverberant and noisy environments.

在本文中，我们提出了一个新的框架来分离多个语音信号，并减少使用多个麦克风的附加噪声。在这个框架中，我们首先制定最小均方误差(MMSE)标准，从观察到的混合声音中检索每个所需的语音信号，并概述了多说话者活动检测的重要性。后者是通过引入一个潜在变量来建模的，该潜在变量的后验概率是通过期望最大化(EM)结合多通道语音观测的空间和频谱线索计算的。实验证明，所得到的联合盲源分离(BSS)和降噪方案在混响和噪声环境中表现优异。

引用次数: 10

Cyclic orthogonal codes in CDMA-based asynchronous Wireless Body Area Networks 基于cdma的异步无线体域网络中的循环正交码

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2012-03-25 DOI: 10.1109/ICASSP.2012.6288198

Ali M Tawfiq, J. Abouei, K. Plataniotis

This work considers a CDMA-based Wireless Body Area Network (WBAN) where multiple biosensors communicate simultaneously to a central node in an asynchronous fashion. The main goal of this paper is to present an augmentation protocol for the physical layer of the IEEE 802.15.6 specifications with focus on the Multiple Access Interference (MAI) mitigation in a proactive WBAN. The proposed methodology uses a new set of orthogonal codes from the conventional Walsh-Hadamard matrix which has the special property of “cyclic orthogonality”. This property ensures that the asynchronous nature of the WBAN does not produce MAI amongst the multiple on-body sensors. The work investigates the optimality of such codes in WBANs from the link Bit Error Rate (BER) performance. We show that the proposed spreading codes outperform conventional non-cyclic orthogonal spreading codes in a practical Rayleigh fading environment.

这项工作考虑了基于cdma的无线体域网络(WBAN)，其中多个生物传感器以异步方式同时与中心节点通信。本文的主要目标是针对IEEE 802.15.6规范的物理层提出一种增强协议，重点关注主动WBAN中的多址干扰(MAI)缓解。该方法采用了一组新的正交码，该码来自于传统的Walsh-Hadamard矩阵，具有“循环正交”的特殊性质。此属性确保WBAN的异步特性不会在多个体上传感器之间产生MAI。该工作从链路误码率(BER)性能的角度研究了这种编码在wban中的最优性。在实际的瑞利衰落环境下，本文提出的扩频码优于传统的非循环正交扩频码。

引用次数: 16

Efficient Gaussian inference algorithms for phase imaging 相位成像的高效高斯推理算法

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2012-03-25 DOI: 10.1109/ICASSP.2012.6287959

Jingshan Zhang, J. Dauwels, M. A. Vázquez, L. Waller

Novel efficient algorithms are developed to infer the phase of a complex optical field from a sequence of intensity images taken at different defocus distances. The non-linear observation model is approximated by a linear model. The complex optical field is inferred by iterative Kalman smoothing in the Fourier domain: forward and backward sweeps of Kalman recursions are alternated, and in each such sweep, the approximate linear model is refined. By limiting the number of iterations, one can trade off accuracy vs. complexity. The complexity of each iteration in the proposed algorithm is in the order of N logN, where N is the number of pixels per image. The storage required scales linearly with N. In contrast, the complexity of existing phase inference algorithms scales with N3 and the required storage with N2. The proposed algorithms may enable real-time estimation of optical fields from noisy intensity images.

本文提出了一种新的高效算法，可以从不同离焦距离下拍摄的一系列强度图像中推断出复杂光场的相位。非线性观测模型用线性模型近似。在傅里叶域中通过迭代卡尔曼平滑来推断复光场:卡尔曼递归的前向扫描和后向扫描交替进行，并且在每次扫描中对近似线性模型进行改进。通过限制迭代次数，可以在准确性与复杂性之间进行权衡。本文算法每次迭代的复杂度为N logN的数量级，其中N为每张图像的像素数。相比之下，现有相位推断算法的复杂度随N3的增加而增加，所需的存储空间随N2的增加而增加。所提出的算法可以实现从噪声强度图像中实时估计光场。

引用次数: 7

Lowresource speech recognition with automatically learned sparse inverse covariance matrices 基于自动学习稀疏逆协方差矩阵的低资源语音识别

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2012-03-25 DOI: 10.1109/ICASSP.2012.6288977

Weibin Zhang, Pascale Fung

Full covariance acoustic models trained with limited training data generalize poorly to unseen test data due to a large number of free parameters. We propose to use sparse inverse covariance matrices to address this problem. Previous sparse inverse covariance methods never outperformed full covariance methods. We propose a method to automatically drive the structure of inverse covariance matrices to sparse during training. We use a new objective function by adding L1 regularization to the traditional objective function for maximum likelihood estimation. The graphic lasso method for the estimation of a sparse inverse covariance matrix is incorporated into the Expectation Maximization algorithm to learn parameters of HMM using the new objective function. Experimental results show that we only need about 25% of the parameters of the inverse covariance matrices to be nonzero in order to achieve the same performance of a full covariance system. Our proposed system using sparse inverse covariance Gaussians also significantly outperforms a system using full covariance Gaussians trained on limited data.

用有限的训练数据训练的全协方差声学模型由于有大量的自由参数，对未知的测试数据泛化效果较差。我们建议使用稀疏逆协方差矩阵来解决这个问题。以往的稀疏反协方差方法的性能从未优于全协方差方法。提出了一种在训练过程中自动将逆协方差矩阵的结构驱动为稀疏的方法。在最大似然估计的传统目标函数基础上加入L1正则化，提出了一种新的目标函数。将稀疏逆协方差矩阵的图形lasso估计方法与期望最大化算法相结合，利用新的目标函数学习HMM的参数。实验结果表明，我们只需要约25%的逆协方差矩阵参数为非零，就能达到与全协方差系统相同的性能。我们提出的使用稀疏逆协方差高斯的系统也显著优于在有限数据上训练的使用全协方差高斯的系统。

{"title":"Lowresource speech recognition with automatically learned sparse inverse covariance matrices","authors":"Weibin Zhang, Pascale Fung","doi":"10.1109/ICASSP.2012.6288977","DOIUrl":"https://doi.org/10.1109/ICASSP.2012.6288977","url":null,"abstract":"Full covariance acoustic models trained with limited training data generalize poorly to unseen test data due to a large number of free parameters. We propose to use sparse inverse covariance matrices to address this problem. Previous sparse inverse covariance methods never outperformed full covariance methods. We propose a method to automatically drive the structure of inverse covariance matrices to sparse during training. We use a new objective function by adding L1 regularization to the traditional objective function for maximum likelihood estimation. The graphic lasso method for the estimation of a sparse inverse covariance matrix is incorporated into the Expectation Maximization algorithm to learn parameters of HMM using the new objective function. Experimental results show that we only need about 25% of the parameters of the inverse covariance matrices to be nonzero in order to achieve the same performance of a full covariance system. Our proposed system using sparse inverse covariance Gaussians also significantly outperforms a system using full covariance Gaussians trained on limited data.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"24 1","pages":"4737-4740"},"PeriodicalIF":0.0,"publicationDate":"2012-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77870799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Factor analysis of Laplacian approach for speaker recognition 说话人识别的拉普拉斯因子分析

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2012-03-25 DOI: 10.1109/ICASSP.2012.6288850

Jinchao Yang, Chunyan Liang, L. Yang, Hongbin Suo, Junjie Wang, Yonghong Yan

In this study, we introduce a new factor analysis of Laplacian approach to speaker recognition under the support vector machine (SVM) framework. The Laplacian-projected supervector from our proposed Laplacian approach, which finds an embedding that preserves local information by locality preserving projections (LPP), is believed to contain speaker dependent information. The proposed method was compared with the state-of-the-art total variability approach on 2010 National Institute of Standards and Technology (NIST) Speaker Recognition Evaluation (SRE) corpus. According to the compared results, our proposed method is effective.

在本研究中，我们在支持向量机(SVM)框架下引入一种新的拉普拉斯因子分析方法来识别说话人。我们提出的拉普拉斯投影超向量，通过局部保留投影(LPP)找到一个保留局部信息的嵌入，被认为包含说话人相关信息。在2010年美国国家标准与技术研究院(NIST)说话人识别评估(SRE)语料库上，将该方法与最先进的总变异性方法进行了比较。对比结果表明，本文提出的方法是有效的。

引用次数: 5

Classified-Filter-based Post-Compensation Interpolation for Color Filter Array demosaicing 基于分类滤波器的彩色滤波器阵列去马赛克后补偿插值

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2012-03-25 DOI: 10.1109/ICASSP.2012.6288035

Jing-Ming Guo, Yun-Fu Liu, B. Lai, Peng-Hua Wang, Jiann-Der Lee

In this paper, a classified-based post-compensation algorithm for Color Filter Array (CFA) demosaicing is proposed. This technique can be used for improving the image quality of the interpolated results obtained by other CFA images. First, each pixel is classified according to its neighborhood texture variance and angle. Then, different Least-Mean-Square (LMS) filters are trained to adopt for dealing pixels of various characteristics. As documented in the experimental results, the proposed scheme can substantially boost the image quality; in addition, a better visual perceptual can be obtained. Notably, the proposed method can be considered as effective post-compensation by applying for any former schemes to yield an even better image quality.

提出了一种基于分类的彩色滤波阵列(CFA)去马赛克后补偿算法。该技术可用于提高其他CFA图像插值结果的图像质量。首先，根据邻域纹理方差和角度对每个像素进行分类。然后，训练不同的最小均方(LMS)滤波器来处理不同特征的像素。实验结果表明，该方案可以显著提高图像质量;此外，还可以获得较好的视觉感知。值得注意的是，该方法可以被认为是一种有效的后补偿方法，它可以应用于任何先前的方案，从而获得更好的图像质量。

引用次数: 2

Inference using phi-divergence Goodness-of-Fit tests 使用散度拟合优度检验进行推理

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2012-03-25 DOI: 10.1109/ICASSP.2012.6288546

Nikhil Kundargi, A. Tewfik

In this paper we study the inferential use of goodness of fit tests in a non-parametric setting. The utility of such tests will be demonstrated for the test case of spectrum sensing applications in cognitive radios. For the first time, we provide a comprehensive framework for decision fusion of a ensemble of goodness-of-fit testing procedures through an Ensemble Goodness-of-Fit test. Also, we introduce a generalized family of functionals and kernels called Φ-divergences which allow us to formulate goodness-of-fit tests that are parameterized by a single parameter s. The performance of these tests is simulated under gaussian and non-gaussian noise in a MIMO setting. We show that under uncertainty or non-gaussianity in the noise, the performance of non-parametric tests in general, and phi-divergence based goodness-of-fit tests in particular, is significantly superior to that of the energy detector with reduced implementation complexity. Especially important is the property that the false alarm rates of our proposed tests is maintained at a fixed level over a wide variation in the channel noise distributions.

本文研究了非参数条件下拟合优度检验的推理应用。这些测试的效用将在认知无线电频谱传感应用的测试案例中得到证明。第一次，我们提供了一个综合框架的决策融合的拟合优度测试程序的集合通过一个整体的拟合优度测试。此外，我们还引入了一个广义的函数族和称为Φ-divergences的核函数，它允许我们制定由单个参数s参数化的拟合优度测试。在MIMO设置中，这些测试在高斯和非高斯噪声下的性能进行了模拟。我们表明，在噪声的不确定性或非高斯性下，一般的非参数测试，特别是基于phi散度的拟合优度测试的性能明显优于降低了实现复杂性的能量检测器。特别重要的是，我们提出的测试的误报率在通道噪声分布的广泛变化中保持在固定水平。

引用次数: 1

Familiar speaker recognition 熟悉的说话人识别

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2012-03-25 DOI: 10.1109/ICASSP.2012.6288854

S. Wenndt, Ronald L. Mitchell

Speaker recognition by machines can be quite good for large groups as seen in NIST speaker recognition evaluations. However, speaker recognition by machine can be fragile for changing environments. This research examines how robust humans are for recognizing familiar speakers in changing environments. Additionally, bandlimited noise was used to try to learn what frequency regions are important for human listeners to recognize familiar speakers.

从NIST的说话人识别评估中可以看出，机器对大群体的说话人识别效果非常好。然而，在不断变化的环境中，机器对说话人的识别可能很脆弱。这项研究考察了人类在不断变化的环境中识别熟悉的说话人的能力。此外，带宽限制噪声被用来试图了解哪些频率区域对人类听众识别熟悉的说话者很重要。

引用次数: 2

Blind estimation and low-rate sampling of sparse mimo systems with common support 具有共同支持的稀疏mimo系统的盲估计和低速率采样

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2012-03-25 DOI: 10.1109/ICASSP.2012.6288768

Ying Xiong, Yue M. Lu

We present a blind estimation algorithm for multi-input and multi-output (MIMO) systems with sparse common support. Key to the proposed algorithm is a matrix generalization of the classical annihilating filter technique, which allows us to estimate the nonlinear parameters of the channels through an efficient and noniterative procedure. An attractive property of the proposed algorithm is that it only needs the sensor measurements at a narrow frequency band. By exploiting this feature, we can derive efficient sub-Nyquist sampling schemes which significantly reduce the number of samples that need to be retained at each sensor. Numerical simulations verify the accuracy of the proposed estimation algorithm and its robustness in the presence of noise.

提出了一种基于稀疏公共支持的多输入多输出(MIMO)系统盲估计算法。该算法的关键是对经典湮灭滤波技术的矩阵推广，使我们能够通过有效的非迭代过程估计信道的非线性参数。该算法的一个吸引人的特点是它只需要在较窄的频带进行传感器测量。通过利用这一特征，我们可以推导出有效的亚奈奎斯特采样方案，该方案显著减少了每个传感器需要保留的样本数量。数值仿真验证了所提估计算法的准确性和在噪声存在下的鲁棒性。

引用次数: 5

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀