2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献

英文中文

LIE operators for compressive sensing 压缩感知的LIE算子

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2014-07-14 DOI: 10.1109/ICASSP.2014.6854018

C. Hegde, Aswin C. Sankaranarayanan, Richard Baraniuk

We consider the efficient acquisition, parameter estimation, and recovery of signal ensembles that lie on a low-dimensional manifold in a high-dimensional ambient signal space. Our particular focus is on randomized, compressive acquisition of signals from the manifold generated by the transformation of a base signal by operators from a Lie group. Such manifolds factor prominently in a number of applications, including radar and sonar array processing, camera arrays, and video processing. Leveraging the fact that Lie group manifolds admit a convenient analytical characterization, we develop new theory and algorithms for: (1) estimating the Lie operator parameters from compressive measurements, and (2) recovering the base signal from compressive measurements. We validate our approach with several of numerical simulations, including the reconstruction of an affine-transformed video sequence from compressive measurements.

我们考虑了高维环境信号空间中低维流形上的信号集合体的有效采集、参数估计和恢复。我们的重点是随机的，压缩采集信号从流形产生的变换由一个基信号由李群的算子。这种流形在雷达和声纳阵列处理、相机阵列和视频处理等许多应用中发挥着重要作用。利用李群流形允许方便的解析表征这一事实，我们开发了新的理论和算法:(1)从压缩测量中估计李算子参数，(2)从压缩测量中恢复基信号。我们用几个数值模拟验证了我们的方法，包括从压缩测量中重建仿射变换的视频序列。

引用次数: 1

A statistical evaluation of Sparsity-based Distance Measure (SDM) as an image quality assessment algorithm 基于稀疏性的距离度量(SDM)作为图像质量评估算法的统计评价

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2014-07-14 DOI: 10.1109/ICASSP.2014.6854108

K. Priya, K. Manasa, Sumohana S. Channappayya

Sparsity-based Distance Measure (SDM), a sparse reconstruction-based image similarity measure was recently proposed and shown to have promising applications in image classification, clustering and retrieval. In this paper, we present a statistical evaluation of SDM's performance as an image quality assessment (IQA) algorithm. This evaluation is carried out on the LIVE image database. We show that the SDM performs fairly in comparison with the state-of-the-art while possessing several attractive properties. Specifically, we demonstrate its robustness to rotation (90°, 180°), scaling, and combinations of distortions - properties that are highly desirable of any IQA algorithm.

基于稀疏性的距离度量(SDM)是近年来提出的一种基于稀疏重建的图像相似性度量方法，在图像分类、聚类和检索等方面具有广阔的应用前景。在本文中，我们提出了SDM作为图像质量评估(IQA)算法的性能的统计评估。该评估是在LIVE图像数据库上进行的。我们表明，与最先进的技术相比，SDM的性能相当，同时拥有几个有吸引力的特性。具体来说，我们证明了它对旋转(90°，180°)，缩放和扭曲组合的鲁棒性-任何IQA算法都非常需要的属性。

引用次数: 4

A discriminatively trained Hough Transform for frame-level phoneme recognition 基于判别训练的Hough变换的帧级音素识别

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2014-07-14 DOI: 10.1109/ICASSP.2014.6854053

J. Dennis, T. H. Dat, Haizhou Li, Chng Eng Siong

Despite recent advances in the use of Artificial Neural Network (ANN) architectures for automatic speech recognition (ASR), relatively little attention has been given to using feature inputs beyond MFCCs in such systems. In this paper, we propose an alternative to conventional MFCC or filterbank features, using an approach based on the Generalised Hough Transform (GHT). The GHT is a common approach used in the field of image processing for the task of object detection, where the idea is to learn the spatial distribution of a codebook of feature information relative to the location of the target class. During recognition, a simple weighted summation of the codebook activations is commonly used to detect the presence of the target classes. Here we propose to learn the weighting discriminatively in an ANN, where the aim is to optimise the static phone classification error at the output of the network. As such an ANN is common to hybrid ASR architectures, the output activations from the GHT can be considered as a novel feature for ASR. Experimental results on the TIMIT phoneme recognition task demonstrate the state-of-the-art performance of the approach.

尽管最近在自动语音识别(ASR)中使用人工神经网络(ANN)架构取得了进展，但在此类系统中使用mfc以外的特征输入的关注相对较少。在本文中，我们提出了一种替代传统的MFCC或滤波器组特征，使用基于广义霍夫变换(GHT)的方法。GHT是图像处理领域中用于目标检测任务的常用方法，其思想是学习相对于目标类位置的特征信息的码本的空间分布。在识别过程中，通常使用码本激活的简单加权求和来检测目标类的存在。在这里，我们提出在人工神经网络中判别性地学习权重，其目的是优化网络输出的静态电话分类误差。由于这种人工神经网络在混合ASR体系结构中很常见，因此GHT的输出激活可以被认为是ASR的一个新特征。在TIMIT音素识别任务上的实验结果证明了该方法的最新性能。

{"title":"A discriminatively trained Hough Transform for frame-level phoneme recognition","authors":"J. Dennis, T. H. Dat, Haizhou Li, Chng Eng Siong","doi":"10.1109/ICASSP.2014.6854053","DOIUrl":"https://doi.org/10.1109/ICASSP.2014.6854053","url":null,"abstract":"Despite recent advances in the use of Artificial Neural Network (ANN) architectures for automatic speech recognition (ASR), relatively little attention has been given to using feature inputs beyond MFCCs in such systems. In this paper, we propose an alternative to conventional MFCC or filterbank features, using an approach based on the Generalised Hough Transform (GHT). The GHT is a common approach used in the field of image processing for the task of object detection, where the idea is to learn the spatial distribution of a codebook of feature information relative to the location of the target class. During recognition, a simple weighted summation of the codebook activations is commonly used to detect the presence of the target classes. Here we propose to learn the weighting discriminatively in an ANN, where the aim is to optimise the static phone classification error at the output of the network. As such an ANN is common to hybrid ASR architectures, the output activations from the GHT can be considered as a novel feature for ASR. Experimental results on the TIMIT phoneme recognition task demonstrate the state-of-the-art performance of the approach.","PeriodicalId":6545,"journal":{"name":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"6 12 1","pages":"2514-2518"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83841263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

On the convergence of average consensus with generalized metropolis-hasting weights 广义大都市加速权下平均一致性的收敛性

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2014-07-14 DOI: 10.1109/ICASSP.2014.6854643

V. Schwarz, Gabor Hannak, G. Matz

Average consensus is a well-studied method for distributed averaging. The convergence properties of average consensus depend on the averaging weights. Examples for commonly used weight designs are Metropolis-Hastings (MH) weights and constant weights. In this paper, we provide a complete convergence analysis for a generalized MH weight design that encompasses conventional MH as special case. More specifically, we formulate sufficient and necessary conditions for convergence. A main conclusion is that AC with MH weights is guaranteed to converge unless the underlying network is a regular bipartite graph.

平均一致性是一种被广泛研究的分布式平均方法。平均一致性的收敛性取决于平均权值。常用重量设计的例子是Metropolis-Hastings (MH)重量和恒重。在本文中，我们提供了一个完整的收敛分析广义MH权重设计，包括传统的MH作为特殊情况。更具体地说，我们给出了收敛的充要条件。一个主要结论是，除非底层网络是正则二部图，否则具有MH权值的AC保证收敛。

引用次数: 24

Unsupervised domain adaptation for deep neural network based voice activity detection 基于深度神经网络的无监督域自适应语音活动检测

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2014-07-14 DOI: 10.1109/ICASSP.2014.6854930

Xiao-Lei Zhang

The mismatching problem between the training and test speech corpora hinders the practical use of the machine-learning-based voice activity detection (VAD). In this paper, we try to address this problem by the unsupervised domain adaptation techniques, which try to find a shared feature subspace between the mismatching corpora. The denoising deep neural network is used as the learning machine. Three domain adaptation techniques are used for analysis. Experimental results show that the unsupervised domain adaptation technique is promising to the mismatching problem of VAD.

训练语料库与测试语料库之间的不匹配问题阻碍了基于机器学习的语音活动检测(VAD)的实际应用。在本文中，我们试图通过无监督域自适应技术来解决这个问题，该技术试图在不匹配的语料库之间找到一个共享的特征子空间。采用去噪深度神经网络作为学习机。采用三种域自适应技术进行分析。实验结果表明，无监督域自适应技术很有希望解决VAD的不匹配问题。

引用次数: 9

Improvement of utterance clustering by using employees' sound and area data 基于员工声音和区域数据的语音聚类改进

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2014-07-14 DOI: 10.1109/ICASSP.2014.6854160

Tetsuya Kawase, Masanori Takehara, S. Tamura, S. Hayamizu, Ryuhei Tenmoku, T. Kurata

In this paper, we propose to use staying area data toward the estimation of serving time for customers. To classify utterances enables us to estimate conversation types between speakers. However, its performance becomes lower in real environments. We propose a method using area data with sound data to solve this problem. We also propose a method to estimate the conversation types using the decision trees. They were tested with the data recorded in a Japanese restaurant. In the experiment to classify utterances, the proposed method performed better than the method using only sound data. In the experiment to estimate the conversation types, we succeeded to recover 70% of the mis-classified conversations using both of sound and area data.

在本文中，我们建议使用停留面积数据来估计顾客的服务时间。对话语进行分类使我们能够估计说话者之间的对话类型。然而，它的性能在实际环境中变得较低。我们提出了一种利用区域数据与声音数据相结合的方法来解决这一问题。我们还提出了一种使用决策树来估计会话类型的方法。他们用一家日本餐馆记录的数据进行了测试。在对语音进行分类的实验中，该方法的分类效果优于仅使用语音数据的方法。在估计会话类型的实验中，我们成功地利用声音和区域数据恢复了70%的错误分类会话。

引用次数: 2

Image denoising by targeted external databases 针对外部数据库进行图像去噪

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2014-07-14 DOI: 10.1109/ICASSP.2014.6854040

Enming Luo, Stanley H. Chan, Truong Q. Nguyen

Classical image denoising algorithms based on single noisy images and generic image databases will soon reach their performance limits. In this paper, we propose to denoise images using targeted external image databases. Formulating denoising as an optimal filter design problem, we utilize the targeted databases to (1) determine the basis functions of the optimal filter by means of group sparsity; (2) determine the spectral coefficients of the optimal filter by means of localized priors. For a variety of scenarios such as text images, multiview images, and face images, we demonstrate superior denoising results over existing algorithms.

基于单一噪声图像和通用图像数据库的经典图像去噪算法很快就会达到其性能极限。在本文中，我们提出使用目标外部图像数据库对图像进行去噪。将去噪作为最优滤波器设计问题，我们利用目标数据库:(1)通过群稀疏性确定最优滤波器的基函数;(2)利用局部先验确定最优滤波器的光谱系数。对于文本图像、多视图图像和人脸图像等各种场景，我们展示了优于现有算法的去噪结果。

引用次数: 24

Automatic initialization for naval application of graph segmentation techniques: A comparative study 图形分割技术在舰船中的自动初始化应用:比较研究

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2014-07-14 DOI: 10.1109/ICASSP.2014.6854578

Irene Camino, U. Zölzer

Nowadays, many different image processing applications are of high interest to maritime authorities because of security reasons. Depending on the application, different kinds of images are employed. The extraction of ship silhouettes requires high resolution images in order to obtain accurate results. However, when the characteristics of the naval environment are visible the background complexity increases greatly and automatic approaches fail. In order to overcome these difficulties we propose an automatic initialization for graph segmentation techniques. A comparative study of earlier suggested initializations for different graph segmentation techniques is also presented. It shows that, under such unfavorable image conditions, finding the proper initialization in an automatic way is not trivial. Yet, the precision and recall achieved by our initialization are considerable higher regardless the graph segmentation. Furthermore, the performance is highly increased since the best results are obtained after only the first iteration.

目前，由于安全原因，许多不同的图像处理应用引起了海事当局的高度兴趣。根据应用程序的不同，使用不同类型的图像。船舶轮廓的提取需要高分辨率的图像才能获得准确的结果。然而，当海军环境的特征可见时，背景复杂性大大增加，自动方法失败。为了克服这些困难，我们提出了一种自动初始化的图形分割技术。对早期提出的不同图分割技术的初始化进行了比较研究。这表明，在如此不利的图像条件下，以自动方式找到合适的初始化并不容易。然而，无论图形分割如何，我们的初始化所获得的精度和召回率都相当高。此外，由于仅在第一次迭代之后就获得了最佳结果，因此性能得到了极大的提高。

引用次数: 0

MIMO detection based on averaging Gaussian projections 基于高斯投影平均的MIMO检测

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2014-07-14 DOI: 10.1109/ICASSP.2014.6853932

J. Goldberger

We propose a new detection algorithm for MIMO communication systems employing a two-dimensional marginal of the Gaussian approximation of the exact discrete distribution of the transmitted data given the received data. From the 2D distributions we derive one-dimensional marginals by averaging all the 2D joint distributions related to a single input symbol. We prove that this strategy to obtain a 1D distribution from a set of not necessarily consistent 2D distributions is optimal (for a specified criterion). The improved performance of the proposed algorithm is demonstrated on several instances of the problem of MIMO detection.

我们提出了一种新的MIMO通信系统检测算法，该算法采用给定接收数据的发射数据精确离散分布的高斯近似的二维边缘。从二维分布中，我们通过平均与单个输入符号相关的所有二维联合分布来推导一维边际。我们证明了这种从一组不一定一致的二维分布中获得一维分布的策略是最优的(对于指定的准则)。在MIMO检测问题的几个实例中证明了该算法的改进性能。

引用次数: 2

An amplify-and-forward scheme for cognitive radios 认知无线电的放大和转发方案

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pub Date : 2014-07-14 DOI: 10.1109/ICASSP.2014.6854095

F. Verde, A. Scaglione, D. Darsena, G. Gelli

In this paper, we propose an opportunistic amplify-and-forward relaying scheme for a cognitive radio network, which is aimed at allowing a secondary user (SU) to transmit over the same time-frequency slot of a primary user (PU). In our scheme, the SU amplifies and transmits the PU signal it receives, by using as relaying gain the information symbols that the SU wishes to transmit towards its own secondary receiver. The information theoretic limits of the proposed protocol are investigated by showing that, in some operative conditions of practical interest, the SU can embed its information symbols in the PU signal, without violating the cognitive radio principle of protecting the PU transmission and, at the same time, by attaining low transmission rates.

在本文中，我们提出了一种用于认知无线电网络的机会放大转发中继方案，该方案旨在允许辅助用户(SU)在主用户(PU)的相同时频槽上传输。在我们的方案中，SU放大并发送它接收到的PU信号，通过使用SU希望发送给它自己的辅助接收器的信息符号作为中继增益。研究了该协议的信息理论限制，表明在一些实际操作条件下，SU可以在不违反保护PU传输的认知无线电原则的情况下将其信息符号嵌入PU信号中，同时获得低传输速率。

引用次数: 4

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀