首页 > 最新文献

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献

英文 中文
LIE operators for compressive sensing 压缩感知的LIE算子
C. Hegde, Aswin C. Sankaranarayanan, Richard Baraniuk
We consider the efficient acquisition, parameter estimation, and recovery of signal ensembles that lie on a low-dimensional manifold in a high-dimensional ambient signal space. Our particular focus is on randomized, compressive acquisition of signals from the manifold generated by the transformation of a base signal by operators from a Lie group. Such manifolds factor prominently in a number of applications, including radar and sonar array processing, camera arrays, and video processing. Leveraging the fact that Lie group manifolds admit a convenient analytical characterization, we develop new theory and algorithms for: (1) estimating the Lie operator parameters from compressive measurements, and (2) recovering the base signal from compressive measurements. We validate our approach with several of numerical simulations, including the reconstruction of an affine-transformed video sequence from compressive measurements.
我们考虑了高维环境信号空间中低维流形上的信号集合体的有效采集、参数估计和恢复。我们的重点是随机的,压缩采集信号从流形产生的变换由一个基信号由李群的算子。这种流形在雷达和声纳阵列处理、相机阵列和视频处理等许多应用中发挥着重要作用。利用李群流形允许方便的解析表征这一事实,我们开发了新的理论和算法:(1)从压缩测量中估计李算子参数,(2)从压缩测量中恢复基信号。我们用几个数值模拟验证了我们的方法,包括从压缩测量中重建仿射变换的视频序列。
{"title":"LIE operators for compressive sensing","authors":"C. Hegde, Aswin C. Sankaranarayanan, Richard Baraniuk","doi":"10.1109/ICASSP.2014.6854018","DOIUrl":"https://doi.org/10.1109/ICASSP.2014.6854018","url":null,"abstract":"We consider the efficient acquisition, parameter estimation, and recovery of signal ensembles that lie on a low-dimensional manifold in a high-dimensional ambient signal space. Our particular focus is on randomized, compressive acquisition of signals from the manifold generated by the transformation of a base signal by operators from a Lie group. Such manifolds factor prominently in a number of applications, including radar and sonar array processing, camera arrays, and video processing. Leveraging the fact that Lie group manifolds admit a convenient analytical characterization, we develop new theory and algorithms for: (1) estimating the Lie operator parameters from compressive measurements, and (2) recovering the base signal from compressive measurements. We validate our approach with several of numerical simulations, including the reconstruction of an affine-transformed video sequence from compressive measurements.","PeriodicalId":6545,"journal":{"name":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"5 1","pages":"2342-2346"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91284213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A statistical evaluation of Sparsity-based Distance Measure (SDM) as an image quality assessment algorithm 基于稀疏性的距离度量(SDM)作为图像质量评估算法的统计评价
K. Priya, K. Manasa, Sumohana S. Channappayya
Sparsity-based Distance Measure (SDM), a sparse reconstruction-based image similarity measure was recently proposed and shown to have promising applications in image classification, clustering and retrieval. In this paper, we present a statistical evaluation of SDM's performance as an image quality assessment (IQA) algorithm. This evaluation is carried out on the LIVE image database. We show that the SDM performs fairly in comparison with the state-of-the-art while possessing several attractive properties. Specifically, we demonstrate its robustness to rotation (90°, 180°), scaling, and combinations of distortions - properties that are highly desirable of any IQA algorithm.
基于稀疏性的距离度量(SDM)是近年来提出的一种基于稀疏重建的图像相似性度量方法,在图像分类、聚类和检索等方面具有广阔的应用前景。在本文中,我们提出了SDM作为图像质量评估(IQA)算法的性能的统计评估。该评估是在LIVE图像数据库上进行的。我们表明,与最先进的技术相比,SDM的性能相当,同时拥有几个有吸引力的特性。具体来说,我们证明了它对旋转(90°,180°),缩放和扭曲组合的鲁棒性-任何IQA算法都非常需要的属性。
{"title":"A statistical evaluation of Sparsity-based Distance Measure (SDM) as an image quality assessment algorithm","authors":"K. Priya, K. Manasa, Sumohana S. Channappayya","doi":"10.1109/ICASSP.2014.6854108","DOIUrl":"https://doi.org/10.1109/ICASSP.2014.6854108","url":null,"abstract":"Sparsity-based Distance Measure (SDM), a sparse reconstruction-based image similarity measure was recently proposed and shown to have promising applications in image classification, clustering and retrieval. In this paper, we present a statistical evaluation of SDM's performance as an image quality assessment (IQA) algorithm. This evaluation is carried out on the LIVE image database. We show that the SDM performs fairly in comparison with the state-of-the-art while possessing several attractive properties. Specifically, we demonstrate its robustness to rotation (90°, 180°), scaling, and combinations of distortions - properties that are highly desirable of any IQA algorithm.","PeriodicalId":6545,"journal":{"name":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"54 1","pages":"2789-2792"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89848384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A discriminatively trained Hough Transform for frame-level phoneme recognition 基于判别训练的Hough变换的帧级音素识别
J. Dennis, T. H. Dat, Haizhou Li, Chng Eng Siong
Despite recent advances in the use of Artificial Neural Network (ANN) architectures for automatic speech recognition (ASR), relatively little attention has been given to using feature inputs beyond MFCCs in such systems. In this paper, we propose an alternative to conventional MFCC or filterbank features, using an approach based on the Generalised Hough Transform (GHT). The GHT is a common approach used in the field of image processing for the task of object detection, where the idea is to learn the spatial distribution of a codebook of feature information relative to the location of the target class. During recognition, a simple weighted summation of the codebook activations is commonly used to detect the presence of the target classes. Here we propose to learn the weighting discriminatively in an ANN, where the aim is to optimise the static phone classification error at the output of the network. As such an ANN is common to hybrid ASR architectures, the output activations from the GHT can be considered as a novel feature for ASR. Experimental results on the TIMIT phoneme recognition task demonstrate the state-of-the-art performance of the approach.
尽管最近在自动语音识别(ASR)中使用人工神经网络(ANN)架构取得了进展,但在此类系统中使用mfc以外的特征输入的关注相对较少。在本文中,我们提出了一种替代传统的MFCC或滤波器组特征,使用基于广义霍夫变换(GHT)的方法。GHT是图像处理领域中用于目标检测任务的常用方法,其思想是学习相对于目标类位置的特征信息的码本的空间分布。在识别过程中,通常使用码本激活的简单加权求和来检测目标类的存在。在这里,我们提出在人工神经网络中判别性地学习权重,其目的是优化网络输出的静态电话分类误差。由于这种人工神经网络在混合ASR体系结构中很常见,因此GHT的输出激活可以被认为是ASR的一个新特征。在TIMIT音素识别任务上的实验结果证明了该方法的最新性能。
{"title":"A discriminatively trained Hough Transform for frame-level phoneme recognition","authors":"J. Dennis, T. H. Dat, Haizhou Li, Chng Eng Siong","doi":"10.1109/ICASSP.2014.6854053","DOIUrl":"https://doi.org/10.1109/ICASSP.2014.6854053","url":null,"abstract":"Despite recent advances in the use of Artificial Neural Network (ANN) architectures for automatic speech recognition (ASR), relatively little attention has been given to using feature inputs beyond MFCCs in such systems. In this paper, we propose an alternative to conventional MFCC or filterbank features, using an approach based on the Generalised Hough Transform (GHT). The GHT is a common approach used in the field of image processing for the task of object detection, where the idea is to learn the spatial distribution of a codebook of feature information relative to the location of the target class. During recognition, a simple weighted summation of the codebook activations is commonly used to detect the presence of the target classes. Here we propose to learn the weighting discriminatively in an ANN, where the aim is to optimise the static phone classification error at the output of the network. As such an ANN is common to hybrid ASR architectures, the output activations from the GHT can be considered as a novel feature for ASR. Experimental results on the TIMIT phoneme recognition task demonstrate the state-of-the-art performance of the approach.","PeriodicalId":6545,"journal":{"name":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"6 12 1","pages":"2514-2518"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83841263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
On the convergence of average consensus with generalized metropolis-hasting weights 广义大都市加速权下平均一致性的收敛性
V. Schwarz, Gabor Hannak, G. Matz
Average consensus is a well-studied method for distributed averaging. The convergence properties of average consensus depend on the averaging weights. Examples for commonly used weight designs are Metropolis-Hastings (MH) weights and constant weights. In this paper, we provide a complete convergence analysis for a generalized MH weight design that encompasses conventional MH as special case. More specifically, we formulate sufficient and necessary conditions for convergence. A main conclusion is that AC with MH weights is guaranteed to converge unless the underlying network is a regular bipartite graph.
平均一致性是一种被广泛研究的分布式平均方法。平均一致性的收敛性取决于平均权值。常用重量设计的例子是Metropolis-Hastings (MH)重量和恒重。在本文中,我们提供了一个完整的收敛分析广义MH权重设计,包括传统的MH作为特殊情况。更具体地说,我们给出了收敛的充要条件。一个主要结论是,除非底层网络是正则二部图,否则具有MH权值的AC保证收敛。
{"title":"On the convergence of average consensus with generalized metropolis-hasting weights","authors":"V. Schwarz, Gabor Hannak, G. Matz","doi":"10.1109/ICASSP.2014.6854643","DOIUrl":"https://doi.org/10.1109/ICASSP.2014.6854643","url":null,"abstract":"Average consensus is a well-studied method for distributed averaging. The convergence properties of average consensus depend on the averaging weights. Examples for commonly used weight designs are Metropolis-Hastings (MH) weights and constant weights. In this paper, we provide a complete convergence analysis for a generalized MH weight design that encompasses conventional MH as special case. More specifically, we formulate sufficient and necessary conditions for convergence. A main conclusion is that AC with MH weights is guaranteed to converge unless the underlying network is a regular bipartite graph.","PeriodicalId":6545,"journal":{"name":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"87 1","pages":"5442-5446"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74962987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Unsupervised domain adaptation for deep neural network based voice activity detection 基于深度神经网络的无监督域自适应语音活动检测
Xiao-Lei Zhang
The mismatching problem between the training and test speech corpora hinders the practical use of the machine-learning-based voice activity detection (VAD). In this paper, we try to address this problem by the unsupervised domain adaptation techniques, which try to find a shared feature subspace between the mismatching corpora. The denoising deep neural network is used as the learning machine. Three domain adaptation techniques are used for analysis. Experimental results show that the unsupervised domain adaptation technique is promising to the mismatching problem of VAD.
训练语料库与测试语料库之间的不匹配问题阻碍了基于机器学习的语音活动检测(VAD)的实际应用。在本文中,我们试图通过无监督域自适应技术来解决这个问题,该技术试图在不匹配的语料库之间找到一个共享的特征子空间。采用去噪深度神经网络作为学习机。采用三种域自适应技术进行分析。实验结果表明,无监督域自适应技术很有希望解决VAD的不匹配问题。
{"title":"Unsupervised domain adaptation for deep neural network based voice activity detection","authors":"Xiao-Lei Zhang","doi":"10.1109/ICASSP.2014.6854930","DOIUrl":"https://doi.org/10.1109/ICASSP.2014.6854930","url":null,"abstract":"The mismatching problem between the training and test speech corpora hinders the practical use of the machine-learning-based voice activity detection (VAD). In this paper, we try to address this problem by the unsupervised domain adaptation techniques, which try to find a shared feature subspace between the mismatching corpora. The denoising deep neural network is used as the learning machine. Three domain adaptation techniques are used for analysis. Experimental results show that the unsupervised domain adaptation technique is promising to the mismatching problem of VAD.","PeriodicalId":6545,"journal":{"name":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"140 1","pages":"6864-6868"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78535753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Improvement of utterance clustering by using employees' sound and area data 基于员工声音和区域数据的语音聚类改进
Tetsuya Kawase, Masanori Takehara, S. Tamura, S. Hayamizu, Ryuhei Tenmoku, T. Kurata
In this paper, we propose to use staying area data toward the estimation of serving time for customers. To classify utterances enables us to estimate conversation types between speakers. However, its performance becomes lower in real environments. We propose a method using area data with sound data to solve this problem. We also propose a method to estimate the conversation types using the decision trees. They were tested with the data recorded in a Japanese restaurant. In the experiment to classify utterances, the proposed method performed better than the method using only sound data. In the experiment to estimate the conversation types, we succeeded to recover 70% of the mis-classified conversations using both of sound and area data.
在本文中,我们建议使用停留面积数据来估计顾客的服务时间。对话语进行分类使我们能够估计说话者之间的对话类型。然而,它的性能在实际环境中变得较低。我们提出了一种利用区域数据与声音数据相结合的方法来解决这一问题。我们还提出了一种使用决策树来估计会话类型的方法。他们用一家日本餐馆记录的数据进行了测试。在对语音进行分类的实验中,该方法的分类效果优于仅使用语音数据的方法。在估计会话类型的实验中,我们成功地利用声音和区域数据恢复了70%的错误分类会话。
{"title":"Improvement of utterance clustering by using employees' sound and area data","authors":"Tetsuya Kawase, Masanori Takehara, S. Tamura, S. Hayamizu, Ryuhei Tenmoku, T. Kurata","doi":"10.1109/ICASSP.2014.6854160","DOIUrl":"https://doi.org/10.1109/ICASSP.2014.6854160","url":null,"abstract":"In this paper, we propose to use staying area data toward the estimation of serving time for customers. To classify utterances enables us to estimate conversation types between speakers. However, its performance becomes lower in real environments. We propose a method using area data with sound data to solve this problem. We also propose a method to estimate the conversation types using the decision trees. They were tested with the data recorded in a Japanese restaurant. In the experiment to classify utterances, the proposed method performed better than the method using only sound data. In the experiment to estimate the conversation types, we succeeded to recover 70% of the mis-classified conversations using both of sound and area data.","PeriodicalId":6545,"journal":{"name":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"28 1","pages":"3047-3051"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79973280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Image denoising by targeted external databases 针对外部数据库进行图像去噪
Enming Luo, Stanley H. Chan, Truong Q. Nguyen
Classical image denoising algorithms based on single noisy images and generic image databases will soon reach their performance limits. In this paper, we propose to denoise images using targeted external image databases. Formulating denoising as an optimal filter design problem, we utilize the targeted databases to (1) determine the basis functions of the optimal filter by means of group sparsity; (2) determine the spectral coefficients of the optimal filter by means of localized priors. For a variety of scenarios such as text images, multiview images, and face images, we demonstrate superior denoising results over existing algorithms.
基于单一噪声图像和通用图像数据库的经典图像去噪算法很快就会达到其性能极限。在本文中,我们提出使用目标外部图像数据库对图像进行去噪。将去噪作为最优滤波器设计问题,我们利用目标数据库:(1)通过群稀疏性确定最优滤波器的基函数;(2)利用局部先验确定最优滤波器的光谱系数。对于文本图像、多视图图像和人脸图像等各种场景,我们展示了优于现有算法的去噪结果。
{"title":"Image denoising by targeted external databases","authors":"Enming Luo, Stanley H. Chan, Truong Q. Nguyen","doi":"10.1109/ICASSP.2014.6854040","DOIUrl":"https://doi.org/10.1109/ICASSP.2014.6854040","url":null,"abstract":"Classical image denoising algorithms based on single noisy images and generic image databases will soon reach their performance limits. In this paper, we propose to denoise images using targeted external image databases. Formulating denoising as an optimal filter design problem, we utilize the targeted databases to (1) determine the basis functions of the optimal filter by means of group sparsity; (2) determine the spectral coefficients of the optimal filter by means of localized priors. For a variety of scenarios such as text images, multiview images, and face images, we demonstrate superior denoising results over existing algorithms.","PeriodicalId":6545,"journal":{"name":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"119 1","pages":"2450-2454"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91536582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Automatic initialization for naval application of graph segmentation techniques: A comparative study 图形分割技术在舰船中的自动初始化应用:比较研究
Irene Camino, U. Zölzer
Nowadays, many different image processing applications are of high interest to maritime authorities because of security reasons. Depending on the application, different kinds of images are employed. The extraction of ship silhouettes requires high resolution images in order to obtain accurate results. However, when the characteristics of the naval environment are visible the background complexity increases greatly and automatic approaches fail. In order to overcome these difficulties we propose an automatic initialization for graph segmentation techniques. A comparative study of earlier suggested initializations for different graph segmentation techniques is also presented. It shows that, under such unfavorable image conditions, finding the proper initialization in an automatic way is not trivial. Yet, the precision and recall achieved by our initialization are considerable higher regardless the graph segmentation. Furthermore, the performance is highly increased since the best results are obtained after only the first iteration.
目前,由于安全原因,许多不同的图像处理应用引起了海事当局的高度兴趣。根据应用程序的不同,使用不同类型的图像。船舶轮廓的提取需要高分辨率的图像才能获得准确的结果。然而,当海军环境的特征可见时,背景复杂性大大增加,自动方法失败。为了克服这些困难,我们提出了一种自动初始化的图形分割技术。对早期提出的不同图分割技术的初始化进行了比较研究。这表明,在如此不利的图像条件下,以自动方式找到合适的初始化并不容易。然而,无论图形分割如何,我们的初始化所获得的精度和召回率都相当高。此外,由于仅在第一次迭代之后就获得了最佳结果,因此性能得到了极大的提高。
{"title":"Automatic initialization for naval application of graph segmentation techniques: A comparative study","authors":"Irene Camino, U. Zölzer","doi":"10.1109/ICASSP.2014.6854578","DOIUrl":"https://doi.org/10.1109/ICASSP.2014.6854578","url":null,"abstract":"Nowadays, many different image processing applications are of high interest to maritime authorities because of security reasons. Depending on the application, different kinds of images are employed. The extraction of ship silhouettes requires high resolution images in order to obtain accurate results. However, when the characteristics of the naval environment are visible the background complexity increases greatly and automatic approaches fail. In order to overcome these difficulties we propose an automatic initialization for graph segmentation techniques. A comparative study of earlier suggested initializations for different graph segmentation techniques is also presented. It shows that, under such unfavorable image conditions, finding the proper initialization in an automatic way is not trivial. Yet, the precision and recall achieved by our initialization are considerable higher regardless the graph segmentation. Furthermore, the performance is highly increased since the best results are obtained after only the first iteration.","PeriodicalId":6545,"journal":{"name":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"19 1","pages":"5120-5124"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87714610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A maximal figure-of-merit learning approach to maximizing mean average precision with deep neural network based classifiers 基于深度神经网络的分类器平均精度最大化的最大优值学习方法
Kehuang Li, Zhen Huang, You-Chi Cheng, Chin-Hui Lee
We propose a maximal figure-of-merit (MFoM) learning framework to directly maximize mean average precision (MAP) which is a key performance metric in many multi-class classification tasks. Conventional classifiers based on support vector machines cannot be easily adopted to optimize the MAP metric. On the other hand, classifiers based on deep neural networks (DNNs) have recently been shown to deliver a great discrimination capability in automatic speech recognition and image classification as well. However, DNNs are usually optimized with the minimum cross entropy criterion. In contrast to most conventional classification methods, our proposed approach can be formulated to embed DNNs and MAP into the objective function to be optimized during training. The combination of the proposed maximum MAP (MMAP) technique and DNNs introduces nonlinearity to the linear discriminant function (LDF) in order to increase the flexibility and discriminant power of the original MFoM-trained LDF based classifiers. Tested on both automatic image annotation and audio event classification, the experimental results show consistent improvements of MAP on both datasets when compared with other state-of-the-art classifiers without using MMAP.
我们提出了一个最大优点图(MFoM)学习框架来直接最大化平均精度(MAP), MAP是许多多类分类任务的关键性能指标。传统的基于支持向量机的分类器难以用于MAP度量的优化。另一方面,基于深度神经网络(dnn)的分类器在自动语音识别和图像分类方面也表现出了很强的识别能力。然而,深度神经网络通常采用最小交叉熵准则进行优化。与大多数传统的分类方法相比,我们提出的方法可以在训练过程中将dnn和MAP嵌入到待优化的目标函数中。提出的最大MAP (MMAP)技术与深度神经网络相结合,将非线性引入线性判别函数(LDF)中,以提高原始mfom训练的基于LDF的分类器的灵活性和判别能力。在自动图像标注和音频事件分类上进行了测试,实验结果表明,与不使用MMAP的其他最先进的分类器相比,MAP在这两个数据集上的改进是一致的。
{"title":"A maximal figure-of-merit learning approach to maximizing mean average precision with deep neural network based classifiers","authors":"Kehuang Li, Zhen Huang, You-Chi Cheng, Chin-Hui Lee","doi":"10.1109/ICASSP.2014.6854454","DOIUrl":"https://doi.org/10.1109/ICASSP.2014.6854454","url":null,"abstract":"We propose a maximal figure-of-merit (MFoM) learning framework to directly maximize mean average precision (MAP) which is a key performance metric in many multi-class classification tasks. Conventional classifiers based on support vector machines cannot be easily adopted to optimize the MAP metric. On the other hand, classifiers based on deep neural networks (DNNs) have recently been shown to deliver a great discrimination capability in automatic speech recognition and image classification as well. However, DNNs are usually optimized with the minimum cross entropy criterion. In contrast to most conventional classification methods, our proposed approach can be formulated to embed DNNs and MAP into the objective function to be optimized during training. The combination of the proposed maximum MAP (MMAP) technique and DNNs introduces nonlinearity to the linear discriminant function (LDF) in order to increase the flexibility and discriminant power of the original MFoM-trained LDF based classifiers. Tested on both automatic image annotation and audio event classification, the experimental results show consistent improvements of MAP on both datasets when compared with other state-of-the-art classifiers without using MMAP.","PeriodicalId":6545,"journal":{"name":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1 1","pages":"4503-4507"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79662591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
MIMO detection based on averaging Gaussian projections 基于高斯投影平均的MIMO检测
J. Goldberger
We propose a new detection algorithm for MIMO communication systems employing a two-dimensional marginal of the Gaussian approximation of the exact discrete distribution of the transmitted data given the received data. From the 2D distributions we derive one-dimensional marginals by averaging all the 2D joint distributions related to a single input symbol. We prove that this strategy to obtain a 1D distribution from a set of not necessarily consistent 2D distributions is optimal (for a specified criterion). The improved performance of the proposed algorithm is demonstrated on several instances of the problem of MIMO detection.
我们提出了一种新的MIMO通信系统检测算法,该算法采用给定接收数据的发射数据精确离散分布的高斯近似的二维边缘。从二维分布中,我们通过平均与单个输入符号相关的所有二维联合分布来推导一维边际。我们证明了这种从一组不一定一致的二维分布中获得一维分布的策略是最优的(对于指定的准则)。在MIMO检测问题的几个实例中证明了该算法的改进性能。
{"title":"MIMO detection based on averaging Gaussian projections","authors":"J. Goldberger","doi":"10.1109/ICASSP.2014.6853932","DOIUrl":"https://doi.org/10.1109/ICASSP.2014.6853932","url":null,"abstract":"We propose a new detection algorithm for MIMO communication systems employing a two-dimensional marginal of the Gaussian approximation of the exact discrete distribution of the transmitted data given the received data. From the 2D distributions we derive one-dimensional marginals by averaging all the 2D joint distributions related to a single input symbol. We prove that this strategy to obtain a 1D distribution from a set of not necessarily consistent 2D distributions is optimal (for a specified criterion). The improved performance of the proposed algorithm is demonstrated on several instances of the problem of MIMO detection.","PeriodicalId":6545,"journal":{"name":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"38 1","pages":"1916-1920"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80095244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1