首页 > 最新文献

2010 20th International Conference on Pattern Recognition最新文献

英文 中文
Efficient Object Detection and Matching Using Feature Classification 基于特征分类的高效目标检测与匹配
Pub Date : 2010-10-07 DOI: 10.1109/ICPR.2010.753
F. Dornaika, Fadi Chakik
This paper presents a new approach for efficient object detection and matching in images and videos. We propose a stage based on a classification scheme that classifies the extracted features in new images into object features and non-object features. This binary classification scheme has turned out to be an efficient tool that can be used for object detection and matching. By means of this classification not only the matching process becomes more robust and faster but also the robust object registration becomes fast. We provide quantitative evaluations showing the advantages of using the classification stage for object matching and registration. Our approach could lend itself nicely to real-time object tracking and detection.
本文提出了一种有效的图像和视频目标检测与匹配的新方法。我们提出了一个基于分类方案的阶段,将新图像中提取的特征分为目标特征和非目标特征。这种二值分类方案是一种有效的目标检测和匹配工具。通过这种分类,不仅使匹配过程变得更加鲁棒和快速,而且鲁棒目标配准也变得更快。我们提供了定量评估,显示了使用分类阶段进行对象匹配和注册的优势。我们的方法可以很好地用于实时对象跟踪和检测。
{"title":"Efficient Object Detection and Matching Using Feature Classification","authors":"F. Dornaika, Fadi Chakik","doi":"10.1109/ICPR.2010.753","DOIUrl":"https://doi.org/10.1109/ICPR.2010.753","url":null,"abstract":"This paper presents a new approach for efficient object detection and matching in images and videos. We propose a stage based on a classification scheme that classifies the extracted features in new images into object features and non-object features. This binary classification scheme has turned out to be an efficient tool that can be used for object detection and matching. By means of this classification not only the matching process becomes more robust and faster but also the robust object registration becomes fast. We provide quantitative evaluations showing the advantages of using the classification stage for object matching and registration. Our approach could lend itself nicely to real-time object tracking and detection.","PeriodicalId":309591,"journal":{"name":"2010 20th International Conference on Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2010-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128062375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
A Comprehensive Evaluation on Non-deterministic Motion Estimation 非确定性运动估计的综合评价
Pub Date : 2010-10-07 DOI: 10.1109/ICPR.2010.571
Changzhu Wu, Qing Wang
When computing optical flow with region-based matching, very few of them can be reliably obtained, especially for the high-contrast areas or those with little texture. Instead of using a single pixel from the reference frame, non-deterministic motion utilizes multiple pixels within a neighborhood to represent the corresponding pixel in the current frame. Although remarkable improvement has been made with this method, the weight associated to each reference pixel is quite sensitive to the selection of its standard deviation. To address this issue, a dual probability is presented in this paper. Intuitively, it enhances those weights of pixels that are more similar to its counterpart in the current frame, while suppressing the rest of them. Experimental results show that the proposed method is effective to deal with intense motion and occlusion, especially in the case of reducing the adverse impact of noise.
在基于区域匹配的光流计算中,能够可靠获得的光流很少,特别是对于高对比度区域或纹理较少的区域。不确定性运动不是使用参考帧中的单个像素,而是利用邻域内的多个像素来表示当前帧中的相应像素。虽然该方法已经取得了显著的改进,但每个参考像素的权重对其标准差的选择相当敏感。为了解决这个问题,本文提出了一个对偶概率。直观地说,它增强那些与当前帧中对应的像素更相似的像素权重,同时抑制其余的像素权重。实验结果表明,该方法可以有效地处理剧烈运动和遮挡,特别是在降低噪声不利影响的情况下。
{"title":"A Comprehensive Evaluation on Non-deterministic Motion Estimation","authors":"Changzhu Wu, Qing Wang","doi":"10.1109/ICPR.2010.571","DOIUrl":"https://doi.org/10.1109/ICPR.2010.571","url":null,"abstract":"When computing optical flow with region-based matching, very few of them can be reliably obtained, especially for the high-contrast areas or those with little texture. Instead of using a single pixel from the reference frame, non-deterministic motion utilizes multiple pixels within a neighborhood to represent the corresponding pixel in the current frame. Although remarkable improvement has been made with this method, the weight associated to each reference pixel is quite sensitive to the selection of its standard deviation. To address this issue, a dual probability is presented in this paper. Intuitively, it enhances those weights of pixels that are more similar to its counterpart in the current frame, while suppressing the rest of them. Experimental results show that the proposed method is effective to deal with intense motion and occlusion, especially in the case of reducing the adverse impact of noise.","PeriodicalId":309591,"journal":{"name":"2010 20th International Conference on Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2010-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114048596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Entropy Estimation and Multi-Dimensional Scale Saliency 熵估计与多维尺度显著性
Pub Date : 2010-10-07 DOI: 10.1109/ICPR.2010.171
P. Suau, Francisco Escolano
In this paper we survey two multi-dimensional Scale Saliency approaches based on graphs and the k-d partition algorithm. In the latter case we introduce a new divergence metric and we show experimentally its suitability. We also show an application of multi-dimensional Scale Saliency to texture discrimination. We demonstrate that the use of multi-dimensional data can improve the performance of texture retrieval based on feature extraction.
本文研究了基于图和k-d划分算法的两种多维尺度显著性方法。在后一种情况下,我们引入了一个新的散度度量,并通过实验证明了它的适用性。我们还展示了多维尺度显著性在纹理识别中的应用。我们证明了使用多维数据可以提高基于特征提取的纹理检索的性能。
{"title":"Entropy Estimation and Multi-Dimensional Scale Saliency","authors":"P. Suau, Francisco Escolano","doi":"10.1109/ICPR.2010.171","DOIUrl":"https://doi.org/10.1109/ICPR.2010.171","url":null,"abstract":"In this paper we survey two multi-dimensional Scale Saliency approaches based on graphs and the k-d partition algorithm. In the latter case we introduce a new divergence metric and we show experimentally its suitability. We also show an application of multi-dimensional Scale Saliency to texture discrimination. We demonstrate that the use of multi-dimensional data can improve the performance of texture retrieval based on feature extraction.","PeriodicalId":309591,"journal":{"name":"2010 20th International Conference on Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2010-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125108935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Active Contours with Thresholding Value for Image Segmentation 基于阈值的活动轮廓图像分割
Pub Date : 2010-10-07 DOI: 10.1109/ICPR.2010.555
Gang Chen, Haiying Zhang, I-Ping Chen, Wen Yang
In this paper, we propose an active contour with threshold value to detect objects and at the same time get rid of unimportant parts rather than extract all information. The basic ideal of our model is to introduce a weight matrix into region-based active contours, which can enhance the weight for the main parts while filter the weak intensity, such as shadows, illumination and so on. Moreover, we can choose threshold value to set weight matrix manually for accurate image segmentation. Thus, the proposed method can extract objects of interest in practice. Coupled partial differential equations are used to implement this method with level set algorithms. Experimental results show the advantages of our method in terms of accuracy for image segmentation.
本文提出了一种带有阈值的活动轮廓来检测目标,同时去除不重要的部分,而不是提取所有的信息。我们的模型的基本理念是在基于区域的活动轮廓中引入一个权值矩阵,该矩阵可以增强主要部分的权值,同时过滤弱强度,如阴影、光照等。此外,我们可以手动选择阈值设置权值矩阵,以实现准确的图像分割。因此,该方法可以在实际中提取出感兴趣的对象。采用耦合偏微分方程和水平集算法实现该方法。实验结果表明了该方法在图像分割精度方面的优势。
{"title":"Active Contours with Thresholding Value for Image Segmentation","authors":"Gang Chen, Haiying Zhang, I-Ping Chen, Wen Yang","doi":"10.1109/ICPR.2010.555","DOIUrl":"https://doi.org/10.1109/ICPR.2010.555","url":null,"abstract":"In this paper, we propose an active contour with threshold value to detect objects and at the same time get rid of unimportant parts rather than extract all information. The basic ideal of our model is to introduce a weight matrix into region-based active contours, which can enhance the weight for the main parts while filter the weak intensity, such as shadows, illumination and so on. Moreover, we can choose threshold value to set weight matrix manually for accurate image segmentation. Thus, the proposed method can extract objects of interest in practice. Coupled partial differential equations are used to implement this method with level set algorithms. Experimental results show the advantages of our method in terms of accuracy for image segmentation.","PeriodicalId":309591,"journal":{"name":"2010 20th International Conference on Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2010-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126348132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
De-ghosting for Image Stitching with Automatic Content-Awareness 具有自动内容感知的图像拼接去重影
Pub Date : 2010-10-07 DOI: 10.1109/ICPR.2010.541
Yu Tang, Jungpil Shin
Ghosting artifact in the field of image stitching is a common problem and the elimination of it is not an easy task. In this paper, we propose an intuitive technique according to a stitching line based on a novel energy map which is essentially a combination of gradient map which indicates the presence of structures and prominence map which determines the attractiveness of a region. We consider a region is of significance only if it is both structural and attractive. Using this improved energy map, the stitching line can easily skirt around the moving objects or salient parts based on the philosophy that human eyes mostly notice only the salient features of an image. We compare result of our method to those of 4 state-of-the-art image stitching methods and it turns out that our method outperforms the 4 methods in removing ghosting artifacts.
在图像拼接领域中,重影是一个常见的问题,消除它并非易事。在本文中,我们提出了一种基于拼接线的直观技术,该技术基于一种新的能量图,该能量图本质上是表明结构存在的梯度图和决定区域吸引力的突出图的组合。我们认为,一个地区只有在既具有结构性又具有吸引力的情况下才具有重要意义。利用这种改进的能量图,缝合线可以很容易地绕过移动的物体或突出的部分,这是基于人眼通常只注意到图像的显著特征的原理。我们将我们的方法与4种最先进的图像拼接方法的结果进行了比较,结果表明我们的方法在去除重影伪影方面优于4种方法。
{"title":"De-ghosting for Image Stitching with Automatic Content-Awareness","authors":"Yu Tang, Jungpil Shin","doi":"10.1109/ICPR.2010.541","DOIUrl":"https://doi.org/10.1109/ICPR.2010.541","url":null,"abstract":"Ghosting artifact in the field of image stitching is a common problem and the elimination of it is not an easy task. In this paper, we propose an intuitive technique according to a stitching line based on a novel energy map which is essentially a combination of gradient map which indicates the presence of structures and prominence map which determines the attractiveness of a region. We consider a region is of significance only if it is both structural and attractive. Using this improved energy map, the stitching line can easily skirt around the moving objects or salient parts based on the philosophy that human eyes mostly notice only the salient features of an image. We compare result of our method to those of 4 state-of-the-art image stitching methods and it turns out that our method outperforms the 4 methods in removing ghosting artifacts.","PeriodicalId":309591,"journal":{"name":"2010 20th International Conference on Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2010-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131221199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Learning the Kernel Combination for Object Categorization 学习用于对象分类的核组合
Pub Date : 2010-10-07 DOI: 10.1109/ICPR.2010.718
Deyuan Zhang, Xiaolong Wang, Bingquan Liu
Although Support Vector Machines(SVM) succeed in classifying several image databases using image descriptors proposed in the literature, no single descriptor can be optimal for general object categorization. This paper describes a novel framework to learn the optimal combination of kernels corresponding to multiple image descriptors before SVM training, leading to solve a quadratic programming problem efficiently. Our framework takes into account the variation of kernel matrix and imbalanced dataset, which are common in real world image categorization tasks. Experimental results on Graz-01 and Caltech-101 image databases show the effectiveness and robustness of our algorithm.
尽管支持向量机(SVM)使用文献中提出的图像描述符成功地对多个图像数据库进行了分类,但对于一般对象分类,没有单一的描述符是最优的。本文提出了一种新的框架,在支持向量机训练前学习多个图像描述符对应的核的最优组合,从而有效地解决二次规划问题。我们的框架考虑了核矩阵的变化和不平衡数据集,这是在现实世界的图像分类任务中常见的。在grazi -01和Caltech-101图像数据库上的实验结果表明了该算法的有效性和鲁棒性。
{"title":"Learning the Kernel Combination for Object Categorization","authors":"Deyuan Zhang, Xiaolong Wang, Bingquan Liu","doi":"10.1109/ICPR.2010.718","DOIUrl":"https://doi.org/10.1109/ICPR.2010.718","url":null,"abstract":"Although Support Vector Machines(SVM) succeed in classifying several image databases using image descriptors proposed in the literature, no single descriptor can be optimal for general object categorization. This paper describes a novel framework to learn the optimal combination of kernels corresponding to multiple image descriptors before SVM training, leading to solve a quadratic programming problem efficiently. Our framework takes into account the variation of kernel matrix and imbalanced dataset, which are common in real world image categorization tasks. Experimental results on Graz-01 and Caltech-101 image databases show the effectiveness and robustness of our algorithm.","PeriodicalId":309591,"journal":{"name":"2010 20th International Conference on Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2010-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125390868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing Web Page Classification via Local Co-training 通过局部协同训练增强网页分类
Pub Date : 2010-10-07 DOI: 10.1109/ICPR.2010.712
Youtian Du, X. Guan, Zhongmin Cai
In this paper we propose a new multi-view semi-supervised learning algorithm called Local Co-Training(LCT). The proposed algorithm employs a set of local models with vector outputs to model the relations among examples in a local region on each view, and iteratively refines the dominant local models (i.e. the local models related to the unlabeled examples chosen for enriching the training set) using unlabeled examples by the co-training process. Compared with previous co-training style algorithms, local co-training has two advantages: firstly, it has higher classification precision by introducing local learning; secondly, only the dominant local models need to be updated, which significantly decreases the computational load. Experiments on WebKB and Cora datasets demonstrate that LCT algorithm can effectively exploit unlabeled data to improve the performance of web page classification.
本文提出了一种新的多视图半监督学习算法,称为局部协同训练(LCT)。该算法采用一组具有向量输出的局部模型,在每个视图上对局部区域内的样例之间的关系进行建模,并通过共训练过程,利用未标记的样例迭代地改进优势局部模型(即与为丰富训练集而选择的未标记样例相关的局部模型)。与以前的协同训练算法相比,局部协同训练具有两个优点:首先,通过引入局部学习,具有更高的分类精度;其次,只需要更新占主导地位的局部模型,这大大降低了计算量。在WebKB和Cora数据集上的实验表明,LCT算法可以有效地利用未标记数据来提高网页分类的性能。
{"title":"Enhancing Web Page Classification via Local Co-training","authors":"Youtian Du, X. Guan, Zhongmin Cai","doi":"10.1109/ICPR.2010.712","DOIUrl":"https://doi.org/10.1109/ICPR.2010.712","url":null,"abstract":"In this paper we propose a new multi-view semi-supervised learning algorithm called Local Co-Training(LCT). The proposed algorithm employs a set of local models with vector outputs to model the relations among examples in a local region on each view, and iteratively refines the dominant local models (i.e. the local models related to the unlabeled examples chosen for enriching the training set) using unlabeled examples by the co-training process. Compared with previous co-training style algorithms, local co-training has two advantages: firstly, it has higher classification precision by introducing local learning; secondly, only the dominant local models need to be updated, which significantly decreases the computational load. Experiments on WebKB and Cora datasets demonstrate that LCT algorithm can effectively exploit unlabeled data to improve the performance of web page classification.","PeriodicalId":309591,"journal":{"name":"2010 20th International Conference on Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2010-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127042609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Fast and Spatially-Smooth Terrain Classification Using Monocular Camera 基于单目相机的快速空间平滑地形分类
Pub Date : 2010-10-07 DOI: 10.1109/ICPR.2010.987
Chetan Jakkoju, K. Krishna, C. V. Jawahar
In this paper, we present a monocular camera based terrain classification scheme. The uniqueness of the proposed scheme is that it inherently incorporates spatial smoothness while segmenting a image, without requirement of post-processing smoothing methods. The algorithm is extremely fast because it is build on top of a Random Forest classifier. We present comparison across features and classifiers. The baseline algorithm uses color, texture and their combination with classifiers such as SVM and Random Forests. We further enhance the algorithm through a label transfer method. The efficacy of the proposed solution can be seen as we reach a low error rates on both our dataset and other publicly available datasets.
本文提出了一种基于单目摄像机的地形分类方案。该方法的独特之处在于它在分割图像时固有地融合了空间平滑性,而不需要后处理平滑方法。该算法非常快,因为它是建立在随机森林分类器之上的。我们在特征和分类器之间进行比较。基线算法使用颜色、纹理及其与SVM和随机森林等分类器的组合。我们通过标签转移方法进一步增强了算法。所提出的解决方案的有效性可以看出,我们在我们的数据集和其他公开可用的数据集上都达到了低错误率。
{"title":"Fast and Spatially-Smooth Terrain Classification Using Monocular Camera","authors":"Chetan Jakkoju, K. Krishna, C. V. Jawahar","doi":"10.1109/ICPR.2010.987","DOIUrl":"https://doi.org/10.1109/ICPR.2010.987","url":null,"abstract":"In this paper, we present a monocular camera based terrain classification scheme. The uniqueness of the proposed scheme is that it inherently incorporates spatial smoothness while segmenting a image, without requirement of post-processing smoothing methods. The algorithm is extremely fast because it is build on top of a Random Forest classifier. We present comparison across features and classifiers. The baseline algorithm uses color, texture and their combination with classifiers such as SVM and Random Forests. We further enhance the algorithm through a label transfer method. The efficacy of the proposed solution can be seen as we reach a low error rates on both our dataset and other publicly available datasets.","PeriodicalId":309591,"journal":{"name":"2010 20th International Conference on Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2010-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127892639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Semi-blind Speech-Music Separation Using Sparsity and Continuity Priors 基于稀疏性和连续性先验的半盲语音-音乐分离
Pub Date : 2010-10-07 DOI: 10.1109/ICPR.2010.1129
Hakan Erdogan, Emad M. Grais
In this paper we propose an approach for the problem of single channel source separation of speech and music signals. Our approach is based on representing each source's power spectral density using dictionaries and nonlinearly projecting the mixture signal spectrum onto the combined span of the dictionary entries. We encourage sparsity and continuity of the dictionary coefficients using penalty terms (or log-priors) in an optimization framework. We propose to use a novel coordinate descent technique for optimization, which nicely handles nonnegativity constraints and nonquadratic penalty terms. We use an adaptive Wiener filter, and spectral subtraction to reconstruct both of the sources from the mixture data after corresponding power spectral densities (PSDs) are estimated for each source. Using conventional metrics, we measure the performance of the system on simulated mixtures of single person speech and piano music sources. The results indicate that the proposed method is a promising technique for low speech-to-music ratio conditions and that sparsity and continuity priors help improve the performance of the proposed system.
本文针对语音和音乐信号的单通道源分离问题提出了一种方法。我们的方法是基于使用字典表示每个源的功率谱密度,并将混合信号频谱非线性地投影到字典条目的组合范围上。我们在优化框架中使用惩罚项(或log-prior)鼓励字典系数的稀疏性和连续性。我们建议使用一种新的坐标下降技术进行优化,它很好地处理了非负性约束和非二次惩罚项。我们使用自适应维纳滤波和谱减法从混合数据中重建两个源,并估计每个源对应的功率谱密度(psd)。使用传统的度量标准,我们测量了系统在模拟单人语音和钢琴音乐源的混合上的性能。结果表明,该方法在低音比条件下是一种很有前途的技术,稀疏性和连续性先验有助于提高系统的性能。
{"title":"Semi-blind Speech-Music Separation Using Sparsity and Continuity Priors","authors":"Hakan Erdogan, Emad M. Grais","doi":"10.1109/ICPR.2010.1129","DOIUrl":"https://doi.org/10.1109/ICPR.2010.1129","url":null,"abstract":"In this paper we propose an approach for the problem of single channel source separation of speech and music signals. Our approach is based on representing each source's power spectral density using dictionaries and nonlinearly projecting the mixture signal spectrum onto the combined span of the dictionary entries. We encourage sparsity and continuity of the dictionary coefficients using penalty terms (or log-priors) in an optimization framework. We propose to use a novel coordinate descent technique for optimization, which nicely handles nonnegativity constraints and nonquadratic penalty terms. We use an adaptive Wiener filter, and spectral subtraction to reconstruct both of the sources from the mixture data after corresponding power spectral densities (PSDs) are estimated for each source. Using conventional metrics, we measure the performance of the system on simulated mixtures of single person speech and piano music sources. The results indicate that the proposed method is a promising technique for low speech-to-music ratio conditions and that sparsity and continuity priors help improve the performance of the proposed system.","PeriodicalId":309591,"journal":{"name":"2010 20th International Conference on Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2010-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128269726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Combining the Likelihood and the Kullback-Leibler Distance in Estimating the Universal Background Model for Speaker Verification Using SVM 结合似然和Kullback-Leibler距离估计基于支持向量机的说话人验证通用背景模型
Pub Date : 2010-10-07 DOI: 10.1109/ICPR.2010.1106
Zhenchun Lei
The state-of-the-art methods for speaker verification are based on the support vector machine. The Gaussian supervector SVM is a typical method which uses the Gaussian mixture model for creating “feature vectors” for the discriminative SVM. And all GMMs are adapted from the same universal background model, which is got by maximum likelihood estimation on a large number of data sets. So the UBM should cover the feature space widely as possible. We propose a new method to estimate the parameters of the UBM by combining the likelihood and the Kullback-Leibler distances in the UBM. Its aim is to find the model parameters which get the high likelihood value and all Gaussian distributions are dispersed to cover the feature space in a great measuring. Experiments on NIST 2001 task show that our method can improve the performance obviously.
最先进的说话人验证方法是基于支持向量机的。高斯超向量支持向量机是利用高斯混合模型为判别支持向量机创建“特征向量”的一种典型方法。所有的gmm都来自同一个通用背景模型,该模型是通过对大量数据集的极大似然估计得到的。因此,UBM应该尽可能广泛地覆盖特征空间。我们提出了一种结合似然和库尔贝克-莱伯勒距离来估计模型参数的新方法。它的目的是寻找得到高似然值的模型参数,并且所有的高斯分布都是分散的,以覆盖大量的特征空间。在NIST 2001任务上的实验表明,该方法可以明显提高性能。
{"title":"Combining the Likelihood and the Kullback-Leibler Distance in Estimating the Universal Background Model for Speaker Verification Using SVM","authors":"Zhenchun Lei","doi":"10.1109/ICPR.2010.1106","DOIUrl":"https://doi.org/10.1109/ICPR.2010.1106","url":null,"abstract":"The state-of-the-art methods for speaker verification are based on the support vector machine. The Gaussian supervector SVM is a typical method which uses the Gaussian mixture model for creating “feature vectors” for the discriminative SVM. And all GMMs are adapted from the same universal background model, which is got by maximum likelihood estimation on a large number of data sets. So the UBM should cover the feature space widely as possible. We propose a new method to estimate the parameters of the UBM by combining the likelihood and the Kullback-Leibler distances in the UBM. Its aim is to find the model parameters which get the high likelihood value and all Gaussian distributions are dispersed to cover the feature space in a great measuring. Experiments on NIST 2001 task show that our method can improve the performance obviously.","PeriodicalId":309591,"journal":{"name":"2010 20th International Conference on Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2010-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128403132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2010 20th International Conference on Pattern Recognition
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1