首页 > 最新文献

2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)最新文献

英文 中文
Algorithm and architecture for quarter pixel motion estimation for H.264/AVC H.264/AVC中四分之一像素运动估计的算法和体系结构
S. K. Chatterjee, I. Chakrabarti
The present paper proposes a fast algorithm and its VLSI architecture for fast quarter pixel (QP) accurate motion estimation (ME). The proposed algorithm is based on the distribution of the QP motion vectors (MVs) around the half pixel MV. The proposed algorithm efficiently explores the most likely QP locations and therefore skips the unlikely ones. The number of QP search locations for the proposed algorithm is reduced by 50% compared to the original full search method but results in only about 0.12 dB peak signal to noise ratio degradation. The VLSI architecture of the proposed algorithm theoretically can process thirty three 1280×720 HDTV frames per second. The power consumption of the proposed architecture is also reduced by 15? compared to a recently reported architecture.
本文提出了一种快速四分之一像素(QP)精确运动估计(ME)算法及其VLSI结构。该算法基于QP运动向量(MV)在半像素MV周围的分布。提出的算法有效地探索最可能的QP位置,从而跳过不太可能的QP位置。与原始的全搜索方法相比,该算法的QP搜索位置数量减少了50%,但峰值信噪比仅下降了约0.12 dB。该算法的VLSI架构理论上可以每秒处理33帧1280×720高清电视帧。所提出的架构的功耗也降低了15%。与最近报道的架构相比。
{"title":"Algorithm and architecture for quarter pixel motion estimation for H.264/AVC","authors":"S. K. Chatterjee, I. Chakrabarti","doi":"10.1109/NCVPRIPG.2013.6776245","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776245","url":null,"abstract":"The present paper proposes a fast algorithm and its VLSI architecture for fast quarter pixel (QP) accurate motion estimation (ME). The proposed algorithm is based on the distribution of the QP motion vectors (MVs) around the half pixel MV. The proposed algorithm efficiently explores the most likely QP locations and therefore skips the unlikely ones. The number of QP search locations for the proposed algorithm is reduced by 50% compared to the original full search method but results in only about 0.12 dB peak signal to noise ratio degradation. The VLSI architecture of the proposed algorithm theoretically can process thirty three 1280×720 HDTV frames per second. The power consumption of the proposed architecture is also reduced by 15? compared to a recently reported architecture.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123791583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fast registration of articulated objects from depth images 深度图像中铰接物体的快速配准
Sourabh Prajapati, P J Narayanan
We present an approach for fast registration of a Global Articulated 3D Model to RGBD data from Kinect. Our approach uses geometry based matching of rigid parts of the articulated objects in depth images. The registration is performed in a parametric space of transformations independently for each segment. The time for registering each frame with the global model is reduced greatly using this method. We experimented the algorithm with different articulated object datasets and obtained significantly low execution time as compared to ICP algorithm when applied on each rigid part of the articulated object.
我们提出了一种从Kinect快速注册全局铰接3D模型到RGBD数据的方法。我们的方法使用基于几何的匹配深度图像中铰接物体的刚性部分。该配准是在一个独立的变换参数空间中进行的。该方法大大减少了每帧与全局模型的配准时间。我们在不同的铰接对象数据集上实验了该算法,当应用于铰接对象的每个刚性部分时,与ICP算法相比,该算法的执行时间明显较低。
{"title":"Fast registration of articulated objects from depth images","authors":"Sourabh Prajapati, P J Narayanan","doi":"10.1109/NCVPRIPG.2013.6776168","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776168","url":null,"abstract":"We present an approach for fast registration of a Global Articulated 3D Model to RGBD data from Kinect. Our approach uses geometry based matching of rigid parts of the articulated objects in depth images. The registration is performed in a parametric space of transformations independently for each segment. The time for registering each frame with the global model is reduced greatly using this method. We experimented the algorithm with different articulated object datasets and obtained significantly low execution time as compared to ICP algorithm when applied on each rigid part of the articulated object.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125252325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Fynbos leaf online plant recognition application Fynbos叶片在线植物识别应用程序
S. Winberg, S. Katz, A. Mishra
Computer-aided plant identification combines computer vision and pattern recognition. The Cape Floristic Kingdom is the most varied of plant kingdoms, comprising thousands of species of fynbos plants. While it is easier to classify fynbos when they are flowering, mostly flower for only a few weeks in a year. This paper concerns an image processing application for automatic identification of certain fynbos using leaf photographs. The architecture of this application is overviewed prior to focusing on the leaf recognition operations, and how these were experimentally tested using a series of experiments, culminating in a comprehensive test to measure identification accuracy, effectiveness of the online user interface, and the processing speed. Our conclusions reflect on the overall effectiveness of the application and our plans to take it further.
计算机辅助植物识别结合了计算机视觉和模式识别。开普植物王国是最多样化的植物王国,由数千种芬波斯植物组成。虽然在开花的时候比较容易对飞燕草进行分类,但大多数飞燕草一年只开几个星期的花。本文研究了一种利用叶片照片自动识别某些飞虫的图像处理应用。在关注叶子识别操作之前,概述了该应用程序的体系结构,以及如何使用一系列实验对这些操作进行实验测试,最后进行了全面的测试,以测量识别准确性、在线用户界面的有效性和处理速度。我们的结论反映了应用程序的整体有效性和我们进一步发展的计划。
{"title":"Fynbos leaf online plant recognition application","authors":"S. Winberg, S. Katz, A. Mishra","doi":"10.1109/NCVPRIPG.2013.6776220","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776220","url":null,"abstract":"Computer-aided plant identification combines computer vision and pattern recognition. The Cape Floristic Kingdom is the most varied of plant kingdoms, comprising thousands of species of fynbos plants. While it is easier to classify fynbos when they are flowering, mostly flower for only a few weeks in a year. This paper concerns an image processing application for automatic identification of certain fynbos using leaf photographs. The architecture of this application is overviewed prior to focusing on the leaf recognition operations, and how these were experimentally tested using a series of experiments, culminating in a comprehensive test to measure identification accuracy, effectiveness of the online user interface, and the processing speed. Our conclusions reflect on the overall effectiveness of the application and our plans to take it further.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128267122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Spatial variance of color and boundary statistics for salient object detection 显著目标检测的颜色空间方差和边界统计
Sudeshna Roy, Sukhendu Das
Bottom-up saliency detection algorithms identify distinct regions in an image, with rare occurrence of local feature distributions. Notable among those works published recently, use local and global contrast, spectral analysis of the entire image or graph based feature mapping. Whereas, we propose a novel unsupervised method using color compactness and statistical modeling of the background cues, to segment the salient foreground region and thus the salient object. At the first stage of processing, the image is segmented into clusters using color feature. First component proposed for our saliency measure combines disparity in color and spatial distance between patches. In addition to rarity of feature, we propose another component for saliency computation that estimates the divergence of the color of a patch from those in the set of patches at the boundary of the image, representing the background. Combination of these two complementary components provides a much improved saliency map for salient object detection.We verify the performance of our proposed method of saliency detection on two popular benchmark datasets, with one or more salient regions and diverse saliency characteristics. Experimental results show that our method out-performs many existing state-of-the-art methods.
自底向上显著性检测算法识别图像中的不同区域,很少出现局部特征分布。在最近发表的作品中,值得注意的是,使用局部和全局对比,整个图像的光谱分析或基于图形的特征映射。然而,我们提出了一种新的无监督方法,利用颜色紧凑性和背景线索的统计建模来分割突出的前景区域,从而分割突出的目标。在处理的第一阶段,使用颜色特征将图像分割成簇。我们提出的第一个显著性度量组件结合了色差和斑块之间的空间距离。除了特征的稀有性之外,我们还提出了另一个显著性计算组件,该组件用于估计斑块的颜色与图像边界(代表背景)的斑块集中的颜色的散度。这两个互补组件的组合为显著目标检测提供了大大改进的显著性图。我们在两个流行的基准数据集上验证了我们提出的显著性检测方法的性能,这些数据集具有一个或多个显著区域和不同的显著性特征。实验结果表明,该方法优于许多现有的先进方法。
{"title":"Spatial variance of color and boundary statistics for salient object detection","authors":"Sudeshna Roy, Sukhendu Das","doi":"10.1109/NCVPRIPG.2013.6776270","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776270","url":null,"abstract":"Bottom-up saliency detection algorithms identify distinct regions in an image, with rare occurrence of local feature distributions. Notable among those works published recently, use local and global contrast, spectral analysis of the entire image or graph based feature mapping. Whereas, we propose a novel unsupervised method using color compactness and statistical modeling of the background cues, to segment the salient foreground region and thus the salient object. At the first stage of processing, the image is segmented into clusters using color feature. First component proposed for our saliency measure combines disparity in color and spatial distance between patches. In addition to rarity of feature, we propose another component for saliency computation that estimates the divergence of the color of a patch from those in the set of patches at the boundary of the image, representing the background. Combination of these two complementary components provides a much improved saliency map for salient object detection.We verify the performance of our proposed method of saliency detection on two popular benchmark datasets, with one or more salient regions and diverse saliency characteristics. Experimental results show that our method out-performs many existing state-of-the-art methods.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129697933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Virtual garment simulation 虚拟服装仿真
Sailik Sengupta, P. Chaudhuri
In this paper we present a system to create 3D garments from 2D patterns. Once placed over the 3D character, our system can quickly stitch the patterns into the 3D garment. The stitched cloth is then simulated to obtain the drape of the garment over the character. Our system can accurately and efficiently resolve cloth-body and cloth-cloth collisions.
在本文中,我们提出了一个系统,从二维图案创建三维服装。一旦放置在3D角色上,我们的系统可以快速将图案缝制到3D服装中。然后模拟缝制的布料,以获得服装在角色上的褶皱。该系统能准确、高效地解决布体碰撞和布体碰撞问题。
{"title":"Virtual garment simulation","authors":"Sailik Sengupta, P. Chaudhuri","doi":"10.1109/NCVPRIPG.2013.6776167","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776167","url":null,"abstract":"In this paper we present a system to create 3D garments from 2D patterns. Once placed over the 3D character, our system can quickly stitch the patterns into the 3D garment. The stitched cloth is then simulated to obtain the drape of the garment over the character. Our system can accurately and efficiently resolve cloth-body and cloth-cloth collisions.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125705542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Cross-domain clustering performed by transfer of knowledge across domains 通过跨领域的知识转移实现跨领域聚类
Suranjana Samanta, T. Selvan, Sukhendu Das
In this paper, we propose a method to improve the results of clustering in a target domain, using significant information from an auxiliary (source) domain dataset. The applicability of this method concerns the field of transfer learning (or domain adaptation), where the performance of a task (say, classification using clustering) in one domain is improved using knowledge obtained from a similar domain. We propose two unsupervised methods of cross-domain clustering and show results on two different categories of benchmark datasets, both having difference in density distributions over the pair of domains. In the first method, we propose an iterative framework, where the clustering in the target domain is influenced by the clusters formed in the source domain and vice-versa. Similarity/dissimilarity measures have been appropriately formulated using Euclidean distance and Bregman Divergence, for cross-domain clustering. In the second method, we perform clustering in the target domain by estimating local density computed using a non-parametric (NP) density estimator (due to less number of samples). Prior to clustering, the NP-density scattering in the target domain is modified using information of cluster density distribution in source domain. Results shown on real-world datasets suggest that the proposed methods of cross-domain clustering are comparable to the recent start-of-the-art work.
在本文中,我们提出了一种利用辅助(源)领域数据集的重要信息来改进目标领域聚类结果的方法。该方法的适用性涉及迁移学习(或领域适应)领域,其中一个领域的任务(例如,使用聚类进行分类)的性能使用从类似领域获得的知识来改进。我们提出了两种无监督的跨域聚类方法,并在两种不同类别的基准数据集上展示了结果,这两种方法在对域上的密度分布都不同。在第一种方法中,我们提出了一个迭代框架,其中目标域中的聚类受到源域中形成的聚类的影响,反之亦然。对于跨域聚类,使用欧几里得距离和布雷格曼散度适当地制定了相似/不相似度量。在第二种方法中,我们通过估计使用非参数(NP)密度估计器计算的局部密度(由于样本数量较少)在目标域中进行聚类。在聚类之前,利用源域的聚类密度分布信息对目标域的np密度散射进行修正。在真实世界数据集上显示的结果表明,所提出的跨域聚类方法与最近开始的艺术工作相当。
{"title":"Cross-domain clustering performed by transfer of knowledge across domains","authors":"Suranjana Samanta, T. Selvan, Sukhendu Das","doi":"10.1109/NCVPRIPG.2013.6776213","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776213","url":null,"abstract":"In this paper, we propose a method to improve the results of clustering in a target domain, using significant information from an auxiliary (source) domain dataset. The applicability of this method concerns the field of transfer learning (or domain adaptation), where the performance of a task (say, classification using clustering) in one domain is improved using knowledge obtained from a similar domain. We propose two unsupervised methods of cross-domain clustering and show results on two different categories of benchmark datasets, both having difference in density distributions over the pair of domains. In the first method, we propose an iterative framework, where the clustering in the target domain is influenced by the clusters formed in the source domain and vice-versa. Similarity/dissimilarity measures have been appropriately formulated using Euclidean distance and Bregman Divergence, for cross-domain clustering. In the second method, we perform clustering in the target domain by estimating local density computed using a non-parametric (NP) density estimator (due to less number of samples). Prior to clustering, the NP-density scattering in the target domain is modified using information of cluster density distribution in source domain. Results shown on real-world datasets suggest that the proposed methods of cross-domain clustering are comparable to the recent start-of-the-art work.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130012940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Feature preserving anisotropic diffusion for image restoration 特征保持各向异性扩散图像恢复
V. B. Surya Prasath, J. Moreno
Anisotropic diffusion based schemes are widely used in image smoothing and noise removal. Typically, the partial differential equation (PDE) used is based on computing image gradients or isotropically smoothed version of the gradient image. To improve the denoising capability of such nonlinear anisotropic diffusion schemes, we introduce a multi-direction based discretization along with a selection strategy for choosing the best direction of possible edge pixels. This strategy avoids the directionality based bias which can over-smooth features that are not aligned with the coordinate axis. The proposed hybrid discretization scheme helps in preserving multi-scale features present in the images via selective smoothing of the PDE. Experimental results indicate such an adaptive modification provides improved restoration results on noisy images.
基于各向异性扩散的算法广泛应用于图像平滑和去噪。通常,使用的偏微分方程(PDE)是基于计算图像梯度或梯度图像的各向同性平滑版本。为了提高这种非线性各向异性扩散格式的去噪能力,我们引入了一种基于多方向的离散化方法以及选择可能边缘像素的最佳方向的选择策略。这种策略避免了基于方向性的偏差,这种偏差可能会使不与坐标轴对齐的特征过于平滑。所提出的混合离散化方案通过对偏微分方程进行选择性平滑,有助于保留图像中存在的多尺度特征。实验结果表明,该自适应算法对噪声图像的恢复效果较好。
{"title":"Feature preserving anisotropic diffusion for image restoration","authors":"V. B. Surya Prasath, J. Moreno","doi":"10.1109/NCVPRIPG.2013.6776250","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776250","url":null,"abstract":"Anisotropic diffusion based schemes are widely used in image smoothing and noise removal. Typically, the partial differential equation (PDE) used is based on computing image gradients or isotropically smoothed version of the gradient image. To improve the denoising capability of such nonlinear anisotropic diffusion schemes, we introduce a multi-direction based discretization along with a selection strategy for choosing the best direction of possible edge pixels. This strategy avoids the directionality based bias which can over-smooth features that are not aligned with the coordinate axis. The proposed hybrid discretization scheme helps in preserving multi-scale features present in the images via selective smoothing of the PDE. Experimental results indicate such an adaptive modification provides improved restoration results on noisy images.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"235 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114417841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Classification of hardwood species using ANN classifier 利用人工神经网络分类器对硬木树种进行分类
Arvind R. Yadav, M. Dewal, R. S. Anand, Sangeeta Gupta
In this paper, an approach for the classification of different hardwood species of open access database, using texture feature extraction and supervised machine learning technique has been implemented. Edges of complex cellular structure of microscopic images of hardwood are enhanced with the application of Gabor filter, and Gray Level Co-occurrence Matrix (GLCM) as an effective texture feature extraction technique is being revalidated. About, 44 features have been extracted from GLCM; these features have been further normalized in the range [0.1, 1]. Multilayer Perceptron Backpropagation Artificial Neural Network have been used for classification. Experiments conducted on 25 wood species have resulted in recognition accuracy of about 88.60% and 92.60% using Levenberg-Marquardt backpropagation training function with two different datasets for training, validation and testing ratio (70%, 15%, 15% and 80%, 10%, 10%) respectively. Proposed methodology can be extended with optimized machine learning techniques for online identification of wood.
本文提出了一种基于纹理特征提取和监督机器学习的开放存取数据库中不同硬木树种的分类方法。利用Gabor滤波器增强了硬木显微图像中复杂细胞结构的边缘,并重新验证了灰度共生矩阵(GLCM)作为一种有效的纹理特征提取技术。从GLCM中提取了约44个特征;这些特征在[0.1,1]范围内进一步归一化。多层感知器反向传播人工神经网络被用于分类。利用Levenberg-Marquardt反向传播训练函数对25种木材进行了实验,在训练、验证和测试比例分别为70%、15%、15%和80%、10%、10%的数据集上,识别准确率分别为88.60%和92.60%。提出的方法可以通过优化的机器学习技术扩展到木材的在线识别。
{"title":"Classification of hardwood species using ANN classifier","authors":"Arvind R. Yadav, M. Dewal, R. S. Anand, Sangeeta Gupta","doi":"10.1109/NCVPRIPG.2013.6776231","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776231","url":null,"abstract":"In this paper, an approach for the classification of different hardwood species of open access database, using texture feature extraction and supervised machine learning technique has been implemented. Edges of complex cellular structure of microscopic images of hardwood are enhanced with the application of Gabor filter, and Gray Level Co-occurrence Matrix (GLCM) as an effective texture feature extraction technique is being revalidated. About, 44 features have been extracted from GLCM; these features have been further normalized in the range [0.1, 1]. Multilayer Perceptron Backpropagation Artificial Neural Network have been used for classification. Experiments conducted on 25 wood species have resulted in recognition accuracy of about 88.60% and 92.60% using Levenberg-Marquardt backpropagation training function with two different datasets for training, validation and testing ratio (70%, 15%, 15% and 80%, 10%, 10%) respectively. Proposed methodology can be extended with optimized machine learning techniques for online identification of wood.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132750015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
Rate-invariant comparisons of covariance paths for visual speech recognition 视觉语音识别中协方差路径的率不变比较
Jingyong Su, Anuj Srivastava, F. Souza, Sudeep Sarkar
An important problem in speech, and generally activity, recognition is to develop analyses that are invariant to the execution rates. We introduce a theoretical framework that provides a parametrization-invariant metric for comparing parametrized paths on Riemannian manifolds. Treating instances of activities as parametrized paths on a Riemannian manifold of covariance matrices, we apply this framework to the problem of visual speech recognition from image sequences. We represent each sequence as a path on the space of covariance matrices, each covariance matrix capturing spatial variability of visual features in a frame, and perform simultaneous pairwise temporal alignment and comparison of paths. This removes the temporal variability and helps provide a robust metric for visual speech classification. We evaluated this idea on the OuluVS database and the rank-1 nearest neighbor classification rate improves from 32% to 57% due to temporal alignment.
语音识别和一般活动识别的一个重要问题是开发对执行率不变的分析。我们引入了一个理论框架,提供了一个参数化不变度量来比较黎曼流形上的参数化路径。将活动实例视为协方差矩阵黎曼流形上的参数化路径,我们将该框架应用于图像序列的视觉语音识别问题。我们将每个序列表示为协方差矩阵空间上的路径,每个协方差矩阵捕获一帧中视觉特征的空间变异性,并同时进行成对的时间对齐和路径比较。这消除了时间的可变性,并有助于为视觉语音分类提供一个健壮的度量。我们在OuluVS数据库上评估了这个想法,由于时间对齐,排名1的最近邻分类率从32%提高到57%。
{"title":"Rate-invariant comparisons of covariance paths for visual speech recognition","authors":"Jingyong Su, Anuj Srivastava, F. Souza, Sudeep Sarkar","doi":"10.1109/NCVPRIPG.2013.6776200","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776200","url":null,"abstract":"An important problem in speech, and generally activity, recognition is to develop analyses that are invariant to the execution rates. We introduce a theoretical framework that provides a parametrization-invariant metric for comparing parametrized paths on Riemannian manifolds. Treating instances of activities as parametrized paths on a Riemannian manifold of covariance matrices, we apply this framework to the problem of visual speech recognition from image sequences. We represent each sequence as a path on the space of covariance matrices, each covariance matrix capturing spatial variability of visual features in a frame, and perform simultaneous pairwise temporal alignment and comparison of paths. This removes the temporal variability and helps provide a robust metric for visual speech classification. We evaluated this idea on the OuluVS database and the rank-1 nearest neighbor classification rate improves from 32% to 57% due to temporal alignment.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134290329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Design of Manipuri Keywords Spotting System using HMM 基于HMM的曼尼普尔关键词识别系统设计
Laishram Rahul, Salam Nandakishor, L. J. Singh, S. K. Dutta
This paper aims to discuss the implementation of phoneme based Manipuri Keyword Spotting System (MKWSS). Manipuri is a scheduled Indian language of Tibeto-Burman origin. Around 5 hours of read speech are collected from 4 male and 6 female speakers for development of database of MKWSS. The symbols of International Phonetic Alphabet (IPA)(revised in 2005) are used during the transcription of the data. A five state left to right Hidden Markov Model (HMM) with 32 mixture continuous density diagonal covariance Gaussian Mixture Model (GMM) per state is used to build a model for each phonetic unit. We have used HMM tool kit (HTK), version 3.4 for modeling the system. The system can recognize 29 phonemes and a non-speech event (silence) and will detect the present keywords formed by these phonemes. Continuous Speech data have been collected from 5 males and 8 females for analysing the performance of the system. The performance of the system depends on the ability of detection of the keywords. An overall performance of 65.24% is obtained from the phoneme based MKWSS.
本文旨在探讨基于音素的曼尼普尔语关键词识别系统的实现。曼尼普尔语是一种源自藏缅语的印度语言。收集了4名男性和6名女性演讲者约5小时的朗读语音,用于建立MKWSS数据库。数据转录使用2005年修订的国际音标(IPA)符号。采用一种从左到右的五状态隐马尔可夫模型(HMM),每个状态有32个混合连续密度对角协方差高斯混合模型(GMM)来为每个语音单元建立模型。我们使用HMM工具箱(HTK) 3.4版本对系统进行建模。系统可以识别29个音素和一个非语音事件(沉默),并将检测由这些音素组成的当前关键词。收集了5名男性和8名女性的连续语音数据,分析了系统的性能。系统的性能取决于关键字的检测能力。基于音素的MKWSS总体性能为65.24%。
{"title":"Design of Manipuri Keywords Spotting System using HMM","authors":"Laishram Rahul, Salam Nandakishor, L. J. Singh, S. K. Dutta","doi":"10.1109/NCVPRIPG.2013.6776249","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776249","url":null,"abstract":"This paper aims to discuss the implementation of phoneme based Manipuri Keyword Spotting System (MKWSS). Manipuri is a scheduled Indian language of Tibeto-Burman origin. Around 5 hours of read speech are collected from 4 male and 6 female speakers for development of database of MKWSS. The symbols of International Phonetic Alphabet (IPA)(revised in 2005) are used during the transcription of the data. A five state left to right Hidden Markov Model (HMM) with 32 mixture continuous density diagonal covariance Gaussian Mixture Model (GMM) per state is used to build a model for each phonetic unit. We have used HMM tool kit (HTK), version 3.4 for modeling the system. The system can recognize 29 phonemes and a non-speech event (silence) and will detect the present keywords formed by these phonemes. Continuous Speech data have been collected from 5 males and 8 females for analysing the performance of the system. The performance of the system depends on the ability of detection of the keywords. An overall performance of 65.24% is obtained from the phoneme based MKWSS.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133495676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
期刊
2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1