首页 > 最新文献

2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)最新文献

英文 中文
Object boundary detection using Rough Set Theory 基于粗糙集理论的目标边界检测
Ashish Phophalia, S. Mitra, Ajit Rajwade
A Rough Set Theory based closed form object boundary detection method has been suggested in this paper. Most of the edge detection methods fail in getting closed boundary of objects of any shape present in the image. Active contour based methods are available to get such object boundaries. The Multiphase Chan-Vese Active Contour Method is one of the most popular of such techniques. However, it is constrained with number of objects present in the image. The granular processing using Rough Set method overcomes this constraint and provides a closed curve around the boundary of the objects. This information can further be utilized in selection of similar patches for various image processing problems such as Image Denoising, Image Super-resolution, Image Segmentation etc. The proposed boundary detection method has been tested in presence of noise also. The experimental results have shown on synthetic image as well as on MRI of human brain. The performance of proposed method is found to be encouraging.
提出了一种基于粗糙集理论的封闭形式目标边界检测方法。大多数边缘检测方法都无法得到图像中任意形状物体的封闭边界。基于活动轮廓的方法可以得到这类目标的边界。多相Chan-Vese活动轮廓法是其中最流行的技术之一。然而,它受到图像中存在的对象数量的限制。使用粗糙集方法的颗粒处理克服了这一限制,并在物体边界周围提供了封闭曲线。这些信息可以进一步用于选择类似的patch来解决各种图像处理问题,如图像去噪、图像超分辨率、图像分割等。本文还对存在噪声的边界检测方法进行了测试。实验结果已在人脑的合成图像和MRI上得到证实。结果表明,该方法的性能令人鼓舞。
{"title":"Object boundary detection using Rough Set Theory","authors":"Ashish Phophalia, S. Mitra, Ajit Rajwade","doi":"10.1109/NCVPRIPG.2013.6776259","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776259","url":null,"abstract":"A Rough Set Theory based closed form object boundary detection method has been suggested in this paper. Most of the edge detection methods fail in getting closed boundary of objects of any shape present in the image. Active contour based methods are available to get such object boundaries. The Multiphase Chan-Vese Active Contour Method is one of the most popular of such techniques. However, it is constrained with number of objects present in the image. The granular processing using Rough Set method overcomes this constraint and provides a closed curve around the boundary of the objects. This information can further be utilized in selection of similar patches for various image processing problems such as Image Denoising, Image Super-resolution, Image Segmentation etc. The proposed boundary detection method has been tested in presence of noise also. The experimental results have shown on synthetic image as well as on MRI of human brain. The performance of proposed method is found to be encouraging.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133497387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Complete visual metrology using relative affine structure 使用相对仿射结构的完整视觉计量
Adersh Miglani, Sumantra Dutta Roy, S. Chaudhury, J. B. Srivastava
We propose a framework for retrieving metric information for repeated objects from single perspective image. Relative affine structure, which is an invariant, is directly proportional to the Euclidean distance of a three dimensional point from a reference plane. The proposed method is based on this fundamental concept. The first object undergoes 4 × 4 transformation and results in a repeated object. We represent this transformation in terms of three relative affine structures along X, Y and Z axes. Additionally, we propose the possible extension of this framework for motion analysis - structure from motion and motion segmentation.
我们提出了一种从单视角图像中检索重复物体度量信息的框架。相对仿射结构是一个不变量,它与三维点到参考平面的欧氏距离成正比。所提出的方法就是基于这一基本概念。第一个对象经过4 × 4变换,得到一个重复对象。我们用X、Y和Z轴上的三个相对仿射结构来表示这种转换。此外,我们提出了该框架在运动分析中的可能扩展-从运动和运动分割中构造。
{"title":"Complete visual metrology using relative affine structure","authors":"Adersh Miglani, Sumantra Dutta Roy, S. Chaudhury, J. B. Srivastava","doi":"10.1109/NCVPRIPG.2013.6776265","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776265","url":null,"abstract":"We propose a framework for retrieving metric information for repeated objects from single perspective image. Relative affine structure, which is an invariant, is directly proportional to the Euclidean distance of a three dimensional point from a reference plane. The proposed method is based on this fundamental concept. The first object undergoes 4 × 4 transformation and results in a repeated object. We represent this transformation in terms of three relative affine structures along X, Y and Z axes. Additionally, we propose the possible extension of this framework for motion analysis - structure from motion and motion segmentation.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132087787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Cursive stroke sequencing for handwritten text documents recognition 草书笔划排序手写文本文件识别
S. Panwar, N. Nain
Text segmentation can be defined as the process of splitting the images of handwritten text document into pieces corresponding to single lines, words and character. This is a very challenging task because in handwritten documents curved text lines appear frequently with different skew and slant angles. After segmentation of word or stroke, also defined as finding the connected components in handwritten text document, we have to sequence the strokes according to the document so that the meaning of the document is preserved. In this paper, We use bottom up grouping approach for segmentation. We have used a novel connectivity strength parameter with depth first search approach for extraction of connected components of the same line from complete connected components of the given document. The exact sequence of connected components is stored in the sequential vector which contains the label of the components. The proposed cursive stroke sequencing technique is implemented and tested on a benchmark IAM database providing encouraging results. Quantitative analysis also shows that this approach gives better results compared to existing segmentation techniques and overcomes the problems encountered in Hill-and-dale writing styles and overlapped and touched lines. The accuracy of the proposed sequencing technique is 98%.
文本分割可以定义为将手写文本文档的图像分割成单行、单字、单字对应的小块的过程。这是一项非常具有挑战性的任务,因为在手写文档中,弯曲的文本线经常以不同的倾斜角度出现。在对手写文本文档进行词或笔画的分词后,也就是在文本文档中找到连接的成分,我们需要根据文档对笔画进行排序,以保持文档的意思。在本文中,我们使用自下而上的分组方法进行分割。我们使用了一种新颖的连接强度参数和深度优先搜索方法,从给定文档的完整连接组件中提取同一行的连接组件。连接组件的确切序列存储在包含组件标签的顺序向量中。提出的草书笔划排序技术在一个基准IAM数据库上进行了实现和测试,结果令人鼓舞。定量分析也表明,与现有的分割技术相比,这种方法取得了更好的效果,并且克服了丘陵和山谷写作风格以及重叠和触线所遇到的问题。所提出的测序技术的准确度为98%。
{"title":"Cursive stroke sequencing for handwritten text documents recognition","authors":"S. Panwar, N. Nain","doi":"10.1109/NCVPRIPG.2013.6776232","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776232","url":null,"abstract":"Text segmentation can be defined as the process of splitting the images of handwritten text document into pieces corresponding to single lines, words and character. This is a very challenging task because in handwritten documents curved text lines appear frequently with different skew and slant angles. After segmentation of word or stroke, also defined as finding the connected components in handwritten text document, we have to sequence the strokes according to the document so that the meaning of the document is preserved. In this paper, We use bottom up grouping approach for segmentation. We have used a novel connectivity strength parameter with depth first search approach for extraction of connected components of the same line from complete connected components of the given document. The exact sequence of connected components is stored in the sequential vector which contains the label of the components. The proposed cursive stroke sequencing technique is implemented and tested on a benchmark IAM database providing encouraging results. Quantitative analysis also shows that this approach gives better results compared to existing segmentation techniques and overcomes the problems encountered in Hill-and-dale writing styles and overlapped and touched lines. The accuracy of the proposed sequencing technique is 98%.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"700 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122986296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Parallel mesh regularization and resampling algorithm for improved mesh registration 改进网格配准的并行网格正则化和重采样算法
Sumandeep Banerjee, Somnath Dutta, P. Biswas, Partha Bhowmick
In this paper, we present a fast and efficient algorithm for regularization and resampling of triangular meshes generated by 3D reconstruction methods such as stereoscopy, laser scanning etc. We also present a scheme for efficient parallel implementation of the proposed algorithm and the time gain with increasing number of processor cores.
本文提出了一种快速有效的三角网格正则化和重采样算法,用于立体、激光扫描等三维重建方法生成的三角网格。我们还提出了一种有效并行实现算法的方案,并随着处理器核数的增加而增加时间增益。
{"title":"Parallel mesh regularization and resampling algorithm for improved mesh registration","authors":"Sumandeep Banerjee, Somnath Dutta, P. Biswas, Partha Bhowmick","doi":"10.1109/NCVPRIPG.2013.6776183","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776183","url":null,"abstract":"In this paper, we present a fast and efficient algorithm for regularization and resampling of triangular meshes generated by 3D reconstruction methods such as stereoscopy, laser scanning etc. We also present a scheme for efficient parallel implementation of the proposed algorithm and the time gain with increasing number of processor cores.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128988878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Augmented paper system: A framework for User's Personalized Workspace 增强纸张系统:用户个性化工作空间的框架
Kavita Bhardwaj, S. Chaudhury, Sumantra Dutta Roy
In this paper, we are presenting a framework for “User's Personalized Workspace” by augmenting the physical paper and digital document. The paper based interactions are seamlessly integrated with digital document based interactions for reading as a activity. For instance when user is involved in reading activity, writing becomes complimentary. In a academic system, paper based presentation mode has facilitated such exercises. Despite rendering the annotation on digital document and store it onto the database, the content of the paper encircled or underlined is used to hyperlink the document. Synchronizing a physical paper and those of digital version in seamless fashion from a user's perspective is the main objective of this work. We have also compared the existing systems which focus on one activity or the other in our proposed system.
在本文中,我们通过增加物理纸张和数字文档,提出了一个“用户个性化工作空间”的框架。基于纸张的交互与基于数字文档的交互无缝集成,将阅读作为一种活动。例如,当用户参与阅读活动时,写作成为一种赞美。在学术系统中,基于论文的演示模式促进了这种练习。尽管在数字文档上呈现注释并将其存储到数据库中,但使用包围或下划线的论文内容来超链接文档。从用户的角度来看,以无缝的方式同步物理纸张和数字版本是这项工作的主要目标。我们还比较了现有的系统,这些系统在我们提议的系统中侧重于一个活动或另一个活动。
{"title":"Augmented paper system: A framework for User's Personalized Workspace","authors":"Kavita Bhardwaj, S. Chaudhury, Sumantra Dutta Roy","doi":"10.1109/NCVPRIPG.2013.6776182","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776182","url":null,"abstract":"In this paper, we are presenting a framework for “User's Personalized Workspace” by augmenting the physical paper and digital document. The paper based interactions are seamlessly integrated with digital document based interactions for reading as a activity. For instance when user is involved in reading activity, writing becomes complimentary. In a academic system, paper based presentation mode has facilitated such exercises. Despite rendering the annotation on digital document and store it onto the database, the content of the paper encircled or underlined is used to hyperlink the document. Synchronizing a physical paper and those of digital version in seamless fashion from a user's perspective is the main objective of this work. We have also compared the existing systems which focus on one activity or the other in our proposed system.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128852162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
STAR: A Content Based Video Retrieval system for moving camera video shots STAR:一个基于内容的视频检索系统,用于移动摄像机视频拍摄
C. Chattopadhyay, Sukhendu Das
This paper presents the design of STAR (Spatio-Temporal Analysis and Retrieval), an unsupervised Content Based Video Retrieval (CBVR) System. STAR's key insight and primary contribution is that it models video content using a joint spatio-temporal feature representation and retrieves videos from the database which have similar moving object and trajectories of motion. Foreground moving blobs from a moving camera video shot are extracted, along with a trajectory for camera motion compensation, to form the space-time volume (STV). The STV is processed to obtain the EMST-CSS representation, which can discriminate across different categories of videos. Performance of STAR has been evaluated qualitatively and quantitatively using precision-recall metric on benchmark video datasets having unconstrained video shots, to exhibit efficiency of STAR.
提出了一种基于无监督内容的视频检索(CBVR)系统STAR (spatial - temporal Analysis and Retrieval)的设计。STAR的关键洞察力和主要贡献在于,它使用联合时空特征表示对视频内容进行建模,并从数据库中检索具有相似运动对象和运动轨迹的视频。从移动摄像机的视频镜头中提取前景移动斑点,并结合轨迹进行摄像机运动补偿,形成时空体(STV)。对STV进行处理得到EMST-CSS表示,该表示可以区分不同类别的视频。在具有无约束视频镜头的基准视频数据集上,使用精确召回度量对STAR的性能进行了定性和定量评估,以展示STAR的效率。
{"title":"STAR: A Content Based Video Retrieval system for moving camera video shots","authors":"C. Chattopadhyay, Sukhendu Das","doi":"10.1109/NCVPRIPG.2013.6776267","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776267","url":null,"abstract":"This paper presents the design of STAR (Spatio-Temporal Analysis and Retrieval), an unsupervised Content Based Video Retrieval (CBVR) System. STAR's key insight and primary contribution is that it models video content using a joint spatio-temporal feature representation and retrieves videos from the database which have similar moving object and trajectories of motion. Foreground moving blobs from a moving camera video shot are extracted, along with a trajectory for camera motion compensation, to form the space-time volume (STV). The STV is processed to obtain the EMST-CSS representation, which can discriminate across different categories of videos. Performance of STAR has been evaluated qualitatively and quantitatively using precision-recall metric on benchmark video datasets having unconstrained video shots, to exhibit efficiency of STAR.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125124271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Digital image tampering detection and localization using singular value decomposition technique 基于奇异值分解技术的数字图像篡改检测与定位
V. Mall, A. Roy, S. Mitra
Recent years have witnessed an exponential growth in the use of digital images due to development of high quality digital cameras and multimedia technology. Easy availability of image editing software has made digital image processing very popular. Ready to use software are available on internet which can be easily used to manipulate the images. In such an environment, the integrity of the image can not be taken for granted. Malicious tampering has serious implication for legal documents, copyright issues and forensic cases. Researchers have come forward with large number of methods to detect image tampering. The proposed method is based on hash generation technique using singular value decomposition. Design of an efficient hash vector as proposed will help in detection and localization of image tampering. The proposed method shows that it is robust against content preserving manipulation but extremely sensitive to even very minute structural tampering.
近年来,由于高质量数码相机和多媒体技术的发展,数字图像的使用呈指数级增长。易于获得的图像编辑软件使得数字图像处理非常流行。准备使用的软件可以在互联网上,可以很容易地使用来操纵图像。在这样的环境下,图像的完整性不能想当然。恶意篡改对法律文件、版权问题和司法案件都有严重的影响。研究人员已经提出了大量检测图像篡改的方法。该方法基于基于奇异值分解的哈希生成技术。所提出的高效哈希向量的设计有助于图像篡改的检测和定位。所提出的方法对内容保留操作具有鲁棒性,但对非常微小的结构篡改也非常敏感。
{"title":"Digital image tampering detection and localization using singular value decomposition technique","authors":"V. Mall, A. Roy, S. Mitra","doi":"10.1109/NCVPRIPG.2013.6776160","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776160","url":null,"abstract":"Recent years have witnessed an exponential growth in the use of digital images due to development of high quality digital cameras and multimedia technology. Easy availability of image editing software has made digital image processing very popular. Ready to use software are available on internet which can be easily used to manipulate the images. In such an environment, the integrity of the image can not be taken for granted. Malicious tampering has serious implication for legal documents, copyright issues and forensic cases. Researchers have come forward with large number of methods to detect image tampering. The proposed method is based on hash generation technique using singular value decomposition. Design of an efficient hash vector as proposed will help in detection and localization of image tampering. The proposed method shows that it is robust against content preserving manipulation but extremely sensitive to even very minute structural tampering.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"224 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124468915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Monitoring a large surveillance space through distributed face matching 通过分布式人脸匹配实现对大监控空间的监控
Richa Mishra, Prasanna Kumar, S. Chaudhury, I. Sreedevi
Large space with many cameras require huge storage and computational power to process these data for surveillance applications. In this paper we propose a distributed camera and processing based face detection and recognition system which can generate information for finding spatiotemporal movement pattern of individuals over a large monitored space. The system is built upon Hadoop Distributed File System using map reduce programming model. A novel key generation scheme using distance based hashing technique has been used for distribution of the face matching task. Experimental results have established effectiveness of the technique.
有许多摄像机的大空间需要巨大的存储和计算能力来处理这些监控应用程序的数据。本文提出了一种基于分布式摄像头和处理的人脸检测与识别系统,该系统可以生成信息,用于在监测的大空间中寻找个体的时空运动模式。该系统基于Hadoop分布式文件系统,采用reduce编程模型。提出了一种基于距离哈希的密钥生成方案,用于人脸匹配任务的分配。实验结果证明了该技术的有效性。
{"title":"Monitoring a large surveillance space through distributed face matching","authors":"Richa Mishra, Prasanna Kumar, S. Chaudhury, I. Sreedevi","doi":"10.1109/NCVPRIPG.2013.6776185","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776185","url":null,"abstract":"Large space with many cameras require huge storage and computational power to process these data for surveillance applications. In this paper we propose a distributed camera and processing based face detection and recognition system which can generate information for finding spatiotemporal movement pattern of individuals over a large monitored space. The system is built upon Hadoop Distributed File System using map reduce programming model. A novel key generation scheme using distance based hashing technique has been used for distribution of the face matching task. Experimental results have established effectiveness of the technique.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114530746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Word recognition in natural scene and video images using Hidden Markov Model 基于隐马尔可夫模型的自然场景和视频图像的词识别
Sangheeta Roy, P. Roy, P. Shivakumara, U. Pal
Text recognition from a natural scene and video is challenging compared to that in scanned document images. This is due to the problems of text on different sources of various styles, font variation, font size variations, background variations, etc. There are approaches for word segmentation from video and scene images to feed the word image into OCRs. Nevertheless, such methods often fail to yield satisfactory results in recognition. Therefore, in this paper, we propose to combine Hidden Markov Model (HMM) and Convolutional Neural Network (CNN) to achieve good recognition rate. Sequential gradient features with HMM help to find character alignment of a word. Later the character alignments are verified by Convolutional Neural network (CNN). The approach is tested on both video and scene data to show the effectiveness of the proposed approach. The results are found encouraging.
与扫描的文档图像相比,来自自然场景和视频的文本识别具有挑战性。这是由于文本在不同来源上的各种样式、字体变化、字体大小变化、背景变化等问题。有一些方法可以从视频和场景图像中分割单词,将单词图像输入ocr。然而,这种方法在识别方面往往不能产生令人满意的结果。因此,本文提出将隐马尔可夫模型(HMM)与卷积神经网络(CNN)相结合,以达到较好的识别率。HMM的顺序梯度特征有助于找到单词的字符对齐。然后用卷积神经网络(CNN)对字符对齐进行验证。在视频和场景数据上对该方法进行了测试,验证了该方法的有效性。结果令人鼓舞。
{"title":"Word recognition in natural scene and video images using Hidden Markov Model","authors":"Sangheeta Roy, P. Roy, P. Shivakumara, U. Pal","doi":"10.1109/NCVPRIPG.2013.6776157","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776157","url":null,"abstract":"Text recognition from a natural scene and video is challenging compared to that in scanned document images. This is due to the problems of text on different sources of various styles, font variation, font size variations, background variations, etc. There are approaches for word segmentation from video and scene images to feed the word image into OCRs. Nevertheless, such methods often fail to yield satisfactory results in recognition. Therefore, in this paper, we propose to combine Hidden Markov Model (HMM) and Convolutional Neural Network (CNN) to achieve good recognition rate. Sequential gradient features with HMM help to find character alignment of a word. Later the character alignments are verified by Convolutional Neural network (CNN). The approach is tested on both video and scene data to show the effectiveness of the proposed approach. The results are found encouraging.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115368154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Temporally scalable compression of animation geometry 动画几何的时间可伸缩压缩
Sanjib Das, H. ShahJaimeen, P. Bora
Animation geometry compression involves compressing the geometry data of dynamic three-dimensional (3D) triangular meshes representing the animation frames. The scalability issue of geometry compression addresses compressing the geometry in a single scale and decompressing it in multiple scales. One of the algorithms for animation geometry compression employs the skinning based motion prediction of vertices and the temporal wavelet transform (TWT) on the prediction errors. This paper presents an encoder and a decoder structure for achieving temporally scalable implementation of the algorithm. The frame-wise prediction errors due to motion based clustering of a group of affine transformed vertices are converted into a layered structure of the frames using the TWT. The affine transformation data of vertices, weights corresponding to each cluster of vertices and the wavelet coefficients of the prediction errors are quantized and encoded using the entropy coding. The resulting bit-stream is arranged in a layered structure to achieve temporal scalability. The base layer consists of the connectivity coded first frame, indices of the clusters of vertices, weights corresponding to each cluster of a vertex, the approximation sub-band of prediction error and the affine transformations corresponding to the approximation frames. The enhancement layers consist of the detailed sub-bands of prediction error and the affine transformations corresponding to the detailed frames. The scalable encoder and decoder are tested on some standard animation sequences and the experimental results show good performance in terms of scalable rates and distortions.
动画几何压缩包括对代表动画帧的动态三维(3D)三角形网格的几何数据进行压缩。几何压缩的可伸缩性问题解决了在单一尺度下压缩几何和在多个尺度下解压缩几何的问题。其中一种动画几何压缩算法采用基于蒙皮的顶点运动预测和时域小波变换(TWT)对预测误差进行处理。本文提出了一种编码器和解码器结构,用于实现该算法的临时可伸缩实现。利用行波管将一组仿射变换顶点的运动聚类引起的逐帧预测误差转换为帧的分层结构。对各点的仿射变换数据、各点簇对应的权值以及预测误差的小波系数进行量化和熵编码。所得到的比特流以分层结构排列,以实现时间可伸缩性。基础层由第一帧的连通性编码、顶点簇的指标、每个顶点簇对应的权值、预测误差的近似子带和近似帧对应的仿射变换组成。增强层由预测误差的详细子带和具体帧对应的仿射变换组成。在一些标准动画序列上对可伸缩编码器和解码器进行了测试,实验结果表明在可伸缩率和失真方面都有良好的性能。
{"title":"Temporally scalable compression of animation geometry","authors":"Sanjib Das, H. ShahJaimeen, P. Bora","doi":"10.1109/NCVPRIPG.2013.6776263","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776263","url":null,"abstract":"Animation geometry compression involves compressing the geometry data of dynamic three-dimensional (3D) triangular meshes representing the animation frames. The scalability issue of geometry compression addresses compressing the geometry in a single scale and decompressing it in multiple scales. One of the algorithms for animation geometry compression employs the skinning based motion prediction of vertices and the temporal wavelet transform (TWT) on the prediction errors. This paper presents an encoder and a decoder structure for achieving temporally scalable implementation of the algorithm. The frame-wise prediction errors due to motion based clustering of a group of affine transformed vertices are converted into a layered structure of the frames using the TWT. The affine transformation data of vertices, weights corresponding to each cluster of vertices and the wavelet coefficients of the prediction errors are quantized and encoded using the entropy coding. The resulting bit-stream is arranged in a layered structure to achieve temporal scalability. The base layer consists of the connectivity coded first frame, indices of the clusters of vertices, weights corresponding to each cluster of a vertex, the approximation sub-band of prediction error and the affine transformations corresponding to the approximation frames. The enhancement layers consist of the detailed sub-bands of prediction error and the affine transformations corresponding to the detailed frames. The scalable encoder and decoder are tested on some standard animation sequences and the experimental results show good performance in terms of scalable rates and distortions.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"321 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121681621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1