首页 > 最新文献

2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)最新文献

英文 中文
Complete visual metrology using relative affine structure 使用相对仿射结构的完整视觉计量
Adersh Miglani, Sumantra Dutta Roy, S. Chaudhury, J. B. Srivastava
We propose a framework for retrieving metric information for repeated objects from single perspective image. Relative affine structure, which is an invariant, is directly proportional to the Euclidean distance of a three dimensional point from a reference plane. The proposed method is based on this fundamental concept. The first object undergoes 4 × 4 transformation and results in a repeated object. We represent this transformation in terms of three relative affine structures along X, Y and Z axes. Additionally, we propose the possible extension of this framework for motion analysis - structure from motion and motion segmentation.
我们提出了一种从单视角图像中检索重复物体度量信息的框架。相对仿射结构是一个不变量,它与三维点到参考平面的欧氏距离成正比。所提出的方法就是基于这一基本概念。第一个对象经过4 × 4变换,得到一个重复对象。我们用X、Y和Z轴上的三个相对仿射结构来表示这种转换。此外,我们提出了该框架在运动分析中的可能扩展-从运动和运动分割中构造。
{"title":"Complete visual metrology using relative affine structure","authors":"Adersh Miglani, Sumantra Dutta Roy, S. Chaudhury, J. B. Srivastava","doi":"10.1109/NCVPRIPG.2013.6776265","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776265","url":null,"abstract":"We propose a framework for retrieving metric information for repeated objects from single perspective image. Relative affine structure, which is an invariant, is directly proportional to the Euclidean distance of a three dimensional point from a reference plane. The proposed method is based on this fundamental concept. The first object undergoes 4 × 4 transformation and results in a repeated object. We represent this transformation in terms of three relative affine structures along X, Y and Z axes. Additionally, we propose the possible extension of this framework for motion analysis - structure from motion and motion segmentation.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132087787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Adaptive BPSO based feature selection and skin detection based background removal for enhanced face recognition 基于自适应粒子群算法的特征选择和基于皮肤检测的背景去除增强人脸识别
Mayukh Sattiraju Student, Vikram Manikandan M Student, K. Manikantan, Associate Professor, S. Ramachandran
Face recognition under varying background and pose is challenging, and extracting background and pose invariant features is an effective approach to solve this problem. This paper proposes a skin detection-based approach for enhancing the performance of a Face Recognition (FR) system, employing a unique combination of Skin based background removal, Discrete Wavelet Transform (DWT), Adaptive Multi-Level Threshold Binary Particle Swarm Optimization (ABPSO) and an Error Control Feedback (ECF) loop. Skin based background removal is used for efficient background removal and ABPSO-based feature selection algorithm is used to search the feature space for the optimal feature subset. The ECF loop is used to neutralize pose variations. Experimental results, obtained by applying the proposed algorithm on Color FERET and CMUPIE face databases, show that the proposed system outperforms other FR systems. A significant increase in the recognition rate and substantial reduction in the number of features are observed.
不同背景和姿态下的人脸识别具有挑战性,提取背景和姿态不变特征是解决这一问题的有效途径。本文提出了一种基于皮肤检测的方法来增强人脸识别(FR)系统的性能,该方法采用了基于皮肤的背景去除、离散小波变换(DWT)、自适应多级阈值二值粒子群优化(ABPSO)和误差控制反馈(ECF)回路的独特组合。采用基于皮肤的背景去除算法进行高效背景去除,采用基于abpso的特征选择算法在特征空间中搜索最优特征子集。ECF回路用于中和位姿变化。将该算法应用于Color FERET和CMUPIE人脸数据库的实验结果表明,该算法优于其他人脸识别系统。识别率显著提高,特征数量显著减少。
{"title":"Adaptive BPSO based feature selection and skin detection based background removal for enhanced face recognition","authors":"Mayukh Sattiraju Student, Vikram Manikandan M Student, K. Manikantan, Associate Professor, S. Ramachandran","doi":"10.1109/NCVPRIPG.2013.6776226","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776226","url":null,"abstract":"Face recognition under varying background and pose is challenging, and extracting background and pose invariant features is an effective approach to solve this problem. This paper proposes a skin detection-based approach for enhancing the performance of a Face Recognition (FR) system, employing a unique combination of Skin based background removal, Discrete Wavelet Transform (DWT), Adaptive Multi-Level Threshold Binary Particle Swarm Optimization (ABPSO) and an Error Control Feedback (ECF) loop. Skin based background removal is used for efficient background removal and ABPSO-based feature selection algorithm is used to search the feature space for the optimal feature subset. The ECF loop is used to neutralize pose variations. Experimental results, obtained by applying the proposed algorithm on Color FERET and CMUPIE face databases, show that the proposed system outperforms other FR systems. A significant increase in the recognition rate and substantial reduction in the number of features are observed.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"197 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126617337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
STAR: A Content Based Video Retrieval system for moving camera video shots STAR:一个基于内容的视频检索系统,用于移动摄像机视频拍摄
C. Chattopadhyay, Sukhendu Das
This paper presents the design of STAR (Spatio-Temporal Analysis and Retrieval), an unsupervised Content Based Video Retrieval (CBVR) System. STAR's key insight and primary contribution is that it models video content using a joint spatio-temporal feature representation and retrieves videos from the database which have similar moving object and trajectories of motion. Foreground moving blobs from a moving camera video shot are extracted, along with a trajectory for camera motion compensation, to form the space-time volume (STV). The STV is processed to obtain the EMST-CSS representation, which can discriminate across different categories of videos. Performance of STAR has been evaluated qualitatively and quantitatively using precision-recall metric on benchmark video datasets having unconstrained video shots, to exhibit efficiency of STAR.
提出了一种基于无监督内容的视频检索(CBVR)系统STAR (spatial - temporal Analysis and Retrieval)的设计。STAR的关键洞察力和主要贡献在于,它使用联合时空特征表示对视频内容进行建模,并从数据库中检索具有相似运动对象和运动轨迹的视频。从移动摄像机的视频镜头中提取前景移动斑点,并结合轨迹进行摄像机运动补偿,形成时空体(STV)。对STV进行处理得到EMST-CSS表示,该表示可以区分不同类别的视频。在具有无约束视频镜头的基准视频数据集上,使用精确召回度量对STAR的性能进行了定性和定量评估,以展示STAR的效率。
{"title":"STAR: A Content Based Video Retrieval system for moving camera video shots","authors":"C. Chattopadhyay, Sukhendu Das","doi":"10.1109/NCVPRIPG.2013.6776267","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776267","url":null,"abstract":"This paper presents the design of STAR (Spatio-Temporal Analysis and Retrieval), an unsupervised Content Based Video Retrieval (CBVR) System. STAR's key insight and primary contribution is that it models video content using a joint spatio-temporal feature representation and retrieves videos from the database which have similar moving object and trajectories of motion. Foreground moving blobs from a moving camera video shot are extracted, along with a trajectory for camera motion compensation, to form the space-time volume (STV). The STV is processed to obtain the EMST-CSS representation, which can discriminate across different categories of videos. Performance of STAR has been evaluated qualitatively and quantitatively using precision-recall metric on benchmark video datasets having unconstrained video shots, to exhibit efficiency of STAR.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125124271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Word recognition in natural scene and video images using Hidden Markov Model 基于隐马尔可夫模型的自然场景和视频图像的词识别
Sangheeta Roy, P. Roy, P. Shivakumara, U. Pal
Text recognition from a natural scene and video is challenging compared to that in scanned document images. This is due to the problems of text on different sources of various styles, font variation, font size variations, background variations, etc. There are approaches for word segmentation from video and scene images to feed the word image into OCRs. Nevertheless, such methods often fail to yield satisfactory results in recognition. Therefore, in this paper, we propose to combine Hidden Markov Model (HMM) and Convolutional Neural Network (CNN) to achieve good recognition rate. Sequential gradient features with HMM help to find character alignment of a word. Later the character alignments are verified by Convolutional Neural network (CNN). The approach is tested on both video and scene data to show the effectiveness of the proposed approach. The results are found encouraging.
与扫描的文档图像相比,来自自然场景和视频的文本识别具有挑战性。这是由于文本在不同来源上的各种样式、字体变化、字体大小变化、背景变化等问题。有一些方法可以从视频和场景图像中分割单词,将单词图像输入ocr。然而,这种方法在识别方面往往不能产生令人满意的结果。因此,本文提出将隐马尔可夫模型(HMM)与卷积神经网络(CNN)相结合,以达到较好的识别率。HMM的顺序梯度特征有助于找到单词的字符对齐。然后用卷积神经网络(CNN)对字符对齐进行验证。在视频和场景数据上对该方法进行了测试,验证了该方法的有效性。结果令人鼓舞。
{"title":"Word recognition in natural scene and video images using Hidden Markov Model","authors":"Sangheeta Roy, P. Roy, P. Shivakumara, U. Pal","doi":"10.1109/NCVPRIPG.2013.6776157","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776157","url":null,"abstract":"Text recognition from a natural scene and video is challenging compared to that in scanned document images. This is due to the problems of text on different sources of various styles, font variation, font size variations, background variations, etc. There are approaches for word segmentation from video and scene images to feed the word image into OCRs. Nevertheless, such methods often fail to yield satisfactory results in recognition. Therefore, in this paper, we propose to combine Hidden Markov Model (HMM) and Convolutional Neural Network (CNN) to achieve good recognition rate. Sequential gradient features with HMM help to find character alignment of a word. Later the character alignments are verified by Convolutional Neural network (CNN). The approach is tested on both video and scene data to show the effectiveness of the proposed approach. The results are found encouraging.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115368154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Monitoring a large surveillance space through distributed face matching 通过分布式人脸匹配实现对大监控空间的监控
Richa Mishra, Prasanna Kumar, S. Chaudhury, I. Sreedevi
Large space with many cameras require huge storage and computational power to process these data for surveillance applications. In this paper we propose a distributed camera and processing based face detection and recognition system which can generate information for finding spatiotemporal movement pattern of individuals over a large monitored space. The system is built upon Hadoop Distributed File System using map reduce programming model. A novel key generation scheme using distance based hashing technique has been used for distribution of the face matching task. Experimental results have established effectiveness of the technique.
有许多摄像机的大空间需要巨大的存储和计算能力来处理这些监控应用程序的数据。本文提出了一种基于分布式摄像头和处理的人脸检测与识别系统,该系统可以生成信息,用于在监测的大空间中寻找个体的时空运动模式。该系统基于Hadoop分布式文件系统,采用reduce编程模型。提出了一种基于距离哈希的密钥生成方案,用于人脸匹配任务的分配。实验结果证明了该技术的有效性。
{"title":"Monitoring a large surveillance space through distributed face matching","authors":"Richa Mishra, Prasanna Kumar, S. Chaudhury, I. Sreedevi","doi":"10.1109/NCVPRIPG.2013.6776185","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776185","url":null,"abstract":"Large space with many cameras require huge storage and computational power to process these data for surveillance applications. In this paper we propose a distributed camera and processing based face detection and recognition system which can generate information for finding spatiotemporal movement pattern of individuals over a large monitored space. The system is built upon Hadoop Distributed File System using map reduce programming model. A novel key generation scheme using distance based hashing technique has been used for distribution of the face matching task. Experimental results have established effectiveness of the technique.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114530746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Augmented paper system: A framework for User's Personalized Workspace 增强纸张系统:用户个性化工作空间的框架
Kavita Bhardwaj, S. Chaudhury, Sumantra Dutta Roy
In this paper, we are presenting a framework for “User's Personalized Workspace” by augmenting the physical paper and digital document. The paper based interactions are seamlessly integrated with digital document based interactions for reading as a activity. For instance when user is involved in reading activity, writing becomes complimentary. In a academic system, paper based presentation mode has facilitated such exercises. Despite rendering the annotation on digital document and store it onto the database, the content of the paper encircled or underlined is used to hyperlink the document. Synchronizing a physical paper and those of digital version in seamless fashion from a user's perspective is the main objective of this work. We have also compared the existing systems which focus on one activity or the other in our proposed system.
在本文中,我们通过增加物理纸张和数字文档,提出了一个“用户个性化工作空间”的框架。基于纸张的交互与基于数字文档的交互无缝集成,将阅读作为一种活动。例如,当用户参与阅读活动时,写作成为一种赞美。在学术系统中,基于论文的演示模式促进了这种练习。尽管在数字文档上呈现注释并将其存储到数据库中,但使用包围或下划线的论文内容来超链接文档。从用户的角度来看,以无缝的方式同步物理纸张和数字版本是这项工作的主要目标。我们还比较了现有的系统,这些系统在我们提议的系统中侧重于一个活动或另一个活动。
{"title":"Augmented paper system: A framework for User's Personalized Workspace","authors":"Kavita Bhardwaj, S. Chaudhury, Sumantra Dutta Roy","doi":"10.1109/NCVPRIPG.2013.6776182","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776182","url":null,"abstract":"In this paper, we are presenting a framework for “User's Personalized Workspace” by augmenting the physical paper and digital document. The paper based interactions are seamlessly integrated with digital document based interactions for reading as a activity. For instance when user is involved in reading activity, writing becomes complimentary. In a academic system, paper based presentation mode has facilitated such exercises. Despite rendering the annotation on digital document and store it onto the database, the content of the paper encircled or underlined is used to hyperlink the document. Synchronizing a physical paper and those of digital version in seamless fashion from a user's perspective is the main objective of this work. We have also compared the existing systems which focus on one activity or the other in our proposed system.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128852162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Cursive stroke sequencing for handwritten text documents recognition 草书笔划排序手写文本文件识别
S. Panwar, N. Nain
Text segmentation can be defined as the process of splitting the images of handwritten text document into pieces corresponding to single lines, words and character. This is a very challenging task because in handwritten documents curved text lines appear frequently with different skew and slant angles. After segmentation of word or stroke, also defined as finding the connected components in handwritten text document, we have to sequence the strokes according to the document so that the meaning of the document is preserved. In this paper, We use bottom up grouping approach for segmentation. We have used a novel connectivity strength parameter with depth first search approach for extraction of connected components of the same line from complete connected components of the given document. The exact sequence of connected components is stored in the sequential vector which contains the label of the components. The proposed cursive stroke sequencing technique is implemented and tested on a benchmark IAM database providing encouraging results. Quantitative analysis also shows that this approach gives better results compared to existing segmentation techniques and overcomes the problems encountered in Hill-and-dale writing styles and overlapped and touched lines. The accuracy of the proposed sequencing technique is 98%.
文本分割可以定义为将手写文本文档的图像分割成单行、单字、单字对应的小块的过程。这是一项非常具有挑战性的任务,因为在手写文档中,弯曲的文本线经常以不同的倾斜角度出现。在对手写文本文档进行词或笔画的分词后,也就是在文本文档中找到连接的成分,我们需要根据文档对笔画进行排序,以保持文档的意思。在本文中,我们使用自下而上的分组方法进行分割。我们使用了一种新颖的连接强度参数和深度优先搜索方法,从给定文档的完整连接组件中提取同一行的连接组件。连接组件的确切序列存储在包含组件标签的顺序向量中。提出的草书笔划排序技术在一个基准IAM数据库上进行了实现和测试,结果令人鼓舞。定量分析也表明,与现有的分割技术相比,这种方法取得了更好的效果,并且克服了丘陵和山谷写作风格以及重叠和触线所遇到的问题。所提出的测序技术的准确度为98%。
{"title":"Cursive stroke sequencing for handwritten text documents recognition","authors":"S. Panwar, N. Nain","doi":"10.1109/NCVPRIPG.2013.6776232","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776232","url":null,"abstract":"Text segmentation can be defined as the process of splitting the images of handwritten text document into pieces corresponding to single lines, words and character. This is a very challenging task because in handwritten documents curved text lines appear frequently with different skew and slant angles. After segmentation of word or stroke, also defined as finding the connected components in handwritten text document, we have to sequence the strokes according to the document so that the meaning of the document is preserved. In this paper, We use bottom up grouping approach for segmentation. We have used a novel connectivity strength parameter with depth first search approach for extraction of connected components of the same line from complete connected components of the given document. The exact sequence of connected components is stored in the sequential vector which contains the label of the components. The proposed cursive stroke sequencing technique is implemented and tested on a benchmark IAM database providing encouraging results. Quantitative analysis also shows that this approach gives better results compared to existing segmentation techniques and overcomes the problems encountered in Hill-and-dale writing styles and overlapped and touched lines. The accuracy of the proposed sequencing technique is 98%.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"700 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122986296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Digital image tampering detection and localization using singular value decomposition technique 基于奇异值分解技术的数字图像篡改检测与定位
V. Mall, A. Roy, S. Mitra
Recent years have witnessed an exponential growth in the use of digital images due to development of high quality digital cameras and multimedia technology. Easy availability of image editing software has made digital image processing very popular. Ready to use software are available on internet which can be easily used to manipulate the images. In such an environment, the integrity of the image can not be taken for granted. Malicious tampering has serious implication for legal documents, copyright issues and forensic cases. Researchers have come forward with large number of methods to detect image tampering. The proposed method is based on hash generation technique using singular value decomposition. Design of an efficient hash vector as proposed will help in detection and localization of image tampering. The proposed method shows that it is robust against content preserving manipulation but extremely sensitive to even very minute structural tampering.
近年来,由于高质量数码相机和多媒体技术的发展,数字图像的使用呈指数级增长。易于获得的图像编辑软件使得数字图像处理非常流行。准备使用的软件可以在互联网上,可以很容易地使用来操纵图像。在这样的环境下,图像的完整性不能想当然。恶意篡改对法律文件、版权问题和司法案件都有严重的影响。研究人员已经提出了大量检测图像篡改的方法。该方法基于基于奇异值分解的哈希生成技术。所提出的高效哈希向量的设计有助于图像篡改的检测和定位。所提出的方法对内容保留操作具有鲁棒性,但对非常微小的结构篡改也非常敏感。
{"title":"Digital image tampering detection and localization using singular value decomposition technique","authors":"V. Mall, A. Roy, S. Mitra","doi":"10.1109/NCVPRIPG.2013.6776160","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776160","url":null,"abstract":"Recent years have witnessed an exponential growth in the use of digital images due to development of high quality digital cameras and multimedia technology. Easy availability of image editing software has made digital image processing very popular. Ready to use software are available on internet which can be easily used to manipulate the images. In such an environment, the integrity of the image can not be taken for granted. Malicious tampering has serious implication for legal documents, copyright issues and forensic cases. Researchers have come forward with large number of methods to detect image tampering. The proposed method is based on hash generation technique using singular value decomposition. Design of an efficient hash vector as proposed will help in detection and localization of image tampering. The proposed method shows that it is robust against content preserving manipulation but extremely sensitive to even very minute structural tampering.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"224 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124468915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Temporally scalable compression of animation geometry 动画几何的时间可伸缩压缩
Sanjib Das, H. ShahJaimeen, P. Bora
Animation geometry compression involves compressing the geometry data of dynamic three-dimensional (3D) triangular meshes representing the animation frames. The scalability issue of geometry compression addresses compressing the geometry in a single scale and decompressing it in multiple scales. One of the algorithms for animation geometry compression employs the skinning based motion prediction of vertices and the temporal wavelet transform (TWT) on the prediction errors. This paper presents an encoder and a decoder structure for achieving temporally scalable implementation of the algorithm. The frame-wise prediction errors due to motion based clustering of a group of affine transformed vertices are converted into a layered structure of the frames using the TWT. The affine transformation data of vertices, weights corresponding to each cluster of vertices and the wavelet coefficients of the prediction errors are quantized and encoded using the entropy coding. The resulting bit-stream is arranged in a layered structure to achieve temporal scalability. The base layer consists of the connectivity coded first frame, indices of the clusters of vertices, weights corresponding to each cluster of a vertex, the approximation sub-band of prediction error and the affine transformations corresponding to the approximation frames. The enhancement layers consist of the detailed sub-bands of prediction error and the affine transformations corresponding to the detailed frames. The scalable encoder and decoder are tested on some standard animation sequences and the experimental results show good performance in terms of scalable rates and distortions.
动画几何压缩包括对代表动画帧的动态三维(3D)三角形网格的几何数据进行压缩。几何压缩的可伸缩性问题解决了在单一尺度下压缩几何和在多个尺度下解压缩几何的问题。其中一种动画几何压缩算法采用基于蒙皮的顶点运动预测和时域小波变换(TWT)对预测误差进行处理。本文提出了一种编码器和解码器结构,用于实现该算法的临时可伸缩实现。利用行波管将一组仿射变换顶点的运动聚类引起的逐帧预测误差转换为帧的分层结构。对各点的仿射变换数据、各点簇对应的权值以及预测误差的小波系数进行量化和熵编码。所得到的比特流以分层结构排列,以实现时间可伸缩性。基础层由第一帧的连通性编码、顶点簇的指标、每个顶点簇对应的权值、预测误差的近似子带和近似帧对应的仿射变换组成。增强层由预测误差的详细子带和具体帧对应的仿射变换组成。在一些标准动画序列上对可伸缩编码器和解码器进行了测试,实验结果表明在可伸缩率和失真方面都有良好的性能。
{"title":"Temporally scalable compression of animation geometry","authors":"Sanjib Das, H. ShahJaimeen, P. Bora","doi":"10.1109/NCVPRIPG.2013.6776263","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776263","url":null,"abstract":"Animation geometry compression involves compressing the geometry data of dynamic three-dimensional (3D) triangular meshes representing the animation frames. The scalability issue of geometry compression addresses compressing the geometry in a single scale and decompressing it in multiple scales. One of the algorithms for animation geometry compression employs the skinning based motion prediction of vertices and the temporal wavelet transform (TWT) on the prediction errors. This paper presents an encoder and a decoder structure for achieving temporally scalable implementation of the algorithm. The frame-wise prediction errors due to motion based clustering of a group of affine transformed vertices are converted into a layered structure of the frames using the TWT. The affine transformation data of vertices, weights corresponding to each cluster of vertices and the wavelet coefficients of the prediction errors are quantized and encoded using the entropy coding. The resulting bit-stream is arranged in a layered structure to achieve temporal scalability. The base layer consists of the connectivity coded first frame, indices of the clusters of vertices, weights corresponding to each cluster of a vertex, the approximation sub-band of prediction error and the affine transformations corresponding to the approximation frames. The enhancement layers consist of the detailed sub-bands of prediction error and the affine transformations corresponding to the detailed frames. The scalable encoder and decoder are tested on some standard animation sequences and the experimental results show good performance in terms of scalable rates and distortions.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"321 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121681621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatio-temporal feature based VLAD for efficient video retrieval 基于时空特征的VLAD高效视频检索
M. K. Reddy, Sahil Arora, R. Venkatesh Babu
Compact representation of visual content has emerged as an important topic in the context of large scale image/video retrieval. The recently proposed Vector of Locally Aggregated Descriptors (VLAD) has shown to outperform other existing techniques for retrieval. In this paper, we propose two spatio-temporal features for constructing VLAD vectors for videos in the context of large scale video retrieval. Given a particular query video, our aim is to retrieve similar videos from the database. Experiments are conducted on UCF50 and HMDB51 datasets, which pose challenges in the form of camera motion, view-point variation, large intra-class variation, etc. The paper proposes the following two spatio-temporal features for constructing VLADs i) Local Histogram of Oriented Optical Flow (LHOOF), and ii) Space-Time Invariant Points (STIP). The performance of these proposed features are compared with SIFT based spatial feature. The mean average precision (MAP) indicates the better retrieval performance of the proposed spatio-temporal feature over spatial feature.
在大规模图像/视频检索的背景下,视觉内容的紧凑表示已经成为一个重要的课题。最近提出的局部聚合描述子向量(VLAD)在检索方面的表现优于其他现有技术。在大规模视频检索的背景下,我们提出了两个时空特征来构建视频的VLAD向量。给定一个特定的查询视频,我们的目标是从数据库中检索相似的视频。在UCF50和HMDB51数据集上进行实验,存在摄像机运动、视点变化、类内变化大等挑战。本文提出了构建vlad的两个时空特征:一是定向光流局部直方图(LHOOF),二是时空不变点(STIP)。将这些特征的性能与基于SIFT的空间特征进行了比较。平均精度(MAP)表明本文提出的时空特征比空间特征具有更好的检索性能。
{"title":"Spatio-temporal feature based VLAD for efficient video retrieval","authors":"M. K. Reddy, Sahil Arora, R. Venkatesh Babu","doi":"10.1109/NCVPRIPG.2013.6776268","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776268","url":null,"abstract":"Compact representation of visual content has emerged as an important topic in the context of large scale image/video retrieval. The recently proposed Vector of Locally Aggregated Descriptors (VLAD) has shown to outperform other existing techniques for retrieval. In this paper, we propose two spatio-temporal features for constructing VLAD vectors for videos in the context of large scale video retrieval. Given a particular query video, our aim is to retrieve similar videos from the database. Experiments are conducted on UCF50 and HMDB51 datasets, which pose challenges in the form of camera motion, view-point variation, large intra-class variation, etc. The paper proposes the following two spatio-temporal features for constructing VLADs i) Local Histogram of Oriented Optical Flow (LHOOF), and ii) Space-Time Invariant Points (STIP). The performance of these proposed features are compared with SIFT based spatial feature. The mean average precision (MAP) indicates the better retrieval performance of the proposed spatio-temporal feature over spatial feature.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134037093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
期刊
2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1