首页 > 最新文献

Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing最新文献

英文 中文
Image hallucination at different times of day using locally affine model and kNN template matching from time-lapse images 利用局部仿射模型和时延图像的kNN模板匹配实现一天中不同时间的图像幻觉
Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010038
N. Patel, Tushar Kataria
Image Hallucination has many applications in areas such as image processing, computational photography and image fusion. In this paper, we present an image Hallucination technique based on the template (patch) matching from the database of time lapse images and learned locally affine model. Template based techniques suffer from blocky artifacts. So, we propose two approaches for imposing consistency criteria across neighbouring patches in the form of regularization. We validate our Color transfer technique by hallucinating a variety of natural images at different times the day. We compare the proposed approach with other state of the art techniques of example image based color transfer and show that the images obtained using our approach look more plausible and natural.
图像幻觉在图像处理、计算摄影和图像融合等领域有着广泛的应用。本文提出了一种基于时移图像数据库中的模板(贴片)匹配和学习局部仿射模型的图像幻觉技术。基于模板的技术受到块构件的影响。因此,我们提出了两种方法,以正则化的形式在相邻的补丁上施加一致性标准。我们通过在一天的不同时间产生各种自然图像来验证我们的色彩转移技术。我们将所提出的方法与其他基于示例图像的颜色转移技术进行了比较,并表明使用我们的方法获得的图像看起来更可信和自然。
{"title":"Image hallucination at different times of day using locally affine model and kNN template matching from time-lapse images","authors":"N. Patel, Tushar Kataria","doi":"10.1145/3009977.3010038","DOIUrl":"https://doi.org/10.1145/3009977.3010038","url":null,"abstract":"Image Hallucination has many applications in areas such as image processing, computational photography and image fusion. In this paper, we present an image Hallucination technique based on the template (patch) matching from the database of time lapse images and learned locally affine model. Template based techniques suffer from blocky artifacts. So, we propose two approaches for imposing consistency criteria across neighbouring patches in the form of regularization. We validate our Color transfer technique by hallucinating a variety of natural images at different times the day. We compare the proposed approach with other state of the art techniques of example image based color transfer and show that the images obtained using our approach look more plausible and natural.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"4 1","pages":"30:1-30:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74431110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A fast identity-independent expression recognition system for robust cartoonification using smart devices 基于智能设备的鲁棒卡通化的快速身份独立表情识别系统
Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010055
Gorisha Agarwal, Ronak Garg, Divya Garg, B. Prasad, Tanima Dutta, Hari Prabhat Gupta
Facial expressions convey rich information about emotions, intentions and other internal states of a person. Automatic facial expression and cartoonification systems are aiming towards the application of computer vision systems in human computer interaction, emotion analysis, medical care, virtual learning and even entertainment. In this paper, we propose an identity-independent robust system to detect human expression and generate their corresponding cartoonified images in real-time using smart-devices. Identity-independent expression recognition system enhances the facial features of query face image using its intra-class variation image and classifies using support vector machines. The method is robust to variation in identity and illumination of the query face image. Along with the basic expressions, like angry, happy and sad, we have also successfully detected the emotional states of sleepy and pain. The experimental results on JAFFE, CK+, PICS, Yalefaces, and Senthil databases show the effectiveness of the system.
面部表情传达了关于一个人的情绪、意图和其他内部状态的丰富信息。自动面部表情和卡通化系统的目标是计算机视觉系统在人机交互、情感分析、医疗保健、虚拟学习甚至娱乐方面的应用。在本文中,我们提出了一个独立于身份的鲁棒系统来检测人类表情,并使用智能设备实时生成相应的卡通化图像。身份无关表情识别系统利用查询人脸图像的类内变异图像增强其面部特征,并利用支持向量机进行分类。该方法对查询人脸图像的身份和光照变化具有鲁棒性。除了愤怒、快乐和悲伤等基本表情外,我们还成功地检测到了困倦和疼痛等情绪状态。在JAFFE、CK+、PICS、Yalefaces和Senthil数据库上的实验结果表明了该系统的有效性。
{"title":"A fast identity-independent expression recognition system for robust cartoonification using smart devices","authors":"Gorisha Agarwal, Ronak Garg, Divya Garg, B. Prasad, Tanima Dutta, Hari Prabhat Gupta","doi":"10.1145/3009977.3010055","DOIUrl":"https://doi.org/10.1145/3009977.3010055","url":null,"abstract":"Facial expressions convey rich information about emotions, intentions and other internal states of a person. Automatic facial expression and cartoonification systems are aiming towards the application of computer vision systems in human computer interaction, emotion analysis, medical care, virtual learning and even entertainment. In this paper, we propose an identity-independent robust system to detect human expression and generate their corresponding cartoonified images in real-time using smart-devices. Identity-independent expression recognition system enhances the facial features of query face image using its intra-class variation image and classifies using support vector machines. The method is robust to variation in identity and illumination of the query face image. Along with the basic expressions, like angry, happy and sad, we have also successfully detected the emotional states of sleepy and pain. The experimental results on JAFFE, CK+, PICS, Yalefaces, and Senthil databases show the effectiveness of the system.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"117 1","pages":"15:1-15:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88065387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep automatic license plate recognition system 深度自动车牌识别系统
Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010052
Vishal Jain, S. Zitha, A. Rajagopal, S. Biswas, H. S. Bharadwaj, K. Ramakrishnan
Automatic License Plate Recognition (ALPR) has important applications in traffic surveillance. It is a challenging problem especially in countries like in India where the license plates have varying sizes, number of lines, fonts etc. The difficulty is all the more accentuated in traffic videos as the cameras are placed high and most plates appear skewed. This work aims to address ALPR using Deep CNN methods for real-time traffic videos. We first extract license plate candidates from each frame using edge information and geometrical properties, ensuring high recall. These proposals are fed to a CNN classifier for License Plate detection obtaining high precision. We then use a CNN classifier trained for individual characters along with a spatial transformer network (STN) for character recognition. Our system is evaluated on several traffic videos with vehicles having different license plate formats in terms of tilt, distances, colors, illumination, character size, thickness etc. Results demonstrate robustness to such variations and impressive performance in both the localization and recognition. We also make available the dataset for further research on this topic.
车牌自动识别在交通监控中有着重要的应用。这是一个具有挑战性的问题,特别是在像印度这样的国家,车牌的大小、行数、字体等各不相同。在交通视频中,由于摄像头放置得很高,而且大多数车牌看起来都是倾斜的,所以难度就更大了。这项工作旨在使用深度CNN方法解决实时交通视频的ALPR问题。我们首先利用边缘信息和几何属性从每帧中提取候选车牌,以确保高召回率。将这些建议输入到CNN分类器中进行车牌检测,获得了较高的检测精度。然后,我们使用针对单个字符训练的CNN分类器以及用于字符识别的空间变压器网络(STN)。我们的系统在几个交通视频上进行了评估,这些视频中的车辆具有不同的车牌格式,包括倾斜、距离、颜色、照明、字符大小、厚度等。结果显示了对这些变化的鲁棒性和令人印象深刻的定位和识别性能。我们还提供数据集以供进一步研究此主题。
{"title":"Deep automatic license plate recognition system","authors":"Vishal Jain, S. Zitha, A. Rajagopal, S. Biswas, H. S. Bharadwaj, K. Ramakrishnan","doi":"10.1145/3009977.3010052","DOIUrl":"https://doi.org/10.1145/3009977.3010052","url":null,"abstract":"Automatic License Plate Recognition (ALPR) has important applications in traffic surveillance. It is a challenging problem especially in countries like in India where the license plates have varying sizes, number of lines, fonts etc. The difficulty is all the more accentuated in traffic videos as the cameras are placed high and most plates appear skewed. This work aims to address ALPR using Deep CNN methods for real-time traffic videos. We first extract license plate candidates from each frame using edge information and geometrical properties, ensuring high recall. These proposals are fed to a CNN classifier for License Plate detection obtaining high precision. We then use a CNN classifier trained for individual characters along with a spatial transformer network (STN) for character recognition. Our system is evaluated on several traffic videos with vehicles having different license plate formats in terms of tilt, distances, colors, illumination, character size, thickness etc. Results demonstrate robustness to such variations and impressive performance in both the localization and recognition. We also make available the dataset for further research on this topic.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"23 1","pages":"6:1-6:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87300186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 57
3D binary signatures 三维二进制签名
Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010009
Siddharth Srivastava, Brejesh Lall
In this paper, we propose a novel binary descriptor for 3D point clouds. The proposed descriptor termed as 3D Binary Signature (3DBS) is motivated from the matching efficiency of the binary descriptors for 2D images. 3DBS describes keypoints from point clouds with a binary vector resulting in extremely fast matching. The method uses keypoints from standard keypoint detectors. The descriptor is built by constructing a Local Reference Frame and aligning a local surface patch accordingly. The local surface patch constitutes of identifying nearest neighbours based upon an angular constraint among them. The points are ordered with respect to the distance from the keypoints. The normals of the ordered pairs of these keypoints are projected on the axes and the relative magnitude is used to assign a binary digit. The vector thus constituted is used as a signature for representing the keypoints. The matching is done by using hamming distance. We show that 3DBS outperforms state of the art descriptors on various evaluation metrics.
本文提出了一种新的三维点云二元描述符。该描述符被称为3D二进制签名(3DBS),其动机来自于二维图像的二进制描述符的匹配效率。3DBS用二进制向量描述点云中的关键点,从而实现极快的匹配。该方法使用标准关键点检测器中的关键点。该描述符是通过构造局部参考框架并相应地对齐局部表面补丁来构建的。局部表面斑块由基于它们之间的角度约束来识别最近的邻居组成。这些点是根据到关键点的距离排序的。将这些关键点的有序对的法线投影到坐标轴上,并使用相对大小来分配二进制数字。这样构成的向量用作表示关键点的签名。采用汉明距离进行匹配。我们展示了3DBS在各种评估指标上优于最先进的描述符。
{"title":"3D binary signatures","authors":"Siddharth Srivastava, Brejesh Lall","doi":"10.1145/3009977.3010009","DOIUrl":"https://doi.org/10.1145/3009977.3010009","url":null,"abstract":"In this paper, we propose a novel binary descriptor for 3D point clouds. The proposed descriptor termed as 3D Binary Signature (3DBS) is motivated from the matching efficiency of the binary descriptors for 2D images. 3DBS describes keypoints from point clouds with a binary vector resulting in extremely fast matching. The method uses keypoints from standard keypoint detectors. The descriptor is built by constructing a Local Reference Frame and aligning a local surface patch accordingly. The local surface patch constitutes of identifying nearest neighbours based upon an angular constraint among them. The points are ordered with respect to the distance from the keypoints. The normals of the ordered pairs of these keypoints are projected on the axes and the relative magnitude is used to assign a binary digit. The vector thus constituted is used as a signature for representing the keypoints. The matching is done by using hamming distance. We show that 3DBS outperforms state of the art descriptors on various evaluation metrics.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"87 3 1","pages":"77:1-77:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82134389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Bishnupur heritage image dataset (BHID): a resource for various computer vision applications 比什努普尔遗产图像数据集(BHID):用于各种计算机视觉应用的资源
Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010005
Mrinmoy Ghorai, Pulak Purkait, Sanchayan Santra, S. Samanta, B. Chanda
Bishnupur is an attractive tourist place in West Bengal, India and is known for its terracotta temples. The place is one of the prospective candidates to be included in the list of UNESCO World Heritage sites. We intend to preserve this heritage site digitally and also to present some virtual interaction for the tourist and researchers. In this paper, we present an image dataset of different temples (namely, Jor Bangla, Kalachand, Madan Mohan, Radha Madhav, Rasmancha, Shyamrai and Nandalal) in Bishnupur for evaluating different types of computer vision and image processing algorithms (like 3D reconstruction, image inpainting, texture classification and content specific image retrieval). The dataset is captured using four different cameras with different parameter settings. Some datasets are extracted and earmarked for certain applications such as texture classification, image inpainting and content specific image retrieval. Example results of baseline methods are also shown for these applications. Thus we evaluate the usefulness of this dataset. To the best of our knowledge, probably this is the first attempt of combined dataset for evaluating various types of problems for a heritage site in India. The dataset is publicly available at http://www.isical.ac.in/~bsnpr/ for research purpose only.
比什努普尔是印度西孟加拉邦的一个有吸引力的旅游胜地,以其兵马俑寺庙而闻名。这里是联合国教科文组织世界遗产名录的潜在候选地之一。我们打算以数字化的方式保存这个遗址,同时也为游客和研究人员提供一些虚拟的互动。在本文中,我们提供了比什努普尔不同寺庙(即Jor Bangla, Kalachand, Madan Mohan, Radha Madhav, Rasmancha, Shyamrai和Nandalal)的图像数据集,用于评估不同类型的计算机视觉和图像处理算法(如3D重建,图像绘画,纹理分类和特定内容的图像检索)。数据集是使用四个不同的相机捕获的,具有不同的参数设置。一些数据集被提取并指定用于某些应用,如纹理分类、图像绘制和特定内容的图像检索。这些应用程序还显示了基线方法的示例结果。因此,我们评估这个数据集的有用性。据我们所知,这可能是第一次尝试使用综合数据集来评估印度遗产地的各种问题。该数据集可在http://www.isical.ac.in/~bsnpr/上公开获取,仅用于研究目的。
{"title":"Bishnupur heritage image dataset (BHID): a resource for various computer vision applications","authors":"Mrinmoy Ghorai, Pulak Purkait, Sanchayan Santra, S. Samanta, B. Chanda","doi":"10.1145/3009977.3010005","DOIUrl":"https://doi.org/10.1145/3009977.3010005","url":null,"abstract":"Bishnupur is an attractive tourist place in West Bengal, India and is known for its terracotta temples. The place is one of the prospective candidates to be included in the list of UNESCO World Heritage sites. We intend to preserve this heritage site digitally and also to present some virtual interaction for the tourist and researchers. In this paper, we present an image dataset of different temples (namely, Jor Bangla, Kalachand, Madan Mohan, Radha Madhav, Rasmancha, Shyamrai and Nandalal) in Bishnupur for evaluating different types of computer vision and image processing algorithms (like 3D reconstruction, image inpainting, texture classification and content specific image retrieval). The dataset is captured using four different cameras with different parameter settings. Some datasets are extracted and earmarked for certain applications such as texture classification, image inpainting and content specific image retrieval. Example results of baseline methods are also shown for these applications. Thus we evaluate the usefulness of this dataset. To the best of our knowledge, probably this is the first attempt of combined dataset for evaluating various types of problems for a heritage site in India. The dataset is publicly available at http://www.isical.ac.in/~bsnpr/ for research purpose only.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"151 1","pages":"80:1-80:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74089731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Classification of Schizophrenia versus normal subjects using deep learning 利用深度学习对精神分裂症和正常受试者进行分类
Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010050
Pinkal Patel, P. Aggarwal, Anubha Gupta
Motivated by deep learning approaches to classify normal and neuro-diseased subjects in functional Magnetic Resonance Imaging (fMRI), we propose stacked autoencoder (SAE) based 2-stage architecture for disease diagnosis. In the proposed architecture, a separate 4-hidden layer autoencoder is trained in unsupervised manner for feature extraction corresponding to every brain region. Thereafter, these trained autoencoders are used to provide features on class-labeled input data for training a binary support vector machine (SVM) based classifier. In order to design a robust classifier, noisy or inactive gray matter voxels are filtered out using a proposed covariance based approach. We applied the proposed methodology on a public dataset, namely, 1000 Functional Connectomes Project Cobre dataset consisting of fMRI data of normal and Schizophrenia subjects. The proposed architecture is able to classify normal and Schizophrenia subjects with 10-fold cross-validation accuracy of 92% that is better compared to the existing methods used on the same dataset.
受功能磁共振成像(fMRI)中对正常和神经病变受试者进行分类的深度学习方法的启发,我们提出了基于堆叠自编码器(SAE)的两阶段疾病诊断架构。在所提出的架构中,以无监督的方式训练一个独立的4隐藏层自编码器,用于提取对应于每个大脑区域的特征。然后,这些训练好的自编码器被用来为分类标记的输入数据提供特征,用于训练基于二进制支持向量机(SVM)的分类器。为了设计一个鲁棒分类器,使用一种基于协方差的方法过滤掉有噪声或不活动的灰质体素。我们将提出的方法应用于一个公共数据集,即1000个功能性连接体项目Cobre数据集,该数据集由正常和精神分裂症受试者的功能磁共振成像数据组成。所提出的架构能够以92%的10倍交叉验证准确率对正常和精神分裂症受试者进行分类,这比在相同数据集上使用的现有方法要好。
{"title":"Classification of Schizophrenia versus normal subjects using deep learning","authors":"Pinkal Patel, P. Aggarwal, Anubha Gupta","doi":"10.1145/3009977.3010050","DOIUrl":"https://doi.org/10.1145/3009977.3010050","url":null,"abstract":"Motivated by deep learning approaches to classify normal and neuro-diseased subjects in functional Magnetic Resonance Imaging (fMRI), we propose stacked autoencoder (SAE) based 2-stage architecture for disease diagnosis. In the proposed architecture, a separate 4-hidden layer autoencoder is trained in unsupervised manner for feature extraction corresponding to every brain region. Thereafter, these trained autoencoders are used to provide features on class-labeled input data for training a binary support vector machine (SVM) based classifier. In order to design a robust classifier, noisy or inactive gray matter voxels are filtered out using a proposed covariance based approach. We applied the proposed methodology on a public dataset, namely, 1000 Functional Connectomes Project Cobre dataset consisting of fMRI data of normal and Schizophrenia subjects. The proposed architecture is able to classify normal and Schizophrenia subjects with 10-fold cross-validation accuracy of 92% that is better compared to the existing methods used on the same dataset.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"3 1","pages":"28:1-28:6"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81908806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
Alternate formulation for transform learning 转换学习的替代公式
Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010069
Jyoti Maggu, A. Majumdar
Dictionary learning has been used to solve inverse problems in imaging and as an unsupervised feature extraction tool in vision. The main disadvantage of dictionary learning for applications in vision is the relatively long feature extraction time during testing; owing to the requirement of solving an iterative optimization problem (l0-minimization). The newly developed analysis framework of transform learning does not suffer from this shortcoming; feature extraction only requires a matrix vector multiplication. This work proposes an alternate formulation for transform learning that improves the accuracy even further. Experiments on benchmark databases show that our proposed transform learning yields results better than dictionary learning, autoencoder (AE) and restricted Boltzmann machine (RBM). The feature extraction time is fast as AE and RBM.
字典学习已被用于解决成像中的逆问题,并在视觉中作为一种无监督特征提取工具。在视觉应用中,字典学习的主要缺点是测试过程中特征提取时间相对较长;由于求解迭代优化问题(10 -最小化)的要求。新发展的转化学习分析框架没有这个缺点;特征提取只需要一个矩阵向量乘法。这项工作提出了一种转换学习的替代公式,可以进一步提高准确性。在基准数据库上的实验表明,本文提出的变换学习方法比字典学习、自动编码器(AE)和受限玻尔兹曼机(RBM)的学习效果更好。特征提取速度比AE和RBM快。
{"title":"Alternate formulation for transform learning","authors":"Jyoti Maggu, A. Majumdar","doi":"10.1145/3009977.3010069","DOIUrl":"https://doi.org/10.1145/3009977.3010069","url":null,"abstract":"Dictionary learning has been used to solve inverse problems in imaging and as an unsupervised feature extraction tool in vision. The main disadvantage of dictionary learning for applications in vision is the relatively long feature extraction time during testing; owing to the requirement of solving an iterative optimization problem (l0-minimization). The newly developed analysis framework of transform learning does not suffer from this shortcoming; feature extraction only requires a matrix vector multiplication. This work proposes an alternate formulation for transform learning that improves the accuracy even further. Experiments on benchmark databases show that our proposed transform learning yields results better than dictionary learning, autoencoder (AE) and restricted Boltzmann machine (RBM). The feature extraction time is fast as AE and RBM.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"36 1","pages":"50:1-50:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85191663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Spatio-temporal weighted histogram based mean shift for illumination robust target tracking 基于时空加权直方图的光照鲁棒目标跟踪
Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010059
K. Deopujari, R. Velmurugan, K. Tiwari
This paper proposes a simple method to handle illumination variation in a video. The proposed method is based on generative mean shift tracker, which uses energy compaction property of discrete Cosine transform (DCT) to handle illumination variation within and across frames. The proposed method uses spatial and temporal DCT coefficient based approach to assign weights to target and candidate histograms in mean shift. The proposed weighing factor takes care of changes in illumination within a frame i.e., illumination change of the target with respect to background and also across the frames i.e., varying illumination between the consecutive time instances. The algorithm was tested using VOT2015 challenge dataset and also on sequences from OTB and CAVIAR datasets. The proposed method was also tested rigorously for illumination attribute. The qualitative and quantitative evaluation process of the proposed method was twofold. First, the tracker was compared with existing DCT coefficient based method and showed improved results. Secondly, the proposed algorithm was compared with other state of the art trackers. The results show that the proposed algorithm outperformed some state-of-the-art trackers while with others it showed comparable performance.
本文提出了一种处理视频中光照变化的简单方法。该方法基于生成式均值偏移跟踪器,利用离散余弦变换(DCT)的能量压缩特性来处理帧内和帧间的光照变化。该方法采用基于时空DCT系数的方法对均值漂移中的目标直方图和候选直方图分配权重。所提出的加权因子考虑了一帧内的光照变化,即目标相对于背景的光照变化,也考虑了帧间的光照变化,即连续时间实例之间的光照变化。使用VOT2015挑战数据集以及OTB和CAVIAR数据集的序列对该算法进行了测试。并对该方法进行了光照属性的严格测试。该方法的定性和定量评价过程分为两部分。首先,对现有的基于DCT系数的跟踪方法进行了比较,得到了改进的结果。其次,将所提算法与其他最先进的跟踪器进行了比较。结果表明,该算法优于一些最先进的跟踪器,而与其他跟踪器表现出相当的性能。
{"title":"Spatio-temporal weighted histogram based mean shift for illumination robust target tracking","authors":"K. Deopujari, R. Velmurugan, K. Tiwari","doi":"10.1145/3009977.3010059","DOIUrl":"https://doi.org/10.1145/3009977.3010059","url":null,"abstract":"This paper proposes a simple method to handle illumination variation in a video. The proposed method is based on generative mean shift tracker, which uses energy compaction property of discrete Cosine transform (DCT) to handle illumination variation within and across frames. The proposed method uses spatial and temporal DCT coefficient based approach to assign weights to target and candidate histograms in mean shift. The proposed weighing factor takes care of changes in illumination within a frame i.e., illumination change of the target with respect to background and also across the frames i.e., varying illumination between the consecutive time instances. The algorithm was tested using VOT2015 challenge dataset and also on sequences from OTB and CAVIAR datasets. The proposed method was also tested rigorously for illumination attribute. The qualitative and quantitative evaluation process of the proposed method was twofold. First, the tracker was compared with existing DCT coefficient based method and showed improved results. Secondly, the proposed algorithm was compared with other state of the art trackers. The results show that the proposed algorithm outperformed some state-of-the-art trackers while with others it showed comparable performance.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"5 1","pages":"40:1-40:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91093053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reduction of variance of observations on pelvic structures in CBCT images using novel mean-shift and mutual information based image registration? 利用新颖的均值移位和基于互信息的图像配准减少CBCT图像中骨盆结构观察的方差?
Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010030
S. Malladi, Bijju Kranthi Veduruparthi, J. Mukherjee, P. Das, S. Chakrabarti, I. Mallick
In this paper, Cone-Beam Computed Tomography(CBCT) image data of colorectal cancer patients are considered for registering standard reference locations of bony structures in pelvic region. A solution is provided in this paper to automatically compute and resolve irregularities involved in locating bony structures in the pelvic region. A new algorithm is proposed to automatically locate the lowest 3D coordinates of the Pubic Symphysis (pb) and the Coccyx on a daily basis. The irregularities involved are reduced to minimum by registering CBCT images. The conventional three dimensional mutual information (MI) based registration and a novel mean shift based mutual information techniques are compared. The variations in the position of pelvic region are also compared for unregistered and registered CBCT images. The proposed algorithm, tested on CBCT image data of 25 patients, each taken over a span of 27 days consecutively, provide promising results. The variations in the locations of coccyx, pb, and the distance between them were found to be reduced due to registration of 3D CBCT images.
本文利用结直肠癌患者的锥形束ct (Cone-Beam Computed Tomography, CBCT)图像数据,登记骨盆区骨结构的标准参考位置。本文提供了一种自动计算和解决骨盆区域骨结构定位中的不规则性的解决方案。提出了一种自动定位耻骨联合(pb)和尾骨的最低三维坐标的新算法。通过对CBCT图像进行配准,将涉及的不规则性降到最低。比较了传统的基于三维互信息(MI)的配准技术和一种新的基于平均位移的互信息配准技术。对比了未配准和配准CBCT图像盆腔区域位置的变化。该算法在25例患者连续27天的CBCT图像数据上进行了测试,结果令人满意。通过对三维CBCT图像的配准,发现尾骨、pb位置的变化以及它们之间的距离减少了。
{"title":"Reduction of variance of observations on pelvic structures in CBCT images using novel mean-shift and mutual information based image registration?","authors":"S. Malladi, Bijju Kranthi Veduruparthi, J. Mukherjee, P. Das, S. Chakrabarti, I. Mallick","doi":"10.1145/3009977.3010030","DOIUrl":"https://doi.org/10.1145/3009977.3010030","url":null,"abstract":"In this paper, Cone-Beam Computed Tomography(CBCT) image data of colorectal cancer patients are considered for registering standard reference locations of bony structures in pelvic region. A solution is provided in this paper to automatically compute and resolve irregularities involved in locating bony structures in the pelvic region. A new algorithm is proposed to automatically locate the lowest 3D coordinates of the Pubic Symphysis (pb) and the Coccyx on a daily basis. The irregularities involved are reduced to minimum by registering CBCT images. The conventional three dimensional mutual information (MI) based registration and a novel mean shift based mutual information techniques are compared. The variations in the position of pelvic region are also compared for unregistered and registered CBCT images. The proposed algorithm, tested on CBCT image data of 25 patients, each taken over a span of 27 days consecutively, provide promising results. The variations in the locations of coccyx, pb, and the distance between them were found to be reduced due to registration of 3D CBCT images.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"16 2 1","pages":"84:1-84:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89937482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Bangla online handwriting recognition using recurrent neural network architecture 使用递归神经网络架构的孟加拉语在线手写识别
Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010072
Bappaditya Chakraborty, P. Mukherjee, U. Bhattacharya
Recognition of unconstrained handwritten texts is always a difficult problem, particularly if the style of handwriting is a mixed cursive one. Among various Indian scripts, only Bangla has this additional difficulty of tackling mixed cur-siveness of its handwriting style in the pipeline of a method towards its automatic recognition. A few other common recognition difficulties of handwriting in an Indian script include the large size of its alphabet and the extremely cursive nature of the shapes of its alphabetic characters. These are among the reasons of achieving only limited success in the study of unconstrained handwritten Bangla text recognition. Artificial Neural Network (ANN) models have often been used for solving difficult real-life pattern recognition problems. Recurrent Neural Network models (RNN) have been studied in the literature for modeling sequence data. In this study, we consider Long Short Term Memory (LSTM) network model, a useful member of this family. In fact, Bidirectional Long Short-Term Memory (BLSTM) neural networks is a special kind of RNN and have recently attracted special attention in solving sequence labelling problems. In this article, we present a BLSTM architecture based approach for unconstrained online handwritten Bangla text recognition.
识别不受约束的手写文本一直是一个难题,特别是如果手写风格是混合草书。在各种各样的印度文字中,只有孟加拉文在自动识别的过程中遇到了这种额外的困难,即处理其手写风格的混合潦草性。其他一些常见的识别印度文字的困难包括其字母的大尺寸和其字母字符形状的极端草书性质。这些都是在无约束手写体孟加拉文本识别研究中取得有限成功的原因之一。人工神经网络(ANN)模型经常被用于解决现实生活中的模式识别难题。文献中已经研究了递归神经网络模型(RNN)用于序列数据的建模。在本研究中,我们考虑长短期记忆(LSTM)网络模型,这是这个家族的一个有用的成员。事实上,双向长短期记忆(Bidirectional Long - short - Memory, BLSTM)神经网络是一种特殊的RNN,近年来在解决序列标记问题方面受到了特别的关注。在本文中,我们提出了一种基于BLSTM架构的无约束在线手写体孟加拉文本识别方法。
{"title":"Bangla online handwriting recognition using recurrent neural network architecture","authors":"Bappaditya Chakraborty, P. Mukherjee, U. Bhattacharya","doi":"10.1145/3009977.3010072","DOIUrl":"https://doi.org/10.1145/3009977.3010072","url":null,"abstract":"Recognition of unconstrained handwritten texts is always a difficult problem, particularly if the style of handwriting is a mixed cursive one. Among various Indian scripts, only Bangla has this additional difficulty of tackling mixed cur-siveness of its handwriting style in the pipeline of a method towards its automatic recognition. A few other common recognition difficulties of handwriting in an Indian script include the large size of its alphabet and the extremely cursive nature of the shapes of its alphabetic characters. These are among the reasons of achieving only limited success in the study of unconstrained handwritten Bangla text recognition. Artificial Neural Network (ANN) models have often been used for solving difficult real-life pattern recognition problems. Recurrent Neural Network models (RNN) have been studied in the literature for modeling sequence data. In this study, we consider Long Short Term Memory (LSTM) network model, a useful member of this family. In fact, Bidirectional Long Short-Term Memory (BLSTM) neural networks is a special kind of RNN and have recently attracted special attention in solving sequence labelling problems. In this article, we present a BLSTM architecture based approach for unconstrained online handwritten Bangla text recognition.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"39 1","pages":"63:1-63:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87736153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
期刊
Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1