Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing最新文献

英文中文

Scalable clustering and applications 可伸缩集群和应用程序

Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing

Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010073

Shahid K I, S. Chaudhury

Large scale machine learning is becoming an active research area recently. Most of the existing clustering algorithms cannot handle big data due to its high time and space complexity. Among the clustering algorithms, eigen vector based clustering, such as Spectral clustering, shows very good accuracy, but it has cubic time complexity. There are various methods proposed to reduce the time and space complexity for eigen decomposition such as Nyström method, Lanc-zos method etc. Nyström method has linear time complexity in terms of number of data points, but has cubic time complexity in terms of number of sampling points. To reduce this, various Rank k approximation methods also proposed, but which are less efficient compare to the normalized spectral clustering. In this paper we propose a two step algorithm for spectral clustering to reduce the time complexity toO(nmk + m2k'), by combining both Nyström and Lanczos method, where k is the number of clusters and k' is the rank k approximation of the sampling matrix (k < k' << m << n). It shows very good results, with various data sets, image segmentation problems and churn prediction of a telecommunication data set, even with very low sampling (for 10 Million × 10 Million matrix, sampled only 100 columns) with lesser time, which confirms the validity of the algorithm.

近年来，大规模机器学习正在成为一个活跃的研究领域。由于大数据的时间和空间复杂度高，现有的聚类算法大多无法处理大数据。在聚类算法中，基于特征向量的聚类，如谱聚类，具有很好的精度，但具有三次时间复杂度。为了降低特征分解的时间和空间复杂度，人们提出了多种方法，如Nyström方法、lanco -zos方法等。Nyström方法在数据点数量上具有线性时间复杂度，但在采样点数量上具有三次时间复杂度。为了减少这种情况，也提出了各种秩k近似方法，但与归一化谱聚类相比效率较低。在本文中,我们提出一种两步谱聚类算法来减少时间复杂度也(nmk + m2k”),通过结合Nystrom和兰索斯法、k是集群的数量和k的采样矩阵的秩k近似(k < k ' < < m < < n)。它显示了很好的结果,与不同的数据集,图像分割问题和电信客户流失预测的数据集,即使在非常低的抽样(1000万×1000万矩阵,在较短的时间内只采样了100列，验证了算法的有效性。

{"title":"Scalable clustering and applications","authors":"Shahid K I, S. Chaudhury","doi":"10.1145/3009977.3010073","DOIUrl":"https://doi.org/10.1145/3009977.3010073","url":null,"abstract":"Large scale machine learning is becoming an active research area recently. Most of the existing clustering algorithms cannot handle big data due to its high time and space complexity. Among the clustering algorithms, eigen vector based clustering, such as Spectral clustering, shows very good accuracy, but it has cubic time complexity. There are various methods proposed to reduce the time and space complexity for eigen decomposition such as Nyström method, Lanc-zos method etc. Nyström method has linear time complexity in terms of number of data points, but has cubic time complexity in terms of number of sampling points. To reduce this, various Rank k approximation methods also proposed, but which are less efficient compare to the normalized spectral clustering. In this paper we propose a two step algorithm for spectral clustering to reduce the time complexity toO(nmk + m2k'), by combining both Nyström and Lanczos method, where k is the number of clusters and k' is the rank k approximation of the sampling matrix (k < k' << m << n). It shows very good results, with various data sets, image segmentation problems and churn prediction of a telecommunication data set, even with very low sampling (for 10 Million × 10 Million matrix, sampled only 100 columns) with lesser time, which confirms the validity of the algorithm.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"92 1","pages":"34:1-34:7"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90782060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Supervised deep segmentation network for brain extraction 脑提取的监督深度分割网络

Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing

Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010016

Apoorva Sikka, Gaurav Mittal, Deepti R. Bathula, N. C. Krishnan

Recent past has seen an inexorable shift towards the use of deep learning techniques to solve a myriad of problems in the field of medical imaging. In this paper, a novel segmentation method involving a fully-connected deep neural network called Deep Segmentation Network (DSN) is proposed to perform supervised regression for brain extraction from T1-weighted magnetic resonance (MR) images. In contrast to the existing patch-based feature learning techniques, DSN works on full 3D volumes, simplifying pre- and post-processing operations, to efficiently provide a voxel-wise binary mask delineating the brain region. The model is evaluated using three publicly available datasets and is observed to either outdo or perform comparably to the state-of-the-art methods. DSN is able to achieve a maximum and minimum Dice Similarity Coefficient (DSC) of 97.57 and 92.82 respectively across all the datasets. Experiments conducted in this paper highlight the ability of the DSN model to automatically learn feature representations; making it a simple yet highly effective approach for brain segmentation. Preliminary experiments also suggest that the proposed model has the potential to segment sub-cortical structures accurately.

近年来，人们不可阻挡地转向使用深度学习技术来解决医学成像领域的无数问题。本文提出了一种基于全连接深度神经网络的分割方法，即深度分割网络(DSN)，对t1加权磁共振(MR)图像的脑提取进行监督回归。与现有的基于补丁的特征学习技术相比，DSN适用于完整的3D体积，简化了预处理和后处理操作，有效地提供了描绘大脑区域的体素二进制掩模。该模型使用三个公开可用的数据集进行评估，并被观察到优于或执行与最先进的方法相当。DSN能够在所有数据集上实现最大和最小骰子相似系数(DSC)分别为97.57和92.82。本文进行的实验突出了DSN模型自动学习特征表示的能力;使之成为一种简单而高效的大脑分割方法。初步实验还表明，该模型具有准确分割皮层下结构的潜力。

{"title":"Supervised deep segmentation network for brain extraction","authors":"Apoorva Sikka, Gaurav Mittal, Deepti R. Bathula, N. C. Krishnan","doi":"10.1145/3009977.3010016","DOIUrl":"https://doi.org/10.1145/3009977.3010016","url":null,"abstract":"Recent past has seen an inexorable shift towards the use of deep learning techniques to solve a myriad of problems in the field of medical imaging. In this paper, a novel segmentation method involving a fully-connected deep neural network called Deep Segmentation Network (DSN) is proposed to perform supervised regression for brain extraction from T1-weighted magnetic resonance (MR) images. In contrast to the existing patch-based feature learning techniques, DSN works on full 3D volumes, simplifying pre- and post-processing operations, to efficiently provide a voxel-wise binary mask delineating the brain region. The model is evaluated using three publicly available datasets and is observed to either outdo or perform comparably to the state-of-the-art methods. DSN is able to achieve a maximum and minimum Dice Similarity Coefficient (DSC) of 97.57 and 92.82 respectively across all the datasets. Experiments conducted in this paper highlight the ability of the DSN model to automatically learn feature representations; making it a simple yet highly effective approach for brain segmentation. Preliminary experiments also suggest that the proposed model has the potential to segment sub-cortical structures accurately.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"27 1","pages":"9:1-9:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90168050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Local dominant binary patterns for recognition of multi-view facial expressions 多视角面部表情识别的局部优势二值模式

Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing

Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010008

Bikash Santra, D. Mukherjee

In this paper, a novel framework is proposed for automatic recognition of facial expressions. However, the face images for the proposed problem are captured at multiple view angle (i.e., multi-view facial expressions). The proposed scheme introduces a local dominant binary pattern (LDBP). Unlike uniform LBP based features, the LDBP uses fewer feature dimension without affecting the recognition performances. The LDBP is computed by improvising LBP with dominant orientations of neighborhood pixels. The eigen-value analysis of structure tensor representation of expressive face images determines the dominant directions of gray value changes in local neighbors of pixels. We use SVM for view-specific classification of multi-view facial expressions. The proposed model is experimented with the benchmark datasets of both near-frontal (CK+ and JAFEE) and multi-view (KDEF, SFEW and LFPW) face images. The datasets include faces from posed as well as spontaneous expressions. The proposed scheme outperforms state-of-the-arts by approximately 1% for the near-frontal facial expressions and by at least 3% for multi-view facial expressions on an average.

本文提出了一种新的面部表情自动识别框架。然而，所提出问题的人脸图像是在多个视角下捕获的(即多视角面部表情)。该方案引入了一种局部优势二进制模式(LDBP)。与基于均匀LBP的特征不同，LDBP使用更少的特征维数而不影响识别性能。LDBP的计算方法是利用邻域像素的优势方向随机生成LBP。面部表情图像结构张量表示的特征值分析决定了像素局部邻域灰度值变化的主导方向。我们使用SVM对多视图面部表情进行特定视图分类。该模型在近正面(CK+和JAFEE)和多视角(KDEF, SFEW和LFPW)人脸图像的基准数据集上进行了实验。这些数据集包括来自摆姿势和自然表情的人脸。所提出的方案在近正面面部表情和多视角面部表情的平均表现上比目前的技术水平高出约1%和至少3%。

{"title":"Local dominant binary patterns for recognition of multi-view facial expressions","authors":"Bikash Santra, D. Mukherjee","doi":"10.1145/3009977.3010008","DOIUrl":"https://doi.org/10.1145/3009977.3010008","url":null,"abstract":"In this paper, a novel framework is proposed for automatic recognition of facial expressions. However, the face images for the proposed problem are captured at multiple view angle (i.e., multi-view facial expressions). The proposed scheme introduces a local dominant binary pattern (LDBP). Unlike uniform LBP based features, the LDBP uses fewer feature dimension without affecting the recognition performances. The LDBP is computed by improvising LBP with dominant orientations of neighborhood pixels. The eigen-value analysis of structure tensor representation of expressive face images determines the dominant directions of gray value changes in local neighbors of pixels. We use SVM for view-specific classification of multi-view facial expressions. The proposed model is experimented with the benchmark datasets of both near-frontal (CK+ and JAFEE) and multi-view (KDEF, SFEW and LFPW) face images. The datasets include faces from posed as well as spontaneous expressions. The proposed scheme outperforms state-of-the-arts by approximately 1% for the near-frontal facial expressions and by at least 3% for multi-view facial expressions on an average.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"47 1","pages":"25:1-25:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81421646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Feature-preserving 3D fluorescence image sequence denoising 保持特征的三维荧光图像序列去噪

Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing

Pub Date : 2016-12-18 DOI: 10.1145/3009977.3009983

H. Bhujle

In this paper feature-preserving denoising scheme for fluorescence video microscopy is presented. Fluorescence image sequences comprise of edges and fine structures with fast moving objects. Improving signal to noise ratio (SNR) while preserving structural details is a difficult task for these image sequences. Few existing denoising techniques result in over-smoothing these image sequences while others fail due to inappropriate implementation of motion estimation and compensation steps. In this paper we use nonlocal means (NLM) video denoising algorithm as to avoid motion estimation and compensation steps. The proposed shot boundary detection technique pre-processes the sequence systematically and accurately to form different shots with content-wise similar frames. To preserve the edges and fine structural details in the image sequences we modify the weighing term of NLM filter. Further, to accelerate the denoising process, separable non-local means filter is implemented for video sequences. We compare the results with existing fluorescence video de-noising techniques and show that the proposed method not only preserves the edges and small structural details more efficiently, also reduces the computational time. Efficacy of the proposed algorithm is evaluated quantitatively and qualitatively with PSNR and vision perception.

提出了一种荧光视频显微图像的特征保持去噪方案。荧光图像序列由边缘和具有快速运动物体的精细结构组成。在保持结构细节的同时提高信噪比是这些图像序列的难点。现有的去噪技术很少会导致这些图像序列的过度平滑，而其他去噪技术则由于运动估计和补偿步骤的不适当而失败。本文采用非局部均值(NLM)视频去噪算法来避免运动估计和补偿步骤。提出的镜头边界检测技术对序列进行系统、准确的预处理，形成具有内容相似帧的不同镜头。为了保留图像序列的边缘和精细的结构细节，我们对NLM滤波器的加权项进行了修改。为了加快去噪过程，对视频序列进行了可分离非局部均值滤波。将结果与现有的荧光视频去噪技术进行了比较，结果表明，该方法不仅能更有效地保留边缘和小的结构细节，而且减少了计算时间。用PSNR和视觉感知对算法的有效性进行了定量和定性评价。

{"title":"Feature-preserving 3D fluorescence image sequence denoising","authors":"H. Bhujle","doi":"10.1145/3009977.3009983","DOIUrl":"https://doi.org/10.1145/3009977.3009983","url":null,"abstract":"In this paper feature-preserving denoising scheme for fluorescence video microscopy is presented. Fluorescence image sequences comprise of edges and fine structures with fast moving objects. Improving signal to noise ratio (SNR) while preserving structural details is a difficult task for these image sequences. Few existing denoising techniques result in over-smoothing these image sequences while others fail due to inappropriate implementation of motion estimation and compensation steps. In this paper we use nonlocal means (NLM) video denoising algorithm as to avoid motion estimation and compensation steps. The proposed shot boundary detection technique pre-processes the sequence systematically and accurately to form different shots with content-wise similar frames. To preserve the edges and fine structural details in the image sequences we modify the weighing term of NLM filter. Further, to accelerate the denoising process, separable non-local means filter is implemented for video sequences. We compare the results with existing fluorescence video de-noising techniques and show that the proposed method not only preserves the edges and small structural details more efficiently, also reduces the computational time. Efficacy of the proposed algorithm is evaluated quantitatively and qualitatively with PSNR and vision perception.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"23 1","pages":"45:1-45:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82808377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Evaluation of graph layout methods based on visual perception 基于视觉感知的图形布局方法评价

Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing

Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010070

Jiafan Li, Yuhua Liu, Changbo Wang

Node-link diagrams provide an intuitive way to explore networks and have inspired a large number of automated graph layout strategies that optimize aesthetic criteria. However, any particular drawing approach cannot fully satisfy all these criteria simultaneously. So the evaluation methods are designed to explore the advantages and disadvantages of different graph layout methods from these standards. Starting from the point of visual perception, this paper analyzes the node's visual importance based on a user experiment and designs a model to measure the node's visual importance. Then evaluate the pros and cons of graph layout methods by comparing the topological importance and visual importance of nodes. A heatmap-based visualization is used to provide visual feedback for the difference between the topological importance and visual importance of nodes. Meantime, a metric is built to quantify the difference precisely. Finally, experiments are done under different scale of data sets to further analyze the characteristics of these graph layout methods.

节点链接图提供了一种直观的方式来探索网络，并激发了大量优化美学标准的自动化图形布局策略。然而，任何特定的绘图方法都不可能同时完全满足所有这些标准。因此设计了评价方法，从这些标准出发，探讨不同图形布局方法的优缺点。本文从视觉感知的角度出发，基于用户实验对节点的视觉重要性进行了分析，并设计了节点视觉重要性度量模型。然后通过比较节点的拓扑重要性和视觉重要性来评价图布局方法的优缺点。采用基于热图的可视化方法，对节点的拓扑重要性和视觉重要性之间的差异提供视觉反馈。同时，建立了一个度量来精确地量化差异。最后，在不同规模的数据集下进行了实验，进一步分析了这些图布局方法的特点。

引用次数: 7

Unsupervised domain adaptation without source domain training samples: a maximum margin clustering based approach 无源域训练样本的无监督域自适应:基于最大边际聚类的方法

Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing

Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010033

Sudipan Saha, Biplab Banerjee, S. Merchant

Unsupervised domain adaptation (DA) techniques inherently assume the presence of ample amount of source domain training samples in addition to the target domain test data. The domains are characterized by domain-specific probability distributions governing the data which are substantially different from each other. The goal is to build a task oriented classifier model that performs proportionately in both the domains. In contrary to the standard unsupervised DA setup, we propose a maximum-margin clustering (MMC) based framework for the same which does not consider source domain labeled samples. Instead we formulate it as a joint clustering problem of all the samples from both the domains in a common feature subspace. The Geodesic Flow Kernel (GFK) based subspace projection technique in the Grassmannian manifold is adopted to cast the samples in a domain invariant space. Further, the MMC stage is followed to simultaneously group the data based on the maximization of margins and a classifier is learned for each group. The data overlapping problem is taken care of by specifically learning a SVM-KNN classifier for the potentially unreliable samples per group. We validate the framework on a pair of remote sensing images of different modalities for the purpose of land-cover classification and a generic object dataset for recognition. We observe that the proposed method exhibits performances at par with the fully supervised case for both the tasks but without the requirement of costly annotations.

无监督域自适应(DA)技术固有地假设除了目标域测试数据之外，还存在大量的源域训练样本。这些域的特征是控制数据的特定于域的概率分布，这些分布彼此之间有很大的不同。目标是构建一个面向任务的分类器模型，在这两个领域中按比例执行。与标准的无监督DA设置相反，我们提出了一个基于最大边际聚类(MMC)的框架，该框架不考虑源域标记样本。相反，我们将其表述为两个域的所有样本在公共特征子空间中的联合聚类问题。采用基于测地线流核(GFK)的格拉斯曼流形子空间投影技术将样本投影到域不变空间。此外，遵循MMC阶段，根据边界最大化同时对数据进行分组，并为每组学习一个分类器。通过对每组可能不可靠的样本学习SVM-KNN分类器来处理数据重叠问题。我们在一对不同模式的遥感图像上验证了该框架，用于土地覆盖分类和用于识别的通用目标数据集。我们观察到，所提出的方法在两种任务中都表现出与完全监督情况相当的性能，但不需要昂贵的注释。

{"title":"Unsupervised domain adaptation without source domain training samples: a maximum margin clustering based approach","authors":"Sudipan Saha, Biplab Banerjee, S. Merchant","doi":"10.1145/3009977.3010033","DOIUrl":"https://doi.org/10.1145/3009977.3010033","url":null,"abstract":"Unsupervised domain adaptation (DA) techniques inherently assume the presence of ample amount of source domain training samples in addition to the target domain test data. The domains are characterized by domain-specific probability distributions governing the data which are substantially different from each other. The goal is to build a task oriented classifier model that performs proportionately in both the domains. In contrary to the standard unsupervised DA setup, we propose a maximum-margin clustering (MMC) based framework for the same which does not consider source domain labeled samples. Instead we formulate it as a joint clustering problem of all the samples from both the domains in a common feature subspace. The Geodesic Flow Kernel (GFK) based subspace projection technique in the Grassmannian manifold is adopted to cast the samples in a domain invariant space. Further, the MMC stage is followed to simultaneously group the data based on the maximization of margins and a classifier is learned for each group. The data overlapping problem is taken care of by specifically learning a SVM-KNN classifier for the potentially unreliable samples per group. We validate the framework on a pair of remote sensing images of different modalities for the purpose of land-cover classification and a generic object dataset for recognition. We observe that the proposed method exhibits performances at par with the fully supervised case for both the tasks but without the requirement of costly annotations.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"46 1","pages":"56:1-56:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84889285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Sketch-based simulated draping for Indian garments 基于草图的印度服装模拟垂饰

Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing

Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010001

Sanjeev Muralikrishnan, P. Chaudhuri

Virtual garments like shirts and trousers are created from 2D patterns stitched over 3D models. However, Indian garments, like dhotis and saris, pose a unique draping challenge for physically-simulated garment systems, as they are not stitched garments. We present a method to intuitively specify the parameters governing the drape of an Indian garment using a sketch-based interface. We then interpret the sketch strokes to procedural, physically-simulated draping routines to wrap, pin and tuck the garments around the body mesh as needed. After draping, the garments are ready to be simulated and used during animation as required. We present several examples of our draping technique.

像衬衫和裤子这样的虚拟服装是由2D图案拼接在3D模型上制成的。然而，印度服装，如印度长袍和纱丽，对物理模拟服装系统构成了独特的垂坠挑战，因为它们不是缝制的服装。我们提出了一种方法，直观地指定参数控制的悬垂印度服装使用草图为基础的界面。然后，我们将草图笔画解释为程序，物理模拟的悬垂程序，以根据需要包裹，别针和折叠衣服周围的身体网格。在悬垂之后，服装就可以被模拟，并根据需要在动画中使用。我们展示了几个我们的悬垂技术的例子。

引用次数: 3

A differential excitation based rotational invariance for convolutional neural networks 基于微分激励的卷积神经网络旋转不变性

Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing

Pub Date : 2016-12-18 DOI: 10.1145/3009977.3009978

Haribabu Kandi, Deepak Mishra, G. R. S. Subrahmanyam

Deep Learning (DL) methods extract complex set of features using architectures containing hierarchical set of layers. The features so learned have high discriminative power and thus represents the input to the network in the most efficient manner. Convolutional Neural Networks (CNN) are one of the deep learning architectures, extracts structural features with little invariance to smaller translational, scaling and other forms of distortions. In this paper, the learning capabilities of CNN's are explored towards providing improvement in rotational invariance to its architecture. We propose a new CNN architecture with an additional layer formed by differential excitation against distance for the improvement of rotational invariance and is called as RICNN. Moreover, we show that the proposed method is giving superior performance towards invariance to rotations against the original CNN architecture (training samples with different orientations are not considered) without disturbing the invariance to smaller translational, scaling and other forms of distortions. Different profiles like training time, testing time and accuracies are evaluated at different percentages of training data for comparing the performance of the proposed configuration with original configuration.

深度学习(DL)方法使用包含分层层集的体系结构提取复杂的特征集。这样学习到的特征具有很高的判别能力，从而以最有效的方式表示网络的输入。卷积神经网络(Convolutional Neural Networks, CNN)是一种深度学习架构，它提取的结构特征对较小的平移、缩放和其他形式的扭曲具有很小的不变性。本文探讨了CNN的学习能力，以改善其结构的旋转不变性。我们提出了一种新的CNN结构，该结构通过对距离的微分激励形成附加层来改善旋转不变性，称为RICNN。此外，我们表明，所提出的方法在针对原始CNN架构(不考虑具有不同方向的训练样本)的旋转不变性方面具有优越的性能，而不会干扰较小的平移，缩放和其他形式的扭曲的不变性。在不同的训练数据百分比下评估不同的配置，如训练时间、测试时间和准确性，以比较建议配置与原始配置的性能。

{"title":"A differential excitation based rotational invariance for convolutional neural networks","authors":"Haribabu Kandi, Deepak Mishra, G. R. S. Subrahmanyam","doi":"10.1145/3009977.3009978","DOIUrl":"https://doi.org/10.1145/3009977.3009978","url":null,"abstract":"Deep Learning (DL) methods extract complex set of features using architectures containing hierarchical set of layers. The features so learned have high discriminative power and thus represents the input to the network in the most efficient manner. Convolutional Neural Networks (CNN) are one of the deep learning architectures, extracts structural features with little invariance to smaller translational, scaling and other forms of distortions. In this paper, the learning capabilities of CNN's are explored towards providing improvement in rotational invariance to its architecture. We propose a new CNN architecture with an additional layer formed by differential excitation against distance for the improvement of rotational invariance and is called as RICNN. Moreover, we show that the proposed method is giving superior performance towards invariance to rotations against the original CNN architecture (training samples with different orientations are not considered) without disturbing the invariance to smaller translational, scaling and other forms of distortions. Different profiles like training time, testing time and accuracies are evaluated at different percentages of training data for comparing the performance of the proposed configuration with original configuration.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"9 1","pages":"70:1-70:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85513191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Event geo-localization and tracking from crowd-sourced video metadata 从众包视频元数据中进行事件地理定位和跟踪

Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing

Pub Date : 2016-12-18 DOI: 10.1145/3009977.3009993

Amit More, S. Chaudhuri

We propose a novel technique for event geo-localization (i.e. 2-D location of the event on the surface of the earth) from the sensor metadata of crowd-sourced videos collected from smartphone devices. With the help of sensors available in the smartphone devices, such as digital compass and GPS receiver, we collect metadata information such as camera viewing direction and location along with the video. The event localization is then posed as a constrained optimization problem using available sensor metadata. Our results on the collected experimental data shows correct localization of events, which is particularly challenging for classical vision based methods because of the nature of the visual data. Since we only use sensor metadata in our approach, computational overhead is much less compared to what would be if video information is used. At the end, we illustrate the benefits of our work in analyzing the video data from multiple sources through geo-localization.

我们提出了一种新的事件地理定位技术(即从智能手机设备收集的众包视频的传感器元数据中对地球表面的事件进行二维定位)。借助智能手机设备中可用的传感器，如数字指南针和GPS接收器，我们收集元数据信息，如摄像头观看方向和位置以及视频。然后使用可用的传感器元数据将事件定位作为约束优化问题。我们在收集的实验数据上的结果显示了事件的正确定位，由于视觉数据的性质，这对于传统的基于视觉的方法来说尤其具有挑战性。由于我们的方法中只使用传感器元数据，因此与使用视频信息相比，计算开销要少得多。最后，我们举例说明了通过地理定位分析多源视频数据的好处。

引用次数: 4

On the (soccer) ball 在足球上

Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing

Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010022

Samriddha Sanyal, A. Kundu, D. Mukherjee

The problem of tracking ball in a soccer video is challenging because of sudden change in speed and orientation of the soccer ball. Successful tracking in such a scenario depends on the ability of the algorithm to balance prior constraints continuously against the evidence garnered from the sequences of images. This paper proposes a particle filter based algorithm that tracks the ball when it changes its direction suddenly or takes high speed. Exact, deterministic tracking algorithms based on discretized functional, suffer from severe limitations in the form of prior constraints. Our tracking algorithm has shown excellent result even for partial occlusion which is a major concern in soccer video. We have shown that the proposed tracking algorithm is at least 7.2% better compared to competing approaches for soccer ball tracking.

在足球视频中，由于足球的速度和方向的突然变化，跟踪球的问题具有挑战性。在这种情况下，成功的跟踪依赖于算法持续平衡先验约束和从图像序列中获得的证据的能力。本文提出了一种基于粒子滤波的算法，可以在球突然改变方向或高速运动时对其进行跟踪。精确的、基于离散泛函的确定性跟踪算法，受到先验约束形式的严重限制。我们的跟踪算法即使对足球视频中主要关注的部分遮挡也显示了出色的结果。我们已经证明，与竞争对手的足球跟踪方法相比，所提出的跟踪算法至少好7.2%。

引用次数: 9

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀