首页 > 最新文献

2017 IEEE International Conference on Computer Vision Workshops (ICCVW)最新文献

英文 中文
Know You at One Glance: A Compact Vector Representation for Low-Shot Learning 一眼就知道你:用于低镜头学习的紧凑向量表示
Pub Date : 2017-10-01 DOI: 10.1109/ICCVW.2017.227
Yu Cheng, Jian Zhao, Zhecan Wang, Yan Xu, J. Karlekar, Shengmei Shen, Jiashi Feng
Low-shot face recognition is a very challenging yet important problem in computer vision. The feature representation of the gallery face sample is one key component in this problem. To this end, we propose an Enforced Softmax optimization approach built upon Convolutional Neural Networks (CNNs) to produce an effective and compact vector representation. The learned feature representation is very helpful to overcome the underlying multi-modality variations and remain the primary key features as close to the mean face of the identity as possible in the high-dimensional feature space, thus making the gallery basis more robust under various conditions, and improving the overall performance for low-shot learning. In particular, we sequentially leverage optimal dropout, selective attenuation, ℓ2 normalization, and model-level optimization to enhance the standard Softmax objective function for to produce a more compact vectorized representation for low-shot learning. Comprehensive evaluations on the MNIST, Labeled Faces in the Wild (LFW), and the challenging MS-Celeb-1M Low-Shot Learning Face Recognition benchmark datasets clearly demonstrate the superiority of our proposed method over state-of-the-arts. By further introducing a heuristic voting strategy for robust multi-view combination, and our proposed method has won the Top-1 place in the MS-Celeb-1M Low-Shot Learning Challenge.
低镜头人脸识别是计算机视觉中一个非常具有挑战性但又非常重要的问题。图库人脸样本的特征表示是该问题的关键组成部分。为此,我们提出了一种基于卷积神经网络(cnn)的强制Softmax优化方法,以产生有效且紧凑的向量表示。学习到的特征表示非常有助于克服潜在的多模态变化,并在高维特征空间中保持主要关键特征尽可能接近身份的平均面,从而使库基在各种条件下更具鲁棒性,并提高低摄学习的整体性能。特别是,我们依次利用最优dropout、选择性衰减、l2归一化和模型级优化来增强标准Softmax目标函数,以便为低射击学习产生更紧凑的矢量化表示。对MNIST、野外标记人脸(LFW)和具有挑战性的MS-Celeb-1M低镜头学习人脸识别基准数据集的综合评估清楚地表明,我们提出的方法优于最先进的方法。通过进一步引入一种鲁棒多视图组合的启发式投票策略,我们提出的方法在MS-Celeb-1M低镜头学习挑战赛中获得了第一名。
{"title":"Know You at One Glance: A Compact Vector Representation for Low-Shot Learning","authors":"Yu Cheng, Jian Zhao, Zhecan Wang, Yan Xu, J. Karlekar, Shengmei Shen, Jiashi Feng","doi":"10.1109/ICCVW.2017.227","DOIUrl":"https://doi.org/10.1109/ICCVW.2017.227","url":null,"abstract":"Low-shot face recognition is a very challenging yet important problem in computer vision. The feature representation of the gallery face sample is one key component in this problem. To this end, we propose an Enforced Softmax optimization approach built upon Convolutional Neural Networks (CNNs) to produce an effective and compact vector representation. The learned feature representation is very helpful to overcome the underlying multi-modality variations and remain the primary key features as close to the mean face of the identity as possible in the high-dimensional feature space, thus making the gallery basis more robust under various conditions, and improving the overall performance for low-shot learning. In particular, we sequentially leverage optimal dropout, selective attenuation, ℓ2 normalization, and model-level optimization to enhance the standard Softmax objective function for to produce a more compact vectorized representation for low-shot learning. Comprehensive evaluations on the MNIST, Labeled Faces in the Wild (LFW), and the challenging MS-Celeb-1M Low-Shot Learning Face Recognition benchmark datasets clearly demonstrate the superiority of our proposed method over state-of-the-arts. By further introducing a heuristic voting strategy for robust multi-view combination, and our proposed method has won the Top-1 place in the MS-Celeb-1M Low-Shot Learning Challenge.","PeriodicalId":149766,"journal":{"name":"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)","volume":"12 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123795558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 42
Deep Learning Anthropomorphic 3D Point Clouds from a Single Depth Map Camera Viewpoint 深度学习拟人化3D点云从单一深度地图相机视点
Pub Date : 2017-10-01 DOI: 10.1109/ICCVW.2017.87
Nolan Lunscher, J. Zelek
In footwear, fit is highly dependent on foot shape, which is not fully captured by shoe size. Scanners can be used to acquire better sizing information and allow for more personalized footwear matching, however when scanning an object, many images are usually needed for reconstruction. Semantics such as knowing the kind of object in view can be leveraged to determine the full 3D shape given only one input view. Deep learning methods have been shown to be able to reconstruct 3D shape from limited inputs in highly symmetrical objects such as furniture and vehicles. We apply a deep learning approach to the domain of foot scanning, and present a method to reconstruct a 3D point cloud from a single input depth map. Anthropomorphic body parts can be challenging due to their irregular shapes, difficulty for parameterizing and limited symmetries. We train a view synthesis based network and show that our method can produce foot scans with accuracies of 1.55 mm from a single input depth map.
在鞋类方面,合脚高度依赖于脚型,而这并不能完全由鞋码反映出来。扫描仪可用于获取更好的尺寸信息,并允许更个性化的鞋子匹配,然而,当扫描一个对象时,通常需要许多图像进行重建。在给定一个输入视图的情况下,可以利用诸如了解视图中对象类型之类的语义来确定完整的3D形状。深度学习方法已被证明能够从高度对称的物体(如家具和车辆)的有限输入中重建3D形状。我们将深度学习方法应用于足部扫描领域,并提出了一种从单个输入深度图重建三维点云的方法。拟人化的身体部位可能具有挑战性,因为它们的形状不规则,难以参数化和有限的对称性。我们训练了一个基于视图合成的网络,并表明我们的方法可以从单个输入深度图产生精度为1.55 mm的足部扫描。
{"title":"Deep Learning Anthropomorphic 3D Point Clouds from a Single Depth Map Camera Viewpoint","authors":"Nolan Lunscher, J. Zelek","doi":"10.1109/ICCVW.2017.87","DOIUrl":"https://doi.org/10.1109/ICCVW.2017.87","url":null,"abstract":"In footwear, fit is highly dependent on foot shape, which is not fully captured by shoe size. Scanners can be used to acquire better sizing information and allow for more personalized footwear matching, however when scanning an object, many images are usually needed for reconstruction. Semantics such as knowing the kind of object in view can be leveraged to determine the full 3D shape given only one input view. Deep learning methods have been shown to be able to reconstruct 3D shape from limited inputs in highly symmetrical objects such as furniture and vehicles. We apply a deep learning approach to the domain of foot scanning, and present a method to reconstruct a 3D point cloud from a single input depth map. Anthropomorphic body parts can be challenging due to their irregular shapes, difficulty for parameterizing and limited symmetries. We train a view synthesis based network and show that our method can produce foot scans with accuracies of 1.55 mm from a single input depth map.","PeriodicalId":149766,"journal":{"name":"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131555611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
SymmSLIC: Symmetry Aware Superpixel Segmentation SymmSLIC:对称感知超像素分割
Pub Date : 2017-10-01 DOI: 10.1109/ICCVW.2017.208
R. Nagar, S. Raman
Over-segmentation of an image into superpixels has become an useful tool for solving various problems in computer vision. Reflection symmetry is quite prevalent in both natural and man-made objects. Existing algorithms for estimating superpixels do not preserve the reflection symmetry of an object which leads to different sizes and shapes of superpixels across the symmetry axis. In this work, we propose an algorithm to over-segment an image through the propagation of reflection symmetry evident at the pixel level to superpixel boundaries. In order to achieve this goal, we exploit the detection of a set of pairs of pixels which are mirror reflections of each other. We partition the image into superpixels while preserving this reflection symmetry information through an iterative algorithm. We compare the proposed method with state-of-the-art superpixel generation methods and show the effectiveness of the method in preserving the size and shape of superpixel boundaries across the reflection symmetry axes. We also present an application called unsupervised symmetric object segmentation to illustrate the effectiveness of the proposed approach.
将图像分割成超像素已经成为解决计算机视觉中各种问题的一个有用工具。反射对称在自然和人造物体中都很普遍。现有的超像素估计算法没有保持物体的反射对称性,导致超像素在对称轴上的大小和形状不同。在这项工作中,我们提出了一种算法,通过在像素级到超像素边界明显的反射对称传播来过度分割图像。为了实现这一目标,我们利用了一组互为镜像反射的像素对的检测。我们将图像划分为超像素,同时通过迭代算法保留这种反射对称信息。我们将所提出的方法与最先进的超像素生成方法进行了比较,并证明了该方法在保留反射对称轴上超像素边界的大小和形状方面的有效性。我们还提出了一个称为无监督对称对象分割的应用,以说明所提出方法的有效性。
{"title":"SymmSLIC: Symmetry Aware Superpixel Segmentation","authors":"R. Nagar, S. Raman","doi":"10.1109/ICCVW.2017.208","DOIUrl":"https://doi.org/10.1109/ICCVW.2017.208","url":null,"abstract":"Over-segmentation of an image into superpixels has become an useful tool for solving various problems in computer vision. Reflection symmetry is quite prevalent in both natural and man-made objects. Existing algorithms for estimating superpixels do not preserve the reflection symmetry of an object which leads to different sizes and shapes of superpixels across the symmetry axis. In this work, we propose an algorithm to over-segment an image through the propagation of reflection symmetry evident at the pixel level to superpixel boundaries. In order to achieve this goal, we exploit the detection of a set of pairs of pixels which are mirror reflections of each other. We partition the image into superpixels while preserving this reflection symmetry information through an iterative algorithm. We compare the proposed method with state-of-the-art superpixel generation methods and show the effectiveness of the method in preserving the size and shape of superpixel boundaries across the reflection symmetry axes. We also present an application called unsupervised symmetric object segmentation to illustrate the effectiveness of the proposed approach.","PeriodicalId":149766,"journal":{"name":"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126249175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Multi-view 6D Object Pose Estimation and Camera Motion Planning Using RGBD Images 基于RGBD图像的多视图6D目标姿态估计和相机运动规划
Pub Date : 2017-10-01 DOI: 10.1109/ICCVW.2017.260
Juil Sock, S. Kasaei, L. Lopes, Tae-Kyun Kim
Recovering object pose in a crowd is a challenging task due to severe occlusions and clutters. In active scenario, whenever an observer fails to recover the poses of objects from the current view point, the observer is able to determine the next view position and captures a new scene from another view point to improve the knowledge of the environment, which may reduce the 6D pose estimation uncertainty. We propose a complete active multi-view framework to recognize 6DOF pose of multiple object instances in a crowded scene. We include several components in active vision setting to increase the accuracy: Hypothesis accumulation and verification combines single-shot based hypotheses estimated from previous views and extract the most likely set of hypotheses; an entropy-based Next-Best-View prediction generates next camera position to capture new data to increase the performance; camera motion planning plans the trajectory of the camera based on the view entropy and the cost of movement. Different approaches for each component are implemented and evaluated to show the increase in performance.
由于严重的闭塞和杂乱,在人群中恢复物体姿态是一项具有挑战性的任务。在主动场景中,当观察者无法从当前视点恢复物体的姿态时,观察者可以确定下一个视点的位置,并从另一个视点捕获新的场景,以提高对环境的了解,这可能会降低6D姿态估计的不确定性。我们提出了一个完整的主动多视图框架来识别拥挤场景中多个目标实例的6DOF姿态。我们在主动视觉设置中加入了几个组件来提高准确性:假设积累和验证结合了从以前的视图中估计的基于单镜头的假设,并提取最可能的假设集;基于熵的下一个最佳视图预测生成下一个摄像机位置以捕获新数据以提高性能;摄像机运动规划是基于视角熵和运动代价来规划摄像机的运动轨迹。为每个组件实现和评估不同的方法,以显示性能的提高。
{"title":"Multi-view 6D Object Pose Estimation and Camera Motion Planning Using RGBD Images","authors":"Juil Sock, S. Kasaei, L. Lopes, Tae-Kyun Kim","doi":"10.1109/ICCVW.2017.260","DOIUrl":"https://doi.org/10.1109/ICCVW.2017.260","url":null,"abstract":"Recovering object pose in a crowd is a challenging task due to severe occlusions and clutters. In active scenario, whenever an observer fails to recover the poses of objects from the current view point, the observer is able to determine the next view position and captures a new scene from another view point to improve the knowledge of the environment, which may reduce the 6D pose estimation uncertainty. We propose a complete active multi-view framework to recognize 6DOF pose of multiple object instances in a crowded scene. We include several components in active vision setting to increase the accuracy: Hypothesis accumulation and verification combines single-shot based hypotheses estimated from previous views and extract the most likely set of hypotheses; an entropy-based Next-Best-View prediction generates next camera position to capture new data to increase the performance; camera motion planning plans the trajectory of the camera based on the view entropy and the cost of movement. Different approaches for each component are implemented and evaluated to show the increase in performance.","PeriodicalId":149766,"journal":{"name":"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128276188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 49
Doppelganger Mining for Face Representation Learning 人脸表征学习的二重身挖掘
Pub Date : 2017-10-01 DOI: 10.1109/ICCVW.2017.226
Evgeny Smirnov, A. Melnikov, Sergey Novoselov, Eugene Luckyanets, G. Lavrentyeva
In this paper we present Doppelganger mining - a method to learn better face representations. The main idea of this method is to maintain a list with the most similar identities for each identity in the training set. This list is used to generate better mini-batches by sampling pairs of similar-looking identities ("doppelgangers") together. It is especially useful for methods, based on exemplar-based supervision. Usually hard example mining comes with a price of necessity to use large mini-batches or substantial extra computation and memory cost, particularly for datasets with large numbers of identities. Our method needs only a negligible extra computation and memory. In our experiments on a benchmark dataset with 21,000 persons we show that Doppelganger mining, being inserted in the face representation learning process with joint prototype-based and exemplar-based supervision, significantly improves the discriminative power of learned face representations.
在本文中,我们提出了二重身挖掘-一种更好地学习人脸表示的方法。该方法的主要思想是为训练集中的每个标识维护一个具有最相似标识的列表。该列表用于通过对看起来相似的身份(“二重身”)进行抽样来生成更好的小批量。它对基于范例监督的方法特别有用。通常,硬示例挖掘的代价是必须使用大量的小批量或大量额外的计算和内存成本,特别是对于具有大量身份的数据集。我们的方法只需要微不足道的额外计算和内存。在我们对21,000人的基准数据集进行的实验中,我们表明,通过基于原型和基于范例的联合监督,将二重身挖掘插入人脸表征学习过程中,显著提高了学习到的人脸表征的判别能力。
{"title":"Doppelganger Mining for Face Representation Learning","authors":"Evgeny Smirnov, A. Melnikov, Sergey Novoselov, Eugene Luckyanets, G. Lavrentyeva","doi":"10.1109/ICCVW.2017.226","DOIUrl":"https://doi.org/10.1109/ICCVW.2017.226","url":null,"abstract":"In this paper we present Doppelganger mining - a method to learn better face representations. The main idea of this method is to maintain a list with the most similar identities for each identity in the training set. This list is used to generate better mini-batches by sampling pairs of similar-looking identities (\"doppelgangers\") together. It is especially useful for methods, based on exemplar-based supervision. Usually hard example mining comes with a price of necessity to use large mini-batches or substantial extra computation and memory cost, particularly for datasets with large numbers of identities. Our method needs only a negligible extra computation and memory. In our experiments on a benchmark dataset with 21,000 persons we show that Doppelganger mining, being inserted in the face representation learning process with joint prototype-based and exemplar-based supervision, significantly improves the discriminative power of learned face representations.","PeriodicalId":149766,"journal":{"name":"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114954694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 42
Class-Specific Reconstruction Transfer Learning via Sparse Low-Rank Constraint 基于稀疏低秩约束的类特定重构迁移学习
Pub Date : 2017-10-01 DOI: 10.1109/ICCVW.2017.116
Shanshan Wang, Lei Zhang, W. Zuo
Subspace learning and reconstruction have been widely explored in recent transfer learning work and generally a specially designed projection and reconstruction transfer matrix are wanted. However, existing subspace reconstruction based algorithms neglect the class prior such that the learned transfer function is biased, especially when data scarcity of some class is encountered. Different from those previous methods, in this paper, we propose a novel reconstruction-based transfer learning method called Class-specific Reconstruction Transfer Learning (CRTL), which optimizes a well-designed transfer loss function without class bias. Using a class-specific reconstruction matrix to align the source domain with the target domain which provides help for classification with class prior modeling. Furthermore, to keep the intrinsic relationship between data and labels after feature augmentation, a projected Hilbert-Schmidt Independence Criterion (pHSIC), that measures the dependency between two sets, is first proposed by mapping the data from original space to RKHS in transfer learning. In addition, combining low-rank and sparse constraints on the class-specific reconstruction coefficient matrix, the global and local data structures can be effectively preserved. Extensive experiments demonstrate that the proposed method outperforms conventional representation-based domain adaptation methods.
子空间的学习和重建是近年来迁移学习研究的热点,通常需要一个专门设计的投影和重建迁移矩阵。然而,现有的基于子空间重构的算法忽略了类先验,使得学习到的传递函数存在偏倚,特别是当遇到某些类的数据稀缺性时。与以往的迁移学习方法不同,本文提出了一种新的基于重构的迁移学习方法,称为类特异性重构迁移学习(class -specific Reconstruction transfer learning, CRTL),该方法优化了一个设计良好的迁移损失函数,没有类偏差。使用特定于类的重构矩阵对源域和目标域进行对齐,为类先验建模的分类提供帮助。此外,为了保持特征增强后数据与标签之间的内在关系,首先在迁移学习中将数据从原始空间映射到RKHS,提出了一种测量两集之间依赖关系的投影Hilbert-Schmidt独立准则(projected Hilbert-Schmidt Independence Criterion, pHSIC)。此外,结合对类重构系数矩阵的低秩和稀疏约束,可以有效地保留全局和局部数据结构。大量的实验表明,该方法优于传统的基于表示的领域自适应方法。
{"title":"Class-Specific Reconstruction Transfer Learning via Sparse Low-Rank Constraint","authors":"Shanshan Wang, Lei Zhang, W. Zuo","doi":"10.1109/ICCVW.2017.116","DOIUrl":"https://doi.org/10.1109/ICCVW.2017.116","url":null,"abstract":"Subspace learning and reconstruction have been widely explored in recent transfer learning work and generally a specially designed projection and reconstruction transfer matrix are wanted. However, existing subspace reconstruction based algorithms neglect the class prior such that the learned transfer function is biased, especially when data scarcity of some class is encountered. Different from those previous methods, in this paper, we propose a novel reconstruction-based transfer learning method called Class-specific Reconstruction Transfer Learning (CRTL), which optimizes a well-designed transfer loss function without class bias. Using a class-specific reconstruction matrix to align the source domain with the target domain which provides help for classification with class prior modeling. Furthermore, to keep the intrinsic relationship between data and labels after feature augmentation, a projected Hilbert-Schmidt Independence Criterion (pHSIC), that measures the dependency between two sets, is first proposed by mapping the data from original space to RKHS in transfer learning. In addition, combining low-rank and sparse constraints on the class-specific reconstruction coefficient matrix, the global and local data structures can be effectively preserved. Extensive experiments demonstrate that the proposed method outperforms conventional representation-based domain adaptation methods.","PeriodicalId":149766,"journal":{"name":"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130702268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Variational Robust Subspace Clustering with Mean Update Algorithm 基于均值更新算法的变分鲁棒子空间聚类
Pub Date : 2017-10-01 DOI: 10.1109/ICCVW.2017.212
Sergej Dogadov, A. Masegosa, Shinichi Nakajima
In this paper, we propose an efficient variational Bayesian (VB) solver for a robust variant of low-rank subspace clustering (LRSC). VB learning offers automatic model selection without parameter tuning. However, it is typically performed by local search with update rules derived from conditional conjugacy, and therefore prone to local minima problem. Instead, we use an approximate global solver for LRSC with an element-wise sparse term to make it robust against spiky noise. In experiment, our method (mean update solver for robust LRSC), outperforms the original LRSC, as well as the robust LRSC with the standard VB solver.
本文提出了一种有效的变分贝叶斯(VB)求解器来求解低秩子空间聚类(LRSC)的鲁棒变体。VB学习提供自动模型选择,无需参数调整。然而,它通常是通过局部搜索和由条件共轭导出的更新规则来执行的,因此容易出现局部最小问题。相反,我们使用一个近似的全局求解器来求解LRSC,其中包含一个元素稀疏项,以使其对尖噪声具有鲁棒性。在实验中,我们的方法(鲁棒LRSC的均值更新求解器)优于原始LRSC,也优于使用标准VB求解器的鲁棒LRSC。
{"title":"Variational Robust Subspace Clustering with Mean Update Algorithm","authors":"Sergej Dogadov, A. Masegosa, Shinichi Nakajima","doi":"10.1109/ICCVW.2017.212","DOIUrl":"https://doi.org/10.1109/ICCVW.2017.212","url":null,"abstract":"In this paper, we propose an efficient variational Bayesian (VB) solver for a robust variant of low-rank subspace clustering (LRSC). VB learning offers automatic model selection without parameter tuning. However, it is typically performed by local search with update rules derived from conditional conjugacy, and therefore prone to local minima problem. Instead, we use an approximate global solver for LRSC with an element-wise sparse term to make it robust against spiky noise. In experiment, our method (mean update solver for robust LRSC), outperforms the original LRSC, as well as the robust LRSC with the standard VB solver.","PeriodicalId":149766,"journal":{"name":"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)","volume":"162 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133846090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dual Structured Convolutional Neural Network with Feature Augmentation for Quantitative Characterization of Tissue Histology 基于特征增强的双结构卷积神经网络定量表征组织组织学
Pub Date : 2017-10-01 DOI: 10.1109/ICCVW.2017.10
Mira Valkonen, K. Kartasalo, Kaisa Liimatainen, M. Nykter, Leena Latonen, P. Ruusuvuori
We present a dual convolutional neural network (dCNN) architecture for extracting multi-scale features from histological tissue images for the purpose of automated characterization of tissue in digital pathology. The dual structure consists of two identical convolutional neural networks applied to input images with different scales, that are merged together and stacked with two fully connected layers. It has been acknowledged that deep networks can be used to extract higher-order features, and therefore, the network output at final fully connected layer was used as a deep dCNN feature vector. Further, engineered features, shown in previous studies to capture important characteristics of tissue structure and morphology, were integrated to the feature extractor module. The acquired quantitative feature representation can be further utilized to train a discriminative model for classifying tissue types. Machine learning based methods for detection of regions of interest, or tissue type classification will advance the transition to decision support systems and computer aided diagnosis in digital pathology. Here we apply the proposed feature-augmented dCNN method with supervised learning in detecting cancerous tissue from whole slide images. The extracted quantitative representation of tissue histology was used to train a logistic regression model with elastic net regularization. The model was able to accurately discriminate cancerous tissue from normal tissue, resulting in blockwise AUC=0.97, where the total number of analyzed tissue blocks was approximately 8.3 million that constitute the test set of 75 whole slide images.
我们提出了一种双卷积神经网络(dCNN)架构,用于从组织图像中提取多尺度特征,用于数字病理学中组织的自动表征。二元结构由两个相同的卷积神经网络组成,分别用于不同尺度的输入图像,并将其合并在一起,用两个完全连接的层进行堆叠。人们已经认识到深度网络可以用于提取高阶特征,因此,将最终完全连接层的网络输出用作深度dCNN特征向量。此外,在先前的研究中显示的捕获组织结构和形态的重要特征的工程特征被集成到特征提取器模块中。所获得的定量特征表示可以进一步用于训练组织类型分类的判别模型。基于机器学习的检测感兴趣区域或组织类型分类的方法将推进向数字病理学中的决策支持系统和计算机辅助诊断的过渡。在这里,我们将提出的带有监督学习的特征增强dCNN方法应用于从整个幻灯片图像中检测癌组织。将提取的组织组织定量表示用于训练弹性网正则化逻辑回归模型。该模型能够准确地区分癌组织和正常组织,得到块AUC=0.97,其中分析的组织块总数约为830万个,构成了75个完整幻灯片图像的测试集。
{"title":"Dual Structured Convolutional Neural Network with Feature Augmentation for Quantitative Characterization of Tissue Histology","authors":"Mira Valkonen, K. Kartasalo, Kaisa Liimatainen, M. Nykter, Leena Latonen, P. Ruusuvuori","doi":"10.1109/ICCVW.2017.10","DOIUrl":"https://doi.org/10.1109/ICCVW.2017.10","url":null,"abstract":"We present a dual convolutional neural network (dCNN) architecture for extracting multi-scale features from histological tissue images for the purpose of automated characterization of tissue in digital pathology. The dual structure consists of two identical convolutional neural networks applied to input images with different scales, that are merged together and stacked with two fully connected layers. It has been acknowledged that deep networks can be used to extract higher-order features, and therefore, the network output at final fully connected layer was used as a deep dCNN feature vector. Further, engineered features, shown in previous studies to capture important characteristics of tissue structure and morphology, were integrated to the feature extractor module. The acquired quantitative feature representation can be further utilized to train a discriminative model for classifying tissue types. Machine learning based methods for detection of regions of interest, or tissue type classification will advance the transition to decision support systems and computer aided diagnosis in digital pathology. Here we apply the proposed feature-augmented dCNN method with supervised learning in detecting cancerous tissue from whole slide images. The extracted quantitative representation of tissue histology was used to train a logistic regression model with elastic net regularization. The model was able to accurately discriminate cancerous tissue from normal tissue, resulting in blockwise AUC=0.97, where the total number of analyzed tissue blocks was approximately 8.3 million that constitute the test set of 75 whole slide images.","PeriodicalId":149766,"journal":{"name":"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)","volume":"119 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134264661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Automated Stem Angle Determination for Temporal Plant Phenotyping Analysis 用于植物表型分析的茎角自动测定
Pub Date : 2017-10-01 DOI: 10.1109/ICCVW.2017.237
S. D. Choudhury, Saptarsi Goswami, Srinidhi Bashyam, T. Awada, A. Samal
Image-based plant phenotyping analysis refers to the monitoring and quantification of phenotyping traits by analyzing images of the plants captured by different types of cameras at regular intervals in a controlled environment. Extracting meaningful phenotypes for temporal phenotyping analysis by considering individual parts of a plant, e.g., leaves and stem, using computer-vision based techniques remains a critical bottleneck due to constantly increasing complexity in plant architecture with variations in self-occlusions and phyllotaxy. The paper introduces an algorithm to compute the stem angle, a potential measure for plants' susceptibility to lodging, i.e., the bending of stem of the plant. Annual yield losses due to stem lodging in the U.S. range between 5 and 25%. In addition to outright yield losses, grain quality may also decline as a result of stem lodging. The algorithm to compute stem angle involves the identification of leaf-tips and leaf-junctions based on a graph theoretic approach. The efficacy of the proposed method is demonstrated based on experimental analysis on a publicly available dataset called Panicoid Phenomap-1. A time-series clustering analysis is also performed on the values of stem angles for a significant time interval during vegetative stage life cycle of the maize plants. This analysis effectively summarizes the temporal patterns of the stem angles into three main groups, which provides further insight into genotype specific behavior of the plants. A comparison of genotypic purity using time series analysis establishes that the temporal variation of the stem angles is likely to be regulated by genetic variation under similar environmental conditions.
基于图像的植物表型分析是指在受控的环境中,通过分析不同类型的相机定时捕获的植物图像,对表型性状进行监测和量化。通过考虑植物的各个部分(如叶和茎),使用基于计算机视觉的技术提取有意义的表型进行时间表型分析仍然是一个关键的瓶颈,因为植物结构的复杂性不断增加,自身闭塞和叶分结构也在变化。本文介绍了一种计算茎角的算法,茎角是衡量植物对倒伏易感性的一个潜在指标,即植物茎的弯曲程度。在美国,茎倒伏造成的年产量损失在5%到25%之间。除了直接的产量损失外,由于茎秆倒伏,粮食品质也可能下降。茎角的计算涉及到基于图论方法的叶尖和叶结点的识别。基于一个名为Panicoid Phenomap-1的公开数据集的实验分析,证明了所提出方法的有效性。对玉米植株营养期生命周期中显著时间间隔的茎角值进行了时间序列聚类分析。该分析有效地将茎角的时间模式归纳为三个主要组,为进一步了解植物的基因型特异性行为提供了依据。利用时间序列分析的基因型纯度比较表明,在相似的环境条件下,茎角的时间变化可能受到遗传变异的调控。
{"title":"Automated Stem Angle Determination for Temporal Plant Phenotyping Analysis","authors":"S. D. Choudhury, Saptarsi Goswami, Srinidhi Bashyam, T. Awada, A. Samal","doi":"10.1109/ICCVW.2017.237","DOIUrl":"https://doi.org/10.1109/ICCVW.2017.237","url":null,"abstract":"Image-based plant phenotyping analysis refers to the monitoring and quantification of phenotyping traits by analyzing images of the plants captured by different types of cameras at regular intervals in a controlled environment. Extracting meaningful phenotypes for temporal phenotyping analysis by considering individual parts of a plant, e.g., leaves and stem, using computer-vision based techniques remains a critical bottleneck due to constantly increasing complexity in plant architecture with variations in self-occlusions and phyllotaxy. The paper introduces an algorithm to compute the stem angle, a potential measure for plants' susceptibility to lodging, i.e., the bending of stem of the plant. Annual yield losses due to stem lodging in the U.S. range between 5 and 25%. In addition to outright yield losses, grain quality may also decline as a result of stem lodging. The algorithm to compute stem angle involves the identification of leaf-tips and leaf-junctions based on a graph theoretic approach. The efficacy of the proposed method is demonstrated based on experimental analysis on a publicly available dataset called Panicoid Phenomap-1. A time-series clustering analysis is also performed on the values of stem angles for a significant time interval during vegetative stage life cycle of the maize plants. This analysis effectively summarizes the temporal patterns of the stem angles into three main groups, which provides further insight into genotype specific behavior of the plants. A comparison of genotypic purity using time series analysis establishes that the temporal variation of the stem angles is likely to be regulated by genetic variation under similar environmental conditions.","PeriodicalId":149766,"journal":{"name":"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133362332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
LPSNet: A Novel Log Path Signature Feature Based Hand Gesture Recognition Framework LPSNet:一种新的基于日志路径特征的手势识别框架
Pub Date : 2017-10-01 DOI: 10.1109/ICCVW.2017.80
Chenyang Li, Xin Zhang, Lianwen Jin
Hand gesture recognition is gaining more attentions because it's a natural and intuitive mode of human computer interaction. Hand gesture recognition still faces great challenges for the real-world applications due to the gesture variance and individual difference. In this paper, we propose the LPSNet, an end-to-end deep neural network based hand gesture recognition framework with novel log path signature features. We pioneer a robust feature, path signature (PS) and its compressed version, log path signature (LPS) to extract effective feature of hand gestures. Also, we present a new method based on PS and LPS to effectively combine RGB and depth videos. Further, we propose a statistical method, DropFrame, to enlarge the data set and increase its diversity. By testing on a well-known public dataset, Sheffield Kinect Gesture (SKIG), our method achieves classification rate as 96.7% (only use RGB videos) and 98.7% (combining RGB and Depth videos), which is the best result comparing with state-of-the-art methods.
手势识别作为一种自然、直观的人机交互方式,越来越受到人们的关注。由于手势的差异和个体差异,手势识别在实际应用中仍然面临着很大的挑战。本文提出了一种基于端到端深度神经网络的手势识别框架LPSNet,该框架具有新颖的日志路径签名特征。我们提出了一种鲁棒性特征,路径签名(PS)及其压缩版本,日志路径签名(LPS)来提取有效的手势特征。此外,我们还提出了一种基于PS和LPS的新方法来有效地结合RGB和深度视频。此外,我们提出了一种统计方法DropFrame,以扩大数据集并增加其多样性。通过在知名的公共数据集Sheffield Kinect Gesture (SKIG)上的测试,我们的方法实现了96.7%(仅使用RGB视频)和98.7%(结合RGB和Depth视频)的分类率,这是与目前最先进的方法相比的最佳结果。
{"title":"LPSNet: A Novel Log Path Signature Feature Based Hand Gesture Recognition Framework","authors":"Chenyang Li, Xin Zhang, Lianwen Jin","doi":"10.1109/ICCVW.2017.80","DOIUrl":"https://doi.org/10.1109/ICCVW.2017.80","url":null,"abstract":"Hand gesture recognition is gaining more attentions because it's a natural and intuitive mode of human computer interaction. Hand gesture recognition still faces great challenges for the real-world applications due to the gesture variance and individual difference. In this paper, we propose the LPSNet, an end-to-end deep neural network based hand gesture recognition framework with novel log path signature features. We pioneer a robust feature, path signature (PS) and its compressed version, log path signature (LPS) to extract effective feature of hand gestures. Also, we present a new method based on PS and LPS to effectively combine RGB and depth videos. Further, we propose a statistical method, DropFrame, to enlarge the data set and increase its diversity. By testing on a well-known public dataset, Sheffield Kinect Gesture (SKIG), our method achieves classification rate as 96.7% (only use RGB videos) and 98.7% (combining RGB and Depth videos), which is the best result comparing with state-of-the-art methods.","PeriodicalId":149766,"journal":{"name":"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133428356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 34
期刊
2017 IEEE International Conference on Computer Vision Workshops (ICCVW)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1