首页 > 最新文献

2011 International Conference on Computer Vision最新文献

英文 中文
A selective spatio-temporal interest point detector for human action recognition in complex scenes 一种用于复杂场景中人类动作识别的选择性时空兴趣点检测器
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126443
Bhaskar Chakraborty, M. B. Holte, T. Moeslund, Jordi Gonzàlez, F. X. Roca
Recent progress in the field of human action recognition points towards the use of Spatio-Temporal Interest Points (STIPs) for local descriptor-based recognition strategies. In this paper we present a new approach for STIP detection by applying surround suppression combined with local and temporal constraints. Our method is significantly different from existing STIP detectors and improves the performance by detecting more repeatable, stable and distinctive STIPs for human actors, while suppressing unwanted background STIPs. For action representation we use a bag-of-visual words (BoV) model of local N-jet features to build a vocabulary of visual-words. To this end, we introduce a novel vocabulary building strategy by combining spatial pyramid and vocabulary compression techniques, resulting in improved performance and efficiency. Action class specific Support Vector Machine (SVM) classifiers are trained for categorization of human actions. A comprehensive set of experiments on existing benchmark datasets, and more challenging datasets of complex scenes, validate our approach and show state-of-the-art performance.
人类行为识别领域的最新进展是将时空兴趣点(STIPs)用于基于局部描述符的识别策略。在本文中,我们提出了一种新的STIP检测方法,该方法将环绕抑制与局部和时间约束相结合。我们的方法与现有的STIP检测器有很大的不同,通过检测更多可重复的、稳定的和独特的人类参与者的STIP来提高性能,同时抑制不需要的背景STIP。对于动作表示,我们使用局部N-jet特征的视觉词袋(BoV)模型来构建视觉词的词汇表。为此,我们将空间金字塔和词汇压缩技术相结合,提出了一种新的词汇构建策略,从而提高了性能和效率。特定于动作类的支持向量机(SVM)分类器被训练用于对人类动作进行分类。在现有的基准数据集和复杂场景的更具挑战性的数据集上进行了一组全面的实验,验证了我们的方法并显示了最先进的性能。
{"title":"A selective spatio-temporal interest point detector for human action recognition in complex scenes","authors":"Bhaskar Chakraborty, M. B. Holte, T. Moeslund, Jordi Gonzàlez, F. X. Roca","doi":"10.1109/ICCV.2011.6126443","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126443","url":null,"abstract":"Recent progress in the field of human action recognition points towards the use of Spatio-Temporal Interest Points (STIPs) for local descriptor-based recognition strategies. In this paper we present a new approach for STIP detection by applying surround suppression combined with local and temporal constraints. Our method is significantly different from existing STIP detectors and improves the performance by detecting more repeatable, stable and distinctive STIPs for human actors, while suppressing unwanted background STIPs. For action representation we use a bag-of-visual words (BoV) model of local N-jet features to build a vocabulary of visual-words. To this end, we introduce a novel vocabulary building strategy by combining spatial pyramid and vocabulary compression techniques, resulting in improved performance and efficiency. Action class specific Support Vector Machine (SVM) classifiers are trained for categorization of human actions. A comprehensive set of experiments on existing benchmark datasets, and more challenging datasets of complex scenes, validate our approach and show state-of-the-art performance.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"1 1","pages":"1776-1783"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79874981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 49
Diagonal preconditioning for first order primal-dual algorithms in convex optimization 凸优化中一阶原对偶算法的对角预处理
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126441
T. Pock, A. Chambolle
In this paper we study preconditioning techniques for the first-order primal-dual algorithm proposed in [5]. In particular, we propose simple and easy to compute diagonal preconditioners for which convergence of the algorithm is guaranteed without the need to compute any step size parameters. As a by-product, we show that for a certain instance of the preconditioning, the proposed algorithm is equivalent to the old and widely unknown alternating step method for monotropic programming [7]. We show numerical results on general linear programming problems and a few standard computer vision problems. In all examples, the preconditioned algorithm significantly outperforms the algorithm of [5].
本文研究了[5]中提出的一阶原对偶算法的预处理技术。特别地,我们提出了简单且易于计算的对角预条件,保证了算法的收敛性,而无需计算任何步长参数。作为一个副产品,我们证明了对于一定的预处理实例,所提出的算法等价于单调规划[7]的旧的和广泛未知的交替步进方法。我们给出了一般线性规划问题和一些标准计算机视觉问题的数值结果。在所有示例中,预条件算法明显优于[5]算法。
{"title":"Diagonal preconditioning for first order primal-dual algorithms in convex optimization","authors":"T. Pock, A. Chambolle","doi":"10.1109/ICCV.2011.6126441","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126441","url":null,"abstract":"In this paper we study preconditioning techniques for the first-order primal-dual algorithm proposed in [5]. In particular, we propose simple and easy to compute diagonal preconditioners for which convergence of the algorithm is guaranteed without the need to compute any step size parameters. As a by-product, we show that for a certain instance of the preconditioning, the proposed algorithm is equivalent to the old and widely unknown alternating step method for monotropic programming [7]. We show numerical results on general linear programming problems and a few standard computer vision problems. In all examples, the preconditioned algorithm significantly outperforms the algorithm of [5].","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"68 1","pages":"1762-1769"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84142628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 461
A dimensionality result for multiple homography matrices 多个单应矩阵的维数结果
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126485
W. Chojnacki, A. Hengel
It is shown that the set of all I-element collections of interdependent homography matrices describing homographies induced by I planes in the 3D scene between two views has dimension 4I + 7. This improves on an earlier result which gave an upper bound for the dimension in question, and solves a long-standing open problem. The significance of the present result lies in that it is critical to the identification of the full set of constraints to which collections of interdependent homography matrices are subject, which in turn is critical to the design of constrained optimisation techniques for estimating such collections from image data.
证明了在两个视图之间的三维场景中,描述由I个平面引起的同形相干的所有I-元素矩阵集合的维数为4I + 7。这改进了先前给出所讨论维度上界的结果,并解决了一个长期存在的开放性问题。当前结果的意义在于,它对于确定相互依赖的单应性矩阵集合所受的全套约束是至关重要的,这反过来对于从图像数据中估计此类集合的约束优化技术的设计至关重要。
{"title":"A dimensionality result for multiple homography matrices","authors":"W. Chojnacki, A. Hengel","doi":"10.1109/ICCV.2011.6126485","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126485","url":null,"abstract":"It is shown that the set of all I-element collections of interdependent homography matrices describing homographies induced by I planes in the 3D scene between two views has dimension 4I + 7. This improves on an earlier result which gave an upper bound for the dimension in question, and solves a long-standing open problem. The significance of the present result lies in that it is critical to the identification of the full set of constraints to which collections of interdependent homography matrices are subject, which in turn is critical to the design of constrained optimisation techniques for estimating such collections from image data.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"324 1","pages":"2104-2109"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80322229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Level-set person segmentation and tracking with multi-region appearance models and top-down shape information 基于多区域外观模型和自顶向下形状信息的水平集人分割与跟踪
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126455
Esther Horbert, Konstantinos Rematas, B. Leibe
In this paper, we address the problem of segmentation-based tracking of multiple articulated persons. We propose two improvements to current level-set tracking formulations. The first is a localized appearance model that uses additional level-sets in order to enforce a hierarchical subdivision of the object shape into multiple connected regions with distinct appearance models. The second is a novel mechanism to include detailed object shape information in the form of a per-pixel figure/ground probability map obtained from an object detection process. Both contributions are seamlessly integrated into the level-set framework. Together, they considerably improve the accuracy of the tracked segmentations. We experimentally evaluate our proposed approach on two challenging sequences and demonstrate its good performance in practice.
在本文中,我们解决了基于分割的多个铰接人跟踪问题。我们对当前的水平集跟踪公式提出了两个改进。第一种是局部外观模型,它使用额外的水平集,以强制将对象形状分层细分为具有不同外观模型的多个连接区域。第二种是一种新机制,以从目标检测过程中获得的每像素图形/地面概率图的形式包含详细的目标形状信息。这两种贡献都无缝地集成到级别集框架中。总之,它们大大提高了跟踪分割的准确性。我们在两个具有挑战性的序列上对所提出的方法进行了实验评估,并在实践中证明了其良好的性能。
{"title":"Level-set person segmentation and tracking with multi-region appearance models and top-down shape information","authors":"Esther Horbert, Konstantinos Rematas, B. Leibe","doi":"10.1109/ICCV.2011.6126455","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126455","url":null,"abstract":"In this paper, we address the problem of segmentation-based tracking of multiple articulated persons. We propose two improvements to current level-set tracking formulations. The first is a localized appearance model that uses additional level-sets in order to enforce a hierarchical subdivision of the object shape into multiple connected regions with distinct appearance models. The second is a novel mechanism to include detailed object shape information in the form of a per-pixel figure/ground probability map obtained from an object detection process. Both contributions are seamlessly integrated into the level-set framework. Together, they considerably improve the accuracy of the tracked segmentations. We experimentally evaluate our proposed approach on two challenging sequences and demonstrate its good performance in practice.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"274 1","pages":"1871-1878"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77831238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
Salient object detection by composition 基于组合的显著目标检测
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126348
J. Feng, Yichen Wei, Litian Tao, Chao Zhang, Jian Sun
Conventional saliency analysis methods measure the saliency of individual pixels. The resulting saliency map inevitably loses information in the original image and finding salient objects in it is difficult. We propose to detect salient objects by directly measuring the saliency of an image window in the original image and adopt the well established sliding window based object detection paradigm.
传统的显著性分析方法测量单个像素的显著性。由此产生的显著性图不可避免地会丢失原始图像中的信息,并且很难在其中找到显著目标。我们建议通过直接测量原始图像中图像窗口的显著性来检测显著目标,并采用已建立的基于滑动窗口的目标检测范式。
{"title":"Salient object detection by composition","authors":"J. Feng, Yichen Wei, Litian Tao, Chao Zhang, Jian Sun","doi":"10.1109/ICCV.2011.6126348","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126348","url":null,"abstract":"Conventional saliency analysis methods measure the saliency of individual pixels. The resulting saliency map inevitably loses information in the original image and finding salient objects in it is difficult. We propose to detect salient objects by directly measuring the saliency of an image window in the original image and adopt the well established sliding window based object detection paradigm.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"8 1","pages":"1028-1035"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82225422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 166
The power of comparative reasoning 比较推理的力量
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126527
J. Yagnik, Dennis W. Strelow, David A. Ross, Ruei-Sung Lin
Rank correlation measures are known for their resilience to perturbations in numeric values and are widely used in many evaluation metrics. Such ordinal measures have rarely been applied in treatment of numeric features as a representational transformation. We emphasize the benefits of ordinal representations of input features both theoretically and empirically. We present a family of algorithms for computing ordinal embeddings based on partial order statistics. Apart from having the stability benefits of ordinal measures, these embeddings are highly nonlinear, giving rise to sparse feature spaces highly favored by several machine learning methods. These embeddings are deterministic, data independent and by virtue of being based on partial order statistics, add another degree of resilience to noise. These machine-learning-free methods when applied to the task of fast similarity search outperform state-of-the-art machine learning methods with complex optimization setups. For solving classification problems, the embeddings provide a nonlinear transformation resulting in sparse binary codes that are well-suited for a large class of machine learning algorithms. These methods show significant improvement on VOC 2010 using simple linear classifiers which can be trained quickly. Our method can be extended to the case of polynomial kernels, while permitting very efficient computation. Further, since the popular Min Hash algorithm is a special case of our method, we demonstrate an efficient scheme for computing Min Hash on conjunctions of binary features. The actual method can be implemented in about 10 lines of code in most languages (2 lines in MAT-LAB), and does not require any data-driven optimization.
等级相关度量以其对数值扰动的弹性而闻名,并广泛应用于许多评估度量中。这样的序数度量很少应用于作为表征变换的数值特征的处理。我们强调了输入特征的有序表示在理论上和经验上的好处。我们提出了一组基于偏序统计量的计算有序嵌入的算法。除了具有有序测度的稳定性优势外,这些嵌入是高度非线性的,产生了稀疏特征空间,这些特征空间受到几种机器学习方法的高度青睐。这些嵌入是确定性的,数据独立的,并且由于基于偏序统计,增加了对噪声的另一程度的弹性。当应用于快速相似性搜索任务时,这些不需要机器学习的方法优于具有复杂优化设置的最先进的机器学习方法。为了解决分类问题,嵌入提供了一种非线性转换,产生稀疏的二进制代码,非常适合于大量的机器学习算法。这些方法在使用简单线性分类器的VOC 2010上有了显著的改进,并且可以快速训练。我们的方法可以扩展到多项式核的情况,同时允许非常高效的计算。此外,由于流行的最小哈希算法是我们方法的一个特殊情况,我们展示了一个有效的方案来计算二元特征的连接最小哈希。在大多数语言中,实际的方法可以在大约10行代码中实现(MAT-LAB中为2行),并且不需要任何数据驱动的优化。
{"title":"The power of comparative reasoning","authors":"J. Yagnik, Dennis W. Strelow, David A. Ross, Ruei-Sung Lin","doi":"10.1109/ICCV.2011.6126527","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126527","url":null,"abstract":"Rank correlation measures are known for their resilience to perturbations in numeric values and are widely used in many evaluation metrics. Such ordinal measures have rarely been applied in treatment of numeric features as a representational transformation. We emphasize the benefits of ordinal representations of input features both theoretically and empirically. We present a family of algorithms for computing ordinal embeddings based on partial order statistics. Apart from having the stability benefits of ordinal measures, these embeddings are highly nonlinear, giving rise to sparse feature spaces highly favored by several machine learning methods. These embeddings are deterministic, data independent and by virtue of being based on partial order statistics, add another degree of resilience to noise. These machine-learning-free methods when applied to the task of fast similarity search outperform state-of-the-art machine learning methods with complex optimization setups. For solving classification problems, the embeddings provide a nonlinear transformation resulting in sparse binary codes that are well-suited for a large class of machine learning algorithms. These methods show significant improvement on VOC 2010 using simple linear classifiers which can be trained quickly. Our method can be extended to the case of polynomial kernels, while permitting very efficient computation. Further, since the popular Min Hash algorithm is a special case of our method, we demonstrate an efficient scheme for computing Min Hash on conjunctions of binary features. The actual method can be implemented in about 10 lines of code in most languages (2 lines in MAT-LAB), and does not require any data-driven optimization.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"44 1","pages":"2431-2438"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82558505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 129
Multi-view 3D reconstruction for scenes under the refractive plane with known vertical direction 垂直方向已知的折射率平面下场景的多视图三维重建
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126262
Yao-Jen Chang, Tsuhan Chen
Images taken from scenes under water suffer distortion due to refraction. While refraction causes magnification with mild distortion on the observed images, severe distortions in geometry reconstruction would be resulted if the refractive distortion is not properly handled. Different from the radial distortion model, the refractive distortion depends on the scene depth seen from each light ray as well as the camera pose relative to the refractive surface. Therefore, it's crucial to obtain a good estimate of scene depth, camera pose and optical center to alleviate the impact of refractive distortion. In this work, we formulate the forward and back projections of light rays involving a refractive plane for the perspective camera model by explicitly modeling refractive distortion as a function of depth. Furthermore, for cameras with an inertial measurement unit (IMU), we show that a linear solution to the relative pose and a closed-form solution to the absolute pose can be derived with known camera vertical directions. We incorporate our formulations with the general structure from motion framework followed by the patch-based multiview stereo algorithm to obtain a 3D reconstruction of the scene. We show through experiments that the explicit modeling of depth-dependent refractive distortion physically leads to more accurate scene reconstructions.
从水下拍摄的图像由于折射而失真。折射引起的放大会对观察到的图像造成轻微的畸变,但如果对折射畸变处理不当,则会对几何重建造成严重的畸变。与径向畸变模型不同,折射畸变取决于从每条光线看到的场景深度以及相机相对于折射表面的姿态。因此,获得良好的景深、相机姿态和光心的估计,以减轻折射畸变的影响至关重要。在这项工作中,我们通过明确地将折射畸变作为深度的函数建模,为透视相机模型制定了涉及折射平面的光线的正向和反向投影。此外,对于具有惯性测量单元(IMU)的相机,我们证明了在已知相机垂直方向的情况下,可以推导出相对姿态的线性解和绝对姿态的封闭解。我们将我们的公式与运动框架的一般结构结合起来,然后是基于补丁的多视图立体算法,以获得场景的3D重建。我们通过实验表明,深度相关的折射畸变的显式建模物理导致更准确的场景重建。
{"title":"Multi-view 3D reconstruction for scenes under the refractive plane with known vertical direction","authors":"Yao-Jen Chang, Tsuhan Chen","doi":"10.1109/ICCV.2011.6126262","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126262","url":null,"abstract":"Images taken from scenes under water suffer distortion due to refraction. While refraction causes magnification with mild distortion on the observed images, severe distortions in geometry reconstruction would be resulted if the refractive distortion is not properly handled. Different from the radial distortion model, the refractive distortion depends on the scene depth seen from each light ray as well as the camera pose relative to the refractive surface. Therefore, it's crucial to obtain a good estimate of scene depth, camera pose and optical center to alleviate the impact of refractive distortion. In this work, we formulate the forward and back projections of light rays involving a refractive plane for the perspective camera model by explicitly modeling refractive distortion as a function of depth. Furthermore, for cameras with an inertial measurement unit (IMU), we show that a linear solution to the relative pose and a closed-form solution to the absolute pose can be derived with known camera vertical directions. We incorporate our formulations with the general structure from motion framework followed by the patch-based multiview stereo algorithm to obtain a 3D reconstruction of the scene. We show through experiments that the explicit modeling of depth-dependent refractive distortion physically leads to more accurate scene reconstructions.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"19 1","pages":"351-358"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83012786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 66
What characterizes a shadow boundary under the sun and sky? 太阳和天空下的阴影边界的特征是什么?
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126331
Xiang Huang, G. Hua, J. Tumblin, Lance Williams
Despite decades of study, robust shadow detection remains difficult, especially within a single color image. We describe a new approach to detect shadow boundaries in images of outdoor scenes lit only by the sun and sky. The method first extracts visual features of candidate edges that are motivated by physical models of illumination and occluders. We feed these features into a Support Vector Machine (SVM) that was trained to discriminate between most-likely shadow-edge candidates and less-likely ones. Finally, we connect edges to help reject non-shadow edge candidates, and to encourage closed, connected shadow boundaries. On benchmark shadow-edge data sets from Lalonde et al. and Zhu et al., our method showed substantial improvements when compared to other recent shadow-detection methods based on statistical learning.
尽管几十年的研究,强大的阴影检测仍然困难,特别是在单色图像。我们描述了一种新的方法来检测仅由太阳和天空照亮的户外场景图像中的阴影边界。该方法首先提取候选边缘的视觉特征,这些特征是由照明和遮挡物的物理模型驱动的。我们将这些特征输入到支持向量机(SVM)中,该支持向量机被训练来区分最可能的阴影边缘候选和不太可能的候选。最后,我们连接边缘以帮助拒绝非阴影边缘候选,并鼓励封闭的,连接的阴影边界。在Lalonde等人和Zhu等人的基准阴影边缘数据集上,与最近基于统计学习的其他阴影检测方法相比,我们的方法有了实质性的改进。
{"title":"What characterizes a shadow boundary under the sun and sky?","authors":"Xiang Huang, G. Hua, J. Tumblin, Lance Williams","doi":"10.1109/ICCV.2011.6126331","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126331","url":null,"abstract":"Despite decades of study, robust shadow detection remains difficult, especially within a single color image. We describe a new approach to detect shadow boundaries in images of outdoor scenes lit only by the sun and sky. The method first extracts visual features of candidate edges that are motivated by physical models of illumination and occluders. We feed these features into a Support Vector Machine (SVM) that was trained to discriminate between most-likely shadow-edge candidates and less-likely ones. Finally, we connect edges to help reject non-shadow edge candidates, and to encourage closed, connected shadow boundaries. On benchmark shadow-edge data sets from Lalonde et al. and Zhu et al., our method showed substantial improvements when compared to other recent shadow-detection methods based on statistical learning.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"42 1","pages":"898-905"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83182610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 78
Video Primal Sketch: A generic middle-level representation of video 视频原始草图:视频的一般中层表示
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126380
Zhi Han, Zongben Xu, Song-Chun Zhu
This paper presents a middle-level video representation named Video Primal Sketch (VPS), which integrates two regimes of models: i) sparse coding model using static or moving primitives to explicitly represent moving corners, lines, feature points, etc., ii) FRAME/MRF model with spatio-temporal filters to implicitly represent textured motion, such as water and fire, by matching feature statistics, i.e. histograms. This paper makes three contributions: i) learning a dictionary of video primitives as parametric generative model; ii) studying the Spatio-Temporal FRAME (ST-FRAME) model for modeling and synthesizing textured motion; and iii) developing a parsimonious hybrid model for generic video representation. VPS selects the proper representation automatically and is compatible with high-level action representations. In the experiments, we synthesize a series of dynamic textures, reconstruct real videos and show varying VPS over the change of densities causing by the scale transition in videos.
本文提出了一种称为视频原始草图(video Primal Sketch, VPS)的中级视频表示方法,该方法集成了两种模型:1)使用静态或移动基元的稀疏编码模型,显式地表示移动的角、线、特征点等;2)使用时空滤波器的FRAME/MRF模型,通过匹配特征统计量,即直方图,隐式地表示纹理运动,如水和火。本文做出了三个贡献:1)学习一个视频原语字典作为参数生成模型;ii)研究用于纹理运动建模和合成的时空框架(ST-FRAME)模型;iii)开发用于通用视频表示的简约混合模型。VPS自动选择适当的表示,并与高级动作表示兼容。在实验中,我们合成了一系列动态纹理,重建了真实视频,并展示了视频中尺度转换引起的密度变化对VPS的影响。
{"title":"Video Primal Sketch: A generic middle-level representation of video","authors":"Zhi Han, Zongben Xu, Song-Chun Zhu","doi":"10.1109/ICCV.2011.6126380","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126380","url":null,"abstract":"This paper presents a middle-level video representation named Video Primal Sketch (VPS), which integrates two regimes of models: i) sparse coding model using static or moving primitives to explicitly represent moving corners, lines, feature points, etc., ii) FRAME/MRF model with spatio-temporal filters to implicitly represent textured motion, such as water and fire, by matching feature statistics, i.e. histograms. This paper makes three contributions: i) learning a dictionary of video primitives as parametric generative model; ii) studying the Spatio-Temporal FRAME (ST-FRAME) model for modeling and synthesizing textured motion; and iii) developing a parsimonious hybrid model for generic video representation. VPS selects the proper representation automatically and is compatible with high-level action representations. In the experiments, we synthesize a series of dynamic textures, reconstruct real videos and show varying VPS over the change of densities causing by the scale transition in videos.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"257 1","pages":"1283-1290"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83431684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Inferring human gaze from appearance via adaptive linear regression 通过自适应线性回归从外表推断人类的凝视
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126237
Feng Lu, Yusuke Sugano, Takahiro Okabe, Yoichi Sato
The problem of estimating human gaze from eye appearance is regarded as mapping high-dimensional features to low-dimensional target space. Conventional methods require densely obtained training samples on the eye appearance manifold, which results in a tedious calibration stage. In this paper, we introduce an adaptive linear regression (ALR) method for accurate mapping via sparsely collected training samples. The key idea is to adaptively find the subset of training samples where the test sample is most linearly representable. We solve the problem via l1-optimization and thoroughly study the key issues to seek for the best solution for regression. The proposed gaze estimation approach based on ALR is naturally sparse and low-dimensional, giving the ability to infer human gaze from variant resolution eye images using much fewer training samples than existing methods. Especially, the optimization procedure in ALR is extended to solve the subpixel alignment problem simultaneously for low resolution test eye images. Performance of the proposed method is evaluated by extensive experiments against various factors such as number of training samples, feature dimensionality and eye image resolution to verify its effectiveness.
从人眼外观估计人眼注视的问题被认为是将高维特征映射到低维目标空间。传统的方法需要在眼睛外观流形上密集地获得训练样本,这导致了繁琐的校准阶段。在本文中,我们引入了一种自适应线性回归(ALR)方法,通过稀疏收集的训练样本进行精确映射。关键思想是自适应地找到训练样本的子集,其中测试样本是最线性可表示的。我们通过11 -优化来解决问题,并深入研究关键问题,寻求回归的最佳解决方案。本文提出的基于ALR的注视估计方法具有天然的稀疏性和低维性,能够使用比现有方法少得多的训练样本从不同分辨率的眼睛图像中推断出人类的注视。特别地,将ALR中的优化过程扩展到同时解决低分辨率测试眼图像的亚像素对齐问题。通过对训练样本数、特征维数和眼睛图像分辨率等因素的大量实验,验证了该方法的有效性。
{"title":"Inferring human gaze from appearance via adaptive linear regression","authors":"Feng Lu, Yusuke Sugano, Takahiro Okabe, Yoichi Sato","doi":"10.1109/ICCV.2011.6126237","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126237","url":null,"abstract":"The problem of estimating human gaze from eye appearance is regarded as mapping high-dimensional features to low-dimensional target space. Conventional methods require densely obtained training samples on the eye appearance manifold, which results in a tedious calibration stage. In this paper, we introduce an adaptive linear regression (ALR) method for accurate mapping via sparsely collected training samples. The key idea is to adaptively find the subset of training samples where the test sample is most linearly representable. We solve the problem via l1-optimization and thoroughly study the key issues to seek for the best solution for regression. The proposed gaze estimation approach based on ALR is naturally sparse and low-dimensional, giving the ability to infer human gaze from variant resolution eye images using much fewer training samples than existing methods. Especially, the optimization procedure in ALR is extended to solve the subpixel alignment problem simultaneously for low resolution test eye images. Performance of the proposed method is evaluated by extensive experiments against various factors such as number of training samples, feature dimensionality and eye image resolution to verify its effectiveness.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"1 1","pages":"153-160"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90994607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 133
期刊
2011 International Conference on Computer Vision
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1