首页 > 最新文献

Proceedings Ninth IEEE International Conference on Computer Vision最新文献

英文 中文
Weighted and robust incremental method for subspace learning 子空间学习的加权鲁棒增量方法
Pub Date : 2003-10-13 DOI: 10.1109/ICCV.2003.1238667
D. Skočaj, A. Leonardis
Visual learning is expected to be a continuous and robust process, which treats input images and pixels selectively. In this paper, we present a method for subspace learning, which takes these considerations into account. We present an incremental method, which sequentially updates the principal subspace considering weighted influence of individual images as well as individual pixels within an image. This approach is further extended to enable determination of consistencies in the input data and imputation of the values in inconsistent pixels using the previously acquired knowledge, resulting in a novel incremental, weighted and robust method for subspace learning.
视觉学习被认为是一个连续的、鲁棒的过程,它有选择地处理输入图像和像素。在本文中,我们提出了一种考虑到这些因素的子空间学习方法。我们提出了一种增量方法,该方法考虑到单个图像以及图像内单个像素的加权影响,顺序更新主子空间。该方法进一步扩展到能够确定输入数据的一致性,并使用先前获得的知识对不一致像素的值进行输入,从而产生一种新的增量,加权和鲁棒的子空间学习方法。
{"title":"Weighted and robust incremental method for subspace learning","authors":"D. Skočaj, A. Leonardis","doi":"10.1109/ICCV.2003.1238667","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238667","url":null,"abstract":"Visual learning is expected to be a continuous and robust process, which treats input images and pixels selectively. In this paper, we present a method for subspace learning, which takes these considerations into account. We present an incremental method, which sequentially updates the principal subspace considering weighted influence of individual images as well as individual pixels within an image. This approach is further extended to enable determination of consistencies in the input data and imputation of the values in inconsistent pixels using the previously acquired knowledge, resulting in a novel incremental, weighted and robust method for subspace learning.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132474491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 198
Geometric segmentation of perspective images based on symmetry groups 基于对称群的透视图像几何分割
Pub Date : 2003-10-13 DOI: 10.1109/ICCV.2003.1238634
A. Yang, Shankar R. Rao, Kun Huang, Wei Hong, Yi Ma
Symmetry is an effective geometric cue to facilitate conventional segmentation techniques on images of man-made environment. Based on three fundamental principles that summarize the relations between symmetry and perspective imaging, namely, structure from symmetry, symmetry hypothesis testing, and global symmetry testing, we develop a prototype system which is able to automatically segment symmetric objects in space from single 2D perspective images. The result of such a segmentation is a hierarchy of geometric primitives, called symmetry cells and complexes, whose 3D structure and pose are fully recovered. Such a geometrically meaningful segmentation may greatly facilitate applications such as feature matching and robot navigation.
对称是一种有效的几何线索,可以简化传统的人工环境图像分割技术。基于对称构造、对称假设检验和全局对称检验这三个基本原理,我们开发了一个能够从单幅二维透视图像中自动分割空间对称物体的原型系统。这种分割的结果是几何原语的层次结构,称为对称细胞和复合物,其三维结构和姿态被完全恢复。这种几何上有意义的分割可以极大地促进特征匹配和机器人导航等应用。
{"title":"Geometric segmentation of perspective images based on symmetry groups","authors":"A. Yang, Shankar R. Rao, Kun Huang, Wei Hong, Yi Ma","doi":"10.1109/ICCV.2003.1238634","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238634","url":null,"abstract":"Symmetry is an effective geometric cue to facilitate conventional segmentation techniques on images of man-made environment. Based on three fundamental principles that summarize the relations between symmetry and perspective imaging, namely, structure from symmetry, symmetry hypothesis testing, and global symmetry testing, we develop a prototype system which is able to automatically segment symmetric objects in space from single 2D perspective images. The result of such a segmentation is a hierarchy of geometric primitives, called symmetry cells and complexes, whose 3D structure and pose are fully recovered. Such a geometrically meaningful segmentation may greatly facilitate applications such as feature matching and robot navigation.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132531259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Cumulative residual entropy, a new measure of information & its application to image alignment 一种新的信息度量方法——累积残差熵及其在图像对齐中的应用
Pub Date : 2003-10-13 DOI: 10.1109/ICCV.2003.1238395
Fei Wang, B. Vemuri, M. Rao, Yunmei Chen
We use the cumulative distribution of a random variable to define the information content in it and use it to develop a novel measure of information that parallels Shannon entropy, which we dub cumulative residual entropy (CRE). The key features of CRE may be summarized as, (1) its definition is valid in both the continuous and discrete domains, (2) it is mathematically more general than the Shannon entropy and (3) its computation from sample data is easy and these computations converge asymptotically to the true values. We define the cross-CRE (CCRE) between two random variables and apply it to solve the uni- and multimodal image alignment problem for parameterized (rigid, affine and projective) transformations. The key strengths of the CCRE over using the now popular mutual information method (based on Shannon's entropy) are that the former has significantly larger noise immunity and a much larger convergence range over the field of parameterized transformations. These strengths of CCRE are demonstrated via experiments on synthesized and real image data.
我们使用随机变量的累积分布来定义其中的信息内容,并使用它来开发一种与香农熵相似的新信息度量,我们将其称为累积残差熵(CRE)。CRE的主要特点可以概括为:(1)它的定义在连续域和离散域都是有效的;(2)它在数学上比香农熵更一般;(3)它从样本数据计算容易,这些计算渐近收敛于真值。我们定义了两个随机变量之间的交叉cre (cross-CRE),并将其应用于解决参数化(刚性、仿射和射影)变换的单模态和多模态图像对齐问题。与目前流行的互信息方法(基于香农熵)相比,CCRE的主要优势在于前者具有更大的抗噪性和更大的参数化变换收敛范围。通过合成和真实图像数据的实验证明了CCRE的这些优势。
{"title":"Cumulative residual entropy, a new measure of information & its application to image alignment","authors":"Fei Wang, B. Vemuri, M. Rao, Yunmei Chen","doi":"10.1109/ICCV.2003.1238395","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238395","url":null,"abstract":"We use the cumulative distribution of a random variable to define the information content in it and use it to develop a novel measure of information that parallels Shannon entropy, which we dub cumulative residual entropy (CRE). The key features of CRE may be summarized as, (1) its definition is valid in both the continuous and discrete domains, (2) it is mathematically more general than the Shannon entropy and (3) its computation from sample data is easy and these computations converge asymptotically to the true values. We define the cross-CRE (CCRE) between two random variables and apply it to solve the uni- and multimodal image alignment problem for parameterized (rigid, affine and projective) transformations. The key strengths of the CCRE over using the now popular mutual information method (based on Shannon's entropy) are that the former has significantly larger noise immunity and a much larger convergence range over the field of parameterized transformations. These strengths of CCRE are demonstrated via experiments on synthesized and real image data.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132591966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 58
Polarization-based inverse rendering from a single view 基于偏振的单视图反向渲染
Pub Date : 2003-10-13 DOI: 10.1109/ICCV.2003.1238455
D. Miyazaki, R. Tan, K. Hara, K. Ikeuchi
This paper presents a method to estimate geometrical, photometrical, and environmental information of a single-viewed object in one integrated framework under fixed viewing position and fixed illumination direction. These three types of information are important to render a photorealistic image of a real object. Photometrical information represents the texture and the surface roughness of an object, while geometrical and environmental information represent the 3D shape of an object and the illumination distribution, respectively. The proposed method estimates the 3D shape by computing the surface normal from polarization data, calculates the texture of the object from the diffuse only reflection component, determines the illumination directions from the position of the brightest intensity in the specular reflection component, and finally computes the surface roughness of the object by using the estimated illumination distribution.
本文提出了一种在固定观测位置和固定光照方向下,在一个集成框架中估计单次观测目标的几何、光度和环境信息的方法。这三种类型的信息对于渲染真实物体的逼真图像非常重要。光度信息表示物体的纹理和表面粗糙度,几何信息和环境信息分别表示物体的三维形状和光照分布。该方法利用偏振数据计算表面法线来估计物体的三维形状,利用漫反射分量计算物体的纹理,利用镜面反射分量中最亮强度的位置确定物体的光照方向,最后利用估计的光照分布计算物体的表面粗糙度。
{"title":"Polarization-based inverse rendering from a single view","authors":"D. Miyazaki, R. Tan, K. Hara, K. Ikeuchi","doi":"10.1109/ICCV.2003.1238455","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238455","url":null,"abstract":"This paper presents a method to estimate geometrical, photometrical, and environmental information of a single-viewed object in one integrated framework under fixed viewing position and fixed illumination direction. These three types of information are important to render a photorealistic image of a real object. Photometrical information represents the texture and the surface roughness of an object, while geometrical and environmental information represent the 3D shape of an object and the illumination distribution, respectively. The proposed method estimates the 3D shape by computing the surface normal from polarization data, calculates the texture of the object from the diffuse only reflection component, determines the illumination directions from the position of the brightest intensity in the specular reflection component, and finally computes the surface roughness of the object by using the estimated illumination distribution.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133110668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 197
Two-frame wide baseline matching 两帧宽基线匹配
Pub Date : 2003-10-13 DOI: 10.1109/ICCV.2003.1238403
Jiangjian Xiao, M. Shah
We describe a novel approach to automatically recover corresponding feature points and epipolar geometry over two wide baseline frames. Our contributions consist of several aspects: First, the use of an affine invariant feature, edge-corner, is introduced to provide a robust and consistent matching primitives. Second, based on SVD decomposition of affine matrix, the affine matching space between two corners can be approximately divided into two independent spaces by rotation angle and scaling factor. Employing this property, a two-stage affine matching algorithm is designed to obtain robust matches over two frames. Third, using the epipolar geometry estimated by these matches, more corresponding feature points are determined. Based on these robust correspondences, the fundamental matrix is refined, and a series of virtual views of the scene are synthesized. Finally, several experiments are presented to illustrate that a number of robust correspondences can be stably determined for two wide baseline images under significant camera motions with illumination changes, occlusions, and self-similarities. After testing a number of examples and comparing with the existing methods, the experimental results strongly demonstrate that our matching method outperforms the state-of-art algorithms for all of the test cases.
我们描述了一种在两个宽基线框架上自动恢复相应特征点和极几何的新方法。我们的贡献包括以下几个方面:首先,引入了仿射不变特征边角的使用,以提供鲁棒和一致的匹配原语。其次,基于仿射矩阵的SVD分解,通过旋转角度和比例因子将两个角间的仿射匹配空间近似划分为两个独立的空间;利用这一特性,设计了一种两阶段仿射匹配算法,以获得两帧的鲁棒匹配。第三,利用这些匹配估计的极面几何,确定更多对应的特征点。基于这些鲁棒性对应关系,对基本矩阵进行了细化,并合成了一系列场景的虚拟视图。最后,几个实验提出,以说明一些鲁棒对应可以稳定地确定两个宽基线图像在显著的相机运动与照明变化,遮挡和自相似性。经过大量的实例测试和与现有方法的比较,实验结果表明,我们的匹配方法在所有测试用例中都优于当前的匹配算法。
{"title":"Two-frame wide baseline matching","authors":"Jiangjian Xiao, M. Shah","doi":"10.1109/ICCV.2003.1238403","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238403","url":null,"abstract":"We describe a novel approach to automatically recover corresponding feature points and epipolar geometry over two wide baseline frames. Our contributions consist of several aspects: First, the use of an affine invariant feature, edge-corner, is introduced to provide a robust and consistent matching primitives. Second, based on SVD decomposition of affine matrix, the affine matching space between two corners can be approximately divided into two independent spaces by rotation angle and scaling factor. Employing this property, a two-stage affine matching algorithm is designed to obtain robust matches over two frames. Third, using the epipolar geometry estimated by these matches, more corresponding feature points are determined. Based on these robust correspondences, the fundamental matrix is refined, and a series of virtual views of the scene are synthesized. Finally, several experiments are presented to illustrate that a number of robust correspondences can be stably determined for two wide baseline images under significant camera motions with illumination changes, occlusions, and self-similarities. After testing a number of examples and comparing with the existing methods, the experimental results strongly demonstrate that our matching method outperforms the state-of-art algorithms for all of the test cases.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130066055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 95
Adaptive dynamic range imaging: optical control of pixel exposures over space and time 自适应动态范围成像:像素曝光随时间和空间的光学控制
Pub Date : 2003-10-13 DOI: 10.1109/ICCV.2003.1238624
S. Nayar, Vlad Branzoi
This paper presents a new approach to imaging that significantly enhances the dynamic range of a camera. The key idea is to adapt the exposure of each pixel on the image detector, based on the radiance value of the corresponding scene point. This adaptation is done in the optical domain, that is, during image formation. In practice, this is achieved using a spatial light modulator whose transmittance can be varied with high resolution over space and time. A real-time control algorithm is developed that uses acquired images to automatically adjust the transmittance function of the spatial modulator. Each captured image and its corresponding transmittance function are used to compute a very high dynamic range image that is linear in scene radiance. We have implemented a video-rate adaptive dynamic range camera that consists of a color CCD detector and a controllable liquid crystal light modulator. Experiments have been conducted in scenarios with complex and harsh lighting conditions. The results indicate that adaptive imaging can have a significant impact on vision applications such as monitoring, tracking, recognition, and navigation.
本文提出了一种新的成像方法,可以显著提高相机的动态范围。其关键思想是根据相应场景点的亮度值来调整图像检测器上每个像素的曝光。这种自适应是在光学领域,即在图像形成期间进行的。在实践中,这是通过空间光调制器实现的,该空间光调制器的透射率可以随空间和时间的高分辨率而变化。提出了一种利用采集到的图像自动调节空间调制器透射率函数的实时控制算法。每个捕获的图像和其相应的透射率函数被用来计算一个非常高的动态范围图像是线性的场景辐射。我们实现了一种视频速率自适应动态范围摄像机,该摄像机由彩色CCD探测器和可控液晶光调制器组成。在复杂和恶劣的光照条件下进行了实验。结果表明,自适应成像技术在视觉监控、跟踪、识别和导航等领域具有重要的应用价值。
{"title":"Adaptive dynamic range imaging: optical control of pixel exposures over space and time","authors":"S. Nayar, Vlad Branzoi","doi":"10.1109/ICCV.2003.1238624","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238624","url":null,"abstract":"This paper presents a new approach to imaging that significantly enhances the dynamic range of a camera. The key idea is to adapt the exposure of each pixel on the image detector, based on the radiance value of the corresponding scene point. This adaptation is done in the optical domain, that is, during image formation. In practice, this is achieved using a spatial light modulator whose transmittance can be varied with high resolution over space and time. A real-time control algorithm is developed that uses acquired images to automatically adjust the transmittance function of the spatial modulator. Each captured image and its corresponding transmittance function are used to compute a very high dynamic range image that is linear in scene radiance. We have implemented a video-rate adaptive dynamic range camera that consists of a color CCD detector and a controllable liquid crystal light modulator. Experiments have been conducted in scenarios with complex and harsh lighting conditions. The results indicate that adaptive imaging can have a significant impact on vision applications such as monitoring, tracking, recognition, and navigation.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114502629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 213
A new perspective [on] shape-from-shading 一个关于形状-阴影的新视角
Pub Date : 2003-10-13 DOI: 10.1109/ICCV.2003.1238439
A. Tankus, N. Sochen, Y. Yeshurun
Shape-from-shading (SFS) is a fundamental problem in computer vision. The vast majority of research in this field have assumed orthography as its projection model. This paper reexamines the basis of SFS, the image irradiance equation, under an assumption of perspective projection. The paper also shows that the perspective image irradiance equation depends merely on the natural logarithm of the depth function (and not on the depth function itself), and as such it is invariant to scale changes of the depth function. We then suggest a simple reconstruction algorithm based on the perspective formula, and compare it to existing orthographic SFS algorithms. This simple algorithm obtained lower error rates than legacy SFS algorithms, and equated with and sometimes surpassed state-of-the-art algorithms. These findings lend support to the assumption that transition to a more realistic set of assumptions improves reconstruction significantly.
形状-阴影(SFS)是计算机视觉中的一个基本问题。该领域的绝大多数研究都将正字法作为其投影模型。本文在透视投影的假设下,重新研究了SFS的基础——图像辐照度方程。本文还表明,透视图像辐照度方程仅取决于深度函数的自然对数(而不取决于深度函数本身),因此它对深度函数的尺度变化是不变的。然后,我们提出了一种基于透视公式的简单重建算法,并将其与现有的正交SFS算法进行了比较。这个简单的算法比传统的SFS算法获得更低的错误率,与最先进的算法相当,有时甚至超过了最先进的算法。这些发现支持了一个假设,即过渡到一组更现实的假设可以显著改善重建。
{"title":"A new perspective [on] shape-from-shading","authors":"A. Tankus, N. Sochen, Y. Yeshurun","doi":"10.1109/ICCV.2003.1238439","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238439","url":null,"abstract":"Shape-from-shading (SFS) is a fundamental problem in computer vision. The vast majority of research in this field have assumed orthography as its projection model. This paper reexamines the basis of SFS, the image irradiance equation, under an assumption of perspective projection. The paper also shows that the perspective image irradiance equation depends merely on the natural logarithm of the depth function (and not on the depth function itself), and as such it is invariant to scale changes of the depth function. We then suggest a simple reconstruction algorithm based on the perspective formula, and compare it to existing orthographic SFS algorithms. This simple algorithm obtained lower error rates than legacy SFS algorithms, and equated with and sometimes surpassed state-of-the-art algorithms. These findings lend support to the assumption that transition to a more realistic set of assumptions improves reconstruction significantly.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128082388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 98
Unsupervised image translation 无监督图像翻译
Pub Date : 2003-10-13 DOI: 10.1109/ICCV.2003.1238384
Rómer Rosales, Kannan Achan, B. Frey
An interesting and potentially useful vision/graphics task is to render an input image in an enhanced form or also in an unusual style; for example with increased sharpness or with some artistic qualities. In previous work [10, 5], researchers showed that by estimating the mapping from an input image to a registered (aligned) image of the same scene in a different style or resolution, the mapping could be used to render a new input image in that style or resolution. Frequently a registered pair is not available, but instead the user may have only a source image of an unrelated scene that contains the desired style. In this case, the task of inferring the output image is much more difficult since the algorithm must both infer correspondences between features in the input image and the source image, and infer the unknown mapping between the images. We describe a Bayesian technique for inferring the most likely output image. The prior on the output image P(X) is a patch-based Markov random field obtained from the source image. The likelihood of the input P(Y/spl bsol/X) is a Bayesian network that can represent different rendering styles. We describe a computationally efficient, probabilistic inference and learning algorithm for inferring the most likely output image and learning the rendering style. We also show that current techniques for image restoration or reconstruction proposed in the vision literature (e.g., image super-resolution or de-noising) and image-based nonphotorealistic rendering could be seen as special cases of our model. We demonstrate our technique using several tasks, including rendering a photograph in the artistic style of an unrelated scene, de-noising, and texture transfer.
一个有趣且潜在有用的视觉/图形任务是以增强形式或以不寻常的风格呈现输入图像;例如,增加清晰度或具有一些艺术品质。在之前的工作[10,5]中,研究人员表明,通过估计从输入图像到不同风格或分辨率的相同场景的注册(对齐)图像的映射,该映射可用于呈现该风格或分辨率的新输入图像。通常情况下,注册的图片是不可用的,但用户可能只有一个不相关的场景的源图像,其中包含所需的风格。在这种情况下,推断输出图像的任务要困难得多,因为算法必须既推断输入图像和源图像中特征之间的对应关系,又推断图像之间的未知映射。我们描述了一种贝叶斯技术来推断最可能的输出图像。输出图像P(X)上的先验是由源图像得到的基于patch的马尔可夫随机场。输入P(Y/spl bsol/X)的可能性是一个贝叶斯网络,它可以表示不同的呈现风格。我们描述了一种计算效率高的概率推理和学习算法,用于推断最可能的输出图像并学习渲染风格。我们还表明,当前在视觉文献中提出的图像恢复或重建技术(例如,图像超分辨率或去噪)和基于图像的非真实感渲染可以被视为我们模型的特殊情况。我们使用几个任务来演示我们的技术,包括以不相关场景的艺术风格渲染照片,去噪和纹理转移。
{"title":"Unsupervised image translation","authors":"Rómer Rosales, Kannan Achan, B. Frey","doi":"10.1109/ICCV.2003.1238384","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238384","url":null,"abstract":"An interesting and potentially useful vision/graphics task is to render an input image in an enhanced form or also in an unusual style; for example with increased sharpness or with some artistic qualities. In previous work [10, 5], researchers showed that by estimating the mapping from an input image to a registered (aligned) image of the same scene in a different style or resolution, the mapping could be used to render a new input image in that style or resolution. Frequently a registered pair is not available, but instead the user may have only a source image of an unrelated scene that contains the desired style. In this case, the task of inferring the output image is much more difficult since the algorithm must both infer correspondences between features in the input image and the source image, and infer the unknown mapping between the images. We describe a Bayesian technique for inferring the most likely output image. The prior on the output image P(X) is a patch-based Markov random field obtained from the source image. The likelihood of the input P(Y/spl bsol/X) is a Bayesian network that can represent different rendering styles. We describe a computationally efficient, probabilistic inference and learning algorithm for inferring the most likely output image and learning the rendering style. We also show that current techniques for image restoration or reconstruction proposed in the vision literature (e.g., image super-resolution or de-noising) and image-based nonphotorealistic rendering could be seen as special cases of our model. We demonstrate our technique using several tasks, including rendering a photograph in the artistic style of an unrelated scene, de-noising, and texture transfer.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133595434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 64
Capturing subtle facial motions in 3D face tracking 在3D面部跟踪中捕捉细微的面部动作
Pub Date : 2003-10-13 DOI: 10.1109/ICCV.2003.1238646
Zhen Wen, Thomas S. Huang
Facial motions produce not only facial feature points motions, but also subtle appearance changes such as wrinkles and shading changes. These subtle changes are important yet difficult issues for both analysis (tracking) and synthesis (animation). Previous approaches were mostly based on models learned from extensive training appearance examples. However, the space of all possible facial motion appearance is huge. Thus, it is not feasible to collect samples covering all possible variations due to lighting conditions, individualities, and head poses. Therefore, it is difficult to adapt such models to new conditions. In this paper, we present an adaptive technique for analyzing subtle facial appearance changes. We propose a new ratio-image based appearance feature, which is independent of a person's face albedo. This feature is used to track face appearance variations based on exemplars. To adapt the exemplar appearance model to new people and lighting conditions, we develop an online EM-based algorithm. Experiments show that the proposed method improves classification results in a facial expression recognition task, where a variety of people and lighting conditions are involved.
面部运动不仅会产生面部特征点的运动,还会产生细微的外观变化,如皱纹和阴影的变化。这些细微的变化对于分析(跟踪)和合成(动画)来说都是重要而困难的问题。以前的方法主要基于从大量训练外观示例中学习的模型。然而,所有可能的面部动作出现的空间是巨大的。因此,由于光照条件、个性和头部姿势的不同,收集涵盖所有可能变化的样本是不可行的。因此,这种模式很难适应新的情况。在本文中,我们提出了一种自适应技术来分析细微的面部外观变化。我们提出了一种新的基于比例图像的外观特征,该特征不依赖于人的面部反照率。该特征用于基于样本跟踪人脸外观变化。为了使样例外观模型适应新的人和光照条件,我们开发了一种基于em的在线算法。实验表明,该方法在涉及多种人物和光照条件的面部表情识别任务中改善了分类结果。
{"title":"Capturing subtle facial motions in 3D face tracking","authors":"Zhen Wen, Thomas S. Huang","doi":"10.1109/ICCV.2003.1238646","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238646","url":null,"abstract":"Facial motions produce not only facial feature points motions, but also subtle appearance changes such as wrinkles and shading changes. These subtle changes are important yet difficult issues for both analysis (tracking) and synthesis (animation). Previous approaches were mostly based on models learned from extensive training appearance examples. However, the space of all possible facial motion appearance is huge. Thus, it is not feasible to collect samples covering all possible variations due to lighting conditions, individualities, and head poses. Therefore, it is difficult to adapt such models to new conditions. In this paper, we present an adaptive technique for analyzing subtle facial appearance changes. We propose a new ratio-image based appearance feature, which is independent of a person's face albedo. This feature is used to track face appearance variations based on exemplars. To adapt the exemplar appearance model to new people and lighting conditions, we develop an online EM-based algorithm. Experiments show that the proposed method improves classification results in a facial expression recognition task, where a variety of people and lighting conditions are involved.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121870334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 111
Recognizing human action efforts: an adaptive three-mode PCA framework 人类行为努力的识别:自适应三模PCA框架
Pub Date : 2003-10-13 DOI: 10.1109/ICCV.2003.1238662
James W. Davis, Hui Gao
We present a computational framework capable of labeling the effort of an action corresponding to the perceived level of exertion by the performer (low - high). The approach initially factorizes examples (at different efforts) of an action into its three-mode principal components to reduce the dimensionality. Then a learning phase is introduced to compute expressive-feature weights to adjust the model's estimation of effort to conform to given perceptual labels for the examples. Experiments are demonstrated recognizing the efforts of a person carrying bags of different weight and for multiple people walking at different paces.
我们提出了一个计算框架,能够标记一个动作的努力,对应于执行者所感知的努力水平(低-高)。该方法首先将动作的示例(在不同的努力下)分解为其三模主成分以降低维数。然后引入一个学习阶段来计算表达特征权重,以调整模型对努力的估计,使其符合给定的示例感知标签。实验证明了识别一个人携带不同重量的袋子的努力,以及许多人以不同的速度行走。
{"title":"Recognizing human action efforts: an adaptive three-mode PCA framework","authors":"James W. Davis, Hui Gao","doi":"10.1109/ICCV.2003.1238662","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238662","url":null,"abstract":"We present a computational framework capable of labeling the effort of an action corresponding to the perceived level of exertion by the performer (low - high). The approach initially factorizes examples (at different efforts) of an action into its three-mode principal components to reduce the dimensionality. Then a learning phase is introduced to compute expressive-feature weights to adjust the model's estimation of effort to conform to given perceptual labels for the examples. Experiments are demonstrated recognizing the efforts of a person carrying bags of different weight and for multiple people walking at different paces.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123561238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
期刊
Proceedings Ninth IEEE International Conference on Computer Vision
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1