首页 > 最新文献

2011 International Conference on Computer Vision最新文献

英文 中文
Learning to cluster using high order graphical models with latent variables 学习使用具有潜在变量的高阶图形模型聚类
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126227
N. Komodakis
This paper proposes a very general max-margin learning framework for distance-based clustering. To this end, it formulates clustering as a high order energy minimization problem with latent variables, and applies a dual decomposition approach for training this model. The resulting framework allows learning a very broad class of distance functions, permits an automatic determination of the number of clusters during testing, and is also very efficient. As an additional contribution, we show how our method can be generalized to handle the training of a very broad class of important models in computer vision: arbitrary high-order latent CRFs. Experimental results verify its effectiveness.
本文提出了一个非常通用的基于距离聚类的最大边际学习框架。为此,本文将聚类问题表述为具有潜在变量的高阶能量最小化问题,并采用对偶分解方法对该模型进行训练。生成的框架允许学习非常广泛的距离函数类,允许在测试期间自动确定簇的数量,并且非常高效。作为额外的贡献,我们展示了如何将我们的方法推广到处理计算机视觉中非常广泛的一类重要模型的训练:任意高阶潜在crf。实验结果验证了该方法的有效性。
{"title":"Learning to cluster using high order graphical models with latent variables","authors":"N. Komodakis","doi":"10.1109/ICCV.2011.6126227","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126227","url":null,"abstract":"This paper proposes a very general max-margin learning framework for distance-based clustering. To this end, it formulates clustering as a high order energy minimization problem with latent variables, and applies a dual decomposition approach for training this model. The resulting framework allows learning a very broad class of distance functions, permits an automatic determination of the number of clusters during testing, and is also very efficient. As an additional contribution, we show how our method can be generalized to handle the training of a very broad class of important models in computer vision: arbitrary high-order latent CRFs. Experimental results verify its effectiveness.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80649971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Building a better probabilistic model of images by factorization 用因子分解法建立更好的图像概率模型
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126473
B. J. Culpepper, Jascha Narain Sohl-Dickstein, B. Olshausen
We describe a directed bilinear model that learns higher-order groupings among features of natural images. The model represents images in terms of two sets of latent variables: one set of variables represents which feature groups are active, while the other specifies the relative activity within groups. Such a factorized representation is beneficial because it is stable in response to small variations in the placement of features while still preserving information about relative spatial relationships. When trained on MNIST digits, the resulting representation provides state of the art performance in classification using a simple classifier. When trained on natural images, the model learns to group features according to proximity in position, orientation, and scale. The model achieves high log-likelihood (−94 nats), surpassing the current state of the art for natural images achievable with an mcRBM model.
我们描述了一个有向双线性模型,它学习自然图像特征之间的高阶分组。该模型用两组潜在变量表示图像:一组变量表示哪些特征组是活跃的,另一组变量表示组内的相对活跃度。这种因式表示是有益的,因为它在响应特征位置的小变化时是稳定的,同时仍然保留有关相对空间关系的信息。当在MNIST数字上训练时,结果表示使用简单的分类器提供了最先进的分类性能。当对自然图像进行训练时,该模型学习根据位置、方向和规模的接近度对特征进行分组。该模型实现了高对数似然(- 94 nats),超过了目前使用mcRBM模型可实现的自然图像的技术水平。
{"title":"Building a better probabilistic model of images by factorization","authors":"B. J. Culpepper, Jascha Narain Sohl-Dickstein, B. Olshausen","doi":"10.1109/ICCV.2011.6126473","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126473","url":null,"abstract":"We describe a directed bilinear model that learns higher-order groupings among features of natural images. The model represents images in terms of two sets of latent variables: one set of variables represents which feature groups are active, while the other specifies the relative activity within groups. Such a factorized representation is beneficial because it is stable in response to small variations in the placement of features while still preserving information about relative spatial relationships. When trained on MNIST digits, the resulting representation provides state of the art performance in classification using a simple classifier. When trained on natural images, the model learns to group features according to proximity in position, orientation, and scale. The model achieves high log-likelihood (−94 nats), surpassing the current state of the art for natural images achievable with an mcRBM model.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78790246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Image based detection of geometric changes in urban environments 基于图像的城市环境几何变化检测
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126515
Aparna Taneja, Luca Ballan, M. Pollefeys
In this paper, we propose an efficient technique to detect changes in the geometry of an urban environment using some images observing its current state. The proposed method can be used to significantly optimize the process of updating the 3D model of a city changing over time, by restricting this process to only those areas where changes are detected. With this application in mind, we designed our algorithm to specifically detect only structural changes in the environment, ignoring any changes in its appearance, and ignoring also all the changes which are not relevant for update purposes, such as cars, people etc. As a by-product, the algorithm also provides a coarse geometry of the detected changes. The performance of the proposed method was tested on four different kinds of urban environments and compared with two alternative techniques.
在本文中,我们提出了一种有效的技术来检测城市环境的几何变化,使用一些图像来观察其当前状态。所提出的方法可以通过将此过程限制在检测到变化的区域,从而显著优化城市随时间变化的3D模型更新过程。考虑到这个应用程序,我们设计的算法专门检测环境中的结构变化,忽略其外观的任何变化,并且忽略所有与更新目的无关的变化,例如汽车,人等。作为副产品,该算法还提供了检测到的变化的粗略几何形状。在四种不同的城市环境中测试了该方法的性能,并与两种替代技术进行了比较。
{"title":"Image based detection of geometric changes in urban environments","authors":"Aparna Taneja, Luca Ballan, M. Pollefeys","doi":"10.1109/ICCV.2011.6126515","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126515","url":null,"abstract":"In this paper, we propose an efficient technique to detect changes in the geometry of an urban environment using some images observing its current state. The proposed method can be used to significantly optimize the process of updating the 3D model of a city changing over time, by restricting this process to only those areas where changes are detected. With this application in mind, we designed our algorithm to specifically detect only structural changes in the environment, ignoring any changes in its appearance, and ignoring also all the changes which are not relevant for update purposes, such as cars, people etc. As a by-product, the algorithm also provides a coarse geometry of the detected changes. The performance of the proposed method was tested on four different kinds of urban environments and compared with two alternative techniques.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79454954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 89
Birdlets: Subordinate categorization using volumetric primitives and pose-normalized appearance 小鸟:使用体积原语和姿势标准化外观进行从属分类
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126238
Ryan Farrell, Om Oza, Ning Zhang, Vlad I. Morariu, Trevor Darrell, L. Davis
Subordinate-level categorization typically rests on establishing salient distinctions between part-level characteristics of objects, in contrast to basic-level categorization, where the presence or absence of parts is determinative. We develop an approach for subordinate categorization in vision, focusing on an avian domain due to the fine-grained structure of the category taxonomy for this domain. We explore a pose-normalized appearance model based on a volumetric poselet scheme. The variation in shape and appearance properties of these parts across a taxonomy provides the cues needed for subordinate categorization. Training pose detectors requires a relatively large amount of training data per category when done from scratch; using a subordinate-level approach, we exploit a pose classifier trained at the basic-level, and extract part appearance and shape information to build subordinate-level models. Our model associates the underlying image pattern parameters used for detection with corresponding volumetric part location, scale and orientation parameters. These parameters implicitly define a mapping from the image pixels into a pose-normalized appearance space, removing view and pose dependencies, facilitating fine-grained categorization from relatively few training examples.
从属级分类通常依赖于在对象的部分级特征之间建立显著的区别,与基本级分类相反,在基本级分类中,部分的存在与否是决定性的。我们开发了一种视觉上的从属分类方法,由于该领域的类别分类法具有细粒度结构,因此我们将重点放在鸟类领域。我们探索了一种基于体积姿态集方案的姿态归一化外观模型。这些部件在整个分类法中的形状和外观属性的变化为从属分类提供了所需的线索。从头开始训练姿势检测器时,每个类别需要相对大量的训练数据;采用从属级方法,利用在基本级训练的姿态分类器,提取零件外观和形状信息,构建从属级模型。我们的模型将用于检测的底层图像模式参数与相应的体积部件位置、比例和方向参数相关联。这些参数隐式地定义了从图像像素到姿势标准化外观空间的映射,消除了视图和姿势依赖关系,便于从相对较少的训练示例中进行细粒度分类。
{"title":"Birdlets: Subordinate categorization using volumetric primitives and pose-normalized appearance","authors":"Ryan Farrell, Om Oza, Ning Zhang, Vlad I. Morariu, Trevor Darrell, L. Davis","doi":"10.1109/ICCV.2011.6126238","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126238","url":null,"abstract":"Subordinate-level categorization typically rests on establishing salient distinctions between part-level characteristics of objects, in contrast to basic-level categorization, where the presence or absence of parts is determinative. We develop an approach for subordinate categorization in vision, focusing on an avian domain due to the fine-grained structure of the category taxonomy for this domain. We explore a pose-normalized appearance model based on a volumetric poselet scheme. The variation in shape and appearance properties of these parts across a taxonomy provides the cues needed for subordinate categorization. Training pose detectors requires a relatively large amount of training data per category when done from scratch; using a subordinate-level approach, we exploit a pose classifier trained at the basic-level, and extract part appearance and shape information to build subordinate-level models. Our model associates the underlying image pattern parameters used for detection with corresponding volumetric part location, scale and orientation parameters. These parameters implicitly define a mapping from the image pixels into a pose-normalized appearance space, removing view and pose dependencies, facilitating fine-grained categorization from relatively few training examples.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78492824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 217
Stereo reconstruction using high order likelihood 利用高阶似然进行立体重建
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126371
H. Jung, Kyoung Mu Lee, Sang Uk Lee
Under the popular Bayesian approach, a stereo problem can be formulated by defining likelihood and prior. Likelihoods are often associated with unary terms and priors are defined by pair-wise or higher order cliques in Markov random field (MRF). In this paper, we propose to use high order likelihood model in stereo. Numerous conventional patch based matching methods such as normalized cross correlation, Laplacian of Gaussian, or census filters are designed under the naive assumption that all the pixels of a patch have the same disparities. However, patch-wise cost can be formulated as higher order cliques for MRF so that the matching cost is a function of image patch's disparities. A patch obtained from the projected image by a disparity map should provide a better match without the blurring effect around disparity discontinuities. Among patch-wise high order matching costs, the census filter approach can be easily reduced to pair-wise cliques. The experimental results on census filter-based high order likelihood demonstrate the advantages of high order likelihood over independent identically distributed unary model.
在流行的贝叶斯方法下,一个立体问题可以通过定义可能性和先验来表述。可能性通常与一元项相关,先验由马尔可夫随机场(MRF)中的成对或高阶团定义。在本文中,我们提出在立体中使用高阶似然模型。许多传统的基于patch的匹配方法,如归一化互相关、拉普拉斯高斯或人口普查滤波器,都是在一个patch的所有像素具有相同差异的天真假设下设计的。然而,对于MRF,块成本可以表示为高阶团,因此匹配成本是图像块差异的函数。通过视差图从投影图像中获得的补丁应该提供更好的匹配,而不会在视差不连续处产生模糊效果。在基于补丁的高阶匹配成本中,人口普查过滤器方法可以很容易地简化为基于成对的小集团。基于普查滤波器的高阶似然模型的实验结果表明,高阶似然模型优于独立同分布一元模型。
{"title":"Stereo reconstruction using high order likelihood","authors":"H. Jung, Kyoung Mu Lee, Sang Uk Lee","doi":"10.1109/ICCV.2011.6126371","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126371","url":null,"abstract":"Under the popular Bayesian approach, a stereo problem can be formulated by defining likelihood and prior. Likelihoods are often associated with unary terms and priors are defined by pair-wise or higher order cliques in Markov random field (MRF). In this paper, we propose to use high order likelihood model in stereo. Numerous conventional patch based matching methods such as normalized cross correlation, Laplacian of Gaussian, or census filters are designed under the naive assumption that all the pixels of a patch have the same disparities. However, patch-wise cost can be formulated as higher order cliques for MRF so that the matching cost is a function of image patch's disparities. A patch obtained from the projected image by a disparity map should provide a better match without the blurring effect around disparity discontinuities. Among patch-wise high order matching costs, the census filter approach can be easily reduced to pair-wise cliques. The experimental results on census filter-based high order likelihood demonstrate the advantages of high order likelihood over independent identically distributed unary model.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78876754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
N-best maximal decoders for part models 零件模型的n -最佳最大解码器
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126552
Dennis Park, Deva Ramanan
We describe a method for generating N-best configurations from part-based models, ensuring that they do not overlap according to some user-provided definition of overlap. We extend previous N-best algorithms from the speech community to incorporate non-maximal suppression cues, such that pixel-shifted copies of a single configuration are not returned. We use approximate algorithms that perform nearly identical to their exact counterparts, but are orders of magnitude faster. Our approach outperforms standard methods for generating multiple object configurations in an image. We use our method to generate multiple pose hypotheses for the problem of human pose estimation from video sequences. We present quantitative results that demonstrate that our framework significantly improves the accuracy of a state-of-the-art pose estimation algorithm.
我们描述了一种从基于零件的模型生成n个最佳配置的方法,根据用户提供的重叠定义,确保它们不重叠。我们从语音社区扩展了以前的N-best算法,以纳入非最大抑制线索,这样就不会返回单个配置的像素移位副本。我们使用近似算法,其执行效果与精确算法几乎相同,但速度要快几个数量级。我们的方法优于在图像中生成多个对象配置的标准方法。针对视频序列中人体姿态估计的问题,我们使用该方法生成了多个姿态假设。我们提出的定量结果表明,我们的框架显着提高了最先进的姿态估计算法的准确性。
{"title":"N-best maximal decoders for part models","authors":"Dennis Park, Deva Ramanan","doi":"10.1109/ICCV.2011.6126552","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126552","url":null,"abstract":"We describe a method for generating N-best configurations from part-based models, ensuring that they do not overlap according to some user-provided definition of overlap. We extend previous N-best algorithms from the speech community to incorporate non-maximal suppression cues, such that pixel-shifted copies of a single configuration are not returned. We use approximate algorithms that perform nearly identical to their exact counterparts, but are orders of magnitude faster. Our approach outperforms standard methods for generating multiple object configurations in an image. We use our method to generate multiple pose hypotheses for the problem of human pose estimation from video sequences. We present quantitative results that demonstrate that our framework significantly improves the accuracy of a state-of-the-art pose estimation algorithm.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76691504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 126
Dynamic Manifold Warping for view invariant action recognition 动态流形翘曲的视图不变动作识别
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126290
Dian Gong, G. Medioni
We address the problem of learning view-invariant 3D models of human motion from motion capture data, in order to recognize human actions from a monocular video sequence with arbitrary viewpoint. We propose a Spatio-Temporal Manifold (STM) model to analyze non-linear multivariate time series with latent spatial structure and apply it to recognize actions in the joint-trajectories space. Based on STM, a novel alignment algorithm Dynamic Manifold Warping (DMW) and a robust motion similarity metric are proposed for human action sequences, both in 2D and 3D. DMW extends previous works on spatio-temporal alignment by incorporating manifold learning. We evaluate and compare the approach to state-of-the-art methods on motion capture data and realistic videos. Experimental results demonstrate the effectiveness of our approach, which yields visually appealing alignment results, produces higher action recognition accuracy, and can recognize actions from arbitrary views with partial occlusion.
我们解决了从动作捕捉数据中学习人体运动的视图不变3D模型的问题,以便从任意视点的单目视频序列中识别人类动作。提出了一种时空流形(STM)模型,用于分析具有潜在空间结构的非线性多变量时间序列,并将其应用于联合轨迹空间中的动作识别。基于STM,提出了一种新的二维和三维人体动作序列对齐算法动态流形扭曲(Dynamic Manifold Warping, DMW)和鲁棒运动相似度度量。DMW通过结合流形学习扩展了以前在时空对齐方面的工作。我们评估并比较了最先进的运动捕捉数据和现实视频的方法。实验结果证明了该方法的有效性,产生了视觉上吸引人的对齐结果,产生了更高的动作识别精度,并且可以识别来自部分遮挡的任意视图的动作。
{"title":"Dynamic Manifold Warping for view invariant action recognition","authors":"Dian Gong, G. Medioni","doi":"10.1109/ICCV.2011.6126290","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126290","url":null,"abstract":"We address the problem of learning view-invariant 3D models of human motion from motion capture data, in order to recognize human actions from a monocular video sequence with arbitrary viewpoint. We propose a Spatio-Temporal Manifold (STM) model to analyze non-linear multivariate time series with latent spatial structure and apply it to recognize actions in the joint-trajectories space. Based on STM, a novel alignment algorithm Dynamic Manifold Warping (DMW) and a robust motion similarity metric are proposed for human action sequences, both in 2D and 3D. DMW extends previous works on spatio-temporal alignment by incorporating manifold learning. We evaluate and compare the approach to state-of-the-art methods on motion capture data and realistic videos. Experimental results demonstrate the effectiveness of our approach, which yields visually appealing alignment results, produces higher action recognition accuracy, and can recognize actions from arbitrary views with partial occlusion.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77376154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 91
Automated corpus callosum extraction via Laplace-Beltrami nodal parcellation and intrinsic geodesic curvature flows on surfaces 基于拉普拉斯-贝尔特拉米节点分割和表面固有测地线曲率流的自动胼胝体提取
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126476
Rongjie Lai, Yonggang Shi, N. Sicotte, A. Toga
Corpus callosum (CC) is an important structure in human brain anatomy. In this work, we propose a fully automated and robust approach to extract corpus callosum from T1-weighted structural MR images. The novelty of our method is composed of two key steps. In the first step, we find an initial guess for the curve representation of CC by using the zero level set of the first nontrivial Laplace-Beltrami (LB) eigenfunction on the white matter surface. In the second step, the initial curve is deformed toward the final solution with a geodesic curvature flow on the white matter surface. For numerical solution of the geodesic curvature flow on surfaces, we represent the contour implicitly on a triangular mesh and develop efficient numerical schemes based on finite element method. Because our method depends only on the intrinsic geometry of the white matter surface, it is robust to orientation differences of the brain across population. In our experiments, we validate the proposed algorithm on 32 brains from a clinical study of multiple sclerosis disease and demonstrate that the accuracy of our results.
胼胝体(CC)是人脑解剖学中的一个重要结构。在这项工作中,我们提出了一种完全自动化和健壮的方法来从t1加权结构MR图像中提取胼胝体。我们的方法的新颖性由两个关键步骤组成。在第一步中,我们利用白质表面上的第一个非平凡拉普拉斯-贝尔特拉米(LB)特征函数的零水平集对CC的曲线表示进行了初步猜测。在第二步中,初始曲线在白质表面以测地线曲率流向最终解变形。对于曲面上测地线曲率流的数值解,我们采用三角网格隐式表示曲面轮廓,并基于有限元法开发了有效的数值格式。由于我们的方法仅依赖于白质表面的固有几何形状,因此它对不同人群的大脑方向差异具有鲁棒性。在我们的实验中,我们在来自多发性硬化症临床研究的32个大脑上验证了所提出的算法,并证明了我们结果的准确性。
{"title":"Automated corpus callosum extraction via Laplace-Beltrami nodal parcellation and intrinsic geodesic curvature flows on surfaces","authors":"Rongjie Lai, Yonggang Shi, N. Sicotte, A. Toga","doi":"10.1109/ICCV.2011.6126476","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126476","url":null,"abstract":"Corpus callosum (CC) is an important structure in human brain anatomy. In this work, we propose a fully automated and robust approach to extract corpus callosum from T1-weighted structural MR images. The novelty of our method is composed of two key steps. In the first step, we find an initial guess for the curve representation of CC by using the zero level set of the first nontrivial Laplace-Beltrami (LB) eigenfunction on the white matter surface. In the second step, the initial curve is deformed toward the final solution with a geodesic curvature flow on the white matter surface. For numerical solution of the geodesic curvature flow on surfaces, we represent the contour implicitly on a triangular mesh and develop efficient numerical schemes based on finite element method. Because our method depends only on the intrinsic geometry of the white matter surface, it is robust to orientation differences of the brain across population. In our experiments, we validate the proposed algorithm on 32 brains from a clinical study of multiple sclerosis disease and demonstrate that the accuracy of our results.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76641952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Adaptive deconvolutional networks for mid and high level feature learning 用于中高级特征学习的自适应反卷积网络
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126474
Matthew D. Zeiler, Graham W. Taylor, R. Fergus
We present a hierarchical model that learns image decompositions via alternating layers of convolutional sparse coding and max pooling. When trained on natural images, the layers of our model capture image information in a variety of forms: low-level edges, mid-level edge junctions, high-level object parts and complete objects. To build our model we rely on a novel inference scheme that ensures each layer reconstructs the input, rather than just the output of the layer directly beneath, as is common with existing hierarchical approaches. This makes it possible to learn multiple layers of representation and we show models with 4 layers, trained on images from the Caltech-101 and 256 datasets. When combined with a standard classifier, features extracted from these models outperform SIFT, as well as representations from other feature learning methods.
我们提出了一个分层模型,该模型通过卷积稀疏编码和最大池化交替层来学习图像分解。当在自然图像上训练时,我们模型的层以各种形式捕获图像信息:低级边缘,中级边缘连接,高级对象部分和完整对象。为了构建我们的模型,我们依赖于一种新的推理方案,该方案确保每一层重建输入,而不仅仅是直接在下一层的输出,就像现有的分层方法一样。这使得学习多层表示成为可能,我们展示了4层的模型,这些模型是在来自Caltech-101和256数据集的图像上训练的。当与标准分类器结合使用时,从这些模型中提取的特征优于SIFT,也优于其他特征学习方法的表示。
{"title":"Adaptive deconvolutional networks for mid and high level feature learning","authors":"Matthew D. Zeiler, Graham W. Taylor, R. Fergus","doi":"10.1109/ICCV.2011.6126474","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126474","url":null,"abstract":"We present a hierarchical model that learns image decompositions via alternating layers of convolutional sparse coding and max pooling. When trained on natural images, the layers of our model capture image information in a variety of forms: low-level edges, mid-level edge junctions, high-level object parts and complete objects. To build our model we rely on a novel inference scheme that ensures each layer reconstructs the input, rather than just the output of the layer directly beneath, as is common with existing hierarchical approaches. This makes it possible to learn multiple layers of representation and we show models with 4 layers, trained on images from the Caltech-101 and 256 datasets. When combined with a standard classifier, features extracted from these models outperform SIFT, as well as representations from other feature learning methods.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77059727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1182
Exploiting the Manhattan-world assumption for extrinsic self-calibration of multi-modal sensor networks 利用曼哈顿世界假设进行多模态传感器网络的外部自定标
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126337
Marcel Brückner, Joachim Denzler
Many new applications are enabled by combining a multi-camera system with a Time-of-Flight (ToF) camera, which is able to simultaneously record intensity and depth images. Classical approaches for self-calibration of a multi-camera system fail to calibrate such a system due to the very different image modalities. In addition, the typical environments of multi-camera systems are man-made and consist primary of only low textured objects. However, at the same time they satisfy the Manhattan-world assumption. We formulate the multi-modal sensor network calibration as a Maximum a Posteriori (MAP) problem and solve it by minimizing the corresponding energy function. First we estimate two separate 3D reconstructions of the environment: one using the pan-tilt unit mounted ToF camera and one using the multi-camera system. We exploit the Manhattan-world assumption and estimate multiple initial calibration hypotheses by registering the three dominant orientations of planes. These hypotheses are used as prior knowledge of a subsequent MAP estimation aiming to align edges that are parallel to these dominant directions. To our knowledge, this is the first self-calibration approach that is able to calibrate a ToF camera with a multi-camera system. Quantitative experiments on real data demonstrate the high accuracy of our approach.
许多新应用都是通过将多摄像头系统与飞行时间(ToF)摄像头相结合来实现的,该摄像头能够同时记录强度和深度图像。由于图像模态的差异,传统的多相机系统自标定方法无法对多相机系统进行标定。此外,多相机系统的典型环境是人造的,主要由低纹理物体组成。然而,与此同时,它们满足了曼哈顿世界的假设。我们将多模态传感器网络的校准表述为一个极大后验问题,并通过最小化相应的能量函数来求解。首先,我们估计了环境的两个独立的3D重建:一个使用安装在ToF相机上的平移单元,另一个使用多相机系统。我们利用曼哈顿世界假设和估计多个初始校准假设通过注册三个主要方向的平面。这些假设被用作后续MAP估计的先验知识,旨在对齐平行于这些主导方向的边。据我们所知,这是第一个能够校准具有多相机系统的ToF相机的自校准方法。在实际数据上的定量实验证明了我们的方法具有较高的准确性。
{"title":"Exploiting the Manhattan-world assumption for extrinsic self-calibration of multi-modal sensor networks","authors":"Marcel Brückner, Joachim Denzler","doi":"10.1109/ICCV.2011.6126337","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126337","url":null,"abstract":"Many new applications are enabled by combining a multi-camera system with a Time-of-Flight (ToF) camera, which is able to simultaneously record intensity and depth images. Classical approaches for self-calibration of a multi-camera system fail to calibrate such a system due to the very different image modalities. In addition, the typical environments of multi-camera systems are man-made and consist primary of only low textured objects. However, at the same time they satisfy the Manhattan-world assumption. We formulate the multi-modal sensor network calibration as a Maximum a Posteriori (MAP) problem and solve it by minimizing the corresponding energy function. First we estimate two separate 3D reconstructions of the environment: one using the pan-tilt unit mounted ToF camera and one using the multi-camera system. We exploit the Manhattan-world assumption and estimate multiple initial calibration hypotheses by registering the three dominant orientations of planes. These hypotheses are used as prior knowledge of a subsequent MAP estimation aiming to align edges that are parallel to these dominant directions. To our knowledge, this is the first self-calibration approach that is able to calibrate a ToF camera with a multi-camera system. Quantitative experiments on real data demonstrate the high accuracy of our approach.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73749958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2011 International Conference on Computer Vision
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1