Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition最新文献_第5页

Riemannian Nonlinear Mixed Effects Models: Analyzing Longitudinal Deformations in Neuroimaging. 黎曼非线性混合效应模型:分析神经成像中的纵向形变。

Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition

Pub Date : 2017-07-01 Epub Date: 2017-11-09 DOI: 10.1109/CVPR.2017.612

Hyunwoo J Kim, Nagesh Adluru, Heemanshu Suri, Baba C Vemuri, Sterling C Johnson, Vikas Singh

Statistical machine learning models that operate on manifold-valued data are being extensively studied in vision, motivated by applications in activity recognition, feature tracking and medical imaging. While non-parametric methods have been relatively well studied in the literature, efficient formulations for parametric models (which may offer benefits in small sample size regimes) have only emerged recently. So far, manifold-valued regression models (such as geodesic regression) are restricted to the analysis of cross-sectional data, i.e., the so-called "fixed effects" in statistics. But in most "longitudinal analysis" (e.g., when a participant provides multiple measurements, over time) the application of fixed effects models is problematic. In an effort to answer this need, this paper generalizes non-linear mixed effects model to the regime where the response variable is manifold-valued, i.e., f : R^d → ℳ. We derive the underlying model and estimation schemes and demonstrate the immediate benefits such a model can provide - both for group level and individual level analysis - on longitudinal brain imaging data. The direct consequence of our results is that longitudinal analysis of manifold-valued measurements (especially, the symmetric positive definite manifold) can be conducted in a computationally tractable manner.

基于流形值数据的统计机器学习模型在视觉领域得到了广泛的研究，其动机是在活动识别、特征跟踪和医学成像方面的应用。虽然非参数方法在文献中已经得到了相对较好的研究，但参数模型的有效公式(可能在小样本量制度中提供好处)最近才出现。到目前为止，流形值回归模型(如测地线回归)仅限于对横截面数据的分析，即统计学中所谓的“固定效应”。但在大多数“纵向分析”中(例如，当参与者随时间提供多个测量值时)，固定效应模型的应用是有问题的。为了解决这一问题，本文将非线性混合效应模型推广到响应变量为流形值的区域，即f: Rd→z。我们推导了潜在的模型和估计方案，并证明了这种模型可以提供的直接好处-无论是群体水平还是个人水平的分析-对纵向脑成像数据。我们的结果的直接结果是流形值测量的纵向分析(特别是对称的正定流形)可以在计算上易于处理的方式进行。

{"title":"Riemannian Nonlinear Mixed Effects Models: Analyzing Longitudinal Deformations in Neuroimaging.","authors":"Hyunwoo J Kim, Nagesh Adluru, Heemanshu Suri, Baba C Vemuri, Sterling C Johnson, Vikas Singh","doi":"10.1109/CVPR.2017.612","DOIUrl":"10.1109/CVPR.2017.612","url":null,"abstract":"Statistical machine learning models that operate on manifold-valued data are being extensively studied in vision, motivated by applications in activity recognition, feature tracking and medical imaging. While non-parametric methods have been relatively well studied in the literature, efficient formulations for parametric models (which may offer benefits in small sample size regimes) have only emerged recently. So far, manifold-valued regression models (such as geodesic regression) are restricted to the analysis of cross-sectional data, i.e., the so-called \"fixed effects\" in statistics. But in most \"longitudinal analysis\" (e.g., when a participant provides multiple measurements, over time) the application of fixed effects models is problematic. In an effort to answer this need, this paper generalizes non-linear mixed effects model to the regime where the response variable is manifold-valued, i.e., f : Rd → ℳ. We derive the underlying model and estimation schemes and demonstrate the immediate benefits such a model can provide - both for group level and individual level analysis - on longitudinal brain imaging data. The direct consequence of our results is that longitudinal analysis of manifold-valued measurements (especially, the symmetric positive definite manifold) can be conducted in a computationally tractable manner.","PeriodicalId":74560,"journal":{"name":"Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition","volume":"2017 ","pages":"5777-5786"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5805155/pdf/nihms914461.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35820235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The Incremental Multiresolution Matrix Factorization Algorithm. 增量多分辨率矩阵因式分解算法。

Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition

Pub Date : 2017-07-01 DOI: 10.1109/CVPR.2017.81

Vamsi K Ithapu, Risi Kondor, Sterling C Johnson, Vikas Singh

Multiresolution analysis and matrix factorization are foundational tools in computer vision. In this work, we study the interface between these two distinct topics and obtain techniques to uncover hierarchical block structure in symmetric matrices - an important aspect in the success of many vision problems. Our new algorithm, the incremental multiresolution matrix factorization, uncovers such structure one feature at a time, and hence scales well to large matrices. We describe how this multiscale analysis goes much farther than what a direct "global" factorization of the data can identify. We evaluate the efficacy of the resulting factorizations for relative leveraging within regression tasks using medical imaging data. We also use the factorization on representations learned by popular deep networks, providing evidence of their ability to infer semantic relationships even when they are not explicitly trained to do so. We show that this algorithm can be used as an exploratory tool to improve the network architecture, and within numerous other settings in vision.

多分辨率分析和矩阵因式分解是计算机视觉领域的基础工具。在这项工作中，我们研究了这两个不同主题之间的接口，并获得了揭示对称矩阵中分层块结构的技术--这是许多视觉问题取得成功的一个重要方面。我们的新算法--增量多分辨率矩阵因式分解--一次只揭示一个特征，因此能很好地扩展到大型矩阵。我们描述了这种多尺度分析如何比直接对数据进行 "全局 "因式分解更深入。我们利用医学成像数据评估了由此产生的因式分解在回归任务中相对利用的有效性。我们还将因式分解用于流行的深度网络所学习的表征，从而证明了它们推断语义关系的能力，即使它们没有经过明确的训练。我们表明，这种算法可以作为一种探索性工具来改进网络架构，也可以在视觉领域的许多其他设置中使用。

{"title":"The Incremental Multiresolution Matrix Factorization Algorithm.","authors":"Vamsi K Ithapu, Risi Kondor, Sterling C Johnson, Vikas Singh","doi":"10.1109/CVPR.2017.81","DOIUrl":"10.1109/CVPR.2017.81","url":null,"abstract":"Multiresolution analysis and matrix factorization are foundational tools in computer vision. In this work, we study the interface between these two distinct topics and obtain techniques to uncover hierarchical block structure in symmetric matrices - an important aspect in the success of many vision problems. Our new algorithm, the incremental multiresolution matrix factorization, uncovers such structure one feature at a time, and hence scales well to large matrices. We describe how this multiscale analysis goes much farther than what a direct \"global\" factorization of the data can identify. We evaluate the efficacy of the resulting factorizations for relative leveraging within regression tasks using medical imaging data. We also use the factorization on representations learned by popular deep networks, providing evidence of their ability to infer semantic relationships even when they are not explicitly trained to do so. We show that this algorithm can be used as an exploratory tool to improve the network architecture, and within numerous other settings in vision.","PeriodicalId":74560,"journal":{"name":"Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition","volume":"2017 ","pages":"692-701"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5798492/pdf/nihms914456.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35807726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Fine-tuning Convolutional Neural Networks for Biomedical Image Analysis: Actively and Incrementally. 用于生物医学图像分析的微调卷积神经网络：主动和增量。

Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition

Pub Date : 2017-07-01 Epub Date: 2017-11-09 DOI: 10.1109/CVPR.2017.506

Zongwei Zhou, Jae Shin, Lei Zhang, Suryakanth Gurudu, Michael Gotway, Jianming Liang

Intense interest in applying convolutional neural networks (CNNs) in biomedical image analysis is wide spread, but its success is impeded by the lack of large annotated datasets in biomedical imaging. Annotating biomedical images is not only tedious and time consuming, but also demanding of costly, specialty-oriented knowledge and skills, which are not easily accessible. To dramatically reduce annotation cost, this paper presents a novel method called AIFT (active, incremental fine-tuning) to naturally integrate active learning and transfer learning into a single framework. AIFT starts directly with a pre-trained CNN to seek "worthy" samples from the unannotated for annotation, and the (fine-tuned) CNN is further fine-tuned continuously by incorporating newly annotated samples in each iteration to enhance the CNN's performance incrementally. We have evaluated our method in three different biomedical imaging applications, demonstrating that the cost of annotation can be cut by at least half. This performance is attributed to the several advantages derived from the advanced active and incremental capability of our AIFT method.

卷积神经网络（CNNs）在生物医学图像分析中的应用引起了广泛的兴趣，但由于生物医学成像中缺乏大型注释数据集，其成功受到了阻碍。注释生物医学图像不仅乏味且耗时，而且需要昂贵的、面向专业的知识和技能，而这些知识和技能并不容易获得。为了显著降低注释成本，本文提出了一种称为AIFT（主动、增量微调）的新方法，将主动学习和迁移学习自然地集成到一个框架中。AIFT直接从预先训练的CNN开始，从未注释的样本中寻找“有价值”的样本进行注释，并通过在每次迭代中加入新注释的样本来进一步连续微调（微调）CNN，以逐步提高CNN的性能。我们已经在三种不同的生物医学成像应用中评估了我们的方法，证明注释的成本至少可以减少一半。这种性能归因于我们的AIFT方法的先进主动和增量能力所带来的几个优势。

{"title":"Fine-tuning Convolutional Neural Networks for Biomedical Image Analysis: Actively and Incrementally.","authors":"Zongwei Zhou, Jae Shin, Lei Zhang, Suryakanth Gurudu, Michael Gotway, Jianming Liang","doi":"10.1109/CVPR.2017.506","DOIUrl":"10.1109/CVPR.2017.506","url":null,"abstract":"Intense interest in applying convolutional neural networks (CNNs) in biomedical image analysis is wide spread, but its success is impeded by the lack of large annotated datasets in biomedical imaging. Annotating biomedical images is not only tedious and time consuming, but also demanding of costly, specialty-oriented knowledge and skills, which are not easily accessible. To dramatically reduce annotation cost, this paper presents a novel method called AIFT (active, incremental fine-tuning) to naturally integrate active learning and transfer learning into a single framework. AIFT starts directly with a pre-trained CNN to seek \"worthy\" samples from the unannotated for annotation, and the (fine-tuned) CNN is further fine-tuned continuously by incorporating newly annotated samples in each iteration to enhance the CNN's performance incrementally. We have evaluated our method in three different biomedical imaging applications, demonstrating that the cost of annotation can be cut by at least half. This performance is attributed to the several advantages derived from the advanced active and incremental capability of our AIFT method.","PeriodicalId":74560,"journal":{"name":"Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition","volume":"2017 ","pages":"4761-4772"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6191179/pdf/nihms958486.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36597835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Latent Variable Graphical Model Selection using Harmonic Analysis: Applications to the Human Connectome Project (HCP). 使用谐波分析的潜在变量图形模型选择：人类连接组计划 (HCP) 的应用。

Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition

Pub Date : 2016-06-01 Epub Date: 2016-12-12 DOI: 10.1109/CVPR.2016.268

Won Hwa Kim, Hyunwoo J Kim, Nagesh Adluru, Vikas Singh

A major goal of imaging studies such as the (ongoing) Human Connectome Project (HCP) is to characterize the structural network map of the human brain and identify its associations with covariates such as genotype, risk factors, and so on that correspond to an individual. But the set of image derived measures and the set of covariates are both large, so we must first estimate a 'parsimonious' set of relations between the measurements. For instance, a Gaussian graphical model will show conditional independences between the random variables, which can then be used to setup specific downstream analyses. But most such data involve a large list of 'latent' variables that remain unobserved, yet affect the 'observed' variables sustantially. Accounting for such latent variables is not directly addressed by standard precision matrix estimation, and is tackled via highly specialized optimization methods. This paper offers a unique harmonic analysis view of this problem. By casting the estimation of the precision matrix in terms of a composition of low-frequency latent variables and high-frequency sparse terms, we show how the problem can be formulated using a new wavelet-type expansion in non-Euclidean spaces. Our formulation poses the estimation problem in the frequency space and shows how it can be solved by a simple sub-gradient scheme. We provide a set of scientific results on ~500 scans from the recently released HCP data where our algorithm recovers highly interpretable and sparse conditional dependencies between brain connectivity pathways and well-known covariates.

成像研究（如正在进行的人类连接组计划（HCP））的一个主要目标是描述人类大脑的结构网络图，并确定其与基因型、风险因素等共变量之间的关联。但图像衍生测量值和协变因素的集合都很大，因此我们必须首先估计测量值之间的 "拟然 "关系。例如，高斯图形模型将显示随机变量之间的条件独立性，然后可用于设置特定的下游分析。但是，大多数此类数据都涉及大量 "潜在 "变量，这些变量仍未被观测到，但会对 "观测 "变量产生持续影响。标准精确矩阵估算无法直接考虑这些潜变量，只能通过高度专业化的优化方法来解决。本文对这一问题提出了独特的谐波分析观点。通过用低频潜变量和高频稀疏项的组合来估算精度矩阵，我们展示了如何在非欧几里得空间中使用新的小波类型展开来表述这个问题。我们的表述提出了频率空间中的估计问题，并展示了如何通过简单的子梯度方案解决该问题。我们对最近发布的 HCP 数据中的约 500 个扫描结果进行了科学分析，在这些结果中，我们的算法恢复了大脑连接通路与众所周知的协变量之间高度可解释的稀疏条件依赖关系。

{"title":"Latent Variable Graphical Model Selection using Harmonic Analysis: Applications to the Human Connectome Project (HCP).","authors":"Won Hwa Kim, Hyunwoo J Kim, Nagesh Adluru, Vikas Singh","doi":"10.1109/CVPR.2016.268","DOIUrl":"10.1109/CVPR.2016.268","url":null,"abstract":"A major goal of imaging studies such as the (ongoing) Human Connectome Project (HCP) is to characterize the structural network map of the human brain and identify its associations with covariates such as genotype, risk factors, and so on that correspond to an individual. But the set of image derived measures and the set of covariates are both large, so we must first estimate a 'parsimonious' set of relations between the measurements. For instance, a Gaussian graphical model will show conditional independences between the random variables, which can then be used to setup specific downstream analyses. But most such data involve a large list of 'latent' variables that remain unobserved, yet affect the 'observed' variables sustantially. Accounting for such latent variables is not directly addressed by standard precision matrix estimation, and is tackled via highly specialized optimization methods. This paper offers a unique harmonic analysis view of this problem. By casting the estimation of the precision matrix in terms of a composition of low-frequency latent variables and high-frequency sparse terms, we show how the problem can be formulated using a new wavelet-type expansion in non-Euclidean spaces. Our formulation poses the estimation problem in the frequency space and shows how it can be solved by a simple sub-gradient scheme. We provide a set of scientific results on ~500 scans from the recently released HCP data where our algorithm recovers highly interpretable and sparse conditional dependencies between brain connectivity pathways and well-known covariates.","PeriodicalId":74560,"journal":{"name":"Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition","volume":"2016 ","pages":"2443-2451"},"PeriodicalIF":0.0,"publicationDate":"2016-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5330303/pdf/nihms824434.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34778815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SemiContour: A Semi-supervised Learning Approach for Contour Detection. 半轮廓：轮廓检测的半监督学习方法

Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition

Pub Date : 2016-06-01 Epub Date: 2016-12-12 DOI: 10.1109/CVPR.2016.34

Zizhao Zhang, Fuyong Xing, Xiaoshuang Shi, Lin Yang

Supervised contour detection methods usually require many labeled training images to obtain satisfactory performance. However, a large set of annotated data might be unavailable or extremely labor intensive. In this paper, we investigate the usage of semi-supervised learning (SSL) to obtain competitive detection accuracy with very limited training data (three labeled images). Specifically, we propose a semi-supervised structured ensemble learning approach for contour detection built on structured random forests (SRF). To allow SRF to be applicable to unlabeled data, we present an effective sparse representation approach to capture inherent structure in image patches by finding a compact and discriminative low-dimensional subspace representation in an unsupervised manner, enabling the incorporation of abundant unlabeled patches with their estimated structured labels to help SRF perform better node splitting. We re-examine the role of sparsity and propose a novel and fast sparse coding algorithm to boost the overall learning efficiency. To the best of our knowledge, this is the first attempt to apply SSL for contour detection. Extensive experiments on the BSDS500 segmentation dataset and the NYU Depth dataset demonstrate the superiority of the proposed method.

有监督的轮廓检测方法通常需要许多标注过的训练图像才能获得令人满意的性能。然而，大量的标注数据可能无法获得，或者需要耗费大量人力物力。在本文中，我们研究了如何利用半监督学习（SSL），在训练数据非常有限（三张标注图像）的情况下获得具有竞争力的检测精度。具体来说，我们提出了一种基于结构化随机森林（SRF）的轮廓检测半监督结构化集合学习方法。为了让 SRF 适用于无标记数据，我们提出了一种有效的稀疏表示方法，通过以无监督的方式找到紧凑且具有判别能力的低维子空间表示来捕捉图像补丁中的固有结构，从而将丰富的无标记补丁与其估计的结构化标签结合起来，帮助 SRF 执行更好的节点分割。我们重新审视了稀疏性的作用，并提出了一种新颖、快速的稀疏编码算法，以提高整体学习效率。据我们所知，这是首次尝试将 SSL 应用于轮廓检测。在 BSDS500 分割数据集和纽约大学深度数据集上的广泛实验证明了所提方法的优越性。

{"title":"SemiContour: A Semi-supervised Learning Approach for Contour Detection.","authors":"Zizhao Zhang, Fuyong Xing, Xiaoshuang Shi, Lin Yang","doi":"10.1109/CVPR.2016.34","DOIUrl":"10.1109/CVPR.2016.34","url":null,"abstract":"Supervised contour detection methods usually require many labeled training images to obtain satisfactory performance. However, a large set of annotated data might be unavailable or extremely labor intensive. In this paper, we investigate the usage of semi-supervised learning (SSL) to obtain competitive detection accuracy with very limited training data (three labeled images). Specifically, we propose a semi-supervised structured ensemble learning approach for contour detection built on structured random forests (SRF). To allow SRF to be applicable to unlabeled data, we present an effective sparse representation approach to capture inherent structure in image patches by finding a compact and discriminative low-dimensional subspace representation in an unsupervised manner, enabling the incorporation of abundant unlabeled patches with their estimated structured labels to help SRF perform better node splitting. We re-examine the role of sparsity and propose a novel and fast sparse coding algorithm to boost the overall learning efficiency. To the best of our knowledge, this is the first attempt to apply SSL for contour detection. Extensive experiments on the BSDS500 segmentation dataset and the NYU Depth dataset demonstrate the superiority of the proposed method.","PeriodicalId":74560,"journal":{"name":"Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition","volume":"2016 ","pages":"251-259"},"PeriodicalIF":0.0,"publicationDate":"2016-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5423734/pdf/nihms-818937.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34988057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Shape Analysis with Hyperbolic Wasserstein Distance. 双曲Wasserstein距离的形状分析。

Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition

Pub Date : 2016-06-01 Epub Date: 2016-12-12 DOI: 10.1109/CVPR.2016.546

Jie Shi, Wen Zhang, Yalin Wang

Shape space is an active research field in computer vision study. The shape distance defined in a shape space may provide a simple and refined index to represent a unique shape. Wasserstein distance defines a Riemannian metric for the Wasserstein space. It intrinsically measures the similarities between shapes and is robust to image noise. Thus it has the potential for the 3D shape indexing and classification research. While the algorithms for computing Wasserstein distance have been extensively studied, most of them only work for genus-0 surfaces. This paper proposes a novel framework to compute Wasserstein distance between general topological surfaces with hyperbolic metric. The computational algorithms are based on Ricci flow, hyperbolic harmonic map, and hyperbolic power Voronoi diagram and the method is general and robust. We apply our method to study human facial expression, longitudinal brain cortical morphometry with normal aging, and cortical shape classification in Alzheimer's disease (AD). Experimental results demonstrate that our method may be used as an effective shape index, which outperforms some other standard shape measures in our AD versus healthy control classification study.

形状空间是计算机视觉研究中一个活跃的研究领域。在形状空间中定义的形状距离可以提供一个简单而精细的索引来表示一个独特的形状。瓦瑟斯坦距离定义了瓦瑟斯坦空间的黎曼度规。它本质上测量形状之间的相似性，并且对图像噪声具有鲁棒性。因此，该方法对三维形状标引和分类研究具有一定的潜力。虽然计算Wasserstein距离的算法已经被广泛研究，但大多数算法只适用于0类曲面。本文提出了一种计算具有双曲度量的一般拓扑曲面间Wasserstein距离的新框架。计算算法基于Ricci流、双曲调和图和双曲幂Voronoi图，具有通用性和鲁棒性。我们应用我们的方法研究人类面部表情、正常衰老的大脑皮层纵向形态测量和阿尔茨海默病(AD)的皮层形状分类。实验结果表明，我们的方法可以作为一种有效的形状指标，在我们的AD与健康对照分类研究中优于其他一些标准的形状指标。

{"title":"Shape Analysis with Hyperbolic Wasserstein Distance.","authors":"Jie Shi, Wen Zhang, Yalin Wang","doi":"10.1109/CVPR.2016.546","DOIUrl":"https://doi.org/10.1109/CVPR.2016.546","url":null,"abstract":"Shape space is an active research field in computer vision study. The shape distance defined in a shape space may provide a simple and refined index to represent a unique shape. Wasserstein distance defines a Riemannian metric for the Wasserstein space. It intrinsically measures the similarities between shapes and is robust to image noise. Thus it has the potential for the 3D shape indexing and classification research. While the algorithms for computing Wasserstein distance have been extensively studied, most of them only work for genus-0 surfaces. This paper proposes a novel framework to compute Wasserstein distance between general topological surfaces with hyperbolic metric. The computational algorithms are based on Ricci flow, hyperbolic harmonic map, and hyperbolic power Voronoi diagram and the method is general and robust. We apply our method to study human facial expression, longitudinal brain cortical morphometry with normal aging, and cortical shape classification in Alzheimer's disease (AD). Experimental results demonstrate that our method may be used as an effective shape index, which outperforms some other standard shape measures in our AD versus healthy control classification study.","PeriodicalId":74560,"journal":{"name":"Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition","volume":"2016 ","pages":"5051-5061"},"PeriodicalIF":0.0,"publicationDate":"2016-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/CVPR.2016.546","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34898674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 19

Joint Patch and Multi-label Learning for Facial Action Unit Detection. 面部动作单元检测的关节贴片和多标签学习。

Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition

Pub Date : 2015-06-01 DOI: 10.1109/CVPR.2015.7298833

Kaili Zhao, Wen-Sheng Chu, Fernando De la Torre, Jeffrey F Cohn, Honggang Zhang

The face is one of the most powerful channel of nonverbal communication. The most commonly used taxonomy to describe facial behaviour is the Facial Action Coding System (FACS). FACS segments the visible effects of facial muscle activation into 30+ action units (AUs). AUs, which may occur alone and in thousands of combinations, can describe nearly all-possible facial expressions. Most existing methods for automatic AU detection treat the problem using one-vs-all classifiers and fail to exploit dependencies among AU and facial features. We introduce joint-patch and multi-label learning (JPML) to address these issues. JPML leverages group sparsity by selecting a sparse subset of facial patches while learning a multi-label classifier. In four of five comparisons on three diverse datasets, CK+, GFT, and BP4D, JPML produced the highest average F1 scores in comparison with state-of-the art.

脸是非语言交流最有力的渠道之一。描述面部行为最常用的分类法是面部动作编码系统(FACS)。FACS将面部肌肉激活的可见效应划分为30多个动作单元(au)。AUs可以单独出现，也可以以数千种组合出现，几乎可以描述所有可能的面部表情。大多数现有的自动人脸识别方法都是使用一对全分类器来处理这个问题，并且无法利用人脸识别和面部特征之间的依赖关系。我们引入联合贴片和多标签学习(JPML)来解决这些问题。JPML在学习多标签分类器时，通过选择面部补丁的稀疏子集来利用组稀疏性。在CK+、GFT和BP4D三种不同数据集的五次比较中，有四次与最先进的数据集相比，JPML的平均F1得分最高。

引用次数: 180

Multivariate General Linear Models (MGLM) on Riemannian Manifolds with Applications to Statistical Analysis of Diffusion Weighted Images. 黎曼流形上的多元一般线性模型及其在扩散加权图像统计分析中的应用。

Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition

Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.352

Hyunwoo J Kim, Nagesh Adluru, Maxwell D Collins, Moo K Chung, Barbara B Bendlin, Sterling C Johnson, Richard J Davidson, Vikas Singh

Linear regression is a parametric model which is ubiquitous in scientific analysis. The classical setup where the observations and responses, i.e., (x_i , y_i ) pairs, are Euclidean is well studied. The setting where y_i is manifold valued is a topic of much interest, motivated by applications in shape analysis, topic modeling, and medical imaging. Recent work gives strategies for max-margin classifiers, principal components analysis, and dictionary learning on certain types of manifolds. For parametric regression specifically, results within the last year provide mechanisms to regress one real-valued parameter, x_i ∈ R, against a manifold-valued variable, y_i ∈ . We seek to substantially extend the operating range of such methods by deriving schemes for multivariate multiple linear regression -a manifold-valued dependent variable against multiple independent variables, i.e., f : Rⁿ → . Our variational algorithm efficiently solves for multiple geodesic bases on the manifold concurrently via gradient updates. This allows us to answer questions such as: what is the relationship of the measurement at voxel y to disease when conditioned on age and gender. We show applications to statistical analysis of diffusion weighted images, which give rise to regression tasks on the manifold GL(n)/O(n) for diffusion tensor images (DTI) and the Hilbert unit sphere for orientation distribution functions (ODF) from high angular resolution acquisition. The companion open-source code is available on nitrc.org/projects/riem_mglm.

线性回归是一种在科学分析中普遍存在的参数化模型。观察和响应的经典设置，即(xi, yi)对，是欧几里得的，得到了很好的研究。在形状分析、主题建模和医学成像的应用中，yi是一个非常有趣的话题。最近的工作给出了在某些类型的流形上的最大边际分类器、主成分分析和字典学习策略。具体来说，对于参数回归，去年的结果提供了将一个实值参数xi∈R与流形值变量yi∈进行回归的机制。我们试图通过推导多元多元线性回归(流形值因变量对多个自变量，即f: Rn→)的格式来大幅扩展这些方法的操作范围。该变分算法通过梯度更新，有效地求解了流形上的多个测地线。这使我们能够回答这样的问题:当以年龄和性别为条件时，体素y的测量与疾病的关系是什么?我们展示了扩散加权图像的统计分析应用，这导致了对扩散张量图像(DTI)的流形GL(n)/O(n)的回归任务和对高角分辨率采集的方向分布函数(ODF)的希尔伯特单位球的回归任务。配套的开源代码可在nitrc.org/projects/riem_mglm上获得。

{"title":"Multivariate General Linear Models (MGLM) on Riemannian Manifolds with Applications to Statistical Analysis of Diffusion Weighted Images.","authors":"Hyunwoo J Kim, Nagesh Adluru, Maxwell D Collins, Moo K Chung, Barbara B Bendlin, Sterling C Johnson, Richard J Davidson, Vikas Singh","doi":"10.1109/CVPR.2014.352","DOIUrl":"https://doi.org/10.1109/CVPR.2014.352","url":null,"abstract":"Linear regression is a parametric model which is ubiquitous in scientific analysis. The classical setup where the observations and responses, i.e., (xi , yi ) pairs, are Euclidean is well studied. The setting where yi is manifold valued is a topic of much interest, motivated by applications in shape analysis, topic modeling, and medical imaging. Recent work gives strategies for max-margin classifiers, principal components analysis, and dictionary learning on certain types of manifolds. For parametric regression specifically, results within the last year provide mechanisms to regress one real-valued parameter, xi ∈ R, against a manifold-valued variable, yi ∈ . We seek to substantially extend the operating range of such methods by deriving schemes for multivariate multiple linear regression -a manifold-valued dependent variable against multiple independent variables, i.e., f : Rn → . Our variational algorithm efficiently solves for multiple geodesic bases on the manifold concurrently via gradient updates. This allows us to answer questions such as: what is the relationship of the measurement at voxel y to disease when conditioned on age and gender. We show applications to statistical analysis of diffusion weighted images, which give rise to regression tasks on the manifold GL(n)/O(n) for diffusion tensor images (DTI) and the Hilbert unit sphere for orientation distribution functions (ODF) from high angular resolution acquisition. The companion open-source code is available on nitrc.org/projects/riem_mglm.","PeriodicalId":74560,"journal":{"name":"Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition","volume":"2014 ","pages":"2705-2712"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/CVPR.2014.352","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32967355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 49

Single Image Super-resolution using Deformable Patches. 利用可变形补丁实现单张图像超分辨率

Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition

Pub Date : 2014-06-01 DOI: 10.1109/CVPR.2014.373

Yu Zhu, Yanning Zhang, Alan L Yuille

We proposed a deformable patches based method for single image super-resolution. By the concept of deformation, a patch is not regarded as a fixed vector but a flexible deformation flow. Via deformable patches, the dictionary can cover more patterns that do not appear, thus becoming more expressive. We present the energy function with slow, smooth and flexible prior for deformation model. During example-based super-resolution, we develop the deformation similarity based on the minimized energy function for basic patch matching. For robustness, we utilize multiple deformed patches combination for the final reconstruction. Experiments evaluate the deformation effectiveness and super-resolution performance, showing that the deformable patches help improve the representation accuracy and perform better than the state-of-art methods.

我们提出了一种基于可变形补丁的单图像超分辨率方法。根据变形的概念，补丁不是一个固定的矢量，而是一个灵活的变形流。通过可变形补丁，字典可以涵盖更多未出现的模式，从而变得更具表现力。我们为变形模型提出了具有缓慢、平滑和灵活先验的能量函数。在基于实例的超分辨率过程中，我们根据最小化的能量函数来开发变形相似性，以实现基本的补丁匹配。为了保证稳健性，我们利用多个变形斑块组合进行最终重建。实验对变形效果和超分辨率性能进行了评估，结果表明可变形补丁有助于提高表示精度，其性能优于最先进的方法。

引用次数: 0

Modeling Image Patches with a Generic Dictionary of Mini-Epitomes. 用Mini-Epitomes的通用字典建模图像补丁。

Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition

Pub Date : 2014-06-01 DOI: 10.1109/CVPR.2014.264

George Papandreou, Liang-Chieh Chen, Alan L Yuille

The goal of this paper is to question the necessity of features like SIFT in categorical visual recognition tasks. As an alternative, we develop a generative model for the raw intensity of image patches and show that it can support image classification performance on par with optimized SIFT-based techniques in a bag-of-visual-words setting. Key ingredient of the proposed model is a compact dictionary of mini-epitomes, learned in an unsupervised fashion on a large collection of images. The use of epitomes allows us to explicitly account for photometric and position variability in image appearance. We show that this flexibility considerably increases the capacity of the dictionary to accurately approximate the appearance of image patches and support recognition tasks. For image classification, we develop histogram-based image encoding methods tailored to the epitomic representation, as well as an "epitomic footprint" encoding which is easy to visualize and highlights the generative nature of our model. We discuss in detail computational aspects and develop efficient algorithms to make the model scalable to large tasks. The proposed techniques are evaluated with experiments on the challenging PASCAL VOC 2007 image classification benchmark.

本文的目的是质疑像SIFT这样的特征在分类视觉识别任务中的必要性。作为替代方案，我们开发了一种图像补丁原始强度的生成模型，并表明它可以在视觉词袋设置中支持与优化的基于sift的技术相当的图像分类性能。提出的模型的关键成分是一个紧凑的迷你缩影字典，以无监督的方式在大量图像上学习。缩影的使用使我们能够明确地解释图像外观的光度和位置变化。我们表明，这种灵活性大大增加了字典的容量，以准确地近似图像补丁的外观和支持识别任务。对于图像分类，我们开发了适合于缩影表示的基于直方图的图像编码方法，以及易于可视化并突出我们模型的生成性质的“缩影足迹”编码。我们详细讨论了计算方面，并开发了有效的算法，使模型可扩展到大型任务。在具有挑战性的PASCAL VOC 2007图像分类基准上对所提出的技术进行了实验评估。

{"title":"Modeling Image Patches with a Generic Dictionary of Mini-Epitomes.","authors":"George Papandreou, Liang-Chieh Chen, Alan L Yuille","doi":"10.1109/CVPR.2014.264","DOIUrl":"https://doi.org/10.1109/CVPR.2014.264","url":null,"abstract":"The goal of this paper is to question the necessity of features like SIFT in categorical visual recognition tasks. As an alternative, we develop a generative model for the raw intensity of image patches and show that it can support image classification performance on par with optimized SIFT-based techniques in a bag-of-visual-words setting. Key ingredient of the proposed model is a compact dictionary of mini-epitomes, learned in an unsupervised fashion on a large collection of images. The use of epitomes allows us to explicitly account for photometric and position variability in image appearance. We show that this flexibility considerably increases the capacity of the dictionary to accurately approximate the appearance of image patches and support recognition tasks. For image classification, we develop histogram-based image encoding methods tailored to the epitomic representation, as well as an \"epitomic footprint\" encoding which is easy to visualize and highlights the generative nature of our model. We discuss in detail computational aspects and develop efficient algorithms to make the model scalable to large tasks. The proposed techniques are evaluated with experiments on the challenging PASCAL VOC 2007 image classification benchmark.","PeriodicalId":74560,"journal":{"name":"Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition","volume":"2014 ","pages":"2059-2066"},"PeriodicalIF":0.0,"publicationDate":"2014-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/CVPR.2014.264","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33964769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17