2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops最新文献

英文中文

Automatic estimation of left ventricular dysfunction from echocardiogram videos 从超声心动图视频自动估计左心室功能障碍

2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops

Pub Date : 2009-06-20 DOI: 10.1109/CVPRW.2009.5204054

D. Beymer, T. Syeda-Mahmood, A. Amir, Fei Wang, Scott Adelman

Echocardiography is often used to diagnose cardiac diseases related to regional and valvular motion abnormalities. Due to the low resolution of the imaging modality, the choice of viewpoint and mode, and the experience of the sonographers, there is a large variance in the estimation of important diagnostic measurements such as ejection fraction. In this paper, we develop an automatic algorithm to estimate diagnostic measurements from raw echocardiogram video sequences. Specifically, we locate and track the left ventricular region over a heart cycle using active shape models. We also present efficient ventricular localization in video sequences by automatically detecting and propagating echocardiographer annotations. Results on a large database of cardiac echo videos demonstrate the use of our method for the prediction of left ventricular dysfunction.

超声心动图常用于诊断与局部和瓣膜运动异常有关的心脏疾病。由于成像方式的低分辨率、视点和模式的选择以及超声医师的经验，在诸如射血分数等重要诊断测量值的估计上存在很大差异。在本文中，我们开发了一种从原始超声心动图视频序列中估计诊断测量的自动算法。具体来说，我们定位和跟踪左心室区域在一个心脏周期使用主动形状模型。我们还通过自动检测和传播超声心动图注释，在视频序列中提出了有效的心室定位。结果在一个大型数据库的心脏回声视频证明使用我们的方法来预测左心室功能障碍。

引用次数: 15

A case for the average-half-face in 2D and 3D for face recognition 二维和三维平均半脸的人脸识别案例

2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops

Pub Date : 2009-06-20 DOI: 10.1109/CVPRW.2009.5204304

Josh Harguess, J. Aggarwal

We observe that the human face is inherently symmetric and we would like to exploit this symmetry in face recognition. The average-half-face has been previously shown to do just that for a set of 3D faces when using eigenfaces for recognition. We build upon that work and present a comparison of the use of the average-half-face to the use of the original full face with 6 different algorithms applied to two- and three-dimensional (2D and 3D) databases. The average-half-face is constructed from the full frontal face image in two steps; first the face image is centered and divided in half and then the two halves are averaged together (reversing the columns of one of the halves). The resulting average-half-face is then used as the input for face recognition algorithms. Previous work has shown that the accuracy of 3D face recognition using eigenfaces with the average-half-face is significantly better than using the full face. We compare the results using the average-half-face and the full face using six face recognition methods; eigenfaces, multi-linear principal components analysis (MPCA), MPCA with linear discriminant analysis (MPCALDA), Fisherfaces (LDA), independent component analysis (ICA), and support vector machines (SVM). We utilize two well-known 2D face database as well as a 3D face database for the comparison. Our results show that in most cases it is superior to employ the average-half-face for frontal face recognition. The consequences of this discovery may result in substantial savings in storage and computation time.

我们观察到人脸本身是对称的，我们想在人脸识别中利用这种对称性。在使用特征脸进行识别时，平均半脸已经被证明对一组3D脸有这样的作用。我们在这项工作的基础上，通过6种不同的算法，将平均半脸的使用与原始全脸的使用进行了比较，这些算法应用于二维和三维(2D和3D)数据库。从全正面人脸图像中分两步构造平均半人脸;首先，将人脸图像居中并分成两半，然后将两半平均在一起(反转其中一半的列)。然后将得到的平均半脸用作人脸识别算法的输入。先前的研究表明，使用平均半脸特征脸的3D人脸识别精度明显优于使用全脸特征脸。我们比较了六种人脸识别方法对平均半脸和全脸的识别结果;特征面、多线性主成分分析(MPCA)、多线性主成分分析与线性判别分析(MPCALDA)、Fisherfaces (LDA)、独立成分分析(ICA)和支持向量机(SVM)。我们利用两个知名的二维人脸数据库和一个三维人脸数据库进行比较。我们的研究结果表明，在大多数情况下，使用平均半脸进行正面人脸识别是优越的。这一发现的结果可能会大大节省存储和计算时间。

{"title":"A case for the average-half-face in 2D and 3D for face recognition","authors":"Josh Harguess, J. Aggarwal","doi":"10.1109/CVPRW.2009.5204304","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204304","url":null,"abstract":"We observe that the human face is inherently symmetric and we would like to exploit this symmetry in face recognition. The average-half-face has been previously shown to do just that for a set of 3D faces when using eigenfaces for recognition. We build upon that work and present a comparison of the use of the average-half-face to the use of the original full face with 6 different algorithms applied to two- and three-dimensional (2D and 3D) databases. The average-half-face is constructed from the full frontal face image in two steps; first the face image is centered and divided in half and then the two halves are averaged together (reversing the columns of one of the halves). The resulting average-half-face is then used as the input for face recognition algorithms. Previous work has shown that the accuracy of 3D face recognition using eigenfaces with the average-half-face is significantly better than using the full face. We compare the results using the average-half-face and the full face using six face recognition methods; eigenfaces, multi-linear principal components analysis (MPCA), MPCA with linear discriminant analysis (MPCALDA), Fisherfaces (LDA), independent component analysis (ICA), and support vector machines (SVM). We utilize two well-known 2D face database as well as a 3D face database for the comparison. Our results show that in most cases it is superior to employ the average-half-face for frontal face recognition. The consequences of this discovery may result in substantial savings in storage and computation time.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"260 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122087964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 60

Remote audio/video acquisition for human signature detection 用于人体签名检测的远程音频/视频采集

2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops

Pub Date : 2009-06-20 DOI: 10.1109/CVPRW.2009.5204294

Yufu Qu, Tao Wang, Zhigang Zhu

To address the challenges of noncooperative, large-distance human signature detection, we present a novel multimodal remote audio/video acquisition system. The system mainly consists of a laser Doppler virbometer (LDV) and a pan-tilt-zoom (PTZ) camera. The LDV is a unique remote hearing sensor that uses the principle of laser interferometry. However, it needs an appropriate surface to modulate the speech of a human subject and reflect the laser beam to the LDV receiver. The manual operation to turn the laser beam onto a target is very difficult at a distance of more than 20 meters. Therefore, the PTZ camera is used to capture the video of the human subject, track the subject when he/she moves, and analyze the image to get a good reflection surface for LDV measurements in real-time. Experiments show that the integration of those two sensory components is ideal for multimodal human signature detection at a large distance.

为了解决非合作、远距离人类特征检测的挑战，我们提出了一种新型的多模态远程音视频采集系统。该系统主要由一个激光多普勒病毒计(LDV)和一个平移-倾斜变焦(PTZ)摄像机组成。LDV是一种独特的远程听力传感器，采用激光干涉测量原理。然而，它需要一个合适的表面来调制人类受试者的语音并将激光束反射到LDV接收器。在超过20米的距离上，手动操作将激光束转向目标非常困难。因此，利用PTZ摄像机捕捉人体被测者的视频，在被测者运动时进行跟踪，并对图像进行分析，以获得实时测量LDV所需的良好反射面。实验表明，这两种感官成分的融合是远距离多模态人体特征检测的理想方法。

引用次数: 11

Improving face recognition with a quality-based probabilistic framework 基于质量的概率框架改进人脸识别

2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops

Pub Date : 2009-06-20 DOI: 10.1109/CVPRW.2009.5204299

N. Ozay, Yan Tong, F. Wheeler, Xiaoming Liu

This paper addresses the problem of developing facial image quality metrics that are predictive of the performance of existing biometric matching algorithms and incorporating the quality estimates into the recognition decision process to improve overall performance. The first task we consider is the separation of probe/gallery qualities since the match score depends on both. Given a set of training images of the same individual, we find the match scores between all possible probe/gallery image pairs. Then, we define symmetric normalized match score for any pair, model it as the average of the qualities of probe/gallery corrupted by additive noise, and estimate the quality values such that the noise is minimized. To utilize quality in the decision process, we employ a Bayesian network to model the relationships among qualities, predefined quality related image features and recognition. The recognition decision is made by probabilistic inference via this model. We illustrate with various face verification experiments that incorporating quality into the decision process can improve the performance significantly.

本文解决了开发面部图像质量指标的问题，这些指标可以预测现有生物识别匹配算法的性能，并将质量估计纳入识别决策过程以提高整体性能。我们考虑的第一个任务是分离探针/画廊质量，因为匹配分数取决于两者。给定同一个体的一组训练图像，我们找到所有可能的探针/画廊图像对之间的匹配分数。然后，我们定义任意对的对称归一化匹配分数，将其建模为被加性噪声破坏的探针/通道质量的平均值，并估计质量值以使噪声最小化。为了在决策过程中利用质量，我们使用贝叶斯网络来建模质量、预定义的质量相关图像特征和识别之间的关系。该模型通过概率推理进行识别决策。我们通过各种人脸验证实验说明，将质量纳入决策过程可以显着提高性能。

{"title":"Improving face recognition with a quality-based probabilistic framework","authors":"N. Ozay, Yan Tong, F. Wheeler, Xiaoming Liu","doi":"10.1109/CVPRW.2009.5204299","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204299","url":null,"abstract":"This paper addresses the problem of developing facial image quality metrics that are predictive of the performance of existing biometric matching algorithms and incorporating the quality estimates into the recognition decision process to improve overall performance. The first task we consider is the separation of probe/gallery qualities since the match score depends on both. Given a set of training images of the same individual, we find the match scores between all possible probe/gallery image pairs. Then, we define symmetric normalized match score for any pair, model it as the average of the qualities of probe/gallery corrupted by additive noise, and estimate the quality values such that the noise is minimized. To utilize quality in the decision process, we employ a Bayesian network to model the relationships among qualities, predefined quality related image features and recognition. The recognition decision is made by probabilistic inference via this model. We illustrate with various face verification experiments that incorporating quality into the decision process can improve the performance significantly.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130965886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 31

In between 3D Active Appearance Models and 3D Morphable Models 介于3D活动外观模型和3D变形模型之间

2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops

Pub Date : 2009-06-20 DOI: 10.1109/CVPRW.2009.5204300

J. Heo, M. Savvides

In this paper we propose a novel method of generating 3D morphable models (3DMMs) from 2D images. We develop algorithms of 3D face reconstruction from a sparse set of points acquired from 2D images. In order to establish correspondence between images precisely, we combined active shape models (ASMs) and active appearance models (AAMs)(CASAAMs) in an intelligent way, showing improved performance on pixel-level accuracy and generalization to unseen faces. The CASAAMs are applied to the images of different views of the same person to extract facial shapes across pose. These 2D shapes are combined for reconstructing a sparse 3D model. The point density of the model is increased by the loop subdivision method, which generates new vertices by a weighted sum of the existing vertices. Then, the depth of the dense 3D model is modified with an average 3D depth-map in order to preserve facial structure more realistically. Finally, all 249 3D models with expression changes are combined to generate a 3DMM for a compact representation. The first session of the multi-PIE database, consisting of 249 persons with expression and illumination changes, is used for the modeling. Unlike typical 3DMMs, our model can generate 3D human faces more realistically and efficiently (2-3 seconds on P4 machine) under diverse illumination conditions.

本文提出了一种从二维图像生成三维变形模型的新方法。我们开发了从二维图像中获得的稀疏点集进行三维人脸重建的算法。为了精确地建立图像之间的对应关系，我们将主动形状模型(asm)和主动外观模型(CASAAMs)智能地结合起来，提高了像素级精度和对未见人脸的泛化性能。CASAAMs应用于同一个人的不同视图图像，以提取不同姿势的面部形状。这些二维形状被组合起来重建一个稀疏的三维模型。该方法通过对现有顶点的加权和生成新的顶点，从而提高模型的点密度。然后，利用平均三维深度图对密集三维模型的深度进行修正，以更真实地保留面部结构;最后，将所有249个具有表达式变化的3D模型组合在一起，生成一个用于紧凑表示的3DMM。使用由249个表情和光照变化的人组成的multi-PIE数据库的第一次会话进行建模。不同于典型的3D dm，我们的模型可以在不同的光照条件下更逼真、更高效地生成3D人脸(在P4机器上2-3秒)。

{"title":"In between 3D Active Appearance Models and 3D Morphable Models","authors":"J. Heo, M. Savvides","doi":"10.1109/CVPRW.2009.5204300","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204300","url":null,"abstract":"In this paper we propose a novel method of generating 3D morphable models (3DMMs) from 2D images. We develop algorithms of 3D face reconstruction from a sparse set of points acquired from 2D images. In order to establish correspondence between images precisely, we combined active shape models (ASMs) and active appearance models (AAMs)(CASAAMs) in an intelligent way, showing improved performance on pixel-level accuracy and generalization to unseen faces. The CASAAMs are applied to the images of different views of the same person to extract facial shapes across pose. These 2D shapes are combined for reconstructing a sparse 3D model. The point density of the model is increased by the loop subdivision method, which generates new vertices by a weighted sum of the existing vertices. Then, the depth of the dense 3D model is modified with an average 3D depth-map in order to preserve facial structure more realistically. Finally, all 249 3D models with expression changes are combined to generate a 3DMM for a compact representation. The first session of the multi-PIE database, consisting of 249 persons with expression and illumination changes, is used for the modeling. Unlike typical 3DMMs, our model can generate 3D human faces more realistically and efficiently (2-3 seconds on P4 machine) under diverse illumination conditions.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"191 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133720200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 14

HANOLISTIC: A Hierarchical Automatic Image Annotation System Using Holistic Approach HANOLISTIC:一种采用整体方法的分层自动图像标注系统

2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops

Pub Date : 2009-06-20 DOI: 10.1109/CVPRW.2009.5204236

Özge Öztimur Karadag, F. Yarman-Vural

Automatic image annotation is the process of assigning keywords to digital images depending on the content information. In one sense, it is a mapping from the visual content information to the semantic context information. In this study, we propose a novel approach for automatic image annotation problem, where the annotation is formulated as a multivariate mapping from a set of independent descriptor spaces, representing a whole image, to a set of words, representing class labels. For this purpose, a hierarchical annotation architecture, named as HANOLISTIC (hierarchical image annotation system using holistic approach), is defined with two layers. The first layer, called level 0 consists of annotators each of which is fed by a set of distinct descriptors, extracted from the whole image. This enables us to represent the image at each annotator by a different visual property of a descriptor. Since, we use the whole image, the problematic segmentation process is avoided. Training of each annotator is accomplished by a supervised learning paradigm, where each word is considered as a class label. Note that, this approach is slightly different then the classical training approaches, where each data has a unique label. In the proposed system, since each image has one or more annotating words, we assume that an image belongs to more than one class. The output of the level 0 annotators indicate the membership values of the words in the vocabulary, to belong an image. These membership values from each annotator is, then, aggregated at the second layer to obtain meta level annotator. Finally, a set of words from the vocabulary is selected based on the ranking of the output of meta level. The hierarchical annotation system proposed in this study outperforms state of the art annotation systems based on segmental and holistic approaches.

图像自动标注是根据数字图像的内容信息为其分配关键词的过程。从某种意义上说，它是从视觉内容信息到语义上下文信息的映射。在这项研究中，我们提出了一种新的自动图像标注方法，其中标注被表述为从一组独立的描述符空间(代表整个图像)到一组代表类标签的词的多元映射。为此，定义了一个分层注释体系结构，称为HANOLISTIC(使用整体方法的分层图像注释系统)，分为两层。第一层称为0级，由注释器组成，每个注释器由一组从整个图像中提取的不同描述符提供。这使我们能够通过描述符的不同视觉属性来表示每个注释器上的图像。由于我们使用了整个图像，因此避免了有问题的分割过程。每个注释器的训练是通过监督学习范式完成的，其中每个单词被认为是一个类标签。请注意，这种方法与经典的训练方法略有不同，其中每个数据都有一个唯一的标签。在提出的系统中，由于每张图像都有一个或多个注释词，我们假设一张图像属于多个类。0级注释器的输出指示词汇表中属于图像的单词的隶属度值。然后，在第二层聚合来自每个注释器的这些成员值，以获得元级注释器。最后，根据元级输出的排序，从词汇表中选择一组单词。本研究提出的分层注释系统优于基于分段和整体方法的最先进的注释系统。

{"title":"HANOLISTIC: A Hierarchical Automatic Image Annotation System Using Holistic Approach","authors":"Özge Öztimur Karadag, F. Yarman-Vural","doi":"10.1109/CVPRW.2009.5204236","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204236","url":null,"abstract":"Automatic image annotation is the process of assigning keywords to digital images depending on the content information. In one sense, it is a mapping from the visual content information to the semantic context information. In this study, we propose a novel approach for automatic image annotation problem, where the annotation is formulated as a multivariate mapping from a set of independent descriptor spaces, representing a whole image, to a set of words, representing class labels. For this purpose, a hierarchical annotation architecture, named as HANOLISTIC (hierarchical image annotation system using holistic approach), is defined with two layers. The first layer, called level 0 consists of annotators each of which is fed by a set of distinct descriptors, extracted from the whole image. This enables us to represent the image at each annotator by a different visual property of a descriptor. Since, we use the whole image, the problematic segmentation process is avoided. Training of each annotator is accomplished by a supervised learning paradigm, where each word is considered as a class label. Note that, this approach is slightly different then the classical training approaches, where each data has a unique label. In the proposed system, since each image has one or more annotating words, we assume that an image belongs to more than one class. The output of the level 0 annotators indicate the membership values of the words in the vocabulary, to belong an image. These membership values from each annotator is, then, aggregated at the second layer to obtain meta level annotator. Finally, a set of words from the vocabulary is selected based on the ranking of the output of meta level. The hierarchical annotation system proposed in this study outperforms state of the art annotation systems based on segmental and holistic approaches.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129572288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Bicycle chain shape models 自行车链条形状模型

2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops

Pub Date : 2009-06-20 DOI: 10.1109/CVPRW.2009.5204053

S. Sommer, Aditya Tatu, Cheng Chen, D. Jurgensen, Marleen de Bruijne, M. Loog, M. Nielsen, F. Lauze

In this paper we introduce landmark-based pre-shapes which allow mixing of anatomical landmarks and pseudo-landmarks, constraining consecutive pseudo-landmarks to satisfy planar equidistance relations. This defines naturally a structure of Riemannian manifold on these preshapes, with a natural action of the group of planar rotations. Orbits define the shapes. We develop a geodesic generalized procrustes analysis procedure for a sample set on such a preshape spaces and use it to compute principal geodesic analysis. We demonstrate it on an elementary synthetic example as well on a dataset of manually annotated vertebra shapes from x-ray. We re-landmark them consistently and show that PGA captures the variability of the dataset better than its linear counterpart, PCA.

本文介绍了基于地标的预形状，该预形状允许混合解剖地标和伪地标，约束连续伪地标以满足平面等距关系。这自然地定义了这些预形状上的黎曼流形结构，并具有平面旋转群的自然作用。轨道定义了形状。我们在这样的预形状空间上建立了一个样本集的测地线广义普鲁克斯特分析程序，并用它来计算主测地线分析。我们在一个基本的合成例子上进行了演示，并在一个人工注释的x射线椎体形状数据集上进行了演示。我们一致地重新标记它们，并表明PGA比其线性对立物PCA更好地捕获数据集的可变性。

引用次数: 15

Pedestrian association and localization in monocular FIR video sequence 单目FIR视频序列中的行人关联与定位

2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops

Pub Date : 2009-06-20 DOI: 10.1109/CVPRW.2009.5204132

Mayank Bansal, Shunguang Wu, J. Eledath

This paper addresses the frame-to-frame data association and state estimation problems in localization of a pedestrian relative to a moving vehicle from a monocular far infra-red video sequence. Using a novel application of the hierarchical model-based motion estimation framework, we are able to use the image appearance information to solve the frame-to-frame data association problem and estimate a sub-pixel accurate height ratio for a pedestrian in two frames. Then, to localize the pedestrian, we propose a novel approach of using the pedestrian height ratio estimates to guide an interacting multiple-hypothesis-mode/height filtering algorithm instead of using a constant pedestrian height model. Experiments on several IR sequences demonstrate that this approach achieves results comparable to those from a known pedestrian height thus avoiding errors from a constant height model based approach.

本文研究了单目远红外视频序列中行人相对于移动车辆定位的帧间数据关联和状态估计问题。利用基于层次模型的运动估计框架的新应用，我们能够利用图像外观信息解决帧间数据关联问题，并在两帧内估计出亚像素精确的行人高度比。然后，为了对行人进行定位，我们提出了一种新的方法，即使用行人高度比估计来指导交互多假设模式/高度滤波算法，而不是使用恒定的行人高度模型。在多个红外序列上的实验表明，该方法可以获得与已知行人高度相当的结果，从而避免了基于恒定高度模型的方法的误差。

引用次数: 2

A bottom-up and top-down optimization framework for learning a compositional hierarchy of object classes 一个自底向上和自顶向下的优化框架，用于学习对象类的组合层次结构

2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops

Pub Date : 2009-06-20 DOI: 10.1109/CVPRW.2009.5204327

S. Fidler, Marko Boben, A. Leonardis

Summary form only given. Learning hierarchical representations of object structure in a bottom-up manner faces several difficult issues. First, we are dealing with a very large number of potential feature aggregations. Furthermore, the set of features the algorithm learns at each layer directly influences the expressiveness of the compositional layers that work on top of them. However, we cannot ensure the usefulness of a particular local feature for object class representation based solely on the local statistics. This can only be done when more global, object-wise information is taken into account. We build on the hierarchical compositional approach (Fidler and Leonardis, 2007) that learns a hierarchy of contour compositions of increasing complexity and specificity. Each composition models spatial relations between its constituent parts.

只提供摘要形式。以自底向上的方式学习对象结构的分层表示面临几个难题。首先，我们正在处理大量潜在的特征聚合。此外，算法在每一层学习的特征集直接影响在其上工作的组合层的表达性。然而，我们不能确保仅基于局部统计数据的特定局部特征对对象类表示的有用性。只有在考虑到更多全局的、面向对象的信息时才能做到这一点。我们建立在分层组合方法(Fidler and Leonardis, 2007)的基础上，该方法学习了越来越复杂和特异性的轮廓组合层次。每个组成部分之间的空间关系模型。

引用次数: 0

Distance guided selection of the best base classifier in an ensemble with application to cervigram image segmentation 距离引导下集合中最佳基分类器的选择及其在图像分割中的应用

2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops

Pub Date : 2009-06-20 DOI: 10.1109/CVPRW.2009.5204048

Wei Wang, Xiaolei Huang

We empirically evaluate a distance-guided learning method embedded in a multiple classifier system (MCS) for tissue segmentation in optical images of the uterine cervix. Instead of combining multiple base classifiers as in traditional ensemble methods, we propose a Bhattacharyya distance based metric for measuring the similarity in decision boundary shapes between a pair of statistical classifiers. By generating an ensemble of base classifiers trained independently on separate training images, we can use the distance metric to select those classifiers in the ensemble whose decision boundaries are similar to that of an unknown test image. In an extreme case, we select the base classifier with the most similar decision boundary to accomplish classification and segmentation on the test image. Our approach is novel in the way that the nearest neighbor is picked and effectively solves classification problems in which base classifiers with good overall performance are not easy to construct due to a large variation in the training examples. In our experiments, we applied our method and several popular ensemble methods to segmenting acetowhite regions in cervical images. The overall classification accuracy of the proposed method is significantly better than that of a single classifier learned using the entire training set, and is also superior to other ensemble methods including majority voting, STAPLE, Boosting and Bagging.

我们经验评估了一种嵌入在多分类器系统(MCS)中的远程引导学习方法，用于子宫宫颈光学图像的组织分割。我们提出了一种基于Bhattacharyya距离的度量来度量一对统计分类器之间决策边界形状的相似性，而不是像传统的集成方法那样将多个基分类器组合在一起。通过生成在单独的训练图像上独立训练的基本分类器的集合，我们可以使用距离度量来选择集合中决策边界与未知测试图像相似的分类器。在极端情况下，我们选择决策边界最相似的基分类器对测试图像进行分类和分割。我们的方法在选择最近邻的方式上是新颖的，并且有效地解决了由于训练样例变化很大而不易构建具有良好总体性能的基分类器的分类问题。在我们的实验中，我们将我们的方法和几种流行的集成方法应用于子宫颈图像的乙酰白区域分割。该方法的总体分类精度明显优于使用整个训练集学习单个分类器的分类精度，也优于多数投票、STAPLE、Boosting和Bagging等集成方法。

{"title":"Distance guided selection of the best base classifier in an ensemble with application to cervigram image segmentation","authors":"Wei Wang, Xiaolei Huang","doi":"10.1109/CVPRW.2009.5204048","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204048","url":null,"abstract":"We empirically evaluate a distance-guided learning method embedded in a multiple classifier system (MCS) for tissue segmentation in optical images of the uterine cervix. Instead of combining multiple base classifiers as in traditional ensemble methods, we propose a Bhattacharyya distance based metric for measuring the similarity in decision boundary shapes between a pair of statistical classifiers. By generating an ensemble of base classifiers trained independently on separate training images, we can use the distance metric to select those classifiers in the ensemble whose decision boundaries are similar to that of an unknown test image. In an extreme case, we select the base classifier with the most similar decision boundary to accomplish classification and segmentation on the test image. Our approach is novel in the way that the nearest neighbor is picked and effectively solves classification problems in which base classifiers with good overall performance are not easy to construct due to a large variation in the training examples. In our experiments, we applied our method and several popular ensemble methods to segmenting acetowhite regions in cervical images. The overall classification accuracy of the proposed method is significantly better than that of a single classifier learned using the entire training set, and is also superior to other ensemble methods including majority voting, STAPLE, Boosting and Bagging.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125062195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀