首页 > 最新文献

2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops最新文献

英文 中文
Data handling displays 数据处理显示
Maxim Lazarov, H. Pirsiavash, B. Sajadi, Uddipan Mukherjee, A. Majumder
Imagine a world in which people can use their hand-held mobile devices to project and transfer content. Everyone can join the collaboration by simply bringing their mobile devices close to each other. People can grab data from each others' devices with simple hand gestures. Now imagine a large display created by tiling multiple displays where multiple users can interact with a large dynamically changing data set in a collocated, collaborative setting and the displays will take care of the data transfer and handling functions in a way that is transparent to the users. In this paper we present a novel data-handling display which works as not only a display device but also as an interaction and data transfer module. We propose simple gesture based solutions to transfer information between these data-handling modules. We achieve high scalability by presenting a fully distributed architecture in which each device is responsible for its own data and also communicates and collaborates with other devices. We also show the usefulness of our work in visualizing large datasets and at the same time allowing multiple users to interact with the data.
想象一个人们可以使用手持移动设备投射和传输内容的世界。每个人都可以通过简单地将他们的移动设备靠近彼此来加入协作。人们可以通过简单的手势从彼此的设备中获取数据。现在想象一下,通过平铺多个显示器创建的大型显示器,多个用户可以在并置的协作设置中与大型动态变化的数据集进行交互,显示器将以一种对用户透明的方式处理数据传输和处理功能。本文提出了一种新型的数据处理显示器,它不仅可以作为显示设备,还可以作为交互和数据传输模块。我们提出了简单的基于手势的解决方案来在这些数据处理模块之间传输信息。我们通过呈现一个完全分布式的架构来实现高可扩展性,在这个架构中,每个设备都负责自己的数据,并与其他设备进行通信和协作。我们还展示了我们的工作在可视化大型数据集方面的有用性,同时允许多个用户与数据交互。
{"title":"Data handling displays","authors":"Maxim Lazarov, H. Pirsiavash, B. Sajadi, Uddipan Mukherjee, A. Majumder","doi":"10.1109/CVPRW.2009.5204320","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204320","url":null,"abstract":"Imagine a world in which people can use their hand-held mobile devices to project and transfer content. Everyone can join the collaboration by simply bringing their mobile devices close to each other. People can grab data from each others' devices with simple hand gestures. Now imagine a large display created by tiling multiple displays where multiple users can interact with a large dynamically changing data set in a collocated, collaborative setting and the displays will take care of the data transfer and handling functions in a way that is transparent to the users. In this paper we present a novel data-handling display which works as not only a display device but also as an interaction and data transfer module. We propose simple gesture based solutions to transfer information between these data-handling modules. We achieve high scalability by presenting a fully distributed architecture in which each device is responsible for its own data and also communicates and collaborates with other devices. We also show the usefulness of our work in visualizing large datasets and at the same time allowing multiple users to interact with the data.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128076854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Mining discriminative adjectives and prepositions for natural scene recognition 自然场景识别中区别性形容词和介词的挖掘
Bangpeng Yao, Juan Carlos Niebles, Li Fei-Fei
This paper presents a method that considers not only patch appearances, but also patch relationships in the form of adjectives and prepositions for natural scene recognition. Most of the existing scene categorization approaches only use patch appearances or co-occurrence of patch appearances to determine the scene categories, but the relationships among patches remain ignored. Those relationships are, however, critical for recognition and understanding. For example, a `beach' scene can be characterized by a `sky' region above `sand', and a `water' region between `sky' and `sand'. We believe that exploiting such relations between image regions can improve scene recognition. In our approach, each image is represented as a spatial pyramid, from which we obtain a collection of patch appearances with spatial layout information. We apply a feature mining approach to get discriminative patch combinations. The mined patch combinations can be interpreted as adjectives or prepositions, which are used for scene understanding and recognition. Experimental results on a fifteen class scene dataset show that our approach achieves competitive state-of-the-art recognition accuracy, while providing a rich description of the scene classes in terms of the mined adjectives and prepositions.
本文提出了一种既考虑补丁外观,又考虑形容词和介词形式的补丁关系的自然场景识别方法。现有的场景分类方法大多只使用补丁外观或补丁外观的共现来确定场景类别,而忽略了补丁之间的关系。然而,这些关系对于认识和理解至关重要。例如,“海滩”场景的特征可以是“沙滩”上方的“天空”区域,“天空”和“沙滩”之间的“水”区域。我们相信利用图像区域之间的这种关系可以提高场景识别。在我们的方法中,每张图像都被表示为一个空间金字塔,从中我们获得了具有空间布局信息的斑块外观的集合。我们使用特征挖掘方法来获得判别补丁组合。挖掘的斑块组合可以被解释为形容词或介词,用于场景理解和识别。在15类场景数据集上的实验结果表明,我们的方法达到了具有竞争力的最先进的识别精度,同时根据挖掘的形容词和介词提供了丰富的场景类别描述。
{"title":"Mining discriminative adjectives and prepositions for natural scene recognition","authors":"Bangpeng Yao, Juan Carlos Niebles, Li Fei-Fei","doi":"10.1109/CVPRW.2009.5204222","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204222","url":null,"abstract":"This paper presents a method that considers not only patch appearances, but also patch relationships in the form of adjectives and prepositions for natural scene recognition. Most of the existing scene categorization approaches only use patch appearances or co-occurrence of patch appearances to determine the scene categories, but the relationships among patches remain ignored. Those relationships are, however, critical for recognition and understanding. For example, a `beach' scene can be characterized by a `sky' region above `sand', and a `water' region between `sky' and `sand'. We believe that exploiting such relations between image regions can improve scene recognition. In our approach, each image is represented as a spatial pyramid, from which we obtain a collection of patch appearances with spatial layout information. We apply a feature mining approach to get discriminative patch combinations. The mined patch combinations can be interpreted as adjectives or prepositions, which are used for scene understanding and recognition. Experimental results on a fifteen class scene dataset show that our approach achieves competitive state-of-the-art recognition accuracy, while providing a rich description of the scene classes in terms of the mined adjectives and prepositions.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134517324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Automatically detecting action units from faces of pain: Comparing shape and appearance features 从疼痛的面部自动检测动作单位:比较形状和外观特征
P. Lucey, J. Cohn, S. Lucey, S. Sridharan, K. Prkachin
Recent psychological research suggests that facial movements are a reliable measure of pain. Automatic detection of facial movements associated with pain would contribute to patient care but is technically challenging. Facial movements may be subtle and accompanied by abrupt changes in head orientation. Active appearance models (AAM) have proven robust to naturally occurring facial behavior, yet AAM-based efforts to automatically detect action units (AUs) are few. Using image data from patients with rotator-cuff injuries, we describe an AAM-based automatic system that decouples shape and appearance to detect AUs on a frame-by-frame basis. Most current approaches to AU detection use only appearance features. We explored the relative efficacy of shape and appearance for AU detection. Consistent with the experience of human observers, we found specific relationships between action units and types of facial features. Several AU (e.g. AU4, 12, and 43) were more discriminable by shape than by appearance, whilst the opposite pattern was found for others (e.g. AU6, 7 and 10). AU-specific feature sets may yield optimal results.
最近的心理学研究表明,面部运动是衡量疼痛的可靠指标。与疼痛相关的面部运动的自动检测将有助于患者护理,但在技术上具有挑战性。面部运动可能很微妙,并伴有头部方向的突然变化。主动外观模型(AAM)已被证明对自然发生的面部行为具有鲁棒性,但基于AAM的自动检测动作单元(AUs)的努力很少。利用肩袖损伤患者的图像数据,我们描述了一个基于aam的自动系统,该系统将形状和外观解耦,以逐帧检测AUs。目前大多数AU检测方法仅使用外观特征。我们探讨了形状和外观在AU检测中的相对功效。与人类观察者的经验一致,我们发现了动作单位和面部特征类型之间的特定关系。一些AU(如AU4、12和43)的形状比外观更容易区分,而其他AU(如AU6、7和10)的模式则相反。特定于au的特性集可能产生最佳结果。
{"title":"Automatically detecting action units from faces of pain: Comparing shape and appearance features","authors":"P. Lucey, J. Cohn, S. Lucey, S. Sridharan, K. Prkachin","doi":"10.1109/CVPRW.2009.5204279","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204279","url":null,"abstract":"Recent psychological research suggests that facial movements are a reliable measure of pain. Automatic detection of facial movements associated with pain would contribute to patient care but is technically challenging. Facial movements may be subtle and accompanied by abrupt changes in head orientation. Active appearance models (AAM) have proven robust to naturally occurring facial behavior, yet AAM-based efforts to automatically detect action units (AUs) are few. Using image data from patients with rotator-cuff injuries, we describe an AAM-based automatic system that decouples shape and appearance to detect AUs on a frame-by-frame basis. Most current approaches to AU detection use only appearance features. We explored the relative efficacy of shape and appearance for AU detection. Consistent with the experience of human observers, we found specific relationships between action units and types of facial features. Several AU (e.g. AU4, 12, and 43) were more discriminable by shape than by appearance, whilst the opposite pattern was found for others (e.g. AU6, 7 and 10). AU-specific feature sets may yield optimal results.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130690799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Alignment of 3D point clouds to overhead images 将3D点云对准头顶图像
R. S. Kaminsky, Noah Snavely, S. Seitz, R. Szeliski
We address the problem of automatically aligning structure-from-motion reconstructions to overhead images, such as satellite images, maps and floor plans, generated from an orthographic camera. We compute the optimal alignment using an objective function that matches 3D points to image edges and imposes free space constraints based on the visibility of points in each camera. We demonstrate the accuracy of our alignment algorithm on several outdoor and indoor scenes using both satellite and floor plan images. We also present an application of our technique, which uses a labeled overhead image to automatically tag the input photo collection with textual information.
我们解决的问题是自动对齐结构从运动重建到头顶的图像,如卫星图像,地图和平面图,由正射影相机生成。我们使用目标函数计算最佳对齐,该目标函数将3D点与图像边缘匹配,并根据每个相机中点的可见性施加自由空间约束。我们使用卫星和平面图图像在几个室外和室内场景上演示了我们的对齐算法的准确性。我们还介绍了我们的技术的一个应用,它使用标记的头顶图像自动标记输入的照片集合和文本信息。
{"title":"Alignment of 3D point clouds to overhead images","authors":"R. S. Kaminsky, Noah Snavely, S. Seitz, R. Szeliski","doi":"10.1109/CVPRW.2009.5204180","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204180","url":null,"abstract":"We address the problem of automatically aligning structure-from-motion reconstructions to overhead images, such as satellite images, maps and floor plans, generated from an orthographic camera. We compute the optimal alignment using an objective function that matches 3D points to image edges and imposes free space constraints based on the visibility of points in each camera. We demonstrate the accuracy of our alignment algorithm on several outdoor and indoor scenes using both satellite and floor plan images. We also present an application of our technique, which uses a labeled overhead image to automatically tag the input photo collection with textual information.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116337905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 102
Non-rigid registration between histological and MR images of the prostate: A joint segmentation and registration framework 前列腺组织学和MR图像之间的非刚性配准:一个联合分割和配准框架
Yangming Ou, D. Shen, M. Feldman, J. Tomaszeweski, C. Davatzikos
This paper presents a 3D non-rigid registration algorithm between histological and MR images of the prostate with cancer. To compensate for the loss of 3D integrity in the histology sectioning process, series of 2D histological slices are first reconstructed into a 3D histological volume. After that, the 3D histology-MRI registration is obtained by maximizing a) landmark similarity and b) cancer region overlap between the two images. The former aims to capture distortions at prostate boundary and internal blob-like structures; and the latter aims to capture distortions specifically at cancer regions. In particular, landmark similarities, the former, is maximized by an annealing process, where correspondences between the automatically-detected boundary and internal landmarks are iteratively established in a fuzzy-to-deterministic fashion. Cancer region overlap, the latter, is maximized in a joint cancer segmentation and registration framework, where the two interleaved problems - segmentation and registration - inform each other in an iterative fashion. Registration accuracy is established by comparing against human-rater-defined landmarks and by comparing with other methods. The ultimate goal of this registration is to warp the histologically-defined cancer ground truth into MRI, for more thoroughly understanding MRI signal characteristics of the prostate cancerous tissue, which will promote the MRI-based prostate cancer diagnosis in the future studies.
本文提出了一种前列腺癌组织与磁共振图像的三维非刚性配准算法。为了弥补组织学切片过程中三维完整性的损失,首先将一系列二维组织学切片重建为三维组织学体积。之后,通过最大化a)标记相似性和b)两幅图像之间的癌区重叠,获得三维组织学- mri配准。前者旨在捕捉前列腺边界和内部斑点状结构的扭曲;后者的目标是捕捉癌症区域的扭曲。特别是,标记相似性,前者,通过退火过程最大化,其中自动检测的边界和内部标记之间的对应关系以模糊到确定性的方式迭代建立。癌症区域重叠,后者,在联合癌症分割和配准框架中最大化,其中两个交错的问题-分割和配准-以迭代的方式相互通知。通过与人类定义的地标和其他方法的比较来确定配准精度。这一登记的最终目的是将组织学上确定的癌症基础真相转化为MRI,从而更透彻地了解前列腺癌组织的MRI信号特征,从而促进未来研究中基于MRI的前列腺癌诊断。
{"title":"Non-rigid registration between histological and MR images of the prostate: A joint segmentation and registration framework","authors":"Yangming Ou, D. Shen, M. Feldman, J. Tomaszeweski, C. Davatzikos","doi":"10.1109/CVPRW.2009.5204347","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204347","url":null,"abstract":"This paper presents a 3D non-rigid registration algorithm between histological and MR images of the prostate with cancer. To compensate for the loss of 3D integrity in the histology sectioning process, series of 2D histological slices are first reconstructed into a 3D histological volume. After that, the 3D histology-MRI registration is obtained by maximizing a) landmark similarity and b) cancer region overlap between the two images. The former aims to capture distortions at prostate boundary and internal blob-like structures; and the latter aims to capture distortions specifically at cancer regions. In particular, landmark similarities, the former, is maximized by an annealing process, where correspondences between the automatically-detected boundary and internal landmarks are iteratively established in a fuzzy-to-deterministic fashion. Cancer region overlap, the latter, is maximized in a joint cancer segmentation and registration framework, where the two interleaved problems - segmentation and registration - inform each other in an iterative fashion. Registration accuracy is established by comparing against human-rater-defined landmarks and by comparing with other methods. The ultimate goal of this registration is to warp the histologically-defined cancer ground truth into MRI, for more thoroughly understanding MRI signal characteristics of the prostate cancerous tissue, which will promote the MRI-based prostate cancer diagnosis in the future studies.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123922609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 43
HALF-SIFT: High-Accurate Localized Features for SIFT 半SIFT:高精度的SIFT局部特征
Kai Cordes, Oliver Müller, B. Rosenhahn, J. Ostermann
In this paper, the accuracy of feature points in images detected by the scale invariant feature transform (SIFT) is analyzed. It is shown that there is a systematic error in the feature point localization. The systematic error is caused by the improper subpel and subscale estimation, an interpolation with a parabolic function. To avoid the systematic error, the detection of high-accurate localized features (HALF) is proposed. We present two models for the localization of a feature point in the scale-space, a Gaussian and a Difference of Gaussians based model function. For evaluation, ground truth image data is synthesized to experimentally prove the systematic error of SIFT and to show that the error is eliminated using HALF. Experiments with natural image data show that the proposed methods increase the accuracy of the feature point positions by 13.9% using the Gaussian and by 15.6% using the Difference of Gaussians model.
本文分析了尺度不变特征变换(SIFT)检测图像中特征点的精度。结果表明,在特征点定位中存在系统误差。系统误差主要是由于子尺度和子尺度估计不当,采用抛物线函数插值。为了避免系统误差,提出了高精度局部特征检测方法。提出了两种基于高斯函数和高斯差分函数的尺度空间特征点定位模型。为了评估,合成了真实图像数据,实验证明了SIFT的系统误差,并表明使用HALF可以消除误差。在自然图像数据上进行的实验表明,采用高斯模型和高斯差分模型分别将特征点定位精度提高了13.9%和15.6%。
{"title":"HALF-SIFT: High-Accurate Localized Features for SIFT","authors":"Kai Cordes, Oliver Müller, B. Rosenhahn, J. Ostermann","doi":"10.1109/CVPRW.2009.5204283","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204283","url":null,"abstract":"In this paper, the accuracy of feature points in images detected by the scale invariant feature transform (SIFT) is analyzed. It is shown that there is a systematic error in the feature point localization. The systematic error is caused by the improper subpel and subscale estimation, an interpolation with a parabolic function. To avoid the systematic error, the detection of high-accurate localized features (HALF) is proposed. We present two models for the localization of a feature point in the scale-space, a Gaussian and a Difference of Gaussians based model function. For evaluation, ground truth image data is synthesized to experimentally prove the systematic error of SIFT and to show that the error is eliminated using HALF. Experiments with natural image data show that the proposed methods increase the accuracy of the feature point positions by 13.9% using the Gaussian and by 15.6% using the Difference of Gaussians model.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124022950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Automatic recognition of fingerspelled words in British Sign Language 自动识别手指拼写单词在英国手语
Stephan Liwicki, M. Everingham
We investigate the problem of recognizing words from video, fingerspelled using the British Sign Language (BSL) fingerspelling alphabet. This is a challenging task since the BSL alphabet involves both hands occluding each other, and contains signs which are ambiguous from the observer's viewpoint. The main contributions of our work include: (i) recognition based on hand shape alone, not requiring motion cues; (ii) robust visual features for hand shape recognition; (iii) scalability to large lexicon recognition with no re-training. We report results on a dataset of 1,000 low quality webcam videos of 100 words. The proposed method achieves a word recognition accuracy of 98.9%.
我们研究了用英国手语(BSL)拼写字母来识别视频中单词的问题。这是一项具有挑战性的任务,因为BSL字母表涉及双手互相遮挡,并且从观察者的角度来看包含模棱两可的符号。我们工作的主要贡献包括:(i)仅基于手部形状的识别,不需要动作线索;(ii)手部形状识别的鲁棒视觉特征;(iii)无需重新训练即可扩展到大型词典识别。我们报告了1000个低质量的100个单词的网络摄像头视频的数据集的结果。该方法的单词识别准确率达到98.9%。
{"title":"Automatic recognition of fingerspelled words in British Sign Language","authors":"Stephan Liwicki, M. Everingham","doi":"10.1109/CVPRW.2009.5204291","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204291","url":null,"abstract":"We investigate the problem of recognizing words from video, fingerspelled using the British Sign Language (BSL) fingerspelling alphabet. This is a challenging task since the BSL alphabet involves both hands occluding each other, and contains signs which are ambiguous from the observer's viewpoint. The main contributions of our work include: (i) recognition based on hand shape alone, not requiring motion cues; (ii) robust visual features for hand shape recognition; (iii) scalability to large lexicon recognition with no re-training. We report results on a dataset of 1,000 low quality webcam videos of 100 words. The proposed method achieves a word recognition accuracy of 98.9%.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128666072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 132
Impact of involuntary subject movement on 3D face scans 受试者非自愿运动对3D面部扫描的影响
Chris Boehnen, P. Flynn
The impact of natural movement/sway while standing still during the capture of a 3D face model for biometric applications has previously been believed to have a negligible impact on biometric performance. Utilizing a newly captured dataset this paper demonstrates a significant negative impact of standing. A 0.5 improvement in d' (test of correct/incorrect match distribution separation) per 3D face region and noticeable improvement to match distributions are shown to result from eliminating movement during the scanning process. By comparing these match distributions to those in the FRGC dataset this paper presents an argument for improving the accuracy of 3D face models by eliminating motion during the capture process.
在捕获生物识别应用的3D面部模型时,自然运动/摇摆的影响之前被认为对生物识别性能的影响可以忽略不计。本文利用新捕获的数据集证明了立地的显著负面影响。每个3D人脸区域的d'(正确/不正确匹配分布分离测试)提高了0.5,匹配分布也得到了显著改善,这是由于在扫描过程中消除了移动。通过将这些匹配分布与FRGC数据集中的匹配分布进行比较,本文提出了通过消除捕获过程中的运动来提高3D人脸模型精度的观点。
{"title":"Impact of involuntary subject movement on 3D face scans","authors":"Chris Boehnen, P. Flynn","doi":"10.1109/CVPRW.2009.5204324","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204324","url":null,"abstract":"The impact of natural movement/sway while standing still during the capture of a 3D face model for biometric applications has previously been believed to have a negligible impact on biometric performance. Utilizing a newly captured dataset this paper demonstrates a significant negative impact of standing. A 0.5 improvement in d' (test of correct/incorrect match distribution separation) per 3D face region and noticeable improvement to match distributions are shown to result from eliminating movement during the scanning process. By comparing these match distributions to those in the FRGC dataset this paper presents an argument for improving the accuracy of 3D face models by eliminating motion during the capture process.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125480572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Scenes vs. objects: A comparative study of two approaches to context based recognition 场景与对象:两种基于上下文的识别方法的比较研究
Andrew Rabinovich, Serge J. Belongie
Contextual models play a very important role in the task of object recognition. Over the years, two kinds of contextual models have emerged: models with contextual inference based on the statistical summary of the scene (we will refer to these as scene based context models, or SBC), and models representing the context in terms of relationships among objects in the image (object based context, or OBC). In designing object recognition systems, it is necessary to understand the theoretical and practical properties of such approaches. This work provides an analysis of these models and evaluates two of their representatives using the LabelMe dataset. We demonstrate a considerable margin of improvement using the OBC style approach.
上下文模型在目标识别中起着非常重要的作用。多年来,出现了两种类型的上下文模型:基于场景统计总结的上下文推断模型(我们将其称为基于场景的上下文模型,或SBC),以及根据图像中对象之间的关系表示上下文的模型(基于对象的上下文,或OBC)。在设计目标识别系统时,有必要了解这些方法的理论和实际特性。这项工作提供了对这些模型的分析,并使用LabelMe数据集评估了它们的两个代表。我们展示了使用OBC风格的方法有相当大的改进余地。
{"title":"Scenes vs. objects: A comparative study of two approaches to context based recognition","authors":"Andrew Rabinovich, Serge J. Belongie","doi":"10.1109/CVPRW.2009.5204220","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204220","url":null,"abstract":"Contextual models play a very important role in the task of object recognition. Over the years, two kinds of contextual models have emerged: models with contextual inference based on the statistical summary of the scene (we will refer to these as scene based context models, or SBC), and models representing the context in terms of relationships among objects in the image (object based context, or OBC). In designing object recognition systems, it is necessary to understand the theoretical and practical properties of such approaches. This work provides an analysis of these models and evaluates two of their representatives using the LabelMe dataset. We demonstrate a considerable margin of improvement using the OBC style approach.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125565794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Audio-visual speech synchronization detection using a bimodal linear prediction model 使用双峰线性预测模型的视听语音同步检测
Kshitiz Kumar, Jirí Navrátil, E. Marcheret, V. Libal, G. Ramaswamy, G. Potamianos
In this work, we study the problem of detecting audio-visual (AV) synchronization in video segments containing a speaker in frontal head pose. The problem holds important applications in biometrics, for example spoofing detection, and it constitutes an important step in AV segmentation necessary for deriving AV fingerprints in multimodal speaker recognition. To attack the problem, we propose a time-evolution model for AV features and derive an analytical approach to capture the notion of synchronization between them. We report results on an appropriate AV database, using two types of visual features extracted from the speaker's facial area: geometric ones and features based on the discrete cosine image transform. Our results demonstrate that the proposed approach provides substantially better AV synchrony detection over a baseline method that employs mutual information, with the geometric visual features outperforming the image transform ones.
在这项工作中,我们研究了在包含说话者正面头部姿势的视频片段中检测视听(AV)同步的问题。该问题在生物识别中具有重要的应用,例如欺骗检测,并且它构成了在多模态说话人识别中提取AV指纹所必需的AV分割的重要步骤。为了解决这个问题,我们提出了一个AV特征的时间演化模型,并推导了一种分析方法来捕捉它们之间的同步概念。我们在适当的AV数据库上报告结果,使用从说话者面部区域提取的两种视觉特征:几何特征和基于离散余弦图像变换的特征。我们的研究结果表明,与采用互信息的基线方法相比,该方法提供了更好的AV同步检测,其几何视觉特征优于图像变换特征。
{"title":"Audio-visual speech synchronization detection using a bimodal linear prediction model","authors":"Kshitiz Kumar, Jirí Navrátil, E. Marcheret, V. Libal, G. Ramaswamy, G. Potamianos","doi":"10.1109/CVPRW.2009.5204303","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204303","url":null,"abstract":"In this work, we study the problem of detecting audio-visual (AV) synchronization in video segments containing a speaker in frontal head pose. The problem holds important applications in biometrics, for example spoofing detection, and it constitutes an important step in AV segmentation necessary for deriving AV fingerprints in multimodal speaker recognition. To attack the problem, we propose a time-evolution model for AV features and derive an analytical approach to capture the notion of synchronization between them. We report results on an appropriate AV database, using two types of visual features extracted from the speaker's facial area: geometric ones and features based on the discrete cosine image transform. Our results demonstrate that the proposed approach provides substantially better AV synchrony detection over a baseline method that employs mutual information, with the geometric visual features outperforming the image transform ones.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123318752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
期刊
2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1