首页 > 最新文献

Computer Vision for Interactive and Intelligent Environment (CVIIE'05)最新文献

英文 中文
Generic Object Recognizer Design 通用对象识别器设计
Pub Date : 2005-11-17 DOI: 10.1109/CVIIE.2005.9
J. Mundy
The problem of generic object recognition is discussed in relation to the current research emphasis in the computer vision community on learning methods for classz~cationM. utual information is proposed as a tool for identiying the salient features of a class and as a mechanism for constructing class recognizers from an engineering design standpoint. The concept of obsewability is introduced to define classes that may differ from human concepts but are necessary to achieve high recognition performance.
针对当前计算机视觉学界对分类学习方法的研究重点,讨论了通用目标识别问题。互信息被认为是识别类的显著特征的工具,也是从工程设计的角度构建类识别器的机制。引入可观察性的概念来定义可能不同于人类概念的类,但这些类是实现高识别性能所必需的。
{"title":"Generic Object Recognizer Design","authors":"J. Mundy","doi":"10.1109/CVIIE.2005.9","DOIUrl":"https://doi.org/10.1109/CVIIE.2005.9","url":null,"abstract":"The problem of generic object recognition is discussed in relation to the current research emphasis in the computer vision community on learning methods for classz~cationM. utual information is proposed as a tool for identiying the salient features of a class and as a mechanism for constructing class recognizers from an engineering design standpoint. The concept of obsewability is introduced to define classes that may differ from human concepts but are necessary to achieve high recognition performance.","PeriodicalId":447061,"journal":{"name":"Computer Vision for Interactive and Intelligent Environment (CVIIE'05)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129839291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards Ambient Projection for Intelligent Environments 面向智能环境的环境投影
Pub Date : 2005-11-17 DOI: 10.1109/CVIIE.2005.20
Rahul Sukthankar
Traditional desktop computing paradigms provide a poor interface for interacting with intelligent physical spaces. Although handheld devices are an important platform for interface agents, their displays are inadequate for many pervasive computing tasks and need to be supplemented by larger high-resolution displays. We propose the notion of augmenting indoor intelligent environments with ambient projection, where large numbers of projectors simultaneously illuminate the environment from multiple directions - analogous to the way in which ambient lighting permeates a room. Ambient projection could enable any suitable surface in an environment to be employed as a display device. Using such displays, the intelligent environment could present high-resolution information, proactively alert users who are not carrying handheld devices and annotate objects in the environment without instrumentation. Several challenges must be solved before such projected displays become a practical solution. This paper provides an overview of our research in computer vision for enabling interactive ambient projected displays.
传统的桌面计算范例为与智能物理空间交互提供了一个糟糕的界面。尽管手持设备是界面代理的一个重要平台,但它们的显示器不足以满足许多普适计算任务,需要更大的高分辨率显示器作为补充。我们提出了通过环境投影来增强室内智能环境的概念,其中大量投影仪同时从多个方向照亮环境-类似于环境照明渗透房间的方式。环境投影可以使环境中任何合适的表面用作显示设备。使用这种显示器,智能环境可以提供高分辨率的信息,主动提醒没有携带手持设备的用户,并在没有仪器的情况下注释环境中的对象。在这种投影显示器成为实用的解决方案之前,必须解决几个挑战。本文概述了我们在实现交互式环境投影显示的计算机视觉方面的研究。
{"title":"Towards Ambient Projection for Intelligent Environments","authors":"Rahul Sukthankar","doi":"10.1109/CVIIE.2005.20","DOIUrl":"https://doi.org/10.1109/CVIIE.2005.20","url":null,"abstract":"Traditional desktop computing paradigms provide a poor interface for interacting with intelligent physical spaces. Although handheld devices are an important platform for interface agents, their displays are inadequate for many pervasive computing tasks and need to be supplemented by larger high-resolution displays. We propose the notion of augmenting indoor intelligent environments with ambient projection, where large numbers of projectors simultaneously illuminate the environment from multiple directions - analogous to the way in which ambient lighting permeates a room. Ambient projection could enable any suitable surface in an environment to be employed as a display device. Using such displays, the intelligent environment could present high-resolution information, proactively alert users who are not carrying handheld devices and annotate objects in the environment without instrumentation. Several challenges must be solved before such projected displays become a practical solution. This paper provides an overview of our research in computer vision for enabling interactive ambient projected displays.","PeriodicalId":447061,"journal":{"name":"Computer Vision for Interactive and Intelligent Environment (CVIIE'05)","volume":"451 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123022394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Representing Actions of Objects in Intelligent Environments 智能环境中对象动作的表示
Pub Date : 2005-11-17 DOI: 10.1109/CVIIE.2005.17
T. Syeda-Mahmood
A first step towards an understanding of the semantic content in a video is the reliable detection and recognition of actions performed by objects in the environment. This is a difficult problem due to the enormous variability in an action's appearance when seen from different viewpoints and/or at different times. In this paper we present a novel approach to representing actions that models actions as specific types of 3d objects. Specifically, we observe that any action can be represented as a generalized cylinder, called the action cylinder. Visualizing actions as objects allows rigid, articulated, and non-rigid actions to all be modeled in a uniform framework.
理解视频中语义内容的第一步是可靠地检测和识别环境中物体所执行的动作。这是一个困难的问题,因为当从不同的角度和/或在不同的时间观察时,动作的外观会有巨大的可变性。在本文中,我们提出了一种新的方法来表示动作,将动作建模为特定类型的3d对象。具体地说,我们观察到任何作用都可以表示为一个广义的圆柱体,称为作用圆柱体。将操作可视化为对象允许在统一的框架中对刚性、铰接和非刚性操作进行建模。
{"title":"Representing Actions of Objects in Intelligent Environments","authors":"T. Syeda-Mahmood","doi":"10.1109/CVIIE.2005.17","DOIUrl":"https://doi.org/10.1109/CVIIE.2005.17","url":null,"abstract":"A first step towards an understanding of the semantic content in a video is the reliable detection and recognition of actions performed by objects in the environment. This is a difficult problem due to the enormous variability in an action's appearance when seen from different viewpoints and/or at different times. In this paper we present a novel approach to representing actions that models actions as specific types of 3d objects. Specifically, we observe that any action can be represented as a generalized cylinder, called the action cylinder. Visualizing actions as objects allows rigid, articulated, and non-rigid actions to all be modeled in a uniform framework.","PeriodicalId":447061,"journal":{"name":"Computer Vision for Interactive and Intelligent Environment (CVIIE'05)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125583369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integrating Communication with Interaction: Computer Vision Challenges for Interactive and Intelligent Environments 集成通信与交互:交互和智能环境中的计算机视觉挑战
Pub Date : 2005-11-17 DOI: 10.1109/CVIIE.2005.10
J. Cooperstock
Interactive, Intelligent Environments involve a convergence of various research themes, including high-fidelity visualization, communication, gestural expression, and virtualized reality systems. Recent advances in real-time acquisition, transmission, and rendering of multimodal data (e.g. audio, video, haptic) allow for the synthesis of significantly improved perceptual representations of a virtual or real (e.g. remote) environment than were previously possible. Furthermore, increased computational power permits the synthesis of a rich responsive media space that responds to a large number of participants engaged in a complex, expressive activity. Unfortunately, current systems tend to concentrate almost exclusively on one aspect or the other, supporting the representation and interaction with a virtual world, or supporting distributed human communication, but never both. The ideal interactive intelligent environment is one that permits effective distributed human-human communication among large numbers of participants at multiple locations, simultaneously with data visualization capabilities and interaction with dynamic, synthetic objects. A significant challenge for the next generation of such environments is to develop the necessary physical infrastructures and software architectures that combine these capabilities appropriately.
交互式智能环境涉及各种研究主题的融合,包括高保真可视化、通信、手势表达和虚拟现实系统。多模态数据(如音频、视频、触觉)的实时采集、传输和渲染方面的最新进展,使得虚拟或真实(如远程)环境的感知表征比以前有了显著改善。此外,增加的计算能力允许合成一个丰富的响应媒体空间,以响应大量参与复杂,表达性活动的参与者。不幸的是,当前的系统倾向于几乎完全集中在一个方面或另一个方面,支持虚拟世界的表示和交互,或者支持分布式的人类通信,但从来没有同时支持这两个方面。理想的交互式智能环境是允许在多个地点的大量参与者之间进行有效的分布式人与人之间的通信,同时具有数据可视化功能和与动态合成对象的交互。下一代这样的环境面临的一个重大挑战是开发必要的物理基础设施和软件体系结构,以适当地组合这些功能。
{"title":"Integrating Communication with Interaction: Computer Vision Challenges for Interactive and Intelligent Environments","authors":"J. Cooperstock","doi":"10.1109/CVIIE.2005.10","DOIUrl":"https://doi.org/10.1109/CVIIE.2005.10","url":null,"abstract":"Interactive, Intelligent Environments involve a convergence of various research themes, including high-fidelity visualization, communication, gestural expression, and virtualized reality systems. Recent advances in real-time acquisition, transmission, and rendering of multimodal data (e.g. audio, video, haptic) allow for the synthesis of significantly improved perceptual representations of a virtual or real (e.g. remote) environment than were previously possible. Furthermore, increased computational power permits the synthesis of a rich responsive media space that responds to a large number of participants engaged in a complex, expressive activity. Unfortunately, current systems tend to concentrate almost exclusively on one aspect or the other, supporting the representation and interaction with a virtual world, or supporting distributed human communication, but never both. The ideal interactive intelligent environment is one that permits effective distributed human-human communication among large numbers of participants at multiple locations, simultaneously with data visualization capabilities and interaction with dynamic, synthetic objects. A significant challenge for the next generation of such environments is to develop the necessary physical infrastructures and software architectures that combine these capabilities appropriately.","PeriodicalId":447061,"journal":{"name":"Computer Vision for Interactive and Intelligent Environment (CVIIE'05)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116428370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Calibrating Visual Sensors and Actuators in Distributed Platforms 分布式平台中视觉传感器和执行器的标定
Pub Date : 2005-11-17 DOI: 10.1109/CVIIE.2005.2
E. Horster, R. Lienhart, Walter Kellermann, J. Bouguet
Many novel multimedia, home entertainment, visual surveillance and health applications use multiple audio-visual sensors and actuators. In this paper we present a novel approach for position and pose calibration of visual sensors and actuators, i.e. cameras and displays, in a distributed network of general purpose computing devices. It complements our work on position calibration of audio sensors and actuators in a distributed computing platform [14]. The approach is suitable for a wide range of possible - even mobile - setups since (a) synchronization is not required, (b) it works automatically, (c) only weak restrictions are imposed on the positions of the cameras and displays, and (d) no upper limit on the number of cameras and displays under calibration is imposed. Corresponding points across different camera images are established automatically and found with subpixel accuracy. Cameras do not have to share one common view. Only a reasonable overlap between camera subgroups is necessary. The method has been sucessfully tested in numerous multi-camera environments with a varying number of cameras and displays and has proven itself to work extremely accurate.
许多新颖的多媒体、家庭娱乐、视觉监控和健康应用都使用多个视听传感器和执行器。在本文中,我们提出了一种在通用计算设备的分布式网络中对视觉传感器和执行器(即摄像机和显示器)进行位置和姿态校准的新方法。它补充了我们在分布式计算平台中音频传感器和执行器位置校准的工作[14]。该方法适用于广泛的可能-甚至是移动-设置,因为(a)不需要同步,(b)它自动工作,(c)仅对摄像机和显示器的位置施加弱限制,以及(d)对校准中的摄像机和显示器的数量没有上限。自动建立不同相机图像之间的对应点,并以亚像素级精度找到对应点。相机不必共享一个共同的视图。相机子组之间只有合理的重叠是必要的。该方法已经成功地在多个多摄像头环境中进行了测试,其中包括不同数量的摄像头和显示器,并证明了它的工作非常准确。
{"title":"Calibrating Visual Sensors and Actuators in Distributed Platforms","authors":"E. Horster, R. Lienhart, Walter Kellermann, J. Bouguet","doi":"10.1109/CVIIE.2005.2","DOIUrl":"https://doi.org/10.1109/CVIIE.2005.2","url":null,"abstract":"Many novel multimedia, home entertainment, visual surveillance and health applications use multiple audio-visual sensors and actuators. In this paper we present a novel approach for position and pose calibration of visual sensors and actuators, i.e. cameras and displays, in a distributed network of general purpose computing devices. It complements our work on position calibration of audio sensors and actuators in a distributed computing platform [14]. The approach is suitable for a wide range of possible - even mobile - setups since (a) synchronization is not required, (b) it works automatically, (c) only weak restrictions are imposed on the positions of the cameras and displays, and (d) no upper limit on the number of cameras and displays under calibration is imposed. Corresponding points across different camera images are established automatically and found with subpixel accuracy. Cameras do not have to share one common view. Only a reasonable overlap between camera subgroups is necessary. The method has been sucessfully tested in numerous multi-camera environments with a varying number of cameras and displays and has proven itself to work extremely accurate.","PeriodicalId":447061,"journal":{"name":"Computer Vision for Interactive and Intelligent Environment (CVIIE'05)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124068941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Situated Observation of Human Activity 人类活动的情境观察
Pub Date : 2005-11-17 DOI: 10.1109/CVIIE.2005.18
J. Crowley
Many human activities follow a loosely defined script in which individuals assume roles. Encoding such scripts in a formal representation makes it possible to build systems that observe and understand human activity. In this paper, we first present a conceptual framework in which scripts for human activity are described as scenarios composed of actors and objects within a network of situations. We provide formal definitions for the underlying concepts for situation models, and then propose a layered, component-based, software architecture model for constructings systems to observe human activity. Both the conceptual framework and architectural model are illustrated with a system for real-time composition of a synchronized audio-video streams for recording activity within a meeting or lecture.
许多人类活动遵循一个松散定义的脚本,其中个人承担角色。将这样的脚本编码为形式化表示,使得构建观察和理解人类活动的系统成为可能。在本文中,我们首先提出了一个概念框架,在这个框架中,人类活动的脚本被描述为由情境网络中的参与者和对象组成的场景。我们为情境模型的底层概念提供了形式化定义,然后提出了一个分层的、基于组件的软件体系结构模型,用于构建观察人类活动的系统。概念框架和架构模型都用一个系统来说明,该系统用于实时合成同步的音频-视频流,用于记录会议或讲座中的活动。
{"title":"Situated Observation of Human Activity","authors":"J. Crowley","doi":"10.1109/CVIIE.2005.18","DOIUrl":"https://doi.org/10.1109/CVIIE.2005.18","url":null,"abstract":"Many human activities follow a loosely defined script in which individuals assume roles. Encoding such scripts in a formal representation makes it possible to build systems that observe and understand human activity. In this paper, we first present a conceptual framework in which scripts for human activity are described as scenarios composed of actors and objects within a network of situations. We provide formal definitions for the underlying concepts for situation models, and then propose a layered, component-based, software architecture model for constructings systems to observe human activity. Both the conceptual framework and architectural model are illustrated with a system for real-time composition of a synchronized audio-video streams for recording activity within a meeting or lecture.","PeriodicalId":447061,"journal":{"name":"Computer Vision for Interactive and Intelligent Environment (CVIIE'05)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128869078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Markerless Motion Capture using Multiple Cameras 无标记运动捕捉使用多个摄像头
Pub Date : 2005-11-17 DOI: 10.1109/CVIIE.2005.13
A. Sundaresan, R. Chellappa
Motion capture has important applications in different areas such as biomechanics, computer animation, and human-computer interaction. Current motion capture methods use passive markers that are attached to different body parts of the subject and are therefore intrusive in nature. In applications such as pathological human movement analysis, these markers may introduce an unknown artifact in the motion, and are, in general, cumbersome. We present computer vision based methods for performing markerless human motion capture. We model the human body as a set of super-quadrics connected in an articulated structure and propose algorithms to estimate the parameters of the model from video sequences. We compute a volume data (voxel) representation from the images and combine bottom-up approach with top down approach guided by our knowledge of the model. We propose a tracking algorithm that uses this model to track human pose. The tracker uses an iterative framework akin to an Iterated Extended Kalman Filter to estimate articulated human motion using multiple cues that combine both spatial and temporal information in a novel manner. We provide preliminary results using data collected from 8-16 cameras. The emphasis of our work is on models and algorithms that are able to scale with respect to the requirement for accuracy. Our ultimate objective is to build an end-to-end system that can integrate the above mentioned components into a completely automated markerless motion capture system.
动作捕捉在生物力学、计算机动画和人机交互等不同领域都有重要的应用。目前的动作捕捉方法使用被动标记,附着在受试者的不同身体部位,因此本质上是侵入性的。在病理人体运动分析等应用中,这些标记可能会在运动中引入未知的伪影,并且通常很麻烦。我们提出了基于计算机视觉的方法来执行无标记的人体动作捕捉。我们将人体建模为一组连接在一个铰接结构中的超二次曲面,并提出了从视频序列中估计模型参数的算法。我们从图像中计算体数据(体素)表示,并根据我们的模型知识将自下而上的方法与自上而下的方法结合起来。我们提出了一种利用该模型跟踪人体姿态的跟踪算法。跟踪器使用类似于迭代扩展卡尔曼滤波器的迭代框架,以一种新颖的方式结合空间和时间信息,使用多个线索来估计铰接的人体运动。我们使用8-16台相机收集的数据提供初步结果。我们工作的重点是能够根据精度要求进行缩放的模型和算法。我们的最终目标是构建一个端到端系统,可以将上述组件集成到一个完全自动化的无标记动作捕捉系统中。
{"title":"Markerless Motion Capture using Multiple Cameras","authors":"A. Sundaresan, R. Chellappa","doi":"10.1109/CVIIE.2005.13","DOIUrl":"https://doi.org/10.1109/CVIIE.2005.13","url":null,"abstract":"Motion capture has important applications in different areas such as biomechanics, computer animation, and human-computer interaction. Current motion capture methods use passive markers that are attached to different body parts of the subject and are therefore intrusive in nature. In applications such as pathological human movement analysis, these markers may introduce an unknown artifact in the motion, and are, in general, cumbersome. We present computer vision based methods for performing markerless human motion capture. We model the human body as a set of super-quadrics connected in an articulated structure and propose algorithms to estimate the parameters of the model from video sequences. We compute a volume data (voxel) representation from the images and combine bottom-up approach with top down approach guided by our knowledge of the model. We propose a tracking algorithm that uses this model to track human pose. The tracker uses an iterative framework akin to an Iterated Extended Kalman Filter to estimate articulated human motion using multiple cues that combine both spatial and temporal information in a novel manner. We provide preliminary results using data collected from 8-16 cameras. The emphasis of our work is on models and algorithms that are able to scale with respect to the requirement for accuracy. Our ultimate objective is to build an end-to-end system that can integrate the above mentioned components into a completely automated markerless motion capture system.","PeriodicalId":447061,"journal":{"name":"Computer Vision for Interactive and Intelligent Environment (CVIIE'05)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128116616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 75
Model-Based 3D Object Tracking Using an Extended-Extended Kalman Filter and Graphics Rendered Measurements 利用扩展-扩展卡尔曼滤波和图形渲染测量的基于模型的3D对象跟踪
Pub Date : 2005-11-17 DOI: 10.1109/CVIIE.2005.14
Hua Yang, G. Welch
This paper presents a model-based 3D object tracking system that uses an improved Extended Kalman filter (EKF) with graphics rendering as the measurement function. During tracking, features are automatically selected from the input images. For each camera, an estimated observation and multiple perturbed observations are rendered for the object. Corresponding features are extracted from the sample images, and their estimated/perturbed measurements are acquired. These sample measurements and the real measurements of the features are then sent to an extended EKF (EEKF). Finally, the EEKF uses the sample measurements to compute high order approximations of the nonlinear measurement functions, and updates the state estimate of the object in an iterative form. The system is scalable to different types of renderable models and measureable features. We present results showing that the approach can be used to track a rigid object, from multiple views, in real-time.
本文提出了一种基于模型的三维目标跟踪系统,该系统采用一种改进的扩展卡尔曼滤波(EKF),以图形渲染为测量函数。在跟踪过程中,从输入图像中自动选择特征。对于每个相机,一个估计的观察和多个摄动的观察对象被渲染。从样本图像中提取相应的特征,并获得其估计/扰动测量值。然后将这些样本测量和特征的实际测量发送到扩展EKF (EEKF)。最后,EEKF利用样本测量值计算非线性测量函数的高阶近似,并以迭代形式更新目标的状态估计。该系统可扩展到不同类型的可渲染模型和可测量特征。我们展示的结果表明,该方法可用于从多个视图实时跟踪刚性物体。
{"title":"Model-Based 3D Object Tracking Using an Extended-Extended Kalman Filter and Graphics Rendered Measurements","authors":"Hua Yang, G. Welch","doi":"10.1109/CVIIE.2005.14","DOIUrl":"https://doi.org/10.1109/CVIIE.2005.14","url":null,"abstract":"This paper presents a model-based 3D object tracking system that uses an improved Extended Kalman filter (EKF) with graphics rendering as the measurement function. During tracking, features are automatically selected from the input images. For each camera, an estimated observation and multiple perturbed observations are rendered for the object. Corresponding features are extracted from the sample images, and their estimated/perturbed measurements are acquired. These sample measurements and the real measurements of the features are then sent to an extended EKF (EEKF). Finally, the EEKF uses the sample measurements to compute high order approximations of the nonlinear measurement functions, and updates the state estimate of the object in an iterative form. The system is scalable to different types of renderable models and measureable features. We present results showing that the approach can be used to track a rigid object, from multiple views, in real-time.","PeriodicalId":447061,"journal":{"name":"Computer Vision for Interactive and Intelligent Environment (CVIIE'05)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123858384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Integrating Motion and Illumination Models for 3D Tracking 整合运动和照明模型的3D跟踪
Pub Date : 2005-11-17 DOI: 10.1109/CVIIE.2005.11
A. Roy-Chowdhury, Yilei Xu
One of the persistent challenges in computer vision has been tracking objects under varying lighting conditions. In this paper we present a method for estimation of 3D motion of a rigid object from a monocular video sequence under arbitrary changes in the illumination conditions under which the video was captured. This is achieved by alternately estimating motion and illumination parameters using a generative model for integrating the effects of motion, illumination and structure within a unified mathematical framework. The motion is represented in terms of translation and rotation of the object centroid, and the illumination is represented using a spherical harmonics linear basis. The method does not assume any model for the variation of the illumination conditions - lighting can change slowly or drastically. For the multi-camera tracking scenario, we propose a new photometric constraint that is valid over the overlapping field of view between two cameras. This is similar in nature to the well-known epipolar constraint, except that it relates the photometric parameters, and can provide an additional constraint for illumination invariant multi-camera tracking. We demonstrate the effectiveness of our tracking algorithm on single and multi-camera video sequences under severe changes of lighting conditions.
计算机视觉一直面临的挑战之一是在不同的光照条件下跟踪物体。在本文中,我们提出了一种方法来估计一个刚体的三维运动从单目视频序列在任意变化的照明条件下的视频被捕获。这是通过使用生成模型交替估计运动和照明参数来实现的,该模型用于在统一的数学框架内整合运动,照明和结构的效果。运动用物体质心的平移和旋转来表示,照明用球面谐波线性基来表示。该方法不假设任何模型的变化照明条件-照明可以缓慢或急剧变化。对于多相机跟踪场景,我们提出了一种新的光度约束,该约束适用于两台相机之间重叠的视场。这在本质上类似于众所周知的极外约束,除了它与光度参数有关,并且可以为光照不变的多相机跟踪提供额外的约束。我们证明了我们的跟踪算法在光照条件剧烈变化的单摄像机和多摄像机视频序列上的有效性。
{"title":"Integrating Motion and Illumination Models for 3D Tracking","authors":"A. Roy-Chowdhury, Yilei Xu","doi":"10.1109/CVIIE.2005.11","DOIUrl":"https://doi.org/10.1109/CVIIE.2005.11","url":null,"abstract":"One of the persistent challenges in computer vision has been tracking objects under varying lighting conditions. In this paper we present a method for estimation of 3D motion of a rigid object from a monocular video sequence under arbitrary changes in the illumination conditions under which the video was captured. This is achieved by alternately estimating motion and illumination parameters using a generative model for integrating the effects of motion, illumination and structure within a unified mathematical framework. The motion is represented in terms of translation and rotation of the object centroid, and the illumination is represented using a spherical harmonics linear basis. The method does not assume any model for the variation of the illumination conditions - lighting can change slowly or drastically. For the multi-camera tracking scenario, we propose a new photometric constraint that is valid over the overlapping field of view between two cameras. This is similar in nature to the well-known epipolar constraint, except that it relates the photometric parameters, and can provide an additional constraint for illumination invariant multi-camera tracking. We demonstrate the effectiveness of our tracking algorithm on single and multi-camera video sequences under severe changes of lighting conditions.","PeriodicalId":447061,"journal":{"name":"Computer Vision for Interactive and Intelligent Environment (CVIIE'05)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129386620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
PICO: Privacy through Invertible Cryptographic Obscuration PICO:通过可逆密码模糊来保护隐私
Pub Date : 2005-11-17 DOI: 10.1109/CVIIE.2005.16
T. Boult
Signifiicant research progress has been made in intelligent imaging systems, surveillance and biometrics improving robustness, increasing performance and decreasing cost. As a result, deployment of surveillance and intelligent video systems is booming and increasing the impact of these on privacy. For many, networked intelligent video systems, especially video surveillance and biometrics, epitomize the invasion of privacy by an Orwellian "big brother:. While tens of millions in government funding have been spent on research improving video surveillance, virtually none has been invested in technologies to enhance privacy or effectively balance privacy and security. This paper presents an example that demonstrates how using and adapting cryptographic ideas and combining them with intelligent video processing, technological pproaches can provide for solutions addressing these critical trade-offs, potentially improving both security and privacy. After reviewing previous research in privacy improving technology in video systems, the paper then presents cryptographically invertible obscuration. This is an application of encryption techniques to improve the privacy aspects while allowing general surveillance to continue and allowing full access (i.e. violation ofprivacy) only with use of a decryption key, maintained by a court or other thirdparty.
在智能成像系统、监控和生物识别方面取得了重大进展,增强了鲁棒性,提高了性能,降低了成本。因此,监控和智能视频系统的部署正在蓬勃发展,并增加了这些对隐私的影响。对许多人来说,网络智能视频系统,尤其是视频监控和生物识别技术,是奥威尔式“老大哥”侵犯隐私的缩影。虽然数以千万计的政府资金被用于改进视频监控的研究,但几乎没有一笔资金投入到增强隐私或有效平衡隐私与安全的技术上。本文提供了一个示例,演示了如何使用和适应加密思想并将其与智能视频处理相结合,技术方法可以提供解决这些关键权衡的解决方案,从而潜在地提高安全性和隐私性。在回顾了以往视频系统中隐私改善技术研究的基础上,提出了加密可逆遮挡技术。这是一种加密技术的应用,旨在改善隐私方面的问题,同时允许继续进行一般监视,并且只允许使用由法院或其他第三方维护的解密密钥进行全面访问(即侵犯隐私)。
{"title":"PICO: Privacy through Invertible Cryptographic Obscuration","authors":"T. Boult","doi":"10.1109/CVIIE.2005.16","DOIUrl":"https://doi.org/10.1109/CVIIE.2005.16","url":null,"abstract":"Signifiicant research progress has been made in intelligent imaging systems, surveillance and biometrics improving robustness, increasing performance and decreasing cost. As a result, deployment of surveillance and intelligent video systems is booming and increasing the impact of these on privacy. For many, networked intelligent video systems, especially video surveillance and biometrics, epitomize the invasion of privacy by an Orwellian \"big brother:. While tens of millions in government funding have been spent on research improving video surveillance, virtually none has been invested in technologies to enhance privacy or effectively balance privacy and security. This paper presents an example that demonstrates how using and adapting cryptographic ideas and combining them with intelligent video processing, technological pproaches can provide for solutions addressing these critical trade-offs, potentially improving both security and privacy. After reviewing previous research in privacy improving technology in video systems, the paper then presents cryptographically invertible obscuration. This is an application of encryption techniques to improve the privacy aspects while allowing general surveillance to continue and allowing full access (i.e. violation ofprivacy) only with use of a decryption key, maintained by a court or other thirdparty.","PeriodicalId":447061,"journal":{"name":"Computer Vision for Interactive and Intelligent Environment (CVIIE'05)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122980769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 136
期刊
Computer Vision for Interactive and Intelligent Environment (CVIIE'05)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1