首页 > 最新文献

2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops最新文献

英文 中文
Tracking multiple pedestrians in real-time using kinematics 利用运动学实时跟踪多行人
S. Apewokin, B. Valentine, M. R. Bales, L. Wills, D. S. Wills
We present an algorithm for real-time tracking of multiple pedestrians in a dynamic scene. The algorithm is targeted for embedded systems and reduces computational and storage costs by using an inexpensive kinematic tracking model with only fixed-point arithmetic representations. Our algorithm leverages from the observation that pedestrians in a dynamic scene tend to move with uniform speed over a small number of consecutive frames. We use a multimodal background modeling technique to accurately segment the foreground (moving people) from the background. We then use connectivity analysis to identify blobs in the foreground and calculate the center of mass of each blob. Finally, we establish correspondence between the center of mass of each blob in the current frame with center of mass information gathered from the two immediately preceding frames. We evaluate our algorithm on a real outdoor video sequence taken with an inexpensive webcam. Our implementation successfully tracks each pedestrian from frame to frame in real-time. Our algorithm performs well in challenging situations resulting from occlusion and crowded conditions, running on an eBox-2300 Thin Client VESA PC.
提出了一种动态场景中多行人实时跟踪算法。该算法的目标是嵌入式系统,并通过使用一种廉价的运动跟踪模型,只有不动点的算法表示,减少计算和存储成本。我们的算法利用了动态场景中的行人倾向于在少量连续帧内以均匀速度移动的观察结果。我们使用多模态背景建模技术来准确地从背景中分割前景(移动的人)。然后,我们使用连通性分析来识别前景中的斑点,并计算每个斑点的质心。最后,我们建立了当前帧中每个斑点的质心与从前两帧中收集的质心信息之间的对应关系。我们用一个便宜的网络摄像头拍摄的真实户外视频序列来评估我们的算法。我们的实现成功地从一帧到另一帧实时跟踪每个行人。我们的算法在eBox-2300瘦客户端VESA PC上运行,在闭塞和拥挤的条件下表现良好。
{"title":"Tracking multiple pedestrians in real-time using kinematics","authors":"S. Apewokin, B. Valentine, M. R. Bales, L. Wills, D. S. Wills","doi":"10.1109/CVPRW.2008.4563149","DOIUrl":"https://doi.org/10.1109/CVPRW.2008.4563149","url":null,"abstract":"We present an algorithm for real-time tracking of multiple pedestrians in a dynamic scene. The algorithm is targeted for embedded systems and reduces computational and storage costs by using an inexpensive kinematic tracking model with only fixed-point arithmetic representations. Our algorithm leverages from the observation that pedestrians in a dynamic scene tend to move with uniform speed over a small number of consecutive frames. We use a multimodal background modeling technique to accurately segment the foreground (moving people) from the background. We then use connectivity analysis to identify blobs in the foreground and calculate the center of mass of each blob. Finally, we establish correspondence between the center of mass of each blob in the current frame with center of mass information gathered from the two immediately preceding frames. We evaluate our algorithm on a real outdoor video sequence taken with an inexpensive webcam. Our implementation successfully tracks each pedestrian from frame to frame in real-time. Our algorithm performs well in challenging situations resulting from occlusion and crowded conditions, running on an eBox-2300 Thin Client VESA PC.","PeriodicalId":102206,"journal":{"name":"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129846784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Improving RANSAC for fast landmark recognition 改进RANSAC快速地标识别
Pablo Márquez-Neila, Jacobo Garcia Miro, J. M. Buenaposada, L. Baumela
We introduce a procedure for recognizing and locating planar landmarks for mobile robot navigation, based in the detection and recognition of a set of interest points. We use RANSAC for fitting a homography and locating the landmark. Our main contribution is the introduction of a geometrical constraint that reduces the number of RANSAC iterations by discarding minimal subsets. In the experiments conducted we conclude that this constraint increases RANSAC performance by reducing in about 35% and 75% the number of iterations for affine and projective cameras, respectively.
基于一组兴趣点的检测和识别,介绍了一种用于移动机器人导航的平面地标识别和定位方法。我们使用RANSAC来拟合单应字并定位地标。我们的主要贡献是引入了一个几何约束,该约束通过丢弃最小子集来减少RANSAC迭代的次数。在进行的实验中,我们得出结论,该约束通过分别减少约35%和75%的仿射和投影相机的迭代次数来提高RANSAC性能。
{"title":"Improving RANSAC for fast landmark recognition","authors":"Pablo Márquez-Neila, Jacobo Garcia Miro, J. M. Buenaposada, L. Baumela","doi":"10.1109/CVPRW.2008.4563138","DOIUrl":"https://doi.org/10.1109/CVPRW.2008.4563138","url":null,"abstract":"We introduce a procedure for recognizing and locating planar landmarks for mobile robot navigation, based in the detection and recognition of a set of interest points. We use RANSAC for fitting a homography and locating the landmark. Our main contribution is the introduction of a geometrical constraint that reduces the number of RANSAC iterations by discarding minimal subsets. In the experiments conducted we conclude that this constraint increases RANSAC performance by reducing in about 35% and 75% the number of iterations for affine and projective cameras, respectively.","PeriodicalId":102206,"journal":{"name":"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129900544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
A stable optic-flow based method for tracking colonoscopy images 一种基于稳定光流的结肠镜图像跟踪方法
Jianfei Liu, K. Subramanian, T. Yoo, R. V. Uitert
In this paper, we focus on the robustness and stability of our algorithm to plot the position of an endoscopic camera (during a colonoscopy procedure) on the corresponding pre-operative CT scan of the patient. The colon has few topological landmarks, in contrast to bronchoscopy images, where a number of registration algorithms have taken advantage of features such as anatomical marks or bifurcations. Our method estimates the camera motion from the optic-flow computed from the information contained in the video stream. Optic-flow computation is notoriously susceptible to errors in estimating the motion field. Our method relies on the following features to counter this, (1) we use a small but reliable set of feature points (sparse optic-flow field) to determine the spatio-temporal scale at which to perform optic-flow computation in each frame of the sequence, (2) the chosen scales are used to compute a more accurate dense optic flow field, which is used to compute qualitative parameters relating to the main motion direction, and (3) the sparse optic-flow field and the main motion parameters are then combined to estimate the camera parameters. A mathematical analysis of our algorithm is presented to illustrate the stability of our method, as well as comparison to existing motion estimation algorithms. We present preliminary results of using this algorithm on both a virtual colonoscopy image sequence, as well as a colon phantom image sequence.
在本文中,我们重点关注我们的算法的鲁棒性和稳定性,以绘制内镜相机(在结肠镜检查过程中)在患者相应的术前CT扫描上的位置。与支气管镜图像相比,结肠具有很少的拓扑标记,其中许多配准算法利用了解剖学标记或分支等特征。该方法根据视频流中包含的信息计算出的光流来估计摄像机的运动。众所周知,光流计算在估计运动场时容易出现误差。我们的方法依靠以下特征来解决这个问题,(1)我们使用一组小而可靠的特征点(稀疏光流场)来确定在序列的每一帧中执行光流计算的时空尺度,(2)选择的尺度用于计算更精确的密集光流场,该光流场用于计算与主要运动方向相关的定性参数。(3)结合稀疏光流场和主要运动参数估计相机参数。数学分析表明了算法的稳定性,并与现有的运动估计算法进行了比较。我们提出了在虚拟结肠镜图像序列以及结肠幻影图像序列上使用该算法的初步结果。
{"title":"A stable optic-flow based method for tracking colonoscopy images","authors":"Jianfei Liu, K. Subramanian, T. Yoo, R. V. Uitert","doi":"10.1109/CVPRW.2008.4562990","DOIUrl":"https://doi.org/10.1109/CVPRW.2008.4562990","url":null,"abstract":"In this paper, we focus on the robustness and stability of our algorithm to plot the position of an endoscopic camera (during a colonoscopy procedure) on the corresponding pre-operative CT scan of the patient. The colon has few topological landmarks, in contrast to bronchoscopy images, where a number of registration algorithms have taken advantage of features such as anatomical marks or bifurcations. Our method estimates the camera motion from the optic-flow computed from the information contained in the video stream. Optic-flow computation is notoriously susceptible to errors in estimating the motion field. Our method relies on the following features to counter this, (1) we use a small but reliable set of feature points (sparse optic-flow field) to determine the spatio-temporal scale at which to perform optic-flow computation in each frame of the sequence, (2) the chosen scales are used to compute a more accurate dense optic flow field, which is used to compute qualitative parameters relating to the main motion direction, and (3) the sparse optic-flow field and the main motion parameters are then combined to estimate the camera parameters. A mathematical analysis of our algorithm is presented to illustrate the stability of our method, as well as comparison to existing motion estimation algorithms. We present preliminary results of using this algorithm on both a virtual colonoscopy image sequence, as well as a colon phantom image sequence.","PeriodicalId":102206,"journal":{"name":"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130333766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Visual map matching and localization using a global feature map 使用全局特征图的视觉地图匹配和定位
O. Pink
This paper presents a novel method to support environmental perception of mobile robots by the use of a global feature map. While typical approaches to simultaneous localization and mapping (SLAM) mainly rely on an on-board camera for mapping, our approach uses geographically referenced aerial or satellite images to build a map in advance. The current position on the map is determined by matching features from the on-board camera to the global feature map. The problem of feature matching is posed as a standard point pattern matching problem and a solution using the iterative closest point method is given. The proposed algorithm is designed for use in a street vehicle and uses lane markings as features, but can be adapted to almost any other type of feature that is visible in aerial images. Our approach allows for estimating the robot position at a higher precision than by a purely GPS-based localization, while at the same time providing information about the environment far beyond the current field of view.
本文提出了一种利用全局特征映射支持移动机器人环境感知的新方法。虽然同步定位和测绘(SLAM)的典型方法主要依赖于机载相机进行测绘,但我们的方法使用地理参考的航空或卫星图像来提前构建地图。地图上的当前位置是通过将机载相机的特征与全局特征图相匹配来确定的。将特征匹配问题提出为标准的点模式匹配问题,并利用迭代最近点法给出了求解方法。所提出的算法设计用于街道车辆,并使用车道标记作为特征,但可以适应几乎任何其他类型的航空图像中可见的特征。我们的方法可以比纯粹的基于gps的定位更精确地估计机器人的位置,同时提供远远超出当前视野的环境信息。
{"title":"Visual map matching and localization using a global feature map","authors":"O. Pink","doi":"10.1109/CVPRW.2008.4563135","DOIUrl":"https://doi.org/10.1109/CVPRW.2008.4563135","url":null,"abstract":"This paper presents a novel method to support environmental perception of mobile robots by the use of a global feature map. While typical approaches to simultaneous localization and mapping (SLAM) mainly rely on an on-board camera for mapping, our approach uses geographically referenced aerial or satellite images to build a map in advance. The current position on the map is determined by matching features from the on-board camera to the global feature map. The problem of feature matching is posed as a standard point pattern matching problem and a solution using the iterative closest point method is given. The proposed algorithm is designed for use in a street vehicle and uses lane markings as features, but can be adapted to almost any other type of feature that is visible in aerial images. Our approach allows for estimating the robot position at a higher precision than by a purely GPS-based localization, while at the same time providing information about the environment far beyond the current field of view.","PeriodicalId":102206,"journal":{"name":"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126836194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 84
Experiments on visual loop closing using vocabulary trees 使用词汇树的视觉循环关闭实验
Ankita Kumar, J. Tardif, Roy Anati, Kostas Daniilidis
In this paper we study the problem of visual loop closing for long trajectories in an urban environment. We use GPS positioning only to narrow down the search area and use pre-built vocabulary trees to find the best matching image in this search area. Geometric consistency is then used to prune out the bad matches. We compare several vocabulary trees on a sequence of 6.5 kilometers. We experiment with hierarchical k-means based trees as well as extremely randomized trees and compare results obtained using five different trees. We obtain the best results using extremely randomized trees. After enforcing geometric consistency the matched images look promising for structure from motion applications.
本文研究了城市环境中长轨迹的视觉闭环问题。我们只使用GPS定位来缩小搜索区域,并使用预先构建的词汇树来寻找该搜索区域的最佳匹配图像。然后使用几何一致性来剔除不良匹配。我们比较了6.5公里序列上的几个词汇树。我们实验了基于k-均值的分层树和极端随机树,并比较了使用五种不同树获得的结果。我们使用极端随机树获得了最好的结果。在增强几何一致性之后,匹配的图像看起来很有希望从运动应用中得到结构。
{"title":"Experiments on visual loop closing using vocabulary trees","authors":"Ankita Kumar, J. Tardif, Roy Anati, Kostas Daniilidis","doi":"10.1109/CVPRW.2008.4563140","DOIUrl":"https://doi.org/10.1109/CVPRW.2008.4563140","url":null,"abstract":"In this paper we study the problem of visual loop closing for long trajectories in an urban environment. We use GPS positioning only to narrow down the search area and use pre-built vocabulary trees to find the best matching image in this search area. Geometric consistency is then used to prune out the bad matches. We compare several vocabulary trees on a sequence of 6.5 kilometers. We experiment with hierarchical k-means based trees as well as extremely randomized trees and compare results obtained using five different trees. We obtain the best results using extremely randomized trees. After enforcing geometric consistency the matched images look promising for structure from motion applications.","PeriodicalId":102206,"journal":{"name":"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114304637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Full orientation invariance and improved feature selectivity of 3D SIFT with application to medical image analysis 三维SIFT的全方向不变性和改进特征选择性及其在医学图像分析中的应用
S. Allaire, John J. Kim, S. Breen, D. Jaffray, V. Pekar
This paper presents a comprehensive extension of the Scale Invariant Feature Transform (SIFT), originally introduced in 2D, to volumetric images. While tackling the significant computational efforts required by such multiscale processing of large data volumes, our implementation addresses two important mathematical issues related to the 2D-to-3D extension. It includes efficient steps to filter out extracted point candidates that have low contrast or are poorly localized along edges or ridges. In addition, it achieves, for the first time, full 3D orientation invariance of the descriptors, which is essential for 3D feature matching. An application of this technique is demonstrated to the feature-based automated registration and segmentation of clinical datasets in the context of radiation therapy.
本文提出了尺度不变特征变换(SIFT)的全面扩展,最初是在2D中引入到体积图像。在处理这种大数据量的多尺度处理所需的大量计算工作时,我们的实现解决了与2d到3d扩展相关的两个重要数学问题。它包括有效的步骤来过滤掉低对比度或沿边缘或脊定位不良的提取候选点。此外,该方法首次实现了描述子的完全三维方向不变性,这对三维特征匹配至关重要。该技术应用于放射治疗中基于特征的临床数据集的自动配准和分割。
{"title":"Full orientation invariance and improved feature selectivity of 3D SIFT with application to medical image analysis","authors":"S. Allaire, John J. Kim, S. Breen, D. Jaffray, V. Pekar","doi":"10.1109/CVPRW.2008.4563023","DOIUrl":"https://doi.org/10.1109/CVPRW.2008.4563023","url":null,"abstract":"This paper presents a comprehensive extension of the Scale Invariant Feature Transform (SIFT), originally introduced in 2D, to volumetric images. While tackling the significant computational efforts required by such multiscale processing of large data volumes, our implementation addresses two important mathematical issues related to the 2D-to-3D extension. It includes efficient steps to filter out extracted point candidates that have low contrast or are poorly localized along edges or ridges. In addition, it achieves, for the first time, full 3D orientation invariance of the descriptors, which is essential for 3D feature matching. An application of this technique is demonstrated to the feature-based automated registration and segmentation of clinical datasets in the context of radiation therapy.","PeriodicalId":102206,"journal":{"name":"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114805382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 123
Towards understanding what makes 3D objects appear simple or complex 理解是什么让3D物体看起来简单或复杂
S. Sukumar, D. Page, A. Koschan, M. Abidi
Humans perceive some objects more complex than others and learning or describing a particular object is directly related to the judged complexity. Towards the goal of understanding why the geometry of some 3D objects appear more complex than others, we conducted a psychophysical study and identified contributing attributes. Our experiments conclude that surface variation, symmetry, part count, simpler part decomposability, intricate details and topology are six significant dimensions that influence 3D visual shape complexity. With that knowledge, we present a method of quantifying complexity and show that the informational aspect of Shannonpsilas theory agrees with the human notion of shape complexity.
人类感知一些物体比其他物体更复杂,学习或描述一个特定的物体与判断的复杂性直接相关。为了理解为什么某些3D物体的几何形状看起来比其他物体更复杂,我们进行了一项心理物理学研究,并确定了相关属性。我们的实验得出结论,表面变化、对称性、零件数量、简单零件可分解性、复杂细节和拓扑结构是影响3D视觉形状复杂性的六个重要维度。有了这些知识,我们提出了一种量化复杂性的方法,并表明山农塞拉斯理论的信息方面与人类形状复杂性的概念一致。
{"title":"Towards understanding what makes 3D objects appear simple or complex","authors":"S. Sukumar, D. Page, A. Koschan, M. Abidi","doi":"10.1109/CVPRW.2008.4562975","DOIUrl":"https://doi.org/10.1109/CVPRW.2008.4562975","url":null,"abstract":"Humans perceive some objects more complex than others and learning or describing a particular object is directly related to the judged complexity. Towards the goal of understanding why the geometry of some 3D objects appear more complex than others, we conducted a psychophysical study and identified contributing attributes. Our experiments conclude that surface variation, symmetry, part count, simpler part decomposability, intricate details and topology are six significant dimensions that influence 3D visual shape complexity. With that knowledge, we present a method of quantifying complexity and show that the informational aspect of Shannonpsilas theory agrees with the human notion of shape complexity.","PeriodicalId":102206,"journal":{"name":"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121445787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
CUDA cuts: Fast graph cuts on the GPU CUDA切割:GPU上的快速图形切割
Vibhav Vineet, P J Narayanan
Graph cuts has become a powerful and popular optimization tool for energies defined over an MRF and have found applications in image segmentation, stereo vision, image restoration, etc. The maxflow/mincut algorithm to compute graph-cuts is computationally heavy. The best-reported implementation of graph cuts takes over 100 milliseconds even on images of size 640times480 and cannot be used for real-time applications or when iterated applications are needed. The commodity Graphics Processor Unit (GPU) has emerged as an economical and fast computation co-processor recently. In this paper, we present an implementation of the push-relabel algorithm for graph cuts on the GPU. We can perform over 60 graph cuts per second on 1024times1024 images and over 150 graph cuts per second on 640times480 images on an Nvidia 8800 GTX. The time for each complete graph-cut is about 1 millisecond when only a few weights change from the previous graph, as on dynamic graphs resulting from videos. The CUDA code with a well-defined interface can be downloaded for anyonepsilas use.
图切割已经成为一种强大而流行的优化工具,用于在MRF上定义能量,并已在图像分割,立体视觉,图像恢复等方面得到应用。用于计算图切割的maxflow/mincut算法计算量很大。据报道,最好的图形切割实现即使在尺寸为640times480的图像上也需要超过100毫秒的时间,并且不能用于实时应用程序或需要迭代的应用程序。商品图形处理器(GPU)是近年来出现的一种经济、快速的计算协处理器。在本文中,我们提出了一种基于GPU的图割推标签算法的实现。在Nvidia 8800 GTX上,我们可以在1024times1024图像上每秒执行60次以上的图形切割,在640times480图像上每秒执行150次以上的图形切割。在视频生成的动态图中,当与前一个图相比只有少量权重变化时,每个完整图裁剪的时间约为1毫秒。CUDA代码具有良好定义的接口,可以下载供任何人使用。
{"title":"CUDA cuts: Fast graph cuts on the GPU","authors":"Vibhav Vineet, P J Narayanan","doi":"10.1109/CVPRW.2008.4563095","DOIUrl":"https://doi.org/10.1109/CVPRW.2008.4563095","url":null,"abstract":"Graph cuts has become a powerful and popular optimization tool for energies defined over an MRF and have found applications in image segmentation, stereo vision, image restoration, etc. The maxflow/mincut algorithm to compute graph-cuts is computationally heavy. The best-reported implementation of graph cuts takes over 100 milliseconds even on images of size 640times480 and cannot be used for real-time applications or when iterated applications are needed. The commodity Graphics Processor Unit (GPU) has emerged as an economical and fast computation co-processor recently. In this paper, we present an implementation of the push-relabel algorithm for graph cuts on the GPU. We can perform over 60 graph cuts per second on 1024times1024 images and over 150 graph cuts per second on 640times480 images on an Nvidia 8800 GTX. The time for each complete graph-cut is about 1 millisecond when only a few weights change from the previous graph, as on dynamic graphs resulting from videos. The CUDA code with a well-defined interface can be downloaded for anyonepsilas use.","PeriodicalId":102206,"journal":{"name":"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131366559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 291
Automatic facial expression recognition for intelligent tutoring systems 用于智能辅导系统的自动面部表情识别
J. Whitehill, M. Bartlett, J. Movellan
This project explores the idea of facial expression for automated feedback in teaching. We show how automatic realtime facial expression recognition can be effectively used to estimate the difficulty level, as perceived by an individual student, of a delivered lecture. We also show that facial expression is predictive of an individual studentpsilas preferred rate of curriculum presentation at each moment in time. On a video lecture viewing task, training on less than two minutes of recorded facial expression data and testing on a separate validation set, our system predicted the subjectspsila self-reported difficulty scores with mean accuracy of 0:42 (Pearson R) and their preferred viewing speeds with mean accuracy of 0:29. Our techniques are fully automatic and have potential applications for both intelligent tutoring systems (ITS) and standard classroom environments.
这个项目探索了面部表情在教学中的自动反馈。我们展示了自动实时面部表情识别如何有效地用于估计单个学生所授课的难度水平。我们还表明,面部表情可以预测个体学生在每个时刻对课程展示的偏好率。在视频讲座观看任务中,使用不到两分钟的面部表情记录数据进行训练,并在单独的验证集上进行测试,我们的系统预测受试者自我报告的难度分数的平均准确率为0:42 (Pearson R),他们偏好的观看速度的平均准确率为0:29。我们的技术是全自动的,在智能辅导系统(ITS)和标准课堂环境中都有潜在的应用。
{"title":"Automatic facial expression recognition for intelligent tutoring systems","authors":"J. Whitehill, M. Bartlett, J. Movellan","doi":"10.1109/CVPRW.2008.4563182","DOIUrl":"https://doi.org/10.1109/CVPRW.2008.4563182","url":null,"abstract":"This project explores the idea of facial expression for automated feedback in teaching. We show how automatic realtime facial expression recognition can be effectively used to estimate the difficulty level, as perceived by an individual student, of a delivered lecture. We also show that facial expression is predictive of an individual studentpsilas preferred rate of curriculum presentation at each moment in time. On a video lecture viewing task, training on less than two minutes of recorded facial expression data and testing on a separate validation set, our system predicted the subjectspsila self-reported difficulty scores with mean accuracy of 0:42 (Pearson R) and their preferred viewing speeds with mean accuracy of 0:29. Our techniques are fully automatic and have potential applications for both intelligent tutoring systems (ITS) and standard classroom environments.","PeriodicalId":102206,"journal":{"name":"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"42 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134024172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 91
Spectral minutiae: A fixed-length representation of a minutiae set 谱细部:细部集的固定长度表示
Hai-yun Xu, R. Veldhuis, T. Kevenaar, A. Akkermans, A. Bazen
Minutiae, which are the endpoints and bifurcations of fingerprint ridges, allow a very discriminative classification of fingerprints. However, a minutiae set is an unordered set and the minutiae locations suffer from various deformations such as translation, rotation and scaling. In this paper, we introduce a novel method to represent a minutiae set as a fixed-length feature vector, which is invariant to translation, and in which rotation and scaling become translations, so that they can be easily compensated for. By applying the spectral minutiae representation, we can combine the fingerprint recognition system with a template protection scheme, which requires a fixed-length feature vector. This paper also presents two spectral minutiae matching algorithms and shows experimental results.
细微点,即指纹脊的端点和分叉点,允许对指纹进行非常有区别的分类。然而,微点集是一个无序的集合,微点位置会受到各种变形的影响,如平移、旋转和缩放。在本文中,我们引入了一种新的方法,将一个细节集表示为一个固定长度的特征向量,该特征向量对平移是不变的,并且旋转和缩放成为平移,因此它们可以很容易地补偿。利用谱特征表示,我们可以将指纹识别系统与模板保护方案相结合,而模板保护方案需要固定长度的特征向量。本文还介绍了两种光谱细节匹配算法,并给出了实验结果。
{"title":"Spectral minutiae: A fixed-length representation of a minutiae set","authors":"Hai-yun Xu, R. Veldhuis, T. Kevenaar, A. Akkermans, A. Bazen","doi":"10.1109/CVPRW.2008.4563120","DOIUrl":"https://doi.org/10.1109/CVPRW.2008.4563120","url":null,"abstract":"Minutiae, which are the endpoints and bifurcations of fingerprint ridges, allow a very discriminative classification of fingerprints. However, a minutiae set is an unordered set and the minutiae locations suffer from various deformations such as translation, rotation and scaling. In this paper, we introduce a novel method to represent a minutiae set as a fixed-length feature vector, which is invariant to translation, and in which rotation and scaling become translations, so that they can be easily compensated for. By applying the spectral minutiae representation, we can combine the fingerprint recognition system with a template protection scheme, which requires a fixed-length feature vector. This paper also presents two spectral minutiae matching algorithms and shows experimental results.","PeriodicalId":102206,"journal":{"name":"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133987448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 54
期刊
2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1