2003 Conference on Computer Vision and Pattern Recognition Workshop最新文献

英文中文

Conformal Rectification of Omnidirectional Stereo Pairs 全向立体对的保角整流

2003 Conference on Computer Vision and Pattern Recognition Workshop

Pub Date : 2003-06-16 DOI: 10.1109/CVPRW.2003.10082

Christopher Geyer, Kostas Daniilidis

A pair of stereo images are said to be rectified if corresponding image points have the same y-coordinate in their respective images. In this paper we consider the rectification of two omnidirectional cameras, specifically two parabolic catadioptric cameras. Such systems consist of a parabolic mirror and an orthographically projecting lens. We show that if the image coordinates are represented as a point z in the complex plane, then the rectification is specified by coth -1z. This rectification is shown to be conformal, in that it is locally distortionless, and furthermore, it is unique up to scale and transformation. We show an experiment in which two real images have been rectified and a stereo matching performed.

如果对应的图像点在其各自的图像中具有相同的y坐标，则称对立体图像进行了校正。本文考虑了两个全向相机，特别是两个抛物面反射相机的校正问题。这种系统由一个抛物面镜和一个正投影透镜组成。我们表明，如果图像坐标表示为复平面上的点z，则校正由coth -1z指定。这种整流被证明是保形的，因为它是局部无扭曲的，而且，它在尺度和变换上是唯一的。我们展示了一个实验，其中两个真实图像已被纠正和立体匹配进行。

引用次数: 43

Handwritten Amharic Bank Check Recognition Using Hidden Markov Random Field 使用隐马尔可夫随机场的手写阿姆哈拉银行支票识别

2003 Conference on Computer Vision and Pattern Recognition Workshop

Pub Date : 2003-06-16 DOI: 10.1109/CVPRW.2003.10027

W. Alemu, S. Fuchs

Amharic, a working language in Ethiopia, has its own writing system which is totally different from that of the Latin alphabet based languages. Amharic handwriting recognition is challenging due to the huge number of symbols, significant interclass similarity and also intra-class variability. In this paper the application of Hidden Markov Random Field (HMRF) for handwriting recognition of the legal amount field of Amharic bank check is presented. The three main contributions of this paper are the following. First, a new feature extraction technique is used which tries to extract natural features as perceived by human beings. The features extracted by this technique show a significant performance improvement. Second, a classification technique by estimating likelihood using a method known as pseudo-marginal probability is developed. The third contribution is the application of contextual information based on the syntactical structure of Amharic checks. Such context information is important in recognition process because even humans fail to recognize symbols correctly without any context. A noticeable difference is observed between results obtained with and without the application of contextual information. On the whole, despite the huge interclass similarity and also intra-class variability of handwritten Amharic characters, attractive results are found.

阿姆哈拉语是埃塞俄比亚的一种工作语言，它有自己的书写系统，与以拉丁字母为基础的语言完全不同。阿姆哈拉语的手写识别具有挑战性，因为它有大量的符号、显著的类间相似性和类内可变性。本文研究了隐马尔可夫随机场(HMRF)在阿姆哈拉银行支票法定金额字段的手写识别中的应用。本文的三个主要贡献如下。首先，采用一种新的特征提取技术，试图提取人类感知到的自然特征。通过该技术提取的特征显示出显著的性能改进。其次，通过使用伪边际概率方法估计似然，开发了一种分类技术。第三个贡献是基于阿姆哈拉语检查的句法结构的上下文信息的应用。这些上下文信息在识别过程中非常重要，因为即使是人类也无法在没有上下文的情况下正确识别符号。在使用和不使用上下文信息的情况下，可以观察到明显的差异。总的来说，尽管手写阿姆哈拉文字有巨大的类间相似性和类内差异性，但结果还是很吸引人的。

{"title":"Handwritten Amharic Bank Check Recognition Using Hidden Markov Random Field","authors":"W. Alemu, S. Fuchs","doi":"10.1109/CVPRW.2003.10027","DOIUrl":"https://doi.org/10.1109/CVPRW.2003.10027","url":null,"abstract":"Amharic, a working language in Ethiopia, has its own writing system which is totally different from that of the Latin alphabet based languages. Amharic handwriting recognition is challenging due to the huge number of symbols, significant interclass similarity and also intra-class variability. In this paper the application of Hidden Markov Random Field (HMRF) for handwriting recognition of the legal amount field of Amharic bank check is presented. The three main contributions of this paper are the following. First, a new feature extraction technique is used which tries to extract natural features as perceived by human beings. The features extracted by this technique show a significant performance improvement. Second, a classification technique by estimating likelihood using a method known as pseudo-marginal probability is developed. The third contribution is the application of contextual information based on the syntactical structure of Amharic checks. Such context information is important in recognition process because even humans fail to recognize symbols correctly without any context. A noticeable difference is observed between results obtained with and without the application of contextual information. On the whole, despite the huge interclass similarity and also intra-class variability of handwritten Amharic characters, attractive results are found.","PeriodicalId":121249,"journal":{"name":"2003 Conference on Computer Vision and Pattern Recognition Workshop","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132325846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Reckless motion estimation from omnidirectional image and inertial measurements 基于全向图像和惯性测量的鲁莽运动估计

2003 Conference on Computer Vision and Pattern Recognition Workshop

Pub Date : 2003-06-16 DOI: 10.1109/CVPRW.2003.10073

Dennis W. Strelow, Sanjiv Singh

Two approaches to improving the accuracy of camera motion estimation from image sequences are the use of omnidirectional cameras, which combine a conventional camera with a convex mirror that magnifies the field of view, and the use of both image and inertial measurements, which are highly complementary. In this paper, we describe optimal batch algorithms for estimating motion and scene structure from either conventional or omnidirectional images, with or without inertial data. We also present a method for motion estimation from inertial data and the tangential components of image projections. Tangential components are identical across a wide range of conventional and omnidirectional projection models, so the resulting method does not require any accurate projection model. Because this method discards half of the projection data (i.e., the radial components) and can operate with a projection model that may grossly mismodel the actual camera behavior, we call the method "reckless" motion estimation, but we show that the camera positions and scene structure estimated using this method can be quite accurate.

从图像序列中提高相机运动估计精度的两种方法是使用全向相机，它将传统相机与放大视场的凸面镜结合在一起，以及同时使用图像和惯性测量，这两种方法是高度互补的。在本文中，我们描述了从常规或全向图像中估计运动和场景结构的最优批处理算法，有或没有惯性数据。我们还提出了一种基于惯性数据和图像投影的切向分量的运动估计方法。切向分量在广泛的常规和全向投影模型中是相同的，因此所得到的方法不需要任何精确的投影模型。由于该方法丢弃了一半的投影数据(即径向分量)，并且可以使用可能严重错误建模实际摄像机行为的投影模型，因此我们将该方法称为“鲁莽”运动估计，但我们表明使用该方法估计的摄像机位置和场景结构可以相当准确。

引用次数: 13

Carving Prior Manifolds Using Inequalities 用不等式刻划先验流形

2003 Conference on Computer Vision and Pattern Recognition Workshop

Pub Date : 2003-06-16 DOI: 10.1109/CVPRW.2003.10064

M. Eriksson, S. Carlsson

The use of prior information by learning from training data is used increasingly in image analysis and computer vision. The high dimensionality of the parameter spaces and the complexity of the probability distributions however often makes the exact learning of priors an impossible problem, requiring an excessive amount of training data that is seldom realizable in practise. In this paper we propose a weaker form of prior estimation which tries to learn the boundaries of impossible events from examples. This is equivalent to estimating the support of the prior distribution or the manifold of possible events. The idea is to model the set of possible events by algebraic inequalities. Learning proceeds by selecting those inequalities that show a consistent sign when applied to the training data set. Every such inequality "carves" out a region of impossible events in the parameter space. The manifold of possible events estimated in this way will in general represent the qualitative properties of the events. We give example of this in the problems of restoration of handwritten characters and automatically tracked body locations

从训练数据中学习先验信息在图像分析和计算机视觉中的应用越来越广泛。然而，参数空间的高维性和概率分布的复杂性往往使先验的精确学习成为一个不可能的问题，需要大量的训练数据，而这些数据在实践中很少实现。在本文中，我们提出了一种较弱形式的先验估计，它试图从实例中学习不可能事件的边界。这相当于估计先验分布或可能事件的流形的支持度。其思想是通过代数不等式对可能事件的集合进行建模。学习通过选择那些在应用于训练数据集时显示一致符号的不等式来进行。每一个这样的不等式都在参数空间中“雕刻”出一个不可能事件的区域。用这种方法估计的可能事件的流形一般将表示事件的定性性质。我们在恢复手写字符和自动跟踪身体位置的问题中给出了这方面的例子

引用次数: 1

Deformable Model Based Shape Analysis Stone Tool Application 基于可变形模型的石制工具形状分析应用

2003 Conference on Computer Vision and Pattern Recognition Workshop

Pub Date : 2003-06-16 DOI: 10.1109/CVPRW.2003.10010

Kyoungju Park, A. Nowell, Dimitris N. Metaxas

This paper introduces a method to measure the average shape of handaxes, and characterize deviations from this average shape by taking into account both internal and external information. In the field of Paleolithic archaeology, standardization and symmetry can be two important concepts. For axially symmetrical shapes such as handaxes, it is possible to introduce a simple appropriate shape representation. We adapt a parameterized deformable model based approach to allow flexibility of shape coverage and analyze the similarity with a few compact parameters. Moreover a hierarchical fitting method ensures stability while measuring global and local shape features step-by-step. Our model incorporates a physics-based framework so as to deform due to forces exerted from boundary data sets.

本文介绍了一种测量手轴平均形状的方法，并综合考虑了内部和外部信息，描述了与平均形状的偏差。在旧石器考古领域，标准化和对称是两个重要的概念。对于轴向对称的形状，例如手轴，可以引入简单适当的形状表示。我们采用了一种基于参数化变形模型的方法来允许形状覆盖的灵活性，并分析了几个紧凑参数的相似性。此外，分层拟合方法确保了在逐步测量全局和局部形状特征时的稳定性。我们的模型结合了一个基于物理的框架，以便由于边界数据集施加的力而变形。

引用次数: 7

Text Processing Method for E-learning Videos 电子学习视频的文本处理方法

2003 Conference on Computer Vision and Pattern Recognition Workshop

Pub Date : 2003-06-16 DOI: 10.1109/CVPRW.2003.10024

Jun Sun, Yukata Katsuyama, S. Naoi

E-learning has received more and more attention in recent years. The abundant text information in E-learning videos is very valuable for information indexing, searching and other applications. In order to effectively extract the text from E-learning videos, a text processing method is proposed in this paper. The method is composed of two parts: text change frame detection and text extraction from image. The purpose of text change frame detection is to remove the redundant frames from the video and reduce the total processing time. A new text extraction algorithm is proposed to extract the text areas in the text change frames for further recognition. Experiments on lecture video manifest the good performance of our method.

近年来，网络学习越来越受到人们的关注。电子学习视频中丰富的文本信息对信息索引、检索等应用具有重要价值。为了有效地从电子学习视频中提取文本，本文提出了一种文本处理方法。该方法由文本变化帧检测和图像文本提取两部分组成。文本变化帧检测的目的是去除视频中的冗余帧，减少总处理时间。提出了一种新的文本提取算法，提取文本变化帧中的文本区域进行进一步识别。在讲座视频上的实验表明了该方法的良好性能。

引用次数: 3

Profile-based Pottery Reconstruction 基于剖面的陶器重建

2003 Conference on Computer Vision and Pattern Recognition Workshop

Pub Date : 2003-06-16 DOI: 10.1109/CVPRW.2003.10007

M. Kampel, Robert Sablatnig

A major obstacle to the broader use of 3D object reconstruction and modeling is the extent of manual intervention needed. Such interventions are currently extensive and exist throughout every phase of a 3D reconstruction project: collection of images, image management, establishment of sensor position and image orientation, extracting the geometric information describing an object, and merging geometric, texture and semantic data. We present a fully automated approach to pottery reconstruction based on the fragment profile, which is the cross-section of the fragment in the direction of the rotational axis of symmetry. We demonstrate the method and give results on synthetic and real data.

广泛使用3D对象重建和建模的一个主要障碍是需要人工干预的程度。这种干预目前广泛存在于三维重建项目的每个阶段:图像收集，图像管理，传感器位置和图像方向的建立，提取描述物体的几何信息，以及合并几何，纹理和语义数据。我们提出了一种基于碎片轮廓的全自动陶器重建方法，这是碎片在对称轴旋转方向上的横截面。在综合数据和实际数据上对该方法进行了验证，并给出了结果。

引用次数: 60

A New Class of Mirrors for Wide-Angle Imaging 一类新型广角成像反射镜

2003 Conference on Computer Vision and Pattern Recognition Workshop

Pub Date : 2003-06-16 DOI: 10.1109/CVPRW.2003.10069

M. Srinivasan

Conventional mirrors for panoramic imaging usually capture circular images. As these images are difficult to interpret visually, they are often remapped digitally into a rectangular image in which one axis represents azimuth and the other elevation. This paper describes a class of mirrors that perform the capture as well as the remapping, thus eliminating the need for computational resources. They provide uniform resolution in azimuth and elevation, and can be designed to make full use of a camera's imaging surface.

用于全景成像的传统反射镜通常捕获圆形图像。由于这些图像很难从视觉上解释，它们通常被重新映射成一个矩形图像，其中一个轴表示方位角，另一个轴表示高程。本文描述了一类执行捕获和重新映射的镜像，从而消除了对计算资源的需求。它们在方位角和仰角上提供统一的分辨率，并且可以设计成充分利用相机的成像表面。

引用次数: 23

Virtual Scene Control Using Human Body Postures 基于人体姿态的虚拟场景控制

2003 Conference on Computer Vision and Pattern Recognition Workshop

Pub Date : 2003-06-16 DOI: 10.1109/CVPRW.2003.10054

S. Yonemoto, R. Taniguchi

This paper describes a vision based 3D real-virtual interaction which enables realistic avatar motion control, and in which the virtual camera is controlled by the body posture of the user. The human motion analysis method is implemented by blob tracking. A physically-constrained motion synthesis method is implemented to generate realistic motion from a limit number of blobs. We address our framework to utilize virtual scene contexts as a priori knowledge. In order to make the virtual scene more realistically beyond the limitation of the real world sensing, we use a framework to augment the reality in the virtual scene by simulating various events of the real world. Concretely, we suppose that a virtual environment can provide action information for the avatar. 3rd-person viewpoint control coupled with body postures is also realized to directly access virtual objects.

本文描述了一种基于视觉的三维实虚拟交互技术，实现了逼真的虚拟人物运动控制，其中虚拟摄像机由用户的身体姿势控制。采用斑点跟踪技术实现人体运动分析方法。实现了一种物理约束运动合成方法，从有限数量的斑点生成真实的运动。我们解决了我们的框架，以利用虚拟场景上下文作为先验知识。为了使虚拟场景超越现实世界感知的限制，我们采用了一个框架，通过模拟现实世界的各种事件来增强虚拟场景中的真实感。具体来说，我们假设虚拟环境可以为角色提供动作信息。还实现了结合身体姿态的第三人称视点控制，直接访问虚拟物体。

引用次数: 1

Real Time Face Detection and Facial Expression Recognition: Development and Applications to Human Computer Interaction. 实时人脸检测与面部表情识别:在人机交互中的发展与应用。

2003 Conference on Computer Vision and Pattern Recognition Workshop

Pub Date : 2003-06-16 DOI: 10.1109/CVPRW.2003.10057

M. Bartlett, G. Littlewort, Ian R. Fasel, J. Movellan

Computer animated agents and robots bring a social dimension to human computer interaction and force us to think in new ways about how computers could be used in daily life. Face to face communication is a real-time process operating at a a time scale in the order of 40 milliseconds. The level of uncertainty at this time scale is considerable, making it necessary for humans and machines to rely on sensory rich perceptual primitives rather than slow symbolic inference processes. In this paper we present progress on one such perceptual primitive. The system automatically detects frontal faces in the video stream and codes them with respect to 7 dimensions in real time: neutral, anger, disgust, fear, joy, sadness, surprise. The face finder employs a cascade of feature detectors trained with boosting techniques [15, 2]. The expression recognizer receives image patches located by the face detector. A Gabor representation of the patch is formed and then processed by a bank of SVM classifiers. A novel combination of Adaboost and SVM's enhances performance. The system was tested on the Cohn-Kanade dataset of posed facial expressions [6]. The generalization performance to new subjects for a 7- way forced choice correct. Most interestingly the outputs of the classifier change smoothly as a function of time, providing a potentially valuable representation to code facial expression dynamics in a fully automatic and unobtrusive manner. The system has been deployed on a wide variety of platforms including Sony's Aibo pet robot, ATR's RoboVie, and CU animator, and is currently being evaluated for applications including automatic reading tutors, assessment of human-robot interaction.

计算机动画代理和机器人为人机交互带来了社会维度，并迫使我们以新的方式思考如何在日常生活中使用计算机。面对面的交流是一个以40毫秒的时间尺度进行的实时过程。在这个时间尺度上的不确定性是相当大的，这使得人类和机器有必要依赖感官丰富的感知原语，而不是缓慢的符号推理过程。在本文中，我们介绍了一个这样的感知原语的进展。该系统自动检测视频流中的正面人脸，并根据7个维度对其进行实时编码:中性、愤怒、厌恶、恐惧、快乐、悲伤、惊讶。人脸识别器采用了一系列经过增强技术训练的特征检测器[15,2]。表情识别器接收人脸检测器定位的图像补丁。形成一个Gabor表示，然后由一组支持向量机分类器进行处理。Adaboost和SVM的新颖组合提高了性能。该系统在Cohn-Kanade摆姿面部表情数据集上进行了测试[6]。对新被试的泛化表现为7种方式的强迫选择正确。最有趣的是，分类器的输出作为时间的函数平滑地变化，以全自动和不引人注目的方式为编码面部表情动态提供了潜在的有价值的表示。该系统已经部署在各种各样的平台上，包括索尼的Aibo宠物机器人、ATR的RoboVie和CU动画师，目前正在评估自动阅读导师、人机交互评估等应用。

{"title":"Real Time Face Detection and Facial Expression Recognition: Development and Applications to Human Computer Interaction.","authors":"M. Bartlett, G. Littlewort, Ian R. Fasel, J. Movellan","doi":"10.1109/CVPRW.2003.10057","DOIUrl":"https://doi.org/10.1109/CVPRW.2003.10057","url":null,"abstract":"Computer animated agents and robots bring a social dimension to human computer interaction and force us to think in new ways about how computers could be used in daily life. Face to face communication is a real-time process operating at a a time scale in the order of 40 milliseconds. The level of uncertainty at this time scale is considerable, making it necessary for humans and machines to rely on sensory rich perceptual primitives rather than slow symbolic inference processes. In this paper we present progress on one such perceptual primitive. The system automatically detects frontal faces in the video stream and codes them with respect to 7 dimensions in real time: neutral, anger, disgust, fear, joy, sadness, surprise. The face finder employs a cascade of feature detectors trained with boosting techniques [15, 2]. The expression recognizer receives image patches located by the face detector. A Gabor representation of the patch is formed and then processed by a bank of SVM classifiers. A novel combination of Adaboost and SVM's enhances performance. The system was tested on the Cohn-Kanade dataset of posed facial expressions [6]. The generalization performance to new subjects for a 7- way forced choice correct. Most interestingly the outputs of the classifier change smoothly as a function of time, providing a potentially valuable representation to code facial expression dynamics in a fully automatic and unobtrusive manner. The system has been deployed on a wide variety of platforms including Sony's Aibo pet robot, ATR's RoboVie, and CU animator, and is currently being evaluated for applications including automatic reading tutors, assessment of human-robot interaction.","PeriodicalId":121249,"journal":{"name":"2003 Conference on Computer Vision and Pattern Recognition Workshop","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130227707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 566

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2003 Conference on Computer Vision and Pattern Recognition Workshop

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀