Pub Date : 2002-12-10DOI: 10.1109/ICPR.2002.1048413
D. M. Honea, W. Snyder, G. Bilbro
In most implementations of active contours (snakes), the evolution of the snake depends only on image characteristics in the immediate neighborhood of the current snake points. This is true even when there is little edge data available in the current neighborhood, and even when the boundary of interest may be some distance away in the image. This paper proposes a vector potential field at each point in the image that is derived from the "pull" exerted by all edge points in the image; the pull for a given edge is inversely proportional to the square of the distance from the pixel it pulls. This potential field acts as a force, and snake points are moved based on the force at their current location, rather than moving to minimize energy at a candidate position. The resulting algorithm allows edges to influence snake evolution earlier and from a greater distance, and results in faster and better convergence to the final boundary under a variety of image characteristics.
{"title":"Active contours using a potential field","authors":"D. M. Honea, W. Snyder, G. Bilbro","doi":"10.1109/ICPR.2002.1048413","DOIUrl":"https://doi.org/10.1109/ICPR.2002.1048413","url":null,"abstract":"In most implementations of active contours (snakes), the evolution of the snake depends only on image characteristics in the immediate neighborhood of the current snake points. This is true even when there is little edge data available in the current neighborhood, and even when the boundary of interest may be some distance away in the image. This paper proposes a vector potential field at each point in the image that is derived from the \"pull\" exerted by all edge points in the image; the pull for a given edge is inversely proportional to the square of the distance from the pixel it pulls. This potential field acts as a force, and snake points are moved based on the force at their current location, rather than moving to minimize energy at a candidate position. The resulting algorithm allows edges to influence snake evolution earlier and from a greater distance, and results in faster and better convergence to the final boundary under a variety of image characteristics.","PeriodicalId":159502,"journal":{"name":"Object recognition supported by user interaction for service robots","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130025586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-10DOI: 10.1109/ICPR.2002.1048312
R. Klette
Reviews two different topological spaces for the orthogonal planar grid, the Alexandroff-Hopf and the Wyse topology. We show isomorphy and homeomorphy between different spaces which are used or applicable in image analysis.
{"title":"Topologies on the planar orthogonal grid","authors":"R. Klette","doi":"10.1109/ICPR.2002.1048312","DOIUrl":"https://doi.org/10.1109/ICPR.2002.1048312","url":null,"abstract":"Reviews two different topological spaces for the orthogonal planar grid, the Alexandroff-Hopf and the Wyse topology. We show isomorphy and homeomorphy between different spaces which are used or applicable in image analysis.","PeriodicalId":159502,"journal":{"name":"Object recognition supported by user interaction for service robots","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134072581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-10DOI: 10.1109/ICPR.2002.1048308
H. Wechsler, Zoran Duric, Fayin Li
In this paper we describe a method of learning hierarchical representations for describing and recognizing gestures expressed as one and two arm movements using competitive learning methods. At the low end of the hierarchy, the atomic motions ("letters") corresponding to flowfields computed from successive color image frames are derived using Learning Vector Quantization (LVQ). At the next intermediate level, the atomic motions are clustered into actions ("words") using homogeneity criteria. The highest level combines actions into activities ("sentences") using proximity driven clustering. We demonstrate the feasibility and the robustness of our approach on real color-image sequences, each consisting of several hundred frames corresponding to dynamic one and two arm movements.
{"title":"Hierarchical interpretation of human activities using competitive learning","authors":"H. Wechsler, Zoran Duric, Fayin Li","doi":"10.1109/ICPR.2002.1048308","DOIUrl":"https://doi.org/10.1109/ICPR.2002.1048308","url":null,"abstract":"In this paper we describe a method of learning hierarchical representations for describing and recognizing gestures expressed as one and two arm movements using competitive learning methods. At the low end of the hierarchy, the atomic motions (\"letters\") corresponding to flowfields computed from successive color image frames are derived using Learning Vector Quantization (LVQ). At the next intermediate level, the atomic motions are clustered into actions (\"words\") using homogeneity criteria. The highest level combines actions into activities (\"sentences\") using proximity driven clustering. We demonstrate the feasibility and the robustness of our approach on real color-image sequences, each consisting of several hundred frames corresponding to dynamic one and two arm movements.","PeriodicalId":159502,"journal":{"name":"Object recognition supported by user interaction for service robots","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133944070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-10DOI: 10.1109/ICPR.2002.1048299
A. Yeung, N. Barnes
Fixating on objects is fundamental to active vision tasks, such as reaching, navigation and docking. Most techniques generally have been designed for space-invariant cameras. This research proposes a new method for corner tracking to facilitate point fixation for a mobile robot using a foveated camera. When the target point is in the centre of the image, the fovea and its position can be accurately tracked at high resolution. At the same time, the periphery has a reduced pixel count thus reducing the image processing computation compared to a uniform camera with the same field of view. If the target point suddenly moves into the periphery, it still appears in the lower resolution part of the image and coarser control can bring it back into fovea. Our experiment results demonstrate the stability of the proposed method and the performance of our implementation is adequate for real-time tracking applications.
{"title":"Towards log-polar fixation for mobile robots - analysis of corner tracking on the log-polar camera","authors":"A. Yeung, N. Barnes","doi":"10.1109/ICPR.2002.1048299","DOIUrl":"https://doi.org/10.1109/ICPR.2002.1048299","url":null,"abstract":"Fixating on objects is fundamental to active vision tasks, such as reaching, navigation and docking. Most techniques generally have been designed for space-invariant cameras. This research proposes a new method for corner tracking to facilitate point fixation for a mobile robot using a foveated camera. When the target point is in the centre of the image, the fovea and its position can be accurately tracked at high resolution. At the same time, the periphery has a reduced pixel count thus reducing the image processing computation compared to a uniform camera with the same field of view. If the target point suddenly moves into the periphery, it still appears in the lower resolution part of the image and coarser control can bring it back into fovea. Our experiment results demonstrate the stability of the proposed method and the performance of our implementation is adequate for real-time tracking applications.","PeriodicalId":159502,"journal":{"name":"Object recognition supported by user interaction for service robots","volume":"257 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134092569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-10DOI: 10.1109/ICPR.2002.1048286
J. Zou, Hong Yan
The theory of projection onto convex sets (POCS) is applied to reduce blocking artifacts in compressed images coded by the block discrete cosine transform (BDCT). An image before compression is simulated by a triangular mesh, which is the basis of a proposed smoothness constraint set. The mesh is constructed by dividing each block into a set of triangles. The proposed method outperforms four existing methods subjectively and objectively.
{"title":"A POCS-based method for reducing artifacts in BDCT compressed images","authors":"J. Zou, Hong Yan","doi":"10.1109/ICPR.2002.1048286","DOIUrl":"https://doi.org/10.1109/ICPR.2002.1048286","url":null,"abstract":"The theory of projection onto convex sets (POCS) is applied to reduce blocking artifacts in compressed images coded by the block discrete cosine transform (BDCT). An image before compression is simulated by a triangular mesh, which is the basis of a proposed smoothness constraint set. The mesh is constructed by dividing each block into a set of triangles. The proposed method outperforms four existing methods subjectively and objectively.","PeriodicalId":159502,"journal":{"name":"Object recognition supported by user interaction for service robots","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128949865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-10DOI: 10.1109/ICPR.2002.1048486
Klaus D. Tönnies, F. Behrens, Melanie Aurnhammer
We present a fast method for locating iris features in frontal face images based on the Hough transform. it consists of an initial iris detection step and a tracking step which uses iris features from initialisation for speeding lip computation. The purpose of research was to evaluate the feasibility of the method for tracking at 200 frames per second or higher. Processing speed of the prototypical implementation on a 266 Mhz Pentium II PC is approximately 6 seconds for initial iris detection and about 0.05 seconds for each tracking step. Further speed-up using faster equipment seems feasible. The algorithm was applied to images of subjects taken under normal room lighting conditions. Tests showed robustness with respect to shadowing and partial occlusion of the iris. The localisation error was below two pixels. Accuracy for tracking was within one pixel. A reduction of the number of pixels, which are processed in the tracking step by 90% showed a modest degradation of the results.
提出了一种基于霍夫变换的人脸正面图像虹膜特征快速定位方法。它由初始虹膜检测步骤和跟踪步骤组成,跟踪步骤利用初始化虹膜特征来加速唇部计算。研究的目的是评估该方法在200帧/秒或更高速度下跟踪的可行性。在266 Mhz的Pentium II PC上,原型实现的初始虹膜检测处理速度约为6秒,每个跟踪步骤约为0.05秒。使用更快的设备进一步加速似乎是可行的。该算法应用于在正常室内照明条件下拍摄的受试者图像。测试显示了对虹膜阴影和部分遮挡的稳健性。定位错误低于两个像素。跟踪精度在一个像素以内。在跟踪步骤中处理的像素数量减少90%,显示出结果的适度退化。
{"title":"Feasibility of Hough-transform-based iris localisation for real-time-application","authors":"Klaus D. Tönnies, F. Behrens, Melanie Aurnhammer","doi":"10.1109/ICPR.2002.1048486","DOIUrl":"https://doi.org/10.1109/ICPR.2002.1048486","url":null,"abstract":"We present a fast method for locating iris features in frontal face images based on the Hough transform. it consists of an initial iris detection step and a tracking step which uses iris features from initialisation for speeding lip computation. The purpose of research was to evaluate the feasibility of the method for tracking at 200 frames per second or higher. Processing speed of the prototypical implementation on a 266 Mhz Pentium II PC is approximately 6 seconds for initial iris detection and about 0.05 seconds for each tracking step. Further speed-up using faster equipment seems feasible. The algorithm was applied to images of subjects taken under normal room lighting conditions. Tests showed robustness with respect to shadowing and partial occlusion of the iris. The localisation error was below two pixels. Accuracy for tracking was within one pixel. A reduction of the number of pixels, which are processed in the tracking step by 90% showed a modest degradation of the results.","PeriodicalId":159502,"journal":{"name":"Object recognition supported by user interaction for service robots","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130996952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-10DOI: 10.1109/ICPR.2002.1048317
Shaoning Pang, Daijin Kim, S. Bang
Presents a method to authenticate an individual's membership in a group without revealing the individual's identity and without restricting how the member of the group may be changed. It has the ability to authenticate membership and is robust to cope with the variations of both the group size and the group member of membership.
{"title":"Membership authentication in dynamic face groups","authors":"Shaoning Pang, Daijin Kim, S. Bang","doi":"10.1109/ICPR.2002.1048317","DOIUrl":"https://doi.org/10.1109/ICPR.2002.1048317","url":null,"abstract":"Presents a method to authenticate an individual's membership in a group without revealing the individual's identity and without restricting how the member of the group may be changed. It has the ability to authenticate membership and is robust to cope with the variations of both the group size and the group member of membership.","PeriodicalId":159502,"journal":{"name":"Object recognition supported by user interaction for service robots","volume":"135 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133687756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-10DOI: 10.1109/ICPR.2002.1048366
P. Li, Tianwen Zhang
The condensation algorithm can deal with non-Gaussian, nonlinear visual contour tracking in a unified way. Despite its simple implementation and generality, it has two main limitations. The first limitation is that in sampling stage the algorithm does not take advantage of the new measurements. As a result of the inefficient sampling strategy, the algorithm needs a large number of samples to represent the posterior distribution of state. The next is in the selection step, resampling may introduce the problem of sample impoverishment. To address these two problems, we present an improved visual tracker based on an importance sampling/resampling algorithm. Gaussian density of each sample is adopted as the sub-optimal importance proposal distribution, which can steer the samples towards the high likelihood by considering the latest observations. We also adopt a criterion of effective sample size to determine whether the resampling is necessary or not. Experiments with real image sequences show that the performance of new algorithm improves considerably for tracking in visual clutter.
{"title":"Visual contour tracking based on sequential importance sampling/resampling algorithm","authors":"P. Li, Tianwen Zhang","doi":"10.1109/ICPR.2002.1048366","DOIUrl":"https://doi.org/10.1109/ICPR.2002.1048366","url":null,"abstract":"The condensation algorithm can deal with non-Gaussian, nonlinear visual contour tracking in a unified way. Despite its simple implementation and generality, it has two main limitations. The first limitation is that in sampling stage the algorithm does not take advantage of the new measurements. As a result of the inefficient sampling strategy, the algorithm needs a large number of samples to represent the posterior distribution of state. The next is in the selection step, resampling may introduce the problem of sample impoverishment. To address these two problems, we present an improved visual tracker based on an importance sampling/resampling algorithm. Gaussian density of each sample is adopted as the sub-optimal importance proposal distribution, which can steer the samples towards the high likelihood by considering the latest observations. We also adopt a criterion of effective sample size to determine whether the resampling is necessary or not. Experiments with real image sequences show that the performance of new algorithm improves considerably for tracking in visual clutter.","PeriodicalId":159502,"journal":{"name":"Object recognition supported by user interaction for service robots","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132740219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-10DOI: 10.1109/ICPR.2002.1048385
M. R. Stevens, M. Snorrason, Sengvieng Amphay
The need for air-to-ground missiles with autonomous target acquisition (ATA) seekers is in large part driven by the failure of pilot-guided bombs in cloudy conditions (such as demonstrated in Kosovo). Passive-millimeter wave (PMMW) sensors have the ability to see through clouds; in fact, they tend to show metallic objects (such as mobile ground targets) in high contrast regardless of weather conditions. However, their resolution is very low when compared with other popular ATA sensors such as laser-radar (LADAR). We present an A TA algorithm suite that combines the superior target detection potential of PMMW with the high-quality segmentation and recognition abilities of LADAR. Preliminary detection and segmentation results are presented for a set of image-pairs of military vehicles that were collected for this project using an 89GHz, 18" aperture PMMW sensor and a 1.06 /spl mu/ very-high-resolution LADAR.
{"title":"Automatic target detection using PMMW and LADAR imagery","authors":"M. R. Stevens, M. Snorrason, Sengvieng Amphay","doi":"10.1109/ICPR.2002.1048385","DOIUrl":"https://doi.org/10.1109/ICPR.2002.1048385","url":null,"abstract":"The need for air-to-ground missiles with autonomous target acquisition (ATA) seekers is in large part driven by the failure of pilot-guided bombs in cloudy conditions (such as demonstrated in Kosovo). Passive-millimeter wave (PMMW) sensors have the ability to see through clouds; in fact, they tend to show metallic objects (such as mobile ground targets) in high contrast regardless of weather conditions. However, their resolution is very low when compared with other popular ATA sensors such as laser-radar (LADAR). We present an A TA algorithm suite that combines the superior target detection potential of PMMW with the high-quality segmentation and recognition abilities of LADAR. Preliminary detection and segmentation results are presented for a set of image-pairs of military vehicles that were collected for this project using an 89GHz, 18\" aperture PMMW sensor and a 1.06 /spl mu/ very-high-resolution LADAR.","PeriodicalId":159502,"journal":{"name":"Object recognition supported by user interaction for service robots","volume":"219 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132740989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-10DOI: 10.1109/ICPR.2002.1048429
Si-Hun Sung, Woo-Sung Chun
Knowledge-based numeric open caption recognition is proposed that can recognize numeric captions generated by a character generator (CG) and automatically superimpose a modified caption using the recognized text only when a valid numeric caption appears in the aimed specific region of a live sportscast scene produced by other broadcasting stations. In the proposed method, mesh features are extracted from an enhanced binary image as feature vectors, then valuable information is recovered from a numeric image by perceiving the character using a multilayer perceptron (MLP) network. The result is verified using a knowledge-based rule set designed for a more stable and reliable output and then the modified information is displayed on a screen by CG. MLB EyeCaption based on the proposed algorithm has already been used for regular Major League Baseball (MLB) programs broadcast live over a Korean nationwide TV network and has produced a favorable response from Korean viewers.
{"title":"Knowledge-based numeric open caption recognition for live sportscast","authors":"Si-Hun Sung, Woo-Sung Chun","doi":"10.1109/ICPR.2002.1048429","DOIUrl":"https://doi.org/10.1109/ICPR.2002.1048429","url":null,"abstract":"Knowledge-based numeric open caption recognition is proposed that can recognize numeric captions generated by a character generator (CG) and automatically superimpose a modified caption using the recognized text only when a valid numeric caption appears in the aimed specific region of a live sportscast scene produced by other broadcasting stations. In the proposed method, mesh features are extracted from an enhanced binary image as feature vectors, then valuable information is recovered from a numeric image by perceiving the character using a multilayer perceptron (MLP) network. The result is verified using a knowledge-based rule set designed for a more stable and reliable output and then the modified information is displayed on a screen by CG. MLB EyeCaption based on the proposed algorithm has already been used for regular Major League Baseball (MLB) programs broadcast live over a Korean nationwide TV network and has produced a favorable response from Korean viewers.","PeriodicalId":159502,"journal":{"name":"Object recognition supported by user interaction for service robots","volume":"354 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133167973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}