This paper presents a new histogram-based method for dynamic background modeling using a sequence of images extracted from video. In particular, a k-means clustering technique has been used to identify the foreground objects. Because of its shadow resistance and discriminative properties, we have used images in the HSV color space instead of the traditional RGB color space. The experimental results on real images are very encouraging as we were able to retrieve perfect backgrounds in simple scenes. In very complex scenes, the backgrounds we have obtained were very good. Furthermore, our method is very fast and could be used in real-time applications after optimization.
{"title":"A Novel Clustering-Based Method for Adaptive Background Segmentation","authors":"S. Indupalli, M. Ali, B. Boufama","doi":"10.1109/CRV.2006.5","DOIUrl":"https://doi.org/10.1109/CRV.2006.5","url":null,"abstract":"This paper presents a new histogram-based method for dynamic background modeling using a sequence of images extracted from video. In particular, a k-means clustering technique has been used to identify the foreground objects. Because of its shadow resistance and discriminative properties, we have used images in the HSV color space instead of the traditional RGB color space. The experimental results on real images are very encouraging as we were able to retrieve perfect backgrounds in simple scenes. In very complex scenes, the backgrounds we have obtained were very good. Furthermore, our method is very fast and could be used in real-time applications after optimization.","PeriodicalId":369170,"journal":{"name":"The 3rd Canadian Conference on Computer and Robot Vision (CRV'06)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128754504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper proposes a simple and novel singlecamera depth estimation method using images reflected by a single transparent parallel planar plate. The transparent plate reflects and transmits the incident light on its surface. The transmitted light is then reflected on the rear-surface and is transmitted again to the air through the surface. These two light paths create an overlapped image that comprises two shifted images. The overlapped image is considered as a stereo image obtained from a narrow baseline stereo. The constraint of these stereo images is presented. The distance to the object can be derived by finding correspondences on the constraint lines using the autocorrelation function of the overlapped image. This paper presents experimental results obtained using an actual system with a transparent acrylic plate.
{"title":"Reflection Stereo - Novel Monocular Stereo using a Transparent Plate -","authors":"M. Shimizu, M. Okutomi","doi":"10.1109/CRV.2006.59","DOIUrl":"https://doi.org/10.1109/CRV.2006.59","url":null,"abstract":"This paper proposes a simple and novel singlecamera depth estimation method using images reflected by a single transparent parallel planar plate. The transparent plate reflects and transmits the incident light on its surface. The transmitted light is then reflected on the rear-surface and is transmitted again to the air through the surface. These two light paths create an overlapped image that comprises two shifted images. The overlapped image is considered as a stereo image obtained from a narrow baseline stereo. The constraint of these stereo images is presented. The distance to the object can be derived by finding correspondences on the constraint lines using the autocorrelation function of the overlapped image. This paper presents experimental results obtained using an actual system with a transparent acrylic plate.","PeriodicalId":369170,"journal":{"name":"The 3rd Canadian Conference on Computer and Robot Vision (CRV'06)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126771841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Image inpainting is an artistic procedure to recover a damaged painting or picture. In this paper, we propose a novel approach for image inpainting. In this approach, the Mumford-Shah (MS) model and the level set method are employed to estimate image structure of the damaged region. This approach has been successfully used in image segmentation problem. Compared to some other inpainting methods, the MS model approach can detect and preserve edges in the inpainting areas. We propose in this paper a fast and efficient algorithm which can achieve both inpainting and segmentation. In previous works on the MS model, only one or two level set functions are used to segment an image. While this approach works well on some simple images, detailed edges cannot be detected on complicated images. Although multi-level set functions can be used to segment an image into many regions, the traditional approach causes extensive computations and the solutions depend on the location of the initial curves. Our proposed approach utilizes faster hierarchical level set method and can guarantee convergence independent of initial conditions. Because we can detect both the main structure and the detailed edges, the approach can preserve detailed edges in the inpainting area. Experimental results demonstrate the advantage of our method.
{"title":"Image Inpainting and Segmentation using Hierarchical Level Set Method","authors":"Xiaojun Du, D. Cho, T. D. Bui","doi":"10.1109/CRV.2006.41","DOIUrl":"https://doi.org/10.1109/CRV.2006.41","url":null,"abstract":"Image inpainting is an artistic procedure to recover a damaged painting or picture. In this paper, we propose a novel approach for image inpainting. In this approach, the Mumford-Shah (MS) model and the level set method are employed to estimate image structure of the damaged region. This approach has been successfully used in image segmentation problem. Compared to some other inpainting methods, the MS model approach can detect and preserve edges in the inpainting areas. We propose in this paper a fast and efficient algorithm which can achieve both inpainting and segmentation. In previous works on the MS model, only one or two level set functions are used to segment an image. While this approach works well on some simple images, detailed edges cannot be detected on complicated images. Although multi-level set functions can be used to segment an image into many regions, the traditional approach causes extensive computations and the solutions depend on the location of the initial curves. Our proposed approach utilizes faster hierarchical level set method and can guarantee convergence independent of initial conditions. Because we can detect both the main structure and the detailed edges, the approach can preserve detailed edges in the inpainting area. Experimental results demonstrate the advantage of our method.","PeriodicalId":369170,"journal":{"name":"The 3rd Canadian Conference on Computer and Robot Vision (CRV'06)","volume":"235 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123287255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Evaluation of object detection systems requires a set of test images with objects in heterogeneous scenes. Unfortunately, existing publicly available object databases provide few, if any, test images suitable for evaluating object detection systems. Here we present the McGill Object Detection Suite (MODS), a software package for creating test sets suitable for evaluating object detection systems. These test sets are created by superimposing objects from existing publicly available object databases onto heterogeneous backgrounds. The MODS is capable of creating test sets focusing on pose, scale, illumination, occlusion, or noise. This software package is being made publicly available to aid the computer vision community by providing standard test sets which will allow object detection systems to be systematically compared and characterized.
{"title":"The McGill Object Detection Suite","authors":"Donovan H. Parks, M. Levine","doi":"10.1109/CRV.2006.75","DOIUrl":"https://doi.org/10.1109/CRV.2006.75","url":null,"abstract":"Evaluation of object detection systems requires a set of test images with objects in heterogeneous scenes. Unfortunately, existing publicly available object databases provide few, if any, test images suitable for evaluating object detection systems. Here we present the McGill Object Detection Suite (MODS), a software package for creating test sets suitable for evaluating object detection systems. These test sets are created by superimposing objects from existing publicly available object databases onto heterogeneous backgrounds. The MODS is capable of creating test sets focusing on pose, scale, illumination, occlusion, or noise. This software package is being made publicly available to aid the computer vision community by providing standard test sets which will allow object detection systems to be systematically compared and characterized.","PeriodicalId":369170,"journal":{"name":"The 3rd Canadian Conference on Computer and Robot Vision (CRV'06)","volume":"686 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120886960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
To be useful as a mobility assistant for a human driver, an intelligent robotic wheelchair must be able to distinguish between safe and hazardous regions in its immediate environment. We present a hybrid method using laser rangefinders and vision for building local 2D metrical maps that incorporate safety information (called local safety maps). Laser range-finders are used for localization and mapping of obstacles in the 2D laser plane, and vision is used for detection of hazards and other obstacles in 3D space. The hazards and obstacles identified by vision are projected into the travel plane of the robot and combined with the laser map to construct the local 2D safety map. The main contributions of this work are (i) the definition of a local 2D safety map, (ii) a hybrid method for building the safety map, and (iii) a method for removing noise from dense stereo data using motion.
{"title":"Building Local Safety Maps for a Wheelchair Robot using Vision and Lasers","authors":"A. Murarka, Joseph Modayil, B. Kuipers","doi":"10.1109/CRV.2006.20","DOIUrl":"https://doi.org/10.1109/CRV.2006.20","url":null,"abstract":"To be useful as a mobility assistant for a human driver, an intelligent robotic wheelchair must be able to distinguish between safe and hazardous regions in its immediate environment. We present a hybrid method using laser rangefinders and vision for building local 2D metrical maps that incorporate safety information (called local safety maps). Laser range-finders are used for localization and mapping of obstacles in the 2D laser plane, and vision is used for detection of hazards and other obstacles in 3D space. The hazards and obstacles identified by vision are projected into the travel plane of the robot and combined with the laser map to construct the local 2D safety map. The main contributions of this work are (i) the definition of a local 2D safety map, (ii) a hybrid method for building the safety map, and (iii) a method for removing noise from dense stereo data using motion.","PeriodicalId":369170,"journal":{"name":"The 3rd Canadian Conference on Computer and Robot Vision (CRV'06)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133575278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper the problem of producing an enlarged image from a given digital image is addressed (zooming). Different image interpolation techniques are used for image enlargement. During interpolation, preserving details and smoothing data at the same time for not introducing spurious artifacts (i.e. Aliasing) is difficult. A complete and a definitive solution to this problem is still an open issue. Although there are some well known methods in the market Parket [14], Sakamote [16], the paper proposes a method that considers discontinuities and luminance variations in a sequence of non linear iterations steps. All the pixels present near the edges are diffused into the edge in a way that aliasing is reduced to a greater extent. Hence the proposed method is completed in limited computational resources. The proposed method preserves edges and brings smoothness and at the same time controls the aliasing effect.
{"title":"An Edge Preserving Locally Adaptive Anti-aliasing Zooming Algorithm with Diffused Interpolation","authors":"Munib Arshad Chughtai, N. Khattak","doi":"10.1109/CRV.2006.8","DOIUrl":"https://doi.org/10.1109/CRV.2006.8","url":null,"abstract":"In this paper the problem of producing an enlarged image from a given digital image is addressed (zooming). Different image interpolation techniques are used for image enlargement. During interpolation, preserving details and smoothing data at the same time for not introducing spurious artifacts (i.e. Aliasing) is difficult. A complete and a definitive solution to this problem is still an open issue. Although there are some well known methods in the market Parket [14], Sakamote [16], the paper proposes a method that considers discontinuities and luminance variations in a sequence of non linear iterations steps. All the pixels present near the edges are diffused into the edge in a way that aliasing is reduced to a greater extent. Hence the proposed method is completed in limited computational resources. The proposed method preserves edges and brings smoothness and at the same time controls the aliasing effect.","PeriodicalId":369170,"journal":{"name":"The 3rd Canadian Conference on Computer and Robot Vision (CRV'06)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130023872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Face processing in video is receiving substantial attention due to its importance in many securityrelated applications. A video provides rich information about a face (multiple frames and temporal coherence) that can be utilized in conjunction with 3D face models, if available, to establish a subject’s identity. We propose a 3D face modeling method that reconstructs a user-specific model derived from a generic 3D face model and two video frames of the user. The user-specific 3D face model can be enrolled into the 3D face database at the enrollment stage to be used in later identification process. The reconstruction process can also be used for the probe data in recognition stage, where the reconstructed 3D face model using probe face is used to generate an optimal view and lighting for the recognition process. The advantage of utilizing reconstructed 3D face model is demonstrated by conducting face recognition experiments for 15 probe subjects against a gallery database containing 100 subjects.
{"title":"3D Face Reconstruction from Stereo Video","authors":"U. Park, Anil K. Jain","doi":"10.1109/CRV.2006.1","DOIUrl":"https://doi.org/10.1109/CRV.2006.1","url":null,"abstract":"Face processing in video is receiving substantial attention due to its importance in many securityrelated applications. A video provides rich information about a face (multiple frames and temporal coherence) that can be utilized in conjunction with 3D face models, if available, to establish a subject’s identity. We propose a 3D face modeling method that reconstructs a user-specific model derived from a generic 3D face model and two video frames of the user. The user-specific 3D face model can be enrolled into the 3D face database at the enrollment stage to be used in later identification process. The reconstruction process can also be used for the probe data in recognition stage, where the reconstructed 3D face model using probe face is used to generate an optimal view and lighting for the recognition process. The advantage of utilizing reconstructed 3D face model is demonstrated by conducting face recognition experiments for 15 probe subjects against a gallery database containing 100 subjects.","PeriodicalId":369170,"journal":{"name":"The 3rd Canadian Conference on Computer and Robot Vision (CRV'06)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121999123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present a new window-based stereo matching algorithm which focuses on robust outlier rejection during aggregation. The main difficulty for window-based methods lies in determining the best window shape and size for each pixel. Working from the assumption that depth discontinuities occur at colour boundaries, we segment the reference image and consider all window pixels outside the image segment that contains the pixel under consideration as outliers and greatly reduce their weight in the aggregation process. We developed a variation on the recursive moving average implementation to keep processing times independent from window size. Together with a robust matching cost and the combination of the left and right disparity maps, this gives us a robust local algorithm that approximates the quality of global techniques without sacrificing the speed and simplicity of window-based aggregation.
{"title":"Local Stereo Matching with Segmentation-based Outlier Rejection","authors":"M. Gerrits, P. Bekaert","doi":"10.1109/CRV.2006.49","DOIUrl":"https://doi.org/10.1109/CRV.2006.49","url":null,"abstract":"We present a new window-based stereo matching algorithm which focuses on robust outlier rejection during aggregation. The main difficulty for window-based methods lies in determining the best window shape and size for each pixel. Working from the assumption that depth discontinuities occur at colour boundaries, we segment the reference image and consider all window pixels outside the image segment that contains the pixel under consideration as outliers and greatly reduce their weight in the aggregation process. We developed a variation on the recursive moving average implementation to keep processing times independent from window size. Together with a robust matching cost and the combination of the left and right disparity maps, this gives us a robust local algorithm that approximates the quality of global techniques without sacrificing the speed and simplicity of window-based aggregation.","PeriodicalId":369170,"journal":{"name":"The 3rd Canadian Conference on Computer and Robot Vision (CRV'06)","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124172528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mook-Kwang Park, Namsu Moon, Sang-Gyu Ryu, Jeongpyo Kong, Yongjin Lee, Wangjin Mun
A novel method of pixel-weighting is proposed to calculate the size of a detected object in an image captured using a single camera. The calculated object size does not vary significantly regardless of the location of the object in an image, which allows it to be effectively utilized in a vision-based surveillance sensing algorithm as a meaningful feature for discriminating human intruders from other objects. Experimental results show the feasibility of the proposed method.
{"title":"A Pixel-Weighting Method for Discriminating Objects of Different Sizes in an Image Captured from a Single Camera","authors":"Mook-Kwang Park, Namsu Moon, Sang-Gyu Ryu, Jeongpyo Kong, Yongjin Lee, Wangjin Mun","doi":"10.1109/CRV.2006.6","DOIUrl":"https://doi.org/10.1109/CRV.2006.6","url":null,"abstract":"A novel method of pixel-weighting is proposed to calculate the size of a detected object in an image captured using a single camera. The calculated object size does not vary significantly regardless of the location of the object in an image, which allows it to be effectively utilized in a vision-based surveillance sensing algorithm as a meaningful feature for discriminating human intruders from other objects. Experimental results show the feasibility of the proposed method.","PeriodicalId":369170,"journal":{"name":"The 3rd Canadian Conference on Computer and Robot Vision (CRV'06)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132479162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper we make use of the idea that a robot can autonomously discover objects and learn their appearances by poking and prodding at interesting parts of a scene. In order to make the resultant object recognition ability more robust, and discriminative, we replace earlier used colour histogram features with an invariant texture-patch method. The texture patches are extracted in a similarity invariant frame which is constructed from short colour contour segments. We demonstrate the robustness of our invariant frames with a repeatability test under general homography transformations of a planar scene. Through the repeatability test, we find that defining the frame using using ellipse segments instead of lines where this is appropriate improves repeatability. We also apply the developed features to autonomous learning of object appearances, and show how the learned objects can be recognised under out-of-plane rotation and scale changes.
{"title":"Autonomous Learning of Object Appearances using Colour Contour Frames","authors":"Per-Erik Forssén, A. Moe","doi":"10.1109/CRV.2006.17","DOIUrl":"https://doi.org/10.1109/CRV.2006.17","url":null,"abstract":"In this paper we make use of the idea that a robot can autonomously discover objects and learn their appearances by poking and prodding at interesting parts of a scene. In order to make the resultant object recognition ability more robust, and discriminative, we replace earlier used colour histogram features with an invariant texture-patch method. The texture patches are extracted in a similarity invariant frame which is constructed from short colour contour segments. We demonstrate the robustness of our invariant frames with a repeatability test under general homography transformations of a planar scene. Through the repeatability test, we find that defining the frame using using ellipse segments instead of lines where this is appropriate improves repeatability. We also apply the developed features to autonomous learning of object appearances, and show how the learned objects can be recognised under out-of-plane rotation and scale changes.","PeriodicalId":369170,"journal":{"name":"The 3rd Canadian Conference on Computer and Robot Vision (CRV'06)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131128131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}