This paper reports a method for acquiring the prior probability of human existence by using past human trajectories and the color of an image. The priors play important roles in human detection as well as in scene understanding. The proposed method is based on the assumption that a person can exist again in an area where he/she existed in the past. In order to acquire the priors efficiently, a high prior probability is assigned to an area having the same color as past human trajectories. We use a particle filter for representing the prior probability. Therefore, we can represent a complex prior probability using only a few parameters. Through experiments, we confirmed that our proposed method can acquire the prior probability efficiently and it can realize highly accurate human detection using the obtained prior probability.
{"title":"Efficient acquisition of human existence priors from motion trajectories","authors":"H. Habe, Hidehito Nakagawa, M. Kidode","doi":"10.2197/ipsjtcva.2.145","DOIUrl":"https://doi.org/10.2197/ipsjtcva.2.145","url":null,"abstract":"This paper reports a method for acquiring the prior probability of human existence by using past human trajectories and the color of an image. The priors play important roles in human detection as well as in scene understanding. The proposed method is based on the assumption that a person can exist again in an area where he/she existed in the past. In order to acquire the priors efficiently, a high prior probability is assigned to an area having the same color as past human trajectories. We use a particle filter for representing the prior probability. Therefore, we can represent a complex prior probability using only a few parameters. Through experiments, we confirmed that our proposed method can acquire the prior probability efficiently and it can realize highly accurate human detection using the obtained prior probability.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133757466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-06-20DOI: 10.1109/CVPRW.2009.5204335
S. Geman
A probabilistic grammar for the groupings and labeling of parts and objects, when taken together with pose and part-dependent appearance models, constitutes a generative scene model and a Bayesian framework for image analysis. To the extent that the generative model generates features, as opposed to pixel intensities, the "inverse" or "posterior distribution" on interpretations given images is based on incomplete information; feature vectors are generally insufficient to recover the original intensities. I will argue for fully generative scene models, meaning models that in principle generate actual digital pictures. I will outline an approach to the construction of fully generative models through an extension of context-sensitive grammars and a re-formulation of the popular template models for image fragments. Mostly I will focus on the problem of constructing pixel-level appearance models. I will propose an approach based on image-fragment templates, as introduced by Ullman and others. However, rather than using a correlation between a template and a given image patch as an extracted feature.
{"title":"Generative hierarchical models for image analysis","authors":"S. Geman","doi":"10.1109/CVPRW.2009.5204335","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204335","url":null,"abstract":"A probabilistic grammar for the groupings and labeling of parts and objects, when taken together with pose and part-dependent appearance models, constitutes a generative scene model and a Bayesian framework for image analysis. To the extent that the generative model generates features, as opposed to pixel intensities, the \"inverse\" or \"posterior distribution\" on interpretations given images is based on incomplete information; feature vectors are generally insufficient to recover the original intensities. I will argue for fully generative scene models, meaning models that in principle generate actual digital pictures. I will outline an approach to the construction of fully generative models through an extension of context-sensitive grammars and a re-formulation of the popular template models for image fragments. Mostly I will focus on the problem of constructing pixel-level appearance models. I will propose an approach based on image-fragment templates, as introduced by Ullman and others. However, rather than using a correlation between a template and a given image patch as an extracted feature.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132012316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-06-20DOI: 10.1109/CVPRW.2009.5204215
A. Doshi, M. Trivedi
Future intelligent environments and systems may need to interact with humans while simultaneously analyzing events and critical situations. Assistive living, advanced driver assistance systems, and intelligent command-and-control centers are just a few of these cases where human interactions play a critical role in situation analysis. In particular, the behavior or body language of the human subject may be a strong indicator of the context of the situation. In this paper we demonstrate how the interaction of a human observer's head pose and eye gaze behaviors can provide significant insight into the context of the event. Such semantic data derived from human behaviors can be used to help interpret and recognize an ongoing event. We present examples from driving and intelligent meeting rooms to support these conclusions, and demonstrate how to use these techniques to improve contextual learning.
{"title":"Head and gaze dynamics in visual attention and context learning","authors":"A. Doshi, M. Trivedi","doi":"10.1109/CVPRW.2009.5204215","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204215","url":null,"abstract":"Future intelligent environments and systems may need to interact with humans while simultaneously analyzing events and critical situations. Assistive living, advanced driver assistance systems, and intelligent command-and-control centers are just a few of these cases where human interactions play a critical role in situation analysis. In particular, the behavior or body language of the human subject may be a strong indicator of the context of the situation. In this paper we demonstrate how the interaction of a human observer's head pose and eye gaze behaviors can provide significant insight into the context of the event. Such semantic data derived from human behaviors can be used to help interpret and recognize an ongoing event. We present examples from driving and intelligent meeting rooms to support these conclusions, and demonstrate how to use these techniques to improve contextual learning.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129690818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-06-20DOI: 10.1109/CVPRW.2009.5204348
F. Shi, P. Yap, Yong Fan, Jie-Zhi Cheng, L. Wald, G. Gerig, Weili Lin, D. Shen
The acquisition of high quality MR images of neonatal brains is largely hampered by their characteristically small head size and low tissue contrast. As a result, subsequent image processing and analysis, especially for brain tissue segmentation, are often hindered. To overcome this problem, a dedicated phased array neonatal head coil is utilized to improve MR image quality by effectively combing images obtained from 8 coil elements without lengthening data acquisition time. In addition, a subject-specific atlas based tissue segmentation algorithm is specifically developed for the delineation of fine structures in the acquired neonatal brain MR images. The proposed tissue segmentation method first enhances the sheet-like cortical gray matter (GM) structures in neonatal images with a Hessian filter for generation of cortical GM prior. Then, the prior is combined with our neonatal population atlas to form a cortical enhanced hybrid atlas, which we refer to as the subject-specific atlas. Various experiments are conducted to compare the proposed method with manual segmentation results, as well as with additional two population atlas based segmentation methods. Results show that the proposed method is capable of segmenting the neonatal brain with the highest accuracy, compared to other two methods.
{"title":"Cortical enhanced tissue segmentation of neonatal brain MR images acquired by a dedicated phased array coil","authors":"F. Shi, P. Yap, Yong Fan, Jie-Zhi Cheng, L. Wald, G. Gerig, Weili Lin, D. Shen","doi":"10.1109/CVPRW.2009.5204348","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204348","url":null,"abstract":"The acquisition of high quality MR images of neonatal brains is largely hampered by their characteristically small head size and low tissue contrast. As a result, subsequent image processing and analysis, especially for brain tissue segmentation, are often hindered. To overcome this problem, a dedicated phased array neonatal head coil is utilized to improve MR image quality by effectively combing images obtained from 8 coil elements without lengthening data acquisition time. In addition, a subject-specific atlas based tissue segmentation algorithm is specifically developed for the delineation of fine structures in the acquired neonatal brain MR images. The proposed tissue segmentation method first enhances the sheet-like cortical gray matter (GM) structures in neonatal images with a Hessian filter for generation of cortical GM prior. Then, the prior is combined with our neonatal population atlas to form a cortical enhanced hybrid atlas, which we refer to as the subject-specific atlas. Various experiments are conducted to compare the proposed method with manual segmentation results, as well as with additional two population atlas based segmentation methods. Results show that the proposed method is capable of segmenting the neonatal brain with the highest accuracy, compared to other two methods.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129251040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-06-20DOI: 10.1109/CVPRW.2009.5205205
K. Sengupta, F. Porikli
In this paper, we introduce a novel technique called geometric sequence (GS) imaging, specifically for the purpose of low power and light weight tracking in human computer interface design. The imaging sensor is programmed to capture the scene with a train of packets, where each packet constitutes a few images. The delay or the baseline associated with consecutive image pairs in a packet follows a fixed ratio, as in a geometric sequence. The image pair with shorter baseline or delay captures fast motion, while the image pair with larger baseline or delay captures slow motion. Given an image packet, the motion confidence maps computed from the slow and the fast image pairs are fused into a single map. Next, we use a Bayesian update scheme to compute the motion hypotheses probability map, given the information of prior packets. We estimate the motion from this probability map. The GS imaging system reliably tracks slow movements as well as fast movements, a feature that is important in realizing applications such as a touchpad type system. Compared to continuous imaging with short delay between consecutive pairs, the GS imaging technique enjoys several advantages. The overall power consumption and the CPU load are significantly low. We present results in the domain of optical camera based human computer interface (HCI) applications, as well as for capacitive fingerprint imaging sensor based touch pad systems.
{"title":"Geometric Sequence (GS) imaging with Bayesian smoothing for optical and capacitive imaging sensors","authors":"K. Sengupta, F. Porikli","doi":"10.1109/CVPRW.2009.5205205","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5205205","url":null,"abstract":"In this paper, we introduce a novel technique called geometric sequence (GS) imaging, specifically for the purpose of low power and light weight tracking in human computer interface design. The imaging sensor is programmed to capture the scene with a train of packets, where each packet constitutes a few images. The delay or the baseline associated with consecutive image pairs in a packet follows a fixed ratio, as in a geometric sequence. The image pair with shorter baseline or delay captures fast motion, while the image pair with larger baseline or delay captures slow motion. Given an image packet, the motion confidence maps computed from the slow and the fast image pairs are fused into a single map. Next, we use a Bayesian update scheme to compute the motion hypotheses probability map, given the information of prior packets. We estimate the motion from this probability map. The GS imaging system reliably tracks slow movements as well as fast movements, a feature that is important in realizing applications such as a touchpad type system. Compared to continuous imaging with short delay between consecutive pairs, the GS imaging technique enjoys several advantages. The overall power consumption and the CPU load are significantly low. We present results in the domain of optical camera based human computer interface (HCI) applications, as well as for capacitive fingerprint imaging sensor based touch pad systems.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124683947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-06-20DOI: 10.1109/CVPRW.2009.5204301
H. Rara, S. Elhabian, Asem M. Ali, Mike Miller, T. Starr, A. Farag
We describe a framework for face recognition at a distance based on sparse-stereo reconstruction. We develop a 3D acquisition system that consists of two CCD stereo cameras mounted on pan-tilt units with adjustable baseline. We first detect the facial region and extract its landmark points, which are used to initialize an AAM mesh fitting algorithm. The fitted mesh vertices provide point correspondences between the left and right images of a stereo pair; stereo-based reconstruction is then used to infer the 3D information of the mesh vertices. We perform experiments regarding the use of different features extracted from these vertices for face recognition. The cumulative rank curves (CMC), which are generated using the proposed framework, confirms the feasibility of the proposed work for long distance recognition of human faces with respect to the state-of-the-art.
{"title":"Face recognition at-a-distance based on sparse-stereo reconstruction","authors":"H. Rara, S. Elhabian, Asem M. Ali, Mike Miller, T. Starr, A. Farag","doi":"10.1109/CVPRW.2009.5204301","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204301","url":null,"abstract":"We describe a framework for face recognition at a distance based on sparse-stereo reconstruction. We develop a 3D acquisition system that consists of two CCD stereo cameras mounted on pan-tilt units with adjustable baseline. We first detect the facial region and extract its landmark points, which are used to initialize an AAM mesh fitting algorithm. The fitted mesh vertices provide point correspondences between the left and right images of a stereo pair; stereo-based reconstruction is then used to infer the 3D information of the mesh vertices. We perform experiments regarding the use of different features extracted from these vertices for face recognition. The cumulative rank curves (CMC), which are generated using the proposed framework, confirms the feasibility of the proposed work for long distance recognition of human faces with respect to the state-of-the-art.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130841135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-06-20DOI: 10.1109/CVPRW.2009.5204168
Andrew C. Gallagher, D. Joshi, Jie Yu, Jiebo Luo
Associating image content with their geographic locations has been increasingly pursued in the computer vision community in recent years. In a recent work, large collections of geotagged images were found to be helpful in estimating geo-locations of query images by simple visual nearest-neighbors search. In this paper, we leverage user tags along with image content to infer the geo-location. Our model builds upon the fact that the visual content and user tags of pictures can provide significant hints about their geo-locations. Using a large collection of over a million geotagged photographs, we build location probability maps of user tags over the entire globe. These maps reflect the picture-taking and tagging behaviors of thousands of users from all over the world, and reveal interesting tag map patterns. Visual content matching is performed using multiple feature descriptors including tiny images, color histograms, GIST features, and bags of textons. The combination of visual content matching and local tag probability maps forms a strong geo-inference engine. Large-scale experiments have shown significant improvements over pure visual content-based geo-location inference.
{"title":"Geo-location inference from image content and user tags","authors":"Andrew C. Gallagher, D. Joshi, Jie Yu, Jiebo Luo","doi":"10.1109/CVPRW.2009.5204168","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204168","url":null,"abstract":"Associating image content with their geographic locations has been increasingly pursued in the computer vision community in recent years. In a recent work, large collections of geotagged images were found to be helpful in estimating geo-locations of query images by simple visual nearest-neighbors search. In this paper, we leverage user tags along with image content to infer the geo-location. Our model builds upon the fact that the visual content and user tags of pictures can provide significant hints about their geo-locations. Using a large collection of over a million geotagged photographs, we build location probability maps of user tags over the entire globe. These maps reflect the picture-taking and tagging behaviors of thousands of users from all over the world, and reveal interesting tag map patterns. Visual content matching is performed using multiple feature descriptors including tiny images, color histograms, GIST features, and bags of textons. The combination of visual content matching and local tag probability maps forms a strong geo-inference engine. Large-scale experiments have shown significant improvements over pure visual content-based geo-location inference.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132684282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-06-20DOI: 10.1109/CVPRW.2009.5204263
Yan Tong, Jixu Chen, Q. Ji
Facial action provides various types of messages for human communications. Recognizing spontaneous facial actions, however, is very challenging due to subtle facial deformation, frequent head movements, and ambiguous and uncertain facial motion measurements. As a result, current research in facial action recognition is limited to posed facial actions and often in frontal view.Spontaneous facial action is characterized by rigid head movements and nonrigid facial muscular movements. More importantly, it is the spatiotemporal interactions among the rigid and nonrigid facial motions that produce a meaningful and natural facial display. Recognizing this fact, we introduce a probabilistic facial action model based on a dynamic Bayesian network (DBN) to simultaneously and coherently capture rigid and nonrigid facial motions, their spatiotemporal dependencies, and their image measurements. Advanced machine learning methods are introduced to learn the probabilistic facial action model based on both training data and prior knowledge. Facial action recognition is accomplished through probabilistic inference by systemically integrating measurements official motions with the facial action model. Experiments show that the proposed system yields significant improvements in recognizing spontaneous facial actions.
{"title":"Modeling and exploiting the spatio-temporal facial action dependencies for robust spontaneous facial expression recognition","authors":"Yan Tong, Jixu Chen, Q. Ji","doi":"10.1109/CVPRW.2009.5204263","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204263","url":null,"abstract":"Facial action provides various types of messages for human communications. Recognizing spontaneous facial actions, however, is very challenging due to subtle facial deformation, frequent head movements, and ambiguous and uncertain facial motion measurements. As a result, current research in facial action recognition is limited to posed facial actions and often in frontal view.Spontaneous facial action is characterized by rigid head movements and nonrigid facial muscular movements. More importantly, it is the spatiotemporal interactions among the rigid and nonrigid facial motions that produce a meaningful and natural facial display. Recognizing this fact, we introduce a probabilistic facial action model based on a dynamic Bayesian network (DBN) to simultaneously and coherently capture rigid and nonrigid facial motions, their spatiotemporal dependencies, and their image measurements. Advanced machine learning methods are introduced to learn the probabilistic facial action model based on both training data and prior knowledge. Facial action recognition is accomplished through probabilistic inference by systemically integrating measurements official motions with the facial action model. Experiments show that the proposed system yields significant improvements in recognizing spontaneous facial actions.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114296754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-06-20DOI: 10.1109/CVPRW.2009.5204267
Sergio Escalera, R. M. Martinez, Jordi Vitrià, P. Radeva, M. Anguera
Dominance is referred to the level of influence a person has in a conversation. Dominance is an important research area in social psychology, but the problem of its automatic estimation is a very recent topic in the contexts of social and wearable computing. In this paper, we focus on dominance detection from visual cues. We estimate the correlation among observers by categorizing the dominant people in a set of face-to-face conversations. Different dominance indicators from gestural communication are defined, manually annotated, and compared to the observers opinion. Moreover, the considered indicators are automatically extracted from video sequences and learnt by using binary classifiers. Results from the three analysis shows a high correlation and allows the categorization of dominant people in public discussion video sequences.
{"title":"Dominance detection in face-to-face conversations","authors":"Sergio Escalera, R. M. Martinez, Jordi Vitrià, P. Radeva, M. Anguera","doi":"10.1109/CVPRW.2009.5204267","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204267","url":null,"abstract":"Dominance is referred to the level of influence a person has in a conversation. Dominance is an important research area in social psychology, but the problem of its automatic estimation is a very recent topic in the contexts of social and wearable computing. In this paper, we focus on dominance detection from visual cues. We estimate the correlation among observers by categorizing the dominant people in a set of face-to-face conversations. Different dominance indicators from gestural communication are defined, manually annotated, and compared to the observers opinion. Moreover, the considered indicators are automatically extracted from video sequences and learnt by using binary classifiers. Results from the three analysis shows a high correlation and allows the categorization of dominant people in public discussion video sequences.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122516710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-06-20DOI: 10.1109/CVPRW.2009.5204340
D. Bradley, B. Atcheson, Ivo Ihrke, W. Heidrich
Two major obstacles to the use of consumer camcorders in computer vision applications are the lack of synchronization hardware, and the use of a "rolling" shutter, which introduces a temporal shear in the video volume. We present two simple approaches for solving both the rolling shutter shear and the synchronization problem at the same time. The first approach is based on strobe illumination, while the second employs a subframe warp along optical flow vectors. In our experiments we have used the proposed methods to effectively remove temporal shear, and synchronize up to 16 consumer-grade camcorders in multiple geometric configurations.
{"title":"Synchronization and rolling shutter compensation for consumer video camera arrays","authors":"D. Bradley, B. Atcheson, Ivo Ihrke, W. Heidrich","doi":"10.1109/CVPRW.2009.5204340","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204340","url":null,"abstract":"Two major obstacles to the use of consumer camcorders in computer vision applications are the lack of synchronization hardware, and the use of a \"rolling\" shutter, which introduces a temporal shear in the video volume. We present two simple approaches for solving both the rolling shutter shear and the synchronization problem at the same time. The first approach is based on strobe illumination, while the second employs a subframe warp along optical flow vectors. In our experiments we have used the proposed methods to effectively remove temporal shear, and synchronize up to 16 consumer-grade camcorders in multiple geometric configurations.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130933674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}