Pub Date : 2010-10-01DOI: 10.1109/AIPR.2010.5759716
C. Diegert
We present a shape-first approach to finding automobiles and trucks in overhead images and include results from our analysis of an image from the Overhead Imaging Research Dataset [1]. For the OIRDS, our shape-first approach traces candidate vehicle outlines by exploiting knowledge about an overhead image of a vehicle: a vehicle's outline fits into a rectangle, this rectangle is sized to allow vehicles to use local roads, and rectangles from two different vehicles are disjoint. Our shape-first approach can efficiently process high-resolution overhead imaging over wide areas to provide tips and cues for human analysts, or for subsequent automatic processing using machine learning or other analysis based on color, tone, pattern, texture, size, and/or location (shape first). In fact, computationally-intensive complex structural, syntactic, and statistical analysis may be possible when a shape-first work flow sends a list of specific tips and cues down a processing pipeline rather than sending the whole of wide area imaging information. This data flow may fit well when bandwidth is limited between computers delivering ad hoc image exploitation and an imaging sensor. As expected, our early computational experiments find that the shape-first processing stage appears to reliably detect rectangular shapes from vehicles. More intriguing is that our computational experiments with six-inch GSD OIRDS benchmark images show that the shape-first stage can be efficient, and that candidate vehicle locations corresponding to features that do not include vehicles are unlikely to trigger tips and cues. We found that stopping with just the shape-first list of candidate vehicle locations, and then solving a weighted, maximal independent vertex set problem to resolve conflicts among candidate vehicle locations, often correctly traces the vehicles in an OIRDS scene.
{"title":"A combinatorial method for tracing objects using semantics of their shape","authors":"C. Diegert","doi":"10.1109/AIPR.2010.5759716","DOIUrl":"https://doi.org/10.1109/AIPR.2010.5759716","url":null,"abstract":"We present a shape-first approach to finding automobiles and trucks in overhead images and include results from our analysis of an image from the Overhead Imaging Research Dataset [1]. For the OIRDS, our shape-first approach traces candidate vehicle outlines by exploiting knowledge about an overhead image of a vehicle: a vehicle's outline fits into a rectangle, this rectangle is sized to allow vehicles to use local roads, and rectangles from two different vehicles are disjoint. Our shape-first approach can efficiently process high-resolution overhead imaging over wide areas to provide tips and cues for human analysts, or for subsequent automatic processing using machine learning or other analysis based on color, tone, pattern, texture, size, and/or location (shape first). In fact, computationally-intensive complex structural, syntactic, and statistical analysis may be possible when a shape-first work flow sends a list of specific tips and cues down a processing pipeline rather than sending the whole of wide area imaging information. This data flow may fit well when bandwidth is limited between computers delivering ad hoc image exploitation and an imaging sensor. As expected, our early computational experiments find that the shape-first processing stage appears to reliably detect rectangular shapes from vehicles. More intriguing is that our computational experiments with six-inch GSD OIRDS benchmark images show that the shape-first stage can be efficient, and that candidate vehicle locations corresponding to features that do not include vehicles are unlikely to trigger tips and cues. We found that stopping with just the shape-first list of candidate vehicle locations, and then solving a weighted, maximal independent vertex set problem to resolve conflicts among candidate vehicle locations, often correctly traces the vehicles in an OIRDS scene.","PeriodicalId":128378,"journal":{"name":"2010 IEEE 39th Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128839726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-10-01DOI: 10.1109/AIPR.2010.5759682
K. S. Nagla, M. Uddin, Dilbag Singh, Rajeev Kumar
Multisensor data fusion is highly applicable in robotics applications because the relationships among objects and events changes due to the change in orientation of robot, snag in sensory information, sensor range and environmental conditions etc. High level and low level image processing in machine vision are widely involved to investigate object identification in complex application. Due to the limitations of vision technology still it is difficult to identify the objects in certain environments. A new technique of object identification using sonar sensor fusion has been proposed. This paper explains the computational account of the data fusion using Bayesian and neural network to recognize the shape of object in the dynamic environment.
{"title":"Object identification in dynamic environment using sensor fusion","authors":"K. S. Nagla, M. Uddin, Dilbag Singh, Rajeev Kumar","doi":"10.1109/AIPR.2010.5759682","DOIUrl":"https://doi.org/10.1109/AIPR.2010.5759682","url":null,"abstract":"Multisensor data fusion is highly applicable in robotics applications because the relationships among objects and events changes due to the change in orientation of robot, snag in sensory information, sensor range and environmental conditions etc. High level and low level image processing in machine vision are widely involved to investigate object identification in complex application. Due to the limitations of vision technology still it is difficult to identify the objects in certain environments. A new technique of object identification using sonar sensor fusion has been proposed. This paper explains the computational account of the data fusion using Bayesian and neural network to recognize the shape of object in the dynamic environment.","PeriodicalId":128378,"journal":{"name":"2010 IEEE 39th Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126416323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-10-01DOI: 10.1109/AIPR.2010.5759719
A. Amankwah, C. Aldrich
We propose a method for the creation of object markers used in watershed segmentation of rock images. First, we use adaptive thresholding to segment the rock image since rock particles local background is often different from surrounding particle regions. Object markers are then extracted using the compactness of objects and adaptive morphological reconstruction. The choice of the feature compactness is motivated by the fact that crushed rocks tend to have rounded shapes. Experimental results after comparing the segmented images show that the performance of our algorithm is superior to most standard methods of watershed segmentation. We also show that the proposed algorithm was more robust in the estimation of fines in rock samples than the traditional methods.
{"title":"Rock image segmentation using watershed with shape markers","authors":"A. Amankwah, C. Aldrich","doi":"10.1109/AIPR.2010.5759719","DOIUrl":"https://doi.org/10.1109/AIPR.2010.5759719","url":null,"abstract":"We propose a method for the creation of object markers used in watershed segmentation of rock images. First, we use adaptive thresholding to segment the rock image since rock particles local background is often different from surrounding particle regions. Object markers are then extracted using the compactness of objects and adaptive morphological reconstruction. The choice of the feature compactness is motivated by the fact that crushed rocks tend to have rounded shapes. Experimental results after comparing the segmented images show that the performance of our algorithm is superior to most standard methods of watershed segmentation. We also show that the proposed algorithm was more robust in the estimation of fines in rock samples than the traditional methods.","PeriodicalId":128378,"journal":{"name":"2010 IEEE 39th Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115252359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-10-01DOI: 10.1109/AIPR.2010.5759690
M. Jaber, S. R. Vantaram, E. Saber
In this paper, a Bayesian Network (BN) framework for unsupervised evaluation of image segmentation quality is proposed. This image understanding algorithm utilizes a set of given Segmentation Maps (SMs) ranging from under-segmented to over-segmented results for a target image, to identify the semantically meaningful ones and rank the SMs according to their applicability in image processing and computer vision systems. Images acquired from the Berkeley segmentation dataset along with their corresponding SMs are used to train and test the proposed algorithm. Low-level local and global image features are employed to define an optimal BN structure and to estimate the inference between its nodes. Furthermore, given several SMs of a test image, the optimal BN is utilized to estimate the probability that a given map is the most favorable segmentation for that image. The algorithm is evaluated on a separate set of images (none of which are included in the training set) wherein the ranked SMs (according to their probabilities of being acceptable segmentation as estimated by the proposed algorithm) are compared to the ground-truth maps generated by human observers. The Normalized Probabilistic Rand (NPR) index is used as an objective metric to quantify our algorithm's performance. The proposed algorithm is designed to serve as a pre-processing module in various bottom-up image processing frameworks such as content-based image retrieval and region-of-interest detection.
{"title":"A probabilistic framework for unsupervised evaluation and ranking of image segmentations","authors":"M. Jaber, S. R. Vantaram, E. Saber","doi":"10.1109/AIPR.2010.5759690","DOIUrl":"https://doi.org/10.1109/AIPR.2010.5759690","url":null,"abstract":"In this paper, a Bayesian Network (BN) framework for unsupervised evaluation of image segmentation quality is proposed. This image understanding algorithm utilizes a set of given Segmentation Maps (SMs) ranging from under-segmented to over-segmented results for a target image, to identify the semantically meaningful ones and rank the SMs according to their applicability in image processing and computer vision systems. Images acquired from the Berkeley segmentation dataset along with their corresponding SMs are used to train and test the proposed algorithm. Low-level local and global image features are employed to define an optimal BN structure and to estimate the inference between its nodes. Furthermore, given several SMs of a test image, the optimal BN is utilized to estimate the probability that a given map is the most favorable segmentation for that image. The algorithm is evaluated on a separate set of images (none of which are included in the training set) wherein the ranked SMs (according to their probabilities of being acceptable segmentation as estimated by the proposed algorithm) are compared to the ground-truth maps generated by human observers. The Normalized Probabilistic Rand (NPR) index is used as an objective metric to quantify our algorithm's performance. The proposed algorithm is designed to serve as a pre-processing module in various bottom-up image processing frameworks such as content-based image retrieval and region-of-interest detection.","PeriodicalId":128378,"journal":{"name":"2010 IEEE 39th Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122921052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-10-01DOI: 10.1109/AIPR.2010.5759704
J. Aanstoos, K. Hasan, C. O'Hara, S. Prasad, Lalitha Dabbiru, Majid Mahrooghy, R. Nóbrega, Matthew A. Lee, B. Shrestha
Multi-polarized L-band Synthetic Aperture Radar is investigated for its potential to screen earthen levees for weak points. Various feature detection and classification algorithms are tested for this application, including both radiometric and textural methods such as grey-level co-occurrence matrix and wavelet features.
{"title":"Use of remote sensing to screen earthen levees","authors":"J. Aanstoos, K. Hasan, C. O'Hara, S. Prasad, Lalitha Dabbiru, Majid Mahrooghy, R. Nóbrega, Matthew A. Lee, B. Shrestha","doi":"10.1109/AIPR.2010.5759704","DOIUrl":"https://doi.org/10.1109/AIPR.2010.5759704","url":null,"abstract":"Multi-polarized L-band Synthetic Aperture Radar is investigated for its potential to screen earthen levees for weak points. Various feature detection and classification algorithms are tested for this application, including both radiometric and textural methods such as grey-level co-occurrence matrix and wavelet features.","PeriodicalId":128378,"journal":{"name":"2010 IEEE 39th Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126345594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-10-01DOI: 10.1109/AIPR.2010.5759710
V. Gruev, Rob Perkins, Timothy York
We present a novel polarization image sensor by monolithically integrating aluminum nanowire optical filters with CCD imaging array. The CCD polarization image sensor is composed of 1000 by 1000 imaging elements with 7.4μm pixel pitch. The image sensor has a dynamic range of 65dB and signal-to-noise ratio of 60dB. The CCD array is covered with an array of pixel-pitch matched nanowire polarization filters with four different orientations offset by 45°. Raw polarization data is presented to a DSP board at 40 frames per second, where degree and angle of polarization is computed. The final polarization results are presented in false color representation. The imaging sensor is used to detect the index of refraction of several flat surfaces.
{"title":"Material detection with a CCD polarization imager","authors":"V. Gruev, Rob Perkins, Timothy York","doi":"10.1109/AIPR.2010.5759710","DOIUrl":"https://doi.org/10.1109/AIPR.2010.5759710","url":null,"abstract":"We present a novel polarization image sensor by monolithically integrating aluminum nanowire optical filters with CCD imaging array. The CCD polarization image sensor is composed of 1000 by 1000 imaging elements with 7.4μm pixel pitch. The image sensor has a dynamic range of 65dB and signal-to-noise ratio of 60dB. The CCD array is covered with an array of pixel-pitch matched nanowire polarization filters with four different orientations offset by 45°. Raw polarization data is presented to a DSP board at 40 frames per second, where degree and angle of polarization is computed. The final polarization results are presented in false color representation. The imaging sensor is used to detect the index of refraction of several flat surfaces.","PeriodicalId":128378,"journal":{"name":"2010 IEEE 39th Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125798785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-10-01DOI: 10.1109/AIPR.2010.5759696
Jonathan Fry, M. Pusateri
In multispectral imaging systems, correction for lens distortion is required to allow pixel by pixel fusion techniques to be applied. While correction of optical aberration can be extended to higher order terms, for many systems, a first order correction is sufficient to achieve desired results. In producing a multispectral imaging system in production quantities, the process of producing the corrections needs to be largely automated as each lens will require its own corrections. We discuss an auto-correction and bench sighting method application to a dual band imaging system. In principle, we wish to image a dual band target and completely determine the lens distortion parameters for the given optics. We begin with a scale-preserving, radial, first-order lens distortion model; this model allows the horizontal field of view to be determined independently of the distortion. It has the benefits of simple parameterization and the ability to correct mild to moderate distortion that may be expected of production optics. The correction process starts with imaging a dual band target. A feature extraction algorithm is applied to the imagery from both bands to generate a large number of correlated feature points. Using the feature points, we derive an over-determined system of equations; the solution to this system yields the distortion parameters for the lens. Using these parameters, an interpolation map can be generated unique to the lenses involved. The interpolation map is used in real-time to correct the distortion while preserving the horizontal field of view constraint on the system.
{"title":"A system and method for auto-correction of first order lens distortion","authors":"Jonathan Fry, M. Pusateri","doi":"10.1109/AIPR.2010.5759696","DOIUrl":"https://doi.org/10.1109/AIPR.2010.5759696","url":null,"abstract":"In multispectral imaging systems, correction for lens distortion is required to allow pixel by pixel fusion techniques to be applied. While correction of optical aberration can be extended to higher order terms, for many systems, a first order correction is sufficient to achieve desired results. In producing a multispectral imaging system in production quantities, the process of producing the corrections needs to be largely automated as each lens will require its own corrections. We discuss an auto-correction and bench sighting method application to a dual band imaging system. In principle, we wish to image a dual band target and completely determine the lens distortion parameters for the given optics. We begin with a scale-preserving, radial, first-order lens distortion model; this model allows the horizontal field of view to be determined independently of the distortion. It has the benefits of simple parameterization and the ability to correct mild to moderate distortion that may be expected of production optics. The correction process starts with imaging a dual band target. A feature extraction algorithm is applied to the imagery from both bands to generate a large number of correlated feature points. Using the feature points, we derive an over-determined system of equations; the solution to this system yields the distortion parameters for the lens. Using these parameters, an interpolation map can be generated unique to the lenses involved. The interpolation map is used in real-time to correct the distortion while preserving the horizontal field of view constraint on the system.","PeriodicalId":128378,"journal":{"name":"2010 IEEE 39th Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126875181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-10-01DOI: 10.1109/AIPR.2010.5759705
Peter J. Costianes, Joseph B. Plock
In 1973, Haralick, Shanmugam, and Dinstein published a paper in the IEEE Transactions on Systems, Man, and Cybernetics which proposed using Gray-Level Cooccurrence Matrices (GLCM) as a basis to define 2-D texture1. Over 14 different texture measures were defined using GLCM. In images with n × n grey levels, the size of the GLCM would be n × n which, for large n such as n=256, put a large computational load on the process and was also best suited for pixel distributions that were rather stochastic in nature. Such features as entropy, variance, correlation, etc. were proposed using the GLCM. When attempting to provide feature measures for man-made targets, most of the information contained in the target is contained by its edge distribution. Previous approaches form an edge outline of the target and then use some techniques such as Fourier descriptors to represent the target. However, in this case, extra steps need to be taken in order to assure that the edge outline is continuous or gaps in the outline somehow are dealt with when creating the Fourier coefficients for the feature vector. This paper presents an approach using GLCM where the gray scale image is put through an edge enhancement using any one of several edge operators. The resultant image is a binary image. For each point in the edge image, a 2×2 GLCM is created by placing an n × n window centered around the point and using the n2 neighboring points to build the GLCM's. This window should be sufficiently large to enclose the target of interest and the GLCM created provides the elements needed to define the features for the edge enhanced target. All software was created in Matlab2 using Matlab functions.
1973年,Haralick、Shanmugam和Dinstein在《IEEE系统、人与控制论学报》(IEEE Transactions on Systems, Man, and Cybernetics)上发表了一篇论文,提出使用灰度协同矩阵(GLCM)作为定义二维纹理的基础1。使用GLCM定义了超过14种不同的纹理测量。在灰度为n × n的图像中,GLCM的大小为n × n,对于较大的n(如n=256),会给该过程带来很大的计算负荷,并且也最适合于本质上相当随机的像素分布。利用GLCM提出了熵、方差、相关性等特征。在试图为人造目标提供特征度量时,目标中包含的大部分信息都包含在目标的边缘分布中。以前的方法先形成目标的边缘轮廓,然后使用傅立叶描述子等技术来表示目标。然而,在这种情况下,需要采取额外的步骤,以确保边缘轮廓是连续的,或者在为特征向量创建傅里叶系数时以某种方式处理轮廓中的间隙。本文提出了一种使用GLCM的方法,其中灰度图像使用几种边缘算子中的任何一种进行边缘增强。得到的图像是二值图像。对于边缘图像中的每个点,通过在该点周围放置一个n × n的窗口并使用n2个相邻点构建GLCM来创建2×2 GLCM。该窗口应该足够大,以包含感兴趣的目标,并且创建的GLCM提供了定义边缘增强目标的特征所需的元素。所有软件都是在Matlab2中使用Matlab函数创建的。
{"title":"Gray-level co-occurrence matrices as features in edge enhanced images","authors":"Peter J. Costianes, Joseph B. Plock","doi":"10.1109/AIPR.2010.5759705","DOIUrl":"https://doi.org/10.1109/AIPR.2010.5759705","url":null,"abstract":"In 1973, Haralick, Shanmugam, and Dinstein published a paper in the IEEE Transactions on Systems, Man, and Cybernetics which proposed using Gray-Level Cooccurrence Matrices (GLCM) as a basis to define 2-D texture1. Over 14 different texture measures were defined using GLCM. In images with n × n grey levels, the size of the GLCM would be n × n which, for large n such as n=256, put a large computational load on the process and was also best suited for pixel distributions that were rather stochastic in nature. Such features as entropy, variance, correlation, etc. were proposed using the GLCM. When attempting to provide feature measures for man-made targets, most of the information contained in the target is contained by its edge distribution. Previous approaches form an edge outline of the target and then use some techniques such as Fourier descriptors to represent the target. However, in this case, extra steps need to be taken in order to assure that the edge outline is continuous or gaps in the outline somehow are dealt with when creating the Fourier coefficients for the feature vector. This paper presents an approach using GLCM where the gray scale image is put through an edge enhancement using any one of several edge operators. The resultant image is a binary image. For each point in the edge image, a 2×2 GLCM is created by placing an n × n window centered around the point and using the n2 neighboring points to build the GLCM's. This window should be sufficiently large to enclose the target of interest and the GLCM created provides the elements needed to define the features for the edge enhanced target. All software was created in Matlab2 using Matlab functions.","PeriodicalId":128378,"journal":{"name":"2010 IEEE 39th Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"207 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115162296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-10-01DOI: 10.1109/AIPR.2010.5759706
J. Foytik, V. Asari, M. Youssef, R. Tompkins
Head pose estimation, though a trivial task for the human visual system, remains a challenging problem for computer vision systems. The task requires identifying the modes of image variance that directly pertain to pose changes, while generalizing across face identity and mitigating other image variances. Conventional methods such as Principal Component Analysis (PCA) fail to identify the true relationship between the observed space and the pose variable, while supervised methods such as Linear Discriminant Analysis (LDA) neglect the continuous nature of pose variation and take a discrete multi-class approach. We present a method for estimating head pose using Canonical Correlation Analysis (CCA), where pose variation is regarded as a continuous variable and is represented by a manifold in feature space. The proposed technique directly identifies the underlying dimension that maximizes correlation between the observed image and pose variable. It is shown to increase estimation accuracy and provide a more compact image representation that better captures pose features. Additionally, an enhanced version of the system is proposed that utilizes Gabor filters for providing pose sensitive input to the correlation based system. The preprocessed input serves to increase the overall accuracy of the pose estimation system. The accuracy of the techniques is evaluated using the Pointing '04 and CUbiC FacePix(30) pose varying face databases and is shown to produce a lower estimation error when compared to both PCA and LDA based methods.
{"title":"Head pose estimation from images using Canonical Correlation Analysis","authors":"J. Foytik, V. Asari, M. Youssef, R. Tompkins","doi":"10.1109/AIPR.2010.5759706","DOIUrl":"https://doi.org/10.1109/AIPR.2010.5759706","url":null,"abstract":"Head pose estimation, though a trivial task for the human visual system, remains a challenging problem for computer vision systems. The task requires identifying the modes of image variance that directly pertain to pose changes, while generalizing across face identity and mitigating other image variances. Conventional methods such as Principal Component Analysis (PCA) fail to identify the true relationship between the observed space and the pose variable, while supervised methods such as Linear Discriminant Analysis (LDA) neglect the continuous nature of pose variation and take a discrete multi-class approach. We present a method for estimating head pose using Canonical Correlation Analysis (CCA), where pose variation is regarded as a continuous variable and is represented by a manifold in feature space. The proposed technique directly identifies the underlying dimension that maximizes correlation between the observed image and pose variable. It is shown to increase estimation accuracy and provide a more compact image representation that better captures pose features. Additionally, an enhanced version of the system is proposed that utilizes Gabor filters for providing pose sensitive input to the correlation based system. The preprocessed input serves to increase the overall accuracy of the pose estimation system. The accuracy of the techniques is evaluated using the Pointing '04 and CUbiC FacePix(30) pose varying face databases and is shown to produce a lower estimation error when compared to both PCA and LDA based methods.","PeriodicalId":128378,"journal":{"name":"2010 IEEE 39th Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123657198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-10-01DOI: 10.1109/AIPR.2010.5759715
Eric D. Nelson, J. Irvine
Wide area motion imagery (WAMI) offers the promise of persistent surveillance over large regions. However, the combination of lower frame rate and coarser spatial resolution found in most WAMI systems can limit the ability to track multiple targets. One way to address this limitation is to employ the wide-area sensor in concert with one or more high resolution sensors. We have developed a capability called Sensor Management for Adaptive Reconnaissance and Tracking (SMART), for tasking an arbitrary number of high-fidelity assets, working with the WAMI sensor to maximize situational awareness based on a prevailing set of conditions and target priorities. We present a simulation framework for exploring performance of various sensor management strategies and present the findings from an initial set of experiments.
{"title":"Intelligent management of multiple sensors for enhanced situational awareness","authors":"Eric D. Nelson, J. Irvine","doi":"10.1109/AIPR.2010.5759715","DOIUrl":"https://doi.org/10.1109/AIPR.2010.5759715","url":null,"abstract":"Wide area motion imagery (WAMI) offers the promise of persistent surveillance over large regions. However, the combination of lower frame rate and coarser spatial resolution found in most WAMI systems can limit the ability to track multiple targets. One way to address this limitation is to employ the wide-area sensor in concert with one or more high resolution sensors. We have developed a capability called Sensor Management for Adaptive Reconnaissance and Tracking (SMART), for tasking an arbitrary number of high-fidelity assets, working with the WAMI sensor to maximize situational awareness based on a prevailing set of conditions and target priorities. We present a simulation framework for exploring performance of various sensor management strategies and present the findings from an initial set of experiments.","PeriodicalId":128378,"journal":{"name":"2010 IEEE 39th Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"66 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123524259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}