Pub Date : 2012-11-01DOI: 10.1109/WNYIPW.2012.6466652
L. K. Phadtare, R. Kushalnagar, N. Cahill
Automatic gesture recognition, specifically for the purpose of understanding sign language, can be an important aid in communicating with the deaf and hard-of-hearing. Recognition of sign languages requires understanding of various linguistic components such as palm orientation, hand shape, hand location and facial expression. We propose a method and system to estimate the palm orientation and the hand shape of a signer. Our system uses Microsoft Kinect to capture color and the depth images of a signer. It analyzes the depth data corresponding to the hand point region and fits plane to this data and defines the normal to this plane as the orientation of the palm. Then it uses 3-D shape context to determine the hand shape by comparing it to example shapes in the database. Palm orientation of the hand was found to be correct in varying poses. The shape context method for hand shape classification was found to identify 20 test hand shapes correctly and 10 shapes were matched to other but very similar shapes.
{"title":"Detecting hand-palm orientation and hand shapes for sign language gesture recognition using 3D images","authors":"L. K. Phadtare, R. Kushalnagar, N. Cahill","doi":"10.1109/WNYIPW.2012.6466652","DOIUrl":"https://doi.org/10.1109/WNYIPW.2012.6466652","url":null,"abstract":"Automatic gesture recognition, specifically for the purpose of understanding sign language, can be an important aid in communicating with the deaf and hard-of-hearing. Recognition of sign languages requires understanding of various linguistic components such as palm orientation, hand shape, hand location and facial expression. We propose a method and system to estimate the palm orientation and the hand shape of a signer. Our system uses Microsoft Kinect to capture color and the depth images of a signer. It analyzes the depth data corresponding to the hand point region and fits plane to this data and defines the normal to this plane as the orientation of the palm. Then it uses 3-D shape context to determine the hand shape by comparing it to example shapes in the database. Palm orientation of the hand was found to be correct in varying poses. The shape context method for hand shape classification was found to identify 20 test hand shapes correctly and 10 shapes were matched to other but very similar shapes.","PeriodicalId":218110,"journal":{"name":"2012 Western New York Image Processing Workshop","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117139377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-11-01DOI: 10.1109/WNYIPW.2012.6466650
E. Welch, D. Patru, E. Saber, K. Bengtson
Most image processing algorithms are parallelizable, i.e. the calculation of one pixel does not affect another one. SIMD architectures, including Intel's WMMX and SSE and ARM's NEON, can exploit this fact by processing multiple pixels at a time, which can result in significant speedups. This study investigates the use of NEON SIMD instructions for two image processing algorithms. The latter are altered to process four pixels at a time, for which a theoretical speedup factor of four can be achieved. In addition, parts of the original implementation have been replaced with inline functions or modified at assembly code level. Experimental benchmark data shows the actual execution speed to be between two to three times higher than the original reference. These results prove that SIMD instructions can significantly speedup image processing algorithms through proper code manipulations.
{"title":"A study of the use of SIMD instructions for two image processing algorithms","authors":"E. Welch, D. Patru, E. Saber, K. Bengtson","doi":"10.1109/WNYIPW.2012.6466650","DOIUrl":"https://doi.org/10.1109/WNYIPW.2012.6466650","url":null,"abstract":"Most image processing algorithms are parallelizable, i.e. the calculation of one pixel does not affect another one. SIMD architectures, including Intel's WMMX and SSE and ARM's NEON, can exploit this fact by processing multiple pixels at a time, which can result in significant speedups. This study investigates the use of NEON SIMD instructions for two image processing algorithms. The latter are altered to process four pixels at a time, for which a theoretical speedup factor of four can be achieved. In addition, parts of the original implementation have been replaced with inline functions or modified at assembly code level. Experimental benchmark data shows the actual execution speed to be between two to three times higher than the original reference. These results prove that SIMD instructions can significantly speedup image processing algorithms through proper code manipulations.","PeriodicalId":218110,"journal":{"name":"2012 Western New York Image Processing Workshop","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131338106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-11-01DOI: 10.1109/WNYIPW.2012.6466649
Lam Tran, C. Navasca, Jiebo Luo
In this paper, we purpose a method for anomaly detection in surveillance video in a tensor framework. We treat a video as a tensor and utilize a stable PCA to decompose it into two tensors, the first tensor is a low rank tensor that consists of background pixels and the second tensor is a sparse tensor that consists of the foreground pixels. The sparse tensor is then analyzed to detect anomaly. The proposed method is a one-shot framework to determine frames that are anomalous in a video.
{"title":"Video detection anomaly via low-rank and sparse decompositions","authors":"Lam Tran, C. Navasca, Jiebo Luo","doi":"10.1109/WNYIPW.2012.6466649","DOIUrl":"https://doi.org/10.1109/WNYIPW.2012.6466649","url":null,"abstract":"In this paper, we purpose a method for anomaly detection in surveillance video in a tensor framework. We treat a video as a tensor and utilize a stable PCA to decompose it into two tensors, the first tensor is a low rank tensor that consists of background pixels and the second tensor is a sparse tensor that consists of the foreground pixels. The sparse tensor is then analyzed to detect anomaly. The proposed method is a one-shot framework to determine frames that are anomalous in a video.","PeriodicalId":218110,"journal":{"name":"2012 Western New York Image Processing Workshop","volume":"115 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124248192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-11-01DOI: 10.1109/WNYIPW.2012.6466645
A. K. A. Hong, J. Pelz, J. Cockburn
Commercial mobile eye tracking systems are readily available, but are costly and complex. They have an additional disadvantage in that the eye cameras are placed directly in the field of view of the subject in order to obtain a clear frontal view of the eye. We propose a lightweight, low-cost, side-mounted mobile eye tracking system that uses side-view eye images to estimate the gaze of the subject. Cameras are mounted on the side of the head using curved mirrors to split the captured frames into scene and eye images. A hybrid algorithm using both feature-based models and appearance-based models is designed to accommodate this novel system. Image sequences, consisting of 4339 frames from seven subjects are analyzed by the algorithm, resulting in a successful gaze estimation rate of 95.7%.
{"title":"Lightweight, low-cost, side-mounted mobile eye tracking system","authors":"A. K. A. Hong, J. Pelz, J. Cockburn","doi":"10.1109/WNYIPW.2012.6466645","DOIUrl":"https://doi.org/10.1109/WNYIPW.2012.6466645","url":null,"abstract":"Commercial mobile eye tracking systems are readily available, but are costly and complex. They have an additional disadvantage in that the eye cameras are placed directly in the field of view of the subject in order to obtain a clear frontal view of the eye. We propose a lightweight, low-cost, side-mounted mobile eye tracking system that uses side-view eye images to estimate the gaze of the subject. Cameras are mounted on the side of the head using curved mirrors to split the captured frames into scene and eye images. A hybrid algorithm using both feature-based models and appearance-based models is designed to accommodate this novel system. Image sequences, consisting of 4339 frames from seven subjects are analyzed by the algorithm, resulting in a successful gaze estimation rate of 95.7%.","PeriodicalId":218110,"journal":{"name":"2012 Western New York Image Processing Workshop","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116804628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-11-01DOI: 10.1109/WNYIPW.2012.6466646
S. Azary, A. Savakis
Multi-view action classification is an important component of real world applications such as automatic surveillance and sports analysis. Motion History Images capture the location and direction of motion in a scene and sparse representations provide a compact representation of high dimensional signals. In this paper, we propose a multi-view action classification algorithm based on sparse representation of spatio-temporal action representations using motion history images. We find that this approach is effective at multi-view action classification and experiments with the i3DPost Multi-view Dataset achieve high classification rates.
{"title":"Multi-view action classification using sparse representations on Motion History Images","authors":"S. Azary, A. Savakis","doi":"10.1109/WNYIPW.2012.6466646","DOIUrl":"https://doi.org/10.1109/WNYIPW.2012.6466646","url":null,"abstract":"Multi-view action classification is an important component of real world applications such as automatic surveillance and sports analysis. Motion History Images capture the location and direction of motion in a scene and sparse representations provide a compact representation of high dimensional signals. In this paper, we propose a multi-view action classification algorithm based on sparse representation of spatio-temporal action representations using motion history images. We find that this approach is effective at multi-view action classification and experiments with the i3DPost Multi-view Dataset achieve high classification rates.","PeriodicalId":218110,"journal":{"name":"2012 Western New York Image Processing Workshop","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131105302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-11-01DOI: 10.1109/WNYIPW.2012.6466644
Y. Artan, A. Burry, V. Kozitsky, P. Paul
Face detection using local successive mean quantization transform (SMQT) features and the sparse network of winnows (SNoW) classifier has received interest in the computer vision community due to its success under varying illumination conditions. Recent work has also demonstrated the effectiveness of this classification technique for character recognition tasks. However, heavy storage requirements of the SNoW classifier necessitate the development of efficient techniques to reduce storage and computational requirements. This study shows that the SNoW classifier built with only a limited number of distinguishing SMQT features provides comparable performance to the original dense snow classifier. Initial results using the well-known CMU-MIT facial image database and a private character database are used to demonstrate the effectiveness of the proposed method.
{"title":"Efficient SMQT features for snow-based classification on face detection and character recognition tasks","authors":"Y. Artan, A. Burry, V. Kozitsky, P. Paul","doi":"10.1109/WNYIPW.2012.6466644","DOIUrl":"https://doi.org/10.1109/WNYIPW.2012.6466644","url":null,"abstract":"Face detection using local successive mean quantization transform (SMQT) features and the sparse network of winnows (SNoW) classifier has received interest in the computer vision community due to its success under varying illumination conditions. Recent work has also demonstrated the effectiveness of this classification technique for character recognition tasks. However, heavy storage requirements of the SNoW classifier necessitate the development of efficient techniques to reduce storage and computational requirements. This study shows that the SNoW classifier built with only a limited number of distinguishing SMQT features provides comparable performance to the original dense snow classifier. Initial results using the well-known CMU-MIT facial image database and a private character database are used to demonstrate the effectiveness of the proposed method.","PeriodicalId":218110,"journal":{"name":"2012 Western New York Image Processing Workshop","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122101169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-11-01DOI: 10.1109/WNYIPW.2012.6466653
S. Suresh, R. Dube
We present a multi-channel segmentation scheme to identify different features of the solar corona, such as coronal holes, active regions and the quiet sun (especially in the ultraviolet and extreme ultraviolet images). In contrast to common techniques, we use an approach that uses image intensity and relative contribution of each of the wavelengths. This approach is illustrated by using the images taken by the AIA telescopes onboard of the SDO mission. This technique incorporates a nearest-neighbor based classifier followed by Moore-neighbor tracing algorithm to find the boundaries and track the regions of interest. This method requires less computation time as compared to the commonly used fuzzy logic methods and is robust in the sense it performs equally well in both the central and limb regions of the solar disc.
{"title":"A multi-channel approach for segmentation of solar corona in images from the solar dynamics observatory","authors":"S. Suresh, R. Dube","doi":"10.1109/WNYIPW.2012.6466653","DOIUrl":"https://doi.org/10.1109/WNYIPW.2012.6466653","url":null,"abstract":"We present a multi-channel segmentation scheme to identify different features of the solar corona, such as coronal holes, active regions and the quiet sun (especially in the ultraviolet and extreme ultraviolet images). In contrast to common techniques, we use an approach that uses image intensity and relative contribution of each of the wavelengths. This approach is illustrated by using the images taken by the AIA telescopes onboard of the SDO mission. This technique incorporates a nearest-neighbor based classifier followed by Moore-neighbor tracing algorithm to find the boundaries and track the regions of interest. This method requires less computation time as compared to the commonly used fuzzy logic methods and is robust in the sense it performs equally well in both the central and limb regions of the solar disc.","PeriodicalId":218110,"journal":{"name":"2012 Western New York Image Processing Workshop","volume":"4 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116826289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-11-01DOI: 10.1109/WNYIPW.2012.6466643
J. Sheaffer, M. Moore, M. Bobrov, A. Webster, M. Torres
We present a work-in-progress system for tracking small objects with limited features to produce stabilized, successive images for frame-to-frame analysis in a specialized, hand-held, digital microscopy environment with limited resolution and processing capability and soft real-time requirements. Our system is able to track dirt or imperfections as small as about 10 μm across the end face of a fiber optic cable under a moving camera. It must locate three distinct features and track each of them from frame to frame. The measured positions of the features are used to calculate transformation matrices relative to a selected basis image and move each image in the image stack into the coordinate frame of the basis image so that we can perform focus stacking on the set of images. All of this must be completed in under a second on a low-power, hand-held device.
{"title":"Motion tracking for realtime, offline image stabilization with limited hardware","authors":"J. Sheaffer, M. Moore, M. Bobrov, A. Webster, M. Torres","doi":"10.1109/WNYIPW.2012.6466643","DOIUrl":"https://doi.org/10.1109/WNYIPW.2012.6466643","url":null,"abstract":"We present a work-in-progress system for tracking small objects with limited features to produce stabilized, successive images for frame-to-frame analysis in a specialized, hand-held, digital microscopy environment with limited resolution and processing capability and soft real-time requirements. Our system is able to track dirt or imperfections as small as about 10 μm across the end face of a fiber optic cable under a moving camera. It must locate three distinct features and track each of them from frame to frame. The measured positions of the features are used to calculate transformation matrices relative to a selected basis image and move each image in the image stack into the coordinate frame of the basis image so that we can perform focus stacking on the set of images. All of this must be completed in under a second on a low-power, hand-held device.","PeriodicalId":218110,"journal":{"name":"2012 Western New York Image Processing Workshop","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127833651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-11-01DOI: 10.1109/WNYIPW.2012.6466642
Thomas B. Kinsman, M. Fairchild, J. Pelz
Using a metric feature space for pattern recognition, data mining, and machine learning greatly simplifies the mathematics because distances are preserved under rotation and translation in feature space. A metric space also provides a “ruler”, or absolute measure of how different two feature vectors are. In the computer vision community color can easily be miss-treated as a metric distance. This paper serves as an introduction to why using a non-metric space is a challenge, and provides details of why color is not a valid Euclidean distance metric.
{"title":"Color is not a metric space implications for pattern recognition, machine learning, and computer vision","authors":"Thomas B. Kinsman, M. Fairchild, J. Pelz","doi":"10.1109/WNYIPW.2012.6466642","DOIUrl":"https://doi.org/10.1109/WNYIPW.2012.6466642","url":null,"abstract":"Using a metric feature space for pattern recognition, data mining, and machine learning greatly simplifies the mathematics because distances are preserved under rotation and translation in feature space. A metric space also provides a “ruler”, or absolute measure of how different two feature vectors are. In the computer vision community color can easily be miss-treated as a metric distance. This paper serves as an introduction to why using a non-metric space is a challenge, and provides details of why color is not a valid Euclidean distance metric.","PeriodicalId":218110,"journal":{"name":"2012 Western New York Image Processing Workshop","volume":"207 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132053030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-11-01DOI: 10.1109/WNYIPW.2012.6466651
D. S. Goldberg, J. Kerekes, K. Canham
Remotely sensed hyperspectral images (HSI) have the potential to provide large amounts of information about a scene. HSI, in this context, are images of the Earth collected with a spatial resolution of 1m to 30m in dozens to hundreds of contiguous narrow spectral bands over different wavelengths so that each pixel is a vector of data. Spectral unmixing is one application which can utilize the large amount of information in HSI. Unmixing is a process used to retrieve a material's spectral profile and its fractional abundance in each pixel since a single pixel contains a mixture of material spectra. Unmixing was used with images collected during an airborne hyperspectral collect at the Rochester Institute of Technology in 2010 with 1m resolution and a 390nm to 2450nm spectral range. The goal of our experiment was to quantitatively evaluate unmixing results by introducing a novel unmixing target. In addition, a single-band, edge unmixing technique is introduced with preliminary experimentation which showed results with mean unmixing fraction error of less than 10%. The results of the methods presented above helped in the design of future collection experiments.
遥感高光谱图像(HSI)具有提供大量场景信息的潜力。在这种情况下,HSI是在不同波长的几十到几百个连续的窄光谱带中以1米到30米的空间分辨率收集的地球图像,这样每个像素都是一个数据向量。光谱解混是HSI中可以利用大量信息的一种应用。解混是一个用于检索材料的光谱轮廓及其在每个像素中的分数丰度的过程,因为单个像素包含材料光谱的混合物。2010年,在罗切斯特理工学院(Rochester Institute of Technology)进行了一次机载高光谱采集,图像分辨率为1m,光谱范围为390nm至2450nm。本实验的目的是通过引入一种新的解混靶来定量评价解混效果。此外,还介绍了一种单波段边缘解混技术,并进行了初步实验,结果表明平均解混分数误差小于10%。上述方法的结果有助于今后收集实验的设计。
{"title":"Hyperspectral linear unmixing: Quantitative evaluation of novel target design and edge unmixing technique","authors":"D. S. Goldberg, J. Kerekes, K. Canham","doi":"10.1109/WNYIPW.2012.6466651","DOIUrl":"https://doi.org/10.1109/WNYIPW.2012.6466651","url":null,"abstract":"Remotely sensed hyperspectral images (HSI) have the potential to provide large amounts of information about a scene. HSI, in this context, are images of the Earth collected with a spatial resolution of 1m to 30m in dozens to hundreds of contiguous narrow spectral bands over different wavelengths so that each pixel is a vector of data. Spectral unmixing is one application which can utilize the large amount of information in HSI. Unmixing is a process used to retrieve a material's spectral profile and its fractional abundance in each pixel since a single pixel contains a mixture of material spectra. Unmixing was used with images collected during an airborne hyperspectral collect at the Rochester Institute of Technology in 2010 with 1m resolution and a 390nm to 2450nm spectral range. The goal of our experiment was to quantitatively evaluate unmixing results by introducing a novel unmixing target. In addition, a single-band, edge unmixing technique is introduced with preliminary experimentation which showed results with mean unmixing fraction error of less than 10%. The results of the methods presented above helped in the design of future collection experiments.","PeriodicalId":218110,"journal":{"name":"2012 Western New York Image Processing Workshop","volume":"179 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126766799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}