This paper proposes a novel scheme for face recognition from visible images to depth images. In our proposed technique, we adopt Partial Least Square (PLS) to handle correlation mapping between 2D to 3D. A considerable performance improvement is observed compared to using Canonical Correlation Analysis (CCA). To further improve the performance, a fusion scheme based on PLS and CCA is advocated. We evaluate the advocated approach on a popular face dataset-FRGCV2.0. Experimental results demonstrate that the proposed scheme is an effective approach to perform 2D-3D face recognition.
{"title":"A New Approach for 2D-3D Heterogeneous Face Recognition","authors":"Xiaolong Wang, V. Ly, G. Guo, C. Kambhamettu","doi":"10.1109/ISM.2013.58","DOIUrl":"https://doi.org/10.1109/ISM.2013.58","url":null,"abstract":"This paper proposes a novel scheme for face recognition from visible images to depth images. In our proposed technique, we adopt Partial Least Square (PLS) to handle correlation mapping between 2D to 3D. A considerable performance improvement is observed compared to using Canonical Correlation Analysis (CCA). To further improve the performance, a fusion scheme based on PLS and CCA is advocated. We evaluate the advocated approach on a popular face dataset-FRGCV2.0. Experimental results demonstrate that the proposed scheme is an effective approach to perform 2D-3D face recognition.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"13 1","pages":"301-304"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87826532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Eye blink is a quick action of closing and opening of the eyelids. Eye blink detection has a wide range of applications in human computer interaction and human vision health care research. Existing approaches to eye blink detection often cannot suit well resource-limited eye blink detection platforms like Smart Glasses, which have limited energy supply and typically cannot afford strong imaging and computational capabilities. In this paper, we present an efficient and robust eye blink detection method for Smart Glasses. Our method first employs an eigen-eye approach to detect closing-eye in individual video frames. Our method then learns eye blink patterns based on the closing-eye detection results and detects eye blinks using a Gradient Boosting method. Our method further uses a non-maximum suppression algorithm to remove repeated detection of the same eye-blink action among consecutive video frames. Experiments with our prototyped smart glasses equipped with a low-power camera and an embedded processor show an accurate detection result (with more than 96% accuracy) on video frames of a small size of 16 × 12 at 96 fps, which enables a number of applications in health care, driving safety, and human computer interaction.
{"title":"Eye Blink Detection for Smart Glasses","authors":"Hoang Le, Thanh Dang, Feng Liu","doi":"10.1109/ISM.2013.59","DOIUrl":"https://doi.org/10.1109/ISM.2013.59","url":null,"abstract":"Eye blink is a quick action of closing and opening of the eyelids. Eye blink detection has a wide range of applications in human computer interaction and human vision health care research. Existing approaches to eye blink detection often cannot suit well resource-limited eye blink detection platforms like Smart Glasses, which have limited energy supply and typically cannot afford strong imaging and computational capabilities. In this paper, we present an efficient and robust eye blink detection method for Smart Glasses. Our method first employs an eigen-eye approach to detect closing-eye in individual video frames. Our method then learns eye blink patterns based on the closing-eye detection results and detects eye blinks using a Gradient Boosting method. Our method further uses a non-maximum suppression algorithm to remove repeated detection of the same eye-blink action among consecutive video frames. Experiments with our prototyped smart glasses equipped with a low-power camera and an embedded processor show an accurate detection result (with more than 96% accuracy) on video frames of a small size of 16 × 12 at 96 fps, which enables a number of applications in health care, driving safety, and human computer interaction.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"29 1","pages":"305-308"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73580508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Users' satisfaction is the service providers' aim to reduce the churn, promote new services and improve ARPU (Average Revenue per User). In this work, a novel hybrid assessment technique is presented. It refines known mathematical models for quality assessment using both context information and subjectives tests. The model is then enriched with new features such as content characteristics, device type and network status, and compared to the state of the art. The effect of application parameters (startup time and buffering ratio) on user perceived quality is also analyzed in this article.
{"title":"A Hybrid Contextual User Perception Model for Streamed Video Quality Assessment","authors":"M. Diallo, N. Maréchal, H. Afifi","doi":"10.1109/ISM.2013.104","DOIUrl":"https://doi.org/10.1109/ISM.2013.104","url":null,"abstract":"Users' satisfaction is the service providers' aim to reduce the churn, promote new services and improve ARPU (Average Revenue per User). In this work, a novel hybrid assessment technique is presented. It refines known mathematical models for quality assessment using both context information and subjectives tests. The model is then enriched with new features such as content characteristics, device type and network status, and compared to the state of the art. The effect of application parameters (startup time and buffering ratio) on user perceived quality is also analyzed in this article.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"65 1","pages":"518-519"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83225670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
There have been some innovative research studies for detecting the Most Popular Route (MRP) using GPS devices in order to support tourists who travel in an unfamiliar area. The MPR is a route on which many moving objects move the most among the entire possible routes. Current MRP detection methods do not take into account the time of trajectory measurement, however, road conditions vary depending on a time zone. Therefore, the detected MRP may not be an appropriate route that was defined outside of the certain time zone. The aim of this study is to propose a new method to detect the MRP which is capable of considering a time zone of trajectory measurement. In addition to the new method, "Popularity Measure" is proposed in order to verify the suitability of the detected MRP. The detected MRP using the existing and proposed method are evaluated by compared from a viewpoint of this popularity measures.
{"title":"Detection of Most Popular Routes and Effective Time Segments Using Trajectory Distributions","authors":"Kazuma Ito, Hung-Hsuan Huang, K. Kawagoe","doi":"10.1109/ISM.2013.107","DOIUrl":"https://doi.org/10.1109/ISM.2013.107","url":null,"abstract":"There have been some innovative research studies for detecting the Most Popular Route (MRP) using GPS devices in order to support tourists who travel in an unfamiliar area. The MPR is a route on which many moving objects move the most among the entire possible routes. Current MRP detection methods do not take into account the time of trajectory measurement, however, road conditions vary depending on a time zone. Therefore, the detected MRP may not be an appropriate route that was defined outside of the certain time zone. The aim of this study is to propose a new method to detect the MRP which is capable of considering a time zone of trajectory measurement. In addition to the new method, \"Popularity Measure\" is proposed in order to verify the suitability of the detected MRP. The detected MRP using the existing and proposed method are evaluated by compared from a viewpoint of this popularity measures.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"18 1","pages":"530-531"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84789450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present a compression algorithm and a streaming protocol designed for streaming of computer-desktop graphics. The encoder has low memory requirements and can be broken into a large number of independent contexts with a high degree of data locality. The encoder also uses only simple arithmetic, which makes it amenable to hardware or highly parallel software implementation. The decoder is trivial and requires no memory, which makes it suitable for use on devices with limited computing capabilities. The streaming protocol runs over UDP and has its own unique error recovery mechanism specifically designed for interactive applications.
{"title":"A Simple Desktop Compression and Streaming System","authors":"I. Hadžić, Hans C. Woithe, Martin D. Carroll","doi":"10.1109/ISM.2013.65","DOIUrl":"https://doi.org/10.1109/ISM.2013.65","url":null,"abstract":"We present a compression algorithm and a streaming protocol designed for streaming of computer-desktop graphics. The encoder has low memory requirements and can be broken into a large number of independent contexts with a high degree of data locality. The encoder also uses only simple arithmetic, which makes it amenable to hardware or highly parallel software implementation. The decoder is trivial and requires no memory, which makes it suitable for use on devices with limited computing capabilities. The streaming protocol runs over UDP and has its own unique error recovery mechanism specifically designed for interactive applications.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"11 1","pages":"339-346"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86663587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Content-based image retrieval (CBIR) is an image retrieval problem with image-content query. This problem is investigated in many applications, such as, human identification, information embedding to real-world objects, life-log, and so on. Through many researches on CBIR, local image features, such as SIFT, SURF, and LBP, defined on image key points are proved to be effective for fast and occlusion-robust image retrieval. In CBIR using local features, it is clear that not all features are necessary for image retrieval. That is, distinctive features have stronger discrimination power than commonly observed features. Also, some local features are fragile against observation distortions. This paper presents an importance measure representing both the robustness and the distinctiveness of a local feature based on diverse density. According to this measure, we can reduce the number of local features related to each database entry. Through some experiments, database having reduced local feature indices performs better than database using all local features as indices.
{"title":"Keypoint Reduction for Smart Image Retrieval","authors":"K. Yuasa, T. Wada","doi":"10.1109/ISM.2013.67","DOIUrl":"https://doi.org/10.1109/ISM.2013.67","url":null,"abstract":"Content-based image retrieval (CBIR) is an image retrieval problem with image-content query. This problem is investigated in many applications, such as, human identification, information embedding to real-world objects, life-log, and so on. Through many researches on CBIR, local image features, such as SIFT, SURF, and LBP, defined on image key points are proved to be effective for fast and occlusion-robust image retrieval. In CBIR using local features, it is clear that not all features are necessary for image retrieval. That is, distinctive features have stronger discrimination power than commonly observed features. Also, some local features are fragile against observation distortions. This paper presents an importance measure representing both the robustness and the distinctiveness of a local feature based on diverse density. According to this measure, we can reduce the number of local features related to each database entry. Through some experiments, database having reduced local feature indices performs better than database using all local features as indices.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"52 1","pages":"351-358"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88952400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jan Vanvinkenroye, Christoph Grüninger, C. Heine, T. Richter
We implemented a survey with one learning group using the web-based tools and a control group working with a traditional setup based on editor and compiler. In a recent publication, we described the design and implementation of a web-based programming lab (ViPLab) targeted at undergraduate Engineering and Mathematics courses. This work provides a quantitative analysis of the user feedback, experience and learning success. The survey shows that web-based installations are as efficient as classical tools, while Windows users prefer the web-based chain over the editor/compiler installation on Linux. This justifies the use of web-based installations in programming beginner courses, if the learning target focuses on programming and not a particular tool chain.
{"title":"A Quantitative Analysis of a Virtual Programming Lab","authors":"Jan Vanvinkenroye, Christoph Grüninger, C. Heine, T. Richter","doi":"10.1109/ISM.2013.88","DOIUrl":"https://doi.org/10.1109/ISM.2013.88","url":null,"abstract":"We implemented a survey with one learning group using the web-based tools and a control group working with a traditional setup based on editor and compiler. In a recent publication, we described the design and implementation of a web-based programming lab (ViPLab) targeted at undergraduate Engineering and Mathematics courses. This work provides a quantitative analysis of the user feedback, experience and learning success. The survey shows that web-based installations are as efficient as classical tools, while Windows users prefer the web-based chain over the editor/compiler installation on Linux. This justifies the use of web-based installations in programming beginner courses, if the learning target focuses on programming and not a particular tool chain.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"61 1","pages":"457-461"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80731895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Visual matching for tracking and recognition, for example in video indexing, often uses image features measured at multiple resolutions. As a tracked object moves away from the camera, appearing progressively smaller, the higher resolutions consecutively become unavailable for matching, causing step changes in the similarity or “match score” of the tracked object. If several candidate matches (hypotheses) are maintained for a tracked region, this effect causes a bias toward larger region hypotheses that match at one extra resolution relative to even slightly smaller hypotheses. The effect is subtle and appears intermittent because it occurs only around a specific discrete set of object sizes. We describe the problem and the class of visual matching methods that it affects, and propose a solution. We present experimental results from a real video indexing system to illustrate both the problem and the effectiveness of the proposed solution.
{"title":"Resolution Control for Size Bias Elimination in Multi-resolution Visual Matching","authors":"S. Clippingdale","doi":"10.1109/ISM.2013.87","DOIUrl":"https://doi.org/10.1109/ISM.2013.87","url":null,"abstract":"Visual matching for tracking and recognition, for example in video indexing, often uses image features measured at multiple resolutions. As a tracked object moves away from the camera, appearing progressively smaller, the higher resolutions consecutively become unavailable for matching, causing step changes in the similarity or “match score” of the tracked object. If several candidate matches (hypotheses) are maintained for a tracked region, this effect causes a bias toward larger region hypotheses that match at one extra resolution relative to even slightly smaller hypotheses. The effect is subtle and appears intermittent because it occurs only around a specific discrete set of object sizes. We describe the problem and the class of visual matching methods that it affects, and propose a solution. We present experimental results from a real video indexing system to illustrate both the problem and the effectiveness of the proposed solution.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"76 1","pages":"451-456"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83857076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Despite the recent success of extensive co-segmentation studies, they still suffer from limitations in accommodating multiple-foreground, large-scale, high-variability image set, as well as their underlying capability for parallel implementation. To improve, this paper proposes a bi-harmonic distance governed flexible method for the robust coherent segmentation of the overlapping/similar contents co-existing in image group, which is independent of supervised learning and any other user-specified prior. The central idea is the novel integration of bi-harmonic distance metric design and multi-level deformable graph generation for multi-level clustering, which gives rise to a host of unique advantages: accommodating multiple-foreground images, respecting both local structures and global semantics of images, being more robust and accurate, and being convenient for parallel acceleration. Critical pipeline of our method involves intrinsic content-coherent measuring, super-pixel assisted bottom-up clustering, and multi-level deformable graph clustering based cross-image optimization. We conduct extensive experiments on the iCoseg benchmark and Oxford flower datasets, and make comprehensive evaluations to demonstrate the superiority of our method via comparison with state-of-the-art methods collected in the MSRC database.
{"title":"Unsupervised Co-segmentation of Complex Image Set via Bi-harmonic Distance Governed Multi-level Deformable Graph Clustering","authors":"Jizhou Ma, Shuai Li, A. Hao, Hong Qin","doi":"10.1109/ISM.2013.16","DOIUrl":"https://doi.org/10.1109/ISM.2013.16","url":null,"abstract":"Despite the recent success of extensive co-segmentation studies, they still suffer from limitations in accommodating multiple-foreground, large-scale, high-variability image set, as well as their underlying capability for parallel implementation. To improve, this paper proposes a bi-harmonic distance governed flexible method for the robust coherent segmentation of the overlapping/similar contents co-existing in image group, which is independent of supervised learning and any other user-specified prior. The central idea is the novel integration of bi-harmonic distance metric design and multi-level deformable graph generation for multi-level clustering, which gives rise to a host of unique advantages: accommodating multiple-foreground images, respecting both local structures and global semantics of images, being more robust and accurate, and being convenient for parallel acceleration. Critical pipeline of our method involves intrinsic content-coherent measuring, super-pixel assisted bottom-up clustering, and multi-level deformable graph clustering based cross-image optimization. We conduct extensive experiments on the iCoseg benchmark and Oxford flower datasets, and make comprehensive evaluations to demonstrate the superiority of our method via comparison with state-of-the-art methods collected in the MSRC database.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"50 1","pages":"38-45"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83976155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tamara Seybold, Christian Keimel, Marion Knopp, W. Stechele
The development and tuning of denoising algorithms is usually based on readily processed test images that are artificially degraded with additive white Gaussian noise (AWGN). While AWGN allows us to easily generate test data in a repeatable manner, it does not reflect the noise characteristics in a real digital camera. Realistic camera noise is signal-dependent and spatially correlated due to the demosaicking step required to obtain full-color images. Hence, the noise characteristic is fundamentally different from AWGN. Using such unrealistic data to test, optimize and compare denoising algorithms may lead to incorrect parameter tuning or sub optimal choices in research on denoising algorithms. In this paper, we therefore propose an approach to evaluate denoising algorithms with respect to realistic camera noise: we describe a new camera noise model that includes the full processing chain of a single sensor camera. We determine the visual quality of noisy and denoised test sequences using a subjective test with 18 participants. We show that the noise characteristics have a significant effect on visual quality. Quality metrics, which are required to compare denoising results, are applied, and we evaluate the performance of 10 full-reference metrics and one no-reference metric with our realistic test data. We conclude that a more realistic noise model should be used in future research to improve the quality estimation of digital images and videos and to improve the research on denoising algorithms.
{"title":"Towards an Evaluation of Denoising Algorithms with Respect to Realistic Camera Noise","authors":"Tamara Seybold, Christian Keimel, Marion Knopp, W. Stechele","doi":"10.1109/ISM.2013.39","DOIUrl":"https://doi.org/10.1109/ISM.2013.39","url":null,"abstract":"The development and tuning of denoising algorithms is usually based on readily processed test images that are artificially degraded with additive white Gaussian noise (AWGN). While AWGN allows us to easily generate test data in a repeatable manner, it does not reflect the noise characteristics in a real digital camera. Realistic camera noise is signal-dependent and spatially correlated due to the demosaicking step required to obtain full-color images. Hence, the noise characteristic is fundamentally different from AWGN. Using such unrealistic data to test, optimize and compare denoising algorithms may lead to incorrect parameter tuning or sub optimal choices in research on denoising algorithms. In this paper, we therefore propose an approach to evaluate denoising algorithms with respect to realistic camera noise: we describe a new camera noise model that includes the full processing chain of a single sensor camera. We determine the visual quality of noisy and denoised test sequences using a subjective test with 18 participants. We show that the noise characteristics have a significant effect on visual quality. Quality metrics, which are required to compare denoising results, are applied, and we evaluate the performance of 10 full-reference metrics and one no-reference metric with our realistic test data. We conclude that a more realistic noise model should be used in future research to improve the quality estimation of digital images and videos and to improve the research on denoising algorithms.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"29 1","pages":"203-210"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83582587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}