We present a compression algorithm and a streaming protocol designed for streaming of computer-desktop graphics. The encoder has low memory requirements and can be broken into a large number of independent contexts with a high degree of data locality. The encoder also uses only simple arithmetic, which makes it amenable to hardware or highly parallel software implementation. The decoder is trivial and requires no memory, which makes it suitable for use on devices with limited computing capabilities. The streaming protocol runs over UDP and has its own unique error recovery mechanism specifically designed for interactive applications.
{"title":"A Simple Desktop Compression and Streaming System","authors":"I. Hadžić, Hans C. Woithe, Martin D. Carroll","doi":"10.1109/ISM.2013.65","DOIUrl":"https://doi.org/10.1109/ISM.2013.65","url":null,"abstract":"We present a compression algorithm and a streaming protocol designed for streaming of computer-desktop graphics. The encoder has low memory requirements and can be broken into a large number of independent contexts with a high degree of data locality. The encoder also uses only simple arithmetic, which makes it amenable to hardware or highly parallel software implementation. The decoder is trivial and requires no memory, which makes it suitable for use on devices with limited computing capabilities. The streaming protocol runs over UDP and has its own unique error recovery mechanism specifically designed for interactive applications.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"11 1","pages":"339-346"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86663587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Digital video has become a very popular media in several contexts, with an ever expanding horizon of applications and uses. Thus, the amount of available video data is growing almost limitless. For this reason, video summarization continues to attract the attention of a wide spectrum of research efforts. In this work we present a novel video summarization technique based on tracking local features among consecutive frames. Our approach operates on the uncompressed domain, and requires only a small set of consecutive frames to perform, thus being able to process the video stream directly and produce results on the fly. We tested our implementation on standard available datasets, and compared the results with the most recent published work in the field. The results achieved show that our proposal produces summarizations that have similar quality than the best published proposals, with the additional advantage of being able to process the stream directly in the uncompressed domain.
{"title":"Speeded-Up Video Summarization Based on Local Features","authors":"Javier Iparraguirre, C. Delrieux","doi":"10.1109/ISM.2013.70","DOIUrl":"https://doi.org/10.1109/ISM.2013.70","url":null,"abstract":"Digital video has become a very popular media in several contexts, with an ever expanding horizon of applications and uses. Thus, the amount of available video data is growing almost limitless. For this reason, video summarization continues to attract the attention of a wide spectrum of research efforts. In this work we present a novel video summarization technique based on tracking local features among consecutive frames. Our approach operates on the uncompressed domain, and requires only a small set of consecutive frames to perform, thus being able to process the video stream directly and produce results on the fly. We tested our implementation on standard available datasets, and compared the results with the most recent published work in the field. The results achieved show that our proposal produces summarizations that have similar quality than the best published proposals, with the additional advantage of being able to process the stream directly in the uncompressed domain.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"15 1","pages":"370-373"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81019536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Users' satisfaction is the service providers' aim to reduce the churn, promote new services and improve ARPU (Average Revenue per User). In this work, a novel hybrid assessment technique is presented. It refines known mathematical models for quality assessment using both context information and subjectives tests. The model is then enriched with new features such as content characteristics, device type and network status, and compared to the state of the art. The effect of application parameters (startup time and buffering ratio) on user perceived quality is also analyzed in this article.
{"title":"A Hybrid Contextual User Perception Model for Streamed Video Quality Assessment","authors":"M. Diallo, N. Maréchal, H. Afifi","doi":"10.1109/ISM.2013.104","DOIUrl":"https://doi.org/10.1109/ISM.2013.104","url":null,"abstract":"Users' satisfaction is the service providers' aim to reduce the churn, promote new services and improve ARPU (Average Revenue per User). In this work, a novel hybrid assessment technique is presented. It refines known mathematical models for quality assessment using both context information and subjectives tests. The model is then enriched with new features such as content characteristics, device type and network status, and compared to the state of the art. The effect of application parameters (startup time and buffering ratio) on user perceived quality is also analyzed in this article.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"65 1","pages":"518-519"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83225670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper proposes a novel scheme for face recognition from visible images to depth images. In our proposed technique, we adopt Partial Least Square (PLS) to handle correlation mapping between 2D to 3D. A considerable performance improvement is observed compared to using Canonical Correlation Analysis (CCA). To further improve the performance, a fusion scheme based on PLS and CCA is advocated. We evaluate the advocated approach on a popular face dataset-FRGCV2.0. Experimental results demonstrate that the proposed scheme is an effective approach to perform 2D-3D face recognition.
{"title":"A New Approach for 2D-3D Heterogeneous Face Recognition","authors":"Xiaolong Wang, V. Ly, G. Guo, C. Kambhamettu","doi":"10.1109/ISM.2013.58","DOIUrl":"https://doi.org/10.1109/ISM.2013.58","url":null,"abstract":"This paper proposes a novel scheme for face recognition from visible images to depth images. In our proposed technique, we adopt Partial Least Square (PLS) to handle correlation mapping between 2D to 3D. A considerable performance improvement is observed compared to using Canonical Correlation Analysis (CCA). To further improve the performance, a fusion scheme based on PLS and CCA is advocated. We evaluate the advocated approach on a popular face dataset-FRGCV2.0. Experimental results demonstrate that the proposed scheme is an effective approach to perform 2D-3D face recognition.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"13 1","pages":"301-304"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87826532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
There have been some innovative research studies for detecting the Most Popular Route (MRP) using GPS devices in order to support tourists who travel in an unfamiliar area. The MPR is a route on which many moving objects move the most among the entire possible routes. Current MRP detection methods do not take into account the time of trajectory measurement, however, road conditions vary depending on a time zone. Therefore, the detected MRP may not be an appropriate route that was defined outside of the certain time zone. The aim of this study is to propose a new method to detect the MRP which is capable of considering a time zone of trajectory measurement. In addition to the new method, "Popularity Measure" is proposed in order to verify the suitability of the detected MRP. The detected MRP using the existing and proposed method are evaluated by compared from a viewpoint of this popularity measures.
{"title":"Detection of Most Popular Routes and Effective Time Segments Using Trajectory Distributions","authors":"Kazuma Ito, Hung-Hsuan Huang, K. Kawagoe","doi":"10.1109/ISM.2013.107","DOIUrl":"https://doi.org/10.1109/ISM.2013.107","url":null,"abstract":"There have been some innovative research studies for detecting the Most Popular Route (MRP) using GPS devices in order to support tourists who travel in an unfamiliar area. The MPR is a route on which many moving objects move the most among the entire possible routes. Current MRP detection methods do not take into account the time of trajectory measurement, however, road conditions vary depending on a time zone. Therefore, the detected MRP may not be an appropriate route that was defined outside of the certain time zone. The aim of this study is to propose a new method to detect the MRP which is capable of considering a time zone of trajectory measurement. In addition to the new method, \"Popularity Measure\" is proposed in order to verify the suitability of the detected MRP. The detected MRP using the existing and proposed method are evaluated by compared from a viewpoint of this popularity measures.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"18 1","pages":"530-531"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84789450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Content-based image retrieval (CBIR) is an image retrieval problem with image-content query. This problem is investigated in many applications, such as, human identification, information embedding to real-world objects, life-log, and so on. Through many researches on CBIR, local image features, such as SIFT, SURF, and LBP, defined on image key points are proved to be effective for fast and occlusion-robust image retrieval. In CBIR using local features, it is clear that not all features are necessary for image retrieval. That is, distinctive features have stronger discrimination power than commonly observed features. Also, some local features are fragile against observation distortions. This paper presents an importance measure representing both the robustness and the distinctiveness of a local feature based on diverse density. According to this measure, we can reduce the number of local features related to each database entry. Through some experiments, database having reduced local feature indices performs better than database using all local features as indices.
{"title":"Keypoint Reduction for Smart Image Retrieval","authors":"K. Yuasa, T. Wada","doi":"10.1109/ISM.2013.67","DOIUrl":"https://doi.org/10.1109/ISM.2013.67","url":null,"abstract":"Content-based image retrieval (CBIR) is an image retrieval problem with image-content query. This problem is investigated in many applications, such as, human identification, information embedding to real-world objects, life-log, and so on. Through many researches on CBIR, local image features, such as SIFT, SURF, and LBP, defined on image key points are proved to be effective for fast and occlusion-robust image retrieval. In CBIR using local features, it is clear that not all features are necessary for image retrieval. That is, distinctive features have stronger discrimination power than commonly observed features. Also, some local features are fragile against observation distortions. This paper presents an importance measure representing both the robustness and the distinctiveness of a local feature based on diverse density. According to this measure, we can reduce the number of local features related to each database entry. Through some experiments, database having reduced local feature indices performs better than database using all local features as indices.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"52 1","pages":"351-358"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88952400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guangyi Cao, A. Ravindran, S. Kamalasadan, B. Joshi, A. Mukherjee
We demonstrate a novel cross-stack control theoretic approach in designing a predictive controller that can automatically track changes in the multimedia workload to maintain a desired metric of application quality while minimizing power consumption.
{"title":"A Cross-Stack Predictive Control Framework for Multimedia Applications","authors":"Guangyi Cao, A. Ravindran, S. Kamalasadan, B. Joshi, A. Mukherjee","doi":"10.1109/ISM.2013.77","DOIUrl":"https://doi.org/10.1109/ISM.2013.77","url":null,"abstract":"We demonstrate a novel cross-stack control theoretic approach in designing a predictive controller that can automatically track changes in the multimedia workload to maintain a desired metric of application quality while minimizing power consumption.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"31 1","pages":"403-404"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75287440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Conventional video compression schemes such as H.264/AVC use a high complexity encoder with block motion estimation (ME) and a low complexity, low latency decoder. However, unmanned aerial vehicle (UAV) reconnaissance and surveillance applications require low complexity encoders but can accommodate high complexity decoders. Moreover, the video sequences in these applications often primarily have global motion due to the known movement of the UAV and camera mounts. Motivated by this scenario, we propose and investigate a low complexity encoder with global motion based frame prediction and no block ME. For fly-over videos, our encoder achieves more than a 40% bit rate savings over a H.264 encoder with ME block size restricted to 8 × 8 and at lower complexity. We also develop a high complexity decoder based on Kalman filtering along motion trajectories and show average PSNR improvements of up to 0.5 dB with respect to a classic low complexity decoder.
{"title":"Low Complexity Video Encoding and High Complexity Decoding for UAV Reconnaissance and Surveillance","authors":"Malavika Bhaskaranand, J. Gibson","doi":"10.1109/ISM.2013.34","DOIUrl":"https://doi.org/10.1109/ISM.2013.34","url":null,"abstract":"Conventional video compression schemes such as H.264/AVC use a high complexity encoder with block motion estimation (ME) and a low complexity, low latency decoder. However, unmanned aerial vehicle (UAV) reconnaissance and surveillance applications require low complexity encoders but can accommodate high complexity decoders. Moreover, the video sequences in these applications often primarily have global motion due to the known movement of the UAV and camera mounts. Motivated by this scenario, we propose and investigate a low complexity encoder with global motion based frame prediction and no block ME. For fly-over videos, our encoder achieves more than a 40% bit rate savings over a H.264 encoder with ME block size restricted to 8 × 8 and at lower complexity. We also develop a high complexity decoder based on Kalman filtering along motion trajectories and show average PSNR improvements of up to 0.5 dB with respect to a classic low complexity decoder.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"24 1","pages":"163-170"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75328108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Levenberg-Marquardt optimization is normally used in 3D point cloud reconstruction from image sequences which is computationally expensive. This paper presents a two-stage camera pose estimation approach where an initial camera pose is obtained during the first stage and a refinement is performed during the second stage. This approach does not require the use of the Levenberg-Marquardt optimization and LU matrix decomposition for computing the projection matrix, thus providing a more computationally efficient 3D point cloud reconstruction as compared to the existing approaches. The results obtained using real video sequences indicate that the introduced approach generates lower re-projection errors as well as faster 3D point cloud reconstruction.
{"title":"Improving Computational Efficiency of 3D Point Cloud Reconstruction from Image Sequences","authors":"Chih-Hsiang Chang, N. Kehtarnavaz","doi":"10.1109/ISM.2013.101","DOIUrl":"https://doi.org/10.1109/ISM.2013.101","url":null,"abstract":"The Levenberg-Marquardt optimization is normally used in 3D point cloud reconstruction from image sequences which is computationally expensive. This paper presents a two-stage camera pose estimation approach where an initial camera pose is obtained during the first stage and a refinement is performed during the second stage. This approach does not require the use of the Levenberg-Marquardt optimization and LU matrix decomposition for computing the projection matrix, thus providing a more computationally efficient 3D point cloud reconstruction as compared to the existing approaches. The results obtained using real video sequences indicate that the introduced approach generates lower re-projection errors as well as faster 3D point cloud reconstruction.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"49 6 1","pages":"510-513"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79738347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The proliferation of multimedia narratives has contributed to what is known as the "crisis of choice", which demands a much more active participation on the part of the user to consume multimedia content. To address this issue, a strategy is to offer users efficient search mechanisms, sometimes based on ontologies. However, one may argue that such mechanisms are often based on abstractions that do not adequately capture the essential aspects of multimedia narratives. This paper proposes a conceptual model to specify multimedia narratives that overcomes this limitation. The model is based on the notion of event and is therefore called Nested Event Model (NEMo). The paper also includes a complete example to illustrate the use of the model.
{"title":"Nested Event Model for Multimedia Narratives","authors":"Ricardo Rios M. do Carmo, L. Soares, M. Casanova","doi":"10.1109/ISM.2013.26","DOIUrl":"https://doi.org/10.1109/ISM.2013.26","url":null,"abstract":"The proliferation of multimedia narratives has contributed to what is known as the \"crisis of choice\", which demands a much more active participation on the part of the user to consume multimedia content. To address this issue, a strategy is to offer users efficient search mechanisms, sometimes based on ontologies. However, one may argue that such mechanisms are often based on abstractions that do not adequately capture the essential aspects of multimedia narratives. This paper proposes a conceptual model to specify multimedia narratives that overcomes this limitation. The model is based on the notion of event and is therefore called Nested Event Model (NEMo). The paper also includes a complete example to illustrate the use of the model.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"4 1","pages":"106-113"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79975464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}