This paper presents a model for automatic summarization of videos recorded by wearable cameras. The proposed model detects various user activities by computing the transform of matching image features among video frames. Four basic types of user activities are proposed, including "moving closer /farther", "panning", "making a turn", and "rotation". Different summarization techniques are provided for different activity types, and a wearable video sequence can be summarized as a compact set of panoramic images. The user activity analysis is solely based on the analysis of images, without resorting to the information of other sensors. Experimental results on a 19- minute video sequence demonstrate the effectiveness of our proposed model.
{"title":"Summarization of Wearable Videos Based on User Activity Analysis","authors":"R. Katpelly, Tiecheng Liu, Chin-Tser Huang","doi":"10.1109/ISM.2007.16","DOIUrl":"https://doi.org/10.1109/ISM.2007.16","url":null,"abstract":"This paper presents a model for automatic summarization of videos recorded by wearable cameras. The proposed model detects various user activities by computing the transform of matching image features among video frames. Four basic types of user activities are proposed, including \"moving closer /farther\", \"panning\", \"making a turn\", and \"rotation\". Different summarization techniques are provided for different activity types, and a wearable video sequence can be summarized as a compact set of panoramic images. The user activity analysis is solely based on the analysis of images, without resorting to the information of other sensors. Experimental results on a 19- minute video sequence demonstrate the effectiveness of our proposed model.","PeriodicalId":129680,"journal":{"name":"Ninth IEEE International Symposium on Multimedia (ISM 2007)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115733482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the rapid advances in mobile communication technologies, QR code in the embedded camera devices has been used as new input interfaces. However, the previous works for extracting QR code from an image do not consider a non-uniform background. In this paper, we implement the applications of QR code and propose an efficient algorithm to extract QR code from the non-uniform background. In contrast with prior works, our approach is of higher accuracy for QR-code recognition and more practical for use in a mobile information environment.
{"title":"A General Scheme for Extracting QR Code from a Non-uniform Background in Camera Phones and Applications","authors":"Yu-Hsuan Chang, Chung-Hua Chu, Ming-Syan Chen","doi":"10.1109/ISM.2007.26","DOIUrl":"https://doi.org/10.1109/ISM.2007.26","url":null,"abstract":"With the rapid advances in mobile communication technologies, QR code in the embedded camera devices has been used as new input interfaces. However, the previous works for extracting QR code from an image do not consider a non-uniform background. In this paper, we implement the applications of QR code and propose an efficient algorithm to extract QR code from the non-uniform background. In contrast with prior works, our approach is of higher accuracy for QR-code recognition and more practical for use in a mobile information environment.","PeriodicalId":129680,"journal":{"name":"Ninth IEEE International Symposium on Multimedia (ISM 2007)","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121373170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hang Liu, Mingquan Wu, Dekai Li, S. Mathur, K. Ramaswamy, Liqiao Han, D. Raychaudhuri
We report the implementation experience and experimental evaluation of a staggered adaptive forward error correction (FEC) system for video multicast over wireless LANs. In the system, the parity packets generated by a cross-packet FEC code are transmitted at a time delay from the original video packets, i.e. staggercasting video stream and FEC stream in different multicast groups. The delay provides temporal diversity to improve the robustness of video multicast, especially to enable the clients to correct burst packet loss using FEC and to achieve seamless handoff. A wireless client dynamically joins the FEC multicast groups based upon its channel conditions and handoff events. We have implemented the system including the streaming server and client proxy. A novel software architecture is designed to integrate the FEC functionality in the clients without requirement for changing the existing video player software. We conduct extensive experiments to investigate the impact of FEC overhead and the delay between the video stream and FEC stream to the video quality under different interference levels and mobile handoff durations. The efficacy of staggered adaptive FEC system on improving video multicast quality is demonstrated in real system implementation.
{"title":"A Staggered FEC System for Seamless Handoff in Wireless LANs: Implementation Experience and Experimental Study","authors":"Hang Liu, Mingquan Wu, Dekai Li, S. Mathur, K. Ramaswamy, Liqiao Han, D. Raychaudhuri","doi":"10.1109/ISM.2007.29","DOIUrl":"https://doi.org/10.1109/ISM.2007.29","url":null,"abstract":"We report the implementation experience and experimental evaluation of a staggered adaptive forward error correction (FEC) system for video multicast over wireless LANs. In the system, the parity packets generated by a cross-packet FEC code are transmitted at a time delay from the original video packets, i.e. staggercasting video stream and FEC stream in different multicast groups. The delay provides temporal diversity to improve the robustness of video multicast, especially to enable the clients to correct burst packet loss using FEC and to achieve seamless handoff. A wireless client dynamically joins the FEC multicast groups based upon its channel conditions and handoff events. We have implemented the system including the streaming server and client proxy. A novel software architecture is designed to integrate the FEC functionality in the clients without requirement for changing the existing video player software. We conduct extensive experiments to investigate the impact of FEC overhead and the delay between the video stream and FEC stream to the video quality under different interference levels and mobile handoff durations. The efficacy of staggered adaptive FEC system on improving video multicast quality is demonstrated in real system implementation.","PeriodicalId":129680,"journal":{"name":"Ninth IEEE International Symposium on Multimedia (ISM 2007)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127159368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper demonstrates the design of a fault- tolerant ubiquitous MM computing system. Our ubiquitous multimodal multimedia (MM) computing system selects the appropriate media and modalities based on the user's context and user's profile. This paper demonstrates the design of a fault-tolerant ubiquitous MM computing system. The requirements analysis is undertaken by considering the quality attributes desired by different stakeholders. We use the attribute-driven design method in the requirement analysis and Architecture Tradeoffs Analysis Method in evaluating the system architecture. In this paper we conduct all tests and explain results through stochastic colored Petri Net specification, citing pre- and postcondition of each scenario.
{"title":"Analysis of a New Ubiquitous Multimodal Multimedia Computing System","authors":"A. Ramdane-Cherif, M. D. Hina, C. Tadj, N. Lévy","doi":"10.1109/ISM.2007.45","DOIUrl":"https://doi.org/10.1109/ISM.2007.45","url":null,"abstract":"This paper demonstrates the design of a fault- tolerant ubiquitous MM computing system. Our ubiquitous multimodal multimedia (MM) computing system selects the appropriate media and modalities based on the user's context and user's profile. This paper demonstrates the design of a fault-tolerant ubiquitous MM computing system. The requirements analysis is undertaken by considering the quality attributes desired by different stakeholders. We use the attribute-driven design method in the requirement analysis and Architecture Tradeoffs Analysis Method in evaluating the system architecture. In this paper we conduct all tests and explain results through stochastic colored Petri Net specification, citing pre- and postcondition of each scenario.","PeriodicalId":129680,"journal":{"name":"Ninth IEEE International Symposium on Multimedia (ISM 2007)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115458193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In video databases, each video contains temporal and spatial relationships between content objects. One of the well-known video indexing strategies is the 3D C-string strategy. However, it cannot deal with the condition that an object appears and then disappears for more than one time. To solve this problem, in this paper, we propose an indexing strategy, called temporal UID Matrix1. Based on the original 13 spatial relationships proposed by 2D C-string and our three new spatial relationships, we can derive the temporal relationships from the sequence of spatial relationships. Therefore, in our proposed strategy, although we build only index for spatial relationships, and we still can answer the video queries, i.e., spatial, temporal, and spatio-temporal queries. From our simulation study, we show that our proposed strategy is more efficient for video searching than the 3D C-string strategy.
{"title":"A Temporal UID Matrix Strategy for Indexing Video Databases","authors":"Ye-In Chang, Wei-Horng Yeh, Jiun-Rung Chen, Youcheng Chen","doi":"10.1109/ISM.2007.9","DOIUrl":"https://doi.org/10.1109/ISM.2007.9","url":null,"abstract":"In video databases, each video contains temporal and spatial relationships between content objects. One of the well-known video indexing strategies is the 3D C-string strategy. However, it cannot deal with the condition that an object appears and then disappears for more than one time. To solve this problem, in this paper, we propose an indexing strategy, called temporal UID Matrix1. Based on the original 13 spatial relationships proposed by 2D C-string and our three new spatial relationships, we can derive the temporal relationships from the sequence of spatial relationships. Therefore, in our proposed strategy, although we build only index for spatial relationships, and we still can answer the video queries, i.e., spatial, temporal, and spatio-temporal queries. From our simulation study, we show that our proposed strategy is more efficient for video searching than the 3D C-string strategy.","PeriodicalId":129680,"journal":{"name":"Ninth IEEE International Symposium on Multimedia (ISM 2007)","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120955197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chen-Yu Chen, Jia-Ching Wang, Jhing-Fa Wang, Y. Hu
An event-based segmentation method for sports videos is presented. A motion entropy criterion is employed to characterize the level of intensity of relevant object motion in individual frames of a video sequence. The resulting motion entropy curve then is approximated with a piece-wise linear model using a homoscedastic error model based time series change point detection algorithm. It is observed that interesting sports events are correlated with specific patterns of the piece-wise linear model. A set of empirically derived classification rules then is derived based on these observations. Application of these rules to the motion entropy curve leads to this motion entropy curve, one is able to segment the corresponding video sequence into individual sections, each consisting of a semantically relevant event. The proposed method is tested on six hours of sports videos including basketball, soccer and tennis. Excellent experimental results are observed.
{"title":"Event-Based Segmentation of Sports Video Using Motion Entropy","authors":"Chen-Yu Chen, Jia-Ching Wang, Jhing-Fa Wang, Y. Hu","doi":"10.1109/ISM.2007.17","DOIUrl":"https://doi.org/10.1109/ISM.2007.17","url":null,"abstract":"An event-based segmentation method for sports videos is presented. A motion entropy criterion is employed to characterize the level of intensity of relevant object motion in individual frames of a video sequence. The resulting motion entropy curve then is approximated with a piece-wise linear model using a homoscedastic error model based time series change point detection algorithm. It is observed that interesting sports events are correlated with specific patterns of the piece-wise linear model. A set of empirically derived classification rules then is derived based on these observations. Application of these rules to the motion entropy curve leads to this motion entropy curve, one is able to segment the corresponding video sequence into individual sections, each consisting of a semantically relevant event. The proposed method is tested on six hours of sports videos including basketball, soccer and tennis. Excellent experimental results are observed.","PeriodicalId":129680,"journal":{"name":"Ninth IEEE International Symposium on Multimedia (ISM 2007)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124504648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Tao, A. Shahbahrami, B. Juurlink, R. Buchty, Wolfgang Karl, S. Vassiliadis
The 2D DWT consists of two 1D DWT in both directions: horizontal filtering processes the rows followed by vertical filtering processes the columns. It is well known that a straightforward implementation of the vertical filtering shows quite different performance with various working set sizes. The only reasonable explanation for this has to be the access behavior of the cache memory. As known, vertical filtering has mapping conflicts in the cache with a working set size that is power of two. However, it is not clear how this conflict forms and whether cache problems exist with other data sizes. Such knowledge is the base for efficient code optimization. In order to acquire this knowledge and to achieve more accurate optimization potentials, we apply a cache visualization tool to examine the runtime cache activities of the vertical implementation. We find that besides mapping conflicts, vertical filtering also shows a large number of capacity misses. More specifically, the visualization tool allows us to detect the parameters related to the strategies. This guarantees the feasibility of the optimization. Our initial experimental results on several different architectures show an up to 215% gain in execution time compared to an already optimized baseline implementation.
{"title":"Optimizing Cache Performance of the Discrete Wavelet Transform Using a Visualization Tool","authors":"J. Tao, A. Shahbahrami, B. Juurlink, R. Buchty, Wolfgang Karl, S. Vassiliadis","doi":"10.1109/ISM.2007.12","DOIUrl":"https://doi.org/10.1109/ISM.2007.12","url":null,"abstract":"The 2D DWT consists of two 1D DWT in both directions: horizontal filtering processes the rows followed by vertical filtering processes the columns. It is well known that a straightforward implementation of the vertical filtering shows quite different performance with various working set sizes. The only reasonable explanation for this has to be the access behavior of the cache memory. As known, vertical filtering has mapping conflicts in the cache with a working set size that is power of two. However, it is not clear how this conflict forms and whether cache problems exist with other data sizes. Such knowledge is the base for efficient code optimization. In order to acquire this knowledge and to achieve more accurate optimization potentials, we apply a cache visualization tool to examine the runtime cache activities of the vertical implementation. We find that besides mapping conflicts, vertical filtering also shows a large number of capacity misses. More specifically, the visualization tool allows us to detect the parameters related to the strategies. This guarantees the feasibility of the optimization. Our initial experimental results on several different architectures show an up to 215% gain in execution time compared to an already optimized baseline implementation.","PeriodicalId":129680,"journal":{"name":"Ninth IEEE International Symposium on Multimedia (ISM 2007)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129966630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Huai-Che Lee, C. Chang, Jui-Shiang Chao, Wei-Te Lin
We present a hybrid motion response model for presenting dynamic character behavior during character interactions in computer games. The method seamlessly integrates motion captured data with physics-based character simulation. The result preserves the desired character motion styles and allows physically realistic responsive behaviors for lifelike character presentation. Based on studies in human balance control and responsive behavior, we designed a mechanism to generate character balancing behaviors for a common computer fighting game scenario. The results demonstrate the capabilities of such mechanism as a set of building blocks for presenting more complex responsive behaviors in computer games.
{"title":"Realistic Character Motion Response in Computer Fighting Game","authors":"Huai-Che Lee, C. Chang, Jui-Shiang Chao, Wei-Te Lin","doi":"10.1109/ISM.2007.30","DOIUrl":"https://doi.org/10.1109/ISM.2007.30","url":null,"abstract":"We present a hybrid motion response model for presenting dynamic character behavior during character interactions in computer games. The method seamlessly integrates motion captured data with physics-based character simulation. The result preserves the desired character motion styles and allows physically realistic responsive behaviors for lifelike character presentation. Based on studies in human balance control and responsive behavior, we designed a mechanism to generate character balancing behaviors for a common computer fighting game scenario. The results demonstrate the capabilities of such mechanism as a set of building blocks for presenting more complex responsive behaviors in computer games.","PeriodicalId":129680,"journal":{"name":"Ninth IEEE International Symposium on Multimedia (ISM 2007)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132812467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As Webcams become an important factor in the PC environment, many camera-based communication techniques have been developed. Among them, gesture-based communication is attracting attention. In this paper, we propose a real-time interactive shadow avatar (RISA) which can express facial emotions by changing as response to the user's gestures. The avatar's shape is a virtual shadow constructed from a real-time sampled picture of user's shape. Several predefined facial animations overlap on the face area of the virtual shadow, according to the types of hand gestures. We use the background subtraction method to separate the virtual shadow, and a simplified region-based tracking method is adopted for tracking hand positions and detecting hand gestures. In order to achieve a smooth change of emotions, we use a refined morphing method which uses many more frames in contrast to traditional dynamic emoticons. Through our experiments, we found that in the cases where there was enough distance between a camera and a user, the accuracy was higher than in the cases where the distance between them was very close. We have found RISA to be very useful in simple online chatting and PC game environments and it was also highlighted in a real media art exhibition.
{"title":"RISA: A Real-Time Interactive Shadow Avatar","authors":"Yangmi Lim, Jinsu Kim, Jinseok Chae","doi":"10.1109/ISM.2007.47","DOIUrl":"https://doi.org/10.1109/ISM.2007.47","url":null,"abstract":"As Webcams become an important factor in the PC environment, many camera-based communication techniques have been developed. Among them, gesture-based communication is attracting attention. In this paper, we propose a real-time interactive shadow avatar (RISA) which can express facial emotions by changing as response to the user's gestures. The avatar's shape is a virtual shadow constructed from a real-time sampled picture of user's shape. Several predefined facial animations overlap on the face area of the virtual shadow, according to the types of hand gestures. We use the background subtraction method to separate the virtual shadow, and a simplified region-based tracking method is adopted for tracking hand positions and detecting hand gestures. In order to achieve a smooth change of emotions, we use a refined morphing method which uses many more frames in contrast to traditional dynamic emoticons. Through our experiments, we found that in the cases where there was enough distance between a camera and a user, the accuracy was higher than in the cases where the distance between them was very close. We have found RISA to be very useful in simple online chatting and PC game environments and it was also highlighted in a real media art exhibition.","PeriodicalId":129680,"journal":{"name":"Ninth IEEE International Symposium on Multimedia (ISM 2007)","volume":"118 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126264060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yang-Ta Kao, T. Shih, Hsing-Ying Zhong, Liang-Kuang Dai
Aged motion pictures may contain different types of defects, such as spikes and scratch lines. Several detection mechanisms were proposed to find scratches. However, it is relatively hard to precisely restore aged films if the continuation of image property is not considered in the temporal domain. This paper proposes a technique on restoring aged films based on simultaneously considering the outmost patches/pixels which are going to be inpainted. The inpainting order plays an important factor for human visualization. By eliminating the inpainting order problem, the results of restored films have good quality as compared to earlier approach. Application of our technique can be used in the restoration of aged films.
{"title":"Scratch Line Removal on Aged Films","authors":"Yang-Ta Kao, T. Shih, Hsing-Ying Zhong, Liang-Kuang Dai","doi":"10.1109/ISM.2007.20","DOIUrl":"https://doi.org/10.1109/ISM.2007.20","url":null,"abstract":"Aged motion pictures may contain different types of defects, such as spikes and scratch lines. Several detection mechanisms were proposed to find scratches. However, it is relatively hard to precisely restore aged films if the continuation of image property is not considered in the temporal domain. This paper proposes a technique on restoring aged films based on simultaneously considering the outmost patches/pixels which are going to be inpainted. The inpainting order plays an important factor for human visualization. By eliminating the inpainting order problem, the results of restored films have good quality as compared to earlier approach. Application of our technique can be used in the restoration of aged films.","PeriodicalId":129680,"journal":{"name":"Ninth IEEE International Symposium on Multimedia (ISM 2007)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116664696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}