Pub Date : 2001-08-22DOI: 10.1109/ICME.2001.1237930
Yu Han, Haomin Jin, M. Sakauchi
In this paper, we present a multimedia map database system. The system is mainly employed for acquisition, storage and retrieval of the real-world urban images with viewpoint information. Here, the images and the viewpoint information are assumed to be collected from the mobile multimedia terminal users. By using the viewpoint information in the database system, an "Extended Visual Map" is proposed for presenting the images according to the actual viewpoint of any user; also a position information service is implemented to retrieval the map information by building images. To realize these objectives, the features of the regions containing a building in every real-world urban image are extracted and compared automatically by the proposed system.
{"title":"Construction of multimedia map database using urban city images","authors":"Yu Han, Haomin Jin, M. Sakauchi","doi":"10.1109/ICME.2001.1237930","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237930","url":null,"abstract":"In this paper, we present a multimedia map database system. The system is mainly employed for acquisition, storage and retrieval of the real-world urban images with viewpoint information. Here, the images and the viewpoint information are assumed to be collected from the mobile multimedia terminal users. By using the viewpoint information in the database system, an \"Extended Visual Map\" is proposed for presenting the images according to the actual viewpoint of any user; also a position information service is implemented to retrieval the map information by building images. To realize these objectives, the features of the regions containing a building in every real-world urban image are extracted and compared automatically by the proposed system.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127024159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Retrieval of video based on high-level content requires that models are developed to map low level perceptual features into high level semantic concepts. Commercials are a video category where the link between low level perceptual features and high level semantics is stressed, since the way colors are chosen and modified throughout a spot create a large part of the message. In this paper, we propose a model for representation and comparison of video content based on the spatial arrangement of color flows. A model for representation and comparison of spatial relationships between extended sets of pixels in a 3D space is introduced by developingon the concept of weighted walkthroughs. Results of preliminary experiments are reported for a library of video commercials.
{"title":"Spatial arrangement of color flows for video retrieval","authors":"A. Bimbo, E. Vicario, P. Pala","doi":"10.1109/ICME.2001.10006","DOIUrl":"https://doi.org/10.1109/ICME.2001.10006","url":null,"abstract":"Retrieval of video based on high-level content requires that models are developed to map low level perceptual features into high level semantic concepts. Commercials are a video category where the link between low level perceptual features and high level semantics is stressed, since the way colors are chosen and modified throughout a spot create a large part of the message. In this paper, we propose a model for representation and comparison of video content based on the spatial arrangement of color flows. A model for representation and comparison of spatial relationships between extended sets of pixels in a 3D space is introduced by developingon the concept of weighted walkthroughs. Results of preliminary experiments are reported for a library of video commercials.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125279661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-08-22DOI: 10.1109/ICME.2001.1237906
Yiu-ming Cheung
By modelling sources as a multivariate auto-regressive (AR) process, we have recently presented a dual AR modelling approach to identify temporal sources in independent component analysis (ICA) (Cheung et al. 2000, Cheung and Xu 1999 & 2001). However, our proposed existing algorithms for this approach are only suitable for the case that the residual term of the AR source process is non-Gaussian white noise. In this paper, we further study the Gaussian case, whereby a maximum-likelihood based algorithm is presented and experimentally demonstrated.
{"title":"Dual auto-regressive modelling approach to Gaussian process identification","authors":"Yiu-ming Cheung","doi":"10.1109/ICME.2001.1237906","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237906","url":null,"abstract":"By modelling sources as a multivariate auto-regressive (AR) process, we have recently presented a dual AR modelling approach to identify temporal sources in independent component analysis (ICA) (Cheung et al. 2000, Cheung and Xu 1999 & 2001). However, our proposed existing algorithms for this approach are only suitable for the case that the residual term of the AR source process is non-Gaussian white noise. In this paper, we further study the Gaussian case, whereby a maximum-likelihood based algorithm is presented and experimentally demonstrated.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"216 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122994487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-08-22DOI: 10.1109/ICME.2001.1237940
M. Mukunoki, M. Bertini, J. Assfalg, A. Bimbo
The authors discuss the method to classify raw material sports videos for broadcasting. Because the raw material sports videos sometimes do not get edited, one cannot use the knowledge on edited videos. The authors use the color and edge features and evaluate whether one can classify the sports videos with those features. Also introduced is the "player" and "audience" class - apart from each sport class - to improve the classification results.
{"title":"Classification of raw material sports videos for broadcasting using color and edge features","authors":"M. Mukunoki, M. Bertini, J. Assfalg, A. Bimbo","doi":"10.1109/ICME.2001.1237940","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237940","url":null,"abstract":"The authors discuss the method to classify raw material sports videos for broadcasting. Because the raw material sports videos sometimes do not get edited, one cannot use the knowledge on edited videos. The authors use the color and edge features and evaluate whether one can classify the sports videos with those features. Also introduced is the \"player\" and \"audience\" class - apart from each sport class - to improve the classification results.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123739951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-08-22DOI: 10.1109/ICME.2001.1237788
H. Kung
The Internet explosion is driving the need for multiple-stream multimedia presentation systems, which provide multiple users with smooth multimedia services such as movie distribution and telemedicine. These kinds of visual systems are composed of multiple media streams, such as video, audio, text, and image streams, which medium volume is large. Therefore, it is a real challenge to have smooth multiple-stream multimedia presentations over Internet, which usually provides insufficient network bandwidth. In this paper, we propose and develop a multiple-stream multimedia middle-ware named TVMS (The Visual Multicast System). TVMS (i) provides a flexible authoring tool to allow users to author a multiple-stream multimedia presentation in a multicast environment and (ii) achieves smooth multimedia presentations with the temporal control mechanism. This paper describes the major considerations and techniques that are involved in the design and implementation of TVMS. System developers can incorporate TVMS to develop multiple-stream multimedia presentation systems in a multicast environment.
互联网的爆发推动了对多流多媒体演示系统的需求,这些系统可以为多个用户提供流畅的多媒体服务,如电影分发和远程医疗。这类视觉系统是由视频流、音频流、文本流、图像流等多种媒体流组成的,介质体积较大。因此,在通常网络带宽不足的情况下,如何在Internet上实现流畅的多流多媒体演示是一个真正的挑战。本文提出并开发了一个多流多媒体中间件TVMS (The Visual Multicast System)。TVMS(1)提供了一种灵活的创作工具,允许用户在多播环境中创作多流多媒体表示;(2)通过时间控制机制实现流畅的多媒体表示。本文描述了TVMS设计和实现中涉及的主要考虑因素和技术。系统开发人员可以将TVMS集成到多播环境中开发多流多媒体表示系统。
{"title":"TVMS: the visual multicast system","authors":"H. Kung","doi":"10.1109/ICME.2001.1237788","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237788","url":null,"abstract":"The Internet explosion is driving the need for multiple-stream multimedia presentation systems, which provide multiple users with smooth multimedia services such as movie distribution and telemedicine. These kinds of visual systems are composed of multiple media streams, such as video, audio, text, and image streams, which medium volume is large. Therefore, it is a real challenge to have smooth multiple-stream multimedia presentations over Internet, which usually provides insufficient network bandwidth. In this paper, we propose and develop a multiple-stream multimedia middle-ware named TVMS (The Visual Multicast System). TVMS (i) provides a flexible authoring tool to allow users to author a multiple-stream multimedia presentation in a multicast environment and (ii) achieves smooth multimedia presentations with the temporal control mechanism. This paper describes the major considerations and techniques that are involved in the design and implementation of TVMS. System developers can incorporate TVMS to develop multiple-stream multimedia presentation systems in a multicast environment.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125077644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-08-22DOI: 10.1109/ICME.2001.1237643
Seong-cheol Heo, Kwang-deok Seo, Kyu-Chan Roh, Jae-Kyoon Kim
In this paper, we propose a new requantization method for the transcoder from MPEG-1 to MPEG-4 in DCT-domain. For rate control of heterogeneous transcoder, we use the requantization method. Conventional requantization methods for the homogeneous transcoder cannot be used directly for the heterogeneous transcoder due to the mismatch in quantization parameters between MPEG-1 and MPEG-4 syntax and the difference of compression efficiency between MPEG-1 and MPEG-4. In order to solve these problems, we propose a new requantization method for MPEG-1 to MPEG-4 transcoder. It consists of the conventional R-Q model with a simple feedback and an adjustment of quantization parameters with which the output rate of MPEG-4 follows the target rate of MPEG-1. By a simulation of transcoding, it is shown that the proposed requantization can generate output rate much closer to the target rate than the conventional requantization and it gives the superior image quality to the conventional requantization.
{"title":"A new requantization method for MPEG-1 to MPEG-4 transcoder","authors":"Seong-cheol Heo, Kwang-deok Seo, Kyu-Chan Roh, Jae-Kyoon Kim","doi":"10.1109/ICME.2001.1237643","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237643","url":null,"abstract":"In this paper, we propose a new requantization method for the transcoder from MPEG-1 to MPEG-4 in DCT-domain. For rate control of heterogeneous transcoder, we use the requantization method. Conventional requantization methods for the homogeneous transcoder cannot be used directly for the heterogeneous transcoder due to the mismatch in quantization parameters between MPEG-1 and MPEG-4 syntax and the difference of compression efficiency between MPEG-1 and MPEG-4. In order to solve these problems, we propose a new requantization method for MPEG-1 to MPEG-4 transcoder. It consists of the conventional R-Q model with a simple feedback and an adjustment of quantization parameters with which the output rate of MPEG-4 follows the target rate of MPEG-1. By a simulation of transcoding, it is shown that the proposed requantization can generate output rate much closer to the target rate than the conventional requantization and it gives the superior image quality to the conventional requantization.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130103763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-08-22DOI: 10.1109/ICME.2001.1237720
H. Oh, Hyun Wook Kim, J. Seok, Jinwoo Hong, D. Youn
A new echo embedding technique in audio watermarking is proposed. The proposed method enables one to embed large energy echoes while the host audio quality is not deteriorated, so that it is robust to common signal processing modifications and resistant to tampering. Practical issues such as parameter tuning and further schemes for robust detection under attacks are also described. Subjective and objective evaluations confirmed that the proposed method could improve the robustness without perceptible distortion.
{"title":"Transparent and robust audio watermarking with a new echo embedding technique","authors":"H. Oh, Hyun Wook Kim, J. Seok, Jinwoo Hong, D. Youn","doi":"10.1109/ICME.2001.1237720","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237720","url":null,"abstract":"A new echo embedding technique in audio watermarking is proposed. The proposed method enables one to embed large energy echoes while the host audio quality is not deteriorated, so that it is robust to common signal processing modifications and resistant to tampering. Practical issues such as parameter tuning and further schemes for robust detection under attacks are also described. Subjective and objective evaluations confirmed that the proposed method could improve the robustness without perceptible distortion.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124562926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-08-22DOI: 10.1109/ICME.2001.1237795
S. Takao, T. Haru, Y. Ariki
TV viewers want to grasp the contents of the news program in a short time due to the increasing number of news channels. Conventional summarization methods based on extraction of the important sentences from each topic included in the news speech is insufficient because the important sentences can not always be extracted from each topic due to unknown topic boundary. To solve this problem, in this paper, we propose a summarization method of TV news program by segmenting the news speech into topics and then extracting the important sentence from each topic.
{"title":"Summarization of news speech with unknown topic boundary","authors":"S. Takao, T. Haru, Y. Ariki","doi":"10.1109/ICME.2001.1237795","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237795","url":null,"abstract":"TV viewers want to grasp the contents of the news program in a short time due to the increasing number of news channels. Conventional summarization methods based on extraction of the important sentences from each topic included in the news speech is insufficient because the important sentences can not always be extracted from each topic due to unknown topic boundary. To solve this problem, in this paper, we propose a summarization method of TV news program by segmenting the news speech into topics and then extracting the important sentence from each topic.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"192 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121181865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Multimedia applications require handling of continuous media and support of a variety of media objects with their temporal and spatial relationships. The paper focuses on the importance of streaming usage for the quality of service for multimedia document playout in distributed environments. We describe two Java-based client-server systems for WWW-enabled delivery of Interactive Multimedia Documents (IMDs) supporting a high level of interaction and distribution of scenario and media. We performed series of tests on client-server systems on both Local Area Networks and on Wide Area Networks. The framework has used RTP/RTCP protocols for delivery of continuous streams over the network. The experiments show and quantify the positive effects of streaming on the quality of IMD presentation.
{"title":"The role of streaming in interactive multimedia documents dissemination","authors":"R. Pitkänen, M. Vazirgiannis, George C. Polyzos","doi":"10.1109/ICME.2001.10003","DOIUrl":"https://doi.org/10.1109/ICME.2001.10003","url":null,"abstract":"Multimedia applications require handling of continuous media and support of a variety of media objects with their temporal and spatial relationships. The paper focuses on the importance of streaming usage for the quality of service for multimedia document playout in distributed environments. We describe two Java-based client-server systems for WWW-enabled delivery of Interactive Multimedia Documents (IMDs) supporting a high level of interaction and distribution of scenario and media. We performed series of tests on client-server systems on both Local Area Networks and on Wide Area Networks. The framework has used RTP/RTCP protocols for delivery of continuous streams over the network. The experiments show and quantify the positive effects of streaming on the quality of IMD presentation.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116263752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-08-22DOI: 10.1109/ICME.2001.1237858
Desney S. Tan, I. Poupyrev, M. Billinghurst, H. Kato, H. Regenbrecht, N. Tetsutani
In its simplest form, reality is merely information that is presented or acquired. Mixed Reality (MR) is built around the integration of real world physical and computer generated virtual information. We do not use the term augmented reality (AR) because we view the merging of both worlds as a symbiosis, with desirable properties from each accentuated and complementing each other, rather than the enhancement of one with the other. Collaborative MR allows multiple participants to simultaneously share a physical space while being surrounded by a virtual space that is registered with the physical. Because the MR world inherits the properties of real and virtual worlds, it is rich with social context, spatial cues, and tangible objects from the real world as well as flexible digital information from virtual. We believe that Mixed Reality is a medium, largely unexplored, but very well suited for face-to-face collaboration.
{"title":"The best of two worlds: merging virtual and real for face to face collaboration","authors":"Desney S. Tan, I. Poupyrev, M. Billinghurst, H. Kato, H. Regenbrecht, N. Tetsutani","doi":"10.1109/ICME.2001.1237858","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237858","url":null,"abstract":"In its simplest form, reality is merely information that is presented or acquired. Mixed Reality (MR) is built around the integration of real world physical and computer generated virtual information. We do not use the term augmented reality (AR) because we view the merging of both worlds as a symbiosis, with desirable properties from each accentuated and complementing each other, rather than the enhancement of one with the other. Collaborative MR allows multiple participants to simultaneously share a physical space while being surrounded by a virtual space that is registered with the physical. Because the MR world inherits the properties of real and virtual worlds, it is rich with social context, spatial cues, and tangible objects from the real world as well as flexible digital information from virtual. We believe that Mixed Reality is a medium, largely unexplored, but very well suited for face-to-face collaboration.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121487873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}