Takeshi Nagamine, A. Jaimes, Kengo Omura, K. Hirata
We present a system based on a new, memory-cue paradigm for retrieving meeting video scenes. The system graically represents important memory retrieval cues such as room layout, participant's faces and sitting positions, etc.. Queries are formulated dynamically: as the user graically manipulates the cues, the query results are shown. Our system (1) helps users easily express the cues they recall about a particular meeting; (2) helps users remember new cues for meeting video retrieval. We discuss the experiments that motivate this new approach, implementation, and future work.
{"title":"A visuospatial memory cue system for meeting video retrieval","authors":"Takeshi Nagamine, A. Jaimes, Kengo Omura, K. Hirata","doi":"10.1145/1027527.1027699","DOIUrl":"https://doi.org/10.1145/1027527.1027699","url":null,"abstract":"We present a system based on a new, memory-cue paradigm for retrieving meeting video scenes. The system graically represents important memory retrieval cues such as room layout, participant's faces and sitting positions, etc.. Queries are formulated dynamically: as the user graically manipulates the cues, the query results are shown. Our system (1) helps users easily express the <i>cues</i> they recall about a particular meeting; (2) helps users <i>remember</i> new cues for meeting video retrieval. We discuss the experiments that motivate this new approach, implementation, and future work.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130583429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A potential security problem in frequency domain video encryption is that some trivial information such as the distribution of DCT coefficients may leak out secret. To illuminate this problem, we performed a successful attack on video using the distribution information of DCT coefficients. Then, according to the weak points discovered, a novel video encryption algorithm, working on run-length coded data, is proposed. It has amended identified security problems, while preserving high efficiency and the adaptability to cooperate with compression schemes.
{"title":"Enhancing security of frequency domain video encryption","authors":"Zheng Liu, Xue Li, Zhao Yang Dong","doi":"10.1145/1027527.1027597","DOIUrl":"https://doi.org/10.1145/1027527.1027597","url":null,"abstract":"A potential security problem in frequency domain video encryption is that some trivial information such as the distribution of DCT coefficients may leak out secret. To illuminate this problem, we performed a successful attack on video using the distribution information of DCT coefficients. Then, according to the weak points discovered, a novel video encryption algorithm, working on run-length coded data, is proposed. It has amended identified security problems, while preserving high efficiency and the adaptability to cooperate with compression schemes.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130889701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ePic is an integrated presentation authoring and playback system that makes it easy to use a wide range of devices installed in one or multiple multimedia venues.
ePic是一个集成的演示文稿创作和播放系统,可以轻松使用安装在一个或多个多媒体场所的各种设备。
{"title":"An EPIC enhanced meeting environment","authors":"Qiong Liu, F. Zhao, John Doherty, Don Kimber","doi":"10.1145/1027527.1027743","DOIUrl":"https://doi.org/10.1145/1027527.1027743","url":null,"abstract":"ePic is an integrated presentation authoring and playback system that makes it easy to use a wide range of devices installed in one or multiple multimedia venues.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131486167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Web image search engine has become an important tool to organize digital images on the Web. However, most commercial search engines still use a list presentation while little effort has been placed on improving their usability. How to present the image search results in a more intuitive and effective way is still an open question to be carefully studied. In this demo, we present iFind, a scalable Web image search engine, in which we integrated two kinds of search result browsing interfaces. User study results have proved that our interfaces are superior to traditional interfaces.
{"title":"Intuitive and effective interfaces for WWW image search engines","authors":"Zhiwei Li, Xing Xie, Hao Liu, Xiaoou Tang, Mingjing Li, Wei-Ying Ma","doi":"10.1145/1027527.1027697","DOIUrl":"https://doi.org/10.1145/1027527.1027697","url":null,"abstract":"Web image search engine has become an important tool to organize digital images on the Web. However, most commercial search engines still use a list presentation while little effort has been placed on improving their usability. How to present the image search results in a more intuitive and effective way is still an open question to be carefully studied. In this demo, we present iFind, a scalable Web image search engine, in which we integrated two kinds of search result browsing interfaces. User study results have proved that our interfaces are superior to traditional interfaces.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123744681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents an algorithm for automatically extracting significant motion trajectories in sports videos. Our approach consists of four stages: global motion estimation, motion blob detection, trajectory evolution and trajectory refinement. Global motion is estimated from the motion vectors in the compressed video using an iterative algorithm with robust outlier rejection. A statistical hypothesis test is carried out within the Block Rejection Map(BRM), which is the by-product of the global motion estimation, for the detection of motion blobs. Trajectory evolution is the process in which the motion blobs are either appended to an existing trajectory or are considered to be the beginning of a new trajectory based on its distance to an adaptive trajectory description. Finally, the extracted motion trajectories are refined using a Kalman filter. Experimental results on both indoor and outdoor sports videos demonstrate the effectiveness and efficiency of the proposed method.
{"title":"Automatic extraction of motion trajectories in compressed sports videos","authors":"Haoran Yi, D. Rajan, L. Chia","doi":"10.1145/1027527.1027599","DOIUrl":"https://doi.org/10.1145/1027527.1027599","url":null,"abstract":"This paper presents an algorithm for automatically extracting significant motion trajectories in sports videos. Our approach consists of four stages: global motion estimation, motion blob detection, trajectory evolution and trajectory refinement. Global motion is estimated from the motion vectors in the compressed video using an iterative algorithm with robust outlier rejection. A statistical hypothesis test is carried out within the Block Rejection Map(<i>BRM</i>), which is the by-product of the global motion estimation, for the detection of motion blobs. Trajectory evolution is the process in which the motion blobs are either appended to an existing trajectory or are considered to be the beginning of a new trajectory based on its distance to an adaptive trajectory description. Finally, the extracted motion trajectories are refined using a Kalman filter. Experimental results on both indoor and outdoor sports videos demonstrate the effectiveness and efficiency of the proposed method.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127491708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we propose a Web image search result organizing method to facilitate user browsing. We formalize this problem as a salient image region pattern extraction problem. Given the images returned by Web search engine, we first segment the images into homogeneous regions and quantize the environmental regions into image codewords. The salient codeword "phrases" are then extracted and ranked based on a regression model learned from human labeled training data. According to the salient "phrases", images are assigned to different clusters, with the one nearest to the centroid as the entry for the corresponding cluster. Satisfying experimental results show the effectiveness of our proposed method.
{"title":"Grouping web image search result","authors":"Xin-Jing Wang, Wei-Ying Ma, Qi-Cai He, Xing Li","doi":"10.1145/1027527.1027632","DOIUrl":"https://doi.org/10.1145/1027527.1027632","url":null,"abstract":"In this paper, we propose a Web image search result organizing method to facilitate user browsing. We formalize this problem as a salient image region pattern extraction problem. Given the images returned by Web search engine, we first segment the images into homogeneous regions and quantize the environmental regions into image codewords. The salient codeword \"phrases\" are then extracted and ranked based on a regression model learned from human labeled training data. According to the salient \"phrases\", images are assigned to different clusters, with the one nearest to the centroid as the entry for the corresponding cluster. Satisfying experimental results show the effectiveness of our proposed method.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133726025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we present the Range Multicast protocol and the implemented prototype. We propose to demonstrate the system at the 2004 ACM Multimedia Conference.
本文提出了范围组播协议及其实现原型。我们建议在2004年ACM多媒体会议上演示该系统。
{"title":"Range multicast routers for large-scale deployment of multimedia application","authors":"Ning Jiang, Y. H. Ho, K. Hua","doi":"10.1145/1027527.1027558","DOIUrl":"https://doi.org/10.1145/1027527.1027558","url":null,"abstract":"In this paper, we present the Range Multicast protocol and the implemented prototype. We propose to demonstrate the system at the 2004 ACM Multimedia Conference.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130651408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present Flavor, a formal language for audio-visual object representation, that has been developed to describe any coded multimedia data in formats such as GIF, JPEG and MPEG. The language comes with a translator for generating C++/Java code from the Flavor description, and the generated code can include bitstream reading, writing and tracing methods. Since Version 5.0, the translator has been enhanced to also support XML. With the enhanced translator, the generated C++/Java code can include a method for producing XML documents that correspond to the bitstreams described by Flavor. The description can also be used to generate a corresponding XML schema. Additionally, a software tool for converting XML representation of multimedia data back into bitstream form is provided.
{"title":"Flavor: a formal language for audio-visual object representation","authors":"A. Eleftheriadis, Danny Hong","doi":"10.1145/1027527.1027717","DOIUrl":"https://doi.org/10.1145/1027527.1027717","url":null,"abstract":"We present Flavor, a formal language for audio-visual object representation, that has been developed to describe any coded multimedia data in formats such as GIF, JPEG and MPEG. The language comes with a translator for generating C++/Java code from the Flavor description, and the generated code can include bitstream reading, writing and tracing methods. Since Version 5.0, the translator has been enhanced to also support XML. With the enhanced translator, the generated C++/Java code can include a method for producing XML documents that correspond to the bitstreams described by Flavor. The description can also be used to generate a corresponding XML schema. Additionally, a software tool for converting XML representation of multimedia data back into bitstream form is provided.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131439392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper investigates the issues in polyphonic popular song retrieval. The problems that we consider include singing voice extraction, melodic curve representation, and database indexing. Initially, polyphonic songs are decomposed into singing voices and instruments sounds in both time and frequency domains based on SVM and ICA. The extracted singing voices are represented as two melodic curves that model the statistical mean and neighborhood similarity of notes. To speed up the matching between songs and query, we further adopt proportional transportation distance to index the songs as vantage point trees. Encouraging results have been obtained through experiments.
{"title":"Indexing and matching of polyphonic songs for query-by-singing system","authors":"Tat-Wan Leung, C. Ngo","doi":"10.1145/1027527.1027598","DOIUrl":"https://doi.org/10.1145/1027527.1027598","url":null,"abstract":"This paper investigates the issues in polyphonic popular song retrieval. The problems that we consider include singing voice extraction, melodic curve representation, and database indexing. Initially, polyphonic songs are decomposed into singing voices and instruments sounds in both time and frequency domains based on SVM and ICA. The extracted singing voices are represented as two melodic curves that model the statistical mean and neighborhood similarity of notes. To speed up the matching between songs and query, we further adopt proportional transportation distance to index the songs as vantage point trees. Encouraging results have been obtained through experiments.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"108 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128193679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we propose an iterative similarity propagation approach to explore the inter-relationships between Web images and their textual annotations for image retrieval. By considering Web images as one type of objects, their surrounding texts as another type, and constructing the links structure between them via webpage analysis, we can iteratively reinforce the similarities between images. The basic idea is that if two objects of the same type are both related to one object of another type, these two objects are similar; likewise, if two objects of the same type are related to two different, but similar objects of another type, then to some extent, these two objects are also similar. The goal of our method is to fully exploit the mutual reinforcement between images and their textual annotations. Our experiments based on 10,628 images crawled from the Web show that our proposed approach can significantly improve Web image retrieval performance.
{"title":"Multi-model similarity propagation and its application for web image retrieval","authors":"Xin-Jing Wang, Wei-Ying Ma, Gui-Rong Xue, Xing Li","doi":"10.1145/1027527.1027746","DOIUrl":"https://doi.org/10.1145/1027527.1027746","url":null,"abstract":"In this paper, we propose an iterative similarity propagation approach to explore the inter-relationships between Web images and their textual annotations for image retrieval. By considering Web images as one type of objects, their surrounding texts as another type, and constructing the links structure between them via webpage analysis, we can iteratively reinforce the similarities between images. The basic idea is that if two objects of the same type are both related to one object of another type, these two objects are similar; likewise, if two objects of the same type are related to two different, but similar objects of another type, then to some extent, these two objects are also similar. The goal of our method is to fully exploit the mutual reinforcement between images and their textual annotations. Our experiments based on 10,628 images crawled from the Web show that our proposed approach can significantly improve Web image retrieval performance.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134527011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}