首页 > 最新文献

MULTIMEDIA '04最新文献

英文 中文
Practical voltage scaling for mobile multimedia devices 移动多媒体设备的实用电压缩放
Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027737
Wanghong Yuan, K. Nahrstedt
This paper presents the design, implementation, and evaluation of a practical voltage scaling (PDVS) algorithm for mobile devices primarily running multimedia applications. PDVS seeks to minimize the total energy of the whole device while meeting multimedia timing requirements. To do this, PDVS extends traditional real-time scheduling by deciding what execution speed in addition to when to execute what applications. PDVS makes these decisions based on the discrete speed levels of the CPU, the total power of the device at different speeds, and the probability distribution of CPU demand of multimedia applications. We have implemented PDVS in the Linux kernel and evaluated it on an HP laptop. Our experimental results show that PDVS saves energy substantially without affecting multimedia performance. It saves energy by 14.4% to 37.2% compared to scheduling algorithms without voltage scaling and by up to 10.4% compared to previous voltage scaling algorithms that assume an ideal CPU with continuous speeds and cubic power-speed relationship.
本文介绍了一种实用的电压缩放(pdv)算法的设计、实现和评估,该算法主要用于运行多媒体应用程序的移动设备。pdv力求在满足多媒体定时要求的同时,将整个设备的总能量降至最低。为此,pdv通过决定执行速度以及何时执行哪些应用程序来扩展传统的实时调度。PDVS根据CPU的离散速度水平、设备在不同速度下的总功率以及多媒体应用的CPU需求的概率分布来做出这些决策。我们在Linux内核中实现了pdv,并在一台HP笔记本电脑上对其进行了评估。实验结果表明,pdv在不影响多媒体性能的前提下,大大节省了电能。与没有电压缩放的调度算法相比,它节省了14.4%到37.2%的能量,与之前的电压缩放算法相比,它节省了10.4%的能量,这些算法假设理想的CPU具有连续速度和三次功率-速度关系。
{"title":"Practical voltage scaling for mobile multimedia devices","authors":"Wanghong Yuan, K. Nahrstedt","doi":"10.1145/1027527.1027737","DOIUrl":"https://doi.org/10.1145/1027527.1027737","url":null,"abstract":"This paper presents the design, implementation, and evaluation of a <i>practical</i> voltage scaling (PDVS) algorithm for mobile devices primarily running multimedia applications. PDVS seeks to minimize the total energy of the whole device while meeting multimedia timing requirements. To do this, PDVS extends traditional real-time scheduling by deciding <i>what execution speed</i> in addition to when to execute what applications. PDVS makes these decisions based on the discrete speed levels of the CPU, the total power of the device at different speeds, and the probability distribution of CPU demand of multimedia applications. We have implemented PDVS in the Linux kernel and evaluated it on an HP laptop. Our experimental results show that PDVS saves energy substantially without affecting multimedia performance. It saves energy by 14.4% to 37.2% compared to scheduling algorithms without voltage scaling and by up to 10.4% compared to previous voltage scaling algorithms that assume an ideal CPU with continuous speeds and cubic power-speed relationship.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115128375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 65
Nonparametric motion model with applications to camera motion pattern classification 非参数运动模型及其在摄像机运动模式分类中的应用
Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027603
Ling-yu Duan, Mingliang Xu, Q. Tian, Changsheng Xu
Motion information is a powerful cue for visual perception. In the context of video indexing and retrieval, motion content serves as a useful source for compact video representation. There has been a lot of literature about parametric motion models. However, it is hard to secure a proper parametric assumption in a wide range of video scenarios. Diverse camera shots and frequent occurrences of bad optical flow estimation motivate us to develop nonparametric motion models. In this paper, we employ the mean shift procedure to propose a novel nonparametric motion representation. With this compact representation, various motion characterization tasks can be achieved by machine learning. Such a learning mechanism can not only capture the domain-independent parametric constraints, but also acquire the domain-dependent knowledge to tolerate the influence of bad dense optical flow vectors or block-based MPEG motion vector fields (MVF). The proposed nonparametric motion model has been applied to camera motion pattern classification on 23191 MVF extracted from MPEG-7 dataset.
运动信息是视觉感知的有力线索。在视频索引和检索的背景下,运动内容是压缩视频表示的有用来源。有很多关于参数化运动模型的文献。然而,在广泛的视频场景中,很难保证一个适当的参数假设。不同的相机镜头和频繁出现的不良光流估计促使我们开发非参数运动模型。本文采用均值移位法提出了一种新的非参数运动表示。有了这种紧凑的表示,各种运动表征任务可以通过机器学习来实现。这种学习机制不仅可以捕获与领域无关的参数约束,而且可以获得与领域相关的知识,以容忍不良的密集光流向量或基于块的MPEG运动向量场(MVF)的影响。将所提出的非参数运动模型应用于从MPEG-7数据集中提取的23191 MVF的摄像机运动模式分类。
{"title":"Nonparametric motion model with applications to camera motion pattern classification","authors":"Ling-yu Duan, Mingliang Xu, Q. Tian, Changsheng Xu","doi":"10.1145/1027527.1027603","DOIUrl":"https://doi.org/10.1145/1027527.1027603","url":null,"abstract":"Motion information is a powerful cue for visual perception. In the context of video indexing and retrieval, motion content serves as a useful source for compact video representation. There has been a lot of literature about parametric motion models. However, it is hard to secure a proper parametric assumption in a wide range of video scenarios. Diverse camera shots and frequent occurrences of bad optical flow estimation motivate us to develop nonparametric motion models. In this paper, we employ the mean shift procedure to propose a novel nonparametric motion representation. With this compact representation, various motion characterization tasks can be achieved by machine learning. Such a learning mechanism can not only capture the domain-independent parametric constraints, but also acquire the domain-dependent knowledge to tolerate the influence of bad dense optical flow vectors or block-based MPEG motion vector fields (MVF). The proposed nonparametric motion model has been applied to camera motion pattern classification on 23191 MVF extracted from MPEG-7 dataset.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116795280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Incremental semi-supervised subspace learning for image retrieval 用于图像检索的增量半监督子空间学习
Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027530
Xiaofei He
Subspace learning techniques are widespread in pattern recognition research. They include Principal Component Analysis (PCA), Locality Preserving Projection (LPP), etc. These techniques are generally unsupervised which allows them to model data in the absence of labels or categories. In relevance feedback driven image retrieval system, the user provided information can be used to better describe the intrinsic semantic relationships between images. In this paper, we propose a semi-supervised subspace learning algorithm which incrementally learns an adaptive subspace by preserving the semantic structure of the image space, based on user interactions in a relevance feedback driven query-by-example system. Our algorithm is capable of accumulating knowledge from users, which could result in new feature representations for images in the database so that the system's future retrieval performance can be enhanced. Experiments on a large collection of images have shown the effectiveness and efficiency of our proposed algorithm.
子空间学习技术在模式识别研究中有着广泛的应用。它们包括主成分分析(PCA)、局部保持投影(LPP)等。这些技术通常是无监督的,这使得它们可以在没有标签或类别的情况下对数据进行建模。在相关性反馈驱动的图像检索系统中,用户提供的信息可以更好地描述图像之间的内在语义关系。在本文中,我们提出了一种半监督子空间学习算法,该算法通过保留图像空间的语义结构,在相关反馈驱动的按例查询系统中,基于用户交互增量学习自适应子空间。我们的算法能够从用户那里积累知识,这可以为数据库中的图像产生新的特征表示,从而提高系统未来的检索性能。在大量图像上的实验证明了该算法的有效性和高效性。
{"title":"Incremental semi-supervised subspace learning for image retrieval","authors":"Xiaofei He","doi":"10.1145/1027527.1027530","DOIUrl":"https://doi.org/10.1145/1027527.1027530","url":null,"abstract":"Subspace learning techniques are widespread in pattern recognition research. They include Principal Component Analysis (PCA), Locality Preserving Projection (LPP), etc. These techniques are generally unsupervised which allows them to model data in the absence of labels or categories. In relevance feedback driven image retrieval system, the user provided information can be used to better describe the intrinsic semantic relationships between images. In this paper, we propose a semi-supervised subspace learning algorithm which incrementally learns an adaptive subspace by preserving the semantic structure of the image space, based on user interactions in a relevance feedback driven query-by-example system. Our algorithm is capable of accumulating knowledge from users, which could result in new feature representations for images in the database so that the system's future retrieval performance can be enhanced. Experiments on a large collection of images have shown the effectiveness and efficiency of our proposed algorithm.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125913857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 96
Real-time background music monitoring based on content-based retrieval 基于内容检索的实时背景音乐监控
Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027550
Yoshiharu Suga, N. Kosugi, M. Morimoto
In this paper, we describe music monitoring in TV broadcasting based on content-based retrieval. A part of audio signals is sequentially extracted from TV broadcasting as a retrieval key, and a music DB that stores a great number of musical pieces is retrieved by this key based on content-based retrieval, and a musical piece is identified sequentially. In this way, we are able to carry out music monitoring. There are three necessary requirements important for realization of the music monitoring. They are robustness against non-stationary noise, real-time processing of large-scale music DB retrieval, and high granularity of the retrieval key. As a method of realizing robustness against non-stationary noise, we propose a partially similar retrieval method which improves retrieval accuracy by using the moment in which no superfluous noise is produced during the existence of non-stationary noise. In order to realize real-time processing of large-scale music DB retrieval, we adopt a coarse-to-fine strategy, and propose a spectral peaks hashing method which performs high-speed refining by using hashing. To calculate a hash value in this hashing, frequency channel numbers of the spectral peaks are used. In order to realize high granularity of the retrieval key, it is necessary to solve the problem of retrieval accuracy degradation associated with heightening the granularity. To improve this accuracy, we propose a detection-by-continuity method which uses music continuity. Moreover, by using music continuity to correct the starting point and the terminal point of a musical piece in TV broadcasting, the retrieval accuracy is improved further. In order to evaluate the effectiveness of the proposed methods, we performed experiments using a music DB which stores over 28,000 musical pieces (over 1800 hours) and TV broadcasting audio signals containing music and background music (BGM). The granularity of the retrieval key was set at about 0.5 seconds. Through these experiments, We verified that music monitoring was possible for over 90% of the total time of music and BGM used in TV broadcasting, and that real-time processing was possible.
本文描述了基于内容检索的电视广播音乐监控。从电视广播中顺序提取一部分音频信号作为检索键,基于内容检索的方法,通过该键检索存储大量音乐片段的音乐DB,并对音乐片段进行顺序识别。这样,我们就可以进行音乐监控。实现音乐监控有三个重要的必要条件。它们具有抗非平稳噪声的鲁棒性、大规模音乐数据库检索的实时性和检索键的高粒度性。的方法实现对非平稳噪声鲁棒性,我们建议部分相似检索方法提高检索精度通过使用目前没有多余的噪音产生在非平稳噪声的存在。为了实现大规模音乐DB检索的实时处理,我们采用了从粗到精的策略,提出了一种利用哈希进行高速细化的谱峰哈希方法。为了在这个散列中计算一个散列值,使用了频谱峰的频率通道号。为了实现检索键的高粒度化,必须解决随着检索键粒度的增大而导致检索精度下降的问题。为了提高这种精度,我们提出了一种利用音乐连续性的连续性检测方法。此外,利用音乐连续性对电视广播音乐片段的起点和终点进行校正,进一步提高了检索精度。为了评估所提出的方法的有效性,我们使用存储超过28,000首音乐(超过1800小时)的音乐数据库和包含音乐和背景音乐(BGM)的电视广播音频信号进行了实验。的粒度检索关键是设定在0.5秒。通过这些实验,我们验证了在电视广播中使用的音乐和BGM的总时间的90%以上的音乐监控是可能的,并且可以实时处理。
{"title":"Real-time background music monitoring based on content-based retrieval","authors":"Yoshiharu Suga, N. Kosugi, M. Morimoto","doi":"10.1145/1027527.1027550","DOIUrl":"https://doi.org/10.1145/1027527.1027550","url":null,"abstract":"In this paper, we describe music monitoring in TV broadcasting based on content-based retrieval. A part of audio signals is sequentially extracted from TV broadcasting as a retrieval key, and a music DB that stores a great number of musical pieces is retrieved by this key based on content-based retrieval, and a musical piece is identified sequentially. In this way, we are able to carry out music monitoring. There are three necessary requirements important for realization of the music monitoring. They are robustness against non-stationary noise, real-time processing of large-scale music DB retrieval, and high granularity of the retrieval key. As a method of realizing robustness against non-stationary noise, we propose a partially similar retrieval method which improves retrieval accuracy by using the moment in which no superfluous noise is produced during the existence of non-stationary noise. In order to realize real-time processing of large-scale music DB retrieval, we adopt a coarse-to-fine strategy, and propose a spectral peaks hashing method which performs high-speed refining by using hashing. To calculate a hash value in this hashing, frequency channel numbers of the spectral peaks are used. In order to realize high granularity of the retrieval key, it is necessary to solve the problem of retrieval accuracy degradation associated with heightening the granularity. To improve this accuracy, we propose a detection-by-continuity method which uses music continuity. Moreover, by using music continuity to correct the starting point and the terminal point of a musical piece in TV broadcasting, the retrieval accuracy is improved further. In order to evaluate the effectiveness of the proposed methods, we performed experiments using a music DB which stores over 28,000 musical pieces (over 1800 hours) and TV broadcasting audio signals containing music and background music (BGM). The granularity of the retrieval key was set at about 0.5 seconds. Through these experiments, We verified that music monitoring was possible for over 90% of the total time of music and BGM used in TV broadcasting, and that real-time processing was possible.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125564324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
GURU: a multimedia distance-learning framework for users with disabilities GURU:为残疾用户提供的多媒体远程学习框架
Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027698
Vidhya Balasubramanian, N. Venkatasubramanian
GURU is a distance-learning environment that renders multimedia information to users with disabilities in an accessible manner. It is an implementation framework developed as part of an effort to provide accessible multimedia information to end users with perceptual (visual and auditory), cognitive or motor impairments. GURU is based on the MPEG-4 standard, and it modifies MP4 content and the presentation of the different objects in the scene dynamically based on users' visual, auditory and motor abilities. This paper briefly describes the implementation of the prototype framework and illustrates sample adaptations as implemented in this framework.
GURU是一个远程学习环境,它以可访问的方式向残疾用户提供多媒体信息。它是为向有知觉(视觉和听觉)、认知或运动障碍的最终用户提供可访问的多媒体信息而开发的实施框架的一部分。GURU基于MPEG-4标准,根据用户的视觉、听觉和运动能力动态修改MP4内容和场景中不同对象的呈现。本文简要描述了原型框架的实现,并举例说明了在该框架中实现的示例调整。
{"title":"GURU: a multimedia distance-learning framework for users with disabilities","authors":"Vidhya Balasubramanian, N. Venkatasubramanian","doi":"10.1145/1027527.1027698","DOIUrl":"https://doi.org/10.1145/1027527.1027698","url":null,"abstract":"GURU is a distance-learning environment that renders multimedia information to users with disabilities in an accessible manner. It is an implementation framework developed as part of an effort to provide accessible multimedia information to end users with perceptual (visual and auditory), cognitive or motor impairments. GURU is based on the MPEG-4 standard, and it modifies MP4 content and the presentation of the different objects in the scene dynamically based on users' visual, auditory and motor abilities. This paper briefly describes the implementation of the prototype framework and illustrates sample adaptations as implemented in this framework.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116417675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Interactive manipulation of replay speed while listening to speech recordings 交互式操作的重播速度,而听语音录音
Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027645
Wolfgang Hürst, T. Lauer, Georg Götz
Today's interfaces for time-scaled audio replay have limitations especially regarding highly interactive tasks such as skimming and searching, which require quick temporary speed changes. Motivated by this shortcoming, we introduce a new interaction technique for speech skimming based on the so called rubber-band metaphor. We propose an "elastic" audio slider which is especially useful for temporary manipulation of replay speed and which integrates seamlessly into standard interface designs. The feasibility of this concept is proven by an initial user study.
今天的时间尺度音频回放界面存在局限性,特别是对于高交互性任务,如略读和搜索,这需要快速的临时速度变化。针对这一缺陷,我们提出了一种基于橡皮筋隐喻的语音浏览交互技术。我们提出了一个“弹性”音频滑块,它对于临时操作回放速度特别有用,并且可以无缝地集成到标准界面设计中。初步的用户研究证明了这一概念的可行性。
{"title":"Interactive manipulation of replay speed while listening to speech recordings","authors":"Wolfgang Hürst, T. Lauer, Georg Götz","doi":"10.1145/1027527.1027645","DOIUrl":"https://doi.org/10.1145/1027527.1027645","url":null,"abstract":"Today's interfaces for time-scaled audio replay have limitations especially regarding highly interactive tasks such as skimming and searching, which require quick temporary speed changes. Motivated by this shortcoming, we introduce a new interaction technique for speech skimming based on the so called rubber-band metaphor. We propose an \"elastic\" audio slider which is especially useful for temporary manipulation of replay speed and which integrates seamlessly into standard interface designs. The feasibility of this concept is proven by an initial user study.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116529023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Nonparametric motion model 非参数运动模型
Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027700
Ling-yu Duan, Mingliang Xu, Q. Tian, Changsheng Xu
Motion information is a powerful cue for visual perception. In the context of video indexing and retrieval, motion content serves as a useful source for compact video representation. There has been a lot of literature about parametric motion models. However, it is hard to secure a proper parametric assumption in a wide range of video scenarios. Diverse camera shots and frequent occurrences of improper optical flow estimation or block matching motivate us to develop nonparametric motion models. In this demonstration, we present a novel nonparametric motion model. The unique features mainly include: 1) Instead of computationally expensive and vulnerable parametric regression our proposed model bases the motion characterization on the classification of motion patterns; 2) we employ machine learning to capture the knowledge of recognizing camera motion patterns from bad motion vector fields (MVF); and 3) with the mean shift filtering our proposed motion representation elegantly incorporates the spatial-range information for noise removal and discontinuity preserving smoothing of MVF. Promising results have been achieved on two tasks: 1) camera motion pattern recognition on 23191 MVFs and 2) recognition of the intensity of motion activity on 622 video segments culled from the MPEG-7 dataset.
运动信息是视觉感知的有力线索。在视频索引和检索的背景下,运动内容是压缩视频表示的有用来源。有很多关于参数化运动模型的文献。然而,在广泛的视频场景中,很难保证一个适当的参数假设。不同的镜头和频繁出现的不正确的光流估计或块匹配促使我们开发非参数运动模型。在这个演示中,我们提出了一个新的非参数运动模型。该模型的独特之处在于:1)采用基于运动模式分类的运动表征方法,取代了计算量大、易受攻击的参数回归方法;2)利用机器学习从不良运动矢量场(MVF)中获取识别相机运动模式的知识;3)通过均值移位滤波,我们提出的运动表示优雅地融合了空间距离信息,用于去除噪声和保持MVF的不连续平滑。在两个任务上取得了令人满意的结果:1)23191个MVFs的摄像机运动模式识别和2)从MPEG-7数据集中挑选的622个视频片段的运动活动强度识别。
{"title":"Nonparametric motion model","authors":"Ling-yu Duan, Mingliang Xu, Q. Tian, Changsheng Xu","doi":"10.1145/1027527.1027700","DOIUrl":"https://doi.org/10.1145/1027527.1027700","url":null,"abstract":"Motion information is a powerful cue for visual perception. In the context of video indexing and retrieval, motion content serves as a useful source for compact video representation. There has been a lot of literature about parametric motion models. However, it is hard to secure a proper parametric assumption in a wide range of video scenarios. Diverse camera shots and frequent occurrences of improper optical flow estimation or block matching motivate us to develop nonparametric motion models. In this demonstration, we present a novel nonparametric motion model. The unique features mainly include: 1) Instead of computationally expensive and vulnerable parametric regression our proposed model bases the motion characterization on the classification of motion patterns; 2) we employ machine learning to capture the knowledge of recognizing camera motion patterns from bad motion vector fields (MVF); and 3) with the mean shift filtering our proposed motion representation elegantly incorporates the spatial-range information for noise removal and discontinuity preserving smoothing of MVF. Promising results have been achieved on two tasks: 1) camera motion pattern recognition on 23191 MVFs and 2) recognition of the intensity of motion activity on 622 video segments culled from the MPEG-7 dataset.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114565448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A semi-naïve Bayesian method incorporating clustering with pair-wise constraints for auto image annotation 结合聚类和成对约束的自动图像标注semi-naïve贝叶斯方法
Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027605
Wanjun Jin, Rui Shi, Tat-Seng Chua
We propose a novel approach for auto image annotation. In our approach, we first perform the segmentation of images into regions, followed by clustering of regions, before learning the relationship between concepts and region clusters using the set of training images with pre-assigned concepts. The main focus of this paper is two-fold. First, in the learning stage, we perform clustering of regions into region clusters by incorporating pair-wise constraints which are derived by considering the language model underlying the annotations assigned to training images. Second, in the annotation stage, we employ a semi-naïve Bayes model to compute the posterior probability of concepts given the region clusters. Experiment results show that our proposed system utilizing these two strategies outperforms the state-of-the-art techniques in annotating large image collection.
提出了一种新的图像自动标注方法。在我们的方法中,我们首先将图像分割成区域,然后对区域进行聚类,然后使用预先分配概念的训练图像集学习概念和区域聚类之间的关系。本文的主要关注点有两个方面。首先,在学习阶段,我们通过结合配对约束将区域聚类到区域聚类中,这些约束是通过考虑分配给训练图像的注释的语言模型派生的。其次,在标注阶段,我们使用semi-naïve贝叶斯模型来计算给定区域簇的概念的后验概率。实验结果表明,采用这两种策略的系统在大型图像集注释方面优于当前最先进的技术。
{"title":"A semi-naïve Bayesian method incorporating clustering with pair-wise constraints for auto image annotation","authors":"Wanjun Jin, Rui Shi, Tat-Seng Chua","doi":"10.1145/1027527.1027605","DOIUrl":"https://doi.org/10.1145/1027527.1027605","url":null,"abstract":"We propose a novel approach for auto image annotation. In our approach, we first perform the segmentation of images into regions, followed by clustering of regions, before learning the relationship between concepts and region clusters using the set of training images with pre-assigned concepts. The main focus of this paper is two-fold. First, in the learning stage, we perform clustering of regions into region clusters by incorporating pair-wise constraints which are derived by considering the language model underlying the annotations assigned to training images. Second, in the annotation stage, we employ a semi-naïve Bayes model to compute the posterior probability of concepts given the region clusters. Experiment results show that our proposed system utilizing these two strategies outperforms the state-of-the-art techniques in annotating large image collection.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122142348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
A comparative study on attributed relational gra matching algorithms for perceptual 3-D shape descriptor in MPEG-7 MPEG-7中感知三维形状描述符的属性关联格拉匹配算法比较研究
Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027686
Duck Hoon Kim, I. Yun, Sang Uk Lee
Nowadays, the demand on user-friendly querying interface such as query-by-sketch and query-by-editing is an important issue in the content-based retrieval system for 3-D object database. Especially in MPEG-7, P3DS (Perceptual 3-D Shape) descriptor has been developed in order to provide the user-friendly querying, which can not be covered by an existing international standard for description and browsing of 3-D object database. Since the P3DS descriptor is based on the part-based representation of 3-D object, it is a kind of attributed relational gra (ARG) so that the ARG matching algorithm naturally follows as the core procedure for the similarity matching of the P3DS descriptor. In this paper, given a P3DS database from the corresponding 3-D object database, we bring focus into investigating the pros and cons of the target ARG matching algorithms. In order to demonstrate the objective evidence of our conclusion, we have conducted the experiments based on the database of 480 3-D objects with 33 categories in terms of the bull's eye performance, average normalized modified retrieval rate, and precision/recall curve.
目前,基于内容的三维对象数据库检索系统对按草图查询和按编辑查询等用户友好的查询界面的需求是一个重要的问题。特别是在MPEG-7中,P3DS(感性三维形状)描述符的开发是为了提供对三维对象数据库描述和浏览的用户友好查询,这是现有国际标准所不能涵盖的。由于P3DS描述符基于三维物体的零件表示,它是一种带有属性的关系图(ARG),因此P3DS描述符的相似度匹配自然遵循ARG匹配算法作为核心过程。本文以相应的三维目标数据库中的P3DS数据库为例,重点研究了目标ARG匹配算法的优缺点。为了证明我们的结论的客观证据,我们在包含33个类别的480个三维物体的数据库上进行了实验,从牛眼性能、平均归一化修正检索率和准确率/召回率曲线三个方面进行了实验。
{"title":"A comparative study on attributed relational gra matching algorithms for perceptual 3-D shape descriptor in MPEG-7","authors":"Duck Hoon Kim, I. Yun, Sang Uk Lee","doi":"10.1145/1027527.1027686","DOIUrl":"https://doi.org/10.1145/1027527.1027686","url":null,"abstract":"Nowadays, the demand on user-friendly querying interface such as query-by-sketch and query-by-editing is an important issue in the content-based retrieval system for 3-D object database. Especially in MPEG-7, P3DS (Perceptual 3-D Shape) descriptor has been developed in order to provide the user-friendly querying, which can not be covered by an existing international standard for description and browsing of 3-D object database. Since the P3DS descriptor is based on the part-based representation of 3-D object, it is a kind of attributed relational gra (ARG) so that the ARG matching algorithm naturally follows as the core procedure for the similarity matching of the P3DS descriptor. In this paper, given a P3DS database from the corresponding 3-D object database, we bring focus into investigating the pros and cons of the target ARG matching algorithms. In order to demonstrate the objective evidence of our conclusion, we have conducted the experiments based on the database of 480 3-D objects with 33 categories in terms of the bull's eye performance, average normalized modified retrieval rate, and <i>precision/recall</i> curve.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117048346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
When code is content: experiments with a whistling machine 当代码满足时:用呼啸的机器做实验
Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027761
M. Böhlen, J. Rinker
The Universal Whistling Machine (U.W.M) senses the presence of people in its vicinity and attracts them with a signature whistle. Given a response whistle, U.W.M. counters with its own composition, based on a time-frequency analysis of the original.
通用口哨机(uw.m)能感应到附近有人的存在,并发出标志性的口哨吸引他们。给出一个回应哨声,uwm根据对原始哨声的时频分析,用自己的成分进行反击。
{"title":"When code is content: experiments with a whistling machine","authors":"M. Böhlen, J. Rinker","doi":"10.1145/1027527.1027761","DOIUrl":"https://doi.org/10.1145/1027527.1027761","url":null,"abstract":"The Universal Whistling Machine (U.W.M) senses the presence of people in its vicinity and attracts them with a signature whistle. Given a response whistle, U.W.M. counters with its own composition, based on a time-frequency analysis of the original.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129593579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
MULTIMEDIA '04
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1