2005 IEEE International Conference on Multimedia and Expo最新文献

英文中文

Telling Stories with Mylifebits 用生活点滴讲故事

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521726

J. Gemmell, Aleks Aris, Roger Lueder

User authored stories will always be the best stories, and authoring tools will continue to be developed. However, a digital lifetime capture permits storytelling via a lightweight markup structure, combined with location, sensor and usage data. In this paper, we describe support in the MyLifeBits system for such an approach, along with some simple authoring tools

用户创作的故事永远是最好的故事，创作工具将继续被开发。然而，数字生命周期捕获允许通过轻量级标记结构，结合位置、传感器和使用数据来讲述故事。在本文中，我们描述了MyLifeBits系统对这种方法的支持，以及一些简单的创作工具

引用次数: 89

A method for extracting a musical unit to phrase music data in the compressed domain of TwinVQ audio compression 一种在TwinVQ音频压缩的压缩域中提取音乐单元以短语化音乐数据的方法

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521491

Motohiro Nakanishi, M. Kobayakawa, M. Hoshi, Tadashi Ohmori

A method for phrasing music data into meaningful musical pieces (e.g., bar and phrase) is an important function to analyze music data. To realize this function, we propose a method for extracting a unit of music data (musical unit) in the compressed domain of TwinVQ audio compression (MPEG-4 audio). Our key idea is to extract a musical unit from a sequence of autocorrelation coefficients computed in the encoding step of TwinVQ audio compression. We call the sequence of the autocorrelation coefficients the "autocorrelation sequence r". We use the k-th autocorrelation sequence r/sub k/ (k=1, 2, ..., 20) of music data for extracting a musical unit of music data. First, we calculate the j/sub k/-th autocorrelation coefficient a/sub k//sup j//sub k/ of the k-th autocorrelation sequence r/sub k/ (j/sub k/=38, 39, ..., 208; k=1, 2, ...,20). Second, for detecting the peak in the sequence (a/sub k//sup 38/, a/sub k//sup 39/, ..., a/sub k//sup 208/), the Laplacian filter is applied to the sequence. We then obtain the order p/sub k/ for which the maximum differential coefficient is attained. Finally, we compute the musical unit using p/sub k/. To evaluate the performance of extracting the musical unit by our method, we collected 64 music data and obtained autocorrelation sequences by applying the TwinVQ encoder to each data. We then applied our extraction algorithm to each autocorrelation sequence. The experimental results reveal a very good performance in the extraction of a musical unit for phrasing music data.

将音乐数据分词成有意义的音乐片段(例如，小节和乐句)的方法是分析音乐数据的重要功能。为了实现这一功能，我们提出了一种在TwinVQ音频压缩(MPEG-4音频)压缩域中提取音乐数据单元(音乐单元)的方法。我们的关键思想是从TwinVQ音频压缩编码步骤中计算的自相关系数序列中提取音乐单元。我们把自相关系数的序列称为“自相关序列r”。我们使用第k个自相关序列r/下标k/ (k= 1,2，…， 20)，用于提取音乐数据的音乐单元。首先，我们计算第k个自相关序列r/sub k/的j/sub k/-第k自相关系数a/sub k//sup j//sub k/ (j/sub k/= 38,39，…, 208;K = 1,2，…，20)。其次，用于检测序列中的峰值(a/sub k//sup 38/， a/sub k//sup 39/，…(a/sub k//sup 208/)，拉普拉斯滤波器应用于序列。然后我们得到p/下标k/阶的最大微分系数。最后，我们使用p/下标k/来计算音乐单元。为了评估该方法提取音乐单元的性能，我们收集了64个音乐数据，并通过对每个数据应用TwinVQ编码器获得自相关序列。然后，我们将提取算法应用于每个自相关序列。实验结果表明，该方法能够很好地提取乐句数据中的音乐单元。

{"title":"A method for extracting a musical unit to phrase music data in the compressed domain of TwinVQ audio compression","authors":"Motohiro Nakanishi, M. Kobayakawa, M. Hoshi, Tadashi Ohmori","doi":"10.1109/ICME.2005.1521491","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521491","url":null,"abstract":"A method for phrasing music data into meaningful musical pieces (e.g., bar and phrase) is an important function to analyze music data. To realize this function, we propose a method for extracting a unit of music data (musical unit) in the compressed domain of TwinVQ audio compression (MPEG-4 audio). Our key idea is to extract a musical unit from a sequence of autocorrelation coefficients computed in the encoding step of TwinVQ audio compression. We call the sequence of the autocorrelation coefficients the \"autocorrelation sequence r\". We use the k-th autocorrelation sequence r/sub k/ (k=1, 2, ..., 20) of music data for extracting a musical unit of music data. First, we calculate the j/sub k/-th autocorrelation coefficient a/sub k//sup j//sub k/ of the k-th autocorrelation sequence r/sub k/ (j/sub k/=38, 39, ..., 208; k=1, 2, ...,20). Second, for detecting the peak in the sequence (a/sub k//sup 38/, a/sub k//sup 39/, ..., a/sub k//sup 208/), the Laplacian filter is applied to the sequence. We then obtain the order p/sub k/ for which the maximum differential coefficient is attained. Finally, we compute the musical unit using p/sub k/. To evaluate the performance of extracting the musical unit by our method, we collected 64 music data and obtained autocorrelation sequences by applying the TwinVQ encoder to each data. We then applied our extraction algorithm to each autocorrelation sequence. The experimental results reveal a very good performance in the extraction of a musical unit for phrasing music data.","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133624484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Fast Motion Estimation by Motion Vector Merging Procedure for H. 264 基于H. 264的运动矢量合并快速运动估计

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521703

Kai-Chung Hou, Mei-Juan Chen, Ching-Ting Hsu

In this paper, a fast motion estimation algorithm for variable block-size by using a motion vector merging procedure is proposed for H.264. The motion vectors of adjacent small blocks are merged to predict the motion vectors of larger blocks for reducing the computation. Experimental results show that our proposed method has lower computational complexity than full search, fast full search and fast motion estimation of the H.264 reference software JM93 with slight quality decrease and little bit-rate increase

本文提出了一种基于H.264的可变块大小的快速运动估计算法。合并相邻小块的运动矢量来预测较大块的运动矢量，减少了计算量。实验结果表明，与H.264参考软件JM93的全搜索、快速全搜索和快速运动估计相比，该方法具有较低的计算复杂度，且图像质量略有下降，码率增幅较小

引用次数: 21

A User-Oriented Multimodal-Interface Framework for General Content-Based Multimedia Retrieval 面向用户的基于内容的多媒体检索多模态接口框架

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521547

Jinchang Ren, T. Vlachos, V. Argyriou

A user-oriented multimodal interface (MMI) framework is proposed. Considering the complexities of media connotations and uncertainties of the user's demands, content-based retrieval has intrinsic requirements for MMI for effective media-content inter-actions. Through integration of knowledge based conduction, learning of semantic concepts, natural language processing and analysis of users' profiles, our framework can establish a solid basis for design and implementation of general CBR systems satisfying extensibility, condensability and inter-operability

提出了一种面向用户的多模态接口(MMI)框架。考虑到媒体内涵的复杂性和用户需求的不确定性，基于内容的检索对媒体与内容的有效交互具有内在要求。通过对基于知识的传导、语义概念的学习、自然语言处理和用户画像分析的集成，为设计和实现具有可扩展性、可凝聚性和互操作性的通用CBR系统奠定了坚实的基础

引用次数: 2

Robust learning-based TV commercial detection 基于鲁棒学习的电视广告检测

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521382

Xiansheng Hua, Lie Lu, HongJiang Zhang

A robust learning-based TV commercial detection approach is proposed in this paper. Firstly, a set of basic features that facilitate distinguishing commercials from general program are analyzed. Then, a series of context-based features, which are more effective for identifying commercials, are derived from these basic features. Next, each shot is classified as commercial or general program based on these features by a pre-trained SVM classifier. And last, the detection results are further refined by scene grouping and some heuristic rules. Experiments on around 10-hour TV recordings of various genres show that the proposed scheme is able to identify commercial blocks with relatively high detection accuracy.

提出了一种基于学习的鲁棒电视广告检测方法。首先，分析了商业广告与一般节目区分的一组基本特征。然后，在这些基本特征的基础上衍生出一系列更有效地识别广告的基于上下文的特征。接下来，基于这些特征，通过预训练的SVM分类器将每个镜头分类为商业或一般节目。最后，通过场景分组和一些启发式规则对检测结果进行进一步细化。在10小时左右的各类电视录像中进行的实验表明，该方案能够以较高的检测准确率识别商业街区。

引用次数: 91

Urban Traffic Control: A Streaming Multimedia Approach 城市交通控制:一种流媒体方法

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521499

C. Palau, M. Esteve, J. Martínez, B. Molina, I. Pérez-Llopis

Urban traffic control systems have based their technological infrastructure on both advanced analogical close-circuit television systems (TVCC) and point-to-point links, providing low-scalable and very expensive systems. The main goal of an urban traffic monitoring system is to capture, send, play and distribute video information from the streets of a certain city. Current digitalization process of video networks, and the research carried out in the field of streaming media, has led vendors to present proprietary hardware and software solutions resulting in a strong dependency among their customers. The existence of open standards for video encoding and protocols for streaming media transmission over IP networks has led us to propose this system. The work presents an open urban traffic control system which bases its design on COTS philosophy for hardware and software, as well as open source and standardized protocols. The proposed system is a suitable solution in terms of scalability, cost, interoperability and performance for traffic control systems. Furthermore, its architecture can be easily adapted to other video applications and tools

城市交通控制系统的技术基础设施基于先进的模拟闭路电视系统(TVCC)和点对点链接，提供低可扩展性和非常昂贵的系统。城市交通监控系统的主要目标是捕获、发送、播放和分发来自某个城市街道的视频信息。当前视频网络的数字化进程，以及在流媒体领域开展的研究，导致供应商提出专有的硬件和软件解决方案，导致客户之间的强烈依赖性。由于存在视频编码的开放标准和IP网络上流媒体传输的协议，我们提出了这个系统。本文提出了一个开放的城市交通控制系统，该系统基于COTS硬件和软件设计理念，以及开源和标准化协议。该系统在可扩展性、成本、互操作性和性能等方面都是交通控制系统的理想解决方案。此外，它的架构可以很容易地适应其他视频应用程序和工具

{"title":"Urban Traffic Control: A Streaming Multimedia Approach","authors":"C. Palau, M. Esteve, J. Martínez, B. Molina, I. Pérez-Llopis","doi":"10.1109/ICME.2005.1521499","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521499","url":null,"abstract":"Urban traffic control systems have based their technological infrastructure on both advanced analogical close-circuit television systems (TVCC) and point-to-point links, providing low-scalable and very expensive systems. The main goal of an urban traffic monitoring system is to capture, send, play and distribute video information from the streets of a certain city. Current digitalization process of video networks, and the research carried out in the field of streaming media, has led vendors to present proprietary hardware and software solutions resulting in a strong dependency among their customers. The existence of open standards for video encoding and protocols for streaming media transmission over IP networks has led us to propose this system. The work presents an open urban traffic control system which bases its design on COTS philosophy for hardware and software, as well as open source and standardized protocols. The proposed system is a suitable solution in terms of scalability, cost, interoperability and performance for traffic control systems. Furthermore, its architecture can be easily adapted to other video applications and tools","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127801388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Performance of Multiple Description Coding in Sensor Networks with Finite Buffers 有限缓冲区传感器网络中多重描述编码的性能研究

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521707

E. Baccaglini, G. Barrenetxea, B. Beferull-Lozano

Sensor networks are usually dense networks where the network diversity can be exploited in order to overcome failures. In this paper, we study the use of multiple description techniques in the context of sensor networks where the cause of failures is due to the usual practical constraint of having finite buffers in the sensors, instead of the more traditional case of link failures considered in previous research. Although from a theoretical point of view we observe that the use of more descriptions provides usually better performance, we show experimentally that this is not the case in practice, when real constraints are introduced, such as finite buffers and the presence of header information, necessary for any real application. Our main result is that the optimal number of descriptions, in terms of average distortion, decreases as the fraction of header information increases for a given buffer size

传感器网络通常是密集的网络，可以利用网络的多样性来克服故障。在本文中，我们研究了在传感器网络的背景下使用多种描述技术，其中故障的原因是由于传感器中具有有限缓冲区的通常实际约束，而不是先前研究中考虑的更传统的链路故障情况。虽然从理论的角度来看，我们观察到使用更多的描述通常提供更好的性能，但我们通过实验表明，在实践中，当引入实际约束时，例如有限缓冲区和头信息的存在，这是任何实际应用程序所必需的，情况并非如此。我们的主要结果是，对于给定的缓冲区大小，描述的最佳数量(就平均失真而言)随着标题信息的比例增加而减少

引用次数: 4

Emotional Speech Classification Using Gaussian Mixture Models and the Sequential Floating Forward Selection Algorithm 基于高斯混合模型和顺序浮动前向选择算法的情绪语音分类

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521717

D. Ververidis, Constantine Kotropoulos

Emotional speech classification can be treated as a supervised learning task where the statistical properties of emotional speech segments are the features and the emotional styles form the labels. The Akaike criterion is used for estimating automatically the number of Gaussian densities that model the probability density function of the emotional speech features. A procedure for reducing the computational burden of crossvalidation in sequential floating forward selection algorithm is proposed that applies the t-test on the probability of correct classification for the Bayes classifier designed for various feature sets. For the Bayes classifier, the sequential floating forward selection algorithm is found to yield a higher probability of correct classification by 3% than that of the sequential forward selection algorithm either taking into account the gender information or ignoring it. The experimental results indicate that the utterances from isolated words and sentences are more colored emotional than those from paragraphs. Without taking into account the gender information, the probability of correct classification for the Bayes classifier admits a maximum when the probability density function of emotional speech features extracted from the aforementioned utterances is modeled as a mixture of 2 Gaussian densities

情绪语音分类可以看作是一种监督学习任务，其中情绪语音片段的统计属性是特征，情绪风格是标签。Akaike准则用于自动估计高斯密度的数量，这些高斯密度对情绪语音特征的概率密度函数进行建模。提出了一种减少顺序浮动前向选择算法中交叉验证计算量的方法，该方法对针对不同特征集设计的贝叶斯分类器的正确分类概率进行t检验。对于贝叶斯分类器，无论是考虑性别信息还是忽略性别信息，顺序浮动前向选择算法的正确分类概率都比顺序前向选择算法高3%。实验结果表明，孤立词和句子的话语比段落的话语更具情感色彩。在不考虑性别信息的情况下，将从上述话语中提取的情绪语音特征的概率密度函数建模为2个高斯密度的混合模型时，贝叶斯分类器的正确分类概率最大

{"title":"Emotional Speech Classification Using Gaussian Mixture Models and the Sequential Floating Forward Selection Algorithm","authors":"D. Ververidis, Constantine Kotropoulos","doi":"10.1109/ICME.2005.1521717","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521717","url":null,"abstract":"Emotional speech classification can be treated as a supervised learning task where the statistical properties of emotional speech segments are the features and the emotional styles form the labels. The Akaike criterion is used for estimating automatically the number of Gaussian densities that model the probability density function of the emotional speech features. A procedure for reducing the computational burden of crossvalidation in sequential floating forward selection algorithm is proposed that applies the t-test on the probability of correct classification for the Bayes classifier designed for various feature sets. For the Bayes classifier, the sequential floating forward selection algorithm is found to yield a higher probability of correct classification by 3% than that of the sequential forward selection algorithm either taking into account the gender information or ignoring it. The experimental results indicate that the utterances from isolated words and sentences are more colored emotional than those from paragraphs. Without taking into account the gender information, the probability of correct classification for the Bayes classifier admits a maximum when the probability density function of emotional speech features extracted from the aforementioned utterances is modeled as a mixture of 2 Gaussian densities","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115513191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 75

An overview of technologies for e-meeting and e-lecture 电子会议和电子讲座技术综述

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521593

B. Erol, Ying Li

Over the past few years, with the rapid adoption of broadband communication and advances in multimedia content capture and delivery, Web-based meetings and lectures, also referred to as e-meeting and e-lecture, have become popular among businesses and academic institutions because of their cost savings and capabilities in providing self-paced education and convenient content access and retrieval. In fact, the technological achievements in capture, analysis, access, and delivery of e-meeting and e-lecture media have already resulted in several working systems that are currently of regular usage. This paper gives an overview of existing work as well as state-of-the-art in these two research areas which are bound to affect the way we teach, learn, and collaborate.

在过去的几年里，随着宽带通信的迅速采用和多媒体内容捕获和交付的进步，基于web的会议和讲座，也被称为电子会议和电子讲座，已经在企业和学术机构中流行起来，因为它们节省了成本，并且能够提供自定进度的教育和方便的内容访问和检索。事实上，电子会议和电子讲座媒体的捕获、分析、访问和交付方面的技术成就已经产生了一些目前经常使用的工作系统。本文概述了现有的工作以及这两个研究领域的最新进展，这两个研究领域势必会影响我们的教学、学习和合作方式。

引用次数: 34

Current and Emerging Topics in Sports Video Processing 当前和新兴的主题在体育视频处理

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521476

Xinguo Yu, D. Farin

Sports video processing is an interesting topic for research, since the clearly defined game rules in sports provide the rich domain knowledge for analysis. Moreover, it is interesting because many specialized applications for sports video processing are emerging. This paper gives an overview of sports video research, where we describe both basic algorithmic techniques and applications.

体育视频处理是一个有趣的研究课题，因为体育运动中明确的比赛规则为分析提供了丰富的领域知识。此外，有趣的是，许多专门用于体育视频处理的应用程序正在出现。本文概述了体育视频研究，其中我们描述了基本算法技术和应用。

引用次数: 59

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2005 IEEE International Conference on Multimedia and Expo

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀