2005 IEEE International Conference on Multimedia and Expo最新文献

英文中文

Comparing Feature Sets for Acted and Spontaneous Speech in View of Automatic Emotion Recognition 基于自动情绪识别的动作语音和自发语音特征集比较

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521463

Thurid Vogt, E. André

We present a data-mining experiment on feature selection for automatic emotion recognition. Starting from more than 1000 features derived from pitch, energy and MFCC time series, the most relevant features in respect to the data are selected from this set by removing correlated features. The features selected for acted and realistic emotions are analyzed and show significant differences. All features are computed automatically and we also contrast automatically with manually units of analysis. A higher degree of automation did not prove to be a disadvantage in terms of recognition accuracy

提出了一种用于自动情感识别的特征选择数据挖掘实验。从从基音、能量和MFCC时间序列中得到的1000多个特征开始，通过去除相关特征，从中选择与数据最相关的特征。分析了表演情感和现实情感所选择的特征，发现两者存在显著差异。所有的特征都是自动计算的，我们还自动与手动分析单元进行对比。事实证明，就识别准确性而言，更高程度的自动化并不是一个劣势

引用次数: 257

Using rhetorical annotations for generating video documentaries 使用修辞注释生成视频纪录片

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521610

S. Bocconi, F. Nack, L. Hardman

We use rhetorical annotations to specify a generation process that can assemble meaningful video sequences with a communicative goal and an argumentative progression. Our annotation schema encodes the verbal information contained in the audio channel, identifying the claims the interviewees make and the argumentation structures they use to make those claims. Based on this schema, we construct a semantic graph which is traversed by rhetoric-based strategies selecting video segments. The selected video segments are edited to form a meaningful video sequence.

我们使用修辞注释来指定一个生成过程，该过程可以将有意义的视频序列与交际目标和论证进展组合在一起。我们的注释模式对音频通道中包含的口头信息进行编码，识别受访者所提出的主张和他们用来提出这些主张的论证结构。在此基础上，构建了基于修辞学的视频片段选择策略遍历的语义图。所选视频片段经过编辑，形成一个有意义的视频序列。

引用次数: 22

Evaluation of the Interleaved Source Coding (ISC) Under Packet Correlation 包相关条件下交错源编码(ISC)的评价

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521711

Jin Young Lee, H. Radha

Network impairments such as delay and packet losses have severe impact on the presentation quality of many predictive video sources. Prior researches have shown efforts to develop packet loss resilient coding methods to overcome such impairments for real-time streaming applications. Interleaved source coding (ISC) is one of the error resilient coding methods, which is based on an optimum interleaving of predictive video coded frames transmitted over a single erasure channel. ISC employs a Markov decision process (MDP) and a corresponding dynamic programming algorithm to identify the optimal interleaving pattern for a given channel model and a transmitting sequence. ISC has shown to significantly improve the overall quality of predictive video coded stream over a lossy channel without complex modifications to standard video coders. In this paper, ISC is evaluated over channels with memory. In particular, we analyze the impact of packet correlation of the popular Gilbert model on ISC-based packet video over a wide range of packet loss probabilities. Simulations have shown that ISC advances the traditional method as either the loss rate increases or the packet correlation decreases

延迟和丢包等网络缺陷严重影响了许多预测视频源的呈现质量。先前的研究已经表明，努力开发数据包丢失弹性编码方法，以克服实时流应用的这种损害。交错源编码(ISC)是一种纠错编码方法，它基于在单个擦除信道上传输的预测视频编码帧的最佳交错。ISC采用马尔可夫决策过程(MDP)和相应的动态规划算法来确定给定信道模型和发送序列的最佳交错模式。ISC已经证明，在不需要对标准视频编码器进行复杂修改的情况下，可以显著提高有损信道上预测视频编码流的整体质量。在本文中，ISC是在有内存的信道上计算的。特别是，我们分析了流行的吉尔伯特模型的数据包相关性对基于isc的数据包视频在广泛的丢包概率范围内的影响。仿真结果表明，ISC在提高丢包率和降低包相关性方面优于传统方法

{"title":"Evaluation of the Interleaved Source Coding (ISC) Under Packet Correlation","authors":"Jin Young Lee, H. Radha","doi":"10.1109/ICME.2005.1521711","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521711","url":null,"abstract":"Network impairments such as delay and packet losses have severe impact on the presentation quality of many predictive video sources. Prior researches have shown efforts to develop packet loss resilient coding methods to overcome such impairments for real-time streaming applications. Interleaved source coding (ISC) is one of the error resilient coding methods, which is based on an optimum interleaving of predictive video coded frames transmitted over a single erasure channel. ISC employs a Markov decision process (MDP) and a corresponding dynamic programming algorithm to identify the optimal interleaving pattern for a given channel model and a transmitting sequence. ISC has shown to significantly improve the overall quality of predictive video coded stream over a lossy channel without complex modifications to standard video coders. In this paper, ISC is evaluated over channels with memory. In particular, we analyze the impact of packet correlation of the popular Gilbert model on ISC-based packet video over a wide range of packet loss probabilities. Simulations have shown that ISC advances the traditional method as either the loss rate increases or the packet correlation decreases","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"137 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121710710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

IP Multicast Video Broadcasting System with User Authentication 具有用户认证的IP组播视频广播系统

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521639

Hiroki Onishi, T. Satoh, T. Uehara, K. Yamaoka

This report describes a pay broadcasting system for the Internet. This system would enable tens of thousands of people to access an identical video stream simultaneously. In this proposed system, contents are broadcast to all terminals using IP multicast. Contents are encrypted so that legitimate users can decode them with a private key and session keys. As a key management scheme, the Tracing Traitor scheme is adopted because it offers advantages in scalability. The system can also embed digital watermarks, which act as a psychological deterrent to illegal copying and distribution of copyrighted contents. Finally, implementation of an application system is described and efficient broadcasting of contents with this system is demonstrated

本报告描述了一种互联网付费广播系统。这个系统将使成千上万的人同时访问同一个视频流。该系统采用IP组播的方式将内容广播到所有终端。内容经过加密，以便合法用户可以使用私钥和会话密钥对其进行解码。作为一种密钥管理方案，由于具有可扩展性的优势，采用了跟踪叛逆者方案。该系统还可以嵌入数字水印，对非法复制和传播受版权保护的内容起到心理威慑作用。最后，介绍了一个应用系统的实现，并演示了该系统对内容的有效传播

引用次数: 0

Quality-Temporal Transcoder Driven by the Jerkiness 由抖动驱动的质量-时间转码器

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521705

G. Iacovoni, S. Morsa, R. Felice

We propose a new video homogeneous transcoding architecture DCT-based which relies on both quality and temporal reduction techniques. The frame layer control is driven by a new indicator, the jerkiness, which represents the user perception of the movement which affects a video stream. The proposed transcoder can meet the constraints of a real-time communication and it has been extensively tested under different conditions

我们提出了一种新的基于dct的视频同质转码结构，它同时依赖于质量和时间约简技术。帧层控制由一个新的指标驱动，即抖动，它代表了用户对影响视频流的运动的感知。所提出的转码器能够满足实时通信的要求，并已在不同条件下进行了广泛的测试

引用次数: 15

Automatic Segmentation of Home Videos 家庭视频的自动分割

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521347

Y. Zhai, M. Shah

Temporal video segmentation is one of the fundamental and essential tasks in video processing, understanding and management. In this paper, we present an automatic method for segmenting the home videos into temporal logical units. We have developed a statistical framework using Markov chain Monte Carlo (MCMC) technique. The temporal scene boundaries are detected by maximizing the posterior probability of the model parameters. The model parameters contain the number of the scenes and the boundary locations of the scenes. The proposed method has been demonstrated on several home videos, and high accuracy has been obtained

时间视频分割是视频处理、理解和管理的基本任务之一。在本文中，我们提出了一种将家庭视频自动分割成时间逻辑单元的方法。我们开发了一个统计框架使用马尔可夫链蒙特卡罗(MCMC)技术。通过最大化模型参数的后验概率来检测时间场景边界。模型参数包含场景的个数和场景的边界位置。该方法已在多个家庭视频中进行了验证，取得了较高的精度

引用次数: 8

Hidden auxiliary media channels in audio signals by perceptually insignificant component replacement 通过感知无关分量的替换隐藏音频信号中的辅助媒体信道

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521349

Tim D. Jackson, Francis F. Li, Keith Yates

This paper proposes a method for the formation of an auxiliary media channel within a host signal. Using a psychoacoustic frequency masking model, perceptually insignificant subband components of the host audio signal are identified and removed. The auxiliary channel data are placed in the empty subbands in the host signal and scaled to a level below the audible threshold. An implementation is given along with results suggesting that the proposed method can effectively hide an auxiliary media channel in a normal audio signal without degrading the perceived sound quality.

本文提出了一种在主机信号中形成辅助媒体信道的方法。使用心理声频率掩蔽模型，识别和去除主机音频信号的感知无关紧要的子带成分。辅助通道数据放置在主机信号的空子带中，并缩放到低于可听阈值的电平。给出了一种实现方法，结果表明所提出的方法可以有效地隐藏在正常音频信号中的辅助媒体信道，而不会降低感知到的音质。

引用次数: 0

Speaker Independent Speech Emotion Recognition by Ensemble Classification 基于集成分类的说话人独立语音情感识别

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521560

Björn Schuller, S. Reiter, R. Müller, M. Al-Hames, M. Lang, G. Rigoll

Emotion recognition grows to an important factor in future media retrieval and man machine interfaces. However, even human deciders often experience problems realizing one's emotion, especially of strangers. In this work we strive to recognize emotion independent of the person concentrating on the speech channel. Single feature relevance of acoustic features is a critical point, which we address by filter-based gain ratio calculation starting at a basis of 276 features. As optimization of a minimum set as a whole in general saves more extraction effort, we furthermore apply an SVM-SFFS wrapper based search. For a more robust estimation we also integrate spoken content information by a Bayesian net analysis of ASR outputs. Overall classification is realized in an early feature fusion by stacked ensembles of diverse base classifiers. Tests ran on a 3,947 movie and automotive interaction dialog-turns database consisting of 35 speakers. Remarkable overall performance can be reported in the discrimination of the seven discrete emotions named in the MPEG-4 standard with added neutrality

情感识别将成为未来媒体检索和人机界面的重要组成部分。然而，即使是人类的决策者也经常在意识到自己的情绪时遇到问题，尤其是对陌生人。在这项工作中，我们努力识别独立于专注于语音通道的人的情感。声学特征的单特征相关性是一个关键点，我们通过基于滤波器的增益比计算来解决这个问题，从276个特征开始。由于整体最小集的优化通常可以节省更多的提取工作量，我们进一步应用了基于SVM-SFFS包装的搜索。为了获得更稳健的估计，我们还通过对ASR输出的贝叶斯网络分析集成了语音内容信息。整体分类是在早期特征融合中通过不同基分类器的叠加集成实现的。测试运行在由35个扬声器组成的3,947个电影和汽车交互对话回合数据库上。在区分MPEG-4标准中命名的七种离散情绪方面，可以报告显着的整体性能，并增加了中立性

引用次数: 156

Comparative evaluation of Web image search engines for multimedia applications 多媒体应用的网络图像搜索引擎的比较评价

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521641

Keon Stevenson, C. Leung

While text-oriented document searching are relatively mature on the Internet, image searching, which requires much more than text matching, significantly lags behind. The use of image search engines significantly enlarges the scope of images to users accessibility. This paper provides an understanding of current technologies in image searching on the Internet, and points to future areas of improvement for multimedia applications. We develop a systematic set of image queries to assess the competence and performance of the major image search engines. We find that current technology is only able to deliver an average precision of around 42% and an average recall of around 12%, while the best performers are capable of producing over 70% for precision and around 27% for recall. The reasons for such differences, and mechanisms for search improvement, are also indicated.

虽然面向文本的文档搜索在Internet上相对成熟，但图像搜索需要的远远超过文本匹配，因此明显落后。图像搜索引擎的使用极大地扩大了图像对用户的可访问性。本文提供了对当前互联网图像搜索技术的理解，并指出了多媒体应用的未来改进领域。我们开发了一套系统的图像查询来评估主要图像搜索引擎的能力和性能。我们发现，目前的技术只能提供42%左右的平均准确率和12%左右的平均召回率，而表现最好的技术能够提供70%以上的准确率和27%左右的召回率。还指出了造成这种差异的原因和改进搜索的机制。

引用次数: 31

A Reversible Watermarking Scheme for JPEG-2000 Compressed Images 一种用于JPEG-2000压缩图像的可逆水印方案

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521362

S. Emmanuel, C. K. Heng, A. Das

In this paper, we present a novel reversible watermarking scheme for image authentication for JPEG/JPEG-2000 coded images. Since the watermarking scheme is reversible, the exact original image can be recovered from the watermarked image. The watermarking scheme makes use of finite state machine principles. The proposed scheme is asymmetric as the watermark extraction key is different from its embedding key. The algorithm is implemented and tested for its visual quality, compression overhead, execution time overhead and payload capacity. It is found that the algorithm has high visual quality, high payload capacity, low compression overhead and low execution time overhead

本文提出了一种新的用于JPEG/JPEG-2000编码图像认证的可逆水印方案。由于水印方案是可逆的，因此可以从水印图像中恢复出准确的原始图像。该水印方案利用有限状态机原理。由于水印提取密钥与其嵌入密钥不同，该方案具有非对称性。对该算法的视觉质量、压缩开销、执行时间开销和有效载荷容量进行了实现和测试。结果表明，该算法具有高视觉质量、高负载容量、低压缩开销和低执行时间开销等优点

引用次数: 8

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2005 IEEE International Conference on Multimedia and Expo

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀