2005 IEEE International Conference on Multimedia and Expo最新文献

英文中文

Speech-Based Visual Concept Learning Using Wordnet 基于语音的视觉概念学习

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521627

Xiaodan Song, Ching-Yung Lin, Ming-Ting Sun

Modeling visual concepts using supervised or unsupervised machine learning approaches are becoming increasing important for video semantic indexing, retrieval, and filtering applications. Naturally, videos include multimodality data such as audio, speech, visual and text, which are combined to infer therein the overall semantic concepts. However, in the literature, most researches were conducted within only one single domain. In this paper we propose an unsupervised technique that builds context-independent keyword lists for desired visual concept modeling using WordNet. Furthermore, we propose an extended speech-based visual concept (ESVC) model to reorder and extend the above keyword lists by supervised learning based on multimodality annotation. Experimental results show that the context-independent models can achieve comparable performance compared to conventional supervised learning algorithms, and the ESVC model achieves about 53% and 28.4% improvement in two testing subsets of the TRECVID 2003 corpus over a state-of-the-art speech-based video concept detection algorithm

使用监督或无监督机器学习方法对视觉概念建模对于视频语义索引、检索和过滤应用变得越来越重要。当然，视频包含音频、语音、视觉和文本等多模态数据，这些数据被组合在一起，从而推断出整体的语义概念。然而，在文献中，大多数研究只在一个单一的领域内进行。在本文中，我们提出了一种无监督技术，该技术构建上下文无关的关键字列表，用于使用WordNet进行所需的视觉概念建模。此外，我们提出了一个扩展的基于语音的视觉概念(ESVC)模型，通过基于多模态标注的监督学习对上述关键字列表进行重新排序和扩展。实验结果表明，与传统的监督学习算法相比，上下文无关的模型可以达到相当的性能，并且ESVC模型在TRECVID 2003语料库的两个测试子集上比最先进的基于语音的视频概念检测算法分别提高了53%和28.4%

引用次数: 2

Reliable video communication with multi-path streaming using MDC 可靠的视频通信与多路径流使用MDC

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521522

I. Lee, L. Guan

Video streaming demands high data rates and hard delay constraints, and it raises several challenges on today's packet-based and best-effort Internet. In this paper, we propose an efficient multiple-description coding (MDC) technique based on video frame sub-sampling and cubic-spline interpolation to provide spatial diversity, such that no additional buffering delay or storage is required. The frame dropping rate due to packet loss and drifting error under the multi-path streaming environment is analyzed in this paper.

视频流要求高数据速率和硬延迟限制，它对当今基于分组和尽力而为的互联网提出了几个挑战。在本文中，我们提出了一种基于视频帧子采样和三次样条插值的高效多描述编码(MDC)技术，以提供空间多样性，从而不需要额外的缓冲延迟或存储。分析了多路径流环境下由于丢包和漂移误差导致的丢帧率。

引用次数: 18

Supporting rights checking in an MPEG-21 Digital Item Processing environment 支持MPEG-21数字项目处理环境中的权限检查

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521608

F. D. Keukelaere, T. DeMartini, Jeroen Bekaert, R. Walle

Within the world of multimedia, the new MPEG-21 standard is currently under development. The purpose of this new standard is to create an open framework for multimedia delivery and consumption. MPEG-21 mastered the multitude of types of content and metadata by standardizing the declaration of digital items in an XML based format. In addition to standardizing, the declaration of digital items MPEG-21 also standardizes digital item processing, which enables the declaration of suggested uses of digital items. The rights expression language and the rights data dictionary parts of MPEG-21 enable the declaration of what rights (permitted interactions) Users are given to digital items. In this paper, we describe how rights checking can be realized in an environment in which interactions with digital items are declared through digital item processing. We demonstrate how rights checking can be done when "critical" digital item base operations are called and how rights context information can be gathered by tracking during the execution of digital item methods

在多媒体领域，新的MPEG-21标准目前正在开发中。这个新标准的目的是为多媒体的传递和使用创建一个开放的框架。MPEG-21通过以基于XML的格式对数字项目的声明进行标准化，从而掌握了多种类型的内容和元数据。除了标准化之外，数字项目的声明MPEG-21还标准化了数字项目的处理，从而可以声明数字项目的建议用途。MPEG-21的权限表达语言和权限数据字典部分可以声明用户对数字项目的权限(允许的交互)。在本文中，我们描述了如何在通过数字项目处理声明与数字项目的交互的环境中实现权利检查。我们将演示如何在调用“关键”数字项目库操作时进行权利检查，以及如何在执行数字项目方法期间通过跟踪收集权利上下文信息

引用次数: 4

Evaluating keypoint methods for content-based copyright protection of digital images 基于内容的数字图像版权保护关键点方法评价

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521614

Larry Huston, R. Sukthankar, Yan Ke

This paper evaluates the effectiveness of keypoint methods for content-based protection of digital images. These methods identify a set of "distinctive" regions (termed keypoints) in an image and encode them using descriptors that are robust to expected image transformations. To determine whether particular images were derived from a protected image, the keypoints for both images are generated and their descriptors matched. We describe a comprehensive set of experiments to examine how keypoint methods cope with three real-world challenges: (1) loss of keypoints due to cropping; (2) matching failures caused by approximate nearest-neighbor indexing schemes; (3) degraded descriptors due to significant image distortions. While keypoint methods perform very well in general, this paper identifies cases where the accuracy of such methods degrades.

本文评价了基于内容的数字图像保护关键点方法的有效性。这些方法识别图像中的一组“独特”区域(称为关键点)，并使用对预期图像转换具有鲁棒性的描述符对其进行编码。为了确定特定图像是否来自受保护的图像，生成两个图像的关键点并匹配它们的描述符。我们描述了一套全面的实验来研究关键点方法如何应对三个现实世界的挑战:(1)由于裁剪导致关键点丢失;(2)近似最近邻索引方案导致的匹配失败;(3)显著的图像失真导致描述符退化。虽然关键点方法通常执行得很好，但本文确定了这种方法的准确性会降低的情况。

引用次数: 2

Hybrid speaker tracking in an automated lecture room 自动演讲室的混合演讲者跟踪

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521365

Cha Zhang, Y. Rui, Li-wei He, M. Wallick

We present a hybrid speaker tracking scheme based on a single pan/tilt/zoom (PTZ) camera in an automated lecture capturing system. Given that the camera's video resolution is higher than the required output resolution, we frame the output video as a sub-region of the camera's input video. This allows us to track the speaker both digitally and mechanically. Digital tracking has the advantage of being smooth, and mechanical tracking can cover a wide area. The hybrid tracking achieves the benefits of both worlds. In addition to hybrid tracking, we present an intelligent pan/zoom selection scheme to improve the aestheticity of the lecture scene.

提出了一种基于单平移/倾斜/变焦(PTZ)相机的混合演讲者跟踪方案。考虑到摄像机的视频分辨率高于所需的输出分辨率，我们将输出视频帧为摄像机输入视频的子区域。这使我们能够以数字和机械方式跟踪扬声器。数字跟踪具有平滑的优点，而机械跟踪可以覆盖广泛的区域。混合跟踪实现了这两个世界的好处。除了混合跟踪外，我们还提出了一种智能平移/缩放选择方案，以提高演讲场景的美观性。

引用次数: 26

Proactive Energy Optimization Algorithms for Wavelet-Based Video Codecs on Power-Aware Processors 功率感知处理器上基于小波的视频编解码器的主动能量优化算法

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521486

V. Akella, M. Schaar, W. Kao

We propose a systematic technique for characterizing the workload of a video decoder at a given time and transforming the shape of the workload to optimize the utilization of a critical resource without compromising the distortion incurred in the process. We call our approach proactive resource management. We will illustrate our techniques by addressing the problem of minimizing the energy consumption during decoding a video sequence on a programmable processor that supports multiple voltages and frequencies. We evaluate two different heuristics for the underlying optimization problem that result in 50% to 92% improvements in energy savings compared to techniques that do not use dynamic adaptation

我们提出了一种系统的技术，用于在给定时间表征视频解码器的工作负载，并转换工作负载的形状，以优化关键资源的利用，而不会损害过程中产生的失真。我们称我们的方法为主动资源管理。我们将通过解决在支持多个电压和频率的可编程处理器上解码视频序列时最小化能耗的问题来说明我们的技术。我们评估了针对潜在优化问题的两种不同的启发式方法，与不使用动态自适应的技术相比，这两种方法的节能效果提高了50%至92%

引用次数: 7

WA-TV: Webifying and Augmenting Broadcast Content for Next-Generation Storage TV WA-TV:下一代存储电视的网络和增强广播内容

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521716

H. Miyamori, Qiang Ma, Katsumi Tanaka

A method is proposed for viewing broadcast content that converts TV programs into Web content and integrates the results with complementary information retrieved using the Internet. Converting the programs into Web pages enables the programs to be skimmed over to get an overview and for particular scenes to be easily explored. Integrating complementary information enables the programs to be viewed efficiently with value-added content. An intuitive, user-friendly browsing interface enables the user to easily changing the level of detail displayed for the integrated information by zooming. Preliminary testing of a prototype system for next-generation storage TV, "WA-TV", validated the approach taken by the proposed method

提出了一种观看广播内容的方法，该方法将电视节目转换为Web内容，并将结果与使用Internet检索的补充信息集成在一起。将程序转换为Web页面可以使程序略读以获得概述，并且可以轻松地探索特定场景。整合互补信息使节目能够与增值内容一起有效地观看。直观、用户友好的浏览界面使用户可以通过缩放轻松更改显示的集成信息的详细程度。对下一代存储电视“WA-TV”原型系统的初步测试验证了所提出方法所采用的方法

引用次数: 3

What happens in films? 电影里发生了什么?

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521357

A. Salway, Andrew Vassiliou, K. Ahmad

This paper aims to contribute to the analysis and description of semantic video content by investigating what actions are important in films. We apply a corpus analysis method to identify frequently occurring phrases in texts that describe films-screenplays and audio description. Frequent words and statistically significant collocations of these words are identified in screenplays of 75 films and in audio description of 45 films. Phrases such as 'looks at', 'turns to', 'smiles at' and various collocations of 'door' were found to be common. We argue that these phrases occur frequently because they describe actions that are important story-telling elements for filmed narrative. We discuss how this knowledge helps the development of systems to structure semantic video content.

本文旨在通过研究电影中重要的动作，为语义视频内容的分析和描述做出贡献。我们应用语料库分析方法来识别描述电影剧本和音频描述的文本中频繁出现的短语。在75部电影的剧本和45部电影的音频描述中发现了频繁词和这些词在统计上显著的搭配。“look at”、“turns to”、“smiles at”等短语以及“door”的各种搭配都很常见。我们认为这些短语之所以频繁出现，是因为它们描述的动作是电影叙事的重要叙事元素。我们将讨论这些知识如何帮助系统开发来构建语义视频内容。

引用次数: 23

Joint Image Halftoning and Watermarking in High-Resolution Digital Form 高分辨率数字形式的联合图像半色调和水印

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521478

Chao-Yong Hsu, Chun-Shien Lu

The existing halftone image watermarking methods were proposed to embed a watermark bit in a halftone dot, which corresponds to a pixel, to generate stego halftone image. This one-to-one mapping, however, is not consistent with the one-to-many strategy that is used by current high-resolution devices, such as computer printers and screens, where one pixel is first expanded into many dots and then a halftoning processing is employed to generate a halftone image. Furthermore, electronic paper or smart paper that produces high-resolution digital files cannot be protected by the traditional halftone watermarking methods. In view of these facts, we present a high-resolution halftone watermarking scheme to deal with the aforementioned problems. The characteristics of our scheme include: (i) a high-resolution halftoning process that employs a one-to-many mapping strategy is proposed; (ii) a many-to-one inverse halftoning process is proposed to generate gray-scale images of good quality; and (iii) halftone image watermarking can be directly conducted on gray-scale instead of halftone images to achieve better robustness

提出了现有的半色调图像水印方法，将水印位嵌入到对应像素的半色调点上，生成隐写半色调图像。然而，这种一对一的映射与当前高分辨率设备(如计算机打印机和屏幕)使用的一对多策略不一致，在这些设备中，首先将一个像素扩展为许多点，然后使用半色调处理来生成半色调图像。此外，产生高分辨率数字文件的电子纸或智能纸无法受到传统半色调水印方法的保护。针对上述问题，本文提出了一种高分辨率半色调水印方案。该方案的特点包括:(1)提出了一种采用一对多映射策略的高分辨率半调色工艺;(ii)提出了一种多对一的逆半调方法，以生成高质量的灰度图像;(3)半色调图像水印可以直接在灰度上进行，而不是在半色调图像上进行，具有更好的鲁棒性

引用次数: 4

Video quality classification based home video segmentation 基于视频质量分类的家庭视频分割

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521399

Si Wu, Yu-Fei Ma, HongJiang Zhang

Home videos often have some abnormal camera motions, such as camera shaking and irregular camera motions, which cause the degradation of visual quality. To remove bad quality segments and automatic stabilize shaky ones are necessary steps for home video archiving. In this paper, we proposed a novel segmentation algorithm for home video based on video quality classification. According to three important properties of motion, speed, direction, and acceleration, the effects caused by camera motion are classified into four categories: blurred, shaky, inconsistent and stable using support vector machines (SVMs). Based on the classification, a multi-scale sliding window is employed to parse video sequence into different segments along time axis, and each of these segments is labeled as one of camera motion effects. The effectiveness of the proposed approach has been validated by extensive experiments.

家庭视频经常会出现一些不正常的摄像机运动，如摄像机晃动、摄像机运动不规则等，导致视觉质量下降。去除质量差的片段和自动稳定不稳定的片段是家庭视频存档的必要步骤。本文提出了一种基于视频质量分类的家庭视频分割算法。根据运动的速度、方向和加速度三个重要属性，利用支持向量机(svm)将摄像机运动产生的影响分为模糊、抖动、不一致和稳定四类。在此基础上，采用多尺度滑动窗口将视频序列沿时间轴分解为不同的片段，并将每个片段标记为一个摄像机运动效果。大量的实验验证了该方法的有效性。

引用次数: 21

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2005 IEEE International Conference on Multimedia and Expo

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀