2005 IEEE International Conference on Multimedia and Expo最新文献

英文中文

KPYR: An Efficient Indexing Method KPYR:一种高效的索引方法

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521704

T. Urruty, F. Belkouch, C. Djeraba

Motivated by the needs for efficient indexing structures adapted to real applications in video database, we present a new indexing structure named Kpyr. In Kpyr, we use a clustering algorithm to partition the data space into sub-spaces on which we apply Pyramid technique (S. Berchtold, et al., 1998). We thus reduce the search space concerned by a query and improve the performances. We show that our approach provides interesting and performing experimental results for both K-nearest neighbors and window queries

针对视频数据库实际应用中对高效索引结构的需求，提出了一种新的索引结构Kpyr。在Kpyr中，我们使用聚类算法将数据空间划分为我们应用金字塔技术的子空间(S. Berchtold等人，1998)。这样就减少了查询所涉及的搜索空间，提高了性能。我们表明，我们的方法为k近邻和窗口查询提供了有趣且有效的实验结果

引用次数: 10

A New Bit-Plane Entropy Coder for Scalable Image Coding 一种用于可扩展图像编码的位平面熵编码器

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521404

Rong Zhang, R. Yu, Qibin Sun, L. Wong

Compression ratio and computational complexity are two major factors for a successful image coder. By exploring the Laplacian distribution of the wavelet coefficients, a new bit plane entropy coder is proposed in this paper. Compared with the state-of-the-art JPEG2000 entropy coder (EBCOT), the proposed coder achieves a 0.75% better loss less performance for 5 level 5/3 wavelet decomposition at block size 64 £ 64 and 2.56% at block size 16 £ 16. Experimental results also show PSNR improvements of about 0.13dB at 1bpp and 0.25dB at 2bpp on average for lossy compression. However, the gain in coding performance is not based on increasing computational complexity but in stead a reduction by using a static arithmetic coder which avoids complicated adaptive procedure.

压缩比和计算复杂度是影响图像编码器成功与否的两个主要因素。本文通过研究小波系数的拉普拉斯分布，提出了一种新的位平面熵编码器。与最先进的JPEG2000熵编码器(EBCOT)相比，该编码器在块大小为64英镑64时的5级5/3小波分解性能提高了0.75%，在块大小为16英镑16时提高了2.56%。实验结果还表明，在有损压缩下，PSNR在1bpp下平均提高0.13dB，在2bpp下平均提高0.25dB。然而，编码性能的提高不是基于计算复杂度的增加，而是通过使用静态算术编码器来降低复杂度，从而避免了复杂的自适应过程。

引用次数: 4

Comparison of shot boundary detectors 弹丸边界探测器的比较

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521541

J. Nesvadba, Fabian Ernst, Jernej Perhavc, J. Benois-Pineau, L. Primaux

A video cut detector (CD), a member of the shot boundary detector (SBD) group, is an essential element for spatio-temporal audiovisual (AV) segmentation and various video-processing technologies. Platform, processing and performance constraints forced the development of various dedicated CDs. Future platforms allow the usage of advanced CD algorithms with higher reliability. In order to enable an appropriate trade-off decision to be made between reliability and the required processing power, benchmarking of four CD algorithms has taken place on bases of a generic, culture-diverse multi-genre AV corpus. In terms of complexity/performance trade-off, a field-difference-based CD proved to be optimal.

视频切割检测器(CD)是镜头边界检测器(SBD)中的一员，是实现时空视听分割和各种视频处理技术的重要组成部分。平台、处理和性能的限制迫使各种专用cd的开发。未来的平台允许使用具有更高可靠性的高级CD算法。为了在可靠性和所需的处理能力之间做出适当的权衡决策，基于通用的、文化多样化的多类型AV语料库，对四种CD算法进行了基准测试。在复杂性/性能权衡方面，基于字段差异的CD被证明是最优的。

引用次数: 21

DSP implementation of digital image stabilizer 数字稳像器的DSP实现

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521546

Yu-Chun Peng, Meng-Ting Lu, Homer H. Chen

A digital image stabilization system compensates the image movement caused by hand jiggle for the image sequence captured by a hand-held video camera. In this paper, a simplified stabilization algorithm based on our previous work is presented. The algorithm performs block-based motion estimation on 16 local 16/spl times/16 blocks and uses a median filter to estimate the global motion. It reduces the complexity by confining the motion estimation to a small number of blocks of the image. This greatly facilitates the implementation of the algorithm on BF561, a DSP processor of analog device. Details of the DSP implementation are described.

数字图像稳定系统补偿由手抖动引起的图像运动，为手持摄像机捕获的图像序列。本文在前人工作的基础上，提出了一种简化的稳定算法。该算法在16个局部16/spl次/16个块上执行基于块的运动估计，并使用中值滤波器估计全局运动。它通过将运动估计限制在图像的少量块中来降低复杂性。这极大地方便了算法在模拟器件DSP处理器BF561上的实现。介绍了DSP的具体实现。

引用次数: 6

A computationally efficient 3D shape rejection algorithm 一种计算效率高的三维形状抑制算法

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521643

Yinpeng Chen, H. Sundaram

In this paper, we present an efficient 3D shape rejection algorithm for unlabeled 3D markers. The problem is important in domains such as rehabilitation and the performing arts. There are three key innovations in our approach-(a) a multi-resolution shape representation using Haar wavelets for unlabeled markers, (b) a multi-resolution shape metric and (c) a shape rejection algorithm that is predicated on the simple idea that we do not need to compute the entire distance to conclude that two shapes are dissimilar. We tested the approach on a real-world pose classification problem with excellent results. We achieved a classification accuracy of 98% with an order of magnitude improvement in terms of computational complexity over a baseline shape matching algorithm.

在本文中，我们提出了一种有效的3D形状拒绝算法，用于未标记的3D标记。这个问题在康复和表演艺术等领域很重要。在我们的方法中有三个关键的创新——(a)使用Haar小波对未标记标记进行多分辨率形状表示，(b)多分辨率形状度量和(c)形状拒绝算法，该算法基于一个简单的想法，即我们不需要计算整个距离来得出两个形状不相似的结论。我们在一个真实世界的姿态分类问题上测试了这种方法，并取得了很好的结果。我们实现了98%的分类准确率，在计算复杂度方面比基线形状匹配算法提高了一个数量级。

引用次数: 7

Conditionally Positive Definite Kernels for SVM Based Image Recognition 基于支持向量机的条件正定核图像识别

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521373

S. Boughorbel, Jean-Philippe Tarel, N. Boujemaa

Kernel based methods such as support vector machine (SVM) has provided successful tools for solving many recognition problems. One of the reasons of this success is the use of kernels. Positive definiteness has to be checked for kernels to be suitable for most of these methods. For instance for SVM, the use of a positive definite kernel insures that the optimized problem is convex and thus the obtained solution is unique. Alternative class of kernels called conditionally positive definite have been studied for a long time from the theoretical point of view and have drawn attention from the community only in the last decade. We propose a new kernel, named log kernel, which seems particularly interesting for images. Moreover, we prove that this new kernel is a conditionally positive definite kernel as well as the power kernel. Finally, we show from experimentations that using conditionally positive definite kernels allows us to outperform classical positive definite kernels

支持向量机(SVM)等基于核的方法为解决许多识别问题提供了成功的工具。这种成功的原因之一是内核的使用。对于大多数这些方法，必须检查核函数的正确定性。例如，对于支持向量机，使用正定核确保优化问题是凸的，因此得到的解是唯一的。另一类被称为条件正定的核从理论角度研究了很长时间，直到最近十年才引起学术界的重视。我们提出了一个新的内核，命名为log内核，它对图像来说似乎特别有趣。此外，我们还证明了这个新核是一个条件正定核和幂核。最后，我们从实验中表明，使用条件正定核可以使我们优于经典正定核

引用次数: 83

An Online Video Composition System 一个在线视频合成系统

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521467

Qiong Liu, Xiaojin Shi, Don Kimber, F. Zhao, Frank Raab

This paper presents an information-driven online video composition system. The composition work handled by the system includes dynamically setting multiple pan/tilt/zoom (PTZ) cameras to proper poses and selecting the best close-up view for passive viewers. The main idea of the composition system is to maximize captured video information with limited cameras. Unlike video composition based on heuristic rules, our video composition is formulated as a process of minimizing distortions between ideal signals (i.e. signals with infinite spatial-temporal resolution) and displayed signals. The formulation is consistent with many well-known empirical approaches widely used in previous systems and may provide analytical explanations to those approaches. Moreover, it provides a novel approach for studying video composition tasks systematically. The composition system allows each user to select a personal close-up view. It manages PTZ cameras and a video switcher based on both signal characteristics and users' view selections. Additionally, it can automate the video composition process based on past users' view-selections when immediate selections are not available. We demonstrate the performance of this system with real meetings

本文提出了一个信息驱动的在线视频合成系统。系统处理的构图工作包括动态设置多个平移/倾斜/变焦(PTZ)相机到适当的姿势，并为被动观众选择最佳的特写视图。合成系统的主要思想是在有限的摄像机下最大限度地捕获视频信息。与基于启发式规则的视频合成不同，我们的视频合成是将理想信号(即具有无限时空分辨率的信号)与显示信号之间的扭曲最小化的过程。该公式与以前系统中广泛使用的许多众所周知的经验方法一致，并且可以为这些方法提供分析解释。为系统地研究视频合成任务提供了一种新的方法。构图系统允许每个用户选择个人特写视图。它根据信号特性和用户的视图选择来管理PTZ摄像机和视频切换器。此外，当即时选择不可用时，它可以根据过去用户的视图选择自动完成视频合成过程。通过实际会议验证了该系统的性能

{"title":"An Online Video Composition System","authors":"Qiong Liu, Xiaojin Shi, Don Kimber, F. Zhao, Frank Raab","doi":"10.1109/ICME.2005.1521467","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521467","url":null,"abstract":"This paper presents an information-driven online video composition system. The composition work handled by the system includes dynamically setting multiple pan/tilt/zoom (PTZ) cameras to proper poses and selecting the best close-up view for passive viewers. The main idea of the composition system is to maximize captured video information with limited cameras. Unlike video composition based on heuristic rules, our video composition is formulated as a process of minimizing distortions between ideal signals (i.e. signals with infinite spatial-temporal resolution) and displayed signals. The formulation is consistent with many well-known empirical approaches widely used in previous systems and may provide analytical explanations to those approaches. Moreover, it provides a novel approach for studying video composition tasks systematically. The composition system allows each user to select a personal close-up view. It manages PTZ cameras and a video switcher based on both signal characteristics and users' view selections. Additionally, it can automate the video composition process based on past users' view-selections when immediate selections are not available. We demonstrate the performance of this system with real meetings","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129039557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

A quarter pel full search block motion estimation architecture for H.264/AVC 基于H.264/AVC的四分之一全搜索块运动估计架构

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521448

C. A. Rahman, Wael Badawy

This paper presents a novel quarter pel full search block motion estimation architecture for H.264/AVC encoder. The proposed architecture is capable of calculating all 41 motion vectors required by the various size blocks, supported by H.264/AVC, in parallel. The architecture has been prototyped in Verilog HDL, simulated and synthesized for Xilinx Virtex2 FPGA. The experimental result shows that the architecture is capable of processing CIF frame sequences in real time considering 5 reference frames within the search range of -3.75 to +4.00 at a clock speed of 120 MHz. The maximum speed of the architecture is around 150 MHz.

提出了一种用于H.264/AVC编码器的四分之一像素全搜索块运动估计结构。所提出的架构能够并行计算H.264/AVC支持的各种大小块所需的所有41个运动向量。该体系结构在Verilog HDL中进行了原型设计，并在Xilinx Virtex2 FPGA上进行了仿真和合成。实验结果表明，在时钟速度为120 MHz的情况下，在-3.75 ~ +4.00的搜索范围内考虑5个参考帧，该架构能够实时处理CIF帧序列。该架构的最大速度约为150 MHz。

引用次数: 24

An adaptive microphone array with local acoustic sensitivity 具有局部声灵敏度的自适应麦克风阵列

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521397

Jianfeng Chen, L. Shue, Hanwu Sun, K. Phua

In this paper, a microphone array with 3-D focal zone is proposed. The microphone array consists of one omni-directional and two uni-directional microphones. The microphone array is so constructed that a cross zone is formed such that only the sound within this zone is captured and any interferences outside the zone are effectively cancelled. The proposed framework is flexible in defining the location/size of the closed volume where the sound source of interest is located. Simulations have been carried out to demonstrate the 3-D spatial selectivity as well as the noise cancellation performance. The most important feature which differs from the previous works is that the super volumetric selectivity is realized by strategically use only three microphones, by which the overall apparatus acts as a virtual wireless close-talking microphone with confined position constrained in both distance and directions.

本文提出了一种具有三维聚焦区的传声器阵列。麦克风阵列由一个全向麦克风和两个单向麦克风组成。话筒阵列的构造使得形成一个交叉区域，使得仅捕获该区域内的声音，并且有效地消除该区域外的任何干扰。拟议的框架在确定声源所在的封闭体量的位置/大小方面是灵活的。通过仿真验证了该方法的三维空间选择性和消噪性能。不同于以往工作的最大特点是，通过策略性地只使用三个麦克风来实现超音量选择性，整个装置作为一个虚拟的无线近距离对话麦克风，在距离和方向上都受到限制。

引用次数: 5

Dynamic language model adaptation using latent topical information and automatic transcripts 使用潜在主题信息和自动转录的动态语言模型自适应

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521369

Berlin Chen

This paper considers dynamic language model adaptation for Mandarin broadcast news recognition. Both contemporary newswire texts and in-domain automatic transcripts were exploited in language model adaptation. A topical mixture model was presented to dynamically explore the long-span latent topical information for language model adaptation. The underlying characteristics and different kinds of model structures were extensively investigated, while their performance was analyzed and verified by comparison with the conventional MAP-based adaptation approaches, which are devoted to extracting the short-span n-gram information. The fusion of global topical and local contextual information was investigated as well. The speech recognition experiments were conducted on the broadcast news collected in Taiwan. Very promising results in perplexity as well as character error rate reductions were initially obtained.

本文研究了基于动态语言模型的普通话广播新闻识别。在语言模型适应中，既利用了当代新闻专线文本，也利用了域内自动抄本。提出了一种主题混合模型，动态挖掘大跨度潜在主题信息，用于语言模型自适应。广泛研究了模型的基本特征和不同类型的模型结构，并与传统的基于map的自适应方法进行了比较，分析和验证了它们的性能。研究了全局主题信息和局部上下文信息的融合。语音识别实验是在台湾收集的广播新闻上进行的。在迷惑和字符错误率降低方面初步获得了非常有希望的结果。

引用次数: 3

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2005 IEEE International Conference on Multimedia and Expo

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀