2005 IEEE International Conference on Multimedia and Expo最新文献

英文中文

Mediated Meeting Interaction for Teleconferencing 电话会议的中介会议交互

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521701

Kazumasa Murai, Don Kimber, J. Foote, Qiong Liu, John Doherty

A common problem with teleconferences is awkward turn-taking-particularly 'collisions,' whereby multiple parties inadvertently speak over each other due to communication delays. We propose a model for teleconference discussions including the effects of delays, and describe tools that can improve the quality of those interactions. We describe an interface to gently provide latency awareness, and to give advanced notice of 'incoming speech' to help participants avoid collisions. This is possible when codec latencies are significant, or when a low bandwidth side channel or out-of-band signaling is available with lower latency than the primary video channel. We report on results of simulations, and of experiments carried out with transpacific meetings, that demonstrate these tools can improve the quality of teleconference discussions

电话会议的一个常见问题是轮流发言的尴尬，特别是“冲突”，即由于通信延迟，多方无意中互相说话。我们提出了一个包括延迟影响的电话会议讨论模型，并描述了可以提高这些交互质量的工具。我们描述了一个接口，轻轻提供延迟意识，并提前通知“传入语音”，以帮助参与者避免碰撞。当编解码器延迟很重要时，或者当低带宽侧信道或带外信令可用且延迟低于主视频信道时，这是可能的。我们报告了模拟的结果，以及在跨太平洋会议中进行的实验，这些结果表明这些工具可以提高电话会议讨论的质量

引用次数: 3

Real-Time and Distributed AV Content Analysis System for Consumer Electronics Networks 面向消费电子网络的实时分布式AV内容分析系统

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521729

J. Nesvadba, P. Fonseca, A. Sinitsyn, F. D. Lange, Martijn Thijssen, P. Kaam, Hong Liu, Rien van Leeuwen, J. Lukkien, A. Korostelev, Jan Ypma, B. Kroon, H. Celik, A. Hanjalic, S. U. Naci, J. Benois-Pineau, P. D. With, Jungong Han

The ever-increasing complexity of generic multimedia-content-analysis-based (MCA) solutions, their processing power demanding nature and the need to prototype and assess solutions in a fast and cost-saving manner motivated the development of the Cassandra framework. The combination of state-of-the-art network and grid-computing solutions and recently standardized interfaces facilitated the set-up of this framework, forming the basis for multiple cross-domain and cross-organizational collaborations. It enables distributed computing scenario simulations for e.g. distributed content analysis (DCA) across consumer electronics (CE) in-home networks, but also the rapid development and assessment of complex multi-MCA-algorithm-based applications and system solutions. Furthermore, the framework's modular nature-logical MCA units are wrapped into so-called service units (SU)-ease the split between system-architecture- and algorithmic-related work and additionally facilitate reusability, extensibility and upgrade ability of those SUs

基于多媒体内容分析(MCA)的通用解决方案的复杂性不断增加，它们的处理能力要求很高，需要以快速和节省成本的方式对解决方案进行原型和评估，这些都促使了Cassandra框架的发展。最先进的网络和网格计算解决方案的结合以及最近标准化的接口促进了这个框架的建立，形成了多个跨领域和跨组织合作的基础。它支持分布式计算场景模拟，例如跨消费电子产品(CE)家庭网络的分布式内容分析(DCA)，以及复杂的基于多mca算法的应用程序和系统解决方案的快速开发和评估。此外，该框架的模块化特性——逻辑MCA单元被封装到所谓的服务单元(SU)中——简化了系统架构和算法相关工作之间的分离，另外还促进了这些SU的可重用性、可扩展性和升级能力

{"title":"Real-Time and Distributed AV Content Analysis System for Consumer Electronics Networks","authors":"J. Nesvadba, P. Fonseca, A. Sinitsyn, F. D. Lange, Martijn Thijssen, P. Kaam, Hong Liu, Rien van Leeuwen, J. Lukkien, A. Korostelev, Jan Ypma, B. Kroon, H. Celik, A. Hanjalic, S. U. Naci, J. Benois-Pineau, P. D. With, Jungong Han","doi":"10.1109/ICME.2005.1521729","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521729","url":null,"abstract":"The ever-increasing complexity of generic multimedia-content-analysis-based (MCA) solutions, their processing power demanding nature and the need to prototype and assess solutions in a fast and cost-saving manner motivated the development of the Cassandra framework. The combination of state-of-the-art network and grid-computing solutions and recently standardized interfaces facilitated the set-up of this framework, forming the basis for multiple cross-domain and cross-organizational collaborations. It enables distributed computing scenario simulations for e.g. distributed content analysis (DCA) across consumer electronics (CE) in-home networks, but also the rapid development and assessment of complex multi-MCA-algorithm-based applications and system solutions. Furthermore, the framework's modular nature-logical MCA units are wrapped into so-called service units (SU)-ease the split between system-architecture- and algorithmic-related work and additionally facilitate reusability, extensibility and upgrade ability of those SUs","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"165 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127379462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17

Streaming layered encoded video using peers 流式分层编码视频使用对等体

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521585

Yanming Shen, Zhengye Liu, S. Panwar, K. Ross, Yao Wang

Peer-to-peer video streaming has emerged as an important means to transport stored video. The peers are less costly and more scalable than an infrastructure-based video streaming network which deploys a dedicated set of servers to store and distribute videos to clients. In this paper, we investigate streaming layered encoded video using peers. Each video is encoded into hierarchical layers which are stored on different peers. The system serves a client request by streaming multiple layers of the requested video from separate peers. The system provides unequal error protection for different layers by varying the number of copies stored for each layer according to its importance. We evaluate the performance of our proposed system with different copy number allocation schemes through extensive simulations. Finally, we compare the performance of layered coding with multiple description coding.

点对点视频流已经成为传输存储视频的重要手段。与基于基础设施的视频流网络相比，对等网络成本更低，可扩展性更强，后者部署了一组专用服务器来存储和分发视频给客户端。本文研究了基于对等体的流媒体分层编码视频。每个视频都被编码成存储在不同节点上的分层层。该系统通过流式传输来自不同对等点的多层请求视频来满足客户端请求。系统根据不同的重要程度，为不同的层提供不同程度的错误保护。我们通过大量的模拟来评估我们提出的系统在不同拷贝数分配方案下的性能。最后，我们比较了分层编码与多重描述编码的性能。

引用次数: 47

Night Scene Live – A Multimedia Application for Mobile Revellers on the Basis of a Hybrid Network, Using DVB-H and IP Datacast 夜景现场–基于DVB-H和IP数据播的混合网络移动狂欢多媒体应用

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521734

J. Baldzer, S. Thieme, Susanne CJ Boll, Hans-Jürgen Appelrath, Niels Rosenhager

The combination of the emerging digital video broadcasting-handheld (DVB-H) standard with cellular communication like UMTS produces a hybrid network with enormous potential for mobile multimedia applications. In order to optimize the performance of hybrid networks, the characteristics of different individual networks have to be considered. Our prototypical hybrid network infrastructure employs smart access management for an optimal usage of both broadcast and point-to-point network. Our demonstrator-"night scene live", a multimedia event portal-is an excellent example of an application exploiting the potential of future hybrid networks

新兴的手持式数字视频广播(DVB-H)标准与UMTS等蜂窝通信相结合，产生了一个具有巨大移动多媒体应用潜力的混合网络。为了优化混合网络的性能，必须考虑不同个体网络的特性。我们的原型混合网络基础设施采用智能访问管理，以实现广播和点对点网络的最佳使用。我们的演示——“夜景直播”，一个多媒体事件门户——是一个开发未来混合网络潜力的应用程序的极好例子

引用次数: 13

A Player-Possession Acquisition System for Broadcast Soccer Video 一种用于足球转播视频的球员占有获取系统

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521475

Xinguo Yu, Tze Sen Hay, Xin Yan, Chng Eng Siong

A semi-auto system is developed to acquire player possession for broadcast soccer video, whose objective is to minimize the manual work. This research is important because acquiring player-possession by pure manual work is very time-consuming. For completeness, this system integrates the ball detection-and-tracking algorithm, view classification algorithm, and play/break analysis algorithm. First, it produces the ball locations, play/break structure, and the view classes of frames. Then it finds the touching points based on ball locations and player detection. Next it estimates the touching-place in the field for each touching point based on the view-class of the touching frame. Last, for each touching-point it acquires the touching-player candidates based on the touching-place and the roles of players. The system provides the graphical user interfaces to verify touching-points and finalize the touching-player for each touching-point. Experimental results show that the proposed system can obtain good results in touching-point detection and touching-player candidate inference, which save a lot of time compared with the pure manual way.

为了最大限度地减少人工操作，开发了一种足球转播视频中球员持球率的半自动采集系统。这项研究很重要，因为纯粹靠手工来获得玩家的掌控力非常耗时。为完整起见，本系统集成了球检测与跟踪算法、视图分类算法和打/破分析算法。首先，它生成球的位置、play/break结构和帧的视图类。然后它根据球的位置和球员的检测来找到接触点。然后，它根据触摸帧的视图类估计每个触摸点在场中的触摸位置。最后，针对每个触点，根据触点位置和触点角色获取候选触点。系统提供图形用户界面来验证触点，并最终确定每个触点的触点播放器。实验结果表明，该系统在触摸点检测和触摸球员候选推理方面取得了较好的效果，与纯手工方法相比节省了大量的时间。

{"title":"A Player-Possession Acquisition System for Broadcast Soccer Video","authors":"Xinguo Yu, Tze Sen Hay, Xin Yan, Chng Eng Siong","doi":"10.1109/ICME.2005.1521475","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521475","url":null,"abstract":"A semi-auto system is developed to acquire player possession for broadcast soccer video, whose objective is to minimize the manual work. This research is important because acquiring player-possession by pure manual work is very time-consuming. For completeness, this system integrates the ball detection-and-tracking algorithm, view classification algorithm, and play/break analysis algorithm. First, it produces the ball locations, play/break structure, and the view classes of frames. Then it finds the touching points based on ball locations and player detection. Next it estimates the touching-place in the field for each touching point based on the view-class of the touching frame. Last, for each touching-point it acquires the touching-player candidates based on the touching-place and the roles of players. The system provides the graphical user interfaces to verify touching-points and finalize the touching-player for each touching-point. Experimental results show that the proposed system can obtain good results in touching-point detection and touching-player candidate inference, which save a lot of time compared with the pure manual way.","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132046060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 20

Audio-visual affect recognition in activation-evaluation space 激活-评价空间中的视听影响识别

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521551

Zhihong Zeng, ZhenQiu Zhang, Brian Pianfetti, J. Tu, Thomas S. Huang

The ability of a computer to detect and appropriately respond to changes in a user's affective state has significant implications to human-computer interaction (HCI). To more accurately simulate the human ability to assess affects through multi-sensory data, automatic affect recognition should also make use of multimodal data. In this paper, we present our efforts toward audio-visual affect recognition. Based on psychological research, we have chosen affect categories based on an activation-evaluation space which is robust in capturing significant aspects of emotion. We apply the Fisher boosting learning algorithm which can build a strong classifier by combining a small set of weak classification functions. Our experimental results show with 30 Fisher features, the testing error rates of our bimodal affect recognition is about 16% on the evaluation axis and 13% on the activation axis.

计算机检测并适当响应用户情感状态变化的能力对人机交互(HCI)具有重要意义。为了更准确地模拟人类通过多感官数据评估情感的能力，自动情感识别也应该利用多模态数据。在本文中，我们介绍了在视听情感识别方面所做的努力。在心理学研究的基础上，我们选择了基于激活-评价空间的情感类别，该空间在捕捉情感的重要方面方面具有鲁棒性。我们采用Fisher增强学习算法，该算法可以通过组合一组弱分类函数来构建一个强分类器。实验结果表明，在30个Fisher特征下，我们的双峰情感识别在评估轴上的测试错误率约为16%，在激活轴上的测试错误率约为13%。

引用次数: 29

Expressive avatars in MPEG-4 MPEG-4中富有表现力的化身

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521544

M. Mancini, Bjoern Hartmann, C. Pelachaud, A. Raouzaiou, K. Karpouzis

Man-machine interaction (MMI) systems that utilize multimodal information about users' current emotional state are presently at the forefront of interest of the computer vision and artificial intelligence communities. A lifelike avatar can enhance interactive applications. In this paper, we present the implementation of GretaEngine and synthesized expressions, including intermediate ones, based on MPEG-4 standard and Whissel's emotion representation.

利用关于用户当前情绪状态的多模态信息的人机交互(MMI)系统目前处于计算机视觉和人工智能社区的前沿。逼真的化身可以增强交互式应用程序。本文提出了基于MPEG-4标准和Whissel情感表示的GretaEngine及其合成表达式(包括中间表达式)的实现方法。

引用次数: 7

Efficient Hardware Search Engine for Associative Content Retrieval of Long Queries in Huge Multimedia Databases 大型多媒体数据库长查询关联内容检索的高效硬件搜索引擎

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521601

Christophe Layer, H. Pfleiderer

Due to the enormous increase in the stored digital contents, search and retrieval functionalities are necessary in multimedia systems. Though processor speed for standard PCs (Personal Computers) is experiencing an almost exponential growth, the memory subsystem handicapped by lower frequencies and a physical I/O (Input/Output) limitation reflects the bottleneck of common computer architectures. As a result, many applications such as database management systems remain so dependent on memory throughput that increases in CPU (Central Processing Unit) speeds are no longer helpful. Because average bandwidth is crucial for system performance, our research has focused especially on techniques for efficient storage and retrieval of multimedia data. This paper presents the realization of a hardware database search engine based on an associative access method for textual information retrieval. It reveals the internal architecture of the system and compares the results of our hardware prototype with the software solution

由于存储的数字内容大量增加，搜索和检索功能在多媒体系统中是必要的。尽管标准pc(个人计算机)的处理器速度正在经历几乎指数级的增长，但内存子系统受到较低频率和物理I/O(输入/输出)限制的限制，反映了普通计算机体系结构的瓶颈。因此，许多应用程序(如数据库管理系统)仍然如此依赖于内存吞吐量，以至于CPU(中央处理单元)速度的提高不再有帮助。由于平均带宽对系统性能至关重要，我们的研究主要集中在高效存储和检索多媒体数据的技术上。本文介绍了一种基于关联存取方法的文本信息检索硬件数据库搜索引擎的实现。揭示了系统的内部结构，并将硬件样机的结果与软件解决方案进行了比较

引用次数: 2

Compression transparent low-level description of audio signals 压缩透明的音频信号低级描述

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521450

J. Lukasiak, Chris McElroy, E. Cheng

A new low level audio descriptor that represents the psychoacoustic noise floor shape of an audio frame is proposed. Results presented indicate that the proposed descriptor is far more resilient to compression noise than any of the MPEG-7 low level audio descriptors. In fact, across a wide range of files, on average the proposed scheme fails to uniquely identify only five frames in every ten thousand. In addition, the proposed descriptor maintains a high resilience to compression noise even when decimated to use only one quarter of the values per frame to represent the noise floor. This characteristic indicates the proposed descriptor presents a truly scalable mechanism for transparently describing the characteristics of an audio frame.

提出了一种新的低阶音频描述符，表示音频帧的心理声学噪声底形状。结果表明，所提出的描述符比任何MPEG-7低电平音频描述符更能适应压缩噪声。事实上，在大范围的文件中，平均而言，所提出的方案无法唯一识别每10000帧中只有5帧。此外，所提出的描述符即使在抽取每帧仅使用四分之一的值来表示噪声底时，也能保持对压缩噪声的高弹性。这一特征表明所提出的描述符提供了一种真正可扩展的机制，用于透明地描述音频帧的特征。

引用次数: 3

Scalable temporal interest points for abstraction and classification of video events 用于视频事件抽象和分类的可伸缩时间兴趣点

2005 IEEE International Conference on Multimedia and Expo

Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521512

Seung-Hoon Han, In-So Kweon

The image sequence of a static scene includes similar or redundant information over time. Hence, motion-discontinuous instants can efficiently characterize a video shot or event. However, such instants (key frames) are differently identified according to the change of velocity and acceleration of motion, and such scales of change might be different on each sequence of the same event. In this paper, we present a scalable video abstraction in which the key frames are obtained by the maximum curvature of camera motion at each temporal scale. The scalability means dealing with the velocity and acceleration change of motion. In the temporal neighborhood determined by the scale, the scene features (motion, color, and edge) can be used to index and classify the video events. Therefore, those key frames provide temporal interest points (TIPs) for the abstraction and classification of video events.

静态场景的图像序列随着时间的推移包含相似或冗余的信息。因此，运动不连续的瞬间可以有效地表征视频镜头或事件。但是，根据运动速度和加速度的变化，这些瞬间(关键帧)的识别是不同的，并且在同一事件的每个序列上，这种变化的尺度可能是不同的。在本文中，我们提出了一种可扩展的视频抽象，其中关键帧是由每个时间尺度上摄像机运动的最大曲率获得的。可扩展性是指处理运动的速度和加速度变化。在尺度确定的时间邻域内，利用场景特征(运动、颜色和边缘)对视频事件进行索引和分类。因此，这些关键帧为视频事件的抽象和分类提供了时间兴趣点。

引用次数: 9

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2005 IEEE International Conference on Multimedia and Expo

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀