MULTIMEDIA '04最新文献

英文中文

Cortina: a system for large-scale, content-based web image retrieval Cortina:一个大规模的，基于内容的网络图像检索系统

MULTIMEDIA '04

Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027650

Till Quack, U. Mönich, L. Thiele, B. S. Manjunath

Recent advances in processing and networking capabilities of computers have led to an accumulation of immense amounts of multimedia data such as images. One of the largest repositories for such data is the World Wide Web (WWW). We present Cortina, a large-scale image retrieval system for the WWW. It handles over 3 million images to date. The system retrieves images based on visual features and collateral text. We show that a search process which consists of an initial query-by-keyword or query-by-image and followed by relevance feedback on the visual appearance of the results is possible for large-scale data sets. We also show that it is superior to the pure text retrieval commonly used in large-scale systems. Semantic relationships in the data are explored and exploited by data mining, and multiple feature spaces are included in the search process.

计算机在处理和联网能力方面的最新进展导致了大量多媒体数据(如图像)的积累。此类数据的最大存储库之一是万维网(WWW)。我们提出了Cortina，一个大规模的图像检索系统。迄今为止，它处理了超过300万张图片。该系统基于视觉特征和附属文本检索图像。我们表明，搜索过程由初始的按关键字查询或按图像查询组成，然后对结果的视觉外观进行相关反馈，这对于大规模数据集是可能的。我们还表明，它优于大规模系统中常用的纯文本检索。通过数据挖掘对数据中的语义关系进行探索和利用，并在搜索过程中包含多个特征空间。

引用次数: 110

Generating 3D views of facial expressions from frontal face video based on topographic analysis 基于地形分析的正面面部视频生成面部表情的3D视图

MULTIMEDIA '04

Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027611

L. Yin, K. Weiss

In this paper, we report our newly developed 3D face modeling system with arbitrary expressions in a high level of detail using the topographic analysis and mesh instantiation process. Given a sequence of images of facial expressions at frontal views, we automatically generate 3D expressions at arbitrary views. Our face modeling system consists of two major components: facial surface representation using topographic analysis and generic model individualization based on labeled surface features and surface curvatures. The realism of the generated individual model is demonstrated through 3D views of facial expressions in videos. This work targets the accurate modeling of face and face expression for human computer interaction and 3D face recognition.

在本文中，我们报告了我们新开发的三维人脸建模系统，该系统使用地形分析和网格实例化过程在高细节水平上具有任意表情。给定一系列正面视图的面部表情图像，我们自动生成任意视图的3D表情。我们的人脸建模系统由两个主要部分组成:使用地形分析的人脸表面表示和基于标记的表面特征和表面曲率的通用模型个性化。生成的个体模型的真实感是通过视频中面部表情的3D视图来展示的。本研究的目标是为人机交互和三维人脸识别提供准确的人脸和面部表情建模。

引用次数: 9

Detecting image near-duplicate by stochastic attributed relational graph matching with learning 基于学习的随机属性关系图匹配检测图像近重复

MULTIMEDIA '04

Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027730

Dong-Qing Zhang, Shih-Fu Chang

Detecting Image Near-Duplicate (IND) is an important problem in a variety of applications, such as copyright infringement detection and multimedia linking. Traditional image similarity models are often difficult to identify IND due to their inability to capture scene composition and semantics. We present a part-based image similarity measure derived from stochastic matching of Attributed Relational Graphs that represent the compositional parts and part relations of image scenes. Such a similarity model is fundamentally different from traditional approaches using low-level features or image alignment. The advantage of this model is its ability to accommodate spatial attributed relations and support supervised and unsupervised learning from training data. The experiments compare the presented model with several prior similarity models, such as color histogram, local edge descriptor, etc. The presented model outperforms the prior approaches with large margin.

图像近重复检测(IND)在版权侵权检测和多媒体链接等各种应用中都是一个重要问题。传统的图像相似模型由于无法捕捉场景组成和语义，往往难以识别IND。我们提出了一种基于部分的图像相似性度量，该度量来源于表示图像场景的组成部分和部分关系的属性关系图的随机匹配。这种相似性模型与使用低级特征或图像对齐的传统方法有根本的不同。该模型的优点是能够适应空间属性关系，并支持从训练数据中进行有监督和无监督学习。实验将该模型与颜色直方图、局部边缘描述符等已有的相似度模型进行了比较。所提出的模型在很大程度上优于先前的方法。

引用次数: 223

Towards an integrated multimedia service hosting overlay 迈向集成的多媒体服务托管覆盖

MULTIMEDIA '04

Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027545

Dongyan Xu, Xuxian Jiang

With the proliferation of multimedia data sources on the Internet, we envision an increasing demand for value-added and function-rich multimedia services that transport, process, and analyze multimedia data on behalf of end users. More importantly, multimedia services are expected to be easily accessible and composable by users. In this paper, we propose MSODA, a service-oriented platform that hosts a wide spectrum of media services provided by different parties. From the user's point of view, MSODA is a shared "market" for media service access and composition. For a media service provider, MSODA creates a virtual dedicated environment for service deployment and management. Finally, the underlying MSODA middleware performs the key functions of service composition, configuration, and mapping for users. We discuss key challenges in the design of MSODA and present preliminary results towards its full realization.

随着因特网上多媒体数据源的激增，我们预计对代表最终用户传输、处理和分析多媒体数据的增值和功能丰富的多媒体服务的需求将不断增加。更重要的是，多媒体服务应易于用户访问和组合。在本文中，我们提出了MSODA，这是一个面向服务的平台，它承载了各方提供的广泛的媒体服务。从用户的角度来看，MSODA是一个媒体服务访问和组合的共享“市场”。对于媒体服务提供商，MSODA为服务部署和管理创建了虚拟专用环境。最后，底层的MSODA中间件为用户执行服务组合、配置和映射等关键功能。我们讨论了MSODA设计中的关键挑战，并提出了其全面实现的初步结果。

引用次数: 19

MobShare: controlled and immediate sharing of mobile images MobShare:控制和即时共享移动图像

MULTIMEDIA '04

Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027690

R. Sarvas, Mikko Viikari, Juha Pesonen, H. Nevanlinna

In this paper we describe the design and implementation of a mobile one picture sharing system MobShare that enables immediate, controlled, and organized sharing of mobile pictures, and the browsing, combining, and discussion of the shared pictures. The design combines research on otogray, personal image management, mobile one camera use, mobile picture publishing, and an interview study we conducted on mobile one camera users. The system is based on a client-server architecture and uses current mobile one and web technology. The implementation describes novel solutions in immediate sharing of mobile images to an organized web album, and in providing full control over with whom the images are shared. Also, we describe new ways of promoting discussion in sharing images and enabling the combination and comparison of personal and shared pictures. The system proves that the designed solutions can be implemented with current technology and provides novel approaches to general issues in sharing digital images.

本文介绍了一个手机图片共享系统MobShare的设计与实现，该系统实现了对手机图片的即时、可控、有组织的共享，以及对共享图片的浏览、组合和讨论。本设计结合了对otogray、个人图像管理、手机相机使用、手机图片发布的研究，以及对手机相机用户的访谈研究。该系统基于客户端-服务器架构，采用当前的移动端和web技术。该实现描述了将移动图像立即共享到有组织的网络相册的新颖解决方案，并提供了对与谁共享图像的完全控制。此外，我们还描述了在共享图像中促进讨论的新方法，以及实现个人和共享图片的组合和比较。该系统证明了所设计的解决方案可以在现有技术下实现，并为数字图像共享中的一般问题提供了新的方法。

引用次数: 83

Automatic pan control system for broadcasting ball games based on audience's face direction 基于观众脸部方向的球类转播自动平移控制系统

MULTIMEDIA '04

Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027634

Shinji Daigo, S. Ozawa

We propose an automatic pan control system for broadcasting ball games by tracking face direction of audience. Presuming that the audience's face is directed to a notable play, we can shoot a broadcasting video by controlling the camera toward the audience's face direction. In our method, a court sensor which detects rough region where players exist is used in addition to the face sensor in order to obtain higher accuracy. Based on these sensors, the broadcasting video is generated by cylindrical mosaicing off-line. We conducted objective and subjective evaluation experiments and the result shows that our approach is better than previously proposed methods in respect of stable tracking and easy watching. Conclusively our method is as effective as the optimum camerawork.

提出了一种通过跟踪观众面部方向来实现球类赛事转播的自动平移控制系统。假设观众的脸对着一个著名的戏剧，我们可以通过控制摄像机对着观众的脸方向拍摄广播视频。在我们的方法中，除了人脸传感器之外，还使用了一个检测球员存在的粗糙区域的球场传感器，以获得更高的精度。基于这些传感器，通过柱面拼接离线生成广播视频。我们进行了客观和主观的评价实验，结果表明，我们的方法在跟踪稳定和易于观察方面优于先前提出的方法。总之，我们的方法与最佳摄影一样有效。

引用次数: 19

Disruption-tolerant content-aware video streaming 可容忍中断的内容感知视频流

MULTIMEDIA '04

Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027627

Tiecheng Liu, Srihari Nelakuditi

Communication between a pair of nodes in the network may get disrupted due to failures of links/nodes resulting in zero effective bandwidth between them during the recovery period. It has been observed that such disruptions are not too uncommon and may last from tens of seconds to minutes. Even an occasional such disruption can drastically degrade the viewing experience of a participant in a video streaming session particularly when a sequence of frames central to the story are lost during the disruption. The conventional approach of prefetching video frames and patching lost ones with retransmissions is not always viable when disruptions are localized and experienced only by a few among many receivers. Error spreading approaches that distribute the losses across the video work well only when the disruptions are quite short. As a better alternative, we propose a disruption-tolerant content-aware video streaming approach that combines the techniques of content summarization and error spreading to enhance viewers experience even when the disruptions are long. We introduce the notion of "substitutable content summary frames" and provide a method to select these frames and also their transmission order to mitigate the impact of a disruption. In the event of a disruption, the already received summary frames are played by the client during disruption and near normal playback is resumed after the disruption. We evaluate our approach and demonstrate that it provides acceptable viewing experience with minimal startup latency and client buffer.

网络中一对节点之间的通信可能会由于链路/节点故障而中断，导致在恢复期间它们之间的有效带宽为零。据观察，这种中断并不罕见，可能持续几十秒到几分钟。即使偶尔出现这样的中断，也会大大降低视频流会话参与者的观看体验，特别是当故事的核心帧序列在中断期间丢失时。当中断是局部的并且只有许多接收机中的少数人经历时，预先获取视频帧并通过重传来修补丢失的视频帧的传统方法并不总是可行的。只有在中断时间很短的情况下，将损失分散到整个视频的错误扩散方法才有效。作为一种更好的替代方案，我们提出了一种容错的内容感知视频流方法，该方法结合了内容摘要和错误传播技术，即使中断时间很长，也能增强观众的体验。我们引入了“可替换内容摘要框架”的概念，并提供了一种选择这些框架及其传输顺序的方法，以减轻中断的影响。在中断的情况下，客户端在中断期间播放已经接收到的摘要帧，并且在中断之后恢复接近正常的播放。我们评估了我们的方法，并证明它以最小的启动延迟和客户端缓冲区提供了可接受的观看体验。

{"title":"Disruption-tolerant content-aware video streaming","authors":"Tiecheng Liu, Srihari Nelakuditi","doi":"10.1145/1027527.1027627","DOIUrl":"https://doi.org/10.1145/1027527.1027627","url":null,"abstract":"Communication between a pair of nodes in the network may get disrupted due to failures of links/nodes resulting in zero effective bandwidth between them during the recovery period. It has been observed that such disruptions are not too uncommon and may last from tens of seconds to minutes. Even an occasional such disruption can drastically degrade the viewing experience of a participant in a video streaming session particularly when a sequence of frames central to the story are lost during the disruption. The conventional approach of prefetching video frames and patching lost ones with retransmissions is not always viable when disruptions are localized and experienced only by a few among many receivers. Error spreading approaches that distribute the losses across the video work well only when the disruptions are quite short. As a better alternative, we propose a disruption-tolerant content-aware video streaming approach that combines the techniques of content summarization and error spreading to enhance viewers experience even when the disruptions are long. We introduce the notion of \"substitutable content summary frames\" and provide a method to select these frames and also their transmission order to mitigate the impact of a disruption. In the event of a disruption, the already received summary frames are played by the client during disruption and near normal playback is resumed after the disruption. We evaluate our approach and demonstrate that it provides acceptable viewing experience with minimal startup latency and client buffer.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123977670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

Possibilities and limitations of immersive free-hand expression: a case study with professional artists 沉浸式徒手表达的可能性与局限性:以专业艺术家为例

MULTIMEDIA '04

Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027649

Wille Mäkelä, M. Reunanen, T. Takala

We have studied the usability and artistic potential of an immersive 3D painting system in its early state. The system allows one to draw lines, meshes and particle clouds using a one-hand wand in a virtual room with stereoscopic display. In its more mature state the software will allow for two-handed interaction with new interaction devices. Ten professional artists participated for two days each in a test, performing both given tasks and free artistic sketching. Their experiences were collected through observation and interviews. At this stage, we found that common technical limitations of virtual environments, such as latency and tracking inaccuracy as well as clumsiness of the hardware devices, may considerably hinder handicraft work. On the other hand, every single participant felt that immersion offers new potential for artistic expression and was definitely willing to continue in the second phase of the test later this year.

我们研究了沉浸式3D绘画系统的早期可用性和艺术潜力。该系统允许用户在具有立体显示的虚拟房间中使用单手魔杖绘制线条、网格和粒子云。在更成熟的状态下，该软件将允许与新的交互设备进行双手交互。10位专业美术师分别参加了为期两天的测试，完成给定的任务和自由的艺术素描。通过观察和访谈收集他们的经验。在这个阶段，我们发现虚拟环境的常见技术限制，如延迟和跟踪不准确以及硬件设备的笨拙，可能会大大阻碍手工工作。另一方面，每个参与者都认为沉浸式体验为艺术表达提供了新的潜力，并且绝对愿意在今年晚些时候继续进行第二阶段的测试。

引用次数: 19

Picture quality improvement in MPEG-4 video coding using simple adaptive filter 使用简单的自适应滤波器改进MPEG-4视频编码中的图像质量

MULTIMEDIA '04

Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027592

Kee-Koo Kwon, Sung-Ho Im, Dong-Sun Lim

In this paper, we propose a novel post-filtering algorithm with low computational complexity that improves the visual quality of decoded images using block boundary classification and simple adaptive filter (SAF). At first, each block boundary is classified into smooth or complex sub-region. And for smooth-smooth sub-regions, the existence of blocking artifacts is determined using blocky strength. Simple adaptive filtering is processed in each block boundary. The proposed method processes adaptively, that is, a nonlinear 1-D 8-tap filter is applied to smooth-smooth sub-regions with blocking artifacts, and for smooth-complex or complex-smooth sub-regions, a nonlinear 1-D variant filter is applied to block boundary pixels so as to reduce the blocking and ringing artifacts. And for complex-complex sub-regions, a nonlinear 1-D 2-tap filter is only applied to adjust two block boundary pixels so as to preserve the image details. Experimental results show that the proposed algorithm produced better results than those of the conventional algorithms both subjective and objective viewpoints.

在本文中，我们提出了一种新的后滤波算法，该算法使用块边界分类和简单自适应滤波(SAF)来提高解码图像的视觉质量，计算复杂度低。首先，将每个块边界划分为光滑或复杂的子区域。对于光滑子区域，使用块强度来确定是否存在块伪像。在每个块边界处进行简单的自适应滤波。该方法自适应处理，即对具有阻塞伪影的光滑-光滑子区域应用非线性1-D 8分锥滤波器，对光滑-复杂或复杂-光滑子区域应用非线性1-D变型滤波器，以减少阻塞和振铃伪影。对于复杂-复杂子区域，非线性1-D 2抽头滤波器仅用于调整两个块边界像素，以保持图像细节。实验结果表明，无论从主观角度还是客观角度，该算法都优于传统算法。

引用次数: 1

Networked multimedia event exploration 网络多媒体事件探索

MULTIMEDIA '04

Pub Date : 2004-10-10 DOI: 10.1145/1027527.1027536

Preetha Appan, H. Sundaram

This paper describes a novel, interactive multimodal framework that enables a network of friends to effectively visualize and browse a shared image collection. The framework is very useful for geographically disconnected friends to share experiences. Our solution involves three components - (a) an event model, (b) three new spatio-temporal event exploration schemes, and (c) a novel technique for summarizing the user interaction. We develop a simple multimedia event model, that additionally incorporates the idea of user viewpoints. We also develop new dissimilarity measures between events, that additionally incorporate user context. We develop three, task driven, event exploration environments - (a) spatio-temporal evolution, (b) event cones and (c) viewpoint centric interaction. An original contribution of this paper is to summarize the user-interaction using an interactive framework. We conjecture that an interactive summary serves to recall the original content better, than a static image-based summary. Our user studies indicate that the exploratory environment performs very well.

本文描述了一种新颖的交互式多模式框架，它使朋友网络能够有效地可视化和浏览共享的图像集。这个框架对于地理上没有联系的朋友分享经验非常有用。我们的解决方案包括三个组成部分——(a)一个事件模型，(b)三个新的时空事件探索方案，以及(c)一种总结用户交互的新技术。我们开发了一个简单的多媒体事件模型，它还包含了用户视点的概念。我们还开发了新的事件之间的不相似性度量，这些度量额外地包含了用户上下文。我们开发了三个任务驱动的事件探索环境——(a)时空演化，(b)事件锥和(c)视点中心交互。本文的一个原始贡献是使用交互框架来总结用户交互。我们推测，与基于静态图像的摘要相比，交互式摘要能够更好地回忆起原始内容。我们的用户研究表明，探索环境表现非常好。

引用次数: 34

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

MULTIMEDIA '04

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀