2011 IEEE International Symposium on Multimedia最新文献

英文中文

Efficient and Language Independent News Story Segmentation for Telecast News Videos 面向电视新闻视频的高效、独立语言的新闻故事分割

2011 IEEE International Symposium on Multimedia

Pub Date : 2011-12-05 DOI: 10.1109/ISM.2011.81

Anubha Jindal, Aditya Tiwari, Hiranmay Ghosh

A TV news program comprises a continuous video stream containing a number of news stories, interspersed with commercials and headlines. This paper presents a method to detect the story boundaries and to separate out the stories from the other components and from each other. The method is based on movement of ticker text bands and repetition of ticker texts during different parts of a news program. The method does not use any language processing tool and is independent of language of telecast. It uses some simple features to distinguish news from the advertisements and can be used for large scale news indexing. We produce some test results on channels telecasting in English and few other Indian languages.

电视新闻节目包括连续的视频流，其中包含一些新闻故事，穿插着商业广告和标题。本文提出了一种检测故事边界并将故事从其他组件中分离出来的方法。该方法是基于在新闻节目的不同部分的股票文本带的运动和股票文本的重复。该方法不使用任何语言处理工具，不依赖于电视广播的语言。它使用一些简单的特征来区分新闻和广告，可以用于大规模的新闻索引。我们在英语和少数其他印度语言的电视频道上发布了一些测试结果。

引用次数: 9

Using Chaotic Maps for Encrypting Image and Video Content 使用混沌映射加密图像和视频内容

2011 IEEE International Symposium on Multimedia

Pub Date : 2011-12-05 DOI: 10.1109/ISM.2011.35

A. Pande, P. Mohapatra, Joseph Zambreno

Arithmetic Coding (AC) is widely used for the entropy coding of text and multimedia data. It involves recursive partitioning of the range [0,1) in accordance with the relative probabilities of occurrence of the input symbols. In this paper, we present a data (image or video) encryption scheme based on arithmetic coding, which we refer to as Chaotic Arithmetic Coding (CAC). In CAC, a large number of chaotic maps can be used to perform coding, each achieving Shannon optimal compression performance. The exact choice of map is governed by a key. CAC has the effect of scrambling the intervals without making any changes to the width of interval in which the codeword must lie, thereby allowing encryption without sacrificing any coding efficiency. We next describe Binary CAC (BCAC) with some simple Security Enhancement (SE) modes which can alleviate the security of scheme against known cryptanalysis against AC-based encryption techniques. These modes, namely Plaintext Modulation (PM), Pair-Wise Independent Keys (PWIK), and Key and cipher text Mixing (MIX) modes have insignificant computational overhead, while BCAC decoder has lower hardware requirements than BAC coder itself, making BCAC with SE as excellent choice for deployment in secure embedded multimedia systems. A bit sensitivity analysis for key and plaintext is presented along with experimental tests for compression performance.

算术编码(AC)被广泛用于文本和多媒体数据的熵编码。它涉及根据输入符号出现的相对概率对范围[0,1)进行递归划分。本文提出了一种基于算术编码的数据(图像或视频)加密方案，我们称之为混沌算术编码(CAC)。在CAC中，可以使用大量的混沌映射进行编码，每个混沌映射都可以实现香农最优压缩性能。映射的确切选择由一个键控制。CAC具有置乱间隔而不改变码字必须所在的间隔宽度的效果，从而允许加密而不牺牲任何编码效率。接下来，我们用一些简单的安全增强(SE)模式描述二进制CAC (BCAC)，这些模式可以降低方案对已知密码分析和基于ac的加密技术的安全性。这些模式，即明文调制(PM)、对独立密钥(PWIK)和密钥与密文混合(MIX)模式，计算开销微不足道，而BCAC解码器比BAC编码器本身具有更低的硬件要求，使BCAC与SE成为部署在安全嵌入式多媒体系统中的绝佳选择。提出了密钥和明文的位敏感性分析方法，并进行了压缩性能的实验测试。

{"title":"Using Chaotic Maps for Encrypting Image and Video Content","authors":"A. Pande, P. Mohapatra, Joseph Zambreno","doi":"10.1109/ISM.2011.35","DOIUrl":"https://doi.org/10.1109/ISM.2011.35","url":null,"abstract":"Arithmetic Coding (AC) is widely used for the entropy coding of text and multimedia data. It involves recursive partitioning of the range [0,1) in accordance with the relative probabilities of occurrence of the input symbols. In this paper, we present a data (image or video) encryption scheme based on arithmetic coding, which we refer to as Chaotic Arithmetic Coding (CAC). In CAC, a large number of chaotic maps can be used to perform coding, each achieving Shannon optimal compression performance. The exact choice of map is governed by a key. CAC has the effect of scrambling the intervals without making any changes to the width of interval in which the codeword must lie, thereby allowing encryption without sacrificing any coding efficiency. We next describe Binary CAC (BCAC) with some simple Security Enhancement (SE) modes which can alleviate the security of scheme against known cryptanalysis against AC-based encryption techniques. These modes, namely Plaintext Modulation (PM), Pair-Wise Independent Keys (PWIK), and Key and cipher text Mixing (MIX) modes have insignificant computational overhead, while BCAC decoder has lower hardware requirements than BAC coder itself, making BCAC with SE as excellent choice for deployment in secure embedded multimedia systems. A bit sensitivity analysis for key and plaintext is presented along with experimental tests for compression performance.","PeriodicalId":339410,"journal":{"name":"2011 IEEE International Symposium on Multimedia","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128525429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 19

A Novel Hierarchical Model-Based Frame Rate Up-Conversion via Spatio-temporal Conditional Random Fields 一种基于分层模型的时空条件随机场帧率上转换方法

2011 IEEE International Symposium on Multimedia

Pub Date : 2011-12-05 DOI: 10.1109/ISM.2011.44

M. Shafiee, Z. Azimifar, A. Wong, P. Fieguth

In this paper, a hierarchical model-based approach to frame rate-up conversion is presented. Given a sequence of consecutive video frames, a Spatio-Temporal Conditional Random Field (ST-CRF) is trained to capture both the motion and shape characteristics of objects within consecutive frames. A hierarchical tree is then constructed via hierarchical segmentation that sub-divides frames into regions based on color intensity and regional velocity. A hierarchical sampling approach is then introduced to construct new intermediate frames between adjacent video frames, where estimated intermediate frames are constructed at each level of a hierarchical tree constructed such that the probability of the ST-CRF is maximized. Preliminary results using videos with different motion characteristics show that the proposed approach has potential for producing intermediate frames with high visual quality.

本文提出了一种基于层次模型的帧速率转换方法。给定连续视频帧序列，训练时空条件随机场(ST-CRF)来捕获连续帧内物体的运动和形状特征。然后通过分层分割构建层次树，根据颜色强度和区域速度将帧细分为区域。然后，引入分层采样方法在相邻视频帧之间构建新的中间帧，其中在构建的分层树的每个级别构建估计的中间帧，从而使ST-CRF的概率最大化。使用具有不同运动特征的视频的初步结果表明，该方法具有产生高视觉质量的中间帧的潜力。

引用次数: 0

Exemplar-based Age Progression Prediction in Children Faces 基于范例的儿童面部年龄进展预测

2011 IEEE International Symposium on Multimedia

Pub Date : 2011-12-05 DOI: 10.1109/ISM.2011.28

C. Shen, Wan-Hua Lu, S. Shih, H. Liao

This work aims to develop a system for predicting age progression in children faces. Age progression prediction in children faces is critical to assist missing children searching. An integral module including feature extraction, distance measurement, and face synthesis is devised in this paper to predict faces at different ages. In the proposed method, a curvature-weighted plus bending-energy distance is employed for selecting similar facial components from an aging database. The growth curve of each facial component is used to predict the shape, size, and location of each component at a different age. Thin plate spline method is employed to synthesize a 3-D face model from the predicted components by minimizing the bending energy. Experiments are conducted to test the proposed method with various subjects and the results show that the proposed method is very promising.

这项工作旨在开发一个预测儿童面部年龄变化的系统。儿童面部年龄递进预测是帮助寻找失踪儿童的关键。本文设计了一个包括特征提取、距离测量和人脸合成在内的集成模块来预测不同年龄的人脸。在该方法中，采用曲率加权加弯曲能量距离从老化数据库中选择相似的人脸成分。每个面部成分的生长曲线用于预测不同年龄时每个成分的形状，大小和位置。采用薄板样条法，通过最小化弯曲能，将预测构件合成三维面模型。用不同的实验对象对所提出的方法进行了测试，结果表明所提出的方法是很有前途的。

引用次数: 16

Blood Cell Image Classification Based on Hierarchical SVM 基于分层支持向量机的血细胞图像分类

2011 IEEE International Symposium on Multimedia

Pub Date : 2011-12-05 DOI: 10.1109/ISM.2011.29

W. Tai, Rouh-Mei Hu, H. Hsiao, Rong-Ming Chen, J. Tsai

The problem of identifying and counting blood cells within the blood smear is of both theoretical and practical interest. The differential counting of blood cells provides invaluable information to pathologist for diagnosis and treatment of many diseases. In this paper we propose an efficient hierarchical blood cell image identification and classification method based on multi-class support vector machine. In this automated process, segmentation and classification of blood cells are the most important stages. We segment the stained blood cells in digital microscopic images and extract the geometric features for each segment to identify and classify the different types of blood cells. The experimental results are compared with the manual results obtained by the pathologist, and demonstrate the effectiveness of the proposed method.

在血液涂片中识别和计数血细胞的问题既有理论意义又有实际意义。血细胞的鉴别计数为病理学家诊断和治疗许多疾病提供了宝贵的信息。本文提出了一种基于多类支持向量机的高效分层血细胞图像识别与分类方法。在这个自动化过程中，血细胞的分割和分类是最重要的阶段。我们对数字显微图像中染色的血细胞进行分割，并提取每一段的几何特征，从而对不同类型的血细胞进行识别和分类。将实验结果与病理学家手工检测结果进行了比较，验证了所提方法的有效性。

引用次数: 42

Automatic Lecture Video Indexing Using Video OCR Technology 使用视频OCR技术的讲座视频自动索引

2011 IEEE International Symposium on Multimedia

Pub Date : 2011-12-05 DOI: 10.1109/ISM.2011.26

Haojin Yang, Maria Siebert, Patrick Lühne, Harald Sack, C. Meinel

During the last years, digital lecture libraries and lecture video portals have become more and more popular. However, finding efficient methods for indexing multimedia still remains a challenging task. Since the text displayed in a lecture video is closely related to the lecture content, it provides a valuable source for indexing and retrieving lecture contents. In this paper, we present an approach for automatic lecture video indexing based on video OCR technology. We have developed a novel video segmenter for automated slide video structure analysis and a weighted DCT (discrete cosines transformation) based text detector. A dynamic image constrast/brightness adaption serves the purpose of enhancing the text image quality to make it processible by existing common OCR software. Time-based text occurence information as well as the analyzed text content are further used for indexing. We prove the accuracy of the proposed approach by evaluation.

在过去的几年里，数字讲座库和讲座视频门户网站变得越来越流行。然而，寻找有效的多媒体索引方法仍然是一项具有挑战性的任务。由于讲座视频中显示的文本与讲座内容密切相关，因此它为索引和检索讲座内容提供了有价值的来源。本文提出了一种基于视频OCR技术的讲座视频自动标引方法。我们开发了一种用于自动幻灯片视频结构分析的新型视频分割器和基于加权DCT(离散余弦变换)的文本检测器。动态图像对比度/亮度自适应的目的是提高文本图像的质量，使其能够被现有的常用OCR软件处理。基于时间的文本出现信息以及分析的文本内容进一步用于索引。通过评价证明了所提方法的准确性。

引用次数: 42

Searching for Sub-images Using Sequence Alignment 使用序列对齐搜索子图像

2011 IEEE International Symposium on Multimedia

Pub Date : 2011-12-05 DOI: 10.1109/ISM.2011.19

T. Homoľa, Vlastislav Dohnal, P. Zezula

The availability of various photo archives and photo sharing systems made similarity searching much more important because the photos are not usually conveniently tagged. So the photos (images) need to be searched by their content. Moreover, it is important not only to compare images with a query holistically but also to locate images that contain the query as their part. The query can be a picture of a person, building, or an abstract object and the task is to retrieve images of the query object but from a different perspective or images capturing a global scene containing the query object. This retrieval is called the sub-image searching. In this paper, we propose an algorithm for retrieving database images by their similarity to and containment of a query. The novelty of it lies in application of a sequence alignment algorithm, which is commonly used in text retrieval. This forms an orthogonal solution to currently used approaches based on inverted files. The proposed algorithm is evaluated on a real-life data set containing photographs where images of logos are searched. It was compared to a state-of-the-art method and the improvement of 20% in average mean precision was obtained.

各种照片档案和照片共享系统的可用性使得相似性搜索变得更加重要，因为照片通常不方便标记。因此，照片(图像)需要根据其内容进行搜索。此外，重要的是不仅要将图像与查询进行整体比较，而且要定位包含查询的图像。查询可以是人物、建筑物或抽象对象的图片，任务是从不同的角度检索查询对象的图像或捕获包含查询对象的全局场景的图像。这种检索称为子图像搜索。在本文中，我们提出了一种检索数据库图像的算法，该算法通过查询的相似性和包含性来检索数据库图像。该方法的新颖之处在于采用了文本检索中常用的序列比对算法。这形成了一个正交的解决方案，目前使用的方法基于倒排文件。所提出的算法在包含搜索徽标图像的照片的真实数据集上进行评估。将其与最先进的方法进行了比较，平均精度提高了20%。

{"title":"Searching for Sub-images Using Sequence Alignment","authors":"T. Homoľa, Vlastislav Dohnal, P. Zezula","doi":"10.1109/ISM.2011.19","DOIUrl":"https://doi.org/10.1109/ISM.2011.19","url":null,"abstract":"The availability of various photo archives and photo sharing systems made similarity searching much more important because the photos are not usually conveniently tagged. So the photos (images) need to be searched by their content. Moreover, it is important not only to compare images with a query holistically but also to locate images that contain the query as their part. The query can be a picture of a person, building, or an abstract object and the task is to retrieve images of the query object but from a different perspective or images capturing a global scene containing the query object. This retrieval is called the sub-image searching. In this paper, we propose an algorithm for retrieving database images by their similarity to and containment of a query. The novelty of it lies in application of a sequence alignment algorithm, which is commonly used in text retrieval. This forms an orthogonal solution to currently used approaches based on inverted files. The proposed algorithm is evaluated on a real-life data set containing photographs where images of logos are searched. It was compared to a state-of-the-art method and the improvement of 20% in average mean precision was obtained.","PeriodicalId":339410,"journal":{"name":"2011 IEEE International Symposium on Multimedia","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123028979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Sound Zone Control in an Interactive Table System Environment 交互式表系统环境中的声音区域控制

2011 IEEE International Symposium on Multimedia

Pub Date : 2011-12-05 DOI: 10.1109/ISM.2011.40

Ryuji Yamaguchi, S. Sugihara, M. Hirakawa

Tabletop interaction systems have been explored actively owing to their advantages of allowing participants to collaborate each other by, in many cases, multi-touch gestures over a single, shared display. However, less research has been done on the exploration of localized auditory feedback. The authors have been investigating an interactive table system that is capable of presenting visual and auditory feedback, as well as gestural input. Especially, as for the auditory feedback, the system can locate multiple sounds simultaneously by controlling the loudness for 16 speakers mounted in the table. In this paper, we describe the design and experimental analysis of sound zone control, aiming to enhance the presence of sounds and interaction among participants in the interactive table environment. User studies indicate that the sound zone can be broadened when a source signal is accompanied by different delayed signals of it, while time intervals for sound position exchange over multiple speakers are not of primary importance in perception.

由于桌面交互系统的优势，在许多情况下，参与者可以通过多点触摸手势在单个共享显示器上相互协作，因此桌面交互系统已经得到了积极的探索。然而，对局部听觉反馈的研究较少。作者一直在研究一种能够呈现视觉和听觉反馈以及手势输入的交互式表格系统。特别是在听觉反馈方面，系统通过控制安装在桌子上的16个扬声器的响度，可以同时定位多个声音。在本文中，我们描述了声音区域控制的设计和实验分析，旨在增强交互桌环境中声音的存在和参与者之间的互动。用户研究表明，当一个源信号伴随着不同的延迟信号时，声音区域可以扩大，而多个扬声器之间声音位置交换的时间间隔在感知中并不重要。

引用次数: 3

OpenTrack - Automated Camera Control for Lecture Recordings 开放轨道-自动相机控制讲座录音

2011 IEEE International Symposium on Multimedia

Pub Date : 2011-12-05 DOI: 10.1109/ISM.2011.97

Benjamin Wulff, Rüdiger Rolf

In this paper we present the current state of our research project which aims to develop an open-source framework for real-time scene analysis and automatic camera control in lecture recording scenarios. The system is designed to run in alliance with the Opencast Matter horn lecture capture system as well as stand-alone. A GPU-based scene segmentation technique using motion cues and background modeling has been implemented using OpenCL. Moving objects are tracked by their centroids and bounding boxes and a dynamic appearance model is used to give persons a relative identity. Development started on a scriptable virtual camera operator module that will drive PTZ-cameras. Other applications of the scene analysis are possible.

在本文中，我们介绍了我们的研究项目的现状，该项目旨在开发一个开源框架，用于实时场景分析和讲座录制场景中的自动摄像机控制。该系统旨在与Opencast Matter喇叭讲座捕获系统联合运行，也可以独立运行。使用OpenCL实现了一种基于gpu的场景分割技术，该技术使用运动线索和背景建模。通过运动物体的质心和边界框来跟踪运动物体，并使用动态外观模型来赋予人物相对身份。开发开始于一个可编写脚本的虚拟摄像机操作模块，该模块将驱动ptz摄像机。场景分析的其他应用是可能的。

引用次数: 12

A Novel CBCD Approach Using MPEG-7 Motion Activity Descriptors 一种新的基于MPEG-7运动活动描述符的CBCD方法

2011 IEEE International Symposium on Multimedia

Pub Date : 2011-12-05 DOI: 10.1109/ISM.2011.36

R. Roopalakshmi, G. R. M. Reddy

Motion features contribute significant information about a video content. This paper highlights a novel CBCD (Content-Based Copy Detection) approach, by incorporating several motion activity features. First, we extract both temporal and spatial motion features to describe overall activity of a video sequence. Second, we combine these features in a feasible manner, to generate robust video fingerprints. Third, clustering based pruned search is utilized for similarity matching instead of direct searching of video fingerprints. The proposed system is tested on TRECVID-2007 data set and the results demonstrate the effectiveness of the proposed system against several transformations such as random noise, fast forward, pattern insertion, cropping and picture-inside-picture.

动作特征提供了关于视频内容的重要信息。本文重点介绍了一种新的CBCD(基于内容的复制检测)方法，该方法结合了几个运动活动特征。首先，我们提取时间和空间的运动特征来描述视频序列的整体活动。其次，我们将这些特征以可行的方式结合起来，生成鲁棒的视频指纹。第三，利用基于聚类的剪枝搜索代替直接搜索视频指纹进行相似度匹配。在TRECVID-2007数据集上对该系统进行了测试，结果证明了该系统对随机噪声、快进、图案插入、裁剪和图内图变换的有效性。

引用次数: 6

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2011 IEEE International Symposium on Multimedia

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀