首页 > 最新文献

2005 IEEE International Conference on Multimedia and Expo最新文献

英文 中文
Gender identification using frontal facial images 正面面部图像的性别识别
Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521613
Amith K Jain, Jeffrey R. Huang, S. Fang
Computer vision and pattern recognition systems play an important role in our lives by means of automated face detection, face and gesture recognition, and estimation of gender and age. This paper addresses the problem of gender classification using frontal facial images. We have developed gender classifiers with performance superior to existing gender classifiers. We experiment on 500 images (250 females and 250 males) randomly withdrawn from the FERET facial database. Independent component analysis (ICA) is used to represent each image as a feature vector in a low dimensional subspace. Different classifiers are studied in this lower dimensional space. Our experimental results show the superior performance of our approach to the existing gender classifiers. We get a 96% accuracy using support vector machine (SVM) in ICA space.
计算机视觉和模式识别系统在我们的生活中发挥着重要的作用,通过自动人脸检测,人脸和手势识别,以及性别和年龄的估计。本文研究了利用正面人脸图像进行性别分类的问题。我们开发了性能优于现有性别分类器的性别分类器。我们对从FERET面部数据库中随机抽取的500张图像(250张女性和250张男性)进行了实验。采用独立分量分析(ICA)将每张图像表示为低维子空间中的特征向量。在这个低维空间中研究了不同的分类器。我们的实验结果表明,我们的方法优于现有的性别分类器。在ICA空间中使用支持向量机(SVM)得到了96%的准确率。
{"title":"Gender identification using frontal facial images","authors":"Amith K Jain, Jeffrey R. Huang, S. Fang","doi":"10.1109/ICME.2005.1521613","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521613","url":null,"abstract":"Computer vision and pattern recognition systems play an important role in our lives by means of automated face detection, face and gesture recognition, and estimation of gender and age. This paper addresses the problem of gender classification using frontal facial images. We have developed gender classifiers with performance superior to existing gender classifiers. We experiment on 500 images (250 females and 250 males) randomly withdrawn from the FERET facial database. Independent component analysis (ICA) is used to represent each image as a feature vector in a low dimensional subspace. Different classifiers are studied in this lower dimensional space. Our experimental results show the superior performance of our approach to the existing gender classifiers. We get a 96% accuracy using support vector machine (SVM) in ICA space.","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133590380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 79
What happens in films? 电影里发生了什么?
Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521357
A. Salway, Andrew Vassiliou, K. Ahmad
This paper aims to contribute to the analysis and description of semantic video content by investigating what actions are important in films. We apply a corpus analysis method to identify frequently occurring phrases in texts that describe films-screenplays and audio description. Frequent words and statistically significant collocations of these words are identified in screenplays of 75 films and in audio description of 45 films. Phrases such as 'looks at', 'turns to', 'smiles at' and various collocations of 'door' were found to be common. We argue that these phrases occur frequently because they describe actions that are important story-telling elements for filmed narrative. We discuss how this knowledge helps the development of systems to structure semantic video content.
本文旨在通过研究电影中重要的动作,为语义视频内容的分析和描述做出贡献。我们应用语料库分析方法来识别描述电影剧本和音频描述的文本中频繁出现的短语。在75部电影的剧本和45部电影的音频描述中发现了频繁词和这些词在统计上显著的搭配。“look at”、“turns to”、“smiles at”等短语以及“door”的各种搭配都很常见。我们认为这些短语之所以频繁出现,是因为它们描述的动作是电影叙事的重要叙事元素。我们将讨论这些知识如何帮助系统开发来构建语义视频内容。
{"title":"What happens in films?","authors":"A. Salway, Andrew Vassiliou, K. Ahmad","doi":"10.1109/ICME.2005.1521357","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521357","url":null,"abstract":"This paper aims to contribute to the analysis and description of semantic video content by investigating what actions are important in films. We apply a corpus analysis method to identify frequently occurring phrases in texts that describe films-screenplays and audio description. Frequent words and statistically significant collocations of these words are identified in screenplays of 75 films and in audio description of 45 films. Phrases such as 'looks at', 'turns to', 'smiles at' and various collocations of 'door' were found to be common. We argue that these phrases occur frequently because they describe actions that are important story-telling elements for filmed narrative. We discuss how this knowledge helps the development of systems to structure semantic video content.","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114085062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Feature Selection and Stacking for Robust Discrimination of Speech, Monophonic Singing, and Polyphonic Music 语音、单声歌唱和复声音乐鲁棒识别的特征选择与叠加
Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521554
Björn Schuller, Brüning J. B. Schmitt, D. Arsic, S. Reiter, M. Lang, G. Rigoll
In this work we strive to find an optimal set of acoustic features for the discrimination of speech, monophonic singing, and polyphonic music to robustly segment acoustic media streams for annotation and interaction purposes. Furthermore we introduce ensemble-based classification approaches within this task. From a basis of 276 attributes we select the most efficient set by SVM-SFFS. Additionally relevance of single features by calculation of information gain ratio is presented. As a basis of comparison we reduce dimensionality by PCA. We show extensive analysis of different classifiers within the named task. Among these are kernel machines, decision trees, and Bayesian classifiers. Moreover we improve single classifier performance by bagging and boosting, and finally combine strengths of classifiers by stackingC. The database is formed by 2,114 samples of speech, and singing of 58 persons. 1,000 music clips have been taken from the MTV-Europe-Top-20 1980-2000. The outstanding discrimination results of a working realtime capable implementation stress the practicability of the proposed novel ideas
在这项工作中,我们努力寻找一组最佳的声学特征,用于识别语音、单声歌唱和复声音乐,以稳健地分割声学媒体流,用于注释和交互目的。此外,我们在本任务中引入了基于集成的分类方法。我们从276个属性中选择SVM-SFFS最有效的集合。此外,通过计算信息增益比,提出了单个特征的相关性。作为比较的基础,我们用主成分分析法降维。我们展示了对命名任务中不同分类器的广泛分析。其中包括核机器、决策树和贝叶斯分类器。此外,我们通过装袋和提升来提高单个分类器的性能,最后通过堆叠来结合分类器的优势。该数据库由2,114个语音样本和58个人的歌声组成。从1980-2000年MTV-Europe-Top-20中截取了1000个音乐片段。一个工作的实时能力实现的突出的识别结果强调了所提出的新思想的实用性
{"title":"Feature Selection and Stacking for Robust Discrimination of Speech, Monophonic Singing, and Polyphonic Music","authors":"Björn Schuller, Brüning J. B. Schmitt, D. Arsic, S. Reiter, M. Lang, G. Rigoll","doi":"10.1109/ICME.2005.1521554","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521554","url":null,"abstract":"In this work we strive to find an optimal set of acoustic features for the discrimination of speech, monophonic singing, and polyphonic music to robustly segment acoustic media streams for annotation and interaction purposes. Furthermore we introduce ensemble-based classification approaches within this task. From a basis of 276 attributes we select the most efficient set by SVM-SFFS. Additionally relevance of single features by calculation of information gain ratio is presented. As a basis of comparison we reduce dimensionality by PCA. We show extensive analysis of different classifiers within the named task. Among these are kernel machines, decision trees, and Bayesian classifiers. Moreover we improve single classifier performance by bagging and boosting, and finally combine strengths of classifiers by stackingC. The database is formed by 2,114 samples of speech, and singing of 58 persons. 1,000 music clips have been taken from the MTV-Europe-Top-20 1980-2000. The outstanding discrimination results of a working realtime capable implementation stress the practicability of the proposed novel ideas","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116698682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Video quality classification based home video segmentation 基于视频质量分类的家庭视频分割
Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521399
Si Wu, Yu-Fei Ma, HongJiang Zhang
Home videos often have some abnormal camera motions, such as camera shaking and irregular camera motions, which cause the degradation of visual quality. To remove bad quality segments and automatic stabilize shaky ones are necessary steps for home video archiving. In this paper, we proposed a novel segmentation algorithm for home video based on video quality classification. According to three important properties of motion, speed, direction, and acceleration, the effects caused by camera motion are classified into four categories: blurred, shaky, inconsistent and stable using support vector machines (SVMs). Based on the classification, a multi-scale sliding window is employed to parse video sequence into different segments along time axis, and each of these segments is labeled as one of camera motion effects. The effectiveness of the proposed approach has been validated by extensive experiments.
家庭视频经常会出现一些不正常的摄像机运动,如摄像机晃动、摄像机运动不规则等,导致视觉质量下降。去除质量差的片段和自动稳定不稳定的片段是家庭视频存档的必要步骤。本文提出了一种基于视频质量分类的家庭视频分割算法。根据运动的速度、方向和加速度三个重要属性,利用支持向量机(svm)将摄像机运动产生的影响分为模糊、抖动、不一致和稳定四类。在此基础上,采用多尺度滑动窗口将视频序列沿时间轴分解为不同的片段,并将每个片段标记为一个摄像机运动效果。大量的实验验证了该方法的有效性。
{"title":"Video quality classification based home video segmentation","authors":"Si Wu, Yu-Fei Ma, HongJiang Zhang","doi":"10.1109/ICME.2005.1521399","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521399","url":null,"abstract":"Home videos often have some abnormal camera motions, such as camera shaking and irregular camera motions, which cause the degradation of visual quality. To remove bad quality segments and automatic stabilize shaky ones are necessary steps for home video archiving. In this paper, we proposed a novel segmentation algorithm for home video based on video quality classification. According to three important properties of motion, speed, direction, and acceleration, the effects caused by camera motion are classified into four categories: blurred, shaky, inconsistent and stable using support vector machines (SVMs). Based on the classification, a multi-scale sliding window is employed to parse video sequence into different segments along time axis, and each of these segments is labeled as one of camera motion effects. The effectiveness of the proposed approach has been validated by extensive experiments.","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115139447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Comparing Feature Sets for Acted and Spontaneous Speech in View of Automatic Emotion Recognition 基于自动情绪识别的动作语音和自发语音特征集比较
Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521463
Thurid Vogt, E. André
We present a data-mining experiment on feature selection for automatic emotion recognition. Starting from more than 1000 features derived from pitch, energy and MFCC time series, the most relevant features in respect to the data are selected from this set by removing correlated features. The features selected for acted and realistic emotions are analyzed and show significant differences. All features are computed automatically and we also contrast automatically with manually units of analysis. A higher degree of automation did not prove to be a disadvantage in terms of recognition accuracy
提出了一种用于自动情感识别的特征选择数据挖掘实验。从从基音、能量和MFCC时间序列中得到的1000多个特征开始,通过去除相关特征,从中选择与数据最相关的特征。分析了表演情感和现实情感所选择的特征,发现两者存在显著差异。所有的特征都是自动计算的,我们还自动与手动分析单元进行对比。事实证明,就识别准确性而言,更高程度的自动化并不是一个劣势
{"title":"Comparing Feature Sets for Acted and Spontaneous Speech in View of Automatic Emotion Recognition","authors":"Thurid Vogt, E. André","doi":"10.1109/ICME.2005.1521463","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521463","url":null,"abstract":"We present a data-mining experiment on feature selection for automatic emotion recognition. Starting from more than 1000 features derived from pitch, energy and MFCC time series, the most relevant features in respect to the data are selected from this set by removing correlated features. The features selected for acted and realistic emotions are analyzed and show significant differences. All features are computed automatically and we also contrast automatically with manually units of analysis. A higher degree of automation did not prove to be a disadvantage in terms of recognition accuracy","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115245513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 257
Automatic Annotation of Location Information for WWW Images WWW图像位置信息的自动标注
Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521537
Zhigang Hua, Chuang Wang, Xing Xie, Hanqing Lu, Wei-Ying Ma
Currently, a crucial challenge is raised on how to manage a large amount of images on the Web. Due to a real synergy between an image and its location, we propose an automatic solution to annotate contextual location information for WWW images. We construct an image importance model to acquire the dominant images in a page that comprise contextual surrounding text. For each acquired image, we develop an effective algorithm to compute location from its contextual text. We apply our approach to 1,000 pages from various Websites for image location annotation. The experiments demonstrated that more than 30% WWW images are related with geographic location information, and our solution can achieve the satisfactory results. Finally, we present some potential applications involving the utilization of image location information
目前,如何管理Web上的大量图像是一个重要的挑战。由于图像与其位置之间存在真正的协同作用,我们提出了一种自动注释WWW图像上下文位置信息的解决方案。我们构建了一个图像重要性模型来获取包含上下文周围文本的页面中的主导图像。对于每个获取的图像,我们开发了一个有效的算法,从其上下文文本计算位置。我们将我们的方法应用于来自不同网站的1,000页图像位置注释。实验表明,超过30%的WWW图像与地理位置信息相关,我们的解决方案可以达到令人满意的效果。最后,我们提出了一些涉及图像位置信息利用的潜在应用
{"title":"Automatic Annotation of Location Information for WWW Images","authors":"Zhigang Hua, Chuang Wang, Xing Xie, Hanqing Lu, Wei-Ying Ma","doi":"10.1109/ICME.2005.1521537","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521537","url":null,"abstract":"Currently, a crucial challenge is raised on how to manage a large amount of images on the Web. Due to a real synergy between an image and its location, we propose an automatic solution to annotate contextual location information for WWW images. We construct an image importance model to acquire the dominant images in a page that comprise contextual surrounding text. For each acquired image, we develop an effective algorithm to compute location from its contextual text. We apply our approach to 1,000 pages from various Websites for image location annotation. The experiments demonstrated that more than 30% WWW images are related with geographic location information, and our solution can achieve the satisfactory results. Finally, we present some potential applications involving the utilization of image location information","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"134 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116946014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Optimized wireless video transmission using classification 利用分类优化无线视频传输
Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521591
R. Wong, M. Schaar, D. Turaga
Cross protocol layer optimizations have been recently proposed for improving the performance of real-time video transmission over 802.11 WLANs. However, performing such cross-layer optimizations is difficult since the video data and channel characteristics are time-varying, and analytically deriving the relationships between quality and channel characteristics given delay and power constraints is difficult. Furthermore, these relationships are often non-linear and non-deterministic (only worst or average case values can be determined). Complex Lagrangian or multi-objective optimization problems are thus often faced. In this paper, we propose a novel framework for solving cross MAC-application layer optimization problems. More specifically, we employ classification techniques to find an optimized cross-layer strategy for wireless multimedia transmission. Our solution deploys both content- and channel-related features to select a joint application-MAC strategy from the different strategies available at the various layers. Preliminary results indicate that considerable improvements can be obtained through the proposed cross-layer techniques relying on classification as opposed to ad-hoc solutions. The improvements are especially important at high packet-loss rates (5% and higher), where deploying a judicious mixture of strategies at the various layers becomes essential.
最近提出了跨协议层优化,以提高802.11 wlan上实时视频传输的性能。然而,执行这种跨层优化是困难的,因为视频数据和信道特性是时变的,并且在给定延迟和功率约束的情况下,很难解析地推导出质量和信道特性之间的关系。此外,这些关系通常是非线性和不确定的(只能确定最坏或平均情况的值)。因此经常面临复杂的拉格朗日或多目标优化问题。在本文中,我们提出了一个新的框架来解决跨mac应用层优化问题。更具体地说,我们使用分类技术来寻找无线多媒体传输的优化跨层策略。我们的解决方案部署了与内容和通道相关的功能,以便从各层可用的不同策略中选择联合应用程序mac策略。初步结果表明,通过所提出的依赖于分类的跨层技术可以获得相当大的改进,而不是临时解决方案。这些改进在高丢包率(5%或更高)的情况下尤为重要,在这种情况下,在各个层部署明智的混合策略变得至关重要。
{"title":"Optimized wireless video transmission using classification","authors":"R. Wong, M. Schaar, D. Turaga","doi":"10.1109/ICME.2005.1521591","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521591","url":null,"abstract":"Cross protocol layer optimizations have been recently proposed for improving the performance of real-time video transmission over 802.11 WLANs. However, performing such cross-layer optimizations is difficult since the video data and channel characteristics are time-varying, and analytically deriving the relationships between quality and channel characteristics given delay and power constraints is difficult. Furthermore, these relationships are often non-linear and non-deterministic (only worst or average case values can be determined). Complex Lagrangian or multi-objective optimization problems are thus often faced. In this paper, we propose a novel framework for solving cross MAC-application layer optimization problems. More specifically, we employ classification techniques to find an optimized cross-layer strategy for wireless multimedia transmission. Our solution deploys both content- and channel-related features to select a joint application-MAC strategy from the different strategies available at the various layers. Preliminary results indicate that considerable improvements can be obtained through the proposed cross-layer techniques relying on classification as opposed to ad-hoc solutions. The improvements are especially important at high packet-loss rates (5% and higher), where deploying a judicious mixture of strategies at the various layers becomes essential.","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"117 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116185381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A spatial-temporal de-interlacing algorithm 一种时空去隔行算法
Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521407
T. Chong, O. Au, Tai-Wai Chan, Wing-San Chau
In this paper, we proposed a spatial-temporal de-interlacing algorithm for conversion of interlaced video to progressive video. Our proposed algorithm estimates the motion trajectory of three consecutive fields interpolates the missing field along the motion trajectory. In the motion estimator, the unidirectional motion estimation and the bidirectional motion estimation processes are combined by multiple objective minimization technique. The unidirectional motion estimation estimates the motion trajectory by comparing the blocks from opposite parity fields while the bi-directional motion estimation compares blocks from the same parity fields. By combining the two motion estimations, the motion trajectory can be accurately predicted. In addition, a quality analyzer is proposed to evaluate the visual quality of the reconstructed frame, which chooses the appropriate interpolation scheme in order to provide maximum de-interlacing performance. Simulation results show the proposed algorithm has better performance over existing de-interlacing algorithm.
在本文中,我们提出了一种将隔行视频转换为逐行视频的时空去隔行算法。我们提出的算法估计三个连续场的运动轨迹,沿运动轨迹插值缺失的场。在运动估计器中,采用多目标最小化技术将单向运动估计和双向运动估计相结合。单向运动估计通过比较来自相反奇偶域的块来估计运动轨迹,双向运动估计通过比较来自相同奇偶域的块来估计运动轨迹。结合这两种运动估计,可以准确地预测运动轨迹。此外,提出了一个质量分析仪来评估重建帧的视觉质量,选择合适的插值方案,以提供最大的去隔行性能。仿真结果表明,该算法比现有的去隔行算法具有更好的性能。
{"title":"A spatial-temporal de-interlacing algorithm","authors":"T. Chong, O. Au, Tai-Wai Chan, Wing-San Chau","doi":"10.1109/ICME.2005.1521407","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521407","url":null,"abstract":"In this paper, we proposed a spatial-temporal de-interlacing algorithm for conversion of interlaced video to progressive video. Our proposed algorithm estimates the motion trajectory of three consecutive fields interpolates the missing field along the motion trajectory. In the motion estimator, the unidirectional motion estimation and the bidirectional motion estimation processes are combined by multiple objective minimization technique. The unidirectional motion estimation estimates the motion trajectory by comparing the blocks from opposite parity fields while the bi-directional motion estimation compares blocks from the same parity fields. By combining the two motion estimations, the motion trajectory can be accurately predicted. In addition, a quality analyzer is proposed to evaluate the visual quality of the reconstructed frame, which chooses the appropriate interpolation scheme in order to provide maximum de-interlacing performance. Simulation results show the proposed algorithm has better performance over existing de-interlacing algorithm.","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121175245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Evaluation of the Interleaved Source Coding (ISC) Under Packet Correlation 包相关条件下交错源编码(ISC)的评价
Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521711
Jin Young Lee, H. Radha
Network impairments such as delay and packet losses have severe impact on the presentation quality of many predictive video sources. Prior researches have shown efforts to develop packet loss resilient coding methods to overcome such impairments for real-time streaming applications. Interleaved source coding (ISC) is one of the error resilient coding methods, which is based on an optimum interleaving of predictive video coded frames transmitted over a single erasure channel. ISC employs a Markov decision process (MDP) and a corresponding dynamic programming algorithm to identify the optimal interleaving pattern for a given channel model and a transmitting sequence. ISC has shown to significantly improve the overall quality of predictive video coded stream over a lossy channel without complex modifications to standard video coders. In this paper, ISC is evaluated over channels with memory. In particular, we analyze the impact of packet correlation of the popular Gilbert model on ISC-based packet video over a wide range of packet loss probabilities. Simulations have shown that ISC advances the traditional method as either the loss rate increases or the packet correlation decreases
延迟和丢包等网络缺陷严重影响了许多预测视频源的呈现质量。先前的研究已经表明,努力开发数据包丢失弹性编码方法,以克服实时流应用的这种损害。交错源编码(ISC)是一种纠错编码方法,它基于在单个擦除信道上传输的预测视频编码帧的最佳交错。ISC采用马尔可夫决策过程(MDP)和相应的动态规划算法来确定给定信道模型和发送序列的最佳交错模式。ISC已经证明,在不需要对标准视频编码器进行复杂修改的情况下,可以显著提高有损信道上预测视频编码流的整体质量。在本文中,ISC是在有内存的信道上计算的。特别是,我们分析了流行的吉尔伯特模型的数据包相关性对基于isc的数据包视频在广泛的丢包概率范围内的影响。仿真结果表明,ISC在提高丢包率和降低包相关性方面优于传统方法
{"title":"Evaluation of the Interleaved Source Coding (ISC) Under Packet Correlation","authors":"Jin Young Lee, H. Radha","doi":"10.1109/ICME.2005.1521711","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521711","url":null,"abstract":"Network impairments such as delay and packet losses have severe impact on the presentation quality of many predictive video sources. Prior researches have shown efforts to develop packet loss resilient coding methods to overcome such impairments for real-time streaming applications. Interleaved source coding (ISC) is one of the error resilient coding methods, which is based on an optimum interleaving of predictive video coded frames transmitted over a single erasure channel. ISC employs a Markov decision process (MDP) and a corresponding dynamic programming algorithm to identify the optimal interleaving pattern for a given channel model and a transmitting sequence. ISC has shown to significantly improve the overall quality of predictive video coded stream over a lossy channel without complex modifications to standard video coders. In this paper, ISC is evaluated over channels with memory. In particular, we analyze the impact of packet correlation of the popular Gilbert model on ISC-based packet video over a wide range of packet loss probabilities. Simulations have shown that ISC advances the traditional method as either the loss rate increases or the packet correlation decreases","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"137 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121710710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Joint Image Halftoning and Watermarking in High-Resolution Digital Form 高分辨率数字形式的联合图像半色调和水印
Pub Date : 2005-07-06 DOI: 10.1109/ICME.2005.1521478
Chao-Yong Hsu, Chun-Shien Lu
The existing halftone image watermarking methods were proposed to embed a watermark bit in a halftone dot, which corresponds to a pixel, to generate stego halftone image. This one-to-one mapping, however, is not consistent with the one-to-many strategy that is used by current high-resolution devices, such as computer printers and screens, where one pixel is first expanded into many dots and then a halftoning processing is employed to generate a halftone image. Furthermore, electronic paper or smart paper that produces high-resolution digital files cannot be protected by the traditional halftone watermarking methods. In view of these facts, we present a high-resolution halftone watermarking scheme to deal with the aforementioned problems. The characteristics of our scheme include: (i) a high-resolution halftoning process that employs a one-to-many mapping strategy is proposed; (ii) a many-to-one inverse halftoning process is proposed to generate gray-scale images of good quality; and (iii) halftone image watermarking can be directly conducted on gray-scale instead of halftone images to achieve better robustness
提出了现有的半色调图像水印方法,将水印位嵌入到对应像素的半色调点上,生成隐写半色调图像。然而,这种一对一的映射与当前高分辨率设备(如计算机打印机和屏幕)使用的一对多策略不一致,在这些设备中,首先将一个像素扩展为许多点,然后使用半色调处理来生成半色调图像。此外,产生高分辨率数字文件的电子纸或智能纸无法受到传统半色调水印方法的保护。针对上述问题,本文提出了一种高分辨率半色调水印方案。该方案的特点包括:(1)提出了一种采用一对多映射策略的高分辨率半调色工艺;(ii)提出了一种多对一的逆半调方法,以生成高质量的灰度图像;(3)半色调图像水印可以直接在灰度上进行,而不是在半色调图像上进行,具有更好的鲁棒性
{"title":"Joint Image Halftoning and Watermarking in High-Resolution Digital Form","authors":"Chao-Yong Hsu, Chun-Shien Lu","doi":"10.1109/ICME.2005.1521478","DOIUrl":"https://doi.org/10.1109/ICME.2005.1521478","url":null,"abstract":"The existing halftone image watermarking methods were proposed to embed a watermark bit in a halftone dot, which corresponds to a pixel, to generate stego halftone image. This one-to-one mapping, however, is not consistent with the one-to-many strategy that is used by current high-resolution devices, such as computer printers and screens, where one pixel is first expanded into many dots and then a halftoning processing is employed to generate a halftone image. Furthermore, electronic paper or smart paper that produces high-resolution digital files cannot be protected by the traditional halftone watermarking methods. In view of these facts, we present a high-resolution halftone watermarking scheme to deal with the aforementioned problems. The characteristics of our scheme include: (i) a high-resolution halftoning process that employs a one-to-many mapping strategy is proposed; (ii) a many-to-one inverse halftoning process is proposed to generate gray-scale images of good quality; and (iii) halftone image watermarking can be directly conducted on gray-scale instead of halftone images to achieve better robustness","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115900364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
2005 IEEE International Conference on Multimedia and Expo
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1