2012 IEEE International Conference on Multimedia and Expo最新文献

英文中文

System Design of Perceptual Quality-Regulable H.264 Video Encoder 感知质量可调的H.264视频编码器系统设计

2012 IEEE International Conference on Multimedia and Expo

Pub Date : 2012-07-09 DOI: 10.1109/ICME.2012.180

Guan-Lin Wu, Yu-Jie Fu, Shao-Yi Chien

In this work, a perceptual quality-regulable H.264 video encoder system has been developed. Exploiting the relationship between the reconstructed macro block and its best predicted macro block from mode decision, a novel quantization parameter prediction method is built and used to regulate the video quality according to a target perceptual quality. An automatic quality refinement scheme is also developed to achieve a better usage of bit budget. Moreover, with the aid of salient object detection, we further improve the quality on where human might focus on. The proposed algorithm achieves better bit allocation for video coding system by changing quantization parameters at macro block level. Compared to JM reference software with macro block layer rate control, the proposed algorithm achieves better and more stable quality with higher average SSIM index and smaller SSIM variation.

本文开发了一种感知质量可调的H.264视频编码器系统。利用重构的宏块与模式决策中预测的最佳宏块之间的关系，建立了一种新的量化参数预测方法，并根据目标感知质量调节视频质量。为了更好地利用钻头预算，提出了一种自动质量改进方案。此外，借助显著目标检测，我们进一步提高了人们可能关注的地方的质量。该算法通过改变宏块级量化参数，实现了视频编码系统更好的比特分配。与具有宏块层速率控制的JM参考软件相比，该算法具有较高的平均SSIM指数和较小的SSIM变化，获得了更好、更稳定的质量。

引用次数: 3

Perception of Temporal Pumping Artifact in Video Coding with the Hierarchical Prediction Structure 用层次预测结构感知视频编码中的时间泵送伪影

2012 IEEE International Conference on Multimedia and Expo

Pub Date : 2012-07-09 DOI: 10.1109/ICME.2012.149

Shuai Wan, Yanchao Gong, Fuzheng Yang

The usage of the hierarchical prediction structure in video coding has introduced a special type of temporal noise, i.e., the temporal pumping artifact. This artifact presents itself as severe quality fluctuations among adjacent pictures and is quite annoying due to the pumping or stumbling effect in perception. In this paper the fundamental reason of perception of the temporal pumping artifact is analyzed. The key factors influencing perception of temporal pumping artifact are evaluated based on subjective experiments, in terms of amplitude, frequency and phase of quality fluctuations, respectively. The detailed analysis suggests how the temporal pumping artifact can be well alleviated or even eliminated through adjusting coding parameters.

分层预测结构在视频编码中的应用引入了一种特殊的时间噪声，即时间泵送伪影。这个伪影在相邻的图片中表现为严重的质量波动，并且由于在感知上的泵送或磕磕绊绊效应而相当烦人。本文分析了时间抽运伪影产生的根本原因。在主观实验的基础上，分别从质量波动的幅度、频率和相位三个方面评价了影响时间泵送伪影感知的关键因素。详细分析表明，通过调整编码参数可以很好地减轻甚至消除时间泵送伪影。

引用次数: 12

A Synaesthetic Approach for Image Slideshow Generation 一种图像幻灯片生成的联觉方法

2012 IEEE International Conference on Multimedia and Expo

Pub Date : 2012-07-09 DOI: 10.1109/ICME.2012.75

Y. Xiang, M. Kankanhalli

In this paper, we present a novel automatic image slideshow system that explores a new medium between images and music. It can be regarded as a new image selection and slideshow composition criterion. Based on the idea of ``hearing colors, seeing sounds" from the art of music visualization, equal importance is assigned to image features and audio properties for better synchronization. We minimize the aesthetic energy distance between visual and audio features. Given a set of images, a subset is selected by correlating image features with the input audio properties. The selected images are then synchronized with the music subclips by their audio-visual distance. The inductive image displaying approach has been introduced for common displaying devices.

在本文中，我们提出了一种新的自动图像幻灯片系统，探索了图像和音乐之间的新媒介。它可以作为一种新的图像选择和幻灯片合成准则。基于音乐可视化艺术的“听色见音”的理念，图像特征和音频属性同样重要，以实现更好的同步。我们最小化视觉和音频特征之间的审美能量距离。给定一组图像，通过将图像特征与输入音频属性相关联来选择子集。选定的图像然后通过它们的视听距离与音乐子剪辑同步。介绍了一种用于普通显示设备的感应式图像显示方法。

引用次数: 1

Perceived Picture Quality of Frame-Compatible 3DTV Video Formats 帧兼容3DTV视频格式的感知图像质量

2012 IEEE International Conference on Multimedia and Expo

Pub Date : 2012-07-09 DOI: 10.1109/ICME.2012.42

F. Speranza, R. Renaud, A. Vincent, W. J. Tam

In the stereoscopic high-definition (HD) frame-compatible formats, the separate left and right views are reduced in resolution and packed to fit within the same video frame as a conventional two-dimensional high-definition signal. Since they do not require additional transmission bandwidth and entail limited changes to the existing broadcasting infrastructure, these formats have been suggested for 3DTV. However, the convenience of frame-compatible formats comes at the expense of lower picture quality of the 3D signal. In this study, we evaluated the loss in picture quality of two frame-compatible formats: 1080i Side-by-Side and 720p Top/Bottom, using a subjective assessment experiment.

在立体高清(HD)帧兼容格式中，独立的左视图和右视图的分辨率被降低，并打包到与传统二维高清信号相同的视频帧中。由于它们不需要额外的传输带宽，并且对现有广播基础设施的改变有限，因此建议将这些格式用于3DTV。然而，帧兼容格式的便利性是以较低的3D信号图像质量为代价的。在本研究中，我们使用主观评估实验评估了两种帧兼容格式:1080i并排和720p上下的图像质量损失。

引用次数: 6

Parallelization Design of Irregular Algorithms of Video Processing on GPUs gpu上不规则视频处理算法的并行化设计

2012 IEEE International Conference on Multimedia and Expo

Pub Date : 2012-07-09 DOI: 10.1109/ICME.2012.147

Huayou Su, Jun Chai, M. Wen, Ju Ren, Chunyuan Zhang

In this paper, we present the parallelization design consideration for irregular algorithms of video processing on GPUs. Enrich parallelism can be exploited by scheduling the processing order or making a tradeoff between performance and parallelism for irregular algorithms (such as CAVLC and deblocking filter). We implement a component-oriented CAVLC encoder and a direction-oriented deblocking filter on GPUs. The experiment results show that, compared with the implementation on CPU, the optimized parallel methods achieve high performance in term of speedup ratio from 63 to 44, relatively for deblocking filter and CAVLC. It shows that the rich parallelism is one of the most important factors to gain high performance for irregular algorithms based on GPUs. In addition, it seems that for some irregular kernels, the number of SM of GPU is more important to the performance than the computation capability.

本文提出了gpu上视频处理不规则算法的并行化设计考虑。可以通过调度处理顺序或在不规则算法(如CAVLC和去块过滤器)的性能和并行性之间进行权衡来利用丰富的并行性。我们在gpu上实现了一个面向组件的CAVLC编码器和一个面向方向的去块滤波器。实验结果表明，与在CPU上实现相比，优化后的并行方法在加速比上达到了63 ~ 44，相对于去块滤波和CAVLC而言。研究结果表明，丰富的并行性是基于gpu的不规则算法获得高性能的重要因素之一。此外，对于一些不规则核，似乎GPU的SM数量比计算能力对性能更重要。

引用次数: 2

Self-Learning of Edge-Preserving Single Image Super-Resolution via Contourlet Transform 基于Contourlet变换的保边单幅图像超分辨率自学习

2012 IEEE International Conference on Multimedia and Expo

Pub Date : 2012-07-09 DOI: 10.1109/ICME.2012.169

Min-Chun Yang, De-An Huang, Chih-Yun Tsai, Y. Wang

We present a self-learning approach for single image super-resolution (SR), with the ability to preserve high frequency components such as edges in resulting high resolution (HR) images. Given a low-resolution (LR) input image, we construct its image pyramid and produce a super pixel dataset. By extracting context information from the super-pixels, we propose to deploy context-specific contour let transform on them in order to model the relationship (via support vector regression) between the input patches and their associated directional high-frequency responses. These learned models are applied to predict the SR output with satisfactory quality. Unlike prior learning-based SR methods, our approach advances a self-learning technique and does not require the self similarity of image patches within or across image scales. More importantly, we do not need to collect training LR/HR image data in advance and only require a single LR input image. Empirical results verify the effectiveness of our approach, which quantitatively and qualitatively outperforms existing interpolation or learning-based SR methods.

我们提出了一种用于单幅图像超分辨率(SR)的自学习方法，该方法能够保留高分辨率(HR)图像中的高频成分，如边缘。给定低分辨率(LR)输入图像，我们构建其图像金字塔并生成超像素数据集。通过从超像素中提取上下文信息，我们建议在它们上部署上下文特定的轮廓let变换，以便(通过支持向量回归)模拟输入补丁与其相关的定向高频响应之间的关系。将这些学习到的模型应用于预测SR输出，得到了满意的结果。与先前基于学习的SR方法不同，我们的方法提出了一种自学习技术，并且不需要图像尺度内或图像尺度间图像补丁的自相似性。更重要的是，我们不需要提前收集训练LR/HR图像数据，只需要单个LR输入图像。实证结果验证了我们方法的有效性，在定量和定性上都优于现有的插值或基于学习的SR方法。

引用次数: 6

Noisy Tag Alignment with Image Regions 图像区域的噪声标签对齐

2012 IEEE International Conference on Multimedia and Expo

Pub Date : 2012-07-09 DOI: 10.1109/ICME.2012.143

Yang Liu, Jing Liu, Zechao Li, Hanqing Lu

With the permeation of Web 2.0, large-scale user contributed images with tags are easily available on social websites. How to align these social tags with image regions is a challenging task while no additional human intervention is considered, but a valuable one since the alignment can provide more detailed image semantic information and improve the accuracy of image retrieval. To this end, we propose a large margin discriminative model for automatically locating unaligned and possibly noisy image-level tags to the corresponding regions, and the model is optimized using concave-convex procedure (CCCP). In the model, each image is considered as a bag of segmented regions, associated with a set of candidate labeling vectors. Each labeling vector encodes a possible label arrangement for the regions of an image. To make the size of admissible labels tractable, we adopt an effective strategy based on the consistency between visual similarity and semantic correlation to generate a more compact set of labeling vectors. Extensive experiments on MSRC and SAIAPR TC-12 databases have been conducted to demonstrate the encouraging performance of our method comparing with other baseline methods.

随着Web 2.0的普及，在社交网站上可以很容易地获得大规模用户贡献的带有标签的图像。如何将这些社交标签与图像区域对齐是一项具有挑战性的任务，因为没有额外的人为干预，但由于对齐可以提供更详细的图像语义信息并提高图像检索的准确性，因此这是一项有价值的任务。为此，我们提出了一个大间距判别模型，用于自动定位未对齐和可能有噪声的图像级标签到相应的区域，并使用凹凸过程(CCCP)对模型进行优化。在该模型中，每个图像都被视为一组分割的区域，并与一组候选标记向量相关联。每个标记向量对图像区域的可能标记排列进行编码。为了使可接受标签的大小易于处理，我们采用了一种基于视觉相似性和语义相关性一致性的有效策略来生成更紧凑的标记向量集。在MSRC和SAIAPR TC-12数据库上进行的大量实验表明，与其他基线方法相比，我们的方法具有令人鼓舞的性能。

{"title":"Noisy Tag Alignment with Image Regions","authors":"Yang Liu, Jing Liu, Zechao Li, Hanqing Lu","doi":"10.1109/ICME.2012.143","DOIUrl":"https://doi.org/10.1109/ICME.2012.143","url":null,"abstract":"With the permeation of Web 2.0, large-scale user contributed images with tags are easily available on social websites. How to align these social tags with image regions is a challenging task while no additional human intervention is considered, but a valuable one since the alignment can provide more detailed image semantic information and improve the accuracy of image retrieval. To this end, we propose a large margin discriminative model for automatically locating unaligned and possibly noisy image-level tags to the corresponding regions, and the model is optimized using concave-convex procedure (CCCP). In the model, each image is considered as a bag of segmented regions, associated with a set of candidate labeling vectors. Each labeling vector encodes a possible label arrangement for the regions of an image. To make the size of admissible labels tractable, we adopt an effective strategy based on the consistency between visual similarity and semantic correlation to generate a more compact set of labeling vectors. Extensive experiments on MSRC and SAIAPR TC-12 databases have been conducted to demonstrate the encouraging performance of our method comparing with other baseline methods.","PeriodicalId":273567,"journal":{"name":"2012 IEEE International Conference on Multimedia and Expo","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125296349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Motion Vectors Merging: Low Complexity Prediction Unit Decision Heuristic for the Inter-prediction of HEVC Encoders 运动矢量合并:HEVC编码器内部预测的低复杂度预测单元决策启发式算法

2012 IEEE International Conference on Multimedia and Expo

Pub Date : 2012-07-09 DOI: 10.1109/ICME.2012.37

F. Sampaio, S. Bampi, M. Grellert, L. Agostini, J. Mattos

This paper presents the Motion Vectors Merging (MVM) heuristic, which is a method to reduce the HEVC inter-prediction complexity targeting the PU partition size decision. In the HM test model of the emerging HEVC standard, computational complexity is mostly concentrated in the inter-frame prediction step (up to 96% of the total encoder execution time, considering common test conditions). The goal of this work is to avoid several Motion Estimation (ME) calls during the PU inter-prediction decision in order to reduce the execution time in the overall encoding process. The MVM algorithm is based on merging NxN PU partitions in order to compose larger ones. After the best PU partition is decided, ME is called to produce the best possible rate-distortion results for the selected partitions. The proposed method was implemented in the HM test model version 3.4 and provides an execution time reduction of up to 34% with insignificant rate-distortion losses (0.08 dB drop and 1.9% bitrate increase in the worst case). Besides, there is no related work in the literature that proposes PU-level decision optimizations. When compared with works that target CU-level fast decision methods, the MVM shows itself competitive, achieving results as good as those works.

提出了运动向量合并(MVM)启发式算法，这是一种针对PU分区大小决策降低HEVC相互预测复杂度的方法。在新兴的HEVC标准的HM测试模型中，计算复杂度主要集中在帧间预测步骤(考虑到常见的测试条件，高达总编码器执行时间的96%)。本文的目标是避免在PU相互预测决策过程中多次调用运动估计(ME)，以减少整个编码过程中的执行时间。MVM算法基于合并NxN PU分区以组成更大的分区。在确定了最佳PU分区之后，将调用ME为所选分区生成可能的最佳速率失真结果。所提出的方法在HM测试模型3.4版中实现，执行时间减少了34%，而速率失真损失微不足道(最坏情况下下降0.08 dB，比特率增加1.9%)。此外，文献中也没有提出pu级决策优化的相关工作。与针对cu级快速决策方法的研究相比，MVM具有一定的竞争力，取得了与cu级快速决策方法相当的结果。

{"title":"Motion Vectors Merging: Low Complexity Prediction Unit Decision Heuristic for the Inter-prediction of HEVC Encoders","authors":"F. Sampaio, S. Bampi, M. Grellert, L. Agostini, J. Mattos","doi":"10.1109/ICME.2012.37","DOIUrl":"https://doi.org/10.1109/ICME.2012.37","url":null,"abstract":"This paper presents the Motion Vectors Merging (MVM) heuristic, which is a method to reduce the HEVC inter-prediction complexity targeting the PU partition size decision. In the HM test model of the emerging HEVC standard, computational complexity is mostly concentrated in the inter-frame prediction step (up to 96% of the total encoder execution time, considering common test conditions). The goal of this work is to avoid several Motion Estimation (ME) calls during the PU inter-prediction decision in order to reduce the execution time in the overall encoding process. The MVM algorithm is based on merging NxN PU partitions in order to compose larger ones. After the best PU partition is decided, ME is called to produce the best possible rate-distortion results for the selected partitions. The proposed method was implemented in the HM test model version 3.4 and provides an execution time reduction of up to 34% with insignificant rate-distortion losses (0.08 dB drop and 1.9% bitrate increase in the worst case). Besides, there is no related work in the literature that proposes PU-level decision optimizations. When compared with works that target CU-level fast decision methods, the MVM shows itself competitive, achieving results as good as those works.","PeriodicalId":273567,"journal":{"name":"2012 IEEE International Conference on Multimedia and Expo","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128446324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 46

Expert Talk for Time Machine Session: Dynamic Time Warping New Youth 时间机器专题专家演讲:动态时间扭曲新青年

2012 IEEE International Conference on Multimedia and Expo

Pub Date : 2012-07-09 DOI: 10.1109/ICME.2012.108

X. Anguera

This time machine expert talk describes the recent comeback of acoustic pattern matching algorithms, such as DTW. These are particularly suited for applications where little (or no) transcribed training data is available.

这个时间机器专家的谈话描述了最近回归的声学模式匹配算法，如DTW。这些特别适合于很少(或没有)转录训练数据可用的应用程序。

引用次数: 0

Full Spherical High Dynamic Range Imaging from the Sky 全球面高动态范围天空成像

2012 IEEE International Conference on Multimedia and Expo

Pub Date : 2012-07-09 DOI: 10.1109/ICME.2012.120

Fumio Okura, M. Kanbara, N. Yokoya

This paper describes a method for acquiring full spherical high dynamic range (HDR) images with no missing areas by using two omni directional cameras mounted on the top and bottom of an unmanned airship. The full spherical HDR images are generated by combining multiple omni directional images that are captured with different shutter speeds. The images generated are intended for uses in telepresence, augmented telepresence, and image-based lighting.

本文介绍了一种利用安装在无人飞艇顶部和底部的两个全方位摄像机获取无缺失区域的全球面高动态范围(HDR)图像的方法。全球面HDR图像是由多个以不同快门速度拍摄的全方位图像组合而成的。生成的图像用于远程呈现、增强远程呈现和基于图像的照明。

引用次数: 6

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2012 IEEE International Conference on Multimedia and Expo

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀