首页 > 最新文献

2013 Visual Communications and Image Processing (VCIP)最新文献

英文 中文
Objective quality assessment for image retargeting based on perceptual distortion and information loss 基于感知失真和信息丢失的图像重定位的客观质量评估
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706443
Chih-Chung Hsu, Chia-Wen Lin, Yuming Fang, Weisi Lin
Image retargeting techniques aim to obtain retargeted images with different sizes or aspect ratios for various display screens. Various content-aware image retargeting algorithms have been proposed recently. However, there is still no accurate objective metric for visual quality assessment of retargeted images. In this paper, we propose a novel objective metric for assessing visual quality of retargeted images based on perceptual geometric distortion and information loss. The proposed metric measures the geometric distortion of retargeted images by SIFT flow variation. Furthermore, a visual saliency map is derived to characterize human perception of the geometric distortion. On the other hand, the information loss in a retargeted image, which is calculated based on the saliency map, is integrated into the proposed metric. A user study is conducted to evaluate the performance of the proposed metric. Experimental results show the consistency between the objective assessments from the proposed metric and subjective assessments.
图像重定向技术的目的是获得不同尺寸或宽高比的不同显示屏幕的重定向图像。最近提出了各种内容感知图像重定向算法。然而,对于重定位图像的视觉质量评估,目前还没有准确客观的度量标准。在本文中,我们提出了一种新的基于感知几何失真和信息损失的客观度量来评估重定位图像的视觉质量。该度量方法通过SIFT流量变化来度量重定位图像的几何畸变。在此基础上,推导了一种视觉显著性图来表征人类对几何畸变的感知。另一方面,将基于显著性映射计算的重目标图像的信息损失集成到所提出的度量中。进行用户研究以评估所提议度量的性能。实验结果表明,所提度量的客观评价与主观评价是一致的。
{"title":"Objective quality assessment for image retargeting based on perceptual distortion and information loss","authors":"Chih-Chung Hsu, Chia-Wen Lin, Yuming Fang, Weisi Lin","doi":"10.1109/VCIP.2013.6706443","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706443","url":null,"abstract":"Image retargeting techniques aim to obtain retargeted images with different sizes or aspect ratios for various display screens. Various content-aware image retargeting algorithms have been proposed recently. However, there is still no accurate objective metric for visual quality assessment of retargeted images. In this paper, we propose a novel objective metric for assessing visual quality of retargeted images based on perceptual geometric distortion and information loss. The proposed metric measures the geometric distortion of retargeted images by SIFT flow variation. Furthermore, a visual saliency map is derived to characterize human perception of the geometric distortion. On the other hand, the information loss in a retargeted image, which is calculated based on the saliency map, is integrated into the proposed metric. A user study is conducted to evaluate the performance of the proposed metric. Experimental results show the consistency between the objective assessments from the proposed metric and subjective assessments.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126110995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
A multi-label classification approach for Facial Expression Recognition 面部表情识别的多标签分类方法
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706330
Kaili Zhao, Honggang Zhang, Mingzhi Dong, Jun Guo, Yonggang Qi, Yi-Zhe Song
Facial Expression Recognition (FER) techniques have already been adopted in numerous multimedia systems. Plenty of previous research assumes that each facial picture should be linked to only one of the predefined affective labels. Nevertheless, in practical applications, few of the expressions are exactly one of the predefined affective states. Therefore, to depict the facial expressions more accurately, this paper proposes a multi-label classification approach for FER and each facial expression would be labeled with one or multiple affective states. Meanwhile, by modeling the relationship between labels via Group Lasso regularization term, a maximum margin multi-label classifier is presented and the convex optimization formulation guarantees a global optimal solution. To evaluate the performance of our classifier, the JAFFE dataset is extended into a multi-label facial expression dataset by setting threshold to its continuous labels marked in the original dataset and the labeling results have shown that multiple labels can output a far more accurate description of facial expression. At the same time, the classification results have verified the superior performance of our algorithm.
面部表情识别(FER)技术已经应用于众多多媒体系统中。之前的大量研究假设每张面部图片应该只与预定义的情感标签中的一个相关联。然而,在实际应用中,很少有表达完全是预定义的情感状态之一。因此,为了更准确地描述面部表情,本文提出了一种多标签分类方法,将每个面部表情标记为一种或多种情感状态。同时,通过Group Lasso正则化项对标签之间的关系进行建模,提出了一种最大余量多标签分类器,并通过凸优化公式保证了全局最优解。为了评估我们的分类器的性能,通过对原始数据集中标记的连续标签设置阈值,将JAFFE数据集扩展为多标签面部表情数据集,标记结果表明,多个标签可以输出更准确的面部表情描述。同时,分类结果也验证了我们算法的优越性能。
{"title":"A multi-label classification approach for Facial Expression Recognition","authors":"Kaili Zhao, Honggang Zhang, Mingzhi Dong, Jun Guo, Yonggang Qi, Yi-Zhe Song","doi":"10.1109/VCIP.2013.6706330","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706330","url":null,"abstract":"Facial Expression Recognition (FER) techniques have already been adopted in numerous multimedia systems. Plenty of previous research assumes that each facial picture should be linked to only one of the predefined affective labels. Nevertheless, in practical applications, few of the expressions are exactly one of the predefined affective states. Therefore, to depict the facial expressions more accurately, this paper proposes a multi-label classification approach for FER and each facial expression would be labeled with one or multiple affective states. Meanwhile, by modeling the relationship between labels via Group Lasso regularization term, a maximum margin multi-label classifier is presented and the convex optimization formulation guarantees a global optimal solution. To evaluate the performance of our classifier, the JAFFE dataset is extended into a multi-label facial expression dataset by setting threshold to its continuous labels marked in the original dataset and the labeling results have shown that multiple labels can output a far more accurate description of facial expression. At the same time, the classification results have verified the superior performance of our algorithm.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"108 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124818731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A novel image tag saliency ranking algorithm based on sparse representation 一种基于稀疏表示的图像标签显著性排序算法
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706420
Caixia Wang, Zehai Song, Songhe Feng, Congyan Lang, Shuicheng Yan
As the explosive growth of the web image data, image tag ranking used for image retrieval accurately from mass images is becoming an active research topic. However, the existing ranking approaches are not very ideal, which remains to be improved. This paper proposed a new image tag saliency ranking algorithm based on sparse representation. we firstly propagate labels from image-level to region-level via Multi-instance Learning driven by sparse representation, which means reconstructing the target instance from positive bag via the sparse linear combination of all the instances from training set, instances with nonzero reconstruction coefficients are considered to be similar to the target instance; then visual attention model is used for tag saliency analysis. Comparing with the existing approaches, the proposed method achieves a better effect and shows a better performance.
随着网络图像数据的爆炸式增长,利用图像标签排序从海量图像中准确检索图像成为一个活跃的研究课题。然而,现有的排名方法并不十分理想,还有待改进。提出了一种基于稀疏表示的图像标签显著性排序算法。首先通过稀疏表示驱动的多实例学习将标签从图像级传播到区域级,即通过训练集中所有实例的稀疏线性组合从正袋重构目标实例,重构系数非零的实例被认为与目标实例相似;然后利用视觉注意模型进行标签显著性分析。与现有方法相比,该方法取得了更好的效果,显示出更好的性能。
{"title":"A novel image tag saliency ranking algorithm based on sparse representation","authors":"Caixia Wang, Zehai Song, Songhe Feng, Congyan Lang, Shuicheng Yan","doi":"10.1109/VCIP.2013.6706420","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706420","url":null,"abstract":"As the explosive growth of the web image data, image tag ranking used for image retrieval accurately from mass images is becoming an active research topic. However, the existing ranking approaches are not very ideal, which remains to be improved. This paper proposed a new image tag saliency ranking algorithm based on sparse representation. we firstly propagate labels from image-level to region-level via Multi-instance Learning driven by sparse representation, which means reconstructing the target instance from positive bag via the sparse linear combination of all the instances from training set, instances with nonzero reconstruction coefficients are considered to be similar to the target instance; then visual attention model is used for tag saliency analysis. Comparing with the existing approaches, the proposed method achieves a better effect and shows a better performance.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122363013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel depth propagation algorithm with color guided motion estimation 一种新的颜色引导运动估计深度传播算法
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706419
Haoqian Wang, Yushi Tian, Yongbing Zhang
Depth propagation is an effective and efficient way to produce depth maps for a video sequence. Motion estimation in most existing depth propagation schemes is performed only based on the estimated depth maps without consideration for color information. This paper presents a novel key frame depth propagation algorithm combining bilateral filtering and motion estimation. A color guided motion estimation process is proposed by taking both color and depth information into account when estimating the motion vectors. In addition, a bidirectional propagation strategy is adopted to reduce the accumulation of depth errors. Experimental results show that the proposed algorithm outperforms most of the existing techniques in obtaining high quality depth maps indicating a better effect of the synthesized stereoscopic video.
深度传播是生成视频序列深度图的一种有效方法。在现有的深度传播方案中,运动估计仅基于估计的深度图,而不考虑颜色信息。提出了一种结合双边滤波和运动估计的关键帧深度传播算法。提出了一种同时考虑颜色和深度信息的颜色引导运动估计方法。此外,采用双向传播策略减少深度误差的累积。实验结果表明,该算法在获得高质量深度图方面优于大多数现有技术,具有较好的合成立体视频效果。
{"title":"A novel depth propagation algorithm with color guided motion estimation","authors":"Haoqian Wang, Yushi Tian, Yongbing Zhang","doi":"10.1109/VCIP.2013.6706419","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706419","url":null,"abstract":"Depth propagation is an effective and efficient way to produce depth maps for a video sequence. Motion estimation in most existing depth propagation schemes is performed only based on the estimated depth maps without consideration for color information. This paper presents a novel key frame depth propagation algorithm combining bilateral filtering and motion estimation. A color guided motion estimation process is proposed by taking both color and depth information into account when estimating the motion vectors. In addition, a bidirectional propagation strategy is adopted to reduce the accumulation of depth errors. Experimental results show that the proposed algorithm outperforms most of the existing techniques in obtaining high quality depth maps indicating a better effect of the synthesized stereoscopic video.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132165406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Object co-detection via low-rank and sparse representation dictionary learning 基于低秩和稀疏表示字典学习的目标协同检测
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706361
Yurui Xie, Chao Huang, Tiecheng Song, Jinxiu Ma, J. Jing
In this paper, we exploit an algorithm for detecting the individual objects from multiple images in a weakly supervised manner. Specifically, we treat the object co-detection as a jointly dictionary learning and objects localization problem. Thus a novel low-rank and sparse representation dictionary learning algorithm is proposed. It aims to learn a compact and discriminative dictionary associated with the specific object category. Different from previous dictionary learning methods, the sparsity imposed on representation coefficients, the rank minimization of learned dictionary, data reconstruction error and the low-rank constraint of sample data are all incorporated in a unitized objective function. Then we optimize all the constraint terms via an extended version of augmented lagrange multipliers (ALM) method simultaneously. The experimental results demonstrate that the low-rank and sparse representation dictionary learning algorithm can compare favorably to other single object detection method.
在本文中,我们开发了一种以弱监督方式从多个图像中检测单个对象的算法。具体来说,我们将目标共同检测视为一个字典学习和目标定位的联合问题。为此,提出了一种新颖的低秩稀疏表示字典学习算法。它旨在学习与特定对象类别相关的紧凑和判别字典。与以往的字典学习方法不同,该方法将表示系数的稀疏性、学习到的字典的秩最小化、数据重构误差以及样本数据的低秩约束都纳入到一个统一的目标函数中。然后通过扩展版的增广拉格朗日乘子法同时优化所有约束项。实验结果表明,低秩稀疏表示字典学习算法优于其他单目标检测方法。
{"title":"Object co-detection via low-rank and sparse representation dictionary learning","authors":"Yurui Xie, Chao Huang, Tiecheng Song, Jinxiu Ma, J. Jing","doi":"10.1109/VCIP.2013.6706361","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706361","url":null,"abstract":"In this paper, we exploit an algorithm for detecting the individual objects from multiple images in a weakly supervised manner. Specifically, we treat the object co-detection as a jointly dictionary learning and objects localization problem. Thus a novel low-rank and sparse representation dictionary learning algorithm is proposed. It aims to learn a compact and discriminative dictionary associated with the specific object category. Different from previous dictionary learning methods, the sparsity imposed on representation coefficients, the rank minimization of learned dictionary, data reconstruction error and the low-rank constraint of sample data are all incorporated in a unitized objective function. Then we optimize all the constraint terms via an extended version of augmented lagrange multipliers (ALM) method simultaneously. The experimental results demonstrate that the low-rank and sparse representation dictionary learning algorithm can compare favorably to other single object detection method.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129033429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Efficient active contour model based on Vese-Chan model and split bregman method 基于Vese-Chan模型和分裂bregman方法的高效活动轮廓模型
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706370
Yunyun Yang, Yi Zhao
In this paper we propose an efficient multi-phase image segmentation for color images based on the piecewise constant multi-phase Vese-Chan model and the split Bregman method. The proposed model is first presented in a four-phase level set formulation and then extended to a multi-phase formulation. The four-phase and multi-phase energy functionals are defined and the corresponding minimization problems of the proposed active contour model are presented. The split Bregman method is applied to minimize the multi-phase energy functional efficiently. The proposed model has been applied to synthetic and real color images with promising results. The advantages of the proposed active contour model have been demonstrated by numerical results.
本文提出了一种基于分段恒多相Vese-Chan模型和分裂Bregman方法的彩色图像多相分割方法。该模型首先以四阶段水平集的形式提出,然后扩展到多阶段的形式。定义了四相和多相能量泛函,并给出了相应的活动轮廓模型的最小化问题。采用分裂Bregman方法有效地减小了多相能量泛函。该模型已应用于合成图像和真实彩色图像,效果良好。数值结果证明了该模型的优越性。
{"title":"Efficient active contour model based on Vese-Chan model and split bregman method","authors":"Yunyun Yang, Yi Zhao","doi":"10.1109/VCIP.2013.6706370","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706370","url":null,"abstract":"In this paper we propose an efficient multi-phase image segmentation for color images based on the piecewise constant multi-phase Vese-Chan model and the split Bregman method. The proposed model is first presented in a four-phase level set formulation and then extended to a multi-phase formulation. The four-phase and multi-phase energy functionals are defined and the corresponding minimization problems of the proposed active contour model are presented. The split Bregman method is applied to minimize the multi-phase energy functional efficiently. The proposed model has been applied to synthetic and real color images with promising results. The advantages of the proposed active contour model have been demonstrated by numerical results.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126739541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Complexity model based load-balancing algorithm for parallel tools of HEVC 基于复杂度模型的HEVC并行工具负载均衡算法
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706451
Y. Ahn, Tae-Jin Hwang, D. Sim, W. Han
Load balancing algorithm supporting parallel tools for HEVC encoder is proposed in this paper. Standardization of HEVC version 1 was finalized and which is known that its RD performance is two times better than H.264/AVC which was the most efficient video coder. However, computational complexity of HEVC encoding process derived from variable block sizes based on hierarchical structure and recursive encoding structure should be dealt as a prerequisite for technique commercialization. In this paper, basic performances of slice- and tile-level parallel tools adopted in HEVC are firstly presented and load balancing algorithm based on complexity model for slices and tiles is proposed. For four slices and four tiles cases, average time saving gains are 12.05% and 3.81% against simple slice- and tile-level parallelization, respectively.
提出了支持并行工具的HEVC编码器负载均衡算法。HEVC版本1的标准化已经完成,它的RD性能是H.264/AVC的两倍,而H.264/AVC是最高效的视频编码器。然而,基于分层结构和递归编码结构的可变块大小导致的HEVC编码过程的计算复杂性是技术商业化的前提。本文首先介绍了HEVC中切片和瓦片级并行工具的基本性能,提出了基于切片和瓦片复杂度模型的负载均衡算法。对于4片和4块情况,相对于简单的片级和块级并行化,平均节省的时间分别为12.05%和3.81%。
{"title":"Complexity model based load-balancing algorithm for parallel tools of HEVC","authors":"Y. Ahn, Tae-Jin Hwang, D. Sim, W. Han","doi":"10.1109/VCIP.2013.6706451","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706451","url":null,"abstract":"Load balancing algorithm supporting parallel tools for HEVC encoder is proposed in this paper. Standardization of HEVC version 1 was finalized and which is known that its RD performance is two times better than H.264/AVC which was the most efficient video coder. However, computational complexity of HEVC encoding process derived from variable block sizes based on hierarchical structure and recursive encoding structure should be dealt as a prerequisite for technique commercialization. In this paper, basic performances of slice- and tile-level parallel tools adopted in HEVC are firstly presented and load balancing algorithm based on complexity model for slices and tiles is proposed. For four slices and four tiles cases, average time saving gains are 12.05% and 3.81% against simple slice- and tile-level parallelization, respectively.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121018897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Analysis and approximation of SAO estimation for CTU-level HEVC encoder ctu级HEVC编码器SAO估计的分析与逼近
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706414
G. Praveen, Ramakrishna Adireddy
In the HEVC standardization process & HM test model implementation, it's been indicated that SAO operation can be executed only at frame level. But for the purpose of low-latency, better memory-bandwidth efficiency and cache performance, it is needed to implement SAO filter at CTU level, along with other encode modules, for majority applications. As well, if any ASIC to be developed for HEVC, all modules are very much expected to execute at CTU/CU level for better pipeline performance. In this paper, we present two methods to carry out SAO offset estimation at CTU level. The proposed two methods are very suitable for the realization in pipe-lined architectures including both software and hardware solutions. Our experimentation results demonstrate that, the proposed two methods produce similar results as SAO frame level results, for both video quality & bit-rate by improving the memory bandwidth and cache performance efficiency.
在HEVC标准化过程和HM测试模型实现中,表明SAO操作只能在帧级执行。但是为了实现低延迟、更好的内存带宽效率和缓存性能,对于大多数应用程序,需要在CTU级别实现SAO过滤器,以及其他编码模块。此外,如果要为HEVC开发任何ASIC,则非常希望所有模块都在CTU/CU级别执行,以获得更好的管道性能。在本文中,我们提出了在CTU水平上进行SAO偏移估计的两种方法。所提出的两种方法都非常适合在流水线架构中实现,包括软件和硬件解决方案。实验结果表明,这两种方法通过提高存储带宽和缓存性能,在视频质量和比特率方面取得了与SAO帧级相似的结果。
{"title":"Analysis and approximation of SAO estimation for CTU-level HEVC encoder","authors":"G. Praveen, Ramakrishna Adireddy","doi":"10.1109/VCIP.2013.6706414","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706414","url":null,"abstract":"In the HEVC standardization process & HM test model implementation, it's been indicated that SAO operation can be executed only at frame level. But for the purpose of low-latency, better memory-bandwidth efficiency and cache performance, it is needed to implement SAO filter at CTU level, along with other encode modules, for majority applications. As well, if any ASIC to be developed for HEVC, all modules are very much expected to execute at CTU/CU level for better pipeline performance. In this paper, we present two methods to carry out SAO offset estimation at CTU level. The proposed two methods are very suitable for the realization in pipe-lined architectures including both software and hardware solutions. Our experimentation results demonstrate that, the proposed two methods produce similar results as SAO frame level results, for both video quality & bit-rate by improving the memory bandwidth and cache performance efficiency.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127804850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Enhancing coded video quality with perceptual foveation driven bit allocation strategy 利用感知注视点驱动的比特分配策略提高编码视频质量
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706373
Junyong You, X. Tai
Contrast sensitivity plays an important role in visual perception when viewing external stimuli, e.g., video, and it has been taken into account in development of advanced video coding algorithms. This paper proposes a perceptual foveation model based on accurate prediction of video fixations and modeling of contrast sensitivity function (CSF). Consequently, an adaptive bit allocation strategy in H.264/AVC video compression is proposed by considering visible frequency threshold of the human visual system (HVS). A subjective video quality assessment together with objective quality metrics have been performed and demonstrated that the proposed perceptual foveation driven bit allocation strategy can significantly improve the perceived quality of coded video compared with standard coding scheme and another visual attention guided coding approach.
对比敏感度在观看外部刺激(如视频)时的视觉感知中起着重要作用,并且在开发先进的视频编码算法时已考虑到它。本文提出了一种基于准确预测视频注视和建立对比敏感度函数(CSF)模型的感知注视模型。在此基础上,提出了一种考虑人眼视觉系统(HVS)可见频率阈值的H.264/AVC视频压缩中的自适应比特分配策略。主观视频质量评估和客观质量指标进行了测试,结果表明,与标准编码方案和另一种视觉注意引导编码方法相比,所提出的感知焦点驱动比特分配策略可以显著提高编码视频的感知质量。
{"title":"Enhancing coded video quality with perceptual foveation driven bit allocation strategy","authors":"Junyong You, X. Tai","doi":"10.1109/VCIP.2013.6706373","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706373","url":null,"abstract":"Contrast sensitivity plays an important role in visual perception when viewing external stimuli, e.g., video, and it has been taken into account in development of advanced video coding algorithms. This paper proposes a perceptual foveation model based on accurate prediction of video fixations and modeling of contrast sensitivity function (CSF). Consequently, an adaptive bit allocation strategy in H.264/AVC video compression is proposed by considering visible frequency threshold of the human visual system (HVS). A subjective video quality assessment together with objective quality metrics have been performed and demonstrated that the proposed perceptual foveation driven bit allocation strategy can significantly improve the perceived quality of coded video compared with standard coding scheme and another visual attention guided coding approach.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132566318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Seeing actions through scene context 通过场景背景来观察动作
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706382
Hongbo Zhang, Songzhi Su, Shaozi Li, Duansheng Chen, Bineng Zhong, R. Ji
Recognizing human actions is not alone, as hinted by the scene herein. In this paper, we investigate the possibility to boost the action recognition performance by exploiting their scene context associated. To this end, we model the scene as a mid-level “hidden layer” to bridge action descriptors and action categories. This is achieved via a scene topic model, in which hybrid visual descriptors including spatiotemporal action features and scene descriptors are first extracted from the video sequence. Then, we learn a joint probability distribution between scene and action by a Naive Bayesian N-earest Neighbor algorithm, which is adopted to jointly infer the action categories online by combining off-the-shelf action recognition algorithms. We demonstrate our merits by comparing to state-of-the-arts in several action recognition benchmarks.
正如这里的场景所暗示的那样,识别人类的行为并不是唯一的。在本文中,我们研究了通过利用它们的场景上下文关联来提高动作识别性能的可能性。为此,我们将场景建模为中级“隐藏层”,以连接动作描述符和动作类别。这是通过场景主题模型实现的,其中首先从视频序列中提取包括时空动作特征和场景描述符的混合视觉描述符。然后,我们通过朴素贝叶斯n近邻算法学习场景和动作之间的联合概率分布,并结合现有的动作识别算法在线联合推断动作类别。我们通过在几个动作识别基准中比较最先进的技术来展示我们的优点。
{"title":"Seeing actions through scene context","authors":"Hongbo Zhang, Songzhi Su, Shaozi Li, Duansheng Chen, Bineng Zhong, R. Ji","doi":"10.1109/VCIP.2013.6706382","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706382","url":null,"abstract":"Recognizing human actions is not alone, as hinted by the scene herein. In this paper, we investigate the possibility to boost the action recognition performance by exploiting their scene context associated. To this end, we model the scene as a mid-level “hidden layer” to bridge action descriptors and action categories. This is achieved via a scene topic model, in which hybrid visual descriptors including spatiotemporal action features and scene descriptors are first extracted from the video sequence. Then, we learn a joint probability distribution between scene and action by a Naive Bayesian N-earest Neighbor algorithm, which is adopted to jointly infer the action categories online by combining off-the-shelf action recognition algorithms. We demonstrate our merits by comparing to state-of-the-arts in several action recognition benchmarks.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"283 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132569627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2013 Visual Communications and Image Processing (VCIP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1