首页 > 最新文献

2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)最新文献

英文 中文
A night-time outdoor data set for low-light enhancement 用于弱光增强的夜间室外数据集
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301758
Yudong Zhou, Ronggang Wang, Yangshen Zhao
Low light Enhancement has been a hot topic in recent years, and many deep neural network (DNN)-based methods have achieved remarkable performance. However, the rapid development of DNNs also raises the urgent requirement of high-quality training sets, especially supervised night-time data sets. In this paper, we establish a night-time outdoor data set (NOD 1) that contains 1214 groups of images. We also generate appropriate and high-quality reference images for each group based on multi-exposure fusion strategy, which not only focuses on dark areas but also provides details for over-exposed areas in low light images. Furthermore, a simple but efficient network is presented as the baseline of NOD. Experimental results on NOD and other data sets show the generalizability and effectiveness of the proposed data set and baseline model.
弱光增强是近年来研究的热点,许多基于深度神经网络(DNN)的方法都取得了显著的效果。然而,深度神经网络的快速发展也对高质量的训练集提出了迫切的要求,特别是有监督的夜间数据集。在本文中,我们建立了一个包含1214组图像的夜间户外数据集(NOD 1)。我们还基于多曝光融合策略为每组生成合适的高质量参考图像,该策略不仅关注暗区,还提供了低光图像中过度曝光区域的细节。在此基础上,提出了一个简单有效的网络作为NOD的基线。在NOD和其他数据集上的实验结果表明了所提出的数据集和基线模型的泛化性和有效性。
{"title":"A night-time outdoor data set for low-light enhancement","authors":"Yudong Zhou, Ronggang Wang, Yangshen Zhao","doi":"10.1109/VCIP49819.2020.9301758","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301758","url":null,"abstract":"Low light Enhancement has been a hot topic in recent years, and many deep neural network (DNN)-based methods have achieved remarkable performance. However, the rapid development of DNNs also raises the urgent requirement of high-quality training sets, especially supervised night-time data sets. In this paper, we establish a night-time outdoor data set (NOD 1) that contains 1214 groups of images. We also generate appropriate and high-quality reference images for each group based on multi-exposure fusion strategy, which not only focuses on dark areas but also provides details for over-exposed areas in low light images. Furthermore, a simple but efficient network is presented as the baseline of NOD. Experimental results on NOD and other data sets show the generalizability and effectiveness of the proposed data set and baseline model.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"164 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132225923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fast Recolor Prediction Scheme in Point Cloud Attribute Compression 点云属性压缩中的快速重着色预测方案
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301768
Chuang Ma, Ge Li, Qi Zhang, Yiting Shao, Jing Wang, Shan Liu
Due to the emerging requirement of point cloud applications, efficient point cloud compression methods are in high demand for compact point cloud representation in limited bandwidth transmission. The compression standard GPCC (Geometry-based Point Cloud Compression) is led by the MPEG (Moving Picture Expert Group) in respond to industrial requirements. KNN (K-Nearest Neighbors) search based prediction method is adopted for point cloud attribute compression in current G-PCC, which only exploits Euclidean distance-based geometric relationship without fully consideration of underlying geometric distribution. In this paper, we propose a novel prediction scheme based on fast recolor technique for attribute lossless and near-lossless compression. Our method has been implemented upon G-PCC reference software of the latest version. Experimental results show that our method can take advantage of the correlation between the attributes of neighbors, which leads to better rate-distortion (R-D) performance than G-PCC anchor on point cloud dataset with negligible encode and decode time increase under the common test conditions.
由于点云应用需求的不断涌现,在有限带宽传输条件下,高效的点云压缩方法对点云表示的紧凑性提出了很高的要求。GPCC(基于几何的点云压缩)是由MPEG(运动图像专家组)为响应工业需求而主导的压缩标准。当前G-PCC中点云属性压缩采用基于KNN (K-Nearest Neighbors)搜索的预测方法,仅利用基于欧氏距离的几何关系,未充分考虑底层几何分布。本文提出了一种基于快速重着色技术的属性无损和近无损压缩预测方案。我们的方法已经在最新版本的G-PCC参考软件上实现。实验结果表明,该方法可以利用相邻属性之间的相关性,在点云数据集上比G-PCC锚具有更好的率失真(R-D)性能,在常规测试条件下,编码和解码时间增加可以忽略不计。
{"title":"Fast Recolor Prediction Scheme in Point Cloud Attribute Compression","authors":"Chuang Ma, Ge Li, Qi Zhang, Yiting Shao, Jing Wang, Shan Liu","doi":"10.1109/VCIP49819.2020.9301768","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301768","url":null,"abstract":"Due to the emerging requirement of point cloud applications, efficient point cloud compression methods are in high demand for compact point cloud representation in limited bandwidth transmission. The compression standard GPCC (Geometry-based Point Cloud Compression) is led by the MPEG (Moving Picture Expert Group) in respond to industrial requirements. KNN (K-Nearest Neighbors) search based prediction method is adopted for point cloud attribute compression in current G-PCC, which only exploits Euclidean distance-based geometric relationship without fully consideration of underlying geometric distribution. In this paper, we propose a novel prediction scheme based on fast recolor technique for attribute lossless and near-lossless compression. Our method has been implemented upon G-PCC reference software of the latest version. Experimental results show that our method can take advantage of the correlation between the attributes of neighbors, which leads to better rate-distortion (R-D) performance than G-PCC anchor on point cloud dataset with negligible encode and decode time increase under the common test conditions.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"37 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132868047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Versatile Video Coding – Algorithms and Specification 通用视频编码。算法和规范
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301820
M. Wien, B. Bross
The tutorial provides an overview on the latest emerging video coding standard VVC (Versatile Video Coding) to be jointly published by ITU-T and ISO/IEC. It has been developed by the Joint Video Experts Team (JVET), consisting of ITU-T Study Group 16 Question 6 (known as VCEG) and ISO/IEC JTC 1/SC 29/WG 11 (known as MPEG). VVC has been designed to achieve significantly improved compression capability compared to previous standards such as HEVC, and at the same time to be highly versatile for effective use in a broadened range of applications. Some key application areas for the use of VVC particularly include ultra-high-definition video (e.g. 4K or 8K resolution), video with a high dynamic range and wide colour gamut (e.g., with transfer characteristics specified in Rec. ITU-R BT.2100), and video for immersive media applications such as 360° omnidirectional video, in addition to the applications that have commonly been addressed by prior video coding standards. Important design criteria for VVC have been low computational complexity on the decoder side and friendliness for parallelization on various algorithmic levels. VVC is planned to be finalized by July 2020 and is expected to enter the market very soon.The tutorial details the video layer coding tools specified in VVC and develops the concepts behind the selected design choices. While many tools or variants thereof have been available before, the VVC design reveals many improvements compared to previous standards which result in compression gain and implementation friendliness. Furthermore, new tools such as the Adaptive Loop Filter, or Matrix-based Intra Prediction have been adopted which contribute significantly to the overall performance. The high-level syntax of VVC has been re-designed compared to previous standards such as HEVC, in order to enable dynamic sub-picture access as well as major scalability features already in version 1 of the specification.
本教程概述了将由ITU-T和ISO/IEC联合发布的最新新兴视频编码标准VVC(通用视频编码)。它由ITU-T第16研究组第6题(称为VCEG)和ISO/IEC JTC 1/SC 29/WG 11(称为MPEG)组成的联合视频专家组(JVET)开发。与HEVC等先前的标准相比,VVC的设计大大提高了压缩能力,同时具有高度通用性,可以在更广泛的应用中有效使用。使用VVC的一些关键应用领域特别包括超高清视频(例如4K或8K分辨率),具有高动态范围和宽色域的视频(例如,具有Rec. ITU-R BT.2100中规定的传输特性),以及用于沉浸式媒体应用的视频,例如360°全方位视频,以及先前视频编码标准通常解决的应用。VVC的重要设计准则是在解码器端计算复杂度低和在各种算法级别上并行化友好。VVC计划在2020年7月之前完成,预计很快就会进入市场。本教程详细介绍了VVC中指定的视频层编码工具,并开发了所选设计选项背后的概念。虽然之前已经有许多工具或其变体可用,但与以前的标准相比,VVC设计显示了许多改进,从而导致压缩增益和实现友好性。此外,采用了新的工具,如自适应环路滤波器或基于矩阵的内部预测,这些工具对整体性能有很大的贡献。与HEVC等先前的标准相比,VVC的高级语法被重新设计,以便支持动态子图片访问以及版本1中已经存在的主要可扩展性特性。
{"title":"Versatile Video Coding – Algorithms and Specification","authors":"M. Wien, B. Bross","doi":"10.1109/VCIP49819.2020.9301820","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301820","url":null,"abstract":"The tutorial provides an overview on the latest emerging video coding standard VVC (Versatile Video Coding) to be jointly published by ITU-T and ISO/IEC. It has been developed by the Joint Video Experts Team (JVET), consisting of ITU-T Study Group 16 Question 6 (known as VCEG) and ISO/IEC JTC 1/SC 29/WG 11 (known as MPEG). VVC has been designed to achieve significantly improved compression capability compared to previous standards such as HEVC, and at the same time to be highly versatile for effective use in a broadened range of applications. Some key application areas for the use of VVC particularly include ultra-high-definition video (e.g. 4K or 8K resolution), video with a high dynamic range and wide colour gamut (e.g., with transfer characteristics specified in Rec. ITU-R BT.2100), and video for immersive media applications such as 360° omnidirectional video, in addition to the applications that have commonly been addressed by prior video coding standards. Important design criteria for VVC have been low computational complexity on the decoder side and friendliness for parallelization on various algorithmic levels. VVC is planned to be finalized by July 2020 and is expected to enter the market very soon.The tutorial details the video layer coding tools specified in VVC and develops the concepts behind the selected design choices. While many tools or variants thereof have been available before, the VVC design reveals many improvements compared to previous standards which result in compression gain and implementation friendliness. Furthermore, new tools such as the Adaptive Loop Filter, or Matrix-based Intra Prediction have been adopted which contribute significantly to the overall performance. The high-level syntax of VVC has been re-designed compared to previous standards such as HEVC, in order to enable dynamic sub-picture access as well as major scalability features already in version 1 of the specification.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133147067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Deep Blind Video Quality Assessment for User Generated Videos 用户生成视频的深度盲视频质量评估
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301757
Jiapeng Tang, Yu Dong, Rong Xie, Xiao Gu, Li Song, Lin Li, Bing Zhou
As short video industry grows up, quality assessment of user generated videos has become a hot issue. Existing no reference video quality assessment methods are not suitable for this type of application scenario since they are aimed at synthetic videos. In this paper, we propose a novel deep blind quality assessment model for user generated videos according to content variety and temporal memory effect. Content-aware features of frames are extracted through deep neural network, and a patch-based method is adopted to obtain frame quality score. Moreover, we propose a temporal memory-based pooling model considering temporal memory effect to predict video quality. Experimental results conducted on KoNViD-1k and LIVE-VQC databases demonstrate that the performance of our proposed method outperforms other state-of-the-art ones, and the comparative analysis proves the efficiency o f our temporal pooling model.
随着短视频行业的发展,用户生成视频的质量评估成为一个热点问题。现有的视频质量评估方法都是针对合成视频的,没有参考依据,不适合这类应用场景。本文提出了一种基于内容多样性和时间记忆效应的用户生成视频深度盲质量评估模型。通过深度神经网络提取帧的内容感知特征,并采用基于patch的方法获得帧质量评分。此外,我们提出了一种考虑时间记忆效应的基于时间记忆的池化模型来预测视频质量。在KoNViD-1k和LIVE-VQC数据库上的实验结果表明,本文方法的性能优于其他最先进的方法,并通过对比分析证明了本文时间池化模型的有效性。
{"title":"Deep Blind Video Quality Assessment for User Generated Videos","authors":"Jiapeng Tang, Yu Dong, Rong Xie, Xiao Gu, Li Song, Lin Li, Bing Zhou","doi":"10.1109/VCIP49819.2020.9301757","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301757","url":null,"abstract":"As short video industry grows up, quality assessment of user generated videos has become a hot issue. Existing no reference video quality assessment methods are not suitable for this type of application scenario since they are aimed at synthetic videos. In this paper, we propose a novel deep blind quality assessment model for user generated videos according to content variety and temporal memory effect. Content-aware features of frames are extracted through deep neural network, and a patch-based method is adopted to obtain frame quality score. Moreover, we propose a temporal memory-based pooling model considering temporal memory effect to predict video quality. Experimental results conducted on KoNViD-1k and LIVE-VQC databases demonstrate that the performance of our proposed method outperforms other state-of-the-art ones, and the comparative analysis proves the efficiency o f our temporal pooling model.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131641736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Progressive Fast CU Split Decision Scheme for AVS3 AVS3的渐进式快速分块决策方案
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301772
Yuyuan Chen, Songlin Sun, Jiaqi Zhang, Shanshe Wang
AVS3 is the newest video coding standard developed by AVS (Audio Video coding Standard) group. AVS3 adopted QTBT(Quad-tree and Binary-tree) plus EQT(Extended quad-tree) block partition scheme, which makes the split process more flexible. The CU split structure is determined by a brute-force rate-distortion optimization (RDO) search. After the whole RDO search, the CU partition with minimum RD cost is selected. The flexible block partition and thorough RDO search bring promising coding gain while extremely complicate the encoder. To reduce the computational complexity of the CU split decision process in AVS3, this paper proposed a spatial information based fast split decision algorithm. In the proposed algorithm, the predicted value of split complexity was calculated firstly according to the information of spatial neighboring blocks. Then the predicted value was used to decide whether to split current CU or not. The experimental results show that the proposed algorithm resulted in average 31.03% encoding time saving with average 0.54% BD-BR loss for Random Access (RA) configuration. The proposed algorithm can greatly reduce the computational complexity of the CU split decision process with negligible performance loss.
AVS3是由AVS (Audio video coding standard)组织开发的最新视频编码标准。AVS3采用QTBT(四叉树和二叉树)+ EQT(扩展四叉树)块分区方案,使得拆分过程更加灵活。通过暴力破解率失真优化(RDO)搜索确定CU分裂结构。经过整个RDO搜索,选择RD开销最小的CU分区。灵活的分组划分和彻底的RDO搜索在使编码器极其复杂的同时带来了可观的编码增益。为了降低AVS3中CU分割决策过程的计算复杂度,提出了一种基于空间信息的快速分割决策算法。该算法首先根据空间相邻块的信息计算分割复杂度的预测值。然后用预测值来决定是否拆分当前CU。实验结果表明,该算法在随机存取(RA)配置下平均节省31.03%的编码时间,平均损失0.54%的BD-BR。该算法可以大大降低CU拆分决策过程的计算复杂度,而性能损失可以忽略不计。
{"title":"A Progressive Fast CU Split Decision Scheme for AVS3","authors":"Yuyuan Chen, Songlin Sun, Jiaqi Zhang, Shanshe Wang","doi":"10.1109/VCIP49819.2020.9301772","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301772","url":null,"abstract":"AVS3 is the newest video coding standard developed by AVS (Audio Video coding Standard) group. AVS3 adopted QTBT(Quad-tree and Binary-tree) plus EQT(Extended quad-tree) block partition scheme, which makes the split process more flexible. The CU split structure is determined by a brute-force rate-distortion optimization (RDO) search. After the whole RDO search, the CU partition with minimum RD cost is selected. The flexible block partition and thorough RDO search bring promising coding gain while extremely complicate the encoder. To reduce the computational complexity of the CU split decision process in AVS3, this paper proposed a spatial information based fast split decision algorithm. In the proposed algorithm, the predicted value of split complexity was calculated firstly according to the information of spatial neighboring blocks. Then the predicted value was used to decide whether to split current CU or not. The experimental results show that the proposed algorithm resulted in average 31.03% encoding time saving with average 0.54% BD-BR loss for Random Access (RA) configuration. The proposed algorithm can greatly reduce the computational complexity of the CU split decision process with negligible performance loss.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114374807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards Quantized DCT Coefficients Restoration for Compressed Images 压缩图像的量化DCT系数恢复
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301794
Tong Ouyang, Zhenzhong Chen, Shan Liu
Images and videos suffer from undesirable visual artifacts at high compression ratios, which is due to the use of the discrete cosine transform and scalar quantization in the compression. To restore the quantized coefficients via producing the quantization error, we propose a coefficients restoration convolutional neural network in the frequency domain (FD-CRNet). Taking advantage of residual learning, the proposed FD-CRNet efficiently exploits the related distribution of different frequency components. The squeeze-and-excitation block (SE block) is adopted to reduce the computational complexity and better restoration performance. Experimental results show the quantized coefficients are recovered near the lossless coefficients effectively, which outperforms the existed coefficients restoration methods. In addition, the performance of methods in the spatial domain is significantly improved by FD-CRNet with more authentic details and sharper edges when removing the compression artifacts.
在高压缩比下,图像和视频会出现不良的视觉伪影,这是由于在压缩中使用了离散余弦变换和标量量化。为了通过产生量化误差来恢复量化系数,我们提出了一种频域系数恢复卷积神经网络(FD-CRNet)。利用残差学习,FD-CRNet有效地利用了不同频率分量的相关分布。为了降低计算复杂度和提高恢复性能,采用了挤压激励块(SE块)。实验结果表明,量化后的系数能有效地恢复到无损系数附近,优于现有的系数恢复方法。此外,FD-CRNet在去除压缩伪影后,在空间域的性能得到了显著提高,细节更真实,边缘更清晰。
{"title":"Towards Quantized DCT Coefficients Restoration for Compressed Images","authors":"Tong Ouyang, Zhenzhong Chen, Shan Liu","doi":"10.1109/VCIP49819.2020.9301794","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301794","url":null,"abstract":"Images and videos suffer from undesirable visual artifacts at high compression ratios, which is due to the use of the discrete cosine transform and scalar quantization in the compression. To restore the quantized coefficients via producing the quantization error, we propose a coefficients restoration convolutional neural network in the frequency domain (FD-CRNet). Taking advantage of residual learning, the proposed FD-CRNet efficiently exploits the related distribution of different frequency components. The squeeze-and-excitation block (SE block) is adopted to reduce the computational complexity and better restoration performance. Experimental results show the quantized coefficients are recovered near the lossless coefficients effectively, which outperforms the existed coefficients restoration methods. In addition, the performance of methods in the spatial domain is significantly improved by FD-CRNet with more authentic details and sharper edges when removing the compression artifacts.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116309654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Extending CCSDS 123.0-B-1 for Lossless 4D Image Compression 扩展CCSDS 123.0-B-1的无损4D图像压缩
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301765
Panpan Zhang, Xiuheng Wang, Tiande Gao, Zhenfu Feng, Jie Chen
A 4-dimensional (4D) image can be viewed as a stack of volumetric images over channels of observation depth or temporal frames. This data contains rich information at the cost of high demands for storage and transmission resources due to its large volume. In this paper, we present a lossless 4D image compression algorithm by extending CCSDS-123.0-B-1 standard. Instead of separately compressing the volumetric image at each channel of 4D images, the proposed algorithm efficiently exploits redundancy across the fourth dimension of data. Experiments conducted on two types of 4D images demonstrate the effectiveness of the proposed lossless compression method.
四维(4D)图像可以看作是在观测深度或时间帧通道上的体积图像的堆栈。这些数据包含了丰富的信息,但由于数据量大,对存储和传输资源的要求很高。本文提出了一种扩展CCSDS-123.0-B-1标准的4D图像无损压缩算法。该算法不是在四维图像的每个通道上单独压缩体积图像,而是有效地利用了数据的四维冗余。在两种4D图像上进行的实验验证了所提出的无损压缩方法的有效性。
{"title":"Extending CCSDS 123.0-B-1 for Lossless 4D Image Compression","authors":"Panpan Zhang, Xiuheng Wang, Tiande Gao, Zhenfu Feng, Jie Chen","doi":"10.1109/VCIP49819.2020.9301765","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301765","url":null,"abstract":"A 4-dimensional (4D) image can be viewed as a stack of volumetric images over channels of observation depth or temporal frames. This data contains rich information at the cost of high demands for storage and transmission resources due to its large volume. In this paper, we present a lossless 4D image compression algorithm by extending CCSDS-123.0-B-1 standard. Instead of separately compressing the volumetric image at each channel of 4D images, the proposed algorithm efficiently exploits redundancy across the fourth dimension of data. Experiments conducted on two types of 4D images demonstrate the effectiveness of the proposed lossless compression method.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123630918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optical Flow Estimation Between Images of Different Resolutions via Variational Method 基于变分法的不同分辨率图像间光流估计
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301771
Rui Zhao, Ruiqin Xiong, Shuyuan Zhu, B. Zeng, Tiejun Huang, Wen Gao
Traditional optical flow estimation methods mostly focus on images of the same resolution. However, there are some situations requiring optical flow between images of different resolutions, where the traditional approaches suffer from the inequality of spectrum aliasing level. In this paper, we propose a method estimating the flow fields between a clear image and a highly undersampled one. The proposed method simultaneously describes the motion and integral relationship between the images via an integral form image under the assumption of brightness and gradient consistency as well as motion smoothness. We also derive the numerical solution briefly, through which we can solve the equations easily via linearizations. Experimental results on Middlebury and MPI-Sintel datasets demonstrate that our proposed method outperforms traditional methods preprocessing images of different resolutions to be the same size, offering more accurate results.
传统的光流估计方法大多集中在相同分辨率的图像上。然而,在某些情况下,不同分辨率的图像之间需要光流,而传统的方法受到光谱混叠水平不平等的影响。在本文中,我们提出了一种估计清晰图像和高度欠采样图像之间流场的方法。该方法在保证亮度、梯度一致性和运动平滑性的前提下,通过一个积分形式的图像同时描述图像之间的运动和积分关系。我们还简要地推导了数值解,通过它我们可以很容易地通过线性化来求解方程。在Middlebury和mpi - sinl数据集上的实验结果表明,本文提出的方法比传统方法对不同分辨率的图像进行预处理得到相同尺寸的结果更准确。
{"title":"Optical Flow Estimation Between Images of Different Resolutions via Variational Method","authors":"Rui Zhao, Ruiqin Xiong, Shuyuan Zhu, B. Zeng, Tiejun Huang, Wen Gao","doi":"10.1109/VCIP49819.2020.9301771","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301771","url":null,"abstract":"Traditional optical flow estimation methods mostly focus on images of the same resolution. However, there are some situations requiring optical flow between images of different resolutions, where the traditional approaches suffer from the inequality of spectrum aliasing level. In this paper, we propose a method estimating the flow fields between a clear image and a highly undersampled one. The proposed method simultaneously describes the motion and integral relationship between the images via an integral form image under the assumption of brightness and gradient consistency as well as motion smoothness. We also derive the numerical solution briefly, through which we can solve the equations easily via linearizations. Experimental results on Middlebury and MPI-Sintel datasets demonstrate that our proposed method outperforms traditional methods preprocessing images of different resolutions to be the same size, offering more accurate results.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124478116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fully Neural Network Mode Based Intra Prediction of Variable Block Size 基于全神经网络模式的变块大小内部预测
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301842
Heming Sun, Lu Yu, J. Katto
Intra prediction is an essential component in the image coding. This paper gives an intra prediction framework completely based on neural network modes (NM). Each NM can be regarded as a regression from the neighboring reference blocks to the current coding block. (1) For variable block size, we utilize different network structures. For small blocks 4×4 and 8×8, fully connected networks are used, while for large blocks 16×16 and 32×32, convolutional neural networks are exploited. (2) For each prediction mode, we develop a specific pre-trained network to boost the regression accuracy. When integrating into HEVC test model, we can save 3.55%, 3.03% and 3.27% BD-rate for Y, U, V components compared with the anchor. As far as we know, this is the first work to explore a fully NM based framework for intra prediction, and we reach a better coding gain with a lower complexity compared with the previous work.
图像内预测是图像编码的重要组成部分。本文给出了一个完全基于神经网络模型的内部预测框架。每个NM都可以看作是从邻近参考块到当前编码块的回归。(1)对于可变块大小,我们使用不同的网络结构。对于小块4×4和8×8,使用完全连接的网络,而对于大块16×16和32×32,使用卷积神经网络。(2)对于每种预测模式,我们开发了特定的预训练网络来提高回归精度。整合到HEVC测试模型中,Y、U、V分量的bd率比锚点分别节省3.55%、3.03%、3.27%。据我们所知,这是第一次探索一个完全基于NM的帧内预测框架,与之前的工作相比,我们获得了更好的编码增益和更低的复杂度。
{"title":"Fully Neural Network Mode Based Intra Prediction of Variable Block Size","authors":"Heming Sun, Lu Yu, J. Katto","doi":"10.1109/VCIP49819.2020.9301842","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301842","url":null,"abstract":"Intra prediction is an essential component in the image coding. This paper gives an intra prediction framework completely based on neural network modes (NM). Each NM can be regarded as a regression from the neighboring reference blocks to the current coding block. (1) For variable block size, we utilize different network structures. For small blocks 4×4 and 8×8, fully connected networks are used, while for large blocks 16×16 and 32×32, convolutional neural networks are exploited. (2) For each prediction mode, we develop a specific pre-trained network to boost the regression accuracy. When integrating into HEVC test model, we can save 3.55%, 3.03% and 3.27% BD-rate for Y, U, V components compared with the anchor. As far as we know, this is the first work to explore a fully NM based framework for intra prediction, and we reach a better coding gain with a lower complexity compared with the previous work.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127111552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Bidirectional Consistency Constrained Template Update Learning for Siamese Trackers Siamese跟踪器的双向一致性约束模板更新学习
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301826
Kexin Chen, Xue Zhou, Chao Liang, Jianxiao Zou
This paper presents an online template update method with bidirectional consistency constraint for Siamese trackers. Due to continuously applying cross-correlation mechanism between template and the search region, the performance of Siamese trackers highly relies on the fidelity of template. Therefore, besides standard linear update, learning the template update methods attract attention. Inspired by this, in this paper we adopt a learning to update model called UpdateNet as our baseline. Different from it, we further propose a novel bi-directional consistency loss as a constraint to learn the template update more smoothly and stably. Our method considers both forward and backward information for each medium frame, thus introducing a multi-stage bidirectional simulated tracking training mechanism. We apply our model to a Siamese tracker, SiamRPN and demonstrate the effectiveness and robustness of our proposed method compared with traditional UpdateNet in the Large-scale Single Object Tracking (LaSOT) dataset.
提出了一种基于双向一致性约束的连体跟踪器在线模板更新方法。由于模板与搜索区域之间的相互关联机制不断被应用,因此Siamese跟踪器的性能高度依赖于模板的保真度。因此,除了标准的线性更新之外,学习模板更新方法也备受关注。受此启发,在本文中,我们采用了一个名为UpdateNet的学习更新模型作为我们的基线。与之不同的是,我们进一步提出了一种新的双向一致性损失作为约束来学习模板更新更平稳。该方法同时考虑了每个中间帧的前向和后向信息,从而引入了一种多阶段双向模拟跟踪训练机制。我们将我们的模型应用于SiamRPN,并在大规模单目标跟踪(LaSOT)数据集中与传统的UpdateNet相比,证明了我们提出的方法的有效性和鲁棒性。
{"title":"Bidirectional Consistency Constrained Template Update Learning for Siamese Trackers","authors":"Kexin Chen, Xue Zhou, Chao Liang, Jianxiao Zou","doi":"10.1109/VCIP49819.2020.9301826","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301826","url":null,"abstract":"This paper presents an online template update method with bidirectional consistency constraint for Siamese trackers. Due to continuously applying cross-correlation mechanism between template and the search region, the performance of Siamese trackers highly relies on the fidelity of template. Therefore, besides standard linear update, learning the template update methods attract attention. Inspired by this, in this paper we adopt a learning to update model called UpdateNet as our baseline. Different from it, we further propose a novel bi-directional consistency loss as a constraint to learn the template update more smoothly and stably. Our method considers both forward and backward information for each medium frame, thus introducing a multi-stage bidirectional simulated tracking training mechanism. We apply our model to a Siamese tracker, SiamRPN and demonstrate the effectiveness and robustness of our proposed method compared with traditional UpdateNet in the Large-scale Single Object Tracking (LaSOT) dataset.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129950288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1