首页 > 最新文献

2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)最新文献

英文 中文
Lightweight Color Image Demosaicking with Multi-Core Feature Extraction 轻量级彩色图像去马赛克与多核特征提取
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301841
Yufei Tan, Kan Chang, Hengxin Li, Zhenhua Tang, Tuanfa Qin
Convolutional neural network (CNN)-based color image demosaicking methods have achieved great success recently. However, in many applications where the computation resource is highly limited, it is not practical to deploy large-scale networks. This paper proposes a lightweight CNN for color image demosaicking. Firstly, to effectively extract shallow features, a multi-core feature extraction module, which takes the Bayer sampling positions into consideration, is proposed. Secondly, by taking advantage of inter-channel correlation, an attention-aware fusion module is presented to efficiently r econstruct t he full color image. Moreover, a feature enhancement module, which contains several cascading attention-aware enhancement blocks, is designed to further refine t he i nitial reconstructed i mage. To demonstrate the effectiveness of the proposed network, several state-of-the-art demosaicking methods are compared. Experimental results show that with the smallest number of parameters, the proposed network outperforms the other compared methods in terms of both objective and subjective qualities.
基于卷积神经网络(CNN)的彩色图像去马赛克方法近年来取得了很大的成功。然而,在许多计算资源非常有限的应用中,部署大规模网络是不现实的。提出了一种用于彩色图像去马赛克的轻量级CNN算法。首先,为了有效提取浅层特征,提出了考虑拜耳采样位置的多核特征提取模块;其次,利用通道间相关性,提出了一种注意感知融合模块,实现了对全彩图像的有效重构;此外,设计了一个特征增强模块,该模块包含多个级联的注意感知增强块,对初始重构图像进行进一步细化。为了证明所提出的网络的有效性,比较了几种最先进的去马赛克方法。实验结果表明,在参数数量最少的情况下,本文提出的网络在客观质量和主观质量方面都优于其他比较方法。
{"title":"Lightweight Color Image Demosaicking with Multi-Core Feature Extraction","authors":"Yufei Tan, Kan Chang, Hengxin Li, Zhenhua Tang, Tuanfa Qin","doi":"10.1109/VCIP49819.2020.9301841","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301841","url":null,"abstract":"Convolutional neural network (CNN)-based color image demosaicking methods have achieved great success recently. However, in many applications where the computation resource is highly limited, it is not practical to deploy large-scale networks. This paper proposes a lightweight CNN for color image demosaicking. Firstly, to effectively extract shallow features, a multi-core feature extraction module, which takes the Bayer sampling positions into consideration, is proposed. Secondly, by taking advantage of inter-channel correlation, an attention-aware fusion module is presented to efficiently r econstruct t he full color image. Moreover, a feature enhancement module, which contains several cascading attention-aware enhancement blocks, is designed to further refine t he i nitial reconstructed i mage. To demonstrate the effectiveness of the proposed network, several state-of-the-art demosaicking methods are compared. Experimental results show that with the smallest number of parameters, the proposed network outperforms the other compared methods in terms of both objective and subjective qualities.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123443993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Unified Single Image De-raining Model via Region Adaptive Coupled Network 基于区域自适应耦合网络的统一单幅图像去训练模型
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301865
Q. Wu, Li Chen, K. Ngan, Hongliang Li, Fanman Meng, Linfeng Xu
Single image de-raining is quite challenging due to the diversity of rain types and inhomogeneous distributions of rainwater. By means of dedicated models and constraints, existing methods perform well for specific rain type. However, their generalization capability is highly limited as well. In this paper, we propose a unified de-raining model by selectively fusing the clean background of the input rain image and the well restored regions occluded by various rains. This is achieved by our region adaptive coupled network (RACN), whose two branches integrate the features of each other in different layers to jointly generate the spatial-variant weight and restored image respectively. On the one hand, the weight branch could lead the restoration branch to focus on the regions with higher contributions for de-raining. On the other hand, the restoration branch could guide the weight branch to keep off the regions with over-/under-filtering risks. Extensive experiments show that our method outperforms many state-of-the-art de-raining algorithms on diverse rain types including the rain streak, raindrop and rain-mist.
由于降雨类型的多样性和雨水分布的不均匀性,单图像去雨是相当具有挑战性的。利用专用的模型和约束条件,现有的方法对特定的降雨类型表现良好。然而,它们的泛化能力也非常有限。在本文中,我们提出了一种统一的去雨模型,该模型通过选择性地融合输入降雨图像的干净背景和被各种降雨遮挡的恢复良好的区域。这是通过我们的区域自适应耦合网络(RACN)实现的,该网络的两个分支在不同的层中整合彼此的特征,分别共同生成空间变权和恢复图像。一方面,权重分支可以引导恢复分支关注对降水贡献较大的区域;另一方面,恢复分支可以引导权重分支避开有过/过滤风险的区域。大量的实验表明,我们的方法在不同的雨类型(包括雨带、雨滴和雨雾)上优于许多最先进的去雨算法。
{"title":"A Unified Single Image De-raining Model via Region Adaptive Coupled Network","authors":"Q. Wu, Li Chen, K. Ngan, Hongliang Li, Fanman Meng, Linfeng Xu","doi":"10.1109/VCIP49819.2020.9301865","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301865","url":null,"abstract":"Single image de-raining is quite challenging due to the diversity of rain types and inhomogeneous distributions of rainwater. By means of dedicated models and constraints, existing methods perform well for specific rain type. However, their generalization capability is highly limited as well. In this paper, we propose a unified de-raining model by selectively fusing the clean background of the input rain image and the well restored regions occluded by various rains. This is achieved by our region adaptive coupled network (RACN), whose two branches integrate the features of each other in different layers to jointly generate the spatial-variant weight and restored image respectively. On the one hand, the weight branch could lead the restoration branch to focus on the regions with higher contributions for de-raining. On the other hand, the restoration branch could guide the weight branch to keep off the regions with over-/under-filtering risks. Extensive experiments show that our method outperforms many state-of-the-art de-raining algorithms on diverse rain types including the rain streak, raindrop and rain-mist.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123490851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Spatiotemporal Guided Self-Supervised Depth Completion from LiDAR and Monocular Camera 基于激光雷达和单目相机的时空制导自监督深度完成
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301857
Z. Chen, Hantao Wang, Lijun Wu, Yanlin Zhou, Dapeng Oliver Wu
Depth completion aims to estimate dense depth maps from sparse depth measurements. It has become increasingly important in autonomous driving and thus has drawn wide attention. In this paper, we introduce photometric losses in both spatial and time domains to jointly guide self-supervised depth completion. This method performs an accurate end-to-end depth completion of vision tasks by using LiDAR and a monocular camera. In particular, we full utilize the consistent information inside the temporally adjacent frames and the stereo vision to improve the accuracy of depth completion in the model training phase. We design a self-supervised framework to eliminate the negative effects of moving objects and the region with smooth gradients. Experiments are conducted on KITTI. Results indicate that our self-supervised method can attain competitive performance.
深度补全的目的是从稀疏的深度测量中估计密集的深度图。它在自动驾驶中变得越来越重要,因此引起了广泛的关注。在本文中,我们引入了空间和时间域的光度损失来共同指导自监督深度完成。该方法利用激光雷达和单目摄像机对视觉任务进行精确的端到端深度完成。特别是在模型训练阶段,我们充分利用时间相邻帧内的一致性信息和立体视觉来提高深度补全的精度。我们设计了一个自监督框架来消除运动物体和平滑梯度区域的负面影响。在KITTI上进行了实验。结果表明,本文提出的自监督方法能够取得较好的效果。
{"title":"Spatiotemporal Guided Self-Supervised Depth Completion from LiDAR and Monocular Camera","authors":"Z. Chen, Hantao Wang, Lijun Wu, Yanlin Zhou, Dapeng Oliver Wu","doi":"10.1109/VCIP49819.2020.9301857","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301857","url":null,"abstract":"Depth completion aims to estimate dense depth maps from sparse depth measurements. It has become increasingly important in autonomous driving and thus has drawn wide attention. In this paper, we introduce photometric losses in both spatial and time domains to jointly guide self-supervised depth completion. This method performs an accurate end-to-end depth completion of vision tasks by using LiDAR and a monocular camera. In particular, we full utilize the consistent information inside the temporally adjacent frames and the stereo vision to improve the accuracy of depth completion in the model training phase. We design a self-supervised framework to eliminate the negative effects of moving objects and the region with smooth gradients. Experiments are conducted on KITTI. Results indicate that our self-supervised method can attain competitive performance.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128686063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
DEN: Disentanglement and Enhancement Networks for Low Illumination Images DEN:低照度图像的解纠缠和增强网络
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301830
Nelson Chong Ngee Bow, Vu-Hoang Tran, Punchok Kerdsiri, Y. P. Loh, Ching-Chun Huang
Though learning-based low-light enhancement methods have achieved significant success, existing methods are still sensitive to noise and unnatural appearance. The problems may come from the lack of structural awareness and the confusion between noise and texture. Thus, we present a lowlight image enhancement method that consists of an image disentanglement network and an illumination boosting network. The disentanglement network is first used to decompose the input image into image details and image illumination. The extracted illumination part then goes through a multi-branch enhancement network designed to improve the dynamic range of the image. The multi-branch network extracts multi-level image features and enhances them via numerous subnets. These enhanced features are then fused to generate the enhanced illumination part. Finally, the denoised image details and the enhanced illumination are entangled to produce the normallight image. Experimental results show that our method can produce visually pleasing images in many public datasets.
尽管基于学习的弱光增强方法已经取得了显著的成功,但现有的方法仍然对噪声和非自然外观敏感。问题可能来自结构意识的缺乏以及噪声和纹理的混淆。因此,我们提出了一种低光图像增强方法,该方法由图像解纠缠网络和光照增强网络组成。首先利用解纠缠网络将输入图像分解为图像细节和图像照明。然后将提取的照明部分经过多分支增强网络,以提高图像的动态范围。多分支网络提取多层次的图像特征,并通过多个子网对其进行增强。然后将这些增强的特征融合以生成增强的照明部分。最后,将去噪后的图像细节与增强后的光照进行纠缠,得到正常光照图像。实验结果表明,该方法可以在许多公共数据集上生成视觉上令人满意的图像。
{"title":"DEN: Disentanglement and Enhancement Networks for Low Illumination Images","authors":"Nelson Chong Ngee Bow, Vu-Hoang Tran, Punchok Kerdsiri, Y. P. Loh, Ching-Chun Huang","doi":"10.1109/VCIP49819.2020.9301830","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301830","url":null,"abstract":"Though learning-based low-light enhancement methods have achieved significant success, existing methods are still sensitive to noise and unnatural appearance. The problems may come from the lack of structural awareness and the confusion between noise and texture. Thus, we present a lowlight image enhancement method that consists of an image disentanglement network and an illumination boosting network. The disentanglement network is first used to decompose the input image into image details and image illumination. The extracted illumination part then goes through a multi-branch enhancement network designed to improve the dynamic range of the image. The multi-branch network extracts multi-level image features and enhances them via numerous subnets. These enhanced features are then fused to generate the enhanced illumination part. Finally, the denoised image details and the enhanced illumination are entangled to produce the normallight image. Experimental results show that our method can produce visually pleasing images in many public datasets.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126428146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HDR Image Compression with Convolutional Autoencoder 卷积自编码器的HDR图像压缩
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301853
Fei Han, Jin Wang, Ruiqin Xiong, Qing Zhu, Baocai Yin
As one of the next-generation multimedia technology, high dynamic range (HDR) imaging technology has been widely applied. Due to its wider color range, HDR image brings greater compression and storage burden compared with traditional LDR image. To solve this problem, in this paper, a two-layer HDR image compression framework based on convolutional neural networks is proposed. The framework is composed of a base layer which provides backward compatibility with the standard JPEG, and an extension layer based on a convolutional variational autoencoder neural networks and a post-processing module. The autoencoder mainly includes a nonlinear transform encoder, a binarized quantizer and a nonlinear transform decoder. Compared with traditional codecs, the proposed CNN autoencoder is more flexible and can retain more image semantic information, which will improve the quality of decoded HDR image. Moreover, to reduce the compression artifacts and noise of reconstructed HDR image, a post-processing method based on group convolutional neural networks is designed. Experimental results show that our method outperforms JPEG XT profile A, B, C and other methods in terms of HDR-VDP-2 evaluation metric. Meanwhile, our scheme also provides backward compatibility with the standard JPEG.
高动态范围(HDR)成像技术作为下一代多媒体技术之一,得到了广泛的应用。与传统的LDR图像相比,HDR图像由于具有更宽的色彩范围,带来了更大的压缩和存储负担。为了解决这一问题,本文提出了一种基于卷积神经网络的双层HDR图像压缩框架。该框架由提供向后兼容标准JPEG的基础层、基于卷积变分自编码器神经网络和后处理模块的扩展层组成。该自编码器主要包括非线性变换编码器、二值化量化器和非线性变换解码器。与传统的编解码器相比,本文提出的CNN自编码器更加灵活,保留了更多的图像语义信息,提高了解码后HDR图像的质量。此外,为了降低重构HDR图像的压缩伪影和噪声,设计了一种基于群卷积神经网络的图像后处理方法。实验结果表明,该方法在HDR-VDP-2评价指标上优于JPEG XT剖面A、B、C等方法。同时,我们的方案还提供了对标准JPEG的向后兼容。
{"title":"HDR Image Compression with Convolutional Autoencoder","authors":"Fei Han, Jin Wang, Ruiqin Xiong, Qing Zhu, Baocai Yin","doi":"10.1109/VCIP49819.2020.9301853","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301853","url":null,"abstract":"As one of the next-generation multimedia technology, high dynamic range (HDR) imaging technology has been widely applied. Due to its wider color range, HDR image brings greater compression and storage burden compared with traditional LDR image. To solve this problem, in this paper, a two-layer HDR image compression framework based on convolutional neural networks is proposed. The framework is composed of a base layer which provides backward compatibility with the standard JPEG, and an extension layer based on a convolutional variational autoencoder neural networks and a post-processing module. The autoencoder mainly includes a nonlinear transform encoder, a binarized quantizer and a nonlinear transform decoder. Compared with traditional codecs, the proposed CNN autoencoder is more flexible and can retain more image semantic information, which will improve the quality of decoded HDR image. Moreover, to reduce the compression artifacts and noise of reconstructed HDR image, a post-processing method based on group convolutional neural networks is designed. Experimental results show that our method outperforms JPEG XT profile A, B, C and other methods in terms of HDR-VDP-2 evaluation metric. Meanwhile, our scheme also provides backward compatibility with the standard JPEG.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116808239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
GRNet: Deep Convolutional Neural Networks based on Graph Reasoning for Semantic Segmentation GRNet:基于图推理的深度卷积神经网络语义分割
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301851
Yang Wu, A. Jiang, Yibin Tang, H. Kwan
In this paper, we develop a novel deep-network architecture for semantic segmentation. In contrast to previous work that widely uses dilated convolutions, we employ the original ResNet as the backbone, and a multi-scale feature fusion module (MFFM) is introduced to extract long-range contextual information and upsample feature maps. Then, a graph reasoning module (GRM) based on graph-convolutional network (GCN) is developed to aggregate semantic information. Our graph reasoning network (GRNet) extracts global contexts of input features by modeling graph reasoning in a single framework. Experimental results demonstrate that our approach provides substantial benefits over a strong baseline and achieves superior segmentation performance on two benchmark datasets.
在本文中,我们开发了一种新的用于语义分割的深度网络架构。与以往广泛使用扩张卷积的研究不同,我们采用原始的ResNet作为主干,并引入多尺度特征融合模块(MFFM)来提取远程上下文信息和上样本特征映射。然后,开发了基于图卷积网络(GCN)的图推理模块(GRM)来实现语义信息的聚合。我们的图推理网络(GRNet)通过在单一框架中建模图推理来提取输入特征的全局上下文。实验结果表明,我们的方法在强大的基线上提供了实质性的好处,并在两个基准数据集上实现了卓越的分割性能。
{"title":"GRNet: Deep Convolutional Neural Networks based on Graph Reasoning for Semantic Segmentation","authors":"Yang Wu, A. Jiang, Yibin Tang, H. Kwan","doi":"10.1109/VCIP49819.2020.9301851","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301851","url":null,"abstract":"In this paper, we develop a novel deep-network architecture for semantic segmentation. In contrast to previous work that widely uses dilated convolutions, we employ the original ResNet as the backbone, and a multi-scale feature fusion module (MFFM) is introduced to extract long-range contextual information and upsample feature maps. Then, a graph reasoning module (GRM) based on graph-convolutional network (GCN) is developed to aggregate semantic information. Our graph reasoning network (GRNet) extracts global contexts of input features by modeling graph reasoning in a single framework. Experimental results demonstrate that our approach provides substantial benefits over a strong baseline and achieves superior segmentation performance on two benchmark datasets.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114771623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Random-access-aware Light Field Video Coding using Tree Pruning Method 基于树修剪方法的随机访问感知光场视频编码
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301800
T. N. Huu, V. V. Duong, B. Jeon
The increasing prevalence of VR/AR as well as the expected availability of Light Field (LF) display soon call for more practical methods to transmit LF image/video for services. In that aspect, the LF video coding should not only consider the compression efficiency but also the view random-access capability (especially in the multi-view-based system). The multi-view coding system heavily exploits view dependencies coming from both inter-view and temporal correlation. While such a system greatly improves the compression efficiency, its view random-access capability can be much reduced due to so called "chain of dependencies." In this paper, we first model the chain of dependencies by a tree, then a cost function is used to assign an importance value to each tree node. By travelling from top to bottom, a node of lesser importance is cut-off, forming a pruned tree to achieve reduction of random-access complexity. Our tree pruning method has shown to reduce about 40% of random-access complexity at the cost of minor compression loss compared to the state-of-the-art methods. Furthermore, it is expected that our method is very lightweight in its realization and also effective on a practical LF video coding system.
随着VR/AR技术的日益普及,以及光场(LF)显示技术的日益普及,迫切需要更实用的方法来传输用于服务的LF图像/视频。在这方面,LF视频编码不仅要考虑压缩效率,而且要考虑视图随机访问能力(特别是在基于多视图的系统中)。多视图编码系统大量利用来自视图间相关性和时间相关性的视图依赖。虽然这样的系统大大提高了压缩效率,但由于所谓的“依赖链”,其视图随机访问能力可能会大大降低。在本文中,我们首先用树来建模依赖链,然后使用代价函数为每个树节点分配一个重要值。通过从上到下的移动,一个不太重要的节点被切断,形成一个修剪树,以达到降低随机访问复杂性的目的。与最先进的方法相比,我们的树修剪方法以较小的压缩损失为代价,减少了大约40%的随机访问复杂性。此外,我们的方法在实现上是非常轻量级的,并且在实际的低频视频编码系统中也是有效的。
{"title":"Random-access-aware Light Field Video Coding using Tree Pruning Method","authors":"T. N. Huu, V. V. Duong, B. Jeon","doi":"10.1109/VCIP49819.2020.9301800","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301800","url":null,"abstract":"The increasing prevalence of VR/AR as well as the expected availability of Light Field (LF) display soon call for more practical methods to transmit LF image/video for services. In that aspect, the LF video coding should not only consider the compression efficiency but also the view random-access capability (especially in the multi-view-based system). The multi-view coding system heavily exploits view dependencies coming from both inter-view and temporal correlation. While such a system greatly improves the compression efficiency, its view random-access capability can be much reduced due to so called \"chain of dependencies.\" In this paper, we first model the chain of dependencies by a tree, then a cost function is used to assign an importance value to each tree node. By travelling from top to bottom, a node of lesser importance is cut-off, forming a pruned tree to achieve reduction of random-access complexity. Our tree pruning method has shown to reduce about 40% of random-access complexity at the cost of minor compression loss compared to the state-of-the-art methods. Furthermore, it is expected that our method is very lightweight in its realization and also effective on a practical LF video coding system.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117140852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
On 2D-3D Image Feature Detections for Image-To-Geometry Registration in Virtual Dental Model 虚拟牙齿模型图像-几何配准中2D-3D图像特征检测研究
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301774
Hui-jun Tang, R. T. Hsung, W. Y. Lam, Leo Y. Y. Cheng, E. Pow
3D digital smile design (DSD) gains great interest in dentistry because it enables esthetic design of teeth and gum. However, the color texture of teeth and gum is often lost/distorted in the digitization process. Recently, the image-to-geometry registration shade mapping (IGRSM) method was proposed for registering color texture from 2D photography to 3D mesh model. It allows better control of illumination and color calibration for automatic teeth shade matching. In this paper, we investigate automated techniques to find the correspondences between 3D tooth model and color intraoral photographs for accurately perform the IGRSM. We propose to use the tooth cusp tips as the correspondence points for the IGR because they can be reliably detected both in 2D photography and 3D surface scan. A modified gradient descent method with directional priority (GDDP) and region growing are developed to find the 3D correspondence points. For the 2D image, the tooth tips contour lines are extracted based on luminosity and chromaticity, the contour peaks are then detected as the correspondence points. From the experimental results, the proposed method shows excellent accuracy in detecting the correspondence points between 2D photography and 3D tooth model. The average registration error is less than 15 pixels for 4752×3168 size intraoral image.
三维数字微笑设计(DSD)因其能够实现牙齿和牙龈的美学设计而引起了牙科界的极大兴趣。然而,在数字化的过程中,牙齿和牙龈的颜色纹理往往会丢失/扭曲。近年来,提出了一种图像-几何配准阴影映射(IGRSM)方法,用于将二维摄影图像中的彩色纹理配准到三维网格模型中。它可以更好地控制照明和颜色校准,以实现自动牙齿阴影匹配。在本文中,我们研究了自动化技术,以找到三维牙齿模型和彩色口腔内照片之间的对应关系,以准确地执行IGRSM。我们建议使用牙尖尖端作为IGR的对应点,因为它们可以在2D摄影和3D表面扫描中可靠地检测到。提出了一种改进的梯度下降法,结合方向优先度和区域生长来寻找三维对应点。对于二维图像,基于亮度和色度提取齿尖轮廓线,检测轮廓峰作为对应点。实验结果表明,该方法在检测二维图像与三维牙齿模型对应点方面具有良好的准确性。对于4752×3168大小的口腔内图像,平均配准误差小于15像素。
{"title":"On 2D-3D Image Feature Detections for Image-To-Geometry Registration in Virtual Dental Model","authors":"Hui-jun Tang, R. T. Hsung, W. Y. Lam, Leo Y. Y. Cheng, E. Pow","doi":"10.1109/VCIP49819.2020.9301774","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301774","url":null,"abstract":"3D digital smile design (DSD) gains great interest in dentistry because it enables esthetic design of teeth and gum. However, the color texture of teeth and gum is often lost/distorted in the digitization process. Recently, the image-to-geometry registration shade mapping (IGRSM) method was proposed for registering color texture from 2D photography to 3D mesh model. It allows better control of illumination and color calibration for automatic teeth shade matching. In this paper, we investigate automated techniques to find the correspondences between 3D tooth model and color intraoral photographs for accurately perform the IGRSM. We propose to use the tooth cusp tips as the correspondence points for the IGR because they can be reliably detected both in 2D photography and 3D surface scan. A modified gradient descent method with directional priority (GDDP) and region growing are developed to find the 3D correspondence points. For the 2D image, the tooth tips contour lines are extracted based on luminosity and chromaticity, the contour peaks are then detected as the correspondence points. From the experimental results, the proposed method shows excellent accuracy in detecting the correspondence points between 2D photography and 3D tooth model. The average registration error is less than 15 pixels for 4752×3168 size intraoral image.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121720311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A semantic labeling framework for ALS point clouds based on discretization and CNN 基于离散化和CNN的ALS点云语义标注框架
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301759
Xingtao Wang, Xiaopeng Fan, Debin Zhao
The airborne laser scanning (ALS) point cloud has drawn increasing attention thanks to its capability to quickly acquire large-scale and high-precision ground information. Due to the complexity of observed scenes and the irregularity of point distribution, the semantic labeling of ALS point clouds is extremely challenging. In this paper, we introduce an efficient discretization based framework according to the geometric character of ALS point clouds, and propose an original intraclass weighted cross entropy loss function to solve the problem of data imbalance. We evaluate our framework on the ISPRS (International Society for Photogrammetry and Remote Sensing) 3D Semantic Labeling dataset. The experimental results show that the proposed method has achieved a new state-of-the-art performance in terms of overall accuracy (85.3%) and average F1 score (74.1%).
机载激光扫描(ALS)点云以其快速获取大尺度、高精度地面信息的能力受到越来越多的关注。由于观测场景的复杂性和点分布的不规则性,ALS点云的语义标注极具挑战性。本文根据ALS点云的几何特征,引入了一种有效的离散化框架,并提出了一种新颖的类内加权交叉熵损失函数来解决数据不平衡问题。我们在ISPRS(国际摄影测量与遥感学会)3D语义标记数据集上评估了我们的框架。实验结果表明,该方法在整体准确率(85.3%)和平均F1分数(74.1%)方面取得了较好的成绩。
{"title":"A semantic labeling framework for ALS point clouds based on discretization and CNN","authors":"Xingtao Wang, Xiaopeng Fan, Debin Zhao","doi":"10.1109/VCIP49819.2020.9301759","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301759","url":null,"abstract":"The airborne laser scanning (ALS) point cloud has drawn increasing attention thanks to its capability to quickly acquire large-scale and high-precision ground information. Due to the complexity of observed scenes and the irregularity of point distribution, the semantic labeling of ALS point clouds is extremely challenging. In this paper, we introduce an efficient discretization based framework according to the geometric character of ALS point clouds, and propose an original intraclass weighted cross entropy loss function to solve the problem of data imbalance. We evaluate our framework on the ISPRS (International Society for Photogrammetry and Remote Sensing) 3D Semantic Labeling dataset. The experimental results show that the proposed method has achieved a new state-of-the-art performance in terms of overall accuracy (85.3%) and average F1 score (74.1%).","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127038386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Orthogonal Features Fusion Network for Anomaly Detection 基于正交特征融合网络的异常检测
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301755
Teli Ma, Yizhi Wang, Jinxin Shao, Baochang Zhang, D. Doermann
Generative models have been successfully used for anomaly detection, which however need a large number of parameters and computation overheads, especially when training spatial and temporal networks in the same framework. In this paper, we introduce a novel network architecture, Orthogonal Features Fusion Network (OFF-Net), to solve the anomaly detection problem. We show that the convolutional feature maps used for generating future frames are orthogonal with each other, which can improve representation capacity of generative models and strengthen temporal connections between adjacent images. We lead a simple but effective module easily mounted on convolutional neural networks (CNNs) with negligible additional parameters added, which can replace the widely-used optical flow n etwork a nd s ignificantly im prove th e pe rformance for anomaly detection. Extensive experiment results demonstrate the effectiveness of OFF-Net that we outperform the state-of-the-art model 1.7% in terms of AUC. We save around 85M-space parameters compared with the prevailing prior arts using optical flow n etwork w ithout c omprising t he performance.
生成模型已被成功地用于异常检测,但异常检测需要大量的参数和计算开销,特别是在同一框架下训练时空网络时。本文引入了一种新的网络结构——正交特征融合网络(OFF-Net)来解决异常检测问题。我们证明了用于生成未来帧的卷积特征映射彼此正交,这可以提高生成模型的表示能力并加强相邻图像之间的时间连接。我们设计了一个简单有效的模块,可以轻松地安装在卷积神经网络(cnn)上,附加的参数可以忽略不计,可以取代广泛使用的光流网络,显著提高异常检测的性能。大量的实验结果证明了OFF-Net的有效性,我们在AUC方面比最先进的模型高出1.7%。与使用光流网络w的现有技术相比,我们节省了大约85m的空间参数,而不影响性能。
{"title":"Orthogonal Features Fusion Network for Anomaly Detection","authors":"Teli Ma, Yizhi Wang, Jinxin Shao, Baochang Zhang, D. Doermann","doi":"10.1109/VCIP49819.2020.9301755","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301755","url":null,"abstract":"Generative models have been successfully used for anomaly detection, which however need a large number of parameters and computation overheads, especially when training spatial and temporal networks in the same framework. In this paper, we introduce a novel network architecture, Orthogonal Features Fusion Network (OFF-Net), to solve the anomaly detection problem. We show that the convolutional feature maps used for generating future frames are orthogonal with each other, which can improve representation capacity of generative models and strengthen temporal connections between adjacent images. We lead a simple but effective module easily mounted on convolutional neural networks (CNNs) with negligible additional parameters added, which can replace the widely-used optical flow n etwork a nd s ignificantly im prove th e pe rformance for anomaly detection. Extensive experiment results demonstrate the effectiveness of OFF-Net that we outperform the state-of-the-art model 1.7% in terms of AUC. We save around 85M-space parameters compared with the prevailing prior arts using optical flow n etwork w ithout c omprising t he performance.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129733259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1