Pub Date : 2020-10-01DOI: 10.1109/ICIP40778.2020.9191135
Jaakko Laitinen, Ari Lemmetti, Jarno Vanne
This paper presents the first known open-source Scalable HEVC (SHVC) encoder for real-time applications. Our proposal is built on top of Kvazaar HEVC encoder by extending its functionality with spatial and signal-to-noise ratio (SNR) scalable coding schemes. These two scalability schemes have been optimized for real-time coding by means of three parallelization techniques: 1) wavefront parallel processing (WPP); 2) overlapped wavefront (OWF); and 3) AVX2-optimized upsampling. On an 8-core Xeon W-2145 processor, the proposed spatially scalable Kvazaar can encode twolayer 1080p video above 50 fps with scaling ratios of l.5 and 2. The respective coding gain s are 18.4% and 9.9% over Kvazaar simulcast coding at similar speed. Correspondingly, the coding speed of SNR scalable Kvazaar exceeds 30 fps with two-layer 1080p video. On average, it obtain s1.20 times speedup and 17.0% better coding efficiency over the simulcast case. These results justify the benefits of the proposed scalability schemes in real-time SHVC coding.
{"title":"Real-Time Implementation Of Scalable Hevc Encoder","authors":"Jaakko Laitinen, Ari Lemmetti, Jarno Vanne","doi":"10.1109/ICIP40778.2020.9191135","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9191135","url":null,"abstract":"This paper presents the first known open-source Scalable HEVC (SHVC) encoder for real-time applications. Our proposal is built on top of Kvazaar HEVC encoder by extending its functionality with spatial and signal-to-noise ratio (SNR) scalable coding schemes. These two scalability schemes have been optimized for real-time coding by means of three parallelization techniques: 1) wavefront parallel processing (WPP); 2) overlapped wavefront (OWF); and 3) AVX2-optimized upsampling. On an 8-core Xeon W-2145 processor, the proposed spatially scalable Kvazaar can encode twolayer 1080p video above 50 fps with scaling ratios of l.5 and 2. The respective coding gain s are 18.4% and 9.9% over Kvazaar simulcast coding at similar speed. Correspondingly, the coding speed of SNR scalable Kvazaar exceeds 30 fps with two-layer 1080p video. On average, it obtain s1.20 times speedup and 17.0% better coding efficiency over the simulcast case. These results justify the benefits of the proposed scalability schemes in real-time SHVC coding.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126323166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-10-01DOI: 10.1109/ICIP40778.2020.9191261
Pengfei Zhu, Mingqi Gu, Wenbin Li, Changqing Zhang, Q. Hu
Few-shot learning aims to learn models that can generalize to unseen tasks from very few annotated samples of available tasks. The performance of few-shot learning is greatly affected by the number of samples per class. The massive unlabeled data can help to boost the performance of few shot learning models. In this paper, we propose a novel progressive point to set metric learning (PPSML) model for semisupervised few-shot classification. The distance metric is defined for an image of the query set to a class of the support set by point to set distance. A self-training strategy is designed to select the samples locally or globally with high confidence and use these samples to progressively update the point to set distance. Experiments on benchmark datasets show that our proposed PPSML significantly improves the accuracy of few shot classification and outperforms the state-of-the-art semisupervised few-shot learning methods.
few -shot学习的目的是学习可以从很少的可用任务的注释样本中泛化到未见任务的模型。少次学习的性能很大程度上受每类样本数量的影响。大量的未标记数据可以帮助提高少数镜头学习模型的性能。本文提出了一种新的用于半监督小样本分类的渐进点集度量学习(PPSML)模型。距离度量是为查询集的图像到一类支持集的点到集的距离定义的。设计了一种自我训练策略,以高置信度选择局部或全局样本,并使用这些样本逐步更新点到设置距离。在基准数据集上的实验表明,我们提出的PPSML算法显著提高了少弹分类的准确率,并且优于目前最先进的半监督少弹学习方法。
{"title":"Progressive Point To Set Metric Learning For Semi-Supervised Few-Shot Classification","authors":"Pengfei Zhu, Mingqi Gu, Wenbin Li, Changqing Zhang, Q. Hu","doi":"10.1109/ICIP40778.2020.9191261","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9191261","url":null,"abstract":"Few-shot learning aims to learn models that can generalize to unseen tasks from very few annotated samples of available tasks. The performance of few-shot learning is greatly affected by the number of samples per class. The massive unlabeled data can help to boost the performance of few shot learning models. In this paper, we propose a novel progressive point to set metric learning (PPSML) model for semisupervised few-shot classification. The distance metric is defined for an image of the query set to a class of the support set by point to set distance. A self-training strategy is designed to select the samples locally or globally with high confidence and use these samples to progressively update the point to set distance. Experiments on benchmark datasets show that our proposed PPSML significantly improves the accuracy of few shot classification and outperforms the state-of-the-art semisupervised few-shot learning methods.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121593028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-10-01DOI: 10.1109/ICIP40778.2020.9191142
Sarah Parker, Yue Chen, Urvang Joshi, Elliott Karpilovsky, D. Mukherjee
AV1, AOMedia’s royalty free codec, has enjoyed a great amount of success since its 2018 release. It currently achieves 31% BDRATE gains over VP9, and is on its way to becoming YouTube’s default codec. Although the industry is currently focused on the implementation and optimization of AV1, AOMedia Research continues to develop new coding tools that deliver higher coding gains within acceptable complexity bounds. Here, we focus on improving transform coding. While AV1 has made great strides in transform coding over VP9, the residue signal still consumes a large portion of the bitstream. In this paper, we describe a more flexible transform partitioning scheme, which will allow the next generation codec to more efficiently target areas in the residue signal with high energy, leading to better residue compression.
{"title":"On Extended Transform Partitions For The Next Generation Video CODEC","authors":"Sarah Parker, Yue Chen, Urvang Joshi, Elliott Karpilovsky, D. Mukherjee","doi":"10.1109/ICIP40778.2020.9191142","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9191142","url":null,"abstract":"AV1, AOMedia’s royalty free codec, has enjoyed a great amount of success since its 2018 release. It currently achieves 31% BDRATE gains over VP9, and is on its way to becoming YouTube’s default codec. Although the industry is currently focused on the implementation and optimization of AV1, AOMedia Research continues to develop new coding tools that deliver higher coding gains within acceptable complexity bounds. Here, we focus on improving transform coding. While AV1 has made great strides in transform coding over VP9, the residue signal still consumes a large portion of the bitstream. In this paper, we describe a more flexible transform partitioning scheme, which will allow the next generation codec to more efficiently target areas in the residue signal with high energy, leading to better residue compression.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132132419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-10-01DOI: 10.1109/ICIP40778.2020.9190760
Xue Geng, Jie Lin, Shaohua Li
There has been a vast literature on Neural Network Compression, either by quantizing network variables to low precision numbers or pruning redundant connections from the network architecture. However, these techniques experience performance degradation when the compression ratio is increased to an extreme extent. In this paper, we propose Cascaded Mixed-precision Networks (CMNs), which are compact yet efficient neural networks without incurring performance drop. CMN is designed as a cascade framework by concatenating a group of neural networks with sequentially increased bitwidth. The execution flow of CMN is conditional on the difficulty of input samples, i.e., easy examples will be correctly classified by going through extremely low-bitwidth networks, and hard examples will be handled by high-bitwidth networks, so that the average compute is reduced. In addition, weight pruning is incorporated into the cascaded framework and jointly optimized with the mixed-precision quantization. To validate this method, we implemented a 2-stage CMN consisting of a binary neural network and a multi-bit (e.g. 8 bits) neural network. Empirical results on CIFAR-100 and ImageNet demonstrate that CMN performs better than state-of-the-art methods, in terms of accuracy and compute.
{"title":"Cascaded Mixed-Precision Networks","authors":"Xue Geng, Jie Lin, Shaohua Li","doi":"10.1109/ICIP40778.2020.9190760","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9190760","url":null,"abstract":"There has been a vast literature on Neural Network Compression, either by quantizing network variables to low precision numbers or pruning redundant connections from the network architecture. However, these techniques experience performance degradation when the compression ratio is increased to an extreme extent. In this paper, we propose Cascaded Mixed-precision Networks (CMNs), which are compact yet efficient neural networks without incurring performance drop. CMN is designed as a cascade framework by concatenating a group of neural networks with sequentially increased bitwidth. The execution flow of CMN is conditional on the difficulty of input samples, i.e., easy examples will be correctly classified by going through extremely low-bitwidth networks, and hard examples will be handled by high-bitwidth networks, so that the average compute is reduced. In addition, weight pruning is incorporated into the cascaded framework and jointly optimized with the mixed-precision quantization. To validate this method, we implemented a 2-stage CMN consisting of a binary neural network and a multi-bit (e.g. 8 bits) neural network. Empirical results on CIFAR-100 and ImageNet demonstrate that CMN performs better than state-of-the-art methods, in terms of accuracy and compute.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130206511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-10-01DOI: 10.1109/ICIP40778.2020.9190815
Xiaohong W. Gao, R. Comley, Maleika Heenaye-Mamode Khan
In this work, an enhanced ResNet deep learning network, depth-ResNet, has been developed to classify the five types of Tuberculosis (TB) lung CT images. Depth-ResNet takes 3D CT images as a whole and processes the volumatic blocks along depth directions. It builds on the ResNet-50 model to obtain 2D features on each frame and injects depth information at each process block. As a result, the averaged accuracy for classification is 71.60% for depth-ResNet and 68.59% for ResNet. The datasets are collected from the ImageCLEF 2018 competition with 1008 training data in total, where the top reported accuracy was 42.27%.
{"title":"An Enhanced Deep Learning Architecture for Classification of Tuberculosis Types From CT Lung Images","authors":"Xiaohong W. Gao, R. Comley, Maleika Heenaye-Mamode Khan","doi":"10.1109/ICIP40778.2020.9190815","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9190815","url":null,"abstract":"In this work, an enhanced ResNet deep learning network, depth-ResNet, has been developed to classify the five types of Tuberculosis (TB) lung CT images. Depth-ResNet takes 3D CT images as a whole and processes the volumatic blocks along depth directions. It builds on the ResNet-50 model to obtain 2D features on each frame and injects depth information at each process block. As a result, the averaged accuracy for classification is 71.60% for depth-ResNet and 68.59% for ResNet. The datasets are collected from the ImageCLEF 2018 competition with 1008 training data in total, where the top reported accuracy was 42.27%.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133958425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-10-01DOI: 10.1109/ICIP40778.2020.9190812
Xinglu Wang, Yingming Li
Muiti-exit architectures, in which a sequence of intermediate classifiers are introduced at different depths of the feature layers, perform adaptive computation by early exiting “easy” samples to speed up the inference. In this paper, we propose a new gradient deconfliction-based training technique for multi-exit architectures. In particular, the conflicting between the gradients back-propagated from different classifiers is removed by projecting the gradient from one classifier onto the normal plane of the gradient from the other classifier. Experiments on CFAR-100 and ImageNet show that the gradient deconfliction-based training strategy significantly improves the performance of the state-of-the-art multi-exit neural networks. Moreover, this method does not require within architecture modifications and can be effectively combined with other previously-proposed training techniques and further boosts the performance.
{"title":"Gradient Deconfliction-Based Training For Multi-Exit Architectures","authors":"Xinglu Wang, Yingming Li","doi":"10.1109/ICIP40778.2020.9190812","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9190812","url":null,"abstract":"Muiti-exit architectures, in which a sequence of intermediate classifiers are introduced at different depths of the feature layers, perform adaptive computation by early exiting “easy” samples to speed up the inference. In this paper, we propose a new gradient deconfliction-based training technique for multi-exit architectures. In particular, the conflicting between the gradients back-propagated from different classifiers is removed by projecting the gradient from one classifier onto the normal plane of the gradient from the other classifier. Experiments on CFAR-100 and ImageNet show that the gradient deconfliction-based training strategy significantly improves the performance of the state-of-the-art multi-exit neural networks. Moreover, this method does not require within architecture modifications and can be effectively combined with other previously-proposed training techniques and further boosts the performance.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134336483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-10-01DOI: 10.1109/ICIP40778.2020.9190972
Dubok Park
In this paper, we propose a novel framework for image downscaling with edge-guided interpolation and adaptive filtering. First, we extract the second derivative edge-guidance map from an input image. Then, inter-pixels are interpolated via edge-guidance map and portion of distance from the input pixels. Finally, adaptive filtering is applied to the expanded pixels for alleviating artifacts while preserving details and contents of input image. Experimental results validate the proposed framework can achieve content-preserving results while reducing artifacts.
{"title":"Edge-Guided Image Downscaling With Adaptive Filtering","authors":"Dubok Park","doi":"10.1109/ICIP40778.2020.9190972","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9190972","url":null,"abstract":"In this paper, we propose a novel framework for image downscaling with edge-guided interpolation and adaptive filtering. First, we extract the second derivative edge-guidance map from an input image. Then, inter-pixels are interpolated via edge-guidance map and portion of distance from the input pixels. Finally, adaptive filtering is applied to the expanded pixels for alleviating artifacts while preserving details and contents of input image. Experimental results validate the proposed framework can achieve content-preserving results while reducing artifacts.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"272 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131480960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-10-01DOI: 10.1109/ICIP40778.2020.9190830
Meng Jiang, Zhou Liu, Bishan Wang, Lei Yu, Wen Yang
The event camera is a novel sensor that records brightness change in the form of asynchronous events with high temporal resolution, and simultaneously outputs intensity images with a lower frame rate. Events recorded by sensors have a lot of noise and the intensity images captured often suffer from motion blur and noise effects. Therefore, to reconstruct high quality images is of great significance for the application of event camera in computer vision. However, the existing reconstruction methods only addressed the motion blur issue without considering the influence of noise. In this paper, we propose a variational model by using spatial smooth constraint regularization to recover clean image frames from blurry and noisy camera images and events at any frame rate. We present experimental results on synthetic dataset as well as real dataset with high speed and high dynamic range to demonstrate that the proposed algorithm is superior to the other reconstruction algorithms.
{"title":"Robust Intensity Image Reconstruciton Based On Event Cameras","authors":"Meng Jiang, Zhou Liu, Bishan Wang, Lei Yu, Wen Yang","doi":"10.1109/ICIP40778.2020.9190830","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9190830","url":null,"abstract":"The event camera is a novel sensor that records brightness change in the form of asynchronous events with high temporal resolution, and simultaneously outputs intensity images with a lower frame rate. Events recorded by sensors have a lot of noise and the intensity images captured often suffer from motion blur and noise effects. Therefore, to reconstruct high quality images is of great significance for the application of event camera in computer vision. However, the existing reconstruction methods only addressed the motion blur issue without considering the influence of noise. In this paper, we propose a variational model by using spatial smooth constraint regularization to recover clean image frames from blurry and noisy camera images and events at any frame rate. We present experimental results on synthetic dataset as well as real dataset with high speed and high dynamic range to demonstrate that the proposed algorithm is superior to the other reconstruction algorithms.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"21 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131084830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-10-01DOI: 10.1109/ICIP40778.2020.9191152
A. Kokaram, Davinder Singh, Simon Robinson
Frame interpolation is the process of synthesising a new frame in-between existing frames in an image sequence. It has emerged as a key module in motion picture effects. Previous work either relies on two frame interpolation based entirely on optic flow, or recently DNNs. This paper presents a new algorithm based on multiframe motion interpolation motivated in a Bayesian sense. We also present the first comparison using industrial toolkits used in the post production industry today. We find that the latest Convolutional Neural Network approaches do not significantly outperform explicit motion based techniques.
{"title":"A Bayesian View of Frame Interpolation and a Comparison with Existing Motion Picture Effects Tools","authors":"A. Kokaram, Davinder Singh, Simon Robinson","doi":"10.1109/ICIP40778.2020.9191152","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9191152","url":null,"abstract":"Frame interpolation is the process of synthesising a new frame in-between existing frames in an image sequence. It has emerged as a key module in motion picture effects. Previous work either relies on two frame interpolation based entirely on optic flow, or recently DNNs. This paper presents a new algorithm based on multiframe motion interpolation motivated in a Bayesian sense. We also present the first comparison using industrial toolkits used in the post production industry today. We find that the latest Convolutional Neural Network approaches do not significantly outperform explicit motion based techniques.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133512843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-10-01DOI: 10.1109/ICIP40778.2020.9191073
Jiamin Lin, Longquan Dai
Photos taken on rainy days are likely degraded by raindrops adhered to camera lenses. Removing raindrops from images is a tough task. Its difficulties lie in restoring high frequency information from corrupted images while keeping the color of restored images consistent with human perception. To solve these problems, we propose an end-to-end convolutional neural network consisting of X-Net and RAD-Net (Raindrop Automatic Detection Net). X-Net takes advantage of Long Skip Connections and Cross Branch Connections to generate raindrop-free image with enough details. RAD-Net assists X-Net to produce better results by yielding raindrop location. Extensive experiments show our approach outperforms state-of-the-art methods quantitatively and qualitatively.
{"title":"X-NET For Single Image Raindrop Removal","authors":"Jiamin Lin, Longquan Dai","doi":"10.1109/ICIP40778.2020.9191073","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9191073","url":null,"abstract":"Photos taken on rainy days are likely degraded by raindrops adhered to camera lenses. Removing raindrops from images is a tough task. Its difficulties lie in restoring high frequency information from corrupted images while keeping the color of restored images consistent with human perception. To solve these problems, we propose an end-to-end convolutional neural network consisting of X-Net and RAD-Net (Raindrop Automatic Detection Net). X-Net takes advantage of Long Skip Connections and Cross Branch Connections to generate raindrop-free image with enough details. RAD-Net assists X-Net to produce better results by yielding raindrop location. Extensive experiments show our approach outperforms state-of-the-art methods quantitatively and qualitatively.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132153618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}