2020 IEEE International Conference on Image Processing (ICIP)最新文献

英文中文

Real-Time Implementation Of Scalable Hevc Encoder 可扩展Hevc编码器的实时实现

2020 IEEE International Conference on Image Processing (ICIP)

Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9191135

Jaakko Laitinen, Ari Lemmetti, Jarno Vanne

This paper presents the first known open-source Scalable HEVC (SHVC) encoder for real-time applications. Our proposal is built on top of Kvazaar HEVC encoder by extending its functionality with spatial and signal-to-noise ratio (SNR) scalable coding schemes. These two scalability schemes have been optimized for real-time coding by means of three parallelization techniques: 1) wavefront parallel processing (WPP); 2) overlapped wavefront (OWF); and 3) AVX2-optimized upsampling. On an 8-core Xeon W-2145 processor, the proposed spatially scalable Kvazaar can encode twolayer 1080p video above 50 fps with scaling ratios of l.5 and 2. The respective coding gain s are 18.4% and 9.9% over Kvazaar simulcast coding at similar speed. Correspondingly, the coding speed of SNR scalable Kvazaar exceeds 30 fps with two-layer 1080p video. On average, it obtain s1.20 times speedup and 17.0% better coding efficiency over the simulcast case. These results justify the benefits of the proposed scalability schemes in real-time SHVC coding.

本文提出了已知的第一个用于实时应用的开源可伸缩HEVC (SHVC)编码器。我们的建议是建立在Kvazaar HEVC编码器的基础上，通过空间和信噪比(SNR)可扩展编码方案扩展其功能。采用三种并行化技术对这两种可扩展性方案进行了实时编码优化:1)波前并行处理(WPP);2)重叠波前(OWF);3) avx2优化上采样。在8核至强W-2145处理器上，提出的空间可扩展Kvazaar可以编码50 fps以上的双层1080p视频，缩放比为1.5和2。在相同的速度下，分别比Kvazaar联播编码的编码增益为18.4%和9.9%。相应的，对于双层1080p视频，信噪比可扩展的Kvazaar编码速度超过30fps。平均而言，它比同期广播的速度提高了s1.20倍，编码效率提高了17.0%。这些结果证明了所提出的可扩展性方案在实时SHVC编码中的优势。

引用次数: 1

Progressive Point To Set Metric Learning For Semi-Supervised Few-Shot Classification 半监督少镜头分类的渐进式点集度量学习

2020 IEEE International Conference on Image Processing (ICIP)

Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9191261

Pengfei Zhu, Mingqi Gu, Wenbin Li, Changqing Zhang, Q. Hu

Few-shot learning aims to learn models that can generalize to unseen tasks from very few annotated samples of available tasks. The performance of few-shot learning is greatly affected by the number of samples per class. The massive unlabeled data can help to boost the performance of few shot learning models. In this paper, we propose a novel progressive point to set metric learning (PPSML) model for semisupervised few-shot classification. The distance metric is defined for an image of the query set to a class of the support set by point to set distance. A self-training strategy is designed to select the samples locally or globally with high confidence and use these samples to progressively update the point to set distance. Experiments on benchmark datasets show that our proposed PPSML significantly improves the accuracy of few shot classification and outperforms the state-of-the-art semisupervised few-shot learning methods.

few -shot学习的目的是学习可以从很少的可用任务的注释样本中泛化到未见任务的模型。少次学习的性能很大程度上受每类样本数量的影响。大量的未标记数据可以帮助提高少数镜头学习模型的性能。本文提出了一种新的用于半监督小样本分类的渐进点集度量学习(PPSML)模型。距离度量是为查询集的图像到一类支持集的点到集的距离定义的。设计了一种自我训练策略，以高置信度选择局部或全局样本，并使用这些样本逐步更新点到设置距离。在基准数据集上的实验表明，我们提出的PPSML算法显著提高了少弹分类的准确率，并且优于目前最先进的半监督少弹学习方法。

引用次数: 4

On Extended Transform Partitions For The Next Generation Video CODEC 下一代视频编解码器的扩展变换分区

2020 IEEE International Conference on Image Processing (ICIP)

Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9191142

Sarah Parker, Yue Chen, Urvang Joshi, Elliott Karpilovsky, D. Mukherjee

AV1, AOMedia’s royalty free codec, has enjoyed a great amount of success since its 2018 release. It currently achieves 31% BDRATE gains over VP9, and is on its way to becoming YouTube’s default codec. Although the industry is currently focused on the implementation and optimization of AV1, AOMedia Research continues to develop new coding tools that deliver higher coding gains within acceptable complexity bounds. Here, we focus on improving transform coding. While AV1 has made great strides in transform coding over VP9, the residue signal still consumes a large portion of the bitstream. In this paper, we describe a more flexible transform partitioning scheme, which will allow the next generation codec to more efficiently target areas in the residue signal with high energy, leading to better residue compression.

amedia的免版税编解码器AV1自2018年发布以来取得了巨大成功。目前它的BDRATE比VP9高31%，并且正在成为YouTube的默认编解码器。尽管业界目前专注于AV1的实现和优化，但AOMedia Research仍在继续开发新的编码工具，在可接受的复杂度范围内提供更高的编码增益。在这里，我们关注于改进转换编码。虽然AV1在转换编码方面比VP9有了很大的进步，但剩余信号仍然消耗了很大一部分比特流。本文提出了一种更灵活的变换分割方案，使下一代编解码器能够更有效地针对残差信号中的高能量区域，从而获得更好的残差压缩效果。

引用次数: 1

Cascaded Mixed-Precision Networks 级联混合精密网络

2020 IEEE International Conference on Image Processing (ICIP)

Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9190760

Xue Geng, Jie Lin, Shaohua Li

There has been a vast literature on Neural Network Compression, either by quantizing network variables to low precision numbers or pruning redundant connections from the network architecture. However, these techniques experience performance degradation when the compression ratio is increased to an extreme extent. In this paper, we propose Cascaded Mixed-precision Networks (CMNs), which are compact yet efficient neural networks without incurring performance drop. CMN is designed as a cascade framework by concatenating a group of neural networks with sequentially increased bitwidth. The execution flow of CMN is conditional on the difficulty of input samples, i.e., easy examples will be correctly classified by going through extremely low-bitwidth networks, and hard examples will be handled by high-bitwidth networks, so that the average compute is reduced. In addition, weight pruning is incorporated into the cascaded framework and jointly optimized with the mixed-precision quantization. To validate this method, we implemented a 2-stage CMN consisting of a binary neural network and a multi-bit (e.g. 8 bits) neural network. Empirical results on CIFAR-100 and ImageNet demonstrate that CMN performs better than state-of-the-art methods, in terms of accuracy and compute.

关于神经网络压缩已经有大量的文献，要么通过将网络变量量化为低精度的数字，要么从网络架构中修剪冗余连接。然而，当压缩比增加到极端程度时，这些技术的性能会下降。在本文中，我们提出了级联混合精度网络(CMNs)，它是一种紧凑而高效的神经网络，不会导致性能下降。CMN被设计成一个级联框架，通过将一组按顺序增加位宽的神经网络连接起来。CMN的执行流程取决于输入样本的难易程度，即简单的样本通过极低位宽的网络进行正确分类，难的样本通过高位宽的网络进行处理，从而减少了平均计算量。此外，将权值剪枝纳入级联框架，并与混合精度量化联合优化。为了验证这种方法，我们实现了一个由二进制神经网络和多比特(例如8比特)神经网络组成的两阶段CMN。在CIFAR-100和ImageNet上的实证结果表明，CMN在精度和计算方面比最先进的方法表现得更好。

{"title":"Cascaded Mixed-Precision Networks","authors":"Xue Geng, Jie Lin, Shaohua Li","doi":"10.1109/ICIP40778.2020.9190760","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9190760","url":null,"abstract":"There has been a vast literature on Neural Network Compression, either by quantizing network variables to low precision numbers or pruning redundant connections from the network architecture. However, these techniques experience performance degradation when the compression ratio is increased to an extreme extent. In this paper, we propose Cascaded Mixed-precision Networks (CMNs), which are compact yet efficient neural networks without incurring performance drop. CMN is designed as a cascade framework by concatenating a group of neural networks with sequentially increased bitwidth. The execution flow of CMN is conditional on the difficulty of input samples, i.e., easy examples will be correctly classified by going through extremely low-bitwidth networks, and hard examples will be handled by high-bitwidth networks, so that the average compute is reduced. In addition, weight pruning is incorporated into the cascaded framework and jointly optimized with the mixed-precision quantization. To validate this method, we implemented a 2-stage CMN consisting of a binary neural network and a multi-bit (e.g. 8 bits) neural network. Empirical results on CIFAR-100 and ImageNet demonstrate that CMN performs better than state-of-the-art methods, in terms of accuracy and compute.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130206511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

An Enhanced Deep Learning Architecture for Classification of Tuberculosis Types From CT Lung Images 一种增强的深度学习架构用于CT肺部图像的结核类型分类

2020 IEEE International Conference on Image Processing (ICIP)

Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9190815

Xiaohong W. Gao, R. Comley, Maleika Heenaye-Mamode Khan

In this work, an enhanced ResNet deep learning network, depth-ResNet, has been developed to classify the five types of Tuberculosis (TB) lung CT images. Depth-ResNet takes 3D CT images as a whole and processes the volumatic blocks along depth directions. It builds on the ResNet-50 model to obtain 2D features on each frame and injects depth information at each process block. As a result, the averaged accuracy for classification is 71.60% for depth-ResNet and 68.59% for ResNet. The datasets are collected from the ImageCLEF 2018 competition with 1008 training data in total, where the top reported accuracy was 42.27%.

在这项工作中，开发了一种增强的ResNet深度学习网络deep -ResNet，用于对五种类型的结核病(TB)肺部CT图像进行分类。deep - resnet将三维CT图像作为一个整体，沿深度方向对体块进行处理。它建立在ResNet-50模型上，以获取每帧上的2D特征，并在每个进程块上注入深度信息。结果表明，deep -ResNet的平均分类准确率为71.60%，ResNet的平均分类准确率为68.59%。这些数据集是从ImageCLEF 2018比赛中收集的，总共有1008个训练数据，其中报告的最高准确率为42.27%。

引用次数: 5

Gradient Deconfliction-Based Training For Multi-Exit Architectures 基于梯度去冲突的多出口结构训练

2020 IEEE International Conference on Image Processing (ICIP)

Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9190812

Xinglu Wang, Yingming Li

Muiti-exit architectures, in which a sequence of intermediate classifiers are introduced at different depths of the feature layers, perform adaptive computation by early exiting “easy” samples to speed up the inference. In this paper, we propose a new gradient deconfliction-based training technique for multi-exit architectures. In particular, the conflicting between the gradients back-propagated from different classifiers is removed by projecting the gradient from one classifier onto the normal plane of the gradient from the other classifier. Experiments on CFAR-100 and ImageNet show that the gradient deconfliction-based training strategy significantly improves the performance of the state-of-the-art multi-exit neural networks. Moreover, this method does not require within architecture modifications and can be effectively combined with other previously-proposed training techniques and further boosts the performance.

在多出口结构中，在特征层的不同深度引入一系列中间分类器，通过早期退出“容易”样本来进行自适应计算，以加快推理速度。在本文中，我们提出了一种新的基于梯度去冲突的多出口结构训练技术。特别是，通过将来自一个分类器的梯度投影到来自另一个分类器的梯度的法平面上，可以消除来自不同分类器的梯度反向传播之间的冲突。在CFAR-100和ImageNet上的实验表明，基于梯度去冲突的训练策略显著提高了最先进的多出口神经网络的性能。此外，该方法不需要在体系结构内部进行修改，并且可以有效地与其他先前提出的训练技术相结合，进一步提高性能。

引用次数: 4

Edge-Guided Image Downscaling With Adaptive Filtering 自适应滤波的边缘引导图像降尺度

2020 IEEE International Conference on Image Processing (ICIP)

Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9190972

Dubok Park

In this paper, we propose a novel framework for image downscaling with edge-guided interpolation and adaptive filtering. First, we extract the second derivative edge-guidance map from an input image. Then, inter-pixels are interpolated via edge-guidance map and portion of distance from the input pixels. Finally, adaptive filtering is applied to the expanded pixels for alleviating artifacts while preserving details and contents of input image. Experimental results validate the proposed framework can achieve content-preserving results while reducing artifacts.

本文提出了一种基于边缘引导插值和自适应滤波的图像降尺度框架。首先，从输入图像中提取二阶导数边缘制导映射。然后，通过边缘制导图和距离输入像素的部分距离插值像素间。最后，对扩展后的像素进行自适应滤波，在保留输入图像细节和内容的同时减少伪影。实验结果验证了该框架在减少伪影的同时能够达到内容保留的效果。

引用次数: 5

Robust Intensity Image Reconstruciton Based On Event Cameras 基于事件相机的鲁棒强度图像重建

2020 IEEE International Conference on Image Processing (ICIP)

Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9190830

Meng Jiang, Zhou Liu, Bishan Wang, Lei Yu, Wen Yang

The event camera is a novel sensor that records brightness change in the form of asynchronous events with high temporal resolution, and simultaneously outputs intensity images with a lower frame rate. Events recorded by sensors have a lot of noise and the intensity images captured often suffer from motion blur and noise effects. Therefore, to reconstruct high quality images is of great significance for the application of event camera in computer vision. However, the existing reconstruction methods only addressed the motion blur issue without considering the influence of noise. In this paper, we propose a variational model by using spatial smooth constraint regularization to recover clean image frames from blurry and noisy camera images and events at any frame rate. We present experimental results on synthetic dataset as well as real dataset with high speed and high dynamic range to demonstrate that the proposed algorithm is superior to the other reconstruction algorithms.

事件相机是一种新颖的传感器，以异步事件的形式记录高时间分辨率的亮度变化，同时以较低的帧率输出强度图像。传感器记录的事件有大量的噪声，所捕获的强度图像经常受到运动模糊和噪声的影响。因此，重建高质量的图像对于事件相机在计算机视觉中的应用具有重要意义。然而，现有的重建方法只解决了运动模糊问题，没有考虑噪声的影响。在本文中，我们提出了一种使用空间平滑约束正则化的变分模型，以在任何帧速率下从模糊和噪声的相机图像和事件中恢复干净的图像帧。在合成数据集和真实数据集上进行了高速、高动态范围的实验，结果表明该算法优于其他重构算法。

引用次数: 2

A Bayesian View of Frame Interpolation and a Comparison with Existing Motion Picture Effects Tools 帧插值的贝叶斯观点及与现有电影效果工具的比较

2020 IEEE International Conference on Image Processing (ICIP)

Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9191152

A. Kokaram, Davinder Singh, Simon Robinson

Frame interpolation is the process of synthesising a new frame in-between existing frames in an image sequence. It has emerged as a key module in motion picture effects. Previous work either relies on two frame interpolation based entirely on optic flow, or recently DNNs. This paper presents a new algorithm based on multiframe motion interpolation motivated in a Bayesian sense. We also present the first comparison using industrial toolkits used in the post production industry today. We find that the latest Convolutional Neural Network approaches do not significantly outperform explicit motion based techniques.

帧插值是在图像序列中现有帧之间合成新帧的过程。它已经成为电影特效的一个关键模块。以前的工作要么完全依赖于光流的两帧插值，要么是最近的深度神经网络。提出了一种基于贝叶斯驱动的多帧运动插值算法。我们还提出了使用工业工具包在后期制作行业今天使用的第一个比较。我们发现最新的卷积神经网络方法并没有明显优于基于显式运动的技术。

引用次数: 1

X-NET For Single Image Raindrop Removal X-NET用于单个图像雨滴去除

2020 IEEE International Conference on Image Processing (ICIP)

Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9191073

Jiamin Lin, Longquan Dai

Photos taken on rainy days are likely degraded by raindrops adhered to camera lenses. Removing raindrops from images is a tough task. Its difficulties lie in restoring high frequency information from corrupted images while keeping the color of restored images consistent with human perception. To solve these problems, we propose an end-to-end convolutional neural network consisting of X-Net and RAD-Net (Raindrop Automatic Detection Net). X-Net takes advantage of Long Skip Connections and Cross Branch Connections to generate raindrop-free image with enough details. RAD-Net assists X-Net to produce better results by yielding raindrop location. Extensive experiments show our approach outperforms state-of-the-art methods quantitatively and qualitatively.

下雨天拍摄的照片很可能会因为沾在相机镜头上的雨滴而变质。从图像中去除雨滴是一项艰巨的任务。它的难点在于从损坏的图像中恢复高频信息，同时保持恢复图像的颜色与人类感知一致。为了解决这些问题，我们提出了一个由X-Net和RAD-Net(雨滴自动检测网)组成的端到端卷积神经网络。X-Net利用长跳连接和交叉分支连接来生成具有足够细节的无雨滴图像。RAD-Net通过生成雨滴位置来帮助X-Net产生更好的结果。大量的实验表明，我们的方法在数量和质量上都优于最先进的方法。

引用次数: 1

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2020 IEEE International Conference on Image Processing (ICIP)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀