首页 > 最新文献

IEEE Transactions on Image Processing最新文献

英文 中文
Parallax Tolerant Light Field Stitching for Hand-held Plenoptic Cameras. 手持式全光学相机的视差容限光场拼接。
IF 10.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2019-10-10 DOI: 10.1109/TIP.2019.2945687
Xin Jin, Pei Wang, Qionghai Dai

Light field (LF) stitching is a potential solution to improve the field of view (FOV) for hand-held plenoptic cameras. Existing LF stitching methods cannot provide accurate registration for scenes with large depth variation. In this paper, a novel LF stitching method is proposed to handle parallax in the LFs more flexibly and accurately. First, a depth layer map (DLM) is proposed to guarantee adequate feature points on each depth layer. For the regions of nondeterministic depth, superpixel layer map (SLM) is proposed based on LF spatial correlation analysis to refine the depth layer assignments. Then, DLM-SLM-based LF registration is proposed to derive the location dependent homography transforms accurately and to warp LFs to its corresponding position without parallax interference. 4D graph-cut is further applied to fuse the registration results for higher LF spatial continuity and angular continuity. Horizontal, vertical and multi-LF stitching are tested for different scenes, which demonstrates the superior performance provided by the proposed method in terms of subjective quality of the stitched LFs, epipolar plane image consistency in the stitched LF, and perspective-averaged correlation between the stitched LF and the input LFs.

光场(LF)拼接是改善手持式全光学摄像机视场(FOV)的一种潜在解决方案。现有的光场拼接方法无法为深度变化较大的场景提供精确的配准。本文提出了一种新颖的 LF 拼接方法,可以更灵活、更准确地处理 LF 中的视差。首先,本文提出了一个深度图层图(DLM),以保证每个深度图层上都有足够的特征点。对于深度不确定的区域,提出了基于 LF 空间相关性分析的超像素层图(SLM),以细化深度层的分配。然后,提出了基于 DLM-SLM 的 LF 配准方法,以准确推导出与位置相关的同构变换,并在没有视差干扰的情况下将 LF warp 到相应的位置。进一步应用 4D 图形切割来融合配准结果,以获得更高的 LF 空间连续性和角度连续性。对不同场景进行了水平、垂直和多 LF 拼接测试,结果表明,在拼接 LF 的主观质量、拼接 LF 的外极面图像一致性以及拼接 LF 与输入 LF 之间的透视平均相关性等方面,所提出的方法都具有卓越的性能。
{"title":"Parallax Tolerant Light Field Stitching for Hand-held Plenoptic Cameras.","authors":"Xin Jin, Pei Wang, Qionghai Dai","doi":"10.1109/TIP.2019.2945687","DOIUrl":"10.1109/TIP.2019.2945687","url":null,"abstract":"<p><p>Light field (LF) stitching is a potential solution to improve the field of view (FOV) for hand-held plenoptic cameras. Existing LF stitching methods cannot provide accurate registration for scenes with large depth variation. In this paper, a novel LF stitching method is proposed to handle parallax in the LFs more flexibly and accurately. First, a depth layer map (DLM) is proposed to guarantee adequate feature points on each depth layer. For the regions of nondeterministic depth, superpixel layer map (SLM) is proposed based on LF spatial correlation analysis to refine the depth layer assignments. Then, DLM-SLM-based LF registration is proposed to derive the location dependent homography transforms accurately and to warp LFs to its corresponding position without parallax interference. 4D graph-cut is further applied to fuse the registration results for higher LF spatial continuity and angular continuity. Horizontal, vertical and multi-LF stitching are tested for different scenes, which demonstrates the superior performance provided by the proposed method in terms of subjective quality of the stitched LFs, epipolar plane image consistency in the stitched LF, and perspective-averaged correlation between the stitched LF and the input LFs.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62590450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient Evaluation of Image Quality via Deep-Learning Approximation of Perceptual Metrics. 通过感知指标的深度学习近似法高效评估图像质量。
IF 10.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2019-10-07 DOI: 10.1109/TIP.2019.2944079
Alessandro Artusi, Francesco Banterle, Fabio Carrara, Alejandro Moreo

Image metrics based on Human Visual System (HVS) play a remarkable role in the evaluation of complex image processing algorithms. However, mimicking the HVS is known to be complex and computationally expensive (both in terms of time and memory), and its usage is thus limited to a few applications and to small input data. All of this makes such metrics not fully attractive in real-world scenarios. To address these issues, we propose Deep Image Quality Metric (DIQM), a deep-learning approach to learn the global image quality feature (mean-opinion-score). DIQM can emulate existing visual metrics efficiently, reducing the computational costs by more than an order of magnitude with respect to existing implementations.

基于人类视觉系统(HVS)的图像度量在复杂图像处理算法的评估中发挥着重要作用。然而,众所周知,模仿人类视觉系统既复杂又耗费计算资源(包括时间和内存),因此其应用仅限于少数应用和较小的输入数据。所有这些都使得这类指标在现实世界中并不完全具有吸引力。为了解决这些问题,我们提出了深度图像质量度量(DIQM),这是一种学习全局图像质量特征(平均意见分数)的深度学习方法。DIQM 可以高效地模拟现有的视觉度量,与现有的实现方法相比,计算成本降低了一个数量级以上。
{"title":"Efficient Evaluation of Image Quality via Deep-Learning Approximation of Perceptual Metrics.","authors":"Alessandro Artusi, Francesco Banterle, Fabio Carrara, Alejandro Moreo","doi":"10.1109/TIP.2019.2944079","DOIUrl":"10.1109/TIP.2019.2944079","url":null,"abstract":"<p><p>Image metrics based on Human Visual System (HVS) play a remarkable role in the evaluation of complex image processing algorithms. However, mimicking the HVS is known to be complex and computationally expensive (both in terms of time and memory), and its usage is thus limited to a few applications and to small input data. All of this makes such metrics not fully attractive in real-world scenarios. To address these issues, we propose Deep Image Quality Metric (DIQM), a deep-learning approach to learn the global image quality feature (mean-opinion-score). DIQM can emulate existing visual metrics efficiently, reducing the computational costs by more than an order of magnitude with respect to existing implementations.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62589666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploiting Block-sparsity for Hyperspectral Kronecker Compressive Sensing: a Tensor-based Bayesian Method. 利用块稀疏性实现高光谱克朗克尔压缩传感:基于张量的贝叶斯方法
IF 10.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2019-10-07 DOI: 10.1109/TIP.2019.2944722
Rongqiang Zhao, Qiang Wang, Jun Fu, Luquan Ren

Bayesian methods are attracting increasing attention in the field of compressive sensing (CS), as they are applicable to recover signals from random measurements. However, these methods have limited use in many tensor-based cases such as hyperspectral Kronecker compressive sensing (HKCS), because they exploit the sparsity in only one dimension. In this paper, we propose a novel Bayesian model for HKCS in an attempt to overcome the above limitation. The model exploits multi-dimensional block-sparsity such that the information redundancies in all dimensions are eliminated. Laplace prior distributions are employed for sparse coefficients in each dimension, and their coupling is consistent with the multi-dimensional block-sparsity model. Based on the proposed model, we develop a tensor-based Bayesian reconstruction algorithm, which decouples the hyperparameters for each dimension via a low-complexity technique. Experimental results demonstrate that the proposed method is able to provide more accurate reconstruction than existing Bayesian methods at a satisfactory speed. Additionally, the proposed method can not only be used for HKCS, it also has the potential to be extended to other multi-dimensional CS applications and to multi-dimensional block-sparse-based data recovery.

贝叶斯方法适用于从随机测量中恢复信号,因此在压缩传感(CS)领域受到越来越多的关注。然而,这些方法在许多基于张量的情况下(如高光谱 Kronecker 压缩传感(HKCS))使用有限,因为它们只利用了一个维度的稀疏性。在本文中,我们提出了一种用于 HKCS 的新型贝叶斯模型,试图克服上述局限性。该模型利用了多维块稀疏性,从而消除了所有维度的信息冗余。每个维度的稀疏系数都采用了拉普拉斯先验分布,它们之间的耦合与多维块稀疏性模型是一致的。基于所提出的模型,我们开发了一种基于张量的贝叶斯重建算法,该算法通过一种低复杂度技术解耦了每个维度的超参数。实验结果表明,与现有的贝叶斯方法相比,所提出的方法能以令人满意的速度提供更精确的重建。此外,所提出的方法不仅可用于香港计算机辅助分析,还具有扩展到其他多维计算机辅助分析应用和基于块稀疏的多维数据恢复的潜力。
{"title":"Exploiting Block-sparsity for Hyperspectral Kronecker Compressive Sensing: a Tensor-based Bayesian Method.","authors":"Rongqiang Zhao, Qiang Wang, Jun Fu, Luquan Ren","doi":"10.1109/TIP.2019.2944722","DOIUrl":"10.1109/TIP.2019.2944722","url":null,"abstract":"<p><p>Bayesian methods are attracting increasing attention in the field of compressive sensing (CS), as they are applicable to recover signals from random measurements. However, these methods have limited use in many tensor-based cases such as hyperspectral Kronecker compressive sensing (HKCS), because they exploit the sparsity in only one dimension. In this paper, we propose a novel Bayesian model for HKCS in an attempt to overcome the above limitation. The model exploits multi-dimensional block-sparsity such that the information redundancies in all dimensions are eliminated. Laplace prior distributions are employed for sparse coefficients in each dimension, and their coupling is consistent with the multi-dimensional block-sparsity model. Based on the proposed model, we develop a tensor-based Bayesian reconstruction algorithm, which decouples the hyperparameters for each dimension via a low-complexity technique. Experimental results demonstrate that the proposed method is able to provide more accurate reconstruction than existing Bayesian methods at a satisfactory speed. Additionally, the proposed method can not only be used for HKCS, it also has the potential to be extended to other multi-dimensional CS applications and to multi-dimensional block-sparse-based data recovery.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62590166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Skeleton Filter: A Self-Symmetric Filter for Skeletonization in Noisy Text Images. 骨架滤波器:用于噪声文本图像骨架化的自对称滤波器
IF 10.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2019-10-07 DOI: 10.1109/TIP.2019.2944560
Xiuxiu Bai, Lele Ye, Jihua Zhu, Li Zhu, Taku Komura

Robustly computing the skeletons of objects in natural images is difficult due to the large variations in shape boundaries and the large amount of noise in the images. Inspired by recent findings in neuroscience, we propose the Skeleton Filter, which is a novel model for skeleton extraction from natural images. The Skeleton Filter consists of a pair of oppositely oriented Gabor-like filters; by applying the Skeleton Filter in various orientations to an image at multiple resolutions and fusing the results, our system can robustly extract the skeleton even under highly noisy conditions. We evaluate the performance of our approach using challenging noisy text datasets and demonstrate that our pipeline realizes state-of-the-art performance for extracting the text skeleton. Moreover, the presence of Gabor filters in the human visual system and the simple architecture of the Skeleton Filter can help explain the strong capabilities of humans in perceiving skeletons of objects, even under dramatically noisy conditions.

由于形状边界的巨大变化和图像中的大量噪声,很难在自然图像中可靠地计算出物体的骨架。受神经科学最新研究成果的启发,我们提出了骨架过滤器,这是一种从自然图像中提取骨架的新型模型。骨架滤波器由一对方向相反的 Gabor 类滤波器组成;通过将不同方向的骨架滤波器应用于多分辨率的图像并融合结果,我们的系统即使在高噪声条件下也能稳健地提取骨架。我们使用具有挑战性的高噪声文本数据集评估了我们方法的性能,并证明我们的管道在提取文本骨架方面实现了最先进的性能。此外,Gabor 滤波器在人类视觉系统中的存在以及骨架滤波器的简单架构有助于解释人类即使在高噪声条件下也能感知物体骨架的强大能力。
{"title":"Skeleton Filter: A Self-Symmetric Filter for Skeletonization in Noisy Text Images.","authors":"Xiuxiu Bai, Lele Ye, Jihua Zhu, Li Zhu, Taku Komura","doi":"10.1109/TIP.2019.2944560","DOIUrl":"10.1109/TIP.2019.2944560","url":null,"abstract":"<p><p>Robustly computing the skeletons of objects in natural images is difficult due to the large variations in shape boundaries and the large amount of noise in the images. Inspired by recent findings in neuroscience, we propose the Skeleton Filter, which is a novel model for skeleton extraction from natural images. The Skeleton Filter consists of a pair of oppositely oriented Gabor-like filters; by applying the Skeleton Filter in various orientations to an image at multiple resolutions and fusing the results, our system can robustly extract the skeleton even under highly noisy conditions. We evaluate the performance of our approach using challenging noisy text datasets and demonstrate that our pipeline realizes state-of-the-art performance for extracting the text skeleton. Moreover, the presence of Gabor filters in the human visual system and the simple architecture of the Skeleton Filter can help explain the strong capabilities of humans in perceiving skeletons of objects, even under dramatically noisy conditions.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62589989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Coupled ISTA Network for Multi-modal Image Super-Resolution. 用于多模态图像超分辨率的深度耦合 ISTA 网络
IF 10.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2019-10-03 DOI: 10.1109/TIP.2019.2944270
Xin Deng, Pier Luigi Dragotti

Given a low-resolution (LR) image, multi-modal image super-resolution (MISR) aims to find the high-resolution (HR) version of this image with the guidance of an HR image from another modality. In this paper, we use a model-based approach to design a new deep network architecture for MISR. We first introduce a novel joint multi-modal dictionary learning (JMDL) algorithm to model cross-modality dependency. In JMDL, we simultaneously learn three dictionaries and two transform matrices to combine the modalities. Then, by unfolding the iterative shrinkage and thresholding algorithm (ISTA), we turn the JMDL model into a deep neural network, called deep coupled ISTA network. Since the network initialization plays an important role in deep network training, we further propose a layer-wise optimization algorithm (LOA) to initialize the parameters of the network before running back-propagation strategy. Specifically, we model the network initialization as a multi-layer dictionary learning problem, and solve it through convex optimization. The proposed LOA is demonstrated to effectively decrease the training loss and increase the reconstruction accuracy. Finally, we compare our method with other state-of-the-art methods in the MISR task. The numerical results show that our method consistently outperforms others both quantitatively and qualitatively at different upscaling factors for various multi-modal scenarios.

给定一幅低分辨率(LR)图像,多模态图像超分辨率(MISR)的目的是在另一种模态的高分辨率图像的引导下找到该图像的高分辨率(HR)版本。在本文中,我们采用基于模型的方法为 MISR 设计了一种新的深度网络架构。我们首先引入了一种新颖的联合多模态字典学习(JMDL)算法,对跨模态依赖性进行建模。在 JMDL 中,我们同时学习三个字典和两个变换矩阵,以结合模态。然后,通过展开迭代收缩和阈值算法(ISTA),我们将 JMDL 模型转化为深度神经网络,即深度耦合 ISTA 网络。由于网络初始化在深度网络训练中起着重要作用,我们进一步提出了一种层优化算法(LOA),用于在运行反向传播策略之前初始化网络参数。具体来说,我们将网络初始化建模为多层字典学习问题,并通过凸优化来解决。实验证明,所提出的 LOA 能有效减少训练损失,提高重建精度。最后,我们将我们的方法与 MISR 任务中的其他先进方法进行了比较。数值结果表明,对于各种多模态场景,在不同的放大系数下,我们的方法在定量和定性上都始终优于其他方法。
{"title":"Deep Coupled ISTA Network for Multi-modal Image Super-Resolution.","authors":"Xin Deng, Pier Luigi Dragotti","doi":"10.1109/TIP.2019.2944270","DOIUrl":"10.1109/TIP.2019.2944270","url":null,"abstract":"<p><p>Given a low-resolution (LR) image, multi-modal image super-resolution (MISR) aims to find the high-resolution (HR) version of this image with the guidance of an HR image from another modality. In this paper, we use a model-based approach to design a new deep network architecture for MISR. We first introduce a novel joint multi-modal dictionary learning (JMDL) algorithm to model cross-modality dependency. In JMDL, we simultaneously learn three dictionaries and two transform matrices to combine the modalities. Then, by unfolding the iterative shrinkage and thresholding algorithm (ISTA), we turn the JMDL model into a deep neural network, called deep coupled ISTA network. Since the network initialization plays an important role in deep network training, we further propose a layer-wise optimization algorithm (LOA) to initialize the parameters of the network before running back-propagation strategy. Specifically, we model the network initialization as a multi-layer dictionary learning problem, and solve it through convex optimization. The proposed LOA is demonstrated to effectively decrease the training loss and increase the reconstruction accuracy. Finally, we compare our method with other state-of-the-art methods in the MISR task. The numerical results show that our method consistently outperforms others both quantitatively and qualitatively at different upscaling factors for various multi-modal scenarios.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62589478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semi-Supervised Human Detection via Region Proposal Networks Aided by Verification. 通过验证辅助区域建议网络进行半监督式人体检测
IF 10.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2019-10-03 DOI: 10.1109/TIP.2019.2944306
Si Wu, Wenhao Wu, Shiyao Lei, Sihao Lin, Rui Li, Zhiwen Yu, Hau-San Wong

In this paper, we explore how to leverage readily available unlabeled data to improve semi-supervised human detection performance. For this purpose, we specifically modify the region proposal network (RPN) for learning on a partially labeled dataset. Based on commonly observed false positive types, a verification module is developed to assess foreground human objects in the candidate regions to provide an important cue for filtering the RPN's proposals. The remaining proposals with high confidence scores are then used as pseudo annotations for re-training our detection model. To reduce the risk of error propagation in the training process, we adopt a self-paced training strategy to progressively include more pseudo annotations generated by the previous model over multiple training rounds. The resulting detector re-trained on the augmented data can be expected to have better detection performance. The effectiveness of the main components of this framework is verified through extensive experiments, and the proposed approach achieves state-of-the-art detection results on multiple scene-specific human detection benchmarks in the semi-supervised setting.

在本文中,我们探讨了如何利用随时可用的非标记数据来提高半监督式人体检测性能。为此,我们专门修改了区域建议网络(RPN),以便在部分标记的数据集上进行学习。根据通常观察到的误报类型,我们开发了一个验证模块,用于评估候选区域中的前景人类对象,为过滤 RPN 的建议提供重要线索。然后,剩余的高置信度建议将作为伪注释,用于重新训练我们的检测模型。为了降低训练过程中错误传播的风险,我们采用了自定步调的训练策略,在多轮训练中逐步加入更多由上一模型生成的伪注释。在增强数据上重新训练的检测器有望获得更好的检测性能。通过大量实验验证了这一框架主要组成部分的有效性,在半监督环境下,所提出的方法在多个特定场景的人类检测基准上取得了最先进的检测结果。
{"title":"Semi-Supervised Human Detection via Region Proposal Networks Aided by Verification.","authors":"Si Wu, Wenhao Wu, Shiyao Lei, Sihao Lin, Rui Li, Zhiwen Yu, Hau-San Wong","doi":"10.1109/TIP.2019.2944306","DOIUrl":"10.1109/TIP.2019.2944306","url":null,"abstract":"<p><p>In this paper, we explore how to leverage readily available unlabeled data to improve semi-supervised human detection performance. For this purpose, we specifically modify the region proposal network (RPN) for learning on a partially labeled dataset. Based on commonly observed false positive types, a verification module is developed to assess foreground human objects in the candidate regions to provide an important cue for filtering the RPN's proposals. The remaining proposals with high confidence scores are then used as pseudo annotations for re-training our detection model. To reduce the risk of error propagation in the training process, we adopt a self-paced training strategy to progressively include more pseudo annotations generated by the previous model over multiple training rounds. The resulting detector re-trained on the augmented data can be expected to have better detection performance. The effectiveness of the main components of this framework is verified through extensive experiments, and the proposed approach achieves state-of-the-art detection results on multiple scene-specific human detection benchmarks in the semi-supervised setting.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62589764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning Interleaved Cascade of Shrinkage Fields for Joint Image Dehazing and Denoising. 为联合图像去污和去噪学习交错级联收缩场
IF 10.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2019-09-30 DOI: 10.1109/TIP.2019.2942504
Qingbo Wu, Wenqi Ren, Xiaochun Cao

Most existing image dehazing methods deteriorate to different extents when processing hazy inputs with noise. The main reason is that the commonly adopted two-step strategy tends to amplify noise in the inverse operation of division by the transmission. To address this problem, we learn an interleaved Cascade of Shrinkage Fields (CSF) to reduce noise in jointly recovering the transmission map and the scene radiance from a single hazy image. Specifically, an auxiliary shrinkage field (SF) model is integrated into each cascade of the proposed scheme to reduce undesirable artifacts during the transmission estimation. Different from conventional CSF, our learned SF models have special visual patterns, which facilitate the specific task of noise reduction in haze removal. Furthermore, a numerical algorithm is proposed to efficiently update the scene radiance and the transmission map in each cascade. Extensive experiments on synthetic and real-world data demonstrate that the proposed algorithm performs favorably against state-of-the-art dehazing methods on hazy and noisy images.

现有的大多数图像去噪方法在处理带有噪声的朦胧输入时都会出现不同程度的恶化。主要原因是,通常采用的两步策略往往会在除法传输的逆操作中放大噪声。为了解决这个问题,我们学习了一种交错级联收缩场(CSF),以减少从单幅朦胧图像中联合恢复透射图和场景辐射率时的噪声。具体地说,在拟议方案的每个级联中都集成了一个辅助收缩场(SF)模型,以减少传输估计过程中的不良伪影。与传统的 CSF 不同,我们学习的 SF 模型具有特殊的视觉模式,这有助于在去除雾霾的过程中完成降噪这一特定任务。此外,我们还提出了一种数值算法,用于在每个级联中有效地更新场景辐照度和传输图。在合成数据和真实世界数据上进行的大量实验表明,与最先进的去雾霾和噪声图像处理方法相比,所提出的算法性能更佳。
{"title":"Learning Interleaved Cascade of Shrinkage Fields for Joint Image Dehazing and Denoising.","authors":"Qingbo Wu, Wenqi Ren, Xiaochun Cao","doi":"10.1109/TIP.2019.2942504","DOIUrl":"10.1109/TIP.2019.2942504","url":null,"abstract":"<p><p>Most existing image dehazing methods deteriorate to different extents when processing hazy inputs with noise. The main reason is that the commonly adopted two-step strategy tends to amplify noise in the inverse operation of division by the transmission. To address this problem, we learn an interleaved Cascade of Shrinkage Fields (CSF) to reduce noise in jointly recovering the transmission map and the scene radiance from a single hazy image. Specifically, an auxiliary shrinkage field (SF) model is integrated into each cascade of the proposed scheme to reduce undesirable artifacts during the transmission estimation. Different from conventional CSF, our learned SF models have special visual patterns, which facilitate the specific task of noise reduction in haze removal. Furthermore, a numerical algorithm is proposed to efficiently update the scene radiance and the transmission map in each cascade. Extensive experiments on synthetic and real-world data demonstrate that the proposed algorithm performs favorably against state-of-the-art dehazing methods on hazy and noisy images.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62588486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Biological Vision Inspired Framework for Image Enhancement in Poor Visibility Conditions. 在能见度较低的条件下增强图像的生物视觉启发框架。
IF 10.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2019-09-25 DOI: 10.1109/TIP.2019.2938310
Kai-Fu Yang, Xian-Shi Zhang, Yong-Jie Li

Image enhancement is an important pre-processing step for many computer vision applications especially regarding the scenes in poor visibility conditions. In this work, we develop a unified two-pathway model inspired by the biological vision, especially the early visual mechanisms, which contributes to image enhancement tasks including low dynamic range (LDR) image enhancement and high dynamic range (HDR) image tone mapping. Firstly, the input image is separated and sent into two visual pathways: structure-pathway and detail-pathway, corresponding to the M-and P-pathway in the early visual system, which code the low-and high-frequency visual information, respectively. In the structure-pathway, an extended biological normalization model is used to integrate the global and local luminance adaptation, which can handle the visual scenes with varying illuminations. On the other hand, the detail enhancement and local noise suppression are achieved in the detail-pathway based on local energy weighting. Finally, the outputs of structure-and detail-pathway are integrated to achieve the low-light image enhancement. In addition, the proposed model can also be used for tone mapping of HDR images with some fine-tuning steps. Extensive experiments on three datasets (two LDR image datasets and one HDR scene dataset) show that the proposed model can handle the visual enhancement tasks mentioned above efficiently and outperform the related state-of-the-art methods.

图像增强是许多计算机视觉应用的重要预处理步骤,尤其是在能见度较低的场景中。在这项工作中,我们受生物视觉特别是早期视觉机制的启发,建立了一个统一的双通道模型,该模型有助于图像增强任务,包括低动态范围(LDR)图像增强和高动态范围(HDR)图像色调映射。首先,输入图像被分离并送入两条视觉通路:结构通路和细节通路,分别对应于早期视觉系统中的 M 通路和 P 通路,它们分别编码低频和高频视觉信息。在结构通路中,一个扩展的生物归一化模型被用来整合全局和局部亮度适应,从而可以处理不同光照度的视觉场景。另一方面,细节通路基于局部能量加权实现细节增强和局部噪声抑制。最后,整合结构和细节通路的输出,实现弱光图像增强。此外,通过一些微调步骤,所提出的模型还可用于 HDR 图像的色调映射。在三个数据集(两个低照度图像数据集和一个高照度场景数据集)上进行的大量实验表明,所提出的模型可以高效地处理上述视觉增强任务,并优于相关的先进方法。
{"title":"A Biological Vision Inspired Framework for Image Enhancement in Poor Visibility Conditions.","authors":"Kai-Fu Yang, Xian-Shi Zhang, Yong-Jie Li","doi":"10.1109/TIP.2019.2938310","DOIUrl":"10.1109/TIP.2019.2938310","url":null,"abstract":"<p><p>Image enhancement is an important pre-processing step for many computer vision applications especially regarding the scenes in poor visibility conditions. In this work, we develop a unified two-pathway model inspired by the biological vision, especially the early visual mechanisms, which contributes to image enhancement tasks including low dynamic range (LDR) image enhancement and high dynamic range (HDR) image tone mapping. Firstly, the input image is separated and sent into two visual pathways: structure-pathway and detail-pathway, corresponding to the M-and P-pathway in the early visual system, which code the low-and high-frequency visual information, respectively. In the structure-pathway, an extended biological normalization model is used to integrate the global and local luminance adaptation, which can handle the visual scenes with varying illuminations. On the other hand, the detail enhancement and local noise suppression are achieved in the detail-pathway based on local energy weighting. Finally, the outputs of structure-and detail-pathway are integrated to achieve the low-light image enhancement. In addition, the proposed model can also be used for tone mapping of HDR images with some fine-tuning steps. Extensive experiments on three datasets (two LDR image datasets and one HDR scene dataset) show that the proposed model can handle the visual enhancement tasks mentioned above efficiently and outperform the related state-of-the-art methods.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62586368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Receptive Field Size vs. Model Depth for Single Image Super-resolution. 单张图像超分辨率的感受野大小与模型深度的关系
IF 10.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2019-09-25 DOI: 10.1109/TIP.2019.2941327
Ruxin Wang, Mingming Gong, Dacheng Tao

The performance of single image super-resolution (SISR) has been largely improved by innovative designs of deep architectures. An important claim raised by these designs is that the deep models have large receptive field size and strong nonlinearity. However, we are concerned about the question that which factor, receptive field size or model depth, is more critical for SISR. Towards revealing the answers, in this paper, we propose a strategy based on dilated convolution to investigate how the two factors affect the performance of SISR. Our findings from exhaustive investigations suggest that SISR is more sensitive to the changes of receptive field size than to the model depth variations, and that the model depth must be congruent with the receptive field size to produce improved performance. These findings inspire us to design a shallower architecture which can save computational and memory cost while preserving comparable effectiveness with respect to a much deeper architecture.

深度架构的创新设计在很大程度上提高了单图像超分辨率(SISR)的性能。这些设计提出的一个重要主张是,深度模型具有较大的感受野尺寸和较强的非线性。然而,我们关心的问题是,感受野大小和模型深度哪个因素对 SISR 更为关键。为了揭示答案,我们在本文中提出了一种基于扩张卷积的策略,以研究这两个因素如何影响 SISR 的性能。详尽的研究结果表明,SISR 对感受野大小的变化比对模型深度的变化更敏感,而且模型深度必须与感受野大小一致才能提高性能。这些发现启发我们设计一种较浅的架构,既能节省计算和内存成本,又能保持与更深架构相当的效果。
{"title":"Receptive Field Size vs. Model Depth for Single Image Super-resolution.","authors":"Ruxin Wang, Mingming Gong, Dacheng Tao","doi":"10.1109/TIP.2019.2941327","DOIUrl":"10.1109/TIP.2019.2941327","url":null,"abstract":"<p><p>The performance of single image super-resolution (SISR) has been largely improved by innovative designs of deep architectures. An important claim raised by these designs is that the deep models have large receptive field size and strong nonlinearity. However, we are concerned about the question that which factor, receptive field size or model depth, is more critical for SISR. Towards revealing the answers, in this paper, we propose a strategy based on dilated convolution to investigate how the two factors affect the performance of SISR. Our findings from exhaustive investigations suggest that SISR is more sensitive to the changes of receptive field size than to the model depth variations, and that the model depth must be congruent with the receptive field size to produce improved performance. These findings inspire us to design a shallower architecture which can save computational and memory cost while preserving comparable effectiveness with respect to a much deeper architecture.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62587513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Intermediate Deep Feature Compression: Toward Intelligent Sensing. 中间深度特征压缩:迈向智能传感
IF 10.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2019-09-25 DOI: 10.1109/TIP.2019.2941660
Zhuo Chen, Kui Fan, Shiqi Wang, Lingyu Duan, Weisi Lin, Alex C Kot

The recent advances of hardware technology have made the intelligent analysis equipped at the front-end with deep learning more prevailing and practical. To better enable the intelligent sensing at the front-end, instead of compressing and transmitting visual signals or the ultimately utilized top-layer deep learning features, we propose to compactly represent and convey the intermediate-layer deep learning features with high generalization capability, to facilitate the collaborating approach between front and cloud ends. This strategy enables a good balance among the computational load, transmission load and the generalization ability for cloud servers when deploying the deep neural networks for large scale cloud based visual analysis. Moreover, the presented strategy also makes the standardization of deep feature coding more feasible and promising, as a series of tasks can simultaneously benefit from the transmitted intermediate layer features. We also present the results for evaluations of both lossless and lossy deep feature compression, which provide meaningful investigations and baselines for future research and standardization activities.

近年来硬件技术的进步使前端配备深度学习的智能分析变得更加普遍和实用。为了更好地实现前端的智能感知,我们建议不压缩和传输视觉信号或最终利用的顶层深度学习特征,而是紧凑地表示和传递具有高泛化能力的中间层深度学习特征,以方便前端和云端之间的协作方式。在基于云的大规模视觉分析中部署深度神经网络时,这种策略能够很好地平衡云服务器的计算负载、传输负载和泛化能力。此外,由于一系列任务可以同时从传输的中间层特征中获益,因此提出的策略也使深度特征编码的标准化变得更加可行和有前景。我们还介绍了无损和有损深度特征压缩的评估结果,这为未来的研究和标准化活动提供了有意义的调查和基准。
{"title":"Intermediate Deep Feature Compression: Toward Intelligent Sensing.","authors":"Zhuo Chen, Kui Fan, Shiqi Wang, Lingyu Duan, Weisi Lin, Alex C Kot","doi":"10.1109/TIP.2019.2941660","DOIUrl":"10.1109/TIP.2019.2941660","url":null,"abstract":"<p><p>The recent advances of hardware technology have made the intelligent analysis equipped at the front-end with deep learning more prevailing and practical. To better enable the intelligent sensing at the front-end, instead of compressing and transmitting visual signals or the ultimately utilized top-layer deep learning features, we propose to compactly represent and convey the intermediate-layer deep learning features with high generalization capability, to facilitate the collaborating approach between front and cloud ends. This strategy enables a good balance among the computational load, transmission load and the generalization ability for cloud servers when deploying the deep neural networks for large scale cloud based visual analysis. Moreover, the presented strategy also makes the standardization of deep feature coding more feasible and promising, as a series of tasks can simultaneously benefit from the transmitted intermediate layer features. We also present the results for evaluations of both lossless and lossy deep feature compression, which provide meaningful investigations and baselines for future research and standardization activities.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62587991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Transactions on Image Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1