首页 > 最新文献

IEEE Transactions on Image Processing最新文献

英文 中文
Inpainting vs denoising for dose reduction in scanning-beam microscopies. 在扫描光束显微镜中减少剂量的涂色与去噪。
IF 10.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2019-07-17 DOI: 10.1109/TIP.2019.2928133
Toby Sanders, Christian Dwyer

We consider sampling strategies for reducing the radiation dose during image acquisition in scanning-beam microscopies, such as SEM, STEM, and STXM. Our basic assumption is that we may acquire subsampled image data (with some pixels missing) and then inpaint the missing data using a compressed-sensing approach. Our noise model consists of Poisson noise plus random Gaussian noise. We include the possibility of acquiring fully-sampled image data, in which case the inpainting approach reduces to a denoising procedure. We use numerical simulations to compare the accuracy of reconstructed images with the "ground truths." The results generally indicate that, for sufficiently high radiation doses, higher sampling rates achieve greater accuracy, commensurate with well-established literature. However, for very low radiation doses, where the Poisson noise and/or random Gaussian noise begins to dominate, then our results indicate that subsampling/inpainting can result in smaller reconstruction errors. We also present an information-theoretic analysis, which allows us to quantify the amount of information gained through the different sampling strategies and enables some broader discussion of the main results.

我们考虑了在扫描电子显微镜、STEM 和 STXM 等扫描光束显微镜的图像采集过程中降低辐射剂量的采样策略。我们的基本假设是,我们可以获取子采样图像数据(部分像素缺失),然后使用压缩传感方法对缺失数据进行补绘。我们的噪声模型包括泊松噪声和随机高斯噪声。我们还考虑到了获取全采样图像数据的可能性,在这种情况下,内绘方法简化为去噪程序。我们使用数值模拟来比较重建图像与 "地面实况 "的准确性。结果普遍表明,对于足够高的辐射剂量,较高的采样率能获得更高的精度,这与已发表的文献相符。然而,对于极低的辐射剂量,泊松噪声和/或随机高斯噪声开始占主导地位,那么我们的结果表明,子采样/绘制可以带来较小的重建误差。我们还进行了信息理论分析,从而量化了通过不同采样策略获得的信息量,并对主要结果进行了更广泛的讨论。
{"title":"Inpainting vs denoising for dose reduction in scanning-beam microscopies.","authors":"Toby Sanders, Christian Dwyer","doi":"10.1109/TIP.2019.2928133","DOIUrl":"10.1109/TIP.2019.2928133","url":null,"abstract":"<p><p>We consider sampling strategies for reducing the radiation dose during image acquisition in scanning-beam microscopies, such as SEM, STEM, and STXM. Our basic assumption is that we may acquire subsampled image data (with some pixels missing) and then inpaint the missing data using a compressed-sensing approach. Our noise model consists of Poisson noise plus random Gaussian noise. We include the possibility of acquiring fully-sampled image data, in which case the inpainting approach reduces to a denoising procedure. We use numerical simulations to compare the accuracy of reconstructed images with the \"ground truths.\" The results generally indicate that, for sufficiently high radiation doses, higher sampling rates achieve greater accuracy, commensurate with well-established literature. However, for very low radiation doses, where the Poisson noise and/or random Gaussian noise begins to dominate, then our results indicate that subsampling/inpainting can result in smaller reconstruction errors. We also present an information-theoretic analysis, which allows us to quantify the amount of information gained through the different sampling strategies and enables some broader discussion of the main results.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62583239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Path-Based Dictionary Augmentation: A Framework for Improving k-Sparse Image Processing. 基于路径的词典扩充:改进 k 解析图像处理的框架
IF 10.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2019-07-15 DOI: 10.1109/TIP.2019.2927331
Tegan H Emerson, Colin Olson, Timothy Doster

We have previously shown that augmenting orthogonal matching pursuit (OMP) with an additional step in the identification stage of each pursuit iteration yields improved k-sparse reconstruction and denoising performance relative to baseline OMP. At each iteration a "path," or geodesic, is generated between the two dictionary atoms that are most correlated with the residual and from this path a new atom that has a greater correlation to the residual than either of the two bracketing atoms is selected. Here, we provide new computational results illustrating improvements in sparse coding and denoising on canonical datasets using both learned and structured dictionaries. Two methods of constructing a path are investigated for each dictionary type: the Euclidean geodesic formed by a linear combination of the two atoms and the 2-Wasserstein geodesic corresponding to the optimal transport map between the atoms. We prove here the existence of a higher-correlation atom in the Euclidean case under assumptions on the two bracketing atoms and introduce algorithmic modifications to improve the likelihood that the bracketing atoms meet those conditions. Although we demonstrate our augmentation on OMP alone, in general it may be applied to any reconstruction algorithm that relies on the selection and sorting of high-similarity atoms during an analysis or identification phase.

我们之前已经证明,与基线 OMP 相比,在每次追寻迭代的识别阶段增加一个额外步骤来增强正交匹配追寻(OMP),可以提高 k 稀疏重建和去噪性能。每次迭代都会在与残差相关性最大的两个字典原子之间生成一条 "路径 "或大地线,并从中选择一个与残差相关性大于两个括号内原子的新原子。在这里,我们提供了新的计算结果,说明了使用学习字典和结构化字典对典型数据集进行稀疏编码和去噪的改进。我们研究了每种字典类型构建路径的两种方法:由两个原子的线性组合形成的欧氏大地线,以及与原子间最优传输图相对应的 2-Wasserstein 大地线。我们在此证明了在欧氏情况下,在两个括号原子的假设条件下存在高相关原子,并引入了算法修改,以提高括号原子满足这些条件的可能性。虽然我们仅在 OMP 上演示了我们的增强算法,但一般来说,它可以应用于任何依赖于在分析或识别阶段选择和排序高相似原子的重建算法。
{"title":"Path-Based Dictionary Augmentation: A Framework for Improving k-Sparse Image Processing.","authors":"Tegan H Emerson, Colin Olson, Timothy Doster","doi":"10.1109/TIP.2019.2927331","DOIUrl":"10.1109/TIP.2019.2927331","url":null,"abstract":"<p><p>We have previously shown that augmenting orthogonal matching pursuit (OMP) with an additional step in the identification stage of each pursuit iteration yields improved k-sparse reconstruction and denoising performance relative to baseline OMP. At each iteration a \"path,\" or geodesic, is generated between the two dictionary atoms that are most correlated with the residual and from this path a new atom that has a greater correlation to the residual than either of the two bracketing atoms is selected. Here, we provide new computational results illustrating improvements in sparse coding and denoising on canonical datasets using both learned and structured dictionaries. Two methods of constructing a path are investigated for each dictionary type: the Euclidean geodesic formed by a linear combination of the two atoms and the 2-Wasserstein geodesic corresponding to the optimal transport map between the atoms. We prove here the existence of a higher-correlation atom in the Euclidean case under assumptions on the two bracketing atoms and introduce algorithmic modifications to improve the likelihood that the bracketing atoms meet those conditions. Although we demonstrate our augmentation on OMP alone, in general it may be applied to any reconstruction algorithm that relies on the selection and sorting of high-similarity atoms during an analysis or identification phase.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62583013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exemplar-Based Recursive Instance Segmentation With Application to Plant Image Analysis. 基于范例的递归实例分割技术在植物图像分析中的应用
IF 10.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2019-07-11 DOI: 10.1109/TIP.2019.2923571
Jin-Gang Yu, Yansheng Li, Changxin Gao, Hongxia Gaoa, Gui-Song Xia, Zhu Liang Yub, Yuanqing Lic

Instance segmentation is a challenging computer vision problem which lies at the intersection of object detection and semantic segmentation. Motivated by plant image analysis in the context of plant phenotyping, a recently emerging application field of computer vision, this paper presents the Exemplar-Based Recursive Instance Segmentation (ERIS) framework. A three-layer probabilistic model is firstly introduced to jointly represent hypotheses, voting elements, instance labels and their connections. Afterwards, a recursive optimization algorithm is developed to infer the maximum a posteriori (MAP) solution, which handles one instance at a time by alternating among the three steps of detection, segmentation and update. The proposed ERIS framework departs from previous works mainly in two respects. First, it is exemplar-based and model-free, which can achieve instance-level segmentation of a specific object class given only a handful of (typically less than 10) annotated exemplars. Such a merit enables its use in case that no massive manually-labeled data is available for training strong classification models, as required by most existing methods. Second, instead of attempting to infer the solution in a single shot, which suffers from extremely high computational complexity, our recursive optimization strategy allows for reasonably efficient MAP-inference in full hypothesis space. The ERIS framework is substantialized for the specific application of plant leaf segmentation in this work. Experiments are conducted on public benchmarks to demonstrate the superiority of our method in both effectiveness and efficiency in comparison with the state-of-the-art.

实例分割是一个极具挑战性的计算机视觉问题,是物体检测和语义分割的交叉点。植物表型是计算机视觉的一个新兴应用领域,本文受植物图像分析的启发,提出了基于范例的递归实例分割(ERIS)框架。首先引入了一个三层概率模型来共同表示假设、投票元素、实例标签及其联系。然后,开发了一种递归优化算法来推断最大后验(MAP)解决方案,通过检测、分割和更新三个步骤的交替进行,一次处理一个实例。拟议的 ERIS 框架主要在两个方面不同于之前的工作。首先,它是基于示例和无模型的,只需少量(通常少于 10 个)注释示例,就能实现特定对象类别的实例级分割。这样的优点使它能够在没有大量人工标注数据来训练强大分类模型的情况下使用,而大多数现有方法都需要这样的数据。其次,我们的递归优化策略可以在整个假设空间内进行合理有效的 MAP 推理,而不是试图一次性推理出解决方案,因为后者的计算复杂度极高。在这项工作中,ERIS 框架针对植物叶片分割的具体应用进行了实质性改进。我们在公共基准上进行了实验,以证明我们的方法在效果和效率上都优于最先进的方法。
{"title":"Exemplar-Based Recursive Instance Segmentation With Application to Plant Image Analysis.","authors":"Jin-Gang Yu, Yansheng Li, Changxin Gao, Hongxia Gaoa, Gui-Song Xia, Zhu Liang Yub, Yuanqing Lic","doi":"10.1109/TIP.2019.2923571","DOIUrl":"10.1109/TIP.2019.2923571","url":null,"abstract":"<p><p>Instance segmentation is a challenging computer vision problem which lies at the intersection of object detection and semantic segmentation. Motivated by plant image analysis in the context of plant phenotyping, a recently emerging application field of computer vision, this paper presents the Exemplar-Based Recursive Instance Segmentation (ERIS) framework. A three-layer probabilistic model is firstly introduced to jointly represent hypotheses, voting elements, instance labels and their connections. Afterwards, a recursive optimization algorithm is developed to infer the maximum a posteriori (MAP) solution, which handles one instance at a time by alternating among the three steps of detection, segmentation and update. The proposed ERIS framework departs from previous works mainly in two respects. First, it is exemplar-based and model-free, which can achieve instance-level segmentation of a specific object class given only a handful of (typically less than 10) annotated exemplars. Such a merit enables its use in case that no massive manually-labeled data is available for training strong classification models, as required by most existing methods. Second, instead of attempting to infer the solution in a single shot, which suffers from extremely high computational complexity, our recursive optimization strategy allows for reasonably efficient MAP-inference in full hypothesis space. The ERIS framework is substantialized for the specific application of plant leaf segmentation in this work. Experiments are conducted on public benchmarks to demonstrate the superiority of our method in both effectiveness and efficiency in comparison with the state-of-the-art.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62582750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
End-to-End Single Image Fog Removal Using Enhanced Cycle Consistent Adversarial Networks 端到端单图像雾去除使用增强周期一致对抗网络
IF 10.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2019-02-04 DOI: 10.1109/TIP.2020.3007844
Wei Liu, Xianxu Hou, Jiang Duan, G. Qiu
Single image defogging is a classical and challenging problem in computer vision. Existing methods towards this problem mainly include handcrafted priors based methods that rely on the use of the atmospheric degradation model and learning-based approaches that require paired fog-fogfree training example images. In practice, however, prior-based methods are prone to failure due to their own limitations and paired training data are extremely difficult to acquire. Moreover, there are few studies on the unpaired trainable defogging network in this field. Thus, inspired by the principle of CycleGAN network, we have developed an end-to-end learning system that uses unpaired fog and fogfree training images, adversarial discriminators and cycle consistency losses to automatically construct a fog removal system. Similar to CycleGAN, our system has two transformation paths; one maps fog images to a fogfree image domain and the other maps fogfree images to a fog image domain. Instead of one stage mapping, our system uses a two stage mapping strategy in each transformation path to enhance the effectiveness of fog removal. Furthermore, we make explicit use of prior knowledge in the networks by embedding the atmospheric degradation principle and a sky prior for mapping fogfree images to the fog images domain. In addition, we also contribute the first real world nature fog-fogfree image dataset for defogging research. Our multiple real fog images dataset (MRFID) contains images of 200 natural outdoor scenes. For each scene, there is one clear image and corresponding four foggy images of different fog densities manually selected from a sequence of images taken by a fixed camera over the course of one year. Qualitative and quantitative comparison against several state-of-the-art methods on both synthetic and real world images demonstrate that our approach is effective and performs favorably for recovering a clear image from a foggy image.
单幅图像去雾是计算机视觉领域的一个经典而又具有挑战性的问题。解决这一问题的现有方法主要包括基于手工先验的方法(依赖于使用大气退化模型)和基于学习的方法(需要成对的无雾训练样例图像)。然而,在实践中,基于先验的方法由于自身的局限性,容易失败,并且配对训练数据的获取极其困难。此外,该领域对非成对可训练除雾网络的研究较少。因此,受CycleGAN网络原理的启发,我们开发了一个端到端的学习系统,该系统使用未配对的雾和无雾训练图像、对抗鉴别器和循环一致性损失来自动构建除雾系统。与CycleGAN类似,我们的系统有两条转换路径;一个将雾图像映射到无雾图像域,另一个将无雾图像映射到雾图像域。我们的系统在每个转换路径中使用两个阶段的映射策略,而不是一个阶段的映射,以提高除雾的有效性。此外,我们通过嵌入大气退化原理和天空先验来将无雾图像映射到雾图像域,从而明确地利用网络中的先验知识。此外,我们还为去雾研究提供了第一个真实世界自然无雾图像数据集。我们的多重真实雾图像数据集(MRFID)包含200个自然户外场景的图像。对于每个场景,都有一个清晰的图像和相应的四个不同雾密度的雾图像,这些雾图像是从固定相机在一年中拍摄的一系列图像中手动选择的。定性和定量比较几种最先进的方法在合成和真实世界的图像表明,我们的方法是有效的,并表现良好,从雾蒙蒙的图像恢复清晰的图像。
{"title":"End-to-End Single Image Fog Removal Using Enhanced Cycle Consistent Adversarial Networks","authors":"Wei Liu, Xianxu Hou, Jiang Duan, G. Qiu","doi":"10.1109/TIP.2020.3007844","DOIUrl":"https://doi.org/10.1109/TIP.2020.3007844","url":null,"abstract":"Single image defogging is a classical and challenging problem in computer vision. Existing methods towards this problem mainly include handcrafted priors based methods that rely on the use of the atmospheric degradation model and learning-based approaches that require paired fog-fogfree training example images. In practice, however, prior-based methods are prone to failure due to their own limitations and paired training data are extremely difficult to acquire. Moreover, there are few studies on the unpaired trainable defogging network in this field. Thus, inspired by the principle of CycleGAN network, we have developed an end-to-end learning system that uses unpaired fog and fogfree training images, adversarial discriminators and cycle consistency losses to automatically construct a fog removal system. Similar to CycleGAN, our system has two transformation paths; one maps fog images to a fogfree image domain and the other maps fogfree images to a fog image domain. Instead of one stage mapping, our system uses a two stage mapping strategy in each transformation path to enhance the effectiveness of fog removal. Furthermore, we make explicit use of prior knowledge in the networks by embedding the atmospheric degradation principle and a sky prior for mapping fogfree images to the fog images domain. In addition, we also contribute the first real world nature fog-fogfree image dataset for defogging research. Our multiple real fog images dataset (MRFID) contains images of 200 natural outdoor scenes. For each scene, there is one clear image and corresponding four foggy images of different fog densities manually selected from a sequence of images taken by a fixed camera over the course of one year. Qualitative and quantitative comparison against several state-of-the-art methods on both synthetic and real world images demonstrate that our approach is effective and performs favorably for recovering a clear image from a foggy image.","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":"7819-7833"},"PeriodicalIF":10.6,"publicationDate":"2019-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TIP.2020.3007844","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62591533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 55
Stacked Deconvolutional Network for Semantic Segmentation. 用于语义分割的堆叠去卷积网络
IF 10.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2019-01-25 DOI: 10.1109/TIP.2019.2895460
Jun Fu, Jing Liu, Yuhang Wang, Jin Zhou, Changyong Wang, Hanqing Lu

Recent progress in semantic segmentation has been driven by improving the spatial resolution under Fully Convolutional Networks (FCNs). To address this problem, we propose a Stacked Deconvolutional Network (SDN) for semantic segmentation. In SDN, multiple shallow deconvolutional networks, which are called as SDN units, are stacked one by one to integrate contextual information and bring the fine recovery of localization information. Meanwhile, inter-unit and intra-unit connections are designed to assist network training and enhance feature fusion since the connections improve the flow of information and gradient propagation throughout the network. Besides, hierarchical supervision is applied during the upsampling process of each SDN unit, which enhances the discrimination of feature representations and benefits the network optimization. We carry out comprehensive experiments and achieve the new state-ofthe- art results on four datasets, including PASCAL VOC 2012, CamVid, GATECH, COCO Stuff. In particular, our best model without CRF post-processing achieves an intersection-over-union score of 86.6% in the test set.

最近在语义分割领域取得的进展主要是通过提高全卷积网络(FCN)的空间分辨率来实现的。为了解决这个问题,我们提出了一种用于语义分割的堆叠去卷积网络(SDN)。在 SDN 中,多个浅层去卷积网络(称为 SDN 单元)被逐个堆叠,以整合上下文信息,实现定位信息的精细恢复。同时,由于单元间和单元内的连接可以改善整个网络的信息流和梯度传播,因此设计了单元间和单元内的连接来帮助网络训练和增强特征融合。此外,在每个 SDN 单元的上采样过程中应用了分层监督,这增强了特征表示的辨别能力,有利于网络优化。我们在四个数据集(包括 PASCAL VOC 2012、CamVid、GATECH 和 COCO Stuff)上进行了全面实验,并取得了最新成果。其中,我们的最佳模型在测试集中的交集大于联合得分率达到了 86.6%,而没有经过 CRF 后处理。
{"title":"Stacked Deconvolutional Network for Semantic Segmentation.","authors":"Jun Fu, Jing Liu, Yuhang Wang, Jin Zhou, Changyong Wang, Hanqing Lu","doi":"10.1109/TIP.2019.2895460","DOIUrl":"10.1109/TIP.2019.2895460","url":null,"abstract":"<p><p>Recent progress in semantic segmentation has been driven by improving the spatial resolution under Fully Convolutional Networks (FCNs). To address this problem, we propose a Stacked Deconvolutional Network (SDN) for semantic segmentation. In SDN, multiple shallow deconvolutional networks, which are called as SDN units, are stacked one by one to integrate contextual information and bring the fine recovery of localization information. Meanwhile, inter-unit and intra-unit connections are designed to assist network training and enhance feature fusion since the connections improve the flow of information and gradient propagation throughout the network. Besides, hierarchical supervision is applied during the upsampling process of each SDN unit, which enhances the discrimination of feature representations and benefits the network optimization. We carry out comprehensive experiments and achieve the new state-ofthe- art results on four datasets, including PASCAL VOC 2012, CamVid, GATECH, COCO Stuff. In particular, our best model without CRF post-processing achieves an intersection-over-union score of 86.6% in the test set.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":" ","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36916270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Monocular Depth Estimation with Augmented Ordinal Depth Relationships. 利用增量正序深度关系进行单目深度估算
IF 10.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2018-10-24 DOI: 10.1109/TIP.2018.2877944
Yuanzhouhan Cao, Tianqi Zhao, Ke Xian, Chunhua Shen, Zhiguo Cao, Shugong Xu

Most existing algorithms for depth estimation from single monocular images need large quantities of metric groundtruth depths for supervised learning. We show that relative depth can be an informative cue for metric depth estimation and can be easily obtained from vast stereo videos. Acquiring metric depths from stereo videos is sometimes impracticable due to the absence of camera parameters. In this paper, we propose to improve the performance of metric depth estimation with relative depths collected from stereo movie videos using existing stereo matching algorithm.We introduce a new "Relative Depth in Stereo" (RDIS) dataset densely labelled with relative depths. We first pretrain a ResNet model on our RDIS dataset. Then we finetune the model on RGB-D datasets with metric ground-truth depths. During our finetuning, we formulate depth estimation as a classification task. This re-formulation scheme enables us to obtain the confidence of a depth prediction in the form of probability distribution. With this confidence, we propose an information gain loss to make use of the predictions that are close to ground-truth. We evaluate our approach on both indoor and outdoor benchmark RGB-D datasets and achieve state-of-the-art performance.

大多数现有的单目图像深度估算算法都需要大量的公制真实深度来进行监督学习。我们的研究表明,相对深度可以作为度量深度估算的信息线索,并且可以从大量立体视频中轻松获取。由于缺乏摄像机参数,从立体视频中获取度量深度有时并不可行。在本文中,我们建议使用现有的立体匹配算法,利用从立体电影视频中收集的相对深度来提高度量深度估计的性能。我们首先在 RDIS 数据集上预训练一个 ResNet 模型。然后,我们在具有度量地面实况深度的 RGB-D 数据集上对模型进行微调。在微调过程中,我们将深度估计视为一项分类任务。这种重新表述方案使我们能够以概率分布的形式获得深度预测的置信度。利用这种置信度,我们提出了一种信息增益损失,以利用接近地面实况的预测结果。我们在室内和室外基准 RGB-D 数据集上评估了我们的方法,并取得了最先进的性能。
{"title":"Monocular Depth Estimation with Augmented Ordinal Depth Relationships.","authors":"Yuanzhouhan Cao, Tianqi Zhao, Ke Xian, Chunhua Shen, Zhiguo Cao, Shugong Xu","doi":"10.1109/TIP.2018.2877944","DOIUrl":"10.1109/TIP.2018.2877944","url":null,"abstract":"<p><p>Most existing algorithms for depth estimation from single monocular images need large quantities of metric groundtruth depths for supervised learning. We show that relative depth can be an informative cue for metric depth estimation and can be easily obtained from vast stereo videos. Acquiring metric depths from stereo videos is sometimes impracticable due to the absence of camera parameters. In this paper, we propose to improve the performance of metric depth estimation with relative depths collected from stereo movie videos using existing stereo matching algorithm.We introduce a new \"Relative Depth in Stereo\" (RDIS) dataset densely labelled with relative depths. We first pretrain a ResNet model on our RDIS dataset. Then we finetune the model on RGB-D datasets with metric ground-truth depths. During our finetuning, we formulate depth estimation as a classification task. This re-formulation scheme enables us to obtain the confidence of a depth prediction in the form of probability distribution. With this confidence, we propose an information gain loss to make use of the predictions that are close to ground-truth. We evaluate our approach on both indoor and outdoor benchmark RGB-D datasets and achieve state-of-the-art performance.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":" ","pages":""},"PeriodicalIF":10.6,"publicationDate":"2018-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36624481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Active Learning with Contaminated Tags for Image Aesthetics Assessment. 使用污染标签进行图像美学评估的深度主动学习
IF 10.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2018-04-18 DOI: 10.1109/TIP.2018.2828326
Zhenguang Liu, Zepeng Wang, Yiyang Yao, Luming Zhang, Ling Shao

Image aesthetic quality assessment has becoming an indispensable technique that facilitates a variety of image applications, e.g., photo retargeting and non-realistic rendering. Conventional approaches suffer from the following limitations: 1) the inefficiency of semantically describing images due to the inherent tag noise and incompletion, 2) the difficulty of accurately reflecting how humans actively perceive various regions inside each image, and 3) the challenge of incorporating the aesthetic experiences of multiple users. To solve these problems, we propose a novel semi-supervised deep active learning (SDAL) algorithm, which discovers how humans perceive semantically important regions from a large quantity of images partially assigned with contaminated tags. More specifically, as humans usually attend to the foreground objects before understanding them, we extract a succinct set of BING (binarized normed gradients) [60]-based object patches from each image. To simulate human visual perception, we propose SDAL which hierarchically learns human gaze shifting path (GSP) by sequentially linking semantically important object patches from each scenery. Noticeably, SDLA unifies the semantically important regions discovery and deep GSP feature learning into a principled framework, wherein only a small proportion of tagged images are adopted. Moreover, based on the sparsity penalty, SDLA can optimally abandon the noisy or redundant low-level image features. Finally, by leveraging the deeply-learned GSP features, a probabilistic model is developed for image aesthetics assessment, where the experience of multiple professional photographers can be encoded. Besides, auxiliary quality-related features can be conveniently integrated into our probabilistic model. Comprehensive experiments on a series of benchmark image sets have demonstrated the superiority of our method. As a byproduct, eye tracking experiments have shown that GSPs generated by our SDAL are about 93% consistent with real human gaze shifting paths.

图像美学质量评估已成为促进各种图像应用(如照片重定位和非现实渲染)不可或缺的技术。传统方法存在以下局限性:1) 由于固有的标签噪声和不完整性,对图像进行语义描述的效率低下;2) 难以准确反映人类如何主动感知每幅图像中的各个区域;3) 难以纳入多个用户的审美体验。为了解决这些问题,我们提出了一种新颖的半监督深度主动学习(SDAL)算法,该算法能从大量图像中发现人类是如何感知语义上重要的区域的,而这些图像中的部分图像标记是被污染的。更具体地说,由于人类在理解前景物体之前通常会先关注它们,因此我们从每幅图像中提取了一组简洁的基于 BING(二值化规范梯度)[60] 的物体补丁。为了模拟人类的视觉感知,我们提出了 SDAL,它通过依次连接每个场景中语义上重要的物体补丁,分层学习人类的注视移动路径(GSP)。值得注意的是,SDLA 将语义重要区域发现和深度 GSP 特征学习统一到了一个原则性框架中,其中只采用了一小部分标记图像。此外,基于稀疏性惩罚,SDLA 可以优化放弃噪声或冗余的低级图像特征。最后,利用深度学习的 GSP 特征,为图像美学评估开发了一个概率模型,其中可以编码多个专业摄影师的经验。此外,与质量相关的辅助特征也可以方便地集成到我们的概率模型中。在一系列基准图像集上进行的综合实验证明了我们方法的优越性。作为副产品,眼球跟踪实验表明,由我们的 SDAL 生成的 GSP 与真实人类目光移动路径的一致性约为 93%。
{"title":"Deep Active Learning with Contaminated Tags for Image Aesthetics Assessment.","authors":"Zhenguang Liu, Zepeng Wang, Yiyang Yao, Luming Zhang, Ling Shao","doi":"10.1109/TIP.2018.2828326","DOIUrl":"10.1109/TIP.2018.2828326","url":null,"abstract":"<p><p>Image aesthetic quality assessment has becoming an indispensable technique that facilitates a variety of image applications, e.g., photo retargeting and non-realistic rendering. Conventional approaches suffer from the following limitations: 1) the inefficiency of semantically describing images due to the inherent tag noise and incompletion, 2) the difficulty of accurately reflecting how humans actively perceive various regions inside each image, and 3) the challenge of incorporating the aesthetic experiences of multiple users. To solve these problems, we propose a novel semi-supervised deep active learning (SDAL) algorithm, which discovers how humans perceive semantically important regions from a large quantity of images partially assigned with contaminated tags. More specifically, as humans usually attend to the foreground objects before understanding them, we extract a succinct set of BING (binarized normed gradients) [60]-based object patches from each image. To simulate human visual perception, we propose SDAL which hierarchically learns human gaze shifting path (GSP) by sequentially linking semantically important object patches from each scenery. Noticeably, SDLA unifies the semantically important regions discovery and deep GSP feature learning into a principled framework, wherein only a small proportion of tagged images are adopted. Moreover, based on the sparsity penalty, SDLA can optimally abandon the noisy or redundant low-level image features. Finally, by leveraging the deeply-learned GSP features, a probabilistic model is developed for image aesthetics assessment, where the experience of multiple professional photographers can be encoded. Besides, auxiliary quality-related features can be conveniently integrated into our probabilistic model. Comprehensive experiments on a series of benchmark image sets have demonstrated the superiority of our method. As a byproduct, eye tracking experiments have shown that GSPs generated by our SDAL are about 93% consistent with real human gaze shifting paths.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":" ","pages":""},"PeriodicalIF":10.6,"publicationDate":"2018-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36302732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Digital affine shear filter banks with 2-layer structure 两层结构的数字仿射剪切滤波器组
IF 10.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2017-07-01 DOI: 10.1109/SAMPTA.2017.8024369
Zhihua Che, X. Zhuang
Affine shear tight frames with 2-layer structure are introduced. Characterizations and constructions of smooth affine shear tight frames with 2-layer structure are provided. Digital affine shear banks with 2-layer structure are then constructed. The implementation of digital affine shear transforms using the transition and subdivision operators are given. Numerical experiments on image denoising demonstrate the advantage of our digital affine shear filter banks with 2-layer structure.
介绍了两层结构仿射剪切紧框架。给出了两层结构光滑仿射剪切紧框架的表征和构造。然后构造了两层结构的数字仿射剪切库。给出了用变换算子和细分算子实现数字仿射剪切变换的方法。图像去噪的数值实验证明了我们的2层结构数字仿射剪切滤波器组的优越性。
{"title":"Digital affine shear filter banks with 2-layer structure","authors":"Zhihua Che, X. Zhuang","doi":"10.1109/SAMPTA.2017.8024369","DOIUrl":"https://doi.org/10.1109/SAMPTA.2017.8024369","url":null,"abstract":"Affine shear tight frames with 2-layer structure are introduced. Characterizations and constructions of smooth affine shear tight frames with 2-layer structure are provided. Digital affine shear banks with 2-layer structure are then constructed. The implementation of digital affine shear transforms using the transition and subdivision operators are given. Numerical experiments on image denoising demonstrate the advantage of our digital affine shear filter banks with 2-layer structure.","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"1 1","pages":"575-579"},"PeriodicalIF":10.6,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/SAMPTA.2017.8024369","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62542429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Improved Denoising via Poisson Mixture Modeling of Image Sensor Noise 基于泊松混合模型的图像传感器噪声改进去噪
IF 10.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2017-04-01 DOI: 10.1109/TIP.2017.2651365
Jiachao Zhang, Keigo Hirakawa
This paper describes a study aimed at comparing the real image sensor noise distribution to the models of noise often assumed in image denoising designs. A quantile analysis in pixel, wavelet transform, and variance stabilization domains reveal that the tails of Poisson, signal-dependent Gaussian, and Poisson–Gaussian models are too short to capture real sensor noise behavior. A new Poisson mixture noise model is proposed to correct the mismatch of tail behavior. Based on the fact that noise model mismatch results in image denoising that undersmoothes real sensor data, we propose a mixture of Poisson denoising method to remove the denoising artifacts without affecting image details, such as edge and textures. Experiments with real sensor data verify that denoising for real image sensor data is indeed improved by this new technique.
本文描述了一项旨在比较真实图像传感器噪声分布与图像去噪设计中通常假设的噪声模型的研究。像素、小波变换和方差稳定域的分位数分析表明,泊松模型、信号依赖高斯模型和泊松-高斯模型的尾部太短,无法捕捉真实的传感器噪声行为。提出了一种新的泊松混合噪声模型来修正尾部行为的不匹配。基于噪声模型不匹配导致图像去噪对真实传感器数据的处理不够平滑的事实,我们提出了一种混合泊松去噪方法,在不影响图像边缘和纹理等细节的情况下去除去噪伪影。用真实传感器数据进行的实验验证了该方法对真实图像传感器数据去噪的效果。
{"title":"Improved Denoising via Poisson Mixture Modeling of Image Sensor Noise","authors":"Jiachao Zhang, Keigo Hirakawa","doi":"10.1109/TIP.2017.2651365","DOIUrl":"https://doi.org/10.1109/TIP.2017.2651365","url":null,"abstract":"This paper describes a study aimed at comparing the real image sensor noise distribution to the models of noise often assumed in image denoising designs. A quantile analysis in pixel, wavelet transform, and variance stabilization domains reveal that the tails of Poisson, signal-dependent Gaussian, and Poisson–Gaussian models are too short to capture real sensor noise behavior. A new Poisson mixture noise model is proposed to correct the mismatch of tail behavior. Based on the fact that noise model mismatch results in image denoising that undersmoothes real sensor data, we propose a mixture of Poisson denoising method to remove the denoising artifacts without affecting image details, such as edge and textures. Experiments with real sensor data verify that denoising for real image sensor data is indeed improved by this new technique.","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"26 1","pages":"1565-1578"},"PeriodicalIF":10.6,"publicationDate":"2017-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TIP.2017.2651365","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62582187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
Non-Additive Imprecise Image Super-Resolution in a Semi-Blind Context 半盲环境下的非加性不精确图像超分辨率
IF 10.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2017-03-01 DOI: 10.1109/TIP.2016.2621414
Fares Graba, F. Comby, O. Strauss
The most effective superresolution methods proposed in the literature require precise knowledge of the so-called point spread function of the imager, while in practice its accurate estimation is nearly impossible. This paper presents a new superresolution method, whose main feature is its ability to account for the scant knowledge of the imager point spread function. This ability is based on representing this imprecise knowledge via a non-additive neighborhood function. The superresolution reconstruction algorithm transfers this imprecise knowledge to output by producing an imprecise (interval-valued) high-resolution image. We propose some experiments illustrating the robustness of the proposed method with respect to the imager point spread function. These experiments also highlight its high performance compared with very competitive earlier approaches. Finally, we show that the imprecision of the high-resolution interval-valued reconstructed image is a reconstruction error marker.
文献中提出的最有效的超分辨方法需要精确地了解成像仪的所谓点扩散函数,而在实践中,它的精确估计几乎是不可能的。本文提出了一种新的超分辨方法,其主要特点是能够解决成像仪点扩散函数知识不足的问题。这种能力是基于通过非加性邻域函数来表示这种不精确的知识。超分辨率重建算法通过生成不精确(区间值)的高分辨率图像将这种不精确的知识传递到输出。我们提出了一些实验来说明该方法相对于成像仪点扩散函数的鲁棒性。这些实验也突出了它与非常有竞争力的早期方法相比的高性能。最后,我们证明了高分辨率区间值重建图像的不精确性是重建误差的标志。
{"title":"Non-Additive Imprecise Image Super-Resolution in a Semi-Blind Context","authors":"Fares Graba, F. Comby, O. Strauss","doi":"10.1109/TIP.2016.2621414","DOIUrl":"https://doi.org/10.1109/TIP.2016.2621414","url":null,"abstract":"The most effective superresolution methods proposed in the literature require precise knowledge of the so-called point spread function of the imager, while in practice its accurate estimation is nearly impossible. This paper presents a new superresolution method, whose main feature is its ability to account for the scant knowledge of the imager point spread function. This ability is based on representing this imprecise knowledge via a non-additive neighborhood function. The superresolution reconstruction algorithm transfers this imprecise knowledge to output by producing an imprecise (interval-valued) high-resolution image. We propose some experiments illustrating the robustness of the proposed method with respect to the imager point spread function. These experiments also highlight its high performance compared with very competitive earlier approaches. Finally, we show that the imprecision of the high-resolution interval-valued reconstructed image is a reconstruction error marker.","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"26 1","pages":"1379-1392"},"PeriodicalIF":10.6,"publicationDate":"2017-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TIP.2016.2621414","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62575631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
期刊
IEEE Transactions on Image Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1