首页 > 最新文献

2021 IEEE International Conference on Image Processing (ICIP)最新文献

英文 中文
Mesh Classification With Dilated Mesh Convolutions 扩展网格卷积的网格分类
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506311
Vinit Veerendraveer Singh, Shivanand Venkanna Sheshappanavar, C. Kambhamettu
Unlike images, meshes are irregular and unstructured. Thus, it is not trivial to extend existing image-based deep learning approaches for mesh analysis. In this paper, inspired by dilated convolutions for images, we proffer dilated convolutions for meshes. Our Dilated Mesh Convolution (DMC) unit inflates the kernels’ receptive field without increasing the number of learnable parameters. We also propose a Stacked Dilated Mesh Convolution (SDMC) block by stacking DMC units. It considers spatial regions around mesh faces’ at multiple scales while summarizing the neighboring contextual information. We accommodated SDMC in MeshNet to classify 3D meshes. Experimental results demonstrate that this redesigned model significantly improves classification accuracy on multiple data sets. Code is available at https://github.com/VimsLab/DMC.
与图像不同,网格是不规则和非结构化的。因此,扩展现有的基于图像的深度学习方法进行网格分析并非易事。在本文中,受图像的扩展卷积的启发,我们提出了网格的扩展卷积。我们的扩展网格卷积(DMC)单元在不增加可学习参数数量的情况下扩大了核的接受域。我们还通过堆叠DMC单元提出了堆叠扩展网格卷积(SDMC)块。它考虑网格面周围的多个尺度的空间区域,同时总结相邻的上下文信息。我们在MeshNet中引入SDMC来对3D网格进行分类。实验结果表明,该模型在多个数据集上显著提高了分类精度。代码可从https://github.com/VimsLab/DMC获得。
{"title":"Mesh Classification With Dilated Mesh Convolutions","authors":"Vinit Veerendraveer Singh, Shivanand Venkanna Sheshappanavar, C. Kambhamettu","doi":"10.1109/ICIP42928.2021.9506311","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506311","url":null,"abstract":"Unlike images, meshes are irregular and unstructured. Thus, it is not trivial to extend existing image-based deep learning approaches for mesh analysis. In this paper, inspired by dilated convolutions for images, we proffer dilated convolutions for meshes. Our Dilated Mesh Convolution (DMC) unit inflates the kernels’ receptive field without increasing the number of learnable parameters. We also propose a Stacked Dilated Mesh Convolution (SDMC) block by stacking DMC units. It considers spatial regions around mesh faces’ at multiple scales while summarizing the neighboring contextual information. We accommodated SDMC in MeshNet to classify 3D meshes. Experimental results demonstrate that this redesigned model significantly improves classification accuracy on multiple data sets. Code is available at https://github.com/VimsLab/DMC.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130294096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Hierarchical Domain-Consistent Network For Cross-Domain Object Detection 跨域目标检测的层次域一致网络
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506743
Yuanyuan Liu, Ziyang Liu, Fang Fang, Zhanghua Fu, Zhanlong Chen
Cross-domain object detection is a very challenging task due to multi-level domain shift in an unseen domain. To address the problem, this paper proposes a hierarchical domain-consistent network (HDCN) for cross-domain object detection, which effectively suppresses pixel-level, image-level, as well as instance-level domain shift via jointly aligning three-level features. Firstly, at the pixel-level feature alignment stage, a pixel-level subnet with foreground-aware attention learning and pixel-level adversarial learning is proposed to focus on local foreground transferable information. Then, at the image-level feature alignment stage, global domain-invariant features are learned from the whole image through image-level adversarial learning. Finally, at the instance-level alignment stage, a prototype graph convolution network is conducted to guarantee distribution alignment of instances by minimizing the distance of prototypes with the same category but from different domains. Moreover, to avoid the non-convergence problem during multi-level feature alignment, a domain-consistent loss is proposed to harmonize the adaptation training process. Comprehensive results on various cross-domain detection tasks demonstrate the broad applicability and effectiveness of the proposed approach.
跨域目标检测是一项非常具有挑战性的任务,因为在不可见的域中存在多级域漂移。为了解决这一问题,本文提出了一种用于跨域目标检测的分层域一致网络(HDCN),该网络通过联合对齐三层特征,有效地抑制了像素级、图像级和实例级的域漂移。首先,在像素级特征对齐阶段,提出了具有前景感知注意学习和像素级对抗学习的像素级子网,专注于局部前景可转移信息;然后,在图像级特征对齐阶段,通过图像级对抗学习从整个图像中学习全局域不变特征。最后,在实例级对齐阶段,构建原型图卷积网络,通过最小化同类别不同域原型之间的距离,保证实例的分布对齐。此外,为了避免多层次特征对齐过程中的不收敛问题,提出了一种域一致损失来协调自适应训练过程。各种跨域检测任务的综合结果证明了该方法的广泛适用性和有效性。
{"title":"Hierarchical Domain-Consistent Network For Cross-Domain Object Detection","authors":"Yuanyuan Liu, Ziyang Liu, Fang Fang, Zhanghua Fu, Zhanlong Chen","doi":"10.1109/ICIP42928.2021.9506743","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506743","url":null,"abstract":"Cross-domain object detection is a very challenging task due to multi-level domain shift in an unseen domain. To address the problem, this paper proposes a hierarchical domain-consistent network (HDCN) for cross-domain object detection, which effectively suppresses pixel-level, image-level, as well as instance-level domain shift via jointly aligning three-level features. Firstly, at the pixel-level feature alignment stage, a pixel-level subnet with foreground-aware attention learning and pixel-level adversarial learning is proposed to focus on local foreground transferable information. Then, at the image-level feature alignment stage, global domain-invariant features are learned from the whole image through image-level adversarial learning. Finally, at the instance-level alignment stage, a prototype graph convolution network is conducted to guarantee distribution alignment of instances by minimizing the distance of prototypes with the same category but from different domains. Moreover, to avoid the non-convergence problem during multi-level feature alignment, a domain-consistent loss is proposed to harmonize the adaptation training process. Comprehensive results on various cross-domain detection tasks demonstrate the broad applicability and effectiveness of the proposed approach.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126664220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Robust Multi-Frame Future Prediction By Leveraging View Synthesis 利用视图合成的鲁棒多帧未来预测
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506508
Kenan E. Ak, Ying Sun, Joo-Hwee Lim
In this paper, we focus on the problem of video prediction, i.e., future frame prediction. Most state-of-the-art techniques focus on synthesizing a single future frame at each step. However, this leads to utilizing the model’s own predicted frames when synthesizing multi-step prediction, resulting in gradual performance degradation due to accumulating errors in pixels. To alleviate this issue, we propose a model that can handle multi-step prediction. Additionally, we employ techniques to leverage from view synthesis for future frame prediction, where both problems are treated independently in the literature. Our proposed method employs multiview camera pose prediction and depth-prediction networks to project the last available frame to desired future frames via differentiable point cloud renderer. For the synthesis of moving objects, we utilize an additional refinement stage. In experiments, we show that the proposed framework outperforms state-of-theart methods in both KITTI and Cityscapes datasets.
本文主要研究视频预测问题,即未来帧预测问题。大多数最先进的技术专注于在每一步合成一个单一的未来帧。然而,这导致在合成多步预测时使用模型自己的预测帧,导致由于像素累积误差而导致性能逐渐下降。为了解决这个问题,我们提出了一个可以处理多步预测的模型。此外,我们还利用视图合成技术来预测未来的框架,这两个问题在文献中都是独立处理的。该方法采用多视角相机姿态预测和深度预测网络,通过可微点云渲染器将最后可用帧投影到期望的未来帧。对于运动物体的合成,我们利用了一个额外的细化阶段。在实验中,我们表明所提出的框架在KITTI和cityscape数据集上都优于最先进的方法。
{"title":"Robust Multi-Frame Future Prediction By Leveraging View Synthesis","authors":"Kenan E. Ak, Ying Sun, Joo-Hwee Lim","doi":"10.1109/ICIP42928.2021.9506508","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506508","url":null,"abstract":"In this paper, we focus on the problem of video prediction, i.e., future frame prediction. Most state-of-the-art techniques focus on synthesizing a single future frame at each step. However, this leads to utilizing the model’s own predicted frames when synthesizing multi-step prediction, resulting in gradual performance degradation due to accumulating errors in pixels. To alleviate this issue, we propose a model that can handle multi-step prediction. Additionally, we employ techniques to leverage from view synthesis for future frame prediction, where both problems are treated independently in the literature. Our proposed method employs multiview camera pose prediction and depth-prediction networks to project the last available frame to desired future frames via differentiable point cloud renderer. For the synthesis of moving objects, we utilize an additional refinement stage. In experiments, we show that the proposed framework outperforms state-of-theart methods in both KITTI and Cityscapes datasets.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124030093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Deepfake Video Detection Using 3D-Attentional Inception Convolutional Neural Network 基于三维注意初始卷积神经网络的深度假视频检测
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506381
Changlei Lu, B. Liu, Wenbo Zhou, Qi Chu, Nenghai Yu
The current spike of deepfake techniques has received considerable attention due to security concerns. To mitigate the potential risks brought by deepfake techniques, many detection methods have been proposed. However, most existing works merely leverage spatial information from separate frames and ignore valuable inter-frame temporal information. In this paper, we propose a deepfake detection scheme that uses 3D-attentional inception network. The proposed model encompasses both spatial and temporal information simultaneously with the 3D kernels. Furthermore, the channel and spatial-temporal attention modules are applied to improve detection capabilities. Comprehensive experiments demonstrate that our scheme outperforms state-of-the-art methods.
出于安全考虑,目前深度伪造技术的激增受到了相当大的关注。为了降低深度伪造技术带来的潜在风险,人们提出了许多检测方法。然而,大多数现有的作品仅仅利用了来自单独帧的空间信息,而忽略了有价值的帧间时间信息。在本文中,我们提出了一种使用三维注意初始网络的深度伪造检测方案。该模型同时包含三维核的空间和时间信息。此外,还采用了信道和时空注意模块来提高检测能力。综合实验表明,我们的方案优于最先进的方法。
{"title":"Deepfake Video Detection Using 3D-Attentional Inception Convolutional Neural Network","authors":"Changlei Lu, B. Liu, Wenbo Zhou, Qi Chu, Nenghai Yu","doi":"10.1109/ICIP42928.2021.9506381","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506381","url":null,"abstract":"The current spike of deepfake techniques has received considerable attention due to security concerns. To mitigate the potential risks brought by deepfake techniques, many detection methods have been proposed. However, most existing works merely leverage spatial information from separate frames and ignore valuable inter-frame temporal information. In this paper, we propose a deepfake detection scheme that uses 3D-attentional inception network. The proposed model encompasses both spatial and temporal information simultaneously with the 3D kernels. Furthermore, the channel and spatial-temporal attention modules are applied to improve detection capabilities. Comprehensive experiments demonstrate that our scheme outperforms state-of-the-art methods.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121163270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Improved Multiclass Adaboost For Image Classification: The Role Of Tree Optimization 改进的多类Adaboost图像分类:树优化的作用
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506569
Arman Zharmagambetov, Magzhan Gabidolla, M. A. Carreira-Perpiñán
Decision tree boosting is considered as an important and widely recognized method in image classification, despite dominance of the deep learning based approaches in this area. Provided with good image features, it can produce a powerful model with unique properties, such as strong predictive power, scalability, interpretability, etc. In this paper, we propose a novel tree boosting framework which capitalizes on the idea of using shallow, sparse and yet powerful oblique decision trees (trained with recently proposed Tree Alternating optimization algorithm) as the base learners. We empirically show that the resulting model achieves better or comparable performance (both in terms of accuracy and model size) against established boosting algorithms such as gradient boosting or AdaBoost in number of benchmarks. Further, we show that such trees can directly and efficiently handle multiclass problems without using one-vs-all strategy employed by most of the practical boosting implementations.
尽管基于深度学习的方法在该领域占主导地位,但决策树增强被认为是图像分类中重要且被广泛认可的方法。它具有良好的图像特征,可以生成强大的模型,具有强大的预测能力、可扩展性、可解释性等独特属性。在本文中,我们提出了一个新的树增强框架,该框架利用了使用浅,稀疏和强大的倾斜决策树(用最近提出的树交替优化算法训练)作为基础学习器的思想。我们的经验表明,在许多基准测试中,与已建立的增强算法(如梯度增强或AdaBoost)相比,所得到的模型实现了更好或相当的性能(在准确性和模型大小方面)。此外,我们证明了这种树可以直接有效地处理多类问题,而无需使用大多数实际提升实现所采用的一对一策略。
{"title":"Improved Multiclass Adaboost For Image Classification: The Role Of Tree Optimization","authors":"Arman Zharmagambetov, Magzhan Gabidolla, M. A. Carreira-Perpiñán","doi":"10.1109/ICIP42928.2021.9506569","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506569","url":null,"abstract":"Decision tree boosting is considered as an important and widely recognized method in image classification, despite dominance of the deep learning based approaches in this area. Provided with good image features, it can produce a powerful model with unique properties, such as strong predictive power, scalability, interpretability, etc. In this paper, we propose a novel tree boosting framework which capitalizes on the idea of using shallow, sparse and yet powerful oblique decision trees (trained with recently proposed Tree Alternating optimization algorithm) as the base learners. We empirically show that the resulting model achieves better or comparable performance (both in terms of accuracy and model size) against established boosting algorithms such as gradient boosting or AdaBoost in number of benchmarks. Further, we show that such trees can directly and efficiently handle multiclass problems without using one-vs-all strategy employed by most of the practical boosting implementations.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114236708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Reducing Stair Artifacts in CT Reconstruction 减少CT重建中的阶梯伪影
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506066
Mark A Wedekind, Eric Oertel, Susana Castillo, M. Magnor
Computed Tomography is increasingly employed for non-destructive evaluation, with the aim of reconstructing a surface mesh of a scanned object from radiographic projections. State-of-the-art algorithms first reconstruct a voxel grid and then extract a surface mesh using existing meshing algorithms, often leading to stair-like aliasing artifacts along the grid axes, due to the grid’s orientation-dependent resolution. We circumvent such artifacts in filtered backprojection reconstructions by optimizing the mesh’s vertex positions using information taken directly from the projections, rather than from a voxel grid. We show that our approach reduces stair artifacts both visibly and measurably, at relatively little additional computational cost. Our method can be tied into existing mesh extraction algorithms and removes stair artifacts almost entirely.
计算机断层扫描越来越多地用于非破坏性评估,目的是从放射成像投影重建扫描物体的表面网格。最先进的算法首先重建体素网格,然后使用现有的网格划分算法提取表面网格,由于网格的方向依赖分辨率,通常会导致沿着网格轴的阶梯状混叠伪影。我们通过使用直接从投影中获取的信息而不是从体素网格中获取的信息来优化网格的顶点位置,从而在过滤后的反向投影重建中规避了这些伪影。我们表明,我们的方法在相对较少的额外计算成本下,可以明显地和可测量地减少楼梯伪像。我们的方法可以与现有的网格提取算法相结合,几乎完全去除楼梯伪影。
{"title":"Reducing Stair Artifacts in CT Reconstruction","authors":"Mark A Wedekind, Eric Oertel, Susana Castillo, M. Magnor","doi":"10.1109/ICIP42928.2021.9506066","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506066","url":null,"abstract":"Computed Tomography is increasingly employed for non-destructive evaluation, with the aim of reconstructing a surface mesh of a scanned object from radiographic projections. State-of-the-art algorithms first reconstruct a voxel grid and then extract a surface mesh using existing meshing algorithms, often leading to stair-like aliasing artifacts along the grid axes, due to the grid’s orientation-dependent resolution. We circumvent such artifacts in filtered backprojection reconstructions by optimizing the mesh’s vertex positions using information taken directly from the projections, rather than from a voxel grid. We show that our approach reduces stair artifacts both visibly and measurably, at relatively little additional computational cost. Our method can be tied into existing mesh extraction algorithms and removes stair artifacts almost entirely.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114239815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fs-Net: Filter Selection Network For Hyperspectral Reconstruction Fs-Net:用于高光谱重建的滤波器选择网络
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506576
Liutao Yang, Zhongnian Li, Zongxiang Pei, Daoqiang Zhang
optimizing spectral filters for hyperspectral reconstruction has received increasing attentions recently. However, current filter selection methods suffer from extremely high computational complexity due to exhaustive optimization. In this paper, in order to reduce the computational complexity, we propose a novel Filter Selection Network (FS-Net) to select filters and learn the reconstruction network simultaneously. Specifically, we propose an end-to-end method to embed filter selection in FS-Net by setting spectral response functions as the input layer. Furthermore, we propose a non-negative Ll sparse regularization (NN-LI) to select optical filters automatically by sparsifying the input layer. Besides, we develop a two-stage training strategy for adjusting the number of selected filters. Experiments on public datasets show that our proposed method can considerably improve the reconstruction quality.
高光谱重建中光谱滤波器的优化问题近年来受到越来越多的关注。然而,目前的滤波器选择方法由于采用穷举优化,计算复杂度极高。为了降低计算复杂度,本文提出了一种新的滤波器选择网络(FS-Net)来同时选择滤波器和学习重构网络。具体来说,我们提出了一种端到端方法,通过设置光谱响应函数作为输入层,在FS-Net中嵌入滤波器选择。此外,我们提出了一种非负Ll稀疏正则化(NN-LI)方法,通过对输入层进行稀疏化来自动选择滤光片。此外,我们还开发了一个两阶段的训练策略来调整所选滤波器的数量。在公共数据集上的实验表明,该方法可以显著提高重构质量。
{"title":"Fs-Net: Filter Selection Network For Hyperspectral Reconstruction","authors":"Liutao Yang, Zhongnian Li, Zongxiang Pei, Daoqiang Zhang","doi":"10.1109/ICIP42928.2021.9506576","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506576","url":null,"abstract":"optimizing spectral filters for hyperspectral reconstruction has received increasing attentions recently. However, current filter selection methods suffer from extremely high computational complexity due to exhaustive optimization. In this paper, in order to reduce the computational complexity, we propose a novel Filter Selection Network (FS-Net) to select filters and learn the reconstruction network simultaneously. Specifically, we propose an end-to-end method to embed filter selection in FS-Net by setting spectral response functions as the input layer. Furthermore, we propose a non-negative Ll sparse regularization (NN-LI) to select optical filters automatically by sparsifying the input layer. Besides, we develop a two-stage training strategy for adjusting the number of selected filters. Experiments on public datasets show that our proposed method can considerably improve the reconstruction quality.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"263 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114329415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Strategies of Deep Learning for Tomographic Reconstruction 层析成像重建的深度学习策略
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506395
Xiaogang Yang, C. Schroer
In this article, we introduce three different strategies of tomographic reconstruction based on deep learning. These algorithms are model-based learning for iterative optimization. We discuss the basic principles of developing these algorithms. The performance of them is analyzed and evaluated both on theory and simulation reconstruction. We developed open-source software to run these algorithms in the same framework. From the simulation results, all these deep learning algorithms showed improvements in reconstruction quality and accuracy where the strategy based on Generative Adversarial Networks showed the advantage especially.
本文介绍了三种基于深度学习的层析成像重建策略。这些算法是基于模型的迭代优化学习。我们讨论了开发这些算法的基本原理。从理论和仿真两方面对它们的性能进行了分析和评价。我们开发了开源软件,在相同的框架下运行这些算法。从仿真结果来看,这些深度学习算法在重建质量和精度上都有提高,其中基于生成式对抗网络的策略表现出优势。
{"title":"Strategies of Deep Learning for Tomographic Reconstruction","authors":"Xiaogang Yang, C. Schroer","doi":"10.1109/ICIP42928.2021.9506395","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506395","url":null,"abstract":"In this article, we introduce three different strategies of tomographic reconstruction based on deep learning. These algorithms are model-based learning for iterative optimization. We discuss the basic principles of developing these algorithms. The performance of them is analyzed and evaluated both on theory and simulation reconstruction. We developed open-source software to run these algorithms in the same framework. From the simulation results, all these deep learning algorithms showed improvements in reconstruction quality and accuracy where the strategy based on Generative Adversarial Networks showed the advantage especially.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116265870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Resolution Improvement In FZA Lens-Less Camera By Synthesizing Images Captured With Different Mask-Sensor Distances 通过合成不同掩模-传感器距离捕获的图像来提高FZA无镜头相机的分辨率
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506638
Xiao Chen, Tomoya Nakamura, Xiuxi Pan, Kazuyuki Tajima, K. Yamaguchi, T. Shimano, M. Yamaguchi
Fresnel zone aperture (FZA) lens-less camera is a class of computational imaging systems that employs an FZA as a coded mask instead of an optical lens. FZA lens-less camera can perform fast deconvolution reconstruction and realize the re-focusing function. However, the reconstructed image’s spatial resolution is restricted by diffraction when using the conventional method derived from the geometrical optics model. In a previous study, we quantitatively analyzed the diffraction propagation between mask and sensor. Then we proposed a color-channel synthesis reconstruction method based on wave-optics theory. This study proposed a novel image reconstruction method without distorting the color information, comprehensively synthesizing two images captured with different mask-sensor distances to mitigate the diffraction influence and improve the image resolution. The numerical simulation and optical experiment results confirm that the proposed method can improve the spatial resolution to about two times that of the conventional method based on the geometrical optics model.
无透镜菲涅耳区孔径(FZA)相机是一类使用FZA作为编码掩模而不是光学透镜的计算成像系统。FZA无镜头相机可以进行快速反卷积重建,实现重对焦功能。然而,利用几何光学模型推导出的传统方法,重构图像的空间分辨率受到衍射的限制。在之前的研究中,我们定量分析了掩模与传感器之间的衍射传播。在此基础上,提出了一种基于波光学理论的彩色通道合成重建方法。本研究提出了一种不失真颜色信息的图像重建方法,综合合成两幅不同掩模传感器距离的图像,以减轻衍射影响,提高图像分辨率。数值模拟和光学实验结果表明,该方法可以将空间分辨率提高到基于几何光学模型的传统方法的两倍左右。
{"title":"Resolution Improvement In FZA Lens-Less Camera By Synthesizing Images Captured With Different Mask-Sensor Distances","authors":"Xiao Chen, Tomoya Nakamura, Xiuxi Pan, Kazuyuki Tajima, K. Yamaguchi, T. Shimano, M. Yamaguchi","doi":"10.1109/ICIP42928.2021.9506638","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506638","url":null,"abstract":"Fresnel zone aperture (FZA) lens-less camera is a class of computational imaging systems that employs an FZA as a coded mask instead of an optical lens. FZA lens-less camera can perform fast deconvolution reconstruction and realize the re-focusing function. However, the reconstructed image’s spatial resolution is restricted by diffraction when using the conventional method derived from the geometrical optics model. In a previous study, we quantitatively analyzed the diffraction propagation between mask and sensor. Then we proposed a color-channel synthesis reconstruction method based on wave-optics theory. This study proposed a novel image reconstruction method without distorting the color information, comprehensively synthesizing two images captured with different mask-sensor distances to mitigate the diffraction influence and improve the image resolution. The numerical simulation and optical experiment results confirm that the proposed method can improve the spatial resolution to about two times that of the conventional method based on the geometrical optics model.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"31 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116316627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Let Them Choose What They Want: A Multi-Task CNN Architecture Leveraging Mid-Level Deep Representations for Face Attribute Classification 让他们选择他们想要的:一个利用中级深度表示进行人脸属性分类的多任务CNN架构
Pub Date : 2021-09-19 DOI: 10.1109/ICIP42928.2021.9506456
Zhenduo Chen, Feng Liu, Zhenglai Zhao
Face Attributes Classification (FAC) is an important task in computer vision, aiming to predict the facial attributes of a given image. However, the value of mid-level feature information and the correlation between face attributes are always ignored by deep learning-based FAC methods. In order to solve these problems, we propose a novel and effective Multi-task CNN architecture. Instead of predicting all 40 attributes together, an attribute grouping strategy is proposed to divide the 40 attributes into 8 task groups correlatively. Meanwhile, through the Fusion Layer, mid-level deep representations are fused into the original feature representations to jointly predict the face attributes. Furthermore, the Task-unique Attention Modules can help learn more task-specific feature representations, obtaining higher FAC accuracy. Extensive experiments on the CelebA dataset demonstrate that our method outperforms state-of-the-art FAC methods.
人脸属性分类(Face Attributes Classification, FAC)是计算机视觉中的一项重要任务,旨在预测给定图像的人脸属性。然而,基于深度学习的FAC方法往往忽略了中级特征信息的价值和人脸属性之间的相关性。为了解决这些问题,我们提出了一种新颖有效的多任务CNN架构。本文提出了一种属性分组策略,将40个属性相关地划分为8个任务组,而不是一起预测所有40个属性。同时,通过融合层将中级深度表征融合到原始特征表征中,共同预测人脸属性。此外,任务唯一注意模块可以帮助学习更多的任务特定特征表示,从而获得更高的FAC准确性。在CelebA数据集上进行的大量实验表明,我们的方法优于最先进的FAC方法。
{"title":"Let Them Choose What They Want: A Multi-Task CNN Architecture Leveraging Mid-Level Deep Representations for Face Attribute Classification","authors":"Zhenduo Chen, Feng Liu, Zhenglai Zhao","doi":"10.1109/ICIP42928.2021.9506456","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506456","url":null,"abstract":"Face Attributes Classification (FAC) is an important task in computer vision, aiming to predict the facial attributes of a given image. However, the value of mid-level feature information and the correlation between face attributes are always ignored by deep learning-based FAC methods. In order to solve these problems, we propose a novel and effective Multi-task CNN architecture. Instead of predicting all 40 attributes together, an attribute grouping strategy is proposed to divide the 40 attributes into 8 task groups correlatively. Meanwhile, through the Fusion Layer, mid-level deep representations are fused into the original feature representations to jointly predict the face attributes. Furthermore, the Task-unique Attention Modules can help learn more task-specific feature representations, obtaining higher FAC accuracy. Extensive experiments on the CelebA dataset demonstrate that our method outperforms state-of-the-art FAC methods.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116334359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
2021 IEEE International Conference on Image Processing (ICIP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1