首页 > 最新文献

2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)最新文献

英文 中文
GRNet: Deep Convolutional Neural Networks based on Graph Reasoning for Semantic Segmentation GRNet:基于图推理的深度卷积神经网络语义分割
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301851
Yang Wu, A. Jiang, Yibin Tang, H. Kwan
In this paper, we develop a novel deep-network architecture for semantic segmentation. In contrast to previous work that widely uses dilated convolutions, we employ the original ResNet as the backbone, and a multi-scale feature fusion module (MFFM) is introduced to extract long-range contextual information and upsample feature maps. Then, a graph reasoning module (GRM) based on graph-convolutional network (GCN) is developed to aggregate semantic information. Our graph reasoning network (GRNet) extracts global contexts of input features by modeling graph reasoning in a single framework. Experimental results demonstrate that our approach provides substantial benefits over a strong baseline and achieves superior segmentation performance on two benchmark datasets.
在本文中,我们开发了一种新的用于语义分割的深度网络架构。与以往广泛使用扩张卷积的研究不同,我们采用原始的ResNet作为主干,并引入多尺度特征融合模块(MFFM)来提取远程上下文信息和上样本特征映射。然后,开发了基于图卷积网络(GCN)的图推理模块(GRM)来实现语义信息的聚合。我们的图推理网络(GRNet)通过在单一框架中建模图推理来提取输入特征的全局上下文。实验结果表明,我们的方法在强大的基线上提供了实质性的好处,并在两个基准数据集上实现了卓越的分割性能。
{"title":"GRNet: Deep Convolutional Neural Networks based on Graph Reasoning for Semantic Segmentation","authors":"Yang Wu, A. Jiang, Yibin Tang, H. Kwan","doi":"10.1109/VCIP49819.2020.9301851","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301851","url":null,"abstract":"In this paper, we develop a novel deep-network architecture for semantic segmentation. In contrast to previous work that widely uses dilated convolutions, we employ the original ResNet as the backbone, and a multi-scale feature fusion module (MFFM) is introduced to extract long-range contextual information and upsample feature maps. Then, a graph reasoning module (GRM) based on graph-convolutional network (GCN) is developed to aggregate semantic information. Our graph reasoning network (GRNet) extracts global contexts of input features by modeling graph reasoning in a single framework. Experimental results demonstrate that our approach provides substantial benefits over a strong baseline and achieves superior segmentation performance on two benchmark datasets.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114771623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A review of data preprocessing modules in digital image forensics methods using deep learning 使用深度学习的数字图像取证方法中的数据预处理模块综述
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301880
Alexandre Berthet, J. Dugelay
Access to technologies like mobile phones contributes to the significant increase in the volume of digital visual data (images and videos). In addition, photo editing software is becoming increasingly powerful and easy to use. In some cases, these tools can be utilized to produce forgeries with the objective to change the semantic meaning of a photo or a video (e.g. fake news). Digital image forensics (DIF) includes two main objectives: the detection (and localization) of forgery and the identification of the origin of the acquisition (i.e. sensor identification). Since 2005, many classical methods for DIF have been designed, implemented and tested on several databases. Meantime, innovative approaches based on deep learning have emerged in other fields and have surpassed traditional techniques. In the context of DIF, deep learning methods mainly use convolutional neural networks (CNN) associated with significant preprocessing modules. This is an active domain and two possible ways to operate preprocessing have been studied: prior to the network or incorporated into it. None of the various studies on the digital image forensics provide a comprehensive overview of the preprocessing techniques used with deep learning methods. Therefore, the core objective of this article is to review the preprocessing modules associated with CNN models.
移动电话等技术的使用有助于数字视觉数据(图像和视频)量的显著增加。此外,照片编辑软件正变得越来越强大和易于使用。在某些情况下,这些工具可以用来制作伪造的目的是改变照片或视频的语义(例如假新闻)。数字图像取证(DIF)包括两个主要目标:伪造的检测(和定位)和采集来源的识别(即传感器识别)。自2005年以来,已经在多个数据库上设计、实现和测试了许多经典的DIF方法。与此同时,基于深度学习的创新方法已经在其他领域出现,并超越了传统技术。在DIF背景下,深度学习方法主要使用卷积神经网络(CNN)与重要的预处理模块相关联。这是一个活跃的领域,已经研究了两种可能的操作预处理方法:在网络之前或并入网络。关于数字图像取证的各种研究都没有提供与深度学习方法一起使用的预处理技术的全面概述。因此,本文的核心目标是回顾与CNN模型相关的预处理模块。
{"title":"A review of data preprocessing modules in digital image forensics methods using deep learning","authors":"Alexandre Berthet, J. Dugelay","doi":"10.1109/VCIP49819.2020.9301880","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301880","url":null,"abstract":"Access to technologies like mobile phones contributes to the significant increase in the volume of digital visual data (images and videos). In addition, photo editing software is becoming increasingly powerful and easy to use. In some cases, these tools can be utilized to produce forgeries with the objective to change the semantic meaning of a photo or a video (e.g. fake news). Digital image forensics (DIF) includes two main objectives: the detection (and localization) of forgery and the identification of the origin of the acquisition (i.e. sensor identification). Since 2005, many classical methods for DIF have been designed, implemented and tested on several databases. Meantime, innovative approaches based on deep learning have emerged in other fields and have surpassed traditional techniques. In the context of DIF, deep learning methods mainly use convolutional neural networks (CNN) associated with significant preprocessing modules. This is an active domain and two possible ways to operate preprocessing have been studied: prior to the network or incorporated into it. None of the various studies on the digital image forensics provide a comprehensive overview of the preprocessing techniques used with deep learning methods. Therefore, the core objective of this article is to review the preprocessing modules associated with CNN models.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"138 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115363011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Deep Near Infrared Colorization with Semantic Segmentation and Transfer Learning 基于语义分割和迁移学习的深近红外着色
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301788
Fengqiao Wang, Lu Liu, Cheolkon Jung
Although near infrared (NIR) images contain no color, they have abundant and clear textures. In this paper, we propose deep NIR colorization with semantic segmentation and transfer learning. NIR images are capable of capturing invisible spectrum (700-1000 nm) that is quite different from visible spectrum images. We employ convolutional layers to build relationship between single NIR images and three-channel color images, instead of mapping to Lab or YCbCr color space. Moreover, we use semantic segmentation as global prior information to refine colorization of smooth regions for objects. We use color divergence loss to further optimize NIR colorization results with good structures and edges. Since the training dataset is not enough to capture rich color information, we adopt transfer learning to get color and semantic information. Experimental results verify that the proposed method produces a natural color image from single NIR image and outperforms state-of-the-art methods in terms of peak signal-to-noise ratio (PSNR) and structural similarity (SSIM).
虽然近红外(NIR)图像不含颜色,但它们具有丰富而清晰的纹理。本文提出了一种基于语义分割和迁移学习的深度近红外着色方法。近红外图像能够捕获与可见光谱图像截然不同的不可见光谱(700-1000 nm)。我们使用卷积层来建立单个近红外图像和三通道彩色图像之间的关系,而不是映射到Lab或YCbCr颜色空间。此外,我们使用语义分割作为全局先验信息来改进物体光滑区域的着色。我们利用色散损失进一步优化近红外着色结果,使其具有良好的结构和边缘。由于训练数据集不足以捕获丰富的颜色信息,我们采用迁移学习来获取颜色和语义信息。实验结果表明,该方法可以从单幅近红外图像中生成自然彩色图像,并且在峰值信噪比(PSNR)和结构相似性(SSIM)方面优于目前最先进的方法。
{"title":"Deep Near Infrared Colorization with Semantic Segmentation and Transfer Learning","authors":"Fengqiao Wang, Lu Liu, Cheolkon Jung","doi":"10.1109/VCIP49819.2020.9301788","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301788","url":null,"abstract":"Although near infrared (NIR) images contain no color, they have abundant and clear textures. In this paper, we propose deep NIR colorization with semantic segmentation and transfer learning. NIR images are capable of capturing invisible spectrum (700-1000 nm) that is quite different from visible spectrum images. We employ convolutional layers to build relationship between single NIR images and three-channel color images, instead of mapping to Lab or YCbCr color space. Moreover, we use semantic segmentation as global prior information to refine colorization of smooth regions for objects. We use color divergence loss to further optimize NIR colorization results with good structures and edges. Since the training dataset is not enough to capture rich color information, we adopt transfer learning to get color and semantic information. Experimental results verify that the proposed method produces a natural color image from single NIR image and outperforms state-of-the-art methods in terms of peak signal-to-noise ratio (PSNR) and structural similarity (SSIM).","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114570793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Special Cane with Visual Odometry for Real-time Indoor Navigation of Blind People 盲人室内实时导航专用视觉里程计手杖
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301782
Tang Tang, Menghan Hu, Guodong Li, Qingli Li, Jian Zhang, Xiaofeng Zhou, Guangtao Zhai
Indoor navigation is urgently needed by blind people in their everyday lives. In this paper, we design an assistive cane with visual odometry based on actual requirements of the blind to aid them in attaining safe indoor navigation. Compared to the state-of-the-art indoor navigation systems, the proposed device is portable, compact, and adaptable. The main specifications of the system are: the perception range is respectively from 0.10m to 2.10m, and 0.08m to 1.60m for width and length dimensions; the maximum weight is 2.1kg; the detection range is from 0.15m and 3.00m; the cruising ability is about 8h; and the objects whose heights are below 80cm can be detected. The demo video of the proposed navigation system is available at: https://doi.org/10.6084/m9.figshare.12399572.v1.
盲人在日常生活中迫切需要室内导航。本文根据盲人的实际需求,设计了一种具有视觉里程计的辅助手杖,以帮助他们实现安全的室内导航。与最先进的室内导航系统相比,所提出的设备具有便携、紧凑和适应性强的特点。系统的主要规格为:感知范围分别为0.1 m ~ 2.1 m,宽度和长度尺寸分别为0.08m ~ 1.60m;最大重量2.1kg;探测距离为0.15m ~ 3.00m;巡航能力约8h;并且可以检测到高度在80cm以下的物体。所提出的导航系统的演示视频可在:https://doi.org/10.6084/m9.figshare.12399572.v1。
{"title":"Special Cane with Visual Odometry for Real-time Indoor Navigation of Blind People","authors":"Tang Tang, Menghan Hu, Guodong Li, Qingli Li, Jian Zhang, Xiaofeng Zhou, Guangtao Zhai","doi":"10.1109/VCIP49819.2020.9301782","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301782","url":null,"abstract":"Indoor navigation is urgently needed by blind people in their everyday lives. In this paper, we design an assistive cane with visual odometry based on actual requirements of the blind to aid them in attaining safe indoor navigation. Compared to the state-of-the-art indoor navigation systems, the proposed device is portable, compact, and adaptable. The main specifications of the system are: the perception range is respectively from 0.10m to 2.10m, and 0.08m to 1.60m for width and length dimensions; the maximum weight is 2.1kg; the detection range is from 0.15m and 3.00m; the cruising ability is about 8h; and the objects whose heights are below 80cm can be detected. The demo video of the proposed navigation system is available at: https://doi.org/10.6084/m9.figshare.12399572.v1.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116193491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
HDR Image Compression with Convolutional Autoencoder 卷积自编码器的HDR图像压缩
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301853
Fei Han, Jin Wang, Ruiqin Xiong, Qing Zhu, Baocai Yin
As one of the next-generation multimedia technology, high dynamic range (HDR) imaging technology has been widely applied. Due to its wider color range, HDR image brings greater compression and storage burden compared with traditional LDR image. To solve this problem, in this paper, a two-layer HDR image compression framework based on convolutional neural networks is proposed. The framework is composed of a base layer which provides backward compatibility with the standard JPEG, and an extension layer based on a convolutional variational autoencoder neural networks and a post-processing module. The autoencoder mainly includes a nonlinear transform encoder, a binarized quantizer and a nonlinear transform decoder. Compared with traditional codecs, the proposed CNN autoencoder is more flexible and can retain more image semantic information, which will improve the quality of decoded HDR image. Moreover, to reduce the compression artifacts and noise of reconstructed HDR image, a post-processing method based on group convolutional neural networks is designed. Experimental results show that our method outperforms JPEG XT profile A, B, C and other methods in terms of HDR-VDP-2 evaluation metric. Meanwhile, our scheme also provides backward compatibility with the standard JPEG.
高动态范围(HDR)成像技术作为下一代多媒体技术之一,得到了广泛的应用。与传统的LDR图像相比,HDR图像由于具有更宽的色彩范围,带来了更大的压缩和存储负担。为了解决这一问题,本文提出了一种基于卷积神经网络的双层HDR图像压缩框架。该框架由提供向后兼容标准JPEG的基础层、基于卷积变分自编码器神经网络和后处理模块的扩展层组成。该自编码器主要包括非线性变换编码器、二值化量化器和非线性变换解码器。与传统的编解码器相比,本文提出的CNN自编码器更加灵活,保留了更多的图像语义信息,提高了解码后HDR图像的质量。此外,为了降低重构HDR图像的压缩伪影和噪声,设计了一种基于群卷积神经网络的图像后处理方法。实验结果表明,该方法在HDR-VDP-2评价指标上优于JPEG XT剖面A、B、C等方法。同时,我们的方案还提供了对标准JPEG的向后兼容。
{"title":"HDR Image Compression with Convolutional Autoencoder","authors":"Fei Han, Jin Wang, Ruiqin Xiong, Qing Zhu, Baocai Yin","doi":"10.1109/VCIP49819.2020.9301853","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301853","url":null,"abstract":"As one of the next-generation multimedia technology, high dynamic range (HDR) imaging technology has been widely applied. Due to its wider color range, HDR image brings greater compression and storage burden compared with traditional LDR image. To solve this problem, in this paper, a two-layer HDR image compression framework based on convolutional neural networks is proposed. The framework is composed of a base layer which provides backward compatibility with the standard JPEG, and an extension layer based on a convolutional variational autoencoder neural networks and a post-processing module. The autoencoder mainly includes a nonlinear transform encoder, a binarized quantizer and a nonlinear transform decoder. Compared with traditional codecs, the proposed CNN autoencoder is more flexible and can retain more image semantic information, which will improve the quality of decoded HDR image. Moreover, to reduce the compression artifacts and noise of reconstructed HDR image, a post-processing method based on group convolutional neural networks is designed. Experimental results show that our method outperforms JPEG XT profile A, B, C and other methods in terms of HDR-VDP-2 evaluation metric. Meanwhile, our scheme also provides backward compatibility with the standard JPEG.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116808239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
DEN: Disentanglement and Enhancement Networks for Low Illumination Images DEN:低照度图像的解纠缠和增强网络
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301830
Nelson Chong Ngee Bow, Vu-Hoang Tran, Punchok Kerdsiri, Y. P. Loh, Ching-Chun Huang
Though learning-based low-light enhancement methods have achieved significant success, existing methods are still sensitive to noise and unnatural appearance. The problems may come from the lack of structural awareness and the confusion between noise and texture. Thus, we present a lowlight image enhancement method that consists of an image disentanglement network and an illumination boosting network. The disentanglement network is first used to decompose the input image into image details and image illumination. The extracted illumination part then goes through a multi-branch enhancement network designed to improve the dynamic range of the image. The multi-branch network extracts multi-level image features and enhances them via numerous subnets. These enhanced features are then fused to generate the enhanced illumination part. Finally, the denoised image details and the enhanced illumination are entangled to produce the normallight image. Experimental results show that our method can produce visually pleasing images in many public datasets.
尽管基于学习的弱光增强方法已经取得了显著的成功,但现有的方法仍然对噪声和非自然外观敏感。问题可能来自结构意识的缺乏以及噪声和纹理的混淆。因此,我们提出了一种低光图像增强方法,该方法由图像解纠缠网络和光照增强网络组成。首先利用解纠缠网络将输入图像分解为图像细节和图像照明。然后将提取的照明部分经过多分支增强网络,以提高图像的动态范围。多分支网络提取多层次的图像特征,并通过多个子网对其进行增强。然后将这些增强的特征融合以生成增强的照明部分。最后,将去噪后的图像细节与增强后的光照进行纠缠,得到正常光照图像。实验结果表明,该方法可以在许多公共数据集上生成视觉上令人满意的图像。
{"title":"DEN: Disentanglement and Enhancement Networks for Low Illumination Images","authors":"Nelson Chong Ngee Bow, Vu-Hoang Tran, Punchok Kerdsiri, Y. P. Loh, Ching-Chun Huang","doi":"10.1109/VCIP49819.2020.9301830","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301830","url":null,"abstract":"Though learning-based low-light enhancement methods have achieved significant success, existing methods are still sensitive to noise and unnatural appearance. The problems may come from the lack of structural awareness and the confusion between noise and texture. Thus, we present a lowlight image enhancement method that consists of an image disentanglement network and an illumination boosting network. The disentanglement network is first used to decompose the input image into image details and image illumination. The extracted illumination part then goes through a multi-branch enhancement network designed to improve the dynamic range of the image. The multi-branch network extracts multi-level image features and enhances them via numerous subnets. These enhanced features are then fused to generate the enhanced illumination part. Finally, the denoised image details and the enhanced illumination are entangled to produce the normallight image. Experimental results show that our method can produce visually pleasing images in many public datasets.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126428146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine Learning for Photometric Redshift Estimation of Quasars with Different Samples 不同样本类星体光度红移估计的机器学习
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301849
Yanxia Zhang, Xin Jin, Jingyi Zhang, Yongheng Zhao
We compare the performance of Support Vector Machine, XGBoost, LightGBM, k-Nearest Neighbors, Random forests and Extra-Trees on the photometric redshift estimation of quasars based on the SDSS_WISE sample. For this sample, LightGBM shows its superiority in speed while k-Nearest Neighbors, Random forests and Extra-Trees show better performance. Then k-Nearest Neighbors, Random forests and Extra-Trees are applied on the SDSS, SDSS_WISE, SDSS_UKIDSS, WISE_UKIDSS and SDSS_WISE_UKIDSS samples. The results show that the performance of an algorithm depends on the sample selection, sample size, input pattern and information from different bands; for the same sample, the more information the better performance is obtained, but different algorithms shows different accuracy; no single algorithm shows its superiority on every sample.
我们比较了基于SDSS_WISE样本的支持向量机、XGBoost、LightGBM、k近邻、随机森林和Extra-Trees在类星体光度红移估计上的性能。对于这个样本,LightGBM在速度上表现出优势,而k-Nearest Neighbors, Random forests和Extra-Trees表现出更好的性能。然后在SDSS、SDSS_WISE、SDSS_UKIDSS、WISE_UKIDSS和SDSS_WISE_UKIDSS样本上应用k近邻、随机森林和Extra-Trees。结果表明,算法的性能取决于样本选择、样本大小、输入模式和不同波段的信息;对于同一样本,信息越多,性能越好,但不同算法的准确率不同;没有一种算法在所有样本上都表现出优越性。
{"title":"Machine Learning for Photometric Redshift Estimation of Quasars with Different Samples","authors":"Yanxia Zhang, Xin Jin, Jingyi Zhang, Yongheng Zhao","doi":"10.1109/VCIP49819.2020.9301849","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301849","url":null,"abstract":"We compare the performance of Support Vector Machine, XGBoost, LightGBM, k-Nearest Neighbors, Random forests and Extra-Trees on the photometric redshift estimation of quasars based on the SDSS_WISE sample. For this sample, LightGBM shows its superiority in speed while k-Nearest Neighbors, Random forests and Extra-Trees show better performance. Then k-Nearest Neighbors, Random forests and Extra-Trees are applied on the SDSS, SDSS_WISE, SDSS_UKIDSS, WISE_UKIDSS and SDSS_WISE_UKIDSS samples. The results show that the performance of an algorithm depends on the sample selection, sample size, input pattern and information from different bands; for the same sample, the more information the better performance is obtained, but different algorithms shows different accuracy; no single algorithm shows its superiority on every sample.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126570149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
On 2D-3D Image Feature Detections for Image-To-Geometry Registration in Virtual Dental Model 虚拟牙齿模型图像-几何配准中2D-3D图像特征检测研究
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301774
Hui-jun Tang, R. T. Hsung, W. Y. Lam, Leo Y. Y. Cheng, E. Pow
3D digital smile design (DSD) gains great interest in dentistry because it enables esthetic design of teeth and gum. However, the color texture of teeth and gum is often lost/distorted in the digitization process. Recently, the image-to-geometry registration shade mapping (IGRSM) method was proposed for registering color texture from 2D photography to 3D mesh model. It allows better control of illumination and color calibration for automatic teeth shade matching. In this paper, we investigate automated techniques to find the correspondences between 3D tooth model and color intraoral photographs for accurately perform the IGRSM. We propose to use the tooth cusp tips as the correspondence points for the IGR because they can be reliably detected both in 2D photography and 3D surface scan. A modified gradient descent method with directional priority (GDDP) and region growing are developed to find the 3D correspondence points. For the 2D image, the tooth tips contour lines are extracted based on luminosity and chromaticity, the contour peaks are then detected as the correspondence points. From the experimental results, the proposed method shows excellent accuracy in detecting the correspondence points between 2D photography and 3D tooth model. The average registration error is less than 15 pixels for 4752×3168 size intraoral image.
三维数字微笑设计(DSD)因其能够实现牙齿和牙龈的美学设计而引起了牙科界的极大兴趣。然而,在数字化的过程中,牙齿和牙龈的颜色纹理往往会丢失/扭曲。近年来,提出了一种图像-几何配准阴影映射(IGRSM)方法,用于将二维摄影图像中的彩色纹理配准到三维网格模型中。它可以更好地控制照明和颜色校准,以实现自动牙齿阴影匹配。在本文中,我们研究了自动化技术,以找到三维牙齿模型和彩色口腔内照片之间的对应关系,以准确地执行IGRSM。我们建议使用牙尖尖端作为IGR的对应点,因为它们可以在2D摄影和3D表面扫描中可靠地检测到。提出了一种改进的梯度下降法,结合方向优先度和区域生长来寻找三维对应点。对于二维图像,基于亮度和色度提取齿尖轮廓线,检测轮廓峰作为对应点。实验结果表明,该方法在检测二维图像与三维牙齿模型对应点方面具有良好的准确性。对于4752×3168大小的口腔内图像,平均配准误差小于15像素。
{"title":"On 2D-3D Image Feature Detections for Image-To-Geometry Registration in Virtual Dental Model","authors":"Hui-jun Tang, R. T. Hsung, W. Y. Lam, Leo Y. Y. Cheng, E. Pow","doi":"10.1109/VCIP49819.2020.9301774","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301774","url":null,"abstract":"3D digital smile design (DSD) gains great interest in dentistry because it enables esthetic design of teeth and gum. However, the color texture of teeth and gum is often lost/distorted in the digitization process. Recently, the image-to-geometry registration shade mapping (IGRSM) method was proposed for registering color texture from 2D photography to 3D mesh model. It allows better control of illumination and color calibration for automatic teeth shade matching. In this paper, we investigate automated techniques to find the correspondences between 3D tooth model and color intraoral photographs for accurately perform the IGRSM. We propose to use the tooth cusp tips as the correspondence points for the IGR because they can be reliably detected both in 2D photography and 3D surface scan. A modified gradient descent method with directional priority (GDDP) and region growing are developed to find the 3D correspondence points. For the 2D image, the tooth tips contour lines are extracted based on luminosity and chromaticity, the contour peaks are then detected as the correspondence points. From the experimental results, the proposed method shows excellent accuracy in detecting the correspondence points between 2D photography and 3D tooth model. The average registration error is less than 15 pixels for 4752×3168 size intraoral image.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121720311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Attention-Guided Fusion Network of Point Cloud and Multiple Views for 3D Shape Recognition 基于注意引导的点云和多视角三维形状识别融合网络
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301813
Bo Peng, Zengrui Yu, Jianjun Lei, Jiahui Song
With the dramatic growth of 3D shape data, 3D shape recognition has become a hot research topic in the field of computer vision. How to effectively utilize the multimodal characteristics of 3D shape has been one of the key problems to boost the performance of 3D shape recognition. In this paper, we propose a novel attention-guided fusion network of point cloud and multiple views for 3D shape recognition. Specifically, in order to obtain more discriminative descriptor for 3D shape data, the inter-modality attention enhancement module and view-context attention fusion module are proposed to gradually refine and fuse the features of the point cloud and multiple views. In the inter-modality attention enhancement module, the inter-modality attention mask based on the joint feature representation is computed, so that the features of each modality are enhanced by fusing the correlative information between two modalities. After that, the view-context attention fusion module is proposed to explore the context information of multiple views, and fuse the enhanced features to obtain more discriminative descriptor for 3D shape data. Experimental results on the ModelNet40 dataset demonstrate that the proposed method achieves promising performance compared with state-of-the-art methods.
随着三维形状数据的急剧增长,三维形状识别已成为计算机视觉领域的研究热点。如何有效地利用三维形状的多模态特征,是提高三维形状识别性能的关键问题之一。本文提出了一种新颖的点云和多视图的注意力引导融合网络,用于三维形状识别。具体而言,为了获得更具判别性的三维形状数据描述符,提出了模态间注意增强模块和视图-上下文注意融合模块,逐步细化和融合点云和多视图的特征。在模态间注意增强模块中,计算基于联合特征表示的模态间注意掩模,通过融合两模态间的相关信息对各模态特征进行增强。在此基础上,提出了视图-上下文注意融合模块,对多视图的上下文信息进行挖掘,并融合增强特征,获得更具判别性的三维形状数据描述符。在ModelNet40数据集上的实验结果表明,与现有方法相比,该方法取得了良好的性能。
{"title":"Attention-Guided Fusion Network of Point Cloud and Multiple Views for 3D Shape Recognition","authors":"Bo Peng, Zengrui Yu, Jianjun Lei, Jiahui Song","doi":"10.1109/VCIP49819.2020.9301813","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301813","url":null,"abstract":"With the dramatic growth of 3D shape data, 3D shape recognition has become a hot research topic in the field of computer vision. How to effectively utilize the multimodal characteristics of 3D shape has been one of the key problems to boost the performance of 3D shape recognition. In this paper, we propose a novel attention-guided fusion network of point cloud and multiple views for 3D shape recognition. Specifically, in order to obtain more discriminative descriptor for 3D shape data, the inter-modality attention enhancement module and view-context attention fusion module are proposed to gradually refine and fuse the features of the point cloud and multiple views. In the inter-modality attention enhancement module, the inter-modality attention mask based on the joint feature representation is computed, so that the features of each modality are enhanced by fusing the correlative information between two modalities. After that, the view-context attention fusion module is proposed to explore the context information of multiple views, and fuse the enhanced features to obtain more discriminative descriptor for 3D shape data. Experimental results on the ModelNet40 dataset demonstrate that the proposed method achieves promising performance compared with state-of-the-art methods.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"60 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121262717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A semantic labeling framework for ALS point clouds based on discretization and CNN 基于离散化和CNN的ALS点云语义标注框架
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301759
Xingtao Wang, Xiaopeng Fan, Debin Zhao
The airborne laser scanning (ALS) point cloud has drawn increasing attention thanks to its capability to quickly acquire large-scale and high-precision ground information. Due to the complexity of observed scenes and the irregularity of point distribution, the semantic labeling of ALS point clouds is extremely challenging. In this paper, we introduce an efficient discretization based framework according to the geometric character of ALS point clouds, and propose an original intraclass weighted cross entropy loss function to solve the problem of data imbalance. We evaluate our framework on the ISPRS (International Society for Photogrammetry and Remote Sensing) 3D Semantic Labeling dataset. The experimental results show that the proposed method has achieved a new state-of-the-art performance in terms of overall accuracy (85.3%) and average F1 score (74.1%).
机载激光扫描(ALS)点云以其快速获取大尺度、高精度地面信息的能力受到越来越多的关注。由于观测场景的复杂性和点分布的不规则性,ALS点云的语义标注极具挑战性。本文根据ALS点云的几何特征,引入了一种有效的离散化框架,并提出了一种新颖的类内加权交叉熵损失函数来解决数据不平衡问题。我们在ISPRS(国际摄影测量与遥感学会)3D语义标记数据集上评估了我们的框架。实验结果表明,该方法在整体准确率(85.3%)和平均F1分数(74.1%)方面取得了较好的成绩。
{"title":"A semantic labeling framework for ALS point clouds based on discretization and CNN","authors":"Xingtao Wang, Xiaopeng Fan, Debin Zhao","doi":"10.1109/VCIP49819.2020.9301759","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301759","url":null,"abstract":"The airborne laser scanning (ALS) point cloud has drawn increasing attention thanks to its capability to quickly acquire large-scale and high-precision ground information. Due to the complexity of observed scenes and the irregularity of point distribution, the semantic labeling of ALS point clouds is extremely challenging. In this paper, we introduce an efficient discretization based framework according to the geometric character of ALS point clouds, and propose an original intraclass weighted cross entropy loss function to solve the problem of data imbalance. We evaluate our framework on the ISPRS (International Society for Photogrammetry and Remote Sensing) 3D Semantic Labeling dataset. The experimental results show that the proposed method has achieved a new state-of-the-art performance in terms of overall accuracy (85.3%) and average F1 score (74.1%).","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127038386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1