首页 > 最新文献

2020 IEEE International Conference on Image Processing (ICIP)最新文献

英文 中文
Block-Size Dependent Overlapped Block Motion Compensation 块大小相关的重叠块运动补偿
Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9191338
Yoshitaka Kidani, Kei Kawamura, Kyohei Unno, S. Naito
Overlapped block motion compensation (OBMC) is one of the inter prediction tools that improves coding performance. OBMC applied to various non-squared blocks has been studied in VVC, which is being standardized by joint video experts team (JVET), to improve coding performance over HEVC. Memory bandwidth, however, is a bottleneck when OBMC is used, and conventional methods have not achieved a good trade-off regarding coding performance and memory bandwidth so far. In this study, interpolation filters and applicable conditions of OBMC depending on block sizes are proposed to achieve the best trade-off. The experimental results show a -0.40% BD-rate gain compared with that of the VVC test model 3 for random access conditions under the common test condition in JVET.
重叠块运动补偿(OBMC)是提高编码性能的相互预测工具之一。为了提高编码性能,联合视频专家组(JVET)研究了OBMC在VVC中对各种非平方块的应用,并正在对其进行标准化。然而,在使用OBMC时,内存带宽是一个瓶颈,到目前为止,传统方法还没有在编码性能和内存带宽之间取得很好的平衡。在本研究中,提出了基于块大小的插值滤波器和OBMC的适用条件,以实现最佳权衡。实验结果表明,在JVET通用测试条件下,随机接入条件下,与VVC测试模型3相比,bd速率增益为-0.40%。
{"title":"Block-Size Dependent Overlapped Block Motion Compensation","authors":"Yoshitaka Kidani, Kei Kawamura, Kyohei Unno, S. Naito","doi":"10.1109/ICIP40778.2020.9191338","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9191338","url":null,"abstract":"Overlapped block motion compensation (OBMC) is one of the inter prediction tools that improves coding performance. OBMC applied to various non-squared blocks has been studied in VVC, which is being standardized by joint video experts team (JVET), to improve coding performance over HEVC. Memory bandwidth, however, is a bottleneck when OBMC is used, and conventional methods have not achieved a good trade-off regarding coding performance and memory bandwidth so far. In this study, interpolation filters and applicable conditions of OBMC depending on block sizes are proposed to achieve the best trade-off. The experimental results show a -0.40% BD-rate gain compared with that of the VVC test model 3 for random access conditions under the common test condition in JVET.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125472504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Circular Shift: An Effective Data Augmentation Method For Convolutional Neural Network On Image Classification 圆移位:卷积神经网络图像分类的有效数据增强方法
Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9191303
Kailai Zhang, Zheng Cao, Ji Wu
In this paper, we present a novel and effective data augmentation method for convolutional neural network(CNN) on image classification tasks. CNN-based models such as VGG, Resnet and Densenet have achieved great success on image classification tasks. The common data augmentation methods such as rotation, crop and flip are always used for CNN, especially under the lack of data. However, in some cases such as small images and dispersed feature of objects, these methods have limitations and even can decrease the classification performance. In this case, an operation that has lower risk is important for the performance improvement. Addressing this problem, we design a data augmentation method named circular shift, which provides variations for the CNN-based models but does not lose too much information. Three commonly used image datasets are chosen for the evaluation of our proposed operation, and the experiment results show consistent improvement on different CNN-based models. What is more, our operation can be added to the current set of augmentation operation and achieves further performance improvement.
本文提出了一种新颖有效的卷积神经网络(CNN)图像分类数据增强方法。基于cnn的VGG、Resnet、Densenet等模型在图像分类任务上取得了很大的成功。对于CNN来说,常用的数据增强方法如轮作、作物、翻转等,尤其是在数据不足的情况下。然而,在某些情况下,如小图像和物体的分散特征,这些方法有局限性,甚至会降低分类性能。在这种情况下,风险较低的操作对于性能改进非常重要。为了解决这个问题,我们设计了一种名为循环移位的数据增强方法,该方法为基于cnn的模型提供了变化,但不会丢失太多信息。我们选择了三个常用的图像数据集来评估我们提出的操作,实验结果表明,在不同的基于cnn的模型上,我们的改进是一致的。而且,我们的操作可以添加到当前的增强操作集合中,进一步提高性能。
{"title":"Circular Shift: An Effective Data Augmentation Method For Convolutional Neural Network On Image Classification","authors":"Kailai Zhang, Zheng Cao, Ji Wu","doi":"10.1109/ICIP40778.2020.9191303","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9191303","url":null,"abstract":"In this paper, we present a novel and effective data augmentation method for convolutional neural network(CNN) on image classification tasks. CNN-based models such as VGG, Resnet and Densenet have achieved great success on image classification tasks. The common data augmentation methods such as rotation, crop and flip are always used for CNN, especially under the lack of data. However, in some cases such as small images and dispersed feature of objects, these methods have limitations and even can decrease the classification performance. In this case, an operation that has lower risk is important for the performance improvement. Addressing this problem, we design a data augmentation method named circular shift, which provides variations for the CNN-based models but does not lose too much information. Three commonly used image datasets are chosen for the evaluation of our proposed operation, and the experiment results show consistent improvement on different CNN-based models. What is more, our operation can be added to the current set of augmentation operation and achieves further performance improvement.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126839010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
An Image-based Method to Predict Surface Enhanced Raman Spectroscopy Sensor Quality 基于图像的表面增强拉曼光谱传感器质量预测方法
Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9190905
Yiming Zuo, Yang Lei, S. Barcelo
Image-based quality control is a powerful tool for nondestructive testing of product quality. Machine vision systems (MVS) often implement image-based machine learning algorithms in an attempt to match human level accuracy in detecting product defects for better efficiency and repeatability. Plasmonic sensors, such as those used in Surface Enhanced Raman Spectroscopy (SERS), present a unique challenge for image-based quality control, because in addition to obvious defects such as scratches and missing areas, subtle color changes can also indicate significant changes in sensor performance. As a further challenge, it is not straightforward for even a human expert to distinguish between high- and lowquality sensors based on these subtle color changes on the sensors. In this paper we show that by extracting image features according to the domain knowledge, we can build an imagebased method that outperforms human expert prediction. This method enables automated non-destructive SERS sensor quality control and has been implemented successfully on our server.
基于图像的质量控制是产品质量无损检测的有力工具。机器视觉系统(MVS)经常实施基于图像的机器学习算法,试图在检测产品缺陷方面达到人类水平的精度,以提高效率和可重复性。等离子体传感器,如用于表面增强拉曼光谱(SERS)的传感器,对基于图像的质量控制提出了独特的挑战,因为除了划痕和缺失区域等明显缺陷外,细微的颜色变化也可以表明传感器性能的显著变化。作为进一步的挑战,即使是人类专家也很难根据传感器上这些细微的颜色变化来区分高质量和低质量的传感器。在本文中,我们证明了通过根据领域知识提取图像特征,我们可以建立一种优于人类专家预测的基于图像的方法。该方法实现了自动化的无损SERS传感器质量控制,并已在我们的服务器上成功实施。
{"title":"An Image-based Method to Predict Surface Enhanced Raman Spectroscopy Sensor Quality","authors":"Yiming Zuo, Yang Lei, S. Barcelo","doi":"10.1109/ICIP40778.2020.9190905","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9190905","url":null,"abstract":"Image-based quality control is a powerful tool for nondestructive testing of product quality. Machine vision systems (MVS) often implement image-based machine learning algorithms in an attempt to match human level accuracy in detecting product defects for better efficiency and repeatability. Plasmonic sensors, such as those used in Surface Enhanced Raman Spectroscopy (SERS), present a unique challenge for image-based quality control, because in addition to obvious defects such as scratches and missing areas, subtle color changes can also indicate significant changes in sensor performance. As a further challenge, it is not straightforward for even a human expert to distinguish between high- and lowquality sensors based on these subtle color changes on the sensors. In this paper we show that by extracting image features according to the domain knowledge, we can build an imagebased method that outperforms human expert prediction. This method enables automated non-destructive SERS sensor quality control and has been implemented successfully on our server.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114883298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Attention Unet++: A Nested Attention-Aware U-Net for Liver CT Image Segmentation 一种用于肝脏CT图像分割的嵌套注意力感知U-Net
Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9190761
Chen Li, Yusong Tan, W. Chen, Xin Luo, Yuanming Gao, Xiaogang Jia, Zhiying Wang
Liver cancer is one of the cancers with the highest mortality. In order to help doctors diagnose and treat liver lesion, an automatic liver segmentation model is urgently needed due to manually segmentation is time-consuming and error-prone. In this paper, we propose a nested attention-aware segmentation network, named Attention UNet++. Our proposed method has a deep supervised encoder-decoder architecture and a redesigned dense skip connection. Attention UNet++ introduces attention mechanism between nested convolutional blocks so that the features extracted at different levels can be merged with a task-related selection. Besides, due to the introduction of deep supervision, the prediction speed of the pruned network is accelerated at the cost of modest performance degradation. We evaluated proposed model on MICCAI 2017 Liver Tumor Segmentation (LiTS) Challenge Dataset. Attention UNet++ achieved very competitive performance for liver segmentation.
肝癌是死亡率最高的癌症之一。为了帮助医生诊断和治疗肝脏病变,由于人工分割费时且容易出错,迫切需要一种自动肝脏分割模型。在本文中,我们提出了一个嵌套的注意力感知分割网络,命名为Attention unet++。我们提出的方法具有深度监督编码器-解码器架构和重新设计的密集跳过连接。unnet++在嵌套的卷积块之间引入了注意机制,使得在不同层次提取的特征可以与任务相关的选择合并。此外,由于引入了深度监督,以适度的性能下降为代价加快了修剪后网络的预测速度。我们在MICCAI 2017肝脏肿瘤分割(LiTS)挑战数据集上评估了该模型。注意:unnet++在肝脏分割方面取得了非常有竞争力的性能。
{"title":"Attention Unet++: A Nested Attention-Aware U-Net for Liver CT Image Segmentation","authors":"Chen Li, Yusong Tan, W. Chen, Xin Luo, Yuanming Gao, Xiaogang Jia, Zhiying Wang","doi":"10.1109/ICIP40778.2020.9190761","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9190761","url":null,"abstract":"Liver cancer is one of the cancers with the highest mortality. In order to help doctors diagnose and treat liver lesion, an automatic liver segmentation model is urgently needed due to manually segmentation is time-consuming and error-prone. In this paper, we propose a nested attention-aware segmentation network, named Attention UNet++. Our proposed method has a deep supervised encoder-decoder architecture and a redesigned dense skip connection. Attention UNet++ introduces attention mechanism between nested convolutional blocks so that the features extracted at different levels can be merged with a task-related selection. Besides, due to the introduction of deep supervision, the prediction speed of the pruned network is accelerated at the cost of modest performance degradation. We evaluated proposed model on MICCAI 2017 Liver Tumor Segmentation (LiTS) Challenge Dataset. Attention UNet++ achieved very competitive performance for liver segmentation.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115188440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 69
Quaternion Harris For Multispectral Keypoint Detection Harris四元数用于多光谱关键点检测
Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9191302
Giorgos Sfikas, D. Ioannidis, D. Tzovaras
We present a new keypoint detection method that generalizes Harris corners for multispectral images by considering the input as a quaternionic matrix. Standard keypoint detectors run on scalar-valued inputs, neglecting input multimodality and potentially missing highly distinctive features. The proposed detector uses information from all channel inputs by defining a quaternionic autocorrelation matrix that possesses quaternionic eigenvectors and real eigenvalues, for the computation of which channel cross-correlations are also taken into account. We have tested the proposed detector on a variety of multispectral images (color, near-infrared), where we have validated its usefulness.
提出了一种基于四元数矩阵的多光谱图像关键点检测方法。标准的关键点检测器在标量值输入上运行,忽略了输入的多模态,可能会错过高度显著的特征。该检测器通过定义具有四元数特征向量和实特征值的四元数自相关矩阵来利用来自所有信道输入的信息,该矩阵的计算也考虑了信道互相关。我们已经在各种多光谱图像(彩色,近红外)上测试了所提出的探测器,验证了其实用性。
{"title":"Quaternion Harris For Multispectral Keypoint Detection","authors":"Giorgos Sfikas, D. Ioannidis, D. Tzovaras","doi":"10.1109/ICIP40778.2020.9191302","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9191302","url":null,"abstract":"We present a new keypoint detection method that generalizes Harris corners for multispectral images by considering the input as a quaternionic matrix. Standard keypoint detectors run on scalar-valued inputs, neglecting input multimodality and potentially missing highly distinctive features. The proposed detector uses information from all channel inputs by defining a quaternionic autocorrelation matrix that possesses quaternionic eigenvectors and real eigenvalues, for the computation of which channel cross-correlations are also taken into account. We have tested the proposed detector on a variety of multispectral images (color, near-infrared), where we have validated its usefulness.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116069548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Segmentation Algorithm of the Valid Region in Fisheye Images Using Edge and Region Information 基于边缘和区域信息的鱼眼图像有效区域分割算法
Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9191294
Tongxin Du, Bin Fang, Mingliang Zhou, Henjun Zhao, Weizhi Xian, X. Wu
In this paper, we propose a method to segment the valid region of fisheye images. First, we construct an objective function with three terms, which are the region driving term, the edge driving term and the length regularization term. Second, we minimize this objective function by a modified gradient descent method to find the best segmentation result. Our method can achieve valid region segmentation by making use of both region information and edge information. Experiments show that the proposed method can deal with blurred edges, halation noise and incomplete valid region problems.
本文提出了一种分割鱼眼图像有效区域的方法。首先,我们构造了一个包含三个项的目标函数,分别是区域驱动项、边缘驱动项和长度正则化项。其次,利用改进的梯度下降法最小化目标函数,得到最佳分割结果。该方法可以同时利用区域信息和边缘信息实现有效的区域分割。实验表明,该方法可以有效地处理边缘模糊、色散噪声和有效区域不完全等问题。
{"title":"Segmentation Algorithm of the Valid Region in Fisheye Images Using Edge and Region Information","authors":"Tongxin Du, Bin Fang, Mingliang Zhou, Henjun Zhao, Weizhi Xian, X. Wu","doi":"10.1109/ICIP40778.2020.9191294","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9191294","url":null,"abstract":"In this paper, we propose a method to segment the valid region of fisheye images. First, we construct an objective function with three terms, which are the region driving term, the edge driving term and the length regularization term. Second, we minimize this objective function by a modified gradient descent method to find the best segmentation result. Our method can achieve valid region segmentation by making use of both region information and edge information. Experiments show that the proposed method can deal with blurred edges, halation noise and incomplete valid region problems.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122930190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Open-Set Metric Learning For Person Re-Identification In The Wild 野外人物再识别的开集度量学习
Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9190744
Arindam Sikdar, Dibyadip Chatterjee, Arpan Bhowmik, A. Chowdhury
Person re-identification in the wild needs to simultaneously (frame-wise) detect and re-identify persons and has wide utility in practical scenarios. However, such tasks come with an additional open-set re-ID challenge as all probe persons may not necessarily be present in the (frame-wise) dynamic gallery. Traditional or close-set re-ID systems are not equipped to handle such cases and raise several false alarms as a result. To cope with such challenges open-set metric learning (OSML), based on the concept of Large margin nearest neighbor (LMNN) approach, is proposed. We term our method Open-Set LMNN (OS-LMNN). The goal of separating impostor samples from the genuine samples is achieved through a joint optimization of the Weibull distribution and the Mahalanobis metric learned through this OS-LMNN approach. The rejection is performed based on low probability over distance of imposter pairs. Exhaustive experiments with other metric learning techniques over the publicly available PRW dataset clearly demonstrate the robustness of our approach.
野外人员再识别需要同时(逐帧)检测和再识别人员,在实际场景中具有广泛的实用性。然而,这样的任务带来了额外的开放集重新标识挑战,因为所有的探测人员不一定都出现在(逐帧的)动态图库中。传统的或封闭式的重新识别系统不具备处理这种情况的能力,因此会产生多次误报。为了应对这些挑战,基于大边界最近邻(LMNN)方法的概念提出了开集度量学习(OSML)。我们称我们的方法为开集LMNN (OS-LMNN)。通过对威布尔分布和Mahalanobis度量的联合优化,该OS-LMNN方法实现了从真实样本中分离冒牌样本的目标。拒绝是基于低概率在距离上的冒名顶替者对执行。在公开可用的PRW数据集上与其他度量学习技术进行的详尽实验清楚地证明了我们方法的鲁棒性。
{"title":"Open-Set Metric Learning For Person Re-Identification In The Wild","authors":"Arindam Sikdar, Dibyadip Chatterjee, Arpan Bhowmik, A. Chowdhury","doi":"10.1109/ICIP40778.2020.9190744","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9190744","url":null,"abstract":"Person re-identification in the wild needs to simultaneously (frame-wise) detect and re-identify persons and has wide utility in practical scenarios. However, such tasks come with an additional open-set re-ID challenge as all probe persons may not necessarily be present in the (frame-wise) dynamic gallery. Traditional or close-set re-ID systems are not equipped to handle such cases and raise several false alarms as a result. To cope with such challenges open-set metric learning (OSML), based on the concept of Large margin nearest neighbor (LMNN) approach, is proposed. We term our method Open-Set LMNN (OS-LMNN). The goal of separating impostor samples from the genuine samples is achieved through a joint optimization of the Weibull distribution and the Mahalanobis metric learned through this OS-LMNN approach. The rejection is performed based on low probability over distance of imposter pairs. Exhaustive experiments with other metric learning techniques over the publicly available PRW dataset clearly demonstrate the robustness of our approach.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122468782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Super-Resolution by Image Enhancement Using Texture Transfer 使用纹理转移图像增强的超分辨率
Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9190844
Jose Jaena Mari Ople, Daniel Stanley Tan, A. Azcarraga, Chao-Lung Yang, K. Hua
Recent deep learning approaches in single image super-resolution (SISR) can generate high-definition textures for super-resolved (SR) images. However, they tend to hallucinate fake textures and even produce artifacts. An alternative to SISR, reference-based SR (RefSR) approaches use high-resolution (HR) reference (Ref) images to provide HR details that are missing in the low-resolution (LR) input image. We propose a novel framework that leverages existing SISR approaches and enhances them with RefSR. Specifically, we refine the output of SISR methods using neural texture transfer, where HR features are queried from the Ref images. The query is conducted by computing the similarity of textural and semantic features between the input image and the Ref images. The most similar HR features, patch-wise, to the LR image is used to augment the SR image through an augmentation network. In the case of dissimilar Ref images from the LR input image, we prevent performance degradation by including the similarity scores in the input features of the network. Furthermore, we use random texture patches during the training to condition our augmentation network to not always trust the queried texture features. Different from past RefSR approaches, our method can use arbitrary Ref images and its lower-bound performance is based on the SR image. We showcase that our method drastically improves the performance of the base SISR approach.
最近的单幅超分辨率(SISR)深度学习方法可以为超分辨率(SR)图像生成高清纹理。然而,他们往往会产生幻觉,甚至产生人工制品。作为SISR的替代方案,基于参考的SR (RefSR)方法使用高分辨率(HR)参考(Ref)图像来提供低分辨率(LR)输入图像中缺失的HR细节。我们提出了一个新的框架,利用现有的SISR方法,并通过RefSR对其进行增强。具体来说,我们使用神经纹理转移来改进SISR方法的输出,其中从Ref图像中查询HR特征。该查询通过计算输入图像与Ref图像之间的纹理和语义特征的相似度来进行。与LR图像最相似的HR特征(斑块)被用于通过增强网络增强SR图像。在与LR输入图像不同的Ref图像的情况下,我们通过在网络的输入特征中包含相似性分数来防止性能下降。此外,我们在训练过程中使用随机纹理补丁来调节我们的增强网络,使其不总是信任查询到的纹理特征。与以往的RefSR方法不同,我们的方法可以使用任意的RefSR图像,并且它的下界性能是基于SR图像的。我们展示了我们的方法大大提高了基本SISR方法的性能。
{"title":"Super-Resolution by Image Enhancement Using Texture Transfer","authors":"Jose Jaena Mari Ople, Daniel Stanley Tan, A. Azcarraga, Chao-Lung Yang, K. Hua","doi":"10.1109/ICIP40778.2020.9190844","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9190844","url":null,"abstract":"Recent deep learning approaches in single image super-resolution (SISR) can generate high-definition textures for super-resolved (SR) images. However, they tend to hallucinate fake textures and even produce artifacts. An alternative to SISR, reference-based SR (RefSR) approaches use high-resolution (HR) reference (Ref) images to provide HR details that are missing in the low-resolution (LR) input image. We propose a novel framework that leverages existing SISR approaches and enhances them with RefSR. Specifically, we refine the output of SISR methods using neural texture transfer, where HR features are queried from the Ref images. The query is conducted by computing the similarity of textural and semantic features between the input image and the Ref images. The most similar HR features, patch-wise, to the LR image is used to augment the SR image through an augmentation network. In the case of dissimilar Ref images from the LR input image, we prevent performance degradation by including the similarity scores in the input features of the network. Furthermore, we use random texture patches during the training to condition our augmentation network to not always trust the queried texture features. Different from past RefSR approaches, our method can use arbitrary Ref images and its lower-bound performance is based on the SR image. We showcase that our method drastically improves the performance of the base SISR approach.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123023252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Semantic-Preserving Image Compression 保持语义的图像压缩
Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9191247
Neel Patwa, Nilesh A. Ahuja, Srinivasa Somayazulu, Omesh Tickoo, S. Varadarajan, S. Koolagudi
Video traffic comprises a large majority of the total traffic on the internet today. Uncompressed visual data requires a very large data rate; lossy compression techniques are employed in order to keep the data-rate manageable. Increasingly, a significant amount of visual data being generated is consumed by analytics (such as classification, detection, etc.) residing in the cloud. Image and video compression can produce visual artifacts, especially at lower data-rates, which can result in a significant drop in performance on such analytic tasks. Moreover, standard image and video compression techniques aim to optimize perceptual quality for human consumption by allocating more bits to perceptually significant features of the scene. However, these features may not necessarily be the most suitable ones for semantic tasks. We present here an approach to compress visual data in order to maximize performance on a given analytic task. We train a deep auto-encoder using a multi-task loss to learn the relevant embeddings. An approximate differentiable model of the quantizer is used during training which helps boost the accuracy during inference. We apply our approach on an image classification problem and show that for a given level of compression, it achieves higher classification accuracy than that obtained by performing classification on images compressed using JPEG. Our approach also outperforms the relevant state-of-the-art approach by a significant margin.
如今,视频流量占互联网总流量的很大一部分。未压缩的视觉数据需要非常大的数据速率;采用有损压缩技术是为了保持数据速率可管理。越来越多的生成的可视化数据被驻留在云中的分析(如分类、检测等)所消耗。图像和视频压缩会产生视觉伪影,特别是在较低的数据速率下,这可能导致此类分析任务的性能显著下降。此外,标准的图像和视频压缩技术旨在通过为场景的感知重要特征分配更多的比特来优化人类消费的感知质量。然而,这些特性不一定是最适合语义任务的。我们在这里提出了一种压缩视觉数据的方法,以便在给定的分析任务上最大化性能。我们使用多任务损失来训练深度自编码器来学习相关的嵌入。在训练过程中使用了量化器的近似可微模型,这有助于提高推理过程中的准确性。我们将该方法应用于图像分类问题,并表明对于给定的压缩级别,它比使用JPEG压缩的图像执行分类获得更高的分类精度。我们的方法也比相关的最先进的方法要好得多。
{"title":"Semantic-Preserving Image Compression","authors":"Neel Patwa, Nilesh A. Ahuja, Srinivasa Somayazulu, Omesh Tickoo, S. Varadarajan, S. Koolagudi","doi":"10.1109/ICIP40778.2020.9191247","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9191247","url":null,"abstract":"Video traffic comprises a large majority of the total traffic on the internet today. Uncompressed visual data requires a very large data rate; lossy compression techniques are employed in order to keep the data-rate manageable. Increasingly, a significant amount of visual data being generated is consumed by analytics (such as classification, detection, etc.) residing in the cloud. Image and video compression can produce visual artifacts, especially at lower data-rates, which can result in a significant drop in performance on such analytic tasks. Moreover, standard image and video compression techniques aim to optimize perceptual quality for human consumption by allocating more bits to perceptually significant features of the scene. However, these features may not necessarily be the most suitable ones for semantic tasks. We present here an approach to compress visual data in order to maximize performance on a given analytic task. We train a deep auto-encoder using a multi-task loss to learn the relevant embeddings. An approximate differentiable model of the quantizer is used during training which helps boost the accuracy during inference. We apply our approach on an image classification problem and show that for a given level of compression, it achieves higher classification accuracy than that obtained by performing classification on images compressed using JPEG. Our approach also outperforms the relevant state-of-the-art approach by a significant margin.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114570679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Deep Person Identification Using Spatiotemporal Facial Motion Amplification 基于时空面部运动放大的深度人识别
Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9191281
K. Gkentsidis, Theodora Pistola, N. Mitianoudis, N. Boulgouris
We explore the capabilities of a new biometric trait, which is based on information extracted through facial motion amplification. Unlike traditional facial biometric traits, the new biometric does not require the visibility of facial features, such as the eyes or nose, that are critical in common facial biometric algorithms. In this paper we propose the formation of a spatiotemporal facial blood flow map, constructed using small motion amplification. Experiments show that the proposed approach provides significant discriminatory capacity over different training and testing days and can be potentially used in situations where traditional facial biometrics may not be applicable.
我们探索了一种新的生物特征的能力,这种特征是基于通过面部运动放大提取的信息。与传统的面部生物特征不同,新的生物特征不需要面部特征的可见性,比如眼睛或鼻子,而这些在常见的面部生物特征算法中是至关重要的。在本文中,我们提出了一个时空的面部血流图的形成,利用小运动放大构造。实验表明,该方法在不同的训练和测试日提供了显著的区分能力,可以潜在地用于传统面部生物识别技术可能不适用的情况。
{"title":"Deep Person Identification Using Spatiotemporal Facial Motion Amplification","authors":"K. Gkentsidis, Theodora Pistola, N. Mitianoudis, N. Boulgouris","doi":"10.1109/ICIP40778.2020.9191281","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9191281","url":null,"abstract":"We explore the capabilities of a new biometric trait, which is based on information extracted through facial motion amplification. Unlike traditional facial biometric traits, the new biometric does not require the visibility of facial features, such as the eyes or nose, that are critical in common facial biometric algorithms. In this paper we propose the formation of a spatiotemporal facial blood flow map, constructed using small motion amplification. Experiments show that the proposed approach provides significant discriminatory capacity over different training and testing days and can be potentially used in situations where traditional facial biometrics may not be applicable.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121901107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2020 IEEE International Conference on Image Processing (ICIP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1