首页 > 最新文献

2020 IEEE International Conference on Image Processing (ICIP)最新文献

英文 中文
Channel-Grouping Based Patch Swap For Arbitrary Style Transfer 基于通道分组的补丁交换,用于任意风格传输
Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9190962
Yan Zhu, Yi Niu, Fu Li, Chunbo Zou, Guangming Shi
The basic principle of the patch-matching based style transfer is to substitute the patches of the content image feature maps by the closest patches from the style image feature maps. Since the finite features harvested from one single aesthetic style image are inadequate to represent the rich textures of the content natural image, existing techniques treat the full-channel style feature patches as simple signal tensors and create new style feature patches via signal-level fusion. In this paper, we propose a channel-grouping based patch swap technique to group the style feature maps into surface and texture channels, and the new features are created by the combination of these two groups, which can be regarded as a semantic-level fusion of the raw style features. Experimental results demonstrate that the proposed method outperforms the existing techniques in providing more style-consistent textures while keeping the content fidelity.
基于补丁匹配的样式转移的基本原理是将内容图像特征映射的补丁替换为样式图像特征映射中最接近的补丁。由于从单个审美风格图像中获取的有限特征不足以表示内容自然图像的丰富纹理,现有技术将全通道风格特征块视为简单的信号张量,并通过信号级融合创建新的风格特征块。在本文中,我们提出了一种基于通道分组的补丁交换技术,将样式特征映射分组为表面通道和纹理通道,并将这两组组合生成新的特征,这可以看作是原始样式特征的语义级融合。实验结果表明,该方法在保持内容保真度的同时,提供了更一致风格的纹理,优于现有技术。
{"title":"Channel-Grouping Based Patch Swap For Arbitrary Style Transfer","authors":"Yan Zhu, Yi Niu, Fu Li, Chunbo Zou, Guangming Shi","doi":"10.1109/ICIP40778.2020.9190962","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9190962","url":null,"abstract":"The basic principle of the patch-matching based style transfer is to substitute the patches of the content image feature maps by the closest patches from the style image feature maps. Since the finite features harvested from one single aesthetic style image are inadequate to represent the rich textures of the content natural image, existing techniques treat the full-channel style feature patches as simple signal tensors and create new style feature patches via signal-level fusion. In this paper, we propose a channel-grouping based patch swap technique to group the style feature maps into surface and texture channels, and the new features are created by the combination of these two groups, which can be regarded as a semantic-level fusion of the raw style features. Experimental results demonstrate that the proposed method outperforms the existing techniques in providing more style-consistent textures while keeping the content fidelity.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123860490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Tracking Hundreds of People in Densely Crowded Scenes With Particle Filtering Supervising Deep Convolutional Neural Networks 用粒子滤波监督深度卷积神经网络在密集拥挤场景中跟踪数百人
Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9190953
G. Franchi, Emanuel Aldea, Séverine Dubuisson, I. Bloch
Tracking an entire high-density crowd composed of more than five hundred individuals is a difficult task that has not yet been accomplished. In this article, we propose to track pedestrians using a model composed of a Particle Filter (PF) and three Deep Convolutional Neural Networks (DCNN). The first network is a detector that learns to localize the persons. The second one is a pretrained network that estimates the optical flow, and the last one corrects the flow. Our contribution resides in the way we train this last network by PF supervision, and in Markov Random Field linking the different tracks.
跟踪一个由500多人组成的高密度人群是一项尚未完成的艰巨任务。在本文中,我们建议使用由粒子滤波器(PF)和三个深度卷积神经网络(DCNN)组成的模型来跟踪行人。第一个网络是一个检测器,它学习定位人。第二种是预训练的网络,用来估计光流,最后一种是校正光流。我们的贡献在于我们通过PF监督训练最后一个网络的方式,以及连接不同轨道的马尔可夫随机场。
{"title":"Tracking Hundreds of People in Densely Crowded Scenes With Particle Filtering Supervising Deep Convolutional Neural Networks","authors":"G. Franchi, Emanuel Aldea, Séverine Dubuisson, I. Bloch","doi":"10.1109/ICIP40778.2020.9190953","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9190953","url":null,"abstract":"Tracking an entire high-density crowd composed of more than five hundred individuals is a difficult task that has not yet been accomplished. In this article, we propose to track pedestrians using a model composed of a Particle Filter (PF) and three Deep Convolutional Neural Networks (DCNN). The first network is a detector that learns to localize the persons. The second one is a pretrained network that estimates the optical flow, and the last one corrects the flow. Our contribution resides in the way we train this last network by PF supervision, and in Markov Random Field linking the different tracks.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123972319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
The Importance Of Skip Connections In Encoder-Decoder Architectures For Colorectal Polyp Detection 跳跃连接在编码器-解码器结构中对结肠直肠息肉检测的重要性
Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9191310
N. Mulliqi, Sule YAYILGAN YILDIRIM, A. Mohammed, L. Ahmedi, Hao Wang, Ogerta Elezaj, Ø. Hovde
Accurate polyp detection during the colonoscopy procedure impacts colorectal cancer prevention and early detection. In this paper, we investigate the influence of skip connections as the main component of encoder-decoder based convolutional neural network (CNN) architectures for colorectal polyp detection. We conduct experiments on long and short skip connections and further extend the existing architecture by introducing dense lateral skip connections. The proposed segmentation architecture utilizes short skip connections in the contracting path, moreover it utilizes dense long and lateral skip connections in between the contracting and expanding path. Results obtained from the MICCAI 2015 Challenge dataset show progressive improvement of the segmentation result with expanded utilization of skip connections. The proposed colorectal polyp segmentation architecture achieves performance comparable to the state-of-the-art under significantly reduced number of model parameters.
结肠镜检查过程中息肉的准确检测影响结直肠癌的预防和早期发现。在本文中,我们研究了跳跃连接作为基于编码器-解码器的卷积神经网络(CNN)架构的主要组成部分对结肠直肠息肉检测的影响。我们对长箕斗连接和短箕斗连接进行了实验,并通过引入密集的横向箕斗连接进一步扩展了现有的结构。所提出的分段体系结构在收缩路径中利用短跳过连接,并且在收缩路径和扩展路径之间利用密集的长跳过连接和横向跳过连接。从MICCAI 2015 Challenge数据集获得的结果显示,随着跳跃连接的扩大利用,分割结果逐步改善。所提出的结肠直肠息肉分割架构在显著减少模型参数数量的情况下实现了与最先进的性能相当的性能。
{"title":"The Importance Of Skip Connections In Encoder-Decoder Architectures For Colorectal Polyp Detection","authors":"N. Mulliqi, Sule YAYILGAN YILDIRIM, A. Mohammed, L. Ahmedi, Hao Wang, Ogerta Elezaj, Ø. Hovde","doi":"10.1109/ICIP40778.2020.9191310","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9191310","url":null,"abstract":"Accurate polyp detection during the colonoscopy procedure impacts colorectal cancer prevention and early detection. In this paper, we investigate the influence of skip connections as the main component of encoder-decoder based convolutional neural network (CNN) architectures for colorectal polyp detection. We conduct experiments on long and short skip connections and further extend the existing architecture by introducing dense lateral skip connections. The proposed segmentation architecture utilizes short skip connections in the contracting path, moreover it utilizes dense long and lateral skip connections in between the contracting and expanding path. Results obtained from the MICCAI 2015 Challenge dataset show progressive improvement of the segmentation result with expanded utilization of skip connections. The proposed colorectal polyp segmentation architecture achieves performance comparable to the state-of-the-art under significantly reduced number of model parameters.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121190251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Complexity Analysis Of VVC Intra Coding VVC帧内编码的复杂度分析
Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9190970
Mário Saldanha, G. Sanchez, C. Marcon, L. Agostini
Versatile Video Coding (VVC) is the next-generation of video coding standards, which was developed to double the coding efficiency over its predecessor High-Efficiency Video Coding (HEVC). Several new coding tools have been investigated and adopted in the VVC Test Model (VTM), whose current version can improve the intra coding efficiency by 24% at the cost of a much higher coding complexity than the HEVC Test Model (HM). Thus, this paper provides a detailed VVC intra coding complexity analysis, which can support upcoming works for finding the most timeconsuming tool that could be simplified to achieve a real-time encoder design.
多功能视频编码(VVC)是新一代视频编码标准,它的开发是为了将编码效率提高一倍于其前身高效视频编码(HEVC)。在VVC测试模型(VTM)中研究并采用了几种新的编码工具,其当前版本的编码效率比HEVC测试模型(HM)提高了24%,但编码复杂度要高得多。因此,本文提供了详细的VVC帧内编码复杂性分析,可以为后续工作提供支持,以找到最耗时的工具,可以简化以实现实时编码器设计。
{"title":"Complexity Analysis Of VVC Intra Coding","authors":"Mário Saldanha, G. Sanchez, C. Marcon, L. Agostini","doi":"10.1109/ICIP40778.2020.9190970","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9190970","url":null,"abstract":"Versatile Video Coding (VVC) is the next-generation of video coding standards, which was developed to double the coding efficiency over its predecessor High-Efficiency Video Coding (HEVC). Several new coding tools have been investigated and adopted in the VVC Test Model (VTM), whose current version can improve the intra coding efficiency by 24% at the cost of a much higher coding complexity than the HEVC Test Model (HM). Thus, this paper provides a detailed VVC intra coding complexity analysis, which can support upcoming works for finding the most timeconsuming tool that could be simplified to achieve a real-time encoder design.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121353387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Cluster Kernel For Learning Similarities Between Symmetric Positive Definite Matrix Time Series 对称正定矩阵时间序列相似性学习的聚类核
Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9191149
Sara Akodad, L. Bombrun, Y. Berthoumieu, C. Germain
The launch of the last generation of Earth observation satellites has yield to a great improvement in the capabilities of acquiring Earth surface images, providing series of multitemporal images. To process these time series images, many machine learning algorithms have been proposed in the literature such as warping based methods and recurrent neural networks (LSTM,…). Recently, based on an ensemble learning approach, the time series cluster kernel (TCK) has been proposed and has shown competitive results compared to the state-of-the-art. Unfortunately, it does not model the spectral/spatial dependencies. To overcome this problem, this paper aims at extending the TCK approach by modeling the time series of second-order statistical features (SO-TCK). Experimental results are conducted on different benchmark datasets, and land cover classification with remote sensing satellite time series over the Reunion Island.
上一代地球观测卫星的发射大大提高了获取地球表面图像的能力,提供了一系列多时相图像。为了处理这些时间序列图像,文献中提出了许多机器学习算法,如基于翘曲的方法和循环神经网络(LSTM,…)。最近,基于集成学习方法,时间序列聚类核(TCK)被提出,并显示出与最先进的技术相比具有竞争力的结果。不幸的是,它没有对光谱/空间依赖性进行建模。为了克服这一问题,本文旨在通过对二阶统计特征的时间序列(SO-TCK)进行建模来扩展TCK方法。在不同的基准数据集上进行了实验,并利用遥感卫星时间序列对留尼旺岛的土地覆盖进行了分类。
{"title":"Cluster Kernel For Learning Similarities Between Symmetric Positive Definite Matrix Time Series","authors":"Sara Akodad, L. Bombrun, Y. Berthoumieu, C. Germain","doi":"10.1109/ICIP40778.2020.9191149","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9191149","url":null,"abstract":"The launch of the last generation of Earth observation satellites has yield to a great improvement in the capabilities of acquiring Earth surface images, providing series of multitemporal images. To process these time series images, many machine learning algorithms have been proposed in the literature such as warping based methods and recurrent neural networks (LSTM,…). Recently, based on an ensemble learning approach, the time series cluster kernel (TCK) has been proposed and has shown competitive results compared to the state-of-the-art. Unfortunately, it does not model the spectral/spatial dependencies. To overcome this problem, this paper aims at extending the TCK approach by modeling the time series of second-order statistical features (SO-TCK). Experimental results are conducted on different benchmark datasets, and land cover classification with remote sensing satellite time series over the Reunion Island.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"600 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116282275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Dual Information-Based Background Model For Moving Object Detection 基于双信息的运动目标检测背景模型
Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9190811
S. Roy, T. Bouwmans
In this article, a novel pixel based object detection framework is proposed that leverages dual type pixel-level information to construct the background model. The first type of information is initially used intensity histograms over a training set of a few initial video frames. Finally, it is formed by gathering all the minimum and maximum values of contiguous non-zero frequencies of the temporal intensity histogram. The second type of information constitutes a set having only the discrete pixel values. Subsequently, a pixel-level periodic updating scheme is used to make the model robust and flexible enough to recognize and detect foregrounds in various critical background environments. This dual format model produces effective results over many state-of-the-art methods in a large variety of challenging real-life video sequences.
本文提出了一种新的基于像素的目标检测框架,利用双类像素级信息构建背景模型。第一类信息最初是在几个初始视频帧的训练集上使用强度直方图。最后,将时间强度直方图中相邻非零频率的所有最小值和最大值集合形成。第二种类型的信息构成仅具有离散像素值的集合。随后,采用像素级周期性更新方案,使模型具有足够的鲁棒性和灵活性,能够在各种关键背景环境中识别和检测前景。这种双格式模型在许多最先进的方法中产生有效的结果,在各种具有挑战性的现实生活视频序列中。
{"title":"Dual Information-Based Background Model For Moving Object Detection","authors":"S. Roy, T. Bouwmans","doi":"10.1109/ICIP40778.2020.9190811","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9190811","url":null,"abstract":"In this article, a novel pixel based object detection framework is proposed that leverages dual type pixel-level information to construct the background model. The first type of information is initially used intensity histograms over a training set of a few initial video frames. Finally, it is formed by gathering all the minimum and maximum values of contiguous non-zero frequencies of the temporal intensity histogram. The second type of information constitutes a set having only the discrete pixel values. Subsequently, a pixel-level periodic updating scheme is used to make the model robust and flexible enough to recognize and detect foregrounds in various critical background environments. This dual format model produces effective results over many state-of-the-art methods in a large variety of challenging real-life video sequences.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116341043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Non-Convex Optimization For Sparse Interferometric Phase Estimation 稀疏干涉相位估计的非凸优化
Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9191249
Satvik Chemudupati, P. Pokala, C. Seelamantula
We present a new sparsity based technique for interferometric phase estimation. We consider complex extensions of non-convex regularizers such as the minimax concave penalty (MCP) and smoothly clipped absolute deviation penalty (SCAD) for sparse recovery. We solve the problem of interferometric phase estimation based on complex-domain dictionary learning. We develop an algorithm, namely, improved sparse interferometric phase estimation (iSpInPhase) based on alternating direction method of multipliers (ADMM) and Wirtinger calculus for solving the optimization problem. Wiritinger calculus is employed because the cost functions are nonholomorphic. We evaluate the performance of iSpInPhase on synthetic data, namely, truncated Gaussian elevation and also on mountain terrain data, namely, Long’s peak, for different noise levels. Performance comparisons show that iSpInPhase outperforms the state-of-the-art techniques in terms of standard performance assessment measures.
提出了一种基于稀疏度的干涉相位估计新方法。我们考虑了非凸正则化算子的复杂扩展,如极小极大凹惩罚(MCP)和平滑裁剪绝对偏差惩罚(SCAD)用于稀疏恢复。我们解决了基于复域字典学习的干涉相位估计问题。本文提出了一种基于乘法器交替方向法(ADMM)和Wirtinger演算的改进稀疏干涉相位估计(iSpInPhase)算法来解决优化问题。由于代价函数是非全纯的,所以采用了Wiritinger演算。我们评估了不同噪声水平下iSpInPhase在合成数据(即截断高斯高程)和山地地形数据(即朗峰)上的性能。性能比较表明,iSpInPhase在标准性能评估措施方面优于最先进的技术。
{"title":"Non-Convex Optimization For Sparse Interferometric Phase Estimation","authors":"Satvik Chemudupati, P. Pokala, C. Seelamantula","doi":"10.1109/ICIP40778.2020.9191249","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9191249","url":null,"abstract":"We present a new sparsity based technique for interferometric phase estimation. We consider complex extensions of non-convex regularizers such as the minimax concave penalty (MCP) and smoothly clipped absolute deviation penalty (SCAD) for sparse recovery. We solve the problem of interferometric phase estimation based on complex-domain dictionary learning. We develop an algorithm, namely, improved sparse interferometric phase estimation (iSpInPhase) based on alternating direction method of multipliers (ADMM) and Wirtinger calculus for solving the optimization problem. Wiritinger calculus is employed because the cost functions are nonholomorphic. We evaluate the performance of iSpInPhase on synthetic data, namely, truncated Gaussian elevation and also on mountain terrain data, namely, Long’s peak, for different noise levels. Performance comparisons show that iSpInPhase outperforms the state-of-the-art techniques in terms of standard performance assessment measures.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121439753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Fast Lossless Implementation Of The Intra Subpartition Mode For VVC 基于VVC的内部子分区模式的快速无损实现
Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9191103
Santiago De-Luxán-Hernández, Gayathri Venugopal, Valeri George, H. Schwarz, D. Marpe, T. Wiegand
Lossy compression is the main target of the upcoming video coding standard Versatile Video Coding (VVC). However, lossless coding is supported in VVC by utilizing a certain encoder configuration. Particularly, the Transform Skip Mode (TSM) is always selected at the block level to bypass the transform stage (together with a QP that results in the same output as input at the quantization stage). Consequently, the Intra Subpartition (ISP) coding mode cannot be used for lossless coding, considering that its combination with TSM is not supported in VVC because it does not provide a significant coding benefit for the lossy common test conditions. For this reason, it is proposed to enable such a combination for the benefit of lossless coding. Besides, the encoder search has been optimized to improve the trade-off between compression benefit and encoder run-time. Experimental results show a 0.71% coding gain with a corresponding encoder run-time of 111%.
有损压缩是即将推出的视频编码标准通用视频编码(VVC)的主要目标。然而,无损编码在VVC中是通过使用一定的编码器配置来支持的。特别是,总是在块级别选择转换跳过模式(TSM)以绕过转换阶段(以及在量化阶段产生与输入相同输出的QP)。因此,ISP (Intra Subpartition)编码模式不能用于无损编码,因为在VVC中不支持它与TSM的组合,因为它不能为有损的常见测试条件提供显著的编码优势。因此,建议启用这样的组合以获得无损编码的好处。此外,还对编码器搜索进行了优化,以改善压缩效益和编码器运行时间之间的权衡。实验结果表明,编码增益为0.71%,编码器运行时间为111%。
{"title":"A Fast Lossless Implementation Of The Intra Subpartition Mode For VVC","authors":"Santiago De-Luxán-Hernández, Gayathri Venugopal, Valeri George, H. Schwarz, D. Marpe, T. Wiegand","doi":"10.1109/ICIP40778.2020.9191103","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9191103","url":null,"abstract":"Lossy compression is the main target of the upcoming video coding standard Versatile Video Coding (VVC). However, lossless coding is supported in VVC by utilizing a certain encoder configuration. Particularly, the Transform Skip Mode (TSM) is always selected at the block level to bypass the transform stage (together with a QP that results in the same output as input at the quantization stage). Consequently, the Intra Subpartition (ISP) coding mode cannot be used for lossless coding, considering that its combination with TSM is not supported in VVC because it does not provide a significant coding benefit for the lossy common test conditions. For this reason, it is proposed to enable such a combination for the benefit of lossless coding. Besides, the encoder search has been optimized to improve the trade-off between compression benefit and encoder run-time. Experimental results show a 0.71% coding gain with a corresponding encoder run-time of 111%.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124300419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Identity-Invariant Facial Landmark Frontalization For Facial Expression Analysis 用于面部表情分析的身份不变面部地标化
Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9190989
Vassilios Vonikakis, Stefan Winkler
We propose a frontalization technique for 2D facial landmarks, designed to aid in the analysis of facial expressions. It employs a new normalization strategy aiming to minimize identity variations, by displacing groups of facial landmarks to standardized locations. The technique operates directly on 2D landmark coordinates, does not require additional feature extraction and as such is computationally light. It achieves considerable improvement over a reference approach, justifying its use as an efficient preprocessing step for facial expression analysis based on geometric features.
我们提出了一种二维面部地标的正面化技术,旨在帮助分析面部表情。它采用了一种新的规范化策略,旨在通过将一组面部地标置换到标准化位置来最大限度地减少身份变化。该技术直接在二维地标坐标上操作,不需要额外的特征提取,因此计算量很轻。与参考方法相比,它取得了相当大的改进,证明了它可以作为基于几何特征的面部表情分析的有效预处理步骤。
{"title":"Identity-Invariant Facial Landmark Frontalization For Facial Expression Analysis","authors":"Vassilios Vonikakis, Stefan Winkler","doi":"10.1109/ICIP40778.2020.9190989","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9190989","url":null,"abstract":"We propose a frontalization technique for 2D facial landmarks, designed to aid in the analysis of facial expressions. It employs a new normalization strategy aiming to minimize identity variations, by displacing groups of facial landmarks to standardized locations. The technique operates directly on 2D landmark coordinates, does not require additional feature extraction and as such is computationally light. It achieves considerable improvement over a reference approach, justifying its use as an efficient preprocessing step for facial expression analysis based on geometric features.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123992278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Feature Extraction For Visual Speaker Authentication Against Computer-Generated Video Attacks 针对计算机生成视频攻击的视觉说话人身份识别特征提取
Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9190976
Jun Ma, Shilin Wang, Aixin Zhang, Alan Wee-Chung Liew
Recent research shows that the lip feature can achieve reliable authentication performance with a good liveness detection ability. However, with the development of sophisticated human face generation methods by the deepfake technology, the talking videos can be forged with high quality and the static lip information is not reliable in such case. Meeting with such challenge, in this paper, we propose a new deep neural network structure to extract robust lip features against human and Computer-Generated (CG) imposters. Two novel network units, i.e. the feature-level Difference block (Diffblock) and the pixel-level Dynamic Response block (DRblock), are proposed to reduce the influence of the static lip information and to represent the dynamic talking habit information. Experiments on the GRID dataset have demonstrated that the proposed network can extract discriminative and robust lip features and outperform two state-of-the-art visual speaker authentication approaches in both human imposter and CG imposter scenarios.
近年来的研究表明,唇形特征具有良好的活体检测能力,可以实现可靠的认证性能。然而,随着深度造假技术成熟的人脸生成方法的发展,语音视频可以被高质量伪造,而静态唇形信息在这种情况下是不可靠的。面对这一挑战,本文提出了一种新的深度神经网络结构来提取针对人类和计算机生成(CG)冒名顶替者的鲁棒嘴唇特征。为了减少静态唇形信息的影响,表示动态说话习惯信息,提出了特征级差分块(Diffblock)和像素级动态响应块(DRblock)两个新颖的网络单元。在GRID数据集上的实验表明,所提出的网络可以提取判别性和鲁棒性的嘴唇特征,并且在人类冒名顶替者和CG冒名顶替者场景中优于两种最先进的视觉说话人认证方法。
{"title":"Feature Extraction For Visual Speaker Authentication Against Computer-Generated Video Attacks","authors":"Jun Ma, Shilin Wang, Aixin Zhang, Alan Wee-Chung Liew","doi":"10.1109/ICIP40778.2020.9190976","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9190976","url":null,"abstract":"Recent research shows that the lip feature can achieve reliable authentication performance with a good liveness detection ability. However, with the development of sophisticated human face generation methods by the deepfake technology, the talking videos can be forged with high quality and the static lip information is not reliable in such case. Meeting with such challenge, in this paper, we propose a new deep neural network structure to extract robust lip features against human and Computer-Generated (CG) imposters. Two novel network units, i.e. the feature-level Difference block (Diffblock) and the pixel-level Dynamic Response block (DRblock), are proposed to reduce the influence of the static lip information and to represent the dynamic talking habit information. Experiments on the GRID dataset have demonstrated that the proposed network can extract discriminative and robust lip features and outperform two state-of-the-art visual speaker authentication approaches in both human imposter and CG imposter scenarios.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126249282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
2020 IEEE International Conference on Image Processing (ICIP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1