首页 > 最新文献

2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)最新文献

英文 中文
Icon Colorization Based On Triple Conditional Generative Adversarial Networks 基于三重条件生成对抗网络的图标着色
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301890
Qin-Ru Han, Wenzhe Zhu, Qing Zhu
Current automatic colorization systems have many defects such as "contour blur", "color overflow"and "color miscellaneous", especially when they are coloring the images with hollowed-out structure. We propose a model based on triple conditional generative adversarial networks, for generator we provide contour image, colored icon and colorization mask as inputs, our network has three discriminators, structure discriminator is trained to judge if the generated icon has similar contour to the input icon, color discriminator anticipates generated icon and the input icon has the similar color style, the function of mask discriminator is to distinguish whether the output has the similar colorization area to the input mask. For the evaluation, we compared with some existing colorization models, also we made a questionnaire to obtain the evaluation of generated icons from different models. The results showed that our colorization model obtain better results comparing to the other models both in generating hollowed-out and solid structure icons.
目前的自动上色系统存在“轮廓模糊”、“颜色溢出”、“颜色杂”等缺陷,特别是在对镂空结构的图像上色时。我们提出了一种基于三重条件生成对抗网络的模型,对于生成器,我们提供轮廓图像、彩色图标和着色掩码作为输入,我们的网络有三个鉴别器,结构鉴别器被训练来判断生成的图标是否与输入图标具有相似的轮廓,颜色鉴别器预测生成的图标和输入图标具有相似的颜色风格;掩码鉴别器的作用是区分输出是否具有与输入掩码相似的着色面积。为了进行评价,我们对比了一些现有的着色模型,并制作了一份问卷,以获得对不同模型生成的图标的评价。结果表明,与其他模型相比,我们的着色模型在生成空心和实体结构图标方面都取得了更好的效果。
{"title":"Icon Colorization Based On Triple Conditional Generative Adversarial Networks","authors":"Qin-Ru Han, Wenzhe Zhu, Qing Zhu","doi":"10.1109/VCIP49819.2020.9301890","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301890","url":null,"abstract":"Current automatic colorization systems have many defects such as \"contour blur\", \"color overflow\"and \"color miscellaneous\", especially when they are coloring the images with hollowed-out structure. We propose a model based on triple conditional generative adversarial networks, for generator we provide contour image, colored icon and colorization mask as inputs, our network has three discriminators, structure discriminator is trained to judge if the generated icon has similar contour to the input icon, color discriminator anticipates generated icon and the input icon has the similar color style, the function of mask discriminator is to distinguish whether the output has the similar colorization area to the input mask. For the evaluation, we compared with some existing colorization models, also we made a questionnaire to obtain the evaluation of generated icons from different models. The results showed that our colorization model obtain better results comparing to the other models both in generating hollowed-out and solid structure icons.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124450605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Deep Inter Coding with Interpolated Reference Frame for Hierarchical Coding Structure 基于插值参考帧的层次编码结构深度编码
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301769
Yu Guo, Zizheng Liu, Zhenzhong Chen, Shan Liu
In the hybrid video coding framework, inter prediction is an efficient tool to exploit temporal redundancy. Since the performance of inter prediction depends on the content of reference frames, coding efficiency can be significantly improved by having more effective reference frames. In this paper, we propose an enhanced inter coding scheme by generating artificial reference frames with deep neural network. Specifically, a new reference frame is interpolated from two-sided previously reconstructed frames, which can be regarded as the prediction of the to-be-coded frame. The synthesized frame is merged into reference picture list for motion estimation to further decrease the prediction residual. We integrate the proposed method into HM-16.20 under random access configuration. Experimental results show that the proposed method can significantly boost the coding performance, which provides 4.6% BD-rate reduction on average compared to HEVC baseline.
在混合视频编码框架中,相互预测是利用时间冗余的有效工具。由于相互预测的性能取决于参考帧的内容,因此使用更多有效的参考帧可以显著提高编码效率。本文提出了一种利用深度神经网络生成人工参考帧的增强互编码方案。具体来说,从之前重构的双边帧中插值出一个新的参考帧,这可以看作是对待编码帧的预测。将合成的帧合并到参考图像列表中进行运动估计,进一步减小预测残差。我们将该方法集成到HM-16.20随机接入配置中。实验结果表明,该方法可以显著提高编码性能,与HEVC基线相比,平均降低了4.6%的bd率。
{"title":"Deep Inter Coding with Interpolated Reference Frame for Hierarchical Coding Structure","authors":"Yu Guo, Zizheng Liu, Zhenzhong Chen, Shan Liu","doi":"10.1109/VCIP49819.2020.9301769","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301769","url":null,"abstract":"In the hybrid video coding framework, inter prediction is an efficient tool to exploit temporal redundancy. Since the performance of inter prediction depends on the content of reference frames, coding efficiency can be significantly improved by having more effective reference frames. In this paper, we propose an enhanced inter coding scheme by generating artificial reference frames with deep neural network. Specifically, a new reference frame is interpolated from two-sided previously reconstructed frames, which can be regarded as the prediction of the to-be-coded frame. The synthesized frame is merged into reference picture list for motion estimation to further decrease the prediction residual. We integrate the proposed method into HM-16.20 under random access configuration. Experimental results show that the proposed method can significantly boost the coding performance, which provides 4.6% BD-rate reduction on average compared to HEVC baseline.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123728211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Improving Compression Artifact Reduction via End-to-End Learning of Side Information 通过端到端侧信息学习减少压缩伪影
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301805
Haichuan Ma, Dong Liu, Feng Wu
We propose to improve neural network-based compression artifact reduction by transmitting side information for the neural network. The side information consists of artifact descriptors that are obtained by analyzing the original and compressed images in the encoder. In the decoder, the received descriptors are used as additional input to a well-designed conditional post-processing neural network. To reduce the transmission overhead, the entire model is optimized under the rate-distortion constraint via end-to-end learning. Experimental results show that introducing the side information greatly improves the ability of the post-processing neural network, and improves the rate-distortion performance.
我们提出通过传递神经网络的侧信息来改进基于神经网络的压缩伪影减少。副信息由通过分析编码器中的原始图像和压缩图像获得的工件描述符组成。在解码器中,接收到的描述符被用作一个精心设计的条件后处理神经网络的附加输入。为了减少传输开销,在速率失真约束下,通过端到端学习对整个模型进行优化。实验结果表明,引入侧信息大大提高了神经网络的后处理能力,提高了率失真性能。
{"title":"Improving Compression Artifact Reduction via End-to-End Learning of Side Information","authors":"Haichuan Ma, Dong Liu, Feng Wu","doi":"10.1109/VCIP49819.2020.9301805","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301805","url":null,"abstract":"We propose to improve neural network-based compression artifact reduction by transmitting side information for the neural network. The side information consists of artifact descriptors that are obtained by analyzing the original and compressed images in the encoder. In the decoder, the received descriptors are used as additional input to a well-designed conditional post-processing neural network. To reduce the transmission overhead, the entire model is optimized under the rate-distortion constraint via end-to-end learning. Experimental results show that introducing the side information greatly improves the ability of the post-processing neural network, and improves the rate-distortion performance.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123190168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
An Optimized Video Encoder Implementation with Screen Content Coding Tools 一个优化的视频编码器实现与屏幕内容编码工具
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301875
Xiaozhong Xu, Shitao Wang, Yu Chen, Yiming Li, Qing Zhang, Yushan Zheng, Shan Liu
Screen content video applications require efficient coding of computer-generated materials. The new screen content coding tools such as intra block copy (IBC) and palette mode (PLT) have addressed this requirement. However, the added computational complexity on top of the existing sophisticated video encoders is also challenging. In this paper, we focus on the fast and efficient encoder implementation of these screen content coding tools. Improvements on hash-based IBC search, PLT optimization, mode decision between PLT and intra mode, and other general encoder accelerations towards screen content applications are studied and discussed. Experimental results show that with these methods added, the encoder can achieve some faster runtime performance than before while the compression efficiency is almost doubled with screen content coding tools.
屏幕内容视频应用需要对计算机生成的材料进行高效编码。新的屏幕内容编码工具,如块内复制(IBC)和调色板模式(PLT)已经解决了这一需求。然而,在现有的复杂视频编码器之上增加的计算复杂性也具有挑战性。在本文中,我们重点研究了这些屏幕内容编码工具的快速有效的编码器实现。研究和讨论了基于哈希的IBC搜索、PLT优化、PLT和内部模式之间的模式决策以及其他针对屏幕内容应用的一般编码器加速的改进。实验结果表明,加入这些方法后,编码器的运行时性能比原来提高了一些,而使用屏幕内容编码工具后,压缩效率几乎提高了一倍。
{"title":"An Optimized Video Encoder Implementation with Screen Content Coding Tools","authors":"Xiaozhong Xu, Shitao Wang, Yu Chen, Yiming Li, Qing Zhang, Yushan Zheng, Shan Liu","doi":"10.1109/VCIP49819.2020.9301875","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301875","url":null,"abstract":"Screen content video applications require efficient coding of computer-generated materials. The new screen content coding tools such as intra block copy (IBC) and palette mode (PLT) have addressed this requirement. However, the added computational complexity on top of the existing sophisticated video encoders is also challenging. In this paper, we focus on the fast and efficient encoder implementation of these screen content coding tools. Improvements on hash-based IBC search, PLT optimization, mode decision between PLT and intra mode, and other general encoder accelerations towards screen content applications are studied and discussed. Experimental results show that with these methods added, the encoder can achieve some faster runtime performance than before while the compression efficiency is almost doubled with screen content coding tools.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134418050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Mixed Appearance-based and Coding Distortion-based CNN Fusion Approach for In-loop Filtering in Video Coding 基于外观和编码失真的CNN融合视频编码环内滤波方法
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301895
Jian Yue, Yanbo Gao, Shuai Li, Menghu Jia
With the success of the convolutional neural networks (CNNs) in image denoising and other computer vision tasks, CNNs have been investigated for in-loop filtering in video coding. Many existing methods directly use CNNs as powerful tools for filtering without much analysis on its effect. Considering the in-loop filters process the reconstructed video frames produced from a fixed line of video coding operations, the coding distortion in the reconstructed frames may share similar properties that can be learned by CNNs in addition to being a noisy image. Therefore, in this paper, we first categorize the CNN based filtering into two types of processes: appearance-based CNN filtering and coding distortion-based CNN filtering, and develop a two-stream CNN fusion framework accordingly. In the appearance-based CNN filtering, a CNN processes the reconstructed frame as a distorted image and extracts the global appearance information to restore the original image. In order to extract the global information, a CNN with pooling is used first to increase the receptive field and up-sampling is added in the late stage to produce pixel-level frame information. On the contrary, in the coding distortion-based filtering, a CNN processes the reconstructed frame as blocks with certain types of distortions by focusing on the local information to learn the coding distortion resulted by the fixed video coding pipeline. Finally, the appearance-based filtering stream and the coding distortion-based filtering stream are fused together to combine the two aspects of CNN filtering, and also the global and local information. To further reduce the complexity, the similar initial and last convolutional layers are shared over two streams to generate a mixed CNN. Experiments demonstrate that the proposed method achieves better performance than the existing CNN-based filtering methods, with 11.26% BD-rate saving under the All Intra configuration.
随着卷积神经网络在图像去噪和其他计算机视觉任务中的成功,人们开始研究卷积神经网络在视频编码中的环内滤波。许多现有的方法直接使用cnn作为强大的过滤工具,而没有对其效果进行过多的分析。考虑到环内滤波器处理由固定行视频编码操作产生的重构视频帧,重构帧中的编码失真除了是噪声图像外,可能具有类似cnn可以学习的属性。因此,在本文中,我们首先将基于CNN的滤波分为两类过程:基于外观的CNN滤波和基于编码失真的CNN滤波,并据此开发了两流CNN融合框架。在基于外观的CNN滤波中,CNN将重构后的帧作为失真图像处理,提取全局外观信息恢复原始图像。为了提取全局信息,首先使用带池化的CNN来增加接受域,然后在后期增加上采样来产生像素级的帧信息。相反,在基于编码失真的滤波中,CNN通过聚焦局部信息,将重构后的帧处理为具有一定失真类型的块,学习固定视频编码管道造成的编码失真。最后,将基于外观的滤波流和基于编码失真的滤波流融合在一起,将CNN滤波的两个方面结合起来,将全局信息和局部信息结合起来。为了进一步降低复杂性,相似的初始和最后卷积层在两个流上共享以生成混合CNN。实验表明,与现有的基于cnn的滤波方法相比,该方法在All Intra配置下可节省11.26%的BD-rate。
{"title":"A Mixed Appearance-based and Coding Distortion-based CNN Fusion Approach for In-loop Filtering in Video Coding","authors":"Jian Yue, Yanbo Gao, Shuai Li, Menghu Jia","doi":"10.1109/VCIP49819.2020.9301895","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301895","url":null,"abstract":"With the success of the convolutional neural networks (CNNs) in image denoising and other computer vision tasks, CNNs have been investigated for in-loop filtering in video coding. Many existing methods directly use CNNs as powerful tools for filtering without much analysis on its effect. Considering the in-loop filters process the reconstructed video frames produced from a fixed line of video coding operations, the coding distortion in the reconstructed frames may share similar properties that can be learned by CNNs in addition to being a noisy image. Therefore, in this paper, we first categorize the CNN based filtering into two types of processes: appearance-based CNN filtering and coding distortion-based CNN filtering, and develop a two-stream CNN fusion framework accordingly. In the appearance-based CNN filtering, a CNN processes the reconstructed frame as a distorted image and extracts the global appearance information to restore the original image. In order to extract the global information, a CNN with pooling is used first to increase the receptive field and up-sampling is added in the late stage to produce pixel-level frame information. On the contrary, in the coding distortion-based filtering, a CNN processes the reconstructed frame as blocks with certain types of distortions by focusing on the local information to learn the coding distortion resulted by the fixed video coding pipeline. Finally, the appearance-based filtering stream and the coding distortion-based filtering stream are fused together to combine the two aspects of CNN filtering, and also the global and local information. To further reduce the complexity, the similar initial and last convolutional layers are shared over two streams to generate a mixed CNN. Experiments demonstrate that the proposed method achieves better performance than the existing CNN-based filtering methods, with 11.26% BD-rate saving under the All Intra configuration.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113997253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
APL: Adaptive Preloading of Short Video with Lyapunov Optimization 基于Lyapunov优化的短视频自适应预加载
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301886
Haodan Zhang, Yixuan Ban, Xinggong Zhang, Zongming Guo, Zhimin Xu, Shengbin Meng, Junlin Li, Yue Wang
Short video applications, like TikTok, have attracted many users across the world. It can feed short videos based on users' preferences and allow users to slide the boring content anywhere and anytime. To reduce the loading time and keep playback smoothness, most of the short video apps will preload the recommended short videos in advance. However, these apps preload short videos in fixed size and fixed order, which can lead to huge playback stall and huge bandwidth waste. To deal with these problems, we present an Adaptive Preloading mechanism for short videos based on Lyapunov Optimization, also called APL, to achieve near-optimal playback experience, i.e., maximizing playback smoothness and minimizing bandwidth waste considering users' sliding behaviors. Specifically, we make three technical contributions: (1) We design a novel short video streaming framework which can dynamically preload the recommended short videos before the current video is downloaded completely. (2) We formulate the preloading problem into a playback experience optimization problem to maximize the playback smoothness and minimize the bandwidth waste. (3) We transform the playback experience optimization problem during the whole viewing process into a single-step greedy algorithm based on the Lyapunov optimization theory to make the online decisions during playback. Through extensive experiments based on the real datasets that generously provided by TikTok, we demonstrate that APL can reduce the stall ratio by 81%/12% and bandwidth waste by 11%/31% compared with no-preloading/fixed-preloading mechanism.
像抖音这样的短视频应用吸引了世界各地的许多用户。它可以根据用户的喜好提供短视频,让用户随时随地滑动无聊的内容。为了减少加载时间,保持播放流畅,大多数短视频app都会提前预加载推荐的短视频。然而,这些应用程序以固定大小和固定顺序预加载短视频,这可能会导致巨大的播放延迟和巨大的带宽浪费。为了解决这些问题,我们提出了一种基于Lyapunov优化(也称为APL)的短视频自适应预加载机制,以实现近乎最优的播放体验,即在考虑用户滑动行为的情况下,最大化播放平滑度和最小化带宽浪费。具体来说,我们做出了三个技术贡献:(1)我们设计了一种新颖的短视频流框架,可以在当前视频完全下载之前动态预加载推荐的短视频。(2)将预加载问题转化为播放体验优化问题,实现播放流畅性最大化和带宽浪费最小化。(3)将整个观看过程中的播放体验优化问题转化为基于Lyapunov优化理论的单步贪心算法,在播放过程中进行在线决策。通过基于TikTok慷慨提供的真实数据集的大量实验,我们证明与无预加载/固定预加载机制相比,APL可以将失速率降低81%/12%,带宽浪费降低11%/31%。
{"title":"APL: Adaptive Preloading of Short Video with Lyapunov Optimization","authors":"Haodan Zhang, Yixuan Ban, Xinggong Zhang, Zongming Guo, Zhimin Xu, Shengbin Meng, Junlin Li, Yue Wang","doi":"10.1109/VCIP49819.2020.9301886","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301886","url":null,"abstract":"Short video applications, like TikTok, have attracted many users across the world. It can feed short videos based on users' preferences and allow users to slide the boring content anywhere and anytime. To reduce the loading time and keep playback smoothness, most of the short video apps will preload the recommended short videos in advance. However, these apps preload short videos in fixed size and fixed order, which can lead to huge playback stall and huge bandwidth waste. To deal with these problems, we present an Adaptive Preloading mechanism for short videos based on Lyapunov Optimization, also called APL, to achieve near-optimal playback experience, i.e., maximizing playback smoothness and minimizing bandwidth waste considering users' sliding behaviors. Specifically, we make three technical contributions: (1) We design a novel short video streaming framework which can dynamically preload the recommended short videos before the current video is downloaded completely. (2) We formulate the preloading problem into a playback experience optimization problem to maximize the playback smoothness and minimize the bandwidth waste. (3) We transform the playback experience optimization problem during the whole viewing process into a single-step greedy algorithm based on the Lyapunov optimization theory to make the online decisions during playback. Through extensive experiments based on the real datasets that generously provided by TikTok, we demonstrate that APL can reduce the stall ratio by 81%/12% and bandwidth waste by 11%/31% compared with no-preloading/fixed-preloading mechanism.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114011376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
A Hybrid Model for Natural Face De-Identiation with Adjustable Privacy 一种具有可调隐私的自然人脸去识别混合模型
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301866
Yunqian Wen, Bo Liu, Rong Xie, Yunhui Zhu, Jingyi Cao, Li Song
As more and more personal photos are shared and tagged in social media, security and privacy protection are becoming an unprecedentedly focus of attention. Avoiding privacy risks such as unintended verification, becomes increasingly challenging. To enable people to enjoy uploading photos without having to consider these privacy concerns, it is crucial to study techniques that allow individuals to limit the identity information leaked in visual data. In this paper, we propose a novel hybrid model consists of two stages to generate visually pleasing de-identified face images according to a single input. Meanwhile, we successfully preserve visual similarity with the original face to retain data usability. Our approach combines latest advances in GAN-based face generation with well-designed adjustable randomness. In our experiments we show visually pleasing de-identified output of our method while preserving a high similarity to the original image content. Moreover, our method adapts well to the verificator of unknown structure, which further improves the practical value in our real life.
随着越来越多的个人照片在社交媒体上被分享和标记,安全和隐私保护成为前所未有的关注焦点。避免诸如意外验证之类的隐私风险变得越来越具有挑战性。为了使人们能够享受上传照片而不必考虑这些隐私问题,研究允许个人限制视觉数据中泄露的身份信息的技术是至关重要的。在本文中,我们提出了一种新的混合模型,该模型由两个阶段组成,根据单个输入生成视觉上令人愉悦的去识别人脸图像。同时,我们成功地保持了与原始人脸的视觉相似性,以保持数据的可用性。我们的方法结合了基于gan的人脸生成的最新进展和精心设计的可调节随机性。在我们的实验中,我们展示了视觉上令人愉悦的去识别输出,同时保持了与原始图像内容的高度相似性。此外,该方法对未知结构的验证具有较好的适应性,进一步提高了该方法在实际生活中的实用价值。
{"title":"A Hybrid Model for Natural Face De-Identiation with Adjustable Privacy","authors":"Yunqian Wen, Bo Liu, Rong Xie, Yunhui Zhu, Jingyi Cao, Li Song","doi":"10.1109/VCIP49819.2020.9301866","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301866","url":null,"abstract":"As more and more personal photos are shared and tagged in social media, security and privacy protection are becoming an unprecedentedly focus of attention. Avoiding privacy risks such as unintended verification, becomes increasingly challenging. To enable people to enjoy uploading photos without having to consider these privacy concerns, it is crucial to study techniques that allow individuals to limit the identity information leaked in visual data. In this paper, we propose a novel hybrid model consists of two stages to generate visually pleasing de-identified face images according to a single input. Meanwhile, we successfully preserve visual similarity with the original face to retain data usability. Our approach combines latest advances in GAN-based face generation with well-designed adjustable randomness. In our experiments we show visually pleasing de-identified output of our method while preserving a high similarity to the original image content. Moreover, our method adapts well to the verificator of unknown structure, which further improves the practical value in our real life.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124944552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Quality of Experience Evaluation for Streaming Video Using CGNN 基于CGNN的流媒体视频体验质量评价
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301799
Zhiming Zhou, Yu Dong, Li Song, Rong Xie, Lin Li, Bing Zhou
One of the principal contradictions these days in the field of video i s lying between the booming demand for evaluating the streaming video quality and the low precision of the Quality of Experience prediction results. In this paper, we propose Convolutional Neural Network and Gate Recurrent Unit (CGNN)-QoE, a deep learning QoE model, that can predict overall and continuous scores of video streaming services accurately in real time. We further implement state-of-the-art models on the basis of their works and compare with our method on six public available datasets. In all considered scenarios, the CGNN-QoE outperforms existing methods.
当前视频领域的主要矛盾之一是对流媒体视频质量评价需求的激增与体验质量预测结果的低精度之间的矛盾。在本文中,我们提出了卷积神经网络和门递归单元(CGNN)-QoE,这是一种深度学习QoE模型,可以实时准确地预测视频流服务的整体和连续分数。我们在他们的工作的基础上进一步实现了最先进的模型,并在六个公共可用数据集上与我们的方法进行了比较。在所有考虑的场景中,CGNN-QoE优于现有方法。
{"title":"Quality of Experience Evaluation for Streaming Video Using CGNN","authors":"Zhiming Zhou, Yu Dong, Li Song, Rong Xie, Lin Li, Bing Zhou","doi":"10.1109/VCIP49819.2020.9301799","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301799","url":null,"abstract":"One of the principal contradictions these days in the field of video i s lying between the booming demand for evaluating the streaming video quality and the low precision of the Quality of Experience prediction results. In this paper, we propose Convolutional Neural Network and Gate Recurrent Unit (CGNN)-QoE, a deep learning QoE model, that can predict overall and continuous scores of video streaming services accurately in real time. We further implement state-of-the-art models on the basis of their works and compare with our method on six public available datasets. In all considered scenarios, the CGNN-QoE outperforms existing methods.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127904174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Application of Brain-Computer Interface and Virtual Reality in Advancing Cultural Experience 脑机接口与虚拟现实在提升文化体验中的应用
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301801
Hao-Lun Fu, Po-Hsiang Fang, Chan-Yu Chi, Chung-ting Kuo, Meng-Hsuan Liu, Howard Muchen Hsu, Cheng-Hsun Hsieh, Sheng-Fu Liang, S. Hsieh, Cheng-Ta Yang
Virtual reality (VR), a computer-generated interactive environment, is provided to a user by projecting a peripheral image onto environmental surfaces. VR has an advantage of enhancing the immersive experience. Nowadays, VR has been widely applied in tourism and cultural experience. On the other hand, a recent integration of electroencephalography-based (EEG-based) brain-computer interface (BCI) and VR is capable of promoting the immersive virtual experience. Therefore, our study aims to propose an integrative framework to implement EEG-based BCI in a VR game to advance the cultural experience. A room escape game in a Tainan temple is created. EEG signals arc recorded while users arc playing the game. The online analyses of EEG signals arc used to interact with the VR display. This integrative framework can result in a better experience than the conventional setup.
虚拟现实(VR)是一种计算机生成的交互式环境,通过将周边图像投射到环境表面来提供给用户。VR具有增强沉浸式体验的优势。如今,VR在旅游和文化体验中得到了广泛的应用。另一方面,最近基于脑电图(eeg)的脑机接口(BCI)与VR的融合能够促进沉浸式虚拟体验。因此,我们的研究旨在提出一个整合框架,在VR游戏中实现基于脑电图的脑机接口,以促进文化体验。在台南的一座寺庙里创造了一个房间逃生游戏。当用户玩游戏时,脑电图信号被记录下来。通过对脑电图信号的在线分析,与虚拟现实显示器进行交互。这种集成框架可以产生比传统设置更好的体验。
{"title":"Application of Brain-Computer Interface and Virtual Reality in Advancing Cultural Experience","authors":"Hao-Lun Fu, Po-Hsiang Fang, Chan-Yu Chi, Chung-ting Kuo, Meng-Hsuan Liu, Howard Muchen Hsu, Cheng-Hsun Hsieh, Sheng-Fu Liang, S. Hsieh, Cheng-Ta Yang","doi":"10.1109/VCIP49819.2020.9301801","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301801","url":null,"abstract":"Virtual reality (VR), a computer-generated interactive environment, is provided to a user by projecting a peripheral image onto environmental surfaces. VR has an advantage of enhancing the immersive experience. Nowadays, VR has been widely applied in tourism and cultural experience. On the other hand, a recent integration of electroencephalography-based (EEG-based) brain-computer interface (BCI) and VR is capable of promoting the immersive virtual experience. Therefore, our study aims to propose an integrative framework to implement EEG-based BCI in a VR game to advance the cultural experience. A room escape game in a Tainan temple is created. EEG signals arc recorded while users arc playing the game. The online analyses of EEG signals arc used to interact with the VR display. This integrative framework can result in a better experience than the conventional setup.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126261817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
The Hough-Based Multibeamlet Transform 基于霍夫的多波束变换
Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301812
A. Lisowska
There are plenty of geometrical multiresolution transforms devoted to efficient edge representation. However, they have two drawbacks. The first one is that such transforms represent mono edge models. And the second one is that they are often based on approximations which are optimal according to the Mean Square Error what does not necessarily lead to optimal edge approximation. In this paper the multibeamlet transform based on the Hough transform is proposed. This transform is defined to properly detect multiedges present in images. Next, the method of image approximation with the use of the multibeamlet transform is described. Additionally, the modified bottom-up tree pruning algorithm is presented in order to properly approximate images with the use of multibeamlets. As follows from the performed experiments, this approach leads to image approximations with better quality than the state-of-the-art geometrical multiresolution transforms.
有大量的几何多分辨率变换致力于有效的边缘表示。然而,它们有两个缺点。第一个是这样的变换表示单边缘模型。第二点是它们通常是基于近似值根据均方误差是最优的这并不一定会导致最优边缘近似值。本文提出了基于霍夫变换的多波束变换。定义该变换是为了正确检测图像中存在的多边。其次,介绍了利用多波束变换进行图像逼近的方法。此外,提出了一种改进的自底向上的树剪枝算法,以便利用多光束对图像进行适当的近似。从实验中可以看出,这种方法比最先进的几何多分辨率变换的图像近似质量更好。
{"title":"The Hough-Based Multibeamlet Transform","authors":"A. Lisowska","doi":"10.1109/VCIP49819.2020.9301812","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301812","url":null,"abstract":"There are plenty of geometrical multiresolution transforms devoted to efficient edge representation. However, they have two drawbacks. The first one is that such transforms represent mono edge models. And the second one is that they are often based on approximations which are optimal according to the Mean Square Error what does not necessarily lead to optimal edge approximation. In this paper the multibeamlet transform based on the Hough transform is proposed. This transform is defined to properly detect multiedges present in images. Next, the method of image approximation with the use of the multibeamlet transform is described. Additionally, the modified bottom-up tree pruning algorithm is presented in order to properly approximate images with the use of multibeamlets. As follows from the performed experiments, this approach leads to image approximations with better quality than the state-of-the-art geometrical multiresolution transforms.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129055012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1