2018 Eighth International Conference on Image Processing Theory, Tools and Applications (IPTA)最新文献

英文中文

Using Adaptive Trackers for Video Face Recognition from a Single Sample Per Person 使用自适应跟踪器对每个人的单个样本进行视频人脸识别

2018 Eighth International Conference on Image Processing Theory, Tools and Applications (IPTA)

Pub Date : 2018-11-01 DOI: 10.1109/IPTA.2018.8608163

Francis Charette Migneault, Eric Granger, F. Mokhayeri

Still-to-video face recognition (FR) is an important function in many video surveillance applications, allowing to recognize target individuals of interest appearing over a distributed network of cameras. Systems for still-to-video FR match faces captured in videos under challenging conditions against facial models, often based on a single reference still per individual. To improve robustness to intra-class variations, an adaptive visual tracker is considered for learning of a diversified face trajectory model for each person appearing in the scene. These appearance models are updated along a trajectory, and matched against the reference gallery stills of each individual enrolled to the system. Matching scores per individual are thereby accumulated over successive frames for robust spatio-temporal recognition. In a specific implementation, face trajectory models learned with a STRUCK tracker are compared to reference stills using an ensemble of SVMs per individual that are trained a priori to discriminate target reference faces (in gallery stills) versus non-target faces (in videos from the operational domain). To represent common pose and illumination variations, domain-specific face synthesis is employed to augment the number of reference stills. Experimental results obtained with this implementation on the Chokepoint video dataset indicate that the proposed system can maintain a comparably high level of accuracy versus state-of-the-art systems, yet requires a lower complexity.

静止到视频的人脸识别(FR)在许多视频监控应用中是一项重要功能，它允许识别分布式摄像机网络中出现的目标个人。静止到视频FR系统在具有挑战性的条件下与面部模型匹配视频中捕获的面部，通常基于每个人的单个参考图像。为了提高对类内变化的鲁棒性，考虑了一种自适应视觉跟踪器，用于学习场景中出现的每个人的多样化面部轨迹模型。这些外观模型沿着轨迹更新，并与系统中登记的每个人的参考画廊剧照相匹配。因此，每个个体的匹配分数在连续的帧中累积，以实现鲁棒的时空识别。在具体实现中，使用STRUCK跟踪器学习的面部轨迹模型与参考静态图像进行比较，使用每个个体的svm集合进行先验训练，以区分目标参考面部(在画廊静态图像中)与非目标面部(在来自操作域的视频中)。为了表示常见的姿势和照明变化，采用特定领域的人脸合成来增加参考静态图像的数量。在阻塞点视频数据集上实现的实验结果表明，与最先进的系统相比，所提出的系统可以保持相当高的精度，但需要更低的复杂性。

{"title":"Using Adaptive Trackers for Video Face Recognition from a Single Sample Per Person","authors":"Francis Charette Migneault, Eric Granger, F. Mokhayeri","doi":"10.1109/IPTA.2018.8608163","DOIUrl":"https://doi.org/10.1109/IPTA.2018.8608163","url":null,"abstract":"Still-to-video face recognition (FR) is an important function in many video surveillance applications, allowing to recognize target individuals of interest appearing over a distributed network of cameras. Systems for still-to-video FR match faces captured in videos under challenging conditions against facial models, often based on a single reference still per individual. To improve robustness to intra-class variations, an adaptive visual tracker is considered for learning of a diversified face trajectory model for each person appearing in the scene. These appearance models are updated along a trajectory, and matched against the reference gallery stills of each individual enrolled to the system. Matching scores per individual are thereby accumulated over successive frames for robust spatio-temporal recognition. In a specific implementation, face trajectory models learned with a STRUCK tracker are compared to reference stills using an ensemble of SVMs per individual that are trained a priori to discriminate target reference faces (in gallery stills) versus non-target faces (in videos from the operational domain). To represent common pose and illumination variations, domain-specific face synthesis is employed to augment the number of reference stills. Experimental results obtained with this implementation on the Chokepoint video dataset indicate that the proposed system can maintain a comparably high level of accuracy versus state-of-the-art systems, yet requires a lower complexity.","PeriodicalId":272294,"journal":{"name":"2018 Eighth International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"134 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134297720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

FACE - Face At Classroom Environment: Dataset and Exploration 课堂环境中的面对面:数据集与探索

2018 Eighth International Conference on Image Processing Theory, Tools and Applications (IPTA)

Pub Date : 2018-11-01 DOI: 10.1109/IPTA.2018.8608166

Oscar Karnalim, Setia Budi, Sulaeman Santoso, E. Handoyo, Hapnes Toba, Huyen Nguyen, Vishv M. Malhotra

The rapid development in face detection study has been greatly supported by the availability of large image datasets, which provide detailed annotations of faces on images. However, among a number of publicly accessible datasets, to our best knowledge, none of them are specifically created for academic applications. In this paper, we propose a systematic method in forming an image dataset tailored for classroom environment. We also made our dataset and its exploratory analyses publicly available. Studies in computer vision for academic application, such as an automated student attendance system, would benefit from our dataset.

人脸检测研究的快速发展，很大程度上得益于大型图像数据集的可用性，这些数据集提供了人脸在图像上的详细注释。然而，据我们所知，在众多可公开访问的数据集中，没有一个是专门为学术应用而创建的。在本文中，我们提出了一种系统的方法来形成适合课堂环境的图像数据集。我们还公开了我们的数据集及其探索性分析。用于学术应用的计算机视觉研究，例如自动学生出勤系统，将受益于我们的数据集。

引用次数: 8

On the use of contextual information for robust colour-based particle filter tracking 上下文信息在基于颜色的粒子滤波跟踪中的应用

2018 Eighth International Conference on Image Processing Theory, Tools and Applications (IPTA)

Pub Date : 2018-11-01 DOI: 10.1109/IPTA.2018.8608147

Jingjing Xiao, M. Oussalah

Color-based particle filters have emerged as an appealing method for targets tracking. As the target may undergo rapid and significant appearance changes, the template (i.e. scale of the target, color distribution histogram) also needs to be updated. Traditional updates without learning contextual information may imply a high risk of distorting the model and losing the target. In this paper, a new algorithm utilizing the environmental information to update both the scale of the tracker and the reference appearance model for the purpose of object tracking in video sequences has been put forward. The proposal makes use of the well-established color-based particle filter tracking while differentiating the foreground and background particles according to their matching score. A roaming phenomenon that yields the estimation to shrink and diverge is investigated. The proposed solution is tested using publicly available benchmark datasets where a comparison with six state-of-the-art trackers has been carried out. The results demonstrate the feasibility of the proposal and lie down foundations for further research of complex tracking problems.

基于颜色的粒子滤波已经成为一种很有吸引力的目标跟踪方法。由于目标可能会发生快速而显著的外观变化，因此模板(即目标的尺度、颜色分布直方图)也需要更新。传统的不学习上下文信息的更新可能意味着扭曲模型和失去目标的高风险。本文提出了一种利用环境信息更新跟踪器尺度和参考外观模型的新算法，用于视频序列中的目标跟踪。该方法利用已有的基于颜色的粒子滤波跟踪方法，根据匹配分数区分前景和背景粒子。研究了一种导致估计收缩和发散的漫游现象。建议的解决方案使用公开可用的基准数据集进行测试，其中与六个最先进的跟踪器进行了比较。结果表明了该方法的可行性，为进一步研究复杂跟踪问题奠定了基础。

{"title":"On the use of contextual information for robust colour-based particle filter tracking","authors":"Jingjing Xiao, M. Oussalah","doi":"10.1109/IPTA.2018.8608147","DOIUrl":"https://doi.org/10.1109/IPTA.2018.8608147","url":null,"abstract":"Color-based particle filters have emerged as an appealing method for targets tracking. As the target may undergo rapid and significant appearance changes, the template (i.e. scale of the target, color distribution histogram) also needs to be updated. Traditional updates without learning contextual information may imply a high risk of distorting the model and losing the target. In this paper, a new algorithm utilizing the environmental information to update both the scale of the tracker and the reference appearance model for the purpose of object tracking in video sequences has been put forward. The proposal makes use of the well-established color-based particle filter tracking while differentiating the foreground and background particles according to their matching score. A roaming phenomenon that yields the estimation to shrink and diverge is investigated. The proposed solution is tested using publicly available benchmark datasets where a comparison with six state-of-the-art trackers has been carried out. The results demonstrate the feasibility of the proposal and lie down foundations for further research of complex tracking problems.","PeriodicalId":272294,"journal":{"name":"2018 Eighth International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128359265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Detection proposal method based on shallow feature constraints 基于浅特征约束的检测建议方法

2018 Eighth International Conference on Image Processing Theory, Tools and Applications (IPTA)

Pub Date : 2018-11-01 DOI: 10.1109/IPTA.2018.8608148

Hao Chen, Hong Zheng, Ying Deng

Rapid detection of small or non-salient attacking objects constitutes the dominant technical concern for prevention of airport bird strike. According to changes of the object observed from far to near, a novel detection proposal method based on shallow feature constraints (ShallowF) is thus proposed. Specifically, the object is located approximately by virtue of feature points, narrowing search spaces, reducing the number of sampling frames, and improving the efficiency of detection proposals. Then sampling rules are specified by connected domains and feature points, further narrowing search spaces and reducing the number of sampling frames. Finally, based on the difference between the target contour and the background, the structured edge group in the bounding boxes is extracted as the scoring basis for target detection before test and validation on the COCO Bird Dataset [1] and the VOC2007 Dataset [2]. Compared with the most advanced detection proposal methods, this method can improve the accuracy of candidate bounding boxes while reducing their quantity.

快速探测小型或不明显的攻击目标是防止机场鸟击的主要技术问题。根据观测目标从远到近的变化，提出了一种基于浅层特征约束的检测方案(ShallowF)。具体来说，利用特征点对目标进行近似定位，缩小搜索空间，减少采样帧数，提高检测方案的效率。然后通过连通域和特征点指定采样规则，进一步缩小搜索空间，减少采样帧数。最后，根据目标轮廓与背景的差异，提取边界框中的结构化边缘组作为目标检测的评分基础，然后在COCO Bird数据集[1]和VOC2007数据集[2]上进行测试验证。与最先进的检测建议方法相比，该方法在减少候选边界框数量的同时，提高了候选边界框的精度。

引用次数: 0

MLANs: Image Aesthetic Assessment via Multi-Layer Aggregation Networks 基于多层聚合网络的图像审美评价

2018 Eighth International Conference on Image Processing Theory, Tools and Applications (IPTA)

Pub Date : 2018-11-01 DOI: 10.1109/IPTA.2018.8608132

Xuantong Meng, Fei Gao, Shengjie Shi, Suguo Zhu, Jingjie Zhu

Image aesthetic assessment aims at computationally evaluating the quality of images based on artistic perceptions. Although existing deep learning based approaches have obtained promising performance, they typically use the high-level features in the convolutional neural networks (CNNs) for aesthetic prediction. However, low-level and intermediate-level features are also highly correlated with image aesthetic. In this paper, we propose to use multi-level features from a CNN for learning effective image aesthetic assessment models. Specially, we extract features from multi-layers and then aggregate them for predicting a image aesthetic score. To evaluate its effectiveness, we build three multilayer aggregation networks (MLANs) based on different baseline networks, including MobileNet, VGG16, and Inception-v3, respectively. Experimental results show that aggregating multilayer features consistently and considerably achieved improved performance. Besides, MLANs show significant superiority over previous state-of-the-art in the aesthetic score prediction task.

图像审美评价的目的是在艺术感知的基础上对图像的质量进行计算评价。尽管现有的基于深度学习的方法已经获得了很好的性能，但它们通常使用卷积神经网络(cnn)中的高级特征进行美学预测。然而，低级和中级特征也与图像审美高度相关。在本文中，我们提出使用来自CNN的多层次特征来学习有效的图像审美评估模型。特别地，我们从多层中提取特征，然后将它们聚合在一起来预测图像的美学评分。为了评估其有效性，我们分别基于MobileNet、VGG16和Inception-v3等不同的基线网络构建了三个多层聚合网络(MLANs)。实验结果表明，对多层特征进行一致的聚合可以显著提高性能。此外，MLANs在美学分数预测任务上也表现出显著的优势。

引用次数: 8

Extracting Painted Pottery Pattern Information Based on Deep Learning 基于深度学习的彩陶图案信息提取

2018 Eighth International Conference on Image Processing Theory, Tools and Applications (IPTA)

Pub Date : 2018-11-01 DOI: 10.1109/IPTA.2018.8608139

Jinye Peng, Kai Yu, Jun Wang, Qunxi Zhang, Cheng Liu, L. Wang

This paper proposes a method that can effectively recover pattern information from painted pottery. The first step is to create an image of the pottery using hyperspectral imaging techniques. The Minimum Noise Fraction transform (MNF) is then used to reduce the dimensionality of the hyperspectral image to obtain the principal component image. Next, we propose a pattern extraction method based on deep learning, the topic of this paper, to further enhance the process resulting in more complete pattern information. Lastly, the pattern information image is fused with a true colour image using the improved sparse representation and detail injection fusion method to obtain an image that includes both the pattern and colour information of the painted pottery. The experimental results we observed confirm this process effectively extracts the pattern information from painted pottery.

本文提出了一种能有效恢复彩陶图案信息的方法。第一步是使用高光谱成像技术创建陶器的图像。然后利用最小噪声分数变换(MNF)对高光谱图像进行降维处理，得到主成分图像。接下来，我们提出了一种基于深度学习的模式提取方法，这是本文的主题，以进一步增强过程，从而获得更完整的模式信息。最后，采用改进的稀疏表示和细节注入融合方法将图案信息图像与真彩色图像融合，得到既包含彩陶图案信息又包含彩陶颜色信息的图像。实验结果表明，该方法能有效地提取彩陶的图案信息。

引用次数: 0

[Title page] (标题页)

2018 Eighth International Conference on Image Processing Theory, Tools and Applications (IPTA)

Pub Date : 2018-11-01 DOI: 10.1109/ipta.2018.8608137

引用次数: 0

DEVELOPING AND VALIDATING A PREDICTIVE MODEL OF MEASUREMENT UNCERTAINTY FOR MULTI-BEAM LIDARS: APPLICATION TO THE VELODYNE VLP-16 多波束激光雷达测量不确定度预测模型的建立与验证:在velodyne vlp-16上的应用

2018 Eighth International Conference on Image Processing Theory, Tools and Applications (IPTA)

Pub Date : 2018-11-01 DOI: 10.1109/IPTA.2018.8608146

Q. Péntek, T. Allouis, O. Strauss, C. Fiorio

A key feature for multi-sensor fusion is the ability to associate, to each measured value, an estimate of its uncertainty. We aim at developing a point-to-pixel association based on UAV-borne LiDAR point cloud and conventional camera data to build digital elevation models where each 3D point is associated to a color. In this paper, we propose a convenient uncertainty prediction model dedicated to multi-beam LiDAR systems with a new consideration on laser diode stack emitted footprints. We supplement this proposition by a novel reference-free evaluation method of this model. This evaluation method aims at validating the LiDAR uncertainty prediction model and estimating its resolving power. It is based on two criteria: one for consistency, the other for specificity. We apply this method to the multi-beam Velodyne VLP-16 LiDAR. The sensor’s prediction model validates the consistency criterion but, as expected, not the specificity criterion. It returns coherently pessimistic prediction with a resolving power upper bounded by 2 cm at a distance of 5 m.

多传感器融合的一个关键特征是能够将每个测量值与其不确定度的估计相关联。我们的目标是开发基于无人机机载激光雷达点云和传统相机数据的点到像素关联，以建立数字高程模型，其中每个3D点与颜色相关联。本文提出了一种方便的多波束激光雷达系统的不确定性预测模型，该模型考虑了激光二极管叠加发射足迹。我们用一种新的无参考评价方法对该模型进行了补充。该评估方法旨在验证激光雷达不确定性预测模型并估计其分辨能力。它基于两个标准:一个是一致性，另一个是特异性。我们将该方法应用于多波束Velodyne VLP-16激光雷达。传感器的预测模型验证了一致性标准，但正如预期的那样，不是特异性标准。它返回相干悲观预测，分辨率上限为2 cm，距离为5 m。

{"title":"DEVELOPING AND VALIDATING A PREDICTIVE MODEL OF MEASUREMENT UNCERTAINTY FOR MULTI-BEAM LIDARS: APPLICATION TO THE VELODYNE VLP-16","authors":"Q. Péntek, T. Allouis, O. Strauss, C. Fiorio","doi":"10.1109/IPTA.2018.8608146","DOIUrl":"https://doi.org/10.1109/IPTA.2018.8608146","url":null,"abstract":"A key feature for multi-sensor fusion is the ability to associate, to each measured value, an estimate of its uncertainty. We aim at developing a point-to-pixel association based on UAV-borne LiDAR point cloud and conventional camera data to build digital elevation models where each 3D point is associated to a color. In this paper, we propose a convenient uncertainty prediction model dedicated to multi-beam LiDAR systems with a new consideration on laser diode stack emitted footprints. We supplement this proposition by a novel reference-free evaluation method of this model. This evaluation method aims at validating the LiDAR uncertainty prediction model and estimating its resolving power. It is based on two criteria: one for consistency, the other for specificity. We apply this method to the multi-beam Velodyne VLP-16 LiDAR. The sensor’s prediction model validates the consistency criterion but, as expected, not the specificity criterion. It returns coherently pessimistic prediction with a resolving power upper bounded by 2 cm at a distance of 5 m.","PeriodicalId":272294,"journal":{"name":"2018 Eighth International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130222385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

An experimental investigation on self adaptive facial recognition algorithms using a long time span data set 基于长时间跨度数据集的自适应人脸识别算法的实验研究

2018 Eighth International Conference on Image Processing Theory, Tools and Applications (IPTA)

Pub Date : 2018-11-01 DOI: 10.1109/IPTA.2018.8608134

G. Orrú, G. Marcialis, F. Roli

Nowadays, facial authentication systems are present in many daily life devices. Their performance is influenced by the appearance of the facial trait that changes according to many factors such as lighting, pose, variations over time and obstructions. Adaptive systems follow these variations by updating themselves through images acquired during system operations. Although the literature proposes many possible approaches, their evaluation is often left to data set not explicitly conceived to simulate a real application scenario. The substantial absence of an appropriate and objective evaluation set is probably the motivation of the lack of implementation of adaptive systems in real devices. This paper presents a facial dataset acquired by videos in the YouTube platform. The collected images are particularly suitable for evaluating adaptive systems as they contain many changes during the time-sequence. A set of experiments of the most representative self adaptive approaches recently appeared in the literature is also performed and discussed. They allow to give some initial insights about pros and cons of facial adaptive authentication systems by considering a medium-long term time window of the investigated systems performance.

如今，人脸认证系统已经出现在许多日常生活设备中。他们的表现受到面部特征的影响，面部特征会根据许多因素而变化，比如光线、姿势、随时间的变化和障碍物。自适应系统通过在系统运行期间获取的图像来更新自己，从而遵循这些变化。虽然文献提出了许多可能的方法，但它们的评估往往留给没有明确设想模拟真实应用场景的数据集。在实际设备中缺乏适当和客观的评估集可能是缺乏自适应系统实现的动机。本文提出了一种基于YouTube平台视频采集的人脸数据集。收集到的图像特别适合于评估自适应系统，因为它们在时间序列中包含许多变化。本文还对最近出现在文献中最具代表性的自适应方法进行了一系列实验并进行了讨论。通过考虑所调查系统性能的中长期时间窗口，可以初步了解面部自适应身份验证系统的优缺点。

{"title":"An experimental investigation on self adaptive facial recognition algorithms using a long time span data set","authors":"G. Orrú, G. Marcialis, F. Roli","doi":"10.1109/IPTA.2018.8608134","DOIUrl":"https://doi.org/10.1109/IPTA.2018.8608134","url":null,"abstract":"Nowadays, facial authentication systems are present in many daily life devices. Their performance is influenced by the appearance of the facial trait that changes according to many factors such as lighting, pose, variations over time and obstructions. Adaptive systems follow these variations by updating themselves through images acquired during system operations. Although the literature proposes many possible approaches, their evaluation is often left to data set not explicitly conceived to simulate a real application scenario. The substantial absence of an appropriate and objective evaluation set is probably the motivation of the lack of implementation of adaptive systems in real devices. This paper presents a facial dataset acquired by videos in the YouTube platform. The collected images are particularly suitable for evaluating adaptive systems as they contain many changes during the time-sequence. A set of experiments of the most representative self adaptive approaches recently appeared in the literature is also performed and discussed. They allow to give some initial insights about pros and cons of facial adaptive authentication systems by considering a medium-long term time window of the investigated systems performance.","PeriodicalId":272294,"journal":{"name":"2018 Eighth International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123149541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Unsupervised Facial Image De-occlusion with Optimized Deep Generative Models 基于优化深度生成模型的无监督人脸图像去遮挡

2018 Eighth International Conference on Image Processing Theory, Tools and Applications (IPTA)

Pub Date : 2018-11-01 DOI: 10.1109/IPTA.2018.8608127

Lei Xu, Honglei Zhang, Jenni Raitoharju, M. Gabbouj

In recent years, Generative Adversarial Networks (GANs) or various types of Auto-Encoders (AEs) have gained attention on facial image de-occlusion and/or in-painting tasks. In this paper, we propose a novel unsupervised technique to remove occlusion from facial images and complete the occluded parts simultaneously with optimized Deep Convolutional Generative Adversarial Networks (DCGANs) in an iterative way. Generally, GANs, as generative models, can estimate the distribution of images using a generator and a discriminator. DCGANs, as its variant, are proposed to conquer its instability during training. Existing facial image in-painting methods manually define a block of pixels as the missing part and the potential content of this block is semantically generated using generative models, such as GANs or AEs. In our method, a mask is inferred from an occluded facial image using a novel loss function, and then this mask is utilized to in-paint the occlusions automatically by pre-trained DCGANs. We evaluate the performance of our method on facial images with various occlusions, such as sunglasses and scarves. The experiments demonstrate that our method can effectively detect certain kinds of occlusions and complete the occluded parts in an unsupervised manner.

近年来，生成对抗网络(GANs)或各种类型的自动编码器(ae)在面部图像去遮挡和/或绘画任务中得到了广泛的关注。在本文中，我们提出了一种新的无监督技术，通过优化的深度卷积生成对抗网络(Deep Convolutional Generative Adversarial Networks, dcgan)以迭代的方式同时完成面部图像中的遮挡部分。一般来说，gan作为生成模型，可以使用生成器和鉴别器来估计图像的分布。dcgan作为它的变体，克服了它在训练过程中的不稳定性。现有的面部图像绘制方法手动定义像素块作为缺失部分，并使用生成模型(如gan或ae)在语义上生成该块的潜在内容。在我们的方法中，使用一种新的损失函数从被遮挡的面部图像中推断出一个掩模，然后利用该掩模由预训练的dcgan自动对遮挡进行油漆。我们评估了我们的方法在各种遮挡的面部图像上的性能，比如太阳镜和围巾。实验表明，该方法可以有效地检测出某些类型的遮挡，并以无监督的方式完成遮挡部分。

{"title":"Unsupervised Facial Image De-occlusion with Optimized Deep Generative Models","authors":"Lei Xu, Honglei Zhang, Jenni Raitoharju, M. Gabbouj","doi":"10.1109/IPTA.2018.8608127","DOIUrl":"https://doi.org/10.1109/IPTA.2018.8608127","url":null,"abstract":"In recent years, Generative Adversarial Networks (GANs) or various types of Auto-Encoders (AEs) have gained attention on facial image de-occlusion and/or in-painting tasks. In this paper, we propose a novel unsupervised technique to remove occlusion from facial images and complete the occluded parts simultaneously with optimized Deep Convolutional Generative Adversarial Networks (DCGANs) in an iterative way. Generally, GANs, as generative models, can estimate the distribution of images using a generator and a discriminator. DCGANs, as its variant, are proposed to conquer its instability during training. Existing facial image in-painting methods manually define a block of pixels as the missing part and the potential content of this block is semantically generated using generative models, such as GANs or AEs. In our method, a mask is inferred from an occluded facial image using a novel loss function, and then this mask is utilized to in-paint the occlusions automatically by pre-trained DCGANs. We evaluate the performance of our method on facial images with various occlusions, such as sunglasses and scarves. The experiments demonstrate that our method can effectively detect certain kinds of occlusions and complete the occluded parts in an unsupervised manner.","PeriodicalId":272294,"journal":{"name":"2018 Eighth International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"661 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127589149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2018 Eighth International Conference on Image Processing Theory, Tools and Applications (IPTA)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀