Pub Date : 2020-10-01DOI: 10.1109/ICIP40778.2020.9191338
Yoshitaka Kidani, Kei Kawamura, Kyohei Unno, S. Naito
Overlapped block motion compensation (OBMC) is one of the inter prediction tools that improves coding performance. OBMC applied to various non-squared blocks has been studied in VVC, which is being standardized by joint video experts team (JVET), to improve coding performance over HEVC. Memory bandwidth, however, is a bottleneck when OBMC is used, and conventional methods have not achieved a good trade-off regarding coding performance and memory bandwidth so far. In this study, interpolation filters and applicable conditions of OBMC depending on block sizes are proposed to achieve the best trade-off. The experimental results show a -0.40% BD-rate gain compared with that of the VVC test model 3 for random access conditions under the common test condition in JVET.
{"title":"Block-Size Dependent Overlapped Block Motion Compensation","authors":"Yoshitaka Kidani, Kei Kawamura, Kyohei Unno, S. Naito","doi":"10.1109/ICIP40778.2020.9191338","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9191338","url":null,"abstract":"Overlapped block motion compensation (OBMC) is one of the inter prediction tools that improves coding performance. OBMC applied to various non-squared blocks has been studied in VVC, which is being standardized by joint video experts team (JVET), to improve coding performance over HEVC. Memory bandwidth, however, is a bottleneck when OBMC is used, and conventional methods have not achieved a good trade-off regarding coding performance and memory bandwidth so far. In this study, interpolation filters and applicable conditions of OBMC depending on block sizes are proposed to achieve the best trade-off. The experimental results show a -0.40% BD-rate gain compared with that of the VVC test model 3 for random access conditions under the common test condition in JVET.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125472504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-10-01DOI: 10.1109/ICIP40778.2020.9191303
Kailai Zhang, Zheng Cao, Ji Wu
In this paper, we present a novel and effective data augmentation method for convolutional neural network(CNN) on image classification tasks. CNN-based models such as VGG, Resnet and Densenet have achieved great success on image classification tasks. The common data augmentation methods such as rotation, crop and flip are always used for CNN, especially under the lack of data. However, in some cases such as small images and dispersed feature of objects, these methods have limitations and even can decrease the classification performance. In this case, an operation that has lower risk is important for the performance improvement. Addressing this problem, we design a data augmentation method named circular shift, which provides variations for the CNN-based models but does not lose too much information. Three commonly used image datasets are chosen for the evaluation of our proposed operation, and the experiment results show consistent improvement on different CNN-based models. What is more, our operation can be added to the current set of augmentation operation and achieves further performance improvement.
{"title":"Circular Shift: An Effective Data Augmentation Method For Convolutional Neural Network On Image Classification","authors":"Kailai Zhang, Zheng Cao, Ji Wu","doi":"10.1109/ICIP40778.2020.9191303","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9191303","url":null,"abstract":"In this paper, we present a novel and effective data augmentation method for convolutional neural network(CNN) on image classification tasks. CNN-based models such as VGG, Resnet and Densenet have achieved great success on image classification tasks. The common data augmentation methods such as rotation, crop and flip are always used for CNN, especially under the lack of data. However, in some cases such as small images and dispersed feature of objects, these methods have limitations and even can decrease the classification performance. In this case, an operation that has lower risk is important for the performance improvement. Addressing this problem, we design a data augmentation method named circular shift, which provides variations for the CNN-based models but does not lose too much information. Three commonly used image datasets are chosen for the evaluation of our proposed operation, and the experiment results show consistent improvement on different CNN-based models. What is more, our operation can be added to the current set of augmentation operation and achieves further performance improvement.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126839010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-10-01DOI: 10.1109/ICIP40778.2020.9190905
Yiming Zuo, Yang Lei, S. Barcelo
Image-based quality control is a powerful tool for nondestructive testing of product quality. Machine vision systems (MVS) often implement image-based machine learning algorithms in an attempt to match human level accuracy in detecting product defects for better efficiency and repeatability. Plasmonic sensors, such as those used in Surface Enhanced Raman Spectroscopy (SERS), present a unique challenge for image-based quality control, because in addition to obvious defects such as scratches and missing areas, subtle color changes can also indicate significant changes in sensor performance. As a further challenge, it is not straightforward for even a human expert to distinguish between high- and lowquality sensors based on these subtle color changes on the sensors. In this paper we show that by extracting image features according to the domain knowledge, we can build an imagebased method that outperforms human expert prediction. This method enables automated non-destructive SERS sensor quality control and has been implemented successfully on our server.
{"title":"An Image-based Method to Predict Surface Enhanced Raman Spectroscopy Sensor Quality","authors":"Yiming Zuo, Yang Lei, S. Barcelo","doi":"10.1109/ICIP40778.2020.9190905","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9190905","url":null,"abstract":"Image-based quality control is a powerful tool for nondestructive testing of product quality. Machine vision systems (MVS) often implement image-based machine learning algorithms in an attempt to match human level accuracy in detecting product defects for better efficiency and repeatability. Plasmonic sensors, such as those used in Surface Enhanced Raman Spectroscopy (SERS), present a unique challenge for image-based quality control, because in addition to obvious defects such as scratches and missing areas, subtle color changes can also indicate significant changes in sensor performance. As a further challenge, it is not straightforward for even a human expert to distinguish between high- and lowquality sensors based on these subtle color changes on the sensors. In this paper we show that by extracting image features according to the domain knowledge, we can build an imagebased method that outperforms human expert prediction. This method enables automated non-destructive SERS sensor quality control and has been implemented successfully on our server.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"8 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114883298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-10-01DOI: 10.1109/ICIP40778.2020.9190761
Chen Li, Yusong Tan, W. Chen, Xin Luo, Yuanming Gao, Xiaogang Jia, Zhiying Wang
Liver cancer is one of the cancers with the highest mortality. In order to help doctors diagnose and treat liver lesion, an automatic liver segmentation model is urgently needed due to manually segmentation is time-consuming and error-prone. In this paper, we propose a nested attention-aware segmentation network, named Attention UNet++. Our proposed method has a deep supervised encoder-decoder architecture and a redesigned dense skip connection. Attention UNet++ introduces attention mechanism between nested convolutional blocks so that the features extracted at different levels can be merged with a task-related selection. Besides, due to the introduction of deep supervision, the prediction speed of the pruned network is accelerated at the cost of modest performance degradation. We evaluated proposed model on MICCAI 2017 Liver Tumor Segmentation (LiTS) Challenge Dataset. Attention UNet++ achieved very competitive performance for liver segmentation.
{"title":"Attention Unet++: A Nested Attention-Aware U-Net for Liver CT Image Segmentation","authors":"Chen Li, Yusong Tan, W. Chen, Xin Luo, Yuanming Gao, Xiaogang Jia, Zhiying Wang","doi":"10.1109/ICIP40778.2020.9190761","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9190761","url":null,"abstract":"Liver cancer is one of the cancers with the highest mortality. In order to help doctors diagnose and treat liver lesion, an automatic liver segmentation model is urgently needed due to manually segmentation is time-consuming and error-prone. In this paper, we propose a nested attention-aware segmentation network, named Attention UNet++. Our proposed method has a deep supervised encoder-decoder architecture and a redesigned dense skip connection. Attention UNet++ introduces attention mechanism between nested convolutional blocks so that the features extracted at different levels can be merged with a task-related selection. Besides, due to the introduction of deep supervision, the prediction speed of the pruned network is accelerated at the cost of modest performance degradation. We evaluated proposed model on MICCAI 2017 Liver Tumor Segmentation (LiTS) Challenge Dataset. Attention UNet++ achieved very competitive performance for liver segmentation.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"157 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115188440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-10-01DOI: 10.1109/ICIP40778.2020.9191302
Giorgos Sfikas, D. Ioannidis, D. Tzovaras
We present a new keypoint detection method that generalizes Harris corners for multispectral images by considering the input as a quaternionic matrix. Standard keypoint detectors run on scalar-valued inputs, neglecting input multimodality and potentially missing highly distinctive features. The proposed detector uses information from all channel inputs by defining a quaternionic autocorrelation matrix that possesses quaternionic eigenvectors and real eigenvalues, for the computation of which channel cross-correlations are also taken into account. We have tested the proposed detector on a variety of multispectral images (color, near-infrared), where we have validated its usefulness.
{"title":"Quaternion Harris For Multispectral Keypoint Detection","authors":"Giorgos Sfikas, D. Ioannidis, D. Tzovaras","doi":"10.1109/ICIP40778.2020.9191302","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9191302","url":null,"abstract":"We present a new keypoint detection method that generalizes Harris corners for multispectral images by considering the input as a quaternionic matrix. Standard keypoint detectors run on scalar-valued inputs, neglecting input multimodality and potentially missing highly distinctive features. The proposed detector uses information from all channel inputs by defining a quaternionic autocorrelation matrix that possesses quaternionic eigenvectors and real eigenvalues, for the computation of which channel cross-correlations are also taken into account. We have tested the proposed detector on a variety of multispectral images (color, near-infrared), where we have validated its usefulness.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"249 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116069548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-10-01DOI: 10.1109/ICIP40778.2020.9191294
Tongxin Du, Bin Fang, Mingliang Zhou, Henjun Zhao, Weizhi Xian, X. Wu
In this paper, we propose a method to segment the valid region of fisheye images. First, we construct an objective function with three terms, which are the region driving term, the edge driving term and the length regularization term. Second, we minimize this objective function by a modified gradient descent method to find the best segmentation result. Our method can achieve valid region segmentation by making use of both region information and edge information. Experiments show that the proposed method can deal with blurred edges, halation noise and incomplete valid region problems.
{"title":"Segmentation Algorithm of the Valid Region in Fisheye Images Using Edge and Region Information","authors":"Tongxin Du, Bin Fang, Mingliang Zhou, Henjun Zhao, Weizhi Xian, X. Wu","doi":"10.1109/ICIP40778.2020.9191294","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9191294","url":null,"abstract":"In this paper, we propose a method to segment the valid region of fisheye images. First, we construct an objective function with three terms, which are the region driving term, the edge driving term and the length regularization term. Second, we minimize this objective function by a modified gradient descent method to find the best segmentation result. Our method can achieve valid region segmentation by making use of both region information and edge information. Experiments show that the proposed method can deal with blurred edges, halation noise and incomplete valid region problems.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"152 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122930190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-10-01DOI: 10.1109/ICIP40778.2020.9190744
Arindam Sikdar, Dibyadip Chatterjee, Arpan Bhowmik, A. Chowdhury
Person re-identification in the wild needs to simultaneously (frame-wise) detect and re-identify persons and has wide utility in practical scenarios. However, such tasks come with an additional open-set re-ID challenge as all probe persons may not necessarily be present in the (frame-wise) dynamic gallery. Traditional or close-set re-ID systems are not equipped to handle such cases and raise several false alarms as a result. To cope with such challenges open-set metric learning (OSML), based on the concept of Large margin nearest neighbor (LMNN) approach, is proposed. We term our method Open-Set LMNN (OS-LMNN). The goal of separating impostor samples from the genuine samples is achieved through a joint optimization of the Weibull distribution and the Mahalanobis metric learned through this OS-LMNN approach. The rejection is performed based on low probability over distance of imposter pairs. Exhaustive experiments with other metric learning techniques over the publicly available PRW dataset clearly demonstrate the robustness of our approach.
{"title":"Open-Set Metric Learning For Person Re-Identification In The Wild","authors":"Arindam Sikdar, Dibyadip Chatterjee, Arpan Bhowmik, A. Chowdhury","doi":"10.1109/ICIP40778.2020.9190744","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9190744","url":null,"abstract":"Person re-identification in the wild needs to simultaneously (frame-wise) detect and re-identify persons and has wide utility in practical scenarios. However, such tasks come with an additional open-set re-ID challenge as all probe persons may not necessarily be present in the (frame-wise) dynamic gallery. Traditional or close-set re-ID systems are not equipped to handle such cases and raise several false alarms as a result. To cope with such challenges open-set metric learning (OSML), based on the concept of Large margin nearest neighbor (LMNN) approach, is proposed. We term our method Open-Set LMNN (OS-LMNN). The goal of separating impostor samples from the genuine samples is achieved through a joint optimization of the Weibull distribution and the Mahalanobis metric learned through this OS-LMNN approach. The rejection is performed based on low probability over distance of imposter pairs. Exhaustive experiments with other metric learning techniques over the publicly available PRW dataset clearly demonstrate the robustness of our approach.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"124 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122468782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-10-01DOI: 10.1109/ICIP40778.2020.9190844
Jose Jaena Mari Ople, Daniel Stanley Tan, A. Azcarraga, Chao-Lung Yang, K. Hua
Recent deep learning approaches in single image super-resolution (SISR) can generate high-definition textures for super-resolved (SR) images. However, they tend to hallucinate fake textures and even produce artifacts. An alternative to SISR, reference-based SR (RefSR) approaches use high-resolution (HR) reference (Ref) images to provide HR details that are missing in the low-resolution (LR) input image. We propose a novel framework that leverages existing SISR approaches and enhances them with RefSR. Specifically, we refine the output of SISR methods using neural texture transfer, where HR features are queried from the Ref images. The query is conducted by computing the similarity of textural and semantic features between the input image and the Ref images. The most similar HR features, patch-wise, to the LR image is used to augment the SR image through an augmentation network. In the case of dissimilar Ref images from the LR input image, we prevent performance degradation by including the similarity scores in the input features of the network. Furthermore, we use random texture patches during the training to condition our augmentation network to not always trust the queried texture features. Different from past RefSR approaches, our method can use arbitrary Ref images and its lower-bound performance is based on the SR image. We showcase that our method drastically improves the performance of the base SISR approach.
{"title":"Super-Resolution by Image Enhancement Using Texture Transfer","authors":"Jose Jaena Mari Ople, Daniel Stanley Tan, A. Azcarraga, Chao-Lung Yang, K. Hua","doi":"10.1109/ICIP40778.2020.9190844","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9190844","url":null,"abstract":"Recent deep learning approaches in single image super-resolution (SISR) can generate high-definition textures for super-resolved (SR) images. However, they tend to hallucinate fake textures and even produce artifacts. An alternative to SISR, reference-based SR (RefSR) approaches use high-resolution (HR) reference (Ref) images to provide HR details that are missing in the low-resolution (LR) input image. We propose a novel framework that leverages existing SISR approaches and enhances them with RefSR. Specifically, we refine the output of SISR methods using neural texture transfer, where HR features are queried from the Ref images. The query is conducted by computing the similarity of textural and semantic features between the input image and the Ref images. The most similar HR features, patch-wise, to the LR image is used to augment the SR image through an augmentation network. In the case of dissimilar Ref images from the LR input image, we prevent performance degradation by including the similarity scores in the input features of the network. Furthermore, we use random texture patches during the training to condition our augmentation network to not always trust the queried texture features. Different from past RefSR approaches, our method can use arbitrary Ref images and its lower-bound performance is based on the SR image. We showcase that our method drastically improves the performance of the base SISR approach.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"122 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123023252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-10-01DOI: 10.1109/ICIP40778.2020.9191247
Neel Patwa, Nilesh A. Ahuja, Srinivasa Somayazulu, Omesh Tickoo, S. Varadarajan, S. Koolagudi
Video traffic comprises a large majority of the total traffic on the internet today. Uncompressed visual data requires a very large data rate; lossy compression techniques are employed in order to keep the data-rate manageable. Increasingly, a significant amount of visual data being generated is consumed by analytics (such as classification, detection, etc.) residing in the cloud. Image and video compression can produce visual artifacts, especially at lower data-rates, which can result in a significant drop in performance on such analytic tasks. Moreover, standard image and video compression techniques aim to optimize perceptual quality for human consumption by allocating more bits to perceptually significant features of the scene. However, these features may not necessarily be the most suitable ones for semantic tasks. We present here an approach to compress visual data in order to maximize performance on a given analytic task. We train a deep auto-encoder using a multi-task loss to learn the relevant embeddings. An approximate differentiable model of the quantizer is used during training which helps boost the accuracy during inference. We apply our approach on an image classification problem and show that for a given level of compression, it achieves higher classification accuracy than that obtained by performing classification on images compressed using JPEG. Our approach also outperforms the relevant state-of-the-art approach by a significant margin.
{"title":"Semantic-Preserving Image Compression","authors":"Neel Patwa, Nilesh A. Ahuja, Srinivasa Somayazulu, Omesh Tickoo, S. Varadarajan, S. Koolagudi","doi":"10.1109/ICIP40778.2020.9191247","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9191247","url":null,"abstract":"Video traffic comprises a large majority of the total traffic on the internet today. Uncompressed visual data requires a very large data rate; lossy compression techniques are employed in order to keep the data-rate manageable. Increasingly, a significant amount of visual data being generated is consumed by analytics (such as classification, detection, etc.) residing in the cloud. Image and video compression can produce visual artifacts, especially at lower data-rates, which can result in a significant drop in performance on such analytic tasks. Moreover, standard image and video compression techniques aim to optimize perceptual quality for human consumption by allocating more bits to perceptually significant features of the scene. However, these features may not necessarily be the most suitable ones for semantic tasks. We present here an approach to compress visual data in order to maximize performance on a given analytic task. We train a deep auto-encoder using a multi-task loss to learn the relevant embeddings. An approximate differentiable model of the quantizer is used during training which helps boost the accuracy during inference. We apply our approach on an image classification problem and show that for a given level of compression, it achieves higher classification accuracy than that obtained by performing classification on images compressed using JPEG. Our approach also outperforms the relevant state-of-the-art approach by a significant margin.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114570679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-10-01DOI: 10.1109/ICIP40778.2020.9191281
K. Gkentsidis, Theodora Pistola, N. Mitianoudis, N. Boulgouris
We explore the capabilities of a new biometric trait, which is based on information extracted through facial motion amplification. Unlike traditional facial biometric traits, the new biometric does not require the visibility of facial features, such as the eyes or nose, that are critical in common facial biometric algorithms. In this paper we propose the formation of a spatiotemporal facial blood flow map, constructed using small motion amplification. Experiments show that the proposed approach provides significant discriminatory capacity over different training and testing days and can be potentially used in situations where traditional facial biometrics may not be applicable.
{"title":"Deep Person Identification Using Spatiotemporal Facial Motion Amplification","authors":"K. Gkentsidis, Theodora Pistola, N. Mitianoudis, N. Boulgouris","doi":"10.1109/ICIP40778.2020.9191281","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9191281","url":null,"abstract":"We explore the capabilities of a new biometric trait, which is based on information extracted through facial motion amplification. Unlike traditional facial biometric traits, the new biometric does not require the visibility of facial features, such as the eyes or nose, that are critical in common facial biometric algorithms. In this paper we propose the formation of a spatiotemporal facial blood flow map, constructed using small motion amplification. Experiments show that the proposed approach provides significant discriminatory capacity over different training and testing days and can be potentially used in situations where traditional facial biometrics may not be applicable.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"203 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121901107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}