Giorgos Kordopatis-Zilos, S. Papadopoulos, I. Patras, Y. Kompatsiaris
{"title":"Near-Duplicate Video Retrieval with Deep Metric Learning","authors":"Giorgos Kordopatis-Zilos, S. Papadopoulos, I. Patras, Y. Kompatsiaris","doi":"10.1109/ICCVW.2017.49","DOIUrl":null,"url":null,"abstract":"This work addresses the problem of Near-Duplicate Video Retrieval (NDVR). We propose an effective video-level NDVR scheme based on deep metric learning that leverages Convolutional Neural Network (CNN) features from intermediate layers to generate discriminative global video representations in tandem with a Deep Metric Learning (DML) framework with two fusion variations, trained to approximate an embedding function for accurate distance calculation between two near-duplicate videos. In contrast to most state-of-the-art methods, which exploit information deriving from the same source of data for both development and evaluation (which usually results to dataset-specific solutions), the proposed model is fed during training with sampled triplets generated from an independent dataset and is thoroughly tested on the widely used CC_WEB_VIDEO dataset, using two popular deep CNN architectures (AlexNet, GoogleNet). We demonstrate that the proposed approach achieves outstanding performance against the state-of-the-art, either with or without access to the evaluation dataset.","PeriodicalId":149766,"journal":{"name":"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)","volume":"2002 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"57","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCVW.2017.49","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 57
Abstract
This work addresses the problem of Near-Duplicate Video Retrieval (NDVR). We propose an effective video-level NDVR scheme based on deep metric learning that leverages Convolutional Neural Network (CNN) features from intermediate layers to generate discriminative global video representations in tandem with a Deep Metric Learning (DML) framework with two fusion variations, trained to approximate an embedding function for accurate distance calculation between two near-duplicate videos. In contrast to most state-of-the-art methods, which exploit information deriving from the same source of data for both development and evaluation (which usually results to dataset-specific solutions), the proposed model is fed during training with sampled triplets generated from an independent dataset and is thoroughly tested on the widely used CC_WEB_VIDEO dataset, using two popular deep CNN architectures (AlexNet, GoogleNet). We demonstrate that the proposed approach achieves outstanding performance against the state-of-the-art, either with or without access to the evaluation dataset.