基于深度度量学习的可泛化高效跨领域人员再识别模型

IF 1.3 4区计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE IET Computer Vision Pub Date : 2023-06-13 DOI:10.1049/cvi2.12214

Saba Sadat Faghih Imani, Kazim Fouladi-Ghaleh, Hossein Aghababa

{"title":"基于深度度量学习的可泛化高效跨领域人员再识别模型","authors":"Saba Sadat Faghih Imani, Kazim Fouladi-Ghaleh, Hossein Aghababa","doi":"10.1049/cvi2.12214","DOIUrl":null,"url":null,"abstract":"<p>Most of the successful person re-ID models conduct supervised training and need a large number of training data. These models fail to generalise well on unseen unlabelled testing sets. The authors aim to learn a generalisable person re-identification model. The model uses one labelled source dataset and one unlabelled target dataset during training and generalises well on the target testing set. To this end, after a feature extraction by the ResNext-50 network, the authors optimise the model by three loss functions. (a) One loss function is designed to learn the features of the target domain by tuning the distances between target images. Therefore, the trained model will be more robust to overcome the intra-domain variations in the target domain and generalises well on the target testing set. (b) One triplet loss is used which considers both source and target domains and makes the model learn the inter-domain variations between source and target domain as well as the variations in the target domain. (c) Also, one loss function is for supervised learning on the labelled source domain. Extensive experiments on Market1501 and DukeMTMC re-ID show that the model achieves a very competitive performance compared with state-of-the-art models and also it requires an acceptable amount of GPU RAM compared to other successful models.</p>","PeriodicalId":56304,"journal":{"name":"IET Computer Vision","volume":"17 8","pages":"993-1004"},"PeriodicalIF":1.3000,"publicationDate":"2023-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cvi2.12214","citationCount":"0","resultStr":"{\"title\":\"Generalizable and efficient cross-domain person re-identification model using deep metric learning\",\"authors\":\"Saba Sadat Faghih Imani, Kazim Fouladi-Ghaleh, Hossein Aghababa\",\"doi\":\"10.1049/cvi2.12214\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Most of the successful person re-ID models conduct supervised training and need a large number of training data. These models fail to generalise well on unseen unlabelled testing sets. The authors aim to learn a generalisable person re-identification model. The model uses one labelled source dataset and one unlabelled target dataset during training and generalises well on the target testing set. To this end, after a feature extraction by the ResNext-50 network, the authors optimise the model by three loss functions. (a) One loss function is designed to learn the features of the target domain by tuning the distances between target images. Therefore, the trained model will be more robust to overcome the intra-domain variations in the target domain and generalises well on the target testing set. (b) One triplet loss is used which considers both source and target domains and makes the model learn the inter-domain variations between source and target domain as well as the variations in the target domain. (c) Also, one loss function is for supervised learning on the labelled source domain. Extensive experiments on Market1501 and DukeMTMC re-ID show that the model achieves a very competitive performance compared with state-of-the-art models and also it requires an acceptable amount of GPU RAM compared to other successful models.</p>\",\"PeriodicalId\":56304,\"journal\":{\"name\":\"IET Computer Vision\",\"volume\":\"17 8\",\"pages\":\"993-1004\"},\"PeriodicalIF\":1.3000,\"publicationDate\":\"2023-06-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cvi2.12214\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IET Computer Vision\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1049/cvi2.12214\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Computer Vision","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/cvi2.12214","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

大多数成功的人物再识别模型都是在监督下进行训练的，需要大量的训练数据。这些模型无法在不可见的无标签测试集上很好地泛化。作者的目标是学习一个可泛化的人物再识别模型。该模型在训练过程中使用一个有标签的源数据集和一个无标签的目标数据集，并能在目标测试集上很好地泛化。为此，在 ResNext-50 网络进行特征提取后，作者通过三个损失函数对模型进行了优化。(a) 其中一个损失函数旨在通过调整目标图像之间的距离来学习目标域的特征。因此，训练出的模型将更加稳健，能够克服目标域中的域内变化，并能在目标测试集上很好地泛化。(b) 使用一个三重损失函数，同时考虑源域和目标域，使模型学习源域和目标域之间的域间变化以及目标域中的变化。(c) 此外，还有一个损失函数用于对标记的源域进行监督学习。在 Market1501 和 DukeMTMC re-ID 上进行的大量实验表明，与最先进的模型相比，该模型的性能极具竞争力，而且与其他成功的模型相比，它所需的 GPU 内存量也是可以接受的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Generalizable and efficient cross-domain person re-identification model using deep metric learning

Most of the successful person re-ID models conduct supervised training and need a large number of training data. These models fail to generalise well on unseen unlabelled testing sets. The authors aim to learn a generalisable person re-identification model. The model uses one labelled source dataset and one unlabelled target dataset during training and generalises well on the target testing set. To this end, after a feature extraction by the ResNext-50 network, the authors optimise the model by three loss functions. (a) One loss function is designed to learn the features of the target domain by tuning the distances between target images. Therefore, the trained model will be more robust to overcome the intra-domain variations in the target domain and generalises well on the target testing set. (b) One triplet loss is used which considers both source and target domains and makes the model learn the inter-domain variations between source and target domain as well as the variations in the target domain. (c) Also, one loss function is for supervised learning on the labelled source domain. Extensive experiments on Market1501 and DukeMTMC re-ID show that the model achieves a very competitive performance compared with state-of-the-art models and also it requires an acceptable amount of GPU RAM compared to other successful models.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IET Computer Vision 工程技术-工程：电子与电气

CiteScore

3.30

自引率

11.80%

发文量

审稿时长

3.4 months

期刊介绍： IET Computer Vision seeks original research papers in a wide range of areas of computer vision. The vision of the journal is to publish the highest quality research work that is relevant and topical to the field, but not forgetting those works that aim to introduce new horizons and set the agenda for future avenues of research in computer vision. IET Computer Vision welcomes submissions on the following topics: Biologically and perceptually motivated approaches to low level vision (feature detection, etc.); Perceptual grouping and organisation Representation, analysis and matching of 2D and 3D shape Shape-from-X Object recognition Image understanding Learning with visual inputs Motion analysis and object tracking Multiview scene analysis Cognitive approaches in low, mid and high level vision Control in visual systems Colour, reflectance and light Statistical and probabilistic models Face and gesture Surveillance Biometrics and security Robotics Vehicle guidance Automatic model aquisition Medical image analysis and understanding Aerial scene analysis and remote sensing Deep learning models in computer vision Both methodological and applications orientated papers are welcome. Manuscripts submitted are expected to include a detailed and analytical review of the literature and state-of-the-art exposition of the original proposed research and its methodology, its thorough experimental evaluation, and last but not least, comparative evaluation against relevant and state-of-the-art methods. Submissions not abiding by these minimum requirements may be returned to authors without being sent to review. Special Issues Current Call for Papers: Computer Vision for Smart Cameras and Camera Networks - https://digital-library.theiet.org/files/IET_CVI_SC.pdf Computer Vision for the Creative Industries - https://digital-library.theiet.org/files/IET_CVI_CVCI.pdf