{"title":"特征金字塔随机融合网络的可见-红外模态人再识别","authors":"Wang Ronggui, Wang Jing, Yang Juan, Xue Lixia","doi":"10.12086/OEE.2020.190669","DOIUrl":null,"url":null,"abstract":"Existing works in person re-identification only considers extracting invariant feature representations from cross-view visible cameras, which ignores the imaging feature in infrared domain, such that there are few studies on visible-infrared relevant modality. Besides, most works distinguish two-views by often computing the similarity in feature maps from one single convolutional layer, which causes a weak performance of learning features. To handle the above problems, we design a feature pyramid random fusion network (FPRnet) that learns discriminative multiple semantic features by computing the similarities between multi-level convolutions when matching the person. FPRnet not only reduces the negative effect of bias in intra-modality, but also balances the heterogeneity gap between inter-modality, which focuses on an infrared image with very different visual properties. Meanwhile, our work integrates the advantages of learning local and global feature, which effectively solves the problems of visible-infrared person re-identification. Extensive experiments on the public SYSU-MM01 dataset from aspects of mAP and convergence speed, demonstrate the superiorities in our approach to the state-of-the-art methods. Furthermore, FPRnet also achieves competitive results with 32.12% mAP recognition rate and much faster convergence.","PeriodicalId":39552,"journal":{"name":"光电工程","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2020-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Feature pyramid random fusion network for visible-infrared modality person re-identification\",\"authors\":\"Wang Ronggui, Wang Jing, Yang Juan, Xue Lixia\",\"doi\":\"10.12086/OEE.2020.190669\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Existing works in person re-identification only considers extracting invariant feature representations from cross-view visible cameras, which ignores the imaging feature in infrared domain, such that there are few studies on visible-infrared relevant modality. Besides, most works distinguish two-views by often computing the similarity in feature maps from one single convolutional layer, which causes a weak performance of learning features. To handle the above problems, we design a feature pyramid random fusion network (FPRnet) that learns discriminative multiple semantic features by computing the similarities between multi-level convolutions when matching the person. FPRnet not only reduces the negative effect of bias in intra-modality, but also balances the heterogeneity gap between inter-modality, which focuses on an infrared image with very different visual properties. Meanwhile, our work integrates the advantages of learning local and global feature, which effectively solves the problems of visible-infrared person re-identification. Extensive experiments on the public SYSU-MM01 dataset from aspects of mAP and convergence speed, demonstrate the superiorities in our approach to the state-of-the-art methods. Furthermore, FPRnet also achieves competitive results with 32.12% mAP recognition rate and much faster convergence.\",\"PeriodicalId\":39552,\"journal\":{\"name\":\"光电工程\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"光电工程\",\"FirstCategoryId\":\"1087\",\"ListUrlMain\":\"https://doi.org/10.12086/OEE.2020.190669\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Engineering\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"光电工程","FirstCategoryId":"1087","ListUrlMain":"https://doi.org/10.12086/OEE.2020.190669","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Engineering","Score":null,"Total":0}
Feature pyramid random fusion network for visible-infrared modality person re-identification
Existing works in person re-identification only considers extracting invariant feature representations from cross-view visible cameras, which ignores the imaging feature in infrared domain, such that there are few studies on visible-infrared relevant modality. Besides, most works distinguish two-views by often computing the similarity in feature maps from one single convolutional layer, which causes a weak performance of learning features. To handle the above problems, we design a feature pyramid random fusion network (FPRnet) that learns discriminative multiple semantic features by computing the similarities between multi-level convolutions when matching the person. FPRnet not only reduces the negative effect of bias in intra-modality, but also balances the heterogeneity gap between inter-modality, which focuses on an infrared image with very different visual properties. Meanwhile, our work integrates the advantages of learning local and global feature, which effectively solves the problems of visible-infrared person re-identification. Extensive experiments on the public SYSU-MM01 dataset from aspects of mAP and convergence speed, demonstrate the superiorities in our approach to the state-of-the-art methods. Furthermore, FPRnet also achieves competitive results with 32.12% mAP recognition rate and much faster convergence.