{"title":"基于多阶语义融合的行人多目标跟踪网络研究","authors":"Cong Liu, Chao Han","doi":"10.3390/wevj14100272","DOIUrl":null,"url":null,"abstract":"Aiming at the problem of insufficient tracking accuracy caused by object occlusion in the process of multi-object tracking, this paper proposes a multi-order semantic fusion pedestrian multi-object tracking network. Firstly, the feature pyramid attention module is used in the backbone network to enlarge the receptive field and obtain more abundant feature information to improve the detection accuracy of different scale objects. Secondly, a size-aware module is integrated into the pedestrian re-identification branch network to fuse semantic features from different resolutions and extract more basic pedestrian features, thereby improving the tracking accuracy. Finally, the detection head is reconstructed and the small object detection layer is fused to make the proposed network adapt to objects of different sizes. Experiments on the MOT16 and MOT17 datasets show that the multi-object tracking accuracy of the proposed network reaches 75.4% (MOT16) and 74.3% (MOT17), which effectively deals with the problem of low tracking accuracy caused by occlusion in the field of autonomous driving, and achieves good tracking results. The network proposed in this paper improves the tracking accuracy of pedestrians and provides a basis for further practical applications.","PeriodicalId":38979,"journal":{"name":"World Electric Vehicle Journal","volume":"14 1","pages":"0"},"PeriodicalIF":2.6000,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Research on Pedestrian Multi-Object Tracking Network Based on Multi-Order Semantic Fusion\",\"authors\":\"Cong Liu, Chao Han\",\"doi\":\"10.3390/wevj14100272\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Aiming at the problem of insufficient tracking accuracy caused by object occlusion in the process of multi-object tracking, this paper proposes a multi-order semantic fusion pedestrian multi-object tracking network. Firstly, the feature pyramid attention module is used in the backbone network to enlarge the receptive field and obtain more abundant feature information to improve the detection accuracy of different scale objects. Secondly, a size-aware module is integrated into the pedestrian re-identification branch network to fuse semantic features from different resolutions and extract more basic pedestrian features, thereby improving the tracking accuracy. Finally, the detection head is reconstructed and the small object detection layer is fused to make the proposed network adapt to objects of different sizes. Experiments on the MOT16 and MOT17 datasets show that the multi-object tracking accuracy of the proposed network reaches 75.4% (MOT16) and 74.3% (MOT17), which effectively deals with the problem of low tracking accuracy caused by occlusion in the field of autonomous driving, and achieves good tracking results. The network proposed in this paper improves the tracking accuracy of pedestrians and provides a basis for further practical applications.\",\"PeriodicalId\":38979,\"journal\":{\"name\":\"World Electric Vehicle Journal\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2023-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"World Electric Vehicle Journal\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/wevj14100272\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"World Electric Vehicle Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/wevj14100272","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
Research on Pedestrian Multi-Object Tracking Network Based on Multi-Order Semantic Fusion
Aiming at the problem of insufficient tracking accuracy caused by object occlusion in the process of multi-object tracking, this paper proposes a multi-order semantic fusion pedestrian multi-object tracking network. Firstly, the feature pyramid attention module is used in the backbone network to enlarge the receptive field and obtain more abundant feature information to improve the detection accuracy of different scale objects. Secondly, a size-aware module is integrated into the pedestrian re-identification branch network to fuse semantic features from different resolutions and extract more basic pedestrian features, thereby improving the tracking accuracy. Finally, the detection head is reconstructed and the small object detection layer is fused to make the proposed network adapt to objects of different sizes. Experiments on the MOT16 and MOT17 datasets show that the multi-object tracking accuracy of the proposed network reaches 75.4% (MOT16) and 74.3% (MOT17), which effectively deals with the problem of low tracking accuracy caused by occlusion in the field of autonomous driving, and achieves good tracking results. The network proposed in this paper improves the tracking accuracy of pedestrians and provides a basis for further practical applications.