{"title":"基于RGB-D传感器的多目标跟踪","authors":"Keliang Zhu, Xuemei Shi, Tianzhong Zhang, Huasong Song, Jinlin Xu, Liangfeng Chen","doi":"10.1145/3585967.3585990","DOIUrl":null,"url":null,"abstract":"The accuracy of the multi-object tracking (MOT) based on the 2D camera without depth info is usually poor. In this paper, we propose a MOT method based on sensors composed of the camera and the ultra-wide band (UWB) radar, which are similar to the depth camera (RGB-D camera). First, we establish a backbone network to extract feature maps from video frames captured by a camera. Then, we combine Faster R-CNN with a re-ID branch to detect objects including the category, coordinate and ID. To track objects, we construct a similarity matrix to calculate the data association between the objects and their historical trajectories. The matrix's elements are calculated by the intersection over union (IoU) between the objects and their related two types of trajectories, which are based on the image data and the UWB localization data separately. Finally, the trajectories are updated by the two types of trajectories, and the recognition network is updated by the localization loss. The experimental results show that our method achieves multi-object recognition and tracking, and outperforms previous methods by a large margin on several public datasets.","PeriodicalId":275067,"journal":{"name":"Proceedings of the 2023 10th International Conference on Wireless Communication and Sensor Networks","volume":"49 12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-Object Tracking based on RGB-D Sensors\",\"authors\":\"Keliang Zhu, Xuemei Shi, Tianzhong Zhang, Huasong Song, Jinlin Xu, Liangfeng Chen\",\"doi\":\"10.1145/3585967.3585990\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The accuracy of the multi-object tracking (MOT) based on the 2D camera without depth info is usually poor. In this paper, we propose a MOT method based on sensors composed of the camera and the ultra-wide band (UWB) radar, which are similar to the depth camera (RGB-D camera). First, we establish a backbone network to extract feature maps from video frames captured by a camera. Then, we combine Faster R-CNN with a re-ID branch to detect objects including the category, coordinate and ID. To track objects, we construct a similarity matrix to calculate the data association between the objects and their historical trajectories. The matrix's elements are calculated by the intersection over union (IoU) between the objects and their related two types of trajectories, which are based on the image data and the UWB localization data separately. Finally, the trajectories are updated by the two types of trajectories, and the recognition network is updated by the localization loss. The experimental results show that our method achieves multi-object recognition and tracking, and outperforms previous methods by a large margin on several public datasets.\",\"PeriodicalId\":275067,\"journal\":{\"name\":\"Proceedings of the 2023 10th International Conference on Wireless Communication and Sensor Networks\",\"volume\":\"49 12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2023 10th International Conference on Wireless Communication and Sensor Networks\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3585967.3585990\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2023 10th International Conference on Wireless Communication and Sensor Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3585967.3585990","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The accuracy of the multi-object tracking (MOT) based on the 2D camera without depth info is usually poor. In this paper, we propose a MOT method based on sensors composed of the camera and the ultra-wide band (UWB) radar, which are similar to the depth camera (RGB-D camera). First, we establish a backbone network to extract feature maps from video frames captured by a camera. Then, we combine Faster R-CNN with a re-ID branch to detect objects including the category, coordinate and ID. To track objects, we construct a similarity matrix to calculate the data association between the objects and their historical trajectories. The matrix's elements are calculated by the intersection over union (IoU) between the objects and their related two types of trajectories, which are based on the image data and the UWB localization data separately. Finally, the trajectories are updated by the two types of trajectories, and the recognition network is updated by the localization loss. The experimental results show that our method achieves multi-object recognition and tracking, and outperforms previous methods by a large margin on several public datasets.