{"title":"基于多尺度特征和规范化注意力模型的车辆检测算法","authors":"Yu-Shuai Duan, Huarong Xu, Lifen Weng","doi":"10.1145/3548608.3559196","DOIUrl":null,"url":null,"abstract":"As the key technology of automatic driving perception module, vehicle detection in complex scenes requires real-time and accurate acquisition of the position and distance information of surrounding vehicles, so as to ensure the safety of passengers. Centernet algorithm performs well in vehicle detection, achieving a trade-off between accuracy and speed, but the network only extracts features of the target at the last layer of the feature map, leading to the problem of missed and false detections during detection. Therefore, this paper proposes a Vehicle-CenterNet detection model, which obtains more detailed information by modifying the original ResNet, constructing layered connections within a single residual block, and increasing the perceptual field size of each layer by stacking convolution operators. In addition, the Mish activation function is used instead of the ReLU activation function, and the smoothed activation function allows better information penetration into the neural network, resulting in better accuracy and generalization. The normalization-based attention module (NAM) is also incorporated to suppress non-target features and further improve the detection accuracy of the model. Experimental results on VOC dataset and KITTI dataset show that the mean average precision (mAP) and F1 Score of the proposed method are improved to different degrees, and the comprehensive performance is better than the original CenterNet algorithm.","PeriodicalId":201434,"journal":{"name":"Proceedings of the 2022 2nd International Conference on Control and Intelligent Robotics","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Vehicle detection algorithm based on multi-scale features and normalization attention model\",\"authors\":\"Yu-Shuai Duan, Huarong Xu, Lifen Weng\",\"doi\":\"10.1145/3548608.3559196\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As the key technology of automatic driving perception module, vehicle detection in complex scenes requires real-time and accurate acquisition of the position and distance information of surrounding vehicles, so as to ensure the safety of passengers. Centernet algorithm performs well in vehicle detection, achieving a trade-off between accuracy and speed, but the network only extracts features of the target at the last layer of the feature map, leading to the problem of missed and false detections during detection. Therefore, this paper proposes a Vehicle-CenterNet detection model, which obtains more detailed information by modifying the original ResNet, constructing layered connections within a single residual block, and increasing the perceptual field size of each layer by stacking convolution operators. In addition, the Mish activation function is used instead of the ReLU activation function, and the smoothed activation function allows better information penetration into the neural network, resulting in better accuracy and generalization. The normalization-based attention module (NAM) is also incorporated to suppress non-target features and further improve the detection accuracy of the model. Experimental results on VOC dataset and KITTI dataset show that the mean average precision (mAP) and F1 Score of the proposed method are improved to different degrees, and the comprehensive performance is better than the original CenterNet algorithm.\",\"PeriodicalId\":201434,\"journal\":{\"name\":\"Proceedings of the 2022 2nd International Conference on Control and Intelligent Robotics\",\"volume\":\"15 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2022 2nd International Conference on Control and Intelligent Robotics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3548608.3559196\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 2nd International Conference on Control and Intelligent Robotics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3548608.3559196","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Vehicle detection algorithm based on multi-scale features and normalization attention model
As the key technology of automatic driving perception module, vehicle detection in complex scenes requires real-time and accurate acquisition of the position and distance information of surrounding vehicles, so as to ensure the safety of passengers. Centernet algorithm performs well in vehicle detection, achieving a trade-off between accuracy and speed, but the network only extracts features of the target at the last layer of the feature map, leading to the problem of missed and false detections during detection. Therefore, this paper proposes a Vehicle-CenterNet detection model, which obtains more detailed information by modifying the original ResNet, constructing layered connections within a single residual block, and increasing the perceptual field size of each layer by stacking convolution operators. In addition, the Mish activation function is used instead of the ReLU activation function, and the smoothed activation function allows better information penetration into the neural network, resulting in better accuracy and generalization. The normalization-based attention module (NAM) is also incorporated to suppress non-target features and further improve the detection accuracy of the model. Experimental results on VOC dataset and KITTI dataset show that the mean average precision (mAP) and F1 Score of the proposed method are improved to different degrees, and the comprehensive performance is better than the original CenterNet algorithm.