{"title":"改进的SSD,用于小目标检测","authors":"Xiang Li, Haibo Luo","doi":"10.1145/3449388.3449391","DOIUrl":null,"url":null,"abstract":"SSD is one of heuristic one-stage target detection approaches. Although it has got impressive results in general target detection, it still struggles in small-size object detection and precise location. In this paper, we proposed an improved SSD which forces on the small-size target detection. We include a shallow and high resolution feature into the hierarchical detection feature which are used for prediction. Then, we fuse the detection features (including the shallow and high resolution one) as a feature pyramid through some convolution layers and unsample operations to pass information from deep features to the shallow ones, aiming to enrich the semantic information of the shallow features. To make the network easier to converge, we add a L2 normalization to the bottom detection feature of the feature pyramid to make a norm balance between each pyramid feature. The experimental results on the VEDAI dataset show that the proposed method has obtained impressive progress than the original SSD for the small targets detection.","PeriodicalId":326682,"journal":{"name":"2021 6th International Conference on Multimedia and Image Processing","volume":"98 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"An Improved SSD for small target detection\",\"authors\":\"Xiang Li, Haibo Luo\",\"doi\":\"10.1145/3449388.3449391\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"SSD is one of heuristic one-stage target detection approaches. Although it has got impressive results in general target detection, it still struggles in small-size object detection and precise location. In this paper, we proposed an improved SSD which forces on the small-size target detection. We include a shallow and high resolution feature into the hierarchical detection feature which are used for prediction. Then, we fuse the detection features (including the shallow and high resolution one) as a feature pyramid through some convolution layers and unsample operations to pass information from deep features to the shallow ones, aiming to enrich the semantic information of the shallow features. To make the network easier to converge, we add a L2 normalization to the bottom detection feature of the feature pyramid to make a norm balance between each pyramid feature. The experimental results on the VEDAI dataset show that the proposed method has obtained impressive progress than the original SSD for the small targets detection.\",\"PeriodicalId\":326682,\"journal\":{\"name\":\"2021 6th International Conference on Multimedia and Image Processing\",\"volume\":\"98 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-01-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 6th International Conference on Multimedia and Image Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3449388.3449391\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 6th International Conference on Multimedia and Image Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3449388.3449391","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
SSD is one of heuristic one-stage target detection approaches. Although it has got impressive results in general target detection, it still struggles in small-size object detection and precise location. In this paper, we proposed an improved SSD which forces on the small-size target detection. We include a shallow and high resolution feature into the hierarchical detection feature which are used for prediction. Then, we fuse the detection features (including the shallow and high resolution one) as a feature pyramid through some convolution layers and unsample operations to pass information from deep features to the shallow ones, aiming to enrich the semantic information of the shallow features. To make the network easier to converge, we add a L2 normalization to the bottom detection feature of the feature pyramid to make a norm balance between each pyramid feature. The experimental results on the VEDAI dataset show that the proposed method has obtained impressive progress than the original SSD for the small targets detection.