{"title":"DenseYOLO:更快,更轻,更准确的YOLO","authors":"Solomon Negussie Tesema, E. Bourennane","doi":"10.1109/IEMCON51383.2020.9284923","DOIUrl":null,"url":null,"abstract":"As much as an object detector should be accurate, it should be light and fast as well. However, current object detectors tend to be either inaccurate when lightweight or very slow and heavy when accurate. Accordingly, determining tolerable tradeoff between speed and accuracy of an object detector is not a simple task. One of the object detectors that have commendable balance of speed and accuracy is YOLOv2. YOLOv2 performs detection by dividing an input image into grids and training each grid cell to predict certain number of objects. In this paper we propose a new approach to even make YOLOv2 more fast and accurate. We re-purpose YOLOv2 into a dense object detector by using fine-grained grids, where a cell predicts only one object and its corresponding class and objectness confidence score. Our approach also trains the system to learn to pick a best fitting anchor box instead of the fixed anchor assignment during ground-truth annotation as used by YOLOv2. We will also introduce a new loss function to balance the overwhelming imbalance between the number of grids responsible of detecting an object and those that should not.","PeriodicalId":6871,"journal":{"name":"2020 11th IEEE Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON)","volume":"57 1","pages":"0534-0539"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"DenseYOLO: Yet Faster, Lighter and More Accurate YOLO\",\"authors\":\"Solomon Negussie Tesema, E. Bourennane\",\"doi\":\"10.1109/IEMCON51383.2020.9284923\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As much as an object detector should be accurate, it should be light and fast as well. However, current object detectors tend to be either inaccurate when lightweight or very slow and heavy when accurate. Accordingly, determining tolerable tradeoff between speed and accuracy of an object detector is not a simple task. One of the object detectors that have commendable balance of speed and accuracy is YOLOv2. YOLOv2 performs detection by dividing an input image into grids and training each grid cell to predict certain number of objects. In this paper we propose a new approach to even make YOLOv2 more fast and accurate. We re-purpose YOLOv2 into a dense object detector by using fine-grained grids, where a cell predicts only one object and its corresponding class and objectness confidence score. Our approach also trains the system to learn to pick a best fitting anchor box instead of the fixed anchor assignment during ground-truth annotation as used by YOLOv2. We will also introduce a new loss function to balance the overwhelming imbalance between the number of grids responsible of detecting an object and those that should not.\",\"PeriodicalId\":6871,\"journal\":{\"name\":\"2020 11th IEEE Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON)\",\"volume\":\"57 1\",\"pages\":\"0534-0539\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 11th IEEE Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IEMCON51383.2020.9284923\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 11th IEEE Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IEMCON51383.2020.9284923","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
DenseYOLO: Yet Faster, Lighter and More Accurate YOLO
As much as an object detector should be accurate, it should be light and fast as well. However, current object detectors tend to be either inaccurate when lightweight or very slow and heavy when accurate. Accordingly, determining tolerable tradeoff between speed and accuracy of an object detector is not a simple task. One of the object detectors that have commendable balance of speed and accuracy is YOLOv2. YOLOv2 performs detection by dividing an input image into grids and training each grid cell to predict certain number of objects. In this paper we propose a new approach to even make YOLOv2 more fast and accurate. We re-purpose YOLOv2 into a dense object detector by using fine-grained grids, where a cell predicts only one object and its corresponding class and objectness confidence score. Our approach also trains the system to learn to pick a best fitting anchor box instead of the fixed anchor assignment during ground-truth annotation as used by YOLOv2. We will also introduce a new loss function to balance the overwhelming imbalance between the number of grids responsible of detecting an object and those that should not.