Evaluation of focal loss based deep neural networks for traffic sign detection

IF 1.8 Q3 REMOTE SENSING International Journal of Image and Data Fusion Pub Date : 2022-06-21 DOI:10.1080/19479832.2022.2086304

Deepika Kamboj, Sharda Vashisth, Sumeet Saurav

{"title":"Evaluation of focal loss based deep neural networks for traffic sign detection","authors":"Deepika Kamboj, Sharda Vashisth, Sumeet Saurav","doi":"10.1080/19479832.2022.2086304","DOIUrl":null,"url":null,"abstract":"ABSTRACT With advancements in autonomous driving, demand for stringent and computationally efficient traffic sign detection systems has increased. However, bringing such a system to a deployable level requires handling critical accuracy and processing speed issues. A focal loss-based single-stage object detector, i.e RetinaNet, is used as a trade-off between accuracy and processing speed as it handles the class imbalance problem of the single-stage detector and is thus suitable for traffic sign detection (TSD). We assessed the detector’s performance by combining various feature extractors such as ResNet-50, ResNet-101, and ResNet-152 on three publicly available TSD benchmark datasets. Performance comparison of the detector using different backbone includes evaluation parameters like mean average precision (mAP), memory allocation, running time, and floating-point operations. From the evaluation results, we found that the RetinaNet object detector using the ResNet-152 backbone obtains the best mAP, while that using ResNet-101 strikes the best trade-off between accuracy and execution time. The motivation behind benchmarking the detector on different datasets is to analyse the detector’s performance on different TSD benchmark datasets. Among the three feature extractors, the RetinaNet model trained using the ResNet-50 backbone is an excellent model in memory consumption, making it an optimal choice for low-cost embedded devices deployment.","PeriodicalId":46012,"journal":{"name":"International Journal of Image and Data Fusion","volume":"14 1","pages":"122 - 144"},"PeriodicalIF":1.8000,"publicationDate":"2022-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Image and Data Fusion","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/19479832.2022.2086304","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"REMOTE SENSING","Score":null,"Total":0}

引用次数: 0

Abstract

ABSTRACT With advancements in autonomous driving, demand for stringent and computationally efficient traffic sign detection systems has increased. However, bringing such a system to a deployable level requires handling critical accuracy and processing speed issues. A focal loss-based single-stage object detector, i.e RetinaNet, is used as a trade-off between accuracy and processing speed as it handles the class imbalance problem of the single-stage detector and is thus suitable for traffic sign detection (TSD). We assessed the detector’s performance by combining various feature extractors such as ResNet-50, ResNet-101, and ResNet-152 on three publicly available TSD benchmark datasets. Performance comparison of the detector using different backbone includes evaluation parameters like mean average precision (mAP), memory allocation, running time, and floating-point operations. From the evaluation results, we found that the RetinaNet object detector using the ResNet-152 backbone obtains the best mAP, while that using ResNet-101 strikes the best trade-off between accuracy and execution time. The motivation behind benchmarking the detector on different datasets is to analyse the detector’s performance on different TSD benchmark datasets. Among the three feature extractors, the RetinaNet model trained using the ResNet-50 backbone is an excellent model in memory consumption, making it an optimal choice for low-cost embedded devices deployment.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于焦点损失的深度神经网络交通标志检测评价

随着自动驾驶技术的进步，对严格且计算效率高的交通标志检测系统的需求不断增加。然而，要使这样的系统达到可部署的水平，需要处理关键的准确性和处理速度问题。基于焦点损耗的单级目标检测器，即retanet，在精度和处理速度之间进行了权衡，因为它处理了单级检测器的类不平衡问题，因此适用于交通标志检测(TSD)。我们通过在三个公开可用的TSD基准数据集上结合各种特征提取器(如ResNet-50、ResNet-101和ResNet-152)来评估检测器的性能。使用不同主干的检测器的性能比较包括平均平均精度(mAP)、内存分配、运行时间和浮点操作等评估参数。从评估结果来看，我们发现使用ResNet-152骨干网的retanet目标检测器获得了最好的mAP，而使用ResNet-101骨干网的retanet目标检测器在精度和执行时间之间取得了最好的平衡。在不同数据集上对检测器进行基准测试的动机是分析检测器在不同TSD基准数据集上的性能。在三种特征提取器中，使用ResNet-50骨干网训练的retanet模型在内存消耗方面表现优异，是低成本嵌入式设备部署的最佳选择。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

International Journal of Image and Data Fusion REMOTE SENSING-

CiteScore

5.00

自引率

0.00%

发文量

期刊介绍： International Journal of Image and Data Fusion provides a single source of information for all aspects of image and data fusion methodologies, developments, techniques and applications. Image and data fusion techniques are important for combining the many sources of satellite, airborne and ground based imaging systems, and integrating these with other related data sets for enhanced information extraction and decision making. Image and data fusion aims at the integration of multi-sensor, multi-temporal, multi-resolution and multi-platform image data, together with geospatial data, GIS, in-situ, and other statistical data sets for improved information extraction, as well as to increase the reliability of the information. This leads to more accurate information that provides for robust operational performance, i.e. increased confidence, reduced ambiguity and improved classification enabling evidence based management. The journal welcomes original research papers, review papers, shorter letters, technical articles, book reviews and conference reports in all areas of image and data fusion including, but not limited to, the following aspects and topics: • Automatic registration/geometric aspects of fusing images with different spatial, spectral, temporal resolutions; phase information; or acquired in different modes • Pixel, feature and decision level fusion algorithms and methodologies • Data Assimilation: fusing data with models • Multi-source classification and information extraction • Integration of satellite, airborne and terrestrial sensor systems • Fusing temporal data sets for change detection studies (e.g. for Land Cover/Land Use Change studies) • Image and data mining from multi-platform, multi-source, multi-scale, multi-temporal data sets (e.g. geometric information, topological information, statistical information, etc.).