标志识别——单枪多盒探测器总结起来有多好?定量研究

Manikandan Ravikiran
{"title":"标志识别——单枪多盒探测器总结起来有多好?定量研究","authors":"Manikandan Ravikiran","doi":"10.1109/AIPR.2018.8707409","DOIUrl":null,"url":null,"abstract":"Deep learning in traffic sign detection & recognition (TSDR) is widely explored in recent times due to its ability to produce state-of-the-art results and availability of public datasets. Two different architectures of detection networks are currently being developed: Single Shot and Region Proposal based approaches. Even though for the case of traffic sign detection, single shot method seem adequate, very few works to date has investigated this hypothesis quantitatively, with most works focusing on region proposal based detection architectures. Moreover, with the complexity of the TSDR task and limited performance of region proposal based approaches, a quantitative study of the single shot method is warranted which would, in turn, reveal its strengths and weakness for TSDR. As such in this paper, we revisit this topic through quantitative evaluation of state-of-the-art Single Shot Multibox Detector (SSD) on multiple standard benchmarks. More specifically, we try to quantify 1) Performance of SSD over multiple existing TSDR benchmarks namely GTSDB, STSDB and BTSDB 2) Generalization of SSD across the datasets 3) Impact of class overlap on SSD’s performance 4) Performance of SSD from synthetically generated datasets using Wikipedia Images. Through our study, we show that 1) SSD can reach performance >0.92 AUC for TSDR across standard benchmarks and in the process, we introduce new benchmarks for Romania(RTSDB) and Finland(FTSDB) in line with GTSDB 2) SSD model pretrained on GTSDB generalizes well for BTSDB and RTSDB with average AUC of 0.90 and comparatively lower for Sweden and Finland datasets. We find that scale selection and information loss as the primary reason for the limited generalization. In the due process, to address these issues we propose a convex optimization-based scale selection and Skip SSD - An architecture developed based on the concept of feature reuse leading to improvement in generalization. We also show that 3) SSD model augmented with small synthetically generated dataset produces close to state-of-the-art accuracy across GTSDB, STSDB and BTSDB 4) Class overlap is indeed a challenging problem to be addressed even in case of SSD. Further, we show detailed experiments and summarize our practical findings for those interested in getting the most out of SSD for TSDR.","PeriodicalId":230582,"journal":{"name":"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Sign Recognition - How well does Single Shot Multibox Detector sum up? A Quantitative Study\",\"authors\":\"Manikandan Ravikiran\",\"doi\":\"10.1109/AIPR.2018.8707409\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep learning in traffic sign detection & recognition (TSDR) is widely explored in recent times due to its ability to produce state-of-the-art results and availability of public datasets. Two different architectures of detection networks are currently being developed: Single Shot and Region Proposal based approaches. Even though for the case of traffic sign detection, single shot method seem adequate, very few works to date has investigated this hypothesis quantitatively, with most works focusing on region proposal based detection architectures. Moreover, with the complexity of the TSDR task and limited performance of region proposal based approaches, a quantitative study of the single shot method is warranted which would, in turn, reveal its strengths and weakness for TSDR. As such in this paper, we revisit this topic through quantitative evaluation of state-of-the-art Single Shot Multibox Detector (SSD) on multiple standard benchmarks. More specifically, we try to quantify 1) Performance of SSD over multiple existing TSDR benchmarks namely GTSDB, STSDB and BTSDB 2) Generalization of SSD across the datasets 3) Impact of class overlap on SSD’s performance 4) Performance of SSD from synthetically generated datasets using Wikipedia Images. Through our study, we show that 1) SSD can reach performance >0.92 AUC for TSDR across standard benchmarks and in the process, we introduce new benchmarks for Romania(RTSDB) and Finland(FTSDB) in line with GTSDB 2) SSD model pretrained on GTSDB generalizes well for BTSDB and RTSDB with average AUC of 0.90 and comparatively lower for Sweden and Finland datasets. We find that scale selection and information loss as the primary reason for the limited generalization. In the due process, to address these issues we propose a convex optimization-based scale selection and Skip SSD - An architecture developed based on the concept of feature reuse leading to improvement in generalization. We also show that 3) SSD model augmented with small synthetically generated dataset produces close to state-of-the-art accuracy across GTSDB, STSDB and BTSDB 4) Class overlap is indeed a challenging problem to be addressed even in case of SSD. Further, we show detailed experiments and summarize our practical findings for those interested in getting the most out of SSD for TSDR.\",\"PeriodicalId\":230582,\"journal\":{\"name\":\"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)\",\"volume\":\"41 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AIPR.2018.8707409\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AIPR.2018.8707409","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

目前正在开发两种不同的检测网络架构:单镜头和基于区域建议的方法。尽管对于交通标志检测来说,单镜头方法似乎是足够的,但迄今为止很少有研究对这一假设进行定量研究,大多数研究都集中在基于区域建议的检测架构上。此外,由于TSDR任务的复杂性和基于区域建议的方法的有限性能,有必要对单次射击方法进行定量研究,从而揭示其在TSDR中的优缺点。因此,在本文中,我们通过对最先进的单镜头多盒探测器(SSD)在多个标准基准上的定量评估来重新审视这个主题。更具体地说,我们试图量化1)SSD在多个现有TSDR基准(即GTSDB, STSDB和BTSDB)上的性能2)SSD在数据集上的泛化3)类重叠对SSD性能的影响4)使用维基百科图像合成生成数据集的SSD性能。通过我们的研究,我们发现:1)SSD在TSDR的标准基准测试中可以达到0.92 AUC的性能,在此过程中,我们引入了符合GTSDB的罗马尼亚(RTSDB)和芬兰(FTSDB)的新基准测试;2)在GTSDB上预训练的SSD模型对BTSDB和RTSDB有很好的泛化,平均AUC为0.90,瑞典和芬兰数据集的AUC相对较低。我们发现尺度选择和信息丢失是泛化受限的主要原因。在适当的过程中,为了解决这些问题,我们提出了一种基于凸优化的规模选择和跳过SSD——一种基于特征重用概念开发的架构,从而提高了泛化能力。我们还表明,3)SSD模型与小型合成生成的数据集增强后,在GTSDB, STSDB和BTSDB之间产生接近最先进的精度4)类重叠确实是一个具有挑战性的问题,即使在SSD的情况下也需要解决。此外,我们展示了详细的实验,并总结了我们的实际发现,为那些有兴趣充分利用SSD的TSDR。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Sign Recognition - How well does Single Shot Multibox Detector sum up? A Quantitative Study
Deep learning in traffic sign detection & recognition (TSDR) is widely explored in recent times due to its ability to produce state-of-the-art results and availability of public datasets. Two different architectures of detection networks are currently being developed: Single Shot and Region Proposal based approaches. Even though for the case of traffic sign detection, single shot method seem adequate, very few works to date has investigated this hypothesis quantitatively, with most works focusing on region proposal based detection architectures. Moreover, with the complexity of the TSDR task and limited performance of region proposal based approaches, a quantitative study of the single shot method is warranted which would, in turn, reveal its strengths and weakness for TSDR. As such in this paper, we revisit this topic through quantitative evaluation of state-of-the-art Single Shot Multibox Detector (SSD) on multiple standard benchmarks. More specifically, we try to quantify 1) Performance of SSD over multiple existing TSDR benchmarks namely GTSDB, STSDB and BTSDB 2) Generalization of SSD across the datasets 3) Impact of class overlap on SSD’s performance 4) Performance of SSD from synthetically generated datasets using Wikipedia Images. Through our study, we show that 1) SSD can reach performance >0.92 AUC for TSDR across standard benchmarks and in the process, we introduce new benchmarks for Romania(RTSDB) and Finland(FTSDB) in line with GTSDB 2) SSD model pretrained on GTSDB generalizes well for BTSDB and RTSDB with average AUC of 0.90 and comparatively lower for Sweden and Finland datasets. We find that scale selection and information loss as the primary reason for the limited generalization. In the due process, to address these issues we propose a convex optimization-based scale selection and Skip SSD - An architecture developed based on the concept of feature reuse leading to improvement in generalization. We also show that 3) SSD model augmented with small synthetically generated dataset produces close to state-of-the-art accuracy across GTSDB, STSDB and BTSDB 4) Class overlap is indeed a challenging problem to be addressed even in case of SSD. Further, we show detailed experiments and summarize our practical findings for those interested in getting the most out of SSD for TSDR.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Automated Annotation of Satellite Imagery using Model-based Projections Visualizing Compression of Deep Learning Models for Classification Malware Classification using Deep Convolutional Neural Networks An Improved Star Detection Algorithm Using a Combination of Statistical and Morphological Image Processing Techniques Improving Nuclei Classification Performance in H&E Stained Tissue Images Using Fully Convolutional Regression Network and Convolutional Neural Network
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1