微调深度学习模型用于行人检测

IF 0.5 Q3 Earth and Planetary Sciences Boletim De Ciencias Geodesicas Pub Date : 2021-06-04 DOI:10.1590/S1982-21702021000200013

C. Amisse, M. E. Jijón-Palma, J. Centeno

{"title":"微调深度学习模型用于行人检测","authors":"C. Amisse, M. E. Jijón-Palma, J. Centeno","doi":"10.1590/S1982-21702021000200013","DOIUrl":null,"url":null,"abstract":"Abstract: Object detection in high resolution images is a new challenge that the remote sensing community is facing thanks to introduction of unmanned aerial vehicles and monitoring cameras. One of the interests is to detect and trace persons in the images. Different from general objects, pedestrians can have different poses and are undergoing constant morphological changes while moving, this task needs an intelligent solution. Fine-tuning has woken up great interest among researchers due to its relevance for retraining convolutional networks for many and interesting applications. For object classification, detection, and segmentation fine-tuned models have shown state-of-the-art performance. In the present work, we evaluate the performance of fine-tuned models with a variation of training data by comparing Faster Region-based Convolutional Neural Network (Faster R-CNN) Inception v2, Single Shot MultiBox Detector (SSD) Inception v2, and SSD Mobilenet v2. To achieve the goal, the effect of varying training data on performance metrics such as accuracy, precision, F1-score, and recall are taken into account. After testing the detectors, it was identified that the precision and recall are more sensitive on the variation of the amount of training data. Under five variation of the amount of training data, we observe that the proportion of 60%-80% consistently achieve highly comparable performance, whereas in all variation of training data Faster R-CNN Inception v2 outperforms SSD Inception v2 and SSD Mobilenet v2 in evaluated metrics, but the SSD converges relatively quickly during the training phase. Overall, partitioning 80% of total data for fine-tuning trained models produces efficient detectors even with only 700 data samples.","PeriodicalId":55347,"journal":{"name":"Boletim De Ciencias Geodesicas","volume":"26 1","pages":""},"PeriodicalIF":0.5000,"publicationDate":"2021-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"FINE-TUNING DEEP LEARNING MODELS FOR PEDESTRIAN DETECTION\",\"authors\":\"C. Amisse, M. E. Jijón-Palma, J. Centeno\",\"doi\":\"10.1590/S1982-21702021000200013\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract: Object detection in high resolution images is a new challenge that the remote sensing community is facing thanks to introduction of unmanned aerial vehicles and monitoring cameras. One of the interests is to detect and trace persons in the images. Different from general objects, pedestrians can have different poses and are undergoing constant morphological changes while moving, this task needs an intelligent solution. Fine-tuning has woken up great interest among researchers due to its relevance for retraining convolutional networks for many and interesting applications. For object classification, detection, and segmentation fine-tuned models have shown state-of-the-art performance. In the present work, we evaluate the performance of fine-tuned models with a variation of training data by comparing Faster Region-based Convolutional Neural Network (Faster R-CNN) Inception v2, Single Shot MultiBox Detector (SSD) Inception v2, and SSD Mobilenet v2. To achieve the goal, the effect of varying training data on performance metrics such as accuracy, precision, F1-score, and recall are taken into account. After testing the detectors, it was identified that the precision and recall are more sensitive on the variation of the amount of training data. Under five variation of the amount of training data, we observe that the proportion of 60%-80% consistently achieve highly comparable performance, whereas in all variation of training data Faster R-CNN Inception v2 outperforms SSD Inception v2 and SSD Mobilenet v2 in evaluated metrics, but the SSD converges relatively quickly during the training phase. Overall, partitioning 80% of total data for fine-tuning trained models produces efficient detectors even with only 700 data samples.\",\"PeriodicalId\":55347,\"journal\":{\"name\":\"Boletim De Ciencias Geodesicas\",\"volume\":\"26 1\",\"pages\":\"\"},\"PeriodicalIF\":0.5000,\"publicationDate\":\"2021-06-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Boletim De Ciencias Geodesicas\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1590/S1982-21702021000200013\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Earth and Planetary Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Boletim De Ciencias Geodesicas","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1590/S1982-21702021000200013","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Earth and Planetary Sciences","Score":null,"Total":0}

引用次数: 9

摘要

摘要:随着无人机和监控摄像机的出现，高分辨率图像中的目标检测是遥感界面临的新挑战。对图像中的人物进行检测和跟踪是研究的热点之一。与一般物体不同的是，行人在移动过程中会有不同的姿势和不断的形态变化，这个任务需要一个智能的解决方案。微调已经引起了研究人员的极大兴趣，因为它与许多有趣的应用重新训练卷积网络有关。对于目标分类、检测和分割，微调模型已经显示出最先进的性能。在目前的工作中，我们通过比较Faster基于区域的卷积神经网络(Faster R-CNN) Inception v2、Single Shot MultiBox Detector (SSD) Inception v2和SSD Mobilenet v2来评估具有不同训练数据的微调模型的性能。为了实现这一目标，需要考虑不同的训练数据对性能指标的影响，如准确性、精度、f1分数和召回率。通过对检测器的测试，发现检测器的准确率和召回率对训练数据量的变化更为敏感。在训练数据量的五种变化下，我们观察到60%-80%的比例始终达到高度可比的性能，而在所有训练数据的变化中，Faster R-CNN Inception v2在评估指标上优于SSD Inception v2和SSD Mobilenet v2，但SSD在训练阶段收敛相对较快。总的来说，即使只有700个数据样本，也可以将总数据的80%用于微调训练模型，从而产生高效的检测器。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

FINE-TUNING DEEP LEARNING MODELS FOR PEDESTRIAN DETECTION

Abstract: Object detection in high resolution images is a new challenge that the remote sensing community is facing thanks to introduction of unmanned aerial vehicles and monitoring cameras. One of the interests is to detect and trace persons in the images. Different from general objects, pedestrians can have different poses and are undergoing constant morphological changes while moving, this task needs an intelligent solution. Fine-tuning has woken up great interest among researchers due to its relevance for retraining convolutional networks for many and interesting applications. For object classification, detection, and segmentation fine-tuned models have shown state-of-the-art performance. In the present work, we evaluate the performance of fine-tuned models with a variation of training data by comparing Faster Region-based Convolutional Neural Network (Faster R-CNN) Inception v2, Single Shot MultiBox Detector (SSD) Inception v2, and SSD Mobilenet v2. To achieve the goal, the effect of varying training data on performance metrics such as accuracy, precision, F1-score, and recall are taken into account. After testing the detectors, it was identified that the precision and recall are more sensitive on the variation of the amount of training data. Under five variation of the amount of training data, we observe that the proportion of 60%-80% consistently achieve highly comparable performance, whereas in all variation of training data Faster R-CNN Inception v2 outperforms SSD Inception v2 and SSD Mobilenet v2 in evaluated metrics, but the SSD converges relatively quickly during the training phase. Overall, partitioning 80% of total data for fine-tuning trained models produces efficient detectors even with only 700 data samples.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Boletim De Ciencias Geodesicas Earth and Planetary Sciences-General Earth and Planetary Sciences

CiteScore

1.70

自引率

20.00%

发文量

审稿时长

3 months

期刊介绍： The Boletim de Ciências Geodésicas publishes original papers in the area of Geodetic Sciences and correlated ones (Geodesy, Photogrammetry and Remote Sensing, Cartography and Geographic Information Systems). Submitted articles must be unpublished, and should not be under consideration for publication in any other journal. Previous publication of the paper in conference proceedings would not violate the originality requirements. Articles must be written preferably in English language.