基于多模态信息融合的行人检测研究

IF 2 4区计算机科学 Q3 AUTOMATION & CONTROL SYSTEMS Information Technology and Control Pub Date : 2023-12-22 DOI:10.5755/j01.itc.52.4.33766

Xiaoping Yang, Zhehong Li, Yuan Liu, Ran Huang, Kai Tan, Lin Huang

{"title":"基于多模态信息融合的行人检测研究","authors":"Xiaoping Yang, Zhehong Li, Yuan Liu, Ran Huang, Kai Tan, Lin Huang","doi":"10.5755/j01.itc.52.4.33766","DOIUrl":null,"url":null,"abstract":"Aiming at the matter that pedestrian detection in the autonomous driving system is vulnerable to the influence of the external environment and the detector supported single sensor modal detector has poor performance beneath the condition of enormous amendment of unrestricted light-weight, this paper proposes a fusion of light and thermal infrared dual mode pedestrian detection methodology. Firstly, 1 × 1 convolution and expanded convolution square measure are introduced within the residual network, and also the ROI Align methodology is employed to exchange the ROI Pooling method-ology to map the candidate box to the feature layer to optimize the Faster R-CNN. Secondly, the loss performance of the generalized intersection over union (GIoU) is employed because of the loss performance of the prediction box positioning regression; finally, supported by the improved Faster R-CNN, four forms of multimodal neural network structures square measure designed to fuse visible and thermal infrared pictures. According to experimental findings, the proposed technique outperforms current mainstream detection algorithms on the KAIST dataset. As compared to the conventional ACF + T + THOG pedestrian detector, the AP is 8.38 percentage points greater. Compared to the visible light pedestrian detector, the miss rate is 5.34 percentage points lower.","PeriodicalId":54982,"journal":{"name":"Information Technology and Control","volume":"17 3","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2023-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Research on Pedestrian Detection Based on Multimodal Infor-mation Fusion\",\"authors\":\"Xiaoping Yang, Zhehong Li, Yuan Liu, Ran Huang, Kai Tan, Lin Huang\",\"doi\":\"10.5755/j01.itc.52.4.33766\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Aiming at the matter that pedestrian detection in the autonomous driving system is vulnerable to the influence of the external environment and the detector supported single sensor modal detector has poor performance beneath the condition of enormous amendment of unrestricted light-weight, this paper proposes a fusion of light and thermal infrared dual mode pedestrian detection methodology. Firstly, 1 × 1 convolution and expanded convolution square measure are introduced within the residual network, and also the ROI Align methodology is employed to exchange the ROI Pooling method-ology to map the candidate box to the feature layer to optimize the Faster R-CNN. Secondly, the loss performance of the generalized intersection over union (GIoU) is employed because of the loss performance of the prediction box positioning regression; finally, supported by the improved Faster R-CNN, four forms of multimodal neural network structures square measure designed to fuse visible and thermal infrared pictures. According to experimental findings, the proposed technique outperforms current mainstream detection algorithms on the KAIST dataset. As compared to the conventional ACF + T + THOG pedestrian detector, the AP is 8.38 percentage points greater. Compared to the visible light pedestrian detector, the miss rate is 5.34 percentage points lower.\",\"PeriodicalId\":54982,\"journal\":{\"name\":\"Information Technology and Control\",\"volume\":\"17 3\",\"pages\":\"\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2023-12-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Technology and Control\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.5755/j01.itc.52.4.33766\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Technology and Control","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.5755/j01.itc.52.4.33766","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

针对自动驾驶系统中行人检测易受外部环境影响，以及在轻量化不受限制的巨大修正条件下，单传感器模态检测器性能不佳的问题，本文提出了一种光热红外双模行人融合检测方法。首先，在残差网络中引入 1 × 1 卷积和扩展卷积平方量，并采用 ROI Align 方法交换 ROI Pooling 方法将候选框映射到特征层，从而优化 Faster R-CNN 。其次，由于预测框定位回归的损失性能，采用了广义交集大于联合（GIoU）的损失性能；最后，在改进的 Faster R-CNN 的支持下，设计了四种形式的多模态神经网络结构来融合可见光和热红外图像。实验结果表明，在 KAIST 数据集上，所提出的技术优于当前的主流检测算法。与传统的 ACF + T + THOG 行人检测器相比，AP 高出 8.38 个百分点。与可见光行人检测器相比，漏检率降低了 5.34 个百分点。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Research on Pedestrian Detection Based on Multimodal Infor-mation Fusion

Aiming at the matter that pedestrian detection in the autonomous driving system is vulnerable to the influence of the external environment and the detector supported single sensor modal detector has poor performance beneath the condition of enormous amendment of unrestricted light-weight, this paper proposes a fusion of light and thermal infrared dual mode pedestrian detection methodology. Firstly, 1 × 1 convolution and expanded convolution square measure are introduced within the residual network, and also the ROI Align methodology is employed to exchange the ROI Pooling method-ology to map the candidate box to the feature layer to optimize the Faster R-CNN. Secondly, the loss performance of the generalized intersection over union (GIoU) is employed because of the loss performance of the prediction box positioning regression; finally, supported by the improved Faster R-CNN, four forms of multimodal neural network structures square measure designed to fuse visible and thermal infrared pictures. According to experimental findings, the proposed technique outperforms current mainstream detection algorithms on the KAIST dataset. As compared to the conventional ACF + T + THOG pedestrian detector, the AP is 8.38 percentage points greater. Compared to the visible light pedestrian detector, the miss rate is 5.34 percentage points lower.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Information Technology and Control 工程技术-计算机：人工智能

CiteScore

2.70

自引率

9.10%

发文量

审稿时长

12 months

期刊介绍： Periodical journal covers a wide field of computer science and control systems related problems including: -Software and hardware engineering; -Management systems engineering; -Information systems and databases; -Embedded systems; -Physical systems modelling and application; -Computer networks and cloud computing; -Data visualization; -Human-computer interface; -Computer graphics, visual analytics, and multimedia systems.