实现水灾地区的实时视频分析：基于冗余的物体检测模型加速器

IF 2.9 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Journal of Real-Time Image Processing Pub Date : 2024-06-25 DOI:10.1007/s11554-024-01490-0

Shubhasree AV, Praveen Sankaran, Raghu C.V

{"title":"实现水灾地区的实时视频分析：基于冗余的物体检测模型加速器","authors":"Shubhasree AV, Praveen Sankaran, Raghu C.V","doi":"10.1007/s11554-024-01490-0","DOIUrl":null,"url":null,"abstract":"The state of Kerala in India has seen multiple instances of intense cyclones in recent years, resulting in heavy flooding. One of the biggest challenges faced by rescuers is the accessibility to flooded areas and buildings during rescue operations. In such scenarios, unmanned aerial vehicles (UAVs) can deliver reliable aerial visual data to aid planning and operations during rescue. Object detectors based on deep learning methods provide an effective solution to automate the process of detecting relevant information from image/video data. These models are complex and resource-hungry, leading to severe speed constraints during field operations. The pixel displacement algorithm (PDA), a portable and effective technique, is developed in this work to speed up object detection models on devices with limited resources, such as edge devices. This method can be integrated with all object detection models to speed up the inference time. The proposed method is combined with multiple object detection models in this work to show its effectiveness. The YOLOv4 model combined with the proposed method outperformed the AP50 performance of the YOLOv4-tiny model by 6\\(\\%\\) while maintaining the same processing time. This approach gave almost 10\\(\\times \\) speed improvement to Jetson Nano at an accuracy cost of \\(3\\%\\) when compared to YOLOv4. Further, a model to predict maximum pixel shift with respect to frame skip is proposed using parameters such as the altitude and velocity of the UAV and the tilt of the camera. Accurate prediction of pixel shift leads to a reduced search area, leading to reduced inference time. The effectiveness of the proposed model was tested against annotated locations, and it was found that the method was able to predict the search area for each test video segment with a high degree of accuracy.","PeriodicalId":51224,"journal":{"name":"Journal of Real-Time Image Processing","volume":"38 1","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Towards real-time video analysis of flooded areas: redundancy-based accelerator for object detection models\",\"authors\":\"Shubhasree AV, Praveen Sankaran, Raghu C.V\",\"doi\":\"10.1007/s11554-024-01490-0\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The state of Kerala in India has seen multiple instances of intense cyclones in recent years, resulting in heavy flooding. One of the biggest challenges faced by rescuers is the accessibility to flooded areas and buildings during rescue operations. In such scenarios, unmanned aerial vehicles (UAVs) can deliver reliable aerial visual data to aid planning and operations during rescue. Object detectors based on deep learning methods provide an effective solution to automate the process of detecting relevant information from image/video data. These models are complex and resource-hungry, leading to severe speed constraints during field operations. The pixel displacement algorithm (PDA), a portable and effective technique, is developed in this work to speed up object detection models on devices with limited resources, such as edge devices. This method can be integrated with all object detection models to speed up the inference time. The proposed method is combined with multiple object detection models in this work to show its effectiveness. The YOLOv4 model combined with the proposed method outperformed the AP50 performance of the YOLOv4-tiny model by 6\\\\(\\\\%\\\\) while maintaining the same processing time. This approach gave almost 10\\\\(\\\\times \\\\) speed improvement to Jetson Nano at an accuracy cost of \\\\(3\\\\%\\\\) when compared to YOLOv4. Further, a model to predict maximum pixel shift with respect to frame skip is proposed using parameters such as the altitude and velocity of the UAV and the tilt of the camera. Accurate prediction of pixel shift leads to a reduced search area, leading to reduced inference time. The effectiveness of the proposed model was tested against annotated locations, and it was found that the method was able to predict the search area for each test video segment with a high degree of accuracy.\",\"PeriodicalId\":51224,\"journal\":{\"name\":\"Journal of Real-Time Image Processing\",\"volume\":\"38 1\",\"pages\":\"\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2024-06-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Real-Time Image Processing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s11554-024-01490-0\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Real-Time Image Processing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11554-024-01490-0","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

近年来，印度喀拉拉邦多次遭受强烈气旋袭击，导致洪水泛滥。救援人员面临的最大挑战之一是在救援行动中如何进入洪水淹没的地区和建筑物。在这种情况下，无人飞行器（UAV）可以提供可靠的空中视觉数据，帮助救援期间的规划和行动。基于深度学习方法的物体检测器为从图像/视频数据中自动检测相关信息的过程提供了有效的解决方案。这些模型既复杂又耗费资源，导致在现场作业时速度受到严重限制。像素位移算法（PDA）是一种便携而有效的技术，在本作品中被开发出来，以加快资源有限的设备（如边缘设备）上的物体检测模型。该方法可与所有物体检测模型集成，以加快推理时间。本作品将所提出的方法与多种物体检测模型相结合，以显示其有效性。在保持相同处理时间的情况下，YOLOv4 模型与所提方法相结合的 AP50 性能比 YOLOv4-tiny 模型高出 6（\%\）。与 YOLOv4 相比，这种方法为 Jetson Nano 带来了近 10 倍的速度提升，但准确率却降低了 3%。此外，还提出了一个模型，利用无人机的高度和速度以及相机的倾斜度等参数，预测相对于帧跳的最大像素偏移。准确预测像素偏移可缩小搜索范围，从而缩短推理时间。根据注释位置测试了所提模型的有效性，结果发现该方法能够高度准确地预测每个测试视频片段的搜索区域。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Towards real-time video analysis of flooded areas: redundancy-based accelerator for object detection models

The state of Kerala in India has seen multiple instances of intense cyclones in recent years, resulting in heavy flooding. One of the biggest challenges faced by rescuers is the accessibility to flooded areas and buildings during rescue operations. In such scenarios, unmanned aerial vehicles (UAVs) can deliver reliable aerial visual data to aid planning and operations during rescue. Object detectors based on deep learning methods provide an effective solution to automate the process of detecting relevant information from image/video data. These models are complex and resource-hungry, leading to severe speed constraints during field operations. The pixel displacement algorithm (PDA), a portable and effective technique, is developed in this work to speed up object detection models on devices with limited resources, such as edge devices. This method can be integrated with all object detection models to speed up the inference time. The proposed method is combined with multiple object detection models in this work to show its effectiveness. The YOLOv4 model combined with the proposed method outperformed the AP50 performance of the YOLOv4-tiny model by 6\(\%\) while maintaining the same processing time. This approach gave almost 10\(\times \) speed improvement to Jetson Nano at an accuracy cost of \(3\%\) when compared to YOLOv4. Further, a model to predict maximum pixel shift with respect to frame skip is proposed using parameters such as the altitude and velocity of the UAV and the tilt of the camera. Accurate prediction of pixel shift leads to a reduced search area, leading to reduced inference time. The effectiveness of the proposed model was tested against annotated locations, and it was found that the method was able to predict the search area for each test video segment with a high degree of accuracy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Real-Time Image Processing COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-ENGINEERING, ELECTRICAL & ELECTRONIC

CiteScore

6.80

自引率

6.70%

发文量

审稿时长

6 months

期刊介绍： Due to rapid advancements in integrated circuit technology, the rich theoretical results that have been developed by the image and video processing research community are now being increasingly applied in practical systems to solve real-world image and video processing problems. Such systems involve constraints placed not only on their size, cost, and power consumption, but also on the timeliness of the image data processed. Examples of such systems are mobile phones, digital still/video/cell-phone cameras, portable media players, personal digital assistants, high-definition television, video surveillance systems, industrial visual inspection systems, medical imaging devices, vision-guided autonomous robots, spectral imaging systems, and many other real-time embedded systems. In these real-time systems, strict timing requirements demand that results are available within a certain interval of time as imposed by the application. It is often the case that an image processing algorithm is developed and proven theoretically sound, presumably with a specific application in mind, but its practical applications and the detailed steps, methodology, and trade-off analysis required to achieve its real-time performance are not fully explored, leaving these critical and usually non-trivial issues for those wishing to employ the algorithm in a real-time system. The Journal of Real-Time Image Processing is intended to bridge the gap between the theory and practice of image processing, serving the greater community of researchers, practicing engineers, and industrial professionals who deal with designing, implementing or utilizing image processing systems which must satisfy real-time design constraints.