探索使用经典计算机视觉和深度学习的人孔自动目标检测方法

Machine Graphics and Vision Pub Date : 2023-03-07 DOI:10.22630/mgv.2023.32.1.2

S. Rao, Nitya Mitnala

{"title":"探索使用经典计算机视觉和深度学习的人孔自动目标检测方法","authors":"S. Rao, Nitya Mitnala","doi":"10.22630/mgv.2023.32.1.2","DOIUrl":null,"url":null,"abstract":"Open, broken, and improperly closed manholes can pose problems for autonomous vehicles and thus need to be included in obstacle avoidance and lane-changing algorithms. In this work, we propose and compare multiple approaches for manhole localization and classification like classical computer vision, convolutional neural networks like YOLOv3 and YOLOv3-Tiny, and vision transformers like YOLOS and ViT. These are analyzed for speed, computational complexity, and accuracy in order to determine the model that can be used with autonomous vehicles. In addition, we propose a size detection pipeline using classical computer vision to determine the size of the hole in an improperly closed manhole with respect to the manhole itself. The evaluation of the data showed that convolutional neural networks are currently better for this task, but vision transformers seem promising.","PeriodicalId":39750,"journal":{"name":"Machine Graphics and Vision","volume":"160 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Exploring automated object detection methods for manholes using classical computer vision and deep learning\",\"authors\":\"S. Rao, Nitya Mitnala\",\"doi\":\"10.22630/mgv.2023.32.1.2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Open, broken, and improperly closed manholes can pose problems for autonomous vehicles and thus need to be included in obstacle avoidance and lane-changing algorithms. In this work, we propose and compare multiple approaches for manhole localization and classification like classical computer vision, convolutional neural networks like YOLOv3 and YOLOv3-Tiny, and vision transformers like YOLOS and ViT. These are analyzed for speed, computational complexity, and accuracy in order to determine the model that can be used with autonomous vehicles. In addition, we propose a size detection pipeline using classical computer vision to determine the size of the hole in an improperly closed manhole with respect to the manhole itself. The evaluation of the data showed that convolutional neural networks are currently better for this task, but vision transformers seem promising.\",\"PeriodicalId\":39750,\"journal\":{\"name\":\"Machine Graphics and Vision\",\"volume\":\"160 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-03-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Machine Graphics and Vision\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.22630/mgv.2023.32.1.2\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Graphics and Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22630/mgv.2023.32.1.2","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

打开、损坏和未正确关闭的人孔可能会给自动驾驶汽车带来问题，因此需要将其纳入避障和变道算法中。在这项工作中，我们提出并比较了多种人孔定位和分类方法，如经典计算机视觉，卷积神经网络如YOLOv3和YOLOv3- tiny，以及视觉变压器如yoloos和ViT。分析这些模型的速度、计算复杂性和准确性，以确定可用于自动驾驶汽车的模型。此外，我们提出了一种使用经典计算机视觉的尺寸检测管道，以确定不适当关闭的人孔中孔的尺寸相对于人孔本身。对数据的评估表明，卷积神经网络目前更适合这项任务，但视觉变压器似乎很有希望。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Exploring automated object detection methods for manholes using classical computer vision and deep learning

Open, broken, and improperly closed manholes can pose problems for autonomous vehicles and thus need to be included in obstacle avoidance and lane-changing algorithms. In this work, we propose and compare multiple approaches for manhole localization and classification like classical computer vision, convolutional neural networks like YOLOv3 and YOLOv3-Tiny, and vision transformers like YOLOS and ViT. These are analyzed for speed, computational complexity, and accuracy in order to determine the model that can be used with autonomous vehicles. In addition, we propose a size detection pipeline using classical computer vision to determine the size of the hole in an improperly closed manhole with respect to the manhole itself. The evaluation of the data showed that convolutional neural networks are currently better for this task, but vision transformers seem promising.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Machine Graphics and Vision Computer Science-Computer Graphics and Computer-Aided Design

CiteScore

0.40

自引率

0.00%

发文量

期刊介绍： Machine GRAPHICS & VISION (MGV) is a refereed international journal, published quarterly, providing a scientific exchange forum and an authoritative source of information in the field of, in general, pictorial information exchange between computers and their environment, including applications of visual and graphical computer systems. The journal concentrates on theoretical and computational models underlying computer generated, analysed, or otherwise processed imagery, in particular: - image processing - scene analysis, modeling, and understanding - machine vision - pattern matching and pattern recognition - image synthesis, including three-dimensional imaging and solid modeling

期刊最新文献

Use of virtual reality to facilitate engineer training in the aerospace industry An efficient pedestrian attributes recognition system under challenging conditions Performance evaluation of Machine Learning models to predict heart attack Lung and colon cancer detection from CT images using Deep Learning Riesz-Laplace Wavelet Transform and PCNN Based Image Fusion