A Review of Machine Learning and Deep Learning for Object Detection, Semantic Segmentation, and Human Action Recognition in Machine and Robotic Vision

Technologies Pub Date : 2024-01-23 DOI:10.3390/technologies12020015

Nikoleta Manakitsa, George S. Maraslidis, L. Moysis, G. Fragulis

{"title":"A Review of Machine Learning and Deep Learning for Object Detection, Semantic Segmentation, and Human Action Recognition in Machine and Robotic Vision","authors":"Nikoleta Manakitsa, George S. Maraslidis, L. Moysis, G. Fragulis","doi":"10.3390/technologies12020015","DOIUrl":null,"url":null,"abstract":"Machine vision, an interdisciplinary field that aims to replicate human visual perception in computers, has experienced rapid progress and significant contributions. This paper traces the origins of machine vision, from early image processing algorithms to its convergence with computer science, mathematics, and robotics, resulting in a distinct branch of artificial intelligence. The integration of machine learning techniques, particularly deep learning, has driven its growth and adoption in everyday devices. This study focuses on the objectives of computer vision systems: replicating human visual capabilities including recognition, comprehension, and interpretation. Notably, image classification, object detection, and image segmentation are crucial tasks requiring robust mathematical foundations. Despite the advancements, challenges persist, such as clarifying terminology related to artificial intelligence, machine learning, and deep learning. Precise definitions and interpretations are vital for establishing a solid research foundation. The evolution of machine vision reflects an ambitious journey to emulate human visual perception. Interdisciplinary collaboration and the integration of deep learning techniques have propelled remarkable advancements in emulating human behavior and perception. Through this research, the field of machine vision continues to shape the future of computer systems and artificial intelligence applications.","PeriodicalId":504839,"journal":{"name":"Technologies","volume":"127 29","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/technologies12020015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Machine vision, an interdisciplinary field that aims to replicate human visual perception in computers, has experienced rapid progress and significant contributions. This paper traces the origins of machine vision, from early image processing algorithms to its convergence with computer science, mathematics, and robotics, resulting in a distinct branch of artificial intelligence. The integration of machine learning techniques, particularly deep learning, has driven its growth and adoption in everyday devices. This study focuses on the objectives of computer vision systems: replicating human visual capabilities including recognition, comprehension, and interpretation. Notably, image classification, object detection, and image segmentation are crucial tasks requiring robust mathematical foundations. Despite the advancements, challenges persist, such as clarifying terminology related to artificial intelligence, machine learning, and deep learning. Precise definitions and interpretations are vital for establishing a solid research foundation. The evolution of machine vision reflects an ambitious journey to emulate human visual perception. Interdisciplinary collaboration and the integration of deep learning techniques have propelled remarkable advancements in emulating human behavior and perception. Through this research, the field of machine vision continues to shape the future of computer systems and artificial intelligence applications.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

机器视觉和机器人视觉中用于物体检测、语义分割和人类动作识别的机器学习和深度学习综述

机器视觉是一个旨在用计算机复制人类视觉感知的跨学科领域，它的发展日新月异，贡献巨大。本文追溯了机器视觉的起源，从早期的图像处理算法到与计算机科学、数学和机器人学的融合，最终形成人工智能的一个独特分支。机器学习技术（尤其是深度学习）的融合推动了机器视觉的发展，并在日常设备中得到广泛应用。本研究侧重于计算机视觉系统的目标：复制人类的视觉能力，包括识别、理解和解释。值得注意的是，图像分类、物体检测和图像分割是需要强大数学基础的关键任务。尽管取得了进步，但挑战依然存在，例如明确人工智能、机器学习和深度学习的相关术语。精确的定义和解释对于建立坚实的研究基础至关重要。机器视觉的发展反映了模拟人类视觉感知的雄心壮志。跨学科合作和深度学习技术的整合推动了在模拟人类行为和感知方面的显著进步。通过这项研究，机器视觉领域将继续塑造计算机系统和人工智能应用的未来。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Technologies

自引率

0.00%

发文量