Football referee gesture recognition algorithm based on YOLOv8s

IF 2.1 4区医学 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Frontiers in Computational Neuroscience Pub Date : 2024-02-07 DOI:10.3389/fncom.2024.1341234

Zhiyuan Yang, Yuanyuan Shen, Yanfei Shen

{"title":"Football referee gesture recognition algorithm based on YOLOv8s","authors":"Zhiyuan Yang, Yuanyuan Shen, Yanfei Shen","doi":"10.3389/fncom.2024.1341234","DOIUrl":null,"url":null,"abstract":"<p>Gesture serves as a crucial means of communication between individuals and between humans and machines. In football matches, referees communicate judgment information through gestures. Due to the diversity and complexity of referees’ gestures and interference factors, such as the players, spectators, and camera angles, automated football referee gesture recognition (FRGR) has become a challenging task. The existing methods based on visual sensors often cannot provide a satisfactory performance. To tackle FRGR problems, we develop a deep learning model based on YOLOv8s. Three improving and optimizing strategies are integrated to solve these problems. First, a Global Attention Mechanism (GAM) is employed to direct the model’s attention to the hand gestures and minimize the background interference. Second, a P2 detection head structure is integrated into the YOLOv8s model to enhance the accuracy of detecting smaller objects at a distance. Third, a new loss function based on the Minimum Point Distance Intersection over Union (MPDIoU) is used to effectively utilize anchor boxes with the same shape, but different sizes. Finally, experiments are executed on a dataset of six hand gestures among 1,200 images. The proposed method was compared with seven different existing models and 10 different optimization models. The proposed method achieves a precision rate of 89.3%, a recall rate of 88.9%, a mAP@0.5 rate of 89.9%, and a mAP@0.5:0.95 rate of 77.3%. These rates are approximately 1.4%, 2.0%, 1.1%, and 5.4% better than those of the newest YOLOv8s, respectively. The proposed method has right prospect in automated gesture recognition for football matches.</p>","PeriodicalId":12363,"journal":{"name":"Frontiers in Computational Neuroscience","volume":null,"pages":null},"PeriodicalIF":2.1000,"publicationDate":"2024-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Computational Neuroscience","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3389/fncom.2024.1341234","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Gesture serves as a crucial means of communication between individuals and between humans and machines. In football matches, referees communicate judgment information through gestures. Due to the diversity and complexity of referees’ gestures and interference factors, such as the players, spectators, and camera angles, automated football referee gesture recognition (FRGR) has become a challenging task. The existing methods based on visual sensors often cannot provide a satisfactory performance. To tackle FRGR problems, we develop a deep learning model based on YOLOv8s. Three improving and optimizing strategies are integrated to solve these problems. First, a Global Attention Mechanism (GAM) is employed to direct the model’s attention to the hand gestures and minimize the background interference. Second, a P2 detection head structure is integrated into the YOLOv8s model to enhance the accuracy of detecting smaller objects at a distance. Third, a new loss function based on the Minimum Point Distance Intersection over Union (MPDIoU) is used to effectively utilize anchor boxes with the same shape, but different sizes. Finally, experiments are executed on a dataset of six hand gestures among 1,200 images. The proposed method was compared with seven different existing models and 10 different optimization models. The proposed method achieves a precision rate of 89.3%, a recall rate of 88.9%, a mAP@0.5 rate of 89.9%, and a mAP@0.5:0.95 rate of 77.3%. These rates are approximately 1.4%, 2.0%, 1.1%, and 5.4% better than those of the newest YOLOv8s, respectively. The proposed method has right prospect in automated gesture recognition for football matches.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于 YOLOv8s 的足球裁判手势识别算法

手势是人与人之间以及人与机器之间交流的重要手段。在足球比赛中，裁判员通过手势传递判断信息。由于裁判手势的多样性和复杂性以及球员、观众和摄像机角度等干扰因素，足球裁判手势自动识别（FRGR）已成为一项具有挑战性的任务。现有的基于视觉传感器的方法往往无法提供令人满意的性能。为了解决 FRGR 问题，我们开发了基于 YOLOv8s 的深度学习模型。为了解决这些问题，我们整合了三种改进和优化策略。首先，我们采用了全局注意力机制（GAM）来引导模型关注手势，并将背景干扰降至最低。其次，在 YOLOv8s 模型中集成了 P2 检测头结构，以提高远距离检测较小物体的准确性。第三，使用了基于最小点距离交叉联合（MPDIoU）的新损失函数，以有效利用形状相同但大小不同的锚点框。最后，在 1,200 张图像中包含六种手势的数据集上进行了实验。将所提出的方法与 7 种不同的现有模型和 10 种不同的优化模型进行了比较。所提方法的精确率为 89.3%，召回率为 88.9%，mAP@0.5 率为 89.9%，mAP@0.5:0.95 率为 77.3%。这些比率分别比最新的 YOLOv8s 高出约 1.4%、2.0%、1.1% 和 5.4%。所提出的方法在足球比赛的自动手势识别中具有广阔的应用前景。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Frontiers in Computational Neuroscience MATHEMATICAL & COMPUTATIONAL BIOLOGY-NEUROSCIENCES

CiteScore

5.30

自引率

3.10%

发文量

166

审稿时长

6-12 weeks

期刊介绍： Frontiers in Computational Neuroscience is a first-tier electronic journal devoted to promoting theoretical modeling of brain function and fostering interdisciplinary interactions between theoretical and experimental neuroscience. Progress in understanding the amazing capabilities of the brain is still limited, and we believe that it will only come with deep theoretical thinking and mutually stimulating cooperation between different disciplines and approaches. We therefore invite original contributions on a wide range of topics that present the fruits of such cooperation, or provide stimuli for future alliances. We aim to provide an interactive forum for cutting-edge theoretical studies of the nervous system, and for promulgating the best theoretical research to the broader neuroscience community. Models of all styles and at all levels are welcome, from biophysically motivated realistic simulations of neurons and synapses to high-level abstract models of inference and decision making. While the journal is primarily focused on theoretically based and driven research, we welcome experimental studies that validate and test theoretical conclusions. Also: comp neuro