Rafael Martínez;Álvaro Llorente;Alberto del Rio;Javier Serrano;David Jimenez
{"title":"Performance Evaluation of YOLOv8-Based Bib Number Detection in Media Streaming Race","authors":"Rafael Martínez;Álvaro Llorente;Alberto del Rio;Javier Serrano;David Jimenez","doi":"10.1109/TBC.2024.3414656","DOIUrl":null,"url":null,"abstract":"The evolution of telecommunication networks unlocks new possibilities for multimedia services, including enriched and personalized experiences. However, ensuring high Quality of Service and Quality of Experience requires intelligent solutions at the edge. This study investigates the real-time detection of race bib numbers using YOLOv8, a state-of-the-art object detection framework, within the context of 5G/6G edge computing. We train (BDBD and SVHN datasets) and analyze various YOLOv8 models (nano to extreme) across two diverse racing datasets (TGCRBNW and RBNR), encompassing varied environmental conditions (daytime and nighttime). Our assessment focuses on key performance metrics, including processing time, efficiency, and accuracy. For instance, on the TGCRBNW dataset, the extreme-sized model shows a noticeable reduction in prediction time when the more powerful GPU is used, with times decreasing from 1,161 to 54 seconds on a desktop computer. Similarly, on the RBNR dataset, the extreme-sized model exhibits a significant reduction in prediction time from 373 to 15 seconds when using the more powerful GPU. In terms of accuracy, we found varying performance across scenarios and datasets. For example, not good enough results are obtained in most scenarios on the TGCRBNW dataset (lower than 50% in all sets and models), while YOLOv8m obtain the high accuracy in several scenarios on the RBNR dataset (almost 80% of accuracy in the best set). Variability in prediction times was observed between different computer architectures, highlighting the importance of selecting appropriate hardware for specific tasks. These results emphasize the importance of aligning computational resources with the demands of real-world tasks to achieve timely and accurate predictions.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 3","pages":"1126-1138"},"PeriodicalIF":3.2000,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10591494","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Broadcasting","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10591494/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
The evolution of telecommunication networks unlocks new possibilities for multimedia services, including enriched and personalized experiences. However, ensuring high Quality of Service and Quality of Experience requires intelligent solutions at the edge. This study investigates the real-time detection of race bib numbers using YOLOv8, a state-of-the-art object detection framework, within the context of 5G/6G edge computing. We train (BDBD and SVHN datasets) and analyze various YOLOv8 models (nano to extreme) across two diverse racing datasets (TGCRBNW and RBNR), encompassing varied environmental conditions (daytime and nighttime). Our assessment focuses on key performance metrics, including processing time, efficiency, and accuracy. For instance, on the TGCRBNW dataset, the extreme-sized model shows a noticeable reduction in prediction time when the more powerful GPU is used, with times decreasing from 1,161 to 54 seconds on a desktop computer. Similarly, on the RBNR dataset, the extreme-sized model exhibits a significant reduction in prediction time from 373 to 15 seconds when using the more powerful GPU. In terms of accuracy, we found varying performance across scenarios and datasets. For example, not good enough results are obtained in most scenarios on the TGCRBNW dataset (lower than 50% in all sets and models), while YOLOv8m obtain the high accuracy in several scenarios on the RBNR dataset (almost 80% of accuracy in the best set). Variability in prediction times was observed between different computer architectures, highlighting the importance of selecting appropriate hardware for specific tasks. These results emphasize the importance of aligning computational resources with the demands of real-world tasks to achieve timely and accurate predictions.
期刊介绍:
The Society’s Field of Interest is “Devices, equipment, techniques and systems related to broadcast technology, including the production, distribution, transmission, and propagation aspects.” In addition to this formal FOI statement, which is used to provide guidance to the Publications Committee in the selection of content, the AdCom has further resolved that “broadcast systems includes all aspects of transmission, propagation, and reception.”