{"title":"Deep Learning for Semantic Segmentation of Football Match Image","authors":"Yutian Wu, Wuqi Zhao, Chen-Chun Huang, Yaming Xi, Qing Li, Heng Wang","doi":"10.1109/IDITR57726.2023.10145987","DOIUrl":null,"url":null,"abstract":"As one of the most popular sports, football has been a subject to growth and advancements in technology. The combination of football and artificial intelligence is expected to be used for intelligent football analysis. Image semantic segmentation is an important basis for image analysis and understanding. This paper proposes a deep learning-based image segmentation model for pixel-level classification of the video recordings frames of football matches. Every pixel of football video frame is classified into one of the 10 classes, e.g., players, ball, goal bar and several background scenes. In this paper, we first test a variety of CNN architectures and pre-trained models and select the MobileNet-UNet architecture as our baseline. We note the severe unbalanced data distribution in football scene segmentation. To solve this problem, the weighted multi-class cross-entropy loss is adopted in training of MobileNet-UNet to redistribute the weights of classification loss, focusing on smaller foreground object classes and improving segmentation accuracy. We also propose to use image transformations and a random mixture sampling technique for training data augmentation to reduce model overfitting. The model is trained and validated in the well-annotated Football Semantic Segmentation Open Dataset. The proposed best model achieves 0.96 frequency weighted IoU and 0.90 mean IoU segmentation accuracy on validation set.","PeriodicalId":272880,"journal":{"name":"2023 2nd International Conference on Innovations and Development of Information Technologies and Robotics (IDITR)","volume":"593 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 2nd International Conference on Innovations and Development of Information Technologies and Robotics (IDITR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IDITR57726.2023.10145987","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
As one of the most popular sports, football has been a subject to growth and advancements in technology. The combination of football and artificial intelligence is expected to be used for intelligent football analysis. Image semantic segmentation is an important basis for image analysis and understanding. This paper proposes a deep learning-based image segmentation model for pixel-level classification of the video recordings frames of football matches. Every pixel of football video frame is classified into one of the 10 classes, e.g., players, ball, goal bar and several background scenes. In this paper, we first test a variety of CNN architectures and pre-trained models and select the MobileNet-UNet architecture as our baseline. We note the severe unbalanced data distribution in football scene segmentation. To solve this problem, the weighted multi-class cross-entropy loss is adopted in training of MobileNet-UNet to redistribute the weights of classification loss, focusing on smaller foreground object classes and improving segmentation accuracy. We also propose to use image transformations and a random mixture sampling technique for training data augmentation to reduce model overfitting. The model is trained and validated in the well-annotated Football Semantic Segmentation Open Dataset. The proposed best model achieves 0.96 frequency weighted IoU and 0.90 mean IoU segmentation accuracy on validation set.