Zhuo Shi , Mingrui Li , Meng Wang , Jing Shen , Wei Chen , Xiaonan Luo
{"title":"NPIPVis: A visualization system involving NBA visual analysis and integrated learning model prediction","authors":"Zhuo Shi , Mingrui Li , Meng Wang , Jing Shen , Wei Chen , Xiaonan Luo","doi":"10.1016/j.vrih.2022.08.008","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><p>Data-driven event analysis has gradually become the backbone of modern competitive sports analysis. Competitive sports data analysis tasks increasingly use computer vision and machine-learning models for intelligent data analysis. Existing sports visualization systems focus on the player–team data visualization, which is not intuitive enough for team season win–loss data and game time-series data visualization and neglects the prediction of all-star players.</p></div><div><h3>Methods</h3><p>This study used an interactive visualization system designed with parallel aggregated ordered hypergraph dynamic hypergraphs, Calliope visualization data story technology, and iStoryline narrative visualization technology to visualize the regular statistics and game time data of players and teams. NPIPVis includes dynamic hypergraphs of a teamʹs wins and losses and game plot narrative visualization components. In addition, an integrated learning-based all-star player prediction model, SRR-voting, which starts from the existing minority and majority samples, was proposed using the synthetic minority oversampling technique and RandomUnderSampler methods to generate and eliminate samples of a certain size to balance the number of allstar and average players in the datasets. Next, a random forest algorithm was introduced to extract and construct the features of players and combined with the voting integrated model to predict the all-star players, using Grid- SearchCV, to optimize the hyperparameters of each model in integrated learning and then combined with five-fold cross-validation to improve the generalization ability of the model. Finally, the SHapley Additive exPlanations (SHAP) model was introduced to enhance the interpretability of the model.</p></div><div><h3>Results</h3><p>The experimental results of comparing the SRR-voting model with six common models show that the accuracy, F1-score, and recall metrics are significantly improved, which verifies the effectiveness and practicality of the SRR-voting model.</p></div><div><h3>Conclusions</h3><p>This study combines data visualization and machine learning to design a National Basketball Association data visualization system to help the general audience visualize game data and predict all-star players; this can also be extended to other sports events or related fields.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"4 5","pages":"Pages 444-458"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2096579622000833/pdf?md5=1a85324d30377a749ed5c9c70fb6f227&pid=1-s2.0-S2096579622000833-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Virtual Reality Intelligent Hardware","FirstCategoryId":"1093","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2096579622000833","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 0
Abstract
Background
Data-driven event analysis has gradually become the backbone of modern competitive sports analysis. Competitive sports data analysis tasks increasingly use computer vision and machine-learning models for intelligent data analysis. Existing sports visualization systems focus on the player–team data visualization, which is not intuitive enough for team season win–loss data and game time-series data visualization and neglects the prediction of all-star players.
Methods
This study used an interactive visualization system designed with parallel aggregated ordered hypergraph dynamic hypergraphs, Calliope visualization data story technology, and iStoryline narrative visualization technology to visualize the regular statistics and game time data of players and teams. NPIPVis includes dynamic hypergraphs of a teamʹs wins and losses and game plot narrative visualization components. In addition, an integrated learning-based all-star player prediction model, SRR-voting, which starts from the existing minority and majority samples, was proposed using the synthetic minority oversampling technique and RandomUnderSampler methods to generate and eliminate samples of a certain size to balance the number of allstar and average players in the datasets. Next, a random forest algorithm was introduced to extract and construct the features of players and combined with the voting integrated model to predict the all-star players, using Grid- SearchCV, to optimize the hyperparameters of each model in integrated learning and then combined with five-fold cross-validation to improve the generalization ability of the model. Finally, the SHapley Additive exPlanations (SHAP) model was introduced to enhance the interpretability of the model.
Results
The experimental results of comparing the SRR-voting model with six common models show that the accuracy, F1-score, and recall metrics are significantly improved, which verifies the effectiveness and practicality of the SRR-voting model.
Conclusions
This study combines data visualization and machine learning to design a National Basketball Association data visualization system to help the general audience visualize game data and predict all-star players; this can also be extended to other sports events or related fields.