Jesse Davis, Lotte Bransen, Laurens Devos, Arne Jaspers, Wannes Meert, Pieter Robberechts, Jan Van Haaren, Maaike Van Roy
{"title":"Methodology and evaluation in sports analytics: challenges, approaches, and lessons learned","authors":"Jesse Davis, Lotte Bransen, Laurens Devos, Arne Jaspers, Wannes Meert, Pieter Robberechts, Jan Van Haaren, Maaike Van Roy","doi":"10.1007/s10994-024-06585-0","DOIUrl":null,"url":null,"abstract":"<p>There has been an explosion of data collected about sports. Because such data is extremely rich and complex, machine learning is increasingly being used to extract actionable insights from it. Typically, machine learning is used to build models and indicators that capture the skills, capabilities, and tendencies of athletes and teams. Such indicators and models are in turn used to inform decision-making at professional clubs. Designing these indicators requires paying careful attention to a number of subtle issues from a methodological and evaluation perspective. In this paper, we highlight these challenges in sports and discuss a variety of approaches for handling them. Methodologically, we highlight that dependencies affect how to perform data partitioning for evaluation as well as the need to consider contextual factors. From an evaluation perspective, we draw a distinction between evaluating the developed indicators themselves versus the underlying models that power them. We argue that both aspects must be considered, but that they require different approaches. We hope that this article helps bridge the gap between traditional sports expertise and modern data analytics by providing a structured framework with practical examples.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"26 1","pages":""},"PeriodicalIF":4.3000,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10994-024-06585-0","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
There has been an explosion of data collected about sports. Because such data is extremely rich and complex, machine learning is increasingly being used to extract actionable insights from it. Typically, machine learning is used to build models and indicators that capture the skills, capabilities, and tendencies of athletes and teams. Such indicators and models are in turn used to inform decision-making at professional clubs. Designing these indicators requires paying careful attention to a number of subtle issues from a methodological and evaluation perspective. In this paper, we highlight these challenges in sports and discuss a variety of approaches for handling them. Methodologically, we highlight that dependencies affect how to perform data partitioning for evaluation as well as the need to consider contextual factors. From an evaluation perspective, we draw a distinction between evaluating the developed indicators themselves versus the underlying models that power them. We argue that both aspects must be considered, but that they require different approaches. We hope that this article helps bridge the gap between traditional sports expertise and modern data analytics by providing a structured framework with practical examples.
期刊介绍:
Machine Learning serves as a global platform dedicated to computational approaches in learning. The journal reports substantial findings on diverse learning methods applied to various problems, offering support through empirical studies, theoretical analysis, or connections to psychological phenomena. It demonstrates the application of learning methods to solve significant problems and aims to enhance the conduct of machine learning research with a focus on verifiable and replicable evidence in published papers.