Pub Date : 2021-08-10DOI: 10.1515/jqas-2021-frontmatter3
{"title":"Frontmatter","authors":"","doi":"10.1515/jqas-2021-frontmatter3","DOIUrl":"https://doi.org/10.1515/jqas-2021-frontmatter3","url":null,"abstract":"","PeriodicalId":16925,"journal":{"name":"Journal of Quantitative Analysis in Sports","volume":"40 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2021-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74472078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract In tennis, the Australian Open, French Open, Wimbledon, and US Open are the four most prestigious events (Grand Slams). These four Grand Slams differ in the composition of the court surfaces, when they are played in the year, and which city hosts the players. Individual Grand Slams come with different expectations, and it is often thought that some players achieve better results at some Grand Slams than others. It is also thought that differences in results may be attributed, at least partially, to surface type of the courts. For example, Rafael Nadal, Roger Federer, and Serena Williams have achieved their best results on clay, grass, and hard courts, respectively. This paper explores differences among Grand Slams, while adjusting for confounders such as tour, competitor strength, and player attributes. More specifically, we examine the effect of the Grand Slam on player performance for matches from 2013 to 2019. We take two approaches to modeling these data: (1) a mixed-effects model accounting for both player and tournament features and (2) models that emphasize individual performance. We identify differences across the Grand Slams at both the tournament and individual player level.
{"title":"Opening up the court: analyzing player performance across tennis Grand Slams","authors":"Shannon K. Gallagher, K. Frisoli, Amanda Luby","doi":"10.1515/jqas-2019-0015","DOIUrl":"https://doi.org/10.1515/jqas-2019-0015","url":null,"abstract":"Abstract In tennis, the Australian Open, French Open, Wimbledon, and US Open are the four most prestigious events (Grand Slams). These four Grand Slams differ in the composition of the court surfaces, when they are played in the year, and which city hosts the players. Individual Grand Slams come with different expectations, and it is often thought that some players achieve better results at some Grand Slams than others. It is also thought that differences in results may be attributed, at least partially, to surface type of the courts. For example, Rafael Nadal, Roger Federer, and Serena Williams have achieved their best results on clay, grass, and hard courts, respectively. This paper explores differences among Grand Slams, while adjusting for confounders such as tour, competitor strength, and player attributes. More specifically, we examine the effect of the Grand Slam on player performance for matches from 2013 to 2019. We take two approaches to modeling these data: (1) a mixed-effects model accounting for both player and tournament features and (2) models that emphasize individual performance. We identify differences across the Grand Slams at both the tournament and individual player level.","PeriodicalId":16925,"journal":{"name":"Journal of Quantitative Analysis in Sports","volume":"24 1","pages":"255 - 271"},"PeriodicalIF":0.8,"publicationDate":"2021-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87480238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gregory Steeger, Johnathon Dulin, Gerardo O. Gonzalez
Abstract The Saint Louis Blues were hot at the end of the 2018–2019 National Hockey League season, winning eleven games in a row in January and February, and eight of their last ten. They parlayed this momentum to their first Stanley Cup Championship in franchise history. Or did they? Did the series of wins at the end of the season give the Blues the momentum needed to reach the pinnacle of the sport on June 12th, or was the Blues’ path to victory the confluence of a series of random events that fell in their favor? In this paper we apply entropy as an unbiased measure to further refute the idea of momentum in sports. We show that game outcomes are not dependent on previous games’ outcomes and conclude that the theory of momentum, across the season, is a fallacy that should not affect behavior.
{"title":"Winning and losing streaks in the National Hockey League: are teams experiencing momentum or are games a sequence of random events?","authors":"Gregory Steeger, Johnathon Dulin, Gerardo O. Gonzalez","doi":"10.1515/jqas-2020-0077","DOIUrl":"https://doi.org/10.1515/jqas-2020-0077","url":null,"abstract":"Abstract The Saint Louis Blues were hot at the end of the 2018–2019 National Hockey League season, winning eleven games in a row in January and February, and eight of their last ten. They parlayed this momentum to their first Stanley Cup Championship in franchise history. Or did they? Did the series of wins at the end of the season give the Blues the momentum needed to reach the pinnacle of the sport on June 12th, or was the Blues’ path to victory the confluence of a series of random events that fell in their favor? In this paper we apply entropy as an unbiased measure to further refute the idea of momentum in sports. We show that game outcomes are not dependent on previous games’ outcomes and conclude that the theory of momentum, across the season, is a fallacy that should not affect behavior.","PeriodicalId":16925,"journal":{"name":"Journal of Quantitative Analysis in Sports","volume":"36 1","pages":"155 - 170"},"PeriodicalIF":0.8,"publicationDate":"2021-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79813151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Previously published statistical analyses of NCAA Division I Men’s Tournament (“March Madness”) game outcomes have revealed that the relationship between tournament seed and the time-aggregated number of third-round (“Sweet 16”) appearances for the middle half of the seeds exhibits a statistically and practically significant departure from monotonicity. In particular, the 8- and 9-seeds combined appear less often than any one of seeds 10–12. In this article, we show that a similar “middle-seed anomaly” also occurs in the NCAA Division I Women’s Tournament but does not occur in two other major sports tournaments that are similar in structure to March Madness. We offer explanations for the presence of a middle-seed anomaly in the NCAA basketball tournaments, and its absence in the others, that are based on the combined effects of the functional form of the relationship between team strength and seed specific to each tournament, the degree of parity among teams, and certain elements of tournament structure. Although these explanations account for the existence of middle-seed anomalies in the NCAA basketball tournaments, their larger-than-expected magnitudes, which arise mainly from the overperformance of seeds 10–12 in the second round, remain enigmatic.
{"title":"The middle-seed anomaly: why does it occur in some sports tournaments but not others?","authors":"D. Zimmerman, Hong Beng Lim","doi":"10.1515/jqas-2020-0065","DOIUrl":"https://doi.org/10.1515/jqas-2020-0065","url":null,"abstract":"Abstract Previously published statistical analyses of NCAA Division I Men’s Tournament (“March Madness”) game outcomes have revealed that the relationship between tournament seed and the time-aggregated number of third-round (“Sweet 16”) appearances for the middle half of the seeds exhibits a statistically and practically significant departure from monotonicity. In particular, the 8- and 9-seeds combined appear less often than any one of seeds 10–12. In this article, we show that a similar “middle-seed anomaly” also occurs in the NCAA Division I Women’s Tournament but does not occur in two other major sports tournaments that are similar in structure to March Madness. We offer explanations for the presence of a middle-seed anomaly in the NCAA basketball tournaments, and its absence in the others, that are based on the combined effects of the functional form of the relationship between team strength and seed specific to each tournament, the degree of parity among teams, and certain elements of tournament structure. Although these explanations account for the existence of middle-seed anomalies in the NCAA basketball tournaments, their larger-than-expected magnitudes, which arise mainly from the overperformance of seeds 10–12 in the second round, remain enigmatic.","PeriodicalId":16925,"journal":{"name":"Journal of Quantitative Analysis in Sports","volume":"73 1","pages":"171 - 185"},"PeriodicalIF":0.8,"publicationDate":"2021-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82437268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fernando Delbianco, Federico Fioravanti, F. Tohm'e
Abstract The COVID-19 pandemic forced almost all professional and amateur sports to be played without attending crowds. Thus, it induced a large-scale natural experiment on the impact of social pressure on decision making and behavior in sports fields. Using a data set of 1027 rugby union matches from 11 tournaments in 10 countries, we find that home teams have won less matches and their point difference decreased during the pandemic, shedding light on the impact of crowd attendance on the home advantage of sports teams.
{"title":"Home advantage and crowd attendance: evidence from rugby during the Covid 19 pandemic","authors":"Fernando Delbianco, Federico Fioravanti, F. Tohm'e","doi":"10.1515/jqas-2021-0044","DOIUrl":"https://doi.org/10.1515/jqas-2021-0044","url":null,"abstract":"Abstract The COVID-19 pandemic forced almost all professional and amateur sports to be played without attending crowds. Thus, it induced a large-scale natural experiment on the impact of social pressure on decision making and behavior in sports fields. Using a data set of 1027 rugby union matches from 11 tournaments in 10 countries, we find that home teams have won less matches and their point difference decreased during the pandemic, shedding light on the impact of crowd attendance on the home advantage of sports teams.","PeriodicalId":16925,"journal":{"name":"Journal of Quantitative Analysis in Sports","volume":"90 1","pages":"15 - 26"},"PeriodicalIF":0.8,"publicationDate":"2021-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76734425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract In this work, we deal with the problem of rating in sports, where the skills of the players/teams are inferred from the observed outcomes of the games. Our focus is on the on-line rating algorithms that estimate skills after each new game by exploiting the probabilistic models that (i) relate the skills to the outcome of the game and (ii) describe how the skills evolve in time. We propose a Bayesian approach which may be seen as an approximate Kalman filter and which is generic in the sense that it can be used with any skills-outcome model and can be applied in the individual as well as in the group sports. We show how the well-known Elo, Glicko, and TrueSkill algorithms may be seen as instances of the one-fits-all approach we propose. To clarify the conditions under which the gains of the Bayesian approach over simpler solutions can actually materialize, we critically compare the known and new algorithms by means of numerical examples using synthetic and empirical data.
{"title":"Simplified Kalman filter for on-line rating: one-fits-all approach","authors":"L. Szczecinski, Raphaëlle Tihon","doi":"10.1515/jqas-2021-0061","DOIUrl":"https://doi.org/10.1515/jqas-2021-0061","url":null,"abstract":"Abstract In this work, we deal with the problem of rating in sports, where the skills of the players/teams are inferred from the observed outcomes of the games. Our focus is on the on-line rating algorithms that estimate skills after each new game by exploiting the probabilistic models that (i) relate the skills to the outcome of the game and (ii) describe how the skills evolve in time. We propose a Bayesian approach which may be seen as an approximate Kalman filter and which is generic in the sense that it can be used with any skills-outcome model and can be applied in the individual as well as in the group sports. We show how the well-known Elo, Glicko, and TrueSkill algorithms may be seen as instances of the one-fits-all approach we propose. To clarify the conditions under which the gains of the Bayesian approach over simpler solutions can actually materialize, we critically compare the known and new algorithms by means of numerical examples using synthetic and empirical data.","PeriodicalId":16925,"journal":{"name":"Journal of Quantitative Analysis in Sports","volume":"34 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2021-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83750473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract In 2019, Eliud Kipchoge ran a sub-two hour marathon wearing Nike’s Alphafly shoes. Despite being the fastest marathon time ever recorded, it wasn’t officially recognized as race conditions were tightly controlled to maximize his success. Besides, Kipchoge’s use of Alphafly shoes was controversial, with some experts claiming that they might have provided an unfair competitive advantage. In this work, we assess the potential influence of advanced footwear technology and the likelihood of a sub-two hour marathon in official races, by studying the evolution of running top performances from 2001 to 2019 for long distances ranging from 10 km to marathon. The analysis is performed using extreme value theory, a field of statistics dealing with analysis of rare events. We find a significant evidence of performance-enhancement effect with a 10% increase of the probability that a new world record for marathon-men discipline is set in 2021. However, results suggest that achieving a sub-two hour marathon in an official race in 2021 is still very unlikely, and exceeds 10% probability only by 2025.
{"title":"Influence of advanced footwear technology on sub-2 hour marathon and other top running performances","authors":"Andreu Arderiu, Raphaël de Fondeville","doi":"10.1515/jqas-2021-0043","DOIUrl":"https://doi.org/10.1515/jqas-2021-0043","url":null,"abstract":"Abstract In 2019, Eliud Kipchoge ran a sub-two hour marathon wearing Nike’s Alphafly shoes. Despite being the fastest marathon time ever recorded, it wasn’t officially recognized as race conditions were tightly controlled to maximize his success. Besides, Kipchoge’s use of Alphafly shoes was controversial, with some experts claiming that they might have provided an unfair competitive advantage. In this work, we assess the potential influence of advanced footwear technology and the likelihood of a sub-two hour marathon in official races, by studying the evolution of running top performances from 2001 to 2019 for long distances ranging from 10 km to marathon. The analysis is performed using extreme value theory, a field of statistics dealing with analysis of rare events. We find a significant evidence of performance-enhancement effect with a 10% increase of the probability that a new world record for marathon-men discipline is set in 2021. However, results suggest that achieving a sub-two hour marathon in an official race in 2021 is still very unlikely, and exceeds 10% probability only by 2025.","PeriodicalId":16925,"journal":{"name":"Journal of Quantitative Analysis in Sports","volume":"82 1","pages":"73 - 86"},"PeriodicalIF":0.8,"publicationDate":"2021-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77299328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract With the vast amount of data collected on football and the growth of computing power, many games involving decision choices can be optimized. The underlying rule is the maximization of an expected utility of outcomes and the law of large numbers. The data available allows one to compute with high accuracy the probabilities of outcomes of actions, and the well defined points system in the game allows for a specification of the terminal utilities. With some well established decision theory we can optimize choices for each single play level. A full exposition of the theory and analysis is presented in the paper.
{"title":"A reinforcement learning based approach to play calling in football","authors":"Preston Biro, S. Walker","doi":"10.1515/jqas-2021-0029","DOIUrl":"https://doi.org/10.1515/jqas-2021-0029","url":null,"abstract":"Abstract With the vast amount of data collected on football and the growth of computing power, many games involving decision choices can be optimized. The underlying rule is the maximization of an expected utility of outcomes and the law of large numbers. The data available allows one to compute with high accuracy the probabilities of outcomes of actions, and the well defined points system in the game allows for a specification of the terminal utilities. With some well established decision theory we can optimize choices for each single play level. A full exposition of the theory and analysis is presented in the paper.","PeriodicalId":16925,"journal":{"name":"Journal of Quantitative Analysis in Sports","volume":"9 1","pages":"97 - 112"},"PeriodicalIF":0.8,"publicationDate":"2021-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78506004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-01-11DOI: 10.1515/jqas-2021-frontmatter1
{"title":"Frontmatter","authors":"","doi":"10.1515/jqas-2021-frontmatter1","DOIUrl":"https://doi.org/10.1515/jqas-2021-frontmatter1","url":null,"abstract":"","PeriodicalId":16925,"journal":{"name":"Journal of Quantitative Analysis in Sports","volume":"26 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2021-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87316480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract The Elo rating system, originally designed for rating chess players, has since become a popular way to estimate competitors’ time-varying skills in many sports. Though the self-correcting Elo algorithm is simple and intuitive, it lacks a probabilistic justification which can make it hard to extend. In this paper, we present a simple connection between approximate Bayesian posterior mode estimation and Elo. We provide a novel justification of the approximations made by linking Elo to steady-state Kalman filtering. Our second key contribution is to observe that the derivation suggests a straightforward procedure for extending Elo. We use the procedure to derive versions of Elo incorporating margins of victory, correlated skills across different playing surfaces, and differing skills by tournament level in tennis. Combining all these extensions results in the most complete version of Elo presented for the sport yet. We evaluate the derived models on two seasons of men’s professional tennis matches (2018 and 2019). The best-performing model was able to predict matches with higher accuracy than both Elo and Glicko (65.8% compared to 63.7 and 63.5%, respectively) and a higher mean log-likelihood (−0.615 compared to −0.632 and −0.633, respectively), demonstrating the proposed model’s ability to improve predictions.
{"title":"How to extend Elo: a Bayesian perspective","authors":"Martin Ingram","doi":"10.1515/JQAS-2020-0066","DOIUrl":"https://doi.org/10.1515/JQAS-2020-0066","url":null,"abstract":"Abstract The Elo rating system, originally designed for rating chess players, has since become a popular way to estimate competitors’ time-varying skills in many sports. Though the self-correcting Elo algorithm is simple and intuitive, it lacks a probabilistic justification which can make it hard to extend. In this paper, we present a simple connection between approximate Bayesian posterior mode estimation and Elo. We provide a novel justification of the approximations made by linking Elo to steady-state Kalman filtering. Our second key contribution is to observe that the derivation suggests a straightforward procedure for extending Elo. We use the procedure to derive versions of Elo incorporating margins of victory, correlated skills across different playing surfaces, and differing skills by tournament level in tennis. Combining all these extensions results in the most complete version of Elo presented for the sport yet. We evaluate the derived models on two seasons of men’s professional tennis matches (2018 and 2019). The best-performing model was able to predict matches with higher accuracy than both Elo and Glicko (65.8% compared to 63.7 and 63.5%, respectively) and a higher mean log-likelihood (−0.615 compared to −0.632 and −0.633, respectively), demonstrating the proposed model’s ability to improve predictions.","PeriodicalId":16925,"journal":{"name":"Journal of Quantitative Analysis in Sports","volume":"79 1","pages":"203 - 219"},"PeriodicalIF":0.8,"publicationDate":"2021-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90609468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}