Drawing on the golf-related example of regression to the mean as presented by Kahneman in his best-selling book, Thinking Fast and Slow, this study shows how the regression-to-the-mean phenomenon is revealed in first- and second-round scoring in 11 different golfer populations, ranging from golfers with the highest level of skill (professional golfers on the PGA TOUR) to amateur groups of much lower skill. Using the mathematics of truncated normal distributions, the study introduces a new method for estimating the mix between variation in scoring due to differences in player skill and that due to luck. Estimates of the skill/luck mix are very close to those obtained using the regression-based methodology of Morrison and are nearly identical to those implied by fixed effects regression models where fixed player and round effects are estimated simultaneously. The study also sheds light on the “paradox of skill,” originally suggested by Gould and developed further by Mauboussin, as it relates to golf by showing that luck plays a more important role in determining player scores in higher-skilled golfer groups compared with lower-skilled groups.
Kahneman在他的畅销书《思考快与慢》(Thinking Fast and Slow)中提出了与高尔夫相关的回归均值的例子,这项研究显示了回归均值现象是如何在11个不同的高尔夫球手群体的第一轮和第二轮得分中揭示出来的,这些人群包括技术水平最高的高尔夫球手(PGA巡回赛的职业高尔夫球手)和技术水平低得多的业余群体。利用截断正态分布的数学,该研究引入了一种新方法来估计由于玩家技能差异和运气差异而导致的得分变化之间的混合。技能/运气组合的估计值与使用Morrison基于回归的方法所获得的估计值非常接近,并且与固定效果回归模型所暗示的估计值几乎相同,其中固定玩家和回合效果是同时估计的。这项研究还揭示了“技能悖论”,这个悖论最初由古尔德提出,后来由莫布森进一步发展,因为它与高尔夫球有关,表明在决定高技能高尔夫球手群体的得分方面,运气起着比低技能群体更重要的作用。
{"title":"The relative roles of skill and luck within 11 different golfer populations","authors":"Richard J. Rendleman","doi":"10.1515/JQAS-2019-0028","DOIUrl":"https://doi.org/10.1515/JQAS-2019-0028","url":null,"abstract":"Drawing on the golf-related example of regression to the mean as presented by Kahneman in his best-selling book, Thinking Fast and Slow, this study shows how the regression-to-the-mean phenomenon is revealed in first- and second-round scoring in 11 different golfer populations, ranging from golfers with the highest level of skill (professional golfers on the PGA TOUR) to amateur groups of much lower skill. Using the mathematics of truncated normal distributions, the study introduces a new method for estimating the mix between variation in scoring due to differences in player skill and that due to luck. Estimates of the skill/luck mix are very close to those obtained using the regression-based methodology of Morrison and are nearly identical to those implied by fixed effects regression models where fixed player and round effects are estimated simultaneously. The study also sheds light on the “paradox of skill,” originally suggested by Gould and developed further by Mauboussin, as it relates to golf by showing that luck plays a more important role in determining player scores in higher-skilled golfer groups compared with lower-skilled groups.","PeriodicalId":16925,"journal":{"name":"Journal of Quantitative Analysis in Sports","volume":"18 1","pages":"237-254"},"PeriodicalIF":0.8,"publicationDate":"2020-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77994278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
C. Avinash, DiPietro Loretta, Young Heather, Elmi Angelo
In assessments of sports-related injury severity, time loss (TL) is measured as a count of days lost to injury and analyzed using ordinal cut points. This approach ignores various athlete and event-specific factors that determine the severity of an injury. We present a conceptual framework for modeling this outcome using univariate random effects count or survival regression. Using a sample of US collegiate soccer-related injury observations, we fit random effects Poisson and Weibull Regression models to perform “severity-adjusted” evaluations of TL, and use our models to make inferences regarding the recovery process. Injury site, injury mechanism and injury history emerged as the strongest predictors in our sample. In comparing random and fixed effects models, we noted that the incorporation of the random effect attenuated associations between most observed covariates and TL, and model fit statistics revealed that the random effects models (AICPoisson = 51875.20; AICWeibull-AFT = 51113.00) improved model fit over the fixed effects models (AICPoisson = 160695.20; AICWeibull-AFT = 53179.00). Our analyses serve as a useful starting point for modeling how TL may actually occur when a player is injured, and suggest that random effects or frailty based approaches can help isolate the effect of potential determinants of TL.
{"title":"Modeling time loss from sports-related injuries using random effects models: an illustration using soccer-related injury observations","authors":"C. Avinash, DiPietro Loretta, Young Heather, Elmi Angelo","doi":"10.1515/JQAS-2019-0030","DOIUrl":"https://doi.org/10.1515/JQAS-2019-0030","url":null,"abstract":"In assessments of sports-related injury severity, time loss (TL) is measured as a count of days lost to injury and analyzed using ordinal cut points. This approach ignores various athlete and event-specific factors that determine the severity of an injury. We present a conceptual framework for modeling this outcome using univariate random effects count or survival regression. Using a sample of US collegiate soccer-related injury observations, we fit random effects Poisson and Weibull Regression models to perform “severity-adjusted” evaluations of TL, and use our models to make inferences regarding the recovery process. Injury site, injury mechanism and injury history emerged as the strongest predictors in our sample. In comparing random and fixed effects models, we noted that the incorporation of the random effect attenuated associations between most observed covariates and TL, and model fit statistics revealed that the random effects models (AICPoisson = 51875.20; AICWeibull-AFT = 51113.00) improved model fit over the fixed effects models (AICPoisson = 160695.20; AICWeibull-AFT = 53179.00). Our analyses serve as a useful starting point for modeling how TL may actually occur when a player is injured, and suggest that random effects or frailty based approaches can help isolate the effect of potential determinants of TL.","PeriodicalId":16925,"journal":{"name":"Journal of Quantitative Analysis in Sports","volume":"15 1","pages":"221-235"},"PeriodicalIF":0.8,"publicationDate":"2020-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82611389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Vast data on eSports should be easily accessible but often is not. League of Legends (LoL) only has rudimentary statistics such as levels, items, gold, and deaths. We present a new way to capture more useful data. We track every champion’s location multiple times every second. We track every ability cast and attack made, all damages caused and avoided, vision, health, mana, and cooldowns. We track continuously, invisibly, remotely, and live. Using a combination of computer vision, dynamic client hooks, machine learning, visualization, logistic regression, large-scale cloud computing, and fast and frugal trees, we generate this new high-frequency data on millions of ranked LoL games, calibrate an in-game win probability model, develop enhanced definitions for standard metrics, introduce dozens more advanced metrics, automate player improvement analysis, and apply a new player-evaluation framework on the basic and advanced stats. How much does an individual contribute to a team’s performance? We find that individual actions conditioned on changes to estimated win probability correlate almost perfectly to team performance: regular kills and deaths do not nearly explain as much as smart kills and worthless deaths. Our approach offers applications for other eSports and traditional sports. All the code is open-sourced.
{"title":"Smart kills and worthless deaths: eSports analytics for League of Legends","authors":"Philip Z. Maymin","doi":"10.1515/jqas-2019-0096","DOIUrl":"https://doi.org/10.1515/jqas-2019-0096","url":null,"abstract":"Abstract Vast data on eSports should be easily accessible but often is not. League of Legends (LoL) only has rudimentary statistics such as levels, items, gold, and deaths. We present a new way to capture more useful data. We track every champion’s location multiple times every second. We track every ability cast and attack made, all damages caused and avoided, vision, health, mana, and cooldowns. We track continuously, invisibly, remotely, and live. Using a combination of computer vision, dynamic client hooks, machine learning, visualization, logistic regression, large-scale cloud computing, and fast and frugal trees, we generate this new high-frequency data on millions of ranked LoL games, calibrate an in-game win probability model, develop enhanced definitions for standard metrics, introduce dozens more advanced metrics, automate player improvement analysis, and apply a new player-evaluation framework on the basic and advanced stats. How much does an individual contribute to a team’s performance? We find that individual actions conditioned on changes to estimated win probability correlate almost perfectly to team performance: regular kills and deaths do not nearly explain as much as smart kills and worthless deaths. Our approach offers applications for other eSports and traditional sports. All the code is open-sourced.","PeriodicalId":16925,"journal":{"name":"Journal of Quantitative Analysis in Sports","volume":"19 3 1","pages":"11 - 27"},"PeriodicalIF":0.8,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83284203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract In this paper, we investigate the correlation between the main physical characteristics of eight variants of football and hockey (such as field size, goal size, player velocity, ball velocity, player density, and game duration) and the resulting average numbers of goals scored per game. To do so, the Pi-theorem in physics is extended to sport science and a non-dimensional parameter of interest is defined. It is based on the ratio between the duration of the game and the order of magnitude of the time needed to cross the midfield, which depends on the average velocity of the ball and the players, the player density and the size of the goals. An excellent correlation is found between the proposed parameter and the average number of goals scored per game during recent international competitions. Using the derived correlation, the effect of any modification of the main characteristics of football and hockey (and their variants) on the scoring pace can be assessed. For instance, it can be predicted that decreasing the length of football fields by 20 m would raise the average number of goals scored to 3.6 (±0.6) per game, versus the 2.6 goals scored during the most recent men’s World Cup.
{"title":"The influence of field size, goal size and number of players on the average number of goals scored per game in variants of football and hockey: the Pi-theorem applied to team sports","authors":"J. Blondeau","doi":"10.1515/JQAS-2020-0009","DOIUrl":"https://doi.org/10.1515/JQAS-2020-0009","url":null,"abstract":"Abstract In this paper, we investigate the correlation between the main physical characteristics of eight variants of football and hockey (such as field size, goal size, player velocity, ball velocity, player density, and game duration) and the resulting average numbers of goals scored per game. To do so, the Pi-theorem in physics is extended to sport science and a non-dimensional parameter of interest is defined. It is based on the ratio between the duration of the game and the order of magnitude of the time needed to cross the midfield, which depends on the average velocity of the ball and the players, the player density and the size of the goals. An excellent correlation is found between the proposed parameter and the average number of goals scored per game during recent international competitions. Using the derived correlation, the effect of any modification of the main characteristics of football and hockey (and their variants) on the scoring pace can be assessed. For instance, it can be predicted that decreasing the length of football fields by 20 m would raise the average number of goals scored to 3.6 (±0.6) per game, versus the 2.6 goals scored during the most recent men’s World Cup.","PeriodicalId":16925,"journal":{"name":"Journal of Quantitative Analysis in Sports","volume":"46 1","pages":"145 - 154"},"PeriodicalIF":0.8,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74968437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Elizabeth L. Bouzarth, B. Grannan, John M Harris, A. Hartley, K. Hutson, E. Morton
Abstract Defensive repositioning strategies (shifts) have become more prevalent in Major League Baseball in recent years. In 2018, batters faced some form of the shift in 34% of their plate appearances (Sawchik, Travis. 2019. “Don’t Worry, MLB–Hitters Are Killing The Shift On Their Own.” FiveThirtyEight, January 17, 2019. Also available at fivethirtyeight.com/features/dont-worry-mlb-hitters-are-killing-the-shift-on-their-own/). Most teams use a shift that overloads one side of the infield and adjusts the positioning of the outfield. In this work we describe a mathematical approach to the positioning of players over the entire field of play without the limitations of traditional positions or current methods of shifting. The model uses historical data for individual batters, and it leaves open the possibility of fewer than four infielders. The model also incorporates risk penalties for positioning players too far from areas of the field in which extra-base hits are more likely. This work is meant to serve as a decision-making tool for coaches and managers to best use their defensive assets. Our simulations show that an optimal positioning with three infielders lowered predicted batting average on balls in play (BABIP) by 5.9% for right-handers and by 10.3% for left-handers on average when compared to a standard four-infielder placement of players.
{"title":"Swing shift: a mathematical approach to defensive positioning in baseball","authors":"Elizabeth L. Bouzarth, B. Grannan, John M Harris, A. Hartley, K. Hutson, E. Morton","doi":"10.1515/jqas-2020-0027","DOIUrl":"https://doi.org/10.1515/jqas-2020-0027","url":null,"abstract":"Abstract Defensive repositioning strategies (shifts) have become more prevalent in Major League Baseball in recent years. In 2018, batters faced some form of the shift in 34% of their plate appearances (Sawchik, Travis. 2019. “Don’t Worry, MLB–Hitters Are Killing The Shift On Their Own.” FiveThirtyEight, January 17, 2019. Also available at fivethirtyeight.com/features/dont-worry-mlb-hitters-are-killing-the-shift-on-their-own/). Most teams use a shift that overloads one side of the infield and adjusts the positioning of the outfield. In this work we describe a mathematical approach to the positioning of players over the entire field of play without the limitations of traditional positions or current methods of shifting. The model uses historical data for individual batters, and it leaves open the possibility of fewer than four infielders. The model also incorporates risk penalties for positioning players too far from areas of the field in which extra-base hits are more likely. This work is meant to serve as a decision-making tool for coaches and managers to best use their defensive assets. Our simulations show that an optimal positioning with three infielders lowered predicted batting average on balls in play (BABIP) by 5.9% for right-handers and by 10.3% for left-handers on average when compared to a standard four-infielder placement of players.","PeriodicalId":16925,"journal":{"name":"Journal of Quantitative Analysis in Sports","volume":"1942 1","pages":"47 - 55"},"PeriodicalIF":0.8,"publicationDate":"2020-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91187229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract The action of crossing the ball in soccer has a long history as an effective tactic for producing goals. Lately, the benefit of crossing the ball has come under question, and alternative strategies have been suggested. This paper utilizes player tracking data to explore crossing at a deeper level. First, we investigate the spatio-temporal conditions that lead to crossing. Then we introduce an intended target model that investigates crossing success. Finally, a contextual analysis is provided that assesses the benefits of crossing in various situations. The analysis is based on causal inference techniques and suggests that crossing remains an effective tactic in particular contexts.
{"title":"A contextual analysis of crossing the ball in soccer","authors":"Lucas Y. Wu, Aaron Danielson, X. J. Hu, T. Swartz","doi":"10.1515/jqas-2020-0060","DOIUrl":"https://doi.org/10.1515/jqas-2020-0060","url":null,"abstract":"Abstract The action of crossing the ball in soccer has a long history as an effective tactic for producing goals. Lately, the benefit of crossing the ball has come under question, and alternative strategies have been suggested. This paper utilizes player tracking data to explore crossing at a deeper level. First, we investigate the spatio-temporal conditions that lead to crossing. Then we introduce an intended target model that investigates crossing success. Finally, a contextual analysis is provided that assesses the benefits of crossing in various situations. The analysis is based on causal inference techniques and suggests that crossing remains an effective tactic in particular contexts.","PeriodicalId":16925,"journal":{"name":"Journal of Quantitative Analysis in Sports","volume":"36 1","pages":"57 - 66"},"PeriodicalIF":0.8,"publicationDate":"2020-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85154465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract This paper investigates the fouling time distribution of players in the National Basketball Association. A Bayesian analysis is presented based on the assumption that fouling time distributions follow a gamma distribution. Various insights are obtained including the observation that players accumulate fouls at a rate that increases with the current number of fouls. We demonstrate possible ways to incorporate the fouling time distributions to provide decision support to coaches in the management of playing time.
{"title":"Foul accumulation in the NBA","authors":"Dani Chu","doi":"10.1515/jqas-2019-0119","DOIUrl":"https://doi.org/10.1515/jqas-2019-0119","url":null,"abstract":"Abstract This paper investigates the fouling time distribution of players in the National Basketball Association. A Bayesian analysis is presented based on the assumption that fouling time distributions follow a gamma distribution. Various insights are obtained including the observation that players accumulate fouls at a rate that increases with the current number of fouls. We demonstrate possible ways to incorporate the fouling time distributions to provide decision support to coaches in the management of playing time.","PeriodicalId":16925,"journal":{"name":"Journal of Quantitative Analysis in Sports","volume":"10 1","pages":"301 - 309"},"PeriodicalIF":0.8,"publicationDate":"2020-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87587976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-09-01DOI: 10.1515/jqas-2020-frontmatter3
{"title":"Frontmatter","authors":"","doi":"10.1515/jqas-2020-frontmatter3","DOIUrl":"https://doi.org/10.1515/jqas-2020-frontmatter3","url":null,"abstract":"","PeriodicalId":16925,"journal":{"name":"Journal of Quantitative Analysis in Sports","volume":"3 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85251101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniele Gambarelli, G. Gambarelli, Dries R. Goossens
{"title":"Corrigendum to: Offensive or defensive play in soccer: a game-theoretical approach","authors":"Daniele Gambarelli, G. Gambarelli, Dries R. Goossens","doi":"10.1515/jqas-2020-0080","DOIUrl":"https://doi.org/10.1515/jqas-2020-0080","url":null,"abstract":"","PeriodicalId":16925,"journal":{"name":"Journal of Quantitative Analysis in Sports","volume":"46 1","pages":"343 - 343"},"PeriodicalIF":0.8,"publicationDate":"2020-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76920080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract The recently concluded 2019 World Swimming Championships was another major swimming competition that witnessed some great progresses achieved by human athletes in many events. However, some world records created 10 years ago back in the era of high-tech swimsuits remained untouched. With the advancements in technical skills and training methods in the past decade, the inability to break those world records is a strong indication that records with the swimsuit bonus cannot reflect the real progressions achieved by human athletes in history. Many swimming professionals and enthusiasts are eager to know a measure of the real world records had the high-tech swimsuits never been allowed. This paper attempts to restore the real world records in Men’s swimming without high-tech swimsuits by integrating various advanced methods in probabilistic modeling and optimization. Through the modeling and separation of swimsuit bias, natural improvement, and athletes’ intrinsic performance, the result of this paper provides the optimal estimates and the 95% confidence intervals for the real world records. The proposed methodology can also be applied to a variety of similar studies with multi-factor considerations.
{"title":"Restoring the real world records in Men’s swimming without high-tech swimsuits","authors":"Zhenyu Gao, Yixing Li, Zhengxin Wang","doi":"10.1515/jqas-2019-0087","DOIUrl":"https://doi.org/10.1515/jqas-2019-0087","url":null,"abstract":"Abstract The recently concluded 2019 World Swimming Championships was another major swimming competition that witnessed some great progresses achieved by human athletes in many events. However, some world records created 10 years ago back in the era of high-tech swimsuits remained untouched. With the advancements in technical skills and training methods in the past decade, the inability to break those world records is a strong indication that records with the swimsuit bonus cannot reflect the real progressions achieved by human athletes in history. Many swimming professionals and enthusiasts are eager to know a measure of the real world records had the high-tech swimsuits never been allowed. This paper attempts to restore the real world records in Men’s swimming without high-tech swimsuits by integrating various advanced methods in probabilistic modeling and optimization. Through the modeling and separation of swimsuit bias, natural improvement, and athletes’ intrinsic performance, the result of this paper provides the optimal estimates and the 95% confidence intervals for the real world records. The proposed methodology can also be applied to a variety of similar studies with multi-factor considerations.","PeriodicalId":16925,"journal":{"name":"Journal of Quantitative Analysis in Sports","volume":"216 1","pages":"291 - 300"},"PeriodicalIF":0.8,"publicationDate":"2020-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91388380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}