Abstract This paper investigates the fouling time distribution of players in the National Basketball Association. A Bayesian analysis is presented based on the assumption that fouling time distributions follow a gamma distribution. Various insights are obtained including the observation that players accumulate fouls at a rate that increases with the current number of fouls. We demonstrate possible ways to incorporate the fouling time distributions to provide decision support to coaches in the management of playing time.
{"title":"Foul accumulation in the NBA","authors":"Dani Chu","doi":"10.1515/jqas-2019-0119","DOIUrl":"https://doi.org/10.1515/jqas-2019-0119","url":null,"abstract":"Abstract This paper investigates the fouling time distribution of players in the National Basketball Association. A Bayesian analysis is presented based on the assumption that fouling time distributions follow a gamma distribution. Various insights are obtained including the observation that players accumulate fouls at a rate that increases with the current number of fouls. We demonstrate possible ways to incorporate the fouling time distributions to provide decision support to coaches in the management of playing time.","PeriodicalId":16925,"journal":{"name":"Journal of Quantitative Analysis in Sports","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2020-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87587976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-09-01DOI: 10.1515/jqas-2020-frontmatter3
{"title":"Frontmatter","authors":"","doi":"10.1515/jqas-2020-frontmatter3","DOIUrl":"https://doi.org/10.1515/jqas-2020-frontmatter3","url":null,"abstract":"","PeriodicalId":16925,"journal":{"name":"Journal of Quantitative Analysis in Sports","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85251101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniele Gambarelli, G. Gambarelli, Dries R. Goossens
{"title":"Corrigendum to: Offensive or defensive play in soccer: a game-theoretical approach","authors":"Daniele Gambarelli, G. Gambarelli, Dries R. Goossens","doi":"10.1515/jqas-2020-0080","DOIUrl":"https://doi.org/10.1515/jqas-2020-0080","url":null,"abstract":"","PeriodicalId":16925,"journal":{"name":"Journal of Quantitative Analysis in Sports","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2020-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76920080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract The recently concluded 2019 World Swimming Championships was another major swimming competition that witnessed some great progresses achieved by human athletes in many events. However, some world records created 10 years ago back in the era of high-tech swimsuits remained untouched. With the advancements in technical skills and training methods in the past decade, the inability to break those world records is a strong indication that records with the swimsuit bonus cannot reflect the real progressions achieved by human athletes in history. Many swimming professionals and enthusiasts are eager to know a measure of the real world records had the high-tech swimsuits never been allowed. This paper attempts to restore the real world records in Men’s swimming without high-tech swimsuits by integrating various advanced methods in probabilistic modeling and optimization. Through the modeling and separation of swimsuit bias, natural improvement, and athletes’ intrinsic performance, the result of this paper provides the optimal estimates and the 95% confidence intervals for the real world records. The proposed methodology can also be applied to a variety of similar studies with multi-factor considerations.
{"title":"Restoring the real world records in Men’s swimming without high-tech swimsuits","authors":"Zhenyu Gao, Yixing Li, Zhengxin Wang","doi":"10.1515/jqas-2019-0087","DOIUrl":"https://doi.org/10.1515/jqas-2019-0087","url":null,"abstract":"Abstract The recently concluded 2019 World Swimming Championships was another major swimming competition that witnessed some great progresses achieved by human athletes in many events. However, some world records created 10 years ago back in the era of high-tech swimsuits remained untouched. With the advancements in technical skills and training methods in the past decade, the inability to break those world records is a strong indication that records with the swimsuit bonus cannot reflect the real progressions achieved by human athletes in history. Many swimming professionals and enthusiasts are eager to know a measure of the real world records had the high-tech swimsuits never been allowed. This paper attempts to restore the real world records in Men’s swimming without high-tech swimsuits by integrating various advanced methods in probabilistic modeling and optimization. Through the modeling and separation of swimsuit bias, natural improvement, and athletes’ intrinsic performance, the result of this paper provides the optimal estimates and the 95% confidence intervals for the real world records. The proposed methodology can also be applied to a variety of similar studies with multi-factor considerations.","PeriodicalId":16925,"journal":{"name":"Journal of Quantitative Analysis in Sports","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2020-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91388380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Analytics and professional sports have become linked over the past several years, but little attention has been paid to the growing field of esports within the sports analytics community. We seek to apply an Adjusted Plus Minus (APM) model, an accepted analytic approach used in traditional sports like hockey and basketball, to one particular esports game: Defense of the Ancients 2 (Dota 2). As with traditional sports, we show how APM metrics developed with Bayesian hierarchical regression can be used to quantify individual player contributions to their teams and, ultimately, use this player-level information to predict game outcomes. In particular, we first provide evidence that gold can be used as a continuous proxy for wins to evaluate a team’s performance, and then use a Bayesian APM model to estimate how players contribute to their team’s gold differential. We demonstrate that this APM model outperforms models based on common team-level statistics (often referred to as “box score statistics”). Beyond the specifics of our modeling approach, this paper serves as an example of the potential utility of applying analytical methodologies from traditional sports analytics to esports.
在过去的几年里,分析学和职业体育联系在一起,但在体育分析界,很少有人关注电子竞技领域的发展。我们试图将调整正负(APM)模型应用于一款特定的电子竞技游戏:《Defense of the Ancients 2》(Dota 2),这是一种传统体育项目(如曲棍球和篮球)中使用的公认分析方法。与传统体育项目一样,我们展示了如何使用贝叶斯层次回归开发APM指标来量化个人玩家对团队的贡献,并最终使用这些玩家级别的信息来预测游戏结果。特别是,我们首先提供了证据,证明金牌数可以作为衡量球队表现的连续指标,然后使用贝叶斯APM模型来估计球员对球队金牌数差异的贡献。我们证明了这个APM模型优于基于普通团队级别统计(通常称为“框得分统计”)的模型。除了我们的建模方法的细节之外,本文还作为将传统体育分析的分析方法应用于电子竞技的潜在效用的一个例子。
{"title":"A Bayesian adjusted plus-minus analysis for the esport Dota 2","authors":"Nicholas J. Clark, Brian Macdonald, Ian Kloo","doi":"10.1515/jqas-2019-0103","DOIUrl":"https://doi.org/10.1515/jqas-2019-0103","url":null,"abstract":"Abstract Analytics and professional sports have become linked over the past several years, but little attention has been paid to the growing field of esports within the sports analytics community. We seek to apply an Adjusted Plus Minus (APM) model, an accepted analytic approach used in traditional sports like hockey and basketball, to one particular esports game: Defense of the Ancients 2 (Dota 2). As with traditional sports, we show how APM metrics developed with Bayesian hierarchical regression can be used to quantify individual player contributions to their teams and, ultimately, use this player-level information to predict game outcomes. In particular, we first provide evidence that gold can be used as a continuous proxy for wins to evaluate a team’s performance, and then use a Bayesian APM model to estimate how players contribute to their team’s gold differential. We demonstrate that this APM model outperforms models based on common team-level statistics (often referred to as “box score statistics”). Beyond the specifics of our modeling approach, this paper serves as an example of the potential utility of applying analytical methodologies from traditional sports analytics to esports.","PeriodicalId":16925,"journal":{"name":"Journal of Quantitative Analysis in Sports","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2020-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86655099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract We study the stability of a time-aware version of the popular Massey method, previously introduced by Franceschet, M., E. Bozzo, and P. Vidoni. 2017. “The Temporalized Massey’s Method.” Journal of Quantitative Analysis in Sports 13: 37–48, for rating teams in sport competitions. To this end, we embed the temporal Massey method in the theory of time-varying averaging algorithms, which are dynamic systems mainly used in control theory for multi-agent coordination. We also introduce a parametric family of Massey-type methods and show that the original and time-aware Massey versions are, in some sense, particular instances of it. Finally, we discuss the key features of this general family of rating procedures, focusing on inferential and predictive issues and on sensitivity to upsets and modifications of the schedule.
我们研究了流行的Massey方法的时间感知版本的稳定性,该方法之前由Franceschet, M., E. Bozzo和P. Vidoni于2017年引入。"时间化的梅西方法"《体育定量分析杂志》13:37-48,用于对体育比赛中的球队进行评级。为此,我们将时间Massey方法嵌入到时变平均算法理论中,时变平均算法是多智能体协调控制理论中主要使用的动态系统。我们还介绍了Massey型方法的参数族,并表明原始的和有时间意识的Massey版本在某种意义上是它的特殊实例。最后,我们讨论了这类评定程序的主要特征,重点讨论了推理和预测问题,以及对进度中断和修改的敏感性。
{"title":"A parametric family of Massey-type methods: inference, prediction, and sensitivity","authors":"E. Bozzo, P. Vidoni, Massimo Franceschet","doi":"10.1515/jqas-2019-0071","DOIUrl":"https://doi.org/10.1515/jqas-2019-0071","url":null,"abstract":"Abstract We study the stability of a time-aware version of the popular Massey method, previously introduced by Franceschet, M., E. Bozzo, and P. Vidoni. 2017. “The Temporalized Massey’s Method.” Journal of Quantitative Analysis in Sports 13: 37–48, for rating teams in sport competitions. To this end, we embed the temporal Massey method in the theory of time-varying averaging algorithms, which are dynamic systems mainly used in control theory for multi-agent coordination. We also introduce a parametric family of Massey-type methods and show that the original and time-aware Massey versions are, in some sense, particular instances of it. Finally, we discuss the key features of this general family of rating procedures, focusing on inferential and predictive issues and on sensitivity to upsets and modifications of the schedule.","PeriodicalId":16925,"journal":{"name":"Journal of Quantitative Analysis in Sports","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2020-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78343712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-06-25DOI: 10.1515/jqas-2020-frontmatter2
{"title":"Frontmatter","authors":"","doi":"10.1515/jqas-2020-frontmatter2","DOIUrl":"https://doi.org/10.1515/jqas-2020-frontmatter2","url":null,"abstract":"","PeriodicalId":16925,"journal":{"name":"Journal of Quantitative Analysis in Sports","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2020-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75262081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lopez (2020) demonstrates clearly how the lack of precise, high-quality data can lead to imprecise results or analyses. In particular, this paper shows that once you know the precise distance to the first down line (“yards to go”) rather than just the integer-valued distances provided in the NFL’s play-by-play data, the decisions made by coaches are more closely in line with what we would expect from rational, data-driven decision-makers in their situation. However, from anNFL team’s perspective, it is unclear if player-tracking data was necessary to help individual coaches in this particular case. Could NFL teams and coaches make approximately the same decisions from a model trained on only play-by-play data, but evaluated in real-time with more precise inputs for yards to go? Fourth-down decisions are typically analyzed with expected points models and/or win probability models (Romer 2006). When making fourth-down decisions, analysts contend that NFL teams should input their current game situation into one of these models (including information such as the down, distance, yard line, score differential, time remaining, etc), and analyze the output. If the model’s computed win probability for a given situation is maximized by “going for it,” the coach should leave the offense on the field; if win probability is maximized by punting, the coach should elect to punt; and if it is maximized by attempting a field goal, the coach should put his field goal unit on the field. Yurko, Horowitz andVentura (2019) provide a detailed explanation of how to build expected points and win probability models, but briefly, the expected points model is a linear model (specifically, a multinomial logistic regression model), and the win probability model is a generalized additive model. Importantly, although only integer-valueddistances (“yards to go”) areprovided in the
{"title":"What will we unlearn next? The implications of Lopez (2020)","authors":"Samuel L. Ventura","doi":"10.1515/jqas-2020-0056","DOIUrl":"https://doi.org/10.1515/jqas-2020-0056","url":null,"abstract":"Lopez (2020) demonstrates clearly how the lack of precise, high-quality data can lead to imprecise results or analyses. In particular, this paper shows that once you know the precise distance to the first down line (“yards to go”) rather than just the integer-valued distances provided in the NFL’s play-by-play data, the decisions made by coaches are more closely in line with what we would expect from rational, data-driven decision-makers in their situation. However, from anNFL team’s perspective, it is unclear if player-tracking data was necessary to help individual coaches in this particular case. Could NFL teams and coaches make approximately the same decisions from a model trained on only play-by-play data, but evaluated in real-time with more precise inputs for yards to go? Fourth-down decisions are typically analyzed with expected points models and/or win probability models (Romer 2006). When making fourth-down decisions, analysts contend that NFL teams should input their current game situation into one of these models (including information such as the down, distance, yard line, score differential, time remaining, etc), and analyze the output. If the model’s computed win probability for a given situation is maximized by “going for it,” the coach should leave the offense on the field; if win probability is maximized by punting, the coach should elect to punt; and if it is maximized by attempting a field goal, the coach should put his field goal unit on the field. Yurko, Horowitz andVentura (2019) provide a detailed explanation of how to build expected points and win probability models, but briefly, the expected points model is a linear model (specifically, a multinomial logistic regression model), and the win probability model is a generalized additive model. Importantly, although only integer-valueddistances (“yards to go”) areprovided in the","PeriodicalId":16925,"journal":{"name":"Journal of Quantitative Analysis in Sports","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2020-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/jqas-2020-0056","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72419915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Erin M. Schliep, Toryn L. J. Schafer, Matt J. Hawkey
Abstract Subjective wellness data can provide important information on the well-being of athletes and be used to maximize player performance and detect and prevent against injury. Wellness data, which are often ordinal and multivariate, include metrics relating to the physical, mental, and emotional status of the athlete. Training and recovery can have significant short- and long-term effects on athlete wellness, and these effects can vary across individual. We develop a joint multivariate latent factor model for ordinal response data to investigate the effects of training and recovery on athlete wellness. We use a latent factor distributed lag model to capture the cumulative effects of training and recovery through time. Current efforts using subjective wellness data have averaged over these metrics to create a univariate summary of wellness, however this approach can mask important information in the data. Our multivariate model leverages each ordinal variable and can be used to identify the relative importance of each in monitoring athlete wellness. The model is applied to professional referee daily wellness, training, and recovery data collected across two Major League Soccer seasons.
{"title":"Distributed lag models to identify the cumulative effects of training and recovery in athletes using multivariate ordinal wellness data","authors":"Erin M. Schliep, Toryn L. J. Schafer, Matt J. Hawkey","doi":"10.1515/jqas-2020-0051","DOIUrl":"https://doi.org/10.1515/jqas-2020-0051","url":null,"abstract":"Abstract Subjective wellness data can provide important information on the well-being of athletes and be used to maximize player performance and detect and prevent against injury. Wellness data, which are often ordinal and multivariate, include metrics relating to the physical, mental, and emotional status of the athlete. Training and recovery can have significant short- and long-term effects on athlete wellness, and these effects can vary across individual. We develop a joint multivariate latent factor model for ordinal response data to investigate the effects of training and recovery on athlete wellness. We use a latent factor distributed lag model to capture the cumulative effects of training and recovery through time. Current efforts using subjective wellness data have averaged over these metrics to create a univariate summary of wellness, however this approach can mask important information in the data. Our multivariate model leverages each ordinal variable and can be used to identify the relative importance of each in monitoring athlete wellness. The model is applied to professional referee daily wellness, training, and recovery data collected across two Major League Soccer seasons.","PeriodicalId":16925,"journal":{"name":"Journal of Quantitative Analysis in Sports","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2020-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89955979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Betting odds are generally considered to represent accurate reflections of the underlying probabilities for the outcomes of sporting events. There are, however, known to be a number of inherent biases such as the favorite-longshot bias in which outsiders are generally priced with poorer value odds than favorites. Using data from European soccer matches, this paper demonstrates the existence of another bias in which the match odds overreact to favorable and unfavorable runs of results. A statistic is defined, called the Combined Odds Distribution (COD) statistic, which measures the performance of a team relative to expectations given their odds over previous matches. Teams that overperform expectations tend to have a high COD statistic and those that underperform tend to have a low COD statistic. Using data from twenty different leagues over twelve seasons, it is shown that teams with a low COD statistic tend to be assigned more generous odds by bookmakers. This can be exploited and a sustained and robust profit can be made. It is suggested that the bias in the odds can be explained in the context of the “hot hand fallacy”, in which gamblers overestimate variation in the ability of each team over time.
{"title":"Profiting from overreaction in soccer betting odds","authors":"E. Wheatcroft","doi":"10.1515/jqas-2019-0009","DOIUrl":"https://doi.org/10.1515/jqas-2019-0009","url":null,"abstract":"Abstract Betting odds are generally considered to represent accurate reflections of the underlying probabilities for the outcomes of sporting events. There are, however, known to be a number of inherent biases such as the favorite-longshot bias in which outsiders are generally priced with poorer value odds than favorites. Using data from European soccer matches, this paper demonstrates the existence of another bias in which the match odds overreact to favorable and unfavorable runs of results. A statistic is defined, called the Combined Odds Distribution (COD) statistic, which measures the performance of a team relative to expectations given their odds over previous matches. Teams that overperform expectations tend to have a high COD statistic and those that underperform tend to have a low COD statistic. Using data from twenty different leagues over twelve seasons, it is shown that teams with a low COD statistic tend to be assigned more generous odds by bookmakers. This can be exploited and a sustained and robust profit can be made. It is suggested that the bias in the odds can be explained in the context of the “hot hand fallacy”, in which gamblers overestimate variation in the ability of each team over time.","PeriodicalId":16925,"journal":{"name":"Journal of Quantitative Analysis in Sports","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2020-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78433311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}