Pub Date : 2022-07-11DOI: 10.1007/s10182-022-00453-9
Andreas Groll, Dominik Liebl
Triggered by advances in data gathering technologies, the use of statistical analyzes, predictions and modeling techniques in sports has gained a rapidly growing interest over the last decades. Today, professional sports teams have access to precise player positioning data and sports scientists design experiments involving non-standard data structures like movement-trajectories. This special issue on statistics in sports is dedicated to further foster the development of statistics and its applications in sports. The contributed articles address a wide range of statistical problems such as statistical methods for prediction of game outcomes, for prevention of sports injuries, for analyzing sports science data from movement laboratories, for measurement and evaluation of player performance, etc. Finally, also SARS-CoV-2 pandemic-related impacts on the sport’s framework are investigated.
{"title":"Editorial special issue: Statistics in sports","authors":"Andreas Groll, Dominik Liebl","doi":"10.1007/s10182-022-00453-9","DOIUrl":"10.1007/s10182-022-00453-9","url":null,"abstract":"<div><p>Triggered by advances in data gathering technologies, the use of statistical analyzes, predictions and modeling techniques in sports has gained a rapidly growing interest over the last decades. Today, professional sports teams have access to precise player positioning data and sports scientists design experiments involving non-standard data structures like movement-trajectories. This special issue on statistics in sports is dedicated to further foster the development of statistics and its applications in sports. The contributed articles address a wide range of statistical problems such as statistical methods for prediction of game outcomes, for prevention of sports injuries, for analyzing sports science data from movement laboratories, for measurement and evaluation of player performance, etc. Finally, also SARS-CoV-2 pandemic-related impacts on the sport’s framework are investigated.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"107 1-2","pages":"1 - 7"},"PeriodicalIF":1.4,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-022-00453-9.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9122617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this contribution, we investigate the importance of Oliver’s Four Factors, proposed in the literature to identify a basketball team’s strengths and weaknesses in terms of shooting, turnovers, rebounding and free throws, as success drivers of a basketball game. In order to investigate the role of each factor in the success of a team in a match, we applied the MOdel-Based recursive partitioning (MOB) algorithm to real data concerning 19,138 matches of 16 National Basketball Association (NBA) regular seasons (from 2004–2005 to 2019–2020). MOB, instead of fitting one global Generalized Linear Model (GLM) to all observations, partitions the observations according to selected partitioning variables and estimates several ad hoc local GLMs for subgroups of observations. The manuscript’s aim is twofold: (1) in order to deal with (quasi) separation problems leading to convergence problems in the numerical solution of Maximum Likelihood (ML) estimation in MOB, we propose a methodological extension of GLM-based recursive partitioning from standard ML estimation to bias-reduced (BR) estimation; and (2) we apply the BR-based GLM trees to basketball analytics. The results show models very easy to interpret that can provide useful support to coaching staff’s decisions.
{"title":"Integration of model-based recursive partitioning with bias reduction estimation: a case study assessing the impact of Oliver’s four factors on the probability of winning a basketball game","authors":"Manlio Migliorati, Marica Manisera, Paola Zuccolotto","doi":"10.1007/s10182-022-00456-6","DOIUrl":"10.1007/s10182-022-00456-6","url":null,"abstract":"<div><p>In this contribution, we investigate the importance of Oliver’s Four Factors, proposed in the literature to identify a basketball team’s strengths and weaknesses in terms of shooting, turnovers, rebounding and free throws, as success drivers of a basketball game. In order to investigate the role of each factor in the success of a team in a match, we applied the MOdel-Based recursive partitioning (MOB) algorithm to real data concerning 19,138 matches of 16 National Basketball Association (NBA) regular seasons (from 2004–2005 to 2019–2020). MOB, instead of fitting one global Generalized Linear Model (GLM) to all observations, partitions the observations according to selected partitioning variables and estimates several ad hoc local GLMs for subgroups of observations. The manuscript’s aim is twofold: (1) in order to deal with (quasi) separation problems leading to convergence problems in the numerical solution of Maximum Likelihood (ML) estimation in MOB, we propose a methodological extension of GLM-based recursive partitioning from standard ML estimation to bias-reduced (BR) estimation; and (2) we apply the BR-based GLM trees to basketball analytics. The results show models very easy to interpret that can provide useful support to coaching staff’s decisions.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"107 1-2","pages":"271 - 293"},"PeriodicalIF":1.4,"publicationDate":"2022-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-022-00456-6.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9114892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-18DOI: 10.1007/s10182-022-00451-x
Michael Höhle
We comment the paper by Jahn et al. (On the role of data, statistics and decisions in a pandemic, 2022).
我们评论Jahn等人的论文(关于数据、统计和决策在大流行中的作用,2022年)。
{"title":"Comment “On the role of data, statistics and decisions in a pandemic” by Jahn et al.","authors":"Michael Höhle","doi":"10.1007/s10182-022-00451-x","DOIUrl":"10.1007/s10182-022-00451-x","url":null,"abstract":"<div><p>We comment the paper by Jahn et al. (On the role of data, statistics and decisions in a pandemic, 2022).</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"106 3","pages":"383 - 386"},"PeriodicalIF":1.4,"publicationDate":"2022-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-022-00451-x.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40400775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-17DOI: 10.1007/s10182-022-00452-w
Claus Thorn Ekstrøm, Andreas Kryger Jensen
Many popular sports involve matches between two teams or players where each team have the possibility of scoring points throughout the match. While the overall match winner and result is interesting, it conveys little information about the underlying scoring trends throughout the match. Modeling approaches that accommodate a finer granularity of the score difference throughout the match is needed to evaluate in-game strategies, discuss scoring streaks, teams strengths, and other aspects of the game. We propose a latent Gaussian process to model the score difference between two teams and introduce the Trend Direction Index as an easily interpretable probabilistic measure of the current trend in the match as well as a measure of post-game trend evaluation. In addition we propose the Excitement Trend Index—the expected number of monotonicity changes in the running score difference—as a measure of overall game excitement. Our proposed methodology is applied to all 1143 matches from the 2019–2020 National Basketball Association season. We show how the trends can be interpreted in individual games and how the excitement score can be used to cluster teams according to how exciting they are to watch.
{"title":"Having a ball: evaluating scoring streaks and game excitement using in-match trend estimation","authors":"Claus Thorn Ekstrøm, Andreas Kryger Jensen","doi":"10.1007/s10182-022-00452-w","DOIUrl":"10.1007/s10182-022-00452-w","url":null,"abstract":"<div><p>Many popular sports involve matches between two teams or players where each team have the possibility of scoring points throughout the match. While the overall match winner and result is interesting, it conveys little information about the underlying scoring trends throughout the match. Modeling approaches that accommodate a finer granularity of the score difference throughout the match is needed to evaluate in-game strategies, discuss scoring streaks, teams strengths, and other aspects of the game. We propose a latent Gaussian process to model the score difference between two teams and introduce the Trend Direction Index as an easily interpretable probabilistic measure of the current trend in the match as well as a measure of post-game trend evaluation. In addition we propose the Excitement Trend Index—the expected number of monotonicity changes in the running score difference—as a measure of overall game excitement. Our proposed methodology is applied to all 1143 matches from the 2019–2020 National Basketball Association season. We show how the trends can be interpreted in individual games and how the excitement score can be used to cluster teams according to how exciting they are to watch.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"107 1-2","pages":"295 - 311"},"PeriodicalIF":1.4,"publicationDate":"2022-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-022-00452-w.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9468995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-10DOI: 10.1007/s10182-022-00450-y
Ursula Berger, Göran Kauermann, Helmut Küchenhoff
The authors make an important contribution presenting a comprehensive and thoughtful overview about the many different aspects of data, statistics and data analyses in times of the recent COVID-19 pandemic discussing all relevant topics. The paper certainly provides a very valuable reflection of what has been done, what could have been done and what needs to be done. We contribute here with a few comments and some additional issues. We do not discuss all chapters of Jahn et al. (AStA Adv Stat Anal, 2022. 10.1007/s10182-022-00439-7), but focus on those where our personal views and experiences might add some additional aspects.
作者做出了重要贡献,对最近COVID-19大流行时期的数据、统计和数据分析的许多不同方面进行了全面和深思熟虑的概述,讨论了所有相关主题。对于已经做了什么、本可以做什么以及需要做什么,这份报告无疑提供了非常有价值的反映。我们在这里提出一些意见和一些附加问题。我们不讨论Jahn等人的所有章节(astv Stat Anal, 2022)。10.1007/s10182-022-00439-7),但重点关注那些我们个人的观点和经验可能会增加一些额外的方面。
{"title":"Discussion on On the role of data, statistics and decisions in a pandemic","authors":"Ursula Berger, Göran Kauermann, Helmut Küchenhoff","doi":"10.1007/s10182-022-00450-y","DOIUrl":"10.1007/s10182-022-00450-y","url":null,"abstract":"<div><p>The authors make an important contribution presenting a comprehensive and thoughtful overview about the many different aspects of data, statistics and data analyses in times of the recent COVID-19 pandemic discussing all relevant topics. The paper certainly provides a very valuable reflection of what has been done, what could have been done and what needs to be done. We contribute here with a few comments and some additional issues. We do not discuss all chapters of Jahn et al. (AStA Adv Stat Anal, 2022. 10.1007/s10182-022-00439-7), but focus on those where our personal views and experiences might add some additional aspects.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"106 3","pages":"387 - 390"},"PeriodicalIF":1.4,"publicationDate":"2022-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-022-00450-y.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50017885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-09DOI: 10.1007/s10182-022-00449-5
Sebastian Contreras, Jonas Dehning, Viola Priesemann
{"title":"Describing a landscape we are yet discovering","authors":"Sebastian Contreras, Jonas Dehning, Viola Priesemann","doi":"10.1007/s10182-022-00449-5","DOIUrl":"10.1007/s10182-022-00449-5","url":null,"abstract":"","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"106 3","pages":"399 - 402"},"PeriodicalIF":1.4,"publicationDate":"2022-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-022-00449-5.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50035658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-02DOI: 10.1007/s10182-022-00448-6
Rodolfo Metulini, Giorgio Gnecco, Francesco Biancalani, Massimo Riccaboni
Multi-regional input–output (I/O) matrices provide the networks of within- and cross-country economic relations. In the context of I/O analysis, the methodology adopted by national statistical offices in data collection raises the issue of obtaining reliable data in a timely fashion and it makes the reconstruction of (parts of) the I/O matrices of particular interest. In this work, we propose a method combining hierarchical clustering and matrix completion with a LASSO-like nuclear norm penalty, to predict missing entries of a partially unknown I/O matrix. Through analyses based on both real-world and synthetic I/O matrices, we study the effectiveness of the proposed method to predict missing values from both previous years data and current data related to countries similar to the one for which current data are obscured. To show the usefulness of our method, an application based on World Input–Output Database (WIOD) tables—which are an example of industry-by-industry I/O tables—is provided. Strong similarities in structure between WIOD and other I/O tables are also found, which make the proposed approach easily generalizable to them.
{"title":"Hierarchical clustering and matrix completion for the reconstruction of world input–output tables","authors":"Rodolfo Metulini, Giorgio Gnecco, Francesco Biancalani, Massimo Riccaboni","doi":"10.1007/s10182-022-00448-6","DOIUrl":"10.1007/s10182-022-00448-6","url":null,"abstract":"<div><p>Multi-regional input–output (I/O) matrices provide the networks of within- and cross-country economic relations. In the context of I/O analysis, the methodology adopted by national statistical offices in data collection raises the issue of obtaining reliable data in a timely fashion and it makes the reconstruction of (parts of) the I/O matrices of particular interest. In this work, we propose a method combining hierarchical clustering and matrix completion with a LASSO-like nuclear norm penalty, to predict missing entries of a partially unknown I/O matrix. Through analyses based on both real-world and synthetic I/O matrices, we study the effectiveness of the proposed method to predict missing values from both previous years data and current data related to countries similar to the one for which current data are obscured. To show the usefulness of our method, an application based on World Input–Output Database (WIOD) tables—which are an example of industry-by-industry I/O tables—is provided. Strong similarities in structure between WIOD and other I/O tables are also found, which make the proposed approach easily generalizable to them.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"107 3","pages":"575 - 620"},"PeriodicalIF":1.4,"publicationDate":"2022-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-022-00448-6.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50004745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-21DOI: 10.1007/s10182-022-00447-7
Walter J. Radermacher
In the Corona pandemic, it became clear with burning clarity how much good quality statistics are needed, and at the same time how unsuccessful we are at providing such statistics despite the existing technical and methodological possibilities and diverse data sources. It is therefore more than overdue to get to the bottom of the causes of these issues and to learn from the findings. This defines a high aspiration, namely that firstly a diagnosis is carried out in which the causes of the deficiencies with their interactions are identified as broadly as possible. Secondly, such a broad diagnosis should result in a therapy that includes a coherent strategy that can be generalised, i.e. that goes beyond the Corona pandemic.
{"title":"Comment on: On the role of data, statistics and decisions in a pandemic statistics for climate protection and health—dare (more) progress!","authors":"Walter J. Radermacher","doi":"10.1007/s10182-022-00447-7","DOIUrl":"10.1007/s10182-022-00447-7","url":null,"abstract":"<div><p>In the Corona pandemic, it became clear with burning clarity how much good quality statistics are needed, and at the same time how unsuccessful we are at providing such statistics despite the existing technical and methodological possibilities and diverse data sources. It is therefore more than overdue to get to the bottom of the causes of these issues and to learn from the findings. This defines a high aspiration, namely that firstly a diagnosis is carried out in which the causes of the deficiencies with their interactions are identified as broadly as possible. Secondly, such a broad diagnosis should result in a therapy that includes a coherent strategy that can be generalised, i.e. that goes beyond the Corona pandemic.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"106 3","pages":"391 - 397"},"PeriodicalIF":1.4,"publicationDate":"2022-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-022-00447-7.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50041913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-11DOI: 10.1007/s10182-022-00446-8
Angel G. Angelov, Magnus Ekström
The paper explores a testing problem which involves four hypotheses, that is, based on observations of two random variables X and Y, we wish to discriminate between four possibilities: identical survival functions, stochastic dominance of X over Y, stochastic dominance of Y over X, or crossing survival functions. Four-decision testing procedures for repeated measurements data are proposed. The tests are based on a permutation approach and do not rely on distributional assumptions. One-sided versions of the Cramér–von Mises, Anderson–Darling, and Kolmogorov–Smirnov statistics are utilized. The consistency of the tests is proven. A simulation study shows good power properties and control of false-detection errors. The suggested tests are applied to data from a psychophysical experiment.
{"title":"Tests of stochastic dominance with repeated measurements data","authors":"Angel G. Angelov, Magnus Ekström","doi":"10.1007/s10182-022-00446-8","DOIUrl":"10.1007/s10182-022-00446-8","url":null,"abstract":"<div><p>The paper explores a testing problem which involves four hypotheses, that is, based on observations of two random variables <i>X</i> and <i>Y</i>, we wish to discriminate between four possibilities: identical survival functions, stochastic dominance of <i>X</i> over <i>Y</i>, stochastic dominance of <i>Y</i> over <i>X</i>, or crossing survival functions. Four-decision testing procedures for repeated measurements data are proposed. The tests are based on a permutation approach and do not rely on distributional assumptions. One-sided versions of the Cramér–von Mises, Anderson–Darling, and Kolmogorov–Smirnov statistics are utilized. The consistency of the tests is proven. A simulation study shows good power properties and control of false-detection errors. The suggested tests are applied to data from a psychophysical experiment.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"107 3","pages":"443 - 467"},"PeriodicalIF":1.4,"publicationDate":"2022-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-022-00446-8.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43319818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}