首页 > 最新文献

Asta-Advances in Statistical Analysis最新文献

英文 中文
Regional now- and forecasting for data reported with delay: toward surveillance of COVID-19 infections 区域现在和延迟报告的数据预测:监测COVID-19感染
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2022-01-18 DOI: 10.1007/s10182-021-00433-5
Giacomo De Nicola, Marc Schneble, Göran Kauermann, Ursula Berger

Governments around the world continue to act to contain and mitigate the spread of COVID-19. The rapidly evolving situation compels officials and executives to continuously adapt policies and social distancing measures depending on the current state of the spread of the disease. In this context, it is crucial for policymakers to have a firm grasp on what the current state of the pandemic is, and to envision how the number of infections is going to evolve over the next days. However, as in many other situations involving compulsory registration of sensitive data, cases are reported with delay to a central register, with this delay deferring an up-to-date view of the state of things. We provide a stable tool for monitoring current infection levels as well as predicting infection numbers in the immediate future at the regional level. We accomplish this through nowcasting of cases that have not yet been reported as well as through predictions of future infections. We apply our model to German data, for which our focus lies in predicting and explain infectious behavior by district.

世界各国政府继续采取行动,遏制和缓解COVID-19的传播。迅速变化的形势迫使官员和行政人员根据疾病传播的现状不断调整政策和社会距离措施。在这种情况下,政策制定者必须牢牢把握疫情的现状,并设想未来几天感染人数将如何演变。但是,与涉及强制登记敏感数据的许多其他情况一样,向中央登记册报告的案件延迟了,这种延迟推迟了对事态的最新看法。我们提供了一个稳定的工具来监测当前的感染水平,并在不久的将来在区域一级预测感染人数。我们通过对尚未报告的病例的临近预报以及对未来感染的预测来实现这一目标。我们将我们的模型应用于德国的数据,我们的重点在于预测和解释地区的感染行为。
{"title":"Regional now- and forecasting for data reported with delay: toward surveillance of COVID-19 infections","authors":"Giacomo De Nicola,&nbsp;Marc Schneble,&nbsp;Göran Kauermann,&nbsp;Ursula Berger","doi":"10.1007/s10182-021-00433-5","DOIUrl":"10.1007/s10182-021-00433-5","url":null,"abstract":"<div><p>Governments around the world continue to act to contain and mitigate the spread of COVID-19. The rapidly evolving situation compels officials and executives to continuously adapt policies and social distancing measures depending on the current state of the spread of the disease. In this context, it is crucial for policymakers to have a firm grasp on what the current state of the pandemic is, and to envision how the number of infections is going to evolve over the next days. However, as in many other situations involving compulsory registration of sensitive data, cases are reported with delay to a central register, with this delay deferring an up-to-date view of the state of things. We provide a stable tool for monitoring current infection levels as well as predicting infection numbers in the immediate future at the regional level. We accomplish this through nowcasting of cases that have not yet been reported as well as through predictions of future infections. We apply our model to German data, for which our focus lies in predicting and explain infectious behavior by district.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"106 3","pages":"407 - 426"},"PeriodicalIF":1.4,"publicationDate":"2022-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-021-00433-5.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39942283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Model-based clustering via new parsimonious mixtures of heavy-tailed distributions 基于模型的重尾分布新简约混合聚类
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2022-01-14 DOI: 10.1007/s10182-021-00430-8
Salvatore D. Tomarchio, Luca Bagnato, Antonio Punzo

Two families of parsimonious mixture models are introduced for model-based clustering. They are based on two multivariate distributions-the shifted exponential normal and the tail-inflated normal-recently introduced in the literature as heavy-tailed generalizations of the multivariate normal. Parsimony is attained by the eigen-decomposition of the component scale matrices, as well as by the imposition of a constraint on the tailedness parameters. Identifiability conditions are also provided. Two variants of the expectation-maximization algorithm are presented for maximum likelihood parameter estimation. Parameter recovery and clustering performance are investigated via a simulation study. Comparisons with the unconstrained mixture models are obtained as by-product. A further simulated analysis is conducted to assess how sensitive our and some well-established parsimonious competitors are to their own generative scheme. Lastly, our and the competing models are evaluated in terms of fitting and clustering on three real datasets.

引入了两类简约混合模型用于基于模型的聚类。它们基于两个多变量分布,即最近在文献中引入的移位指数正态和尾部膨胀正态,作为多变量正态的重尾推广。通过分量尺度矩阵的本征分解以及对尾性参数施加约束来获得简洁性。还提供了可识别性条件。针对最大似然参数估计,提出了期望最大化算法的两种变体。通过仿真研究研究了参数恢复和聚类性能。作为副产品,获得了与无约束混合物模型的比较。进行了进一步的模拟分析,以评估我们和一些公认的吝啬竞争对手对他们自己的生成方案的敏感程度。最后,我们和竞争模型在三个真实数据集上进行了拟合和聚类评估。
{"title":"Model-based clustering via new parsimonious mixtures of heavy-tailed distributions","authors":"Salvatore D. Tomarchio,&nbsp;Luca Bagnato,&nbsp;Antonio Punzo","doi":"10.1007/s10182-021-00430-8","DOIUrl":"10.1007/s10182-021-00430-8","url":null,"abstract":"<div><p>Two families of parsimonious mixture models are introduced for model-based clustering. They are based on two multivariate distributions-the shifted exponential normal and the tail-inflated normal-recently introduced in the literature as heavy-tailed generalizations of the multivariate normal. Parsimony is attained by the eigen-decomposition of the component scale matrices, as well as by the imposition of a constraint on the tailedness parameters. Identifiability conditions are also provided. Two variants of the expectation-maximization algorithm are presented for maximum likelihood parameter estimation. Parameter recovery and clustering performance are investigated via a simulation study. Comparisons with the unconstrained mixture models are obtained as by-product. A further simulated analysis is conducted to assess how sensitive our and some well-established parsimonious competitors are to their own generative scheme. Lastly, our and the competing models are evaluated in terms of fitting and clustering on three real datasets.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"106 2","pages":"315 - 347"},"PeriodicalIF":1.4,"publicationDate":"2022-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50025982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Diagnostic checking of multiple imputation models 多种输入模型的诊断检查
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2022-01-14 DOI: 10.1007/s10182-021-00429-1
Yang Zhao
{"title":"Diagnostic checking of multiple imputation models","authors":"Yang Zhao","doi":"10.1007/s10182-021-00429-1","DOIUrl":"https://doi.org/10.1007/s10182-021-00429-1","url":null,"abstract":"","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"57 1","pages":"271 - 286"},"PeriodicalIF":1.4,"publicationDate":"2022-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"51998155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Diagnostic checking of multiple imputation models 多种输入模型的诊断检查
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2022-01-14 DOI: 10.1007/s10182-021-00429-1
Yang Zhao

Model checking in multiple imputation (MI, Rubin in Multiple imputation for nonresponse in surveys, Wiley, New York, 1987) becomes increasingly important with the recent developments in MI and its widespread use in statistical analysis with missing data (e.g. van Buuren et al. in J Stat Comput Simul 76(12):1049–1064, 2006; van Buuren and Groothuis-Oudshoorn in J Stat Soft 45(3):1–67, 2011; Chen et al. in Biometrics 67:799–809, 2011; Nguyen et al. in Emerg Themes Epidemiol 14(8):1–12, 2017). The currently recommended posterior predictive checking method (He and Zaslavsky in Stat Med 31:1–18, 2012; Nguyen et al. in Biom J 4:676–694, 2015) is less effective when the proportion of missing values increases and its produced posterior predictive p value is not supported by a null distribution as a standard p value (Meng in Annu Stat 22:1142–1160, 1994). This research develops a new diagnostic method for checking MI models and proposes a test statistic with a standard p value. The new diagnostic checking method is effective and flexible. It does not depend on the proportion of missing values and can deal with data sets with arbitrary nonmonotone missing data patterns. We examine the performance of the proposed method in a simulation study and illustrate the method in a study of coronary disease and associated factors.

多重插补中的模型检查(MI,Rubin在调查中无响应的多重插补中,Wiley,New York,1987)随着MI的最新发展及其在缺失数据的统计分析中的广泛使用而变得越来越重要(例如,van Buuren等人在J Stat Comput Simul 76(12):1049–10642006;van Buuren和Groothuis Oudshoorn在J Stat Soft 45(3):1–672011;Chen等人在《生物计量学》67:799–8092011;Nguyen等人在《新兴主题流行病学》第14(8)期:2017年1月12日)。当前推荐的后验预测检查方法(He和Zaslavsky在《Stat Med》31:1-182012;Nguyen等人在《Biom J》4:676–6942015中)在缺失值比例增加时效果较差,并且其产生的后验预报p值没有作为标准p值的零分布支持(Meng在《Annu Stat》22:1142–11601994中)。本研究开发了一种检查MI模型的新诊断方法,并提出了一种标准p值的检验统计量。新的诊断检查方法有效且灵活。它不依赖于缺失值的比例,并且可以处理具有任意非单调缺失数据模式的数据集。我们在模拟研究中检验了所提出的方法的性能,并在冠状动脉疾病和相关因素的研究中说明了该方法。
{"title":"Diagnostic checking of multiple imputation models","authors":"Yang Zhao","doi":"10.1007/s10182-021-00429-1","DOIUrl":"10.1007/s10182-021-00429-1","url":null,"abstract":"<div><p>Model checking in multiple imputation (MI, Rubin in Multiple imputation for nonresponse in surveys, Wiley, New York, 1987) becomes increasingly important with the recent developments in MI and its widespread use in statistical analysis with missing data (e.g. van Buuren et al. in J Stat Comput Simul 76(12):1049–1064, 2006; van Buuren and Groothuis-Oudshoorn in J Stat Soft 45(3):1–67, 2011; Chen et al. in Biometrics 67:799–809, 2011; Nguyen et al. in Emerg Themes Epidemiol 14(8):1–12, 2017). The currently recommended posterior predictive checking method (He and Zaslavsky in Stat Med 31:1–18, 2012; Nguyen et al. in Biom J 4:676–694, 2015) is less effective when the proportion of missing values increases and its produced posterior predictive <i>p</i> value is not supported by a null distribution as a standard <i>p</i> value (Meng in Annu Stat 22:1142–1160, 1994). This research develops a new diagnostic method for checking MI models and proposes a test statistic with a standard <i>p</i> value. The new diagnostic checking method is effective and flexible. It does not depend on the proportion of missing values and can deal with data sets with arbitrary nonmonotone missing data patterns. We examine the performance of the proposed method in a simulation study and illustrate the method in a study of coronary disease and associated factors.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"106 2","pages":"271 - 286"},"PeriodicalIF":1.4,"publicationDate":"2022-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-021-00429-1.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50050052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A spatial randomness test based on the box-counting dimension 基于盒计数维度的空间随机性检验
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2022-01-05 DOI: 10.1007/s10182-021-00434-4
Yolanda Caballero, Ramón Giraldo, Jorge Mateu

Statistical modelling of a spatial point pattern often begins by testing the hypothesis of spatial randomness. Classical tests are based on quadrat counts and distance-based methods. Alternatively, we propose a new statistical test of spatial randomness based on the fractal dimension, calculated through the box-counting method providing an inferential perspective contrary to the more often descriptive use of this method. We also develop a graphical test based on the log–log plot to calculate the box-counting dimension. We evaluate the performance of our methodology by conducting a simulation study and analysing a COVID-19 dataset. The results reinforce the good performance of the method that arises as an alternative to the more classical distances-based strategies.

空间点模式的统计建模通常从检验空间随机性假设开始。经典测试基于样方计数和基于距离的方法。或者,我们提出了一种新的基于分形维数的空间随机性统计检验,通过盒计数法计算,提供了一种与该方法更经常描述性使用相反的推断视角。我们还开发了一个基于对数-对数图的图形测试来计算盒计数维度。我们通过进行模拟研究和分析COVID-19数据集来评估我们方法的性能。结果强化了该方法的良好性能,该方法作为更经典的基于距离的策略的替代方案而出现。
{"title":"A spatial randomness test based on the box-counting dimension","authors":"Yolanda Caballero,&nbsp;Ramón Giraldo,&nbsp;Jorge Mateu","doi":"10.1007/s10182-021-00434-4","DOIUrl":"10.1007/s10182-021-00434-4","url":null,"abstract":"<div><p>Statistical modelling of a spatial point pattern often begins by testing the hypothesis of spatial randomness. Classical tests are based on quadrat counts and distance-based methods. Alternatively, we propose a new statistical test of spatial randomness based on the fractal dimension, calculated through the box-counting method providing an inferential perspective contrary to the more often descriptive use of this method. We also develop a graphical test based on the log–log plot to calculate the box-counting dimension. We evaluate the performance of our methodology by conducting a simulation study and analysing a COVID-19 dataset. The results reinforce the good performance of the method that arises as an alternative to the more classical distances-based strategies.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"106 3","pages":"499 - 524"},"PeriodicalIF":1.4,"publicationDate":"2022-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-021-00434-4.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39809141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
An integrated local depth measure 一种综合的局部深度测量
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2022-01-03 DOI: 10.1007/s10182-021-00424-6
Lucas Fernandez-Piana, Marcela Svarc

We introduce the Integrated Dual Local Depth, which is a local depth measure for data in a Banach space based on the use of one-dimensional projections. The properties of a depth measure are analyzed under this setting and a proper definition of local symmetry is given. Moreover, strong consistency results for the local depth and also, for local depth regions are attained. Finally, applications to descriptive data analysis and classification are analyzed, making a special focus on multivariate functional data, where we obtain very promising results.

本文介绍了一种基于一维投影的Banach空间数据的局部深度测量方法——集成对偶局部深度。在这种情况下,分析了深度测量的性质,给出了局部对称的适当定义。此外,对于局部深度和局部深度区域,得到了较强的一致性结果。最后,对描述性数据分析和分类的应用进行了分析,特别关注多元函数数据,在这方面我们获得了非常有希望的结果。
{"title":"An integrated local depth measure","authors":"Lucas Fernandez-Piana,&nbsp;Marcela Svarc","doi":"10.1007/s10182-021-00424-6","DOIUrl":"10.1007/s10182-021-00424-6","url":null,"abstract":"<div><p>We introduce the Integrated Dual Local Depth, which is a local depth measure for data in a Banach space based on the use of one-dimensional projections. The properties of a depth measure are analyzed under this setting and a proper definition of local symmetry is given. Moreover, strong consistency results for the local depth and also, for local depth regions are attained. Finally, applications to descriptive data analysis and classification are analyzed, making a special focus on multivariate functional data, where we obtain very promising results.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"106 2","pages":"175 - 197"},"PeriodicalIF":1.4,"publicationDate":"2022-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-021-00424-6.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50010270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving the causal treatment effect estimation with propensity scores by the bootstrap 利用bootstrap改进倾向评分的因果治疗效果估计
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2021-12-21 DOI: 10.1007/s10182-021-00427-3
Maeregu W. Arisido, Fulvia Mecatti, Paola Rebora

When observational studies are used to establish the causal effects of treatments, the estimated effect is affected by treatment selection bias. The inverse propensity score weight (IPSW) is often used to deal with such bias. However, IPSW requires strong assumptions whose misspecifications and strategies to correct the misspecifications were rarely studied. We present a bootstrap bias correction of IPSW (BC-IPSW) to improve the performance of propensity score in dealing with treatment selection bias in the presence of failure to the ignorability and overlap assumptions. The approach was motivated by a real observational study to explore the potential of anticoagulant treatment for reducing mortality in patients with end-stage renal disease. The benefit of the treatment to enhance survival was demonstrated; the suggested BC-IPSW method indicated a statistically significant reduction in mortality for patients receiving the treatment. Using extensive simulations, we show that BC-IPSW substantially reduced the bias due to the misspecification of the ignorability and overlap assumptions. Further, we showed that IPSW is still useful to account for the lack of treatment randomization, but its advantages are stringently linked to the satisfaction of ignorability, indicating that the existence of relevant though unmeasured or unused covariates can worsen the selection bias.

当观察性研究用于确定治疗的因果效应时,估计的效果受到治疗选择偏倚的影响。逆倾向得分权重(IPSW)通常用于处理这种偏差。然而,IPSW需要强有力的假设,其错误规范和纠正错误规范的策略很少被研究。我们提出了IPSW的自举偏差校正(BC-IPSW),以改善倾向评分在处理可忽略性和重叠假设失败的治疗选择偏差时的表现。该方法的动机是一项真实的观察性研究,旨在探索抗凝治疗降低终末期肾病患者死亡率的潜力。证明了治疗对提高生存率的益处;建议的BC-IPSW方法表明,接受治疗的患者死亡率有统计学意义的降低。通过广泛的模拟,我们发现BC-IPSW大大减少了由于可忽略性和重叠假设的错误说明而导致的偏差。此外,我们发现IPSW仍然有助于解释治疗随机化的缺乏,但其优势与可忽略性的满意度密切相关,这表明相关但未测量或未使用的协变量的存在会恶化选择偏差。
{"title":"Improving the causal treatment effect estimation with propensity scores by the bootstrap","authors":"Maeregu W. Arisido,&nbsp;Fulvia Mecatti,&nbsp;Paola Rebora","doi":"10.1007/s10182-021-00427-3","DOIUrl":"10.1007/s10182-021-00427-3","url":null,"abstract":"<div><p>When observational studies are used to establish the causal effects of treatments, the estimated effect is affected by treatment selection bias. The inverse propensity score weight (IPSW) is often used to deal with such bias. However, IPSW requires strong assumptions whose misspecifications and strategies to correct the misspecifications were rarely studied. We present a bootstrap bias correction of IPSW (BC-IPSW) to improve the performance of propensity score in dealing with treatment selection bias in the presence of failure to the ignorability and overlap assumptions. The approach was motivated by a real observational study to explore the potential of anticoagulant treatment for reducing mortality in patients with end-stage renal disease. The benefit of the treatment to enhance survival was demonstrated; the suggested BC-IPSW method indicated a statistically significant reduction in mortality for patients receiving the treatment. Using extensive simulations, we show that BC-IPSW substantially reduced the bias due to the misspecification of the ignorability and overlap assumptions. Further, we showed that IPSW is still useful to account for the lack of treatment randomization, but its advantages are stringently linked to the satisfaction of ignorability, indicating that the existence of relevant though unmeasured or unused covariates can worsen the selection bias.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"106 3","pages":"455 - 471"},"PeriodicalIF":1.4,"publicationDate":"2021-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-021-00427-3.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46055305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Prediction of sports injuries in football: a recurrent time-to-event approach using regularized Cox models 足球运动损伤的预测:使用正则化Cox模型的重复时间-事件方法
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2021-11-20 DOI: 10.1007/s10182-021-00428-2
Lore Zumeta-Olaskoaga, Maximilian Weigert, Jon Larruskain, Eder Bikandi, Igor Setuain, Josean Lekue, Helmut Küchenhoff, Dae-Jin Lee

Data-based methods and statistical models are given special attention to the study of sports injuries to gain in-depth understanding of its risk factors and mechanisms. The objective of this work is to evaluate the use of shared frailty Cox models for the prediction of occurring sports injuries, and to compare their performance with different sets of variables selected by several regularized variable selection approaches. The study is motivated by specific characteristics commonly found for sports injury data, that usually include reduced sample size and even fewer number of injuries, coupled with a large number of potentially influential variables. Hence, we conduct a simulation study to address these statistical challenges and to explore regularized Cox model strategies together with shared frailty models in different controlled situations. We show that predictive performance greatly improves as more player observations are available. Methods that result in sparse models and favour interpretability, e.g. Best Subset Selection and Boosting, are preferred when the sample size is small. We include a real case study of injuries of female football players of a Spanish football club.

基于数据的方法和统计模型特别重视对运动损伤的研究,以深入了解其危险因素和机制。这项工作的目的是评估共享脆弱性Cox模型在预测发生的运动损伤方面的使用,并比较它们与几种正则化变量选择方法选择的不同变量集的性能。这项研究的动机是运动损伤数据中常见的特定特征,通常包括样本量减少,受伤次数更少,以及大量潜在的影响变量。因此,我们进行了一项模拟研究来解决这些统计挑战,并探索在不同控制情况下正则化Cox模型策略以及共享脆弱性模型。我们发现,随着更多玩家的观察结果的出现,预测性能将大大提高。当样本量较小时,首选产生稀疏模型并有利于可解释性的方法,例如最佳子集选择和增强。我们包括一个真实的案例研究受伤的女足球运动员的西班牙足球俱乐部。
{"title":"Prediction of sports injuries in football: a recurrent time-to-event approach using regularized Cox models","authors":"Lore Zumeta-Olaskoaga,&nbsp;Maximilian Weigert,&nbsp;Jon Larruskain,&nbsp;Eder Bikandi,&nbsp;Igor Setuain,&nbsp;Josean Lekue,&nbsp;Helmut Küchenhoff,&nbsp;Dae-Jin Lee","doi":"10.1007/s10182-021-00428-2","DOIUrl":"10.1007/s10182-021-00428-2","url":null,"abstract":"<div><p>Data-based methods and statistical models are given special attention to the study of sports injuries to gain in-depth understanding of its risk factors and mechanisms. The objective of this work is to evaluate the use of shared frailty Cox models for the prediction of occurring sports injuries, and to compare their performance with different sets of variables selected by several regularized variable selection approaches. The study is motivated by specific characteristics commonly found for sports injury data, that usually include reduced sample size and even fewer number of injuries, coupled with a large number of potentially influential variables. Hence, we conduct a simulation study to address these statistical challenges and to explore regularized Cox model strategies together with shared frailty models in different controlled situations. We show that predictive performance greatly improves as more player observations are available. Methods that result in sparse models and favour interpretability, e.g. Best Subset Selection and Boosting, are preferred when the sample size is small. We include a real case study of injuries of female football players of a Spanish football club.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"107 1-2","pages":"101 - 126"},"PeriodicalIF":1.4,"publicationDate":"2021-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-021-00428-2.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46564718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Small area estimation of socioeconomic indicators for sampled and unsampled domains 抽样和未抽样地区社会经济指标的小面积估计
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2021-11-19 DOI: 10.1007/s10182-021-00426-4
Jan Pablo Burgard, Domingo Morales, Anna-Lena Wölwer

Socioeconomic indicators play a crucial role in monitoring political actions over time and across regions. Income-based indicators such as the median income of sub-populations can provide information on the impact of measures, e.g., on poverty reduction. Regional information is usually published on an aggregated level. Due to small sample sizes, these regional aggregates are often associated with large standard errors or are missing if the region is unsampled or the estimate is simply not published. For example, if the median income of Hispanic or Latino Americans from the American Community Survey is of interest, some county-year combinations are not available. Therefore, a comparison of different counties or time-points is partly not possible. We propose a new predictor based on small area estimation techniques for aggregated data and bivariate modeling. This predictor provides empirical best predictions for the partially unavailable county-year combinations. We provide an analytical approximation to the mean squared error. The theoretical findings are backed up by a large-scale simulation study. Finally, we return to the problem of estimating the county-year estimates for the median income of Hispanic or Latino Americans and externally validate the estimates.

社会经济指标在监测一段时间内和各区域的政治行动方面发挥着至关重要的作用。基于收入的指标,如次级人口的收入中位数,可以提供有关措施影响的信息,例如对减贫的影响。区域信息通常在汇总级别上发布。由于样本量较小,这些区域集合通常与较大的标准误差有关,或者如果该区域未采样或只是没有公布估计值,则这些区域集合就会丢失。例如,如果对美国社区调查中西班牙裔或拉丁裔美国人的收入中位数感兴趣,则某些县年度组合不可用。因此,对不同的县或时间点进行比较在一定程度上是不可能的。我们提出了一种新的基于小面积估计技术的预测器,用于聚合数据和双变量建模。该预测因子为部分不可用的县-年组合提供了经验最佳预测。我们提供了均方误差的解析近似值。这一理论发现得到了大规模模拟研究的支持。最后,我们回到估计西班牙裔或拉丁裔美国人收入中值的县年度估计值的问题,并从外部验证这些估计值。
{"title":"Small area estimation of socioeconomic indicators for sampled and unsampled domains","authors":"Jan Pablo Burgard,&nbsp;Domingo Morales,&nbsp;Anna-Lena Wölwer","doi":"10.1007/s10182-021-00426-4","DOIUrl":"10.1007/s10182-021-00426-4","url":null,"abstract":"<div><p>Socioeconomic indicators play a crucial role in monitoring political actions over time and across regions. Income-based indicators such as the median income of sub-populations can provide information on the impact of measures, e.g., on poverty reduction. Regional information is usually published on an aggregated level. Due to small sample sizes, these regional aggregates are often associated with large standard errors or are missing if the region is unsampled or the estimate is simply not published. For example, if the median income of Hispanic or Latino Americans from the American Community Survey is of interest, some county-year combinations are not available. Therefore, a comparison of different counties or time-points is partly not possible. We propose a new predictor based on small area estimation techniques for aggregated data and bivariate modeling. This predictor provides empirical best predictions for the partially unavailable county-year combinations. We provide an analytical approximation to the mean squared error. The theoretical findings are backed up by a large-scale simulation study. Finally, we return to the problem of estimating the county-year estimates for the median income of Hispanic or Latino Americans and externally validate the estimates.</p></div>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"106 2","pages":"287 - 314"},"PeriodicalIF":1.4,"publicationDate":"2021-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10182-021-00426-4.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50038566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Small area estimation of socioeconomic indicators for sampled and unsampled domains 抽样和未抽样地区社会经济指标的小面积估计
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2021-11-19 DOI: 10.1007/s10182-021-00426-4
J. P. Burgard, D. Morales, Anna-Lena Wölwer
{"title":"Small area estimation of socioeconomic indicators for sampled and unsampled domains","authors":"J. P. Burgard, D. Morales, Anna-Lena Wölwer","doi":"10.1007/s10182-021-00426-4","DOIUrl":"https://doi.org/10.1007/s10182-021-00426-4","url":null,"abstract":"","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":"106 1","pages":"287 - 314"},"PeriodicalIF":1.4,"publicationDate":"2021-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"51998129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Asta-Advances in Statistical Analysis
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1