首页 > 最新文献

Computational Statistics最新文献

英文 中文
Site-specific nitrogen recommendation: fast, accurate, and feasible Bayesian kriging 针对具体地点的氮推荐:快速、准确、可行的贝叶斯克里金法
IF 1.3 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-07-18 DOI: 10.1007/s00180-024-01527-9
Davood Poursina, B. Wade Brorsen

Bayesian Kriging (BK) provides a way to estimate regression models where the parameters are smoothed across space. Such estimates could help guide site-specific fertilizer recommendations. One advantage of BK is that it can readily fill in the missing values that are common in yield monitor data. The problem is that previous methods are too computationally intensive to be commercially feasible when estimating a nonlinear production function. This paper sought to increase computational speed by imposing restrictions on the spatial covariance matrix. Previous research used an exponential function for the spatial covariance matrix. The two alternatives considered are the conditional autoregressive and simultaneous autoregressive models. In addition, a new analytical solution is provided for finding the optimal value of nitrogen with a stochastic linear plateau model. A comparison among models in the accuracy and computational burden shows that the restrictions significantly reduced the computational burden, although they did sacrifice some accuracy in the dataset considered.

贝叶斯克里金法(BK)提供了一种估计回归模型的方法,其中的参数在空间上被平滑处理。这种估计有助于指导针对具体地点的施肥建议。贝叶斯克里金法的一个优点是,它可以随时填补产量监测数据中常见的缺失值。问题在于,以前的方法计算量过大,在估算非线性生产函数时不具有商业可行性。本文试图通过对空间协方差矩阵施加限制来提高计算速度。以前的研究使用指数函数来计算空间协方差矩阵。考虑的两种替代方法是条件自回归模型和同步自回归模型。此外,还为利用随机线性高原模型寻找氮的最佳值提供了新的分析解决方案。对各种模型的准确性和计算负担进行比较后发现,限制条件大大减轻了计算负担,但在所考虑的数据集中牺牲了一些准确性。
{"title":"Site-specific nitrogen recommendation: fast, accurate, and feasible Bayesian kriging","authors":"Davood Poursina, B. Wade Brorsen","doi":"10.1007/s00180-024-01527-9","DOIUrl":"https://doi.org/10.1007/s00180-024-01527-9","url":null,"abstract":"<p>Bayesian Kriging (BK) provides a way to estimate regression models where the parameters are smoothed across space. Such estimates could help guide site-specific fertilizer recommendations. One advantage of BK is that it can readily fill in the missing values that are common in yield monitor data. The problem is that previous methods are too computationally intensive to be commercially feasible when estimating a nonlinear production function. This paper sought to increase computational speed by imposing restrictions on the spatial covariance matrix. Previous research used an exponential function for the spatial covariance matrix. The two alternatives considered are the conditional autoregressive and simultaneous autoregressive models. In addition, a new analytical solution is provided for finding the optimal value of nitrogen with a stochastic linear plateau model. A comparison among models in the accuracy and computational burden shows that the restrictions significantly reduced the computational burden, although they did sacrifice some accuracy in the dataset considered.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"13 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141742552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian diagnostics in a partially linear model with first-order autoregressive skew-normal errors 具有一阶自回归偏态误差的部分线性模型的贝叶斯诊断法
IF 1.3 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-07-11 DOI: 10.1007/s00180-024-01504-2
Yonghui Liu, Jiawei Lu, Gilberto A. Paula, Shuangzhe Liu

This paper studies a Bayesian local influence method to detect influential observations in a partially linear model with first-order autoregressive skew-normal errors. This method appears suitable for small or moderate-sized data sets ((n=200{sim }400)) and overcomes some theoretical limitations, bridging the diagnostic gap for small or moderate-sized data in classical methods. The MCMC algorithm is employed for parameter estimation, and Bayesian local influence analysis is made using three perturbation schemes (priors, variances, and data) and three measurement scales (Bayes factor, (phi )-divergence, and posterior mean). Simulation studies are conducted to validate the reliability of the diagnostics. Finally, a practical application uses data on the 1976 Los Angeles ozone concentration to further demonstrate the effectiveness of the diagnostics.

本文研究了一种贝叶斯局部影响方法,用于在具有一阶自回归偏态误差的部分线性模型中检测有影响的观测值。该方法适用于中小型数据集(n=200{/sim }400),并克服了一些理论限制,弥补了经典方法在中小型数据诊断方面的不足。采用 MCMC 算法进行参数估计,并使用三种扰动方案(先验、方差和数据)和三种测量尺度(贝叶斯因子、(phi )-发散和后验均值)进行贝叶斯局部影响分析。模拟研究验证了诊断的可靠性。最后,利用 1976 年洛杉矶臭氧浓度的数据进行了实际应用,进一步证明了诊断方法的有效性。
{"title":"Bayesian diagnostics in a partially linear model with first-order autoregressive skew-normal errors","authors":"Yonghui Liu, Jiawei Lu, Gilberto A. Paula, Shuangzhe Liu","doi":"10.1007/s00180-024-01504-2","DOIUrl":"https://doi.org/10.1007/s00180-024-01504-2","url":null,"abstract":"<p>This paper studies a Bayesian local influence method to detect influential observations in a partially linear model with first-order autoregressive skew-normal errors. This method appears suitable for small or moderate-sized data sets (<span>(n=200{sim }400)</span>) and overcomes some theoretical limitations, bridging the diagnostic gap for small or moderate-sized data in classical methods. The MCMC algorithm is employed for parameter estimation, and Bayesian local influence analysis is made using three perturbation schemes (priors, variances, and data) and three measurement scales (Bayes factor, <span>(phi )</span>-divergence, and posterior mean). Simulation studies are conducted to validate the reliability of the diagnostics. Finally, a practical application uses data on the 1976 Los Angeles ozone concentration to further demonstrate the effectiveness of the diagnostics.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"20 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141613205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Empirical likelihood change point detection in quantile regression models 量子回归模型中的经验似然变化点检测
IF 1.3 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-07-10 DOI: 10.1007/s00180-024-01526-w
Suthakaran Ratnasingam, Ramadha D. Piyadi Gamage

Quantile regression is an extension of linear regression which estimates a conditional quantile of interest. In this paper, we propose an empirical likelihood-based non-parametric procedure to detect structural changes in the quantile regression models. Further, we have modified the proposed smoothed empirical likelihood-based method using adjusted smoothed empirical likelihood and transformed smoothed empirical likelihood techniques. We have shown that under the null hypothesis, the limiting distribution of the smoothed empirical likelihood ratio test statistic is identical to that of the classical parametric likelihood. Simulations are conducted to investigate the finite sample properties of the proposed methods. Finally, to demonstrate the effectiveness of the proposed method, it is applied to urinary Glycosaminoglycans (GAGs) data to detect structural changes.

量子回归是线性回归的一种扩展,它估计的是感兴趣的条件量子。在本文中,我们提出了一种基于经验似然的非参数程序,用于检测量值回归模型中的结构变化。此外,我们还利用调整平滑经验似然和变换平滑经验似然技术对所提出的基于平滑经验似然的方法进行了改进。我们证明,在零假设下,平滑经验似然比检验统计量的极限分布与经典参数似然的极限分布相同。我们还进行了模拟,以研究拟议方法的有限样本特性。最后,为了证明所提方法的有效性,我们将其应用于尿液糖胺聚糖(GAGs)数据的结构变化检测。
{"title":"Empirical likelihood change point detection in quantile regression models","authors":"Suthakaran Ratnasingam, Ramadha D. Piyadi Gamage","doi":"10.1007/s00180-024-01526-w","DOIUrl":"https://doi.org/10.1007/s00180-024-01526-w","url":null,"abstract":"<p>Quantile regression is an extension of linear regression which estimates a conditional quantile of interest. In this paper, we propose an empirical likelihood-based non-parametric procedure to detect structural changes in the quantile regression models. Further, we have modified the proposed smoothed empirical likelihood-based method using adjusted smoothed empirical likelihood and transformed smoothed empirical likelihood techniques. We have shown that under the null hypothesis, the limiting distribution of the smoothed empirical likelihood ratio test statistic is identical to that of the classical parametric likelihood. Simulations are conducted to investigate the finite sample properties of the proposed methods. Finally, to demonstrate the effectiveness of the proposed method, it is applied to urinary Glycosaminoglycans (GAGs) data to detect structural changes.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"28 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141570108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust variable selection for additive coefficient models 加法系数模型的稳健变量选择
IF 1.3 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-07-05 DOI: 10.1007/s00180-024-01524-y
Hang Zou, Xiaowen Huang, Yunlu Jiang

Additive coefficient models generalize linear regression models by assuming that the relationship between the response and some covariates is linear, while their regression coefficients are additive functions. Because of its advantages in dealing with the “curse of dimensionality”, additive coefficient models gain a lot of attention. The commonly used estimation methods for additive coefficient models are not robust against high leverage points. To circumvent this difficulty, we develop a robust variable selection procedure based on the exponential squared loss function and group penalty for the additive coefficient models, which can tackle outliers in the response and covariates simultaneously. Under some regularity conditions, we show that the oracle estimator is a local solution of the proposed method. Furthermore, we apply the local linear approximation and minorization-maximization algorithm for the implementation of the proposed estimator. Meanwhile, we propose a data-driven procedure to select the tuning parameters. Simulation studies and an application to a plasma beta-carotene level data set illustrate that the proposed method can offer more reliable results than other existing methods in contamination schemes.

加法系数模型是对线性回归模型的概括,它假设响应与某些协变量之间是线性关系,而其回归系数是加法函数。由于其在处理 "维度诅咒 "方面的优势,加系数模型受到了广泛关注。常用的加法系数模型估计方法对高杠杆点并不稳健。为了规避这一难题,我们开发了一种基于指数平方损失函数和组惩罚的加法系数模型稳健变量选择程序,可以同时处理响应和协变量中的异常值。在一些正则性条件下,我们证明了oracle估计器是所提方法的局部解。此外,我们还应用了局部线性近似和最小化-最大化算法来实现所提出的估计器。同时,我们提出了一种数据驱动程序来选择调整参数。模拟研究和血浆β-胡萝卜素水平数据集的应用表明,与其他现有的污染方案方法相比,建议的方法能提供更可靠的结果。
{"title":"Robust variable selection for additive coefficient models","authors":"Hang Zou, Xiaowen Huang, Yunlu Jiang","doi":"10.1007/s00180-024-01524-y","DOIUrl":"https://doi.org/10.1007/s00180-024-01524-y","url":null,"abstract":"<p>Additive coefficient models generalize linear regression models by assuming that the relationship between the response and some covariates is linear, while their regression coefficients are additive functions. Because of its advantages in dealing with the “curse of dimensionality”, additive coefficient models gain a lot of attention. The commonly used estimation methods for additive coefficient models are not robust against high leverage points. To circumvent this difficulty, we develop a robust variable selection procedure based on the exponential squared loss function and group penalty for the additive coefficient models, which can tackle outliers in the response and covariates simultaneously. Under some regularity conditions, we show that the oracle estimator is a local solution of the proposed method. Furthermore, we apply the local linear approximation and minorization-maximization algorithm for the implementation of the proposed estimator. Meanwhile, we propose a data-driven procedure to select the tuning parameters. Simulation studies and an application to a plasma beta-carotene level data set illustrate that the proposed method can offer more reliable results than other existing methods in contamination schemes.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"1 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141570110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Variable selection and structure identification for additive models with longitudinal data 纵向数据加法模型的变量选择和结构识别
IF 1.3 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-06-26 DOI: 10.1007/s00180-024-01521-1
Ting Wang, Liya Fu, Yanan Song

This paper proposes a polynomial structure identification (PSI) method for variable selection and model structure identification of additive models with longitudinal data. First, the backfitting algorithm and zero-order local polynomial smoothing method are used to select important variables in the additive model, and the importance of variables is determined through the inverse of the bandwidth parameter in the nonparametric partial kernel function. Second, the backfitting algorithm and Q-order local polynomial smoothing method are utilized to identify the specific structure of each selected predictor. To incorporate correlations within longitudinal data, a two-stage estimation method is proposed for estimating the regression parameters of the identified important variables: (i) Parameter estimators of the important variables are firstly obtained under an independence working model assumption; (ii) Generalized estimating equations with a working correlation matrix based on B-splines are constructed to obtain the final estimators of the parameters, which improve the efficiency of parameter estimation. Finally, simulation studies are carried out to evaluate the performance of the proposed method, followed by the presentation of two real-world examples for illustration.

本文提出了一种多项式结构识别(PSI)方法,用于纵向数据加法模型的变量选择和模型结构识别。首先,利用反拟合算法和零阶局部多项式平滑法来选择加法模型中的重要变量,并通过非参数偏核函数中带宽参数的倒数来确定变量的重要性。其次,利用反拟合算法和 Q 阶局部多项式平滑法来确定每个选定预测因子的具体结构。为了将纵向数据中的相关性考虑在内,提出了一种两阶段估计方法来估计所确定的重要变量的回归参数:(i) 首先在独立工作模型假设下得到重要变量的参数估计值;(ii) 基于 B-样条曲线构建具有工作相关矩阵的广义估计方程,得到最终的参数估计值,从而提高参数估计的效率。最后,通过模拟研究评估了所提方法的性能,并列举了两个实际案例进行说明。
{"title":"Variable selection and structure identification for additive models with longitudinal data","authors":"Ting Wang, Liya Fu, Yanan Song","doi":"10.1007/s00180-024-01521-1","DOIUrl":"https://doi.org/10.1007/s00180-024-01521-1","url":null,"abstract":"<p>This paper proposes a polynomial structure identification (PSI) method for variable selection and model structure identification of additive models with longitudinal data. First, the backfitting algorithm and zero-order local polynomial smoothing method are used to select important variables in the additive model, and the importance of variables is determined through the inverse of the bandwidth parameter in the nonparametric partial kernel function. Second, the backfitting algorithm and <i>Q</i>-order local polynomial smoothing method are utilized to identify the specific structure of each selected predictor. To incorporate correlations within longitudinal data, a two-stage estimation method is proposed for estimating the regression parameters of the identified important variables: (i) Parameter estimators of the important variables are firstly obtained under an independence working model assumption; (ii) Generalized estimating equations with a working correlation matrix based on B-splines are constructed to obtain the final estimators of the parameters, which improve the efficiency of parameter estimation. Finally, simulation studies are carried out to evaluate the performance of the proposed method, followed by the presentation of two real-world examples for illustration.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"30 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141510862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multiple imputation with competing risk outcomes 具有竞争风险结果的多重估算
IF 1.3 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-06-26 DOI: 10.1007/s00180-024-01518-w
Peter C. Austin

In time-to-event analyses, a competing risk is an event whose occurrence precludes the occurrence of the event of interest. Settings with competing risks occur frequently in clinical research. Missing data, which is a common problem in research, occurs when the value of a variable is recorded for some, but not all, records in the dataset. Multiple Imputation (MI) is a popular method to address the presence of missing data. MI uses an imputation model to generate M (M > 1) values for each variable that is missing, resulting in the creation of M complete datasets. A popular algorithm for imputing missing data is multivariate imputation using chained equations (MICE). We used a complex simulation design with covariates and missing data patterns reflective of patients hospitalized with acute myocardial infarction (AMI) to compare three strategies for imputing missing predictor variables when the analysis model is a cause-specific hazard when there were three different event types. We compared two MICE-based strategies that differed according to which cause-specific cumulative hazard functions were included in the imputation models (the three cause-specific cumulative hazard functions vs. only the cause-specific cumulative hazard function for the primary outcome) with the use of the substantive model compatible fully conditional specification (SMCFCS) algorithm. While no strategy had consistently superior performance compared to the other strategies, SMCFCS may be the preferred strategy. We illustrated the application of the strategies using a case study of patients hospitalized with AMI.

在时间到事件分析中,竞争风险是指其发生排除了相关事件发生的事件。临床研究中经常出现有竞争风险的情况。缺失数据是研究中的一个常见问题,当数据集中的某些记录记录了变量值,但并非所有记录都记录了变量值时,就会出现缺失数据。多重估算(MI)是解决数据缺失问题的常用方法。多重估算使用估算模型为每个缺失变量生成 M(M > 1)个值,从而创建 M 个完整的数据集。一种流行的缺失数据归因算法是使用链式方程的多变量归因(MICE)。我们使用了一个复杂的模拟设计,其中的协变量和缺失数据模式反映了急性心肌梗死(AMI)住院患者的情况,比较了在分析模型为特定病因危险时,当有三种不同的事件类型时,对缺失的预测变量进行归因的三种策略。我们比较了两种基于 MICE 的策略,这两种策略的不同之处在于,在使用实质性模型兼容全条件规范 (SMCFCS) 算法的情况下,归因模型中包含了哪些特定病因累积危险函数(三个特定病因累积危险函数与仅包含主要结果的特定病因累积危险函数)。虽然与其他策略相比,没有一种策略具有持续的优越性,但 SMCFCS 可能是首选策略。我们通过对急性心肌梗死住院患者的病例研究来说明这些策略的应用。
{"title":"Multiple imputation with competing risk outcomes","authors":"Peter C. Austin","doi":"10.1007/s00180-024-01518-w","DOIUrl":"https://doi.org/10.1007/s00180-024-01518-w","url":null,"abstract":"<p>In time-to-event analyses, a competing risk is an event whose occurrence precludes the occurrence of the event of interest. Settings with competing risks occur frequently in clinical research. Missing data, which is a common problem in research, occurs when the value of a variable is recorded for some, but not all, records in the dataset. Multiple Imputation (MI) is a popular method to address the presence of missing data. MI uses an imputation model to generate M (M &gt; 1) values for each variable that is missing, resulting in the creation of M complete datasets. A popular algorithm for imputing missing data is multivariate imputation using chained equations (MICE). We used a complex simulation design with covariates and missing data patterns reflective of patients hospitalized with acute myocardial infarction (AMI) to compare three strategies for imputing missing predictor variables when the analysis model is a cause-specific hazard when there were three different event types. We compared two MICE-based strategies that differed according to which cause-specific cumulative hazard functions were included in the imputation models (the three cause-specific cumulative hazard functions vs. only the cause-specific cumulative hazard function for the primary outcome) with the use of the substantive model compatible fully conditional specification (SMCFCS) algorithm. While no strategy had consistently superior performance compared to the other strategies, SMCFCS may be the preferred strategy. We illustrated the application of the strategies using a case study of patients hospitalized with AMI.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"176 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141510861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ranking handball teams from statistical strength estimation 根据统计实力估算手球队排名
IF 1.3 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-06-24 DOI: 10.1007/s00180-024-01522-0
Florian Felice

In this work, we present a methodology to estimate the strength of handball teams. We propose the use of the Conway-Maxwell-Poisson distribution to model the number of goals scored by a team as a flexible discrete distribution which can handle situations of non equi-dispersion. From its parameters, we derive a mathematical formula to determine the strength of a team. We propose a ranking based on the estimated strengths to compare teams across different championships. Applied to female handball club data from European competitions over the 2022/2023 season, we show that our new proposed ranking can have an echo in real sports events and is linked to recent results from European competitions.

在这项工作中,我们提出了一种估算手球队实力的方法。我们建议使用康威-麦克斯韦-泊松分布(Conway-Maxwell-Poisson distribution)来模拟球队的进球数,这是一种灵活的离散分布,可以处理非等离散的情况。根据其参数,我们推导出一个数学公式来确定一支球队的实力。我们提出了一种基于估计实力的排名方法,用于比较不同锦标赛的参赛队伍。我们将其应用于 2022/2023 赛季欧洲赛事的女子手球俱乐部数据,结果表明,我们提出的新排名可以在实际体育赛事中产生反响,并与欧洲赛事的最新结果相关联。
{"title":"Ranking handball teams from statistical strength estimation","authors":"Florian Felice","doi":"10.1007/s00180-024-01522-0","DOIUrl":"https://doi.org/10.1007/s00180-024-01522-0","url":null,"abstract":"<p>In this work, we present a methodology to estimate the strength of handball teams. We propose the use of the Conway-Maxwell-Poisson distribution to model the number of goals scored by a team as a flexible discrete distribution which can handle situations of non equi-dispersion. From its parameters, we derive a mathematical formula to determine the strength of a team. We propose a ranking based on the estimated strengths to compare teams across different championships. Applied to female handball club data from European competitions over the 2022/2023 season, we show that our new proposed ranking can have an echo in real sports events and is linked to recent results from European competitions.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"24 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141532487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hypothesis testing in Cox models when continuous covariates are dichotomized: bias analysis and bootstrap-based test 连续协变量二分时 Cox 模型中的假设检验:偏差分析和基于引导的检验
IF 1.3 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-06-23 DOI: 10.1007/s00180-024-01520-2
Hyunman Sim, Sungjeong Lee, Bo-Hyung Kim, Eun Shin, Woojoo Lee

Hypothesis testing for the regression coefficient associated with a dichotomized continuous covariate in a Cox proportional hazards model has been considered in clinical research. Although most existing testing methods do not allow covariates, except for a dichotomized continuous covariate, they have generally been applied. Through an analytic bias analysis and a numerical study, we show that the current practice is not free from an inflated type I error and a loss of power. To overcome this limitation, we develop a bootstrap-based test that allows additional covariates and dichotomizes two-dimensional covariates into a binary variable. In addition, we develop an efficient algorithm to speed up the calculation of the proposed test statistic. Our numerical study demonstrates that the proposed bootstrap-based test maintains the type I error well at the nominal level and exhibits higher power than other methods, as well as that the proposed efficient algorithm reduces computational costs.

临床研究一直在考虑对 Cox 比例危险模型中与二分连续协变量相关的回归系数进行假设检验。尽管除二分连续协变量外,现有的大多数检验方法不允许使用协变量,但这些方法已被普遍应用。通过分析偏差分析和数值研究,我们发现目前的做法并不能避免 I 型误差的扩大和功率的损失。为了克服这一局限性,我们开发了一种基于 bootstrap 的检验方法,允许使用额外的协变量,并将二维协变量二分为二元变量。此外,我们还开发了一种高效算法,以加快所提检验统计量的计算速度。我们的数值研究表明,与其他方法相比,所提出的基于引导的检验能在名义水平上很好地保持 I 型误差,并表现出更高的功率,同时所提出的高效算法也降低了计算成本。
{"title":"Hypothesis testing in Cox models when continuous covariates are dichotomized: bias analysis and bootstrap-based test","authors":"Hyunman Sim, Sungjeong Lee, Bo-Hyung Kim, Eun Shin, Woojoo Lee","doi":"10.1007/s00180-024-01520-2","DOIUrl":"https://doi.org/10.1007/s00180-024-01520-2","url":null,"abstract":"<p>Hypothesis testing for the regression coefficient associated with a dichotomized continuous covariate in a Cox proportional hazards model has been considered in clinical research. Although most existing testing methods do not allow covariates, except for a dichotomized continuous covariate, they have generally been applied. Through an analytic bias analysis and a numerical study, we show that the current practice is not free from an inflated type I error and a loss of power. To overcome this limitation, we develop a bootstrap-based test that allows additional covariates and dichotomizes two-dimensional covariates into a binary variable. In addition, we develop an efficient algorithm to speed up the calculation of the proposed test statistic. Our numerical study demonstrates that the proposed bootstrap-based test maintains the type I error well at the nominal level and exhibits higher power than other methods, as well as that the proposed efficient algorithm reduces computational costs.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"28 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141510737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Trend of high dimensional time series estimation using low-rank matrix factorization: heuristics and numerical experiments via the TrendTM package 使用低秩矩阵因式分解进行高维时间序列估计的趋势:通过 TrendTM 软件包进行启发式方法和数值实验
IF 1.3 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-06-20 DOI: 10.1007/s00180-024-01519-9
Emilie Lebarbier, Nicolas Marie, Amélie Rosier

This article focuses on the practical issue of a recent theoretical method proposed for trend estimation in high dimensional time series. This method falls within the scope of the low-rank matrix factorization methods in which the temporal structure is taken into account. It consists of minimizing a penalized criterion, theoretically efficient but which depends on two constants to be chosen in practice. We propose a two-step strategy to solve this question based on two different known heuristics. The performance and a comparison of the strategies are studied through an important simulation study in various scenarios. In order to make the estimation method with the best strategy available to the community, we implemented the method in an R package TrendTM which is presented and used here. Finally, we give a geometric interpretation of the results by linking it to PCA and use the results to solve a high-dimensional curve clustering problem. The package is available on CRAN.

本文重点讨论最近提出的一种用于高维时间序列趋势估计的理论方法的实际问题。该方法属于低秩矩阵因式分解方法的范畴,其中考虑了时间结构。它包括最小化一个惩罚性标准,该标准在理论上是有效的,但在实践中取决于两个常量的选择。我们基于两种不同的已知启发式方法,提出了一种分两步解决这一问题的策略。通过在各种情况下进行重要的模拟研究,对这些策略的性能和比较进行了研究。为了向社会提供具有最佳策略的估算方法,我们在 R 软件包 TrendTM 中实现了该方法,并在此介绍和使用。最后,我们通过将其与 PCA 相结合,对结果进行了几何解释,并利用结果解决了一个高维曲线聚类问题。该软件包可在 CRAN 上下载。
{"title":"Trend of high dimensional time series estimation using low-rank matrix factorization: heuristics and numerical experiments via the TrendTM package","authors":"Emilie Lebarbier, Nicolas Marie, Amélie Rosier","doi":"10.1007/s00180-024-01519-9","DOIUrl":"https://doi.org/10.1007/s00180-024-01519-9","url":null,"abstract":"<p>This article focuses on the practical issue of a recent theoretical method proposed for trend estimation in high dimensional time series. This method falls within the scope of the low-rank matrix factorization methods in which the temporal structure is taken into account. It consists of minimizing a penalized criterion, theoretically efficient but which depends on two constants to be chosen in practice. We propose a two-step strategy to solve this question based on two different known heuristics. The performance and a comparison of the strategies are studied through an important simulation study in various scenarios. In order to make the estimation method with the best strategy available to the community, we implemented the method in an R package <span>TrendTM</span> which is presented and used here. Finally, we give a geometric interpretation of the results by linking it to PCA and use the results to solve a high-dimensional curve clustering problem. The package is available on CRAN.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"3 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141530231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Some aspects of nonlinear dimensionality reduction 非线性降维的一些方面
IF 1.3 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2024-06-16 DOI: 10.1007/s00180-024-01514-0
Liwen Wang, Yongda Wang, Shifeng Xiong, Jiankui Yang

In this paper we discuss nonlinear dimensionality reduction within the framework of principal curves. We formulate dimensionality reduction as problems of estimating principal subspaces for both noiseless and noisy cases, and propose the corresponding iterative algorithms that modify existing principal curve algorithms. An R squared criterion is introduced to estimate the dimension of the principal subspace. In addition, we present new regression and density estimation strategies based on our dimensionality reduction algorithms. Theoretical analyses and numerical experiments show the effectiveness of the proposed methods.

本文讨论了主曲线框架内的非线性降维问题。我们将降维问题表述为估计无噪声和噪声情况下的主子空间问题,并提出了相应的迭代算法,对现有的主曲线算法进行了修改。我们引入了 R 平方准则来估计主子空间的维度。此外,我们还基于降维算法提出了新的回归和密度估计策略。理论分析和数值实验表明了所提方法的有效性。
{"title":"Some aspects of nonlinear dimensionality reduction","authors":"Liwen Wang, Yongda Wang, Shifeng Xiong, Jiankui Yang","doi":"10.1007/s00180-024-01514-0","DOIUrl":"https://doi.org/10.1007/s00180-024-01514-0","url":null,"abstract":"<p>In this paper we discuss nonlinear dimensionality reduction within the framework of principal curves. We formulate dimensionality reduction as problems of estimating principal subspaces for both noiseless and noisy cases, and propose the corresponding iterative algorithms that modify existing principal curve algorithms. An R squared criterion is introduced to estimate the dimension of the principal subspace. In addition, we present new regression and density estimation strategies based on our dimensionality reduction algorithms. Theoretical analyses and numerical experiments show the effectiveness of the proposed methods.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"202 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141510736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computational Statistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1