首页 > 最新文献

Statistical Papers最新文献

英文 中文
Confidence intervals for overall response rate difference in the sequential parallel comparison design 顺序平行比较设计中总体答复率差异的置信区间
IF 1.3 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-08-31 DOI: 10.1007/s00362-024-01606-5
Guogen Shan, Xinlin Lu, Yahui Zhang, Samuel S. Wu

High placebo responses could significantly reduce the treatment effect in a parallel randomized trial. To combat that challenge, several approaches were developed, including the sequential parallel comparison design (SPCD) that was shown to increase the statistical power as compared to the traditional randomized trial. A linear combination of the response rate differences from two phases per the SPCD is commonly used to measure the overall treatment effect size. The traditional approach to calculate the confidence interval for the overall rate difference is based on the delta method using the variance–covariance matrix of all outcomes. As outcomes from a multinomial distribution are correlated, we suggest utilizing a constrained variance–covariance matrix in the delta method. In the observation of anti-conservative coverages from asymptotic intervals, we further propose using importance sampling to develop accurate intervals. Simulation studies show that accurate intervals have better coverage probabilities than others and the interval width of accurate intervals is similar to the interval width of others. Two real trials to treat major depressive disorder are used to illustrate the application of the proposed intervals.

在平行随机试验中,高安慰剂反应可能会大大降低治疗效果。为了应对这一挑战,人们开发了多种方法,其中包括序列平行比较设计(SPCD),与传统的随机试验相比,该设计被证明可以提高统计功率。通常使用 SPCD 两个阶段响应率差异的线性组合来衡量总体治疗效果大小。计算总比率差异置信区间的传统方法是基于使用所有结果的方差-协方差矩阵的德尔塔法。由于多叉分布的结果具有相关性,我们建议在 delta 法中使用受约束的方差-协方差矩阵。在观察渐近区间的反保守覆盖率时,我们进一步建议使用重要性抽样来建立精确区间。模拟研究表明,精确区间比其他区间具有更好的覆盖概率,而且精确区间的区间宽度与其他区间的区间宽度相似。两个治疗重度抑郁障碍的真实试验用于说明所建议区间的应用。
{"title":"Confidence intervals for overall response rate difference in the sequential parallel comparison design","authors":"Guogen Shan, Xinlin Lu, Yahui Zhang, Samuel S. Wu","doi":"10.1007/s00362-024-01606-5","DOIUrl":"https://doi.org/10.1007/s00362-024-01606-5","url":null,"abstract":"<p>High placebo responses could significantly reduce the treatment effect in a parallel randomized trial. To combat that challenge, several approaches were developed, including the sequential parallel comparison design (SPCD) that was shown to increase the statistical power as compared to the traditional randomized trial. A linear combination of the response rate differences from two phases per the SPCD is commonly used to measure the overall treatment effect size. The traditional approach to calculate the confidence interval for the overall rate difference is based on the delta method using the variance–covariance matrix of all outcomes. As outcomes from a multinomial distribution are correlated, we suggest utilizing a constrained variance–covariance matrix in the delta method. In the observation of anti-conservative coverages from asymptotic intervals, we further propose using importance sampling to develop accurate intervals. Simulation studies show that accurate intervals have better coverage probabilities than others and the interval width of accurate intervals is similar to the interval width of others. Two real trials to treat major depressive disorder are used to illustrate the application of the proposed intervals.</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"39 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142201036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian and frequentist inference derived from the maximum entropy principle with applications to propagating uncertainty about statistical methods 从最大熵原理推导出的贝叶斯推理和频数推理,在传播统计方法的不确定性方面的应用
IF 1.3 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-08-27 DOI: 10.1007/s00362-024-01597-3
David R. Bickel

Using statistical methods to analyze data requires considering the data set to be randomly generated from a probability distribution that is unknown but idealized according to a mathematical model consisting of constraints, assumptions about the distribution. Since the choice of such a model is up to the scientist, there is an understandable bias toward choosing models that make scientific conclusions appear more certain than they really are. There is a similar bias in the scientist’s choice of whether to use Bayesian or frequentist methods. This article provides tools to mitigate both of those biases on the basis of a principle of information theory. It is found that the same principle unifies Bayesianism with the fiducial version of frequentism. The principle arguably overcomes not only the main objections against fiducial inference but also the main Bayesian objection against the use of confidence intervals.

使用统计方法分析数据需要考虑到数据集是从一个未知的概率分布中随机生成的,但根据一个数学模型进行了理想化,该数学模型由有关分布的约束条件和假设组成。由于这种模型的选择是由科学家决定的,因此存在一种可以理解的偏差,即选择那些能使科学结论看起来比实际情况更确定的模型。科学家在选择使用贝叶斯方法还是频数方法时也存在类似的偏差。本文根据信息论原理,提供了减轻这两种偏差的工具。研究发现,同一原则将贝叶斯主义与频数主义的信条版本统一起来。可以说,该原理不仅克服了对信实推理的主要反对意见,也克服了贝叶斯主义对使用置信区间的主要反对意见。
{"title":"Bayesian and frequentist inference derived from the maximum entropy principle with applications to propagating uncertainty about statistical methods","authors":"David R. Bickel","doi":"10.1007/s00362-024-01597-3","DOIUrl":"https://doi.org/10.1007/s00362-024-01597-3","url":null,"abstract":"<p>Using statistical methods to analyze data requires considering the data set to be randomly generated from a probability distribution that is unknown but idealized according to a mathematical model consisting of constraints, assumptions about the distribution. Since the choice of such a model is up to the scientist, there is an understandable bias toward choosing models that make scientific conclusions appear more certain than they really are. There is a similar bias in the scientist’s choice of whether to use Bayesian or frequentist methods. This article provides tools to mitigate both of those biases on the basis of a principle of information theory. It is found that the same principle unifies Bayesianism with the fiducial version of frequentism. The principle arguably overcomes not only the main objections against fiducial inference but also the main Bayesian objection against the use of confidence intervals.</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"46 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142201032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reduced bias estimation of the log odds ratio 对数概率的减偏估计
IF 1.3 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-08-26 DOI: 10.1007/s00362-024-01593-7
Asma Saleh

Analysis of binary matched pairs data is problematic due to infinite maximum likelihood estimates of the log odds ratio and potentially biased estimates, especially for small samples. We propose a penalised version of the log-likelihood function based on adjusted responses which always results in a finite estimator of the log odds ratio. The probability limit of the adjusted log-likelihood estimator is derived and it is shown that in certain settings the maximum likelihood, conditional and modified profile log-likelihood estimators drop out as special cases of the former estimator. We implement indirect inference to the adjusted log-likelihood estimator. It is shown, through a complete enumeration study, that the indirect inference estimator is competitive in terms of bias and variance in comparison to the maximum likelihood, conditional, modified profile log-likelihood and Firth’s penalised log-likelihood estimators.

二元配对数据的分析存在问题,因为对数概率的最大似然估计值是无限的,而且估计值可能存在偏差,特别是对于小样本而言。我们提出了一种基于调整后响应的对数似然函数的惩罚版本,它总能得到对数几率比的有限估计值。我们推导出了调整后对数似然估计值的概率极限,并证明在某些情况下,最大似然估计值、条件似然估计值和修正的剖面对数似然估计值会作为前者的特例而消失。我们对调整后的对数似然估计器进行了间接推理。通过完整的枚举研究表明,就偏差和方差而言,间接推理估计器与最大似然估计器、条件估计器、修正轮廓对数似然估计器和 Firth 惩罚对数似然估计器相比具有竞争力。
{"title":"Reduced bias estimation of the log odds ratio","authors":"Asma Saleh","doi":"10.1007/s00362-024-01593-7","DOIUrl":"https://doi.org/10.1007/s00362-024-01593-7","url":null,"abstract":"<p>Analysis of binary matched pairs data is problematic due to infinite maximum likelihood estimates of the log odds ratio and potentially biased estimates, especially for small samples. We propose a penalised version of the log-likelihood function based on adjusted responses which always results in a finite estimator of the log odds ratio. The probability limit of the adjusted log-likelihood estimator is derived and it is shown that in certain settings the maximum likelihood, conditional and modified profile log-likelihood estimators drop out as special cases of the former estimator. We implement indirect inference to the adjusted log-likelihood estimator. It is shown, through a complete enumeration study, that the indirect inference estimator is competitive in terms of bias and variance in comparison to the maximum likelihood, conditional, modified profile log-likelihood and Firth’s penalised log-likelihood estimators.</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"6 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142201034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A critical note on the exponentiated EWMA chart 关于指数化 EWMA 图表的重要说明
IF 1.3 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-08-23 DOI: 10.1007/s00362-024-01601-w
Abdul Haq, William H. Woodall

In this short note, we reevaluate the run-length performance of the EWMA and exponentiated EWMA (Exp-EWMA) charts using the conditional expected delay metric. It is found that the enhancements offered by the Exp-EWMA chart over the EWMA chart in the zero-state setup are marginal. Given its simplicity in implementation and its ability to encompass the functionality of the Exp-EWMA chart in detecting delayed shifts in the process mean, the EWMA chart remains the preferred choice over the Exp-EWMA chart.

在这篇短文中,我们使用条件预期延迟指标重新评估了 EWMA 和指数化 EWMA(Exp-EWMA)图表的运行时长性能。结果发现,在零状态设置下,Exp-EWMA 图表比 EWMA 图表的增强效果微乎其微。鉴于 EWMA 图表实施简单,而且能够涵盖 Exp-EWMA 图表在检测过程均值延迟移动方面的功能,因此与 Exp-EWMA 图表相比,EWMA 图表仍是首选。
{"title":"A critical note on the exponentiated EWMA chart","authors":"Abdul Haq, William H. Woodall","doi":"10.1007/s00362-024-01601-w","DOIUrl":"https://doi.org/10.1007/s00362-024-01601-w","url":null,"abstract":"<p>In this short note, we reevaluate the run-length performance of the EWMA and exponentiated EWMA (Exp-EWMA) charts using the conditional expected delay metric. It is found that the enhancements offered by the Exp-EWMA chart over the EWMA chart in the zero-state setup are marginal. Given its simplicity in implementation and its ability to encompass the functionality of the Exp-EWMA chart in detecting delayed shifts in the process mean, the EWMA chart remains the preferred choice over the Exp-EWMA chart.</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"16 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142201033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hadamard matrices, quaternions, and the Pearson chi-square statistic 哈达玛矩阵、四元数和皮尔逊卡方统计量
IF 1.3 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-08-21 DOI: 10.1007/s00362-024-01602-9
Abbas Alhakim

The symbolic partitioning of the Pearson chi-square statistic with unequal cell probabilities into asymptotically independent component tests is revisited. We introduce Hadamard-like matrices whose resulting component tests compares the full vector of cell counts. This contributes to making these component tests intuitively interpretable. We present a simple way to construct the Hadamard-like matrices when the number of cell counts is 2, 4 or 8 without assuming any relations between cell probabilities. For higher powers of 2, the theory of orthogonal designs is used to set a priori relations between cell probabilities, in order to establish the construction. Simulations are given to illustrate the sensitivity of various components to changes in location, scale, skewness and tail probability, as well as to illustrate the potential improvement in power when the cell probabilities are changed.

我们重新探讨了将不等细胞概率的皮尔逊卡方统计量符号划分为渐近独立分量检验的方法。我们引入了类似哈达玛矩阵的分量检验,其结果是对细胞计数的全向量进行比较。这有助于使这些成分检验具有直观的可解释性。当细胞数为 2、4 或 8 时,我们提出了构建哈达玛类矩阵的简单方法,而无需假设细胞概率之间的任何关系。对于 2 的更高次幂,我们使用正交设计理论来设定细胞概率之间的先验关系,从而建立结构。我们还给出了模拟结果,以说明各种成分对位置、规模、倾斜度和尾部概率变化的敏感性,并说明改变单元格概率时可能提高的功率。
{"title":"Hadamard matrices, quaternions, and the Pearson chi-square statistic","authors":"Abbas Alhakim","doi":"10.1007/s00362-024-01602-9","DOIUrl":"https://doi.org/10.1007/s00362-024-01602-9","url":null,"abstract":"<p>The symbolic partitioning of the Pearson chi-square statistic with unequal cell probabilities into asymptotically independent component tests is revisited. We introduce Hadamard-like matrices whose resulting component tests compares the full vector of cell counts. This contributes to making these component tests intuitively interpretable. We present a simple way to construct the Hadamard-like matrices when the number of cell counts is 2, 4 or 8 without assuming any relations between cell probabilities. For higher powers of 2, the theory of orthogonal designs is used to set a priori relations between cell probabilities, in order to establish the construction. Simulations are given to illustrate the sensitivity of various components to changes in location, scale, skewness and tail probability, as well as to illustrate the potential improvement in power when the cell probabilities are changed.</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"5 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142201037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exceedance statistics based on bottom- $$k$$ -lists 基于底部 $$k$$ 列表的超标统计数据
IF 1.3 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-07-25 DOI: 10.1007/s00362-024-01581-x
Agah Kozan, Burak Uyar, Halil Tanil

Similar to usual lower records, Bottom-(k)-lists (Kozan and Tanil, İstatistik J Turk Stat Assoc 13:73–79, 2013) have a wide range of practical applications in meteorology, hydrology, sports, etc. Also, exceedance statistics can be viewed as a close relative to tolerance limits—an important field of statistical science. In this study, an idea of combining these two important subjects together is studied and an exceedance statistic is defined based on bottom-(k)-lists in an independent and identically distributed (iid) continuous random sequence. Probability mass function (pmf) of a selected exceedance statistic is obtained. Also, an illustrative application of the exceedance statistic is given.

与通常的较低记录类似,Bottom-(k)-lists(Kozan 和 Tanil,İstatistik J Turk Stat Assoc 13:73-79,2013 年)在气象学、水文学、体育等领域有着广泛的实际应用。此外,超限统计也可视为与容限的近亲--容限是统计科学的一个重要领域。本研究将这两个重要课题结合在一起进行研究,并基于独立且同分布(iid)的连续随机序列中的底(k )列表定义了超限统计量。得出了所选超标统计量的概率质量函数(pmf)。此外,还给出了超标统计量的示例应用。
{"title":"Exceedance statistics based on bottom- $$k$$ -lists","authors":"Agah Kozan, Burak Uyar, Halil Tanil","doi":"10.1007/s00362-024-01581-x","DOIUrl":"https://doi.org/10.1007/s00362-024-01581-x","url":null,"abstract":"<p>Similar to usual lower records, Bottom-<span>(k)</span>-lists (Kozan and Tanil, İstatistik J Turk Stat Assoc 13:73–79, 2013) have a wide range of practical applications in meteorology, hydrology, sports, etc. Also, exceedance statistics can be viewed as a close relative to tolerance limits—an important field of statistical science. In this study, an idea of combining these two important subjects together is studied and an exceedance statistic is defined based on bottom-<span>(k)</span>-lists in an independent and identically distributed (iid) continuous random sequence. Probability mass function (pmf) of a selected exceedance statistic is obtained. Also, an illustrative application of the exceedance statistic is given.</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"41 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141770423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
General classes of bivariate distributions for modeling data with common observations 为具有共同观测数据建模的二元分布的一般类别
IF 1.3 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-07-20 DOI: 10.1007/s00362-024-01589-3
Na Young Yoo, Ji Hwan Cha

In analyzing bivariate data sets, data with common observations are frequently encountered and, in this case, existing absolutely continuous bivariate distributions are not applicable. Only a few models, such as the bivariate distribution proposed by Marshall and Olkin (J Am Stat Assoc 62(317):30–44, 1967), have been developed to model such data sets and the choice of models to fit data sets having common observations is very limited. In this paper, three general classes of bivariate distributions for modeling data with common observations are developed. To develop the bivariate distributions, we employ a probability model in reliability. Considering a system with two components, it is assumed that, when the first failure of the components occurs, with some probability, it immediately causes the failure of the remaining component, and, with complementary probability, the residual lifetime of the remaining component is shortened according to some stochastic order. It will be shown that, by specifying the underlying distributions contained in the joint distribution, numerous families of bivariate distributions can be generated. Therefore, this work provides substantially increased flexibility in modeling data sets with common observations. The developed models are fitted to two real-life data sets and it is shown that these models outperform the existing models in terms of fitting performance and their performances are satisfactory.

在分析二元数据集时,经常会遇到具有共同观测值的数据,在这种情况下,现有的 绝对连续二元分布并不适用。只有少数几个模型,如 Marshall 和 Olkin 提出的双变量分布(J Am Stat Assoc 62(317):30-44, 1967),被用来模拟这类数据集,而用于拟合具有共同观测值的数据集的模型选择非常有限。本文开发了三类通用的双变量分布,用于对具有共同观测值的数据建模。为了建立双变量分布,我们采用了可靠性概率模型。考虑到一个系统有两个组件,假定当组件的第一个故障以某种概率发生时,会立即导致剩余组件的故障,并且以互补概率,剩余组件的剩余寿命会按照某种随机顺序缩短。我们将证明,通过指定联合分布中包含的基本分布,可以生成众多的二元分布系列。因此,这项工作大大提高了对具有共同观测数据集建模的灵活性。开发的模型拟合了两个现实生活中的数据集,结果表明这些模型在拟合性能方面优于现有模型,其表现令人满意。
{"title":"General classes of bivariate distributions for modeling data with common observations","authors":"Na Young Yoo, Ji Hwan Cha","doi":"10.1007/s00362-024-01589-3","DOIUrl":"https://doi.org/10.1007/s00362-024-01589-3","url":null,"abstract":"<p>In analyzing bivariate data sets, data with common observations are frequently encountered and, in this case, existing absolutely continuous bivariate distributions are not applicable. Only a few models, such as the bivariate distribution proposed by Marshall and Olkin (J Am Stat Assoc 62(317):30–44, 1967), have been developed to model such data sets and the choice of models to fit data sets having common observations is very limited. In this paper, three general classes of bivariate distributions for modeling data with common observations are developed. To develop the bivariate distributions, we employ a probability model in reliability. Considering a system with two components, it is assumed that, when the first failure of the components occurs, with some probability, it immediately causes the failure of the remaining component, and, with complementary probability, the residual lifetime of the remaining component is shortened according to some stochastic order. It will be shown that, by specifying the underlying distributions contained in the joint distribution, numerous families of bivariate distributions can be generated. Therefore, this work provides substantially increased flexibility in modeling data sets with common observations. The developed models are fitted to two real-life data sets and it is shown that these models outperform the existing models in terms of fitting performance and their performances are satisfactory.</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"43 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141744105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hotelling $$T^2$$ test in high dimensions with application to Wilks outlier method 高维度 Hotelling $$T^2$ 检验与 Wilks 离群值方法的应用
IF 1.3 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-07-19 DOI: 10.1007/s00362-024-01587-5
Reza Modarres

We consider the Hotelling (T^2) test in low sample size, high dimensional setting. We partition the p variables into (b>1) blocks of p/b variables and use the union-intersection principle to propose a testing procedure that computes the (T^2) test in each block. We show that the proposed method is more powerful than Hotelling (T^2) test. We also consider Wilks method of outlier detection and use the union-intersection principle to search for outliers in blocks of variables. The significance level and the power function of the new test are investigated. We show that the new outlier detection method produces more power compared to Wilks test.

我们考虑了低样本量、高维度环境下的(T^2) 检验。我们将 p 个变量划分为 p/b 个变量的 (b>1) 块,并利用联合-交集原理提出了一种在每个块中计算 (T^2) 检验的检验过程。我们证明了所提出的方法比 Hotelling (T^2) 检验更强大。我们还考虑了离群值检测的 Wilks 方法,并使用联合-交集原理在变量块中搜索离群值。我们研究了新检验的显著性水平和幂函数。结果表明,与 Wilks 检验相比,新的离群值检测方法能产生更大的检验功率。
{"title":"Hotelling $$T^2$$ test in high dimensions with application to Wilks outlier method","authors":"Reza Modarres","doi":"10.1007/s00362-024-01587-5","DOIUrl":"https://doi.org/10.1007/s00362-024-01587-5","url":null,"abstract":"<p>We consider the Hotelling <span>(T^2)</span> test in low sample size, high dimensional setting. We partition the <i>p</i> variables into <span>(b&gt;1)</span> blocks of <i>p</i>/<i>b</i> variables and use the union-intersection principle to propose a testing procedure that computes the <span>(T^2)</span> test in each block. We show that the proposed method is more powerful than Hotelling <span>(T^2)</span> test. We also consider Wilks method of outlier detection and use the union-intersection principle to search for outliers in blocks of variables. The significance level and the power function of the new test are investigated. We show that the new outlier detection method produces more power compared to Wilks test.</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"7 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141744301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the functional regression model and its finite-dimensional approximations 关于函数回归模型及其有限维近似值
IF 1.3 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-07-10 DOI: 10.1007/s00362-024-01567-9
José R. Berrendero, Alejandro Cholaquidis, Antonio Cuevas

The problem of linearly predicting a scalar response Y from a functional (random) explanatory variable (X=X(t), tin I) is considered. It is argued that the term “linearly” can be interpreted in several meaningful ways. Thus, one could interpret that (up to a random noise) Y could be expressed as a linear combination of a finite family of marginals (X(t_i)) of the process X, or a limit of a sequence of such linear combinations. This simple point of view (which has some precedents in the literature) leads to a formulation of the linear model in terms of the RKHS space generated by the covariance function of the process X(t). It turns out that such RKHS-based formulation includes the standard functional linear model, based on the inner product in the space (L^2[0,1]), as a particular case. It includes as well all models in which Y is assumed to be (up to an additive noise) a linear combination of a finite number of linear projections of X. Some consistency results are proved which, in particular, lead to an asymptotic approximation of the predictions derived from the general (functional) linear model in terms of finite-dimensional models based on a finite family of marginals (X(t_i)), for an increasing grid of points (t_j) in I. We also include a discussion on the crucial notion of coefficient of determination (aimed at assessing the fit of the model) in this setting. A few experimental results are given.

研究考虑了从函数(随机)解释变量 (X=X(t),tin I) 线性预测标量响应 Y 的问题。有人认为,"线性 "一词可以有几种有意义的解释。因此,我们可以将 Y 解释为过程 X 的边际值 (X(t_i))的有限族的线性组合,或者是这种线性组合序列的极限。这种简单的观点(在文献中已有先例)导致了线性模型的表述,即由过程 X(t) 的协方差函数生成的 RKHS 空间。事实证明,这种基于 RKHS 的表述包括标准函数线性模型,它基于空间 (L^2[0,1])中的内积,是一种特殊情况。它还包括所有假定 Y 是 X 的有限数量线性投影的线性组合(直到加性噪声)的模型。我们还讨论了在这种情况下决定系数的关键概念(旨在评估模型的拟合度)。文中给出了一些实验结果。
{"title":"On the functional regression model and its finite-dimensional approximations","authors":"José R. Berrendero, Alejandro Cholaquidis, Antonio Cuevas","doi":"10.1007/s00362-024-01567-9","DOIUrl":"https://doi.org/10.1007/s00362-024-01567-9","url":null,"abstract":"<p>The problem of linearly predicting a scalar response <i>Y</i> from a functional (random) explanatory variable <span>(X=X(t), tin I)</span> is considered. It is argued that the term “linearly” can be interpreted in several meaningful ways. Thus, one could interpret that (up to a random noise) <i>Y</i> could be expressed as a linear combination of a finite family of marginals <span>(X(t_i))</span> of the process <i>X</i>, or a limit of a sequence of such linear combinations. This simple point of view (which has some precedents in the literature) leads to a formulation of the linear model in terms of the RKHS space generated by the covariance function of the process <i>X</i>(<i>t</i>). It turns out that such RKHS-based formulation includes the standard functional linear model, based on the inner product in the space <span>(L^2[0,1])</span>, as a particular case. It includes as well all models in which <i>Y</i> is assumed to be (up to an additive noise) a linear combination of a finite number of linear projections of <i>X</i>. Some consistency results are proved which, in particular, lead to an asymptotic approximation of the predictions derived from the general (functional) linear model in terms of finite-dimensional models based on a finite family of marginals <span>(X(t_i))</span>, for an increasing grid of points <span>(t_j)</span> in <i>I</i>. We also include a discussion on the crucial notion of coefficient of determination (aimed at assessing the fit of the model) in this setting. A few experimental results are given.</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"18 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141566634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Additive partial linear models with autoregressive symmetric errors and its application to the hospitalizations for respiratory diseases 具有自回归对称误差的加性偏线性模型及其在呼吸系统疾病住院治疗中的应用
IF 1.3 3区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-07-09 DOI: 10.1007/s00362-024-01590-w
Shu Wei Chou-Chen, Rodrigo A. Oliveira, Irina Raicher, Gilberto A. Paula

Additive partial linear models with symmetric autoregressive errors of order p are proposed in this paper for modeling time series data. Specifically, we apply this model class to explain the weekly hospitalization for respiratory diseases in Sorocaba, São Paulo, Brazil, by incorporating climate and pollution as covariates, trend and seasonality. The main feature of this model class is its capability of considering a set of explanatory variables with linear and nonlinear structures, which allows, for example, to model jointly trend and seasonality of a time series with additive functions for the nonlinear explanatory variables and a predictor to accommodate discrete and linear explanatory variables. Additionally, the conditional symmetric errors allow the possibility of fitting data with high correlation order, as well as error distributions with heavier or lighter tails than the normal ones. We present the model class and a novel iterative process is derived by combining a P-GAM type algorithm with a quasi-Newton procedure for the parameter estimation. The inferential results, diagnostic procedures, including conditional quantile residual analysis and local influence analysis for sensitivity, are discussed. Simulation studies are performed to assess finite sample properties of parametric and nonparametric estimators. Finally, the data set analysis and concluding remarks are given.

本文提出了具有 p 阶对称自回归误差的加性偏线性模型,用于建立时间序列数据模型。具体而言,我们将该模型类别应用于解释巴西圣保罗索罗卡巴的呼吸道疾病周住院率,并将气候和污染作为协变量、趋势和季节性纳入其中。该模型类别的主要特点是能够考虑一组具有线性和非线性结构的解释变量,例如,它允许用非线性解释变量的加法函数和一个预测器对时间序列的趋势和季节性进行联合建模,以适应离散和线性解释变量。此外,条件对称误差允许拟合高相关阶数的数据,以及比正态分布尾部更重或更轻的误差分布。我们介绍了模型类别,并通过将 P-GAM 类型算法与参数估计的准牛顿过程相结合,得出了一种新颖的迭代过程。我们讨论了推论结果和诊断程序,包括条件量级残差分析和局部影响分析的敏感性。还进行了模拟研究,以评估参数和非参数估计器的有限样本特性。最后,给出了数据集分析和结束语。
{"title":"Additive partial linear models with autoregressive symmetric errors and its application to the hospitalizations for respiratory diseases","authors":"Shu Wei Chou-Chen, Rodrigo A. Oliveira, Irina Raicher, Gilberto A. Paula","doi":"10.1007/s00362-024-01590-w","DOIUrl":"https://doi.org/10.1007/s00362-024-01590-w","url":null,"abstract":"<p>Additive partial linear models with symmetric autoregressive errors of order <i>p</i> are proposed in this paper for modeling time series data. Specifically, we apply this model class to explain the weekly hospitalization for respiratory diseases in Sorocaba, São Paulo, Brazil, by incorporating climate and pollution as covariates, trend and seasonality. The main feature of this model class is its capability of considering a set of explanatory variables with linear and nonlinear structures, which allows, for example, to model jointly trend and seasonality of a time series with additive functions for the nonlinear explanatory variables and a predictor to accommodate discrete and linear explanatory variables. Additionally, the conditional symmetric errors allow the possibility of fitting data with high correlation order, as well as error distributions with heavier or lighter tails than the normal ones. We present the model class and a novel iterative process is derived by combining a P-GAM type algorithm with a quasi-Newton procedure for the parameter estimation. The inferential results, diagnostic procedures, including conditional quantile residual analysis and local influence analysis for sensitivity, are discussed. Simulation studies are performed to assess finite sample properties of parametric and nonparametric estimators. Finally, the data set analysis and concluding remarks are given.</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"11 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141566875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Statistical Papers
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1