首页 > 最新文献

Scandinavian Journal of Statistics最新文献

英文 中文
Consistent covariances estimation for stratum imbalances under minimization method for covariate-adaptive randomization 协方差自适应随机化最小化方法下分层不平衡的一致协方差估计
IF 1 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2023-12-26 DOI: 10.1111/sjos.12703
Zixuan Zhao, Yanglei Song, Wenyu Jiang, Dongsheng Tu
Pocock and Simon's minimization method is a popular approach for covariate-adaptive randomization in clinical trials. Valid statistical inference with data collected under the minimization method requires the knowledge of the limiting covariance matrix of within-stratum imbalances, whose existence is only recently established. In this work, we propose a bootstrap-based estimator for this limit and establish its consistency, in particular, by Le Cam's third lemma. As an application, we consider in simulation studies adjustments to existing robust tests for treatment effects with survival data by the proposed estimator. It shows that the adjusted tests achieve a size close to the nominal level, and unlike other designs, the robust tests without adjustment may have an asymptotic size inflation issue under the minimization method.
波科克和西蒙的最小化方法是临床试验中一种常用的协方差自适应随机方法。使用最小化方法收集的数据进行有效的统计推断,需要知道层内不平衡的极限协方差矩阵,而该矩阵的存在最近才被证实。在这项工作中,我们提出了一种基于自举法的极限估计方法,并特别通过 Le Cam 的第三个 Lemma 建立了其一致性。作为一项应用,我们在模拟研究中考虑用提出的估计器调整现有的生存数据治疗效果稳健检验。结果表明,调整后的检验规模接近名义水平,与其他设计不同的是,在最小化方法下,未经调整的稳健检验可能存在规模膨胀的渐近问题。
{"title":"Consistent covariances estimation for stratum imbalances under minimization method for covariate-adaptive randomization","authors":"Zixuan Zhao, Yanglei Song, Wenyu Jiang, Dongsheng Tu","doi":"10.1111/sjos.12703","DOIUrl":"https://doi.org/10.1111/sjos.12703","url":null,"abstract":"Pocock and Simon's minimization method is a popular approach for covariate-adaptive randomization in clinical trials. Valid statistical inference with data collected under the minimization method requires the knowledge of the limiting covariance matrix of within-stratum imbalances, whose existence is only recently established. In this work, we propose a bootstrap-based estimator for this limit and establish its consistency, in particular, by Le Cam's third lemma. As an application, we consider in simulation studies adjustments to existing robust tests for treatment effects with survival data by the proposed estimator. It shows that the adjusted tests achieve a size close to the nominal level, and unlike other designs, the robust tests without adjustment may have an asymptotic size inflation issue under the minimization method.","PeriodicalId":49567,"journal":{"name":"Scandinavian Journal of Statistics","volume":"30 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2023-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139053744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Confidence Bands for Survival Curves from Outcome-Dependent Stratified Samples 依赖结果的分层抽样生存曲线的置信带
IF 1 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2023-12-21 DOI: 10.1111/sjos.12700
Takumi Saegusa, Peter Nandori
We consider the construction of confidence bands for survival curves under the outcome-dependent stratified sampling. A main challenge of this design is that data are a biased dependent sample due to stratification and sampling without replacement. Most literature on regression approximates this design by Bernoulli sampling but variance is generally overestimated. Even with this approximation, the limiting distribution of the inverse probability weighted Kaplan-Meier estimator involves a general Gaussian process, and hence quantiles of its supremum is not analytically available. In this paper, we provide a rigorous asymptotic theory for the weighted Kaplan-Meier estimator accounting for dependence in the sample. We propose the novel hybrid method to both simulate and bootstrap parts of the limiting process to compute confidence bands with asymptotically correct coverage probability. Simulation study indicates that the proposed bands are appropriate for practical use. A Wilms tumor example is presented.
我们考虑了在依赖结果的分层抽样下构建生存曲线的置信区间。这种设计的主要挑战在于,由于分层抽样和无替换抽样,数据是有偏差的依赖样本。大多数关于回归的文献都用伯努利抽样来近似这种设计,但方差通常被高估。即使采用了这种近似方法,反概率加权卡普兰-梅耶估计器的极限分布也涉及一般高斯过程,因此无法对其上确值进行分析。在本文中,我们为加权卡普兰-梅耶估计器提供了严格的渐近理论,并考虑了样本中的依赖性。我们提出了一种新颖的混合方法,既能模拟极限过程,又能引导极限过程,从而计算出具有渐近正确覆盖概率的置信带。模拟研究表明,所提出的置信带适合实际应用。下面以 Wilms 肿瘤为例进行说明。
{"title":"Confidence Bands for Survival Curves from Outcome-Dependent Stratified Samples","authors":"Takumi Saegusa, Peter Nandori","doi":"10.1111/sjos.12700","DOIUrl":"https://doi.org/10.1111/sjos.12700","url":null,"abstract":"We consider the construction of confidence bands for survival curves under the outcome-dependent stratified sampling. A main challenge of this design is that data are a biased dependent sample due to stratification and sampling without replacement. Most literature on regression approximates this design by Bernoulli sampling but variance is generally overestimated. Even with this approximation, the limiting distribution of the inverse probability weighted Kaplan-Meier estimator involves a general Gaussian process, and hence quantiles of its supremum is not analytically available. In this paper, we provide a rigorous asymptotic theory for the weighted Kaplan-Meier estimator accounting for dependence in the sample. We propose the novel hybrid method to both simulate and bootstrap parts of the limiting process to compute confidence bands with asymptotically correct coverage probability. Simulation study indicates that the proposed bands are appropriate for practical use. A Wilms tumor example is presented.","PeriodicalId":49567,"journal":{"name":"Scandinavian Journal of Statistics","volume":"9 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2023-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138825587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
G-optimal grid designs for kriging models 克里金模型的 G 优化网格设计
IF 1 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2023-12-11 DOI: 10.1111/sjos.12699
Subhadra Dasgupta, Siuli Mukhopadhyay, Jonathan Keith
This work is focused on finding G -optimal designs theoretically for kriging models with two -dimensional inputs and separable exponential covariance structures. For design comparison, the notion of evenness of two-dimensional grid designs is developed. The mathematical relationship between the design and the supremum of the mean squared prediction error (SMSPE) function is studied and then optimal designs are explored for both prospective and retrospective design scenarios. In the case of prospective designs, the new design is developed before the experiment is conducted and the regularly spaced grid is shown to be the G -optimal design. Retrospective designs are constructed by adding or deleting points from an already existing design. Deterministic algorithms are developed to find the best possible retrospective designs (which minimizes the SMSPE). It is found that a more evenly spread design under the G -optimality criterion leads to the best possible retrospective design. For all the cases of finding the optimal prospective designs and the best possible retrospective designs, both frequentist and Bayesian frameworks have been considered. The proposed methodology for finding retrospective designs is illustrated with a spatio-temporal river water quality monitoring experiment.
这项工作的重点是为具有二维输入和可分离指数协方差结构的克里金模型从理论上找到 G 最佳设计。为了进行设计比较,提出了二维网格设计的均匀性概念。研究了设计与均方预测误差(SMSPE)函数上确界之间的数学关系,然后探讨了前瞻性设计和回顾性设计两种情况下的最优设计。在前瞻性设计的情况下,新设计是在实验进行之前开发的,而规则间隔的网格被证明是 G 最佳设计。回顾性设计是通过在已有设计中添加或删除点来构建的。我们开发了确定性算法来寻找最佳的回顾性设计(使 SMSPE 最小)。研究发现,在 G 最佳准则下,更均匀分布的设计会导致最佳的回顾性设计。在寻找最优前瞻性设计和最佳回顾性设计的所有情况下,都考虑了频繁主义和贝叶斯框架。我们用一个时空河流水质监测实验来说明所提出的寻找回溯设计的方法。
{"title":"G-optimal grid designs for kriging models","authors":"Subhadra Dasgupta, Siuli Mukhopadhyay, Jonathan Keith","doi":"10.1111/sjos.12699","DOIUrl":"https://doi.org/10.1111/sjos.12699","url":null,"abstract":"This work is focused on finding G -optimal designs theoretically for kriging models with two -dimensional inputs and separable exponential covariance structures. For design comparison, the notion of evenness of two-dimensional grid designs is developed. The mathematical relationship between the design and the supremum of the mean squared prediction error (<i>SMSPE</i>) function is studied and then optimal designs are explored for both prospective and retrospective design scenarios. In the case of prospective designs, the new design is developed before the experiment is conducted and the regularly spaced grid is shown to be the G -optimal design. Retrospective designs are constructed by adding or deleting points from an already existing design. Deterministic algorithms are developed to find the best possible retrospective designs (which minimizes the <i>SMSPE</i>). It is found that a more evenly spread design under the G -optimality criterion leads to the best possible retrospective design. For all the cases of finding the optimal prospective designs and the best possible retrospective designs, both frequentist and Bayesian frameworks have been considered. The proposed methodology for finding retrospective designs is illustrated with a spatio-temporal river water quality monitoring experiment.","PeriodicalId":49567,"journal":{"name":"Scandinavian Journal of Statistics","volume":"1 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2023-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138566680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Nonparametric conditional mean testing via an extreme-type statistic in high dimension 基于高维极值型统计量的非参数条件均值检验
IF 1 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2023-12-02 DOI: 10.1111/sjos.12697
Yiming Liu, Guangming Pan, Guangren Yang, Wang Zhou
We propose a new test to investigate the conditional mean dependence between a response variable and the corresponding covariates in the high dimensional regimes. The test statistic is an extreme-type one built on the nonparametric method. The limiting null distribution of the proposed extreme type statistic under a mild mixing condition is established. Moreover, to make the test more powerful in general structures we propose a more general test statistic and develop its asymptotic properties. The power analysis of both methods is also considered. In real data analysis, we also propose a new way to conduct the feature screening based on our results. To evaluate the performance of our estimators and other methods, extensive simulations are conducted.
我们提出了一种新的检验方法来研究高维体系中响应变量与相应协变量之间的条件平均相关性。检验统计量是建立在非参数方法上的极值型统计量。建立了在轻度混合条件下所提出的极值型统计量的极限零分布。此外,为了使一般结构的检验更有效,我们提出了一个更一般的检验统计量,并发展了它的渐近性质。本文还考虑了两种方法的功率分析。在实际数据分析中,我们也提出了一种基于我们的结果进行特征筛选的新方法。为了评估我们的估计器和其他方法的性能,进行了大量的模拟。
{"title":"Nonparametric conditional mean testing via an extreme-type statistic in high dimension","authors":"Yiming Liu, Guangming Pan, Guangren Yang, Wang Zhou","doi":"10.1111/sjos.12697","DOIUrl":"https://doi.org/10.1111/sjos.12697","url":null,"abstract":"We propose a new test to investigate the conditional mean dependence between a response variable and the corresponding covariates in the high dimensional regimes. The test statistic is an extreme-type one built on the nonparametric method. The limiting null distribution of the proposed extreme type statistic under a mild mixing condition is established. Moreover, to make the test more powerful in general structures we propose a more general test statistic and develop its asymptotic properties. The power analysis of both methods is also considered. In real data analysis, we also propose a new way to conduct the feature screening based on our results. To evaluate the performance of our estimators and other methods, extensive simulations are conducted.","PeriodicalId":49567,"journal":{"name":"Scandinavian Journal of Statistics","volume":"18 7","pages":""},"PeriodicalIF":1.0,"publicationDate":"2023-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138526384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modelling multivariate extreme value distributions via Markov trees* 用马尔可夫树建模多元极值分布*
IF 1 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2023-11-30 DOI: 10.1111/sjos.12698
Shuang Hu, Zuoxiang Peng, Johan Segers
Multivariate extreme value distributions are a common choice for modelling multivariate extremes. In high dimensions, however, the construction of flexible and parsimonious models is challenging. We propose to combine bivariate max-stable distributions into a Markov random field with respect to a tree. Although in general not max-stable itself, this Markov tree is attracted by a multivariate max-stable distribution. The latter serves as a tree-based approximation to an unknown max-stable distribution with the given bivariate distributions as margins. Given data, we learn an appropriate tree structure by Prim's algorithm with estimated pairwise upper tail dependence coefficients as edge weights. The distributions of pairs of connected variables can be fitted in various ways. The resulting tree-structured max-stable distribution allows for inference on rare event probabilities, as illustrated on river discharge data from the upper Danube basin.
多变量极值分布是建模多变量极值的常用选择。然而,在高维中,构建灵活且简洁的模型是一项挑战。我们提出将二元最大稳定分布组合成一个关于树的马尔可夫随机场。虽然马尔可夫树本身通常不是极大稳定的,但它被多元极大稳定分布所吸引。后者以给定的二元分布作为边界,作为未知最大稳定分布的基于树的近似。给定数据,我们用估计的两两上尾相关系数作为边权,通过Prim算法学习到合适的树结构。连接变量对的分布可以用各种方法拟合。由此产生的树状结构最大稳定分布允许对罕见事件概率进行推断,如多瑙河上游流域的河流流量数据所示。
{"title":"Modelling multivariate extreme value distributions via Markov trees*","authors":"Shuang Hu, Zuoxiang Peng, Johan Segers","doi":"10.1111/sjos.12698","DOIUrl":"https://doi.org/10.1111/sjos.12698","url":null,"abstract":"Multivariate extreme value distributions are a common choice for modelling multivariate extremes. In high dimensions, however, the construction of flexible and parsimonious models is challenging. We propose to combine bivariate max-stable distributions into a Markov random field with respect to a tree. Although in general not max-stable itself, this Markov tree is attracted by a multivariate max-stable distribution. The latter serves as a tree-based approximation to an unknown max-stable distribution with the given bivariate distributions as margins. Given data, we learn an appropriate tree structure by Prim's algorithm with estimated pairwise upper tail dependence coefficients as edge weights. The distributions of pairs of connected variables can be fitted in various ways. The resulting tree-structured max-stable distribution allows for inference on rare event probabilities, as illustrated on river discharge data from the upper Danube basin.","PeriodicalId":49567,"journal":{"name":"Scandinavian Journal of Statistics","volume":"106 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138526422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Accurate bias estimation with applications to focused model selection 准确的偏差估计与集中模型选择的应用
IF 1 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2023-11-14 DOI: 10.1111/sjos.12696
Ingrid Dæhlen, Nils Lid Hjort, Ingrid Hobæk Haff
We derive approximations to the bias and squared bias with errors of order o(1/n)�$$ oleft(1/nright) $$� where n�$$ n $$� is the sample size. Our results hold for a large class of estimators, including quantiles, transformations of unbiased estimators, maximum likelihood estimators in (possibly) incorrectly specified models, and functions thereof. Furthermore, we use the approximations to derive estimators of the mean squared error (MSE) which are correct to order o(1/n)�$$ oleft(1/nright) $$�. Since the variance of many estimators is of order O(1/n)�$$ Oleft(1/nright) $$�, this level of precision is needed for the MSE estimator to properly take the variance into account. We also formulate a new focused information criterion (FIC) for model selection based on the estimators of the squared bias. Lastly, we illustrate the methods on data containing the number of battle deaths in all major inter-state wars between 1823 and the present day. The application illustrates the potentially large impact of using a less-accurate estimator of the squared bias.
我们得到偏差和平方偏差的近似值,误差为o(1/n) $$ oleft(1/nright) $$阶,其中n $$ n $$是样本量。我们的结果适用于大量的估计量,包括分位数、无偏估计量的变换、(可能)不正确指定模型中的最大似然估计量及其函数。此外,我们使用近似来推导均方误差(MSE)的估计量,其正确到o(1/n) $$ oleft(1/nright) $$阶。由于许多估计器的方差为O(1/n) $$ Oleft(1/nright) $$阶,因此MSE估计器需要这种精度才能适当地考虑方差。我们还提出了一个新的基于偏差平方估计量的模型选择聚焦信息准则(FIC)。最后,我们说明了在1823年至今的所有主要国家间战争中包含战斗死亡人数的数据的方法。该应用说明了使用不太精确的偏差平方估计器可能产生的巨大影响。
{"title":"Accurate bias estimation with applications to focused model selection","authors":"Ingrid Dæhlen, Nils Lid Hjort, Ingrid Hobæk Haff","doi":"10.1111/sjos.12696","DOIUrl":"https://doi.org/10.1111/sjos.12696","url":null,"abstract":"We derive approximations to the bias and squared bias with errors of order <math altimg=\"urn:x-wiley:sjos:media:sjos12696:sjos12696-math-0001\" display=\"inline\" location=\"graphic/sjos12696-math-0001.png\" overflow=\"scroll\">\u0000<semantics>\u0000<mrow>\u0000<mi>o</mi>\u0000<mo stretchy=\"false\">(</mo>\u0000<mn>1</mn>\u0000<mo stretchy=\"false\">/</mo>\u0000<mi>n</mi>\u0000<mo stretchy=\"false\">)</mo>\u0000</mrow>\u0000$$ oleft(1/nright) $$</annotation>\u0000</semantics></math> where <math altimg=\"urn:x-wiley:sjos:media:sjos12696:sjos12696-math-0002\" display=\"inline\" location=\"graphic/sjos12696-math-0002.png\" overflow=\"scroll\">\u0000<semantics>\u0000<mrow>\u0000<mi>n</mi>\u0000</mrow>\u0000$$ n $$</annotation>\u0000</semantics></math> is the sample size. Our results hold for a large class of estimators, including quantiles, transformations of unbiased estimators, maximum likelihood estimators in (possibly) incorrectly specified models, and functions thereof. Furthermore, we use the approximations to derive estimators of the mean squared error (MSE) which are correct to order <math altimg=\"urn:x-wiley:sjos:media:sjos12696:sjos12696-math-0003\" display=\"inline\" location=\"graphic/sjos12696-math-0003.png\" overflow=\"scroll\">\u0000<semantics>\u0000<mrow>\u0000<mi>o</mi>\u0000<mo stretchy=\"false\">(</mo>\u0000<mn>1</mn>\u0000<mo stretchy=\"false\">/</mo>\u0000<mi>n</mi>\u0000<mo stretchy=\"false\">)</mo>\u0000</mrow>\u0000$$ oleft(1/nright) $$</annotation>\u0000</semantics></math>. Since the variance of many estimators is of order <math altimg=\"urn:x-wiley:sjos:media:sjos12696:sjos12696-math-0004\" display=\"inline\" location=\"graphic/sjos12696-math-0004.png\" overflow=\"scroll\">\u0000<semantics>\u0000<mrow>\u0000<mi>O</mi>\u0000<mo stretchy=\"false\">(</mo>\u0000<mn>1</mn>\u0000<mo stretchy=\"false\">/</mo>\u0000<mi>n</mi>\u0000<mo stretchy=\"false\">)</mo>\u0000</mrow>\u0000$$ Oleft(1/nright) $$</annotation>\u0000</semantics></math>, this level of precision is needed for the MSE estimator to properly take the variance into account. We also formulate a new focused information criterion (FIC) for model selection based on the estimators of the squared bias. Lastly, we illustrate the methods on data containing the number of battle deaths in all major inter-state wars between 1823 and the present day. The application illustrates the potentially large impact of using a less-accurate estimator of the squared bias.","PeriodicalId":49567,"journal":{"name":"Scandinavian Journal of Statistics","volume":"5 3","pages":""},"PeriodicalIF":1.0,"publicationDate":"2023-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138526435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A New Paradigm for High‐dimensional Data: Distance‐Based Semiparametric Feature Aggregation Framework via Between‐Subject Attributes 高维数据的新范式:基于主题间属性的距离半参数特征聚合框架
4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2023-11-08 DOI: 10.1111/sjos.12695
Jinyuan Liu, Xinlian Zhang, Tuo Lin, Ruohui Chen, Yuan Zhong, Tian Chen, Tsungchin Wu, Chenyu Liu, Anna Huang, Tanya T. Nguyen, Ellen E. Lee, Dilip V. Jeste, Xin M. Tu
Abstract This article proposes a distance‐based framework incentivized by the paradigm shift towards feature aggregation for high‐dimensional data, which does not rely on the sparse‐feature assumption or the permutation‐based inference. Focusing on distance‐based outcomes that preserve information without truncating any features, a class of semiparametric regression has been developed, which encapsulates multiple sources of high‐dimensional variables using pairwise outcomes of between‐subject attributes. Further, we propose a strategy to address the interlocking correlations among pairs via the U‐statistics‐based estimating equations (UGEE), which correspond to their unique efficient influence function (EIF). Hence, the resulting semiparametric estimators are robust to distributional misspecification while enjoying root‐n consistency and asymptotic optimality to facilitate inference. In essence, the proposed approach not only circumvents information loss due to feature selection but also improves the model's interpretability and computational feasibility. Simulation studies and applications to the human microbiome and wearables data are provided, where the feature dimensions are tens of thousands. This article is protected by copyright. All rights reserved.
本文提出了一种基于距离的框架,该框架不依赖于稀疏特征假设或基于排列的推理,它受到了高维数据向特征聚合范式转变的激励。关注基于距离的结果,在不截断任何特征的情况下保留信息,一类半参数回归已经被开发出来,它使用主体之间属性的成对结果封装了多个高维变量源。此外,我们提出了一种策略,通过基于U统计量的估计方程(UGEE)来解决它们之间的连锁相关性,这对应于它们的唯一有效影响函数(EIF)。因此,所得到的半参数估计量对分布错规范具有鲁棒性,同时具有根n一致性和渐近最优性,便于推理。本质上,该方法不仅避免了特征选择带来的信息丢失,而且提高了模型的可解释性和计算可行性。提供了人体微生物组和可穿戴设备数据的模拟研究和应用,其中特征尺寸为数万。这篇文章受版权保护。版权所有。
{"title":"A New Paradigm for High‐dimensional Data: Distance‐Based Semiparametric Feature Aggregation Framework via Between‐Subject Attributes","authors":"Jinyuan Liu, Xinlian Zhang, Tuo Lin, Ruohui Chen, Yuan Zhong, Tian Chen, Tsungchin Wu, Chenyu Liu, Anna Huang, Tanya T. Nguyen, Ellen E. Lee, Dilip V. Jeste, Xin M. Tu","doi":"10.1111/sjos.12695","DOIUrl":"https://doi.org/10.1111/sjos.12695","url":null,"abstract":"Abstract This article proposes a distance‐based framework incentivized by the paradigm shift towards feature aggregation for high‐dimensional data, which does not rely on the sparse‐feature assumption or the permutation‐based inference. Focusing on distance‐based outcomes that preserve information without truncating any features, a class of semiparametric regression has been developed, which encapsulates multiple sources of high‐dimensional variables using pairwise outcomes of between‐subject attributes. Further, we propose a strategy to address the interlocking correlations among pairs via the U‐statistics‐based estimating equations (UGEE), which correspond to their unique efficient influence function (EIF). Hence, the resulting semiparametric estimators are robust to distributional misspecification while enjoying root‐n consistency and asymptotic optimality to facilitate inference. In essence, the proposed approach not only circumvents information loss due to feature selection but also improves the model's interpretability and computational feasibility. Simulation studies and applications to the human microbiome and wearables data are provided, where the feature dimensions are tens of thousands. This article is protected by copyright. All rights reserved.","PeriodicalId":49567,"journal":{"name":"Scandinavian Journal of Statistics","volume":"42 s195","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135342278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Maximum likelihood estimator for skew Brownian motion: the convergence rate 偏斜布朗运动的极大似然估计:收敛速率
4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2023-11-02 DOI: 10.1111/sjos.12694
Antoine Lejay, Sara Mazzonetto
Abstract We give a thorough description of the asymptotic property of the maximum likelihood estimator (MLE) of the skewness parameter of a Skew Brownian Motion (SBM). Thanks to recent results on the Central Limit Theorem of the rate of convergence of estimators for the SBM, we prove a conjecture left open that the MLE has asymptotically a mixed normal distribution involving the local time with a rate of convergence of order . We also give a series expansion of the MLE and study the asymptotic behavior of the score and its derivatives, as well as their variation with the skewness parameter. In particular, we exhibit a specific behavior when the SBM is actually a Brownian motion, and quantify the explosion of the coefficients of the expansion when the skewness parameter is close to or 1.
摘要给出了斜布朗运动(SBM)偏度参数的极大似然估计(MLE)的渐近性质。基于最近关于SBM估计量收敛速度的中心极限定理的一些结果,我们证明了一个未解的猜想,即MLE具有一个渐近的包含局部时间的混合正态分布,其收敛速度为阶。我们还给出了MLE的级数展开式,并研究了分数及其导数的渐近行为,以及它们随偏度参数的变化。特别是,我们展示了当SBM实际上是布朗运动时的特定行为,并量化了当偏度参数接近或1时膨胀系数的爆炸。
{"title":"Maximum likelihood estimator for skew Brownian motion: the convergence rate","authors":"Antoine Lejay, Sara Mazzonetto","doi":"10.1111/sjos.12694","DOIUrl":"https://doi.org/10.1111/sjos.12694","url":null,"abstract":"Abstract We give a thorough description of the asymptotic property of the maximum likelihood estimator (MLE) of the skewness parameter of a Skew Brownian Motion (SBM). Thanks to recent results on the Central Limit Theorem of the rate of convergence of estimators for the SBM, we prove a conjecture left open that the MLE has asymptotically a mixed normal distribution involving the local time with a rate of convergence of order . We also give a series expansion of the MLE and study the asymptotic behavior of the score and its derivatives, as well as their variation with the skewness parameter. In particular, we exhibit a specific behavior when the SBM is actually a Brownian motion, and quantify the explosion of the coefficients of the expansion when the skewness parameter is close to or 1.","PeriodicalId":49567,"journal":{"name":"Scandinavian Journal of Statistics","volume":"11 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135874775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimation of the Adjusted Standard‐deviatile for Extreme Risks 极端风险的调整标准差估计
4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2023-10-22 DOI: 10.1111/sjos.12693
Haoyu Chen, Tiantian Mao, Fan Yang
Abstract In this paper, we modify the Bayes risk for the expectile, the so‐called variantile risk measure, to better capture extreme risks. The modified risk measure is called the adjusted standard‐deviatile. First, we derive the asymptotic expansions of the adjusted standard‐deviatile. Next, based on the first‐order asymptotic expansion, we propose two efficient estimation methods for the adjusted standard‐deviatile at intermediate and extreme levels. By using techniques from extreme value theory, the asymptotic normality is proved for both estimators for independent and identically distributed observations and for ‐mixing time series, respectively. Simulations and real data applications are conducted to examine the performance of the proposed estimators.
在本文中,我们修改了期望值的贝叶斯风险,即所谓的可变风险度量,以更好地捕捉极端风险。修正后的风险度量称为调整标准差。首先,我们导出了调整标准差的渐近展开式。其次,基于一阶渐近展开式,我们提出了两种有效的中间和极端水平调整标准差估计方法。通过使用极值理论的技术,分别证明了独立和同分布观测值和混合时间序列的估计量的渐近正态性。通过仿真和实际数据应用来检验所提出的估计器的性能。
{"title":"Estimation of the Adjusted Standard‐deviatile for Extreme Risks","authors":"Haoyu Chen, Tiantian Mao, Fan Yang","doi":"10.1111/sjos.12693","DOIUrl":"https://doi.org/10.1111/sjos.12693","url":null,"abstract":"Abstract In this paper, we modify the Bayes risk for the expectile, the so‐called variantile risk measure, to better capture extreme risks. The modified risk measure is called the adjusted standard‐deviatile. First, we derive the asymptotic expansions of the adjusted standard‐deviatile. Next, based on the first‐order asymptotic expansion, we propose two efficient estimation methods for the adjusted standard‐deviatile at intermediate and extreme levels. By using techniques from extreme value theory, the asymptotic normality is proved for both estimators for independent and identically distributed observations and for ‐mixing time series, respectively. Simulations and real data applications are conducted to examine the performance of the proposed estimators.","PeriodicalId":49567,"journal":{"name":"Scandinavian Journal of Statistics","volume":"23 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135461959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Nearly Unstable Integer‐Valued ARCH Process and Unit Root Testing 近不稳定整值ARCH过程与单位根检验
4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2023-10-19 DOI: 10.1111/sjos.12689
Wagner Barreto-Souza, Ngai Hang Chan
Abstract This paper introduces a Nearly Unstable INteger‐valued AutoRegressive Conditional Heteroscedastic (NU‐INARCH) process for dealing with count time series data. It is proved that a proper normalization of the NU‐INARCH process weakly converges to a Cox–Ingersoll–Ross diffusion in the Skorohod topology. The asymptotic distribution of the conditional least squares estimator of the correlation parameter is established as a functional of certain stochastic integrals. Numerical experiments based on Monte Carlo simulations are provided to verify the behavior of the asymptotic distribution under finite samples. These simulations reveal that the nearly unstable approach provides satisfactory and better results than those based on the stationarity assumption even when the true process is not that close to nonstationarity. A unit root test is proposed and its Type‐I error and power are examined via Monte Carlo simulations. As an illustration, the proposed methodology is applied to the daily number of deaths due to COVID‐19 in the United Kingdom.
摘要介绍了一种处理计数时间序列数据的近不稳定整值自回归条件异方差(NU - INARCH)处理方法。证明了NU - INARCH过程的适当归一化在Skorohod拓扑上弱收敛为Cox-Ingersoll-Ross扩散。建立了相关参数的条件最小二乘估计的渐近分布,并给出了相关参数的条件最小二乘估计是若干随机积分的泛函。给出了基于蒙特卡罗模拟的数值实验,验证了有限样本下渐近分布的性质。这些模拟结果表明,即使真实过程不太接近非平稳,近似不稳定方法也比基于平稳假设的方法提供了令人满意和更好的结果。提出了一种单位根检验方法,并通过蒙特卡罗模拟检验了其I型误差和功率。作为一个例子,建议的方法适用于英国因COVID - 19导致的每日死亡人数。
{"title":"Nearly Unstable Integer‐Valued ARCH Process and Unit Root Testing","authors":"Wagner Barreto-Souza, Ngai Hang Chan","doi":"10.1111/sjos.12689","DOIUrl":"https://doi.org/10.1111/sjos.12689","url":null,"abstract":"Abstract This paper introduces a Nearly Unstable INteger‐valued AutoRegressive Conditional Heteroscedastic (NU‐INARCH) process for dealing with count time series data. It is proved that a proper normalization of the NU‐INARCH process weakly converges to a Cox–Ingersoll–Ross diffusion in the Skorohod topology. The asymptotic distribution of the conditional least squares estimator of the correlation parameter is established as a functional of certain stochastic integrals. Numerical experiments based on Monte Carlo simulations are provided to verify the behavior of the asymptotic distribution under finite samples. These simulations reveal that the nearly unstable approach provides satisfactory and better results than those based on the stationarity assumption even when the true process is not that close to nonstationarity. A unit root test is proposed and its Type‐I error and power are examined via Monte Carlo simulations. As an illustration, the proposed methodology is applied to the daily number of deaths due to COVID‐19 in the United Kingdom.","PeriodicalId":49567,"journal":{"name":"Scandinavian Journal of Statistics","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135666555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Scandinavian Journal of Statistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1