首页 > 最新文献

Annals of the Institute of Statistical Mathematics最新文献

英文 中文
Model averaging for semiparametric varying coefficient quantile regression models 半参数变系数分位数回归模型的模型平均
IF 1 4区 数学 Q2 Mathematics Pub Date : 2022-12-22 DOI: 10.1007/s10463-022-00857-z
Zishu Zhan, Yang Li, Yuhong Yang, Cunjie Lin

In this study, we propose a model averaging approach to estimating the conditional quantiles based on a set of semiparametric varying coefficient models. Different from existing literature on the subject, we consider a particular form for all candidates, where there is only one varying coefficient in each sub-model, and all the candidates under investigation may be misspecified. We propose a weight choice criterion based on a leave-more-out cross-validation objective function. Moreover, the resulting averaging estimator is more robust against model misspecification due to the weighted coefficients that adjust the relative importance of the varying and constant coefficients for the same predictors. We prove out statistical properties for each sub-model and asymptotic optimality of the weight selection method. Simulation studies show that the proposed procedure has satisfactory prediction accuracy. An analysis of a skin cutaneous melanoma data further supports the merits of the proposed approach.

在这项研究中,我们提出了一种基于半参数变系数模型的模型平均方法来估计条件分位数。与现有文献不同的是,我们考虑了所有候选项的特定形式,其中每个子模型中只有一个变化系数,并且所有被调查的候选项都可能被错误指定。我们提出了一个基于留多交叉验证目标函数的权重选择准则。此外,由于加权系数调整了相同预测因子的变系数和常系数的相对重要性,因此所得的平均估计器对模型错误规范的鲁棒性更强。证明了各子模型的统计性质和权重选择方法的渐近最优性。仿真研究表明,该方法具有较好的预测精度。对皮肤黑色素瘤数据的分析进一步支持了所提出方法的优点。
{"title":"Model averaging for semiparametric varying coefficient quantile regression models","authors":"Zishu Zhan,&nbsp;Yang Li,&nbsp;Yuhong Yang,&nbsp;Cunjie Lin","doi":"10.1007/s10463-022-00857-z","DOIUrl":"10.1007/s10463-022-00857-z","url":null,"abstract":"<div><p>In this study, we propose a model averaging approach to estimating the conditional quantiles based on a set of semiparametric varying coefficient models. Different from existing literature on the subject, we consider a particular form for all candidates, where there is only one varying coefficient in each sub-model, and all the candidates under investigation may be misspecified. We propose a weight choice criterion based on a leave-more-out cross-validation objective function. Moreover, the resulting averaging estimator is more robust against model misspecification due to the weighted coefficients that adjust the relative importance of the varying and constant coefficients for the same predictors. We prove out statistical properties for each sub-model and asymptotic optimality of the weight selection method. Simulation studies show that the proposed procedure has satisfactory prediction accuracy. An analysis of a skin cutaneous melanoma data further supports the merits of the proposed approach.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47825772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Slash distributions, generalized convolutions, and extremes 斜线分布,广义卷积和极值
IF 1 4区 数学 Q2 Mathematics Pub Date : 2022-12-20 DOI: 10.1007/s10463-022-00858-y
M. Arendarczyk, T. J. Kozubowski, A. K. Panorska

An (alpha)-slash distribution built upon a random variable X is a heavy tailed distribution corresponding to (Y=X/U^{1/alpha }), where U is standard uniform random variable, independent of X. We point out and explore a connection between (alpha)-slash distributions, which are gaining popularity in statistical practice, and generalized convolutions, which come up in the probability theory as generalizations of the standard concept of the convolution of probability measures and allow for the operation between the measures to be random itself. The stochastic interpretation of Kendall convolution discussed in this work brings this theoretical concept closer to statistical practice, and leads to new results for (alpha)-slash distributions connected with extremes. In particular, we show that the maximum of independent random variables with (alpha)-slash distributions is also a random variable with an (alpha)-slash distribution. Our theoretical results are illustrated by several examples involving standard and novel probability distributions and extremes.

建立在随机变量X上的(alpha) -斜线分布是对应于(Y=X/U^{1/alpha })的重尾分布,其中U是独立于X的标准均匀随机变量。我们指出并探索了在统计实践中越来越流行的(alpha) -斜线分布与广义卷积之间的联系。它出现在概率论中作为概率测度卷积标准概念的概括并且允许测度之间的运算本身是随机的。在这项工作中讨论的肯德尔卷积的随机解释使这一理论概念更接近统计实践,并导致与极端相关的(alpha) -斜线分布的新结果。特别地,我们证明了具有(alpha) -斜线分布的独立随机变量的最大值也是具有(alpha) -斜线分布的随机变量。我们的理论结果通过几个涉及标准和新的概率分布和极值的例子来说明。
{"title":"Slash distributions, generalized convolutions, and extremes","authors":"M. Arendarczyk,&nbsp;T. J. Kozubowski,&nbsp;A. K. Panorska","doi":"10.1007/s10463-022-00858-y","DOIUrl":"10.1007/s10463-022-00858-y","url":null,"abstract":"<div><p>An <span>(alpha)</span>-slash distribution built upon a random variable <i>X</i> is a heavy tailed distribution corresponding to <span>(Y=X/U^{1/alpha })</span>, where <i>U</i> is standard uniform random variable, independent of <i>X</i>. We point out and explore a connection between <span>(alpha)</span>-slash distributions, which are gaining popularity in statistical practice, and generalized convolutions, which come up in the probability theory as generalizations of the standard concept of the convolution of probability measures and allow for the operation between the measures to be random itself. The stochastic interpretation of Kendall convolution discussed in this work brings this theoretical concept closer to statistical practice, and leads to new results for <span>(alpha)</span>-slash distributions connected with extremes. In particular, we show that the maximum of independent random variables with <span>(alpha)</span>-slash distributions is also a random variable with an <span>(alpha)</span>-slash distribution. Our theoretical results are illustrated by several examples involving standard and novel probability distributions and extremes.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10463-022-00858-y.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42924932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A unified precision matrix estimation framework via sparse column-wise inverse operator under weak sparsity 弱稀疏性下基于稀疏列逆算子的统一精度矩阵估计框架
IF 1 4区 数学 Q2 Mathematics Pub Date : 2022-12-08 DOI: 10.1007/s10463-022-00856-0
Zeyu Wu, Cheng Wang, Weidong Liu

In this paper, we estimate the high-dimensional precision matrix under the weak sparsity condition where many entries are nearly zero. We revisit the sparse column-wise inverse operator estimator and derive its general error bounds under the weak sparsity condition. A unified framework is established to deal with various cases including the heavy-tailed data, the non-paranormal data, and the matrix variate data. These new methods can achieve the same convergence rates as the existing methods and can be implemented efficiently.

在弱稀疏性条件下,我们估计了高维精度矩阵,其中许多项接近于零。我们重新研究了稀疏列逆算子估计,并推导了它在弱稀疏条件下的一般误差界。建立了一个统一的框架来处理各种情况,包括重尾数据、非异常数据和矩阵变量数据。这些新方法可以达到与现有方法相同的收敛速度,并且可以有效地实现。
{"title":"A unified precision matrix estimation framework via sparse column-wise inverse operator under weak sparsity","authors":"Zeyu Wu,&nbsp;Cheng Wang,&nbsp;Weidong Liu","doi":"10.1007/s10463-022-00856-0","DOIUrl":"10.1007/s10463-022-00856-0","url":null,"abstract":"<div><p>In this paper, we estimate the high-dimensional precision matrix under the weak sparsity condition where many entries are nearly zero. We revisit the sparse column-wise inverse operator estimator and derive its general error bounds under the weak sparsity condition. A unified framework is established to deal with various cases including the heavy-tailed data, the non-paranormal data, and the matrix variate data. These new methods can achieve the same convergence rates as the existing methods and can be implemented efficiently.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10463-022-00856-0.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41927406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data-driven model selection for same-realization predictions in autoregressive processes 自回归过程中相同实现预测的数据驱动模型选择
IF 1 4区 数学 Q2 Mathematics Pub Date : 2022-11-27 DOI: 10.1007/s10463-022-00855-1
Kare Kamila

This paper is about the one-step ahead prediction of the future of observations drawn from an infinite-order autoregressive AR((infty )) process. It aims to design penalties (fully data driven) ensuring that the selected model verifies the efficiency property but in the non-asymptotic framework. We show that the excess risk of the selected estimator enjoys the best bias-variance trade-off over the considered collection. To achieve these results, we needed to overcome the dependence difficulties by following a classical approach which consists in restricting to a set where the empirical covariance matrix is equivalent to the theoretical one. We show that this event happens with probability larger than (1-c_0/n^2) with (c_0>0). The proposed data-driven criteria are based on the minimization of the penalized criterion akin to the Mallows’s (C_p).

本文是关于从无限阶自回归AR((infty ))过程中提取的观测值的未来的一步预测。它旨在设计惩罚(完全数据驱动),确保所选模型在非渐近框架下验证效率属性。我们表明,所选择的估计器的超额风险在考虑的集合上享有最佳的偏差-方差权衡。为了获得这些结果,我们需要通过遵循经典方法来克服依赖困难,该方法包括将经验协方差矩阵限制在一个与理论协方差矩阵等效的集合中。我们用(c_0>0)表明该事件发生的概率大于(1-c_0/n^2)。建议的数据驱动标准是基于最小化的惩罚标准,类似于Mallows的(C_p)。
{"title":"Data-driven model selection for same-realization predictions in autoregressive processes","authors":"Kare Kamila","doi":"10.1007/s10463-022-00855-1","DOIUrl":"10.1007/s10463-022-00855-1","url":null,"abstract":"<div><p>This paper is about the one-step ahead prediction of the future of observations drawn from an infinite-order autoregressive AR(<span>(infty )</span>) process. It aims to design penalties (fully data driven) ensuring that the selected model verifies the efficiency property but in the non-asymptotic framework. We show that the excess risk of the selected estimator enjoys the best bias-variance trade-off over the considered collection. To achieve these results, we needed to overcome the dependence difficulties by following a classical approach which consists in restricting to a set where the empirical covariance matrix is equivalent to the theoretical one. We show that this event happens with probability larger than <span>(1-c_0/n^2)</span> with <span>(c_0&gt;0)</span>. The proposed data-driven criteria are based on the minimization of the penalized criterion akin to the Mallows’s <span>(C_p)</span>.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43492788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bootstrap method for misspecified ergodic Lévy driven stochastic differential equation models 错定遍历lsamy驱动随机微分方程模型的自举法
IF 1 4区 数学 Q2 Mathematics Pub Date : 2022-11-10 DOI: 10.1007/s10463-022-00854-2
Yuma Uehara

In this paper, we consider possibly misspecified stochastic differential equation models driven by Lévy processes. Regardless of whether the driving noise is Gaussian or not, Gaussian quasi-likelihood estimator can estimate unknown parameters in the drift and scale coefficients. However, in the misspecified case, the asymptotic distribution of the estimator varies by the correction of the misspecification bias, and consistent estimators for the asymptotic variance proposed in the correctly specified case may lose theoretical validity. As one of its solutions, we propose a bootstrap method for approximating the asymptotic distribution. We show that our bootstrap method theoretically works in both correctly specified case and misspecified case without assuming the precise distribution of the driving noise.

在本文中,我们考虑了由lsamvy过程驱动的可能的错定随机微分方程模型。无论驱动噪声是否为高斯噪声,高斯拟似然估计都可以估计出漂移系数和尺度系数中的未知参数。然而,在错误指定的情况下,估计量的渐近分布随着错误指定偏差的校正而变化,并且在正确指定的情况下提出的渐近方差的一致估计可能会失去理论有效性。作为其解之一,我们提出了一种逼近渐近分布的自举法。结果表明,在不假设驱动噪声精确分布的情况下,该方法在正确指定情况和错误指定情况下理论上都有效。
{"title":"Bootstrap method for misspecified ergodic Lévy driven stochastic differential equation models","authors":"Yuma Uehara","doi":"10.1007/s10463-022-00854-2","DOIUrl":"10.1007/s10463-022-00854-2","url":null,"abstract":"<div><p>In this paper, we consider possibly misspecified stochastic differential equation models driven by Lévy processes. Regardless of whether the driving noise is Gaussian or not, Gaussian quasi-likelihood estimator can estimate unknown parameters in the drift and scale coefficients. However, in the misspecified case, the asymptotic distribution of the estimator varies by the correction of the misspecification bias, and consistent estimators for the asymptotic variance proposed in the correctly specified case may lose theoretical validity. As one of its solutions, we propose a bootstrap method for approximating the asymptotic distribution. We show that our bootstrap method theoretically works in both correctly specified case and misspecified case without assuming the precise distribution of the driving noise.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42521981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Tests for the existence of group effects and interactions for two-way models with dependent errors 具有相依误差的双向模型的群效应和相互作用的存在性检验
IF 1 4区 数学 Q2 Mathematics Pub Date : 2022-10-31 DOI: 10.1007/s10463-022-00853-3
Yuichi Goto, Kotone Suzuki, Xiaofei Xu, Masanobu Taniguchi

In this paper, we propose tests for the existence of random effects and interactions for two-way models with dependent errors. We prove that the proposed tests are asymptotically distribution-free which have asymptotically size ({{tau }}) and are consistent. We elucidate the nontrivial power under the local alternative when a sample size tends to infinity and the number of groups is fixed. A simulation study is performed to investigate the finite-sample performance of the proposed tests. In the real data analysis, we apply our tests to the daily log-returns of 24 stock prices from six countries and four sectors. We find that there is no strong evidence to support the existence of substantial differences in the log-return across countries, nor to the existence of interactions between countries and sectors. However, there exists random effect differences in the daily log-return series across different sectors.

在本文中,我们提出了具有依赖误差的双向模型的随机效应和相互作用存在性的检验。我们证明了所提出的检验是渐近无分布的,具有渐近大小({{tau }})并且是一致的。研究了当样本容量趋于无穷大且组数固定时,局部选择下的非平凡幂。进行了模拟研究,以调查所提出的测试的有限样本性能。在实际数据分析中,我们对来自6个国家和4个行业的24只股票的日对数收益进行了测试。我们发现,没有强有力的证据支持各国之间的对数回报存在实质性差异,也没有强有力的证据支持国家和部门之间存在相互作用。但不同行业的日对数收益序列存在随机效应差异。
{"title":"Tests for the existence of group effects and interactions for two-way models with dependent errors","authors":"Yuichi Goto,&nbsp;Kotone Suzuki,&nbsp;Xiaofei Xu,&nbsp;Masanobu Taniguchi","doi":"10.1007/s10463-022-00853-3","DOIUrl":"10.1007/s10463-022-00853-3","url":null,"abstract":"<div><p>In this paper, we propose tests for the existence of random effects and interactions for two-way models with dependent errors. We prove that the proposed tests are asymptotically distribution-free which have asymptotically size <span>({{tau }})</span> and are consistent. We elucidate the nontrivial power under the local alternative when a sample size tends to infinity and the number of groups is fixed. A simulation study is performed to investigate the finite-sample performance of the proposed tests. In the real data analysis, we apply our tests to the daily log-returns of 24 stock prices from six countries and four sectors. We find that there is no strong evidence to support the existence of substantial differences in the log-return across countries, nor to the existence of interactions between countries and sectors. However, there exists random effect differences in the daily log-return series across different sectors.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43124921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust estimation for nonrandomly distributed data 非随机分布数据的鲁棒估计
IF 1 4区 数学 Q2 Mathematics Pub Date : 2022-10-12 DOI: 10.1007/s10463-022-00852-4
Shaomin Li, Kangning Wang, Yong Xu

In recent years, many methodologies for distributed data have been developed. However, there are two problems. First, most of these methods require the data to be randomly and uniformly distributed across different machines. Second, the methods are mainly not robust. To solve these problems, we propose a distributed pilot modal regression estimator, which achieves robustness and can adapt when the data are stored nonrandomly. First, we collect a random pilot sample from different machines; then, we approximate the global MR objective function by a communication-efficient surrogate that can be efficiently evaluated by the pilot sample and the local gradients. The final estimator is obtained by minimizing the surrogate function in the master machine, while the other machines only need to calculate their gradients. Theoretical results show the new estimator is asymptotically efficient as the global MR estimator. Simulation studies illustrate the utility of the proposed approach.

近年来,已经开发了许多用于分布式数据的方法。然而,有两个问题。首先,这些方法中的大多数都要求数据随机且均匀地分布在不同的机器上。其次,方法的鲁棒性不强。为了解决这些问题,我们提出了一种分布式导频模态回归估计器,该估计器既具有鲁棒性,又能适应非随机存储的数据。首先,我们从不同的机器上随机收集一个试点样本;然后,我们通过一个通信高效的代理来近似全局MR目标函数,该代理可以由导频样本和局部梯度有效地评估。最终的估计量是通过最小化主机中的代理函数得到的,而其他机器只需要计算它们的梯度。理论结果表明,该估计量作为全局MR估计量是渐近有效的。仿真研究表明了该方法的有效性。
{"title":"Robust estimation for nonrandomly distributed data","authors":"Shaomin Li,&nbsp;Kangning Wang,&nbsp;Yong Xu","doi":"10.1007/s10463-022-00852-4","DOIUrl":"10.1007/s10463-022-00852-4","url":null,"abstract":"<div><p>In recent years, many methodologies for distributed data have been developed. However, there are two problems. First, most of these methods require the data to be randomly and uniformly distributed across different machines. Second, the methods are mainly not robust. To solve these problems, we propose a distributed pilot modal regression estimator, which achieves robustness and can adapt when the data are stored nonrandomly. First, we collect a random pilot sample from different machines; then, we approximate the global MR objective function by a communication-efficient surrogate that can be efficiently evaluated by the pilot sample and the local gradients. The final estimator is obtained by minimizing the surrogate function in the master machine, while the other machines only need to calculate their gradients. Theoretical results show the new estimator is asymptotically efficient as the global MR estimator. Simulation studies illustrate the utility of the proposed approach.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41284458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Matrix completion under complex survey sampling 复杂调查抽样下的矩阵补全
IF 1 4区 数学 Q2 Mathematics Pub Date : 2022-09-19 DOI: 10.1007/s10463-022-00851-5
Xiaojun Mao, Zhonglei Wang, Shu Yang

Multivariate nonresponse is often encountered in complex survey sampling, and simply ignoring it leads to erroneous inference. In this paper, we propose a new matrix completion method for complex survey sampling. Different from existing works either conducting row-wise or column-wise imputation, the data matrix is treated as a whole which allows for exploiting both row and column patterns simultaneously. A column-space-decomposition model is adopted incorporating a low-rank structured matrix for the finite population with easy-to-obtain demographic information as covariates. Besides, we propose a computationally efficient projection strategy to identify the model parameters under complex survey sampling. Then, an augmented inverse probability weighting estimator is used to estimate the parameter of interest, and the corresponding asymptotic upper bound of the estimation error is derived. Simulation studies show that the proposed estimator has a smaller mean squared error than other competitors, and the corresponding variance estimator performs well. The proposed method is applied to assess the health status of the U.S. population.

在复杂的调查抽样中经常会遇到多元无响应,简单地忽略它会导致错误的推断。本文提出了一种新的复杂调查抽样的矩阵补全方法。与现有的进行逐行或逐列插入的工作不同,数据矩阵被视为一个整体,允许同时利用行和列模式。采用一种列-空间分解模型,该模型以易于获取的人口统计信息为协变量,将有限种群纳入低秩结构矩阵。此外,我们还提出了一种计算效率高的投影策略来识别复杂调查抽样下的模型参数。然后,利用增广逆概率加权估计器对感兴趣的参数进行估计,并推导出相应估计误差的渐近上界。仿真研究表明,该估计器的均方误差较小,相应的方差估计器性能良好。所提出的方法被用于评估美国人口的健康状况。
{"title":"Matrix completion under complex survey sampling","authors":"Xiaojun Mao,&nbsp;Zhonglei Wang,&nbsp;Shu Yang","doi":"10.1007/s10463-022-00851-5","DOIUrl":"10.1007/s10463-022-00851-5","url":null,"abstract":"<div><p>Multivariate nonresponse is often encountered in complex survey sampling, and simply ignoring it leads to erroneous inference. In this paper, we propose a new matrix completion method for complex survey sampling. Different from existing works either conducting row-wise or column-wise imputation, the data matrix is treated as a whole which allows for exploiting both row and column patterns simultaneously. A column-space-decomposition model is adopted incorporating a low-rank structured matrix for the finite population with easy-to-obtain demographic information as covariates. Besides, we propose a computationally efficient projection strategy to identify the model parameters under complex survey sampling. Then, an augmented inverse probability weighting estimator is used to estimate the parameter of interest, and the corresponding asymptotic upper bound of the estimation error is derived. Simulation studies show that the proposed estimator has a smaller mean squared error than other competitors, and the corresponding variance estimator performs well. The proposed method is applied to assess the health status of the U.S. population.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10465119/pdf/nihms-1875523.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10127028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Inhomogeneous hidden semi-Markov models for incompletely observed point processes 不完全观测点过程的非齐次隐半马尔可夫模型
IF 1 4区 数学 Q2 Mathematics Pub Date : 2022-09-18 DOI: 10.1007/s10463-022-00843-5
Amina Shahzadi, Ting Wang, Mark Bebbington, Matthew Parry

A general class of inhomogeneous hidden semi-Markov models (IHSMMs) is proposed for modelling partially observed processes that do not necessarily behave in a stationary and memoryless manner. The key feature of the proposed model is that the sojourn times of the states in the semi-Markov chain are time-dependent, making it an inhomogeneous semi-Markov chain. Conjectured consistency of the parameter estimators is checked by simulation study using direct numerical optimization of the log-likelihood function. The proposed models are applied to a global volcanic eruption catalogue to investigate the time-dependent incompleteness of the record by introducing a particular case of IHSMMs with time-dependent shifted Poisson state durations and a renewal process as the observed process. The Akaike Information Criterion and residual analysis are used to choose the best model. The selected IHSMM provides useful insights into the completeness of the global record of volcanic eruptions, demonstrating the effectiveness of this method.

提出了一类一般的非齐次隐半马尔可夫模型(IHSMMs),用于模拟部分观察到的过程,这些过程不一定以平稳和无记忆的方式表现。该模型的主要特征是半马尔可夫链中状态的停留时间与时间相关,使其成为非齐次半马尔可夫链。利用对数似然函数的直接数值优化,通过仿真研究验证了参数估计量的推测一致性。将所提出的模型应用于全球火山喷发目录,通过引入具有随时间变化的泊松状态持续时间和更新过程的ihsmm的特定案例来研究记录的时间依赖性不完全性。利用赤池信息准则和残差分析选择最佳模型。所选的IHSMM对全球火山爆发记录的完整性提供了有用的见解,证明了该方法的有效性。
{"title":"Inhomogeneous hidden semi-Markov models for incompletely observed point processes","authors":"Amina Shahzadi,&nbsp;Ting Wang,&nbsp;Mark Bebbington,&nbsp;Matthew Parry","doi":"10.1007/s10463-022-00843-5","DOIUrl":"10.1007/s10463-022-00843-5","url":null,"abstract":"<div><p>A general class of inhomogeneous hidden semi-Markov models (IHSMMs) is proposed for modelling partially observed processes that do not necessarily behave in a stationary and memoryless manner. The key feature of the proposed model is that the sojourn times of the states in the semi-Markov chain are time-dependent, making it an inhomogeneous semi-Markov chain. Conjectured consistency of the parameter estimators is checked by simulation study using direct numerical optimization of the log-likelihood function. The proposed models are applied to a global volcanic eruption catalogue to investigate the time-dependent incompleteness of the record by introducing a particular case of IHSMMs with time-dependent shifted Poisson state durations and a renewal process as the observed process. The Akaike Information Criterion and residual analysis are used to choose the best model. The selected IHSMM provides useful insights into the completeness of the global record of volcanic eruptions, demonstrating the effectiveness of this method.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44617410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Regression analysis for exponential family data in a finite population setup using two-stage cluster sample 有限总体条件下指数族数据的两阶段聚类回归分析
IF 1 4区 数学 Q2 Mathematics Pub Date : 2022-09-14 DOI: 10.1007/s10463-022-00850-6
Brajendra C. Sutradhar

Over the last four decades, the cluster regression analysis in a finite population (FP) setup for an exponential family such as linear or binary data was done by using a two-stage cluster sample chosen from the FP but by treating the sample as though it is a single-stage cluster sample from a super-population (SP) which contains the FP as a hypothetical sample. Because the responses within a cluster in the FP are correlated, the aforementioned sample mis-specification makes the sample-based so-called GLS (generalized least square) estimators design biased and inconsistent. In this paper, we demonstrate for the exponential family data how to avoid the sampling mis-specification and accommodate the cluster correlations to obtain unbiased and consistent estimates for the FP parameters. The asymptotic normality of the regression estimators is also given for the construction of confidence intervals when needed.

在过去的四十年中,对于指数族(如线性或二进制数据)的有限总体(FP)设置中的聚类回归分析是通过使用从FP中选择的两阶段聚类样本来完成的,但通过将样本视为来自包含FP作为假设样本的超级总体(SP)的单阶段聚类样本来处理。由于FP中集群内的响应是相关的,因此上述样本错误规范使得基于样本的所谓GLS(广义最小二乘)估计器设计有偏差和不一致。在本文中,我们证明了指数族数据如何避免抽样错误规范和适应聚类相关性,以获得FP参数的无偏一致估计。给出了回归估计量的渐近正态性,以便在需要时构造置信区间。
{"title":"Regression analysis for exponential family data in a finite population setup using two-stage cluster sample","authors":"Brajendra C. Sutradhar","doi":"10.1007/s10463-022-00850-6","DOIUrl":"10.1007/s10463-022-00850-6","url":null,"abstract":"<div><p>Over the last four decades, the cluster regression analysis in a finite population (FP) setup for an exponential family such as linear or binary data was done by using a two-stage cluster sample chosen from the FP but by treating the sample as though it is a single-stage cluster sample from a super-population (SP) which contains the FP as a hypothetical sample. Because the responses within a cluster in the FP are correlated, the aforementioned sample mis-specification makes the sample-based so-called GLS (generalized least square) estimators design biased and inconsistent. In this paper, we demonstrate for the exponential family data how to avoid the sampling mis-specification and accommodate the cluster correlations to obtain unbiased and consistent estimates for the FP parameters. The asymptotic normality of the regression estimators is also given for the construction of confidence intervals when needed.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46263827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Annals of the Institute of Statistical Mathematics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1