Annals of the Institute of Statistical Mathematics最新文献

英文中文

Author’s rejoinder to the discussion of the Akaike Memorial Lecture 2020 作者对2020年赤池纪念讲座讨论的回应

IF 1 4区数学 Q3 STATISTICS & PROBABILITY

Annals of the Institute of Statistical Mathematics

Pub Date : 2022-05-19 DOI: 10.1007/s10463-022-00830-w

John Copas

引用次数: 0

Inference of random effects for linear mixed-effects models with a fixed number of clusters 具有固定簇数的线性混合效应模型的随机效应推断

IF 1 4区数学 Q3 STATISTICS & PROBABILITY

Annals of the Institute of Statistical Mathematics

Pub Date : 2022-05-14 DOI: 10.1007/s10463-022-00825-7

Chih-Hao Chang, Hsin-Cheng Huang, Ching-Kang Ing

We consider a linear mixed-effects model with a clustered structure, where the parameters are estimated using maximum likelihood (ML) based on possibly unbalanced data. Inference with this model is typically done based on asymptotic theory, assuming that the number of clusters tends to infinity with the sample size. However, when the number of clusters is fixed, classical asymptotic theory developed under a divergent number of clusters is no longer valid and can lead to erroneous conclusions. In this paper, we establish the asymptotic properties of the ML estimators of random-effects parameters under a general setting, which can be applied to conduct valid statistical inference with fixed numbers of clusters. Our asymptotic theorems allow both fixed effects and random effects to be misspecified, and the dimensions of both effects to go to infinity with the sample size.

我们考虑了一个具有聚类结构的线性混合效应模型，其中参数是基于可能不平衡的数据使用最大似然(ML)估计的。该模型的推理通常基于渐近理论，假设簇的数量随着样本量的增加而趋于无穷大。然而，当聚类数量固定时，在聚类数量不同的情况下发展的经典渐近理论不再有效，并可能导致错误的结论。本文建立了随机效应参数的ML估计量在一般情况下的渐近性质，可用于对固定数量的聚类进行有效的统计推断。我们的渐近定理允许固定效应和随机效应被错误指定，并且这两种效应的维度随着样本量的增大而趋于无穷大。

引用次数: 1

On comparing competing risks using the ratio of their cumulative incidence functions 利用其累积关联函数的比值比较竞争风险

IF 1 4区数学 Q3 STATISTICS & PROBABILITY

Annals of the Institute of Statistical Mathematics

Pub Date : 2022-05-14 DOI: 10.1007/s10463-022-00823-9

Hammou El Barmi

For ( 1le i le r), let (F_i) be the cumulative incidence function (CIF) corresponding to the ith risk in an r-competing risks model. We assume a discrete or a grouped time framework and obtain the maximum likelihood estimators (m.l.e.) of these CIFs under the restriction that (F_i(t)/F_{i+1}(t)) is nondecreasing, (1 le i le r-1.) We also derive the likelihood ratio tests for testing for and against this restriction and obtain their asymptotic distributions. The theory developed here can also be used to investigate the association between a failure time and a discretized or ordinal mark variable that is observed only at the time of failure. To illustrate the applicability of our results, we give examples in the competing risks and the mark variable settings.

对于( 1le i le r)，设(F_i)为r竞争风险模型中第i个风险对应的累积关联函数(CIF)。我们假设一个离散或分组的时间框架，并在(F_i(t)/F_{i+1}(t))非递减的限制下得到了这些CIFs的最大似然估计量(m.l.e.)， (1 le i le r-1.)我们还推导了检验这个限制的似然比检验，并得到了它们的渐近分布。这里发展的理论也可用于研究失效时间与仅在失效时观察到的离散或有序标记变量之间的关系。为了说明我们的结果的适用性，我们给出了竞争风险和标记变量设置的例子。

引用次数: 0

Asymptotic equivalence for nonparametric regression with dependent errors: Gauss–Markov processes 具有相关误差的非参数回归的渐近等价性:高斯-马尔科夫过程

IF 1 4区数学 Q3 STATISTICS & PROBABILITY

Annals of the Institute of Statistical Mathematics

Pub Date : 2022-05-09 DOI: 10.1007/s10463-022-00826-6

Holger Dette, Martin Kroll

For the class of Gauss–Markov processes we study the problem of asymptotic equivalence of the nonparametric regression model with errors given by the increments of the process and the continuous time model, where a whole path of a sum of a deterministic signal and the Gauss–Markov process can be observed. We derive sufficient conditions which imply asymptotic equivalence of the two models. We verify these conditions for the special cases of Sobolev ellipsoids and Hölder classes with smoothness index (>1/2) under mild assumptions on the Gauss–Markov process. To give a counterexample, we show that asymptotic equivalence fails to hold for the special case of Brownian bridge. Our findings demonstrate that the well-known asymptotic equivalence of the Gaussian white noise model and the nonparametric regression model with i.i.d. standard normal errors (see Brown and Low (Ann Stat 24:2384–2398, 1996)) can be extended to a setup with general Gauss–Markov noises.

对于一类高斯-马尔可夫过程，研究了误差由过程增量给出的非参数回归模型与连续时间模型的渐近等价问题，其中确定性信号与高斯-马尔可夫过程的和可以观察到整条路径。给出了两个模型渐近等价的充分条件。我们在高斯-马尔可夫过程的温和假设下，对Sobolev椭球和具有平滑指数(>1/2)的Hölder类的特殊情况验证了这些条件。为了给出一个反例，我们证明了对于布朗桥的特殊情况渐近等价不成立。我们的研究结果表明，众所周知的高斯白噪声模型和具有i.i.d标准正态误差的非参数回归模型的渐近等价性(见Brown和Low (Ann Stat 24:2384-2398, 1996))可以扩展到具有一般高斯-马尔可夫噪声的设置。

引用次数: 0

A sequential feature selection procedure for high-dimensional Cox proportional hazards model 高维Cox比例风险模型的序列特征选择方法

IF 1 4区数学 Q3 STATISTICS & PROBABILITY

Annals of the Institute of Statistical Mathematics

Pub Date : 2022-05-07 DOI: 10.1007/s10463-022-00824-8

Ke Yu, Shan Luo

Feature selection for the high-dimensional Cox proportional hazards model (Cox model) is very important in many microarray genetic studies. In this paper, we propose a sequential feature selection procedure for this model. We define a novel partial profile score to assess the impact of unselected features conditional on the current model, significant features are thereby added into the model sequentially, and the Extended Bayesian Information Criteria (EBIC) is adopted as a stopping rule. Under mild conditions, we show that this procedure is selection consistent. Extensive simulation studies and two real data applications are conducted to demonstrate the advantage of our proposed procedure over several representative approaches.

高维Cox比例风险模型(Cox模型)的特征选择在许多微阵列遗传学研究中非常重要。在本文中，我们提出了一种序列特征选择方法。我们定义了一个新的局部轮廓分数来评估未选择的特征对当前模型的影响，从而依次将重要特征添加到模型中，并采用扩展贝叶斯信息标准(EBIC)作为停止规则。在温和的条件下，我们证明这个过程是选择一致的。广泛的仿真研究和两个实际数据应用进行了证明，我们提出的程序优于几个代表性的方法。

引用次数: 0

A blockwise network autoregressive model with application for fraud detection 一种用于欺诈检测的分块网络自回归模型

IF 1 4区数学 Q3 STATISTICS & PROBABILITY

Annals of the Institute of Statistical Mathematics

Pub Date : 2022-05-04 DOI: 10.1007/s10463-022-00822-w

Bofei Xiao, Bo Lei, Wei Lan, Bin Guo

This paper proposes a blockwise network autoregressive (BWNAR) model by grouping nodes in the network into nonoverlapping blocks to adapt networks with blockwise structures. Before modeling, we employ the pseudo likelihood ratio criterion (pseudo-LR) together with the standard spectral clustering approach and a binary segmentation method developed by Ma et al. (Journal of Machine Learning Research, 22, 1–63, 2021) to estimate the number of blocks and their memberships, respectively. Then, we acquire the consistency and asymptotic normality of the estimator of influence parameters by the quasi-maximum likelihood estimation method without imposing any distribution assumptions. In addition, a novel likelihood ratio test statistic is proposed to verify the heterogeneity of the influencing parameters. The performance and usefulness of the model are assessed through simulations and an empirical example of the detection of fraud in financial transactions, respectively.

本文提出了一种块网络自回归(BWNAR)模型，该模型通过将网络中的节点分组为不重叠的块来适应具有块结构的网络。在建模之前，我们采用伪似然比准则(pseudo- lr)以及Ma等人(Journal of Machine Learning Research, 22, 1 - 63,2021)开发的标准光谱聚类方法和二值分割方法来分别估计块的数量及其隶属度。然后，在不施加任何分布假设的情况下，利用拟极大似然估计方法获得了影响参数估计量的相合性和渐近正态性。此外，提出了一种新的似然比检验统计量来验证影响参数的异质性。该模型的性能和有用性分别通过模拟和金融交易欺诈检测的经验例子进行评估。

引用次数: 1

Outcome regression-based estimation of conditional average treatment effect 基于结果回归的条件平均治疗效果估计

IF 1 4区数学 Q3 STATISTICS & PROBABILITY

Annals of the Institute of Statistical Mathematics

Pub Date : 2022-04-29 DOI: 10.1007/s10463-022-00821-x

Lu Li, Niwen Zhou, Lixing Zhu

The research is about a systematic investigation on the following issues. First, we construct different outcome regression-based estimators for conditional average treatment effect under, respectively, true, parametric, nonparametric and semiparametric dimension reduction structure. Second, according to the corresponding asymptotic variance functions when supposing the models are correctly specified, we answer the following questions: what is the asymptotic efficiency ranking about the four estimators in general? how is the efficiency related to the affiliation of the given covariates in the set of arguments of the regression functions? what do the roles of bandwidth and kernel function selections play for the estimation efficiency; and in which scenarios should the estimator under semiparametric dimension reduction regression structure be used in practice? Meanwhile, the results show that any outcome regression-based estimation should be asymptotically more efficient than any inverse probability weighting-based estimation. Several simulation studies are conducted to examine the finite sample performances of these estimators, and a real dataset is analyzed for illustration.

本文主要对以下几个问题进行了系统的研究。首先，我们分别在真、参数、非参数和半参数降维结构下构造了基于结果回归的条件平均处理效果估计量。其次，在假设模型被正确指定的情况下，根据相应的渐近方差函数，我们回答了以下问题:四种估计量的渐近效率一般排序如何?效率是如何与回归函数参数集合中给定协变量的隶属关系联系起来的?带宽和核函数的选择对估计效率有何影响?半参数降维回归结构下的估计量在哪些情况下应用?同时，结果表明，任何基于结果回归的估计都应该比任何基于逆概率加权的估计渐进地更有效。为了检验这些估计器的有限样本性能，进行了一些仿真研究，并对一个真实数据集进行了分析。

{"title":"Outcome regression-based estimation of conditional average treatment effect","authors":"Lu Li, Niwen Zhou, Lixing Zhu","doi":"10.1007/s10463-022-00821-x","DOIUrl":"10.1007/s10463-022-00821-x","url":null,"abstract":"<div><p>The research is about a systematic investigation on the following issues. First, we construct different outcome regression-based estimators for conditional average treatment effect under, respectively, true, parametric, nonparametric and semiparametric dimension reduction structure. Second, according to the corresponding asymptotic variance functions when supposing the models are correctly specified, we answer the following questions: what is the asymptotic efficiency ranking about the four estimators in general? how is the efficiency related to the affiliation of the given covariates in the set of arguments of the regression functions? what do the roles of bandwidth and kernel function selections play for the estimation efficiency; and in which scenarios should the estimator under semiparametric dimension reduction regression structure be used in practice? Meanwhile, the results show that any outcome regression-based estimation should be asymptotically more efficient than any inverse probability weighting-based estimation. Several simulation studies are conducted to examine the finite sample performances of these estimators, and a real dataset is analyzed for illustration.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":"74 5","pages":"987 - 1041"},"PeriodicalIF":1.0,"publicationDate":"2022-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10463-022-00821-x.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44135447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

On the rate of convergence of image classifiers based on convolutional neural networks 基于卷积神经网络的图像分类器收敛速度研究

IF 1 4区数学 Q3 STATISTICS & PROBABILITY

Annals of the Institute of Statistical Mathematics

Pub Date : 2022-04-27 DOI: 10.1007/s10463-022-00828-4

Michael Kohler, Adam Krzyżak, Benjamin Walter

Image classifiers based on convolutional neural networks are defined, and the rate of convergence of the misclassification risk of the estimates towards the optimal misclassification risk is analyzed. Under suitable assumptions on the smoothness and structure of a posteriori probability, the rate of convergence is shown which is independent of the dimension of the image. This proves that in image classification, it is possible to circumvent the curse of dimensionality by convolutional neural networks. Furthermore, the obtained result gives an indication why convolutional neural networks are able to outperform the standard feedforward neural networks in image classification. Our classifiers are compared with various other classification methods using simulated data. Furthermore, the performance of our estimates is also tested on real images.

定义了基于卷积神经网络的图像分类器，分析了估计的误分类风险向最优误分类风险收敛的速度。在对后验概率的平滑性和结构进行适当假设的情况下，给出了与图像尺寸无关的收敛速度。这证明了在图像分类中，卷积神经网络是可以克服维数诅咒的。此外，所得结果还说明了卷积神经网络在图像分类方面优于标准前馈神经网络的原因。使用模拟数据将我们的分类器与其他各种分类方法进行了比较。此外，我们还在真实图像上测试了我们估计的性能。

引用次数: 11

Directed hybrid random networks mixing preferential attachment with uniform attachment mechanisms 混合优先附着和均匀附着机制的有向混合随机网络

IF 1 4区数学 Q3 STATISTICS & PROBABILITY

Annals of the Institute of Statistical Mathematics

Pub Date : 2022-04-23 DOI: 10.1007/s10463-022-00827-5

Tiandong Wang, Panpan Zhang

Motivated by the complexity of network data, we propose a directed hybrid random network that mixes preferential attachment (PA) rules with uniform attachment rules. When a new edge is created, with probability (pin (0,1)), it follows the PA rule. Otherwise, this new edge is added between two uniformly chosen nodes. Such mixture makes the in- and out-degrees of a fixed node grow at a slower rate, compared to the pure PA case, thus leading to lighter distributional tails. For estimation and inference, we develop two numerical methods which are applied to both synthetic and real network data. We see that with extra flexibility given by the parameter p, the hybrid random network provides a better fit to real-world scenarios, where lighter tails from in- and out-degrees are observed.

考虑到网络数据的复杂性，我们提出了一种混合优先连接规则和统一连接规则的有向混合随机网络。当一条新边被创建时，它遵循PA规则的概率为(pin (0,1))。否则，这条新边将被添加到两个均匀选择的节点之间。与纯PA情况相比，这种混合使得固定节点的进出度以较慢的速度增长，从而导致较轻的分布尾部。对于估计和推理，我们开发了两种数值方法，分别适用于合成和实际网络数据。我们看到，由于参数p提供了额外的灵活性，混合随机网络可以更好地适应现实世界的情况，在现实世界中，可以观察到来自内外度的较轻的尾部。

引用次数: 4

Adaptive efficient estimation for generalized semi-Markov big data models 广义半马尔可夫大数据模型的自适应有效估计

IF 1 4区数学 Q3 STATISTICS & PROBABILITY

Annals of the Institute of Statistical Mathematics

Pub Date : 2022-03-05 DOI: 10.1007/s10463-022-00820-y

Vlad Stefan Barbu, Slim Beltaief, Serguei Pergamenchtchikov

In this paper we study generalized semi-Markov high dimension regression models in continuous time, observed at fixed discrete time moments. The generalized semi-Markov process has dependent jumps and, therefore, it is an extension of the semi-Markov regression introduced in Barbu et al. (Stat Inference Stoch Process 22:187–231, 2019a). For such models we consider estimation problems in nonparametric setting. To this end, we develop model selection procedures for which sharp non-asymptotic oracle inequalities for the robust risks are obtained. Moreover, we give constructive sufficient conditions which provide through the obtained oracle inequalities the adaptive robust efficiency property in the minimax sense. It should be noted also that, for these results, we do not use neither sparse conditions nor the parameter dimension in the model. As examples, regression models constructed through spherical symmetric noise impulses and truncated fractional Poisson processes are considered. Numerical Monte-Carlo simulations confirming the theoretical results are given in the supplementary materials.

本文研究了在固定离散时刻观测的连续时间广义半马尔可夫高维回归模型。广义半马尔可夫过程具有相关跳跃，因此，它是Barbu等人引入的半马尔可夫回归的扩展(Stat Inference Stoch process 22:7 7 - 231,2019a)。对于这类模型，我们考虑了非参数设置下的估计问题。为此，我们开发了模型选择程序，得到了鲁棒风险的尖锐非渐近oracle不等式。此外，我们给出了构造性充分条件，通过所得到的oracle不等式提供了极大极小意义上的自适应鲁棒有效性。还应该注意的是，对于这些结果，我们既没有使用稀疏条件，也没有使用模型中的参数维度。作为例子，考虑了由球面对称噪声脉冲和截断分数泊松过程构造的回归模型。在补充资料中给出了蒙特卡罗数值模拟，证实了理论结果。

{"title":"Adaptive efficient estimation for generalized semi-Markov big data models","authors":"Vlad Stefan Barbu, Slim Beltaief, Serguei Pergamenchtchikov","doi":"10.1007/s10463-022-00820-y","DOIUrl":"10.1007/s10463-022-00820-y","url":null,"abstract":"<div><p>In this paper we study generalized semi-Markov high dimension regression models in continuous time, observed at fixed discrete time moments. The generalized semi-Markov process has dependent jumps and, therefore, it is an extension of the semi-Markov regression introduced in Barbu et al. (Stat Inference Stoch Process 22:187–231, 2019a). For such models we consider estimation problems in nonparametric setting. To this end, we develop model selection procedures for which sharp non-asymptotic oracle inequalities for the robust risks are obtained. Moreover, we give constructive sufficient conditions which provide through the obtained oracle inequalities the adaptive robust efficiency property in the minimax sense. It should be noted also that, for these results, we do not use neither sparse conditions nor the parameter dimension in the model. As examples, regression models constructed through spherical symmetric noise impulses and truncated fractional Poisson processes are considered. Numerical Monte-Carlo simulations confirming the theoretical results are given in the supplementary materials.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":"74 5","pages":"925 - 955"},"PeriodicalIF":1.0,"publicationDate":"2022-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48866441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

首页上一页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Annals of the Institute of Statistical Mathematics

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀