Pub Date : 2022-05-19DOI: 10.1007/s10463-022-00830-w
John Copas
{"title":"Author’s rejoinder to the discussion of the Akaike Memorial Lecture 2020","authors":"John Copas","doi":"10.1007/s10463-022-00830-w","DOIUrl":"10.1007/s10463-022-00830-w","url":null,"abstract":"","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46722236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-14DOI: 10.1007/s10463-022-00825-7
Chih-Hao Chang, Hsin-Cheng Huang, Ching-Kang Ing
We consider a linear mixed-effects model with a clustered structure, where the parameters are estimated using maximum likelihood (ML) based on possibly unbalanced data. Inference with this model is typically done based on asymptotic theory, assuming that the number of clusters tends to infinity with the sample size. However, when the number of clusters is fixed, classical asymptotic theory developed under a divergent number of clusters is no longer valid and can lead to erroneous conclusions. In this paper, we establish the asymptotic properties of the ML estimators of random-effects parameters under a general setting, which can be applied to conduct valid statistical inference with fixed numbers of clusters. Our asymptotic theorems allow both fixed effects and random effects to be misspecified, and the dimensions of both effects to go to infinity with the sample size.
{"title":"Inference of random effects for linear mixed-effects models with a fixed number of clusters","authors":"Chih-Hao Chang, Hsin-Cheng Huang, Ching-Kang Ing","doi":"10.1007/s10463-022-00825-7","DOIUrl":"10.1007/s10463-022-00825-7","url":null,"abstract":"<div><p>We consider a linear mixed-effects model with a clustered structure, where the parameters are estimated using maximum likelihood (ML) based on possibly unbalanced data. Inference with this model is typically done based on asymptotic theory, assuming that the number of clusters tends to infinity with the sample size. However, when the number of clusters is fixed, classical asymptotic theory developed under a divergent number of clusters is no longer valid and can lead to erroneous conclusions. In this paper, we establish the asymptotic properties of the ML estimators of random-effects parameters under a general setting, which can be applied to conduct valid statistical inference with fixed numbers of clusters. Our asymptotic theorems allow both fixed effects and random effects to be misspecified, and the dimensions of both effects to go to infinity with the sample size.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10463-022-00825-7.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41937562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-14DOI: 10.1007/s10463-022-00823-9
Hammou El Barmi
For ( 1le i le r), let (F_i) be the cumulative incidence function (CIF) corresponding to the ith risk in an r-competing risks model. We assume a discrete or a grouped time framework and obtain the maximum likelihood estimators (m.l.e.) of these CIFs under the restriction that (F_i(t)/F_{i+1}(t)) is nondecreasing, (1 le i le r-1.) We also derive the likelihood ratio tests for testing for and against this restriction and obtain their asymptotic distributions. The theory developed here can also be used to investigate the association between a failure time and a discretized or ordinal mark variable that is observed only at the time of failure. To illustrate the applicability of our results, we give examples in the competing risks and the mark variable settings.
对于( 1le i le r),设(F_i)为r竞争风险模型中第i个风险对应的累积关联函数(CIF)。我们假设一个离散或分组的时间框架,并在(F_i(t)/F_{i+1}(t))非递减的限制下得到了这些CIFs的最大似然估计量(m.l.e.), (1 le i le r-1.)我们还推导了检验这个限制的似然比检验,并得到了它们的渐近分布。这里发展的理论也可用于研究失效时间与仅在失效时观察到的离散或有序标记变量之间的关系。为了说明我们的结果的适用性,我们给出了竞争风险和标记变量设置的例子。
{"title":"On comparing competing risks using the ratio of their cumulative incidence functions","authors":"Hammou El Barmi","doi":"10.1007/s10463-022-00823-9","DOIUrl":"10.1007/s10463-022-00823-9","url":null,"abstract":"<div><p>For <span>( 1le i le r)</span>, let <span>(F_i)</span> be the cumulative incidence function (CIF) corresponding to the <i>ith</i> risk in an <i>r</i>-competing risks model. We assume a discrete or a grouped time framework and obtain the maximum likelihood estimators (m.l.e.) of these CIFs under the restriction that <span>(F_i(t)/F_{i+1}(t))</span> is nondecreasing, <span>(1 le i le r-1.)</span> We also derive the likelihood ratio tests for testing for and against this restriction and obtain their asymptotic distributions. The theory developed here can also be used to investigate the association between a failure time and a discretized or ordinal mark variable that is observed only at the time of failure. To illustrate the applicability of our results, we give examples in the competing risks and the mark variable settings.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10463-022-00823-9.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41934200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-09DOI: 10.1007/s10463-022-00826-6
Holger Dette, Martin Kroll
For the class of Gauss–Markov processes we study the problem of asymptotic equivalence of the nonparametric regression model with errors given by the increments of the process and the continuous time model, where a whole path of a sum of a deterministic signal and the Gauss–Markov process can be observed. We derive sufficient conditions which imply asymptotic equivalence of the two models. We verify these conditions for the special cases of Sobolev ellipsoids and Hölder classes with smoothness index (>1/2) under mild assumptions on the Gauss–Markov process. To give a counterexample, we show that asymptotic equivalence fails to hold for the special case of Brownian bridge. Our findings demonstrate that the well-known asymptotic equivalence of the Gaussian white noise model and the nonparametric regression model with i.i.d. standard normal errors (see Brown and Low (Ann Stat 24:2384–2398, 1996)) can be extended to a setup with general Gauss–Markov noises.
对于一类高斯-马尔可夫过程,研究了误差由过程增量给出的非参数回归模型与连续时间模型的渐近等价问题,其中确定性信号与高斯-马尔可夫过程的和可以观察到整条路径。给出了两个模型渐近等价的充分条件。我们在高斯-马尔可夫过程的温和假设下,对Sobolev椭球和具有平滑指数(>1/2)的Hölder类的特殊情况验证了这些条件。为了给出一个反例,我们证明了对于布朗桥的特殊情况渐近等价不成立。我们的研究结果表明,众所周知的高斯白噪声模型和具有i.i.d标准正态误差的非参数回归模型的渐近等价性(见Brown和Low (Ann Stat 24:2384-2398, 1996))可以扩展到具有一般高斯-马尔可夫噪声的设置。
{"title":"Asymptotic equivalence for nonparametric regression with dependent errors: Gauss–Markov processes","authors":"Holger Dette, Martin Kroll","doi":"10.1007/s10463-022-00826-6","DOIUrl":"10.1007/s10463-022-00826-6","url":null,"abstract":"<div><p>For the class of Gauss–Markov processes we study the problem of asymptotic equivalence of the nonparametric regression model with errors given by the increments of the process and the continuous time model, where a whole path of a sum of a deterministic signal and the Gauss–Markov process can be observed. We derive sufficient conditions which imply asymptotic equivalence of the two models. We verify these conditions for the special cases of Sobolev ellipsoids and Hölder classes with smoothness index <span>(>1/2)</span> under mild assumptions on the Gauss–Markov process. To give a counterexample, we show that asymptotic equivalence fails to hold for the special case of Brownian bridge. Our findings demonstrate that the well-known asymptotic equivalence of the Gaussian white noise model and the nonparametric regression model with i.i.d. standard normal errors (see Brown and Low (Ann Stat 24:2384–2398, 1996)) can be extended to a setup with general Gauss–Markov noises.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44200829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-07DOI: 10.1007/s10463-022-00824-8
Ke Yu, Shan Luo
Feature selection for the high-dimensional Cox proportional hazards model (Cox model) is very important in many microarray genetic studies. In this paper, we propose a sequential feature selection procedure for this model. We define a novel partial profile score to assess the impact of unselected features conditional on the current model, significant features are thereby added into the model sequentially, and the Extended Bayesian Information Criteria (EBIC) is adopted as a stopping rule. Under mild conditions, we show that this procedure is selection consistent. Extensive simulation studies and two real data applications are conducted to demonstrate the advantage of our proposed procedure over several representative approaches.
{"title":"A sequential feature selection procedure for high-dimensional Cox proportional hazards model","authors":"Ke Yu, Shan Luo","doi":"10.1007/s10463-022-00824-8","DOIUrl":"10.1007/s10463-022-00824-8","url":null,"abstract":"<div><p>Feature selection for the high-dimensional Cox proportional hazards model (Cox model) is very important in many microarray genetic studies. In this paper, we propose a sequential feature selection procedure for this model. We define a novel partial profile score to assess the impact of unselected features conditional on the current model, significant features are thereby added into the model sequentially, and the Extended Bayesian Information Criteria (EBIC) is adopted as a stopping rule. Under mild conditions, we show that this procedure is selection consistent. Extensive simulation studies and two real data applications are conducted to demonstrate the advantage of our proposed procedure over several representative approaches.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10463-022-00824-8.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48850896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-04DOI: 10.1007/s10463-022-00822-w
Bofei Xiao, Bo Lei, Wei Lan, Bin Guo
This paper proposes a blockwise network autoregressive (BWNAR) model by grouping nodes in the network into nonoverlapping blocks to adapt networks with blockwise structures. Before modeling, we employ the pseudo likelihood ratio criterion (pseudo-LR) together with the standard spectral clustering approach and a binary segmentation method developed by Ma et al. (Journal of Machine Learning Research, 22, 1–63, 2021) to estimate the number of blocks and their memberships, respectively. Then, we acquire the consistency and asymptotic normality of the estimator of influence parameters by the quasi-maximum likelihood estimation method without imposing any distribution assumptions. In addition, a novel likelihood ratio test statistic is proposed to verify the heterogeneity of the influencing parameters. The performance and usefulness of the model are assessed through simulations and an empirical example of the detection of fraud in financial transactions, respectively.
本文提出了一种块网络自回归(BWNAR)模型,该模型通过将网络中的节点分组为不重叠的块来适应具有块结构的网络。在建模之前,我们采用伪似然比准则(pseudo- lr)以及Ma等人(Journal of Machine Learning Research, 22, 1 - 63,2021)开发的标准光谱聚类方法和二值分割方法来分别估计块的数量及其隶属度。然后,在不施加任何分布假设的情况下,利用拟极大似然估计方法获得了影响参数估计量的相合性和渐近正态性。此外,提出了一种新的似然比检验统计量来验证影响参数的异质性。该模型的性能和有用性分别通过模拟和金融交易欺诈检测的经验例子进行评估。
{"title":"A blockwise network autoregressive model with application for fraud detection","authors":"Bofei Xiao, Bo Lei, Wei Lan, Bin Guo","doi":"10.1007/s10463-022-00822-w","DOIUrl":"10.1007/s10463-022-00822-w","url":null,"abstract":"<div><p>This paper proposes a blockwise network autoregressive (BWNAR) model by grouping nodes in the network into nonoverlapping blocks to adapt networks with blockwise structures. Before modeling, we employ the pseudo likelihood ratio criterion (pseudo-LR) together with the standard spectral clustering approach and a binary segmentation method developed by Ma et al. (<i>Journal of Machine Learning Research</i>, <b>22</b>, 1–63, 2021) to estimate the number of blocks and their memberships, respectively. Then, we acquire the consistency and asymptotic normality of the estimator of influence parameters by the quasi-maximum likelihood estimation method without imposing any distribution assumptions. In addition, a novel likelihood ratio test statistic is proposed to verify the heterogeneity of the influencing parameters. The performance and usefulness of the model are assessed through simulations and an empirical example of the detection of fraud in financial transactions, respectively.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49330432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-04-29DOI: 10.1007/s10463-022-00821-x
Lu Li, Niwen Zhou, Lixing Zhu
The research is about a systematic investigation on the following issues. First, we construct different outcome regression-based estimators for conditional average treatment effect under, respectively, true, parametric, nonparametric and semiparametric dimension reduction structure. Second, according to the corresponding asymptotic variance functions when supposing the models are correctly specified, we answer the following questions: what is the asymptotic efficiency ranking about the four estimators in general? how is the efficiency related to the affiliation of the given covariates in the set of arguments of the regression functions? what do the roles of bandwidth and kernel function selections play for the estimation efficiency; and in which scenarios should the estimator under semiparametric dimension reduction regression structure be used in practice? Meanwhile, the results show that any outcome regression-based estimation should be asymptotically more efficient than any inverse probability weighting-based estimation. Several simulation studies are conducted to examine the finite sample performances of these estimators, and a real dataset is analyzed for illustration.
{"title":"Outcome regression-based estimation of conditional average treatment effect","authors":"Lu Li, Niwen Zhou, Lixing Zhu","doi":"10.1007/s10463-022-00821-x","DOIUrl":"10.1007/s10463-022-00821-x","url":null,"abstract":"<div><p>The research is about a systematic investigation on the following issues. First, we construct different outcome regression-based estimators for conditional average treatment effect under, respectively, true, parametric, nonparametric and semiparametric dimension reduction structure. Second, according to the corresponding asymptotic variance functions when supposing the models are correctly specified, we answer the following questions: what is the asymptotic efficiency ranking about the four estimators in general? how is the efficiency related to the affiliation of the given covariates in the set of arguments of the regression functions? what do the roles of bandwidth and kernel function selections play for the estimation efficiency; and in which scenarios should the estimator under semiparametric dimension reduction regression structure be used in practice? Meanwhile, the results show that any outcome regression-based estimation should be asymptotically more efficient than any inverse probability weighting-based estimation. Several simulation studies are conducted to examine the finite sample performances of these estimators, and a real dataset is analyzed for illustration.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10463-022-00821-x.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44135447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-04-27DOI: 10.1007/s10463-022-00828-4
Michael Kohler, Adam Krzyżak, Benjamin Walter
Image classifiers based on convolutional neural networks are defined, and the rate of convergence of the misclassification risk of the estimates towards the optimal misclassification risk is analyzed. Under suitable assumptions on the smoothness and structure of a posteriori probability, the rate of convergence is shown which is independent of the dimension of the image. This proves that in image classification, it is possible to circumvent the curse of dimensionality by convolutional neural networks. Furthermore, the obtained result gives an indication why convolutional neural networks are able to outperform the standard feedforward neural networks in image classification. Our classifiers are compared with various other classification methods using simulated data. Furthermore, the performance of our estimates is also tested on real images.
{"title":"On the rate of convergence of image classifiers based on convolutional neural networks","authors":"Michael Kohler, Adam Krzyżak, Benjamin Walter","doi":"10.1007/s10463-022-00828-4","DOIUrl":"10.1007/s10463-022-00828-4","url":null,"abstract":"<div><p>Image classifiers based on convolutional neural networks are defined, and the rate of convergence of the misclassification risk of the estimates towards the optimal misclassification risk is analyzed. Under suitable assumptions on the smoothness and structure of a posteriori probability, the rate of convergence is shown which is independent of the dimension of the image. This proves that in image classification, it is possible to circumvent the curse of dimensionality by convolutional neural networks. Furthermore, the obtained result gives an indication why convolutional neural networks are able to outperform the standard feedforward neural networks in image classification. Our classifiers are compared with various other classification methods using simulated data. Furthermore, the performance of our estimates is also tested on real images.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10463-022-00828-4.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47148477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-04-23DOI: 10.1007/s10463-022-00827-5
Tiandong Wang, Panpan Zhang
Motivated by the complexity of network data, we propose a directed hybrid random network that mixes preferential attachment (PA) rules with uniform attachment rules. When a new edge is created, with probability (pin (0,1)), it follows the PA rule. Otherwise, this new edge is added between two uniformly chosen nodes. Such mixture makes the in- and out-degrees of a fixed node grow at a slower rate, compared to the pure PA case, thus leading to lighter distributional tails. For estimation and inference, we develop two numerical methods which are applied to both synthetic and real network data. We see that with extra flexibility given by the parameter p, the hybrid random network provides a better fit to real-world scenarios, where lighter tails from in- and out-degrees are observed.
{"title":"Directed hybrid random networks mixing preferential attachment with uniform attachment mechanisms","authors":"Tiandong Wang, Panpan Zhang","doi":"10.1007/s10463-022-00827-5","DOIUrl":"10.1007/s10463-022-00827-5","url":null,"abstract":"<div><p>Motivated by the complexity of network data, we propose a directed hybrid random network that mixes preferential attachment (PA) rules with uniform attachment rules. When a new edge is created, with probability <span>(pin (0,1))</span>, it follows the PA rule. Otherwise, this new edge is added between two uniformly chosen nodes. Such mixture makes the in- and out-degrees of a fixed node grow at a slower rate, compared to the pure PA case, thus leading to lighter distributional tails. For estimation and inference, we develop two numerical methods which are applied to both synthetic and real network data. We see that with extra flexibility given by the parameter <i>p</i>, the hybrid random network provides a better fit to real-world scenarios, where lighter tails from in- and out-degrees are observed.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10463-022-00827-5.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48256893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-03-05DOI: 10.1007/s10463-022-00820-y
Vlad Stefan Barbu, Slim Beltaief, Serguei Pergamenchtchikov
In this paper we study generalized semi-Markov high dimension regression models in continuous time, observed at fixed discrete time moments. The generalized semi-Markov process has dependent jumps and, therefore, it is an extension of the semi-Markov regression introduced in Barbu et al. (Stat Inference Stoch Process 22:187–231, 2019a). For such models we consider estimation problems in nonparametric setting. To this end, we develop model selection procedures for which sharp non-asymptotic oracle inequalities for the robust risks are obtained. Moreover, we give constructive sufficient conditions which provide through the obtained oracle inequalities the adaptive robust efficiency property in the minimax sense. It should be noted also that, for these results, we do not use neither sparse conditions nor the parameter dimension in the model. As examples, regression models constructed through spherical symmetric noise impulses and truncated fractional Poisson processes are considered. Numerical Monte-Carlo simulations confirming the theoretical results are given in the supplementary materials.
本文研究了在固定离散时刻观测的连续时间广义半马尔可夫高维回归模型。广义半马尔可夫过程具有相关跳跃,因此,它是Barbu等人引入的半马尔可夫回归的扩展(Stat Inference Stoch process 22:7 7 - 231,2019a)。对于这类模型,我们考虑了非参数设置下的估计问题。为此,我们开发了模型选择程序,得到了鲁棒风险的尖锐非渐近oracle不等式。此外,我们给出了构造性充分条件,通过所得到的oracle不等式提供了极大极小意义上的自适应鲁棒有效性。还应该注意的是,对于这些结果,我们既没有使用稀疏条件,也没有使用模型中的参数维度。作为例子,考虑了由球面对称噪声脉冲和截断分数泊松过程构造的回归模型。在补充资料中给出了蒙特卡罗数值模拟,证实了理论结果。
{"title":"Adaptive efficient estimation for generalized semi-Markov big data models","authors":"Vlad Stefan Barbu, Slim Beltaief, Serguei Pergamenchtchikov","doi":"10.1007/s10463-022-00820-y","DOIUrl":"10.1007/s10463-022-00820-y","url":null,"abstract":"<div><p>In this paper we study generalized semi-Markov high dimension regression models in continuous time, observed at fixed discrete time moments. The generalized semi-Markov process has dependent jumps and, therefore, it is an extension of the semi-Markov regression introduced in Barbu et al. (Stat Inference Stoch Process 22:187–231, 2019a). For such models we consider estimation problems in nonparametric setting. To this end, we develop model selection procedures for which sharp non-asymptotic oracle inequalities for the robust risks are obtained. Moreover, we give constructive sufficient conditions which provide through the obtained oracle inequalities the adaptive robust efficiency property in the minimax sense. It should be noted also that, for these results, we do not use neither sparse conditions nor the parameter dimension in the model. As examples, regression models constructed through spherical symmetric noise impulses and truncated fractional Poisson processes are considered. Numerical Monte-Carlo simulations confirming the theoretical results are given in the supplementary materials.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48866441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}