Pub Date : 2023-12-09DOI: 10.1007/s00362-023-01512-2
Joni Virta, Niko Lietzén, Henri Nyberg
The estimation of signal dimension under heavy-tailed latent variable models is studied. As a primary contribution, robust extensions of an earlier estimator based on Gaussian Stein’s unbiased risk estimation are proposed. These novel extensions are based on the framework of elliptical distributions and robust scatter matrices. Extensive simulation studies are conducted in order to compare the novel methods with several well-known competitors in both estimation accuracy and computational speed. The novel methods are applied to a financial asset return data set.
{"title":"Robust signal dimension estimation via SURE","authors":"Joni Virta, Niko Lietzén, Henri Nyberg","doi":"10.1007/s00362-023-01512-2","DOIUrl":"https://doi.org/10.1007/s00362-023-01512-2","url":null,"abstract":"<p>The estimation of signal dimension under heavy-tailed latent variable models is studied. As a primary contribution, robust extensions of an earlier estimator based on Gaussian Stein’s unbiased risk estimation are proposed. These novel extensions are based on the framework of elliptical distributions and robust scatter matrices. Extensive simulation studies are conducted in order to compare the novel methods with several well-known competitors in both estimation accuracy and computational speed. The novel methods are applied to a financial asset return data set.</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"21 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2023-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138561939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-09DOI: 10.1007/s00362-023-01472-7
Muhammad Jaffri Mohd Nasir, Ramzan Nazim Khan, Gopalan Nair, Darfiana Nur
Group LASSO (gLASSO) estimator has been recently proposed to estimate thresholds for the self-exciting threshold autoregressive model, and a group least angle regression (gLAR) algorithm has been applied to obtain an approximate solution to the optimization problem. Although gLAR algorithm is computationally fast, it has been reported that the algorithm tends to estimate too many irrelevant thresholds along with the relevant ones. This paper develops an active-set based block coordinate descent (aBCD) algorithm as an exact optimization method for gLASSO to improve the performance of estimating relevant thresholds. Methods and strategy for choosing the appropriate values of shrinkage parameter for gLASSO are also discussed. To consistently estimate relevant thresholds from the threshold set obtained by the gLASSO, the backward elimination algorithm (BEA) is utilized. We evaluate numerical efficiency of the proposed algorithms, along with the Single-Line-Search (SLS) and the gLAR algorithms through simulated data and real data sets. Simulation studies show that the SLS and aBCD algorithms have similar performance in estimating thresholds although the latter method is much faster. In addition, the aBCD-BEA can sometimes outperform gLAR-BEA in terms of estimating the correct number of thresholds under certain conditions. The results from case studies have also shown that aBCD-BEA performs better in identifying important thresholds.
{"title":"Active-set based block coordinate descent algorithm in group LASSO for self-exciting threshold autoregressive model","authors":"Muhammad Jaffri Mohd Nasir, Ramzan Nazim Khan, Gopalan Nair, Darfiana Nur","doi":"10.1007/s00362-023-01472-7","DOIUrl":"https://doi.org/10.1007/s00362-023-01472-7","url":null,"abstract":"<p>Group LASSO (gLASSO) estimator has been recently proposed to estimate thresholds for the <i>self-exciting</i> threshold autoregressive model, and a group least angle regression (<i>gLAR</i>) algorithm has been applied to obtain an approximate solution to the optimization problem. Although <i>gLAR</i> algorithm is computationally fast, it has been reported that the algorithm tends to estimate too many irrelevant thresholds along with the relevant ones. This paper develops an <i>active-set</i> based block coordinate descent (<i>aBCD</i>) algorithm as an exact optimization method for gLASSO to improve the performance of estimating relevant thresholds. Methods and strategy for choosing the appropriate values of shrinkage parameter for gLASSO are also discussed. To consistently estimate relevant thresholds from the threshold set obtained by the gLASSO, the backward elimination algorithm (<i>BEA</i>) is utilized. We evaluate numerical efficiency of the proposed algorithms, along with the Single-Line-Search (<i>SLS</i>) and the <i>gLAR</i> algorithms through simulated data and real data sets. Simulation studies show that the <i>SLS</i> and <i>aBCD</i> algorithms have similar performance in estimating thresholds although the latter method is much faster. In addition, the <i>aBCD-BEA</i> can sometimes outperform <i>gLAR-BEA</i> in terms of estimating the correct number of thresholds under certain conditions. The results from case studies have also shown that <i>aBCD-BEA</i> performs better in identifying important thresholds.</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"1 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2023-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138562028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-01DOI: 10.1007/s00362-023-01510-4
Petr Čoupek, Viktor Dolník, Zdeněk Hlávka, Daniel Hlubinka
A new goodness-of-fit (GoF) test is proposed and investigated for the Gaussianity of the observed functional data. The test statistic is the Cramér-von Mises distance between the observed empirical characteristic functional (CF) and the theoretical CF corresponding to the null hypothesis stating that the functional observations (process paths) were generated from a specific parametric family of Gaussian processes, possibly with unknown parameters. The asymptotic null distribution of the proposed test statistic is derived also in the presence of these nuisance parameters, the consistency of the classical parametric bootstrap is established, and some particular choices of the necessary tuning parameters are discussed. The empirical level and power are investigated in a simulation study involving GoF tests of an Ornstein–Uhlenbeck process, Vašíček model, or a (fractional) Brownian motion, both with and without nuisance parameters, with suitable Gaussian and non-Gaussian alternatives.
{"title":"Fourier approach to goodness-of-fit tests for Gaussian random processes","authors":"Petr Čoupek, Viktor Dolník, Zdeněk Hlávka, Daniel Hlubinka","doi":"10.1007/s00362-023-01510-4","DOIUrl":"https://doi.org/10.1007/s00362-023-01510-4","url":null,"abstract":"<p>A new goodness-of-fit (GoF) test is proposed and investigated for the Gaussianity of the observed functional data. The test statistic is the Cramér-von Mises distance between the observed empirical characteristic functional (CF) and the theoretical CF corresponding to the null hypothesis stating that the functional observations (process paths) were generated from a specific parametric family of Gaussian processes, possibly with unknown parameters. The asymptotic null distribution of the proposed test statistic is derived also in the presence of these nuisance parameters, the consistency of the classical parametric bootstrap is established, and some particular choices of the necessary tuning parameters are discussed. The empirical level and power are investigated in a simulation study involving GoF tests of an Ornstein–Uhlenbeck process, Vašíček model, or a (fractional) Brownian motion, both with and without nuisance parameters, with suitable Gaussian and non-Gaussian alternatives.</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"40 25","pages":""},"PeriodicalIF":1.3,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138506053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In biomedical studies, panel count data have been extensively investigated. Such data occur if study subjects are monitored or observed only at some discrete time points during observation periods. In addition, these data may be collected from multiple centers, and study subjects from the same center might be correlated. Limited literature exists for clustered panel count data. Ignoring such cluster effects could result in biased variance estimation. In this paper, two semiparametric additive mean models are proposed for clustered panel count data. The first model contains a common baseline function across all clusters, while the second model features cluster-specific baseline functions. Some estimation equations are derived to estimate the regression parameters of interest for the proposed two models. For the common baseline model, the baseline function is also estimated. Given some regularity conditions, the resulting estimators are shown to be consistent and asymptotically normal. Extensive simulation studies are carried out and indicate that the proposed approaches perform well in finite samples. An application of the China Health and Nutrition Study is also provided for illustration.
{"title":"Regression analysis of clustered panel count data with additive mean models","authors":"Weiwei Wang, Zhiyang Cui, Ruijie Chen, Yijun Wang, Xiaobing Zhao","doi":"10.1007/s00362-023-01511-3","DOIUrl":"https://doi.org/10.1007/s00362-023-01511-3","url":null,"abstract":"<p>In biomedical studies, panel count data have been extensively investigated. Such data occur if study subjects are monitored or observed only at some discrete time points during observation periods. In addition, these data may be collected from multiple centers, and study subjects from the same center might be correlated. Limited literature exists for clustered panel count data. Ignoring such cluster effects could result in biased variance estimation. In this paper, two semiparametric additive mean models are proposed for clustered panel count data. The first model contains a common baseline function across all clusters, while the second model features cluster-specific baseline functions. Some estimation equations are derived to estimate the regression parameters of interest for the proposed two models. For the common baseline model, the baseline function is also estimated. Given some regularity conditions, the resulting estimators are shown to be consistent and asymptotically normal. Extensive simulation studies are carried out and indicate that the proposed approaches perform well in finite samples. An application of the China Health and Nutrition Study is also provided for illustration.</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"40 37","pages":""},"PeriodicalIF":1.3,"publicationDate":"2023-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138506075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-16DOI: 10.1007/s00362-023-01495-0
Stergios B. Fotopoulos, Abhishek Kaul, Vasileios Pavlopoulos, Venkata K. Jandhyala
The article offers a method for estimating the volatility covariance matrix of vectors of financial time series data using a change point approach. The proposed method supersedes general varying-coefficient parametric models, such as GARCH, whose coefficients may vary with time, by a change point model. In this study, an adaptive pointwise selection of homogeneous segments with a given right-end point by a local change point analysis is introduced. Sufficient conditions are obtained under which the maximum likelihood process is adaptive against the covariance estimate to yield an optimal rate of convergence with respect to the change size. This rate is preserved while allowing the jump size to diminish. Under these circumstances, argmax results of a two-sided negative Brownian motion or a two-sided negative drift random walk under vanishing and non-vanishing jump size regimes, respectively, provide inference for the change point parameter. Theoretical results are supported by the Monte–Carlo simulation study. A bivariate data on daily log returns of two US stock market indices as well as tri-variate data on daily log returns of three banks are analyzed by constructing confidence interval estimates for multiple change points that have been identified previously for each of the two data sets.
{"title":"Adaptive parametric change point inference under covariance structure changes","authors":"Stergios B. Fotopoulos, Abhishek Kaul, Vasileios Pavlopoulos, Venkata K. Jandhyala","doi":"10.1007/s00362-023-01495-0","DOIUrl":"https://doi.org/10.1007/s00362-023-01495-0","url":null,"abstract":"<p>The article offers a method for estimating the volatility covariance matrix of vectors of financial time series data using a change point approach. The proposed method supersedes general varying-coefficient parametric models, such as GARCH, whose coefficients may vary with time, by a change point model. In this study, an adaptive pointwise selection of homogeneous segments with a given right-end point by a local change point analysis is introduced. Sufficient conditions are obtained under which the maximum likelihood process is adaptive against the covariance estimate to yield an optimal rate of convergence with respect to the change size. This rate is preserved while allowing the jump size to diminish. Under these circumstances, argmax results of a two-sided negative Brownian motion or a two-sided negative drift random walk under vanishing and non-vanishing jump size regimes, respectively, provide inference for the change point parameter. Theoretical results are supported by the Monte–Carlo simulation study. A bivariate data on daily log returns of two US stock market indices as well as tri-variate data on daily log returns of three banks are analyzed by constructing confidence interval estimates for multiple change points that have been identified previously for each of the two data sets.</p>","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"41 22","pages":""},"PeriodicalIF":1.3,"publicationDate":"2023-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138506063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper studies factor modeling for a vector of time series with long-memory properties to investigate how outliers affect the identification of the number of factors and also proposes a robust method to reduce their impact. The number of factors is estimated using an eigenvalue analysis for a non-negative definite matrix introduced by Lam et al. (2011). Two estimators are proposed; the first is based on the classical sample covariance function, and the second uses a robust covariance function estimate. In both cases, it is shown that the eigenvalues estimates have similar convergence rates. Empirical simulations support both estimators for multivariate stationary long-memory time series and show that the robust method is preferable when the data is contaminated with additive outliers. Time series of daily log returns are used as an example of application. In addition to abrupt observations, exchange rates exhibit non-stationarity properties with long memory parameters greater than one. Then we use semi-parametric long memory estimators to estimate the fractional parameters of the series. The number of factors was estimated using the classical and robust approaches. Due to the influence of the abrupt observations, these tools suggested a different number of factors to model the data. The robust method suggested two factors, while the classical approach indicated only one factor.
{"title":"A dimension reduction factor approach for multivariate time series with long-memory: a robust alternative method","authors":"Valdério Anselmo Reisen, Céline Lévy-Leduc, Edson Zambon Monte, Pascal Bondon","doi":"10.1007/s00362-023-01504-2","DOIUrl":"https://doi.org/10.1007/s00362-023-01504-2","url":null,"abstract":"This paper studies factor modeling for a vector of time series with long-memory properties to investigate how outliers affect the identification of the number of factors and also proposes a robust method to reduce their impact. The number of factors is estimated using an eigenvalue analysis for a non-negative definite matrix introduced by Lam et al. (2011). Two estimators are proposed; the first is based on the classical sample covariance function, and the second uses a robust covariance function estimate. In both cases, it is shown that the eigenvalues estimates have similar convergence rates. Empirical simulations support both estimators for multivariate stationary long-memory time series and show that the robust method is preferable when the data is contaminated with additive outliers. Time series of daily log returns are used as an example of application. In addition to abrupt observations, exchange rates exhibit non-stationarity properties with long memory parameters greater than one. Then we use semi-parametric long memory estimators to estimate the fractional parameters of the series. The number of factors was estimated using the classical and robust approaches. Due to the influence of the abrupt observations, these tools suggested a different number of factors to model the data. The robust method suggested two factors, while the classical approach indicated only one factor.","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"1 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136227101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Two-piece distribution based semi-parametric quantile regression for right censored data","authors":"Worku Biyadgie Ewnetu, Irène Gijbels, Anneleen Verhasselt","doi":"10.1007/s00362-023-01475-4","DOIUrl":"https://doi.org/10.1007/s00362-023-01475-4","url":null,"abstract":"","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"122 44","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135138334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-03DOI: 10.1007/s00362-023-01503-3
Yang Liu, Rong Kuang, Guanfu Liu
{"title":"Penalized likelihood inference for the finite mixture of Poisson distributions from capture-recapture data","authors":"Yang Liu, Rong Kuang, Guanfu Liu","doi":"10.1007/s00362-023-01503-3","DOIUrl":"https://doi.org/10.1007/s00362-023-01503-3","url":null,"abstract":"","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"123 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135818669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}