Pub Date : 2024-02-14DOI: 10.1007/s11749-024-00921-1
Yang Liu, Yukun Liu, Pengfei Li, Riquan Zhang
Missing covariates are not uncommon in capture–recapture studies. When covariate information is missing at random in capture–recapture data, an empirical full likelihood method has been demonstrated to outperform conditional-likelihood-based methods in abundance estimation. However, the fully observed covariates must be discrete, and the method is not directly applicable to continuous-time capture–recapture data. Based on the Binomial and Poisson regression models, we propose a two-step semiparametric empirical likelihood approach for abundance estimation in the presence of missing covariates, regardless of whether the fully observed covariates are discrete or continuous. We show that the maximum semiparametric empirical likelihood estimators for the underlying parameters and the abundance are asymptotically normal, and more efficient than the counterpart for a completely known non-missingness probability. After scaling, the empirical likelihood ratio test statistic for abundance follows a limiting chi-square distribution with one degree of freedom. The proposed approach is further extended to one-inflated count regression models, and a score-like test is constructed to assess whether one-inflation exists among the number of captures. Our simulation shows that, compared with the previous method, the proposed method not only performs better in correcting bias, but also has a more accurate coverage in the presence of fully observed continuous covariates, although there may be a slight efficiency loss when the fully observed covariates are only discrete. The performance of the new method is illustrated by analyses of the yellow-bellied prinia data and the rana pretiosa data.
{"title":"Two-step semiparametric empirical likelihood inference from capture–recapture data with missing covariates","authors":"Yang Liu, Yukun Liu, Pengfei Li, Riquan Zhang","doi":"10.1007/s11749-024-00921-1","DOIUrl":"https://doi.org/10.1007/s11749-024-00921-1","url":null,"abstract":"<p>Missing covariates are not uncommon in capture–recapture studies. When covariate information is missing at random in capture–recapture data, an empirical full likelihood method has been demonstrated to outperform conditional-likelihood-based methods in abundance estimation. However, the fully observed covariates must be discrete, and the method is not directly applicable to continuous-time capture–recapture data. Based on the Binomial and Poisson regression models, we propose a two-step semiparametric empirical likelihood approach for abundance estimation in the presence of missing covariates, regardless of whether the fully observed covariates are discrete or continuous. We show that the maximum semiparametric empirical likelihood estimators for the underlying parameters and the abundance are asymptotically normal, and more efficient than the counterpart for a completely known non-missingness probability. After scaling, the empirical likelihood ratio test statistic for abundance follows a limiting chi-square distribution with one degree of freedom. The proposed approach is further extended to one-inflated count regression models, and a score-like test is constructed to assess whether one-inflation exists among the number of captures. Our simulation shows that, compared with the previous method, the proposed method not only performs better in correcting bias, but also has a more accurate coverage in the presence of fully observed continuous covariates, although there may be a slight efficiency loss when the fully observed covariates are only discrete. The performance of the new method is illustrated by analyses of the yellow-bellied prinia data and the rana pretiosa data.</p>","PeriodicalId":51189,"journal":{"name":"Test","volume":"73 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139773396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-07DOI: 10.1007/s11749-024-00919-9
Alfonso Russo, Alessio Farcomeni
We specify a general formulation for multivariate latent Markov models for panel data, where outcomes are possibly of mixed-type (categorical, discrete, continuous). Conditionally on a time-varying discrete latent variable and covariates, the joint distribution of outcomes simultaneously observed is expressed through a parametric copula. We therefore do not make any conditional independence assumption. The observed likelihood is maximized by means of an expectation–maximization algorithm. In a simulation study, we argue how modeling the residual contemporary dependence might be crucial in order to avoid bias in the parameter estimates. We illustrate through an original application to assessment of poverty through direct and indirect indicators in a cohort of Italian households.
{"title":"A copula formulation for multivariate latent Markov models","authors":"Alfonso Russo, Alessio Farcomeni","doi":"10.1007/s11749-024-00919-9","DOIUrl":"https://doi.org/10.1007/s11749-024-00919-9","url":null,"abstract":"<p>We specify a general formulation for multivariate latent Markov models for panel data, where outcomes are possibly of mixed-type (categorical, discrete, continuous). Conditionally on a time-varying discrete latent variable and covariates, the joint distribution of outcomes simultaneously observed is expressed through a parametric copula. We therefore do not make any conditional independence assumption. The observed likelihood is maximized by means of an expectation–maximization algorithm. In a simulation study, we argue how modeling the residual contemporary dependence might be crucial in order to avoid bias in the parameter estimates. We illustrate through an original application to assessment of poverty through direct and indirect indicators in a cohort of Italian households.\u0000</p>","PeriodicalId":51189,"journal":{"name":"Test","volume":"36 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139769593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-06DOI: 10.1007/s11749-023-00915-5
Mai Ghannam, Sévérien Nkurunziza
In this paper, we consider an inference problem in a tensor regression model with one change-point. Specifically, we consider a general hypothesis testing problem on a tensor parameter and the studied testing problem includes as a special case the problem about the absence of a change-point. To this end, we derive the unrestricted estimator (UE) and the restricted estimator (RE) as well as the joint asymptotic normality of the UE and RE. Thanks to the established asymptotic normality, we derive a test for testing the hypothesized restriction. We also derive the asymptotic power of the proposed test and we prove that the established test is consistent. Beyond the complexity of the testing problem in the tensor model, we consider a very general case where the tensor error term and the regressors do not need to be independent and the dependence structure of the outer-product of the tensor error term and regressors is as weak as that of an (mathcal {L}^2-) mixingale. Further, to study the performance of the proposed methods in small and moderate sample sizes, we present some simulation results that corroborate the theoretical results. Finally, to illustrate the application of the proposed methods, we test the non-existence of a change-point in some fMRI neuro-imaging data.
{"title":"Change-point detection in a tensor regression model","authors":"Mai Ghannam, Sévérien Nkurunziza","doi":"10.1007/s11749-023-00915-5","DOIUrl":"https://doi.org/10.1007/s11749-023-00915-5","url":null,"abstract":"<p>In this paper, we consider an inference problem in a tensor regression model with one change-point. Specifically, we consider a general hypothesis testing problem on a tensor parameter and the studied testing problem includes as a special case the problem about the absence of a change-point. To this end, we derive the unrestricted estimator (UE) and the restricted estimator (RE) as well as the joint asymptotic normality of the UE and RE. Thanks to the established asymptotic normality, we derive a test for testing the hypothesized restriction. We also derive the asymptotic power of the proposed test and we prove that the established test is consistent. Beyond the complexity of the testing problem in the tensor model, we consider a very general case where the tensor error term and the regressors do not need to be independent and the dependence structure of the outer-product of the tensor error term and regressors is as weak as that of an <span>(mathcal {L}^2-)</span> mixingale. Further, to study the performance of the proposed methods in small and moderate sample sizes, we present some simulation results that corroborate the theoretical results. Finally, to illustrate the application of the proposed methods, we test the non-existence of a change-point in some fMRI neuro-imaging data.\u0000</p>","PeriodicalId":51189,"journal":{"name":"Test","volume":"29 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139769742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-23DOI: 10.1007/s11749-023-00916-4
Konstantinos Bourazas, Guido Consonni, Laura Deldossi
An ongoing “replication crisis” calls into question scientific discoveries across a variety of disciplines ranging from life to social sciences. Replication studies aim to investigate the validity of findings in published research, and try to assess whether the latter are statistically consistent with those in the replications. While the majority of replication projects are based on a single experiment, multiple independent replications of the same experiment conducted simultaneously at different sites are becoming more frequent. In connection with these types of projects, we deal with testing heterogeneity among sites; specifically, we focus on sample size determination suitable to deliver compelling evidence once the experimental data are gathered.
{"title":"Bayesian sample size determination for detecting heterogeneity in multi-site replication studies","authors":"Konstantinos Bourazas, Guido Consonni, Laura Deldossi","doi":"10.1007/s11749-023-00916-4","DOIUrl":"https://doi.org/10.1007/s11749-023-00916-4","url":null,"abstract":"<p>An ongoing “replication crisis” calls into question scientific discoveries across a variety of disciplines ranging from life to social sciences. Replication studies aim to investigate the validity of findings in published research, and try to assess whether the latter are statistically consistent with those in the replications. While the majority of replication projects are based on a single experiment, multiple independent replications of the same experiment conducted simultaneously at different sites are becoming more frequent. In connection with these types of projects, we deal with testing heterogeneity among sites; specifically, we focus on sample size determination suitable to deliver compelling evidence once the experimental data are gathered.</p>","PeriodicalId":51189,"journal":{"name":"Test","volume":"7 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139553795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-18DOI: 10.1007/s11749-023-00914-6
J. E. Borgert, J. S. Marron
This discussion paper applauds the authors for their impactful contribution to functional data analysis (FDA). Their primary insight lies in a formal mathematical definition of the “shape” of a curve, which they connect to familiar intuitive notions through a number of examples. Notably, the paper highlights the pitfalls of less well-thought-out curve registration approaches. The authors’ application of COVID-19 data enriches the discussion, highlighting the work’s practical relevance. We discuss connections of this work with object-oriented data analysis and propose enhancements to the authors’ shape-based functional principal component analysis. Additionally, we illustrate the practical significance of adaptive alignment with an example from our own research.
{"title":"Comments on: Shape-based functional data analysis","authors":"J. E. Borgert, J. S. Marron","doi":"10.1007/s11749-023-00914-6","DOIUrl":"https://doi.org/10.1007/s11749-023-00914-6","url":null,"abstract":"<p>This discussion paper applauds the authors for their impactful contribution to functional data analysis (FDA). Their primary insight lies in a formal mathematical definition of the “shape” of a curve, which they connect to familiar intuitive notions through a number of examples. Notably, the paper highlights the pitfalls of less well-thought-out curve registration approaches. The authors’ application of COVID-19 data enriches the discussion, highlighting the work’s practical relevance. We discuss connections of this work with object-oriented data analysis and propose enhancements to the authors’ shape-based functional principal component analysis. Additionally, we illustrate the practical significance of adaptive alignment with an example from our own research.</p>","PeriodicalId":51189,"journal":{"name":"Test","volume":"11 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139499873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-12DOI: 10.1007/s11749-023-00913-7
Diana P. Ovalle–Muñoz, M. Dolores Ruiz–Medina
This paper addresses the estimation of the second-order structure of a manifold cross-time random field (RF) displaying spatially varying Long Range Dependence (LRD), adopting the functional time series framework introduced in Ruiz-Medina (Fract Calc Appl Anal 25:1426–1458, 2022). Conditions for the asymptotic unbiasedness of the integrated periodogram operator in the Hilbert–Schmidt operator norm are derived beyond structural assumptions. Weak-consistent estimation of the long-memory operator is achieved under a semiparametric functional spectral framework in the Gaussian context. The case where the projected manifold process can display Short Range Dependence (SRD) and LRD at different manifold scales is also analyzed. The performance of both estimation procedures is illustrated in the simulation study, in the context of multifractionally integrated spherical functional autoregressive–moving average (SPHARMA(p,q)) processes.
{"title":"LRD spectral analysis of multifractional functional time series on manifolds","authors":"Diana P. Ovalle–Muñoz, M. Dolores Ruiz–Medina","doi":"10.1007/s11749-023-00913-7","DOIUrl":"https://doi.org/10.1007/s11749-023-00913-7","url":null,"abstract":"<p>This paper addresses the estimation of the second-order structure of a manifold cross-time random field (RF) displaying spatially varying Long Range Dependence (LRD), adopting the functional time series framework introduced in Ruiz-Medina (Fract Calc Appl Anal 25:1426–1458, 2022). Conditions for the asymptotic unbiasedness of the integrated periodogram operator in the Hilbert–Schmidt operator norm are derived beyond structural assumptions. Weak-consistent estimation of the long-memory operator is achieved under a semiparametric functional spectral framework in the Gaussian context. The case where the projected manifold process can display Short Range Dependence (SRD) and LRD at different manifold scales is also analyzed. The performance of both estimation procedures is illustrated in the simulation study, in the context of multifractionally integrated spherical functional autoregressive–moving average (SPHARMA(p,q)) processes.</p>","PeriodicalId":51189,"journal":{"name":"Test","volume":"13 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139461192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-12DOI: 10.1007/s11749-023-00917-3
Majid Asadi, Maxim Finkelstein
In this short communication, we discuss the remaining lifetime and the mean remaining lifetime (MRL) of an item with a random age. We show that the MRL at random age is closely related to some well-known variability measures. First, we provide a decomposition result showing that the MRL at random age, similar to other variability measures, has a covariance representation. Under the proportional hazards (PH) model, we show that the MRL, depending on the parameter of proportionality, subsumes the Gini’s mean difference and the cumulative residual entropy as special cases. It is also shown that, under the PH model, the MRL can be expressed via the equilibrium distribution and the mean number of events in the generalized Pólya process.
{"title":"On variability of the mean remaining lifetime at random age","authors":"Majid Asadi, Maxim Finkelstein","doi":"10.1007/s11749-023-00917-3","DOIUrl":"https://doi.org/10.1007/s11749-023-00917-3","url":null,"abstract":"<p>In this short communication, we discuss the remaining lifetime and the mean remaining lifetime (MRL) of an item with a random age. We show that the MRL at random age is closely related to some well-known variability measures. First, we provide a decomposition result showing that the MRL at random age, similar to other variability measures, has a covariance representation. Under the proportional hazards (PH) model, we show that the MRL, depending on the parameter of proportionality, subsumes the Gini’s mean difference and the cumulative residual entropy as special cases. It is also shown that, under the PH model, the MRL can be expressed via the equilibrium distribution and the mean number of events in the generalized Pólya process.\u0000</p>","PeriodicalId":51189,"journal":{"name":"Test","volume":"6 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139461602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-19DOI: 10.1007/s11749-023-00912-8
Abstract
Generalized linear models (GLMs) are very widely used, but formal goodness-of-fit (GOF) tests for the overall fit of the model seem to be in wide use only for certain classes of GLMs. We develop and apply a new goodness-of-fit test, similar to the well-known and commonly used Hosmer–Lemeshow (HL) test, that can be used with a wide variety of GLMs. The test statistic is a variant of the HL statistic, but we rigorously derive an asymptotically correct sampling distribution using methods of Stute and Zhu (Scand J Stat 29(3):535–545, 2002) and demonstrate its consistency. We compare the performance of our new test with other GOF tests for GLMs, including a naive direct application of the HL test to the Poisson problem. Our test provides competitive or comparable power in various simulation settings and we identify a situation where a naive version of the test fails to hold its size. Our generalized HL test is straightforward to implement and interpret and an R package is publicly available.
{"title":"A generalized Hosmer–Lemeshow goodness-of-fit test for a family of generalized linear models","authors":"","doi":"10.1007/s11749-023-00912-8","DOIUrl":"https://doi.org/10.1007/s11749-023-00912-8","url":null,"abstract":"<h3>Abstract</h3> <p>Generalized linear models (GLMs) are very widely used, but formal goodness-of-fit (GOF) tests for the overall fit of the model seem to be in wide use only for certain classes of GLMs. We develop and apply a new goodness-of-fit test, similar to the well-known and commonly used Hosmer–Lemeshow (HL) test, that can be used with a wide variety of GLMs. The test statistic is a variant of the HL statistic, but we rigorously derive an asymptotically correct sampling distribution using methods of Stute and Zhu (Scand J Stat 29(3):535–545, 2002) and demonstrate its consistency. We compare the performance of our new test with other GOF tests for GLMs, including a naive direct application of the HL test to the Poisson problem. Our test provides competitive or comparable power in various simulation settings and we identify a situation where a naive version of the test fails to hold its size. Our generalized HL test is straightforward to implement and interpret and an <span>R</span> package is publicly available.</p>","PeriodicalId":51189,"journal":{"name":"Test","volume":"887 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2023-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138741449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-18DOI: 10.1007/s11749-023-00911-9
T. Tony Cai, Zijian Guo, Yin Xia
{"title":"Rejoinder on: statistical inference and large-scale multiple testing for high-dimensional regression models","authors":"T. Tony Cai, Zijian Guo, Yin Xia","doi":"10.1007/s11749-023-00911-9","DOIUrl":"https://doi.org/10.1007/s11749-023-00911-9","url":null,"abstract":"","PeriodicalId":51189,"journal":{"name":"Test","volume":"74 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2023-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138717037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-15DOI: 10.1007/s11749-023-00909-3
Simos G. Meintanis, John P. Nolan, Charl Pretorius
We consider goodness-of-fit methods for multivariate symmetric and asymmetric stable Paretian random vectors in arbitrary dimension. The methods are based on the empirical characteristic function and are implemented both in the i.i.d. context as well as for innovations in GARCH models. Asymptotic properties of the proposed procedures are discussed, while the finite-sample properties are illustrated by means of an extensive Monte Carlo study. The procedures are also applied to real data from the financial markets.
{"title":"Specification procedures for multivariate stable-Paretian laws for independent and for conditionally heteroskedastic data","authors":"Simos G. Meintanis, John P. Nolan, Charl Pretorius","doi":"10.1007/s11749-023-00909-3","DOIUrl":"https://doi.org/10.1007/s11749-023-00909-3","url":null,"abstract":"<p>We consider goodness-of-fit methods for multivariate symmetric and asymmetric stable Paretian random vectors in arbitrary dimension. The methods are based on the empirical characteristic function and are implemented both in the i.i.d. context as well as for innovations in GARCH models. Asymptotic properties of the proposed procedures are discussed, while the finite-sample properties are illustrated by means of an extensive Monte Carlo study. The procedures are also applied to real data from the financial markets.\u0000</p>","PeriodicalId":51189,"journal":{"name":"Test","volume":"38 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2023-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138690329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}