This paper considers linear regression models when neither the response variable nor the covariates can be directly observed, but are measured with multiplicative distortion measurement errors. We propose new identifiability conditions for the distortion functions via the varying coefficient models, then moment‐based estimators of parameters in the model are proposed by using the estimated varying coefficient functions. This method does not require the independence condition between the confounding variables and the unobserved response and variables. We establish the connections among the varying coefficient based estimators, the conditional mean calibration and the conditional absolute mean calibration. We study the asymptotic results of these proposed estimators, and discuss their asymptotic efficiencies. Lastly, we make some comparisons among the proposed estimators through the simulation. These methods are applied to analyze a real dataset for an illustration.
{"title":"Linear Regression Models with Multiplicative Distortions under New Identifiability Conditions","authors":"Jun Zhang, Bingqing Lin, Yan Zhou","doi":"10.1111/stan.12304","DOIUrl":"https://doi.org/10.1111/stan.12304","url":null,"abstract":"This paper considers linear regression models when neither the response variable nor the covariates can be directly observed, but are measured with multiplicative distortion measurement errors. We propose new identifiability conditions for the distortion functions via the varying coefficient models, then moment‐based estimators of parameters in the model are proposed by using the estimated varying coefficient functions. This method does not require the independence condition between the confounding variables and the unobserved response and variables. We establish the connections among the varying coefficient based estimators, the conditional mean calibration and the conditional absolute mean calibration. We study the asymptotic results of these proposed estimators, and discuss their asymptotic efficiencies. Lastly, we make some comparisons among the proposed estimators through the simulation. These methods are applied to analyze a real dataset for an illustration.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2023-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75687111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Joris Pries, Etienne van de Bijl, Jan Klein, Sandjai Bhulai, Rob van der Mei
Before any binary classification model is taken into practice, it is important to validate its performance on a proper test set. Without a frame of reference given by a baseline method, it is impossible to determine if a score is “good” or “bad.” The goal of this paper is to examine all baseline methods that are independent of feature values and determine which model is the “best” and why. By identifying which baseline models are optimal, a crucial selection decision in the evaluation process is simplified. We prove that the recently proposed Dutch Draw baseline is the best input‐independent classifier (independent of feature values) for all order‐invariant measures (independent of sequence order) assuming that the samples are randomly shuffled. This means that the Dutch Draw baseline is the optimal baseline under these intuitive requirements and should therefore be used in practice.
{"title":"The optimal input‐independent baseline for binary classification: The Dutch Draw","authors":"Joris Pries, Etienne van de Bijl, Jan Klein, Sandjai Bhulai, Rob van der Mei","doi":"10.1111/stan.12297","DOIUrl":"https://doi.org/10.1111/stan.12297","url":null,"abstract":"Before any binary classification model is taken into practice, it is important to validate its performance on a proper test set. Without a frame of reference given by a baseline method, it is impossible to determine if a score is “good” or “bad.” The goal of this paper is to examine all baseline methods that are independent of feature values and determine which model is the “best” and why. By identifying which baseline models are optimal, a crucial selection decision in the evaluation process is simplified. We prove that the recently proposed Dutch Draw baseline is the best input‐independent classifier (independent of feature values) for all order‐invariant measures (independent of sequence order) assuming that the samples are randomly shuffled. This means that the Dutch Draw baseline is the optimal baseline under these intuitive requirements and should therefore be used in practice.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136012239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yu-Hyeong Jang, Jun Zhao, Hyoung-Moon Kim, Kyusang Yu, Sunghoon Kwon, Sunghwan Kim
Maximum likelihood estimation is used widely in classical statistics. However, except in a few cases, it does not have a closed form. Furthermore, it takes time to derive the maximum likelihood estimator (MLE) owing to the use of iterative methods such as Newton–Raphson. Nonetheless, this estimation method has several advantages, chief among them being the invariance property and asymptotic normality. Based on the first approximation to the solution of the likelihood equation, we obtain an estimator that has the same asymptotic behavior as the MLE for multivariate gamma distribution. The newly proposed estimator, denoted as MLECE$$ {mathrm{MLE}}_{mathrm{CE}} $$ , is also in closed form as long as the n$$ sqrt{n} $$ ‐consistent initial estimator is in the closed form. Hence, we develop some closed‐form n$$ sqrt{n} $$ ‐consistent estimators for multivariate gamma distribution to improve the small‐sample property. MLECE$$ {mathrm{MLE}}_{mathrm{CE}} $$ is an alternative to MLE and performs better compared to MLE in terms of computation time, especially for large datasets, and stability. For the bivariate gamma distribution, the MLECE$$ {mathrm{MLE}}_{mathrm{CE}} $$ is over 130 times faster than the MLE, and as the sample size increasing, the MLECE$$ {mathrm{MLE}}_{mathrm{CE}} $$ is over 200 times faster than the MLE. Owing to the instant calculation of the proposed estimator, it can be used in state–space modeling or real‐time processing models.
{"title":"New closed‐form efficient estimator for multivariate gamma distribution","authors":"Yu-Hyeong Jang, Jun Zhao, Hyoung-Moon Kim, Kyusang Yu, Sunghoon Kwon, Sunghwan Kim","doi":"10.1111/stan.12299","DOIUrl":"https://doi.org/10.1111/stan.12299","url":null,"abstract":"Maximum likelihood estimation is used widely in classical statistics. However, except in a few cases, it does not have a closed form. Furthermore, it takes time to derive the maximum likelihood estimator (MLE) owing to the use of iterative methods such as Newton–Raphson. Nonetheless, this estimation method has several advantages, chief among them being the invariance property and asymptotic normality. Based on the first approximation to the solution of the likelihood equation, we obtain an estimator that has the same asymptotic behavior as the MLE for multivariate gamma distribution. The newly proposed estimator, denoted as MLECE$$ {mathrm{MLE}}_{mathrm{CE}} $$ , is also in closed form as long as the n$$ sqrt{n} $$ ‐consistent initial estimator is in the closed form. Hence, we develop some closed‐form n$$ sqrt{n} $$ ‐consistent estimators for multivariate gamma distribution to improve the small‐sample property. MLECE$$ {mathrm{MLE}}_{mathrm{CE}} $$ is an alternative to MLE and performs better compared to MLE in terms of computation time, especially for large datasets, and stability. For the bivariate gamma distribution, the MLECE$$ {mathrm{MLE}}_{mathrm{CE}} $$ is over 130 times faster than the MLE, and as the sample size increasing, the MLECE$$ {mathrm{MLE}}_{mathrm{CE}} $$ is over 200 times faster than the MLE. Owing to the instant calculation of the proposed estimator, it can be used in state–space modeling or real‐time processing models.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2023-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73572282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Karlis, Azmi Chutoo, N. Mamode Khan, V. Jowaheer
In spatial count data analysis, modeling with a multilateral lattice structure presents some important challenges. They include both the model construction and the estimation of the model parameters, since the structure accommodates the left, right, top, bottom, and diagonal site effects. Thus, the multilateral spatial process unifies all the popular spatial subclasses that include the unilateral, Rook, Bishop, and Queen models and, hence, makes it suitable for a wide variety of applications. This paper introduces a first‐order multilateral integer‐valued spatial process, based on a binomial thinning mechanism and some innovation term, under both stationary and nonstationary conditions. The estimation of parameters is handled by the conditional maximum likelihood estimation (CML) approach. Simulation experiments are implemented to assess the consistency of the CML estimators in the stationary and nonstationary multilateral spatial model and its subclasses, based on different grid sizes and under both covariate and noncovariate designs. The proposed model, along with its subclasses are applied to real datasets.
{"title":"The Multilateral Spatial Integer‐valued Process of order 1","authors":"D. Karlis, Azmi Chutoo, N. Mamode Khan, V. Jowaheer","doi":"10.1111/stan.12298","DOIUrl":"https://doi.org/10.1111/stan.12298","url":null,"abstract":"In spatial count data analysis, modeling with a multilateral lattice structure presents some important challenges. They include both the model construction and the estimation of the model parameters, since the structure accommodates the left, right, top, bottom, and diagonal site effects. Thus, the multilateral spatial process unifies all the popular spatial subclasses that include the unilateral, Rook, Bishop, and Queen models and, hence, makes it suitable for a wide variety of applications. This paper introduces a first‐order multilateral integer‐valued spatial process, based on a binomial thinning mechanism and some innovation term, under both stationary and nonstationary conditions. The estimation of parameters is handled by the conditional maximum likelihood estimation (CML) approach. Simulation experiments are implemented to assess the consistency of the CML estimators in the stationary and nonstationary multilateral spatial model and its subclasses, based on different grid sizes and under both covariate and noncovariate designs. The proposed model, along with its subclasses are applied to real datasets.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2023-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85976129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper investigates stochastic comparisons of largest claim amounts of two sets of independent or interdependent portfolios in the sense of some stochastic orders. Let random variable Xi$$ {X}_i $$ ( i=1,…,n$$ i=1,dots, n $$ ) with distribution function F(x;αi)$$ Fleft(x;{alpha}_iright) $$ , represents the claim amount for ith risk of a portfolio. Here two largest claim amounts are compared considering that the claim variables follow a general semiparametric family of distributions having the property that the survival function F‾(x;α)$$ overline{F}left(x;alpha right) $$ is increasing in α$$ alpha $$ or is increasing and convex/concave in α$$ alpha $$ . The results obtained in this paper apply to a large class of well‐known distributions including the family of exponentiated/generalized distributions (e.g., exponentiated exponential, Weibull, gamma and Pareto family), Rayleigh distribution and Marshall–Olkin family of distributions. As a direct consequence of some main theorems, we also obtained the results for scale family of distributions. Several numerical examples are provided to illustrate the results.
{"title":"Stochastic comparisons of largest claim amounts from heterogeneous portfolios","authors":"Pradip Kundu, Amarjit Kundu, Biplab Hawlader","doi":"10.1111/stan.12296","DOIUrl":"https://doi.org/10.1111/stan.12296","url":null,"abstract":"This paper investigates stochastic comparisons of largest claim amounts of two sets of independent or interdependent portfolios in the sense of some stochastic orders. Let random variable Xi$$ {X}_i $$ ( i=1,…,n$$ i=1,dots, n $$ ) with distribution function F(x;αi)$$ Fleft(x;{alpha}_iright) $$ , represents the claim amount for ith risk of a portfolio. Here two largest claim amounts are compared considering that the claim variables follow a general semiparametric family of distributions having the property that the survival function F‾(x;α)$$ overline{F}left(x;alpha right) $$ is increasing in α$$ alpha $$ or is increasing and convex/concave in α$$ alpha $$ . The results obtained in this paper apply to a large class of well‐known distributions including the family of exponentiated/generalized distributions (e.g., exponentiated exponential, Weibull, gamma and Pareto family), Rayleigh distribution and Marshall–Olkin family of distributions. As a direct consequence of some main theorems, we also obtained the results for scale family of distributions. Several numerical examples are provided to illustrate the results.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2023-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74143095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nelson Alirio Cruz, Luis Alberto López Pérez, Oscar Orlando Melo
This paper presents an experimental cross‐over design whose response variable is a count that belongs to the Poisson distribution. The methodology is extended to data with overdispersion or subdispersion. We present the theoretical development for analysis of cases with few treatments and a few periods. In this case, we consider the log‐linear link for estimation effects and the Delta method for the asymptotic inference of the estimators. When the number of periods and sequences increases, we propose an extension of the previous methodology, using the generalized linear models. In this extension, cross‐over designs for count data include treatments, sequences, time effects, covariables, and any correlation structure. The most important result of the methodology is that it allows the detection of significant factors within the cross‐over design when the response variable belongs to the exponential family, especially the treatment effects. Finally, we present the analysis of data obtained in a student hydration study and a simulation study. We show a comparison between the usual methods of analysis and those obtained in the present work, demonstrating the advantage over the usual methods in situations with carry‐over presence.
{"title":"Analysis of cross‐over experiments with count data in the presence of carry‐over effects","authors":"Nelson Alirio Cruz, Luis Alberto López Pérez, Oscar Orlando Melo","doi":"10.1111/stan.12295","DOIUrl":"https://doi.org/10.1111/stan.12295","url":null,"abstract":"This paper presents an experimental cross‐over design whose response variable is a count that belongs to the Poisson distribution. The methodology is extended to data with overdispersion or subdispersion. We present the theoretical development for analysis of cases with few treatments and a few periods. In this case, we consider the log‐linear link for estimation effects and the Delta method for the asymptotic inference of the estimators. When the number of periods and sequences increases, we propose an extension of the previous methodology, using the generalized linear models. In this extension, cross‐over designs for count data include treatments, sequences, time effects, covariables, and any correlation structure. The most important result of the methodology is that it allows the detection of significant factors within the cross‐over design when the response variable belongs to the exponential family, especially the treatment effects. Finally, we present the analysis of data obtained in a student hydration study and a simulation study. We show a comparison between the usual methods of analysis and those obtained in the present work, demonstrating the advantage over the usual methods in situations with carry‐over presence.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2023-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89772271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A stationary sequence of nonnegative random variables generated by autoregressive (AR) models may be used to describe the inter‐arrival times between events in counting processes. Even though, several such models are available in the literature, there is no unified approach to estimate their parameters. In this paper, we propose a class of combined estimating function method to estimate the model parameters of AR models with gamma marginals. The proposed method is compared with other estimation procedures and are illustrated by simulation and data analysis.
{"title":"Estimating function method for nonnegative autoregressive models","authors":"E. Hari, Prasad N. Balakrishna, E. H. Prasad","doi":"10.1111/stan.12294","DOIUrl":"https://doi.org/10.1111/stan.12294","url":null,"abstract":"A stationary sequence of nonnegative random variables generated by autoregressive (AR) models may be used to describe the inter‐arrival times between events in counting processes. Even though, several such models are available in the literature, there is no unified approach to estimate their parameters. In this paper, we propose a class of combined estimating function method to estimate the model parameters of AR models with gamma marginals. The proposed method is compared with other estimation procedures and are illustrated by simulation and data analysis.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2023-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86547801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Burger, Sean van der Merwe, E. Lesaffre, P. C. le Roux, Morgan J. Raath‐Krüger
There is no literature on outlier‐robust parametric mixed‐effects quantile regression models for continuous proportion data as an alternative to systematically identifying and eliminating outliers. To fill this gap, we formulate a robust method by extending the recently proposed fixed‐effects quantile regression model based on the heavy‐tailed Johnson‐ t$$ t $$ distribution for continuous proportion data to the mixed‐effects modeling context, using a Bayesian approach. Our proposed method is motivated by and used to model the extreme quantiles of the vitality of cushion plants to provide insights into the ecology of the system in which the plants are dominant. We conducted a simulation study to assess the new method's performance and robustness to outliers. We show that the new model has good accuracy and confidence interval coverage properties and is remarkably robust to outliers. In contrast, our study demonstrates that the current approach in the literature for modeling hierarchically structured bounded data's quantiles is susceptible to outliers, especially when modeling the extreme quantiles. We conclude that the proposed model is an appropriate robust alternative to the current approach for modeling the quantiles of correlated continuous proportions when outliers are present in the data.
目前还没有关于连续比例数据的异常值-鲁棒参数混合效应分位数回归模型作为系统识别和消除异常值的替代方法的文献。为了填补这一空白,我们通过使用贝叶斯方法,将最近提出的基于连续比例数据的重尾Johnson - t $$ t $$分布的固定效应分位数回归模型扩展到混合效应建模环境,从而制定了一种鲁棒方法。我们提出的方法是由缓冲植物活力的极端分位数模型驱动的,并用于对植物占主导地位的系统的生态学提供见解。我们进行了仿真研究,以评估新方法的性能和对异常值的鲁棒性。结果表明,新模型具有良好的精度和置信区间覆盖性能,对异常值具有显著的鲁棒性。相比之下,我们的研究表明,目前文献中用于分层结构有界数据分位数建模的方法容易受到异常值的影响,特别是在建模极端分位数时。我们得出的结论是,当数据中存在异常值时,所提出的模型是对相关连续比例的分位数建模的当前方法的适当鲁棒替代方法。
{"title":"A robust mixed‐effects parametric quantile regression model for continuous proportions: Quantifying the constraints to vitality in cushion plants","authors":"D. Burger, Sean van der Merwe, E. Lesaffre, P. C. le Roux, Morgan J. Raath‐Krüger","doi":"10.1111/stan.12293","DOIUrl":"https://doi.org/10.1111/stan.12293","url":null,"abstract":"There is no literature on outlier‐robust parametric mixed‐effects quantile regression models for continuous proportion data as an alternative to systematically identifying and eliminating outliers. To fill this gap, we formulate a robust method by extending the recently proposed fixed‐effects quantile regression model based on the heavy‐tailed Johnson‐ t$$ t $$ distribution for continuous proportion data to the mixed‐effects modeling context, using a Bayesian approach. Our proposed method is motivated by and used to model the extreme quantiles of the vitality of cushion plants to provide insights into the ecology of the system in which the plants are dominant. We conducted a simulation study to assess the new method's performance and robustness to outliers. We show that the new model has good accuracy and confidence interval coverage properties and is remarkably robust to outliers. In contrast, our study demonstrates that the current approach in the literature for modeling hierarchically structured bounded data's quantiles is susceptible to outliers, especially when modeling the extreme quantiles. We conclude that the proposed model is an appropriate robust alternative to the current approach for modeling the quantiles of correlated continuous proportions when outliers are present in the data.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2023-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90397245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A variety of inferential tests are available for single and multilevel mediation but most come with notable limitations that balance tradeoffs between power and Type I error. We extend the partial posterior p value method (p3 method) to test multilevel mediation. This contemporary resampling‐based composite approach is specifically suited for complex null hypotheses. We develop the p3 method and investigate its performance within the context of two‐level cluster‐randomized multilevel mediation studies. Similar to its performance in single‐level studies, we found that the p3 method performed well relative to other mediation tests suggesting it provides a judicious balance between Type I error rate and power. While bias‐corrected bootstrapping achieved the best overall performance, the p3 method serves as an alternative tool for researchers investigating multilevel mediation that is especially useful when conducting a priori power analyses. To encourage utilization, we provide R code for implementing the p3 method.
{"title":"A partial posterior p value test for multilevel mediation","authors":"Kyle Cox, Ben Kelcey","doi":"10.1111/stan.12291","DOIUrl":"https://doi.org/10.1111/stan.12291","url":null,"abstract":"A variety of inferential tests are available for single and multilevel mediation but most come with notable limitations that balance tradeoffs between power and Type I error. We extend the partial posterior p value method (p3 method) to test multilevel mediation. This contemporary resampling‐based composite approach is specifically suited for complex null hypotheses. We develop the p3 method and investigate its performance within the context of two‐level cluster‐randomized multilevel mediation studies. Similar to its performance in single‐level studies, we found that the p3 method performed well relative to other mediation tests suggesting it provides a judicious balance between Type I error rate and power. While bias‐corrected bootstrapping achieved the best overall performance, the p3 method serves as an alternative tool for researchers investigating multilevel mediation that is especially useful when conducting a priori power analyses. To encourage utilization, we provide R code for implementing the p3 method.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2023-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80357683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we introduce a new portmanteau test for the iid hypothesis, where the elements of the sample are allowed to take values in a general space (e.g., a function space). We study the finite sample properties of our test, evaluating its performance in terms of empirical size and power. In particular, we compare the empirical power of our test with the power of other tests in the literature designed to work in a specific data setting, as the one‐way analysis of variance test used in experimental data analysis and three portmanteau tests used in time series analysis. In every case, we found conditions where our test outperformed in power to the competing test. Finally, to illustrate the usefulness of our test, we implement it on two real‐world applications based on function‐valued data.
{"title":"A portmanteau test for the iid hypothesis","authors":"Ricardo Bórquez","doi":"10.1111/stan.12290","DOIUrl":"https://doi.org/10.1111/stan.12290","url":null,"abstract":"In this paper, we introduce a new portmanteau test for the iid hypothesis, where the elements of the sample are allowed to take values in a general space (e.g., a function space). We study the finite sample properties of our test, evaluating its performance in terms of empirical size and power. In particular, we compare the empirical power of our test with the power of other tests in the literature designed to work in a specific data setting, as the one‐way analysis of variance test used in experimental data analysis and three portmanteau tests used in time series analysis. In every case, we found conditions where our test outperformed in power to the competing test. Finally, to illustrate the usefulness of our test, we implement it on two real‐world applications based on function‐valued data.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2023-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86661955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}