Pub Date : 2015-03-01DOI: 10.1016/j.stamet.2014.10.001
Shyamal K. De , Michael Baron
Sequential methods are developed for conducting a large number of simultaneous tests while controlling the Type I and Type II generalized familywise error rates. Namely, for the chosen values of , , , and , we derive simultaneous tests of individual hypotheses, based on sequentially collected data, that keep the probability of at least Type I errors not exceeding level and the probability of at least Type II errors not greater than . This generalization of the classical notions of familywise error rates allows substantial reduction of the expected sample size of the multiple testing procedure.
{"title":"Sequential tests controlling generalized familywise error rates","authors":"Shyamal K. De , Michael Baron","doi":"10.1016/j.stamet.2014.10.001","DOIUrl":"10.1016/j.stamet.2014.10.001","url":null,"abstract":"<div><p><span>Sequential methods are developed for conducting a large number of simultaneous tests while controlling the Type I and Type II </span><em>generalized</em><span> familywise error rates. Namely, for the chosen values of </span><span><math><mi>α</mi></math></span>, <span><math><mi>β</mi></math></span>, <span><math><mi>k</mi></math></span>, and <span><math><mi>m</mi></math></span>, we derive simultaneous tests of <span><math><mi>d</mi></math></span><span> individual hypotheses, based on sequentially collected data, that keep the probability of at least </span><span><math><mi>k</mi></math></span> Type I errors not exceeding level <span><math><mi>α</mi></math></span> and the probability of at least <span><math><mi>m</mi></math></span><span> Type II errors not greater than </span><span><math><mi>β</mi></math></span>. This generalization of the classical notions of familywise error rates allows substantial reduction of the expected sample size of the multiple testing procedure.</p></div>","PeriodicalId":48877,"journal":{"name":"Statistical Methodology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.stamet.2014.10.001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"55092548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-01-01DOI: 10.1016/j.stamet.2014.06.002
Chang Xuan Mao, Nan Yang, Jinhua Zhong
Data from a surveillance system can be used to estimate the size of a disease population. For certain surveillance systems, a binomial mixture model arises as a natural choice. The Chao estimator estimates a lower bound of the population size. The Zelterman estimator estimates a parameter that is neither a lower bound nor an upper bound. By comparing the Chao estimator and the Zelterman estimator both theoretically and numerically, we conclude that the Chao estimator is better.
{"title":"On the Chao and Zelterman estimators in a binomial mixture model","authors":"Chang Xuan Mao, Nan Yang, Jinhua Zhong","doi":"10.1016/j.stamet.2014.06.002","DOIUrl":"10.1016/j.stamet.2014.06.002","url":null,"abstract":"<div><p>Data from a surveillance system can be used to estimate the size of a disease population. For certain surveillance systems, a binomial mixture model arises as a natural choice. The Chao estimator estimates a lower bound of the population size. The Zelterman estimator estimates a parameter that is neither a lower bound nor an upper bound. By comparing the Chao estimator and the Zelterman estimator both theoretically and numerically, we conclude that the Chao estimator is better.</p></div>","PeriodicalId":48877,"journal":{"name":"Statistical Methodology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.stamet.2014.06.002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"55092414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-01-01DOI: 10.1016/j.stamet.2014.06.001
Mahmood Kharrati-Kopaei
In this paper, a lemma is presented and then it is used to construct simultaneous confidence intervals (SCIs) for the differences of location parameters of successive exponential distributions in the unbalanced case under heteroscedasticity. A simulation study based comparison of our SCIs with two recently proposed procedures in terms of coverage probability and average volume revealed that the proposed method can be recommended for small and moderate sample sizes.
{"title":"A note on the simultaneous confidence intervals for the successive differences of exponential location parameters under heteroscedasticity","authors":"Mahmood Kharrati-Kopaei","doi":"10.1016/j.stamet.2014.06.001","DOIUrl":"10.1016/j.stamet.2014.06.001","url":null,"abstract":"<div><p>In this paper, a lemma is presented and then it is used to construct simultaneous confidence intervals (SCIs) for the differences of location parameters of successive exponential distributions<span> in the unbalanced case under heteroscedasticity<span>. A simulation study based comparison of our SCIs with two recently proposed procedures in terms of coverage probability<span> and average volume revealed that the proposed method can be recommended for small and moderate sample sizes.</span></span></span></p></div>","PeriodicalId":48877,"journal":{"name":"Statistical Methodology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.stamet.2014.06.001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"55092403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-01-01DOI: 10.1016/j.stamet.2014.08.001
Xiang Zhan , Debashis Ghosh
With evolving genomic technologies, it is possible to get different measures of the same underlying biological phenomenon using different technologies. The goal of this paper is to build a prediction model for an outcome variable from covariates . Besides , we have surrogate covariates which are related to . We want to utilize the information in to boost the prediction for using . In this paper, we propose a kernel machine-based method to improve prediction of by by incorporating auxiliary information . By combining single kernel machines, we also propose a hybrid kernel machine predictor, which can yield a smaller prediction error than its constituents. The prediction error of our kernel machine predictors is evaluated using simulations. We also apply our method to a lung cancer dataset and an Alzheimer’s disease dataset.
{"title":"Incorporating auxiliary information for improved prediction using combination of kernel machines","authors":"Xiang Zhan , Debashis Ghosh","doi":"10.1016/j.stamet.2014.08.001","DOIUrl":"10.1016/j.stamet.2014.08.001","url":null,"abstract":"<div><p>With evolving genomic technologies, it is possible to get different measures of the same underlying biological phenomenon using different technologies. The goal of this paper is to build a prediction model for an outcome variable <span><math><mi>Y</mi></math></span><span> from covariates </span><span><math><mi>X</mi></math></span>. Besides <span><math><mi>X</mi></math></span>, we have surrogate covariates <span><math><mi>W</mi></math></span> which are related to <span><math><mi>X</mi></math></span>. We want to utilize the information in <span><math><mi>W</mi></math></span> to boost the prediction for <span><math><mi>Y</mi></math></span> using <span><math><mi>X</mi></math></span>. In this paper, we propose a kernel machine-based method to improve prediction of <span><math><mi>Y</mi></math></span> by <span><math><mi>X</mi></math></span> by incorporating auxiliary information <span><math><mi>W</mi></math></span>. By combining single kernel machines, we also propose a hybrid kernel machine predictor, which can yield a smaller prediction error than its constituents. The prediction error of our kernel machine predictors is evaluated using simulations. We also apply our method to a lung cancer dataset and an Alzheimer’s disease dataset.</p></div>","PeriodicalId":48877,"journal":{"name":"Statistical Methodology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.stamet.2014.08.001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32835018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-01-01DOI: 10.1016/j.stamet.2014.05.002
Jiyeon Lee , Sangyeol Lee , Siyun Park
In this paper, we apply the maximum entropy test designed for a goodness of fit in iid samples (cf. Lee et al. (2011)) to GARCH(1,1) models. Its approximate asymptotic distribution is derived under the null hypothesis. A bootstrap version of the test is also discussed and its performance is evaluated through Monte Carlo simulations. A real data analysis is conducted for illustration.
在本文中,我们将设计用于id样本拟合优度的最大熵检验(cf. Lee et al.(2011))应用于GARCH(1,1)模型。在零假设下,导出了其近似渐近分布。本文还讨论了该测试的自举版本,并通过蒙特卡罗模拟对其性能进行了评估。并以实际数据分析为例进行说明。
{"title":"Maximum entropy test for GARCH models","authors":"Jiyeon Lee , Sangyeol Lee , Siyun Park","doi":"10.1016/j.stamet.2014.05.002","DOIUrl":"10.1016/j.stamet.2014.05.002","url":null,"abstract":"<div><p>In this paper, we apply the maximum entropy test designed for a goodness of fit in iid samples (cf. Lee et al. (2011)) to GARCH(1,1) models. Its approximate asymptotic distribution is derived under the null hypothesis. A bootstrap version of the test is also discussed and its performance is evaluated through Monte Carlo simulations. A real data analysis is conducted for illustration.</p></div>","PeriodicalId":48877,"journal":{"name":"Statistical Methodology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.stamet.2014.05.002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"55092388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-01-01DOI: 10.1016/j.stamet.2014.07.002
Saralees Nadarajah , Sergey Bityukov , Nikolai Krasnikov
A review is provided of the concept confidence distributions. Material covered include: fundamentals, extensions, applications of confidence distributions and available computer software. We expect that this review could serve as a source of reference and encourage further research with respect to confidence distributions.
{"title":"Confidence distributions: A review","authors":"Saralees Nadarajah , Sergey Bityukov , Nikolai Krasnikov","doi":"10.1016/j.stamet.2014.07.002","DOIUrl":"10.1016/j.stamet.2014.07.002","url":null,"abstract":"<div><p>A review is provided of the concept <em>confidence distributions</em>. Material covered include: fundamentals, extensions, applications of confidence distributions and available computer software. We expect that this review could serve as a source of reference and encourage further research with respect to confidence distributions.</p></div>","PeriodicalId":48877,"journal":{"name":"Statistical Methodology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.stamet.2014.07.002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"55092441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-11-01DOI: 10.1016/j.stamet.2014.05.001
Chi Tim Ng , Chi Wai Yu
Instead of using sample information only to do variable selection, in this article we also take priori information — linear constraints of regression coefficients — into account. The penalized likelihood estimation method is adopted. However under constraints, it is not guaranteed that information criteria like AIC and BIC are minimized at an oracle solution using the lasso or SCAD penalty. To overcome such difficulties, a modified SCAD penalty is proposed. The definitions of information criteria GCV, AIC and BIC for constrained variable selection problems are also proposed. Statistically, we show that if the tuning parameter is appropriately chosen, the proposed estimators enjoy the oracle properties and satisfy the linear constraints. Additionally, they also possess the robust property to outliers if the linear model with M-estimation is used.
{"title":"Modified SCAD penalty for constrained variable selection problems","authors":"Chi Tim Ng , Chi Wai Yu","doi":"10.1016/j.stamet.2014.05.001","DOIUrl":"10.1016/j.stamet.2014.05.001","url":null,"abstract":"<div><p>Instead of using sample information only to do variable selection, in this article we also take <em>priori</em><span><span> information — linear constraints<span> of regression coefficients — into account. The penalized likelihood estimation method is adopted. However under constraints, it is not guaranteed that information criteria like </span></span>AIC<span> and BIC are minimized at an oracle solution using the lasso or SCAD penalty. To overcome such difficulties, a modified SCAD penalty is proposed. The definitions of information criteria GCV, AIC and BIC for constrained variable selection problems are also proposed. Statistically, we show that if the tuning parameter is appropriately chosen, the proposed estimators enjoy the oracle properties and satisfy the linear constraints. Additionally, they also possess the robust property to outliers if the linear model with M-estimation is used.</span></span></p></div>","PeriodicalId":48877,"journal":{"name":"Statistical Methodology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.stamet.2014.05.001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"55092370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-11-01DOI: 10.1016/j.stamet.2014.03.001
Salim Bouzebda
The present paper is mainly concerned with the statistical tests of the independence problem between random vectors. We develop an approach based on general empirical processes indexed by a particular class of functions. We prove two abstract approximation theorems that include some existing results as particular cases. Finally, we characterize the limiting behavior of the Möbius transformation of empirical processes indexed by functions under contiguous sequences of alternatives.
{"title":"General tests of independence based on empirical processes indexed by functions","authors":"Salim Bouzebda","doi":"10.1016/j.stamet.2014.03.001","DOIUrl":"10.1016/j.stamet.2014.03.001","url":null,"abstract":"<div><p>The present paper is mainly concerned with the statistical tests of the independence problem between random vectors. We develop an approach based on general empirical processes indexed by a particular class of functions. We prove two abstract approximation theorems<span> that include some existing results as particular cases. Finally, we characterize the limiting behavior of the Möbius transformation of empirical processes indexed by functions under contiguous sequences of alternatives.</span></p></div>","PeriodicalId":48877,"journal":{"name":"Statistical Methodology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.stamet.2014.03.001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"55092301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We introduce state-space models where the functionals of the observational and evolutionary equations are unknown, and treated as random functions evolving with time. Thus, our model is nonparametric and generalizes the traditional parametric state-space models. This random function approach also frees us from the restrictive assumption that the functional forms, although time-dependent, are of fixed forms. The traditional approach of assuming known, parametric functional forms is questionable, particularly in state-space models, since the validation of the assumptions require data on both the observed time series and the latent states; however, data on the latter are not available in state-space models.
We specify Gaussian processes as priors of the random functions and exploit the “look-up table approach” of Bhattacharya (2007) to efficiently handle the dynamic structure of the model. We consider both univariate and multivariate situations, using the Markov chain Monte Carlo (MCMC) approach for studying the posterior distributions of interest. We illustrate our methods with simulated data sets, in both univariate and multivariate situations. Moreover, using our Gaussian process approach we analyze a real data set, which has also been analyzed by Shumway & Stoffer (1982) and Carlin, Polson & Stoffer (1992) using the linearity assumption. Interestingly, our analyses indicate that towards the end of the time series, the linearity assumption is perhaps questionable.
{"title":"Bayesian inference in nonparametric dynamic state-space models","authors":"Anurag Ghosh , Soumalya Mukhopadhyay , Sandipan Roy , Sourabh Bhattacharya","doi":"10.1016/j.stamet.2014.02.004","DOIUrl":"10.1016/j.stamet.2014.02.004","url":null,"abstract":"<div><p>We introduce state-space models where the functionals of the observational and evolutionary equations are unknown, and treated as random functions evolving with time. Thus, our model is nonparametric and generalizes the traditional parametric state-space models. This random function approach also frees us from the restrictive assumption that the functional forms, although time-dependent, are of fixed forms. The traditional approach of assuming known, parametric functional forms is questionable, particularly in state-space models, since the validation of the assumptions require data on both the observed time series and the latent states; however, data on the latter are not available in state-space models.</p><p>We specify Gaussian processes<span> as priors of the random functions and exploit the “look-up table approach” of Bhattacharya (2007) to efficiently handle the dynamic structure of the model. We consider both univariate and multivariate situations, using the Markov chain Monte Carlo<span><span> (MCMC) approach for studying the posterior distributions of interest. We illustrate our methods with </span>simulated data sets, in both univariate and multivariate situations. Moreover, using our Gaussian process approach we analyze a real data set, which has also been analyzed by Shumway & Stoffer (1982) and Carlin, Polson & Stoffer (1992) using the linearity assumption. Interestingly, our analyses indicate that towards the end of the time series, the linearity assumption is perhaps questionable.</span></span></p></div>","PeriodicalId":48877,"journal":{"name":"Statistical Methodology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.stamet.2014.02.004","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"55092292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-11-01DOI: 10.1016/j.stamet.2014.03.003
Nitis Mukhopadhyay, Sankha Muthu Poruthotage
The asymptotically efficient and asymptotically consistent purely sequential procedure of Mukhopadhyay and Al-Mousawi (1986) is customarily used to construct a confidence region for the mean vector of . This procedure does not have the exact consistency property. is assumed known and positive definite with unknown. The maximum diameter of and the confidence coefficient are prefixed.
A purely sequential sampling strategy is proposed allowing sampling until sample size crosses the boundary multiple times. We ascertain asymptotic efficiency and asymptotic consistency properties (Theorem 3.1). Its ability to nearly achieve required coverage probability without significant over-sampling is demonstrated with simulations. A truncation technique plus fine-tuning of the multiple crossing rule are proposed to increase practicality. Two real data illustrations are highlighted.
{"title":"Multiple crossing sequential fixed-size confidence region methodologies for a multivariate normal mean vector","authors":"Nitis Mukhopadhyay, Sankha Muthu Poruthotage","doi":"10.1016/j.stamet.2014.03.003","DOIUrl":"10.1016/j.stamet.2014.03.003","url":null,"abstract":"<div><p>The <em>asymptotically efficient</em> and <em>asymptotically consistent</em><span> purely sequential procedure of Mukhopadhyay and Al-Mousawi (1986) is customarily used to construct a confidence region </span><span><math><mi>R</mi></math></span> for the mean vector <span><math><mstyle><mi>μ</mi></mstyle></math></span> of <span><math><msub><mrow><mi>N</mi></mrow><mrow><mi>p</mi></mrow></msub><mrow><mo>(</mo><mstyle><mi>μ</mi></mstyle><mo>,</mo><msup><mrow><mi>σ</mi></mrow><mrow><mn>2</mn></mrow></msup><mstyle><mi>H</mi></mstyle><mo>)</mo></mrow></math></span>. This procedure does not have the <em>exact consistency</em> property. <span><math><msub><mrow><mstyle><mi>H</mi></mstyle></mrow><mrow><mi>p</mi><mo>×</mo><mi>p</mi></mrow></msub></math></span> is assumed known and positive definite with <span><math><msup><mrow><mi>σ</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span> unknown. The maximum diameter of <span><math><mi>R</mi></math></span><span> and the confidence coefficient are prefixed.</span></p><p><span>A purely sequential sampling strategy is proposed allowing sampling until sample size crosses the boundary multiple times. We ascertain asymptotic efficiency and asymptotic consistency properties (</span><span>Theorem 3.1</span><span>). Its ability to nearly achieve required coverage probability without significant over-sampling is demonstrated with simulations. A truncation technique plus fine-tuning of the multiple crossing rule are proposed to increase practicality. Two real data illustrations are highlighted.</span></p></div>","PeriodicalId":48877,"journal":{"name":"Statistical Methodology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.stamet.2014.03.003","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"55092338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}