We develop two analytic approaches to solve D-optimal approximate designs under generalized linear models. The first approach provides analytic D-optimal allocations for generalized linear models with two factors, which include as a special case the $2^2$ main-effects model considered by Yang, Mandal and Majumdar (2012). The second approach leads to explicit solutions for a class of generalized linear models with more than two factors. With the aid of the analytic solutions, we provide a necessary and sufficient condition under which a D-optimal design with two quantitative factors could be constructed on the boundary points only. It bridges the gap between D-optimal factorial designs and D-optimal designs with continuous factors.
{"title":"Analytic Solutions for D-optimal Factorial Designs under Generalized Linear Models","authors":"Liping Tong, H. Volkmer, Jie Yang","doi":"10.1214/14-EJS926","DOIUrl":"https://doi.org/10.1214/14-EJS926","url":null,"abstract":"We develop two analytic approaches to solve D-optimal approximate designs under generalized linear models. The first approach provides analytic D-optimal allocations for generalized linear models with two factors, which include as a special case the $2^2$ main-effects model considered by Yang, Mandal and Majumdar (2012). The second approach leads to explicit solutions for a class of generalized linear models with more than two factors. With the aid of the analytic solutions, we provide a necessary and sufficient condition under which a D-optimal design with two quantitative factors could be constructed on the boundary points only. It bridges the gap between D-optimal factorial designs and D-optimal designs with continuous factors.","PeriodicalId":8446,"journal":{"name":"arXiv: Computation","volume":"99 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82687353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
inference problem that has no straightforward solution. We take a Bayesian approach to the inference of unknown parameters of a non-linear state model; this, in turn, requires the availability of ecient Markov Chain Monte Carlo (MCMC) sampling methods for the latent (hidden) variables and model parameters. Using the ensemble technique of Neal (2010) and the embedded HMM technique of Neal (2003), we introduce a new Markov Chain Monte Carlo method for non-linear state space models. The key idea is to perform parameter updates conditional on an enormously large ensemble of latent sequences, as opposed to a single sequence, as with existing methods. We look at the performance of this ensemble method when doing Bayesian inference in the Ricker model of population dynamics. We show that for this problem, the ensemble method is vastly more ecient than a simple Metropolis method, as well as 1 .9 to 12.0 times more ecient than a single-sequence embedded HMM method, when all methods are tuned appropriately. We also introduce a way of speeding up the ensemble method by performing partial backward passes to discard poor proposals at low computational cost, resulting in a final eciency
{"title":"MCMC for non-Linear State Space Models Using Ensembles of Latent Sequences","authors":"Alexander Y. Shestopaloff, Radford M. Neal","doi":"10.14288/1.0043899","DOIUrl":"https://doi.org/10.14288/1.0043899","url":null,"abstract":"inference problem that has no straightforward solution. We take a Bayesian approach to the inference of unknown parameters of a non-linear state model; this, in turn, requires the availability of ecient Markov Chain Monte Carlo (MCMC) sampling methods for the latent (hidden) variables and model parameters. Using the ensemble technique of Neal (2010) and the embedded HMM technique of Neal (2003), we introduce a new Markov Chain Monte Carlo method for non-linear state space models. The key idea is to perform parameter updates conditional on an enormously large ensemble of latent sequences, as opposed to a single sequence, as with existing methods. We look at the performance of this ensemble method when doing Bayesian inference in the Ricker model of population dynamics. We show that for this problem, the ensemble method is vastly more ecient than a simple Metropolis method, as well as 1 .9 to 12.0 times more ecient than a single-sequence embedded HMM method, when all methods are tuned appropriately. We also introduce a way of speeding up the ensemble method by performing partial backward passes to discard poor proposals at low computational cost, resulting in a final eciency","PeriodicalId":8446,"journal":{"name":"arXiv: Computation","volume":"57 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87047692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The particle Gibbs sampler is a Markov chain Monte Carlo (MCMC) algorithm to sample from the full posterior distribution of a state-space model. It does so by executing Gibbs sampling steps on an extended target distribution defined on the space of the auxiliary variables generated by an interacting particle system. This paper makes the following contributions to the theoretical study of this algorithm. Firstly, we present a coupling construction between two particle Gibbs updates from different starting points and we show that the coupling probability may be made arbitrarily close to one by increasing the number of particles. We obtain as a direct corollary that the particle Gibbs kernel is uniformly ergodic. Secondly, we show how the inclusion of an additional Gibbs sampling step that reselects the ancestors of the particle Gibbs' extended target distribution, which is a popular approach in practice to improve mixing, does indeed yield a theoretically more efficient algorithm as measured by the asymptotic variance. Thirdly, we extend particle Gibbs to work with lower variance resampling schemes. A detailed numerical study is provided to demonstrate the efficiency of particle Gibbs and the proposed variants.
{"title":"On particle Gibbs sampling","authors":"N. Chopin, Sumeetpal S. Singh","doi":"10.3150/14-BEJ629","DOIUrl":"https://doi.org/10.3150/14-BEJ629","url":null,"abstract":"The particle Gibbs sampler is a Markov chain Monte Carlo (MCMC) algorithm to sample from the full posterior distribution of a state-space model. It does so by executing Gibbs sampling steps on an extended target distribution defined on the space of the auxiliary variables generated by an interacting particle system. This paper makes the following contributions to the theoretical study of this algorithm. Firstly, we present a coupling construction between two particle Gibbs updates from different starting points and we show that the coupling probability may be made arbitrarily close to one by increasing the number of particles. We obtain as a direct corollary that the particle Gibbs kernel is uniformly ergodic. Secondly, we show how the inclusion of an additional Gibbs sampling step that reselects the ancestors of the particle Gibbs' extended target distribution, which is a popular approach in practice to improve mixing, does indeed yield a theoretically more efficient algorithm as measured by the asymptotic variance. Thirdly, we extend particle Gibbs to work with lower variance resampling schemes. A detailed numerical study is provided to demonstrate the efficiency of particle Gibbs and the proposed variants.","PeriodicalId":8446,"journal":{"name":"arXiv: Computation","volume":"74 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77184303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Markov chains can be used to generate samples whose distribution approximates a given target distribution. The quality of the samples of such Markov chains can be measured by the discrepancy between the empirical distribution of the samples and the target distribution. We prove upper bounds on this discrepancy under the assumption that the Markov chain is uniformly ergodic and the driver sequence is deterministic rather than independent $U(0,1)$ random variables. In particular, we show the existence of driver sequences for which the discrepancy of the Markov chain from the target distribution with respect to certain test sets converges with (almost) the usual Monte Carlo rate of $n^{-1/2}$.
{"title":"Discrepancy bounds for uniformly ergodic Markov chain quasi-Monte Carlo","authors":"J. Dick, Daniel Rudolf, Hou-Ying Zhu","doi":"10.1214/16-AAP1173","DOIUrl":"https://doi.org/10.1214/16-AAP1173","url":null,"abstract":"Markov chains can be used to generate samples whose distribution approximates a given target distribution. The quality of the samples of such Markov chains can be measured by the discrepancy between the empirical distribution of the samples and the target distribution. We prove upper bounds on this discrepancy under the assumption that the Markov chain is uniformly ergodic and the driver sequence is deterministic rather than independent $U(0,1)$ random variables. In particular, we show the existence of driver sequences for which the discrepancy of the Markov chain from the target distribution with respect to certain test sets converges with (almost) the usual Monte Carlo rate of $n^{-1/2}$.","PeriodicalId":8446,"journal":{"name":"arXiv: Computation","volume":"216 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89100227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Item response theory (IRT) models are a class of statistical models used to describe the response behaviors of individuals to a set of items having a certain number of options. They are adopted by researchers in social science, particularly in the analysis of performance or attitudinal data, in psychology, education, medicine, marketing and other fields where the aim is to measure latent constructs. Most IRT analyses use parametric models that rely on assumptions that often are not satisfied. In such cases, a nonparametric approach might be preferable; nevertheless, there are not many software applications allowing to use that. To address this gap, this paper presents the R package KernSmoothIRT. It implements kernel smoothing for the estimation of option characteristic curves, and adds several plotting and analytical tools to evaluate the whole test/questionnaire, the items, and the subjects. In order to show the package's capabilities, two real datasets are used, one employing multiple-choice responses, and the other scaled responses.
{"title":"KernSmoothIRT: An R Package for Kernel Smoothing in Item Response Theory","authors":"A. Mazza, A. Punzo, Brian McGuire","doi":"10.18637/JSS.V058.I06","DOIUrl":"https://doi.org/10.18637/JSS.V058.I06","url":null,"abstract":"Item response theory (IRT) models are a class of statistical models used to describe the response behaviors of individuals to a set of items having a certain number of options. They are adopted by researchers in social science, particularly in the analysis of performance or attitudinal data, in psychology, education, medicine, marketing and other fields where the aim is to measure latent constructs. Most IRT analyses use parametric models that rely on assumptions that often are not satisfied. In such cases, a nonparametric approach might be preferable; nevertheless, there are not many software applications allowing to use that. To address this gap, this paper presents the R package KernSmoothIRT. It implements kernel smoothing for the estimation of option characteristic curves, and adds several plotting and analytical tools to evaluate the whole test/questionnaire, the items, and the subjects. In order to show the package's capabilities, two real datasets are used, one employing multiple-choice responses, and the other scaled responses.","PeriodicalId":8446,"journal":{"name":"arXiv: Computation","volume":"42 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2012-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91224652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Logistic Gaussian process (LGP) priors provide a flexible alternative for modelling unknown densities. The smoothness properties of the density estimates can be controlled through the prior covariance structure of the LGP, but the challenge is the analytically intractable inference. In this paper, we present approximate Bayesian inference for LGP density estimation in a grid using Laplace's method to integrate over the non-Gaussian posterior distribution of latent function values and to determine the covariance function parameters with type-II maximum a posteriori (MAP) estimation. We demonstrate that Laplace's method with MAP is sufficiently fast for practical interactive visualisation of 1D and 2D densities. Our experiments with simulated and real 1D data sets show that the estimation accuracy is close to a Markov chain Monte Carlo approximation and state-of-the-art hierarchical infinite Gaussian mixture models. We also construct a reduced-rank approximation to speed up the computations for dense 2D grids, and demonstrate density regression with the proposed Laplace approach.
{"title":"Laplace approximation for logistic Gaussian process density estimation and regression","authors":"J. Riihimaki, Aki Vehtari","doi":"10.1214/14-BA872","DOIUrl":"https://doi.org/10.1214/14-BA872","url":null,"abstract":"Logistic Gaussian process (LGP) priors provide a flexible alternative for modelling unknown densities. The smoothness properties of the density estimates can be controlled through the prior covariance structure of the LGP, but the challenge is the analytically intractable inference. In this paper, we present approximate Bayesian inference for LGP density estimation in a grid using Laplace's method to integrate over the non-Gaussian posterior distribution of latent function values and to determine the covariance function parameters with type-II maximum a posteriori (MAP) estimation. We demonstrate that Laplace's method with MAP is sufficiently fast for practical interactive visualisation of 1D and 2D densities. Our experiments with simulated and real 1D data sets show that the estimation accuracy is close to a Markov chain Monte Carlo approximation and state-of-the-art hierarchical infinite Gaussian mixture models. We also construct a reduced-rank approximation to speed up the computations for dense 2D grids, and demonstrate density regression with the proposed Laplace approach.","PeriodicalId":8446,"journal":{"name":"arXiv: Computation","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82296054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We investigate sampling laws for particle algorithms and the influence of these laws on the efficiency of particle approximations of marginal likelihoods in hidden Markov models. Among a broad class of candidates we characterize the essentially unique family of particle system transition kernels which is optimal with respect to an asymptotic-in-time variance growth rate criterion. The sampling structure of the algorithm defined by these optimal transitions turns out to be only subtly different from standard algorithms and yet the fluctuation properties of the estimates it provides can be dramatically different. The structure of the optimal transition suggests a new class of algorithms, which we term "twisted" particle filters and which we validate with asymptotic analysis of a more traditional nature, in the regime where the number of particles tends to infinity.
{"title":"Twisted particle filters","authors":"N. Whiteley, Anthony Lee","doi":"10.1214/13-AOS1167","DOIUrl":"https://doi.org/10.1214/13-AOS1167","url":null,"abstract":"We investigate sampling laws for particle algorithms and the influence of these laws on the efficiency of particle approximations of marginal likelihoods in hidden Markov models. Among a broad class of candidates we characterize the essentially unique family of particle system transition kernels which is optimal with respect to an asymptotic-in-time variance growth rate criterion. The sampling structure of the algorithm defined by these optimal transitions turns out to be only subtly different from standard algorithms and yet the fluctuation properties of the estimates it provides can be dramatically different. The structure of the optimal transition suggests a new class of algorithms, which we term \"twisted\" particle filters and which we validate with asymptotic analysis of a more traditional nature, in the regime where the number of particles tends to infinity.","PeriodicalId":8446,"journal":{"name":"arXiv: Computation","volume":"66 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2012-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76511657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose a new skewness test statistic for normality based on the Pearson measure of skewness. We obtain asymptotic first four moments of the null distribution for this statistic by using a computer algebra system and its normalizing transformation based on the Johnson $S_{U}$ system. Finally the performance of the proposed statistic is shown by comparing the powers of several skewness test statistics against some alternative hypotheses.
{"title":"A measure of skewness for testing departures from normality","authors":"S. Nakagawa, Hiroki Hashiguchi, N. Niki","doi":"10.17654/TS052010061","DOIUrl":"https://doi.org/10.17654/TS052010061","url":null,"abstract":"We propose a new skewness test statistic for normality based on the Pearson measure of skewness. We obtain asymptotic first four moments of the null distribution for this statistic by using a computer algebra system and its normalizing transformation based on the Johnson $S_{U}$ system. Finally the performance of the proposed statistic is shown by comparing the powers of several skewness test statistics against some alternative hypotheses.","PeriodicalId":8446,"journal":{"name":"arXiv: Computation","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2012-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89192706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nimalan Mahendran, Ziyun Wang, F. Hamze, Nando de Freitas
This paper proposes a new randomized strategy for adaptive MCMC using Bayesian optimization. This approach applies to non-differentiable objective functions and trades off exploration and exploitation to reduce the number of potentially costly objective function evaluations. We demonstrate the strategy in the complex setting of sampling from constrained, discrete and densely connected probabilistic graphical models where, for each variation of the problem, one needs to adjust the parameters of the proposal mechanism automatically to ensure efficient mixing of the Markov chains.
{"title":"Bayesian optimization for adaptive MCMC","authors":"Nimalan Mahendran, Ziyun Wang, F. Hamze, Nando de Freitas","doi":"10.14288/1.0052032","DOIUrl":"https://doi.org/10.14288/1.0052032","url":null,"abstract":"This paper proposes a new randomized strategy for adaptive MCMC using Bayesian optimization. This approach applies to non-differentiable objective functions and trades off exploration and exploitation to reduce the number of potentially costly objective function evaluations. We demonstrate the strategy in the complex setting of sampling from constrained, discrete and densely connected probabilistic graphical models where, for each variation of the problem, one needs to adjust the parameters of the proposal mechanism automatically to ensure efficient mixing of the Markov chains.","PeriodicalId":8446,"journal":{"name":"arXiv: Computation","volume":"49 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2011-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73634769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}