Joint models for longitudinal and time-to-event data constitute an attractive modeling framework that has received a lot of interest in the recent years. This paper presents the capabilities of the R package JMbayes for fitting these models under a Bayesian approach using Markon chain Monte Carlo algorithms. JMbayes can fit a wide range of joint models, including among others joint models for continuous and categorical longitudinal responses, and provides several options for modeling the association structure between the two outcomes. In addition, this package can be used to derive dynamic predictions for both outcomes, and offers several tools to validate these predictions in terms of discrimination and calibration. All these features are illustrated using a real data example on patients with primary biliary cirrhosis.
{"title":"The R Package JMbayes for Fitting Joint Models for Longitudinal and Time-to-Event Data using MCMC","authors":"D. Rizopoulos","doi":"10.18637/JSS.V072.I07","DOIUrl":"https://doi.org/10.18637/JSS.V072.I07","url":null,"abstract":"Joint models for longitudinal and time-to-event data constitute an attractive modeling framework that has received a lot of interest in the recent years. This paper presents the capabilities of the R package JMbayes for fitting these models under a Bayesian approach using Markon chain Monte Carlo algorithms. JMbayes can fit a wide range of joint models, including among others joint models for continuous and categorical longitudinal responses, and provides several options for modeling the association structure between the two outcomes. In addition, this package can be used to derive dynamic predictions for both outcomes, and offers several tools to validate these predictions in terms of discrimination and calibration. All these features are illustrated using a real data example on patients with primary biliary cirrhosis.","PeriodicalId":8446,"journal":{"name":"arXiv: Computation","volume":"19 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2014-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85923276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-04-22DOI: 10.3929/ETHZ-A-010645962
Mehdi Molkaraie
We consider the problem of estimating the partition function of the ferromagnetic Ising model in a consistent external magnetic field. The estimation is done via importance sampling in the dual of the Forney factor graph representing the model. Emphasis is on models at low temperature (corresponding to models with strong couplings) and on models with a mixture of strong and weak coupling parameters.
{"title":"An Importance Sampling Algorithm for the Ising Model with Strong Couplings","authors":"Mehdi Molkaraie","doi":"10.3929/ETHZ-A-010645962","DOIUrl":"https://doi.org/10.3929/ETHZ-A-010645962","url":null,"abstract":"We consider the problem of estimating the partition function of the ferromagnetic Ising model in a consistent external magnetic field. The estimation is done via importance sampling in the dual of the Forney factor graph representing the model. Emphasis is on models at low temperature (corresponding to models with strong couplings) and on models with a mixture of strong and weak coupling parameters.","PeriodicalId":8446,"journal":{"name":"arXiv: Computation","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2014-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77462512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present R package mnlogit for training multinomial logistic regression models, particularly those involving a large number of classes and features. Compared to existing software, mnlogit offers speedups of 10x-50x for modestly sized problems and more than 100x for larger problems. Running mnlogit in parallel mode on a multicore machine gives an additional 2x-4x speedup on up to 8 processor cores. Computational efficiency is achieved by drastically speeding up calculation of the log-likelihood function's Hessian matrix by exploiting structure in matrices that arise in intermediate calculations.
{"title":"Fast Estimation of Multinomial Logit Models: R Package mnlogit","authors":"Asad Hasan, Wang Zhiyu, A. S. Mahani","doi":"10.18637/JSS.V075.I03","DOIUrl":"https://doi.org/10.18637/JSS.V075.I03","url":null,"abstract":"We present R package mnlogit for training multinomial logistic regression models, particularly those involving a large number of classes and features. Compared to existing software, mnlogit offers speedups of 10x-50x for modestly sized problems and more than 100x for larger problems. Running mnlogit in parallel mode on a multicore machine gives an additional 2x-4x speedup on up to 8 processor cores. Computational efficiency is achieved by drastically speeding up calculation of the log-likelihood function's Hessian matrix by exploiting structure in matrices that arise in intermediate calculations.","PeriodicalId":8446,"journal":{"name":"arXiv: Computation","volume":"254 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2014-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77735332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-03-03DOI: 10.4310/SII.2015.V8.N2.A4
C. Grazian, B. Liseo
We propose a novel use of a recent new computational tool for Bayesian inference, namely the Approximate Bayesian Computation (ABC) methodology. ABC is a way to handle models for which the likelihood function may be intractable or even unavailable and/or too costly to evaluate; in particular, we consider the problem of eliminating the nuisance parameters from a complex statistical model in order to produce a likelihood function depending on the quantity of interest only. Given a proper prior for the entire vector parameter, we propose to approximate the integrated likelihood by the ratio of kernel estimators of the marginal posterior and prior for the quantity of interest. We present several examples.
{"title":"Approximate Integrated Likelihood via ABC methods","authors":"C. Grazian, B. Liseo","doi":"10.4310/SII.2015.V8.N2.A4","DOIUrl":"https://doi.org/10.4310/SII.2015.V8.N2.A4","url":null,"abstract":"We propose a novel use of a recent new computational tool for Bayesian inference, namely the Approximate Bayesian Computation (ABC) methodology. ABC is a way to handle models for which the likelihood function may be intractable or even unavailable and/or too costly to evaluate; in particular, we consider the problem of eliminating the nuisance parameters from a complex statistical model in order to produce a likelihood function depending on the quantity of interest only. Given a proper prior for the entire vector parameter, we propose to approximate the integrated likelihood by the ratio of kernel estimators of the marginal posterior and prior for the quantity of interest. We present several examples.","PeriodicalId":8446,"journal":{"name":"arXiv: Computation","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2014-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83583085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The likelihood for the parameters of a generalized linear mixed model involves an integral which may be of very high dimension. Because of this intractability, many approximations to the likelihood have been proposed, but all can fail when the model is sparse, in that there is only a small amount of information available on each random effect. The sequential reduction method described in this paper exploits the dependence structure of the posterior distribution of the random effects to reduce substantially the cost of finding an accurate approximation to the likelihood in models with sparse structure.
{"title":"A sequential reduction method for inference in generalized linear mixed models","authors":"Helen E. Ogden","doi":"10.1214/15-EJS991","DOIUrl":"https://doi.org/10.1214/15-EJS991","url":null,"abstract":"The likelihood for the parameters of a generalized linear mixed model involves an integral which may be of very high dimension. Because of this intractability, many approximations to the likelihood have been proposed, but all can fail when the model is sparse, in that there is only a small amount of information available on each random effect. The sequential reduction method described in this paper exploits the dependence structure of the posterior distribution of the random effects to reduce substantially the cost of finding an accurate approximation to the likelihood in models with sparse structure.","PeriodicalId":8446,"journal":{"name":"arXiv: Computation","volume":"118 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72698539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Real-world systems are often complex, dynamic, and nonlinear. Understanding the dynamics of a system from its observed time series is key to the prediction and control of the system's behavior. While most existing techniques tacitly assume some form of stationarity or continuity, abrupt changes, which are often due to external disturbances or sudden changes in the intrinsic dynamics, are common in time series. Structural breaks, which are time points at which the statistical patterns of a time series change, pose considerable challenges to data analysis. Without identification of such break points, the same dynamic rule would be applied to the whole period of observation, whereas false identification of structural breaks may lead to overfitting. In this paper, we cast the problem of decomposing a time series into its trend and seasonal components as an optimization problem. This problem is ill-posed due to the arbitrariness in the number of parameters. To overcome this difficulty, we propose the addition of a penalty function (i.e., a regularization term) that accounts for the number of parameters. Our approach simultaneously identifies seasonality and trend without the need of iterations, and allows the reliable detection of structural breaks. The method is applied to recorded data on fish populations and sea surface temperature, where it detects structural breaks that would have been neglected otherwise. This suggests that our method can lead to a general approach for the monitoring, prediction, and prevention of structural changes in real systems.
{"title":"Detecting structural breaks in seasonal time series by regularized optimization","authors":"B. Wang, Jie Sun, A. Motter","doi":"10.1201/b16387-524","DOIUrl":"https://doi.org/10.1201/b16387-524","url":null,"abstract":"Real-world systems are often complex, dynamic, and nonlinear. Understanding the dynamics of a system from its observed time series is key to the prediction and control of the system's behavior. While most existing techniques tacitly assume some form of stationarity or continuity, abrupt changes, which are often due to external disturbances or sudden changes in the intrinsic dynamics, are common in time series. Structural breaks, which are time points at which the statistical patterns of a time series change, pose considerable challenges to data analysis. Without identification of such break points, the same dynamic rule would be applied to the whole period of observation, whereas false identification of structural breaks may lead to overfitting. In this paper, we cast the problem of decomposing a time series into its trend and seasonal components as an optimization problem. This problem is ill-posed due to the arbitrariness in the number of parameters. To overcome this difficulty, we propose the addition of a penalty function (i.e., a regularization term) that accounts for the number of parameters. Our approach simultaneously identifies seasonality and trend without the need of iterations, and allows the reliable detection of structural breaks. The method is applied to recorded data on fish populations and sea surface temperature, where it detects structural breaks that would have been neglected otherwise. This suggests that our method can lead to a general approach for the monitoring, prediction, and prevention of structural changes in real systems.","PeriodicalId":8446,"journal":{"name":"arXiv: Computation","volume":"4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81530365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Stiefel manifold $V_{p,d}$ is the space of all $d times p$ orthonormal matrices, with the $d-1$ hypersphere and the space of all orthogonal matrices constituting special cases. In modeling data lying on the Stiefel manifold, parametric distributions such as the matrix Langevin distribution are often used; however, model misspecification is a concern and it is desirable to have nonparametric alternatives. Current nonparametric methods are Frechet mean based. We take a fully generative nonparametric approach, which relies on mixing parametric kernels such as the matrix Langevin. The proposed kernel mixtures can approximate a large class of distributions on the Stiefel manifold, and we develop theory showing posterior consistency. While there exists work developing general posterior consistency results, extending these results to this particular manifold requires substantial new theory. Posterior inference is illustrated on a real-world dataset of near-Earth objects.
{"title":"Bayesian nonparametric inference on the Stiefel manifold","authors":"Lizhen Lin, Vinayak A. Rao, D. Dunson","doi":"10.5705/SS.202016.0017","DOIUrl":"https://doi.org/10.5705/SS.202016.0017","url":null,"abstract":"The Stiefel manifold $V_{p,d}$ is the space of all $d times p$ orthonormal matrices, with the $d-1$ hypersphere and the space of all orthogonal matrices constituting special cases. In modeling data lying on the Stiefel manifold, parametric distributions such as the matrix Langevin distribution are often used; however, model misspecification is a concern and it is desirable to have nonparametric alternatives. Current nonparametric methods are Frechet mean based. We take a fully generative nonparametric approach, which relies on mixing parametric kernels such as the matrix Langevin. The proposed kernel mixtures can approximate a large class of distributions on the Stiefel manifold, and we develop theory showing posterior consistency. While there exists work developing general posterior consistency results, extending these results to this particular manifold requires substantial new theory. Posterior inference is illustrated on a real-world dataset of near-Earth objects.","PeriodicalId":8446,"journal":{"name":"arXiv: Computation","volume":"56 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84523861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We address the Least Quantile of Squares (LQS) (and in particular the Least Median of Squares) regression problem using modern optimization methods. We propose a Mixed Integer Optimization (MIO) formulation of the LQS problem which allows us to find a provably global optimal solution for the LQS problem. Our MIO framework has the appealing characteristic that if we terminate the algorithm early, we obtain a solution with a guarantee on its sub-optimality. We also propose continuous optimization methods based on first-order subdifferential methods, sequential linear optimization and hybrid combinations of them to obtain near optimal solutions to the LQS problem. The MIO algorithm is found to benefit significantly from high quality solutions delivered by our continuous optimization based methods. We further show that the MIO approach leads to (a) an optimal solution for any dataset, where the data-points $(y_i,mathbf{x}_i)$'s are not necessarily in general position, (b) a simple proof of the breakdown point of the LQS objective value that holds for any dataset and (c) an extension to situations where there are polyhedral constraints on the regression coefficient vector. We report computational results with both synthetic and real-world datasets showing that the MIO algorithm with warm starts from the continuous optimization methods solve small ($n=100$) and medium ($n=500$) size problems to provable optimality in under two hours, and outperform all publicly available methods for large-scale ($n={}$10,000) LQS problems.
{"title":"Least quantile regression via modern optimization","authors":"D. Bertsimas, R. Mazumder","doi":"10.1214/14-AOS1223","DOIUrl":"https://doi.org/10.1214/14-AOS1223","url":null,"abstract":"We address the Least Quantile of Squares (LQS) (and in particular the Least Median of Squares) regression problem using modern optimization methods. We propose a Mixed Integer Optimization (MIO) formulation of the LQS problem which allows us to find a provably global optimal solution for the LQS problem. Our MIO framework has the appealing characteristic that if we terminate the algorithm early, we obtain a solution with a guarantee on its sub-optimality. We also propose continuous optimization methods based on first-order subdifferential methods, sequential linear optimization and hybrid combinations of them to obtain near optimal solutions to the LQS problem. The MIO algorithm is found to benefit significantly from high quality solutions delivered by our continuous optimization based methods. We further show that the MIO approach leads to (a) an optimal solution for any dataset, where the data-points $(y_i,mathbf{x}_i)$'s are not necessarily in general position, (b) a simple proof of the breakdown point of the LQS objective value that holds for any dataset and (c) an extension to situations where there are polyhedral constraints on the regression coefficient vector. We report computational results with both synthetic and real-world datasets showing that the MIO algorithm with warm starts from the continuous optimization methods solve small ($n=100$) and medium ($n=500$) size problems to provable optimality in under two hours, and outperform all publicly available methods for large-scale ($n={}$10,000) LQS problems.","PeriodicalId":8446,"journal":{"name":"arXiv: Computation","volume":"31 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83525446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We introduce a general form of sequential Monte Carlo algorithm defined in terms of a parameterized resampling mechanism. We find that a suitably generalized notion of the Effective Sample Size (ESS), widely used to monitor algorithm degeneracy, appears naturally in a study of its convergence properties. We are then able to phrase sufficient conditions for time-uniform convergence in terms of algorithmic control of the ESS, in turn achievable by adaptively modulating the interaction between particles. This leads us to suggest novel algorithms which are, in senses to be made precise, provably stable and yet designed to avoid the degree of interaction which hinders parallelization of standard algorithms. As a byproduct, we prove time-uniform convergence of the popular adaptive resampling particle filter.
{"title":"On the role of interaction in sequential Monte Carlo algorithms","authors":"N. Whiteley, Anthony Lee, K. Heine","doi":"10.3150/14-BEJ666","DOIUrl":"https://doi.org/10.3150/14-BEJ666","url":null,"abstract":"We introduce a general form of sequential Monte Carlo algorithm defined in terms of a parameterized resampling mechanism. We find that a suitably generalized notion of the Effective Sample Size (ESS), widely used to monitor algorithm degeneracy, appears naturally in a study of its convergence properties. We are then able to phrase sufficient conditions for time-uniform convergence in terms of algorithmic control of the ESS, in turn achievable by adaptively modulating the interaction between particles. This leads us to suggest novel algorithms which are, in senses to be made precise, provably stable and yet designed to avoid the degree of interaction which hinders parallelization of standard algorithms. As a byproduct, we prove time-uniform convergence of the popular adaptive resampling particle filter.","PeriodicalId":8446,"journal":{"name":"arXiv: Computation","volume":"76 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76549637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Polynomial chaos is a powerful technique for propagating uncertainty through ordinary and partial differential equations. Random variables are expanded in terms of orthogonal polynomials and differential equations are derived for the expansion coefficients. Here we study the structure and dynamics of these differential equations when the original system has Hamiltonian structure, has multiple time scales, or displays chaotic dynamics. In particular, we prove that the differential equations for the expansion coefficients in generalized polynomial chaos expansions of Hamiltonian systems retain the Hamiltonian structure relative to the ensemble average Hamiltonian. We connect this with the volume-preserving property of Hamiltonian flows to show that, for an oscillator with uncertain frequency, a finite expansion must fail at long times, regardless of the order of the expansion. Also, using a two-time scale forced nonlinear oscillator, we show that a polynomial chaos expansion of the time-averaged equations captures uncertainty in the slow evolution of the Poincar'e section of the system and that, as the time scale separation increases, the computational advantage of this procedure increases. Finally, using the forced Duffing oscillator as an example, we demonstrate that when the original dynamical system displays chaotic dynamics, the resulting dynamical system from polynomial chaos also displays chaotic dynamics, limiting its applicability.
{"title":"Polynomial chaos based uncertainty quantification in Hamiltonian, multi-time scale, and chaotic systems","authors":"J. M. Pasini, T. Sahai","doi":"10.3934/JCD.2014.1.357","DOIUrl":"https://doi.org/10.3934/JCD.2014.1.357","url":null,"abstract":"Polynomial chaos is a powerful technique for propagating uncertainty through ordinary and partial differential equations. Random variables are expanded in terms of orthogonal polynomials and differential equations are derived for the expansion coefficients. Here we study the structure and dynamics of these differential equations when the original system has Hamiltonian structure, has multiple time scales, or displays chaotic dynamics. In particular, we prove that the differential equations for the expansion coefficients in generalized polynomial chaos expansions of Hamiltonian systems retain the Hamiltonian structure relative to the ensemble average Hamiltonian. We connect this with the volume-preserving property of Hamiltonian flows to show that, for an oscillator with uncertain frequency, a finite expansion must fail at long times, regardless of the order of the expansion. Also, using a two-time scale forced nonlinear oscillator, we show that a polynomial chaos expansion of the time-averaged equations captures uncertainty in the slow evolution of the Poincar'e section of the system and that, as the time scale separation increases, the computational advantage of this procedure increases. Finally, using the forced Duffing oscillator as an example, we demonstrate that when the original dynamical system displays chaotic dynamics, the resulting dynamical system from polynomial chaos also displays chaotic dynamics, limiting its applicability.","PeriodicalId":8446,"journal":{"name":"arXiv: Computation","volume":"31 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74265115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}