{"title":"Bayesian predictive inference under a Dirichlet process with sensitivity to the normal baseline","authors":"Balgobin Nandram, Jiani Yin","doi":"10.1016/j.stamet.2015.07.003","DOIUrl":null,"url":null,"abstract":"<div><p>It is well known that the Dirichlet process (DP) model and Dirichlet process mixture (DPM) model are sensitive to the specifications of the baseline distribution. Given a sample from a finite population, we perform Bayesian predictive inference about a finite population quantity (e.g., mean) using a DP model. Generally, in many applications a normal distribution is used for the baseline distribution. Therefore, our main objective is empirical and we show the extent of the sensitivity of inference about the finite population mean with respect to six distributions (normal, lognormal, gamma, inverse Gaussian, a two-component normal mixture and a skewed normal). We have compared the DP model using these baselines with the Polya posterior (fully nonparametric) and the Bayesian bootstrap (sampling with a Haldane prior). We used two examples, one on income data and the other on body mass index data, to compare the performance of these three procedures. These examples show some differences among the six baseline distributions, the Polya posterior and the Bayesian bootstrap, indicating that the normal baseline model cannot be used automatically. Therefore, we consider a simulation study to assess this issue further, and we show how to solve this problem using a leave-one-out kernel baseline. Because the leave-one-out kernel baseline cannot be easily applied to the DPM, we show theoretically how one can solve the sensitivity problem for the DPM as well.</p></div>","PeriodicalId":48877,"journal":{"name":"Statistical Methodology","volume":"28 ","pages":"Pages 1-17"},"PeriodicalIF":0.0000,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.stamet.2015.07.003","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistical Methodology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1572312715000520","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 9
Abstract
It is well known that the Dirichlet process (DP) model and Dirichlet process mixture (DPM) model are sensitive to the specifications of the baseline distribution. Given a sample from a finite population, we perform Bayesian predictive inference about a finite population quantity (e.g., mean) using a DP model. Generally, in many applications a normal distribution is used for the baseline distribution. Therefore, our main objective is empirical and we show the extent of the sensitivity of inference about the finite population mean with respect to six distributions (normal, lognormal, gamma, inverse Gaussian, a two-component normal mixture and a skewed normal). We have compared the DP model using these baselines with the Polya posterior (fully nonparametric) and the Bayesian bootstrap (sampling with a Haldane prior). We used two examples, one on income data and the other on body mass index data, to compare the performance of these three procedures. These examples show some differences among the six baseline distributions, the Polya posterior and the Bayesian bootstrap, indicating that the normal baseline model cannot be used automatically. Therefore, we consider a simulation study to assess this issue further, and we show how to solve this problem using a leave-one-out kernel baseline. Because the leave-one-out kernel baseline cannot be easily applied to the DPM, we show theoretically how one can solve the sensitivity problem for the DPM as well.
期刊介绍:
Statistical Methodology aims to publish articles of high quality reflecting the varied facets of contemporary statistical theory as well as of significant applications. In addition to helping to stimulate research, the journal intends to bring about interactions among statisticians and scientists in other disciplines broadly interested in statistical methodology. The journal focuses on traditional areas such as statistical inference, multivariate analysis, design of experiments, sampling theory, regression analysis, re-sampling methods, time series, nonparametric statistics, etc., and also gives special emphasis to established as well as emerging applied areas.