Pub Date : 2020-01-01Epub Date: 2019-08-26DOI: 10.1080/00031305.2019.1647876
Pierre Baldi, Babak Shahbaba
Although no universally accepted definition of causality exists, in practice one is often faced with the question of statistically assessing causal relationships in different settings. We present a uniform general approach to causality problems derived from the axiomatic foundations of the Bayesian statistical framework. In this approach, causality statements are viewed as hypotheses, or models, about the world and the fundamental object to be computed is the posterior distribution of the causal hypotheses, given the data and the background knowledge. Computation of the posterior, illustrated here in simple examples, may involve complex probabilistic modeling but this is no different than in any other Bayesian modeling situation. The main advantage of the approach is its connection to the axiomatic foundations of the Bayesian framework, and the general uniformity with which it can be applied to a variety of causality settings, ranging from specific to general cases, or from causes of effects to effects of causes.
{"title":"Bayesian Causality.","authors":"Pierre Baldi, Babak Shahbaba","doi":"10.1080/00031305.2019.1647876","DOIUrl":"https://doi.org/10.1080/00031305.2019.1647876","url":null,"abstract":"<p><p>Although no universally accepted definition of causality exists, in practice one is often faced with the question of statistically assessing causal relationships in different settings. We present a uniform general approach to causality problems derived from the axiomatic foundations of the Bayesian statistical framework. In this approach, causality statements are viewed as hypotheses, or models, about the world and the fundamental object to be computed is the posterior distribution of the causal hypotheses, given the data and the background knowledge. Computation of the posterior, illustrated here in simple examples, may involve complex probabilistic modeling but this is no different than in any other Bayesian modeling situation. The main advantage of the approach is its connection to the axiomatic foundations of the Bayesian framework, and the general uniformity with which it can be applied to a variety of causality settings, ranging from specific to general cases, or from causes of effects to effects of causes.</p>","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"74 3","pages":"249-257"},"PeriodicalIF":1.8,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/00031305.2019.1647876","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38480318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-03-20eCollection Date: 2019-01-01DOI: 10.1080/00031305.2018.1529623
Naomi C Brownstein, Thomas A Louis, Anthony O'Hagan, Jane Pendergast
This article resulted from our participation in the session on the "role of expert opinion and judgment in statistical inference" at the October 2017 ASA Symposium on Statistical Inference. We present a strong, unified statement on roles of expert judgment in statistics with processes for obtaining input, whether from a Bayesian or frequentist perspective. Topics include the role of subjectivity in the cycle of scientific inference and decisions, followed by a clinical trial and a greenhouse gas emissions case study that illustrate the role of judgments and the importance of basing them on objective information and a comprehensive uncertainty assessment. We close with a call for increased proactivity and involvement of statisticians in study conceptualization, design, conduct, analysis, and communication.
{"title":"The Role of Expert Judgment in Statistical Inference and Evidence-Based Decision-Making.","authors":"Naomi C Brownstein, Thomas A Louis, Anthony O'Hagan, Jane Pendergast","doi":"10.1080/00031305.2018.1529623","DOIUrl":"10.1080/00031305.2018.1529623","url":null,"abstract":"<p><p>This article resulted from our participation in the session on the \"role of expert opinion and judgment in statistical inference\" at the October 2017 ASA Symposium on Statistical Inference. We present a strong, unified statement on roles of expert judgment in statistics with processes for obtaining input, whether from a Bayesian or frequentist perspective. Topics include the role of subjectivity in the cycle of scientific inference and decisions, followed by a clinical trial and a greenhouse gas emissions case study that illustrate the role of judgments and the importance of basing them on objective information and a comprehensive uncertainty assessment. We close with a call for increased proactivity and involvement of statisticians in study conceptualization, design, conduct, analysis, and communication.</p>","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"73 1","pages":"56-68"},"PeriodicalIF":1.8,"publicationDate":"2019-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6474725/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"37216155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-01-01DOI: 10.1080/00031305.2018.1518266
Robert J Calin-Jageman, Geoff Cumming
The "New Statistics" emphasizes effect sizes, confidence intervals, meta-analysis, and the use of Open Science practices. We present 3 specific ways in which a New Statistics approach can help improve scientific practice: by reducing over-confidence in small samples, by reducing confirmation bias, and by fostering more cautious judgments of consistency. We illustrate these points through consideration of the literature on oxytocin and human trust, a research area that typifies some of the endemic problems that arise with poor statistical practice.
{"title":"The New Statistics for Better Science: Ask How Much, How Uncertain, and What Else is Known.","authors":"Robert J Calin-Jageman, Geoff Cumming","doi":"10.1080/00031305.2018.1518266","DOIUrl":"https://doi.org/10.1080/00031305.2018.1518266","url":null,"abstract":"<p><p>The \"New Statistics\" emphasizes effect sizes, confidence intervals, meta-analysis, and the use of Open Science practices. We present 3 specific ways in which a New Statistics approach can help improve scientific practice: by reducing over-confidence in small samples, by reducing confirmation bias, and by fostering more cautious judgments of consistency. We illustrate these points through consideration of the literature on oxytocin and human trust, a research area that typifies some of the endemic problems that arise with poor statistical practice.</p>","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"73 Suppl 1","pages":"271-280"},"PeriodicalIF":1.8,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/00031305.2018.1518266","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10033601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-01-01Epub Date: 2019-03-20DOI: 10.1080/00031305.2018.1518788
Valen E Johnson
This article examines the evidence contained in t statistics that are marginally significant in 5% tests. The bases for evaluating evidence are likelihood ratios and integrated likelihood ratios, computed under a variety of assumptions regarding the alternative hypotheses in null hypothesis significance tests. Likelihood ratios and integrated likelihood ratios provide a useful measure of the evidence in favor of competing hypotheses because they can be interpreted as representing the ratio of the probabilities that each hypothesis assigns to observed data. When they are either very large or very small, they suggest that one hypothesis is much better than the other in predicting observed data. If they are close to 1.0, then both hypotheses provide approximately equally valid explanations for observed data. I find that p-values that are close to 0.05 (i.e., that are "marginally significant") correspond to integrated likelihood ratios that are bounded by approximately 7 in two-sided tests, and by approximately 4 in one-sided tests. The modest magnitude of integrated likelihood ratios corresponding to p-values close to 0.05 clearly suggests that higher standards of evidence are needed to support claims of novel discoveries and new effects.
{"title":"Evidence from marginally significant <i>t</i> statistics.","authors":"Valen E Johnson","doi":"10.1080/00031305.2018.1518788","DOIUrl":"https://doi.org/10.1080/00031305.2018.1518788","url":null,"abstract":"<p><p>This article examines the evidence contained in <i>t</i> statistics that are marginally significant in 5% tests. The bases for evaluating evidence are likelihood ratios and integrated likelihood ratios, computed under a variety of assumptions regarding the alternative hypotheses in null hypothesis significance tests. Likelihood ratios and integrated likelihood ratios provide a useful measure of the evidence in favor of competing hypotheses because they can be interpreted as representing the ratio of the probabilities that each hypothesis assigns to observed data. When they are either very large or very small, they suggest that one hypothesis is much better than the other in predicting observed data. If they are close to 1.0, then both hypotheses provide approximately equally valid explanations for observed data. I find that <i>p</i>-values that are close to 0.05 (i.e., that are \"marginally significant\") correspond to integrated likelihood ratios that are bounded by approximately 7 in two-sided tests, and by approximately 4 in one-sided tests. The modest magnitude of integrated likelihood ratios corresponding to <i>p</i>-values close to 0.05 clearly suggests that higher standards of evidence are needed to support claims of novel discoveries and new effects.</p>","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":" ","pages":"129-134"},"PeriodicalIF":1.8,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/00031305.2018.1518788","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"37268013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-01-01Epub Date: 2018-06-04DOI: 10.1080/00031305.2017.1415972
Yueh-Yun Chi, Deborah H Glueck, Keith E Muller
Despite the popularity of the general linear mixed model for data analysis, power and sample size methods and software are not generally available for commonly used test statistics and reference distributions. Statisticians resort to simulations with homegrown and uncertified programs or rough approximations which are misaligned with the data analysis. For a wide range of designs with longitudinal and clustering features, we provide accurate power and sample size approximations for inference about fixed effects in linear models we call reversible. We show that under widely applicable conditions, the general linear mixed-model Wald test has non-central distributions equivalent to well-studied multivariate tests. In turn, exact and approximate power and sample size results for the multivariate Hotelling-Lawley test provide exact and approximate power and sample size results for the mixed-model Wald test. The calculations are easily computed with a free, open-source product that requires only a web browser to use. Commercial software can be used for a smaller range of reversible models. Simple approximations allow accounting for modest amounts of missing data. A real-world example illustrates the methods. Sample size results are presented for a multicenter study on pregnancy. The proposed study, an extension of a funded project, has clustering within clinic. Exchangeability among participants allows averaging across them to remove the clustering structure. The resulting simplified design is a single level longitudinal study. Multivariate methods for power provide an approximate sample size. All proofs and inputs for the example are in the Supplementary Materials (available online).
{"title":"Power and Sample Size for Fixed-Effects Inference in Reversible Linear Mixed Models.","authors":"Yueh-Yun Chi, Deborah H Glueck, Keith E Muller","doi":"10.1080/00031305.2017.1415972","DOIUrl":"https://doi.org/10.1080/00031305.2017.1415972","url":null,"abstract":"<p><p>Despite the popularity of the general linear mixed model for data analysis, power and sample size methods and software are not generally available for commonly used test statistics and reference distributions. Statisticians resort to simulations with homegrown and uncertified programs or rough approximations which are misaligned with the data analysis. For a wide range of designs with longitudinal and clustering features, we provide accurate power and sample size approximations for inference about fixed effects in linear models we call reversible. We show that under widely applicable conditions, the general linear mixed-model Wald test has non-central distributions equivalent to well-studied multivariate tests. In turn, exact and approximate power and sample size results for the multivariate Hotelling-Lawley test provide exact and approximate power and sample size results for the mixed-model Wald test. The calculations are easily computed with a free, open-source product that requires only a web browser to use. Commercial software can be used for a smaller range of reversible models. Simple approximations allow accounting for modest amounts of missing data. A real-world example illustrates the methods. Sample size results are presented for a multicenter study on pregnancy. The proposed study, an extension of a funded project, has clustering within clinic. Exchangeability among participants allows averaging across them to remove the clustering structure. The resulting simplified design is a single level longitudinal study. Multivariate methods for power provide an approximate sample size. All proofs and inputs for the example are in the Supplementary Materials (available online).</p>","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"73 4","pages":"350-359"},"PeriodicalIF":1.8,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/00031305.2017.1415972","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"37631108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-01-01Epub Date: 2019-05-30DOI: 10.1080/00031305.2019.1607554
Chuan-Fa Tang, Dewei Wang, Hammou El Barmi, Joshua M Tebbs
We develop an empirical likelihood approach to test independence of two univariate random variables X and Y versus the alternative that X and Y are strictly positive quadrant dependent (PQD). Establishing this type of ordering between X and Y is of interest in many applications, including finance, insurance, engineering, and other areas. Adopting the framework in Einmahl and McKeague (2003, Bernoulli), we create a distribution-free test statistic that integrates a localized empirical likelihood ratio test statistic with respect to the empirical joint distribution of X and Y. When compared to well known existing tests and distance-based tests we develop by using copula functions, simulation results show the EL testing procedure performs well in a variety of scenarios when X and Y are strictly PQD. We use three data sets for illustration and provide an online R resource practitioners can use to implement the methods in this article.
{"title":"Testing for positive quadrant dependence.","authors":"Chuan-Fa Tang, Dewei Wang, Hammou El Barmi, Joshua M Tebbs","doi":"10.1080/00031305.2019.1607554","DOIUrl":"https://doi.org/10.1080/00031305.2019.1607554","url":null,"abstract":"<p><p>We develop an empirical likelihood approach to test independence of two univariate random variables <i>X</i> and <i>Y</i> versus the alternative that <i>X</i> and <i>Y</i> are strictly positive quadrant dependent (PQD). Establishing this type of ordering between <i>X</i> and <i>Y</i> is of interest in many applications, including finance, insurance, engineering, and other areas. Adopting the framework in Einmahl and McKeague (2003, <i>Bernoulli</i>), we create a distribution-free test statistic that integrates a localized empirical likelihood ratio test statistic with respect to the empirical joint distribution of <i>X</i> and <i>Y</i>. When compared to well known existing tests and distance-based tests we develop by using copula functions, simulation results show the EL testing procedure performs well in a variety of scenarios when <i>X</i> and <i>Y</i> are strictly PQD. We use three data sets for illustration and provide an online R resource practitioners can use to implement the methods in this article.</p>","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"2019 ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/00031305.2019.1607554","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38559588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-01-01Epub Date: 2019-03-20DOI: 10.1080/00031305.2018.1518269
Sherri Rose, Thomas G McGuire
Stepwise regression building procedures are commonly used applied statistical tools, despite their well-known drawbacks. While many of their limitations have been widely discussed in the literature, other aspects of the use of individual statistical fit measures, especially in high-dimensional stepwise regression settings, have not. Giving primacy to individual fit, as is done with p-values and R2, when group fit may be the larger concern, can lead to misguided decision making. One of the most consequential uses of stepwise regression is in health care, where these tools allocate hundreds of billions of dollars to health plans enrolling individuals with different predicted health care costs. The main goal of this "risk adjustment" system is to convey incentives to health plans such that they provide health care services fairly, a component of which is not to discriminate in access or care for persons or groups likely to be expensive. We address some specific limitations of p-values and R2 for high-dimensional stepwise regression in this policy problem through an illustrated example by additionally considering a group-level fairness metric.
{"title":"Limitations of P-Values and R-squared for Stepwise Regression Building: A Fairness Demonstration in Health Policy Risk Adjustment.","authors":"Sherri Rose, Thomas G McGuire","doi":"10.1080/00031305.2018.1518269","DOIUrl":"https://doi.org/10.1080/00031305.2018.1518269","url":null,"abstract":"<p><p>Stepwise regression building procedures are commonly used applied statistical tools, despite their well-known drawbacks. While many of their limitations have been widely discussed in the literature, other aspects of the use of individual statistical fit measures, especially in high-dimensional stepwise regression settings, have not. Giving primacy to individual fit, as is done with p-values and R<sup>2</sup>, when group fit may be the larger concern, can lead to misguided decision making. One of the most consequential uses of stepwise regression is in health care, where these tools allocate hundreds of billions of dollars to health plans enrolling individuals with different predicted health care costs. The main goal of this \"risk adjustment\" system is to convey incentives to health plans such that they provide health care services fairly, a component of which is not to discriminate in access or care for persons or groups likely to be expensive. We address some specific limitations of p-values and R<sup>2</sup> for high-dimensional stepwise regression in this policy problem through an illustrated example by additionally considering a group-level fairness metric.</p>","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":" ","pages":"152-156"},"PeriodicalIF":1.8,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/00031305.2018.1518269","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"37122994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-01-01Epub Date: 2018-05-10DOI: 10.1080/00031305.2017.1322142
Mithat Gönen, Wesley O Johnson, Yonggang Lu, Peter H Westfall
Many Bayes factors have been proposed for comparing population means in two-sample (independent samples) studies. Recently, Wang and Liu (2015) presented an "objective" Bayes factor (BF) as an alternative to a "subjective" one presented by Gönen et al. (2005). Their report was evidently intended to show the superiority of their BF based on "undesirable behavior" of the latter. A wonderful aspect of Bayesian models is that they provide an opportunity to "lay all cards on the table." What distinguishes the various BFs in the two-sample problem is the choice of priors (cards) for the model parameters. This article discusses desiderata of BFs that have been proposed, and proposes a new criterion to compare BFs, no matter whether subjectively or objectively determined: A BF may be preferred if it correctly classifies the data as coming from the correct model most often. The criterion is based on a famous result in classification theory to minimize the total probability of misclassification. This criterion is objective, easily verified by simulation, shows clearly the effects (positive or negative) of assuming particular priors, provides new insights into the appropriateness of BFs in general, and provides a new answer to the question, "Which BF is best?"
{"title":"Comparing Objective and Subjective Bayes Factors for the Two-Sample Comparison: The Classification Theorem in Action.","authors":"Mithat Gönen, Wesley O Johnson, Yonggang Lu, Peter H Westfall","doi":"10.1080/00031305.2017.1322142","DOIUrl":"10.1080/00031305.2017.1322142","url":null,"abstract":"<p><p>Many Bayes factors have been proposed for comparing population means in two-sample (independent samples) studies. Recently, Wang and Liu (2015) presented an \"objective\" Bayes factor (BF) as an alternative to a \"subjective\" one presented by Gönen et al. (2005). Their report was evidently intended to show the superiority of their BF based on \"undesirable behavior\" of the latter. A wonderful aspect of Bayesian models is that they provide an opportunity to \"lay all cards on the table.\" What distinguishes the various BFs in the two-sample problem is the choice of priors (cards) for the model parameters. This article discusses desiderata of BFs that have been proposed, and proposes a new criterion to compare BFs, no matter whether subjectively or objectively determined: A BF may be preferred if it correctly classifies the data as coming from the correct model most often. The criterion is based on a famous result in classification theory to minimize the total probability of misclassification. This criterion is objective, easily verified by simulation, shows clearly the effects (positive or negative) of assuming particular priors, provides new insights into the appropriateness of BFs in general, and provides a new answer to the question, \"Which BF is best?\"</p>","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"73 1","pages":"22-31"},"PeriodicalIF":1.8,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6424525/pdf/nihms-1502428.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"37086310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-01-01Epub Date: 2018-07-09DOI: 10.1080/00031305.2018.1424033
Hongmei Zhang, Yubo Zou, Will Terry, Wilfried Karmaus, Hasan Arshad
Traditional clustering methods focus on grouping subjects or (dependent) variables assuming independence between the variables. Clusters formed through these approaches can potentially lack homogeneity. This article proposes a joint clustering method by which both variables and subjects are clustered. In each joint cluster (in general composed of a subset of variables and a subset of subjects), there exists a unique association between dependent variables and covariates of interest. To this end, a Bayesian method is designed, in which a semi-parametric model is used to evaluate any unknown relationships between possibly correlated variables and covariates of interest, and a Dirichlet process is utilized to cluster subjects. Compared to existing clustering techniques, the major novelty of the method exists in its ability to improve the homogeneity of clusters, along with the ability to take the correlations between variables into account. Via simulations, we examine the performance and efficiency of the proposed method. Applying the method to cluster allergens and subjects based on the association of wheal size in reaction to allergens with age, we found that a certain pattern of allergic sensitization to a set of allergens has a potential to reduce the occurrence of asthma.
{"title":"Joint clustering with correlated variables.","authors":"Hongmei Zhang, Yubo Zou, Will Terry, Wilfried Karmaus, Hasan Arshad","doi":"10.1080/00031305.2018.1424033","DOIUrl":"https://doi.org/10.1080/00031305.2018.1424033","url":null,"abstract":"<p><p>Traditional clustering methods focus on grouping subjects or (dependent) variables assuming independence between the variables. Clusters formed through these approaches can potentially lack homogeneity. This article proposes a joint clustering method by which both variables and subjects are clustered. In each joint cluster (in general composed of a subset of variables and a subset of subjects), there exists a unique association between dependent variables and covariates of interest. To this end, a Bayesian method is designed, in which a semi-parametric model is used to evaluate any unknown relationships between possibly correlated variables and covariates of interest, and a Dirichlet process is utilized to cluster subjects. Compared to existing clustering techniques, the major novelty of the method exists in its ability to improve the homogeneity of clusters, along with the ability to take the correlations between variables into account. Via simulations, we examine the performance and efficiency of the proposed method. Applying the method to cluster allergens and subjects based on the association of wheal size in reaction to allergens with age, we found that a certain pattern of allergic sensitization to a set of allergens has a potential to reduce the occurrence of asthma.</p>","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"73 3","pages":"296-306"},"PeriodicalIF":1.8,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/00031305.2018.1424033","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38325031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-01-01Epub Date: 2019-08-05DOI: 10.1080/00031305.2018.1537894
Thaddeus Tarpey, Eva Petkova
Hutson and Vexler (2018) demonstrate an example of aliasing with the beta and normal distribution. This letter presents another illustration of aliasing using the beta and normal distributions via an infinite mixture model, inspired by the problem of modeling placebo response.
{"title":"Letter to the Editor.","authors":"Thaddeus Tarpey, Eva Petkova","doi":"10.1080/00031305.2018.1537894","DOIUrl":"https://doi.org/10.1080/00031305.2018.1537894","url":null,"abstract":"<p><p>Hutson and Vexler (2018) demonstrate an example of aliasing with the beta and normal distribution. This letter presents another illustration of aliasing using the beta and normal distributions via an infinite mixture model, inspired by the problem of modeling placebo response.</p>","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"73 3","pages":"312"},"PeriodicalIF":1.8,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/00031305.2018.1537894","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25523718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}