首页 > 最新文献

Journal of behavioral data science最新文献

英文 中文
On some known derivations and new ones for the Wishart distribution: A didactic 关于Wishart分布的一些已知推导和新推导:一个教训
Pub Date : 2023-06-21 DOI: 10.35566/jbds/v3n1/ogasawara
H. Ogasawara
The proofs of the probability density function (pdf) of the Wishart distribution tend to be complicated with geometric viewpoints, tedious Jacobians and not self-contained algebra. In this paper, some known proofs and simple new ones for uncorrelated and correlated cases are provided with didactic explanations. For the new derivation of the uncorrelated case, an elementary direct derivation of the distribution of the Bartlett-decomposed matrix is provided. In the derivation of the correlated case from the uncorrelated one, simple methods including a new one are shown.
Wishart分布的概率密度函数(pdf)的证明往往是复杂的,具有几何观点、乏味的Jacobian和非自含代数。本文对不相关和相关案例的一些已知证明和简单的新证明进行了教条主义的解释。对于不相关情形的新推导,提供了Bartlett分解矩阵分布的初等直接推导。在从不相关情况推导相关情况时,给出了包括新方法在内的简单方法。
{"title":"On some known derivations and new ones for the Wishart distribution: A didactic","authors":"H. Ogasawara","doi":"10.35566/jbds/v3n1/ogasawara","DOIUrl":"https://doi.org/10.35566/jbds/v3n1/ogasawara","url":null,"abstract":"The proofs of the probability density function (pdf) of the Wishart distribution tend to be complicated with geometric viewpoints, tedious Jacobians and not self-contained algebra. In this paper, some known proofs and simple new ones for uncorrelated and correlated cases are provided with didactic explanations. For the new derivation of the uncorrelated case, an elementary direct derivation of the distribution of the Bartlett-decomposed matrix is provided. In the derivation of the correlated case from the uncorrelated one, simple methods including a new one are shown.","PeriodicalId":93575,"journal":{"name":"Journal of behavioral data science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47472307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian IRT in JAGS: A Tutorial JAGS中的贝叶斯IRT:教程
Pub Date : 2023-03-27 DOI: 10.35566/jbds/v3n1/mccure
Kenneth McClure
Item response modeling is common throughout psychology and education in assessments of intelligence, psychopathology, and ability. The current paper provides a tutorial on estimating the two-parameter logistic and graded response models in a Bayesian framework as well as provide an introduction on evaluating convergence and model fit in this framework. Example data are drawn from depression items in the 2017 Wave of the National Longitudinal Survey of Youth and example code is provided for JAGS and implemented through R using the runjags package. The aim of this paper is to provide readers with the necessary information to conduct Bayesian IRT in JAGS.
项目反应模型在智力、精神病理和能力评估的心理学和教育中是常见的。本文提供了在贝叶斯框架下估计双参数logistic和梯度响应模型的教程,并介绍了在该框架下评估收敛性和模型拟合。示例数据来自2017年全国青年纵向调查浪潮中的抑郁项目,并为JAGS提供了示例代码,并使用runjags包通过R实现。本文的目的是为读者提供在JAGS中进行贝叶斯IRT的必要信息。
{"title":"Bayesian IRT in JAGS: A Tutorial","authors":"Kenneth McClure","doi":"10.35566/jbds/v3n1/mccure","DOIUrl":"https://doi.org/10.35566/jbds/v3n1/mccure","url":null,"abstract":"Item response modeling is common throughout psychology and education in assessments of intelligence, psychopathology, and ability. The current paper provides a tutorial on estimating the two-parameter logistic and graded response models in a Bayesian framework as well as provide an introduction on evaluating convergence and model fit in this framework. Example data are drawn from depression items in the 2017 Wave of the National Longitudinal Survey of Youth and example code is provided for JAGS and implemented through R using the runjags package. The aim of this paper is to provide readers with the necessary information to conduct Bayesian IRT in JAGS.","PeriodicalId":93575,"journal":{"name":"Journal of behavioral data science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45345246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Conducting Meta-analyses of Proportions in R 在R中进行比例的元分析
Pub Date : 2023-01-01 DOI: 10.35566/jbds/v3n2/wang
Naike Wang
Meta-analysis of proportions has been widely adopted across various scientific disciplines as a means to estimate the prevalence of phenomena of interest. However, there is a lack of comprehensive tutorials demonstrating the proper execution of such analyses using the R programming language. The objective of this study is to bridge this gap and provide an extensive guide to conducting a meta-analysis of proportions using R. Furthermore, we offer a thorough critical review of the methods and tests involved in conducting a meta-analysis of proportions, highlighting several common practices that may yield biased estimations and misleading inferences. We illustrate the meta-analytic process in five stages: (1) preparation of the R environment; (2) computation of effect sizes; (3) quantification of heterogeneity; (4) visualization of heterogeneity with the forest plot and the Baujat plot; and (5) explanation of heterogeneity with moderator analyses. In the last section of the tutorial, we address the misconception of assessing publication bias in the context of meta-analysis of proportions. The provided code offers readers three options to transform proportional data (e.g., the double arcsine method). The tutorial presentation is conceptually oriented and formula usage is minimal. We will use a published meta-analysis of proportions as an example to illustrate the implementation of the R code and the interpretation of the results.
比例的元分析作为一种估计感兴趣的现象的普遍性的手段,已被广泛采用于各个科学学科。然而,缺乏全面的教程来演示如何使用R编程语言正确执行此类分析。本研究的目的是弥合这一差距,并为使用r进行比例荟萃分析提供广泛的指导。此外,我们对进行比例荟萃分析所涉及的方法和测试进行了全面的批判性回顾,强调了几种可能产生偏差估计和误导性推论的常见做法。我们将meta分析过程分为五个阶段:(1)R环境的准备;(2)效应量计算;(3)异质性量化;(4)利用forest样地和Baujat样地可视化异质性;(5)用调节因子分析解释异质性。在本教程的最后一部分,我们解决了在比例荟萃分析的背景下评估发表偏倚的误解。所提供的代码为读者提供了三种转换比例数据的选项(例如,双反正弦方法)。该教程的演示是以概念为导向的,公式的使用很少。我们将使用已发表的比例元分析作为示例来说明R代码的实现和结果的解释。
{"title":"Conducting Meta-analyses of Proportions in R","authors":"Naike Wang","doi":"10.35566/jbds/v3n2/wang","DOIUrl":"https://doi.org/10.35566/jbds/v3n2/wang","url":null,"abstract":"Meta-analysis of proportions has been widely adopted across various scientific disciplines as a means to estimate the prevalence of phenomena of interest. However, there is a lack of comprehensive tutorials demonstrating the proper execution of such analyses using the R programming language. The objective of this study is to bridge this gap and provide an extensive guide to conducting a meta-analysis of proportions using R. Furthermore, we offer a thorough critical review of the methods and tests involved in conducting a meta-analysis of proportions, highlighting several common practices that may yield biased estimations and misleading inferences. We illustrate the meta-analytic process in five stages: (1) preparation of the R environment; (2) computation of effect sizes; (3) quantification of heterogeneity; (4) visualization of heterogeneity with the forest plot and the Baujat plot; and (5) explanation of heterogeneity with moderator analyses. In the last section of the tutorial, we address the misconception of assessing publication bias in the context of meta-analysis of proportions. The provided code offers readers three options to transform proportional data (e.g., the double arcsine method). The tutorial presentation is conceptually oriented and formula usage is minimal. We will use a published meta-analysis of proportions as an example to illustrate the implementation of the R code and the interpretation of the results.","PeriodicalId":93575,"journal":{"name":"Journal of behavioral data science","volume":"153 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135508171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Robust Bayesian growth curve modeling: A tutorial using JAGS 稳健贝叶斯增长曲线建模:使用JAGS的教程
Pub Date : 2023-01-01 DOI: 10.35566/jbds/v3n2/li
Ruoxuan Li
Latent growth curve models (LGCM) are widely used in longitudinal data analysis, and robust methods can be used to model error distributions for non-normal data. This tutorial introduces how to modellinear, non-linear, and quadratic growth curve models under the Bayesian framework and uses examples to illustrate how to model errors using t, exponential power, and skew-normal distributions. The code of JAGS models is provided and implemented by the R package runjags. Model diagnostics and comparisons are briefly discussed.
潜在生长曲线模型(LGCM)广泛应用于纵向数据分析,鲁棒方法可用于非正态数据的误差分布建模。本教程介绍如何在Bayesian框架下建模线性、非线性和二次增长曲线模型,并使用示例说明如何使用t、指数幂和偏正态分布建模误差。JAGS模型的代码由R包runjags提供并实现。简要讨论了模型诊断和比较。
{"title":"Robust Bayesian growth curve modeling: A tutorial using JAGS","authors":"Ruoxuan Li","doi":"10.35566/jbds/v3n2/li","DOIUrl":"https://doi.org/10.35566/jbds/v3n2/li","url":null,"abstract":"Latent growth curve models (LGCM) are widely used in longitudinal data analysis, and robust methods can be used to model error distributions for non-normal data. This tutorial introduces how to modellinear, non-linear, and quadratic growth curve models under the Bayesian framework and uses examples to illustrate how to model errors using t, exponential power, and skew-normal distributions. The code of JAGS models is provided and implemented by the R package runjags. Model diagnostics and comparisons are briefly discussed.","PeriodicalId":93575,"journal":{"name":"Journal of behavioral data science","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135508148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Relative Predictive Performance of Treatments of Ordinal Outcome Variables across Machine Learning Algorithms and Class Distributions 机器学习算法和类分布中有序结果变量处理的相对预测性能
Pub Date : 2022-12-16 DOI: 10.35566/jbds/v2n2/suzuki
Honoka Suzuki, Oscar Gonzalez
Abstract Ordinal variables, such as those measured on a five-point Likert scale, are ubiquitous in the behavioral sciences. However, machine learning methods for modeling ordinal outcome variables (i.e., ordinal classification) are not as well-developed or widely utilized, compared to classification and regression methods for modeling nominal and continuous outcomes, respectively. Consequently, ordinal outcomes are often treated “naively” as nominal or continuous outcomes in practice. This study builds upon previous literature that has examined the predictive performance of such naïve approaches of treating ordinal outcome variables compared to ordinal classification methods in machine learning. We conducted a Monte Carlo simulation study to systematically assess the relative predictive performance of an ordinal classification approach proposed by Frank and Hall (2001) against naïve approaches according to two key factors that have received limited attention in previous literature: (1) the machine learning algorithm being used to implement the approaches and (2) the class distribution of the ordinal outcome variable. The consideration of these important, practical factors expands our knowledge on the consequences of naïve treatments of ordinal outcomes, which are shown in this study to vary substantially according to these factors. Given the ubiquity of ordinal measures coupled with the growing presence of machine learning applications in the behavioral sciences, these are important considerations for building high-performing predictive models in the field.
在行为科学中,用李克特五点量表测量的有序变量无处不在。然而,与分别用于标称结果和连续结果建模的分类和回归方法相比,用于模拟有序结果变量(即有序分类)的机器学习方法并没有得到很好的发展或广泛的应用。因此,在实践中,顺序结果经常被“天真地”视为名义或连续结果。本研究建立在先前文献的基础上,这些文献研究了naïve处理有序结果变量的方法与机器学习中的有序分类方法的预测性能。我们进行了蒙特卡罗模拟研究,根据两个关键因素系统地评估Frank和Hall(2001)提出的有序分类方法与naïve方法的相对预测性能,这些因素在以前的文献中受到的关注有限:(1)用于实现方法的机器学习算法和(2)有序结果变量的类分布。考虑到这些重要的、实际的因素,扩展了我们对naïve治疗对正常结果的影响的认识,在本研究中,这些结果根据这些因素有很大的不同。考虑到有序度量的普遍存在以及机器学习在行为科学中的应用日益增长,这些都是在该领域构建高性能预测模型的重要考虑因素。
{"title":"Relative Predictive Performance of Treatments of Ordinal Outcome Variables across Machine Learning Algorithms and Class Distributions","authors":"Honoka Suzuki, Oscar Gonzalez","doi":"10.35566/jbds/v2n2/suzuki","DOIUrl":"https://doi.org/10.35566/jbds/v2n2/suzuki","url":null,"abstract":"Abstract Ordinal variables, such as those measured on a five-point Likert scale, are ubiquitous in the behavioral sciences. However, machine learning methods for modeling ordinal outcome variables (i.e., ordinal classification) are not as well-developed or widely utilized, compared to classification and regression methods for modeling nominal and continuous outcomes, respectively. Consequently, ordinal outcomes are often treated “naively” as nominal or continuous outcomes in practice. This study builds upon previous literature that has examined the predictive performance of such naïve approaches of treating ordinal outcome variables compared to ordinal classification methods in machine learning. We conducted a Monte Carlo simulation study to systematically assess the relative predictive performance of an ordinal classification approach proposed by Frank and Hall (2001) against naïve approaches according to two key factors that have received limited attention in previous literature: (1) the machine learning algorithm being used to implement the approaches and (2) the class distribution of the ordinal outcome variable. The consideration of these important, practical factors expands our knowledge on the consequences of naïve treatments of ordinal outcomes, which are shown in this study to vary substantially according to these factors. Given the ubiquity of ordinal measures coupled with the growing presence of machine learning applications in the behavioral sciences, these are important considerations for building high-performing predictive models in the field.","PeriodicalId":93575,"journal":{"name":"Journal of behavioral data science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45265750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Tutorial on Bayesian Analysis of Count Data Using JAGS 使用JAGS对计数数据进行贝叶斯分析教程
Pub Date : 2022-12-14 DOI: 10.35566/jbds/v2n2/shao
Sijing Shao
In behavioral studies, the frequency of a particular behavior or event is often collected and the acquired data are referred to as count data. This tutorial introduces readers to Poisson regression models which is a more appropriate approach for such data. Meanwhile, count data with excessive zeros often occur in behavioral studies and models such as zero-inflated or hurdle models can be employed for handling zero-inflation in the count data. In this tutorial, we aim to cover the necessary fundamentals for these methods and equip readers with application tools of JAGS. Examples of the implementation of the models in JAGS from within R are provided for demonstration purposes.
在行为研究中,通常会收集特定行为或事件的频率,所获得的数据被称为计数数据。本教程向读者介绍泊松回归模型,这是一种更适合此类数据的方法。同时,在行为研究中经常出现零过多的计数数据,并且可以使用诸如零膨胀或障碍模型之类的模型来处理计数数据中的零膨胀。在本教程中,我们旨在介绍这些方法的必要基础知识,并为读者提供JAGS的应用工具。提供了从R中在JAGS中实现模型的示例,用于演示目的。
{"title":"A Tutorial on Bayesian Analysis of Count Data Using JAGS","authors":"Sijing Shao","doi":"10.35566/jbds/v2n2/shao","DOIUrl":"https://doi.org/10.35566/jbds/v2n2/shao","url":null,"abstract":"In behavioral studies, the frequency of a particular behavior or event is often collected and the acquired data are referred to as count data. This tutorial introduces readers to Poisson regression models which is a more appropriate approach for such data. Meanwhile, count data with excessive zeros often occur in behavioral studies and models such as zero-inflated or hurdle models can be employed for handling zero-inflation in the count data. In this tutorial, we aim to cover the necessary fundamentals for these methods and equip readers with application tools of JAGS. Examples of the implementation of the models in JAGS from within R are provided for demonstration purposes.","PeriodicalId":93575,"journal":{"name":"Journal of behavioral data science","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41518390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Handling Ignorable and Non-ignorable Missing Data through Bayesian Methods in JAGS JAGS中用贝叶斯方法处理可忽略和不可忽略的缺失数据
Pub Date : 2022-12-13 DOI: 10.35566/jbds/v2n2/xu
Ziqian Xu
With the prevalence of missing data in social science research, it is necessary to use methods for handling missing data. One framework in which data with missing values can still be used for parameter estimation is the Bayesian framework. In this tutorial, different missing data mechanisms including Missing Completely at Random, Missing at Random, and Missing Not at Random are introduced. Methods for estimating models with missing values under the Bayesian framework for both ignorable and non-ignorable missingness are also discussed. A structural equation model on data from the Advanced Cognitive Training for Independent and Vital Elderly study is used as an illustration on how to fit missing data models in JAGS.
随着社会科学研究中缺失数据的普遍存在,有必要使用处理缺失数据的方法。其中具有缺失值的数据仍然可以用于参数估计的一个框架是贝叶斯框架。在本教程中,介绍了不同的丢失数据机制,包括完全随机丢失、随机丢失和不随机丢失。还讨论了在可忽略和不可忽略缺失的贝叶斯框架下估计具有缺失值的模型的方法。使用独立和重要老年人高级认知训练研究数据的结构方程模型来说明如何拟合JAGS中缺失的数据模型。
{"title":"Handling Ignorable and Non-ignorable Missing Data through Bayesian Methods in JAGS","authors":"Ziqian Xu","doi":"10.35566/jbds/v2n2/xu","DOIUrl":"https://doi.org/10.35566/jbds/v2n2/xu","url":null,"abstract":"\u0000 \u0000 \u0000With the prevalence of missing data in social science research, it is necessary to use methods for handling missing data. One framework in which data with missing values can still be used for parameter estimation is the Bayesian framework. In this tutorial, different missing data mechanisms including Missing Completely at Random, Missing at Random, and Missing Not at Random are introduced. Methods for estimating models with missing values under the Bayesian framework for both ignorable and non-ignorable missingness are also discussed. A structural equation model on data from the Advanced Cognitive Training for Independent and Vital Elderly study is used as an illustration on how to fit missing data models in JAGS. \u0000 \u0000 \u0000","PeriodicalId":93575,"journal":{"name":"Journal of behavioral data science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49400980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Tutorial on Bayesian Latent Class Analysis Using JAGS 基于JAGS的贝叶斯潜类分析教程
Pub Date : 2022-12-04 DOI: 10.35566/jbds/v2n2/qiu
Meng Qiu
This tutorial introduces readers to latent class analysis (LCA) as a model-based approach to understand the unobserved heterogeneity in a population. Given the growing popularity of LCA, we aim to equip readers with theoretical fundamentals as well as computational tools. We outline some potential pitfalls of LCA and suggest related solutions. Moreover, we demonstrate how to conduct frequentist and Bayesian LCA in R with real and simulated data. To ease learning, the analysis is broken down into a series of simple steps. Beyond the simple LCA, two extensions including mixed-model LCA and growth curve LCA are provided to aid readers’ transition to more advanced models. The complete R code and data set are provided.  
本教程向读者介绍潜在类分析(LCA)作为一种基于模型的方法来理解群体中未观察到的异质性。鉴于LCA的日益普及,我们的目标是为读者提供理论基础和计算工具。我们概述了LCA的一些潜在缺陷,并提出了相关的解决方案。此外,我们还演示了如何使用真实和模拟数据在R中进行频率分析和贝叶斯LCA。为了便于学习,分析被分解为一系列简单的步骤。除了简单的LCA之外,还提供了混合模型LCA和增长曲线LCA两种扩展,以帮助读者过渡到更高级的模型。提供了完整的R代码和数据集。
{"title":"A Tutorial on Bayesian Latent Class Analysis Using JAGS","authors":"Meng Qiu","doi":"10.35566/jbds/v2n2/qiu","DOIUrl":"https://doi.org/10.35566/jbds/v2n2/qiu","url":null,"abstract":"This tutorial introduces readers to latent class analysis (LCA) as a model-based approach to understand the unobserved heterogeneity in a population. Given the growing popularity of LCA, we aim to equip readers with theoretical fundamentals as well as computational tools. We outline some potential pitfalls of LCA and suggest related solutions. Moreover, we demonstrate how to conduct frequentist and Bayesian LCA in R with real and simulated data. To ease learning, the analysis is broken down into a series of simple steps. Beyond the simple LCA, two extensions including mixed-model LCA and growth curve LCA are provided to aid readers’ transition to more advanced models. The complete R code and data set are provided. \u0000 ","PeriodicalId":93575,"journal":{"name":"Journal of behavioral data science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48376382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
The Performances of Gelman-Rubin and Geweke's Convergence Diagnostics of Monte Carlo Markov Chains in Bayesian Analysis Gelman-Rubin和Geweke的蒙特卡罗马尔可夫链收敛性诊断在贝叶斯分析中的性能
Pub Date : 2022-11-14 DOI: 10.35566/jbds/v2n2/p3
H. Du, Zijun Ke, Ge Jiang, Sijia Huang
Bayesian statistics have been widely used given the development of Markov chain Monte Carlo sampling techniques and the growth of computational power. A major challenge of Bayesian methods that has not yet been fully addressed is how we can appropriately evaluate the convergence of the random samples to the target posterior distributions. In this paper, we focus on Gelman and Rubin's diagnostic (PSRF), Brooks and Gleman's diagnostic (MPSRF), and Geweke's diagnostics, and compare the Type I error rate and Type II error rate of seven convergence criteria: MPSRF>1.1, any upper bound of PSRF is larger than 1.1, more than 5% of the upper bounds of PSRFs are larger than 1.1, any PSRF is larger than 1.1, more than 5% of PSRFs are larger than 1.1, any Geweke test statistic is larger than 1.96 or smaller than -1.96, and more than 5% of Geweke test statistics are larger than 1.96 or smaller than -1.96. Based on the simulation results, we recommend the upper bound of PSRF if we only can choose one diagnostic. When the number of estimated parameters is large, between the diagnostic per parameter (i.e., PSRF) or the multivariate diagnostic (i.e., MPSRF), we recommend the upper bound of PSRF over MPSRF. Additionally, we do not suggest claiming convergence at the analysis level while allowing a small proportion of the parameters to have significant convergence diagnosis results.
随着马尔可夫链蒙特卡罗采样技术的发展和计算能力的增长,贝叶斯统计已经得到了广泛的应用。贝叶斯方法的一个尚未完全解决的主要挑战是,我们如何适当地评估随机样本对目标后验分布的收敛性。在本文中,我们重点研究了Gelman和Rubin的诊断(PSRF)、Brooks和Gleman的诊断(MPSRF)以及Geweke的诊断,并比较了七个收敛准则的I型错误率和II型错误率:MPSRF>1.1,PSRF的任何上界都大于1.1,PSRF上界的5%以上大于1.1,任何PSRF都大于1.1,超过5%的PSRF大于1.1,任何Geweke检验统计量大于1.96或小于-1.96,超过5%的Geweke试验统计量大于1.96%或小于-1.96%。基于仿真结果,如果我们只能选择一个诊断,我们建议PSRF的上限。当估计参数的数量很大时,在每参数诊断(即PSRF)或多变量诊断(即MPSRF)之间,我们建议PSRF的上限高于MPSRF。此外,我们不建议在分析级别声称收敛,同时允许一小部分参数具有显著的收敛诊断结果。
{"title":"The Performances of Gelman-Rubin and Geweke's Convergence Diagnostics of Monte Carlo Markov Chains in Bayesian Analysis","authors":"H. Du, Zijun Ke, Ge Jiang, Sijia Huang","doi":"10.35566/jbds/v2n2/p3","DOIUrl":"https://doi.org/10.35566/jbds/v2n2/p3","url":null,"abstract":"\u0000Bayesian statistics have been widely used given the development of Markov chain Monte Carlo sampling techniques and the growth of computational power. A major challenge of Bayesian methods that has not yet been fully addressed is how we can appropriately evaluate the convergence of the random samples to the target posterior distributions. In this paper, we focus on Gelman and Rubin's diagnostic (PSRF), Brooks and Gleman's diagnostic (MPSRF), and Geweke's diagnostics, and compare the Type I error rate and Type II error rate of seven convergence criteria: MPSRF>1.1, any upper bound of PSRF is larger than 1.1, more than 5% of the upper bounds of PSRFs are larger than 1.1, any PSRF is larger than 1.1, more than 5% of PSRFs are larger than 1.1, any Geweke test statistic is larger than 1.96 or smaller than -1.96, and more than 5% of Geweke test statistics are larger than 1.96 or smaller than -1.96. Based on the simulation results, we recommend the upper bound of PSRF if we only can choose one diagnostic. When the number of estimated parameters is large, between the diagnostic per parameter (i.e., PSRF) or the multivariate diagnostic (i.e., MPSRF), we recommend the upper bound of PSRF over MPSRF. Additionally, we do not suggest claiming convergence at the analysis level while allowing a small proportion of the parameters to have significant convergence diagnosis results.\u0000","PeriodicalId":93575,"journal":{"name":"Journal of behavioral data science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49559557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A New Bayesian Structural Equation Modeling Approach with Priors on the Covariance Matrix Parameter 基于协方差矩阵参数先验的贝叶斯结构方程建模新方法
Pub Date : 2022-08-07 DOI: 10.35566/jbds/v2n2/p2
Haiyan Liu, Wen Qu, Zhiyong Zhang, Hao Wu
Bayesian inference for structural equation models (SEMs) is increasingly popular in social and psychological sciences owing to its flexibility to adapt to more complex models and the ability to include prior information if available. However, there are two major hurdles in using the traditional Bayesian SEM in practice: (1) the information nested in the prior distributions is hard to control, and (2) the MCMC iterative procedures naturally lead to Markov chains with serial dependence and the diagnostics of their convergence are often difficult. In this study, we present an alternative procedure for Bayesian SEM aiming to address the two challenges. In the new Bayesian SEM procedure, we specify a prior distribution on the population covariance matrix parameter $mathbf{Sigma}$ and obtain its posterior distribution $p(mathbf{Sigma}|text{data})$. We then construct a posterior distribution of model parameters $boldsymbol{theta}$ in the hypothetical SEM model by transforming the posterior distribution of $mathbf{Sigma}$ to a distribution of model parameter $boldsymbol{theta}$. The new procedure eases the practice of Bayesian SEM significantly and has a better control over the information nested in the prior distribution. We evaluated its performance through a simulation study and demonstrate its application through an empirical example.
结构方程模型的贝叶斯推理(SEMs)在社会和心理科学中越来越受欢迎,因为它能够灵活地适应更复杂的模型,并且能够在可用的情况下包含先验信息。然而,在实践中使用传统的贝叶斯SEM存在两个主要障碍:(1)嵌套在先验分布中的信息难以控制;(2)MCMC迭代过程自然导致具有序列依赖性的马尔可夫链,其收敛性的诊断往往很困难。在这项研究中,我们提出了贝叶斯扫描电镜的替代程序,旨在解决这两个挑战。在新的贝叶斯扫描电镜过程中,我们指定总体协方差矩阵参数$mathbf{Sigma}$的先验分布,并得到其后验分布$p(mathbf{Sigma}|text{data})$。然后,通过将模型参数$boldsymbol{theta}$的后验分布转化为模型参数$mathbf{Sigma}$的后验分布,在假设的SEM模型中构造模型参数$boldsymbol{theta}$的后验分布。新方法大大简化了贝叶斯扫描电镜的实践,并对嵌套在先验分布中的信息有更好的控制。通过仿真研究对其性能进行了评价,并通过实例验证了其应用。
{"title":"A New Bayesian Structural Equation Modeling Approach with Priors on the Covariance Matrix Parameter","authors":"Haiyan Liu, Wen Qu, Zhiyong Zhang, Hao Wu","doi":"10.35566/jbds/v2n2/p2","DOIUrl":"https://doi.org/10.35566/jbds/v2n2/p2","url":null,"abstract":"Bayesian inference for structural equation models (SEMs) is increasingly popular in social and psychological sciences owing to its flexibility to adapt to more complex models and the ability to include prior information if available. However, there are two major hurdles in using the traditional Bayesian SEM in practice: (1) the information nested in the prior distributions is hard to control, and (2) the MCMC iterative procedures naturally lead to Markov chains with serial dependence and the diagnostics of their convergence are often difficult. In this study, we present an alternative procedure for Bayesian SEM aiming to address the two challenges. In the new Bayesian SEM procedure, we specify a prior distribution on the population covariance matrix parameter $mathbf{Sigma}$ and obtain its posterior distribution $p(mathbf{Sigma}|text{data})$. We then construct a posterior distribution of model parameters $boldsymbol{theta}$ in the hypothetical SEM model by transforming the posterior distribution of $mathbf{Sigma}$ to a distribution of model parameter $boldsymbol{theta}$. The new procedure eases the practice of Bayesian SEM significantly and has a better control over the information nested in the prior distribution. We evaluated its performance through a simulation study and demonstrate its application through an empirical example.","PeriodicalId":93575,"journal":{"name":"Journal of behavioral data science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44073394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Journal of behavioral data science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1