British Journal of Mathematical & Statistical Psychology最新文献_第7页

A Gibbs-INLA algorithm for multidimensional graded response model analysis 用于多维分级响应模型分析的Gibbs INLA算法。

IF 2.6 3区心理学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

British Journal of Mathematical & Statistical Psychology

Pub Date : 2023-09-29 DOI: 10.1111/bmsp.12321

Xiaofan Lin, Siliang Zhang, Yincai Tang, Xuan Li

In this paper, we propose a novel Gibbs-INLA algorithm for the Bayesian inference of graded response models with ordinal response based on multidimensional item response theory. With the combination of the Gibbs sampling and the integrated nested Laplace approximation (INLA), the new framework avoids the cumbersome tuning which is inevitable in classical Markov chain Monte Carlo (MCMC) algorithm, and has low computing memory, high computational efficiency with much fewer iterations, and still achieve higher estimation accuracy. Therefore, it has the ability to handle large amount of multidimensional response data with different item responses. Simulation studies are conducted to compare with the Metroplis-Hastings Robbins-Monro (MH-RM) algorithm and an application to the study of the IPIP-NEO personality inventory data is given to assess the performance of the new algorithm. Extensions of the proposed algorithm for application on more complicated models and different data types are also discussed.

在本文中，我们基于多维项目反应理论，提出了一种新的Gibbs INLA算法，用于具有顺序反应的分级反应模型的贝叶斯推理。通过将吉布斯采样和集成嵌套拉普拉斯近似（INLA）相结合，新框架避免了经典马尔可夫链蒙特卡罗（MCMC）算法中不可避免的繁琐调整，并且计算内存低，迭代次数少，计算效率高，并且仍然实现了更高的估计精度。因此，它能够处理具有不同项目响应的大量多维响应数据。将其与Metroplis-Hasttings-Robbins-Monro（MH-RM）算法进行了仿真研究，并将其应用于IPIP-NEO人格清单数据的研究，以评估新算法的性能。还讨论了所提出的算法在更复杂的模型和不同数据类型上的应用扩展。

引用次数: 0

A Bayesian nonparametric approach for handling item and examinee heterogeneity in assessment data 一种处理评估数据中项目和受试者异质性的贝叶斯非参数方法。

IF 2.6 3区心理学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

British Journal of Mathematical & Statistical Psychology

Pub Date : 2023-09-20 DOI: 10.1111/bmsp.12322

Tianyu Pan, Weining Shen, Clintin P. Davis-Stober, Guanyu Hu

We propose a novel nonparametric Bayesian item response theory model that estimates clusters at the question level, while simultaneously allowing for heterogeneity at the examinee level under each question cluster, characterized by a mixture of binomial distributions. The main contribution of this work is threefold. First, we present our new model and demonstrate that it is identifiable under a set of conditions. Second, we show that our model can correctly identify question-level clusters asymptotically, and the parameters of interest that measure the proficiency of examinees in solving certain questions can be estimated at a $� � � \sqrt{� � n � �} � �$ rate (up to a log term). Third, we present a tractable sampling algorithm to obtain valid posterior samples from our proposed model. Compared to the existing methods, our model manages to reveal the multi-dimensionality of the examinees' proficiency level in handling different types of questions parsimoniously by imposing a nested clustering structure. The proposed model is evaluated via a series of simulations as well as apply it to an English proficiency assessment data set. This data analysis example nicely illustrates how our model can be used by test makers to distinguish different types of students and aid in the design of future tests.

我们提出了一种新的非参数贝叶斯项目反应理论模型，该模型在问题水平上估计聚类，同时考虑到每个问题聚类下考生水平的异质性，其特征是二项式分布的混合。这项工作的主要贡献有三个方面。首先，我们提出了我们的新模型，并证明它在一组条件下是可识别的。其次，我们证明了我们的模型可以渐近地正确识别问题级别的聚类，并且衡量考生解决某些问题的熟练程度的感兴趣参数可以估计为n$sqrt｛n｝$$比率（高达对数项）。第三，我们提出了一种易于处理的采样算法，从我们提出的模型中获得有效的后验样本。与现有方法相比，我们的模型通过引入嵌套聚类结构，成功地揭示了考生在处理不同类型问题时的熟练程度的多维性。通过一系列模拟对所提出的模型进行了评估，并将其应用于英语水平评估数据集。这个数据分析示例很好地说明了测试人员如何使用我们的模型来区分不同类型的学生，并帮助设计未来的测试。

{"title":"A Bayesian nonparametric approach for handling item and examinee heterogeneity in assessment data","authors":"Tianyu Pan, Weining Shen, Clintin P. Davis-Stober, Guanyu Hu","doi":"10.1111/bmsp.12322","DOIUrl":"10.1111/bmsp.12322","url":null,"abstract":"We propose a novel nonparametric Bayesian item response theory model that estimates clusters at the question level, while simultaneously allowing for heterogeneity at the examinee level under each question cluster, characterized by a mixture of binomial distributions. The main contribution of this work is threefold. First, we present our new model and demonstrate that it is identifiable under a set of conditions. Second, we show that our model can correctly identify question-level clusters asymptotically, and the parameters of interest that measure the proficiency of examinees in solving certain questions can be estimated at a <math>\u0000 <semantics>\u0000 <mrow>\u0000 <msqrt>\u0000 <mrow>\u0000 <mi>n</mi>\u0000 </mrow>\u0000 </msqrt>\u0000 </mrow>\u0000 </semantics></math> rate (up to a log term). Third, we present a tractable sampling algorithm to obtain valid posterior samples from our proposed model. Compared to the existing methods, our model manages to reveal the multi-dimensionality of the examinees' proficiency level in handling different types of questions parsimoniously by imposing a nested clustering structure. The proposed model is evaluated via a series of simulations as well as apply it to an English proficiency assessment data set. This data analysis example nicely illustrates how our model can be used by test makers to distinguish different types of students and aid in the design of future tests.","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":"77 1","pages":"196-211"},"PeriodicalIF":2.6,"publicationDate":"2023-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41174590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Replies to comments on "Which method delivers greater signal-to-noise ratio: Structural equation modelling or regression analysis with weighted composites?" by Yuan and Fang (2023) 袁和方(2023)对“哪种方法能提供更大的信噪比:结构方程建模还是加权复合材料回归分析?”的评论回复

IF 2.6 3区心理学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

British Journal of Mathematical & Statistical Psychology

Pub Date : 2023-09-15 DOI: 10.1111/bmsp.12323

Ke-Hai Yuan, Yongfei Fang

引用次数: 0

Exploring examinees' responses to constructed response items with a supervised topic model 用监督话题模型探索考生对构建的回答项目的反应。

IF 2.6 3区心理学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

British Journal of Mathematical & Statistical Psychology

Pub Date : 2023-09-13 DOI: 10.1111/bmsp.12319

Seohyun Kim, Zhenqiu Lu, Allan S. Cohen

Textual data are increasingly common in test data as many assessments include constructed response (CR) items as indicators of participants' understanding. The development of techniques based on natural language processing has made it possible for researchers to rapidly analyse large sets of textual data. One family of statistical techniques for this purpose are probabilistic topic models. Topic modelling is a technique for detecting the latent topic structure in a collection of documents and has been widely used to analyse texts in a variety of areas. The detected topics can reveal primary themes in the documents, and the relative use of topics can be useful in investigating the variability of the documents. Supervised latent Dirichlet allocation (SLDA) is a popular topic model in that family that jointly models textual data and paired responses such as could occur with participants' textual answers to CR items and their rubric-based scores. SLDA has an assumption of a homogeneous relationship between textual data and paired responses across all documents. This approach, while useful for some purposes, may not be satisfied for situations in which a population has subgroups that have different relationships. In this study, we introduce a new supervised topic model that incorporates finite-mixture modelling into the SLDA. This new model can detect latent groups of participants that have different relationships between their textual responses and associated scores. The model is illustrated with an example from an analysis of a set of textual responses and paired scores from a middle grades assessment of science inquiry knowledge. A simulation study is presented to investigate the performance of the proposed model under practical testing conditions.

文本数据在测试数据中越来越普遍，因为许多评估包括构建反应(CR)项目作为参与者理解的指标。基于自然语言处理技术的发展使研究人员能够快速分析大量文本数据。用于此目的的一类统计技术是概率主题模型。主题建模是一种检测文档集合中潜在主题结构的技术，已被广泛用于分析各个领域的文本。检测到的主题可以揭示文档中的主要主题，并且主题的相对使用可以用于调查文档的可变性。监督潜狄利克雷分配(SLDA)是该家族中流行的主题模型，它联合建模文本数据和配对反应，例如参与者对CR项目的文本答案及其基于规则的分数。SLDA假设所有文档中的文本数据和成对响应之间存在同构关系。这种方法虽然对某些目的有用，但可能不适用于总体中具有不同关系的子组的情况。在本研究中，我们引入了一种新的监督主题模型，该模型将有限混合模型引入到SLDA中。这个新模型可以检测潜在的参与者群体，他们的文本回复和相关分数之间有不同的关系。该模型是由一个例子，从一组文本回应的分析和配对分数从科学探究知识的中级评估。通过仿真研究，验证了该模型在实际测试条件下的性能。

{"title":"Exploring examinees' responses to constructed response items with a supervised topic model","authors":"Seohyun Kim, Zhenqiu Lu, Allan S. Cohen","doi":"10.1111/bmsp.12319","DOIUrl":"10.1111/bmsp.12319","url":null,"abstract":"Textual data are increasingly common in test data as many assessments include constructed response (CR) items as indicators of participants' understanding. The development of techniques based on natural language processing has made it possible for researchers to rapidly analyse large sets of textual data. One family of statistical techniques for this purpose are probabilistic topic models. Topic modelling is a technique for detecting the latent topic structure in a collection of documents and has been widely used to analyse texts in a variety of areas. The detected topics can reveal primary themes in the documents, and the relative use of topics can be useful in investigating the variability of the documents. Supervised latent Dirichlet allocation (SLDA) is a popular topic model in that family that jointly models textual data and paired responses such as could occur with participants' textual answers to CR items and their rubric-based scores. SLDA has an assumption of a homogeneous relationship between textual data and paired responses across all documents. This approach, while useful for some purposes, may not be satisfied for situations in which a population has subgroups that have different relationships. In this study, we introduce a new supervised topic model that incorporates finite-mixture modelling into the SLDA. This new model can detect latent groups of participants that have different relationships between their textual responses and associated scores. The model is illustrated with an example from an analysis of a set of textual responses and paired scores from a middle grades assessment of science inquiry knowledge. A simulation study is presented to investigate the performance of the proposed model under practical testing conditions.","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":"77 1","pages":"130-150"},"PeriodicalIF":2.6,"publicationDate":"2023-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/bmsp.12319","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10225777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Estimation of nonlinear mixed-effects continuous-time models using the continuous-discrete extended Kalman filter 用连续离散扩展卡尔曼滤波估计非线性混合效应连续时间模型

IF 2.6 3区心理学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

British Journal of Mathematical & Statistical Psychology

Pub Date : 2023-09-06 DOI: 10.1111/bmsp.12318

Lu Ou, Michael D. Hunter, Zhaohua Lu, Cynthia A. Stifter, Sy-Miin Chow

Many intensive longitudinal measurements are collected at irregularly spaced time intervals, and involve complex, possibly nonlinear and heterogeneous patterns of change. Effective modelling of such change processes requires continuous-time differential equation models that may be nonlinear and include mixed effects in the parameters. One approach of fitting such models is to define random effect variables as additional latent variables in a stochastic differential equation (SDE) model of choice, and use estimation algorithms designed for fitting SDE models, such as the continuous-discrete extended Kalman filter (CDEKF) approach implemented in the dynr R package, to estimate the random effect variables as latent variables. However, this approach's efficacy and identification constraints in handling mixed-effects SDE models have not been investigated. In the current study, we analytically inspect the identification constraints of using the CDEKF approach to fit nonlinear mixed-effects SDE models; extend a published model of emotions to a nonlinear mixed-effects SDE model as an example, and fit it to a set of irregularly spaced ecological momentary assessment data; and evaluate the feasibility of the proposed approach to fit the model through a Monte Carlo simulation study. Results show that the proposed approach produces reasonable parameter and standard error estimates when some identification constraint is met. We address the effects of sample size, process noise variance, and data spacing conditions on estimation results.

许多密集的纵向测量是在不规则的时间间隔内收集的，并且涉及复杂的，可能是非线性的和非均匀的变化模式。这种变化过程的有效建模需要连续时间微分方程模型，这些模型可能是非线性的，并且在参数中包含混合效应。拟合这些模型的一种方法是将随机效应变量定义为选择的随机微分方程(SDE)模型中的附加潜在变量，并使用专为拟合SDE模型而设计的估计算法，例如dynr R包中实现的连续离散扩展卡尔曼滤波(CDEKF)方法来估计随机效应变量作为潜在变量。然而，该方法在处理混合效应SDE模型中的有效性和识别约束尚未得到研究。在本研究中，我们分析检验了使用CDEKF方法拟合非线性混合效应SDE模型的识别约束;以已发表的情绪模型为例，将其扩展为非线性混合效应SDE模型，并拟合到一组不规则间隔的生态瞬时评价数据;并通过蒙特卡罗仿真研究来评估所提出的方法拟合模型的可行性。结果表明，在满足一定辨识约束的情况下，该方法能得到合理的参数估计和标准误差估计。我们讨论了样本量、过程噪声方差和数据间距条件对估计结果的影响。

{"title":"Estimation of nonlinear mixed-effects continuous-time models using the continuous-discrete extended Kalman filter","authors":"Lu Ou, Michael D. Hunter, Zhaohua Lu, Cynthia A. Stifter, Sy-Miin Chow","doi":"10.1111/bmsp.12318","DOIUrl":"10.1111/bmsp.12318","url":null,"abstract":"Many intensive longitudinal measurements are collected at irregularly spaced time intervals, and involve complex, possibly nonlinear and heterogeneous patterns of change. Effective modelling of such change processes requires continuous-time differential equation models that may be nonlinear and include mixed effects in the parameters. One approach of fitting such models is to define random effect variables as additional latent variables in a stochastic differential equation (SDE) model of choice, and use estimation algorithms designed for fitting SDE models, such as the continuous-discrete extended Kalman filter (CDEKF) approach implemented in the dynr R package, to estimate the random effect variables as latent variables. However, this approach's efficacy and identification constraints in handling mixed-effects SDE models have not been investigated. In the current study, we analytically inspect the identification constraints of using the CDEKF approach to fit nonlinear mixed-effects SDE models; extend a published model of emotions to a nonlinear mixed-effects SDE model as an example, and fit it to a set of irregularly spaced ecological momentary assessment data; and evaluate the feasibility of the proposed approach to fit the model through a Monte Carlo simulation study. Results show that the proposed approach produces reasonable parameter and standard error estimates when some identification constraint is met. We address the effects of sample size, process noise variance, and data spacing conditions on estimation results.","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":"76 3","pages":"462-490"},"PeriodicalIF":2.6,"publicationDate":"2023-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10226412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Using item scores and response times in person-fit assessment 在个人适应性评估中使用项目得分和反应时间。

IF 2.6 3区心理学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

British Journal of Mathematical & Statistical Psychology

Pub Date : 2023-09-05 DOI: 10.1111/bmsp.12320

Kylie Gorney, Sandip Sinharay, Xiang Liu

The use of joint models for item scores and response times is becoming increasingly popular in educational and psychological testing. In this paper, we propose two new person-fit statistics for such models in order to detect aberrant behaviour. The first statistic is computed by combining two existing person-fit statistics: one for the item scores, and one for the item response times. The second statistic is computed directly using the likelihood function of the joint model. Using detailed simulations, we show that the empirical null distributions of the new statistics are very close to the theoretical null distributions, and that the new statistics tend to be more powerful than several existing statistics for item scores and/or response times. A real data example is also provided using data from a licensure examination.

在教育和心理测试中，项目分数和反应时间联合模型的使用越来越流行。在本文中，我们为此类模型提出了两个新的人称拟合统计量，以检测异常行为。第一个统计量是通过合并两个现有的人称拟合统计量计算出来的：一个是项目得分统计量，另一个是项目反应时间统计量。第二个统计量是直接使用联合模型的似然函数计算得出的。通过详细的模拟，我们发现新统计量的经验空分布与理论空分布非常接近，而且新统计量往往比现有的几种项目得分和/或响应时间统计量更强大。我们还提供了一个使用执业资格考试数据的真实数据示例。

引用次数: 0

Evaluating the performance of existing and novel equivalence tests for fit indices in structural equation modelling 评估结构方程建模中现有和新型拟合指数等效检验的性能。

IF 2.6 3区心理学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

British Journal of Mathematical & Statistical Psychology

Pub Date : 2023-07-13 DOI: 10.1111/bmsp.12317

Nataly Beribisky, Robert A. Cribbie

It has been suggested that equivalence testing (otherwise known as negligible effect testing) should be used to evaluate model fit within structural equation modelling (SEM). In this study, we propose novel variations of equivalence tests based on the popular root mean squared error of approximation and comparative fit index fit indices. Using Monte Carlo simulations, we compare the performance of these novel tests to other existing equivalence testing-based fit indices in SEM, as well as to other methods commonly used to evaluate model fit. Results indicate that equivalence tests in SEM have good Type I error control and display considerable power for detecting well-fitting models in medium to large sample sizes. At small sample sizes, relative to traditional fit indices, equivalence tests limit the chance of supporting a poorly fitting model. We also present an illustrative example to demonstrate how equivalence tests may be incorporated in model fit reporting. Equivalence tests in SEM also have unique interpretational advantages compared to other methods of model fit evaluation. We recommend that equivalence tests be utilized in conjunction with descriptive fit indices to provide more evidence when evaluating model fit.

有人建议，等效检验（又称可忽略效应检验）应被用于评估结构方程建模（SEM）中的模型拟合度。在本研究中，我们提出了基于流行的均方根近似误差和比较拟合指数拟合指数的新型等效检验变体。通过蒙特卡罗模拟，我们将这些新型检验的性能与 SEM 中其他现有的基于等效检验的拟合指数以及其他常用于评估模型拟合度的方法进行了比较。结果表明，SEM 中的等效检验具有良好的 I 类误差控制能力，在中到大型样本量中检测拟合良好的模型时显示出相当大的威力。在小样本量情况下，相对于传统的拟合指数，等效检验限制了支持拟合不良模型的机会。我们还将举例说明如何将等效检验纳入模型拟合报告。与其他模型拟合度评估方法相比，SEM 中的等效检验还具有独特的解释优势。我们建议将等效检验与描述性拟合指数结合使用，以便在评估模型拟合度时提供更多证据。

{"title":"Evaluating the performance of existing and novel equivalence tests for fit indices in structural equation modelling","authors":"Nataly Beribisky, Robert A. Cribbie","doi":"10.1111/bmsp.12317","DOIUrl":"10.1111/bmsp.12317","url":null,"abstract":"It has been suggested that equivalence testing (otherwise known as negligible effect testing) should be used to evaluate model fit within structural equation modelling (SEM). In this study, we propose novel variations of equivalence tests based on the popular root mean squared error of approximation and comparative fit index fit indices. Using Monte Carlo simulations, we compare the performance of these novel tests to other existing equivalence testing-based fit indices in SEM, as well as to other methods commonly used to evaluate model fit. Results indicate that equivalence tests in SEM have good Type I error control and display considerable power for detecting well-fitting models in medium to large sample sizes. At small sample sizes, relative to traditional fit indices, equivalence tests limit the chance of supporting a poorly fitting model. We also present an illustrative example to demonstrate how equivalence tests may be incorporated in model fit reporting. Equivalence tests in SEM also have unique interpretational advantages compared to other methods of model fit evaluation. We recommend that equivalence tests be utilized in conjunction with descriptive fit indices to provide more evidence when evaluating model fit.","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":"77 1","pages":"103-129"},"PeriodicalIF":2.6,"publicationDate":"2023-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/bmsp.12317","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10134925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

K-Plus anticlustering: An improved k-means criterion for maximizing between-group similarity K-Plus 反聚类法：最大化组间相似性的改进型 k-means 准则。

IF 2.6 3区心理学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

British Journal of Mathematical & Statistical Psychology

Pub Date : 2023-07-11 DOI: 10.1111/bmsp.12315

Martin Papenberg

Anticlustering refers to the process of partitioning elements into disjoint groups with the goal of obtaining high between-group similarity and high within-group heterogeneity. Anticlustering thereby reverses the logic of its better known twin—cluster analysis—and is usually approached by maximizing instead of minimizing a clustering objective function. This paper presents k-plus, an extension of the classical k-means objective of maximizing between-group similarity in anticlustering applications. K-plus represents between-group similarity as discrepancy in distribution moments (means, variance, and higher-order moments), whereas the k-means criterion only reflects group differences with regard to means. While constituting a new criterion for anticlustering, it is shown that k-plus anticlustering can be implemented by optimizing the original k-means criterion after the input data have been augmented with additional variables. A computer simulation and practical examples show that k-plus anticlustering achieves high between-group similarity with regard to multiple objectives. In particular, optimizing between-group similarity with regard to variances usually does not compromise similarity with regard to means; the k-plus extension is therefore generally preferred over classical k-means anticlustering. Examples are given on how k-plus anticlustering can be applied to real norming data using the open source R package anticlust, which is freely available via CRAN.

反聚类指的是将元素划分为互不相交的组的过程，其目标是获得高的组间相似性和高的组内异质性。反聚类分析与众所周知的孪生聚类分析的逻辑相反，通常是通过最大化而不是最小化聚类目标函数来实现的。本文介绍的 k-plus 是经典 k-means 目标的扩展，即在反聚类应用中最大化组间相似度。K-plus 将组间相似性表示为分布矩（均值、方差和高阶矩）的差异，而 k-means 准则只反映组间均值的差异。研究表明，k-plus 反聚类法是一种新的反聚类准则，它可以在输入数据添加额外变量后，通过优化原始的 k-means 准则来实现。计算机模拟和实际案例表明，k-plus 反聚类法在多个目标方面都能达到较高的组间相似度。特别是，优化组间相似性的方差通常不会影响相似性的均值；因此，k-plus 扩展通常优于经典的 k-means 反聚类法。本文举例说明了如何使用开源 R 软件包 anticlust 将 k-plus 反聚类应用于真实的常模数据，该软件包可通过 CRAN 免费获取。

{"title":"K-Plus anticlustering: An improved k-means criterion for maximizing between-group similarity","authors":"Martin Papenberg","doi":"10.1111/bmsp.12315","DOIUrl":"10.1111/bmsp.12315","url":null,"abstract":"Anticlustering refers to the process of partitioning elements into disjoint groups with the goal of obtaining high between-group similarity and high within-group heterogeneity. Anticlustering thereby reverses the logic of its better known twin—cluster analysis—and is usually approached by maximizing instead of minimizing a clustering objective function. This paper presents k-plus, an extension of the classical k-means objective of maximizing between-group similarity in anticlustering applications. K-plus represents between-group similarity as discrepancy in distribution moments (means, variance, and higher-order moments), whereas the k-means criterion only reflects group differences with regard to means. While constituting a new criterion for anticlustering, it is shown that k-plus anticlustering can be implemented by optimizing the original k-means criterion after the input data have been augmented with additional variables. A computer simulation and practical examples show that k-plus anticlustering achieves high between-group similarity with regard to multiple objectives. In particular, optimizing between-group similarity with regard to variances usually does not compromise similarity with regard to means; the k-plus extension is therefore generally preferred over classical k-means anticlustering. Examples are given on how k-plus anticlustering can be applied to real norming data using the open source R package anticlust, which is freely available via CRAN.","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":"77 1","pages":"80-102"},"PeriodicalIF":2.6,"publicationDate":"2023-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/bmsp.12315","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9764395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancing measurement validity in diverse populations: Modern approaches to evaluating differential item functioning 提高不同人群的测量效度:评估差异项目功能的现代方法

IF 2.6 3区心理学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

British Journal of Mathematical & Statistical Psychology

Pub Date : 2023-07-10 DOI: 10.1111/bmsp.12316

Daniel J. Bauer

When developing and evaluating psychometric measures, a key concern is to ensure that they accurately capture individual differences on the intended construct across the entire population of interest. Inaccurate assessments of individual differences can occur when responses to some items reflect not only the intended construct but also construct-irrelevant characteristics, like a person's race or sex. Unaccounted for, this item bias can lead to apparent differences on the scores that do not reflect true differences, invalidating comparisons between people with different backgrounds. Accordingly, empirically identifying which items manifest bias through the evaluation of differential item functioning (DIF) has been a longstanding focus of much psychometric research. The majority of this work has focused on evaluating DIF across two (or a few) groups. Modern conceptualizations of identity, however, emphasize its multi-determined and intersectional nature, with some aspects better represented as dimensional than categorical. Fortunately, many model-based approaches to modelling DIF now exist that allow for simultaneous evaluation of multiple background variables, including both continuous and categorical variables, and potential interactions among background variables. This paper provides a comparative, integrative review of these new approaches to modelling DIF and clarifies both the opportunities and challenges associated with their application in psychometric research.

在开发和评估心理测量方法时，一个关键的问题是确保它们准确地捕捉到整个感兴趣人群在预期结构上的个体差异。当对某些项目的反应不仅反映了预期的构念，而且反映了与构念无关的特征，比如一个人的种族或性别，就会出现对个体差异的不准确评估。没有解释的是，这种项目偏差会导致分数上的明显差异，而这并不能反映真正的差异，从而使不同背景的人之间的比较无效。因此，通过差异项目功能(DIF)的评估来实证地识别哪些项目表现出偏见一直是许多心理测量学研究的长期焦点。这项工作的大部分集中在评估跨两个(或几个)组的DIF上。然而，现代身份概念强调其多决定和交叉的性质，其中一些方面更好地表现为维度而不是分类。幸运的是，现在存在许多基于模型的方法来建模DIF，这些方法允许同时评估多个背景变量，包括连续变量和分类变量，以及背景变量之间的潜在相互作用。本文对这些模拟DIF的新方法进行了比较、综合的回顾，并阐明了它们在心理测量学研究中的应用所带来的机遇和挑战。

{"title":"Enhancing measurement validity in diverse populations: Modern approaches to evaluating differential item functioning","authors":"Daniel J. Bauer","doi":"10.1111/bmsp.12316","DOIUrl":"10.1111/bmsp.12316","url":null,"abstract":"When developing and evaluating psychometric measures, a key concern is to ensure that they accurately capture individual differences on the intended construct across the entire population of interest. Inaccurate assessments of individual differences can occur when responses to some items reflect not only the intended construct but also construct-irrelevant characteristics, like a person's race or sex. Unaccounted for, this item bias can lead to apparent differences on the scores that do not reflect true differences, invalidating comparisons between people with different backgrounds. Accordingly, empirically identifying which items manifest bias through the evaluation of differential item functioning (DIF) has been a longstanding focus of much psychometric research. The majority of this work has focused on evaluating DIF across two (or a few) groups. Modern conceptualizations of identity, however, emphasize its multi-determined and intersectional nature, with some aspects better represented as dimensional than categorical. Fortunately, many model-based approaches to modelling DIF now exist that allow for simultaneous evaluation of multiple background variables, including both continuous and categorical variables, and potential interactions among background variables. This paper provides a comparative, integrative review of these new approaches to modelling DIF and clarifies both the opportunities and challenges associated with their application in psychometric research.","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":"76 3","pages":"435-461"},"PeriodicalIF":2.6,"publicationDate":"2023-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/bmsp.12316","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10125818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Assessment of generalised Bayesian structural equation models for continuous and binary data 连续和二元数据的广义贝叶斯结构方程模型的评价

IF 2.6 3区心理学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

British Journal of Mathematical & Statistical Psychology

Pub Date : 2023-07-04 DOI: 10.1111/bmsp.12314

Konstantinos Vamvourellis, Konstantinos Kalogeropoulos, Irini Moustaki

The paper proposes a novel model assessment paradigm aiming to address shortcoming of posterior predictive $� � � p � �$ -values, which provide the default metric of fit for Bayesian structural equation modelling (BSEM). The model framework presented in the paper focuses on the approximate zero approach (Psychological Methods, 17, 2012, 313), which involves formulating certain parameters (such as factor loadings) to be approximately zero through the use of informative priors, instead of explicitly setting them to zero. The introduced model assessment procedure monitors the out-of-sample predictive performance of the fitted model, and together with a list of guidelines we provide, one can investigate whether the hypothesised model is supported by the data. We incorporate scoring rules and cross-validation to supplement existing model assessment metrics for BSEM. The proposed tools can be applied to models for both continuous and binary data. The modelling of categorical and non-normally distributed continuous data is facilitated with the introduction of an item-individual random effect. We study the performance of the proposed methodology via simulation experiments as well as real data on the ‘Big-5’ personality scale and the Fagerstrom test for nicotine dependence.

本文提出了一种新的模型评估范式，旨在解决后验预测p值作为贝叶斯结构方程建模(BSEM)的默认拟合度量的不足。本文提出的模型框架侧重于近似零方法(心理学方法，17,2012,313)，该方法涉及通过使用信息先验将某些参数(如因子负载)制定为近似零，而不是明确地将其设置为零。引入的模型评估程序监测拟合模型的样本外预测性能，并与我们提供的一系列指导方针一起，可以调查假设模型是否得到数据的支持。我们结合评分规则和交叉验证来补充现有的BSEM模型评估指标。所提出的工具可以应用于连续数据和二进制数据的模型。分类和非正态分布连续数据的建模通过引入项目-个体随机效应而变得容易。我们通过模拟实验以及“大5”人格量表和Fagerstrom尼古丁依赖测试的真实数据来研究所提出方法的性能。

{"title":"Assessment of generalised Bayesian structural equation models for continuous and binary data","authors":"Konstantinos Vamvourellis, Konstantinos Kalogeropoulos, Irini Moustaki","doi":"10.1111/bmsp.12314","DOIUrl":"10.1111/bmsp.12314","url":null,"abstract":"The paper proposes a novel model assessment paradigm aiming to address shortcoming of posterior predictive <math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>p</mi>\u0000 </mrow>\u0000 </semantics></math>-values, which provide the default metric of fit for Bayesian structural equation modelling (BSEM). The model framework presented in the paper focuses on the approximate zero approach (Psychological Methods, 17, 2012, 313), which involves formulating certain parameters (such as factor loadings) to be approximately zero through the use of informative priors, instead of explicitly setting them to zero. The introduced model assessment procedure monitors the out-of-sample predictive performance of the fitted model, and together with a list of guidelines we provide, one can investigate whether the hypothesised model is supported by the data. We incorporate scoring rules and cross-validation to supplement existing model assessment metrics for BSEM. The proposed tools can be applied to models for both continuous and binary data. The modelling of categorical and non-normally distributed continuous data is facilitated with the introduction of an item-individual random effect. We study the performance of the proposed methodology via simulation experiments as well as real data on the ‘Big-5’ personality scale and the Fagerstrom test for nicotine dependence.","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":"76 3","pages":"559-584"},"PeriodicalIF":2.6,"publicationDate":"2023-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/bmsp.12314","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9748065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0