首页 > 最新文献

British Journal of Mathematical & Statistical Psychology最新文献

英文 中文
An explanatory mixture IRT model for careless and insufficient effort responding in self-report measures 自我报告测量中粗心和不充分努力反应的解释性混合IRT模型
IF 2.6 3区 心理学 Q1 Mathematics Pub Date : 2022-06-22 DOI: 10.1111/bmsp.12272
Esther Ulitzsch, Seyma Nur Yildirim-Erbasli, Guher Gorgun, Okan Bulut

Careless and insufficient effort responding (C/IER) on self-report measures results in responses that do not reflect the trait to be measured, thereby posing a major threat to the quality of survey data. Reliable approaches for detecting C/IER aid in increasing the validity of inferences being made from survey data. First, once detected, C/IER can be taken into account in data analysis. Second, approaches for detecting C/IER support a better understanding of its occurrence, which facilitates designing surveys that curb the prevalence of C/IER. Previous approaches for detecting C/IER are limited in that they identify C/IER at the aggregate respondent or scale level, thereby hindering investigations of item characteristics evoking C/IER. We propose an explanatory mixture item response theory model that supports identifying and modelling C/IER at the respondent-by-item level, can detect a wide array of C/IER patterns, and facilitates a deeper understanding of item characteristics associated with its occurrence. As the approach only requires raw response data, it is applicable to data from paper-and-pencil and online surveys. The model shows good parameter recovery and can well handle the simultaneous occurrence of multiple types of C/IER patterns in simulated data. The approach is illustrated on a publicly available Big Five inventory data set, where we found later item positions to be associated with higher C/IER probabilities. We gathered initial supporting validity evidence for the proposed approach by investigating agreement with multiple commonly employed indicators of C/IER.

对自我报告测量的粗心和不充分的努力反应(C/IER)导致反应不能反映要测量的特征,从而对调查数据的质量构成重大威胁。检测C/IER的可靠方法有助于提高从调查数据中得出的推论的有效性。首先,一旦检测到C/IER,就可以在数据分析中考虑到。其次,检测C/IER的方法有助于更好地了解其发生情况,这有助于设计抑制C/IER流行的调查。以前检测C/IER的方法是有限的,因为它们在总被调查者或量表水平上识别C/IER,从而阻碍了对唤起C/IER的项目特征的调查。我们提出了一个解释性的混合项目反应理论模型,该模型支持在被调查者的项目层面识别和建模C/IER,可以检测广泛的C/IER模式,并有助于更深入地理解与其发生相关的项目特征。由于该方法只需要原始回复数据,因此适用于纸笔调查和在线调查的数据。该模型具有良好的参数恢复能力,能够很好地处理模拟数据中多种类型的C/IER模式同时出现的情况。该方法在公开的五大库存数据集上进行了说明,我们发现后期的项目位置与更高的C/IER概率相关。我们通过调查与多个常用的C/IER指标的一致性,收集了初步支持有效性的证据。
{"title":"An explanatory mixture IRT model for careless and insufficient effort responding in self-report measures","authors":"Esther Ulitzsch,&nbsp;Seyma Nur Yildirim-Erbasli,&nbsp;Guher Gorgun,&nbsp;Okan Bulut","doi":"10.1111/bmsp.12272","DOIUrl":"10.1111/bmsp.12272","url":null,"abstract":"<p>Careless and insufficient effort responding (C/IER) on self-report measures results in responses that do not reflect the trait to be measured, thereby posing a major threat to the quality of survey data. Reliable approaches for detecting C/IER aid in increasing the validity of inferences being made from survey data. First, once detected, C/IER can be taken into account in data analysis. Second, approaches for detecting C/IER support a better understanding of its occurrence, which facilitates designing surveys that curb the prevalence of C/IER. Previous approaches for detecting C/IER are limited in that they identify C/IER at the aggregate respondent or scale level, thereby hindering investigations of item characteristics evoking C/IER. We propose an explanatory mixture item response theory model that supports identifying and modelling C/IER at the respondent-by-item level, can detect a wide array of C/IER patterns, and facilitates a deeper understanding of item characteristics associated with its occurrence. As the approach only requires raw response data, it is applicable to data from paper-and-pencil and online surveys. The model shows good parameter recovery and can well handle the simultaneous occurrence of multiple types of C/IER patterns in simulated data. The approach is illustrated on a publicly available Big Five inventory data set, where we found later item positions to be associated with higher C/IER probabilities. We gathered initial supporting validity evidence for the proposed approach by investigating agreement with multiple commonly employed indicators of C/IER.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2022-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://bpspsychub.onlinelibrary.wiley.com/doi/epdf/10.1111/bmsp.12272","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40163455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A flexible approach to modelling over-, under- and equidispersed count data in IRT: The Two-Parameter Conway–Maxwell–Poisson Model 在IRT中对过分散、欠分散和等分散计数数据建模的一种灵活方法:双参数康威-麦克斯韦-泊松模型
IF 2.6 3区 心理学 Q1 Mathematics Pub Date : 2022-06-09 DOI: 10.1111/bmsp.12273
Marie Beisemann

Several psychometric tests and self-reports generate count data (e.g., divergent thinking tasks). The most prominent count data item response theory model, the Rasch Poisson Counts Model (RPCM), is limited in applicability by two restrictive assumptions: equal item discriminations and equidispersion (conditional mean equal to conditional variance). Violations of these assumptions lead to impaired reliability and standard error estimates. Previous work generalized the RPCM but maintained some limitations. The two-parameter Poisson counts model allows for varying discriminations but retains the equidispersion assumption. The Conway–Maxwell–Poisson Counts Model allows for modelling over- and underdispersion (conditional mean less than and greater than conditional variance, respectively) but still assumes constant discriminations. The present work introduces the Two-Parameter Conway–Maxwell–Poisson (2PCMP) model which generalizes these three models to allow for varying discriminations and dispersions within one model, helping to better accommodate data from count data tests and self-reports. A marginal maximum likelihood method based on the EM algorithm is derived. An implementation of the 2PCMP model in R and C++ is provided. Two simulation studies examine the model's statistical properties and compare the 2PCMP model to established models. Data from divergent thinking tasks are reanalysed with the 2PCMP model to illustrate the model's flexibility and ability to test assumptions of special cases.

一些心理测试和自我报告产生计数数据(例如,发散性思维任务)。最著名的计数数据项响应理论模型,即Rasch Poisson计数模型(RPCM),其适用性受到两个限制性假设的限制:相等的项目判别和等分散(条件均值等于条件方差)。违反这些假设会导致可靠性和标准误差估计受损。以前的工作推广了RPCM,但仍然存在一些局限性。双参数泊松计数模型允许不同的区别,但保留等色散假设。康威-麦克斯韦-泊松计数模型允许建模过分散和欠分散(条件均值分别小于和大于条件方差),但仍然假设恒定的区别。目前的工作引入了双参数康威-麦克斯韦-泊松(2PCMP)模型,该模型将这三个模型进行了推广,以允许在一个模型内进行不同的区分和分散,有助于更好地适应计数数据测试和自我报告的数据。提出了一种基于EM算法的边际极大似然方法。在R和c++中提供了2PCMP模型的实现。两项模拟研究检验了模型的统计特性,并将2PCMP模型与已建立的模型进行了比较。用2PCMP模型重新分析发散性思维任务的数据,以说明该模型的灵活性和检验特殊情况假设的能力。
{"title":"A flexible approach to modelling over-, under- and equidispersed count data in IRT: The Two-Parameter Conway–Maxwell–Poisson Model","authors":"Marie Beisemann","doi":"10.1111/bmsp.12273","DOIUrl":"10.1111/bmsp.12273","url":null,"abstract":"<p>Several psychometric tests and self-reports generate count data (<i>e.g.</i>, divergent thinking tasks). The most prominent count data item response theory model, the Rasch Poisson Counts Model (RPCM), is limited in applicability by two restrictive assumptions: equal item discriminations and equidispersion (conditional mean equal to conditional variance). Violations of these assumptions lead to impaired reliability and standard error estimates. Previous work generalized the RPCM but maintained some limitations. The two-parameter Poisson counts model allows for varying discriminations but retains the equidispersion assumption. The Conway–Maxwell–Poisson Counts Model allows for modelling over- and underdispersion (conditional mean less than and greater than conditional variance, respectively) but still assumes constant discriminations. The present work introduces the Two-Parameter Conway–Maxwell–Poisson (2PCMP) model which generalizes these three models to allow for varying discriminations and dispersions within one model, helping to better accommodate data from count data tests and self-reports. A marginal maximum likelihood method based on the EM algorithm is derived. An implementation of the 2PCMP model in R and C++ is provided. Two simulation studies examine the model's statistical properties and compare the 2PCMP model to established models. Data from divergent thinking tasks are reanalysed with the 2PCMP model to illustrate the model's flexibility and ability to test assumptions of special cases.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2022-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://bpspsychub.onlinelibrary.wiley.com/doi/epdf/10.1111/bmsp.12273","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9545874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Score-based measurement invariance checks for Bayesian maximum-a-posteriori estimates in item response theory 项目反应理论中贝叶斯最大后验估计的基于分数的测量不变性检验
IF 2.6 3区 心理学 Q1 Mathematics Pub Date : 2022-06-06 DOI: 10.1111/bmsp.12275
Rudolf Debelak, Samuel Pawel, Carolin Strobl, Edgar C. Merkle

A family of score-based tests has been proposed in recent years for assessing the invariance of model parameters in several models of item response theory (IRT). These tests were originally developed in a maximum likelihood framework. This study discusses analogous tests for Bayesian maximum-a-posteriori estimates and multiple-group IRT models. We propose two families of statistical tests, which are based on an approximation using a pooled variance method, or on a simulation approach based on asymptotic results. The resulting tests were evaluated by a simulation study, which investigated their sensitivity against differential item functioning with respect to a categorical or continuous person covariate in the two- and three-parametric logistic models. Whereas the method based on pooled variance was found to be useful in practice with maximum likelihood as well as maximum-a-posteriori estimates, the simulation-based approach was found to require large sample sizes to lead to satisfactory results.

近年来,人们提出了一系列基于分数的测试来评估项目反应理论(IRT)中几个模型参数的不变性。这些测试最初是在最大似然框架中开发的。本研究讨论了贝叶斯最大后验估计和多组IRT模型的类似检验。我们提出了两类统计检验,它们基于使用混合方差方法的近似,或基于基于渐近结果的模拟方法。由此产生的测试通过模拟研究进行评估,该研究调查了它们对两参数和三参数逻辑模型中分类或连续人协变量的差异项目功能的敏感性。尽管基于混合方差的方法被发现在实践中对最大似然和最大后验估计是有用的,但基于模拟的方法被发现需要大样本量才能产生令人满意的结果。
{"title":"Score-based measurement invariance checks for Bayesian maximum-a-posteriori estimates in item response theory","authors":"Rudolf Debelak,&nbsp;Samuel Pawel,&nbsp;Carolin Strobl,&nbsp;Edgar C. Merkle","doi":"10.1111/bmsp.12275","DOIUrl":"10.1111/bmsp.12275","url":null,"abstract":"<p>A family of score-based tests has been proposed in recent years for assessing the invariance of model parameters in several models of item response theory (IRT). These tests were originally developed in a maximum likelihood framework. This study discusses analogous tests for Bayesian maximum-a-posteriori estimates and multiple-group IRT models. We propose two families of statistical tests, which are based on an approximation using a pooled variance method, or on a simulation approach based on asymptotic results. The resulting tests were evaluated by a simulation study, which investigated their sensitivity against differential item functioning with respect to a categorical or continuous person covariate in the two- and three-parametric logistic models. Whereas the method based on pooled variance was found to be useful in practice with maximum likelihood as well as maximum-a-posteriori estimates, the simulation-based approach was found to require large sample sizes to lead to satisfactory results.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9796736/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10485161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Tracking a multitude of abilities as they develop 跟踪各种能力的发展
IF 2.6 3区 心理学 Q1 Mathematics Pub Date : 2022-06-05 DOI: 10.1111/bmsp.12276
Maria Bolsinova, Matthieu J. S. Brinkhuis, Abe D. Hofman, Gunter Maris

Recently, the Urnings algorithm (Bolsinova et al.,  2022, J. R. Stat. Soc. Ser. C Appl. Statistics, 71, 91) has been proposed that allows for tracking the development of abilities of the learners and the difficulties of the items in adaptive learning systems. It is a simple and scalable algorithm which is suited for large-scale applications in which large streams of data are coming into the system and on-the-fly updating is needed. Compared to alternatives like the Elo rating system and its extensions, the Urnings rating system allows the uncertainty of the ratings to be evaluated and accounts for adaptive item selection which, if not corrected for, may distort the ratings. In this paper we extend the Urnings algorithm to allow for both between-item and within-item multidimensionality. This allows for tracking the development of interrelated abilities both at the individual and the population level. We present formal derivations of the multidimensional Urnings algorithm, illustrate its properties in simulations, and present an application to data from an adaptive learning system for primary school mathematics called Math Garden.

最近,Urnings算法(Bolsinova et al., 2022, J. R. Stat. Soc.)爵士。C:。统计,71,91)已经提出,允许跟踪学习者的能力发展和项目的困难在自适应学习系统。它是一种简单且可扩展的算法,适合于大量数据流进入系统并需要实时更新的大规模应用程序。与Elo评级系统及其扩展相比,Urnings评级系统允许评估评级的不确定性,并考虑到自适应项目选择,如果不加以纠正,可能会扭曲评级。在本文中,我们扩展了Urnings算法,以允许项目间和项目内的多维性。这样就可以在个人和群体水平上跟踪相互关联的能力的发展。我们提出了多维Urnings算法的形式化推导,在模拟中说明了它的性质,并提出了一个应用于小学数学自适应学习系统“数学花园”的数据。
{"title":"Tracking a multitude of abilities as they develop","authors":"Maria Bolsinova,&nbsp;Matthieu J. S. Brinkhuis,&nbsp;Abe D. Hofman,&nbsp;Gunter Maris","doi":"10.1111/bmsp.12276","DOIUrl":"10.1111/bmsp.12276","url":null,"abstract":"<p>Recently, the Urnings algorithm (Bolsinova <i>et al</i>.,  2022, <i>J. R. Stat. Soc. Ser. C Appl. Statistics</i>, <i>71</i>, 91) has been proposed that allows for tracking the development of abilities of the learners and the difficulties of the items in adaptive learning systems. It is a simple and scalable algorithm which is suited for large-scale applications in which large streams of data are coming into the system and on-the-fly updating is needed. Compared to alternatives like the Elo rating system and its extensions, the Urnings rating system allows the uncertainty of the ratings to be evaluated and accounts for adaptive item selection which, if not corrected for, may distort the ratings. In this paper we extend the Urnings algorithm to allow for both between-item and within-item multidimensionality. This allows for tracking the development of interrelated abilities both at the individual and the population level. We present formal derivations of the multidimensional Urnings algorithm, illustrate its properties in simulations, and present an application to data from an adaptive learning system for primary school mathematics called Math Garden.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2022-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9796260/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10454182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Theoretical considerations when simulating data from the g-and-h family of distributions 模拟g和h族分布数据时的理论考虑
IF 2.6 3区 心理学 Q1 Mathematics Pub Date : 2022-05-30 DOI: 10.1111/bmsp.12274
Oscar Lorenzo Olvera Astivia, Kroc Edward

The g-and-h family of distributions is a computationally efficient, flexible option to model and simulate non-normal data. In spite of its popularity, there are several theoretical aspects of these distributions that need special consideration when they are used. In this paper some of these aspects are explored. In particular, through mathematical analysis it is shown that a popular multivariate generalization of the g-and-h distribution may result in marginal distributions which are no longer g-and-h distributed, that more than one set of (g,h) parameters can correspond to the same values of population skewness and excess kurtosis, and that multivariate generalizations of g-and-h distributions available in the literature are special cases of Gaussian copula distributions. A small-scale simulation is also used to demonstrate how simulation conclusions can change when different (g,h) parameters are used to simulate data, even if they imply the same population values of skewness and excess kurtosis.

g和h族分布是一种计算效率高、灵活的非正态数据建模和模拟方法。尽管这些发行版很受欢迎,但在使用它们时,有几个理论方面需要特别考虑。本文就这些方面进行了探讨。特别地,通过数学分析表明,流行的g- h分布的多元推广可能导致不再是g- h分布的边际分布,多组(g,h)参数可以对应相同的总体偏度和过量峰度值,文献中可用的g- h分布的多元推广是高斯copula分布的特殊情况。一个小规模的模拟也被用来证明当使用不同的(g,h)参数来模拟数据时,即使它们意味着相同的偏度和过量峰度的总体值,模拟结论是如何变化的。
{"title":"Theoretical considerations when simulating data from the g-and-h family of distributions","authors":"Oscar Lorenzo Olvera Astivia,&nbsp;Kroc Edward","doi":"10.1111/bmsp.12274","DOIUrl":"https://doi.org/10.1111/bmsp.12274","url":null,"abstract":"<p>The <i>g</i>-and-<i>h</i> family of distributions is a computationally efficient, flexible option to model and simulate non-normal data. In spite of its popularity, there are several theoretical aspects of these distributions that need special consideration when they are used. In this paper some of these aspects are explored. In particular, through mathematical analysis it is shown that a popular multivariate generalization of the <i>g</i>-and-<i>h</i> distribution may result in marginal distributions which are no longer <i>g</i>-and-<i>h</i> distributed, that more than one set of (<i>g,h</i>) parameters can correspond to the same values of population skewness and excess kurtosis, and that multivariate generalizations of <i>g</i>-and-<i>h</i> distributions available in the literature are special cases of Gaussian copula distributions. A small-scale simulation is also used to demonstrate how simulation conclusions can change when different (<i>g,h</i>) parameters are used to simulate data, even if they imply the same population values of skewness and excess kurtosis.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2022-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91888205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data-driven Q-matrix learning based on Boolean matrix factorization in cognitive diagnostic assessment 认知诊断评估中基于布尔矩阵分解的数据驱动q矩阵学习
IF 2.6 3区 心理学 Q1 Mathematics Pub Date : 2022-05-16 DOI: 10.1111/bmsp.12271
Jianhua Xiong, Zhaosheng Luo, Guanzhong Luo, Xiaofeng Yu

Attributes and the Q-matrix are the central components for cognitive diagnostic assessment, and are usually defined by domain experts. However, it is challenging and time consuming for experts to specify the attributes and Q-matrix manually. Thus, there is an urgent need for an automatic and intelligent means to address this concern. This paper presents a new data-driven approach for learning the Q-matrix from response data. By constructing a statistical index and a heuristic algorithm based on Boolean matrix factorization, the response matrix is decomposed into the Boolean product of the Q-matrix and the attribute mastery patterns. The feasibility of the proposed approach is evaluated using simulated data generated under various conditions. A real data example is also presented to demonstrate the usefulness of the proposed approach.

属性和q矩阵是认知诊断评估的核心组成部分,通常由领域专家定义。然而,对于专家来说,手动指定属性和q矩阵是具有挑战性和耗时的。因此,迫切需要一种自动和智能的手段来解决这一问题。本文提出了一种从响应数据中学习q矩阵的数据驱动方法。通过构造统计指标和基于布尔矩阵分解的启发式算法,将响应矩阵分解为q矩阵与属性掌握模式的布尔积。利用在各种条件下产生的模拟数据对所提出方法的可行性进行了评估。最后给出了一个实际的数据示例,以证明该方法的有效性。
{"title":"Data-driven Q-matrix learning based on Boolean matrix factorization in cognitive diagnostic assessment","authors":"Jianhua Xiong,&nbsp;Zhaosheng Luo,&nbsp;Guanzhong Luo,&nbsp;Xiaofeng Yu","doi":"10.1111/bmsp.12271","DOIUrl":"https://doi.org/10.1111/bmsp.12271","url":null,"abstract":"<p>Attributes and the Q-matrix are the central components for cognitive diagnostic assessment, and are usually defined by domain experts. However, it is challenging and time consuming for experts to specify the attributes and Q-matrix manually. Thus, there is an urgent need for an automatic and intelligent means to address this concern. This paper presents a new data-driven approach for learning the Q-matrix from response data. By constructing a statistical index and a heuristic algorithm based on Boolean matrix factorization, the response matrix is decomposed into the Boolean product of the Q-matrix and the attribute mastery patterns. The feasibility of the proposed approach is evaluated using simulated data generated under various conditions. A real data example is also presented to demonstrate the usefulness of the proposed approach.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2022-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91843545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A new person-fit method based on machine learning in CDM in education 教育CDM中基于机器学习的人-拟合新方法
IF 2.6 3区 心理学 Q1 Mathematics Pub Date : 2022-03-27 DOI: 10.1111/bmsp.12270
Zhemin Zhu, David Arthur, Hua-Hua Chang

Cognitive diagnosis models have become popular in educational assessment and are used to provide more individualized feedback about a student's specific strengths and weaknesses than traditional total scores. However, if the testing data are contaminated by certain biases or aberrant response patterns, such predictions may not be accurate. The current research objective is to develop a new person-fit method that is based on machine learning and improves the functionality of existing person-fit methods. Various simulations were designed under three aberrant conditions: cheating, sleeping and random guessing. Simulation results showed that the new method was more powerful and effective than previous methods, especially for short-length tests.

认知诊断模型在教育评估中已经变得很流行,与传统的总分相比,它可以提供关于学生具体优缺点的更个性化的反馈。然而,如果测试数据受到某些偏差或异常反应模式的污染,这样的预测可能不准确。目前的研究目标是开发一种新的基于机器学习的人-拟合方法,并改进现有的人-拟合方法的功能。在三种异常情况下设计了各种模拟:作弊、睡觉和随机猜测。仿真结果表明,该方法比以往的方法更有效,特别是对于短长度的测试。
{"title":"A new person-fit method based on machine learning in CDM in education","authors":"Zhemin Zhu,&nbsp;David Arthur,&nbsp;Hua-Hua Chang","doi":"10.1111/bmsp.12270","DOIUrl":"https://doi.org/10.1111/bmsp.12270","url":null,"abstract":"<p>Cognitive diagnosis models have become popular in educational assessment and are used to provide more individualized feedback about a student's specific strengths and weaknesses than traditional total scores. However, if the testing data are contaminated by certain biases or aberrant response patterns, such predictions may not be accurate. The current research objective is to develop a new person-fit method that is based on machine learning and improves the functionality of existing person-fit methods. Various simulations were designed under three aberrant conditions: cheating, sleeping and random guessing. Simulation results showed that the new method was more powerful and effective than previous methods, especially for short-length tests.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2022-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91877346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Modelling multilevel nonlinear treatment-by-covariate interactions in cluster randomized controlled trials using a generalized additive mixed model 用广义加性混合模型对聚类随机对照试验中多水平非线性协变量相互作用进行建模
IF 2.6 3区 心理学 Q1 Mathematics Pub Date : 2022-03-21 DOI: 10.1111/bmsp.12265
Sun-Joo Cho, Kristopher J. Preacher, Haley E. Yaremych, Matthew Naveiras, Douglas Fuchs, Lynn S. Fuchs

A cluster randomized controlled trial (C-RCT) is common in educational intervention studies. Multilevel modelling (MLM) is a dominant analytic method to evaluate treatment effects in a C-RCT. In most MLM applications intended to detect an interaction effect, a single interaction effect (called a conflated effect) is considered instead of level-specific interaction effects in a multilevel design (called unconflated multilevel interaction effects), and the linear interaction effect is modelled. In this paper we present a generalized additive mixed model (GAMM) that allows an unconflated multilevel interaction to be estimated without assuming a prespecified form of the interaction. R code is provided to estimate the model parameters using maximum likelihood estimation and to visualize the nonlinear treatment-by-covariate interaction. The usefulness of the model is illustrated using instructional intervention data from a C-RCT. Results of simulation studies showed that the GAMM outperformed an alternative approach to recover an unconflated logistic multilevel interaction. In addition, the parameter recovery of the GAMM was relatively satisfactory in multilevel designs found in educational intervention studies, except when the number of clusters, cluster sizes, and intraclass correlations were small. When modelling a linear multilevel treatment-by-covariate interaction in the presence of a nonlinear effect, biased estimates (such as overestimated standard errors and overestimated random effect variances) and incorrect predictions of the unconflated multilevel interaction were found.

聚类随机对照试验(C-RCT)在教育干预研究中很常见。多水平模型(MLM)是评价C-RCT治疗效果的主要分析方法。在大多数旨在检测交互效应的传销应用中,考虑单个交互效应(称为合并效应),而不是多层设计中的特定级别交互效应(称为非合并多层交互效应),并对线性交互效应进行建模。本文提出了一种广义加性混合模型(GAMM),它允许在不假设相互作用的预先指定形式的情况下估计非合并的多层相互作用。R代码提供了估计模型参数使用最大似然估计和可视化非线性处理的协变量相互作用。使用C-RCT的教学干预数据来说明该模型的有效性。仿真研究的结果表明,GAMM优于一种替代方法来恢复一个未合并的逻辑多层相互作用。此外,除了聚类数量、聚类大小和类内相关性较小的情况外,在教育干预研究中发现的多水平设计中,GAMM的参数恢复相对令人满意。在存在非线性效应的情况下,对协变量相互作用的线性多层处理进行建模时,发现了对未合并的多层相互作用的有偏估计(如高估标准误差和高估随机效应方差)和不正确的预测。
{"title":"Modelling multilevel nonlinear treatment-by-covariate interactions in cluster randomized controlled trials using a generalized additive mixed model","authors":"Sun-Joo Cho,&nbsp;Kristopher J. Preacher,&nbsp;Haley E. Yaremych,&nbsp;Matthew Naveiras,&nbsp;Douglas Fuchs,&nbsp;Lynn S. Fuchs","doi":"10.1111/bmsp.12265","DOIUrl":"10.1111/bmsp.12265","url":null,"abstract":"<p>A cluster randomized controlled trial (C-RCT) is common in educational intervention studies. Multilevel modelling (MLM) is a dominant analytic method to evaluate treatment effects in a C-RCT. In most MLM applications intended to detect an interaction effect, a single interaction effect (called a <i>conflated</i> effect) is considered instead of level-specific interaction effects in a multilevel design (called <i>unconflated multilevel interaction</i> effects), and the linear interaction effect is modelled. In this paper we present a generalized additive mixed model (GAMM) that allows an unconflated multilevel interaction to be estimated without assuming a prespecified form of the interaction. R code is provided to estimate the model parameters using maximum likelihood estimation and to visualize the nonlinear treatment-by-covariate interaction. The usefulness of the model is illustrated using instructional intervention data from a C-RCT. Results of simulation studies showed that the GAMM outperformed an alternative approach to recover an unconflated logistic multilevel interaction. In addition, the parameter recovery of the GAMM was relatively satisfactory in multilevel designs found in educational intervention studies, except when the number of clusters, cluster sizes, and intraclass correlations were small. When modelling a linear multilevel treatment-by-covariate interaction in the presence of a nonlinear effect, biased estimates (such as overestimated standard errors and overestimated random effect variances) and incorrect predictions of the unconflated multilevel interaction were found.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2022-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40311418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Complete Q-matrices in conjunctive models on general attribute structures 一般属性结构上合取模型中的完全q矩阵
IF 2.6 3区 心理学 Q1 Mathematics Pub Date : 2022-03-20 DOI: 10.1111/bmsp.12266
Jürgen Heller

In cognitive diagnostic assessment a property of the Q-matrix, usually referred to as completeness, warrants that the cognitive attributes underlying the observed behaviour can be uniquely assessed. Characterizations of completeness were first derived under the assumption of independent attributes, and are currently under investigation for interdependent attributes. The dominant approach considers so-called attribute hierarchies, which are conceptualized through a partial order on the set of attributes. The present paper extends previously published results on this issue obtained for conjunctive attribute hierarchy models. Drawing upon results from knowledge structure theory, it provides novel sufficient and necessary conditions for completeness of the Q-matrix, not only for conjunctive models on attribute hierarchies, but also on more general attribute structures.

在认知诊断评估中,q矩阵的一个属性,通常被称为完整性,保证了观察到的行为背后的认知属性可以被唯一地评估。完备性的特征首先是在独立属性的假设下推导出来的,目前正在对相互依赖属性进行研究。占主导地位的方法考虑所谓的属性层次结构,它通过属性集上的偏序来概念化。本文扩展了先前发表的关于连接属性层次模型的结果。利用知识结构理论的结果,它不仅为属性层次上的合取模型,而且为更一般的属性结构上的合取模型提供了新的q矩阵完备性的充要条件。
{"title":"Complete Q-matrices in conjunctive models on general attribute structures","authors":"Jürgen Heller","doi":"10.1111/bmsp.12266","DOIUrl":"10.1111/bmsp.12266","url":null,"abstract":"<p>In cognitive diagnostic assessment a property of the <i>Q</i>-matrix, usually referred to as completeness, warrants that the cognitive attributes underlying the observed behaviour can be uniquely assessed. Characterizations of completeness were first derived under the assumption of independent attributes, and are currently under investigation for interdependent attributes. The dominant approach considers so-called attribute hierarchies, which are conceptualized through a partial order on the set of attributes. The present paper extends previously published results on this issue obtained for conjunctive attribute hierarchy models. Drawing upon results from knowledge structure theory, it provides novel sufficient and necessary conditions for completeness of the <i>Q</i>-matrix, not only for conjunctive models on attribute hierarchies, but also on more general attribute structures.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2022-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://bpspsychub.onlinelibrary.wiley.com/doi/epdf/10.1111/bmsp.12266","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124910395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Refinement: Measuring informativeness of ratings in the absence of a gold standard 细化:在没有金标准的情况下衡量评级的信息量
IF 2.6 3区 心理学 Q1 Mathematics Pub Date : 2022-03-16 DOI: 10.1111/bmsp.12268
Sheridan Grant, Marina Meilă, Elena Erosheva, Carole Lee

We propose a new metric for evaluating the informativeness of a set of ratings from a single rater on a given scale. Such evaluations are of interest when raters rate numerous comparable items on the same scale, as occurs in hiring, college admissions, and peer review. Our exposition takes the context of peer review, which involves univariate and multivariate cardinal ratings. We draw on this context to motivate an information-theoretic measure of the refinement of a set of ratings – entropic refinement – as well as two secondary measures. A mathematical analysis of the three measures reveals that only the first, which captures the information content of the ratings, possesses properties appropriate to a refinement metric. Finally, we analyse refinement in real-world grant-review data, finding evidence that overall merit scores are more refined than criterion scores.

我们提出了一种新的度量标准,用于评估给定尺度上单个评分者的一组评分的信息性。当评分者在同一尺度上对许多可比较的项目进行评分时,就像在招聘、大学录取和同行评议中发生的那样,这种评估是有意义的。我们的论述以同行评议为背景,其中涉及单变量和多变量基数评级。我们利用这一背景来激发一组评级的改进的信息理论措施-熵改进-以及两个次要措施。对这三个度量的数学分析表明,只有第一个度量(捕获评级的信息内容)具有适合于细化度量的属性。最后,我们分析了现实世界拨款审查数据的细化,发现总体绩效分数比标准分数更细化的证据。
{"title":"Refinement: Measuring informativeness of ratings in the absence of a gold standard","authors":"Sheridan Grant,&nbsp;Marina Meilă,&nbsp;Elena Erosheva,&nbsp;Carole Lee","doi":"10.1111/bmsp.12268","DOIUrl":"10.1111/bmsp.12268","url":null,"abstract":"<p>We propose a new metric for evaluating the informativeness of a set of ratings from a single rater on a given scale. Such evaluations are of interest when raters rate numerous comparable items on the same scale, as occurs in hiring, college admissions, and peer review. Our exposition takes the context of peer review, which involves univariate and multivariate cardinal ratings. We draw on this context to motivate an information-theoretic measure of the <i>refinement</i> of a set of ratings – entropic refinement – as well as two secondary measures. A mathematical analysis of the three measures reveals that only the first, which captures the information content of the ratings, possesses properties appropriate to a refinement metric. Finally, we analyse refinement in real-world grant-review data, finding evidence that overall merit scores are more refined than criterion scores.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2022-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10860001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
British Journal of Mathematical & Statistical Psychology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1