首页 > 最新文献

Applied Psychological Measurement最新文献

英文 中文
Bayesian Item Response Theory Models With Flexible Generalized Logit Links. 具有灵活广义对数链接的贝叶斯项目反应理论模型
IF 1 4区 心理学 Q4 PSYCHOLOGY, MATHEMATICAL Pub Date : 2022-07-01 Epub Date: 2022-05-20 DOI: 10.1177/01466216221089343
Jiwei Zhang, Ying-Ying Zhang, Jian Tao, Ming-Hui Chen

In educational and psychological research, the logit and probit links are often used to fit the binary item response data. The appropriateness and importance of the choice of links within the item response theory (IRT) framework has not been investigated yet. In this paper, we present a family of IRT models with generalized logit links, which include the traditional logistic and normal ogive models as special cases. This family of models are flexible enough not only to adjust the item characteristic curve tail probability by two shape parameters but also to allow us to fit the same link or different links to different items within the IRT model framework. In addition, the proposed models are implemented in the Stan software to sample from the posterior distributions. Using readily available Stan outputs, the four Bayesian model selection criteria are computed for guiding the choice of the links within the IRT model framework. Extensive simulation studies are conducted to examine the empirical performance of the proposed models and the model fittings in terms of "in-sample" and "out-of-sample" predictions based on the deviance. Finally, a detailed analysis of the real reading assessment data is carried out to illustrate the proposed methodology.

在教育和心理学研究中,通常使用 logit 和 probit 链接来拟合二元项目反应数据。在项目反应理论(IRT)框架内选择链接的适当性和重要性尚未得到研究。在本文中,我们提出了一系列具有广义 logit 链接的 IRT 模型,其中包括作为特例的传统 logistic 模型和正态 Ogive 模型。这一系列模型非常灵活,不仅可以通过两个形状参数调整项目特征曲线的尾部概率,还可以在 IRT 模型框架内将相同或不同的链接拟合到不同的项目上。此外,建议的模型是在 Stan 软件中实现的,以便从后分布中采样。利用现成的 Stan 输出结果,计算出四个贝叶斯模型选择标准,用于指导在 IRT 模型框架内选择链接。进行了广泛的模拟研究,以根据偏差的 "样本内 "和 "样本外 "预测来检验建议模型和模型拟合的经验性能。最后,对真实的阅读评估数据进行了详细分析,以说明所提出的方法。
{"title":"Bayesian Item Response Theory Models With Flexible Generalized Logit Links.","authors":"Jiwei Zhang, Ying-Ying Zhang, Jian Tao, Ming-Hui Chen","doi":"10.1177/01466216221089343","DOIUrl":"10.1177/01466216221089343","url":null,"abstract":"<p><p>In educational and psychological research, the logit and probit links are often used to fit the binary item response data. The appropriateness and importance of the choice of links within the item response theory (IRT) framework has not been investigated yet. In this paper, we present a family of IRT models with generalized logit links, which include the traditional logistic and normal ogive models as special cases. This family of models are flexible enough not only to adjust the item characteristic curve tail probability by two shape parameters but also to allow us to fit the same link or different links to different items within the IRT model framework. In addition, the proposed models are implemented in the Stan software to sample from the posterior distributions. Using readily available Stan outputs, the four Bayesian model selection criteria are computed for guiding the choice of the links within the IRT model framework. Extensive simulation studies are conducted to examine the empirical performance of the proposed models and the model fittings in terms of \"in-sample\" and \"out-of-sample\" predictions based on the deviance. Finally, a detailed analysis of the real reading assessment data is carried out to illustrate the proposed methodology.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"46 5","pages":"382-405"},"PeriodicalIF":1.0,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9265488/pdf/10.1177_01466216221089343.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10091271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dual-Objective Item Selection Methods in Computerized Adaptive Test Using the Higher-Order Cognitive Diagnostic Models. 基于高阶认知诊断模型的计算机自适应测验双目标选题方法。
IF 1.2 4区 心理学 Q4 PSYCHOLOGY, MATHEMATICAL Pub Date : 2022-07-01 DOI: 10.1177/01466216221089342
Chongqin Xi, Dongbo Tu, Yan Cai

To efficiently obtain information about both the general abilities and detailed cognitive profiles of examinees from a single model that uses a single-calibration process, higher-order cognitive diagnostic computerized adaptive testing (CD-CAT) that employ higher-order cognitive diagnostic models have been developed. However, the current item selection methods used in higher-order CD-CAT adaptively select items according to only the attribute profiles, which might lead to low precision regarding general abilities; hence, an appropriate method was proposed for this CAT system in this study. Under the framework of the higher-order models, the responses were affected by attribute profiles, which were governed by general abilities. It is reasonable to hold that the item responses were affected by a combination of general abilities and attribute profiles. Based on the logic of Shannon entropy and the generalized deterministic, inputs, noisy "and" gate (G-DINA) model discrimination index (GDI), two new item selection methods were proposed for higher-order CD-CAT by considering the above combination in this study. The simulation results demonstrated that the new methods achieved more accurate estimations of both general abilities and cognitive profiles than the existing methods and maintained distinct advantages in terms of item pool usage.

为了从使用单一校准过程的单一模型中有效地获取有关考生一般能力和详细认知概况的信息,开发了采用高阶认知诊断模型的高阶认知诊断计算机化自适应测试(CD-CAT)。然而,目前在高阶CD-CAT中使用的项目选择方法仅根据属性概况自适应地选择项目,这可能导致对一般能力的精度较低;因此,本研究针对该CAT系统提出了一种合适的方法。在高阶模型框架下,响应受属性概况的影响,而属性概况受一般能力的支配。我们有理由认为,项目反应受到一般能力和属性概况的综合影响。基于香农熵逻辑和广义确定性“输入、噪声”门(G-DINA)模型判别指数(GDI),综合考虑上述两种方法,提出了两种新的高阶CD-CAT项目选择方法。仿真结果表明,新方法比现有方法更准确地估计了一般能力和认知特征,并在项目池使用方面保持了明显的优势。
{"title":"Dual-Objective Item Selection Methods in Computerized Adaptive Test Using the Higher-Order Cognitive Diagnostic Models.","authors":"Chongqin Xi,&nbsp;Dongbo Tu,&nbsp;Yan Cai","doi":"10.1177/01466216221089342","DOIUrl":"https://doi.org/10.1177/01466216221089342","url":null,"abstract":"<p><p>To efficiently obtain information about both the general abilities and detailed cognitive profiles of examinees from a single model that uses a single-calibration process, higher-order cognitive diagnostic computerized adaptive testing (CD-CAT) that employ higher-order cognitive diagnostic models have been developed. However, the current item selection methods used in higher-order CD-CAT adaptively select items according to only the attribute profiles, which might lead to low precision regarding general abilities; hence, an appropriate method was proposed for this CAT system in this study. Under the framework of the higher-order models, the responses were affected by attribute profiles, which were governed by general abilities. It is reasonable to hold that the item responses were affected by a combination of general abilities and attribute profiles. Based on the logic of Shannon entropy and the generalized deterministic, inputs, noisy \"and\" gate (G-DINA) model discrimination index (GDI), two new item selection methods were proposed for higher-order CD-CAT by considering the above combination in this study. The simulation results demonstrated that the new methods achieved more accurate estimations of both general abilities and cognitive profiles than the existing methods and maintained distinct advantages in terms of item pool usage.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"46 5","pages":"422-438"},"PeriodicalIF":1.2,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9265487/pdf/10.1177_01466216221089342.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10091270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluation of the Linear Composite Conjecture for Unidimensional IRT Scale for Multidimensional Responses. 多维响应下一维IRT尺度线性复合猜想的评价。
IF 1.2 4区 心理学 Q4 PSYCHOLOGY, MATHEMATICAL Pub Date : 2022-07-01 DOI: 10.1177/01466216221084218
Tyler Strachan, Uk Hyun Cho, Terry Ackerman, Shyh-Huei Chen, Jimmy de la Torre, Edward H Ip

The linear composite direction represents, theoretically, where the unidimensional scale would lie within a multidimensional latent space. Using compensatory multidimensional IRT, the linear composite can be derived from the structure of the items and the latent distribution. The purpose of this study was to evaluate the validity of the linear composite conjecture and examine how well a fitted unidimensional IRT model approximates the linear composite direction in a multidimensional latent space. Simulation experiment results overall show that the fitted unidimensional IRT model sufficiently approximates linear composite direction when correlation between bivariate latent variables is positive. When the correlation between bivariate latent variables is negative, instability occurs when the fitted unidimensional IRT model is used to approximate linear composite direction. A real data experiment was also conducted using 20 items from a multiple-choice mathematics test from American College Testing.

线性复合方向在理论上表示一维尺度在多维潜在空间中的位置。利用补偿性多维IRT,可以从项目的结构和潜在分布推导出线性复合。本研究的目的是评估线性复合猜想的有效性,并检查拟合的一维IRT模型在多维潜在空间中近似线性复合方向的程度。仿真实验结果总体上表明,拟合的一维IRT模型在二元潜变量之间为正相关时,能充分逼近线性复合方向。当二元潜变量之间的相关性为负时,用拟合的一维IRT模型近似线性复合方向会产生不稳定性。采用美国大学考试数学多项选择题中的20个题目进行了真实数据实验。
{"title":"Evaluation of the Linear Composite Conjecture for Unidimensional IRT Scale for Multidimensional Responses.","authors":"Tyler Strachan,&nbsp;Uk Hyun Cho,&nbsp;Terry Ackerman,&nbsp;Shyh-Huei Chen,&nbsp;Jimmy de la Torre,&nbsp;Edward H Ip","doi":"10.1177/01466216221084218","DOIUrl":"https://doi.org/10.1177/01466216221084218","url":null,"abstract":"<p><p>The linear composite direction represents, theoretically, where the unidimensional scale would lie within a multidimensional latent space. Using compensatory multidimensional IRT, the linear composite can be derived from the structure of the items and the latent distribution. The purpose of this study was to evaluate the validity of the linear composite conjecture and examine how well a fitted unidimensional IRT model approximates the linear composite direction in a multidimensional latent space. Simulation experiment results overall show that the fitted unidimensional IRT model sufficiently approximates linear composite direction when correlation between bivariate latent variables is positive. When the correlation between bivariate latent variables is negative, instability occurs when the fitted unidimensional IRT model is used to approximate linear composite direction. A real data experiment was also conducted using 20 items from a multiple-choice mathematics test from American College Testing.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"46 5","pages":"347-360"},"PeriodicalIF":1.2,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9265490/pdf/10.1177_01466216221084218.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10091268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Computerized Adaptive Testing for Ipsative Tests with Multidimensional Pairwise-Comparison Items: Algorithm Development and Applications. 多维成对比较项目的互异性测试的计算机自适应测试:算法开发和应用。
IF 1.2 4区 心理学 Q4 PSYCHOLOGY, MATHEMATICAL Pub Date : 2022-06-01 DOI: 10.1177/01466216221084209
Xue-Lan Qiu, Jimmy de la Torre, Sage Ro, Wen-Chung Wang

A computerized adaptive testing (CAT) solution for tests with multidimensional pairwise-comparison (MPC) items, aiming to measure career interest, value, and personality, is rare. This paper proposes new item selection and exposure control methods for CAT with dichotomous and polytomous MPC items and present simulation study results. The results show that the procedures are effective in selecting items and controlling within-person statement exposure with no loss of efficiency. Implications are discussed in two applications of the proposed CAT procedures: a work attitude test with dichotomous MPC items and a career interest assessment with polytomous MPC items.

一个计算机化的自适应测试(CAT)解决方案与多维两两比较(MPC)项目的测试,旨在衡量职业兴趣,价值和个性,是罕见的。本文提出了基于二分和多分MPC题项的CAT题项选择和暴露控制新方法,并给出了仿真研究结果。结果表明,该程序在选择项目和控制内部陈述曝光方面是有效的,且没有损失效率。本研究讨论了两种CAT程序的应用:工作态度测验和职业兴趣测验。
{"title":"Computerized Adaptive Testing for Ipsative Tests with Multidimensional Pairwise-Comparison Items: Algorithm Development and Applications.","authors":"Xue-Lan Qiu,&nbsp;Jimmy de la Torre,&nbsp;Sage Ro,&nbsp;Wen-Chung Wang","doi":"10.1177/01466216221084209","DOIUrl":"https://doi.org/10.1177/01466216221084209","url":null,"abstract":"<p><p>A computerized adaptive testing (CAT) solution for tests with multidimensional pairwise-comparison (MPC) items, aiming to measure career interest, value, and personality, is rare. This paper proposes new item selection and exposure control methods for CAT with dichotomous and polytomous MPC items and present simulation study results. The results show that the procedures are effective in selecting items and controlling within-person statement exposure with no loss of efficiency. Implications are discussed in two applications of the proposed CAT procedures: a work attitude test with dichotomous MPC items and a career interest assessment with polytomous MPC items.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"46 4","pages":"255-272"},"PeriodicalIF":1.2,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9118927/pdf/10.1177_01466216221084209.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9609917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Combining Cognitive Diagnostic Computerized Adaptive Testing With Multidimensional Item Response Theory. 认知诊断计算机自适应测试与多维项目反应理论的结合。
IF 1.2 4区 心理学 Q4 PSYCHOLOGY, MATHEMATICAL Pub Date : 2022-06-01 DOI: 10.1177/01466216221084214
Hao Luo, Daxun Wang, Zhiming Guo, Yan Cai, Dongbo Tu

The new generation of tests not only focuses on the general ability but also the process of finer-grained skills. Under the guidance of this thought, researchers have developed a dual-purpose CD-CAT (Dual-CAT). In the existing Dual-CAT, the models used in overall ability estimation are unidimensional IRT models, which cannot apply to the multidimensional tests. This article intends to develop a multidimensional Dual-CAT to improve its applicability. To achieve this goal, this article firstly proposes some item selection methods for the multidimensional Dual-CAT, and then verifies the estimation accuracy and exposure rate of these methods through both simulation study and a real item bank study. The results show that the established multidimensional Dual-CAT is effective and the new proposed methods outperform the traditional methods. Finally, this article discusses the future direction of the Dual-CAT.

新一代的测试不仅关注一般能力,而且关注更细粒度技能的过程。在这种思想的指导下,研究人员研制出了一种两用CD-CAT (Dual-CAT)。在现有的Dual-CAT测试中,用于综合能力估计的模型是一维的IRT模型,不能用于多维测试。本文旨在开发一种多维的Dual-CAT,以提高其适用性。为了实现这一目标,本文首先提出了一些多维双cat的题库选择方法,然后通过仿真研究和真实题库研究验证了这些方法的估计精度和曝光率。结果表明,所建立的多维双cat方法是有效的,新方法优于传统方法。最后,本文讨论了双cat的未来发展方向。
{"title":"Combining Cognitive Diagnostic Computerized Adaptive Testing With Multidimensional Item Response Theory.","authors":"Hao Luo,&nbsp;Daxun Wang,&nbsp;Zhiming Guo,&nbsp;Yan Cai,&nbsp;Dongbo Tu","doi":"10.1177/01466216221084214","DOIUrl":"https://doi.org/10.1177/01466216221084214","url":null,"abstract":"<p><p>The new generation of tests not only focuses on the general ability but also the process of finer-grained skills. Under the guidance of this thought, researchers have developed a dual-purpose CD-CAT (Dual-CAT). In the existing Dual-CAT, the models used in overall ability estimation are unidimensional IRT models, which cannot apply to the multidimensional tests. This article intends to develop a multidimensional Dual-CAT to improve its applicability. To achieve this goal, this article firstly proposes some item selection methods for the multidimensional Dual-CAT, and then verifies the estimation accuracy and exposure rate of these methods through both simulation study and a real item bank study. The results show that the established multidimensional Dual-CAT is effective and the new proposed methods outperform the traditional methods. Finally, this article discusses the future direction of the Dual-CAT.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"46 4","pages":"288-302"},"PeriodicalIF":1.2,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9118931/pdf/10.1177_01466216221084214.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9911725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detecting Examinees With Item Preknowledge on Real Data. 基于真实数据的项目预知检测。
IF 1.2 4区 心理学 Q4 PSYCHOLOGY, MATHEMATICAL Pub Date : 2022-06-01 DOI: 10.1177/01466216221084202
Dmitry I Belov, Sarah L Toton

Recently, Belov & Wollack (2021) developed a method for detecting groups of colluding examinees as cliques in a graph. The objective of this article is to study how the performance of their method on real data with item preknowledge (IP) depends on the mechanism of edge formation governed by a response similarity index (RSI). This study resulted in the development of three new RSIs and demonstrated a remarkable advantage of combining responses and response times for detecting examinees with IP. Possible extensions of this study and recommendations for practitioners were formulated.

最近,Belov & Wollack(2021)开发了一种方法,用于在图中检测串谋的考生群体。本文的目的是研究他们的方法在具有项目预知(IP)的真实数据上的性能如何依赖于响应相似指数(RSI)控制的边缘形成机制。该研究开发了三种新的rsi,并证明了结合反应和反应时间来检测IP的考生的显着优势。本研究的可能扩展和对从业者的建议被制定。
{"title":"Detecting Examinees With Item Preknowledge on Real Data.","authors":"Dmitry I Belov,&nbsp;Sarah L Toton","doi":"10.1177/01466216221084202","DOIUrl":"https://doi.org/10.1177/01466216221084202","url":null,"abstract":"<p><p>Recently, Belov & Wollack (2021) developed a method for detecting groups of colluding examinees as cliques in a graph. The objective of this article is to study how the performance of their method on real data with item preknowledge (IP) depends on the mechanism of edge formation governed by a response similarity index (RSI). This study resulted in the development of three new RSIs and demonstrated a remarkable advantage of combining responses and response times for detecting examinees with IP. Possible extensions of this study and recommendations for practitioners were formulated.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"46 4","pages":"273-287"},"PeriodicalIF":1.2,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9118928/pdf/10.1177_01466216221084202.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9609916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
The Potential for Interpretational Confounding in Cognitive Diagnosis Models. 认知诊断模型中解释性混淆的可能性。
IF 1 4区 心理学 Q4 PSYCHOLOGY, MATHEMATICAL Pub Date : 2022-06-01 Epub Date: 2022-04-15 DOI: 10.1177/01466216221084207
Qi Helen Huang, Daniel M Bolt

Binary examinee mastery/nonmastery classifications in cognitive diagnosis models may often be an approximation to proficiencies that are better regarded as continuous. Such misspecification can lead to inconsistencies in the operational definition of "mastery" when binary skills models are assumed. In this paper we demonstrate the potential for an interpretational confounding of the latent skills when truly continuous skills are treated as binary. Using the DINA model as an example, we show how such forms of confounding can be observed through item and/or examinee parameter change when (1) different collections of items (such as representing different test forms) previously calibrated separately are subsequently calibrated together; and (2) when structural restrictions are placed on the relationships among skill attributes (such as the assumption of strictly nonnegative growth over time), among other possibilities. We examine these occurrences in both simulation and real data studies. It is suggested that researchers should regularly attend to the potential for interpretational confounding by studying differences in attribute mastery proportions and/or changes in item parameter (e.g., slip and guess) estimates attributable to skill continuity when the same samples of examinees are administered different test forms, or the same test forms are involved in different calibrations.

在认知诊断模型中,受试者掌握/不掌握的二元分类往往可能是能力的近似值,而这些能力最好被看作是连续的。当假定采用二元技能模型时,这种错误定义可能会导致 "掌握 "的操作定义不一致。在本文中,我们展示了当真正连续的技能被视为二进制技能时,潜在技能的解释混淆的可能性。以 DINA 模型为例,我们展示了在以下情况下如何通过项目和/或考生参数的变化观察到这种形式的混淆:(1) 先前分别校准的不同项目集合(如代表不同测试形式)随后被一起校准;(2) 对技能属性之间的关系施加结构性限制(如假定随时间的增长为严格的非负值),以及其他可能性。我们在模拟和真实数据研究中对这些情况进行了考察。我们建议,研究人员应定期关注解释性混淆的可能性,研究当相同的考生样本接受不同的测试形式,或相同的测试形式参与不同的校准时,属性掌握比例的差异和/或项目参数(如滑动和猜测)估计值因技能连续性而产生的变化。
{"title":"The Potential for Interpretational Confounding in Cognitive Diagnosis Models.","authors":"Qi Helen Huang, Daniel M Bolt","doi":"10.1177/01466216221084207","DOIUrl":"10.1177/01466216221084207","url":null,"abstract":"<p><p>Binary examinee mastery/nonmastery classifications in cognitive diagnosis models may often be an approximation to proficiencies that are better regarded as continuous. Such misspecification can lead to inconsistencies in the operational definition of \"mastery\" when binary skills models are assumed. In this paper we demonstrate the potential for an interpretational confounding of the latent skills when truly continuous skills are treated as binary. Using the DINA model as an example, we show how such forms of confounding can be observed through item and/or examinee parameter change when (1) different collections of items (such as representing different test forms) previously calibrated separately are subsequently calibrated together; and (2) when structural restrictions are placed on the relationships among skill attributes (such as the assumption of strictly nonnegative growth over time), among other possibilities. We examine these occurrences in both simulation and real data studies. It is suggested that researchers should regularly attend to the potential for interpretational confounding by studying differences in attribute mastery proportions and/or changes in item parameter (e.g., slip and guess) estimates attributable to skill continuity when the same samples of examinees are administered different test forms, or the same test forms are involved in different calibrations.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"46 4","pages":"303-320"},"PeriodicalIF":1.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9118932/pdf/10.1177_01466216221084207.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9609918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Comparison of Modern and Popular Approaches to Calculating Reliability for Dichotomously Scored Items. 现代与流行的二分计分项目信度计算方法之比较
IF 1 4区 心理学 Q4 PSYCHOLOGY, MATHEMATICAL Pub Date : 2022-06-01 Epub Date: 2022-04-14 DOI: 10.1177/01466216221084210
Sébastien Béland, Carl F Falk

Recent work on reliability coefficients has largely focused on continuous items, including critiques of Cronbach's alpha. Although two new model-based reliability coefficients have been proposed for dichotomous items (Dimitrov, 2003a,b; Green & Yang, 2009a), these approaches have yet to be compared to each other or other popular estimates of reliability such as omega, alpha, and the greatest lower bound. We seek computational improvements to one of these model-based reliability coefficients and, in addition, conduct initial Monte Carlo simulations to compare coefficients using dichotomous data. Our results suggest that such improvements to the model-based approach are warranted, while model-based approaches were generally superior.

最近对可靠性系数的研究主要集中在连续项目上,包括对Cronbach 's alpha的批评。虽然两个新的基于模型的可靠性系数已经提出了二分类项目(Dimitrov, 2003a,b;Green & Yang, 2009a),这些方法还没有相互比较或与其他流行的可靠性估计(如ω, alpha和最大下界)进行比较。我们寻求对这些基于模型的可靠性系数之一的计算改进,此外,还进行了初始蒙特卡罗模拟,以使用二分类数据比较系数。我们的结果表明,这种基于模型的方法的改进是必要的,而基于模型的方法通常是优越的。
{"title":"A Comparison of Modern and Popular Approaches to Calculating Reliability for Dichotomously Scored Items.","authors":"Sébastien Béland, Carl F Falk","doi":"10.1177/01466216221084210","DOIUrl":"10.1177/01466216221084210","url":null,"abstract":"<p><p>Recent work on reliability coefficients has largely focused on continuous items, including critiques of Cronbach's alpha. Although two new model-based reliability coefficients have been proposed for dichotomous items (Dimitrov, 2003a,b; Green & Yang, 2009a), these approaches have yet to be compared to each other or other popular estimates of reliability such as omega, alpha, and the greatest lower bound. We seek computational improvements to one of these model-based reliability coefficients and, in addition, conduct initial Monte Carlo simulations to compare coefficients using dichotomous data. Our results suggest that such improvements to the model-based approach are warranted, while model-based approaches were generally superior.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"46 1","pages":"321-337"},"PeriodicalIF":1.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9118929/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41659739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assessing Dimensionality in Dichotomous Items When Many Subjects Have All-Zero Responses: An Example From Psychiatry and a Solution Using Mixture Models. 当许多受试者有全零反应时,评估二分类项目的维度:来自精神病学的一个例子和使用混合模型的解决方案。
IF 1.2 4区 心理学 Q4 PSYCHOLOGY, MATHEMATICAL Pub Date : 2022-05-01 DOI: 10.1177/01466216211066602
William F Christensen, Melanie M Wall, Irini Moustaki

Common methods for determining the number of latent dimensions underlying an item set include eigenvalue analysis and examination of fit statistics for factor analysis models with varying number of factors. Given a set of dichotomous items, the authors demonstrate that these empirical assessments of dimensionality often incorrectly estimate the number of dimensions when there is a preponderance of individuals in the sample with all-zeros as their responses, for example, not endorsing any symptoms on a health battery. Simulated data experiments are conducted to demonstrate when each of several common diagnostics of dimensionality can be expected to under- or over-estimate the true dimensionality of the underlying latent variable. An example is shown from psychiatry assessing the dimensionality of a social anxiety disorder battery where 1, 2, 3, or more factors are identified, depending on the method of dimensionality assessment. An all-zero inflated exploratory factor analysis model (AZ-EFA) is introduced for assessing the dimensionality of the underlying subgroup corresponding to those possessing the measurable trait. The AZ-EFA approach is demonstrated using simulation experiments and an example measuring social anxiety disorder from a large nationally representative survey. Implications of the findings are discussed, in particular, regarding the potential for different findings in community versus patient populations.

确定项目集潜在维度数的常用方法包括特征值分析和具有不同数量因素的因素分析模型的拟合统计检查。给定一组二分类项目,作者证明,当样本中大多数人的反应都是零时,这些对维度的经验评估往往会错误地估计维度的数量,例如,不赞同健康电池上的任何症状。模拟数据实验进行,以证明当几个常见的诊断维度的每一个可以预期低估或高估潜在变量的真实维度。一个来自精神病学评估社交焦虑障碍的维度的例子,根据维度评估的方法,可以确定1、2、3或更多的因素。引入了一种全零膨胀探索性因子分析模型(AZ-EFA),用于评估具有可测量特征的潜在子群所对应的维度。AZ-EFA方法通过模拟实验和一个来自全国代表性调查的测量社交焦虑障碍的例子来证明。讨论了研究结果的含义,特别是关于社区与患者群体中不同研究结果的可能性。
{"title":"Assessing Dimensionality in Dichotomous Items When Many Subjects Have All-Zero Responses: An Example From Psychiatry and a Solution Using Mixture Models.","authors":"William F Christensen,&nbsp;Melanie M Wall,&nbsp;Irini Moustaki","doi":"10.1177/01466216211066602","DOIUrl":"https://doi.org/10.1177/01466216211066602","url":null,"abstract":"<p><p>Common methods for determining the number of latent dimensions underlying an item set include eigenvalue analysis and examination of fit statistics for factor analysis models with varying number of factors. Given a set of dichotomous items, the authors demonstrate that these empirical assessments of dimensionality often incorrectly estimate the number of dimensions when there is a preponderance of individuals in the sample with all-zeros as their responses, for example, not endorsing any symptoms on a health battery. Simulated data experiments are conducted to demonstrate when each of several common diagnostics of dimensionality can be expected to under- or over-estimate the true dimensionality of the underlying latent variable. An example is shown from psychiatry assessing the dimensionality of a social anxiety disorder battery where 1, 2, 3, or more factors are identified, depending on the method of dimensionality assessment. An all-zero inflated exploratory factor analysis model (AZ-EFA) is introduced for assessing the dimensionality of the underlying subgroup corresponding to those possessing the measurable trait. The AZ-EFA approach is demonstrated using simulation experiments and an example measuring social anxiety disorder from a large nationally representative survey. Implications of the findings are discussed, in particular, regarding the potential for different findings in community versus patient populations.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"46 3","pages":"167-184"},"PeriodicalIF":1.2,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9073639/pdf/10.1177_01466216211066602.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9748243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Reducing the Misclassification Costs of Cognitive Diagnosis Computerized Adaptive Testing: Item Selection With Minimum Expected Risk. 降低认知诊断计算机自适应测试的错误分类成本:最小预期风险的项目选择。
IF 1.2 4区 心理学 Q4 PSYCHOLOGY, MATHEMATICAL Pub Date : 2022-05-01 DOI: 10.1177/01466216211066610
Chia-Ling Hsu, Wen-Chung Wang

Cognitive diagnosis computerized adaptive testing (CD-CAT) aims to identify each examinee's strengths and weaknesses on latent attributes for appropriate classification into an attribute profile. As the cost of a CD-CAT misclassification differs across user needs (e.g., remedial program vs. scholarship eligibilities), item selection can incorporate such costs to improve measurement efficiency. This study proposes such a method, minimum expected risk (MER), based on Bayesian decision theory. According to simulations, using MER to identify examinees with no mastery (MER-U0) or full mastery (MER-U1) showed greater classification accuracy and efficiency than other methods for these attribute profiles, especially for shorter tests or low quality item banks. For other attribute profiles, regardless of item quality or termination criterion, MER methods, modified posterior-weighted Kullback-Leibler information (MPWKL), posterior-weighted CDM discrimination index (PWCDI), and Shannon entropy (SHE) performed similarly and outperformed posterior-weighted attribute-level CDM discrimination index (PWACDI) in classification accuracy and test efficiency, especially on short tests. MER with a zero-one loss function, MER-U0, MER-U1, and PWACDI utilized item banks more effectively than the other methods. Overall, these results show the feasibility of using MER in CD-CAT to increase the accuracy for specific attribute profiles to address different user needs.

认知诊断计算机化自适应测试(CD-CAT)旨在识别每个考生在潜在属性上的优势和劣势,并将其适当分类到属性概况中。由于CD-CAT错误分类的成本因用户需求而异(例如,补救计划与奖学金资格),项目选择可以纳入此类成本以提高测量效率。本文提出了一种基于贝叶斯决策理论的最小期望风险(MER)方法。模拟结果表明,使用MER识别未掌握(MER- u0)或完全掌握(MER- u1)考生的分类准确率和效率高于其他方法,特别是对于较短的考试或低质量的题库。对于其他属性文件,无论项目质量或终止标准如何,MER方法、改进后置加权Kullback-Leibler信息(MPWKL)、后置加权CDM判别指数(PWCDI)和Shannon熵(SHE)在分类精度和测试效率方面表现相似,且优于后置加权属性级CDM判别指数(PWACDI),特别是在短测试上。具有0 - 1损失函数的MER、MER- u0、MER- u1和PWACDI比其他方法更有效地利用了物库。总的来说,这些结果表明在CD-CAT中使用MER来提高特定属性概况的准确性以满足不同用户需求的可行性。
{"title":"Reducing the Misclassification Costs of Cognitive Diagnosis Computerized Adaptive Testing: Item Selection With Minimum Expected Risk.","authors":"Chia-Ling Hsu,&nbsp;Wen-Chung Wang","doi":"10.1177/01466216211066610","DOIUrl":"https://doi.org/10.1177/01466216211066610","url":null,"abstract":"<p><p>Cognitive diagnosis computerized adaptive testing (CD-CAT) aims to identify each examinee's strengths and weaknesses on latent attributes for appropriate classification into an attribute profile. As the cost of a CD-CAT misclassification differs across user needs (e.g., remedial program vs. scholarship eligibilities), item selection can incorporate such costs to improve measurement efficiency. This study proposes such a method, <i>minimum expected risk</i> (MER), based on Bayesian decision theory. According to simulations, using MER to identify examinees with no mastery (MER-U0) or full mastery (MER-U1) showed greater classification accuracy and efficiency than other methods for these attribute profiles, especially for shorter tests or low quality item banks. For other attribute profiles, regardless of item quality or termination criterion, MER methods, modified posterior-weighted Kullback-Leibler information (MPWKL), posterior-weighted CDM discrimination index (PWCDI), and Shannon entropy (SHE) performed similarly and outperformed posterior-weighted attribute-level CDM discrimination index (PWACDI) in classification accuracy and test efficiency, especially on short tests. MER with a zero-one loss function, MER-U0, MER-U1, and PWACDI utilized item banks more effectively than the other methods. Overall, these results show the feasibility of using MER in CD-CAT to increase the accuracy for specific attribute profiles to address different user needs.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"46 3","pages":"185-199"},"PeriodicalIF":1.2,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9073635/pdf/10.1177_01466216211066610.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9748238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Applied Psychological Measurement
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1