A Diagnostic Tree Model for Adaptive Assessment of Complex Cognitive Processes Using Multidimensional Response Options

IF 1.9 3区心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Journal of Educational and Behavioral Statistics Pub Date : 2023-04-05 DOI:10.3102/10769986231158301

M. Davison, David J. Weiss, Joseph N. DeWeese, Ozge Ersan, Gina Biancarosa, Patrick C. Kennedy

{"title":"A Diagnostic Tree Model for Adaptive Assessment of Complex Cognitive Processes Using Multidimensional Response Options","authors":"M. Davison, David J. Weiss, Joseph N. DeWeese, Ozge Ersan, Gina Biancarosa, Patrick C. Kennedy","doi":"10.3102/10769986231158301","DOIUrl":null,"url":null,"abstract":"A tree model for diagnostic educational testing is described along with Monte Carlo simulations designed to evaluate measurement accuracy based on the model. The model is implemented in an assessment of inferential reading comprehension, the Multiple-Choice Online Causal Comprehension Assessment (MOCCA), through a sequential, multidimensional, computerized adaptive testing (CAT) strategy. Assessment of the first dimension, reading comprehension (RC), is based on the three-parameter logistic model. For diagnostic and intervention purposes, the second dimension, called process propensity (PP), is used to classify struggling students based on their pattern of incorrect responses. In the simulation studies, CAT item selection rules and stopping rules were varied to evaluate their effect on measurement accuracy along dimension RC and classification accuracy along dimension PP. For dimension RC, methods that improved accuracy tended to increase test length. For dimension PP, however, item selection and stopping rules increased classification accuracy without materially increasing test length. A small live-testing pilot study confirmed some of the findings of the simulation studies. Development of the assessment has been guided by psychometric theory, Monte Carlo simulation results, and a theory of instruction and diagnosis.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":" ","pages":""},"PeriodicalIF":1.9000,"publicationDate":"2023-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Educational and Behavioral Statistics","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.3102/10769986231158301","RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}

引用次数: 0

Abstract

A tree model for diagnostic educational testing is described along with Monte Carlo simulations designed to evaluate measurement accuracy based on the model. The model is implemented in an assessment of inferential reading comprehension, the Multiple-Choice Online Causal Comprehension Assessment (MOCCA), through a sequential, multidimensional, computerized adaptive testing (CAT) strategy. Assessment of the first dimension, reading comprehension (RC), is based on the three-parameter logistic model. For diagnostic and intervention purposes, the second dimension, called process propensity (PP), is used to classify struggling students based on their pattern of incorrect responses. In the simulation studies, CAT item selection rules and stopping rules were varied to evaluate their effect on measurement accuracy along dimension RC and classification accuracy along dimension PP. For dimension RC, methods that improved accuracy tended to increase test length. For dimension PP, however, item selection and stopping rules increased classification accuracy without materially increasing test length. A small live-testing pilot study confirmed some of the findings of the simulation studies. Development of the assessment has been guided by psychometric theory, Monte Carlo simulation results, and a theory of instruction and diagnosis.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

使用多维反应选项的复杂认知过程适应性评估诊断树模型

诊断教育测试的树模型描述与蒙特卡罗模拟设计，以评估基于该模型的测量精度。该模型通过顺序的、多维的、计算机化的自适应测试(CAT)策略应用于推理阅读理解的评估——多项选择在线因果理解评估(MOCCA)中。第一个维度，阅读理解(RC)的评估是基于三参数逻辑模型。为了诊断和干预的目的，第二个维度，称为过程倾向(PP)，被用来根据错误反应的模式对挣扎的学生进行分类。在模拟研究中，采用不同的CAT项目选择规则和停止规则来评估它们对沿RC维度的测量精度和沿PP维度的分类精度的影响。对于RC维度，提高精度的方法倾向于增加测试长度。而对于维度PP，项目选择和停止规则在不显著增加测试长度的情况下提高了分类精度。一项小型的现场试验试点研究证实了模拟研究的一些发现。评估的发展以心理测量理论、蒙特卡罗模拟结果以及教学和诊断理论为指导。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Journal of Educational and Behavioral Statistics Multiple-

CiteScore

4.40

自引率

4.20%

发文量

期刊介绍： Journal of Educational and Behavioral Statistics, sponsored jointly by the American Educational Research Association and the American Statistical Association, publishes articles that are original and provide methods that are useful to those studying problems and issues in educational or behavioral research. Typical papers introduce new methods of analysis. Critical reviews of current practice, tutorial presentations of less well known methods, and novel applications of already-known methods are also of interest. Papers discussing statistical techniques without specific educational or behavioral interest or focusing on substantive results without developing new statistical methods or models or making novel use of existing methods have lower priority. Simulation studies, either to demonstrate properties of an existing method or to compare several existing methods (without providing a new method), also have low priority. The Journal of Educational and Behavioral Statistics provides an outlet for papers that are original and provide methods that are useful to those studying problems and issues in educational or behavioral research. Typical papers introduce new methods of analysis, provide properties of these methods, and an example of use in education or behavioral research. Critical reviews of current practice, tutorial presentations of less well known methods, and novel applications of already-known methods are also sometimes accepted. Papers discussing statistical techniques without specific educational or behavioral interest or focusing on substantive results without developing new statistical methods or models or making novel use of existing methods have lower priority. Simulation studies, either to demonstrate properties of an existing method or to compare several existing methods (without providing a new method), also have low priority.