首页 > 最新文献

Journal of Educational and Behavioral Statistics最新文献

英文 中文
Using Item Scores and Distractors to Detect Item Compromise and Preknowledge 利用项目得分和干扰因素检测项目妥协和预知识
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2023-04-20 DOI: 10.3102/10769986231159923
Kylie Gorney, James A. Wollack, S. Sinharay, Carol Eckerly
Any time examinees have had access to items and/or answers prior to taking a test, the fairness of the test and validity of test score interpretations are threatened. Therefore, there is a high demand for procedures to detect both compromised items (CI) and examinees with preknowledge (EWP). In this article, we develop a procedure that uses item scores and distractors to simultaneously detect CI and EWP. The false positive rate and true positive rate are evaluated for both items and examinees using detailed simulations. A real data example is also provided using data from an information technology certification exam.
考生在参加考试前任何时候都可以接触到项目和/或答案,考试的公平性和考试成绩解释的有效性都会受到威胁。因此,对检测受损项目(CI)和具有先验知识的受试者(EWP)的程序提出了很高的要求。在这篇文章中,我们开发了一个程序,使用项目评分和干扰物来同时检测CI和EWP。使用详细的模拟来评估项目和考生的假阳性率和真阳性率。还提供了一个使用信息技术认证考试数据的真实数据示例。
{"title":"Using Item Scores and Distractors to Detect Item Compromise and Preknowledge","authors":"Kylie Gorney, James A. Wollack, S. Sinharay, Carol Eckerly","doi":"10.3102/10769986231159923","DOIUrl":"https://doi.org/10.3102/10769986231159923","url":null,"abstract":"Any time examinees have had access to items and/or answers prior to taking a test, the fairness of the test and validity of test score interpretations are threatened. Therefore, there is a high demand for procedures to detect both compromised items (CI) and examinees with preknowledge (EWP). In this article, we develop a procedure that uses item scores and distractors to simultaneously detect CI and EWP. The false positive rate and true positive rate are evaluated for both items and examinees using detailed simulations. A real data example is also provided using data from an information technology certification exam.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"636 - 660"},"PeriodicalIF":2.4,"publicationDate":"2023-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49258978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
An Explicit Form With Continuous Attribute Profile of the Partial Mastery DINA Model 部分Mastery DINA模型的一个具有连续属性轮廓的显式形式
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2023-04-10 DOI: 10.3102/10769986231159436
Tian Shu, Guanzhong Luo, Zhaosheng Luo, Xiaofeng Yu, Xiaojun Guo, Yujun Li
Cognitive diagnosis models (CDMs) are the statistical framework for cognitive diagnostic assessment in education and psychology. They generally assume that subjects’ latent attributes are dichotomous—mastery or nonmastery, which seems quite deterministic. As an alternative to dichotomous attribute mastery, attention is drawn to the use of a continuous attribute mastery format in recent literature. To obtain subjects’ finer-grained attribute mastery for more precise diagnosis and guidance, an equivalent but more explicit form of the partial-mastery-deterministic inputs, noisy “and” gate (DINA) model (termed continuous attribute profile [CAP]-DINA form) is proposed in this article. Its parameters estimation algorithm based on this form using Bayesian techniques with Markov chain Monte Carlo algorithm is also presented. Two simulation studies are conducted then to explore its parameter recovery and model misspecification, and the results demonstrate that the CAP-DINA form performs robustly with satisfactory efficiency in these two aspects. A real data study of the English test also indicates it has a better model fit than DINA.
认知诊断模型(CDMs)是教育和心理学认知诊断评估的统计框架。他们普遍假设被试的潜在属性是二分类的——精通或不精通,这似乎是相当确定的。作为二分类属性掌握的替代方法,在最近的文献中,人们注意到连续属性掌握格式的使用。为了获得受试者的细粒度属性掌握,以便更精确地进行诊断和指导,本文提出了一种等效但更明确的部分掌握确定性输入形式,即噪声“和”门(DINA)模型(称为连续属性轮廓[CAP]-DINA形式)。在此基础上,利用贝叶斯技术和马尔可夫链蒙特卡罗算法对其参数进行估计。在此基础上,对CAP-DINA形式的参数恢复和模型错配进行了两次仿真研究,结果表明CAP-DINA形式在参数恢复和模型错配两方面都具有较好的鲁棒性。对英语测试的真实数据研究也表明,它比DINA具有更好的模型拟合。
{"title":"An Explicit Form With Continuous Attribute Profile of the Partial Mastery DINA Model","authors":"Tian Shu, Guanzhong Luo, Zhaosheng Luo, Xiaofeng Yu, Xiaojun Guo, Yujun Li","doi":"10.3102/10769986231159436","DOIUrl":"https://doi.org/10.3102/10769986231159436","url":null,"abstract":"Cognitive diagnosis models (CDMs) are the statistical framework for cognitive diagnostic assessment in education and psychology. They generally assume that subjects’ latent attributes are dichotomous—mastery or nonmastery, which seems quite deterministic. As an alternative to dichotomous attribute mastery, attention is drawn to the use of a continuous attribute mastery format in recent literature. To obtain subjects’ finer-grained attribute mastery for more precise diagnosis and guidance, an equivalent but more explicit form of the partial-mastery-deterministic inputs, noisy “and” gate (DINA) model (termed continuous attribute profile [CAP]-DINA form) is proposed in this article. Its parameters estimation algorithm based on this form using Bayesian techniques with Markov chain Monte Carlo algorithm is also presented. Two simulation studies are conducted then to explore its parameter recovery and model misspecification, and the results demonstrate that the CAP-DINA form performs robustly with satisfactory efficiency in these two aspects. A real data study of the English test also indicates it has a better model fit than DINA.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"573 - 602"},"PeriodicalIF":2.4,"publicationDate":"2023-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42823878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Is It Who You Are or Where You Are? Accounting for Compositional Differences in Cross-Site Treatment Effect Variation 是你是谁还是你在哪里?考虑跨站点处理效果变异的成分差异
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2023-04-10 DOI: 10.3102/10769986231155427
Benjamin Lu, E. Ben-Michael, A. Feller, Luke W. Miratrix
In multisite trials, learning about treatment effect variation across sites is critical for understanding where and for whom a program works. Unadjusted comparisons, however, capture “compositional” differences in the distributions of unit-level features as well as “contextual” differences in site-level features, including possible differences in program implementation. Our goal in this article is to adjust site-level estimates for differences in the distribution of observed unit-level features: If we can reweight (or “transport”) each site to have a common distribution of observed unit-level covariates, the remaining treatment effect variation captures contextual and unobserved compositional differences across sites. This allows us to make apples-to-apples comparisons across sites, parceling out the amount of cross-site effect variation explained by systematic differences in populations served. In this article, we develop a framework for transporting effects using approximate balancing weights, where the weights are chosen to directly optimize unit-level covariate balance between each site and the common target distribution. We first develop our approach for the general setting of transporting the effect of a single-site trial. We then extend our method to multisite trials, assess its performance via simulation, and use it to analyze a series of multisite trials of adult education and vocational training programs. In our application, we find that distributional differences are potentially masking cross-site variation. Our method is available in the balancer R package.
在多地点试验中,了解不同地点的治疗效果变化对于了解一个项目在哪里和对谁起作用至关重要。然而,未经调整的比较捕获了单元级特征分布中的“组成”差异,以及站点级特征的“上下文”差异,包括计划实施中的可能差异。我们在本文中的目标是调整观测到的单位水平特征分布差异的站点水平估计:如果我们可以重新加权(或“传输”)每个站点以具有观测到的单位水平协变量的共同分布,则剩余的处理效果变化捕获了站点之间的上下文和未观察到的组成差异。这使我们能够在不同的站点之间进行比较,将服务人群的系统差异所解释的跨站点效应差异的数量分配出来。在本文中,我们开发了一个使用近似平衡权值传输效应的框架,其中权重的选择直接优化每个站点和共同目标分布之间的单位级协变量平衡。我们首先发展我们的方法一般设置运输单点试验的效果。然后,我们将我们的方法扩展到多地点试验,通过模拟评估其性能,并使用它来分析成人教育和职业培训计划的一系列多地点试验。在我们的应用中,我们发现分布差异潜在地掩盖了跨站点的变化。我们的方法在平衡器R包中可用。
{"title":"Is It Who You Are or Where You Are? Accounting for Compositional Differences in Cross-Site Treatment Effect Variation","authors":"Benjamin Lu, E. Ben-Michael, A. Feller, Luke W. Miratrix","doi":"10.3102/10769986231155427","DOIUrl":"https://doi.org/10.3102/10769986231155427","url":null,"abstract":"In multisite trials, learning about treatment effect variation across sites is critical for understanding where and for whom a program works. Unadjusted comparisons, however, capture “compositional” differences in the distributions of unit-level features as well as “contextual” differences in site-level features, including possible differences in program implementation. Our goal in this article is to adjust site-level estimates for differences in the distribution of observed unit-level features: If we can reweight (or “transport”) each site to have a common distribution of observed unit-level covariates, the remaining treatment effect variation captures contextual and unobserved compositional differences across sites. This allows us to make apples-to-apples comparisons across sites, parceling out the amount of cross-site effect variation explained by systematic differences in populations served. In this article, we develop a framework for transporting effects using approximate balancing weights, where the weights are chosen to directly optimize unit-level covariate balance between each site and the common target distribution. We first develop our approach for the general setting of transporting the effect of a single-site trial. We then extend our method to multisite trials, assess its performance via simulation, and use it to analyze a series of multisite trials of adult education and vocational training programs. In our application, we find that distributional differences are potentially masking cross-site variation. Our method is available in the balancer R package.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"420 - 453"},"PeriodicalIF":2.4,"publicationDate":"2023-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43898401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Diagnostic Tree Model for Adaptive Assessment of Complex Cognitive Processes Using Multidimensional Response Options 使用多维反应选项的复杂认知过程适应性评估诊断树模型
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2023-04-05 DOI: 10.3102/10769986231158301
M. Davison, David J. Weiss, Joseph N. DeWeese, Ozge Ersan, Gina Biancarosa, Patrick C. Kennedy
A tree model for diagnostic educational testing is described along with Monte Carlo simulations designed to evaluate measurement accuracy based on the model. The model is implemented in an assessment of inferential reading comprehension, the Multiple-Choice Online Causal Comprehension Assessment (MOCCA), through a sequential, multidimensional, computerized adaptive testing (CAT) strategy. Assessment of the first dimension, reading comprehension (RC), is based on the three-parameter logistic model. For diagnostic and intervention purposes, the second dimension, called process propensity (PP), is used to classify struggling students based on their pattern of incorrect responses. In the simulation studies, CAT item selection rules and stopping rules were varied to evaluate their effect on measurement accuracy along dimension RC and classification accuracy along dimension PP. For dimension RC, methods that improved accuracy tended to increase test length. For dimension PP, however, item selection and stopping rules increased classification accuracy without materially increasing test length. A small live-testing pilot study confirmed some of the findings of the simulation studies. Development of the assessment has been guided by psychometric theory, Monte Carlo simulation results, and a theory of instruction and diagnosis.
诊断教育测试的树模型描述与蒙特卡罗模拟设计,以评估基于该模型的测量精度。该模型通过顺序的、多维的、计算机化的自适应测试(CAT)策略应用于推理阅读理解的评估——多项选择在线因果理解评估(MOCCA)中。第一个维度,阅读理解(RC)的评估是基于三参数逻辑模型。为了诊断和干预的目的,第二个维度,称为过程倾向(PP),被用来根据错误反应的模式对挣扎的学生进行分类。在模拟研究中,采用不同的CAT项目选择规则和停止规则来评估它们对沿RC维度的测量精度和沿PP维度的分类精度的影响。对于RC维度,提高精度的方法倾向于增加测试长度。而对于维度PP,项目选择和停止规则在不显著增加测试长度的情况下提高了分类精度。一项小型的现场试验试点研究证实了模拟研究的一些发现。评估的发展以心理测量理论、蒙特卡罗模拟结果以及教学和诊断理论为指导。
{"title":"A Diagnostic Tree Model for Adaptive Assessment of Complex Cognitive Processes Using Multidimensional Response Options","authors":"M. Davison, David J. Weiss, Joseph N. DeWeese, Ozge Ersan, Gina Biancarosa, Patrick C. Kennedy","doi":"10.3102/10769986231158301","DOIUrl":"https://doi.org/10.3102/10769986231158301","url":null,"abstract":"A tree model for diagnostic educational testing is described along with Monte Carlo simulations designed to evaluate measurement accuracy based on the model. The model is implemented in an assessment of inferential reading comprehension, the Multiple-Choice Online Causal Comprehension Assessment (MOCCA), through a sequential, multidimensional, computerized adaptive testing (CAT) strategy. Assessment of the first dimension, reading comprehension (RC), is based on the three-parameter logistic model. For diagnostic and intervention purposes, the second dimension, called process propensity (PP), is used to classify struggling students based on their pattern of incorrect responses. In the simulation studies, CAT item selection rules and stopping rules were varied to evaluate their effect on measurement accuracy along dimension RC and classification accuracy along dimension PP. For dimension RC, methods that improved accuracy tended to increase test length. For dimension PP, however, item selection and stopping rules increased classification accuracy without materially increasing test length. A small live-testing pilot study confirmed some of the findings of the simulation studies. Development of the assessment has been guided by psychometric theory, Monte Carlo simulation results, and a theory of instruction and diagnosis.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":" ","pages":""},"PeriodicalIF":2.4,"publicationDate":"2023-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44970278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Restricted DINA Model: A Comprehensive Cognitive Diagnostic Model for Classroom-Level Assessments 受限DINA模型:一种用于课堂水平评估的综合认知诊断模型
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2023-03-30 DOI: 10.3102/10769986231158829
P. Nájera, F. J. Abad, Chia-Yi Chiu, M. Sorrel
The nonparametric classification (NPC) method has been proven to be a suitable procedure for cognitive diagnostic assessments at a classroom level. However, its nonparametric nature impedes the obtention of a model likelihood, hindering the exploration of crucial psychometric aspects, such as model fit or reliability. Reporting the reliability and validity of scores is imperative in any applied context. The present study proposes the restricted deterministic input, noisy “and” gate (R-DINA) model, a parametric cognitive diagnosis model based on the NPC method that provides the same attribute profile classifications as the nonparametric method while allowing to derive a model likelihood and, subsequently, to compute fit and reliability indices. The suitability of the new proposal is examined by means of an exhaustive simulation study and a real data illustration. The results show that the R-DINA model properly recovers the posterior probabilities of attribute mastery, thus becoming a suitable alternative for comprehensive small-scale diagnostic assessments.
非参数分类(NPC)方法已被证明是在课堂水平上进行认知诊断评估的合适程序。然而,它的非参数性质阻碍了对模型可能性的关注,阻碍了对关键心理测量方面的探索,如模型拟合或可靠性。在任何应用环境中,报告分数的可靠性和有效性都是必不可少的。本研究提出了限制确定性输入、噪声“和”门(R-DINA)模型,这是一种基于NPC方法的参数认知诊断模型,它提供了与非参数方法相同的属性轮廓分类,同时允许导出模型似然性,并随后计算拟合和可靠性指数。通过详尽的模拟研究和实际数据说明,对新提案的适用性进行了审查。结果表明,R-DINA模型正确地恢复了属性掌握的后验概率,从而成为综合小规模诊断评估的合适替代方案。
{"title":"The Restricted DINA Model: A Comprehensive Cognitive Diagnostic Model for Classroom-Level Assessments","authors":"P. Nájera, F. J. Abad, Chia-Yi Chiu, M. Sorrel","doi":"10.3102/10769986231158829","DOIUrl":"https://doi.org/10.3102/10769986231158829","url":null,"abstract":"The nonparametric classification (NPC) method has been proven to be a suitable procedure for cognitive diagnostic assessments at a classroom level. However, its nonparametric nature impedes the obtention of a model likelihood, hindering the exploration of crucial psychometric aspects, such as model fit or reliability. Reporting the reliability and validity of scores is imperative in any applied context. The present study proposes the restricted deterministic input, noisy “and” gate (R-DINA) model, a parametric cognitive diagnosis model based on the NPC method that provides the same attribute profile classifications as the nonparametric method while allowing to derive a model likelihood and, subsequently, to compute fit and reliability indices. The suitability of the new proposal is examined by means of an exhaustive simulation study and a real data illustration. The results show that the R-DINA model properly recovers the posterior probabilities of attribute mastery, thus becoming a suitable alternative for comprehensive small-scale diagnostic assessments.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":" ","pages":""},"PeriodicalIF":2.4,"publicationDate":"2023-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43078375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Finding the Right Grain-Size for Measurement in the Classroom 在教室里找到合适的测量粒度
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2023-03-30 DOI: 10.3102/10769986231159006
M. Wilson
This article introduces a new framework for articulating how educational assessments can be related to teacher uses in the classroom. It articulates three levels of assessment: macro (use of standardized tests), meso (externally developed items), and micro (on-the-fly in the classroom). The first level is the usual context for educational measurement, but one of the contributions of this article is that it mainly focuses on the latter two levels. Co-ordination of the content across these two levels can be achieved using the concept of a construct map, which articulates the substantive target property at levels of detail that are appropriate for both teacher planning and within-classroom use. This article then describes a statistical model designed to span these two levels and discusses how best to relate this to the macrolevel. Results from a curriculum and instruction development project on the topic of measurement in the elementary school are demonstrated, showing how they are empirically related.
本文介绍了一个新的框架来阐明教育评估如何与教师在课堂上的使用相关。它阐述了三个评估级别:宏观(使用标准化测试)、中观(外部开发的项目)和微观(在课堂上动态)。第一个层次是教育测量的常见背景,但本文的贡献之一是主要关注后两个层次。可以使用构造图的概念来实现这两个层次的内容协调,构造图在适合教师规划和课堂使用的细节层次上阐明了实质性目标属性。然后,本文描述了一个旨在跨越这两个层面的统计模型,并讨论了如何最好地将其与宏观层面联系起来。展示了一个关于小学测量主题的课程和教学开发项目的结果,展示了它们之间的经验联系。
{"title":"Finding the Right Grain-Size for Measurement in the Classroom","authors":"M. Wilson","doi":"10.3102/10769986231159006","DOIUrl":"https://doi.org/10.3102/10769986231159006","url":null,"abstract":"This article introduces a new framework for articulating how educational assessments can be related to teacher uses in the classroom. It articulates three levels of assessment: macro (use of standardized tests), meso (externally developed items), and micro (on-the-fly in the classroom). The first level is the usual context for educational measurement, but one of the contributions of this article is that it mainly focuses on the latter two levels. Co-ordination of the content across these two levels can be achieved using the concept of a construct map, which articulates the substantive target property at levels of detail that are appropriate for both teacher planning and within-classroom use. This article then describes a statistical model designed to span these two levels and discusses how best to relate this to the macrolevel. Results from a curriculum and instruction development project on the topic of measurement in the elementary school are demonstrated, showing how they are empirically related.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":" ","pages":""},"PeriodicalIF":2.4,"publicationDate":"2023-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45422470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimizing Diagnostic Classification Models Application Considering Real-Life Constraints 考虑现实约束的诊断分类模型优化应用
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2023-03-30 DOI: 10.3102/10769986231159137
Kun Su, R. Henson
This article provides a process to carefully evaluate the suitability of a content domain for which diagnostic classification models (DCMs) could be applicable and then optimized steps for constructing a test blueprint for applying DCMs and a real-life example illustrating this process. The content domains were carefully evaluated using a set of defined criteria, which are purposely defined to improve the success rate of DCM implementation. Given the domain, the Q-matrix is determined by a simulation-based approach using correct classification rates as criteria. Finally, a physics test on the final Q-matrix was developed, administered, and analyzed by the author and the subject-matter experts (SMEs).
本文提供了一个仔细评估可应用诊断分类模型(dcm)的内容域的适用性的过程,然后优化了构建用于应用dcm的测试蓝图的步骤,并提供了一个说明此过程的实际示例。使用一组定义的标准仔细地评估了内容域,这些标准的定义是为了提高DCM实现的成功率。给定域,q矩阵由基于模拟的方法确定,使用正确的分类率作为标准。最后,作者和主题专家(sme)开发、管理和分析了最终q矩阵的物理测试。
{"title":"Optimizing Diagnostic Classification Models Application Considering Real-Life Constraints","authors":"Kun Su, R. Henson","doi":"10.3102/10769986231159137","DOIUrl":"https://doi.org/10.3102/10769986231159137","url":null,"abstract":"This article provides a process to carefully evaluate the suitability of a content domain for which diagnostic classification models (DCMs) could be applicable and then optimized steps for constructing a test blueprint for applying DCMs and a real-life example illustrating this process. The content domains were carefully evaluated using a set of defined criteria, which are purposely defined to improve the success rate of DCM implementation. Given the domain, the Q-matrix is determined by a simulation-based approach using correct classification rates as criteria. Finally, a physics test on the final Q-matrix was developed, administered, and analyzed by the author and the subject-matter experts (SMEs).","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":" ","pages":""},"PeriodicalIF":2.4,"publicationDate":"2023-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44298525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Expertise on Offer: Why Isn’t Anyone Buying? 专业知识:为什么没人买?
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2023-03-29 DOI: 10.3102/10769986231160671
H. Braun
It is a much-lamented fact that research with the potential to inform or influence education policy instead remains policy inert. There are many reasons for this frustrating state of affairs, including a lack of strategic thinking on the part of researchers on how to successfully accomplish outreach—as opposed to communication with peers (in-reach). Another, and a principal focus of this article, is the failure of researchers to appreciate the power of employing compelling narratives to bring their findings to the attention of policymakers and other stakeholders. Accordingly, this article presents some examples of narratives specifically designed for outreach and discusses some of their features. It also considers the challenges in gaining traction with counternarratives once a particular narrative has achieved currency. Researchers should also be mindful of the tenor of the times, with experts now often viewed with skepticism, if not downright hostility. In some quarters, excessive reliance on technocrats is even seen as a threat to democratic governance. The article concludes with some recommendations on how to appropriately enhance the role of research in education policymaking.
一个令人遗憾的事实是,有可能为教育政策提供信息或影响教育政策的研究却在政策上处于惰性。造成这种令人沮丧的状态的原因有很多,包括研究人员在如何成功地完成外联方面缺乏战略思维,而不是与同行沟通(in-reach)。另一个问题,也是本文的主要焦点,是研究人员未能认识到采用令人信服的叙述将他们的发现引起政策制定者和其他利益相关者注意的力量。因此,本文提出了一些专门为外联设计的叙事例子,并讨论了它们的一些特点。它还考虑到,一旦一种特定的叙事获得认可,如何通过反叙事获得吸引力所面临的挑战。研究人员还应该注意时代的趋势,现在的专家即使不是彻头彻尾的敌意,也经常被怀疑。在某些地区,过度依赖技术官僚甚至被视为对民主治理的威胁。文章最后就如何适当提高研究在教育决策中的作用提出了一些建议。
{"title":"Expertise on Offer: Why Isn’t Anyone Buying?","authors":"H. Braun","doi":"10.3102/10769986231160671","DOIUrl":"https://doi.org/10.3102/10769986231160671","url":null,"abstract":"It is a much-lamented fact that research with the potential to inform or influence education policy instead remains policy inert. There are many reasons for this frustrating state of affairs, including a lack of strategic thinking on the part of researchers on how to successfully accomplish outreach—as opposed to communication with peers (in-reach). Another, and a principal focus of this article, is the failure of researchers to appreciate the power of employing compelling narratives to bring their findings to the attention of policymakers and other stakeholders. Accordingly, this article presents some examples of narratives specifically designed for outreach and discusses some of their features. It also considers the challenges in gaining traction with counternarratives once a particular narrative has achieved currency. Researchers should also be mindful of the tenor of the times, with experts now often viewed with skepticism, if not downright hostility. In some quarters, excessive reliance on technocrats is even seen as a threat to democratic governance. The article concludes with some recommendations on how to appropriately enhance the role of research in education policymaking.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"547 - 572"},"PeriodicalIF":2.4,"publicationDate":"2023-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41996110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detecting Item Preknowledge Using Revisits With Speed and Accuracy 使用Revisits快速准确地检测项目先验知识
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2023-02-28 DOI: 10.3102/10769986231153403
Onur Demirkaya, Ummugul Bezirhan, Jinming Zhang
Examinees with item preknowledge tend to obtain inflated test scores that undermine test score validity. With the availability of process data collected in computer-based assessments, the research on detecting item preknowledge has progressed on using both item scores and response times. Item revisit patterns of examinees can also be utilized as an additional source of information. This study proposes a new statistic for detecting item preknowledge when compromised items are known by utilizing the hierarchical speed–accuracy revisits model. By simultaneously evaluating abnormal changes in the latent abilities, speeds, and revisit propensities of examinees, the procedure was found to provide greater statistical power and stronger substantive evidence that an examinee had indeed benefited from item preknowledge.
具有项目先验知识的考生往往会获得夸大的考试成绩,从而破坏考试成绩的有效性。随着计算机评估中收集的过程数据的可用性,检测项目先验知识的研究在使用项目得分和响应时间方面取得了进展。考生的项目重访模式也可以作为额外的信息来源。本研究提出了一种新的统计方法,用于在已知受损项目时,通过利用分层的速度-准确性重新访问模型来检测项目先验知识。通过同时评估考生潜在能力、速度和重访倾向的异常变化,发现该程序提供了更大的统计能力和更有力的实质性证据,证明考生确实受益于项目预先知识。
{"title":"Detecting Item Preknowledge Using Revisits With Speed and Accuracy","authors":"Onur Demirkaya, Ummugul Bezirhan, Jinming Zhang","doi":"10.3102/10769986231153403","DOIUrl":"https://doi.org/10.3102/10769986231153403","url":null,"abstract":"Examinees with item preknowledge tend to obtain inflated test scores that undermine test score validity. With the availability of process data collected in computer-based assessments, the research on detecting item preknowledge has progressed on using both item scores and response times. Item revisit patterns of examinees can also be utilized as an additional source of information. This study proposes a new statistic for detecting item preknowledge when compromised items are known by utilizing the hierarchical speed–accuracy revisits model. By simultaneously evaluating abnormal changes in the latent abilities, speeds, and revisit propensities of examinees, the procedure was found to provide greater statistical power and stronger substantive evidence that an examinee had indeed benefited from item preknowledge.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"521 - 542"},"PeriodicalIF":2.4,"publicationDate":"2023-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48567321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Causal Latent Transition Model With Multivariate Outcomes and Unobserved Heterogeneity: Application to Human Capital Development 具有多元结果和未观察异质性的因果潜在转移模型:在人力资本开发中的应用
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2023-02-09 DOI: 10.3102/10769986221150033
F. Bartolucci, F. Pennoni, G. Vittadini
In order to evaluate the effect of a policy or treatment with pre- and post-treatment outcomes, we propose an approach based on a transition model, which may be applied with multivariate outcomes and accounts for unobserved heterogeneity. This model is based on potential versions of discrete latent variables representing the individual characteristic of interest and may be cast in the hidden (latent) Markov literature for panel data. Therefore, it can be estimated by maximum likelihood in a relatively simple way. The approach extends the difference-in-difference method as it is possible to deal with multivariate outcomes. Moreover, causal effects may be expressed with respect to transition probabilities. The proposal is validated through a simulation study, and it is applied to evaluate educational programs administered to pupils in the sixth and seventh grades during their middle school period. These programs are carried out in an Italian region to improve non-cognitive skills (CSs). We study if they impact also on students’ CSs in Italian and Mathematics in the eighth grade, exploiting the pretreatment test scores available in the fifth grade. The main conclusion is that the educational programs aimed to develop noncognitive abilities help the best students to maintain their higher cognitive abilities over time.
为了通过治疗前后的结果来评估政策或治疗的效果,我们提出了一种基于过渡模型的方法,该方法可以应用于多变量结果并解释未观察到的异质性。该模型基于代表感兴趣的个体特征的离散潜在变量的潜在版本,并且可以在面板数据的隐藏(潜在)马尔可夫文献中进行投射。因此,可以用比较简单的方法用最大似然法进行估计。该方法扩展了差分法,因为它可以处理多变量结果。此外,因果效应可以用转移概率来表示。通过模拟研究验证了该建议,并将其应用于评估中学六年级和七年级学生的教育计划。这些项目在意大利的一个地区开展,旨在提高非认知技能(CSs)。我们利用五年级的预处理测试成绩,研究它们是否也会影响八年级学生的意大利语和数学CSs。主要结论是,旨在培养非认知能力的教育项目有助于最优秀的学生随着时间的推移保持其较高的认知能力。
{"title":"A Causal Latent Transition Model With Multivariate Outcomes and Unobserved Heterogeneity: Application to Human Capital Development","authors":"F. Bartolucci, F. Pennoni, G. Vittadini","doi":"10.3102/10769986221150033","DOIUrl":"https://doi.org/10.3102/10769986221150033","url":null,"abstract":"In order to evaluate the effect of a policy or treatment with pre- and post-treatment outcomes, we propose an approach based on a transition model, which may be applied with multivariate outcomes and accounts for unobserved heterogeneity. This model is based on potential versions of discrete latent variables representing the individual characteristic of interest and may be cast in the hidden (latent) Markov literature for panel data. Therefore, it can be estimated by maximum likelihood in a relatively simple way. The approach extends the difference-in-difference method as it is possible to deal with multivariate outcomes. Moreover, causal effects may be expressed with respect to transition probabilities. The proposal is validated through a simulation study, and it is applied to evaluate educational programs administered to pupils in the sixth and seventh grades during their middle school period. These programs are carried out in an Italian region to improve non-cognitive skills (CSs). We study if they impact also on students’ CSs in Italian and Mathematics in the eighth grade, exploiting the pretreatment test scores available in the fifth grade. The main conclusion is that the educational programs aimed to develop noncognitive abilities help the best students to maintain their higher cognitive abilities over time.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"387 - 419"},"PeriodicalIF":2.4,"publicationDate":"2023-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48079573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Educational and Behavioral Statistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1