首页 > 最新文献

Applied Measurement in Education最新文献

英文 中文
Efficient Assessment of Students’ Proportional Reasoning 学生比例推理能力的有效评估
IF 1.5 4区 教育学 Q3 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2022-01-02 DOI: 10.1080/08957347.2022.2034825
Michele B. Carney, Katie Paulding, Joe Champion
ABSTRACT Teachers need ways to efficiently assess students’ cognitive understanding. One promising approach involves easily adapted and administered item types that yield quantitative scores that can be interpreted in terms of whether or not students likely possess key understandings. This study illustrates an approach to analyzing response process validity evidence from item types for assessing two important aspects of proportional reasoning. Data include results from an interview protocol used with 33 middle school students to compare their responses to prototypical item types to their conceptions of composed unit and multiplicative comparison. The findings provide validity evidence in support of the score interpretations for the item types but also detail important item specifications and caveats. Discussion includes recommendations for extending the research for examining response process validity evidence in support of claims related to cognitive interpretations of scores for other key mathematical conceptions.
教师需要有效评估学生认知理解的方法。一种很有前途的方法包括易于调整和管理的项目类型,这些项目类型产生定量分数,可以根据学生是否可能掌握关键理解来解释。本研究说明了一种方法来分析反应过程效度证据从项目类型评估比例推理的两个重要方面。数据包括对33名中学生的访谈协议的结果,以比较他们对原型项目类型的反应与他们对组成单位和乘法比较的概念。研究结果为项目类型的得分解释提供了效度证据,但也详细说明了重要的项目规格和注意事项。讨论包括对扩展研究的建议,以检查反应过程有效性证据,以支持与其他关键数学概念得分的认知解释有关的主张。
{"title":"Efficient Assessment of Students’ Proportional Reasoning","authors":"Michele B. Carney, Katie Paulding, Joe Champion","doi":"10.1080/08957347.2022.2034825","DOIUrl":"https://doi.org/10.1080/08957347.2022.2034825","url":null,"abstract":"ABSTRACT Teachers need ways to efficiently assess students’ cognitive understanding. One promising approach involves easily adapted and administered item types that yield quantitative scores that can be interpreted in terms of whether or not students likely possess key understandings. This study illustrates an approach to analyzing response process validity evidence from item types for assessing two important aspects of proportional reasoning. Data include results from an interview protocol used with 33 middle school students to compare their responses to prototypical item types to their conceptions of composed unit and multiplicative comparison. The findings provide validity evidence in support of the score interpretations for the item types but also detail important item specifications and caveats. Discussion includes recommendations for extending the research for examining response process validity evidence in support of claims related to cognitive interpretations of scores for other key mathematical conceptions.","PeriodicalId":51609,"journal":{"name":"Applied Measurement in Education","volume":"35 1","pages":"46 - 62"},"PeriodicalIF":1.5,"publicationDate":"2022-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43149266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Personalized Online Learning, Test Fairness, and Educational Measurement: Considering Differential Content Exposure Prior to a High Stakes End of Course Exam 个性化在线学习,考试公平和教育测量:在高风险的课程考试结束之前考虑不同的内容暴露
IF 1.5 4区 教育学 Q3 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2022-01-02 DOI: 10.1080/08957347.2022.2034824
Daniel Katz, A. Huggins-Manley, Walter L. Leite
ABSTRACT According to the Standards for Educational and Psychological Testing (2014), one aspect of test fairness concerns examinees having comparable opportunities to learn prior to taking tests. Meanwhile, many researchers are developing platforms enhanced by artificial intelligence (AI) that can personalize curriculum to individual student needs. This leads to a larger overarching question: When personalized learning leads to students having differential exposure to curriculum throughout the K-12 school year, how might this affect test fairness with respect to summative, end-of-year high-stakes tests? As a first step, we traced the differences in content exposure associated with personalized learning and more traditional learning paths. To better understand the implications of differences in content coverage, we conducted a simulation study to evaluate the degree to which curriculum exposure varied across students in a particular AI-enhanced learning platform for Algebra instruction with high-school students. Results indicate that AI-enhanced personalized learning may pose threats to test fairness as opportunity-to-learn on K-12 summative high-stakes tests. We discuss the implications given different perspectives of the role of testing in education
根据《教育与心理测试标准(2014)》,考试公平的一个方面是考生在考试前有相当的学习机会。与此同时,许多研究人员正在开发由人工智能(AI)增强的平台,可以根据学生的个性化需求定制课程。这就引出了一个更大的首要问题:当个性化学习导致学生在整个K-12学年中对课程有不同的接触时,这将如何影响总结性、年终高风险考试的考试公平性?作为第一步,我们追踪了与个性化学习和更传统的学习路径相关的内容暴露的差异。为了更好地理解内容覆盖差异的影响,我们进行了一项模拟研究,以评估在特定的人工智能增强学习平台上,学生对高中代数教学的课程接触程度的差异。结果表明,人工智能增强的个性化学习可能会对K-12总结性高风险考试的考试公平性构成威胁。我们讨论了测试在教育中的作用的不同观点的含义
{"title":"Personalized Online Learning, Test Fairness, and Educational Measurement: Considering Differential Content Exposure Prior to a High Stakes End of Course Exam","authors":"Daniel Katz, A. Huggins-Manley, Walter L. Leite","doi":"10.1080/08957347.2022.2034824","DOIUrl":"https://doi.org/10.1080/08957347.2022.2034824","url":null,"abstract":"ABSTRACT According to the Standards for Educational and Psychological Testing (2014), one aspect of test fairness concerns examinees having comparable opportunities to learn prior to taking tests. Meanwhile, many researchers are developing platforms enhanced by artificial intelligence (AI) that can personalize curriculum to individual student needs. This leads to a larger overarching question: When personalized learning leads to students having differential exposure to curriculum throughout the K-12 school year, how might this affect test fairness with respect to summative, end-of-year high-stakes tests? As a first step, we traced the differences in content exposure associated with personalized learning and more traditional learning paths. To better understand the implications of differences in content coverage, we conducted a simulation study to evaluate the degree to which curriculum exposure varied across students in a particular AI-enhanced learning platform for Algebra instruction with high-school students. Results indicate that AI-enhanced personalized learning may pose threats to test fairness as opportunity-to-learn on K-12 summative high-stakes tests. We discuss the implications given different perspectives of the role of testing in education","PeriodicalId":51609,"journal":{"name":"Applied Measurement in Education","volume":"35 1","pages":"1 - 16"},"PeriodicalIF":1.5,"publicationDate":"2022-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48261491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Determining Reliability of Daily Measures: An Illustration with Data on Teacher Stress 确定日常测量的可靠性:以教师压力数据为例
IF 1.5 4区 教育学 Q3 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2022-01-02 DOI: 10.1080/08957347.2022.2034822
Thijmen van Alphen, S. Jak, Joost Jansen in de Wal, J. Schuitema, T. Peetsma
ABSTRACT Intensive longitudinal data is increasingly used to study state-like processes such as changes in daily stress. Measures aimed at collecting such data require the same level of scrutiny regarding scale reliability as traditional questionnaires. The most prevalent methods used to assess reliability of intensive longitudinal measures are based on the generalizability theory or a multilevel factor analytic approach. However, the application of recent improvements made for the factor analytic approach may not be readily applicable for all researchers. Therefore, this article illustrates a five-step approach for determining reliability of daily data, which is one type of intensive longitudinal data. First, we show how the proposed reliability equations are applied. Next, we illustrate how these equations are used as part of our five-step approach with empirical data, originating from a study investigating changes in daily stress of secondary school teachers. The results are a within-level (ωw), between-level (ωb) reliability score. Mplus syntax for these examples is included and discussed. As such, this paper anticipates on the need for comprehensive guides for the analysis of daily data.
摘要密集的纵向数据越来越多地用于研究类似状态的过程,如日常压力的变化。旨在收集此类数据的措施需要对量表可靠性进行与传统问卷相同程度的审查。用于评估密集纵向测量可靠性的最常用方法是基于可推广性理论或多层次因素分析方法。然而,最近对因子分析方法的改进可能并不适用于所有研究人员。因此,本文阐述了一种确定日常数据可靠性的五步方法,这是一种密集的纵向数据。首先,我们展示了所提出的可靠性方程是如何应用的。接下来,我们用实证数据说明了这些方程是如何作为我们五步方法的一部分使用的,这些数据来源于一项调查中学教师日常压力变化的研究。结果是等级内(ωw)、等级间(ωb)的可靠性得分。包括并讨论了这些示例的Mplus语法。因此,本文预计需要对日常数据的分析提供全面的指南。
{"title":"Determining Reliability of Daily Measures: An Illustration with Data on Teacher Stress","authors":"Thijmen van Alphen, S. Jak, Joost Jansen in de Wal, J. Schuitema, T. Peetsma","doi":"10.1080/08957347.2022.2034822","DOIUrl":"https://doi.org/10.1080/08957347.2022.2034822","url":null,"abstract":"ABSTRACT Intensive longitudinal data is increasingly used to study state-like processes such as changes in daily stress. Measures aimed at collecting such data require the same level of scrutiny regarding scale reliability as traditional questionnaires. The most prevalent methods used to assess reliability of intensive longitudinal measures are based on the generalizability theory or a multilevel factor analytic approach. However, the application of recent improvements made for the factor analytic approach may not be readily applicable for all researchers. Therefore, this article illustrates a five-step approach for determining reliability of daily data, which is one type of intensive longitudinal data. First, we show how the proposed reliability equations are applied. Next, we illustrate how these equations are used as part of our five-step approach with empirical data, originating from a study investigating changes in daily stress of secondary school teachers. The results are a within-level (ωw), between-level (ωb) reliability score. Mplus syntax for these examples is included and discussed. As such, this paper anticipates on the need for comprehensive guides for the analysis of daily data.","PeriodicalId":51609,"journal":{"name":"Applied Measurement in Education","volume":"35 1","pages":"63 - 79"},"PeriodicalIF":1.5,"publicationDate":"2022-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47889298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Between- versus Within-Examinee Variability in Test-Taking Effort and Test Emotions during a Low-Stakes Test 在低风险测试中,受试者之间与受试者内部的测试努力和测试情绪变化
IF 1.5 4区 教育学 Q3 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2021-10-02 DOI: 10.1080/08957347.2021.1987905
B. Perkins, D. Pastor, S. Finney
ABSTRACT When tests are low stakes for examinees, meaning there are little to no personal consequences associated with test results, some examinees put little effort into their performance. To understand the causes and consequences of diminished effort, researchers correlate test-taking effort with other variables, such as test-taking emotions and test performance. Most studies correlate examinees’ overall level of test-taking effort with other variables, with fewer studies considering variables related to changing effort levels during testing. To understand if fluctuations in effort during testing relate to fluctuations in emotions, we collected effort and emotions (anger, boredom, emotionality, enjoyment, pride, worry) data from 768 university students three times during a low-stakes institutional accountability test. Examinees greatly varied in their average levels of effort and average levels of emotions during testing; relatively less variability was observed in these variables during testing. Average levels of emotions were predictive of effort, but fluctuations in emotions during testing were not. Our findings indicate that researchers should consider the proportion of intraindividual and interindividual variability in effort when considering predictors of test-taking effort.
当考试对考生来说是低风险的,这意味着考试结果几乎没有个人后果,一些考生对自己的表现不太努力。为了了解努力减少的原因和后果,研究人员将考试努力与其他变量联系起来,如考试情绪和考试成绩。大多数研究将考生的整体考试努力水平与其他变量联系起来,很少有研究考虑与考试过程中努力水平变化相关的变量。为了了解测试过程中努力的波动是否与情绪的波动有关,我们在一项低风险制度责任测试中三次收集了768名大学生的努力和情绪(愤怒、无聊、情绪、享受、骄傲、担忧)的数据。考生在考试中的平均努力水平和平均情绪水平差异很大;在测试期间,这些变量的可变性相对较小。平均情绪水平可以预测努力程度,但测试过程中情绪的波动却不能。我们的研究结果表明,研究人员在考虑考试努力的预测因素时,应该考虑个体内部和个体之间的努力变异性的比例。
{"title":"Between- versus Within-Examinee Variability in Test-Taking Effort and Test Emotions during a Low-Stakes Test","authors":"B. Perkins, D. Pastor, S. Finney","doi":"10.1080/08957347.2021.1987905","DOIUrl":"https://doi.org/10.1080/08957347.2021.1987905","url":null,"abstract":"ABSTRACT When tests are low stakes for examinees, meaning there are little to no personal consequences associated with test results, some examinees put little effort into their performance. To understand the causes and consequences of diminished effort, researchers correlate test-taking effort with other variables, such as test-taking emotions and test performance. Most studies correlate examinees’ overall level of test-taking effort with other variables, with fewer studies considering variables related to changing effort levels during testing. To understand if fluctuations in effort during testing relate to fluctuations in emotions, we collected effort and emotions (anger, boredom, emotionality, enjoyment, pride, worry) data from 768 university students three times during a low-stakes institutional accountability test. Examinees greatly varied in their average levels of effort and average levels of emotions during testing; relatively less variability was observed in these variables during testing. Average levels of emotions were predictive of effort, but fluctuations in emotions during testing were not. Our findings indicate that researchers should consider the proportion of intraindividual and interindividual variability in effort when considering predictors of test-taking effort.","PeriodicalId":51609,"journal":{"name":"Applied Measurement in Education","volume":"34 1","pages":"285 - 300"},"PeriodicalIF":1.5,"publicationDate":"2021-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48626074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Detecting Differential Item Functioning Using Cognitive Diagnosis Models: Applications of the Wald Test and Likelihood Ratio Test in a University Entrance Examination 用认知诊断模型检测差异项目功能:Wald检验和似然比检验在高考中的应用
IF 1.5 4区 教育学 Q3 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2021-10-02 DOI: 10.1080/08957347.2021.1987906
Roghayeh Mehrazmay, B. Ghonsooly, J. de la Torre
ABSTRACT The present study aims to examine gender differential item functioning (DIF) in the reading comprehension section of a high stakes test using cognitive diagnosis models. Based on the multiple-group generalized deterministic, noisy “and” gate (MG G-DINA) model, the Wald test and likelihood ratio test are used to detect DIF. The flagged items are further inspected to find the attributes they measure, and the probabilities of correct response are checked across latent profiles to gain insights into the potential reasons for the occurrence of DIF. In addition, attribute and latent class prevalence are examined across males and females. The three items displaying large DIF involve three attributes, namely Vocabulary, Main Idea, and Details. The results indicate that females have lower probabilities of correct response across all latent profiles, and fewer females have mastered all the attributes. Moreover, the findings show that the same attribute mastery profiles are prevalent across genders. Finally, the results of the DIF analysis are used to select models that could replace the complex MG G-DINA without significant loss of information.
摘要本研究旨在使用认知诊断模型来检验高风险测试阅读理解部分的性别差异项目功能(DIF)。在多组广义确定性、有噪声的“and”门(MG G-DINA)模型的基础上,采用Wald检验和似然比检验对DIF进行检测。对标记的项目进行进一步检查,以找到它们测量的属性,并在潜在档案中检查正确响应的概率,以深入了解DIF发生的潜在原因。此外,还调查了男性和女性的属性和潜在阶级流行率。显示大型DIF的三个项目涉及三个属性,即词汇、主要思想和细节。结果表明,女性在所有潜在特征中做出正确反应的概率较低,掌握所有属性的女性较少。此外,研究结果表明,相同的属性掌握特征在不同性别中普遍存在。最后,使用DIF分析的结果来选择可以在没有显著信息损失的情况下取代复杂的MG G-DINA的模型。
{"title":"Detecting Differential Item Functioning Using Cognitive Diagnosis Models: Applications of the Wald Test and Likelihood Ratio Test in a University Entrance Examination","authors":"Roghayeh Mehrazmay, B. Ghonsooly, J. de la Torre","doi":"10.1080/08957347.2021.1987906","DOIUrl":"https://doi.org/10.1080/08957347.2021.1987906","url":null,"abstract":"ABSTRACT The present study aims to examine gender differential item functioning (DIF) in the reading comprehension section of a high stakes test using cognitive diagnosis models. Based on the multiple-group generalized deterministic, noisy “and” gate (MG G-DINA) model, the Wald test and likelihood ratio test are used to detect DIF. The flagged items are further inspected to find the attributes they measure, and the probabilities of correct response are checked across latent profiles to gain insights into the potential reasons for the occurrence of DIF. In addition, attribute and latent class prevalence are examined across males and females. The three items displaying large DIF involve three attributes, namely Vocabulary, Main Idea, and Details. The results indicate that females have lower probabilities of correct response across all latent profiles, and fewer females have mastered all the attributes. Moreover, the findings show that the same attribute mastery profiles are prevalent across genders. Finally, the results of the DIF analysis are used to select models that could replace the complex MG G-DINA without significant loss of information.","PeriodicalId":51609,"journal":{"name":"Applied Measurement in Education","volume":"34 1","pages":"262 - 284"},"PeriodicalIF":1.5,"publicationDate":"2021-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43501745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Reconceptualizing Rapid Responses as a Speededness Indicator in High-Stakes Assessments 重新定义快速反应作为高风险评估中的速度指标
IF 1.5 4区 教育学 Q3 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2021-10-02 DOI: 10.1080/08957347.2021.1987904
R. Feinberg, D. Jurich, S. Wise
ABSTRACT Previous research on rapid responding tends to implicitly consider examinees as either engaging in solution behavior or purely guessing. However, particularly in a high-stakes testing context, examinees perceiving that they are running out of time may consider the remaining items for less time than necessary to provide a fully informed response, but longer than a truly rapid guess. This partial consideration results in a response that misrepresents true ability, but with accuracy above the level of pure chance. To address this limitation of existing methodology, we propose an empirical approach that attempts to disentangle fully and partially informed responses to be used as a preliminary measure of the extent to which speededness may be distorting test score validity. We first illustrate and validate the approach using an experimental dataset in which the amount of time per item was manipulated. Next, applications of this approach are demonstrated using observational data in a more realistic context through four operational exams in which speededness is unknown.
以往关于快速反应的研究倾向于含蓄地认为考生要么参与解决问题的行为,要么纯粹是猜测。然而,特别是在高风险的考试环境中,考生意识到他们的时间不多了,他们考虑剩下的题目的时间可能比提供一个全面的信息反应所需的时间要少,但比真正快速猜测的时间要长。这种不完全的考虑导致的结果是错误地反映了真实的能力,但其准确性高于纯粹的偶然水平。为了解决现有方法的这一局限性,我们提出了一种经验方法,试图解开完全和部分知情的反应,作为速度可能扭曲测试分数有效性的程度的初步测量。我们首先使用实验数据集来说明和验证该方法,其中每个项目的时间量被操纵。接下来,通过四个未知速度的操作测试,在更现实的背景下使用观测数据演示了这种方法的应用。
{"title":"Reconceptualizing Rapid Responses as a Speededness Indicator in High-Stakes Assessments","authors":"R. Feinberg, D. Jurich, S. Wise","doi":"10.1080/08957347.2021.1987904","DOIUrl":"https://doi.org/10.1080/08957347.2021.1987904","url":null,"abstract":"ABSTRACT Previous research on rapid responding tends to implicitly consider examinees as either engaging in solution behavior or purely guessing. However, particularly in a high-stakes testing context, examinees perceiving that they are running out of time may consider the remaining items for less time than necessary to provide a fully informed response, but longer than a truly rapid guess. This partial consideration results in a response that misrepresents true ability, but with accuracy above the level of pure chance. To address this limitation of existing methodology, we propose an empirical approach that attempts to disentangle fully and partially informed responses to be used as a preliminary measure of the extent to which speededness may be distorting test score validity. We first illustrate and validate the approach using an experimental dataset in which the amount of time per item was manipulated. Next, applications of this approach are demonstrated using observational data in a more realistic context through four operational exams in which speededness is unknown.","PeriodicalId":51609,"journal":{"name":"Applied Measurement in Education","volume":"34 1","pages":"312 - 326"},"PeriodicalIF":1.5,"publicationDate":"2021-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45481814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Characterizing the Latent Classes in a Mixture IRT Model Using DIF 用DIF刻画混合IRT模型中的潜在类
IF 1.5 4区 教育学 Q3 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2021-10-02 DOI: 10.1080/08957347.2021.1987900
Tuğba Karadavut
ABSTRACT Mixture IRT models address the heterogeneity in a population by extracting latent classes and allowing item parameters to vary between latent classes. Once the latent classes are extracted, they need to be further examined to be characterized. Some approaches have been adopted in the literature for this purpose. These approaches examine either the examinee or the item characteristics conceptually or statistically. In this study, we propose a two-step procedure for characterizing the latent classes. First, a DIF analysis can be conducted to determine the items that function differentially between the latent classes using the latent class membership information. Then, the characteristics of the items with DIF can be further examined to use this information for characterizing the latent classes. We provided an empirical example to illustrate this procedure.
摘要混合IRT模型通过提取潜在类别并允许项目参数在潜在类别之间变化来解决群体中的异质性。一旦提取出潜在类别,就需要对其进行进一步的检查以进行特征化。为此,文献中采用了一些方法。这些方法从概念上或统计上检查考生或项目特征。在这项研究中,我们提出了一个两步程序来表征潜在类别。首先,可以进行DIF分析,以使用潜在类成员信息来确定在潜在类之间不同地起作用的项目。然后,可以进一步检查具有DIF的项目的特征,以使用该信息来表征潜在类别。我们提供了一个实证例子来说明这个过程。
{"title":"Characterizing the Latent Classes in a Mixture IRT Model Using DIF","authors":"Tuğba Karadavut","doi":"10.1080/08957347.2021.1987900","DOIUrl":"https://doi.org/10.1080/08957347.2021.1987900","url":null,"abstract":"ABSTRACT Mixture IRT models address the heterogeneity in a population by extracting latent classes and allowing item parameters to vary between latent classes. Once the latent classes are extracted, they need to be further examined to be characterized. Some approaches have been adopted in the literature for this purpose. These approaches examine either the examinee or the item characteristics conceptually or statistically. In this study, we propose a two-step procedure for characterizing the latent classes. First, a DIF analysis can be conducted to determine the items that function differentially between the latent classes using the latent class membership information. Then, the characteristics of the items with DIF can be further examined to use this information for characterizing the latent classes. We provided an empirical example to illustrate this procedure.","PeriodicalId":51609,"journal":{"name":"Applied Measurement in Education","volume":"34 1","pages":"301 - 311"},"PeriodicalIF":1.5,"publicationDate":"2021-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46295386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Method for Displaying Incremental Validity with Expectancy Charts 用期望图显示增量有效性的方法
IF 1.5 4区 教育学 Q3 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2021-10-02 DOI: 10.1080/08957347.2021.1987902
Samuel D Lee, Philip T. Walmsley, P. Sackett, N. Kuncel
ABSTRACT Providing assessment validity information to decision makers in a clear and useful format is an ongoing challenge for the educational and psychological measurement community. We identify issues with a previous approach to a graphical presentation, noting that it is mislabeled as presenting incremental validity, when in fact it displays the effects of using predictors in a multiple hurdle fashion. We offer a straightforward technique for displaying incremental validity among predictors in reference to a criterion measure.
为决策者提供清晰有用的评估效度信息是教育和心理测量界面临的一个持续挑战。我们发现了先前图形表示方法的问题,注意到它被错误地标记为呈现增量有效性,而实际上它显示了以多障碍方式使用预测器的效果。我们提供了一种直接的技术,用于在参考标准测量的预测因子之间显示增量有效性。
{"title":"A Method for Displaying Incremental Validity with Expectancy Charts","authors":"Samuel D Lee, Philip T. Walmsley, P. Sackett, N. Kuncel","doi":"10.1080/08957347.2021.1987902","DOIUrl":"https://doi.org/10.1080/08957347.2021.1987902","url":null,"abstract":"ABSTRACT Providing assessment validity information to decision makers in a clear and useful format is an ongoing challenge for the educational and psychological measurement community. We identify issues with a previous approach to a graphical presentation, noting that it is mislabeled as presenting incremental validity, when in fact it displays the effects of using predictors in a multiple hurdle fashion. We offer a straightforward technique for displaying incremental validity among predictors in reference to a criterion measure.","PeriodicalId":51609,"journal":{"name":"Applied Measurement in Education","volume":"34 1","pages":"251 - 261"},"PeriodicalIF":1.5,"publicationDate":"2021-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"59806139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Consideration of Admissions Testing at Colleges and Universities: A Perspective 高校招生考试的思考:一个视角
IF 1.5 4区 教育学 Q3 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2021-10-02 DOI: 10.1080/08957347.2021.1987907
K. Geisinger
Three of the papers in this issue consider college admissions testing and a fourth high-stakes testing. I am not entirely sure that there is a more controversial topic today in higher education, ev...
本期的三篇论文讨论了大学入学考试和第四篇高风险考试。我不完全确定在当今的高等教育中还有一个更有争议的话题,那就是……
{"title":"The Consideration of Admissions Testing at Colleges and Universities: A Perspective","authors":"K. Geisinger","doi":"10.1080/08957347.2021.1987907","DOIUrl":"https://doi.org/10.1080/08957347.2021.1987907","url":null,"abstract":"Three of the papers in this issue consider college admissions testing and a fourth high-stakes testing. I am not entirely sure that there is a more controversial topic today in higher education, ev...","PeriodicalId":51609,"journal":{"name":"Applied Measurement in Education","volume":"34 1","pages":"237 - 239"},"PeriodicalIF":1.5,"publicationDate":"2021-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45967835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Comparing School Reports and Empirical Estimates of Relative Reliance on Tests Vs Grades in College Admissions 比较学校报告和经验估计相对依赖考试与成绩在大学录取
IF 1.5 4区 教育学 Q3 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2021-10-02 DOI: 10.1080/08957347.2021.1987903
P. Sackett, M. S. Sharpe, N. Kuncel
ABSTRACT The literature is replete with references to a disproportionate reliance on admission test scores (e.g., the ACT or SAT) in the college admissions process. School-reported reliance on test scores and grades has been used to study this question, generally indicating relatively equal reliance on the two, with a slightly higher endorsement of grades. As an alternative, we develop an empirical index of relative reliance on tests and grades, and compare school-reported estimates with empirical evidence of relative reliance. Using a dataset from 174 U.S. colleges and universities, we examine the degree to which applicants and enrolled students differ on the SAT and on high school GPA in each school, and develop an index of empirical relative reliance on test scores vs. grades. We find that schools tend to select on test scores and high school grades relatively equally, with the empirical reliance index showing slightly more reliance on test scores and school-reported reliance estimates showing slightly more reliance on grades.
在大学录取过程中,文献中充斥着对入学考试分数(例如ACT或SAT)不成比例的依赖。学校报告对考试成绩和成绩的依赖程度被用来研究这个问题,通常表明对两者的依赖程度相对相等,对成绩的认可程度略高。作为替代方案,我们开发了一个相对依赖测试和成绩的经验指数,并将学校报告的估计与相对依赖的经验证据进行比较。使用来自174所美国高校的数据集,我们检查了申请人和在校生在每所学校的SAT和高中GPA上的差异程度,并开发了一个基于测试分数与成绩的经验相对依赖指数。我们发现,学校倾向于相对平等地选择考试成绩和高中成绩,经验依赖指数显示对考试成绩的依赖程度略高,学校报告的依赖估计显示对成绩的依赖程度略高。
{"title":"Comparing School Reports and Empirical Estimates of Relative Reliance on Tests Vs Grades in College Admissions","authors":"P. Sackett, M. S. Sharpe, N. Kuncel","doi":"10.1080/08957347.2021.1987903","DOIUrl":"https://doi.org/10.1080/08957347.2021.1987903","url":null,"abstract":"ABSTRACT The literature is replete with references to a disproportionate reliance on admission test scores (e.g., the ACT or SAT) in the college admissions process. School-reported reliance on test scores and grades has been used to study this question, generally indicating relatively equal reliance on the two, with a slightly higher endorsement of grades. As an alternative, we develop an empirical index of relative reliance on tests and grades, and compare school-reported estimates with empirical evidence of relative reliance. Using a dataset from 174 U.S. colleges and universities, we examine the degree to which applicants and enrolled students differ on the SAT and on high school GPA in each school, and develop an index of empirical relative reliance on test scores vs. grades. We find that schools tend to select on test scores and high school grades relatively equally, with the empirical reliance index showing slightly more reliance on test scores and school-reported reliance estimates showing slightly more reliance on grades.","PeriodicalId":51609,"journal":{"name":"Applied Measurement in Education","volume":"34 1","pages":"240 - 250"},"PeriodicalIF":1.5,"publicationDate":"2021-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43893869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Applied Measurement in Education
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1