Applied Measurement in Education最新文献

英文中文

Comparing Examinee-Based and Response-Based Motivation Filtering Methods in Remote Low-Stakes Testing 比较远程低分数测试中基于考生的动机过滤方法和基于反应的动机过滤方法

IF 1.5 4区教育学 Q3 EDUCATION & EDUCATIONAL RESEARCH

Applied Measurement in Education

Pub Date : 2024-02-02 DOI: 10.1080/08957347.2024.2311927

Sarah Alahmadi, Christine E. DeMars

Large-scale educational assessments are sometimes considered low-stakes, increasing the possibility of confounding true performance level with low motivation. These concerns are amplified in remote...

大规模的教育评估有时被认为是低风险的，这增加了将真实成绩水平与低动机混为一谈的可能性。这些问题在远程教育中更为严重......

引用次数: 0

Analyzing Complete Generalizability Theory Designs Using Structural Equation Models 利用结构方程模型分析完整的可推广性理论设计

IF 1.5 4区教育学 Q3 EDUCATION & EDUCATIONAL RESEARCH

Applied Measurement in Education

Pub Date : 2023-12-06 DOI: 10.1080/08957347.2023.2274573

Walter P. Vispoel, Hyeri Hong, Hyeryung Lee, Terrence D. Jorgensen

We illustrate how to analyze complete generalizability theory (GT) designs using structural equation modeling software (lavaan in R), compare results to those obtained from numerous ANOVA-based pac...

我们说明了如何使用结构方程建模软件（R 中的 lavaan）分析完整的广义理论（GT）设计，并将结果与众多基于方差分析的设计进行比较。

引用次数: 1

Validity: An Integrated Approach to Test Score Meaning and Use, by Gregory J. Cizek, New York, Routledge, 2020, 190 pp., 55.00 (Paperback) 有效性:测试分数意义和使用的综合方法，Gregory J. Cizek，纽约，Routledge, 2020, 190页，55.00(平装本)

IF 1.5 4区教育学 Q3 EDUCATION & EDUCATIONAL RESEARCH

Applied Measurement in Education

Pub Date : 2023-11-19 DOI: 10.1080/08957347.2023.2274570

Tony Albano

Published in Applied Measurement in Education (Ahead of Print, 2023)

发表于《教育中的应用测量》(2023年出版前版)

引用次数: 0

Recruitment and Retention of Racially and Ethnically Minoritized Graduate Students in Educational Measurement Programs 在教育测量项目中招收和保留种族和少数民族研究生

IF 1.5 4区教育学 Q3 EDUCATION & EDUCATIONAL RESEARCH

Applied Measurement in Education

Pub Date : 2023-11-16 DOI: 10.1080/08957347.2023.2274565

Jennifer Randall, Joseph Rios

Building on the extant literature on recruitment and retention within the field of STEM and undergraduate education, we sought to explore the recruitment and retention experiences of racially and e...

在现有的STEM和本科教育领域的招聘和保留文献的基础上，我们试图探索种族和种族歧视的招聘和保留经验。

引用次数: 0

Detecting Item Parameter Drift in Small Sample Rasch Equating 小样本拉希方程中项目参数漂移检测

4区教育学 Q3 EDUCATION & EDUCATIONAL RESEARCH

Applied Measurement in Education

Pub Date : 2023-11-08 DOI: 10.1080/08957347.2023.2274567

Daniel Jurich, Chunyan Liu

ABSTRACTScreening items for parameter drift helps protect against serious validity threats and ensure score comparability when equating forms. Although many high-stakes credentialing examinations operate with small sample sizes, few studies have investigated methods to detect drift in small sample equating. This study demonstrates that several newly researched drift detection strategies can improve equating accuracy under certain conditions with small samples where some anchor items display item parameter drift. Results showed that the recently proposed methods mINFIT and mOUTFIT as well as the more conventional Robust-z helped mitigate the adverse effects of drifting anchor items in conditions with higher drift levels or with more than 75 examinees. In contrast, the Logit Difference approach excessively removed invariant anchor items. The discussion provides recommendations on how practitioners working with small samples can use the results to make more informed decisions regarding item parameter drift. Disclosure statementNo potential conflict of interest was reported by the author(s).Supplementary materialSupplemental data for this article can be accessed online at https://doi.org/10.1080/08957347.2023.2274567Notes1 In certain testing designs, some items may be reused as non-anchor items on future forms. Although IPD can occur on those items, we use the traditional IPD definition as specific to differential functioning in the items reused to serve as the equating anchor set.2 In IRT, the old form anchor item parameter estimates can also come from a pre-calibrated bank. However, we use the old and new form terminology as the simulation design involves directly equating to a previous form.3 For example, assume an item drifted in the 1.0 magnitude condition from b = 0 to 1 between Forms 1 and 2, this item would be treated as having a true b of 1.0 if selected on the Form 3.

【摘要】筛选参数漂移项有助于防止严重的有效性威胁，保证等分时的分数可比性。尽管许多高风险的资格考试都是在小样本量下进行的，但很少有研究调查了检测小样本方程漂移的方法。本研究表明，在小样本条件下，一些锚项目显示项目参数漂移的情况下，几种新研究的漂移检测策略可以提高等效精度。结果表明，最近提出的方法mINFIT和mOUTFIT以及更传统的Robust-z有助于减轻漂移锚项目在较高漂移水平或超过75名考生的条件下的不利影响。相反，Logit差分方法过多地删除了不变锚项。讨论提供了关于从业者如何使用小样本来使用结果对项目参数漂移做出更明智的决策的建议。披露声明作者未报告潜在的利益冲突。补充材料本文的补充数据可在线访问https://doi.org/10.1080/08957347.2023.2274567Notes1在某些测试设计中，有些项目可能会在未来的表格中作为非锚项目重复使用。虽然IPD可能发生在这些项目上，但我们使用传统的IPD定义，将其作为重复使用的项目中的微分功能作为等价锚集在IRT中，旧形式的锚项目参数估计也可以来自预先校准的库。然而，我们使用新旧表单术语，因为模拟设计涉及直接等同于以前的表单例如，假设一个项目在表格1和表格2之间从b = 0到1的1.0量级条件下漂移，如果在表格3上被选中，则该项目将被视为具有真值为1.0的b。

{"title":"Detecting Item Parameter Drift in Small Sample Rasch Equating","authors":"Daniel Jurich, Chunyan Liu","doi":"10.1080/08957347.2023.2274567","DOIUrl":"https://doi.org/10.1080/08957347.2023.2274567","url":null,"abstract":"ABSTRACTScreening items for parameter drift helps protect against serious validity threats and ensure score comparability when equating forms. Although many high-stakes credentialing examinations operate with small sample sizes, few studies have investigated methods to detect drift in small sample equating. This study demonstrates that several newly researched drift detection strategies can improve equating accuracy under certain conditions with small samples where some anchor items display item parameter drift. Results showed that the recently proposed methods mINFIT and mOUTFIT as well as the more conventional Robust-z helped mitigate the adverse effects of drifting anchor items in conditions with higher drift levels or with more than 75 examinees. In contrast, the Logit Difference approach excessively removed invariant anchor items. The discussion provides recommendations on how practitioners working with small samples can use the results to make more informed decisions regarding item parameter drift. Disclosure statementNo potential conflict of interest was reported by the author(s).Supplementary materialSupplemental data for this article can be accessed online at https://doi.org/10.1080/08957347.2023.2274567Notes1 In certain testing designs, some items may be reused as non-anchor items on future forms. Although IPD can occur on those items, we use the traditional IPD definition as specific to differential functioning in the items reused to serve as the equating anchor set.2 In IRT, the old form anchor item parameter estimates can also come from a pre-calibrated bank. However, we use the old and new form terminology as the simulation design involves directly equating to a previous form.3 For example, assume an item drifted in the 1.0 magnitude condition from b = 0 to 1 between Forms 1 and 2, this item would be treated as having a true b of 1.0 if selected on the Form 3.","PeriodicalId":51609,"journal":{"name":"Applied Measurement in Education","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135392677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Bayesian Logistic Regression: A New Method to Calibrate Pretest Items in Multistage Adaptive Testing 贝叶斯逻辑回归:一种校正多阶段自适应测试中预测项目的新方法

4区教育学 Q3 EDUCATION & EDUCATIONAL RESEARCH

Applied Measurement in Education

Pub Date : 2023-11-08 DOI: 10.1080/08957347.2023.2274572

TsungHan Ho

ABSTRACTAn operational multistage adaptive test (MST) requires the development of a large item bank and the effort to continuously replenish the item bank due to concerns about test security and validity over the long term. New items should be pretested and linked to the item bank before being used operationally. The linking item volume fluctuations in MST, however, bring into question the quality of the link to the reference scale. In this study, various calibration/linking methods along with a newly proposed Bayesian logistic regression (BLR) method were evaluated by comparison with the test characteristic curve method through simulated MST response data in terms of item parameter recovery. Results generated by the BLR method were promising due to its estimation stability and robustness across studied conditions. The findings of the present study should help inform practitioners of the utilities of implementing the pretest item calibration method in MST. Disclosure statementNo potential conflict of interest was reported by the author(s).

摘要一个可操作的多阶段自适应测试(MST)需要开发一个大型的题库，并且由于对测试安全性和长期有效性的考虑，需要不断地补充题库。在实际使用之前，新项目应预先测试并链接到题库。然而，MST中链接项目数量的波动使人们对与参考量表的链接质量产生了疑问。在本研究中，通过模拟MST响应数据，对各种校准/链接方法以及新提出的贝叶斯逻辑回归(BLR)方法与试验特征曲线方法在项目参数恢复方面进行了比较。由于BLR方法在研究条件下的估计稳定性和鲁棒性，其结果是有希望的。本研究的结果应有助于告知实践者在MST中实施测前项目校准方法的效用。披露声明作者未报告潜在的利益冲突。

引用次数: 0

Change in Engagement During Test Events: An Argument for Weighted Scoring? 测试过程中参与度的变化:加权评分的争论?

4区教育学 Q3 EDUCATION & EDUCATIONAL RESEARCH

Applied Measurement in Education

Pub Date : 2023-11-06 DOI: 10.1080/08957347.2023.2274568

Steven L. Wise, G. Gage Kingsbury, Meredith L. Langi

ABSTRACTRecent research has provided evidence that performance change during a student’s test event can indicate the presence of test-taking disengagement. Meaningful performance change implies that some portions of the test event reflect assumed maximum performance better than others and, because disengagement tends to diminish performance, lower-performing portions are less likely to reflect maximum performance than higher-performing portions. This empirical study explored the use of differential weighting of item responses during scoring, with weighting schemes representing either declining or increasing performance. Results indicated that weighted scoring could substantially decrease the score distortion due to disengagement factors and thereby improve test score validity. The study findings support the use of scoring procedures that manage disengagement by adapting to student test-taking behavior. Disclosure statementThe authors have no known conflicts of interest to disclose.Notes1 What constitutes “construct-irrelevant” depends on how the target construct is conceptualized. For example, Borgonovi and Biecek (Citation2016) argued that academic endurance should be considered part of what PISA is intended to measure, because academic endurance is positively associated with a student’s success later in life. It is unclear, however, how universally this conceptualization is adopted by those interpreting PISA results.2 Such comparisons between first and second half test performance require the assumption that the two halves are reasonably equivalent in terms of content representation if IRT-based scoring is used.3 Half test MLE standard errors in Math and Reading were around 4.2 and 4.8, respectively.4 These intervals are not intended to correspond to the critical regions used to assess statistical significance under the AMC method. For example, classifying PD < -10 points as a large decline represents a less conservative criterion than the critical region used by Wise and Kingsbury (Citation2022).

摘要最近的研究提供了证据，表明学生在考试过程中的表现变化可以表明考试脱离的存在。有意义的性能变化意味着测试事件的某些部分比其他部分更好地反映了假定的最大性能，并且，因为脱离会降低性能，较低性能的部分比较高性能的部分更不可能反映最大性能。本实证研究探讨了在评分过程中对项目反应的差异加权的使用，加权方案代表了绩效的下降或提高。结果表明，采用加权评分法可以有效地减少因脱离参与因素而造成的分数失真，从而提高测试分数的效度。研究结果支持使用评分程序，通过适应学生的考试行为来管理脱离。声明作者无已知利益冲突需要披露。注1“无关结构”的构成取决于目标结构是如何概念化的。例如，Borgonovi和Biecek (Citation2016)认为，学业耐力应该被视为PISA旨在衡量的一部分，因为学业耐力与学生以后的成功呈正相关。然而，目前尚不清楚，解释PISA结果的人是否普遍采用了这一概念如果使用基于irt的评分，对前半部分和后半部分测试性能的比较需要假设这两部分在内容表示方面是相当的数学和阅读的一半测试MLE标准误差分别在4.2和4.8左右这些区间不打算对应于在AMC方法下用于评估统计显著性的关键区域。例如，与Wise和Kingsbury (Citation2022)使用的临界区域相比，将PD < -10点分类为大幅下降的标准就不那么保守了。

{"title":"Change in Engagement During Test Events: An Argument for Weighted Scoring?","authors":"Steven L. Wise, G. Gage Kingsbury, Meredith L. Langi","doi":"10.1080/08957347.2023.2274568","DOIUrl":"https://doi.org/10.1080/08957347.2023.2274568","url":null,"abstract":"ABSTRACTRecent research has provided evidence that performance change during a student’s test event can indicate the presence of test-taking disengagement. Meaningful performance change implies that some portions of the test event reflect assumed maximum performance better than others and, because disengagement tends to diminish performance, lower-performing portions are less likely to reflect maximum performance than higher-performing portions. This empirical study explored the use of differential weighting of item responses during scoring, with weighting schemes representing either declining or increasing performance. Results indicated that weighted scoring could substantially decrease the score distortion due to disengagement factors and thereby improve test score validity. The study findings support the use of scoring procedures that manage disengagement by adapting to student test-taking behavior. Disclosure statementThe authors have no known conflicts of interest to disclose.Notes1 What constitutes “construct-irrelevant” depends on how the target construct is conceptualized. For example, Borgonovi and Biecek (Citation2016) argued that academic endurance should be considered part of what PISA is intended to measure, because academic endurance is positively associated with a student’s success later in life. It is unclear, however, how universally this conceptualization is adopted by those interpreting PISA results.2 Such comparisons between first and second half test performance require the assumption that the two halves are reasonably equivalent in terms of content representation if IRT-based scoring is used.3 Half test MLE standard errors in Math and Reading were around 4.2 and 4.8, respectively.4 These intervals are not intended to correspond to the critical regions used to assess statistical significance under the AMC method. For example, classifying PD < -10 points as a large decline represents a less conservative criterion than the critical region used by Wise and Kingsbury (Citation2022).","PeriodicalId":51609,"journal":{"name":"Applied Measurement in Education","volume":"21 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135634751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The Promise of Assessments That Advance Social Justice: An Indigenous Example 促进社会公正的评估承诺：一个土著人的例子

IF 1.5 4区教育学 Q3 EDUCATION & EDUCATIONAL RESEARCH

Applied Measurement in Education

Pub Date : 2023-06-08 DOI: 10.1080/08957347.2023.2222031

Pōhai Kūkea Shultz, Kerry S. Englert

ABSTRACT In the United States, systemic racism against people of color was brought to the forefront of discourse throughout 2020, and highlighted the on-going inequities faced by intentionally marginalized groups in policing, health and education. No community of color is immune from these inequities, and the activism in 2020 and the consequences of the pandemic have made systemic inequities impossible to ignore. In the Hawaiʻi context, social and racial injustice has resulted in cultural and language loss (among other markers of colonization), but it is within this loss that we can see the potential for the most significant evolution of assessment practices that champion self determination and social justice. We illustrate how injustices can be addressed through the development of assessments centered in advocacy of and accountability to our communities of color. It is time for us to reimagine what self-determination and social justice in all assessment systems can and should look like.

摘要在美国，2020年，针对有色人种的系统性种族主义被提上了讨论的焦点，并突显了故意边缘化群体在警务、卫生和教育方面面临的持续不平等。没有一个有色人种社区能够免受这些不平等的影响，2020年的激进主义和疫情的后果使系统性的不平等现象不容忽视。在夏威夷的背景下，社会和种族不公正导致了文化和语言的丧失（以及其他殖民化标志），但正是在这种丧失中，我们可以看到支持自决和社会正义的评估实践有可能发生最重大的演变。我们展示了如何通过制定以倡导和问责有色人种社区为中心的评估来解决不公正现象。现在是我们重新想象所有评估系统中的自决和社会正义可以而且应该是什么样子的时候了。

引用次数: 0

The Standards Will Never Be Enough: A Racial Justice Extension 标准永远不够:种族正义的延伸

IF 1.5 4区教育学 Q3 EDUCATION & EDUCATIONAL RESEARCH

Applied Measurement in Education

Pub Date : 2023-05-31 DOI: 10.1080/08957347.2023.2214656

Mya Poe, M. Oliveri, N. Elliot

ABSTRACT Since 1952, the Standards for Educational and Psychological Testing has provided criteria for developing and evaluating educational and psychological tests and testing practice. Yet, we argue that the foundations, operations, and applications in the Standards are no longer sufficient to meet the current U.S. testing demands for fairness for all test takers. We propose racial justice extensions as principled ways to extend the Standards, through intentional actions focused on race and targeted at educational policies, processes, and outcomes in specific settings. To inform these extensions, we focus on four social-justice concepts: intersectionality derived from Black Feminist Theory; responsibility derived from moral philosophy; disparate impact derived from legal reasoning; and situatedness derived from social learning theories. We demonstrate these extensions and concepts in action by applying them to case studies of nursing licensure and placement testing.

摘要自1952年以来，《教育心理测试标准》为教育心理测试和测试实践的发展和评估提供了标准。然而，我们认为，标准中的基础、操作和应用已不足以满足当前美国对所有考生公平性的考试要求。我们建议将种族正义扩展作为扩展标准的原则性方式，通过关注种族并针对特定环境中的教育政策、过程和结果的有意行动。为了为这些扩展提供信息，我们关注四个社会正义概念：源自黑人女权主义理论的交叉性；源自道德哲学的责任；法律推理产生的不同影响；以及源自社会学习理论的情境性。我们通过将这些扩展和概念应用于护理执照和安置测试的案例研究，展示了它们的实际应用。

引用次数: 1

Shifting Educational Measurement from an Agent of Systemic Racism to an Anti-Racist Endeavor 将教育测量从系统性种族主义的代理人转变为反种族主义的奋斗者

IF 1.5 4区教育学 Q3 EDUCATION & EDUCATIONAL RESEARCH

Applied Measurement in Education

Pub Date : 2023-05-27 DOI: 10.1080/08957347.2023.2217555

Michaeline Russell

ABSTRACT In recent years, issues of race, racism and social justice have garnered increased attention across the nation. Although some aspects of social justice, particularly cultural sensitivity and test bias, have received similar attention within the field of educational measurement, sharp focus of racism has alluded the field. This manuscript focuses narrowly on racism. Drawing on an expansive body of work in the field of sociology, several key theories of race and racism advanced over the past century are presented. Elements of these theories are then integrated into a model of systemic racism. This model is used to identify some of the ways in which educational measurement supports systemic racism as it operates in the United States. I then explore ways in which an anti-racist frame could be applied to combat the system of racism and reorient our work to support racial liberation.

摘要近年来，种族、种族主义和社会正义问题在全国范围内引起了越来越多的关注。尽管社会正义的某些方面，特别是文化敏感性和考试偏见，在教育测量领域也受到了类似的关注，但种族主义的尖锐焦点也暗示了这一领域。这份手稿只关注种族主义。根据社会学领域的大量工作，介绍了过去一个世纪提出的几个关于种族和种族主义的关键理论。然后，这些理论的元素被整合到系统性种族主义的模型中。该模型用于确定教育衡量在美国运作时支持系统性种族主义的一些方式。然后，我探讨了如何应用反种族主义框架来打击种族主义制度，并重新调整我们的工作以支持种族解放。

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Applied Measurement in Education

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀