首页 > 最新文献

Educational Measurement-Issues and Practice最新文献

英文 中文
An Evaluation of Automatic Item Generation: A Case Study of Weak Theory Approach 项目自动生成的评价:以弱理论方法为例
IF 2 4区 教育学 Q2 Social Sciences Pub Date : 2022-10-06 DOI: 10.1111/emip.12529
Yanyan Fu, Edison M. Choe, Hwanggyu Lim, Jaehwa Choi

This case study applied the weak theory of Automatic Item Generation (AIG) to generate isomorphic item instances (i.e., unique but psychometrically equivalent items) for a large-scale assessment. Three representative instances were selected from each item template (i.e., model) and pilot-tested. In addition, a new analytical framework, differential child item functioning (DCIF) analysis, based on the existing differential item functioning statistics, was applied to evaluate the psychometric equivalency of item instances within each template. The results showed that, out of 23 templates, nine successfully generated isomorphic instances, five required minor revisions to make them isomorphic, and the remaining templates required major modifications. The results and insights obtained from the AIG template development procedure may help item writers and psychometricians effectively develop and manage the templates that generate isomorphic instances.

本案例研究应用弱自动项目生成理论(AIG)为大规模评估生成同构项目实例(即唯一但心理测量等效的项目)。从每个项目模板(即模型)中选择三个具有代表性的实例进行试点测试。此外,在现有差异项目功能统计的基础上,应用一个新的分析框架——差异儿童项目功能分析(DCIF)来评估每个模板中项目实例的心理等效性。结果表明,在23个模板中,9个成功地生成了同构实例,5个需要进行小的修改以使它们同构,其余的模板需要进行大的修改。从AIG模板开发过程中获得的结果和见解可以帮助项目编写者和心理测量学家有效地开发和管理生成同构实例的模板。
{"title":"An Evaluation of Automatic Item Generation: A Case Study of Weak Theory Approach","authors":"Yanyan Fu,&nbsp;Edison M. Choe,&nbsp;Hwanggyu Lim,&nbsp;Jaehwa Choi","doi":"10.1111/emip.12529","DOIUrl":"10.1111/emip.12529","url":null,"abstract":"<p>This case study applied the <i>weak theory</i> of Automatic Item Generation (AIG) to generate isomorphic item instances (i.e., unique but psychometrically equivalent items) for a large-scale assessment. Three representative instances were selected from each item template (i.e., model) and pilot-tested. In addition, a new analytical framework, differential child item functioning (DCIF) analysis, based on the existing differential item functioning statistics, was applied to evaluate the psychometric equivalency of item instances within each template. The results showed that, out of 23 templates, nine successfully generated isomorphic instances, five required minor revisions to make them isomorphic, and the remaining templates required major modifications. The results and insights obtained from the AIG template development procedure may help item writers and psychometricians effectively develop and manage the templates that generate isomorphic instances.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2022-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46565878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ITEMS Corner Update: Announcing Two Significant Changes to ITEMS 物品角落更新:宣布对物品的两个重大更改
IF 2 4区 教育学 Q2 Social Sciences Pub Date : 2022-09-08 DOI: 10.1111/emip.12524
Brian C. Leventhal

In addition to an exciting new module on Multidimensional Item Response Theory (MIRT) equating, there are two important announcements regarding the Instructional Topics in Educational Measurement Series (ITEMS). After much discussion with authors, learners, the educational measurement community, and other stakeholders of ITEMS, I am pleased to announce (1) the transfer of the ITEMS portal to the National Council on Measurement in Education (NCME) website and (2) a new digital module format.

Transfer of the ITEMS portal to the NCME website: In 2018, I, along with Matthew Gaertner, led efforts to launch the new NCME website on the Higher Logic platform. Besides bringing a modern look and feel to the organization's web presence, the platform was selected due to its flexibility and customizability. In the years since, traffic to the NCME website has continued to increase, and there has been a significant increase in site content (e.g., software database, special interest group community pages). In April of this year, just prior to the NCME Annual Conference, the website committee, led by Erin Banjanovic, released a much-needed re-organization of the content on the site. This wonderful overhaul has made navigating the NCME website easier with content now in more logical locations. However, noticeably absent from the NCME website has been the ITEMS portal. As a reminder, ITEMS is a publication from NCME that has a brief summary published in the Educational Measurement: Issues and Practice journal with the primary digital content on the ITEMS portal, freely available after registration. The ITEMS portal has been a Learning Management System based website, with many features and ripe for extension. Though from the user's perspective, it can be complex to navigate and necessitates additional navigation from the primary NCME website, also requiring unique log-in criteria.

It is at this time that I am pleased to announce that the ITEMS portal is now available on the NCME website at the following link: https://www.ncme.org/itemsportal

Transferring the ITEMS portal to the NCME website has several immediate benefits. First, all modules will remain free of charge, but no longer require additional registration. Second, they will have a different organization structure, improving navigating across modules and enabling more efficient access to key information. Finally, they will fall under the NCME brand, having the same look and feel with all the content on the NCME website.

Although this issue marks the launch of the ITEMS portal on the NCME website, the transfer of content remains a work in progress. For now, both the old and new ITEMS portal will be available and all links to the old ITEMS portal will remain functional. However, I would strongly advise all who embed or link to content to begin updating to the portal on the NCME website. Nearly all of the content has been shifted, but if you notice anything missing or ha

除了一个令人兴奋的多维项目反应理论(MIRT)等式的新模块外,关于教育测量系列(ITEMS)的教学主题还有两个重要的公告。在与作者、学习者、教育测量界和其他利益相关者进行了大量讨论之后,我很高兴地宣布:(1)将ITEMS门户网站转移到国家教育测量委员会(NCME)网站上;(2)采用新的数字模块格式。项目门户网站转移到NCME网站:2018年,我和Matthew Gaertner一起,领导了在高等逻辑平台上推出新的NCME网站的努力。除了为组织的网络呈现带来现代的外观和感觉之外,该平台还因其灵活性和可定制性而被选中。从那以后的几年里,NCME网站的流量持续增加,网站内容也有了显著的增加(例如,软件数据库,特殊兴趣小组社区页面)。今年4月,就在NCME年会之前,由Erin Banjanovic领导的网站委员会发布了一个急需的网站内容重组。这个奇妙的修改使得浏览NCME网站变得更加容易,内容现在在更合理的位置。然而,值得注意的是,NCME网站上没有条目门户。提醒一下,ITEMS是NCME的一份出版物,在《教育测量:问题与实践》期刊上发表了一个简短的摘要,主要数字内容在ITEMS门户网站上,注册后免费提供。ITEMS门户网站是一个基于学习管理系统的网站,具有许多功能,扩展的时机已经成熟。虽然从用户的角度来看,它可能很复杂,需要从主NCME网站进行额外的导航,也需要独特的登录标准。此时,我很高兴地宣布,项目门户网站现在可以在NCME网站上使用,链接如下:https://www.ncme.org/itemsportalTransferring NCME网站的项目门户网站有几个直接的好处。首先,所有模块将保持免费,但不再需要额外的注册。其次,它们将具有不同的组织结构,改进了跨模块的导航,并能够更有效地访问关键信息。最后,它们将归入NCME品牌,与NCME网站上的所有内容具有相同的外观和感觉。虽然这个问题标志着NCME网站项目门户的启动,但内容的转移仍在进行中。现在,新旧ITEMS门户都可用,所有到旧ITEMS门户的链接都将保持功能。然而,我强烈建议所有嵌入或链接到内容的人开始更新到NCME网站上的门户网站。几乎所有的内容都被转移了,但如果你发现有任何缺失或对增强导航有建议,请不要犹豫给我发电子邮件[email protected]。在接下来的几个月里,我将修改和增强NCME网站上新项目门户的外观、感觉和导航。这意味着如果ITEMS门户在每次访问时都更新,您不应该感到惊讶。新的数字模块格式:新门户的另一个优点是能够完全定制每个模块。在被任命为编辑的过去一年里,我与许多模块的作者和学习者进行了交谈。一个共同的主题出现了:对数字模块的热爱,但是在学习管理系统中开发一个模块需要巨大的学习曲线(和时间),以及使用模块的复杂性和局限性。在NCME网站上灵活的项目门户网站的帮助下,经过几次头脑风暴会议,我很高兴地宣布一种新的项目模块格式。这些模块将保持数字化,但简化了交互功能,使用户能够更快地获取内容。每个模块将包括一个视频摘要介绍,概述整个模块的学习目标。下面是几个部分,每个部分都有自己的学习目标,一个大约10分钟的视频,以及互动学习检查。这些部分可以以任何顺序完成,尽管这些模块将被设计为线性表示。视频将可在网站上观看或下载。这将允许课程教师将部分模块嵌入到他们的课程中,专业人士只与利益相关者分享特定的内容,并允许学习者下载部分内容以供离线查看。我要强调的是,与引用其他出版材料类似,在嵌入替代使用时,请适当引用该模块。为方便起见,模块引用将在门户网站上提供。我很高兴地宣布,由Stella Y. Kim撰写的关于MIRT等式的新格式的第一个数字模块。在这个模块中,Dr。 Kim对MIRT模型进行了回顾,总结了最近关于MIRT等值的文献,与一维IRT等值相比,MIRT等值的挑战,以及如何执行MIRT等值的逐步指南。此外,她还举例说明了使用FlexMirt和RAGE-RGEQUATE的活动方法。我特别感谢Kim博士对新开发过程的耐心,以及她对我对ITEMS模块新形式的设想的信任。我鼓励对MIRT等值感兴趣的学习者完成本模块,并鼓励所有ITEMS模块学习者探索ITEMS模块的新形式。对于作者来说,这种新格式的另一个好处是在后端。不需要学习新的软件,开发时间大大缩短,几乎100%的精力都花在了开发内容上。我已经并将继续与作者合作,使这一过程尽可能无缝衔接。因此,我鼓励任何对制作ITEMS模块感兴趣的人直接与我联系。有一个令人兴奋的开发中的模块阵容,但我很高兴与任何有模块想法的人交谈!
{"title":"ITEMS Corner Update: Announcing Two Significant Changes to ITEMS","authors":"Brian C. Leventhal","doi":"10.1111/emip.12524","DOIUrl":"10.1111/emip.12524","url":null,"abstract":"<p>In addition to an exciting new module on Multidimensional Item Response Theory (MIRT) equating, there are two important announcements regarding the Instructional Topics in Educational Measurement Series (ITEMS). After much discussion with authors, learners, the educational measurement community, and other stakeholders of ITEMS, I am pleased to announce (1) the transfer of the ITEMS portal to the National Council on Measurement in Education (NCME) website and (2) a new digital module format.</p><p><i>Transfer of the ITEMS portal to the NCME website</i>: In 2018, I, along with Matthew Gaertner, led efforts to launch the new NCME website on the Higher Logic platform. Besides bringing a modern look and feel to the organization's web presence, the platform was selected due to its flexibility and customizability. In the years since, traffic to the NCME website has continued to increase, and there has been a significant increase in site content (e.g., software database, special interest group community pages). In April of this year, just prior to the NCME Annual Conference, the website committee, led by Erin Banjanovic, released a much-needed re-organization of the content on the site. This wonderful overhaul has made navigating the NCME website easier with content now in more logical locations. However, noticeably absent from the NCME website has been the ITEMS portal. As a reminder, ITEMS is a publication from NCME that has a brief summary published in the <i>Educational Measurement: Issues and Practice</i> journal with the primary digital content on the ITEMS portal, freely available after registration. The ITEMS portal has been a Learning Management System based website, with many features and ripe for extension. Though from the user's perspective, it can be complex to navigate and necessitates additional navigation from the primary NCME website, also requiring unique log-in criteria.</p><p>It is at this time that I am pleased to announce that the ITEMS portal is now available on the NCME website at the following link: https://www.ncme.org/itemsportal</p><p>Transferring the ITEMS portal to the NCME website has several immediate benefits. First, all modules will remain free of charge, but no longer require additional registration. Second, they will have a different organization structure, improving navigating across modules and enabling more efficient access to key information. Finally, they will fall under the NCME brand, having the same look and feel with all the content on the NCME website.</p><p>Although this issue marks the launch of the ITEMS portal on the NCME website, the transfer of content remains a work in progress. For now, both the old and new ITEMS portal will be available and all links to the old ITEMS portal will remain functional. However, I would strongly advise all who embed or link to content to begin updating to the portal on the NCME website. Nearly all of the content has been shifted, but if you notice anything missing or ha","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2022-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/emip.12524","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42982757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Issue Cover 期封面
IF 2 4区 教育学 Q2 Social Sciences Pub Date : 2022-09-08 DOI: 10.1111/emip.12445
{"title":"Issue Cover","authors":"","doi":"10.1111/emip.12445","DOIUrl":"https://doi.org/10.1111/emip.12445","url":null,"abstract":"","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2022-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/emip.12445","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"137801146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Investigation of the Nature and Consequence of the Relationship between IRT Difficulty and Discrimination IRT难度与歧视关系的性质及后果研究
IF 2 4区 教育学 Q2 Social Sciences Pub Date : 2022-09-08 DOI: 10.1111/emip.12522
Sandra M. Sweeney, Sandip Sinharay, Matthew S. Johnson, Eric W. Steinhauer

The focus of this paper is on the empirical relationship between item difficulty and item discrimination. Two studies—an empirical investigation and a simulation study—were conducted to examine the association between item difficulty and item discrimination under classical test theory and item response theory (IRT), and the effects of the association on various quantities of interest. Results from the empirical investigation show that item difficulty and item discrimination are negatively correlated under classical test theory, mostly negatively correlated under the two-parameter logistic model, and mostly positively correlated under the three-parameter logistic model; the magnitude of the correlation varied over the different data sets. Results from the simulation study reveal that a failure to incorporate the correlation between item difficulty and item discrimination in IRT simulations may provide the investigator with inaccurate values of important quantities of interest, and may lead to incorrect operational decisions. Implications to practice and future directions are discussed.

本文的研究重点是题目难度与题目辨别力之间的实证关系。通过实证研究和模拟研究,探讨了经典测试理论和项目反应理论(IRT)下项目难度与项目歧视之间的关系,以及这种关系对不同兴趣量的影响。实证研究结果表明:项目难度与项目辨别力在经典测试理论下呈负相关,在双参数逻辑模型下呈负相关,在三参数逻辑模型下呈正相关;相关性的大小在不同的数据集上有所不同。模拟研究的结果表明,在IRT模拟中未能纳入项目难度和项目辨别之间的相关性可能会为研究者提供不准确的重要感兴趣量值,并可能导致错误的操作决策。讨论了对实践的启示和未来的发展方向。
{"title":"An Investigation of the Nature and Consequence of the Relationship between IRT Difficulty and Discrimination","authors":"Sandra M. Sweeney,&nbsp;Sandip Sinharay,&nbsp;Matthew S. Johnson,&nbsp;Eric W. Steinhauer","doi":"10.1111/emip.12522","DOIUrl":"10.1111/emip.12522","url":null,"abstract":"<p>The focus of this paper is on the empirical relationship between item difficulty and item discrimination. Two studies—an empirical investigation and a simulation study—were conducted to examine the association between item difficulty and item discrimination under classical test theory and item response theory (IRT), and the effects of the association on various quantities of interest. Results from the empirical investigation show that item difficulty and item discrimination are negatively correlated under classical test theory, mostly negatively correlated under the two-parameter logistic model, and mostly positively correlated under the three-parameter logistic model; the magnitude of the correlation varied over the different data sets. Results from the simulation study reveal that a failure to incorporate the correlation between item difficulty and item discrimination in IRT simulations may provide the investigator with inaccurate values of important quantities of interest, and may lead to incorrect operational decisions. Implications to practice and future directions are discussed.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2022-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45419783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Digital Module 29: Multidimensional Item Response Theory Equating 数字模块29:多维项目反应理论等价
IF 2 4区 教育学 Q2 Social Sciences Pub Date : 2022-09-08 DOI: 10.1111/emip.12525
Stella Y. Kim

In this digital ITEMS module, Dr. Stella Kim provides an overview of multidimensional item response theory (MIRT) equating. Traditional unidimensional item response theory (IRT) equating methods impose the sometimes untenable restriction on data that only a single ability is assessed. This module discusses potential sources of multidimensionality and presents potential consequences of multidimensionality on equating. To remedy these effects, MIRT equating can be used as a viable alternative to traditional methods of IRT equating. In conducting MIRT equating, the choice of an appropriate MIRT model is necessary, and thus the module describes several existing MIRT models and illustrates each using hypothetical examples. After a brief description of MIRT models, an extensive review of the current literature is presented to identify gaps in the literature on MIRT equating. Then, the steps for conducting MIRT observed-score equating are described. Finally, the module discusses practical considerations in applying MIRT equating to testing practices and suggests potential areas of research for future studies.

在这个数字项目模块中,斯特拉·金博士提供了多维项目反应理论(MIRT)等同的概述。传统的单维项目反应理论(IRT)等同方法对数据施加了有时站不住脚的限制,即只评估单一的能力。本模块讨论了多维度的潜在来源,并提出了多维度对方程的潜在影响。为了弥补这些影响,MIRT等效可以作为传统IRT等效方法的可行替代方法。在进行MIRT等值时,选择合适的MIRT模型是必要的,因此该模块描述了几个现有的MIRT模型,并使用假设的示例说明了每个模型。在对MIRT模型的简要描述之后,对当前文献进行了广泛的回顾,以确定MIRT等同文献中的空白。然后,描述了进行MIRT观察得分相等的步骤。最后,该模块讨论了应用MIRT等同于测试实践的实际考虑因素,并提出了未来研究的潜在研究领域。
{"title":"Digital Module 29: Multidimensional Item Response Theory Equating","authors":"Stella Y. Kim","doi":"10.1111/emip.12525","DOIUrl":"10.1111/emip.12525","url":null,"abstract":"<p>In this digital ITEMS module, Dr. Stella Kim provides an overview of multidimensional item response theory (MIRT) equating. Traditional unidimensional item response theory (IRT) equating methods impose the sometimes untenable restriction on data that only a single ability is assessed. This module discusses potential sources of multidimensionality and presents potential consequences of multidimensionality on equating. To remedy these effects, MIRT equating can be used as a viable alternative to traditional methods of IRT equating. In conducting MIRT equating, the choice of an appropriate MIRT model is necessary, and thus the module describes several existing MIRT models and illustrates each using hypothetical examples. After a brief description of MIRT models, an extensive review of the current literature is presented to identify gaps in the literature on MIRT equating. Then, the steps for conducting MIRT observed-score equating are described. Finally, the module discusses practical considerations in applying MIRT equating to testing practices and suggests potential areas of research for future studies.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2022-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/emip.12525","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44828478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
On the Cover: Person Infit Density Contour 封面:人的密度轮廓
IF 2 4区 教育学 Q2 Social Sciences Pub Date : 2022-09-08 DOI: 10.1111/emip.12526
Yuan-Ling Liaw
{"title":"On the Cover: Person Infit Density Contour","authors":"Yuan-Ling Liaw","doi":"10.1111/emip.12526","DOIUrl":"10.1111/emip.12526","url":null,"abstract":"","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2022-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43771777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Special Case of Brennan's Index for Tests That Aim to Select a Limited Number of Students: A Monte Carlo Simulation Study 以选拔有限数量学生为目的的测试的布伦南指数的一个特例:蒙特卡洛模拟研究
IF 2 4区 教育学 Q2 Social Sciences Pub Date : 2022-09-08 DOI: 10.1111/emip.12528
Serkan Arikan, Eren Can Aybek

Many scholars compared various item discrimination indices in real or simulated data. Item discrimination indices, such as item-total correlation, item-rest correlation, and IRT item discrimination parameter, provide information about individual differences among all participants. However, there are tests that aim to select a very limited number of students, examinees, or candidates for allocated schools and job positions. Thus, there is a need to evaluate the performances of CTT and IRT item discrimination indices when the test purpose is to select a limited number of students. The purpose of the current Monte Carlo study is to evaluate item discrimination indices in the case of selecting a limited number of high-achieving students. The results showed that a special case of Brennan's index, B10–90, provided more accurate information for this specific test purpose. Additionally, the effects of various factors, such as test length, ability distributions of examinees, and item difficulty variance on item discrimination indices were investigated. The performance of each item discrimination index is discussed in detail.

许多学者比较了真实或模拟数据中的各种项目识别指标。项目识别指标,如项目-总相关、项目-休息相关和IRT项目识别参数,提供了所有参与者之间个体差异的信息。然而,也有一些考试旨在为分配的学校和工作岗位选择数量非常有限的学生、考生或候选人。因此,当测试目的是选择有限数量的学生时,有必要评估CTT和IRT项目区分指标的表现。本蒙特卡罗研究的目的是评估在选择有限数量的高成就学生的情况下的项目歧视指标。结果表明,布伦南指数的一个特例B10-90为这个特定的测试目的提供了更准确的信息。此外,还考察了考题长度、考生能力分布、试题难度方差等因素对试题辨析指标的影响。详细讨论了各项目判别指标的性能。
{"title":"A Special Case of Brennan's Index for Tests That Aim to Select a Limited Number of Students: A Monte Carlo Simulation Study","authors":"Serkan Arikan,&nbsp;Eren Can Aybek","doi":"10.1111/emip.12528","DOIUrl":"10.1111/emip.12528","url":null,"abstract":"<p>Many scholars compared various item discrimination indices in real or simulated data. Item discrimination indices, such as item-total correlation, item-rest correlation, and IRT item discrimination parameter, provide information about individual differences among all participants. However, there are tests that aim to select a very limited number of students, examinees, or candidates for allocated schools and job positions. Thus, there is a need to evaluate the performances of CTT and IRT item discrimination indices when the test purpose is to select a limited number of students. The purpose of the current Monte Carlo study is to evaluate item discrimination indices in the case of selecting a limited number of high-achieving students. The results showed that a special case of Brennan's index, <i>B</i><sub>10–90</sub>, provided more accurate information for this specific test purpose. Additionally, the effects of various factors, such as test length, ability distributions of examinees, and item difficulty variance on item discrimination indices were investigated. The performance of each item discrimination index is discussed in detail.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2022-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43556492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Supporting the Interpretive Validity of Student-Level Claims in Science Assessment with Tiered Claim Structures 用分层诉求结构支持科学评估中学生层次诉求的解释效度
IF 2 4区 教育学 Q2 Social Sciences Pub Date : 2022-08-06 DOI: 10.1111/emip.12523
Sanford R. Student, Brian Gong

We address two persistent challenges in large-scale assessments of the Next Generation Science Standards: (a) the validity of score interpretations that target the standards broadly and (b) how to structure claims for assessments of this complex domain. The NGSS pose a particular challenge for specifying claims about students that evidence from summative assessments can support. As a solution, we propose tiered claims, which explicitly distinguish between claims about what students have done or can do on test items—which are typically easier to support under current test designs—and claims about what students could do in the broader domain of performances described by the standards, for which novel evidence is likely required. We discuss the positive implications of tiered claims for test construction, validation, and reporting of results.

我们在下一代科学标准的大规模评估中解决了两个持续存在的挑战:(a)广泛针对标准的分数解释的有效性,以及(b)如何构建这一复杂领域评估的要求。NGSS提出了一个特别的挑战,即在总结性评估的证据可以支持的情况下,明确关于学生的主张。作为一种解决方案,我们提出了分层要求,明确区分学生在测试项目上做了什么或能做什么——在当前的测试设计下,这通常更容易支持——和学生在标准所描述的更广泛的表现领域能做什么,这可能需要新的证据。我们讨论分级索赔对测试构建、验证和结果报告的积极影响。
{"title":"Supporting the Interpretive Validity of Student-Level Claims in Science Assessment with Tiered Claim Structures","authors":"Sanford R. Student,&nbsp;Brian Gong","doi":"10.1111/emip.12523","DOIUrl":"10.1111/emip.12523","url":null,"abstract":"<p>We address two persistent challenges in large-scale assessments of the Next Generation Science Standards: (a) the validity of score interpretations that target the standards broadly and (b) how to structure claims for assessments of this complex domain. The NGSS pose a particular challenge for specifying claims about students that evidence from summative assessments can support. As a solution, we propose tiered claims, which explicitly distinguish between claims about what students have done or can do on test items—which are typically easier to support under current test designs—and claims about what students could do in the broader domain of performances described by the standards, for which novel evidence is likely required. We discuss the positive implications of tiered claims for test construction, validation, and reporting of results.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2022-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49048947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Average Rank and Adjusted Rank Are Better Measures of College Student Success than GPA 平均排名和调整排名是比GPA更好的衡量大学生成功的标准
IF 2 4区 教育学 Q2 Social Sciences Pub Date : 2022-08-01 DOI: 10.1111/emip.12521
Donald Wittman

I show that there are better measures of student college performance than grade point average (GPA) by undertaking a fine-grained empirical investigation of grading within a large public university. The value of using GPA as a measure of comparative performance is undermined by academically weaker students taking courses where the grading is more generous. In fact, college courses composed of weaker performing students (whether measured by their relative performance in other classes, SAT scores, or high school GPA) have higher average grades. To partially correct for idiosyncratic grading across classes, alternative measures, student class rank and the student's average class rank, are introduced. In comparison to a student's lower-division grade, the student's lower-division rank is a better predictor of the student's grade in the upper-division course. Course rank and course grade are adjusted to account for different levels of academic competitiveness across courses (more precisely, student fixed-effects are derived). SAT scores and high school GPA are then used to predict college performance. Higher explained variation (R2) is obtained when the dependent variable is average class rank rather than GPA. Still higher explained variation occurs when the dependent variable is adjusted rank.

我通过对一所大型公立大学的评分进行细致的实证调查,表明有比平均绩点(GPA)更好的衡量学生大学表现的方法。使用GPA作为比较表现的衡量标准的价值被学习成绩较差的学生选择评分更慷慨的课程所削弱。事实上,由表现较差的学生组成的大学课程(无论是以他们在其他课程中的相对表现、SAT分数还是高中GPA来衡量)的平均成绩更高。为了部分纠正跨班级的特殊评分,引入了替代措施,学生班级排名和学生平均班级排名。与学生的低年级成绩相比,学生的低年级排名可以更好地预测学生在高年级课程中的成绩。课程排名和成绩被调整,以考虑不同课程的学术竞争力水平(更准确地说,是学生固定效应)。然后,SAT分数和高中GPA被用来预测大学的表现。当因变量是平均班级排名而不是GPA时,得到更高的解释变异(R2)。当因变量被调整等级时,还会出现更高的解释变异。
{"title":"Average Rank and Adjusted Rank Are Better Measures of College Student Success than GPA","authors":"Donald Wittman","doi":"10.1111/emip.12521","DOIUrl":"10.1111/emip.12521","url":null,"abstract":"<p>I show that there are better measures of student college performance than grade point average (GPA) by undertaking a fine-grained empirical investigation of grading within a large public university. The value of using GPA as a measure of comparative performance is undermined by academically weaker students taking courses where the grading is more generous. In fact, college courses composed of <i>weaker</i> performing students (whether measured by their relative performance in other classes, SAT scores, or high school GPA) have <i>higher</i> average grades. To partially correct for idiosyncratic grading across classes, alternative measures, student class rank and the student's average class rank, are introduced. In comparison to a student's lower-division grade, the student's lower-division <i>rank</i> is a better predictor of the student's grade in the upper-division course. Course rank and course grade are adjusted to account for different levels of academic competitiveness across courses (more precisely, student fixed-effects are derived). SAT scores and high school GPA are then used to predict college performance. Higher explained variation (<i>R</i><sup>2</sup>) is obtained when the dependent variable is average class rank rather than GPA. Still higher explained variation occurs when the dependent variable is adjusted rank.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/emip.12521","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45018605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reconceptualization of Coefficient Alpha Reliability for Test Summed and Scaled Scores 测试求和分数和标度分数的Alpha信度系数的重新定义
IF 2 4区 教育学 Q2 Social Sciences Pub Date : 2022-07-07 DOI: 10.1111/emip.12520
Rashid S. Almehrizi

Coefficient alpha reliability persists as the most common reliability coefficient reported in research. The assumptions for its use are, however, not well-understood. The current paper challenges the commonly used expressions of coefficient alpha and argues that while these expressions are correct when estimating reliability for summed scores, they are not appropriate to extend coefficient alpha to correctly estimate the reliability for nonlinearly transformed scaled scores such as percentile ranks and stanines. The current paper reconceptualizes coefficient alpha as a complement of the ratio of two unbiased estimates of the summed score variance. These include conditional summed score variance assuming uncorrelated item scores (gives the error score variance) and unconditional summed score variance incorporating intercorrelated item scores (gives the observed score variance). Using this reconceptualization, a new equation of coefficient generalized alpha is introduced for scaled scores. Coefficient alpha is a special case of this new equation since the latter reduces to coefficinet alpha if the scaled scores are the summed scores themselves. Two applications (cognitive and psychological assessments) are used to compare the performance (estimation and bootstrap confidence interval) of the reliability coefficients for different scaled scores. Results support the new equation of coefficient generalized alpha and compare it to coefficient generalized beta for parallel test forms. Coefficient generalized alpha produced different reliability values, which were larger than coefficient generalized beta for different scaled scores.

信度系数一直是研究中最常用的信度系数。然而,对其使用的假设还没有得到很好的理解。本文对常用的alpha系数表达式提出了质疑,认为虽然这些表达式在估计总分数的信度时是正确的,但不适用于将alpha系数扩展到正确估计非线性转换后的分数(如百分位排名和stanines)的信度。本文将系数alpha重新定义为两个无偏估计的总和分数方差之比的补充。这包括假设不相关的项目得分的条件总和得分方差(给出错误得分方差)和包含相互关联的项目得分的无条件总和得分方差(给出观察到的得分方差)。利用这一重新概念,引入了一个新的系数广义alpha方程。系数alpha是这个新方程的一个特殊情况,因为如果缩放分数本身是求和分数,则后者会降低为系数alpha。两个应用程序(认知和心理评估),以比较性能(估计和自举置信区间)的信度系数对不同的尺度分数。结果支持系数广义α的新方程,并将其与平行检验形式的系数广义β进行了比较。广义alpha系数产生不同的信度值,不同量表分数的信度值大于广义beta系数。
{"title":"Reconceptualization of Coefficient Alpha Reliability for Test Summed and Scaled Scores","authors":"Rashid S. Almehrizi","doi":"10.1111/emip.12520","DOIUrl":"10.1111/emip.12520","url":null,"abstract":"<p>Coefficient alpha reliability persists as the most common reliability coefficient reported in research. The assumptions for its use are, however, not well-understood. The current paper challenges the commonly used expressions of coefficient alpha and argues that while these expressions are correct when estimating reliability for summed scores, they are not appropriate to extend coefficient alpha to correctly estimate the reliability for nonlinearly transformed scaled scores such as percentile ranks and stanines. The current paper reconceptualizes coefficient alpha as a complement of the ratio of two unbiased estimates of the summed score variance. These include conditional summed score variance assuming uncorrelated item scores (gives the error score variance) and unconditional summed score variance incorporating intercorrelated item scores (gives the observed score variance). Using this reconceptualization, a new equation of coefficient generalized alpha is introduced for scaled scores. Coefficient alpha is a special case of this new equation since the latter reduces to coefficinet alpha if the scaled scores are the summed scores themselves. Two applications (cognitive and psychological assessments) are used to compare the performance (estimation and bootstrap confidence interval) of the reliability coefficients for different scaled scores. Results support the new equation of coefficient generalized alpha and compare it to coefficient generalized beta for parallel test forms. Coefficient generalized alpha produced different reliability values, which were larger than coefficient generalized beta for different scaled scores.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2022-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45740678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Educational Measurement-Issues and Practice
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1