Data visualization is a core tenet of communicating measurement research and outcomes. Measurement professionals utilize data visualization in various phases of research, including exploration and communication. However, data visualization has not received enough attention in the measurement field. While it is true that many measurement graphics are relatively standard, many others are not and there is a wide variety of visualization quality and effectiveness seen in measurement journals. This article provides an overview of the current data visualization trends in measurement and provides some general tips for effective data visualization, with examples. This article is not a comprehensive treatise on data visualization. Therefore, we provide some resources for additional reading. Finally, we call on the measurement community to pay greater attention to the details of data visualization. We also call on measurement training programs to emphasize statistical reasoning through data visualization.
{"title":"Communicating Measurement Outcomes with (Better) Graphics","authors":"J. Carl Setzer, Zhongmin Cui","doi":"10.1111/emip.12519","DOIUrl":"10.1111/emip.12519","url":null,"abstract":"<p>Data visualization is a core tenet of communicating measurement research and outcomes. Measurement professionals utilize data visualization in various phases of research, including exploration and communication. However, data visualization has not received enough attention in the measurement field. While it is true that many measurement graphics are relatively standard, many others are not and there is a wide variety of visualization quality and effectiveness seen in measurement journals. This article provides an overview of the current data visualization trends in measurement and provides some general tips for effective data visualization, with examples. This article is not a comprehensive treatise on data visualization. Therefore, we provide some resources for additional reading. Finally, we call on the measurement community to pay greater attention to the details of data visualization. We also call on measurement training programs to emphasize statistical reasoning through data visualization.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2022-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43990870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the Cover: Indicators for Item Preknowledge","authors":"Yuan-Ling Liaw","doi":"10.1111/emip.12507","DOIUrl":"10.1111/emip.12507","url":null,"abstract":"","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2022-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/emip.12507","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46252186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This study employs the 4-parameter logistic item response theory model to account for the unexpected incorrect responses or slipping effects observed in a large-scale Algebra 1 End-of-Course assessment, including several innovative item formats. It investigates whether modeling the misfit at the upper asymptote has any practical impact on the parameter estimates. With a simulation study, it also investigates the amount of bias in the parameter estimates when the slipping effects are ignored. Findings from the empirical data indicate that the impact of ignoring slipping effects is negligible when the abilities are evaluated within the context of classification of students into performance levels; however, it is present toward the extreme ends of ability continuum within the context of individual abilities. Findings from the simulations reveal that when the proportion of items with the slipping effects is small (20%), ignoring misfit does not have practical importance; however, when the proportion of items with the slipping effects is moderate to large (50%–80%), the abilities are generally underestimated at both ends of ability scale. When an upper asymptote parameter was used for modeling the slipping effects, the items became easier and more discriminative in general than the model ignoring the slipping effects.
{"title":"Modeling Slipping Effects in a Large-Scale Assessment with Innovative Item Formats","authors":"Ismail Cuhadar, Salih Binici","doi":"10.1111/emip.12508","DOIUrl":"10.1111/emip.12508","url":null,"abstract":"<p>This study employs the 4-parameter logistic item response theory model to account for the unexpected incorrect responses or slipping effects observed in a large-scale Algebra 1 End-of-Course assessment, including several innovative item formats. It investigates whether modeling the misfit at the upper asymptote has any practical impact on the parameter estimates. With a simulation study, it also investigates the amount of bias in the parameter estimates when the slipping effects are ignored. Findings from the empirical data indicate that the impact of ignoring slipping effects is negligible when the abilities are evaluated within the context of classification of students into performance levels; however, it is present toward the extreme ends of ability continuum within the context of individual abilities. Findings from the simulations reveal that when the proportion of items with the slipping effects is small (20%), ignoring misfit does not have practical importance; however, when the proportion of items with the slipping effects is moderate to large (50%–80%), the abilities are generally underestimated at both ends of ability scale. When an upper asymptote parameter was used for modeling the slipping effects, the items became easier and more discriminative in general than the model ignoring the slipping effects.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2022-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46032270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
COVID-19 is disrupting assessment practices and accelerating changes. With special focus on K-12 and credentialing exams, this article describes the series of changes observed during the pandemic, the solutions assessment providers have implemented, and the long-term impact on future practices. Additionally, this article highlights the importance of the balanced assessment system, the use of assessments both for learning and of learning, and using assessments to support for social justice, equity and inclusion. These desired uses and outcomes will continue to challenge assessment and measurement experts on how they design, develop, and implement assessments moving forward.
{"title":"NCME Presidential Address 2021: Assessment Research and Practice in the Post-COVID-19 Era","authors":"Ye Tong","doi":"10.1111/emip.12509","DOIUrl":"https://doi.org/10.1111/emip.12509","url":null,"abstract":"<p>COVID-19 is disrupting assessment practices and accelerating changes. With special focus on K-12 and credentialing exams, this article describes the series of changes observed during the pandemic, the solutions assessment providers have implemented, and the long-term impact on future practices. Additionally, this article highlights the importance of the balanced assessment system, the use of assessments both for learning and of learning, and using assessments to support for social justice, equity and inclusion. These desired uses and outcomes will continue to challenge assessment and measurement experts on how they design, develop, and implement assessments moving forward.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2022-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"137736624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Test equating requires collecting data to link the scores from different forms of a test. Problems arise when equating samples are not equivalent and the test forms to be linked share no common items by which to measure or adjust for the group nonequivalence. Using data from five operational test forms, we created five pairs of research forms for each form, such that the equating relationship between each pair was known. Then we compared five approaches to adjusting for group nonequivalence in a situation where not only was group equivalence questionable, but the number of common items was small. We used a resampling approach to evaluate the linking accuracy of group adjustment using sample weights via minimum discriminant information adjustment (MDIA) using test takers’ collateral (demographic) information, a weak anchor of only three items, or a mix of both. Overall, the use of both sample weights via MDIA and a weak anchor produced the most accurate result, while the direct (random groups) linking method assuming group equivalence produced the least accurate result due to nontrivial bias. For all five research forms, using both collateral information and anchor items only marginally improved linking accuracy compared to using the weak anchor alone.
{"title":"Adjusting for Ability Differences of Equating Samples When Randomization Is Suboptimal","authors":"Sooyeon Kim, Michael E. Walker","doi":"10.1111/emip.12506","DOIUrl":"10.1111/emip.12506","url":null,"abstract":"<p>Test equating requires collecting data to link the scores from different forms of a test. Problems arise when equating samples are not equivalent and the test forms to be linked share no common items by which to measure or adjust for the group nonequivalence. Using data from five operational test forms, we created five pairs of research forms for each form, such that the equating relationship between each pair was known. Then we compared five approaches to adjusting for group nonequivalence in a situation where not only was group equivalence questionable, but the number of common items was small. We used a resampling approach to evaluate the linking accuracy of group adjustment using sample weights via minimum discriminant information adjustment (MDIA) using test takers’ collateral (demographic) information, a weak anchor of only three items, or a mix of both. Overall, the use of both sample weights via MDIA and a weak anchor produced the most accurate result, while the direct (random groups) linking method assuming group equivalence produced the least accurate result due to nontrivial bias. For all five research forms, using both collateral information and anchor items only marginally improved linking accuracy compared to using the weak anchor alone.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2022-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46094188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the Cover: Turning the Page","authors":"Yuan-Ling Liaw","doi":"10.1111/emip.12502","DOIUrl":"10.1111/emip.12502","url":null,"abstract":"","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2022-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/emip.12502","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44257410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Richard A. Feinberg, Carol Morrison, Mark R. Raymond
Formal graduate education in a measurement related field provides a solid foundation for professionals who work on credentialing examinations. Those foundational skills are then expanded and refined over time as practitioners encounter complex and nuanced challenges that were not covered by or go beyond the context described in textbooks. For instance, as most of us who work on operational testing programs are (sometimes) painfully aware, real data can be very messy. Often unanticipated situations arise that can create a range of problems from threats to score validity, to unexpected financial costs, and even longer-term reputational damage. In practice, solutions for these unanticipated situations are not always straightforward, often requiring a compromise between psychometric best practices, business resources, and needs of the customer. In this module we discuss some of these unusual challenges that usually occur in a credentialing program. First, we provide a high-level summary of the main components of the assessment lifecycle and the different roles within a testing organization. Next, we propose a framework for qualifying risk along with various considerations and potential actions for managing these challenges. Lastly, we integrate this information by presenting a few scenarios that can occur in practice that should help learners think though applicable team-based problem-solving and better align recommended action from a psychometric perspective given the context and magnitude of the challenge.
{"title":"Digital Module 28: Unusual Things That Usually Occur in a Credentialing Testing Program","authors":"Richard A. Feinberg, Carol Morrison, Mark R. Raymond","doi":"10.1111/emip.12500","DOIUrl":"10.1111/emip.12500","url":null,"abstract":"<p>Formal graduate education in a measurement related field provides a solid foundation for professionals who work on credentialing examinations. Those foundational skills are then expanded and refined over time as practitioners encounter complex and nuanced challenges that were not covered by or go beyond the context described in textbooks. For instance, as most of us who work on operational testing programs are (sometimes) painfully aware, real data can be very messy. Often unanticipated situations arise that can create a range of problems from threats to score validity, to unexpected financial costs, and even longer-term reputational damage. In practice, solutions for these unanticipated situations are not always straightforward, often requiring a compromise between psychometric best practices, business resources, and needs of the customer. In this module we discuss some of these unusual challenges that usually occur in a credentialing program. First, we provide a high-level summary of the main components of the assessment lifecycle and the different roles within a testing organization. Next, we propose a framework for qualifying risk along with various considerations and potential actions for managing these challenges. Lastly, we integrate this information by presenting a few scenarios that can occur in practice that should help learners think though applicable team-based problem-solving and better align recommended action from a psychometric perspective given the context and magnitude of the challenge.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2022-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49651068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Welcome to the first issue of the Instructional Topics in Educational Measurement Series (ITEMS) of my 3-year tenure as editor. I first want to thank André Rupp, the outgoing editor, for his patience and time discussing technical details of the modern ITEMS module format (more on his work below). I also want to thank Susan Davis-Becker and Michael Peabody for their work leading the Publications Committee and guidance through the transition process. Finally, I am thankful for the assistance from Deborah Harris, outgoing editor of Educational Measurement: Issues and Practice (EM:IP), and Zhongmin Cui, incoming editor, with whom I look forward to working over our tenures.
I am grateful for the opportunity to lead ITEMS. As an academic, I expend considerable energy teaching future professionals of our field. However, being an educator is much more than simply teaching. It also requires advising, mentorship, and recruiting. I envision ITEMS as a confluence of these on a grand scale. Of course, teaching and learning is at the core of ITEMS. However, informing others about educational measurement involves more than teaching a technical short course on item response theory or reliability. It also requires catering to the needs of the learner, providing resources to fill gaps in knowledge or to review material learned years ago, as well as giving guidance to young professionals about the opportunities in the field.
Since 1987, ITEMS has been published as a part of EM:IP. Until May 2017, ITEMS modules were published as didactic articles, appealing to practitioners seeking more technical expertise, K–12 teachers desiring resources to learn about assessment and measurement, as well as graduate students desiring to supplement their education. After a hiatus of just over a year, ITEMS returned in 2018 in a new form. To bring ITEMS into the 21st century (Rupp, 2018), then-editor André Rupp reimagined ITEMS for a digital landscape. Over the next three and a half years twenty-seven high-quality digital modules were published. These digital modules blended voice-narrated content with applied examples, exercise sets, software exemplars, and other unique resources only possible on a digital platform. These modules would not have been possible without the expertise and dedication of the authors developing the incredible products. The behind-the-scenes instructional design team also deserves recognition. These volunteers worked countless hours to implement the vision of what we now see as digital ITEMS modules. Having learned the intricacies of building a module, it is abundantly clear that these digital modules would not have been possible without the instructional team's effort. Thank you, Xi Lu and Jonathan Lehrfeld for your substantial contribution.
As I begin my tenure, I plan to build on the accomplished history of ITEMS with new modules appealing to the varied interests both within and beyond the NCME commu
欢迎阅读我担任编辑三年以来的《教育测量教学专题丛书》(ITEMS)的第一期。首先,我要感谢即将卸任的编辑安德烈•鲁普,感谢他耐心地花时间讨论现代ITEMS模块格式的技术细节(下文将详细介绍他的工作)。我还要感谢苏珊·戴维斯-贝克尔和迈克尔·皮博迪领导出版委员会的工作,并在过渡过程中提供指导。最后,我要感谢即将离任的《教育测量:问题与实践》(Educational Measurement: Issues and Practice, EM:IP)主编Deborah Harris和即将离任的《教育测量:问题与实践》(Educational Measurement: Issues and Practice, EM:IP)主编崔忠民的帮助,我期待与他们在我们的任期内一起工作。我很感激能有机会领导ITEMS。作为一名学者,我花了相当多的精力来教授我们领域未来的专业人士。然而,作为一名教育者不仅仅是简单的教学。它还需要建议、指导和招聘。我认为《物品》是这些元素在大范围内的融合。当然,教与学是ITEMS的核心。然而,告知他人教育测量不仅仅是教授项目反应理论或可靠性的技术短期课程。它还要求满足学习者的需求,提供资源来填补知识空白或复习多年前学过的材料,并就该领域的机会向年轻专业人员提供指导。自1987年以来,ITEMS已作为EM:IP的一部分发布。直到2017年5月,ITEMS模块以教学文章的形式发布,吸引寻求更多技术专业知识的从业者,希望获得评估和测量资源的K-12教师,以及希望补充其教育的研究生。在中断了一年多之后,ITEMS在2018年以新的形式回归。为了将ITEMS带入21世纪(Rupp, 2018),当时的编辑andrer<e:1> Rupp为数字景观重新构想了ITEMS。在接下来的三年半里,出版了27个高质量的数字模块。这些数字模块将语音叙述的内容与应用示例、练习集、软件示例和其他只有在数字平台上才能实现的独特资源混合在一起。如果没有开发这些令人难以置信的产品的作者的专业知识和奉献精神,这些模块是不可能的。幕后的教学设计团队也值得肯定。这些志愿者工作了无数个小时来实现我们现在看到的数字项目模块的愿景。在了解了构建模块的复杂性之后,很明显,如果没有教学团队的努力,这些数字模块是不可能实现的。当我开始我的任期时,我计划用新的模块来建立ITEMS的完成历史,以吸引NCME社区内外的各种兴趣。这包括K-12评估、执照和认证、高等教育评估和课堂评估等方面的从业人员和专业人员。我计划与来自多个学科的专业人士合作,以扩大历史上在教育测量界并不突出的不同观点。我鼓励对编写模块感兴趣的专业人员直接与我联系。这包括高级和年轻的专业人士,以及高级研究生,他们希望在数字教学内容的发展中提供不同的观点。我还将与社区内的专业人士合作,编写专注于专业本身的模块。项目提供了一个平台,帮助教育公众,利益相关者,以及NCME社区的彼此,了解我们作为一个职业所做的事情。这可以激励学生在该领域学习,让研究生了解不同就业环境提供的机会,并告知公众和选民在做出明智的评估和衡量决策时的思想水平和专业知识。我很高兴向大家介绍我任期内的第一个模块。在本期中,Richard Feinberg、Carol Morrison和Mark Raymond撰写了“在认证测试程序中经常发生的不寻常的事情”模块,其中他们说明了评估生命周期的组成部分,概述了测试组织中的单元、角色和交接,并提供了在操作测试中出现的风险和常见问题区域的示例。正如Andrew Ho在最近NCME网络研讨会的讨论中所说,“有一件事已经变得很清楚……我们在NCME中拥有一个多么棒的执照和认证测试社区,有时我们如何谈论他们”(Ho et al., 2021)。这些专业人士在社区中代表了重要的声音和独特的观点。
{"title":"ITEMS Corner: Educating the Educational Measurement Community","authors":"Brian C. Leventhal","doi":"10.1111/emip.12501","DOIUrl":"10.1111/emip.12501","url":null,"abstract":"<p>Welcome to the first issue of the Instructional Topics in Educational Measurement Series (ITEMS) of my 3-year tenure as editor. I first want to thank André Rupp, the outgoing editor, for his patience and time discussing technical details of the modern ITEMS module format (more on his work below). I also want to thank Susan Davis-Becker and Michael Peabody for their work leading the Publications Committee and guidance through the transition process. Finally, I am thankful for the assistance from Deborah Harris, outgoing editor of <i>Educational Measurement: Issues and Practice</i> (<i>EM:IP</i>), and Zhongmin Cui, incoming editor, with whom I look forward to working over our tenures.</p><p>I am grateful for the opportunity to lead ITEMS. As an academic, I expend considerable energy teaching future professionals of our field. However, being an educator is much more than simply teaching. It also requires advising, mentorship, and recruiting. I envision ITEMS as a confluence of these on a grand scale. Of course, teaching and learning is at the core of ITEMS. However, informing others about educational measurement involves more than teaching a technical short course on item response theory or reliability. It also requires catering to the needs of the learner, providing resources to fill gaps in knowledge or to review material learned years ago, as well as giving guidance to young professionals about the opportunities in the field.</p><p>Since 1987, ITEMS has been published as a part of <i>EM:IP</i>. Until May 2017, ITEMS modules were published as didactic articles, appealing to practitioners seeking more technical expertise, K–12 teachers desiring resources to learn about assessment and measurement, as well as graduate students desiring to supplement their education. After a hiatus of just over a year, ITEMS returned in 2018 in a new form. To bring ITEMS into the 21st century (Rupp, <span>2018</span>), then-editor André Rupp reimagined ITEMS for a digital landscape. Over the next three and a half years twenty-seven high-quality digital modules were published. These digital modules blended voice-narrated content with applied examples, exercise sets, software exemplars, and other unique resources only possible on a digital platform. These modules would not have been possible without the expertise and dedication of the authors developing the incredible products. The behind-the-scenes instructional design team also deserves recognition. These volunteers worked countless hours to implement the vision of what we now see as digital ITEMS modules. Having learned the intricacies of building a module, it is abundantly clear that these digital modules would not have been possible without the instructional team's effort. Thank you, Xi Lu and Jonathan Lehrfeld for your substantial contribution.</p><p>As I begin my tenure, I plan to build on the accomplished history of ITEMS with new modules appealing to the varied interests both within and beyond the NCME commu","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2022-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/emip.12501","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45682788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}