Country differences in response styles (RS) may jeopardize cross-country comparability of Likert-type scales. When adjusting for rather than investigating RS is the primary goal, it seems advantageous to impose minimal assumptions on RS structures and leverage information from multiple scales for RS measurement. Using PISA 2015 background questionnaire data, we investigate such an adjustment procedure and explore its impact on cross-country comparisons in contrast to customary analyses and RS adjustments that (a) leave RS unconsidered, (b) incorporate stronger assumptions on RS structure, and/or (c) only use some selected scales for RS measurement. Our findings suggest that not only the decision as to whether to adjust for RS but also how to adjust may heavily impact cross-country comparisons. This concerns both the assumptions on RS structures and the scales employed for RS measurement. Implications for RS adjustments in cross-country comparisons are derived, strongly advocating for taking model uncertainty into account.
The purpose of this study was to explore high school course-taking sequences and their relationship to college enrollment. Specifically, we implemented sequence analysis to discover common course-taking trajectories in math, science, and English language arts using high school transcript data from a recent nationally representative survey. Through sequence clustering, we reduced the complexity of the sequences and examined representative course-taking sequences. Classification tree, random forests, and multinomial logistic regression analyses were used to explore the relationship between the course sequences students complete and their postsecondary outcomes. Results showed that distinct representative course-taking sequences can be identified for all students as well as student subgroups. More advanced and complex course-taking sequences were associated with postsecondary enrollment.
The Cross-Classified Mixed Effects Model (CCMEM) has been demonstrated to be a flexible framework for evaluating reliability by measurement specialists. Reliability can be estimated based on the variance components of the test scores. Built upon their accomplishment, this study extends the CCMEM to be used for evaluating validity evidence. Validity is viewed as the coherence among the elements of a measurement system. As such, validity can be evaluated by the user-reasoned desired or undesired fixed and random effects. Based on the data of ePIRLS 2016 Reading Assessment, we demonstrate how to obtain evidence for reliability and validity by CCMEM. We conclude with a discussion on the practicality and benefits of this validation method.
The article describes practical suggestions for measurement researchers and psychometricians to respond to calls for social responsibility in assessment. The underlying assumption is that personalizing large-scale assessment improves the chances that assessment and the use of test scores will contribute to equity in education. This article describes a spectrum of standardization and personalization in large-scale assessment. Informed by a review of existing theories, models, and frameworks in the context of current and developing technologies and with a social justice lens, we propose steps to take, as part of assessment research and development, to contribute to the science of personalizing large-scale assessment in technically defensible ways.
This issue marks 1 year into my tenure as editor of Instructional Topics in Educational Measurement Series (ITEMS). I will summarize and reflect on the achievements from the past year, outline the new ITEMS module production process, and introduce the new module published in this issue of Educational Measurement: Issues and Practice (EM:IP).
Over the past year, there have been three new modules published: Unusual Things That Usually Occur in a Credentialing Testing Program (Feinberg et al., 2022), Multidimensional Item Response Theory Equating (Kim, 2022), and Validity and Educational Testing: Purposes and Uses of Educational Tests (Lewis & Sireci, 2022). Each of these modules has been a great addition to the ITEMS library, with the latter two being in the new format released in mid-2022.
Among the many benefits of the new format, modules are now more accessible on a variety of devices (e.g., desktop, phone, tablet) in both online and offline mode. The production process has also been simplified. Over the next few issues of EM:IP, I will take a deep dive into the process of designing a module for this nontraditional publication. The goal is threefold: (1) to educate readers of the behind-the-scenes process; (2) to showcase the extensive work that module development requires; and (3) to attract readers as potential authors, understanding the value of taking time to produce such a useful resource.
As noted, I will discuss these steps in more detail in the upcoming issues of EM:IP. Reconceptualizing ITEMS modules into this new form was only one of two initiatives I undertook in 2022. For the other, I worked to shift the ITEMS portal from a stand-alone website to the NCME website. As noted in the last issue of EM:IP, this has successfully been completed with evidence of several learners accessing the new ITEMS portal.
For 2023, I look forward to the production of several new and engaging ITEMS modules. I am excited to announce the first module of 2023, Digital Module 31: Testing Accommodations for Students with Disabilities, authored by Dr. Benjamin Lovett. In this module, Dr. Lovett describes common testing accommodations, explains how testing accommodations can reduce constructive-irrelevant variance and increase fairness, and describes best practices along with current common problems in practice. In this five-section module, Dr. Lovett provides video versions of the content as well as an interactive activity using two case studies.
If you are interested in learning more about the ITEMS module development process, authoring a module, or being involved in some other capacity, please reach out to me at [email protected].
Students with disabilities often take tests under different conditions than their peers do. Testing accommodations, which involve changes to test administration that maintain test content, include extending time limits, presenting written text through auditory means, and taking a test in a private room with fewer distractions. For some students with disabilities, accommodations such as these are necessary for fair assessment; without accommodations, invalid interpretations would be made on the basis of these students’ scores. However, when misapplied, accommodations can also diminish fairness, introduce new sources of construct-irrelevant variance, and also lead to invalid interpretation of test scores. This module provides a psychometric framework for thinking about accommodations, and then explicates an accommodations decision-making framework that includes a variety of considerations. Problems with current accommodations practices are discussed, along with potential solutions and future directions. The module is accompanied by exercises allowing participants to apply their understanding.