Pub Date : 2019-05-24DOI: 10.1080/10627197.2019.1615371
Sultan Turkan, Alexis A. López, René Lawless, Florencia Tolentino
ABSTRACT In this article we explore the use of pictorial glossaries as an accommodation for English learners (ELs) with entry and emerging levels of English language proficiency. Drawing on survey responses from 98 middle school ELs and cognitive interviews with 10 of the survey participants, we examined the participants’ preferences and experiences with using accommodations and explored how some of them responded to NAEP mathematics items using pictorial glossaries. Our findings showed that the participants viewed the use of pictures, videos, and translations as useful, but they did not have a lot of experience using these types of accommodations. Also, we found that the pictorial glosses sometimes helped ELs with understanding the local meaning of words in the mathematics problems but not with the global meaning conveyed at the sentence level in the problems.
{"title":"Using Pictorial Glossaries as an Accommodation for English Learners: An Exploratory Study","authors":"Sultan Turkan, Alexis A. López, René Lawless, Florencia Tolentino","doi":"10.1080/10627197.2019.1615371","DOIUrl":"https://doi.org/10.1080/10627197.2019.1615371","url":null,"abstract":"ABSTRACT In this article we explore the use of pictorial glossaries as an accommodation for English learners (ELs) with entry and emerging levels of English language proficiency. Drawing on survey responses from 98 middle school ELs and cognitive interviews with 10 of the survey participants, we examined the participants’ preferences and experiences with using accommodations and explored how some of them responded to NAEP mathematics items using pictorial glossaries. Our findings showed that the participants viewed the use of pictures, videos, and translations as useful, but they did not have a lot of experience using these types of accommodations. Also, we found that the pictorial glosses sometimes helped ELs with understanding the local meaning of words in the mathematics problems but not with the global meaning conveyed at the sentence level in the problems.","PeriodicalId":46209,"journal":{"name":"Educational Assessment","volume":"24 1","pages":"235 - 265"},"PeriodicalIF":1.5,"publicationDate":"2019-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/10627197.2019.1615371","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48707267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-05-16DOI: 10.1080/10627197.2019.1615372
C. Scoular, E. Care
ABSTRACT Recent educational and psychological research has highlighted shifting workplace requirements and change required to equip the emerging workforce with skills for the 21st century. The emergence of these highlights the issues, and drives the importance, of new methods of assessment. This study addresses some of the issues by describing a scoring process for measuring collaborative problem solving (CPS) in online environments. The method presented, from conceptualization to implementation, centers on its generalizable application, presenting a systematic process of identifying, coding, and scoring behavior patterns in log stream data generated from assessments. Item Response Theory was used to investigate the psychometric properties of behavior patterns. The goal of this study was to present an approach that informs new measurement practices in relation to sociocognitive latent traits and their processes. The generalized scoring process provides an efficient approach to develop measures of social and cognitive skills in online environments.
{"title":"A Generalized Scoring Process to Measure Collaborative Problem Solving in Online Environments","authors":"C. Scoular, E. Care","doi":"10.1080/10627197.2019.1615372","DOIUrl":"https://doi.org/10.1080/10627197.2019.1615372","url":null,"abstract":"ABSTRACT Recent educational and psychological research has highlighted shifting workplace requirements and change required to equip the emerging workforce with skills for the 21st century. The emergence of these highlights the issues, and drives the importance, of new methods of assessment. This study addresses some of the issues by describing a scoring process for measuring collaborative problem solving (CPS) in online environments. The method presented, from conceptualization to implementation, centers on its generalizable application, presenting a systematic process of identifying, coding, and scoring behavior patterns in log stream data generated from assessments. Item Response Theory was used to investigate the psychometric properties of behavior patterns. The goal of this study was to present an approach that informs new measurement practices in relation to sociocognitive latent traits and their processes. The generalized scoring process provides an efficient approach to develop measures of social and cognitive skills in online environments.","PeriodicalId":46209,"journal":{"name":"Educational Assessment","volume":"24 1","pages":"213 - 234"},"PeriodicalIF":1.5,"publicationDate":"2019-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/10627197.2019.1615372","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46399201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-05-01DOI: 10.1080/10627197.2019.1702461
Evelyn S. Johnson, Angela R. Crawford, Laura A. Moylan, Yuzhu Z. Zheng
ABSTRACT This manuscript describes the comprehensive validation work undertaken to develop the Recognizing Effective Special Education Teachers (RESET) observation system, which was designed to provide evaluations of special education teachers’ ability to effectively implement evidence-based practices and to provide specific, actionable feedback to teachers on how to improve instruction. Following the guidance for developing effective educator evaluation systems, we employed the Evidence-Centered Design framework, articulated the claims and inferences to be made with RESET, and conducted a series of studies to collect evidence to evaluate its validity. Our efforts and results to date are described, and implications for practice and further research are discussed.
{"title":"Validity of a Special Education Teacher Observation System","authors":"Evelyn S. Johnson, Angela R. Crawford, Laura A. Moylan, Yuzhu Z. Zheng","doi":"10.1080/10627197.2019.1702461","DOIUrl":"https://doi.org/10.1080/10627197.2019.1702461","url":null,"abstract":"ABSTRACT This manuscript describes the comprehensive validation work undertaken to develop the Recognizing Effective Special Education Teachers (RESET) observation system, which was designed to provide evaluations of special education teachers’ ability to effectively implement evidence-based practices and to provide specific, actionable feedback to teachers on how to improve instruction. Following the guidance for developing effective educator evaluation systems, we employed the Evidence-Centered Design framework, articulated the claims and inferences to be made with RESET, and conducted a series of studies to collect evidence to evaluate its validity. Our efforts and results to date are described, and implications for practice and further research are discussed.","PeriodicalId":46209,"journal":{"name":"Educational Assessment","volume":"25 1","pages":"31 - 46"},"PeriodicalIF":1.5,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/10627197.2019.1702461","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48948656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-03-13DOI: 10.1080/10627197.2019.1575723
J. Soland, N. Jensen, Tran D. Keys, Sharon Bi, Emily Wolk
ABSTRACT A vast literature investigates academic disengagement among students, including its ultimate manifestation, dropping out of school. Research also shows that test disengagement can be a problem for many inferences educators and policymakers wish to draw from test scores. However, few studies consider whether academic and test disengagement are related. In this study, we examine whether behaviors indicative of academic disengagement like chronic absenteeism and course failures are related to behaviors indicative of test disengagement like rapidly guessing on items. We also examine whether social-emotional factors like low academic self-efficacy and self-management, which research suggests are the root causes of academic disengagement, are also related to rapid guessing behavior. Our results provide evidence that academic and test disengagement are related, including through a common association with poor self-management. The implications of this connection for measurement and practice are discussed.
{"title":"Are Test and Academic Disengagement Related? Implications for Measurement and Practice","authors":"J. Soland, N. Jensen, Tran D. Keys, Sharon Bi, Emily Wolk","doi":"10.1080/10627197.2019.1575723","DOIUrl":"https://doi.org/10.1080/10627197.2019.1575723","url":null,"abstract":"ABSTRACT A vast literature investigates academic disengagement among students, including its ultimate manifestation, dropping out of school. Research also shows that test disengagement can be a problem for many inferences educators and policymakers wish to draw from test scores. However, few studies consider whether academic and test disengagement are related. In this study, we examine whether behaviors indicative of academic disengagement like chronic absenteeism and course failures are related to behaviors indicative of test disengagement like rapidly guessing on items. We also examine whether social-emotional factors like low academic self-efficacy and self-management, which research suggests are the root causes of academic disengagement, are also related to rapid guessing behavior. Our results provide evidence that academic and test disengagement are related, including through a common association with poor self-management. The implications of this connection for measurement and practice are discussed.","PeriodicalId":46209,"journal":{"name":"Educational Assessment","volume":"24 1","pages":"119 - 134"},"PeriodicalIF":1.5,"publicationDate":"2019-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/10627197.2019.1575723","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45941028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-02-13DOI: 10.1080/10627197.2019.1578169
Masoomeh Estaji, Mahsa Farahanynia
ABSTRACT The present study aimed to investigate the effect of two major approaches of Dynamic Assessment, namely, interventionist and interactionist approaches, on learners’ oral narrative performance and anxiety. To this end, 34 Iranian EFL learners were assigned to an Interactionist Group (InA.G) and Interventionist Group (InV.G). Initially, both groups were given the Foreign Language Classroom Anxiety Scale and a pretest of speaking. In the treatment phase, the InV.G was asked to narrate a video and received instructions on their errors. The InA.G narrated the video while being provided with scaffolding during narration. Then both groups were given a posttest and, two weeks later, a delayed posttest. The results indicated that both groups’ oral performance significantly increased, while their anxiety reduced. In the end, a semi-structured interview was conducted whose results revealed that the InA.G experienced more anxiety mostly due to feeling a sense of interruption and losing face.
{"title":"The Immediate and Delayed Effect of Dynamic Assessment Approaches on EFL Learners’ Oral Narrative Performance and Anxiety","authors":"Masoomeh Estaji, Mahsa Farahanynia","doi":"10.1080/10627197.2019.1578169","DOIUrl":"https://doi.org/10.1080/10627197.2019.1578169","url":null,"abstract":"ABSTRACT The present study aimed to investigate the effect of two major approaches of Dynamic Assessment, namely, interventionist and interactionist approaches, on learners’ oral narrative performance and anxiety. To this end, 34 Iranian EFL learners were assigned to an Interactionist Group (InA.G) and Interventionist Group (InV.G). Initially, both groups were given the Foreign Language Classroom Anxiety Scale and a pretest of speaking. In the treatment phase, the InV.G was asked to narrate a video and received instructions on their errors. The InA.G narrated the video while being provided with scaffolding during narration. Then both groups were given a posttest and, two weeks later, a delayed posttest. The results indicated that both groups’ oral performance significantly increased, while their anxiety reduced. In the end, a semi-structured interview was conducted whose results revealed that the InA.G experienced more anxiety mostly due to feeling a sense of interruption and losing face.","PeriodicalId":46209,"journal":{"name":"Educational Assessment","volume":"24 1","pages":"135 - 154"},"PeriodicalIF":1.5,"publicationDate":"2019-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/10627197.2019.1578169","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44206486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-01-16DOI: 10.1080/10627197.2018.1557515
Mo Zhang, P. V. van Rijn, P. Deane, R. Bennett
ABSTRACT Writing from source text is critical for developing college-and-career readiness because it is required in advanced academic environments and many vocations. Scenario-based assessment (SBA) represents one approach to measuring this ability. In such assessment, the scenario presents an issue that the student is to read and write about. Before writing, lead-in exercises are presented to encourage the examinee to engage with the source materials and to model the process used in a classroom writing project. This study experimentally manipulated a middle-school assessment design to understand if (1) the lead-in/essay structure increased scores erroneously with a concomitant decrease in test technical quality, and (2) the presence of a single unifying scenario affected scores or score meaning. In general, the SBA design did not appear to artificially increase total-test or essay scores. As importantly, it functioned as well as, sometimes better than, the alternative designs in terms of the measurement characteristics examined.
{"title":"Scenario-Based Assessments in Writing: An Experimental Study","authors":"Mo Zhang, P. V. van Rijn, P. Deane, R. Bennett","doi":"10.1080/10627197.2018.1557515","DOIUrl":"https://doi.org/10.1080/10627197.2018.1557515","url":null,"abstract":"ABSTRACT Writing from source text is critical for developing college-and-career readiness because it is required in advanced academic environments and many vocations. Scenario-based assessment (SBA) represents one approach to measuring this ability. In such assessment, the scenario presents an issue that the student is to read and write about. Before writing, lead-in exercises are presented to encourage the examinee to engage with the source materials and to model the process used in a classroom writing project. This study experimentally manipulated a middle-school assessment design to understand if (1) the lead-in/essay structure increased scores erroneously with a concomitant decrease in test technical quality, and (2) the presence of a single unifying scenario affected scores or score meaning. In general, the SBA design did not appear to artificially increase total-test or essay scores. As importantly, it functioned as well as, sometimes better than, the alternative designs in terms of the measurement characteristics examined.","PeriodicalId":46209,"journal":{"name":"Educational Assessment","volume":"24 1","pages":"73 - 90"},"PeriodicalIF":1.5,"publicationDate":"2019-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/10627197.2018.1557515","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45041997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-01-09DOI: 10.1080/10627197.2018.1564272
E. Jones, C. Bergin
ABSTRACT In most U.S. schools, teachers are evaluated using observation of teaching practice (OTP). This study investigates rater effects on OTP ratings among 421 principals in an authentic teacher evaluation system. Many-facet Rasch analysis (MFR) using a block of shared ratings revealed that principals generally (a) differentiated between more and less effective teachers, (b) rated their teachers with leniency (i.e., overused higher rating categories), and (c) differentiated between teaching practices (e.g., Cognitive Engagement vs. Classroom Management) with minimal halo effect. Individual principals varied significantly in degree of leniency, and approximately 12% of principals exhibited severe rater bias. Implications for use of OTP ratings for evaluating teachers’ effectiveness are discussed. Strengths and limitations of MFR to analyze rater effects in OTP are also discussed.
{"title":"Evaluating Teacher Effectiveness Using Classroom Observations: A Rasch Analysis of the Rater Effects of Principals","authors":"E. Jones, C. Bergin","doi":"10.1080/10627197.2018.1564272","DOIUrl":"https://doi.org/10.1080/10627197.2018.1564272","url":null,"abstract":"ABSTRACT In most U.S. schools, teachers are evaluated using observation of teaching practice (OTP). This study investigates rater effects on OTP ratings among 421 principals in an authentic teacher evaluation system. Many-facet Rasch analysis (MFR) using a block of shared ratings revealed that principals generally (a) differentiated between more and less effective teachers, (b) rated their teachers with leniency (i.e., overused higher rating categories), and (c) differentiated between teaching practices (e.g., Cognitive Engagement vs. Classroom Management) with minimal halo effect. Individual principals varied significantly in degree of leniency, and approximately 12% of principals exhibited severe rater bias. Implications for use of OTP ratings for evaluating teachers’ effectiveness are discussed. Strengths and limitations of MFR to analyze rater effects in OTP are also discussed.","PeriodicalId":46209,"journal":{"name":"Educational Assessment","volume":"24 1","pages":"118 - 91"},"PeriodicalIF":1.5,"publicationDate":"2019-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/10627197.2018.1564272","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42072623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-01-02DOI: 10.1080/10627197.2018.1545571
H. Buzick
ABSTRACT Using two states’ grades 3 through 8 state assessment databases, this study documents the extent to which students were assigned testing accommodations for ELA or mathematics in only one of two consecutive years. The percentage of students with disabilities who were assigned accommodations in the current year only or in the prior year only in a given grade statewide was not trivial, sometimes exceeding 25%. The relationship between inconsistent assignment to any accommodations and both students’ prior proficiency level and aggregate growth is also documented. Group differences were observed at the state level. No practical differences were observed when covariates for inconsistent assignment were included in school value-added models, but very few schools had a substantial proportion of students with disabilities assigned accommodations inconsistently in a given grade. Implications for research and practice are discussed.
{"title":"Testing Accommodations and the Measurement of Student Academic Growth","authors":"H. Buzick","doi":"10.1080/10627197.2018.1545571","DOIUrl":"https://doi.org/10.1080/10627197.2018.1545571","url":null,"abstract":"ABSTRACT Using two states’ grades 3 through 8 state assessment databases, this study documents the extent to which students were assigned testing accommodations for ELA or mathematics in only one of two consecutive years. The percentage of students with disabilities who were assigned accommodations in the current year only or in the prior year only in a given grade statewide was not trivial, sometimes exceeding 25%. The relationship between inconsistent assignment to any accommodations and both students’ prior proficiency level and aggregate growth is also documented. Group differences were observed at the state level. No practical differences were observed when covariates for inconsistent assignment were included in school value-added models, but very few schools had a substantial proportion of students with disabilities assigned accommodations inconsistently in a given grade. Implications for research and practice are discussed.","PeriodicalId":46209,"journal":{"name":"Educational Assessment","volume":"24 1","pages":"57 - 72"},"PeriodicalIF":1.5,"publicationDate":"2019-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/10627197.2018.1545571","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"59626278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-23DOI: 10.1080/10627197.2018.1545570
Alisha K. Wackerle-Hollman, Lillian Durán, S. Brunner, José Palma, Theresa L. Kohlmeier, Michael C. Rodriguez
ABSTRACT Spanish speakers in the United States are a steadily increasing population, up by 233% since 1980. Given the growing population of dual language learners (DLLs) and the large numbers of Spanish-speaking children enrolled in pre-kindergarten programs, addressing the educational needs of preschool-aged DLLs has become a national imperative. Specifically, the intersection of this growing population and the dearth of appropriate assessment tools to evaluate DLLs early language and literacy skills creates a need for assessments that accurately measure preschool performance. This manuscript reports on the iterative design process of a measure of Spanish phonological awareness for preschool-aged DLLs: Spanish Individual Growth and Development Indicators (S-IGDI) Primeros Sonidos. We employed measure design framework to develop the measure and tested item function within a study of 970, 4–5 year old DLLs. Results, including item level analyses and evidence regarding construct and criterion validity are reported.
在美国,讲西班牙语的人口稳步增长,自1980年以来增长了233%。鉴于双语学习者(dll)的人数不断增长,以及大量讲西班牙语的儿童参加了学前教育项目,解决学龄前dl的教育需求已成为国家的当务之急。具体来说,人口增长和缺乏适当的评估工具来评估dll的早期语言和读写能力的交集,创造了对准确衡量学龄前表现的评估的需求。本文报告了一种针对学龄前dls的西班牙语语音意识测量的迭代设计过程:西班牙语个人成长和发展指标(S-IGDI) Primeros Sonidos。本研究以970名4 ~ 5岁儿童为研究对象,采用量表设计框架开发量表和测试项功能。结果,包括项目水平分析和证据有关结构和标准效度报告。
{"title":"Developing a Measure of Spanish Phonological Awareness for Preschool Age Children: Spanish Individual Growth and Development Indicators","authors":"Alisha K. Wackerle-Hollman, Lillian Durán, S. Brunner, José Palma, Theresa L. Kohlmeier, Michael C. Rodriguez","doi":"10.1080/10627197.2018.1545570","DOIUrl":"https://doi.org/10.1080/10627197.2018.1545570","url":null,"abstract":"ABSTRACT Spanish speakers in the United States are a steadily increasing population, up by 233% since 1980. Given the growing population of dual language learners (DLLs) and the large numbers of Spanish-speaking children enrolled in pre-kindergarten programs, addressing the educational needs of preschool-aged DLLs has become a national imperative. Specifically, the intersection of this growing population and the dearth of appropriate assessment tools to evaluate DLLs early language and literacy skills creates a need for assessments that accurately measure preschool performance. This manuscript reports on the iterative design process of a measure of Spanish phonological awareness for preschool-aged DLLs: Spanish Individual Growth and Development Indicators (S-IGDI) Primeros Sonidos. We employed measure design framework to develop the measure and tested item function within a study of 970, 4–5 year old DLLs. Results, including item level analyses and evidence regarding construct and criterion validity are reported.","PeriodicalId":46209,"journal":{"name":"Educational Assessment","volume":"24 1","pages":"33 - 56"},"PeriodicalIF":1.5,"publicationDate":"2018-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/10627197.2018.1545570","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43291229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-13DOI: 10.1080/10627197.2018.1545569
A. Kan, O. Bulut, D. Cormier
ABSTRACT Item stem formats can alter the cognitive complexity as well as the type of abilities required for solving mathematics items. Consequently, it is possible that item stem formats can affect the dimensional structure of mathematics assessments. This empirical study investigated the relationship between item stem format and the dimensionality of mathematics assessments. A sample of 671 sixth-grade students was given two forms of a mathematics assessment in which mathematical expression (ME) items and word problems (WP) were used to measure the same content. The effects of mathematical language and reading abilities in responding to ME and WP items were explored using unidimensional and multidimensional item response theory models. The results showed that WP and ME items appear to differ with regard to the underlying abilities required to answer these items. Hence, the multidimensional model fit the response data better than the unidimensional model. For the accurate assessment of mathematics achievement, students’ reading and mathematical language abilities should also be considered when implementing mathematics assessments with ME and WP items.
{"title":"The Impact of Item Stem Format on the Dimensional Structure of Mathematics Assessments","authors":"A. Kan, O. Bulut, D. Cormier","doi":"10.1080/10627197.2018.1545569","DOIUrl":"https://doi.org/10.1080/10627197.2018.1545569","url":null,"abstract":"ABSTRACT Item stem formats can alter the cognitive complexity as well as the type of abilities required for solving mathematics items. Consequently, it is possible that item stem formats can affect the dimensional structure of mathematics assessments. This empirical study investigated the relationship between item stem format and the dimensionality of mathematics assessments. A sample of 671 sixth-grade students was given two forms of a mathematics assessment in which mathematical expression (ME) items and word problems (WP) were used to measure the same content. The effects of mathematical language and reading abilities in responding to ME and WP items were explored using unidimensional and multidimensional item response theory models. The results showed that WP and ME items appear to differ with regard to the underlying abilities required to answer these items. Hence, the multidimensional model fit the response data better than the unidimensional model. For the accurate assessment of mathematics achievement, students’ reading and mathematical language abilities should also be considered when implementing mathematics assessments with ME and WP items.","PeriodicalId":46209,"journal":{"name":"Educational Assessment","volume":"24 1","pages":"13 - 32"},"PeriodicalIF":1.5,"publicationDate":"2018-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/10627197.2018.1545569","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42649952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}