Pub Date : 2022-08-16DOI: 10.1177/02655322221112364
Samuel D. Ihlenfeldt, Joseph A. Rios
For institutions where English is the primary language of instruction, English assessments for admissions such as the Test of English as a Foreign Language (TOEFL) and International English Language Testing System (IELTS) give admissions decision-makers a sense of a student’s skills in academic English. Despite this explicit purpose, these exams have also been used for the practice of predicting academic success. In this study, we meta-analytically synthesized 132 effect sizes from 32 studies containing validity evidence of academic English assessments to determine whether different assessments (a) predicted academic success (as measured by grade point average [GPA]) and (b) did so comparably. Overall, assessments had a weak positive correlation with academic achievement (r = .231, p < .001). Additionally, no significant differences were found in the predictive power of the IELTS and TOEFL exams. No moderators were significant, indicating that these findings held true across school type, school level, and publication type. Although significant, the overall correlation was low; thus, practitioners are cautioned from using standardized English-language proficiency test scores in isolation in lieu of a holistic application review during the admissions process.
对于以英语为主要教学语言的院校,招生英语评估,如托福(TOEFL)和国际英语语言测试系统(IELTS),可以让招生决策者了解学生的学术英语技能。尽管有这种明确的目的,这些考试也被用于预测学业成功的实践。在本研究中,我们对包含学术英语评估效度证据的32项研究的132个效应量进行了meta分析,以确定不同的评估(a)是否预测了学业成功(以平均绩点[GPA]衡量)和(b)是否具有可比性。总体而言,评估与学业成绩呈弱正相关(r =。231, p < .001)。此外,雅思和托福考试的预测能力没有显著差异。没有显著的调节因子,表明这些发现在学校类型、学校水平和出版物类型中都是正确的。虽然显著,但总体相关性较低;因此,从业者被告诫不要在招生过程中孤立地使用标准化的英语水平考试成绩,而不是进行全面的申请审查。
{"title":"A meta-analysis on the predictive validity of English language proficiency assessments for college admissions","authors":"Samuel D. Ihlenfeldt, Joseph A. Rios","doi":"10.1177/02655322221112364","DOIUrl":"https://doi.org/10.1177/02655322221112364","url":null,"abstract":"For institutions where English is the primary language of instruction, English assessments for admissions such as the Test of English as a Foreign Language (TOEFL) and International English Language Testing System (IELTS) give admissions decision-makers a sense of a student’s skills in academic English. Despite this explicit purpose, these exams have also been used for the practice of predicting academic success. In this study, we meta-analytically synthesized 132 effect sizes from 32 studies containing validity evidence of academic English assessments to determine whether different assessments (a) predicted academic success (as measured by grade point average [GPA]) and (b) did so comparably. Overall, assessments had a weak positive correlation with academic achievement (r = .231, p < .001). Additionally, no significant differences were found in the predictive power of the IELTS and TOEFL exams. No moderators were significant, indicating that these findings held true across school type, school level, and publication type. Although significant, the overall correlation was low; thus, practitioners are cautioned from using standardized English-language proficiency test scores in isolation in lieu of a holistic application review during the admissions process.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"40 1","pages":"276 - 299"},"PeriodicalIF":4.1,"publicationDate":"2022-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46240900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-15DOI: 10.1177/02655322221114895
Beverly A. Baker
From both a theoretical and an empirical perspective, this volume addresses the challenges of testing learners of multiple school language(s). The author states that “This volume is intended as a non-technical resource to offer help and guidance to all those who work in education with multilingual populations” (p. 1). In that sense, it is not a book about the assessment of language per se (although she presents a research study in which she collects information on students’ language proficiency). Rather, it is intended primarily for non-language specialists; to educators working with multilingual learners of all subjects. As she states throughout the work, the author addresses what she sees as limitations in both theoretical and empirical work that consider two languages only, claiming that this work has limited insights for those working with speakers of more than two languages. The author is motivated by the fair assessment of all students, including linguistically and culturally minoritized students. What follows are a summary and critical comments of the book, beginning with an overview of each of the chapters then directing a critical commentary to a few chapters in particular (Chapters 2, 5, and 7). Given the repetition of ideas across the chapters, I assume that many of these chapters have been designed to be read on a stand-alone basis. I have chosen these chapters to focus my comments because in my view they form the core of the book—they contain the theoretical approach undergirding the author’s work, the practical guidance in the form of the author’s “integrated approach,” and the details of her empirical study.
{"title":"Book Review: Multilingual Testing and Assessment","authors":"Beverly A. Baker","doi":"10.1177/02655322221114895","DOIUrl":"https://doi.org/10.1177/02655322221114895","url":null,"abstract":"From both a theoretical and an empirical perspective, this volume addresses the challenges of testing learners of multiple school language(s). The author states that “This volume is intended as a non-technical resource to offer help and guidance to all those who work in education with multilingual populations” (p. 1). In that sense, it is not a book about the assessment of language per se (although she presents a research study in which she collects information on students’ language proficiency). Rather, it is intended primarily for non-language specialists; to educators working with multilingual learners of all subjects. As she states throughout the work, the author addresses what she sees as limitations in both theoretical and empirical work that consider two languages only, claiming that this work has limited insights for those working with speakers of more than two languages. The author is motivated by the fair assessment of all students, including linguistically and culturally minoritized students. What follows are a summary and critical comments of the book, beginning with an overview of each of the chapters then directing a critical commentary to a few chapters in particular (Chapters 2, 5, and 7). Given the repetition of ideas across the chapters, I assume that many of these chapters have been designed to be read on a stand-alone basis. I have chosen these chapters to focus my comments because in my view they form the core of the book—they contain the theoretical approach undergirding the author’s work, the practical guidance in the form of the author’s “integrated approach,” and the details of her empirical study.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"40 1","pages":"184 - 188"},"PeriodicalIF":4.1,"publicationDate":"2022-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45693580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-09DOI: 10.1177/02655322221113917
Shuai Li, Ting-hui Wen, Xian Li, Yali Feng, Chuan Lin
This study compared holistic and analytic marking methods for their effects on parameter estimation (of examinees, raters, and items) and rater cognition in assessing speech act production in L2 Chinese. Seventy American learners of Chinese completed an oral Discourse Completion Test assessing requests and refusals. Four first-language (L1) Chinese raters evaluated the examinees’ oral productions using two four-point rating scales. The holistic scale simultaneously included the following five dimensions: communicative function, prosody, fluency, appropriateness, and grammaticality; the analytic scale included sub-scales to examine each of the five dimensions. The raters scored the dataset twice with the two marking methods, respectively, and with counterbalanced order. They also verbalized their scoring rationale after performing each rating. Results revealed that both marking methods led to high reliability and produced scores with high correlation; however, analytic marking possessed better assessment quality in terms of higher reliability and measurement precision, higher percentages of Rasch model fit for examinees and items, and more balanced reference to rating criteria among raters during the scoring process.
{"title":"Comparing holistic and analytic marking methods in assessing speech act production in L2 Chinese","authors":"Shuai Li, Ting-hui Wen, Xian Li, Yali Feng, Chuan Lin","doi":"10.1177/02655322221113917","DOIUrl":"https://doi.org/10.1177/02655322221113917","url":null,"abstract":"This study compared holistic and analytic marking methods for their effects on parameter estimation (of examinees, raters, and items) and rater cognition in assessing speech act production in L2 Chinese. Seventy American learners of Chinese completed an oral Discourse Completion Test assessing requests and refusals. Four first-language (L1) Chinese raters evaluated the examinees’ oral productions using two four-point rating scales. The holistic scale simultaneously included the following five dimensions: communicative function, prosody, fluency, appropriateness, and grammaticality; the analytic scale included sub-scales to examine each of the five dimensions. The raters scored the dataset twice with the two marking methods, respectively, and with counterbalanced order. They also verbalized their scoring rationale after performing each rating. Results revealed that both marking methods led to high reliability and produced scores with high correlation; however, analytic marking possessed better assessment quality in terms of higher reliability and measurement precision, higher percentages of Rasch model fit for examinees and items, and more balanced reference to rating criteria among raters during the scoring process.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"40 1","pages":"249 - 275"},"PeriodicalIF":4.1,"publicationDate":"2022-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41463902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-28DOI: 10.1177/02655322221113189
Atta Gebril
With the increasing role of tests worldwide, language professionals and other stakeholders are regularly involved in a wide range of assessment-related decisions in their local contexts. Such decisions vary in terms of the stakes associated with them, with many involving high-stakes decisions. Regardless of the nature of the stakes, assessment contexts tend to share something in common: the challenges that test users encounter on a daily basis. To make things even worse, many test users operate in an instructional setting with little knowledge about assessment. Taylor (2009) refers to the lack of assessment literacy materials that are accessible to different stakeholders, arguing that such materials are “highly technical or too specialized for language educators seeking to understand basic principles and practice in assessment” (p. 23). On a related note, assessment literacy training tends to be offered in a one-size-fits-all manner and does not tap into the unique characteristics of local contexts. This view is in contradiction with what different researchers reported in the literature since assessment literacy is perceived as “a social and co-constructed construct,” “no longer viewed as passive accumulation of knowledge and skills” (Yan & Fan, 2021, p. 220), and tends to be impacted by a number of contextual factors, such as linguistic background and teaching experience (Crusan et al., 2016). In light of these issues, the current volume taps into the existing challenges in different assessment/instructional settings. It is rare in our field to find a volume dedicated mainly to challenges in different assessment/instructional settings. Usually, there is a general sense that practitioners do not prefer such a negative tone when reading or writing about language assessment practices. In addition, practitioners generally do not have the incentives and resources needed for publishing, nor do they have access to a suitable platform for sharing such experiences. Challenges in Language Testing Around the World: Insights for Language Test Users by Betty Lanteigne, Christine Coombe, and James Dean Brown is a good addition to the existing body of knowledge since it offers a closer look at “things that could get overlooked, misapplied, misinterpreted, misused” in different assessment projects (Lanteigne et al., 2021, p. v.). Another perspective that the authors have to be commended on is related to the international nature of the experiences reported in this volume. 1113189 LTJ0010.1177/02655322221113189Language TestingBook reviews book-reviews2022
随着测试在世界范围内的作用越来越大,语言专业人员和其他利益相关者经常参与当地环境中与评估相关的广泛决策。此类决策的利害关系各不相同,其中许多涉及高利害关系的决策。无论利害关系的性质如何,评估环境往往有一些共同点:测试用户每天都会遇到的挑战。更糟糕的是,许多测试用户在教学环境中操作,对评估知之甚少。Taylor(2009)提到缺乏不同利益相关者可以获得的评估识字材料,认为这些材料“对于寻求理解评估基本原则和实践的语言教育工作者来说,技术性很强或过于专业化”(第23页)。与此相关的是,评估识字培训往往以一刀切的方式提供,没有利用当地环境的独特特征。这一观点与不同研究人员在文献中报道的观点相矛盾,因为评估素养被视为“一种社会和共同构建的结构”,“不再被视为知识和技能的被动积累”(Yan&Fan,2021,220),并且往往受到许多情境因素的影响,如语言背景和教学经验(Crusan et al.,2016)。鉴于这些问题,本卷探讨了不同评估/教学环境中存在的挑战。在我们的领域中,很少能找到一本主要针对不同评估/教学环境中的挑战的书。通常,人们普遍认为,从业者在阅读或撰写有关语言评估实践的文章时,不喜欢这种负面的语气。此外,从业者通常没有出版所需的激励和资源,也没有合适的平台来分享这些经验。Betty Lanteigne、Christine Coombe、,詹姆斯·迪恩·布朗是对现有知识体系的一个很好的补充,因为它更深入地研究了不同评估项目中“可能被忽视、误用、误解和误用的东西”(Lanteigne et al.,2021,p.v.)。作者必须赞扬的另一个观点与本卷中报告的经验的国际性质有关。1113189 LTJ0010.1177/026553222221113189语言测试书评2022
{"title":"Book Review: Challenges in Language Testing Around the World: Insights for Language Test Users","authors":"Atta Gebril","doi":"10.1177/02655322221113189","DOIUrl":"https://doi.org/10.1177/02655322221113189","url":null,"abstract":"With the increasing role of tests worldwide, language professionals and other stakeholders are regularly involved in a wide range of assessment-related decisions in their local contexts. Such decisions vary in terms of the stakes associated with them, with many involving high-stakes decisions. Regardless of the nature of the stakes, assessment contexts tend to share something in common: the challenges that test users encounter on a daily basis. To make things even worse, many test users operate in an instructional setting with little knowledge about assessment. Taylor (2009) refers to the lack of assessment literacy materials that are accessible to different stakeholders, arguing that such materials are “highly technical or too specialized for language educators seeking to understand basic principles and practice in assessment” (p. 23). On a related note, assessment literacy training tends to be offered in a one-size-fits-all manner and does not tap into the unique characteristics of local contexts. This view is in contradiction with what different researchers reported in the literature since assessment literacy is perceived as “a social and co-constructed construct,” “no longer viewed as passive accumulation of knowledge and skills” (Yan & Fan, 2021, p. 220), and tends to be impacted by a number of contextual factors, such as linguistic background and teaching experience (Crusan et al., 2016). In light of these issues, the current volume taps into the existing challenges in different assessment/instructional settings. It is rare in our field to find a volume dedicated mainly to challenges in different assessment/instructional settings. Usually, there is a general sense that practitioners do not prefer such a negative tone when reading or writing about language assessment practices. In addition, practitioners generally do not have the incentives and resources needed for publishing, nor do they have access to a suitable platform for sharing such experiences. Challenges in Language Testing Around the World: Insights for Language Test Users by Betty Lanteigne, Christine Coombe, and James Dean Brown is a good addition to the existing body of knowledge since it offers a closer look at “things that could get overlooked, misapplied, misinterpreted, misused” in different assessment projects (Lanteigne et al., 2021, p. v.). Another perspective that the authors have to be commended on is related to the international nature of the experiences reported in this volume. 1113189 LTJ0010.1177/02655322221113189Language TestingBook reviews book-reviews2022","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"40 1","pages":"180 - 183"},"PeriodicalIF":4.1,"publicationDate":"2022-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44568134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-24DOI: 10.1177/02655322221100115
Ann-Kristin Helland Gujord
This study explores whether and to what extent the background information supplied by 10,155 immigrants who took an official language test in Norwegian affected their chances of passing one, two, or all three parts of the test. The background information included in the analysis was prior education, region (location of their home country), language (first language [L1] background, knowledge of English), second language (hours of second language [L2] instruction, L2 use), L1 community (years of residence, contact with L1 speakers), age, and gender. An ordered logistic regression analysis revealed that eight of the hypothesised explanatory variables significantly impacted the dependent variable (test result). Several of the significant variables relate to pre-immigration conditions, such as educational opportunities earlier in life. The findings have implications for language testing and also, to some extent, for the understanding of variation in learning outcomes.
{"title":"Who succeeds and who fails? Exploring the role of background variables in explaining the outcomes of L2 language tests","authors":"Ann-Kristin Helland Gujord","doi":"10.1177/02655322221100115","DOIUrl":"https://doi.org/10.1177/02655322221100115","url":null,"abstract":"This study explores whether and to what extent the background information supplied by 10,155 immigrants who took an official language test in Norwegian affected their chances of passing one, two, or all three parts of the test. The background information included in the analysis was prior education, region (location of their home country), language (first language [L1] background, knowledge of English), second language (hours of second language [L2] instruction, L2 use), L1 community (years of residence, contact with L1 speakers), age, and gender. An ordered logistic regression analysis revealed that eight of the hypothesised explanatory variables significantly impacted the dependent variable (test result). Several of the significant variables relate to pre-immigration conditions, such as educational opportunities earlier in life. The findings have implications for language testing and also, to some extent, for the understanding of variation in learning outcomes.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"40 1","pages":"227 - 248"},"PeriodicalIF":4.1,"publicationDate":"2022-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45523649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-12DOI: 10.1177/02655322221092388
Stefanie A. Wind
Researchers frequently evaluate rater judgments in performance assessments for evidence of differential rater functioning (DRF), which occurs when rater severity is systematically related to construct-irrelevant student characteristics after controlling for student achievement levels. However, researchers have observed that methods for detecting DRF may be limited in sparse rating designs, where it is not possible for every rater to score every student. In these designs, there is limited information with which to detect DRF. Sparse designs can also exacerbate the impact of artificial DRF, which occurs when raters are inaccurately flagged for DRF due to statistical artifacts. In this study, a sequential method is adapted from previous research on differential item functioning (DIF) that allows researchers to detect DRF more accurately and distinguish between true and artificial DRF. Analyses of data from a rater-mediated writing assessment and a simulation study demonstrate that the sequential approach results in different conclusions about which raters exhibit DRF. Moreover, the simulation study results suggest that the sequential procedure results in improved accuracy in DRF detection across a variety of rating design conditions. Practical implications for language testing research are discussed.
{"title":"A sequential approach to detecting differential rater functioning in sparse rater-mediated assessment networks","authors":"Stefanie A. Wind","doi":"10.1177/02655322221092388","DOIUrl":"https://doi.org/10.1177/02655322221092388","url":null,"abstract":"Researchers frequently evaluate rater judgments in performance assessments for evidence of differential rater functioning (DRF), which occurs when rater severity is systematically related to construct-irrelevant student characteristics after controlling for student achievement levels. However, researchers have observed that methods for detecting DRF may be limited in sparse rating designs, where it is not possible for every rater to score every student. In these designs, there is limited information with which to detect DRF. Sparse designs can also exacerbate the impact of artificial DRF, which occurs when raters are inaccurately flagged for DRF due to statistical artifacts. In this study, a sequential method is adapted from previous research on differential item functioning (DIF) that allows researchers to detect DRF more accurately and distinguish between true and artificial DRF. Analyses of data from a rater-mediated writing assessment and a simulation study demonstrate that the sequential approach results in different conclusions about which raters exhibit DRF. Moreover, the simulation study results suggest that the sequential procedure results in improved accuracy in DRF detection across a variety of rating design conditions. Practical implications for language testing research are discussed.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"40 1","pages":"209 - 226"},"PeriodicalIF":4.1,"publicationDate":"2022-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43741835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-01DOI: 10.1177/02655322221076033
Melissa A. Bowles
This study details the development of a local test designed to place university Spanish students (n = 719) into one of the four different course levels and to distinguish between traditional L2 learners and early bilinguals on the basis of their linguistic knowledge, regardless of the variety of Spanish they were exposed to. Early bilinguals include two groups—heritage learners (HLs), who were exposed to Spanish in their homes and communities growing up, and early L2 learners with extensive Spanish exposure, often through dual immersion education, who are increasingly enrolling in university Spanish courses and tend to pattern with HLs. Expert instructor judgment and learner corpora contributed to item development, and 12 of 15 written multiple-choice test items targeting early-acquired vocabulary had differential item functioning (DIF) according to the Mantel–Haenszel procedure, favoring HLs. Recursive partitioning revealed that vocabulary score correctly identified 597/603 (99%) of L2 learners as such, and the six HLs whose vocabulary scores incorrectly identified them as L2 learners were in the lowest placement groups. Vocabulary scores also correctly identified 100% of the early L2 learners in the sample (n = 7) as having a heritage profile. Implications for the local context and for placement testing in general are provided.
{"title":"Using instructor judgment, learner corpora, and DIF to develop a placement test for Spanish L2 and heritage learners","authors":"Melissa A. Bowles","doi":"10.1177/02655322221076033","DOIUrl":"https://doi.org/10.1177/02655322221076033","url":null,"abstract":"This study details the development of a local test designed to place university Spanish students (n = 719) into one of the four different course levels and to distinguish between traditional L2 learners and early bilinguals on the basis of their linguistic knowledge, regardless of the variety of Spanish they were exposed to. Early bilinguals include two groups—heritage learners (HLs), who were exposed to Spanish in their homes and communities growing up, and early L2 learners with extensive Spanish exposure, often through dual immersion education, who are increasingly enrolling in university Spanish courses and tend to pattern with HLs. Expert instructor judgment and learner corpora contributed to item development, and 12 of 15 written multiple-choice test items targeting early-acquired vocabulary had differential item functioning (DIF) according to the Mantel–Haenszel procedure, favoring HLs. Recursive partitioning revealed that vocabulary score correctly identified 597/603 (99%) of L2 learners as such, and the six HLs whose vocabulary scores incorrectly identified them as L2 learners were in the lowest placement groups. Vocabulary scores also correctly identified 100% of the early L2 learners in the sample (n = 7) as having a heritage profile. Implications for the local context and for placement testing in general are provided.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"39 1","pages":"355 - 376"},"PeriodicalIF":4.1,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46971326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-04-14DOI: 10.1177/02655322221076153
Gerriet Janssen
This article provides a single, common-case study of a test retrofit project at one Colombian university. It reports on how the test retrofit project was carried out and describes the different areas of language assessment literacy the project afforded local teacher stakeholders. This project was successful in that it modified the test constructs and item types, while drawing stronger connections between the curriculum and the placement instrument. It also established a conceptual framework for the test and produced a more robust test form, psychometrically. The project intersected with different social forces, which impacted the project’s outcome in various ways. The project also illustrates how test retrofit provided local teachers with opportunities for language assessment literacy and with evidence-based knowledge about their students’ language proficiency. The study concludes that local assessment projects have the capacity to benefit local teachers, especially in terms of increased language assessment literacy. Intrinsic to a project’s sustainability are long-term financial commitment and institutionally established dedicated time, assigned to teacher participants. The study also concludes that project leadership requires both assessment and political skill sets, to conduct defensible research while compelling institutions to see the potential benefits of an ongoing test development or retrofit project.
{"title":"Local placement test retrofit and building language assessment literacy with teacher stakeholders: A case study from Colombia","authors":"Gerriet Janssen","doi":"10.1177/02655322221076153","DOIUrl":"https://doi.org/10.1177/02655322221076153","url":null,"abstract":"This article provides a single, common-case study of a test retrofit project at one Colombian university. It reports on how the test retrofit project was carried out and describes the different areas of language assessment literacy the project afforded local teacher stakeholders. This project was successful in that it modified the test constructs and item types, while drawing stronger connections between the curriculum and the placement instrument. It also established a conceptual framework for the test and produced a more robust test form, psychometrically. The project intersected with different social forces, which impacted the project’s outcome in various ways. The project also illustrates how test retrofit provided local teachers with opportunities for language assessment literacy and with evidence-based knowledge about their students’ language proficiency. The study concludes that local assessment projects have the capacity to benefit local teachers, especially in terms of increased language assessment literacy. Intrinsic to a project’s sustainability are long-term financial commitment and institutionally established dedicated time, assigned to teacher participants. The study also concludes that project leadership requires both assessment and political skill sets, to conduct defensible research while compelling institutions to see the potential benefits of an ongoing test development or retrofit project.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"39 1","pages":"377 - 400"},"PeriodicalIF":4.1,"publicationDate":"2022-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48953721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-04-04DOI: 10.1177/02655322221086211
J. Read
{"title":"Test Review: The International English Language Testing System (IELTS)","authors":"J. Read","doi":"10.1177/02655322221086211","DOIUrl":"https://doi.org/10.1177/02655322221086211","url":null,"abstract":"","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"39 1","pages":"679 - 694"},"PeriodicalIF":4.1,"publicationDate":"2022-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48370702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}