首页 > 最新文献

Language Testing最新文献

英文 中文
Book Review: Assessing Academic English for Higher Education Admissions 书评:评估高等教育招生的学术英语
IF 4.1 1区 文学 Q1 Arts and Humanities Pub Date : 2022-08-24 DOI: 10.1177/02655322221118069
Diane Schmitt
{"title":"Book Review: Assessing Academic English for Higher Education Admissions","authors":"Diane Schmitt","doi":"10.1177/02655322221118069","DOIUrl":"https://doi.org/10.1177/02655322221118069","url":null,"abstract":"","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":null,"pages":null},"PeriodicalIF":4.1,"publicationDate":"2022-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46898693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A meta-analysis on the predictive validity of English language proficiency assessments for college admissions 大学入学英语语言能力评估预测效度的元分析
IF 4.1 1区 文学 Q1 Arts and Humanities Pub Date : 2022-08-16 DOI: 10.1177/02655322221112364
Samuel D. Ihlenfeldt, Joseph A. Rios
For institutions where English is the primary language of instruction, English assessments for admissions such as the Test of English as a Foreign Language (TOEFL) and International English Language Testing System (IELTS) give admissions decision-makers a sense of a student’s skills in academic English. Despite this explicit purpose, these exams have also been used for the practice of predicting academic success. In this study, we meta-analytically synthesized 132 effect sizes from 32 studies containing validity evidence of academic English assessments to determine whether different assessments (a) predicted academic success (as measured by grade point average [GPA]) and (b) did so comparably. Overall, assessments had a weak positive correlation with academic achievement (r = .231, p < .001). Additionally, no significant differences were found in the predictive power of the IELTS and TOEFL exams. No moderators were significant, indicating that these findings held true across school type, school level, and publication type. Although significant, the overall correlation was low; thus, practitioners are cautioned from using standardized English-language proficiency test scores in isolation in lieu of a holistic application review during the admissions process.
对于以英语为主要教学语言的院校,招生英语评估,如托福(TOEFL)和国际英语语言测试系统(IELTS),可以让招生决策者了解学生的学术英语技能。尽管有这种明确的目的,这些考试也被用于预测学业成功的实践。在本研究中,我们对包含学术英语评估效度证据的32项研究的132个效应量进行了meta分析,以确定不同的评估(a)是否预测了学业成功(以平均绩点[GPA]衡量)和(b)是否具有可比性。总体而言,评估与学业成绩呈弱正相关(r =。231, p < .001)。此外,雅思和托福考试的预测能力没有显著差异。没有显著的调节因子,表明这些发现在学校类型、学校水平和出版物类型中都是正确的。虽然显著,但总体相关性较低;因此,从业者被告诫不要在招生过程中孤立地使用标准化的英语水平考试成绩,而不是进行全面的申请审查。
{"title":"A meta-analysis on the predictive validity of English language proficiency assessments for college admissions","authors":"Samuel D. Ihlenfeldt, Joseph A. Rios","doi":"10.1177/02655322221112364","DOIUrl":"https://doi.org/10.1177/02655322221112364","url":null,"abstract":"For institutions where English is the primary language of instruction, English assessments for admissions such as the Test of English as a Foreign Language (TOEFL) and International English Language Testing System (IELTS) give admissions decision-makers a sense of a student’s skills in academic English. Despite this explicit purpose, these exams have also been used for the practice of predicting academic success. In this study, we meta-analytically synthesized 132 effect sizes from 32 studies containing validity evidence of academic English assessments to determine whether different assessments (a) predicted academic success (as measured by grade point average [GPA]) and (b) did so comparably. Overall, assessments had a weak positive correlation with academic achievement (r = .231, p < .001). Additionally, no significant differences were found in the predictive power of the IELTS and TOEFL exams. No moderators were significant, indicating that these findings held true across school type, school level, and publication type. Although significant, the overall correlation was low; thus, practitioners are cautioned from using standardized English-language proficiency test scores in isolation in lieu of a holistic application review during the admissions process.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":null,"pages":null},"PeriodicalIF":4.1,"publicationDate":"2022-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46240900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Book Review: Multilingual Testing and Assessment 书评:多语言测试与评估
IF 4.1 1区 文学 Q1 Arts and Humanities Pub Date : 2022-08-15 DOI: 10.1177/02655322221114895
Beverly A. Baker
From both a theoretical and an empirical perspective, this volume addresses the challenges of testing learners of multiple school language(s). The author states that “This volume is intended as a non-technical resource to offer help and guidance to all those who work in education with multilingual populations” (p. 1). In that sense, it is not a book about the assessment of language per se (although she presents a research study in which she collects information on students’ language proficiency). Rather, it is intended primarily for non-language specialists; to educators working with multilingual learners of all subjects. As she states throughout the work, the author addresses what she sees as limitations in both theoretical and empirical work that consider two languages only, claiming that this work has limited insights for those working with speakers of more than two languages. The author is motivated by the fair assessment of all students, including linguistically and culturally minoritized students. What follows are a summary and critical comments of the book, beginning with an overview of each of the chapters then directing a critical commentary to a few chapters in particular (Chapters 2, 5, and 7). Given the repetition of ideas across the chapters, I assume that many of these chapters have been designed to be read on a stand-alone basis. I have chosen these chapters to focus my comments because in my view they form the core of the book—they contain the theoretical approach undergirding the author’s work, the practical guidance in the form of the author’s “integrated approach,” and the details of her empirical study.
从理论和经验的角度来看,本卷解决了测试多种学校语言学习者的挑战。作者指出,“本书旨在作为一种非技术资源,为所有从事多语言人群教育工作的人提供帮助和指导”(第1页)。从这个意义上说,它不是一本关于语言本身评估的书(尽管她提出了一项研究,其中她收集了有关学生语言能力的信息)。相反,它主要是为非语言专家设计的;与所有学科的多语种学习者一起工作的教育工作者。正如她在整部作品中所述,作者指出了她所认为的理论和实证工作的局限性,即只考虑两种语言,声称这项工作对那些使用两种以上语言的人的见解有限。作者的动机是公平评估所有学生,包括语言和文化上的少数民族学生。下面是对本书的总结和评论,从每章的概述开始,然后对特定的几章(第2、5和7章)进行评论。考虑到章节中思想的重复,我认为这些章节中的许多章节都是为了独立阅读而设计的。我选择这几章来集中评论,因为在我看来,它们构成了本书的核心——它们包含了支撑作者工作的理论方法,以作者“综合方法”的形式提供的实践指导,以及她的实证研究的细节。
{"title":"Book Review: Multilingual Testing and Assessment","authors":"Beverly A. Baker","doi":"10.1177/02655322221114895","DOIUrl":"https://doi.org/10.1177/02655322221114895","url":null,"abstract":"From both a theoretical and an empirical perspective, this volume addresses the challenges of testing learners of multiple school language(s). The author states that “This volume is intended as a non-technical resource to offer help and guidance to all those who work in education with multilingual populations” (p. 1). In that sense, it is not a book about the assessment of language per se (although she presents a research study in which she collects information on students’ language proficiency). Rather, it is intended primarily for non-language specialists; to educators working with multilingual learners of all subjects. As she states throughout the work, the author addresses what she sees as limitations in both theoretical and empirical work that consider two languages only, claiming that this work has limited insights for those working with speakers of more than two languages. The author is motivated by the fair assessment of all students, including linguistically and culturally minoritized students. What follows are a summary and critical comments of the book, beginning with an overview of each of the chapters then directing a critical commentary to a few chapters in particular (Chapters 2, 5, and 7). Given the repetition of ideas across the chapters, I assume that many of these chapters have been designed to be read on a stand-alone basis. I have chosen these chapters to focus my comments because in my view they form the core of the book—they contain the theoretical approach undergirding the author’s work, the practical guidance in the form of the author’s “integrated approach,” and the details of her empirical study.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":null,"pages":null},"PeriodicalIF":4.1,"publicationDate":"2022-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45693580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparing holistic and analytic marking methods in assessing speech act production in L2 Chinese 二语汉语言语行为产生评价的整体标记法和分析标记法比较
IF 4.1 1区 文学 Q1 Arts and Humanities Pub Date : 2022-08-09 DOI: 10.1177/02655322221113917
Shuai Li, Ting-hui Wen, Xian Li, Yali Feng, Chuan Lin
This study compared holistic and analytic marking methods for their effects on parameter estimation (of examinees, raters, and items) and rater cognition in assessing speech act production in L2 Chinese. Seventy American learners of Chinese completed an oral Discourse Completion Test assessing requests and refusals. Four first-language (L1) Chinese raters evaluated the examinees’ oral productions using two four-point rating scales. The holistic scale simultaneously included the following five dimensions: communicative function, prosody, fluency, appropriateness, and grammaticality; the analytic scale included sub-scales to examine each of the five dimensions. The raters scored the dataset twice with the two marking methods, respectively, and with counterbalanced order. They also verbalized their scoring rationale after performing each rating. Results revealed that both marking methods led to high reliability and produced scores with high correlation; however, analytic marking possessed better assessment quality in terms of higher reliability and measurement precision, higher percentages of Rasch model fit for examinees and items, and more balanced reference to rating criteria among raters during the scoring process.
本研究比较了整体评分法和分析评分法对二语汉语言语行为产生的参数估计(考生、评分者和项目)和评分者认知的影响。70名美国汉语学习者完成了一项口头话语完成测试,评估请求和拒绝。四名第一语言(L1)汉语评分员使用两个四分制量表评估考生的口语作品。整体量表同时包括交际功能、韵律性、流畅性、适当性和语法性五个维度;分析量表包括检查五个维度中的每一个的子量表。评分者分别使用两种标记方法和平衡顺序对数据集进行两次评分。在完成每个评分后,他们还用语言描述了他们的评分理由。结果表明,两种评分方法均具有较高的信度和较高的相关性;分析阅卷具有较高的信度和测量精度、较高的考生和项目的Rasch模型拟合百分比、评分者在评分过程中对评分标准的参考更为均衡等评价质量。
{"title":"Comparing holistic and analytic marking methods in assessing speech act production in L2 Chinese","authors":"Shuai Li, Ting-hui Wen, Xian Li, Yali Feng, Chuan Lin","doi":"10.1177/02655322221113917","DOIUrl":"https://doi.org/10.1177/02655322221113917","url":null,"abstract":"This study compared holistic and analytic marking methods for their effects on parameter estimation (of examinees, raters, and items) and rater cognition in assessing speech act production in L2 Chinese. Seventy American learners of Chinese completed an oral Discourse Completion Test assessing requests and refusals. Four first-language (L1) Chinese raters evaluated the examinees’ oral productions using two four-point rating scales. The holistic scale simultaneously included the following five dimensions: communicative function, prosody, fluency, appropriateness, and grammaticality; the analytic scale included sub-scales to examine each of the five dimensions. The raters scored the dataset twice with the two marking methods, respectively, and with counterbalanced order. They also verbalized their scoring rationale after performing each rating. Results revealed that both marking methods led to high reliability and produced scores with high correlation; however, analytic marking possessed better assessment quality in terms of higher reliability and measurement precision, higher percentages of Rasch model fit for examinees and items, and more balanced reference to rating criteria among raters during the scoring process.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":null,"pages":null},"PeriodicalIF":4.1,"publicationDate":"2022-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41463902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Book Review: Challenges in Language Testing Around the World: Insights for Language Test Users 书评:世界各地语言测试的挑战:语言测试用户的见解
IF 4.1 1区 文学 Q1 Arts and Humanities Pub Date : 2022-07-28 DOI: 10.1177/02655322221113189
Atta Gebril
With the increasing role of tests worldwide, language professionals and other stakeholders are regularly involved in a wide range of assessment-related decisions in their local contexts. Such decisions vary in terms of the stakes associated with them, with many involving high-stakes decisions. Regardless of the nature of the stakes, assessment contexts tend to share something in common: the challenges that test users encounter on a daily basis. To make things even worse, many test users operate in an instructional setting with little knowledge about assessment. Taylor (2009) refers to the lack of assessment literacy materials that are accessible to different stakeholders, arguing that such materials are “highly technical or too specialized for language educators seeking to understand basic principles and practice in assessment” (p. 23). On a related note, assessment literacy training tends to be offered in a one-size-fits-all manner and does not tap into the unique characteristics of local contexts. This view is in contradiction with what different researchers reported in the literature since assessment literacy is perceived as “a social and co-constructed construct,” “no longer viewed as passive accumulation of knowledge and skills” (Yan & Fan, 2021, p. 220), and tends to be impacted by a number of contextual factors, such as linguistic background and teaching experience (Crusan et al., 2016). In light of these issues, the current volume taps into the existing challenges in different assessment/instructional settings. It is rare in our field to find a volume dedicated mainly to challenges in different assessment/instructional settings. Usually, there is a general sense that practitioners do not prefer such a negative tone when reading or writing about language assessment practices. In addition, practitioners generally do not have the incentives and resources needed for publishing, nor do they have access to a suitable platform for sharing such experiences. Challenges in Language Testing Around the World: Insights for Language Test Users by Betty Lanteigne, Christine Coombe, and James Dean Brown is a good addition to the existing body of knowledge since it offers a closer look at “things that could get overlooked, misapplied, misinterpreted, misused” in different assessment projects (Lanteigne et al., 2021, p. v.). Another perspective that the authors have to be commended on is related to the international nature of the experiences reported in this volume. 1113189 LTJ0010.1177/02655322221113189Language TestingBook reviews book-reviews2022
随着测试在世界范围内的作用越来越大,语言专业人员和其他利益相关者经常参与当地环境中与评估相关的广泛决策。此类决策的利害关系各不相同,其中许多涉及高利害关系的决策。无论利害关系的性质如何,评估环境往往有一些共同点:测试用户每天都会遇到的挑战。更糟糕的是,许多测试用户在教学环境中操作,对评估知之甚少。Taylor(2009)提到缺乏不同利益相关者可以获得的评估识字材料,认为这些材料“对于寻求理解评估基本原则和实践的语言教育工作者来说,技术性很强或过于专业化”(第23页)。与此相关的是,评估识字培训往往以一刀切的方式提供,没有利用当地环境的独特特征。这一观点与不同研究人员在文献中报道的观点相矛盾,因为评估素养被视为“一种社会和共同构建的结构”,“不再被视为知识和技能的被动积累”(Yan&Fan,2021,220),并且往往受到许多情境因素的影响,如语言背景和教学经验(Crusan et al.,2016)。鉴于这些问题,本卷探讨了不同评估/教学环境中存在的挑战。在我们的领域中,很少能找到一本主要针对不同评估/教学环境中的挑战的书。通常,人们普遍认为,从业者在阅读或撰写有关语言评估实践的文章时,不喜欢这种负面的语气。此外,从业者通常没有出版所需的激励和资源,也没有合适的平台来分享这些经验。Betty Lanteigne、Christine Coombe、,詹姆斯·迪恩·布朗是对现有知识体系的一个很好的补充,因为它更深入地研究了不同评估项目中“可能被忽视、误用、误解和误用的东西”(Lanteigne et al.,2021,p.v.)。作者必须赞扬的另一个观点与本卷中报告的经验的国际性质有关。1113189 LTJ0010.1177/026553222221113189语言测试书评2022
{"title":"Book Review: Challenges in Language Testing Around the World: Insights for Language Test Users","authors":"Atta Gebril","doi":"10.1177/02655322221113189","DOIUrl":"https://doi.org/10.1177/02655322221113189","url":null,"abstract":"With the increasing role of tests worldwide, language professionals and other stakeholders are regularly involved in a wide range of assessment-related decisions in their local contexts. Such decisions vary in terms of the stakes associated with them, with many involving high-stakes decisions. Regardless of the nature of the stakes, assessment contexts tend to share something in common: the challenges that test users encounter on a daily basis. To make things even worse, many test users operate in an instructional setting with little knowledge about assessment. Taylor (2009) refers to the lack of assessment literacy materials that are accessible to different stakeholders, arguing that such materials are “highly technical or too specialized for language educators seeking to understand basic principles and practice in assessment” (p. 23). On a related note, assessment literacy training tends to be offered in a one-size-fits-all manner and does not tap into the unique characteristics of local contexts. This view is in contradiction with what different researchers reported in the literature since assessment literacy is perceived as “a social and co-constructed construct,” “no longer viewed as passive accumulation of knowledge and skills” (Yan & Fan, 2021, p. 220), and tends to be impacted by a number of contextual factors, such as linguistic background and teaching experience (Crusan et al., 2016). In light of these issues, the current volume taps into the existing challenges in different assessment/instructional settings. It is rare in our field to find a volume dedicated mainly to challenges in different assessment/instructional settings. Usually, there is a general sense that practitioners do not prefer such a negative tone when reading or writing about language assessment practices. In addition, practitioners generally do not have the incentives and resources needed for publishing, nor do they have access to a suitable platform for sharing such experiences. Challenges in Language Testing Around the World: Insights for Language Test Users by Betty Lanteigne, Christine Coombe, and James Dean Brown is a good addition to the existing body of knowledge since it offers a closer look at “things that could get overlooked, misapplied, misinterpreted, misused” in different assessment projects (Lanteigne et al., 2021, p. v.). Another perspective that the authors have to be commended on is related to the international nature of the experiences reported in this volume. 1113189 LTJ0010.1177/02655322221113189Language TestingBook reviews book-reviews2022","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":null,"pages":null},"PeriodicalIF":4.1,"publicationDate":"2022-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44568134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Who succeeds and who fails? Exploring the role of background variables in explaining the outcomes of L2 language tests 谁成功了,谁失败了?探讨背景变量在解释第二语言测试结果中的作用
IF 4.1 1区 文学 Q1 Arts and Humanities Pub Date : 2022-07-24 DOI: 10.1177/02655322221100115
Ann-Kristin Helland Gujord
This study explores whether and to what extent the background information supplied by 10,155 immigrants who took an official language test in Norwegian affected their chances of passing one, two, or all three parts of the test. The background information included in the analysis was prior education, region (location of their home country), language (first language [L1] background, knowledge of English), second language (hours of second language [L2] instruction, L2 use), L1 community (years of residence, contact with L1 speakers), age, and gender. An ordered logistic regression analysis revealed that eight of the hypothesised explanatory variables significantly impacted the dependent variable (test result). Several of the significant variables relate to pre-immigration conditions, such as educational opportunities earlier in life. The findings have implications for language testing and also, to some extent, for the understanding of variation in learning outcomes.
这项研究探讨了10155名参加挪威官方语言测试的移民提供的背景信息是否以及在多大程度上影响了他们通过测试的一个、两个或全部三个部分的机会。分析中包含的背景信息包括先前教育、地区(母国所在地)、语言(第一语言[L1]背景、英语知识)、第二语言(第二语言[L2]教学时数、L2使用)、母语社区(居住年限、与母语使用者接触)、年龄和性别。有序逻辑回归分析显示,八个假设的解释变量显著影响因变量(检验结果)。几个重要的变量与移民前的条件有关,比如生命早期的教育机会。这些发现对语言测试有启示,在某种程度上,对学习结果变化的理解也有启示。
{"title":"Who succeeds and who fails? Exploring the role of background variables in explaining the outcomes of L2 language tests","authors":"Ann-Kristin Helland Gujord","doi":"10.1177/02655322221100115","DOIUrl":"https://doi.org/10.1177/02655322221100115","url":null,"abstract":"This study explores whether and to what extent the background information supplied by 10,155 immigrants who took an official language test in Norwegian affected their chances of passing one, two, or all three parts of the test. The background information included in the analysis was prior education, region (location of their home country), language (first language [L1] background, knowledge of English), second language (hours of second language [L2] instruction, L2 use), L1 community (years of residence, contact with L1 speakers), age, and gender. An ordered logistic regression analysis revealed that eight of the hypothesised explanatory variables significantly impacted the dependent variable (test result). Several of the significant variables relate to pre-immigration conditions, such as educational opportunities earlier in life. The findings have implications for language testing and also, to some extent, for the understanding of variation in learning outcomes.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":null,"pages":null},"PeriodicalIF":4.1,"publicationDate":"2022-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45523649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A sequential approach to detecting differential rater functioning in sparse rater-mediated assessment networks 在稀疏评分器介导的评估网络中检测差分评分器功能的顺序方法
IF 4.1 1区 文学 Q1 Arts and Humanities Pub Date : 2022-05-12 DOI: 10.1177/02655322221092388
Stefanie A. Wind
Researchers frequently evaluate rater judgments in performance assessments for evidence of differential rater functioning (DRF), which occurs when rater severity is systematically related to construct-irrelevant student characteristics after controlling for student achievement levels. However, researchers have observed that methods for detecting DRF may be limited in sparse rating designs, where it is not possible for every rater to score every student. In these designs, there is limited information with which to detect DRF. Sparse designs can also exacerbate the impact of artificial DRF, which occurs when raters are inaccurately flagged for DRF due to statistical artifacts. In this study, a sequential method is adapted from previous research on differential item functioning (DIF) that allows researchers to detect DRF more accurately and distinguish between true and artificial DRF. Analyses of data from a rater-mediated writing assessment and a simulation study demonstrate that the sequential approach results in different conclusions about which raters exhibit DRF. Moreover, the simulation study results suggest that the sequential procedure results in improved accuracy in DRF detection across a variety of rating design conditions. Practical implications for language testing research are discussed.
研究人员经常在成绩评估中评估评分者的判断,以寻找差异评分者功能(DRF)的证据,当评分者的严重程度在控制学生成绩水平后与构建不相关的学生特征系统相关时,就会发生这种情况。然而,研究人员观察到,在稀疏评分设计中,检测DRF的方法可能受到限制,因为不可能每个评分者都为每个学生打分。在这些设计中,用于检测DRF的信息有限。稀疏设计也会加剧人工DRF的影响,当评分者由于统计伪影而被错误地标记为DRF时,就会发生这种情况。在这项研究中,一种序列方法改编自先前对差异项目功能(DIF)的研究,使研究人员能够更准确地检测DRF,并区分真实和人工DRF。对评分者介导的写作评估和模拟研究的数据分析表明,顺序方法会导致评分者表现出DRF的不同结论。此外,模拟研究结果表明,在各种额定设计条件下,顺序程序提高了DRF检测的准确性。讨论了语言测试研究的实际意义。
{"title":"A sequential approach to detecting differential rater functioning in sparse rater-mediated assessment networks","authors":"Stefanie A. Wind","doi":"10.1177/02655322221092388","DOIUrl":"https://doi.org/10.1177/02655322221092388","url":null,"abstract":"Researchers frequently evaluate rater judgments in performance assessments for evidence of differential rater functioning (DRF), which occurs when rater severity is systematically related to construct-irrelevant student characteristics after controlling for student achievement levels. However, researchers have observed that methods for detecting DRF may be limited in sparse rating designs, where it is not possible for every rater to score every student. In these designs, there is limited information with which to detect DRF. Sparse designs can also exacerbate the impact of artificial DRF, which occurs when raters are inaccurately flagged for DRF due to statistical artifacts. In this study, a sequential method is adapted from previous research on differential item functioning (DIF) that allows researchers to detect DRF more accurately and distinguish between true and artificial DRF. Analyses of data from a rater-mediated writing assessment and a simulation study demonstrate that the sequential approach results in different conclusions about which raters exhibit DRF. Moreover, the simulation study results suggest that the sequential procedure results in improved accuracy in DRF detection across a variety of rating design conditions. Practical implications for language testing research are discussed.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":null,"pages":null},"PeriodicalIF":4.1,"publicationDate":"2022-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43741835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using instructor judgment, learner corpora, and DIF to develop a placement test for Spanish L2 and heritage learners 使用教师判断,学习者语料库和DIF开发西班牙语第二语言和传统学习者的分班测试
IF 4.1 1区 文学 Q1 Arts and Humanities Pub Date : 2022-05-01 DOI: 10.1177/02655322221076033
Melissa A. Bowles
This study details the development of a local test designed to place university Spanish students (n = 719) into one of the four different course levels and to distinguish between traditional L2 learners and early bilinguals on the basis of their linguistic knowledge, regardless of the variety of Spanish they were exposed to. Early bilinguals include two groups—heritage learners (HLs), who were exposed to Spanish in their homes and communities growing up, and early L2 learners with extensive Spanish exposure, often through dual immersion education, who are increasingly enrolling in university Spanish courses and tend to pattern with HLs. Expert instructor judgment and learner corpora contributed to item development, and 12 of 15 written multiple-choice test items targeting early-acquired vocabulary had differential item functioning (DIF) according to the Mantel–Haenszel procedure, favoring HLs. Recursive partitioning revealed that vocabulary score correctly identified 597/603 (99%) of L2 learners as such, and the six HLs whose vocabulary scores incorrectly identified them as L2 learners were in the lowest placement groups. Vocabulary scores also correctly identified 100% of the early L2 learners in the sample (n = 7) as having a heritage profile. Implications for the local context and for placement testing in general are provided.
这项研究详细介绍了一项旨在安排西班牙大学生(n = 719)分为四个不同的课程级别之一,并根据他们的语言知识区分传统二语学习者和早期双语者,无论他们接触到的西班牙语种类如何,以及早期的二语学习者,他们经常通过双重沉浸式教育接触西班牙语,他们越来越多地报名参加大学西班牙语课程,并倾向于与高年级学生打交道。专家-教师判断和学习者语料库有助于项目开发,根据Mantel–Haenszel程序,针对早期习得词汇的15个书面多项选择题中有12个具有差异项目功能(DIF),有利于HLs。递归划分显示,词汇得分正确地识别了597/603(99%)的二语学习者,而词汇得分错误地识别为二语学习的六个HLs处于最低位置组。词汇得分也正确地识别了样本中100%的早期二语学习者(n = 7) 作为具有遗产档案。提供了对当地环境和一般安置测试的影响。
{"title":"Using instructor judgment, learner corpora, and DIF to develop a placement test for Spanish L2 and heritage learners","authors":"Melissa A. Bowles","doi":"10.1177/02655322221076033","DOIUrl":"https://doi.org/10.1177/02655322221076033","url":null,"abstract":"This study details the development of a local test designed to place university Spanish students (n = 719) into one of the four different course levels and to distinguish between traditional L2 learners and early bilinguals on the basis of their linguistic knowledge, regardless of the variety of Spanish they were exposed to. Early bilinguals include two groups—heritage learners (HLs), who were exposed to Spanish in their homes and communities growing up, and early L2 learners with extensive Spanish exposure, often through dual immersion education, who are increasingly enrolling in university Spanish courses and tend to pattern with HLs. Expert instructor judgment and learner corpora contributed to item development, and 12 of 15 written multiple-choice test items targeting early-acquired vocabulary had differential item functioning (DIF) according to the Mantel–Haenszel procedure, favoring HLs. Recursive partitioning revealed that vocabulary score correctly identified 597/603 (99%) of L2 learners as such, and the six HLs whose vocabulary scores incorrectly identified them as L2 learners were in the lowest placement groups. Vocabulary scores also correctly identified 100% of the early L2 learners in the sample (n = 7) as having a heritage profile. Implications for the local context and for placement testing in general are provided.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":null,"pages":null},"PeriodicalIF":4.1,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46971326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Local placement test retrofit and building language assessment literacy with teacher stakeholders: A case study from Colombia 与教师利益相关者一起改进当地安置考试并培养语言评估素养:哥伦比亚的案例研究
IF 4.1 1区 文学 Q1 Arts and Humanities Pub Date : 2022-04-14 DOI: 10.1177/02655322221076153
Gerriet Janssen
This article provides a single, common-case study of a test retrofit project at one Colombian university. It reports on how the test retrofit project was carried out and describes the different areas of language assessment literacy the project afforded local teacher stakeholders. This project was successful in that it modified the test constructs and item types, while drawing stronger connections between the curriculum and the placement instrument. It also established a conceptual framework for the test and produced a more robust test form, psychometrically. The project intersected with different social forces, which impacted the project’s outcome in various ways. The project also illustrates how test retrofit provided local teachers with opportunities for language assessment literacy and with evidence-based knowledge about their students’ language proficiency. The study concludes that local assessment projects have the capacity to benefit local teachers, especially in terms of increased language assessment literacy. Intrinsic to a project’s sustainability are long-term financial commitment and institutionally established dedicated time, assigned to teacher participants. The study also concludes that project leadership requires both assessment and political skill sets, to conduct defensible research while compelling institutions to see the potential benefits of an ongoing test development or retrofit project.
本文提供了哥伦比亚一所大学测试改造项目的单一、常见案例研究。它报告了测试改造项目是如何进行的,并描述了该项目为当地教师利益相关者提供的语言评估素养的不同领域。该项目的成功之处在于,它修改了测试结构和项目类型,同时在课程和安置工具之间建立了更紧密的联系。它还为测试建立了一个概念框架,并从心理测量的角度产生了一种更稳健的测试形式。该项目与不同的社会力量交叉,以各种方式影响项目的结果。该项目还说明了测试改造如何为当地教师提供语言评估识字的机会,以及如何为他们的学生提供基于证据的语言能力知识。该研究得出结论,地方评估项目有能力使当地教师受益,特别是在提高语言评估素养方面。项目可持续性的内在因素是长期的财务承诺和制度上规定的分配给教师参与者的专用时间。该研究还得出结论,项目领导需要评估和政治技能,才能进行有说服力的研究,同时迫使机构看到正在进行的测试开发或改造项目的潜在好处。
{"title":"Local placement test retrofit and building language assessment literacy with teacher stakeholders: A case study from Colombia","authors":"Gerriet Janssen","doi":"10.1177/02655322221076153","DOIUrl":"https://doi.org/10.1177/02655322221076153","url":null,"abstract":"This article provides a single, common-case study of a test retrofit project at one Colombian university. It reports on how the test retrofit project was carried out and describes the different areas of language assessment literacy the project afforded local teacher stakeholders. This project was successful in that it modified the test constructs and item types, while drawing stronger connections between the curriculum and the placement instrument. It also established a conceptual framework for the test and produced a more robust test form, psychometrically. The project intersected with different social forces, which impacted the project’s outcome in various ways. The project also illustrates how test retrofit provided local teachers with opportunities for language assessment literacy and with evidence-based knowledge about their students’ language proficiency. The study concludes that local assessment projects have the capacity to benefit local teachers, especially in terms of increased language assessment literacy. Intrinsic to a project’s sustainability are long-term financial commitment and institutionally established dedicated time, assigned to teacher participants. The study also concludes that project leadership requires both assessment and political skill sets, to conduct defensible research while compelling institutions to see the potential benefits of an ongoing test development or retrofit project.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":null,"pages":null},"PeriodicalIF":4.1,"publicationDate":"2022-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48953721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Test Review: The International English Language Testing System (IELTS) 考试回顾:国际英语语言考试系统(雅思)
IF 4.1 1区 文学 Q1 Arts and Humanities Pub Date : 2022-04-04 DOI: 10.1177/02655322221086211
J. Read
{"title":"Test Review: The International English Language Testing System (IELTS)","authors":"J. Read","doi":"10.1177/02655322221086211","DOIUrl":"https://doi.org/10.1177/02655322221086211","url":null,"abstract":"","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":null,"pages":null},"PeriodicalIF":4.1,"publicationDate":"2022-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48370702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
Language Testing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1