首页 > 最新文献

Language Testing最新文献

英文 中文
A scoping review of research on second language test preparation 第二语言考试准备研究范围综述
IF 4.1 1区 文学 Q1 Arts and Humanities Pub Date : 2024-05-31 DOI: 10.1177/02655322241249754
Shanshan He, Anne-Marie Sénécal, Laura Stansfield, Ruslan Suvorov
Test preparation has garnered considerable attention in second language (L2) education due to the significant implications that successful performance on a language test may have for academic advancement, future career opportunities, and immigration prospects. Meanwhile, an overemphasis on test preparation has been criticized for encouraging the cultivation of construct-irrelevant test-taking strategies at the expense of developing general language proficiency. To systematically explore how test preparation has been investigated in the literature, we conducted a scoping review of 66 studies on L2 test preparation. Specifically, this study examined the key characteristics of publications on test preparation, the main themes explored, the study and participant characteristics, as well as the essential aspects of their research methodologies. The results of this review revealed various trends in the literature on L2 test preparation, such as the exclusive focus on English as the target language, the lack of diversity in stakeholders as participants, the dominance of international language tests, and the paucity of experimental studies that utilize advanced statistical techniques. In addition to interpreting the results of our analysis, we discuss the implications of this scoping review and outline several directions for future research on test preparation.
在第二语言(L2)教育中,由于语言考试成绩对学业晋升、未来职业机会和移民前景具有重要影响,因此考试准备备受关注。与此同时,过分强调考试准备也被批评为鼓励培养与建构无关的应试策略,而牺牲了一般语言能力的发展。为了系统地探讨文献中对考试准备的研究,我们对 66 项关于 L2 考试准备的研究进行了范围界定。具体而言,本研究考察了有关考试准备的出版物的主要特点、探讨的主要主题、研究和参与者的特点,以及研究方法的基本方面。综述结果揭示了 L2 备考文献中的各种趋势,如只关注作为目标语言的英语、参与者中的利益相关者缺乏多样性、国际语言测试占主导地位以及使用先进统计技术的实验研究很少。除了解释我们的分析结果之外,我们还讨论了此次范围界定综述的意义,并概述了未来有关考试准备的几个研究方向。
{"title":"A scoping review of research on second language test preparation","authors":"Shanshan He, Anne-Marie Sénécal, Laura Stansfield, Ruslan Suvorov","doi":"10.1177/02655322241249754","DOIUrl":"https://doi.org/10.1177/02655322241249754","url":null,"abstract":"Test preparation has garnered considerable attention in second language (L2) education due to the significant implications that successful performance on a language test may have for academic advancement, future career opportunities, and immigration prospects. Meanwhile, an overemphasis on test preparation has been criticized for encouraging the cultivation of construct-irrelevant test-taking strategies at the expense of developing general language proficiency. To systematically explore how test preparation has been investigated in the literature, we conducted a scoping review of 66 studies on L2 test preparation. Specifically, this study examined the key characteristics of publications on test preparation, the main themes explored, the study and participant characteristics, as well as the essential aspects of their research methodologies. The results of this review revealed various trends in the literature on L2 test preparation, such as the exclusive focus on English as the target language, the lack of diversity in stakeholders as participants, the dominance of international language tests, and the paucity of experimental studies that utilize advanced statistical techniques. In addition to interpreting the results of our analysis, we discuss the implications of this scoping review and outline several directions for future research on test preparation.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":null,"pages":null},"PeriodicalIF":4.1,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141192784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The moderating role of L2 proficiency in the predictive power of L1 fluency on L2 utterance fluency L2 熟练程度对 L1 流利程度对 L2 语篇流利程度的预测力的调节作用
IF 4.1 1区 文学 Q1 Arts and Humanities Pub Date : 2024-04-17 DOI: 10.1177/02655322241241851
Shungo Suzuki, Judit Kormos
The current study examined the extent to which first language (L1) utterance fluency measures can predict second language (L2) fluency and how L2 proficiency moderates the relationship between L1 and L2 fluency. A total of 104 Japanese-speaking learners of English completed different argumentative speech tasks in their L1 and L2. Their speaking performance was analysed using measures of speed, breakdown, and repair fluency. L2 proficiency was operationalised as cognitive fluency. Two factor scores of cognitive fluency—linguistic resources and processing speed—were computed based on performance in a set of linguistic knowledge tests capturing vocabulary knowledge, morphosyntactic processing, and articulatory skills. A series of generalised linear mixed-effects models revealed small-to-moderate effect sizes for the predictive power of L1 utterance fluency measures on their L2 counterparts. Moderator effects of L2 proficiency were found only in speed fluency measures. The relationship between L1 and L2 speed fluency was weaker for L2 learners with wider L2 linguistic resources. Conversely, for those with faster L2 processing speed, the L1-L2 link tended to be stronger. These findings indicate that the L1-L2 fluency link is subject to the complex interplay of phonological differences between learners’ L1 and L2 and their L2 proficiency, offering implications for diagnostic speaking assessment.
本研究探讨了第一语言(L1)的语篇流利程度在多大程度上可以预测第二语言(L2)的流利程度,以及 L2 能力如何调节 L1 和 L2 流利程度之间的关系。共有 104 名日语英语学习者用第一语言和第二语言完成了不同的论证性演讲任务。他们的口语表现通过速度、分解和修复流利度进行了分析。L2 熟练程度被操作化为认知流利度。认知流利性的两个因子得分--语言资源和处理速度--是根据一组语言知识测试的成绩计算得出的,这些测试包括词汇知识、词法句法处理和发音技能。一系列广义线性混合效应模型显示,L1 语篇流畅性测量对 L2 对应测量的预测能力具有小到中等的效应大小。只有在速度流利度测量中发现了 L2 能力的调节效应。对于拥有较多 L2 语言资源的 L2 学习者来说,L1 和 L2 速度流利性之间的关系较弱。相反,对于那些 L2 处理速度较快的学习者来说,L1-L2 的联系往往更强。这些研究结果表明,L1-L2流利度之间的联系受制于学习者的 L1 和 L2 语音差异及其 L2 熟练程度之间复杂的相互作用,这为诊断性口语评估提供了启示。
{"title":"The moderating role of L2 proficiency in the predictive power of L1 fluency on L2 utterance fluency","authors":"Shungo Suzuki, Judit Kormos","doi":"10.1177/02655322241241851","DOIUrl":"https://doi.org/10.1177/02655322241241851","url":null,"abstract":"The current study examined the extent to which first language (L1) utterance fluency measures can predict second language (L2) fluency and how L2 proficiency moderates the relationship between L1 and L2 fluency. A total of 104 Japanese-speaking learners of English completed different argumentative speech tasks in their L1 and L2. Their speaking performance was analysed using measures of speed, breakdown, and repair fluency. L2 proficiency was operationalised as cognitive fluency. Two factor scores of cognitive fluency—linguistic resources and processing speed—were computed based on performance in a set of linguistic knowledge tests capturing vocabulary knowledge, morphosyntactic processing, and articulatory skills. A series of generalised linear mixed-effects models revealed small-to-moderate effect sizes for the predictive power of L1 utterance fluency measures on their L2 counterparts. Moderator effects of L2 proficiency were found only in speed fluency measures. The relationship between L1 and L2 speed fluency was weaker for L2 learners with wider L2 linguistic resources. Conversely, for those with faster L2 processing speed, the L1-L2 link tended to be stronger. These findings indicate that the L1-L2 fluency link is subject to the complex interplay of phonological differences between learners’ L1 and L2 and their L2 proficiency, offering implications for diagnostic speaking assessment.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":null,"pages":null},"PeriodicalIF":4.1,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140608860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The effect of viewing visual cues in a listening comprehension test on second language learners’ test-taking process and performance: An eye-tracking study 在听力理解测试中查看视觉线索对第二语言学习者的考试过程和成绩的影响:眼动追踪研究
IF 4.1 1区 文学 Q1 Arts and Humanities Pub Date : 2024-04-17 DOI: 10.1177/02655322241239356
Suh Keong Kwon, Guoxing Yu
In this study, we examined the effect of visual cues in a second language listening test on test takers’ viewing behaviours and their test performance. Fifty-seven learners of English in Korea took a video-based listening test, with their eye movements recorded, and 23 of them were interviewed individually after the test. The participants viewed the visual cues longer than the items in the multiple-choice questions. Looking at the correct answer choice was related to a higher test score, while looking at the speaker(s) in the video and the distractors of the test items to a lower test score. Viewing the PowerPoint slides showed mixed effects on test performance, depending on different eye-movement measures. Stimulated-recall interviews shed further light on the possible reasons for the different patterns of the participants’ eye movements. Overall, the participants held the positive view that the visual cues aided them in comprehending the aural input and in completing the listening tasks more successfully. We discuss these findings in relation to the authenticity of tasks and the construct relevance of video-based listening tests.
在这项研究中,我们考察了第二语言听力测试中的视觉提示对考生观看行为和测试成绩的影响。57 名韩国英语学习者参加了视频听力测试,他们的眼球运动被记录下来,其中 23 人在测试后接受了单独访谈。参加者观看视觉提示的时间比观看选择题的时间长。观看正确答案选项与较高的测试得分有关,而观看视频中的说话者和测试项目中的干扰项则与较低的测试得分有关。根据不同的眼动测量方法,观看 PowerPoint 幻灯片对测试成绩的影响不一。受刺激回忆访谈进一步揭示了受试者眼动模式不同的可能原因。总体而言,受试者认为视觉提示有助于他们理解听力输入,并更顺利地完成听力任务,这种观点是积极的。我们将结合任务的真实性和基于视频的听力测试的建构相关性来讨论这些发现。
{"title":"The effect of viewing visual cues in a listening comprehension test on second language learners’ test-taking process and performance: An eye-tracking study","authors":"Suh Keong Kwon, Guoxing Yu","doi":"10.1177/02655322241239356","DOIUrl":"https://doi.org/10.1177/02655322241239356","url":null,"abstract":"In this study, we examined the effect of visual cues in a second language listening test on test takers’ viewing behaviours and their test performance. Fifty-seven learners of English in Korea took a video-based listening test, with their eye movements recorded, and 23 of them were interviewed individually after the test. The participants viewed the visual cues longer than the items in the multiple-choice questions. Looking at the correct answer choice was related to a higher test score, while looking at the speaker(s) in the video and the distractors of the test items to a lower test score. Viewing the PowerPoint slides showed mixed effects on test performance, depending on different eye-movement measures. Stimulated-recall interviews shed further light on the possible reasons for the different patterns of the participants’ eye movements. Overall, the participants held the positive view that the visual cues aided them in comprehending the aural input and in completing the listening tasks more successfully. We discuss these findings in relation to the authenticity of tasks and the construct relevance of video-based listening tests.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":null,"pages":null},"PeriodicalIF":4.1,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140608771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Book review: From assessment to feedback by Inez De Florio 书评:从评估到反馈》,作者:Inez De Florio
IF 4.1 1区 文学 Q1 Arts and Humanities Pub Date : 2024-04-17 DOI: 10.1177/02655322241246574
Salomé Villa Larenas
{"title":"Book review: From assessment to feedback by Inez De Florio","authors":"Salomé Villa Larenas","doi":"10.1177/02655322241246574","DOIUrl":"https://doi.org/10.1177/02655322241246574","url":null,"abstract":"","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":null,"pages":null},"PeriodicalIF":4.1,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140614754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Developing internet-based Tests of Aptitude for Language Learning (TALL): An open research endeavour 开发基于互联网的语言学习能力测试(TALL):开放式研究工作
IF 4.1 1区 文学 Q1 Arts and Humanities Pub Date : 2024-04-16 DOI: 10.1177/02655322241241849
Junlan Pan, Emma Marsden
Tests of Aptitude for Language Learning (TALL) is an openly accessible internet-based battery to measure the multifaceted construct of foreign language aptitude, using language domain–specific instruments and L1-sensitive instructions and stimuli. This brief report introduces the components of this theory-informed battery and methodological considerations for developing it into an open research instrument. It also presents the preliminary results from the initial validation of TALL carried out on data collected from Chinese L1 participants ( n = 165) from a university setting who took two rounds of tests (with counterbalanced test items) with a minimum 30-day interval. The results of data analyses at subtest, item, and battery levels suggest that, in general, TALL has satisfactory reliability and can be used to measure aptitude conceptualized in the theoretical frameworks on which it has been developed. This report also highlights the value of TALL as a convenient data collection tool openly accessible to any researcher for free, its potential for facilitating an open data pool for high-quality syntheses of aptitude-related research findings, and its implications for Open Research practices in testing language-related constructs.
语言学习能力测试(TALL)是一种基于互联网的开放式测试工具,通过使用特定语言领域的工具和对 L1 敏感的指令和刺激来测量外语能力的多层面结构。本简要报告介绍了这一理论依据型测评工具的组成部分,以及将其开发为开放式研究工具的方法论考虑因素。报告还介绍了对 TALL 进行初步验证的初步结果,验证数据收集自大学环境中的中文 L1 参与者(n = 165),他们参加了两轮测试(测试项目相互平衡),测试间隔至少为 30 天。对分测验、项目和测验组的数据分析结果表明,总体而言,TALL具有令人满意的信度,可用于测量其所依据的理论框架中的能力概念。本报告还强调了 TALL 作为一种方便的数据收集工具的价值,任何研究人员都可以免费公开使用该工具;TALL 有助于建立一个开放的数据池,以便对能力倾向相关研究成果进行高质量的综合;TALL 对语言相关建构测试的开放研究实践也具有重要意义。
{"title":"Developing internet-based Tests of Aptitude for Language Learning (TALL): An open research endeavour","authors":"Junlan Pan, Emma Marsden","doi":"10.1177/02655322241241849","DOIUrl":"https://doi.org/10.1177/02655322241241849","url":null,"abstract":"Tests of Aptitude for Language Learning (TALL) is an openly accessible internet-based battery to measure the multifaceted construct of foreign language aptitude, using language domain–specific instruments and L1-sensitive instructions and stimuli. This brief report introduces the components of this theory-informed battery and methodological considerations for developing it into an open research instrument. It also presents the preliminary results from the initial validation of TALL carried out on data collected from Chinese L1 participants ( n = 165) from a university setting who took two rounds of tests (with counterbalanced test items) with a minimum 30-day interval. The results of data analyses at subtest, item, and battery levels suggest that, in general, TALL has satisfactory reliability and can be used to measure aptitude conceptualized in the theoretical frameworks on which it has been developed. This report also highlights the value of TALL as a convenient data collection tool openly accessible to any researcher for free, its potential for facilitating an open data pool for high-quality syntheses of aptitude-related research findings, and its implications for Open Research practices in testing language-related constructs.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":null,"pages":null},"PeriodicalIF":4.1,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140698518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
All types of experience are equal, but some are more equal: The effect of different types of experience on rater severity and rater consistency 所有类型的经验都是平等的,但有些经验更为平等:不同类型的经验对评分者严重程度和评分者一致性的影响
IF 4.1 1区 文学 Q1 Arts and Humanities Pub Date : 2024-04-10 DOI: 10.1177/02655322241239362
Reeta Neittaanmäki, Iasonas Lamprianou
This article focuses on rater severity and consistency and their relation to different types of rater experience over a long period of time. The article is based on longitudinal data collected from 2009 to 2019 from the second language Finnish speaking subtest in the National Certificates of Language Proficiency in Finland. The study investigated whether rater severity and consistency are affected differently by different types of rater experience and by skipping rating sessions. The data consisted of 45 rating sessions with 104 raters and 59,899 examinees and were analyzed using the Many-Facets Rasch model and generalized linear mixed models. The results showed that when the raters gained more rating experience, they became slightly more lenient, but different types of experience had quantitatively different magnitudes of impact. In addition, skipping rating sessions, and in that way disconnecting from the rater community, increased the likelihood of a rater to be inconsistent. Finally, we provide methodological recommendations for future research and consider implications for practice.
本文重点研究了评分者的严重性和一致性,以及它们与不同类型的评分者长期经验之间的关系。文章基于 2009 年至 2019 年期间从芬兰国家语言能力证书第二语言芬兰语口语子测试中收集的纵向数据。研究调查了不同类型的评分者经验和跳过评分环节是否会对评分者的严重程度和一致性产生不同影响。数据由 104 名评分员和 59,899 名受试者的 45 个评分环节组成,并使用 Many-Facets Rasch 模型和广义线性混合模型进行了分析。结果表明,当评分者获得更多的评分经验时,他们会变得稍微宽松一些,但不同类型的经验会产生数量上不同的影响。此外,跳过评分环节,从而脱离评分者群体,会增加评分者不一致的可能性。最后,我们为今后的研究提供了方法建议,并考虑了对实践的影响。
{"title":"All types of experience are equal, but some are more equal: The effect of different types of experience on rater severity and rater consistency","authors":"Reeta Neittaanmäki, Iasonas Lamprianou","doi":"10.1177/02655322241239362","DOIUrl":"https://doi.org/10.1177/02655322241239362","url":null,"abstract":"This article focuses on rater severity and consistency and their relation to different types of rater experience over a long period of time. The article is based on longitudinal data collected from 2009 to 2019 from the second language Finnish speaking subtest in the National Certificates of Language Proficiency in Finland. The study investigated whether rater severity and consistency are affected differently by different types of rater experience and by skipping rating sessions. The data consisted of 45 rating sessions with 104 raters and 59,899 examinees and were analyzed using the Many-Facets Rasch model and generalized linear mixed models. The results showed that when the raters gained more rating experience, they became slightly more lenient, but different types of experience had quantitatively different magnitudes of impact. In addition, skipping rating sessions, and in that way disconnecting from the rater community, increased the likelihood of a rater to be inconsistent. Finally, we provide methodological recommendations for future research and consider implications for practice.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":null,"pages":null},"PeriodicalIF":4.1,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140567507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Communal factors in rater severity and consistency over time in high-stakes oral assessment 高分口语测评中测评者严重程度和一致性随时间变化的共性因素
IF 4.1 1区 文学 Q1 Arts and Humanities Pub Date : 2024-04-10 DOI: 10.1177/02655322241239363
Reeta Neittaanmäki, Iasonas Lamprianou
This article focuses on rater severity and consistency and their relation to major changes in the rating system in a high-stakes testing context. The study is based on longitudinal data collected from 2009 to 2019 from the second language (L2) Finnishspeaking subtest in the National Certificates of Language Proficiency in Finland. We investigated whether rater severity and consistency changed over that period and whether the changes could be explained by major changes in the rating system, such as the change of lead examiner, the modus of rating and training (on-site or remote), and the composition of the rater group. The data consisted of 45 rating sessions with 104 raters and 59,899 examinees and were analysed using the Many-Facets Rasch model and generalized linear mixed models. The analyses indicated that raters as a group became somewhat more lenient over time. In addition, the results showed that the rater community and its practices, the lead examiners, and the modus of rating and training can influence the rating behaviour. Finally, we elaborate on implications for both research and practice.
本文重点关注评分者的严重性和一致性,以及它们与高风险测试背景下评分系统重大变化的关系。研究基于 2009 年至 2019 年期间从芬兰国家语言能力证书第二语言(L2)芬兰语子测试中收集的纵向数据。我们调查了在此期间评分者的严重程度和一致性是否发生了变化,以及这些变化是否可以用评分系统的重大变化来解释,例如主考官的更换、评分和培训方式(现场或远程)以及评分者群体的构成。数据包括 104 名评分员和 59 899 名受试者的 45 次评分,并使用多面 Rasch 模型和广义线性混合模型进行了分析。分析表明,随着时间的推移,评分者作为一个群体变得更加宽松。此外,结果表明,评分者群体及其做法、主考官以及评分和培训方式都会影响评分行为。最后,我们阐述了对研究和实践的影响。
{"title":"Communal factors in rater severity and consistency over time in high-stakes oral assessment","authors":"Reeta Neittaanmäki, Iasonas Lamprianou","doi":"10.1177/02655322241239363","DOIUrl":"https://doi.org/10.1177/02655322241239363","url":null,"abstract":"This article focuses on rater severity and consistency and their relation to major changes in the rating system in a high-stakes testing context. The study is based on longitudinal data collected from 2009 to 2019 from the second language (L2) Finnishspeaking subtest in the National Certificates of Language Proficiency in Finland. We investigated whether rater severity and consistency changed over that period and whether the changes could be explained by major changes in the rating system, such as the change of lead examiner, the modus of rating and training (on-site or remote), and the composition of the rater group. The data consisted of 45 rating sessions with 104 raters and 59,899 examinees and were analysed using the Many-Facets Rasch model and generalized linear mixed models. The analyses indicated that raters as a group became somewhat more lenient over time. In addition, the results showed that the rater community and its practices, the lead examiners, and the modus of rating and training can influence the rating behaviour. Finally, we elaborate on implications for both research and practice.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":null,"pages":null},"PeriodicalIF":4.1,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140567501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Test score comparison tables: How well are they serving test users? 考试分数对照表:它们为测试用户提供了多好的服务?
IF 4.1 1区 文学 Q1 Arts and Humanities Pub Date : 2024-03-26 DOI: 10.1177/02655322241239348
U. Knoch, Jason Fan
While several test concordance tables have been published, the research underpinning such tables has rarely been examined in detail. This study aimed to survey the publically available studies or documentation underpinning the test concordance tables of the providers of four major international language tests, all accepted by the Australian Department of Home Affairs for Australian visa purposes. To evaluate the concordance studies, we first identified the good practice principles in concordance research through a review of both the relevant literature and leading professional standards in the field of educational measurement and language assessment. Next, we reviewed the concordance studies against the identified good practice principles. Our findings revealed that the information supplied by test providers varied, with some making the full research papers available, whereas others providing little information about their underpinning research. None of the concordance studies fulfilled all the good practice principles. Based on the findings of this study, we offer recommendations for future concordance research in the field of language testing as well as suggestions for practice.
虽然已经公布了一些测试对照表,但很少对这些对照表所依据的研究进行详细审查。本研究旨在调查四种主要国际语言测试的提供者公开发表的研究报告或文件,这些测试都被澳大利亚内政部接受用于澳大利亚签证目的。为了评估一致性研究,我们首先通过查阅相关文献以及教育测量和语言评估领域的主要专业标准,确定了一致性研究的良好实践原则。然后,我们根据所确定的良好实践原则审查了一致性研究。我们的研究结果表明,测试提供者所提供的信息各不相同,有些提供完整的研究论文,而有些则很少提供有关其基础研究的信息。没有一项一致性研究符合所有的良好操作原则。根据本研究的结果,我们对语言测试领域未来的一致性研究提出了建议,并对实践提出了建议。
{"title":"Test score comparison tables: How well are they serving test users?","authors":"U. Knoch, Jason Fan","doi":"10.1177/02655322241239348","DOIUrl":"https://doi.org/10.1177/02655322241239348","url":null,"abstract":"While several test concordance tables have been published, the research underpinning such tables has rarely been examined in detail. This study aimed to survey the publically available studies or documentation underpinning the test concordance tables of the providers of four major international language tests, all accepted by the Australian Department of Home Affairs for Australian visa purposes. To evaluate the concordance studies, we first identified the good practice principles in concordance research through a review of both the relevant literature and leading professional standards in the field of educational measurement and language assessment. Next, we reviewed the concordance studies against the identified good practice principles. Our findings revealed that the information supplied by test providers varied, with some making the full research papers available, whereas others providing little information about their underpinning research. None of the concordance studies fulfilled all the good practice principles. Based on the findings of this study, we offer recommendations for future concordance research in the field of language testing as well as suggestions for practice.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":null,"pages":null},"PeriodicalIF":4.1,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140379890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Book review: L2 Writing Assessment: An Evolutionary Perspective 书评:L2 Writing Assessment:进化的视角
IF 4.1 1区 文学 Q1 Arts and Humanities Pub Date : 2024-03-22 DOI: 10.1177/02655322241239355
Khaled Barkaoui
{"title":"Book review: L2 Writing Assessment: An Evolutionary Perspective","authors":"Khaled Barkaoui","doi":"10.1177/02655322241239355","DOIUrl":"https://doi.org/10.1177/02655322241239355","url":null,"abstract":"","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":null,"pages":null},"PeriodicalIF":4.1,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140203027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating methodological enhancements to the Yes/No Angoff standard-setting method in language proficiency assessment 评估语言能力评估中 "是/否 "安格夫标准设定法的方法改进情况
IF 4.1 1区 文学 Q1 Arts and Humanities Pub Date : 2024-02-12 DOI: 10.1177/02655322231222600
Tia M. Fechter, Heeyeon Yoon
This study evaluated the efficacy of two proposed methods in an operational standard-setting study conducted for a high-stakes language proficiency test of the U.S. government. The goal was to seek low-cost modifications to the existing Yes/No Angoff method to increase the validity and reliability of the recommended cut scores using a convergent mixed-methods study design. The study used the Yes/No ratings as the baseline method in two rounds of ratings, while differentiating the two methods by incorporating item maps and an Ordered Item Booklet, each of which is an integral tool of the Mapmark and the Bookmark methods. The results showed that the internal validity evidence is similar across both methods, especially after Round 2 ratings. When procedural validity evidence was considered, however, a preference emerged for the method where panelists conducted the initial ratings unbeknownst to the empirical item difficulty information, and then such information was provided on an item map as part of the Round 1 feedback. The findings highlight the importance of evaluating both internal and procedural validity evidence when considering standard-setting methods.
本研究针对美国政府的一项高风险语言能力测试,在一项操作标准制定研究中对两种建议方法的有效性进行了评估。研究的目的是对现有的 "是/否 "安格夫评分法进行低成本的修改,以提高推荐切分分数的有效性和可靠性。这项研究在两轮评级中使用是/否评级作为基准方法,同时通过纳入项目地图和有序项目手册来区分这两种方法,每种方法都是地图标记和书签方法不可或缺的工具。结果表明,两种方法的内部效度证据相似,尤其是在第二轮评级之后。然而,在考虑程序效度证据时,发现小组成员更倾向于在不了解实证项目难度信息的情况下进行初始评级,然后在项目图上提供此类信息作为第一轮反馈的一部分的方法。研究结果凸显了在考虑标准制定方法时评估内部有效性和程序有效性证据的重要性。
{"title":"Evaluating methodological enhancements to the Yes/No Angoff standard-setting method in language proficiency assessment","authors":"Tia M. Fechter, Heeyeon Yoon","doi":"10.1177/02655322231222600","DOIUrl":"https://doi.org/10.1177/02655322231222600","url":null,"abstract":"This study evaluated the efficacy of two proposed methods in an operational standard-setting study conducted for a high-stakes language proficiency test of the U.S. government. The goal was to seek low-cost modifications to the existing Yes/No Angoff method to increase the validity and reliability of the recommended cut scores using a convergent mixed-methods study design. The study used the Yes/No ratings as the baseline method in two rounds of ratings, while differentiating the two methods by incorporating item maps and an Ordered Item Booklet, each of which is an integral tool of the Mapmark and the Bookmark methods. The results showed that the internal validity evidence is similar across both methods, especially after Round 2 ratings. When procedural validity evidence was considered, however, a preference emerged for the method where panelists conducted the initial ratings unbeknownst to the empirical item difficulty information, and then such information was provided on an item map as part of the Round 1 feedback. The findings highlight the importance of evaluating both internal and procedural validity evidence when considering standard-setting methods.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":null,"pages":null},"PeriodicalIF":4.1,"publicationDate":"2024-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139784401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Language Testing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1