Pub Date : 2024-05-31DOI: 10.1177/02655322241249754
Shanshan He, Anne-Marie Sénécal, Laura Stansfield, Ruslan Suvorov
Test preparation has garnered considerable attention in second language (L2) education due to the significant implications that successful performance on a language test may have for academic advancement, future career opportunities, and immigration prospects. Meanwhile, an overemphasis on test preparation has been criticized for encouraging the cultivation of construct-irrelevant test-taking strategies at the expense of developing general language proficiency. To systematically explore how test preparation has been investigated in the literature, we conducted a scoping review of 66 studies on L2 test preparation. Specifically, this study examined the key characteristics of publications on test preparation, the main themes explored, the study and participant characteristics, as well as the essential aspects of their research methodologies. The results of this review revealed various trends in the literature on L2 test preparation, such as the exclusive focus on English as the target language, the lack of diversity in stakeholders as participants, the dominance of international language tests, and the paucity of experimental studies that utilize advanced statistical techniques. In addition to interpreting the results of our analysis, we discuss the implications of this scoping review and outline several directions for future research on test preparation.
{"title":"A scoping review of research on second language test preparation","authors":"Shanshan He, Anne-Marie Sénécal, Laura Stansfield, Ruslan Suvorov","doi":"10.1177/02655322241249754","DOIUrl":"https://doi.org/10.1177/02655322241249754","url":null,"abstract":"Test preparation has garnered considerable attention in second language (L2) education due to the significant implications that successful performance on a language test may have for academic advancement, future career opportunities, and immigration prospects. Meanwhile, an overemphasis on test preparation has been criticized for encouraging the cultivation of construct-irrelevant test-taking strategies at the expense of developing general language proficiency. To systematically explore how test preparation has been investigated in the literature, we conducted a scoping review of 66 studies on L2 test preparation. Specifically, this study examined the key characteristics of publications on test preparation, the main themes explored, the study and participant characteristics, as well as the essential aspects of their research methodologies. The results of this review revealed various trends in the literature on L2 test preparation, such as the exclusive focus on English as the target language, the lack of diversity in stakeholders as participants, the dominance of international language tests, and the paucity of experimental studies that utilize advanced statistical techniques. In addition to interpreting the results of our analysis, we discuss the implications of this scoping review and outline several directions for future research on test preparation.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"205 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141192784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-17DOI: 10.1177/02655322241241851
Shungo Suzuki, Judit Kormos
The current study examined the extent to which first language (L1) utterance fluency measures can predict second language (L2) fluency and how L2 proficiency moderates the relationship between L1 and L2 fluency. A total of 104 Japanese-speaking learners of English completed different argumentative speech tasks in their L1 and L2. Their speaking performance was analysed using measures of speed, breakdown, and repair fluency. L2 proficiency was operationalised as cognitive fluency. Two factor scores of cognitive fluency—linguistic resources and processing speed—were computed based on performance in a set of linguistic knowledge tests capturing vocabulary knowledge, morphosyntactic processing, and articulatory skills. A series of generalised linear mixed-effects models revealed small-to-moderate effect sizes for the predictive power of L1 utterance fluency measures on their L2 counterparts. Moderator effects of L2 proficiency were found only in speed fluency measures. The relationship between L1 and L2 speed fluency was weaker for L2 learners with wider L2 linguistic resources. Conversely, for those with faster L2 processing speed, the L1-L2 link tended to be stronger. These findings indicate that the L1-L2 fluency link is subject to the complex interplay of phonological differences between learners’ L1 and L2 and their L2 proficiency, offering implications for diagnostic speaking assessment.
{"title":"The moderating role of L2 proficiency in the predictive power of L1 fluency on L2 utterance fluency","authors":"Shungo Suzuki, Judit Kormos","doi":"10.1177/02655322241241851","DOIUrl":"https://doi.org/10.1177/02655322241241851","url":null,"abstract":"The current study examined the extent to which first language (L1) utterance fluency measures can predict second language (L2) fluency and how L2 proficiency moderates the relationship between L1 and L2 fluency. A total of 104 Japanese-speaking learners of English completed different argumentative speech tasks in their L1 and L2. Their speaking performance was analysed using measures of speed, breakdown, and repair fluency. L2 proficiency was operationalised as cognitive fluency. Two factor scores of cognitive fluency—linguistic resources and processing speed—were computed based on performance in a set of linguistic knowledge tests capturing vocabulary knowledge, morphosyntactic processing, and articulatory skills. A series of generalised linear mixed-effects models revealed small-to-moderate effect sizes for the predictive power of L1 utterance fluency measures on their L2 counterparts. Moderator effects of L2 proficiency were found only in speed fluency measures. The relationship between L1 and L2 speed fluency was weaker for L2 learners with wider L2 linguistic resources. Conversely, for those with faster L2 processing speed, the L1-L2 link tended to be stronger. These findings indicate that the L1-L2 fluency link is subject to the complex interplay of phonological differences between learners’ L1 and L2 and their L2 proficiency, offering implications for diagnostic speaking assessment.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"10 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140608860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-17DOI: 10.1177/02655322241239356
Suh Keong Kwon, Guoxing Yu
In this study, we examined the effect of visual cues in a second language listening test on test takers’ viewing behaviours and their test performance. Fifty-seven learners of English in Korea took a video-based listening test, with their eye movements recorded, and 23 of them were interviewed individually after the test. The participants viewed the visual cues longer than the items in the multiple-choice questions. Looking at the correct answer choice was related to a higher test score, while looking at the speaker(s) in the video and the distractors of the test items to a lower test score. Viewing the PowerPoint slides showed mixed effects on test performance, depending on different eye-movement measures. Stimulated-recall interviews shed further light on the possible reasons for the different patterns of the participants’ eye movements. Overall, the participants held the positive view that the visual cues aided them in comprehending the aural input and in completing the listening tasks more successfully. We discuss these findings in relation to the authenticity of tasks and the construct relevance of video-based listening tests.
{"title":"The effect of viewing visual cues in a listening comprehension test on second language learners’ test-taking process and performance: An eye-tracking study","authors":"Suh Keong Kwon, Guoxing Yu","doi":"10.1177/02655322241239356","DOIUrl":"https://doi.org/10.1177/02655322241239356","url":null,"abstract":"In this study, we examined the effect of visual cues in a second language listening test on test takers’ viewing behaviours and their test performance. Fifty-seven learners of English in Korea took a video-based listening test, with their eye movements recorded, and 23 of them were interviewed individually after the test. The participants viewed the visual cues longer than the items in the multiple-choice questions. Looking at the correct answer choice was related to a higher test score, while looking at the speaker(s) in the video and the distractors of the test items to a lower test score. Viewing the PowerPoint slides showed mixed effects on test performance, depending on different eye-movement measures. Stimulated-recall interviews shed further light on the possible reasons for the different patterns of the participants’ eye movements. Overall, the participants held the positive view that the visual cues aided them in comprehending the aural input and in completing the listening tasks more successfully. We discuss these findings in relation to the authenticity of tasks and the construct relevance of video-based listening tests.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"36 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140608771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-17DOI: 10.1177/02655322241246574
Salomé Villa Larenas
{"title":"Book review: From assessment to feedback by Inez De Florio","authors":"Salomé Villa Larenas","doi":"10.1177/02655322241246574","DOIUrl":"https://doi.org/10.1177/02655322241246574","url":null,"abstract":"","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"124 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140614754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-10DOI: 10.1177/02655322241239362
Reeta Neittaanmäki, Iasonas Lamprianou
This article focuses on rater severity and consistency and their relation to different types of rater experience over a long period of time. The article is based on longitudinal data collected from 2009 to 2019 from the second language Finnish speaking subtest in the National Certificates of Language Proficiency in Finland. The study investigated whether rater severity and consistency are affected differently by different types of rater experience and by skipping rating sessions. The data consisted of 45 rating sessions with 104 raters and 59,899 examinees and were analyzed using the Many-Facets Rasch model and generalized linear mixed models. The results showed that when the raters gained more rating experience, they became slightly more lenient, but different types of experience had quantitatively different magnitudes of impact. In addition, skipping rating sessions, and in that way disconnecting from the rater community, increased the likelihood of a rater to be inconsistent. Finally, we provide methodological recommendations for future research and consider implications for practice.
{"title":"All types of experience are equal, but some are more equal: The effect of different types of experience on rater severity and rater consistency","authors":"Reeta Neittaanmäki, Iasonas Lamprianou","doi":"10.1177/02655322241239362","DOIUrl":"https://doi.org/10.1177/02655322241239362","url":null,"abstract":"This article focuses on rater severity and consistency and their relation to different types of rater experience over a long period of time. The article is based on longitudinal data collected from 2009 to 2019 from the second language Finnish speaking subtest in the National Certificates of Language Proficiency in Finland. The study investigated whether rater severity and consistency are affected differently by different types of rater experience and by skipping rating sessions. The data consisted of 45 rating sessions with 104 raters and 59,899 examinees and were analyzed using the Many-Facets Rasch model and generalized linear mixed models. The results showed that when the raters gained more rating experience, they became slightly more lenient, but different types of experience had quantitatively different magnitudes of impact. In addition, skipping rating sessions, and in that way disconnecting from the rater community, increased the likelihood of a rater to be inconsistent. Finally, we provide methodological recommendations for future research and consider implications for practice.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"25 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140567507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-10DOI: 10.1177/02655322241239363
Reeta Neittaanmäki, Iasonas Lamprianou
This article focuses on rater severity and consistency and their relation to major changes in the rating system in a high-stakes testing context. The study is based on longitudinal data collected from 2009 to 2019 from the second language (L2) Finnishspeaking subtest in the National Certificates of Language Proficiency in Finland. We investigated whether rater severity and consistency changed over that period and whether the changes could be explained by major changes in the rating system, such as the change of lead examiner, the modus of rating and training (on-site or remote), and the composition of the rater group. The data consisted of 45 rating sessions with 104 raters and 59,899 examinees and were analysed using the Many-Facets Rasch model and generalized linear mixed models. The analyses indicated that raters as a group became somewhat more lenient over time. In addition, the results showed that the rater community and its practices, the lead examiners, and the modus of rating and training can influence the rating behaviour. Finally, we elaborate on implications for both research and practice.
{"title":"Communal factors in rater severity and consistency over time in high-stakes oral assessment","authors":"Reeta Neittaanmäki, Iasonas Lamprianou","doi":"10.1177/02655322241239363","DOIUrl":"https://doi.org/10.1177/02655322241239363","url":null,"abstract":"This article focuses on rater severity and consistency and their relation to major changes in the rating system in a high-stakes testing context. The study is based on longitudinal data collected from 2009 to 2019 from the second language (L2) Finnishspeaking subtest in the National Certificates of Language Proficiency in Finland. We investigated whether rater severity and consistency changed over that period and whether the changes could be explained by major changes in the rating system, such as the change of lead examiner, the modus of rating and training (on-site or remote), and the composition of the rater group. The data consisted of 45 rating sessions with 104 raters and 59,899 examinees and were analysed using the Many-Facets Rasch model and generalized linear mixed models. The analyses indicated that raters as a group became somewhat more lenient over time. In addition, the results showed that the rater community and its practices, the lead examiners, and the modus of rating and training can influence the rating behaviour. Finally, we elaborate on implications for both research and practice.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"49 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140567501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In order to address the needs of the continually growing number of Chinese language learners, the present study developed and presented initial validation of a 100-item Chinese vocabulary proficiency test (CVPT) for learners of Chinese as a second/foreign language (CS/FL) using Item Response Theory among 170 CS/FL learners from Indonesia and 354 CS/FL learners from Thailand. Participants were required to translate or explain the meanings of the Chinese words using Indonesian or Thai. The results provided preliminary evidence for the construct validity of the CVPT for measuring CS/FL learners’ receptive Chinese vocabulary knowledge in terms of content, substantive, structural, generalizability, and external aspects. The translation-based CVPT was an attempt to measure CS/FL learners’ vocabulary proficiency by exploring their performance in a vocabulary translation task, potentially revealing test-takers’ high-degree vocabulary knowledge. Such a CVPT could be useful for Chinese vocabulary instruction and designing future Chinese vocabulary measurement tools.
{"title":"The development of a Chinese vocabulary proficiency test (CVPT) for learners of Chinese as a second/foreign language","authors":"Haiwei Zhang, Peng Sun, Yaowaluk Bianglae, Winda Widiawati","doi":"10.1177/02655322231219998","DOIUrl":"https://doi.org/10.1177/02655322231219998","url":null,"abstract":"In order to address the needs of the continually growing number of Chinese language learners, the present study developed and presented initial validation of a 100-item Chinese vocabulary proficiency test (CVPT) for learners of Chinese as a second/foreign language (CS/FL) using Item Response Theory among 170 CS/FL learners from Indonesia and 354 CS/FL learners from Thailand. Participants were required to translate or explain the meanings of the Chinese words using Indonesian or Thai. The results provided preliminary evidence for the construct validity of the CVPT for measuring CS/FL learners’ receptive Chinese vocabulary knowledge in terms of content, substantive, structural, generalizability, and external aspects. The translation-based CVPT was an attempt to measure CS/FL learners’ vocabulary proficiency by exploring their performance in a vocabulary translation task, potentially revealing test-takers’ high-degree vocabulary knowledge. Such a CVPT could be useful for Chinese vocabulary instruction and designing future Chinese vocabulary measurement tools.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"35 12","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139441444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-03DOI: 10.1177/02655322231223105
Tony Clark, Emma Bruce
This article is temporarily under embargo.
本文暂时禁止发表。
{"title":"Open Science should be welcomed by test providers but grounded in pragmatic caution: A response to Winke","authors":"Tony Clark, Emma Bruce","doi":"10.1177/02655322231223105","DOIUrl":"https://doi.org/10.1177/02655322231223105","url":null,"abstract":"This article is temporarily under embargo.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"130 19","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139387670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-01DOI: 10.1177/02655322231203234
Talia Isaacs, Paula M. Winke
This Editorial comes at a time when the after-effects of the acute phase of the COVID-19 pandemic are still being felt but when, in most countries around the world, there has been some easing of restrictions and a return to (quasi-)normalcy. In the language testing and assessment community, many colleagues relished the opportunity to meet and participate in events at the 44th annual Language Testing Research Colloquium (LTRC) in New York in July 2023. This was after 4 years of LTRC exclusively being held online due to public health concerns, restrictions on movement, and other policy-related and logistical matters. In the context of this Editorial, which comes out annually, we find it liberating to be able to focus on matters that are non-pandemic related. In terms of the day-to-day business of managing the journal, we have moved beyond a time of crisis, as reflected in the removal of a note about pandemic effects in our Author and Reviewer e-mail invitation templates. In this annual address, we note a change of the guard in the editorial team that will have come into effect by the time this Editorial is published and some elements of continuity. We also reflect on developments over the past year while briefly touching on what lies ahead.
{"title":"Purposeful turns for more equitable and transparent publishing in language testing and assessment","authors":"Talia Isaacs, Paula M. Winke","doi":"10.1177/02655322231203234","DOIUrl":"https://doi.org/10.1177/02655322231203234","url":null,"abstract":"This Editorial comes at a time when the after-effects of the acute phase of the COVID-19 pandemic are still being felt but when, in most countries around the world, there has been some easing of restrictions and a return to (quasi-)normalcy. In the language testing and assessment community, many colleagues relished the opportunity to meet and participate in events at the 44th annual Language Testing Research Colloquium (LTRC) in New York in July 2023. This was after 4 years of LTRC exclusively being held online due to public health concerns, restrictions on movement, and other policy-related and logistical matters. In the context of this Editorial, which comes out annually, we find it liberating to be able to focus on matters that are non-pandemic related. In terms of the day-to-day business of managing the journal, we have moved beyond a time of crisis, as reflected in the removal of a note about pandemic effects in our Author and Reviewer e-mail invitation templates. In this annual address, we note a change of the guard in the editorial team that will have come into effect by the time this Editorial is published and some elements of continuity. We also reflect on developments over the past year while briefly touching on what lies ahead.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":" 97","pages":"3 - 8"},"PeriodicalIF":4.1,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139392081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}