Language Testing最新文献

英文中文

Making each point count: Revising a local adaptation of the Jacobs et al. (1981) ESL COMPOSITION PROFILE rubric 让每一分都有价值：修订雅各布斯等人（1981 年）的英语写作概貌评分标准的地方改编版

IF 4.1 1区文学 0 LANGUAGE & LINGUISTICS

Language Testing

Pub Date : 2023-12-30 DOI: 10.1177/02655322231217979

Yu-Tzu Chang, Ann Tai Choe, Daniel Holden, Daniel R. Isbell

In this Brief Report, we describe an evaluation of and revisions to a rubric adapted from the Jacobs et al. (1981) ESL COMPOSITION PROFILE, with four rubric categories and 20-point rating scales, in the context of an intensive English program writing placement test. Analysis of 4 years of rating data (2016–2021, including 434 essays) using many-facet Rasch measurement demonstrated that the 20-point rating scales of the Jacobs et al. rubric functioned poorly due to (a) questionably small distinctions in writing quality between successive score categories and (b) the presence of several disordered categories. We reanalyzed the score data after collapsing the 20-point scales into 4-point scales to simulate a revision to the rubric. This reanalysis appeared promising, with well-ordered and distinct score categories, and only a trivial decrease in person separation reliability. After implementing this revision to the rubric, we examined data from recent administrations (2022–2023, including 93 essays) to evaluate scale functioning. As in the simulation, scale categories were well-ordered and distinct in operational rating. Moreover, no raters demonstrated exceedingly poor fit using the revised rubric. Findings hold implications for other programs adopting/adapting the PROFILE or a similar rubric.

在本简要报告中，我们介绍了对改编自雅各布斯等人（1981 年）的 ESL COMPOSITION PROFILE 的评分标准的评估和修订情况，该评分标准有四个评分类别和 20 点评分量表，适用于强化英语课程写作分级测试。使用多方面拉施测量法对 4 年（2016-2021 年，包括 434 篇作文）的评分数据进行分析后发现，雅各布斯等人评分标准的 20 分评分量表功能不佳，原因在于：（a）连续评分类别之间的写作质量差别太小，令人质疑；（b）存在多个无序类别。为了模拟对评分标准的修订，我们将 20 分的评分标准简化为 4 分的评分标准，并对评分数据进行了重新分析。重新分析的结果显示，评分类别井然有序、各具特色，人称分离信度仅略有下降。在对评分标准进行修订后，我们检查了最近几次考试（2022-2023 年，包括 93 篇文章）的数据，以评估量表的功能。与模拟结果一样，量表类别井然有序，且在操作评分时各具特色。此外，没有任何评分者在使用修订后的评分标准时表现出极差的匹配度。研究结果对其他采用/调整 PROFILE 或类似评分标准的项目具有启示意义。

{"title":"Making each point count: Revising a local adaptation of the Jacobs et al. (1981) ESL COMPOSITION PROFILE rubric","authors":"Yu-Tzu Chang, Ann Tai Choe, Daniel Holden, Daniel R. Isbell","doi":"10.1177/02655322231217979","DOIUrl":"https://doi.org/10.1177/02655322231217979","url":null,"abstract":"In this Brief Report, we describe an evaluation of and revisions to a rubric adapted from the Jacobs et al. (1981) ESL COMPOSITION PROFILE, with four rubric categories and 20-point rating scales, in the context of an intensive English program writing placement test. Analysis of 4 years of rating data (2016–2021, including 434 essays) using many-facet Rasch measurement demonstrated that the 20-point rating scales of the Jacobs et al. rubric functioned poorly due to (a) questionably small distinctions in writing quality between successive score categories and (b) the presence of several disordered categories. We reanalyzed the score data after collapsing the 20-point scales into 4-point scales to simulate a revision to the rubric. This reanalysis appeared promising, with well-ordered and distinct score categories, and only a trivial decrease in person separation reliability. After implementing this revision to the rubric, we examined data from recent administrations (2022–2023, including 93 essays) to evaluate scale functioning. As in the simulation, scale categories were well-ordered and distinct in operational rating. Moreover, no raters demonstrated exceedingly poor fit using the revised rubric. Findings hold implications for other programs adopting/adapting the PROFILE or a similar rubric.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":" 2","pages":""},"PeriodicalIF":4.1,"publicationDate":"2023-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139140765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Comparing two formats of data-driven rating scales for classroom assessment of pragmatic performance with roleplays 比较两种以数据为导向的评分量表格式，以评估课堂上的角色扮演语用表现

IF 4.1 1区文学 0 LANGUAGE & LINGUISTICS

Language Testing

Pub Date : 2023-11-29 DOI: 10.1177/02655322231210217

Yunwen Su, Sun-Young Shin

Rating scales that language testers design should be tailored to the specific test purpose and score use as well as reflect the target construct. Researchers have long argued for the value of data-driven scales for classroom performance assessment, because they are specific to pedagogical tasks and objectives, have rich descriptors to offer useful diagnostic information, and exhibit robust content representativeness and stable measurement properties. This sequential mixed methods study compares two data-driven rating scales with multiple criteria that use different formats for pragmatic performance. They were developed using roleplays performed by 43 second-language learners of Mandarin—the hierarchical-binary (HB) scale, developed through close analysis of performance data, and the multi-trait (MT) scale derived from the HB, which has the same criteria but takes the format of an analytic scale. Results revealed the influence of format, albeit to a limited extent: MT showed a marginal advantage over HB in terms of overall reliability, practicality, and discriminatory power, though measurement properties of the two scales were largely comparable. All raters were positive about the pedagogical value of both scales. This study reveals that rater perceptions of the ease of use and effectiveness of both scales provide further insights into scale functioning.

语言测试人员设计的评分量表应适合特定的测试目的和分数用途，并能反映目标建构。长期以来，研究人员一直在论证数据驱动量表在课堂表现评估中的价值，因为它们针对教学任务和目标，具有丰富的描述符，可提供有用的诊断信息，并表现出强大的内容代表性和稳定的测量属性。这项连续的混合方法研究比较了两种数据驱动的评分量表，它们采用不同的实用性表现形式，具有多重标准。这两个量表是由 43 名普通话第二语言学习者通过角色扮演的方式完成的--通过对表现数据的严密分析而开发的分层二元量表（HB），以及从 HB 量表衍生出的多特征量表（MT），后者具有相同的标准，但采用了分析量表的形式。结果显示了量表形式的影响，尽管影响程度有限：尽管两个量表的测量属性基本相当，但在总体可靠性、实用性和区分度方面，MT 量表比 HB 量表略胜一筹。所有评分者都对两种量表的教学价值持肯定态度。本研究揭示了评定者对两种量表的易用性和有效性的看法，从而进一步揭示了量表的功能。

{"title":"Comparing two formats of data-driven rating scales for classroom assessment of pragmatic performance with roleplays","authors":"Yunwen Su, Sun-Young Shin","doi":"10.1177/02655322231210217","DOIUrl":"https://doi.org/10.1177/02655322231210217","url":null,"abstract":"Rating scales that language testers design should be tailored to the specific test purpose and score use as well as reflect the target construct. Researchers have long argued for the value of data-driven scales for classroom performance assessment, because they are specific to pedagogical tasks and objectives, have rich descriptors to offer useful diagnostic information, and exhibit robust content representativeness and stable measurement properties. This sequential mixed methods study compares two data-driven rating scales with multiple criteria that use different formats for pragmatic performance. They were developed using roleplays performed by 43 second-language learners of Mandarin—the hierarchical-binary (HB) scale, developed through close analysis of performance data, and the multi-trait (MT) scale derived from the HB, which has the same criteria but takes the format of an analytic scale. Results revealed the influence of format, albeit to a limited extent: MT showed a marginal advantage over HB in terms of overall reliability, practicality, and discriminatory power, though measurement properties of the two scales were largely comparable. All raters were positive about the pedagogical value of both scales. This study reveals that rater perceptions of the ease of use and effectiveness of both scales provide further insights into scale functioning.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"52 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2023-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139210387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Triangulating NLP-based analysis of rater comments and MFRM: An innovative approach to investigating raters’ application of rating scales in writing assessment 基于 NLP 的评分者评语分析和多指标评分法（MFRM）的三角分析：调查评分者在写作评估中应用评分量表的创新方法

IF 4.1 1区文学 0 LANGUAGE & LINGUISTICS

Language Testing

Pub Date : 2023-11-29 DOI: 10.1177/02655322231210231

Huiying Cai, Xun Yan

Rater comments tend to be qualitatively analyzed to indicate raters’ application of rating scales. This study applied natural language processing (NLP) techniques to quantify meaningful, behavioral information from a corpus of rater comments and triangulated that information with a many-facet Rasch measurement (MFRM) analysis of rater scores. The data consisted of ratings on 987 essays by 36 raters (a total of 3948 analytic scores and 1974 rater comments) on a post-admission English Placement Test (EPT) at a large US university. We computed a set of comment-based features based on the analytic components and evaluative language the raters used to infer whether raters were aligned to the scale. For data triangulation, we performed correlation analyses between the MFRM measures of rater performance and the comment-based measures. Although the EPT raters showed overall satisfactory performance, we found meaningful associations between rater comments and performance features. In particular, raters with higher precision and fit to what the Rasch model predicts used more analytic components and used evaluative language more similar to the scale descriptors. These findings suggest that NLP techniques have the potential to help language testers analyze rater comments and understand rater behavior.

评分者的评语往往通过定性分析来说明评分者对评分量表的应用。本研究应用自然语言处理（NLP）技术，从评分者评语语料库中量化有意义的行为信息，并将这些信息与评分者评分的多方面拉施测量（MFRM）分析进行三角测量。数据包括 36 位评分者对美国一所大型大学入学后英语分级测试 (EPT) 中 987 篇文章的评分（共计 3948 个分析分数和 1974 个评分者评语）。我们根据评分者使用的分析成分和评价语言计算出了一套基于评论的特征，以推断评分者是否与量表一致。为了对数据进行三角测量，我们对评分者绩效的 MFRM 测量和基于评论的测量进行了相关分析。尽管 EPT 评分员的总体表现令人满意，但我们发现评分员的评语与表现特征之间存在有意义的关联。特别是，精确度较高且符合拉施模型预测的评分者使用了更多的分析成分，并使用了与量表描述符更为相似的评价性语言。这些发现表明，NLP 技术有可能帮助语言测试人员分析评分者的评语并理解评分者的行为。

{"title":"Triangulating NLP-based analysis of rater comments and MFRM: An innovative approach to investigating raters’ application of rating scales in writing assessment","authors":"Huiying Cai, Xun Yan","doi":"10.1177/02655322231210231","DOIUrl":"https://doi.org/10.1177/02655322231210231","url":null,"abstract":"Rater comments tend to be qualitatively analyzed to indicate raters’ application of rating scales. This study applied natural language processing (NLP) techniques to quantify meaningful, behavioral information from a corpus of rater comments and triangulated that information with a many-facet Rasch measurement (MFRM) analysis of rater scores. The data consisted of ratings on 987 essays by 36 raters (a total of 3948 analytic scores and 1974 rater comments) on a post-admission English Placement Test (EPT) at a large US university. We computed a set of comment-based features based on the analytic components and evaluative language the raters used to infer whether raters were aligned to the scale. For data triangulation, we performed correlation analyses between the MFRM measures of rater performance and the comment-based measures. Although the EPT raters showed overall satisfactory performance, we found meaningful associations between rater comments and performance features. In particular, raters with higher precision and fit to what the Rasch model predicts used more analytic components and used evaluative language more similar to the scale descriptors. These findings suggest that NLP techniques have the potential to help language testers analyze rater comments and understand rater behavior.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"2 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2023-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139212101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Correction Note 更正说明

IF 4.1 1区文学 0 LANGUAGE & LINGUISTICS

Language Testing

Pub Date : 2023-11-20 DOI: 10.1177/02655322231211735

引用次数: 0

Argument-based validation of Academic Collocation Tests 基于论证的学术搭配测试验证

1区文学 0 LANGUAGE & LINGUISTICS

Language Testing

Pub Date : 2023-10-21 DOI: 10.1177/02655322231198499

Thi My Hang Nguyen, Peter Gu, Averil Coxhead

Despite extensive research on assessing collocational knowledge, valid measures of academic collocations remain elusive. With the present study, we begin an argument-based approach to validate two Academic Collocation Tests (ACTs) that assess the ability to recognize and produce academic collocations (i.e., two-word units such as key element and well established) in written contexts. A total of 343 tertiary students completed a background questionnaire (including demographic information, IELTS scores, and learning experience), the ACTs, and the Vocabulary Size Test. Forty-four participants also took part in post-test interviews to share reflections on the tests and retook the ACTs verbally. The findings showed that the scoring inference based on analyses of test item characteristics, testing conditions, and scoring procedures was partially supported. The generalization inference, based on the consistency of item measures and testing occasions, was justified. The extrapolation inference, drawn from correlations with other measures and factors such as collocation frequency and learning experience, received partial support. Suggestions for increasing the degree of support for the inferences are discussed. The present study reinforces the value of validation research and generates the momentum for test developers to continue this practice with other vocabulary tests.

尽管在评估搭配知识方面进行了广泛的研究，但有效的学术搭配方法仍然难以捉摸。在本研究中，我们开始了一种基于论证的方法来验证两个学术搭配测试(act)，这些测试评估了在书面语境中识别和产生学术搭配(即两个单词单位，如关键元素和良好建立)的能力。共有343名大学生完成了背景调查问卷(包括人口统计信息、雅思成绩和学习经历)、act和词汇量测试。44名参与者还参加了考试后的访谈，分享对考试的感想，并口头重新参加act考试。研究结果表明，基于测试项目特征、测试条件和评分程序分析的评分推理得到部分支持。基于项目测量和测试场合的一致性，可以证明归纳推理是正确的。从搭配频率和学习经验等其他度量和因素的相关性中得出的外推推理得到了部分支持。讨论了增加对推论支持程度的建议。目前的研究强化了验证研究的价值，并为测试开发人员在其他词汇测试中继续这种实践产生了动力。

{"title":"Argument-based validation of Academic Collocation Tests","authors":"Thi My Hang Nguyen, Peter Gu, Averil Coxhead","doi":"10.1177/02655322231198499","DOIUrl":"https://doi.org/10.1177/02655322231198499","url":null,"abstract":"Despite extensive research on assessing collocational knowledge, valid measures of academic collocations remain elusive. With the present study, we begin an argument-based approach to validate two Academic Collocation Tests (ACTs) that assess the ability to recognize and produce academic collocations (i.e., two-word units such as key element and well established) in written contexts. A total of 343 tertiary students completed a background questionnaire (including demographic information, IELTS scores, and learning experience), the ACTs, and the Vocabulary Size Test. Forty-four participants also took part in post-test interviews to share reflections on the tests and retook the ACTs verbally. The findings showed that the scoring inference based on analyses of test item characteristics, testing conditions, and scoring procedures was partially supported. The generalization inference, based on the consistency of item measures and testing occasions, was justified. The extrapolation inference, drawn from correlations with other measures and factors such as collocation frequency and learning experience, received partial support. Suggestions for increasing the degree of support for the inferences are discussed. The present study reinforces the value of validation research and generates the momentum for test developers to continue this practice with other vocabulary tests.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135512996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Revisiting raters’ accent familiarity in speaking tests: Evidence that presentation mode interacts with accent familiarity to variably affect comprehensibility ratings 重新审视评分者在口语测试中的口音熟悉度:陈述方式与口音熟悉度相互作用对可理解性评分产生可变影响的证据

1区文学 0 LANGUAGE & LINGUISTICS

Language Testing

Pub Date : 2023-10-14 DOI: 10.1177/02655322231200808

Michael D. Carey, Stefan Szocs

This controlled experimental study investigated the interaction of variables associated with rating the pronunciation component of high-stakes English-language-speaking tests such as IELTS and TOEFL iBT. One hundred experienced raters who were all either familiar or unfamiliar with Brazilian-accented English or Papua New Guinean Tok Pisin-accented English, respectively, were presented with speech samples in audio-only or audio-visual mode. Two-way ordinal regression with post hoc pairwise comparisons found that the presentation mode interacted significantly with accent familiarity to increase comprehensibility ratings (χ² = 88.005, df = 3, p < .0001), with presentation mode having a stronger effect in the interaction than accent familiarity (χ² = 59.328, df = 1, p < .0001). Based on odds ratios, raters were significantly more likely to score comprehensibility higher when the presentation mode was audio-visual (compared to audio-only) for both the unfamiliar (91% more likely) and familiar speakers (92.3% more likely). The results suggest that semi-direct speaking tests using audio-only or audio-visual modes of presentation should be evaluated through research to ascertain how accent familiarity and presentation mode interact to variably affect comprehensibility ratings. Such research may be beneficial to investigate the virtual modes of speaking test delivery that have emerged post-COVID-19.

这项对照实验研究调查了与雅思和新托福等高风险英语口语考试中发音部分评分相关的变量之间的相互作用。100名经验丰富的评分者分别熟悉或不熟悉巴西口音英语或巴布亚新几内亚托克-比索口音英语，他们以纯音频或视听的方式观看了语音样本。双向有序回归与事后两两比较发现，呈现方式与口音熟悉度显著相互作用，可提高可理解性评分(χ²= 88.005,df = 3, p <.0001)，呈现方式对互动的影响强于口音熟悉度(χ²= 59.328,df = 1, p <。)。根据比值比，对于不熟悉的演讲者(可能性高出91%)和熟悉的演讲者(可能性高出92.3%)，当呈现模式为视听(与纯音频相比)时，评分者更有可能给可理解性打更高的分。结果表明，应该通过研究来评估使用纯音频或视听呈现模式的半直接口语测试，以确定口音熟悉度和呈现模式如何相互作用以变量影响可理解性评分。这样的研究可能有助于调查covid -19后出现的口语测试虚拟模式。

{"title":"Revisiting raters’ accent familiarity in speaking tests: Evidence that presentation mode interacts with accent familiarity to variably affect comprehensibility ratings","authors":"Michael D. Carey, Stefan Szocs","doi":"10.1177/02655322231200808","DOIUrl":"https://doi.org/10.1177/02655322231200808","url":null,"abstract":"This controlled experimental study investigated the interaction of variables associated with rating the pronunciation component of high-stakes English-language-speaking tests such as IELTS and TOEFL iBT. One hundred experienced raters who were all either familiar or unfamiliar with Brazilian-accented English or Papua New Guinean Tok Pisin-accented English, respectively, were presented with speech samples in audio-only or audio-visual mode. Two-way ordinal regression with post hoc pairwise comparisons found that the presentation mode interacted significantly with accent familiarity to increase comprehensibility ratings (χ² = 88.005, df = 3, p < .0001), with presentation mode having a stronger effect in the interaction than accent familiarity (χ² = 59.328, df = 1, p < .0001). Based on odds ratios, raters were significantly more likely to score comprehensibility higher when the presentation mode was audio-visual (compared to audio-only) for both the unfamiliar (91% more likely) and familiar speakers (92.3% more likely). The results suggest that semi-direct speaking tests using audio-only or audio-visual modes of presentation should be evaluated through research to ascertain how accent familiarity and presentation mode interact to variably affect comprehensibility ratings. Such research may be beneficial to investigate the virtual modes of speaking test delivery that have emerged post-COVID-19.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"130 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135804141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Our validity looks like justice. Does yours? 我们的有效性看起来像正义。是你的吗?

1区文学 0 LANGUAGE & LINGUISTICS

Language Testing

Pub Date : 2023-10-07 DOI: 10.1177/02655322231202947

Jennifer Randall, Mya Poe, David Slomp, Maria Elena Oliveri

Educational assessments, from kindergarden to 12th grade (K-12) to licensure, have a long, well-documented history of oppression and marginalization. In this paper, we (the authors) ask the field of educational assessment/measurement to actively disrupt the White supremacist and racist logics that fuel this marginalization and re-orient itself toward assessment justice. We describe how a justice-oriented, antiracist validity (JAV) approach to validation processes can support assessment justice efforts, specifically with respect to language assessment. Relying on antiracist principles and critical quantitative methodologies, a JAV approach proposes a set of critical questions to consider when gathering validity evidence, with potential utility for language testers.

从幼儿园到12年级(K-12)再到执照考试，教育评估都有着悠久的、有充分证据的压迫和边缘化历史。在本文中，我们(作者)要求教育评估/测量领域积极打破助长这种边缘化的白人至上主义和种族主义逻辑，并将其重新定位于评估正义。我们描述了以公正为导向、反种族主义有效性(java)的验证过程方法如何支持公正评估工作，特别是在语言评估方面。依靠反种族主义原则和关键的定量方法，java方法提出了一组在收集有效性证据时要考虑的关键问题，对语言测试人员具有潜在的实用性。

引用次数: 0

Language assessment accommodations: Issues and challenges for the future 语言评估设施:未来的问题和挑战

1区文学 0 LANGUAGE & LINGUISTICS

Language Testing

Pub Date : 2023-10-01 DOI: 10.1177/02655322231186222

Lynda Taylor, Jayanti Banerjee

Several papers reference the concept of universal design as the preferred theoretical foundation for language test design and development (Christensen et al., 2023; Kim et al., 2023; Guzman-Orth et al., 2023). This approach, originally derived from the field of architecture in the United States (Case, 2008), proposes a set of principles whereby assessments are intentionally and proactively designed from the earliest stage of construction to be maximally accessible to all users, regardless of any special needs they may have (cf., the planning and design of public buildings as disabled-friendly). In the context of language test design and construction, this typically entails giving all test takers access to a broad range of universal but optional tools (e.g., magnifier, colour overlay) to enhance test accessibility. Recent technological developments for text

引用次数: 0

Accommodations in language testing and assessment: Safeguarding equity, access, and inclusion 语言测试和评估的便利:维护公平、获取和包容

1区文学 0 LANGUAGE & LINGUISTICS

Language Testing

Pub Date : 2023-10-01 DOI: 10.1177/02655322231186221

Lynda Taylor, Jayanti Banerjee

the development and implementation of a special accommodations policy associated with a large-scale, localized computer-based language test designed to assess the English skills needed in the Singaporean workplace context. They analyzed both operational test data and interview data to investigate three main lines of enquiry: different stakeholders’ perceptions of the appropriateness and effectiveness of the accommodations; the impact of the accommodations on test-takers’ future opportunities; and stakeholder perceptions of key factors that play a role in accommodations. Their findings prompted recommendations on improving special accommodations policy development, dissemination

引用次数: 0

Book review: J. Fox and N. Artemeva. Reconsidering Context in Language Assessment: Transdisciplinary Perspectives, Social Theories, and Validity 书评:J.福克斯和N.阿尔特米娃。重新考虑语言评估中的语境:跨学科视角、社会理论与有效性

1区文学 0 LANGUAGE & LINGUISTICS

Language Testing

Pub Date : 2023-09-25 DOI: 10.1177/02655322231199501

Susy Macqueen

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Language Testing

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀