{"title":"开发一种创新的引出式模仿任务用于高效英语水平评估","authors":"Larry Davis, John Norris","doi":"10.1002/ets2.12338","DOIUrl":null,"url":null,"abstract":"<p>The elicited imitation task (EIT), in which language learners listen to a series of spoken sentences and repeat each one verbatim, is a commonly used measure of language proficiency in second language acquisition research. The <i>TOEFL</i>® <i>Essentials</i>™ test includes an EIT as a holistic measure of speaking proficiency, referred to as the “Listen and Repeat” task type. In this report, we describe the design considerations that informed the development of the EIT for TOEFL Essentials. We also report the results of a series of investigations conducted during the prototyping and pilot phases of test development, which were undertaken with the goal of confirming task design specifications, evaluating scoring performance, and obtaining initial validity evidence to support score interpretation and use of the EIT in the TOEFL Essentials test. We found that task design variables generally performed as expected. The length of input sentence was strongly associated with performance (Pearson <i>r</i> = .88), consistent with the construct measured by the EIT, while other task variables not directly related to the EIT construct did not impact performance (e.g., graphics, speaker accent, and response time). Scorers drawn from TOEFL iBT test raters were able to score responses consistently with over 98% exact or adjacent interrater agreement on a 6-point scale, and scores on the pilot version of the EIT were highly reliable (Cronbach's α = .93 on the 15-item pilot version). Correlations between EIT scores and other measures were generally as expected: Correlations with other speaking tasks were high (.78–.84) and slightly to somewhat lower for other language measures (.73 for writing, .68 for listening, and .57 for reading). Correlation with an independent measure of holistic language proficiency (C-test) was moderately high (.69), as expected. We discuss the study findings in terms of the TOEFL Essentials test validity argument and point out limitations to the current results along with future research needs. Overall, we believe that the findings provide initial support to warrant the use of the EIT as operationalized in the TOEFL Essentials test.</p>","PeriodicalId":11972,"journal":{"name":"ETS Research Report Series","volume":"2021 1","pages":"1-30"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ets2.12338","citationCount":"7","resultStr":"{\"title\":\"Developing an Innovative Elicited Imitation Task for Efficient English Proficiency Assessment\",\"authors\":\"Larry Davis, John Norris\",\"doi\":\"10.1002/ets2.12338\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The elicited imitation task (EIT), in which language learners listen to a series of spoken sentences and repeat each one verbatim, is a commonly used measure of language proficiency in second language acquisition research. The <i>TOEFL</i>® <i>Essentials</i>™ test includes an EIT as a holistic measure of speaking proficiency, referred to as the “Listen and Repeat” task type. In this report, we describe the design considerations that informed the development of the EIT for TOEFL Essentials. We also report the results of a series of investigations conducted during the prototyping and pilot phases of test development, which were undertaken with the goal of confirming task design specifications, evaluating scoring performance, and obtaining initial validity evidence to support score interpretation and use of the EIT in the TOEFL Essentials test. We found that task design variables generally performed as expected. The length of input sentence was strongly associated with performance (Pearson <i>r</i> = .88), consistent with the construct measured by the EIT, while other task variables not directly related to the EIT construct did not impact performance (e.g., graphics, speaker accent, and response time). Scorers drawn from TOEFL iBT test raters were able to score responses consistently with over 98% exact or adjacent interrater agreement on a 6-point scale, and scores on the pilot version of the EIT were highly reliable (Cronbach's α = .93 on the 15-item pilot version). Correlations between EIT scores and other measures were generally as expected: Correlations with other speaking tasks were high (.78–.84) and slightly to somewhat lower for other language measures (.73 for writing, .68 for listening, and .57 for reading). Correlation with an independent measure of holistic language proficiency (C-test) was moderately high (.69), as expected. We discuss the study findings in terms of the TOEFL Essentials test validity argument and point out limitations to the current results along with future research needs. Overall, we believe that the findings provide initial support to warrant the use of the EIT as operationalized in the TOEFL Essentials test.</p>\",\"PeriodicalId\":11972,\"journal\":{\"name\":\"ETS Research Report Series\",\"volume\":\"2021 1\",\"pages\":\"1-30\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ets2.12338\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ETS Research Report Series\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/ets2.12338\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Social Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ETS Research Report Series","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ets2.12338","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Social Sciences","Score":null,"Total":0}
Developing an Innovative Elicited Imitation Task for Efficient English Proficiency Assessment
The elicited imitation task (EIT), in which language learners listen to a series of spoken sentences and repeat each one verbatim, is a commonly used measure of language proficiency in second language acquisition research. The TOEFL® Essentials™ test includes an EIT as a holistic measure of speaking proficiency, referred to as the “Listen and Repeat” task type. In this report, we describe the design considerations that informed the development of the EIT for TOEFL Essentials. We also report the results of a series of investigations conducted during the prototyping and pilot phases of test development, which were undertaken with the goal of confirming task design specifications, evaluating scoring performance, and obtaining initial validity evidence to support score interpretation and use of the EIT in the TOEFL Essentials test. We found that task design variables generally performed as expected. The length of input sentence was strongly associated with performance (Pearson r = .88), consistent with the construct measured by the EIT, while other task variables not directly related to the EIT construct did not impact performance (e.g., graphics, speaker accent, and response time). Scorers drawn from TOEFL iBT test raters were able to score responses consistently with over 98% exact or adjacent interrater agreement on a 6-point scale, and scores on the pilot version of the EIT were highly reliable (Cronbach's α = .93 on the 15-item pilot version). Correlations between EIT scores and other measures were generally as expected: Correlations with other speaking tasks were high (.78–.84) and slightly to somewhat lower for other language measures (.73 for writing, .68 for listening, and .57 for reading). Correlation with an independent measure of holistic language proficiency (C-test) was moderately high (.69), as expected. We discuss the study findings in terms of the TOEFL Essentials test validity argument and point out limitations to the current results along with future research needs. Overall, we believe that the findings provide initial support to warrant the use of the EIT as operationalized in the TOEFL Essentials test.