Pub Date : 1900-01-01DOI: 10.20622/jltajournal.23.0_17
Toshihide O’ki
To become an advanced second language listener, a learner needs to have good speech perception ability. Previous research that focused on measuring this ability often utilized integrated-skills tasks (e.g., repetition tasks and dictation tasks), but their validity and reliability are questionable because learners’ productive skills affect their task performance. This study attempted to develop an original discrete-point task called the word count task, in which learners count and report the number of words in blanks. To evaluate the task’s validity and reliability, two comparable studies with dictation tasks were conducted with university students in Japan. The second study, which was revised based on the first study, revealed that the reliability coefficient of the word count task expressed by Cronbach’s (cid:302)(cid:3) reached .85, slightly exceeding the reliability of the dictation tasks. Moreover, correlations with dictation tasks were found to be significantly positive with moderate to strong relationships, meaning the word count task demonstrated sufficient criterion-related validity. Moreover, the listening strategy survey conducted to explore cognitive processes involved in the task showed that phonological processing is more dominant than meaning processing in the word count task. These findings seem to corroborate the applicability of the word count task to research and classroom assessment, but further research is necessary to reevaluate its validity using other methods mentioned in this study. The first study attempted to examine the validity and reliability of a prototype of word count task (WCT) and reveal how the task should be revised. As explained, there is no established task to measure speech perception ability; therefore, the validity of WCT was tested based on its relationship with a dictation task (DT). As already stated, DTs have shortcomings as a measure of speech perception ability, but due to their popularity in educational settings in Japan, using one is the best counterpart to the WCT.
{"title":"Developing a New Task to Measure Speech Perception Ability: Is the Word Count Task Valid and Reliable?","authors":"Toshihide O’ki","doi":"10.20622/jltajournal.23.0_17","DOIUrl":"https://doi.org/10.20622/jltajournal.23.0_17","url":null,"abstract":"To become an advanced second language listener, a learner needs to have good speech perception ability. Previous research that focused on measuring this ability often utilized integrated-skills tasks (e.g., repetition tasks and dictation tasks), but their validity and reliability are questionable because learners’ productive skills affect their task performance. This study attempted to develop an original discrete-point task called the word count task, in which learners count and report the number of words in blanks. To evaluate the task’s validity and reliability, two comparable studies with dictation tasks were conducted with university students in Japan. The second study, which was revised based on the first study, revealed that the reliability coefficient of the word count task expressed by Cronbach’s (cid:302)(cid:3) reached .85, slightly exceeding the reliability of the dictation tasks. Moreover, correlations with dictation tasks were found to be significantly positive with moderate to strong relationships, meaning the word count task demonstrated sufficient criterion-related validity. Moreover, the listening strategy survey conducted to explore cognitive processes involved in the task showed that phonological processing is more dominant than meaning processing in the word count task. These findings seem to corroborate the applicability of the word count task to research and classroom assessment, but further research is necessary to reevaluate its validity using other methods mentioned in this study. The first study attempted to examine the validity and reliability of a prototype of word count task (WCT) and reveal how the task should be revised. As explained, there is no established task to measure speech perception ability; therefore, the validity of WCT was tested based on its relationship with a dictation task (DT). As already stated, DTs have shortcomings as a measure of speech perception ability, but due to their popularity in educational settings in Japan, using one is the best counterpart to the WCT.","PeriodicalId":249185,"journal":{"name":"JLTA Journal","volume":"306 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122695240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.20622/jltajournal.23.0_3
Antony Kunnan
Introduction The dominant 20th century approach to the evaluation of language assessments was the Standards-based approach. The Standards most evaluators referred to are the American Psychological Association (APA), American Educational Research Association (AERA), National Council on Measurement in Education (NCME) Standards (1999, 2014). These standards (mainly a list of test qualities such as validity and reliability, and of late, consequences and fairness) were developed from best practices at assessment institutions and had loose connections to theories of educational and psychological measurement. The “Test Usefulness” concept proposed by Bachman and Palmer (1996) was a popular example of the Standards approach. In the early part of the 21st century, Kane (1992) and Bachman and Palmer (2010) proposed an Argument-based approach using Toulmin’s way of structuring arguments with claims, warrants, backing and rebuttals. This approach provided a framework for evaluating language assessments. Bachman and Palmer’s (2010) “Assessment Use Argument” (AUA) is an example of this approach. While both approaches provide ways for researchers to conduct evaluations, they have a weakness, and that is they generally lack an articulated philosophical grounding. This lack of philosophical grounding can be seen in the Standards approach in which why the listed standards are important and not others is not articulated. In the Argument approach, what aspects are to be included as claims and warrants is left the assessment developer with the evaluator following them which is a critical problem. To remedy this situation, I am proposing an Ethics-based approach to assessment evaluation. The framework that implements the approach harnesses the dual concepts of fair assessments and just institutions leading to the Principle of Fairness and Principle of Justice, respectively.
{"title":"Evaluating Language Assessments From an Ethics Perspective: Suggestions for a New Agenda","authors":"Antony Kunnan","doi":"10.20622/jltajournal.23.0_3","DOIUrl":"https://doi.org/10.20622/jltajournal.23.0_3","url":null,"abstract":"Introduction The dominant 20th century approach to the evaluation of language assessments was the Standards-based approach. The Standards most evaluators referred to are the American Psychological Association (APA), American Educational Research Association (AERA), National Council on Measurement in Education (NCME) Standards (1999, 2014). These standards (mainly a list of test qualities such as validity and reliability, and of late, consequences and fairness) were developed from best practices at assessment institutions and had loose connections to theories of educational and psychological measurement. The “Test Usefulness” concept proposed by Bachman and Palmer (1996) was a popular example of the Standards approach. In the early part of the 21st century, Kane (1992) and Bachman and Palmer (2010) proposed an Argument-based approach using Toulmin’s way of structuring arguments with claims, warrants, backing and rebuttals. This approach provided a framework for evaluating language assessments. Bachman and Palmer’s (2010) “Assessment Use Argument” (AUA) is an example of this approach. While both approaches provide ways for researchers to conduct evaluations, they have a weakness, and that is they generally lack an articulated philosophical grounding. This lack of philosophical grounding can be seen in the Standards approach in which why the listed standards are important and not others is not articulated. In the Argument approach, what aspects are to be included as claims and warrants is left the assessment developer with the evaluator following them which is a critical problem. To remedy this situation, I am proposing an Ethics-based approach to assessment evaluation. The framework that implements the approach harnesses the dual concepts of fair assessments and just institutions leading to the Principle of Fairness and Principle of Justice, respectively.","PeriodicalId":249185,"journal":{"name":"JLTA Journal","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127291797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.20622/jltajournal.24.0_23
S. Ito
{"title":"The Effects of Retelling on Reading Comprehension: Focusing on Different Levels of Comprehension and Non-Textual Information in Retelling Protocols","authors":"S. Ito","doi":"10.20622/jltajournal.24.0_23","DOIUrl":"https://doi.org/10.20622/jltajournal.24.0_23","url":null,"abstract":"","PeriodicalId":249185,"journal":{"name":"JLTA Journal","volume":"126 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129351778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.20622/jltajournal.23.0_57
Rie Koizumi, Yo In’nami, Makoto Fukazawa
The current study aimed to reveal similarities and differences between a holistic and an analytic rubric used in assessing speaking performance in a paired oral test. To this end, speaking performances of 110 Japanese university students produced in paired oral interaction were evaluated by raters, holistically and analytically. The comparisons made between the two rubrics using many-facet Rasch measurement showed that both worked effectively, with the analytic rubric working slightly better in terms of a better global fit, a better test-taker and task separation, higher test-taker and task reliability, smaller standard errors, and a smaller percentage of test takers with overfit. Correlation and regression analysis indicated a strong relationship between the two (r = .84) and the Interactive communication and Fluency analytic criteria substantially explained holistic scores (adjusted R2 = .71). Results suggest that teachers can obtain similar results with either rubric type and, if they select an analytic one, a priority would be to include Interactive communication and Fluency criteria.
{"title":"Comparison Between Holistic and Analytic Rubrics of a Paired Oral Test","authors":"Rie Koizumi, Yo In’nami, Makoto Fukazawa","doi":"10.20622/jltajournal.23.0_57","DOIUrl":"https://doi.org/10.20622/jltajournal.23.0_57","url":null,"abstract":"The current study aimed to reveal similarities and differences between a holistic and an analytic rubric used in assessing speaking performance in a paired oral test. To this end, speaking performances of 110 Japanese university students produced in paired oral interaction were evaluated by raters, holistically and analytically. The comparisons made between the two rubrics using many-facet Rasch measurement showed that both worked effectively, with the analytic rubric working slightly better in terms of a better global fit, a better test-taker and task separation, higher test-taker and task reliability, smaller standard errors, and a smaller percentage of test takers with overfit. Correlation and regression analysis indicated a strong relationship between the two (r = .84) and the Interactive communication and Fluency analytic criteria substantially explained holistic scores (adjusted R2 = .71). Results suggest that teachers can obtain similar results with either rubric type and, if they select an analytic one, a priority would be to include Interactive communication and Fluency criteria.","PeriodicalId":249185,"journal":{"name":"JLTA Journal","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128564312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.20622/jltajournal.25.0_3
妃美 近藤ブラウン
{"title":"Japanese language testing in American universities: Research agenda in the new normal","authors":"妃美 近藤ブラウン","doi":"10.20622/jltajournal.25.0_3","DOIUrl":"https://doi.org/10.20622/jltajournal.25.0_3","url":null,"abstract":"","PeriodicalId":249185,"journal":{"name":"JLTA Journal","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133079644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}