首页 > 最新文献

Assessing Writing最新文献

英文 中文
Which gender provides more specific peer feedback? Gender and assessment training’s effects on peer feedback specificity and intrapersonal factors 哪种性别提供更具体的同伴反馈?性别和评估培训对同伴反馈特异性和个人因素的影响
IF 5.5 1区 文学 Q1 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2025-10-01 Epub Date: 2025-10-07 DOI: 10.1016/j.asw.2025.100987
José Carlos G. Ocampo , Ernesto Panadero , David Zamorano , Iván Sánchez-Iglesias
This study investigated the effects of assessor gender (male vs. female), fictitious assessee gender (male vs. female), and assessment training (with vs. without) on peer feedback specificity (i.e. localisation and focus) and intrapersonal factors (i.e. trust in the self as an assessor and discomfort). This study involved 240 undergraduate psychology students (nMen=120, nWomen=120), with half receiving assessment training and the other half receiving the task instructions. Participants were divided into eight subgroups based on training condition and their self-reported gender to provide peer feedback to three writing samples (poor, average, excellent quality) by fictitious male or female peer assessees in Eduflow. A total of 3017 peer feedback segments were analysed, revealing that trained or untrained male and female assessors were comparable in most peer feedback specificity categories when assessing fictitious male or female assessees. Nonetheless, we also found that female assessors excelled in certain categories of peer feedback specificity, while male assessors also demonstrated competencies in other categories. Results also showed that assessors who received assessment training provided localised peer feedback in all the writing samples. Finally, gender and training did not affect participants’ trust in their abilities and (dis)comfort when providing peer feedback.
本研究调查了评估者性别(男性与女性)、虚构的评估者性别(男性与女性)和评估培训(有与没有)对同伴反馈特异性(即定位和焦点)和个人因素(即对自我作为评估者的信任和不适)的影响。这项研究涉及240名心理学本科生(n男=120,n女=120),其中一半接受评估训练,另一半接受任务指导。参与者根据培训条件和自我报告的性别分为8个小组,由Eduflow中虚构的男性或女性同行评议者对三个写作样本(差、一般、优秀)提供同行反馈。共分析了3017个同伴反馈部分,揭示了在评估虚构的男性或女性评估者时,受过培训或未受过培训的男性和女性评估者在大多数同伴反馈特异性类别中具有可比性。尽管如此,我们也发现女性评估者在同伴反馈特异性的某些类别中表现出色,而男性评估者在其他类别中也表现出能力。结果还显示,接受评估培训的评估员在所有写作样本中提供了本地化的同行反馈。最后,性别和培训不影响参与者对自己能力的信任和提供同伴反馈时的(不)舒适感。
{"title":"Which gender provides more specific peer feedback? Gender and assessment training’s effects on peer feedback specificity and intrapersonal factors","authors":"José Carlos G. Ocampo ,&nbsp;Ernesto Panadero ,&nbsp;David Zamorano ,&nbsp;Iván Sánchez-Iglesias","doi":"10.1016/j.asw.2025.100987","DOIUrl":"10.1016/j.asw.2025.100987","url":null,"abstract":"<div><div>This study investigated the effects of assessor gender (male vs. female), fictitious assessee gender (male vs. female), and assessment training (with vs. without) on peer feedback specificity (i.e. localisation and focus) and intrapersonal factors (i.e. trust in the self as an assessor and discomfort). This study involved 240 undergraduate psychology students (n<sub>Men</sub>=120, n<sub>Women</sub>=120), with half receiving assessment training and the other half receiving the task instructions. Participants were divided into eight subgroups based on training condition and their self-reported gender to provide peer feedback to three writing samples (poor, average, excellent quality) by fictitious male or female peer assessees in <em>Eduflow</em>. A total of 3017 peer feedback segments were analysed, revealing that trained or untrained male and female assessors were comparable in most peer feedback specificity categories when assessing fictitious male or female assessees. Nonetheless, we also found that female assessors excelled in certain categories of peer feedback specificity, while male assessors also demonstrated competencies in other categories. Results also showed that assessors who received assessment training provided localised peer feedback in all the writing samples. Finally, gender and training did not affect participants’ trust in their abilities and (dis)comfort when providing peer feedback.</div></div>","PeriodicalId":46865,"journal":{"name":"Assessing Writing","volume":"66 ","pages":"Article 100987"},"PeriodicalIF":5.5,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145264744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Criterion validity evidence and alternate form reliability of curriculum-based measures of written expression for eighth grade students 八年级学生书面表达课程测量的标准效度、证据和替代形式信度
IF 4.2 1区 文学 Q1 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2025-10-01 Epub Date: 2025-06-16 DOI: 10.1016/j.asw.2025.100958
John Elwood Romig , Amanda A. Olsen , Elizabeth Medina , Anna Tulloh
Significant majorities of students in secondary grade levels struggle to meet grade level expectations for writing. Progress monitoring with curriculum-based measurement is one possible strategy for shaping instruction towards improved student outcomes. However, relatively little research has examined curriculum-based measures for writing with students in secondary grade levels. This study included 89 8th grade participants who completed one curriculum-based measurement writing task weekly for 11 weeks and completed the Test of Written Language – 4 in the 12th week. Spearman’s rank correlations were calculated to determine the alternate form reliability and criterion validity evidence of the curriculum-based measurement tasks. We found alternate form reliability and criterion validity evidence to be weaker than established thresholds in the field but approaching what was found with other writing assessments. Educators should use caution when interpreting results of CBM in writing and consider alternative writing assessments for screening purposes.
绝大多数中学生在写作方面都很难达到年级的要求。以课程为基础的进度监测是一种可能的策略,可以使教学朝着提高学生成绩的方向发展。然而,相对而言,很少有研究针对中学学生的写作水平进行基于课程的测试。这项研究包括89名8年级的参与者,他们每周完成一项基于课程的写作测试,持续11周,并在第12周完成书面语言测试- 4。计算Spearman秩相关以确定课程测量任务的替代形式信度和标准效度证据。我们发现替代形式的信度和标准效度证据弱于该领域的既定阈值,但接近其他写作评估的结果。教育工作者在解释CBM的写作结果时应该谨慎,并考虑其他的写作评估来进行筛选。
{"title":"Criterion validity evidence and alternate form reliability of curriculum-based measures of written expression for eighth grade students","authors":"John Elwood Romig ,&nbsp;Amanda A. Olsen ,&nbsp;Elizabeth Medina ,&nbsp;Anna Tulloh","doi":"10.1016/j.asw.2025.100958","DOIUrl":"10.1016/j.asw.2025.100958","url":null,"abstract":"<div><div>Significant majorities of students in secondary grade levels struggle to meet grade level expectations for writing. Progress monitoring with curriculum-based measurement is one possible strategy for shaping instruction towards improved student outcomes. However, relatively little research has examined curriculum-based measures for writing with students in secondary grade levels. This study included 89 8th grade participants who completed one curriculum-based measurement writing task weekly for 11 weeks and completed the <em>Test of Written Language – 4</em> in the 12th week. Spearman’s rank correlations were calculated to determine the alternate form reliability and criterion validity evidence of the curriculum-based measurement tasks. We found alternate form reliability and criterion validity evidence to be weaker than established thresholds in the field but approaching what was found with other writing assessments. Educators should use caution when interpreting results of CBM in writing and consider alternative writing assessments for screening purposes.</div></div>","PeriodicalId":46865,"journal":{"name":"Assessing Writing","volume":"66 ","pages":"Article 100958"},"PeriodicalIF":4.2,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144290887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring the cross-lingual influence of linguistic complexity in second language writing assessment 探讨语言复杂性对第二语言写作评价的跨语言影响
IF 5.5 1区 文学 Q1 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2025-10-01 Epub Date: 2025-08-16 DOI: 10.1016/j.asw.2025.100951
Sara Geremia , Thomas Gaillat , Nicolas Ballier , Andrew J. Simpkin
This paper explores the influence of L1 on the linguistic complexity of English learners. It relies on features extracted from texts and modelled using a statistical learning framework. Linguistic complexity is assessed automatically in terms of proficiency levels across different L1. We investigate whether proficiency grading by humans matches clusters of learner writings based on the similarity of linguistic features. We then use complexity metrics to automatically assess proficiency levels in samples of writings of different L1s. We focus on variable importance to understand which features best discriminate between levels. Analytic clusters of linguistic complexity data do not map well to learning levels, which promises poorly for the relevance of using language complexity metrics for level prediction. However, assessing L1 influence on linguistic complexity through a multinomial logistic regression with elastic net regularisation shows significant results. The models predict the proficiency levels of students of different L1s.
本文探讨了母语对英语学习者语言复杂性的影响。它依赖于从文本中提取的特征,并使用统计学习框架建模。语言复杂性是根据不同语言的熟练程度自动评估的。我们根据语言特征的相似性来研究人类的熟练程度评分是否与学习者的写作相匹配。然后,我们使用复杂性度量来自动评估不同l15写作样本的熟练程度。我们关注变量重要性,以了解哪些特征最能区分不同级别。语言复杂性数据的分析聚类不能很好地映射到学习水平,这对于使用语言复杂性指标进行水平预测的相关性很差。然而,通过弹性网络正则化的多项逻辑回归评估L1对语言复杂性的影响显示出显著的结果。模型预测了不同年级学生的熟练程度。
{"title":"Exploring the cross-lingual influence of linguistic complexity in second language writing assessment","authors":"Sara Geremia ,&nbsp;Thomas Gaillat ,&nbsp;Nicolas Ballier ,&nbsp;Andrew J. Simpkin","doi":"10.1016/j.asw.2025.100951","DOIUrl":"10.1016/j.asw.2025.100951","url":null,"abstract":"<div><div>This paper explores the influence of L1 on the linguistic complexity of English learners. It relies on features extracted from texts and modelled using a statistical learning framework. Linguistic complexity is assessed automatically in terms of proficiency levels across different L1. We investigate whether proficiency grading by humans matches clusters of learner writings based on the similarity of linguistic features. We then use complexity metrics to automatically assess proficiency levels in samples of writings of different L1s. We focus on variable importance to understand which features best discriminate between levels. Analytic clusters of linguistic complexity data do not map well to learning levels, which promises poorly for the relevance of using language complexity metrics for level prediction. However, assessing L1 influence on linguistic complexity through a multinomial logistic regression with elastic net regularisation shows significant results. The models predict the proficiency levels of students of different L1s.</div></div>","PeriodicalId":46865,"journal":{"name":"Assessing Writing","volume":"66 ","pages":"Article 100951"},"PeriodicalIF":5.5,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144851982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The effect of metacognitive instruction with indirect written corrective feedback on secondary students’ engagement and functional adequacy in L2 writing 间接书面纠正反馈的元认知教学对中学生二语写作投入和功能充分性的影响
IF 4.2 1区 文学 Q1 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2025-10-01 Epub Date: 2025-06-25 DOI: 10.1016/j.asw.2025.100962
Miseong Kim, Phil Hiver
This study explored how metacognitive instruction (MI) combined with indirect written corrective feedback (WCF) influences students’ engagement with WCF and their functional adequacy (FA) in L2 writing. Fifty-four intermediate-level Korean secondary school students participated, divided into a treatment group (WCF + MI) and a comparison group (WCF only). Over 13 weeks, students completed five argumentative writing tasks, receiving WCF after each task. They also completed a self-report survey on their engagement with WCF. Results from the pretest, immediate posttest, and delayed posttest revealed that students in the treatment group showed increased behavioral engagement over time, although this pattern was inconsistent across all engagement dimensions. Overall, FA scores improved significantly across time points, but no significant differences were observed between groups. Furthermore, engagement with WCF did not significantly predict FA performance in either group at either posttest. These findings suggest that pairing MI with WCF may encourage behavioral engagement, but its impact on writing quality remains inconclusive. While preliminary, the results highlight the potential of MI as a tool in the feedback process and suggest the need for further research using broader engagement measures and longer instructional periods to better understand how MI and WCF can jointly support L2 writing development.
本研究探讨元认知教学(MI)结合间接书面纠正反馈(WCF)如何影响学生使用间接书面纠正反馈和他们在第二语言写作中的功能充分性(FA)。54名韩国初中学生被分为治疗组(WCF + MI)和对照组(仅WCF)。在13周的时间里,学生们完成了5个议论文写作任务,每个任务后都得到了WCF。他们还完成了一份关于他们参与WCF的自我报告调查。前测、即时后测和延迟后测的结果显示,随着时间的推移,治疗组的学生表现出越来越多的行为投入,尽管这种模式在所有投入维度上都不一致。总体而言,FA评分在各个时间点上均有显著提高,但组间无显著差异。此外,参与WCF并不能显著预测两组在两个后测中的FA表现。这些发现表明,将MI与WCF相结合可能会促进行为参与,但其对写作质量的影响仍不确定。虽然是初步的,但结果强调了MI作为反馈过程工具的潜力,并建议需要进一步研究,使用更广泛的参与措施和更长的教学时间,以更好地了解MI和WCF如何共同支持第二语言写作的发展。
{"title":"The effect of metacognitive instruction with indirect written corrective feedback on secondary students’ engagement and functional adequacy in L2 writing","authors":"Miseong Kim,&nbsp;Phil Hiver","doi":"10.1016/j.asw.2025.100962","DOIUrl":"10.1016/j.asw.2025.100962","url":null,"abstract":"<div><div>This study explored how metacognitive instruction (MI) combined with indirect written corrective feedback (WCF) influences students’ engagement with WCF and their functional adequacy (FA) in L2 writing. Fifty-four intermediate-level Korean secondary school students participated, divided into a treatment group (WCF + MI) and a comparison group (WCF only). Over 13 weeks, students completed five argumentative writing tasks, receiving WCF after each task. They also completed a self-report survey on their engagement with WCF. Results from the pretest, immediate posttest, and delayed posttest revealed that students in the treatment group showed increased behavioral engagement over time, although this pattern was inconsistent across all engagement dimensions. Overall, FA scores improved significantly across time points, but no significant differences were observed between groups. Furthermore, engagement with WCF did not significantly predict FA performance in either group at either posttest. These findings suggest that pairing MI with WCF may encourage behavioral engagement, but its impact on writing quality remains inconclusive. While preliminary, the results highlight the potential of MI as a tool in the feedback process and suggest the need for further research using broader engagement measures and longer instructional periods to better understand how MI and WCF can jointly support L2 writing development.</div></div>","PeriodicalId":46865,"journal":{"name":"Assessing Writing","volume":"66 ","pages":"Article 100962"},"PeriodicalIF":4.2,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144471913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Investigating a customized generative AI chatbot for automated essay scoring in a disciplinary writing task 在一个学科写作任务中,研究一个定制的生成式人工智能聊天机器人,用于自动评分
IF 4.2 1区 文学 Q1 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2025-10-01 Epub Date: 2025-06-19 DOI: 10.1016/j.asw.2025.100959
Ge Lan , Yi Li , Jie Yang , Xuanzi He
The release of ChatGPT in 2022 has greatly influenced research on language assessment. Recently, there has been a burgeoning trend of investigating whether Generative (Gen) AI tools can be used for automated essay scoring (AES); however, most of this research has focused on common academic genres or exam writing tasks. To respond to the call for further investigations in recent studies, our study investigated the relationship between writing scores from a GenAI tool and scores from English teachers at a university in Hong Kong. We built a Chatbot to imitate the exact procedures English teachers need to follow before marking student writing. The Chatbot was applied to score 254 technical progress reports produced by engineering students in a disciplinary English course. Then we conducted correlation tests to examine the relationships between the Chatbot and English teachers on their total scores and four analytical scores (i.e., task fulfillment, language, organization, and formatting). The findings show a positive and moderate correlation on the total score (r = 0.424). For the analytical scores, the correlations are different across the four analytical domains, with stronger correlations on language (r = 0.364) and organization (r = 0.316) and weaker correlations on task fulfillment (r = 0.275) and formatting (r = 0.186). The findings indicate that GenAI has limited capacity for automated assessment as a whole but also that a customized Chatbot has greater potential for assessing language and organization domains than task fulfillment and formatting domains. Implications are also provided for similar future research.
2022年ChatGPT的发布极大地影响了语言评估的研究。最近,有一个新兴的趋势是研究生成(Gen)人工智能工具是否可以用于自动作文评分(AES);然而,大多数研究都集中在常见的学术流派或考试写作任务上。为了响应最近研究中对进一步调查的呼吁,我们的研究调查了GenAI工具的写作分数与香港一所大学英语教师的分数之间的关系。我们建立了一个聊天机器人来模仿英语老师在批改学生作文之前需要遵循的确切程序。在一门学科英语课程中,工程专业学生制作了254份技术进展报告,并应用聊天机器人对其进行评分。然后,我们进行了相关测试,以检验聊天机器人与英语教师在总分和四项分析得分(即任务履行、语言、组织和格式)上的关系。结果表明,与总分呈正相关,且呈中等正相关(r = 0.424)。对于分析得分,四个分析领域之间的相关性有所不同,其中语言(r = 0.364)和组织(r = 0.316)的相关性较强,任务实现(r = 0.275)和格式(r = 0.186)的相关性较弱。研究结果表明,GenAI整体上的自动评估能力有限,但自定义聊天机器人在评估语言和组织领域方面比任务实现和格式领域具有更大的潜力。为今后类似的研究提供了启示。
{"title":"Investigating a customized generative AI chatbot for automated essay scoring in a disciplinary writing task","authors":"Ge Lan ,&nbsp;Yi Li ,&nbsp;Jie Yang ,&nbsp;Xuanzi He","doi":"10.1016/j.asw.2025.100959","DOIUrl":"10.1016/j.asw.2025.100959","url":null,"abstract":"<div><div>The release of ChatGPT in 2022 has greatly influenced research on language assessment. Recently, there has been a burgeoning trend of investigating whether Generative (Gen) AI tools can be used for automated essay scoring (AES); however, most of this research has focused on common academic genres or exam writing tasks. To respond to the call for further investigations in recent studies, our study investigated the relationship between writing scores from a GenAI tool and scores from English teachers at a university in Hong Kong. We built a Chatbot to imitate the exact procedures English teachers need to follow before marking student writing. The Chatbot was applied to score 254 technical progress reports produced by engineering students in a disciplinary English course. Then we conducted correlation tests to examine the relationships between the Chatbot and English teachers on their total scores and four analytical scores (i.e., task fulfillment, language, organization, and formatting). The findings show a positive and moderate correlation on the total score (<em>r</em> = 0.424). For the analytical scores, the correlations are different across the four analytical domains, with stronger correlations on language (<em>r</em> = 0.364) and organization (<em>r</em> = 0.316) and weaker correlations on task fulfillment (<em>r</em> = 0.275) and formatting (<em>r</em> = 0.186). The findings indicate that GenAI has limited capacity for automated assessment as a whole but also that a customized Chatbot has greater potential for assessing language and organization domains than task fulfillment and formatting domains. Implications are also provided for similar future research.</div></div>","PeriodicalId":46865,"journal":{"name":"Assessing Writing","volume":"66 ","pages":"Article 100959"},"PeriodicalIF":4.2,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144321634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparative judgment in L2 writing assessment: Reliability and validity across crowdsourced, community-driven, and trained rater groups of judges 第二语言写作评估中的比较判断:在众包、社区驱动和训练有素的评判员群体中的信度和效度
IF 4.2 1区 文学 Q1 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2025-07-01 Epub Date: 2025-04-04 DOI: 10.1016/j.asw.2025.100937
Peter Thwaites , Pauline Jadoulle , Magali Paquot
Several recent studies have explored the use of comparative judgement for assessing second language writing. One of the claimed advantages of this method is that it generates valid assessments even when judgements are conducted by individuals outside of the traditional language assessment community. However, evidence in support of this claim largely focuses on concurrent validity – i.e. the extent to which CJ rating scales generated by various groups of judges correlate with rubric-based assessments. Little evidence exists of the construct validity of using CJ for L2 writing assessment. The present study seeks to address this by exploring what judges pay attention to while making comparative judgements. Three distinct groups of judges assessed the same set of 25 English L2 argumentative essays, leaving comments after each of their decisions. These comments were then analysed in order to explore the construct relevance and construct representativeness of each judge group’s rating scale. The results suggest that these scales differ in the extent to which they can be considered valid assessments of the target essays.
最近的几项研究探索了使用比较判断来评估第二语言写作。这种方法的优点之一是,即使判断是由传统语言评估社区之外的个人进行的,它也能产生有效的评估。然而,支持这一说法的证据主要集中在并发效度上,即由不同法官群体产生的CJ评定量表与基于规则的评估之间的关联程度。很少有证据表明使用CJ进行第二语言写作评估的结构效度。本研究试图通过探讨法官在进行比较判断时注意什么来解决这个问题。三组不同的评委评估同一组25篇英语L2议论文,在他们的每一个决定后留下评论。然后对这些评论进行分析,以探索每个评委组的评分量表的结构相关性和结构代表性。结果表明,这些尺度的不同程度上,他们可以被认为是目标论文的有效评估。
{"title":"Comparative judgment in L2 writing assessment: Reliability and validity across crowdsourced, community-driven, and trained rater groups of judges","authors":"Peter Thwaites ,&nbsp;Pauline Jadoulle ,&nbsp;Magali Paquot","doi":"10.1016/j.asw.2025.100937","DOIUrl":"10.1016/j.asw.2025.100937","url":null,"abstract":"<div><div>Several recent studies have explored the use of comparative judgement for assessing second language writing. One of the claimed advantages of this method is that it generates valid assessments even when judgements are conducted by individuals outside of the traditional language assessment community. However, evidence in support of this claim largely focuses on concurrent validity – i.e. the extent to which CJ rating scales generated by various groups of judges correlate with rubric-based assessments. Little evidence exists of the construct validity of using CJ for L2 writing assessment. The present study seeks to address this by exploring what judges pay attention to while making comparative judgements. Three distinct groups of judges assessed the same set of 25 English L2 argumentative essays, leaving comments after each of their decisions. These comments were then analysed in order to explore the construct relevance and construct representativeness of each judge group’s rating scale. The results suggest that these scales differ in the extent to which they can be considered valid assessments of the target essays.</div></div>","PeriodicalId":46865,"journal":{"name":"Assessing Writing","volume":"65 ","pages":"Article 100937"},"PeriodicalIF":4.2,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143769269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The impact of self-revision, machine translation, and ChatGPT on L2 writing: Raters’ assessments, linguistic complexity, and error correction 自我修改、机器翻译和聊天翻译对第二语言写作的影响:评分者的评估、语言复杂性和错误纠正
IF 4.2 1区 文学 Q1 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2025-07-01 Epub Date: 2025-05-01 DOI: 10.1016/j.asw.2025.100950
Minjoo Kim , Yuah V. Chon
This study explores how learners in a South Korean high school English as a Foreign Language (EFL) context can effectively use neural machine translation (MT) and ChatGPT to enhance their L2 writing. While recent AI tools offer significant potential for supporting human writing feedback, a comparative analysis of how these tools impact writing outcomes—compared to when L2 writers independently proofread and revise their writing—has not been fully examined. To address this gap, a controlled experiment was conducted using three distinct proofreading interventions—self-proofreading (SP), MT-assisted proofreading (MAP), and ChatGPT-assisted proofreading (CAP). Learners were encouraged to first compose their texts in their L2 and then use either MT through inverse translation or ChatGPT through a structured proofreading process. The findings revealed that learners using MAP and CAP demonstrated substantial improvements in overall writing quality compared to those relying solely on SP. CAP users, in particular, produced longer texts, exhibited greater lexical diversity, and constructed more complex sentences, although this was accompanied by reduced verb cohesion. Both MAP and CAP significantly reduced grammatical errors, but did not affect prepositional errors. These findings provide practical recommendations for integrating MT and ChatGPT into L2 writing pedagogy.
本研究探讨了韩国高中英语学习者如何有效地使用神经机器翻译(MT)和聊天翻译(ChatGPT)来提高他们的第二语言写作。虽然最近的人工智能工具在支持人类写作反馈方面提供了巨大的潜力,但对这些工具如何影响写作结果的比较分析(与第二语言作者独立校对和修改写作时相比)还没有得到充分的研究。为了解决这一差距,我们进行了一项对照实验,使用了三种不同的校对干预措施——自我校对(SP)、mt辅助校对(MAP)和chatgpt辅助校对(CAP)。我们鼓励学习者首先用第二语言编写文本,然后通过逆向翻译或结构化校对过程使用MT或ChatGPT。研究结果显示,使用MAP和CAP的学习者与只使用SP的学习者相比,在整体写作质量上有了实质性的提高。特别是使用CAP的学习者,他们写出了更长的文本,表现出更大的词汇多样性,并构建了更复杂的句子,尽管这伴随着动词衔接的降低。MAP和CAP都能显著减少语法错误,但对介词错误没有影响。这些发现为将MT和ChatGPT整合到第二语言写作教学中提供了实用的建议。
{"title":"The impact of self-revision, machine translation, and ChatGPT on L2 writing: Raters’ assessments, linguistic complexity, and error correction","authors":"Minjoo Kim ,&nbsp;Yuah V. Chon","doi":"10.1016/j.asw.2025.100950","DOIUrl":"10.1016/j.asw.2025.100950","url":null,"abstract":"<div><div>This study explores how learners in a South Korean high school English as a Foreign Language (EFL) context can effectively use neural machine translation (MT) and ChatGPT to enhance their L2 writing. While recent AI tools offer significant potential for supporting human writing feedback, a comparative analysis of how these tools impact writing outcomes—compared to when L2 writers independently proofread and revise their writing—has not been fully examined. To address this gap, a controlled experiment was conducted using three distinct proofreading interventions—self-proofreading (SP), MT-assisted proofreading (MAP), and ChatGPT-assisted proofreading (CAP). Learners were encouraged to first compose their texts in their L2 and then use either MT through inverse translation or ChatGPT through a structured proofreading process. The findings revealed that learners using MAP and CAP demonstrated substantial improvements in overall writing quality compared to those relying solely on SP. CAP users, in particular, produced longer texts, exhibited greater lexical diversity, and constructed more complex sentences, although this was accompanied by reduced verb cohesion. Both MAP and CAP significantly reduced grammatical errors, but did not affect prepositional errors. These findings provide practical recommendations for integrating MT and ChatGPT into L2 writing pedagogy.</div></div>","PeriodicalId":46865,"journal":{"name":"Assessing Writing","volume":"65 ","pages":"Article 100950"},"PeriodicalIF":4.2,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143892141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Trinka: Facilitating academic writing through an intelligent writing evaluation system Trinka:通过智能写作评估系统促进学术写作
IF 4.2 1区 文学 Q1 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2025-07-01 Epub Date: 2025-05-28 DOI: 10.1016/j.asw.2025.100953
Jessie S. Barrot
In recent years, intelligent writing evaluation (IWE) systems have gained significant attention due to their ability to enhance the writing process and improve content quality through artificial intelligence (AI) and natural language processing (NLP). This technology review focuses on Trinka, an advanced IWE system tailored for academic writing, which delivers context-aware feedback beyond basic grammar corrections. Key features of Trinka include an AI content detector, academic phrase bank, journal finder, citation checker, inclusive language recommendations, and plagiarism detection—tools specifically designed to meet the needs of scholars and researchers. The review also examines how Trinka can be integrated into second language (L2) writing instruction, highlighting its potential to enhance learning and assessment. Finally, the paper addresses limitations, such as user over-reliance, privacy concerns, and financial accessibility, urging educators and writers to adopt a critical and responsible approach to Trinka’s use in academic contexts.
近年来,智能写作评估(IWE)系统因其通过人工智能(AI)和自然语言处理(NLP)增强写作过程和提高内容质量的能力而受到广泛关注。这篇技术评论的重点是Trinka,一个为学术写作量身定制的高级IWE系统,它提供上下文感知的反馈,而不是基本的语法纠正。Trinka的主要功能包括人工智能内容检测器、学术短语库、期刊查找器、引文检查器、包容性语言推荐和抄袭检测——专门为满足学者和研究人员的需求而设计的工具。该报告还研究了如何将Trinka融入第二语言写作教学,强调了其提高学习和评估的潜力。最后,本文指出了用户过度依赖、隐私问题和经济可及性等局限性,敦促教育工作者和作家对Trinka在学术环境中的使用采取批判性和负责任的态度。
{"title":"Trinka: Facilitating academic writing through an intelligent writing evaluation system","authors":"Jessie S. Barrot","doi":"10.1016/j.asw.2025.100953","DOIUrl":"10.1016/j.asw.2025.100953","url":null,"abstract":"<div><div>In recent years, intelligent writing evaluation (IWE) systems have gained significant attention due to their ability to enhance the writing process and improve content quality through artificial intelligence (AI) and natural language processing (NLP). This technology review focuses on Trinka, an advanced IWE system tailored for academic writing, which delivers context-aware feedback beyond basic grammar corrections. Key features of Trinka include an AI content detector, academic phrase bank, journal finder, citation checker, inclusive language recommendations, and plagiarism detection—tools specifically designed to meet the needs of scholars and researchers. The review also examines how Trinka can be integrated into second language (L2) writing instruction, highlighting its potential to enhance learning and assessment. Finally, the paper addresses limitations, such as user over-reliance, privacy concerns, and financial accessibility, urging educators and writers to adopt a critical and responsible approach to Trinka’s use in academic contexts.</div></div>","PeriodicalId":46865,"journal":{"name":"Assessing Writing","volume":"65 ","pages":"Article 100953"},"PeriodicalIF":4.2,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144155008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unveiling the precursors of negative emotions in second language writing through control-value theory: An explanatory sequential design approach 通过控制价值理论揭示第二语言写作中负面情绪的前兆:一个解释性顺序设计方法
IF 4.2 1区 文学 Q1 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2025-07-01 Epub Date: 2025-05-05 DOI: 10.1016/j.asw.2025.100949
Haijing Zhang , Fangwei Huang
With the emergence of the psychological focus on second language acquisition, research on second language (L2) writing has gradually transitioned to a comprehensive exploration of the writing process. However, few studies have explored the potential trigger mechanism of negative emotions in L2 writing, especially in learning Chinese as a second language (CSL). To fill this gap, the explanatory sequential design was employed to investigate the relationships among CSL learners’ writing self-efficacy, perceived writing task value, writing anger, and writing boredom based on the control-value theory. The quantitative results illustrate that 1) writing self-efficacy positively predicts perceived writing task value, writing anger, and writing boredom; 2) perceived writing task value negatively predicts writing anger and writing boredom; and 3) perceived writing task value mediates the relationship between writing self-efficacy and writing anger/boredom. The qualitative results add insight to the L2 writing process, revealing that 1) writing self-efficacy exhibited dialectical tension during the writing process; 2) perceived writing task value illustrated contextual immediacy in L2 writing; and 3) writing anger/boredom demonstrated dynamism throughout the procedure of completing the L2 writing task. These results extend the application scope and deepen the theoretical understanding of control-value theory, offering significant pedagogical implications for L2 education.
随着心理学对二语习得的关注,对二语写作的研究逐渐过渡到对写作过程的全面探索。然而,很少有研究探讨消极情绪在第二语言写作中的潜在触发机制,特别是在汉语学习中。为了填补这一空白,本研究基于控制价值理论,采用解释序列设计研究了汉语第二语言学习者的写作自我效能感、写作任务价值感知、写作愤怒和写作无聊之间的关系。定量结果表明:1)写作自我效能正向预测写作任务价值、写作愤怒和写作无聊;2)感知写作任务价值负向预测写作愤怒和写作无聊;3)感知写作任务价值在写作自我效能感与写作愤怒/无聊之间起中介作用。定性研究结果进一步揭示了二语写作过程:1)写作自我效能在写作过程中表现出辩证张力;2)感知写作任务价值在二语写作中的语境即时性;3)在完成第二语言写作任务的整个过程中,愤怒/无聊表现出活力。这些研究结果拓展了控制值理论的应用范围,加深了对控制值理论的理论理解,对第二语言教育具有重要的教学意义。
{"title":"Unveiling the precursors of negative emotions in second language writing through control-value theory: An explanatory sequential design approach","authors":"Haijing Zhang ,&nbsp;Fangwei Huang","doi":"10.1016/j.asw.2025.100949","DOIUrl":"10.1016/j.asw.2025.100949","url":null,"abstract":"<div><div>With the emergence of the psychological focus on second language acquisition, research on second language (L2) writing has gradually transitioned to a comprehensive exploration of the writing process. However, few studies have explored the potential trigger mechanism of negative emotions in L2 writing, especially in learning Chinese as a second language (CSL). To fill this gap, the explanatory sequential design was employed to investigate the relationships among CSL learners’ writing self-efficacy, perceived writing task value, writing anger, and writing boredom based on the control-value theory. The quantitative results illustrate that 1) writing self-efficacy positively predicts perceived writing task value, writing anger, and writing boredom; 2) perceived writing task value negatively predicts writing anger and writing boredom; and 3) perceived writing task value mediates the relationship between writing self-efficacy and writing anger/boredom. The qualitative results add insight to the L2 writing process, revealing that 1) writing self-efficacy exhibited dialectical tension during the writing process; 2) perceived writing task value illustrated contextual immediacy in L2 writing; and 3) writing anger/boredom demonstrated dynamism throughout the procedure of completing the L2 writing task. These results extend the application scope and deepen the theoretical understanding of control-value theory, offering significant pedagogical implications for L2 education.</div></div>","PeriodicalId":46865,"journal":{"name":"Assessing Writing","volume":"65 ","pages":"Article 100949"},"PeriodicalIF":4.2,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143906760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Making things happen: A study of grammatical metaphors in L2 writing scripts 让事情发生:二语写作脚本的语法隐喻研究
IF 4.2 1区 文学 Q1 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2025-07-01 Epub Date: 2025-05-12 DOI: 10.1016/j.asw.2025.100939
Nicholas Glasson, Andrew Kitney
The notion of grammatical metaphor (GM) (Halliday, 1985) is essentially where a writer can shift an action or quality into being a ‘thing’. As in most senses of metaphor, the goal is to “represent something as something else” (McGrath & Liardét, 2023, p.33).
This study investigated the use of grammatical metaphor (GM) in Linguaskill writing exam responses across CEFR proficiency levels (below-B1 to C1 or above). It analysed the presence of a pre-existing GM list (see McGrath & Liardét, 2023) to explore GM frequency in L2 responses, the correlative relationship with proficiency scores and qualitatively explored candidate responses in terms of how GMs were used. Results show a moderate positive correlation between proficiency and GM use, with a dominance of process-to-thing shifts (e.g., transform→transformation) and emergence of GM use from lower to higher proficiency levels. This underscores GM's significance in crafting academically valued meanings in L2 contexts, suggesting its potential for informing instructional and assessment practices.
语法隐喻(GM)的概念(Halliday, 1985)本质上是指作家可以将一个动作或质量转变为一个“事物”。就像大多数意义上的隐喻一样,目标是“将某物表现为另一物”(麦格拉思&;liardsamt, 2023,第33页)。本研究调查了不同CEFR水平(b1以下至C1或以上)的学生在语言写作考试中的语法隐喻使用情况。它分析了一份预先存在的转基因清单(见McGrath &;liardsamet, 2023)探索第二语言回答中的GM频率,与熟练程度分数的相关关系,并就GM的使用方式定性地探索候选回答。结果表明,熟练程度与转基因使用之间存在适度的正相关关系,过程到事物的转变(例如,转换→转换)占主导地位,转基因使用从低水平到高水平的出现。这强调了通用汽车在第二语言语境中塑造学术价值意义的重要性,表明它有可能为教学和评估实践提供信息。
{"title":"Making things happen: A study of grammatical metaphors in L2 writing scripts","authors":"Nicholas Glasson,&nbsp;Andrew Kitney","doi":"10.1016/j.asw.2025.100939","DOIUrl":"10.1016/j.asw.2025.100939","url":null,"abstract":"<div><div>The notion of grammatical metaphor (GM) (Halliday, 1985) is essentially where a writer can shift an action or quality into being a ‘thing’. As in most senses of metaphor, the goal is to “represent something as something else” (McGrath &amp; Liardét, 2023, p.33).</div><div>This study investigated the use of grammatical metaphor (GM) in Linguaskill writing exam responses across CEFR proficiency levels (below-B1 to C1 or above). It analysed the presence of a pre-existing GM list (see McGrath &amp; Liardét, 2023) to explore GM frequency in L2 responses, the correlative relationship with proficiency scores and qualitatively explored candidate responses in terms of how GMs were used. Results show a moderate positive correlation between proficiency and GM use, with a dominance of process-to-thing shifts (e.g., transform→transformation) and emergence of GM use from lower to higher proficiency levels. This underscores GM's significance in crafting academically valued meanings in L2 contexts, suggesting its potential for informing instructional and assessment practices.</div></div>","PeriodicalId":46865,"journal":{"name":"Assessing Writing","volume":"65 ","pages":"Article 100939"},"PeriodicalIF":4.2,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143937159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Assessing Writing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1