首页 > 最新文献

Assessing Writing最新文献

英文 中文
Judgment accuracy in primary school EFL writing assessment: Do text characteristics matter? 小学英语写作评价的判断准确性:文本特征重要吗?
IF 4.2 1区 文学 Q1 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2025-06-26 DOI: 10.1016/j.asw.2025.100957
Ruth Trüb , Jens Möller , Julian Lohmann , Thorben Jansen , Stefan D. Keller
Assessing the writing competence of pupils learning English as a foreign language (EFL) at primary school is challenging. This study aimed at examining a largely unexplored topic, namely the role of text characteristics in writing assessment, and analysed judgment accuracy differentiated by nine aspects of text quality (communicative effect, level of detail, coherence, cohesion, complexity of syntax and grammar, correctness of syntax and grammar, vocabulary, orthography and punctuation). Two hundred pre-service teachers assessed four randomly assigned texts from learners in grade six. Their assessment was compared to the existing ratings of two experts from a previous study. We found a relative judgment accuracy between r = .34 and .60 for the nine assessment criteria, with vocabulary being assessed significantly more accurately than almost all other criteria. Orthography, complexity and correctness of syntax and grammar and punctuation were rated with significantly more accuracy than cohesion, level of detail, communicative effect and coherence. The pre-service teachers assessed most criteria more strictly and with higher variability than the experts. The results suggest that teacher education should offer pre-service teachers concrete opportunities to practise writing assessment, implement activities to strengthen the assessment of content- and structure-related criteria, and help them adjust their assessment rigour.
评估小学生学习英语的写作能力是一项具有挑战性的工作。本研究旨在考察文本特征在写作评价中的作用,并分析了文本质量(交际效果、细节水平、连贯、衔接、句法和语法复杂性、句法和语法正确性、词汇、正字法和标点)九个方面区分的判断准确性。200名职前教师评估了随机分配给六年级学生的四篇课文。他们的评估与先前研究中两位专家的现有评级进行了比较。我们发现r = 之间的相对判断精度。9项评估标准分别为34和0.60,其中词汇量的评估比几乎所有其他标准都要准确得多。正字法、句法复杂性和正确性、语法和标点的准确性明显高于衔接、细节水平、交际效果和连贯。职前教师比专家更严格地评估了大多数标准,并且具有更高的可变性。结果表明,教师教育应为职前教师提供实践写作评估的具体机会,开展活动以加强与内容和结构相关的标准评估,并帮助他们调整评估的严谨性。
{"title":"Judgment accuracy in primary school EFL writing assessment: Do text characteristics matter?","authors":"Ruth Trüb ,&nbsp;Jens Möller ,&nbsp;Julian Lohmann ,&nbsp;Thorben Jansen ,&nbsp;Stefan D. Keller","doi":"10.1016/j.asw.2025.100957","DOIUrl":"10.1016/j.asw.2025.100957","url":null,"abstract":"<div><div>Assessing the writing competence of pupils learning English as a foreign language (EFL) at primary school is challenging. This study aimed at examining a largely unexplored topic, namely the role of text characteristics in writing assessment, and analysed judgment accuracy differentiated by nine aspects of text quality (communicative effect, level of detail, coherence, cohesion, complexity of syntax and grammar, correctness of syntax and grammar, vocabulary, orthography and punctuation). Two hundred pre-service teachers assessed four randomly assigned texts from learners in grade six. Their assessment was compared to the existing ratings of two experts from a previous study. We found a relative judgment accuracy between <em>r</em> = .34 and .60 for the nine assessment criteria, with vocabulary being assessed significantly more accurately than almost all other criteria. Orthography, complexity and correctness of syntax and grammar and punctuation were rated with significantly more accuracy than cohesion, level of detail, communicative effect and coherence. The pre-service teachers assessed most criteria more strictly and with higher variability than the experts. The results suggest that teacher education should offer pre-service teachers concrete opportunities to practise writing assessment, implement activities to strengthen the assessment of content- and structure-related criteria, and help them adjust their assessment rigour.</div></div>","PeriodicalId":46865,"journal":{"name":"Assessing Writing","volume":"66 ","pages":"Article 100957"},"PeriodicalIF":4.2,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144490215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The effect of metacognitive instruction with indirect written corrective feedback on secondary students’ engagement and functional adequacy in L2 writing 间接书面纠正反馈的元认知教学对中学生二语写作投入和功能充分性的影响
IF 4.2 1区 文学 Q1 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2025-06-25 DOI: 10.1016/j.asw.2025.100962
Miseong Kim, Phil Hiver
This study explored how metacognitive instruction (MI) combined with indirect written corrective feedback (WCF) influences students’ engagement with WCF and their functional adequacy (FA) in L2 writing. Fifty-four intermediate-level Korean secondary school students participated, divided into a treatment group (WCF + MI) and a comparison group (WCF only). Over 13 weeks, students completed five argumentative writing tasks, receiving WCF after each task. They also completed a self-report survey on their engagement with WCF. Results from the pretest, immediate posttest, and delayed posttest revealed that students in the treatment group showed increased behavioral engagement over time, although this pattern was inconsistent across all engagement dimensions. Overall, FA scores improved significantly across time points, but no significant differences were observed between groups. Furthermore, engagement with WCF did not significantly predict FA performance in either group at either posttest. These findings suggest that pairing MI with WCF may encourage behavioral engagement, but its impact on writing quality remains inconclusive. While preliminary, the results highlight the potential of MI as a tool in the feedback process and suggest the need for further research using broader engagement measures and longer instructional periods to better understand how MI and WCF can jointly support L2 writing development.
本研究探讨元认知教学(MI)结合间接书面纠正反馈(WCF)如何影响学生使用间接书面纠正反馈和他们在第二语言写作中的功能充分性(FA)。54名韩国初中学生被分为治疗组(WCF + MI)和对照组(仅WCF)。在13周的时间里,学生们完成了5个议论文写作任务,每个任务后都得到了WCF。他们还完成了一份关于他们参与WCF的自我报告调查。前测、即时后测和延迟后测的结果显示,随着时间的推移,治疗组的学生表现出越来越多的行为投入,尽管这种模式在所有投入维度上都不一致。总体而言,FA评分在各个时间点上均有显著提高,但组间无显著差异。此外,参与WCF并不能显著预测两组在两个后测中的FA表现。这些发现表明,将MI与WCF相结合可能会促进行为参与,但其对写作质量的影响仍不确定。虽然是初步的,但结果强调了MI作为反馈过程工具的潜力,并建议需要进一步研究,使用更广泛的参与措施和更长的教学时间,以更好地了解MI和WCF如何共同支持第二语言写作的发展。
{"title":"The effect of metacognitive instruction with indirect written corrective feedback on secondary students’ engagement and functional adequacy in L2 writing","authors":"Miseong Kim,&nbsp;Phil Hiver","doi":"10.1016/j.asw.2025.100962","DOIUrl":"10.1016/j.asw.2025.100962","url":null,"abstract":"<div><div>This study explored how metacognitive instruction (MI) combined with indirect written corrective feedback (WCF) influences students’ engagement with WCF and their functional adequacy (FA) in L2 writing. Fifty-four intermediate-level Korean secondary school students participated, divided into a treatment group (WCF + MI) and a comparison group (WCF only). Over 13 weeks, students completed five argumentative writing tasks, receiving WCF after each task. They also completed a self-report survey on their engagement with WCF. Results from the pretest, immediate posttest, and delayed posttest revealed that students in the treatment group showed increased behavioral engagement over time, although this pattern was inconsistent across all engagement dimensions. Overall, FA scores improved significantly across time points, but no significant differences were observed between groups. Furthermore, engagement with WCF did not significantly predict FA performance in either group at either posttest. These findings suggest that pairing MI with WCF may encourage behavioral engagement, but its impact on writing quality remains inconclusive. While preliminary, the results highlight the potential of MI as a tool in the feedback process and suggest the need for further research using broader engagement measures and longer instructional periods to better understand how MI and WCF can jointly support L2 writing development.</div></div>","PeriodicalId":46865,"journal":{"name":"Assessing Writing","volume":"66 ","pages":"Article 100962"},"PeriodicalIF":4.2,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144471913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparing GPT-based approaches in automated writing evaluation 比较基于gpt的自动写作评价方法
IF 4.2 1区 文学 Q1 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2025-06-24 DOI: 10.1016/j.asw.2025.100961
Yingying Liu , Xiaofei Lu , Huilei Qi
Large language models (LLMs) like OpenAI’s GPT models show significant promise in automated writing evaluation (AWE). However, recent research has mainly focused on non-fine-tuned GPT models, with limited attention to fine-tuned models as well as potential factors influencing performance, such as model type, prompting strategy, and dataset characteristics. This study compares six GPT-based approaches for evaluating TOFEL argumentative writing, namely, GPT-3.5 zero-shot, GPT-3.5 few-shot, GPT-4 zero-shot, GPT-4 few-shot, and two fine-tuning methods. We assess the impact of model type (GPT-3.5 vs. GPT-4), prompting strategy (zero-shot vs. few-shot), fine-tuning, class imbalance and dataset shift on performance. Our findings reveal that fine-tuned GPT models consistently outperform non-fine-tuned GPT-4 models, which in turn outperform GPT-3.5 models. Few-shot prompting does not show clear advantages over zero-shot prompting in this study. Additionally, class imbalance and dataset shift negatively affect model accuracy and reliability. These results offer valuable insights into the effectiveness of different GPT-based approaches and the factors that influence their performance in AWE.
像OpenAI的GPT模型这样的大型语言模型(llm)在自动写作评估(AWE)中显示出了巨大的前景。然而,目前的研究主要集中在非微调的GPT模型上,对微调模型以及模型类型、提示策略、数据集特征等潜在影响性能的因素关注较少。本研究比较了六种基于gpt的托福论文写作评估方法,即GPT-3.5零射击、GPT-3.5少射击、GPT-4零射击、GPT-4少射击和两种微调方法。我们评估了模型类型(GPT-3.5 vs. GPT-4)、提示策略(零投篮vs.少投篮)、微调、类别不平衡和数据集转移对性能的影响。我们的研究结果表明,经过微调的GPT模型始终优于非微调的GPT-4模型,而非微调的GPT-4模型又优于GPT-3.5模型。在本研究中,少针提示没有显示出明显优于零针提示。此外,类别不平衡和数据集移位会对模型的准确性和可靠性产生负面影响。这些结果为了解不同基于gpt的方法的有效性以及影响其在AWE中表现的因素提供了有价值的见解。
{"title":"Comparing GPT-based approaches in automated writing evaluation","authors":"Yingying Liu ,&nbsp;Xiaofei Lu ,&nbsp;Huilei Qi","doi":"10.1016/j.asw.2025.100961","DOIUrl":"10.1016/j.asw.2025.100961","url":null,"abstract":"<div><div>Large language models (LLMs) like OpenAI’s GPT models show significant promise in automated writing evaluation (AWE). However, recent research has mainly focused on non-fine-tuned GPT models, with limited attention to fine-tuned models as well as potential factors influencing performance, such as model type, prompting strategy, and dataset characteristics. This study compares six GPT-based approaches for evaluating TOFEL argumentative writing, namely, GPT-3.5 zero-shot, GPT-3.5 few-shot, GPT-4 zero-shot, GPT-4 few-shot, and two fine-tuning methods. We assess the impact of model type (GPT-3.5 vs. GPT-4), prompting strategy (zero-shot vs. few-shot), fine-tuning, class imbalance and dataset shift on performance. Our findings reveal that fine-tuned GPT models consistently outperform non-fine-tuned GPT-4 models, which in turn outperform GPT-3.5 models. Few-shot prompting does not show clear advantages over zero-shot prompting in this study. Additionally, class imbalance and dataset shift negatively affect model accuracy and reliability. These results offer valuable insights into the effectiveness of different GPT-based approaches and the factors that influence their performance in AWE.</div></div>","PeriodicalId":46865,"journal":{"name":"Assessing Writing","volume":"66 ","pages":"Article 100961"},"PeriodicalIF":4.2,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144471912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving writing feedback quality and self-efficacy of pre-service teachers in Gen-AI contexts: An experimental mixed-method design 提高Gen-AI情境下职前教师写作反馈质量和自我效能:一种实验性混合方法设计
IF 4.2 1区 文学 Q1 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2025-06-19 DOI: 10.1016/j.asw.2025.100960
Siyu Zhu , Qingyang Li , Yuan Yao , Jialin Li , Xinhua Zhu
The rapid advancement of Generative AI (Gen-AI), such as ChatGPT, presents both opportunities and challenges for teacher education. For pre-service teachers (PSTs), Gen-AI offers new tools to enhance the efficiency and quality of writing feedback. However, it also raises concerns, as many PSTs lack classroom experience, confidence in giving feedback, and knowledge of how to effectively integrate AI-generated content into instructional practice. To address these issues, this study adopted a pre-post experimental design to examine the effects of targeted training on PSTs’ provision of writing feedback, with a focus on feedback quality, self-efficacy, and their relationship in ChatGPT-supported contexts. Over a two-week training program with 30 PSTs, Wilcoxon signed-rank test results from the content analysis showed significant improvements in feedback quality and self-efficacy. Semi-structured interviews with eight participants identified cognitive changes and enhanced ChatGPT operational skills as key drivers of these improvements. We reaffirmed that mastery and vicarious experiences are crucial for enhancing teacher self-efficacy. Furthermore, a reciprocal relationship was observed between the quality and self-efficacy in providing ChatGPT-assisted feedback. This study contributes to the broader discourse on ChatGPT in education and offers specific strategies for effectively incorporating new technology into teacher training.
ChatGPT等生成式人工智能(Gen-AI)的快速发展,给教师教育带来了机遇和挑战。Gen-AI为职前教师(pst)提供了新的工具,以提高写作反馈的效率和质量。然而,这也引起了人们的担忧,因为许多pst缺乏课堂经验,缺乏给予反馈的信心,也不知道如何有效地将人工智能生成的内容整合到教学实践中。为了解决这些问题,本研究采用了前后实验设计来检验针对性培训对pst提供写作反馈的影响,重点关注反馈质量、自我效能及其在chatgpt支持情境下的关系。在为期两周的30名pst培训项目中,内容分析的Wilcoxon sign -rank测试结果显示反馈质量和自我效能显著提高。对8位参与者进行的半结构化访谈确定了认知变化和增强的ChatGPT操作技能是这些改进的关键驱动因素。我们重申,掌握和替代经验是提高教师自我效能感的关键。此外,在提供chatgpt辅助反馈的质量和自我效能之间观察到相互关系。本研究为ChatGPT在教育中的广泛讨论做出了贡献,并为有效地将新技术纳入教师培训提供了具体策略。
{"title":"Improving writing feedback quality and self-efficacy of pre-service teachers in Gen-AI contexts: An experimental mixed-method design","authors":"Siyu Zhu ,&nbsp;Qingyang Li ,&nbsp;Yuan Yao ,&nbsp;Jialin Li ,&nbsp;Xinhua Zhu","doi":"10.1016/j.asw.2025.100960","DOIUrl":"10.1016/j.asw.2025.100960","url":null,"abstract":"<div><div>The rapid advancement of Generative AI (Gen-AI), such as ChatGPT, presents both opportunities and challenges for teacher education. For pre-service teachers (PSTs), Gen-AI offers new tools to enhance the efficiency and quality of writing feedback. However, it also raises concerns, as many PSTs lack classroom experience, confidence in giving feedback, and knowledge of how to effectively integrate AI-generated content into instructional practice. To address these issues, this study adopted a pre-post experimental design to examine the effects of targeted training on PSTs’ provision of writing feedback, with a focus on feedback quality, self-efficacy, and their relationship in ChatGPT-supported contexts. Over a two-week training program with 30 PSTs, Wilcoxon signed-rank test results from the content analysis showed significant improvements in feedback quality and self-efficacy. Semi-structured interviews with eight participants identified cognitive changes and enhanced ChatGPT operational skills as key drivers of these improvements. We reaffirmed that mastery and vicarious experiences are crucial for enhancing teacher self-efficacy. Furthermore, a reciprocal relationship was observed between the quality and self-efficacy in providing ChatGPT-assisted feedback. This study contributes to the broader discourse on ChatGPT in education and offers specific strategies for effectively incorporating new technology into teacher training.</div></div>","PeriodicalId":46865,"journal":{"name":"Assessing Writing","volume":"66 ","pages":"Article 100960"},"PeriodicalIF":4.2,"publicationDate":"2025-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144321635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Investigating a customized generative AI chatbot for automated essay scoring in a disciplinary writing task 在一个学科写作任务中,研究一个定制的生成式人工智能聊天机器人,用于自动评分
IF 4.2 1区 文学 Q1 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2025-06-19 DOI: 10.1016/j.asw.2025.100959
Ge Lan , Yi Li , Jie Yang , Xuanzi He
The release of ChatGPT in 2022 has greatly influenced research on language assessment. Recently, there has been a burgeoning trend of investigating whether Generative (Gen) AI tools can be used for automated essay scoring (AES); however, most of this research has focused on common academic genres or exam writing tasks. To respond to the call for further investigations in recent studies, our study investigated the relationship between writing scores from a GenAI tool and scores from English teachers at a university in Hong Kong. We built a Chatbot to imitate the exact procedures English teachers need to follow before marking student writing. The Chatbot was applied to score 254 technical progress reports produced by engineering students in a disciplinary English course. Then we conducted correlation tests to examine the relationships between the Chatbot and English teachers on their total scores and four analytical scores (i.e., task fulfillment, language, organization, and formatting). The findings show a positive and moderate correlation on the total score (r = 0.424). For the analytical scores, the correlations are different across the four analytical domains, with stronger correlations on language (r = 0.364) and organization (r = 0.316) and weaker correlations on task fulfillment (r = 0.275) and formatting (r = 0.186). The findings indicate that GenAI has limited capacity for automated assessment as a whole but also that a customized Chatbot has greater potential for assessing language and organization domains than task fulfillment and formatting domains. Implications are also provided for similar future research.
2022年ChatGPT的发布极大地影响了语言评估的研究。最近,有一个新兴的趋势是研究生成(Gen)人工智能工具是否可以用于自动作文评分(AES);然而,大多数研究都集中在常见的学术流派或考试写作任务上。为了响应最近研究中对进一步调查的呼吁,我们的研究调查了GenAI工具的写作分数与香港一所大学英语教师的分数之间的关系。我们建立了一个聊天机器人来模仿英语老师在批改学生作文之前需要遵循的确切程序。在一门学科英语课程中,工程专业学生制作了254份技术进展报告,并应用聊天机器人对其进行评分。然后,我们进行了相关测试,以检验聊天机器人与英语教师在总分和四项分析得分(即任务履行、语言、组织和格式)上的关系。结果表明,与总分呈正相关,且呈中等正相关(r = 0.424)。对于分析得分,四个分析领域之间的相关性有所不同,其中语言(r = 0.364)和组织(r = 0.316)的相关性较强,任务实现(r = 0.275)和格式(r = 0.186)的相关性较弱。研究结果表明,GenAI整体上的自动评估能力有限,但自定义聊天机器人在评估语言和组织领域方面比任务实现和格式领域具有更大的潜力。为今后类似的研究提供了启示。
{"title":"Investigating a customized generative AI chatbot for automated essay scoring in a disciplinary writing task","authors":"Ge Lan ,&nbsp;Yi Li ,&nbsp;Jie Yang ,&nbsp;Xuanzi He","doi":"10.1016/j.asw.2025.100959","DOIUrl":"10.1016/j.asw.2025.100959","url":null,"abstract":"<div><div>The release of ChatGPT in 2022 has greatly influenced research on language assessment. Recently, there has been a burgeoning trend of investigating whether Generative (Gen) AI tools can be used for automated essay scoring (AES); however, most of this research has focused on common academic genres or exam writing tasks. To respond to the call for further investigations in recent studies, our study investigated the relationship between writing scores from a GenAI tool and scores from English teachers at a university in Hong Kong. We built a Chatbot to imitate the exact procedures English teachers need to follow before marking student writing. The Chatbot was applied to score 254 technical progress reports produced by engineering students in a disciplinary English course. Then we conducted correlation tests to examine the relationships between the Chatbot and English teachers on their total scores and four analytical scores (i.e., task fulfillment, language, organization, and formatting). The findings show a positive and moderate correlation on the total score (<em>r</em> = 0.424). For the analytical scores, the correlations are different across the four analytical domains, with stronger correlations on language (<em>r</em> = 0.364) and organization (<em>r</em> = 0.316) and weaker correlations on task fulfillment (<em>r</em> = 0.275) and formatting (<em>r</em> = 0.186). The findings indicate that GenAI has limited capacity for automated assessment as a whole but also that a customized Chatbot has greater potential for assessing language and organization domains than task fulfillment and formatting domains. Implications are also provided for similar future research.</div></div>","PeriodicalId":46865,"journal":{"name":"Assessing Writing","volume":"66 ","pages":"Article 100959"},"PeriodicalIF":4.2,"publicationDate":"2025-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144321634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Criterion validity evidence and alternate form reliability of curriculum-based measures of written expression for eighth grade students 八年级学生书面表达课程测量的标准效度、证据和替代形式信度
IF 4.2 1区 文学 Q1 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2025-06-16 DOI: 10.1016/j.asw.2025.100958
John Elwood Romig , Amanda A. Olsen , Elizabeth Medina , Anna Tulloh
Significant majorities of students in secondary grade levels struggle to meet grade level expectations for writing. Progress monitoring with curriculum-based measurement is one possible strategy for shaping instruction towards improved student outcomes. However, relatively little research has examined curriculum-based measures for writing with students in secondary grade levels. This study included 89 8th grade participants who completed one curriculum-based measurement writing task weekly for 11 weeks and completed the Test of Written Language – 4 in the 12th week. Spearman’s rank correlations were calculated to determine the alternate form reliability and criterion validity evidence of the curriculum-based measurement tasks. We found alternate form reliability and criterion validity evidence to be weaker than established thresholds in the field but approaching what was found with other writing assessments. Educators should use caution when interpreting results of CBM in writing and consider alternative writing assessments for screening purposes.
绝大多数中学生在写作方面都很难达到年级的要求。以课程为基础的进度监测是一种可能的策略,可以使教学朝着提高学生成绩的方向发展。然而,相对而言,很少有研究针对中学学生的写作水平进行基于课程的测试。这项研究包括89名8年级的参与者,他们每周完成一项基于课程的写作测试,持续11周,并在第12周完成书面语言测试- 4。计算Spearman秩相关以确定课程测量任务的替代形式信度和标准效度证据。我们发现替代形式的信度和标准效度证据弱于该领域的既定阈值,但接近其他写作评估的结果。教育工作者在解释CBM的写作结果时应该谨慎,并考虑其他的写作评估来进行筛选。
{"title":"Criterion validity evidence and alternate form reliability of curriculum-based measures of written expression for eighth grade students","authors":"John Elwood Romig ,&nbsp;Amanda A. Olsen ,&nbsp;Elizabeth Medina ,&nbsp;Anna Tulloh","doi":"10.1016/j.asw.2025.100958","DOIUrl":"10.1016/j.asw.2025.100958","url":null,"abstract":"<div><div>Significant majorities of students in secondary grade levels struggle to meet grade level expectations for writing. Progress monitoring with curriculum-based measurement is one possible strategy for shaping instruction towards improved student outcomes. However, relatively little research has examined curriculum-based measures for writing with students in secondary grade levels. This study included 89 8th grade participants who completed one curriculum-based measurement writing task weekly for 11 weeks and completed the <em>Test of Written Language – 4</em> in the 12th week. Spearman’s rank correlations were calculated to determine the alternate form reliability and criterion validity evidence of the curriculum-based measurement tasks. We found alternate form reliability and criterion validity evidence to be weaker than established thresholds in the field but approaching what was found with other writing assessments. Educators should use caution when interpreting results of CBM in writing and consider alternative writing assessments for screening purposes.</div></div>","PeriodicalId":46865,"journal":{"name":"Assessing Writing","volume":"66 ","pages":"Article 100958"},"PeriodicalIF":4.2,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144290887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Editorial introduction, Assessing writing Tools & Tech Forum 2025 编辑导言,评估写作工具与技术论坛2025
IF 4.2 1区 文学 Q1 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2025-06-07 DOI: 10.1016/j.asw.2025.100956
Kelly Hartwell, Laura Aull
{"title":"Editorial introduction, Assessing writing Tools & Tech Forum 2025","authors":"Kelly Hartwell,&nbsp;Laura Aull","doi":"10.1016/j.asw.2025.100956","DOIUrl":"10.1016/j.asw.2025.100956","url":null,"abstract":"","PeriodicalId":46865,"journal":{"name":"Assessing Writing","volume":"65 ","pages":"Article 100956"},"PeriodicalIF":4.2,"publicationDate":"2025-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144230522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A large-scale corpus for assessing source-based writing quality: ASAP 2.0 用于评估基于源的写作质量的大规模语料库:ASAP 2.0
IF 4.2 1区 文学 Q1 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2025-06-03 DOI: 10.1016/j.asw.2025.100954
Scott A. Crossley , Perpetual Baffour , L. Burleigh , Jules King
This paper introduces ASAP 2.0, a dataset of ∼25,000 source-based argumentative essays from U.S. secondary students. The corpus addresses the shortcomings of the original ASAP corpus by including demographic data, consistent scoring rubrics, and source texts. ASAP 2.0 aims to support the development of unbiased, sophisticated Automatic Essay Scoring (AES) systems that can foster improved educational practices by providing summative to students. The corpus is designed for broad accessibility with the hope of facilitating research into writing quality and AES system biases.
本文介绍了以美国中学生为对象的2.5万篇议论文为基础的数据集ASAP 2.0。该语料库通过包含人口统计数据、一致的评分标准和源文本解决了原始ASAP语料库的缺点。ASAP 2.0旨在支持公正、复杂的自动作文评分(AES)系统的发展,通过为学生提供总结,促进改进的教育实践。该语料库被设计为广泛的可访问性,希望促进对写作质量和AES系统偏差的研究。
{"title":"A large-scale corpus for assessing source-based writing quality: ASAP 2.0","authors":"Scott A. Crossley ,&nbsp;Perpetual Baffour ,&nbsp;L. Burleigh ,&nbsp;Jules King","doi":"10.1016/j.asw.2025.100954","DOIUrl":"10.1016/j.asw.2025.100954","url":null,"abstract":"<div><div>This paper introduces ASAP 2.0, a dataset of ∼25,000 source-based argumentative essays from U.S. secondary students. The corpus addresses the shortcomings of the original ASAP corpus by including demographic data, consistent scoring rubrics, and source texts. ASAP 2.0 aims to support the development of unbiased, sophisticated Automatic Essay Scoring (AES) systems that can foster improved educational practices by providing summative to students. The corpus is designed for broad accessibility with the hope of facilitating research into writing quality and AES system biases.</div></div>","PeriodicalId":46865,"journal":{"name":"Assessing Writing","volume":"65 ","pages":"Article 100954"},"PeriodicalIF":4.2,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144194685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Potentials and pitfalls of Google Gemini in writing: Implications for educators 双子座在写作方面的潜力和陷阱:对教育者的启示
IF 4.2 1区 文学 Q1 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2025-05-31 DOI: 10.1016/j.asw.2025.100955
Hieu Manh Do
Recent developments in artificial intelligence (AI) have led to the emergence of chatbots as an effective tool for language learning. One such tool is Google Gemini, which engages writers and researchers in natural and human-like interactive experiences. Google Gemini offers significant benefits for improving efficiency and collaboration in academic writing but also presents challenges related to accuracy, ethical considerations, and potential impacts on writer creativity. Thus, this tech review aims to explore the potential benefits and limitations of Google Gemini in writing. This review also concludes with recommendations for writing instructors and suggestions for future researchers in the field.
人工智能(AI)的最新发展导致聊天机器人作为语言学习的有效工具出现。谷歌Gemini就是这样一个工具,它让作家和研究人员参与到自然的、类似人类的互动体验中。双子座为提高学术写作的效率和协作提供了显著的好处,但也提出了与准确性、道德考虑和对作者创造力的潜在影响相关的挑战。因此,这篇技术评论旨在以书面形式探讨谷歌Gemini的潜在优势和局限性。本文还总结了对写作教师的建议和对未来该领域研究人员的建议。
{"title":"Potentials and pitfalls of Google Gemini in writing: Implications for educators","authors":"Hieu Manh Do","doi":"10.1016/j.asw.2025.100955","DOIUrl":"10.1016/j.asw.2025.100955","url":null,"abstract":"<div><div>Recent developments in artificial intelligence (AI) have led to the emergence of chatbots as an effective tool for language learning. One such tool is Google Gemini, which engages writers and researchers in natural and human-like interactive experiences. Google Gemini offers significant benefits for improving efficiency and collaboration in academic writing but also presents challenges related to accuracy, ethical considerations, and potential impacts on writer creativity. Thus, this tech review aims to explore the potential benefits and limitations of Google Gemini in writing. This review also concludes with recommendations for writing instructors and suggestions for future researchers in the field.</div></div>","PeriodicalId":46865,"journal":{"name":"Assessing Writing","volume":"65 ","pages":"Article 100955"},"PeriodicalIF":4.2,"publicationDate":"2025-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144184998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Trinka: Facilitating academic writing through an intelligent writing evaluation system Trinka:通过智能写作评估系统促进学术写作
IF 4.2 1区 文学 Q1 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2025-05-28 DOI: 10.1016/j.asw.2025.100953
Jessie S. Barrot
In recent years, intelligent writing evaluation (IWE) systems have gained significant attention due to their ability to enhance the writing process and improve content quality through artificial intelligence (AI) and natural language processing (NLP). This technology review focuses on Trinka, an advanced IWE system tailored for academic writing, which delivers context-aware feedback beyond basic grammar corrections. Key features of Trinka include an AI content detector, academic phrase bank, journal finder, citation checker, inclusive language recommendations, and plagiarism detection—tools specifically designed to meet the needs of scholars and researchers. The review also examines how Trinka can be integrated into second language (L2) writing instruction, highlighting its potential to enhance learning and assessment. Finally, the paper addresses limitations, such as user over-reliance, privacy concerns, and financial accessibility, urging educators and writers to adopt a critical and responsible approach to Trinka’s use in academic contexts.
近年来,智能写作评估(IWE)系统因其通过人工智能(AI)和自然语言处理(NLP)增强写作过程和提高内容质量的能力而受到广泛关注。这篇技术评论的重点是Trinka,一个为学术写作量身定制的高级IWE系统,它提供上下文感知的反馈,而不是基本的语法纠正。Trinka的主要功能包括人工智能内容检测器、学术短语库、期刊查找器、引文检查器、包容性语言推荐和抄袭检测——专门为满足学者和研究人员的需求而设计的工具。该报告还研究了如何将Trinka融入第二语言写作教学,强调了其提高学习和评估的潜力。最后,本文指出了用户过度依赖、隐私问题和经济可及性等局限性,敦促教育工作者和作家对Trinka在学术环境中的使用采取批判性和负责任的态度。
{"title":"Trinka: Facilitating academic writing through an intelligent writing evaluation system","authors":"Jessie S. Barrot","doi":"10.1016/j.asw.2025.100953","DOIUrl":"10.1016/j.asw.2025.100953","url":null,"abstract":"<div><div>In recent years, intelligent writing evaluation (IWE) systems have gained significant attention due to their ability to enhance the writing process and improve content quality through artificial intelligence (AI) and natural language processing (NLP). This technology review focuses on Trinka, an advanced IWE system tailored for academic writing, which delivers context-aware feedback beyond basic grammar corrections. Key features of Trinka include an AI content detector, academic phrase bank, journal finder, citation checker, inclusive language recommendations, and plagiarism detection—tools specifically designed to meet the needs of scholars and researchers. The review also examines how Trinka can be integrated into second language (L2) writing instruction, highlighting its potential to enhance learning and assessment. Finally, the paper addresses limitations, such as user over-reliance, privacy concerns, and financial accessibility, urging educators and writers to adopt a critical and responsible approach to Trinka’s use in academic contexts.</div></div>","PeriodicalId":46865,"journal":{"name":"Assessing Writing","volume":"65 ","pages":"Article 100953"},"PeriodicalIF":4.2,"publicationDate":"2025-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144155008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Assessing Writing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1