首页 > 最新文献

Assessing Writing最新文献

英文 中文
How reliable and valid is peer evaluation in adolescents’ L2 argumentative writing? 同伴评价在青少年第二语言议论文写作中的可靠性和有效性如何?
IF 5.5 1区 文学 Q1 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2025-11-29 DOI: 10.1016/j.asw.2025.100992
Albert W. Li , Steve Graham
Peer evaluation is widely recognized for its educational benefits; however, its reliability and validity, particularly among adolescent second-language (L2) writers at the early stages of English language and literacy development, remain insufficiently explored. This explanatory sequential mixed-methods study investigated the reliability and validity of peer evaluation in English argumentative writing among 35 Grade 10 and 37 Grade 12 students from a public high school in Beijing, China. Twelve of the participating students (six at each grade) were interviewed about the validity, reliability, and value of peer evaluation. The findings indicated that peer evaluations demonstrated high levels of reliability and validity, with peer-assessed writing scores closely aligning with inter-teacher assessments. Notably, variations were observed among Grade 10 students, particularly in the evaluation of lower-order writing skills, such as grammar and vocabulary, which exhibited reduced validity. These results underscore the potential of peer evaluation in assessing higher-order content-level writing across varying levels of L2 English writing proficiency. The study also highlights areas where adolescent L2 writers may require additional support to enhance the effectiveness of peer evaluation practices in English argumentative writing. Implications for improving English argumentative writing instruction and refining peer evaluation strategies in high school L2 English classrooms are discussed.
同伴评价因其教育效益而得到广泛认可;然而,它的可靠性和有效性,特别是在英语语言和读写能力发展的早期阶段的青少年第二语言(L2)作者中,仍然没有得到充分的探索。本解释性顺序混合方法研究调查了北京一所公立高中35名10年级学生和37名12年级学生英语议论文写作同伴评价的信度和效度。对12名参与研究的学生(每个年级6名)进行了关于同伴评价的效度、信度和价值的访谈。研究结果显示,同侪评鉴表现出较高的信度和效度,同侪评鉴的写作分数与教师间评鉴的分数密切一致。值得注意的是,在10年级学生中观察到差异,特别是在低阶写作技能的评估中,如语法和词汇,其效度降低。这些结果强调了同伴评价在评估不同水平的第二语言英语写作水平的高阶内容水平写作方面的潜力。该研究还强调了青少年第二语言作者可能需要额外支持的领域,以提高英语议论文写作中同伴评估实践的有效性。本文讨论了在高中第二语言英语课堂中提高英语议论文教学和完善同伴评价策略的意义。
{"title":"How reliable and valid is peer evaluation in adolescents’ L2 argumentative writing?","authors":"Albert W. Li ,&nbsp;Steve Graham","doi":"10.1016/j.asw.2025.100992","DOIUrl":"10.1016/j.asw.2025.100992","url":null,"abstract":"<div><div>Peer evaluation is widely recognized for its educational benefits; however, its reliability and validity, particularly among adolescent second-language (L2) writers at the early stages of English language and literacy development, remain insufficiently explored. This explanatory sequential mixed-methods study investigated the reliability and validity of peer evaluation in English argumentative writing among 35 Grade 10 and 37 Grade 12 students from a public high school in Beijing, China. Twelve of the participating students (six at each grade) were interviewed about the validity, reliability, and value of peer evaluation. The findings indicated that peer evaluations demonstrated high levels of reliability and validity, with peer-assessed writing scores closely aligning with inter-teacher assessments. Notably, variations were observed among Grade 10 students, particularly in the evaluation of lower-order writing skills, such as grammar and vocabulary, which exhibited reduced validity. These results underscore the potential of peer evaluation in assessing higher-order content-level writing across varying levels of L2 English writing proficiency. The study also highlights areas where adolescent L2 writers may require additional support to enhance the effectiveness of peer evaluation practices in English argumentative writing. Implications for improving English argumentative writing instruction and refining peer evaluation strategies in high school L2 English classrooms are discussed.</div></div>","PeriodicalId":46865,"journal":{"name":"Assessing Writing","volume":"67 ","pages":"Article 100992"},"PeriodicalIF":5.5,"publicationDate":"2025-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145624789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development of a Genre Adherence Rubric (GAR) for applied linguistics research articles 应用语言学研究论文体裁依附性标准的建立
IF 5.5 1区 文学 Q1 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2025-11-15 DOI: 10.1016/j.asw.2025.100991
Mahsa Alinasab , Javad Gholami , Zhila Mohammadnia
This study reports the development of a four-descriptor Genre Adherence Rubric (GAR) for Research Articles (RAs) based on two pilot studies. To this end, we designed, implemented, and assessed a genre-oriented RA writing course in a master's program in applied linguistics. The instructional package contained knowledge-giving and hands-on materials and tasks on moves/steps, their sequencing, and linguistic features in RA sections. The participants were asked to revise their first draft RAs following the course. We defined, developed, and piloted the GAR, which primarily consisted of move obligation, optionality, and sequencing, and used it to rate the original and revised RAs. The first pilot and scorer feedback showed that language needs to be included as an additional descriptor. In the second pilot study, implementing the four-prong GAR yielded meaningful differences in another set of revised RAs. As a novel attempt to rate RAs and similar scholarly writings through genre lenses and apart from opening new avenues for research, the GAR presented in this paper warrants further confirmation or modification. Given the ever-growing importance of scholarly writing and publishing, the findings have tenable implications for journal editors, publishers, and academic writing instructors to adopt or adapt GAR-like RA appraisal scales.
本研究报告了基于两项试点研究的研究文章(RAs)的四描述体裁依从性标准(GAR)的发展。为此,我们设计、实施并评估了应用语言学硕士课程中以体裁为导向的RA写作课程。该教学包包含了有关动作/步骤、顺序和RA部分语言特征的知识传授和实践材料和任务。参与者被要求在课程结束后修改他们的第一份RAs草稿。我们定义、开发并试用了GAR,它主要由移动义务、可选性和排序组成,并使用它对原始和修订的RAs进行评级。第一个试点和评分者的反馈表明,语言需要作为一个额外的描述符。在第二项试点研究中,实施四叉GAR在另一组修订后的RAs中产生了有意义的差异。作为一种通过体裁视角评价RAs和类似学术著作的新颖尝试,除了为研究开辟新的途径外,本文提出的GAR值得进一步确认或修改。鉴于学术写作和出版的重要性日益增长,研究结果对期刊编辑、出版商和学术写作导师采用或调整类似gar的RA评估量表有一定的意义。
{"title":"Development of a Genre Adherence Rubric (GAR) for applied linguistics research articles","authors":"Mahsa Alinasab ,&nbsp;Javad Gholami ,&nbsp;Zhila Mohammadnia","doi":"10.1016/j.asw.2025.100991","DOIUrl":"10.1016/j.asw.2025.100991","url":null,"abstract":"<div><div>This study reports the development of a four-descriptor Genre Adherence Rubric (GAR) for Research Articles (RAs) based on two pilot studies. To this end, we designed, implemented, and assessed a genre-oriented RA writing course in a master's program in applied linguistics. The instructional package contained knowledge-giving and hands-on materials and tasks on moves/steps, their sequencing, and linguistic features in RA sections. The participants were asked to revise their first draft RAs following the course. We defined, developed, and piloted the GAR, which primarily consisted of move obligation, optionality, and sequencing, and used it to rate the original and revised RAs. The first pilot and scorer feedback showed that language needs to be included as an additional descriptor. In the second pilot study, implementing the four-prong GAR yielded meaningful differences in another set of revised RAs. As a novel attempt to rate RAs and similar scholarly writings through genre lenses and apart from opening new avenues for research, the GAR presented in this paper warrants further confirmation or modification. Given the ever-growing importance of scholarly writing and publishing, the findings have tenable implications for journal editors, publishers, and academic writing instructors to adopt or adapt GAR-like RA appraisal scales.</div></div>","PeriodicalId":46865,"journal":{"name":"Assessing Writing","volume":"67 ","pages":"Article 100991"},"PeriodicalIF":5.5,"publicationDate":"2025-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145521191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generative artificial intelligence for automated essay scoring: Exploring teacher agency through an ecological perspective 自动作文评分的生成式人工智能:从生态角度探索教师代理
IF 5.5 1区 文学 Q1 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2025-11-11 DOI: 10.1016/j.asw.2025.100990
Jessie S. Barrot
Generative artificial intelligence (AI) is increasingly used in writing assessment, particularly for automated essay scoring (AES) and for generating formative feedback within automated writing evaluation (AWE). While AI-driven AES enhances efficiency and consistency, concerns regarding accuracy, bias, and ethical implications raise critical questions about its role in assessment. This paper examines the impact of generative AI on teacher agency through an ecological perspective, which considers agency as shaped by personal, institutional, and sociocultural factors. The analysis highlights the need for teachers to critically mediate AI-generated scores and feedback to align them with pedagogical goals, ensuring AI functions as an assistive tool rather than a determinant of assessment outcomes. Although AI can streamline assessment, over-reliance risks diminishing teachers’ evaluative expertise and reinforcing biases embedded in AI systems. Ethical concerns, including transparency, data privacy, and fairness, further complicate its adoption. To address these challenges, this paper proposes a framework for responsible AI integration that prioritizes bias mitigation, data security, and teacher-driven decision-making. The discussion concludes with pedagogical implications and directions for future research on AI-assisted writing assessment.
生成式人工智能(AI)越来越多地用于写作评估,特别是自动作文评分(AES)和在自动写作评估(AWE)中生成形成性反馈。虽然人工智能驱动的AES提高了效率和一致性,但对准确性、偏见和伦理影响的担忧引发了对其在评估中的作用的关键问题。本文从生态学的角度考察了生成式人工智能对教师能动性的影响,认为能动性是由个人、制度和社会文化因素形成的。分析强调,教师需要批判性地调节人工智能生成的分数和反馈,使其与教学目标保持一致,确保人工智能作为辅助工具而不是评估结果的决定因素。尽管人工智能可以简化评估,但过度依赖可能会削弱教师的评估专业知识,并强化人工智能系统中嵌入的偏见。道德问题,包括透明度、数据隐私和公平性,进一步使其采用复杂化。为了应对这些挑战,本文提出了一个负责任的人工智能集成框架,优先考虑减轻偏见、数据安全和教师驱动的决策。最后讨论了人工智能辅助写作评估的教学意义和未来研究方向。
{"title":"Generative artificial intelligence for automated essay scoring: Exploring teacher agency through an ecological perspective","authors":"Jessie S. Barrot","doi":"10.1016/j.asw.2025.100990","DOIUrl":"10.1016/j.asw.2025.100990","url":null,"abstract":"<div><div>Generative artificial intelligence (AI) is increasingly used in writing assessment, particularly for automated essay scoring (AES) and for generating formative feedback within automated writing evaluation (AWE). While AI-driven AES enhances efficiency and consistency, concerns regarding accuracy, bias, and ethical implications raise critical questions about its role in assessment. This paper examines the impact of generative AI on teacher agency through an ecological perspective, which considers agency as shaped by personal, institutional, and sociocultural factors. The analysis highlights the need for teachers to critically mediate AI-generated scores and feedback to align them with pedagogical goals, ensuring AI functions as an assistive tool rather than a determinant of assessment outcomes. Although AI can streamline assessment, over-reliance risks diminishing teachers’ evaluative expertise and reinforcing biases embedded in AI systems. Ethical concerns, including transparency, data privacy, and fairness, further complicate its adoption. To address these challenges, this paper proposes a framework for responsible AI integration that prioritizes bias mitigation, data security, and teacher-driven decision-making. The discussion concludes with pedagogical implications and directions for future research on AI-assisted writing assessment.</div></div>","PeriodicalId":46865,"journal":{"name":"Assessing Writing","volume":"67 ","pages":"Article 100990"},"PeriodicalIF":5.5,"publicationDate":"2025-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145486179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integrating move analysis and sentence reconstruction in automated writing evaluation for L2 academic writers 在二语学术写作自动评价中整合移动分析和句子重构
IF 5.5 1区 文学 Q1 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2025-10-01 DOI: 10.1016/j.asw.2025.100984
Bo-Ren Mau , Hui-Hsien Feng
Artificial intelligence has been widely utilized to assist L2 writers through automated writing evaluation (AWE) systems, which offer grammatical feedback. However, for English academic writing, such feedback is apparently insufficient to address the complexities of academic discourse. While genre-based AWE systems employ move analysis, they offer move detections as corrective feedback (CF) without addressing language use issues and are developed using limited datasets. Additionally, general-purpose large language models (LLMs; e.g., ChatGPT) may lack specialized mechanisms for accurately identifying rhetorical moves and providing genre-specific feedback in academic writing contexts. To address these limitations, this study proposes GURUS, a genre-based AWE system grounded in second language acquisition theories. It provides indirect CF by classifying moves with probabilistic scores, and direct CF through sentence reconstruction. GURUS is implemented as a web-based application using ensemble learning model and transformer-based LLMs. By offering indirect and direct CF, GURUS promotes learner-machine interaction, prompting learners to notice discrepancies between their writing and the reconstructed sentences. GURUS was trained on over one million sentences with OMRC moves. Its classification performance was assessed using F1-score and Brier score; furthermore, semantic and rhetorical production were evaluated using BERTscore and human assessment. The results show that GURUS sufficiently classifies sentence moves and reconstructs sentences while retaining semantic integrity. Given GURUS holds promise in academic writing instruction, this study also discusses its implementation to bolster learners’ genre awareness and proficiency in move-based abstract writing.
人工智能通过自动写作评估(AWE)系统被广泛用于帮助二语作者,该系统提供语法反馈。然而,对于英语学术写作来说,这种反馈显然不足以解决学术话语的复杂性。虽然基于类型的AWE系统使用移动分析,但它们提供的移动检测作为纠正反馈(CF),而不解决语言使用问题,并且使用有限的数据集开发。此外,通用大型语言模型(llm,如ChatGPT)可能缺乏专门的机制来准确识别修辞动作,并在学术写作环境中提供特定类型的反馈。为了解决这些限制,本研究提出了GURUS,这是一个基于第二语言习得理论的基于体裁的AWE系统。它通过概率得分对动作进行分类来提供间接推理,通过句子重构来提供直接推理。GURUS是使用集成学习模型和基于转换器的llm实现的基于web的应用程序。通过提供间接和直接的CF, GURUS促进了学习者与机器的互动,促使学习者注意到他们的写作和重构句子之间的差异。大师们用OMRC动作训练了超过一百万个句子。采用f1评分和Brier评分评价其分类性能;此外,使用BERTscore和人工评估来评估语义和修辞生成。结果表明,GURUS在保持语义完整性的前提下,对句子移动进行了充分的分类和重构。鉴于GURUS在学术写作教学中具有前景,本研究还讨论了它的实施,以提高学习者的体裁意识和熟练程度基于动作的抽象写作。
{"title":"Integrating move analysis and sentence reconstruction in automated writing evaluation for L2 academic writers","authors":"Bo-Ren Mau ,&nbsp;Hui-Hsien Feng","doi":"10.1016/j.asw.2025.100984","DOIUrl":"10.1016/j.asw.2025.100984","url":null,"abstract":"<div><div>Artificial intelligence has been widely utilized to assist L2 writers through automated writing evaluation (AWE) systems, which offer grammatical feedback. However, for English academic writing, such feedback is apparently insufficient to address the complexities of academic discourse. While genre-based AWE systems employ move analysis, they offer move detections as corrective feedback (CF) without addressing language use issues and are developed using limited datasets. Additionally, general-purpose large language models (LLMs; e.g., ChatGPT) may lack specialized mechanisms for accurately identifying rhetorical moves and providing genre-specific feedback in academic writing contexts. To address these limitations, this study proposes GURUS, a genre-based AWE system grounded in second language acquisition theories. It provides indirect CF by classifying moves with probabilistic scores, and direct CF through sentence reconstruction. GURUS is implemented as a web-based application using ensemble learning model and transformer-based LLMs. By offering indirect and direct CF, GURUS promotes learner-machine interaction, prompting learners to notice discrepancies between their writing and the reconstructed sentences. GURUS was trained on over one million sentences with OMRC moves. Its classification performance was assessed using <em>F1</em>-score and Brier score; furthermore, semantic and rhetorical production were evaluated using BERTscore and human assessment. The results show that GURUS sufficiently classifies sentence moves and reconstructs sentences while retaining semantic integrity. Given GURUS holds promise in academic writing instruction, this study also discusses its implementation to bolster learners’ genre awareness and proficiency in move-based abstract writing.</div></div>","PeriodicalId":46865,"journal":{"name":"Assessing Writing","volume":"66 ","pages":"Article 100984"},"PeriodicalIF":5.5,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145264745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using ChatGPT to score essays and short-form constructed responses 使用ChatGPT对文章和短文进行评分
IF 5.5 1区 文学 Q1 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2025-10-01 DOI: 10.1016/j.asw.2025.100988
Mark D. Shermis
This study evaluates the effectiveness of ChatGPT-4o in scoring essays and short-form constructed responses compared to human raters and traditional machine learning models. Using data from the Automated Student Assessment Prize (ASAP), ChatGPT’s performance was assessed across multiple predictive models, including linear regression, random forest, gradient boost, and XGBoost. Results indicate that while ChatGPT’s gradient boost model achieved quadratic weighted kappa (QWK) scores close to human raters for some datasets, overall performance remained inconsistent, particularly for short-form responses. The study highlights key challenges, including variability in scoring accuracy, potential biases, and limitations in aligning ChatGPT’s predictions with human scoring standards. While ChatGPT demonstrated efficiency and scalability, its leniency and variability suggest that it should not yet replace human raters in high-stakes assessments. Instead, a hybrid approach combining AI with empirical scoring models may improve reliability. Future research should focus on refining AI-driven scoring models through enhanced fine-tuning, bias mitigation, and validation with broader datasets. Ethical considerations, including fairness in automated scoring and data security, must also be addressed. This study concludes that ChatGPT holds promise as a supplementary tool in educational assessment but requires further development to ensure validity and fairness.
本研究评估了chatgpt - 40与人类评分者和传统机器学习模型相比,在评分文章和简短的构造反应方面的有效性。使用来自自动学生评估奖(ASAP)的数据,ChatGPT的性能通过多个预测模型进行评估,包括线性回归、随机森林、梯度增强和XGBoost。结果表明,虽然ChatGPT的梯度提升模型在某些数据集上获得了接近人类评分的二次加权kappa (QWK)分数,但总体表现仍然不一致,特别是对于简短的回答。该研究强调了关键的挑战,包括评分准确性的可变性、潜在的偏见,以及将ChatGPT的预测与人类评分标准保持一致的局限性。虽然ChatGPT展示了效率和可扩展性,但它的宽松性和可变性表明,它还不应该在高风险评估中取代人类评分员。相反,将人工智能与经验评分模型相结合的混合方法可能会提高可靠性。未来的研究应侧重于通过加强微调、减少偏见和使用更广泛的数据集进行验证来改进人工智能驱动的评分模型。道德方面的考虑,包括自动评分的公平性和数据安全,也必须得到解决。本研究得出结论,ChatGPT作为教育评估的补充工具有希望,但需要进一步发展以确保有效性和公平性。
{"title":"Using ChatGPT to score essays and short-form constructed responses","authors":"Mark D. Shermis","doi":"10.1016/j.asw.2025.100988","DOIUrl":"10.1016/j.asw.2025.100988","url":null,"abstract":"<div><div>This study evaluates the effectiveness of ChatGPT-4o in scoring essays and short-form constructed responses compared to human raters and traditional machine learning models. Using data from the Automated Student Assessment Prize (ASAP), ChatGPT’s performance was assessed across multiple predictive models, including linear regression, random forest, gradient boost, and XGBoost. Results indicate that while ChatGPT’s gradient boost model achieved quadratic weighted kappa (QWK) scores close to human raters for some datasets, overall performance remained inconsistent, particularly for short-form responses. The study highlights key challenges, including variability in scoring accuracy, potential biases, and limitations in aligning ChatGPT’s predictions with human scoring standards. While ChatGPT demonstrated efficiency and scalability, its leniency and variability suggest that it should not yet replace human raters in high-stakes assessments. Instead, a hybrid approach combining AI with empirical scoring models may improve reliability. Future research should focus on refining AI-driven scoring models through enhanced fine-tuning, bias mitigation, and validation with broader datasets. Ethical considerations, including fairness in automated scoring and data security, must also be addressed. This study concludes that ChatGPT holds promise as a supplementary tool in educational assessment but requires further development to ensure validity and fairness.</div></div>","PeriodicalId":46865,"journal":{"name":"Assessing Writing","volume":"66 ","pages":"Article 100988"},"PeriodicalIF":5.5,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145320230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The development of syntactic complexity in integrated writing: A focus on fine-grained measures 综合写作中句法复杂性的发展:对细粒度测量的关注
IF 5.5 1区 文学 Q1 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2025-10-01 DOI: 10.1016/j.asw.2025.100983
Seyyed Ehsan Golparvar , J. Elliott Casal , Hamideh Abolhasani
Past studies on complexity development in L2 writing have largely examined syntactic complexity, measured by traditional length-based measures, in independent writing, rather than integrated writing. The present research adds to the literature by investigating the development of clausal and phrasal indices of syntactic complexity in two integrated writing tasks (graph writing and summary writing) via fine-grained measures. The participants wrote three graph-based essays in the first semester, and three summary writing essays in the second semester. The results of mixed-effects modeling demonstrated significant changes in both phrasal and clausal complexity and syntactic variety at the phrasal level, evidencing growth towards a more phrasal complex academic style. Task type also had a significant impact on these measures; higher values were generally found in summary writing responses. The results are discussed in light of previous literature as well as the characteristics of the source texts, and theoretical and methodological implications for writing complexity research are offered.
过去关于二语写作复杂性发展的研究主要是研究独立写作中的句法复杂性,用传统的基于长度的方法来衡量,而不是综合写作。本研究在文献的基础上,通过细粒度测量研究了两种综合写作任务(图表写作和摘要写作)中句法复杂性的小句和短语指标的发展。参与者在第一学期写了三篇基于图表的论文,在第二学期写了三篇总结写作论文。混合效应建模的结果显示,短语和小句的复杂性以及短语层面的句法多样性都发生了显著变化,表明短语更加复杂的学术风格正在发展。任务类型对这些指标也有显著影响;通常在总结写作回答中发现较高的值。本文结合以往的文献以及源文本的特点对研究结果进行了讨论,并为写作复杂性研究提供了理论和方法上的启示。
{"title":"The development of syntactic complexity in integrated writing: A focus on fine-grained measures","authors":"Seyyed Ehsan Golparvar ,&nbsp;J. Elliott Casal ,&nbsp;Hamideh Abolhasani","doi":"10.1016/j.asw.2025.100983","DOIUrl":"10.1016/j.asw.2025.100983","url":null,"abstract":"<div><div>Past studies on complexity development in L2 writing have largely examined syntactic complexity, measured by traditional length-based measures, in independent writing, rather than integrated writing. The present research adds to the literature by investigating the development of clausal and phrasal indices of syntactic complexity in two integrated writing tasks (graph writing and summary writing) via fine-grained measures. The participants wrote three graph-based essays in the first semester, and three summary writing essays in the second semester. The results of mixed-effects modeling demonstrated significant changes in both phrasal and clausal complexity and syntactic variety at the phrasal level, evidencing growth towards a more phrasal complex academic style. Task type also had a significant impact on these measures; higher values were generally found in summary writing responses. The results are discussed in light of previous literature as well as the characteristics of the source texts, and theoretical and methodological implications for writing complexity research are offered.</div></div>","PeriodicalId":46865,"journal":{"name":"Assessing Writing","volume":"66 ","pages":"Article 100983"},"PeriodicalIF":5.5,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145218946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Linguistic predictors of L2 writing performance: Variations across genres 二语写作表现的语言预测因素:不同体裁的差异
IF 5.5 1区 文学 Q1 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2025-10-01 DOI: 10.1016/j.asw.2025.100985
Weiwei Yang , Sara T. Cushing , Guoxing Yu
This study investigated how linguistic complexity (including lexical and syntactic complexity), accuracy, and fluency (CAF) predicted second language (L2) writing scores across four essay genres: narration, exposition, expo-argumentation and argumentation. Approximately 60 essays were collected on each of these genres on the same subject matter and were scored using a holistic rubric. Eight measures of complexity, accuracy and fluency were examined. Forward stepwise regression analysis based on Akaike Information Criterion Corrected (AICC) was conducted for each genre. The findings revealed a large amount of score variance explained by CAF: 61 % for the argumentative task and about 70 % for the other three tasks. Fluency was found to be a highly important score predictor for the narrative and expository tasks, while lexical sophistication was equally important or more important than fluency for the expo-argumentative and argumentative tasks. The regression model for the narrative task also differed from those for the expository, argumentative task types, regarding syntactic complexity predictors. Lexical diversity was generally less important in predicting scores than lexical sophistication. The implications of the findings for L2 writing scoring and automated essay scoring are discussed.
本研究调查了语言复杂性(包括词汇和句法复杂性)、准确性和流畅性(CAF)如何预测第二语言(L2)写作分数,涉及四种散文类型:叙述、阐述、阐述论证和论证。在相同的主题上,收集了大约60篇关于这些类型的文章,并使用整体标题进行评分。对语言的复杂性、准确性和流畅性进行了八项测试。对各类型进行基于赤池信息标准修正(AICC)的正逐步回归分析。研究结果显示,CAF解释了大量的分数差异:辩论任务为61 %,其他三个任务约为70 %。在叙述和说明性任务中,流利性被发现是一个非常重要的得分预测指标,而在说明性论证和论证任务中,词汇的复杂性与流利性同样重要,甚至更重要。在句法复杂性预测因子方面,叙事性任务的回归模型也不同于说明性、议论文型任务。词汇多样性在预测分数方面的重要性通常低于词汇复杂程度。研究结果对第二语言写作评分和自动作文评分的影响进行了讨论。
{"title":"Linguistic predictors of L2 writing performance: Variations across genres","authors":"Weiwei Yang ,&nbsp;Sara T. Cushing ,&nbsp;Guoxing Yu","doi":"10.1016/j.asw.2025.100985","DOIUrl":"10.1016/j.asw.2025.100985","url":null,"abstract":"<div><div>This study investigated how linguistic complexity (including lexical and syntactic complexity), accuracy, and fluency (CAF) predicted second language (L2) writing scores across four essay genres: narration, exposition, expo-argumentation and argumentation. Approximately 60 essays were collected on each of these genres on the same subject matter and were scored using a holistic rubric. Eight measures of complexity, accuracy and fluency were examined. Forward stepwise regression analysis based on Akaike Information Criterion Corrected (AICC) was conducted for each genre. The findings revealed a large amount of score variance explained by CAF: 61 % for the argumentative task and about 70 % for the other three tasks. Fluency was found to be a highly important score predictor for the narrative and expository tasks, while lexical sophistication was equally important or more important than fluency for the expo-argumentative and argumentative tasks. The regression model for the narrative task also differed from those for the expository, argumentative task types, regarding syntactic complexity predictors. Lexical diversity was generally less important in predicting scores than lexical sophistication. The implications of the findings for L2 writing scoring and automated essay scoring are discussed.</div></div>","PeriodicalId":46865,"journal":{"name":"Assessing Writing","volume":"66 ","pages":"Article 100985"},"PeriodicalIF":5.5,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145219003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring the scoring validity of holistic and dimension-based Comparative Judgements of young learners’ EFL writing 探讨青少年英语写作整体与维度比较判断的评分效度
IF 5.5 1区 文学 Q1 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2025-10-01 DOI: 10.1016/j.asw.2025.100986
Rebecca Sickinger , John Pill , Tineke Brunfaut
Comparative Judgement (CJ) is a pairwise comparison evaluation method, typically conducted online. Multiple judges each compare the quality of a series of paired performances and, from their decisions, a rank order is constructed and scores calculated. Research across different educational contexts supports CJ’s reliability for evaluating written performances, permitting more precise scoring of scripts and for dimension-focused evaluation. However, scant insights are available about the basis of judges’ evaluations. This issue is important because argument-based approaches to validation (common in the field of language testing and adopted in this study) require evidence to support claims about how scores are appropriate for test purpose. Therefore, we investigate the scoring validity of CJ, both when used holistically (the standard application of CJ) and when evaluating scripts by individual criteria (termed dimensions in the research context). Twenty-seven judges evaluated 300 scripts addressing two writing task types in a national English as a Foreign Language examination for young learners in Austria. Judges reported via questionnaires what they had focused on while judging. Subsequently, eight judges provided think-aloud data while evaluating 157 scripts, offering further insight into the writing features they considered and their decision-making during CJ. Findings showed that while most judges adapted a decision-making process similar to traditional rating methods, some adapted their method to accommodate the nature of CJ evaluation. Furthermore, results indicated that the judges considered construct-relevant criteria when using CJ, both holistically and by dimension, thus offering support to an argument for the appropriateness of using CJ in this context.
比较判断(CJ)是一种两两比较评价方法,通常在网上进行。多名裁判各自比较一系列配对表演的质量,并根据他们的决定,构建一个排名顺序并计算分数。研究跨越不同的教育背景,支持CJ的可靠性评估书面表演,允许更精确的剧本评分和维度为重点的评估。然而,关于法官评价的依据,人们的见解很少。这个问题很重要,因为基于论证的验证方法(在语言测试领域很常见,并在本研究中采用)需要证据来支持分数如何适合测试目的的说法。因此,我们调查了CJ的评分效度,无论是在整体使用时(CJ的标准应用),还是在根据个人标准评估剧本时(在研究背景下称为维度)。在奥地利举行的全国英语作为外语考试中,27名评委评估了300个针对两种写作任务类型的剧本。评委们通过问卷报告了他们在评判时关注的焦点。随后,8位评委在评估157个剧本的同时,提供了“出声思考”的数据,进一步深入了解了他们在CJ期间考虑的写作特点和他们的决策。调查结果表明,虽然大多数法官采用了类似于传统评级方法的决策过程,但有些法官调整了他们的方法,以适应CJ评价的性质。此外,结果表明,法官在使用CJ时考虑了整体和维度的建构相关标准,从而为在这种情况下使用CJ的适当性提供了支持。
{"title":"Exploring the scoring validity of holistic and dimension-based Comparative Judgements of young learners’ EFL writing","authors":"Rebecca Sickinger ,&nbsp;John Pill ,&nbsp;Tineke Brunfaut","doi":"10.1016/j.asw.2025.100986","DOIUrl":"10.1016/j.asw.2025.100986","url":null,"abstract":"<div><div>Comparative Judgement (CJ) is a pairwise comparison evaluation method, typically conducted online. Multiple judges each compare the quality of a series of paired performances and, from their decisions, a rank order is constructed and scores calculated. Research across different educational contexts supports CJ’s reliability for evaluating written performances, permitting more precise scoring of scripts and for dimension-focused evaluation. However, scant insights are available about the basis of judges’ evaluations. This issue is important because argument-based approaches to validation (common in the field of language testing and adopted in this study) require evidence to support claims about how scores are appropriate for test purpose. Therefore, we investigate the scoring validity of CJ, both when used holistically (the standard application of CJ) and when evaluating scripts by individual criteria (termed dimensions in the research context). Twenty-seven judges evaluated 300 scripts addressing two writing task types in a national English as a Foreign Language examination for young learners in Austria. Judges reported via questionnaires what they had focused on while judging. Subsequently, eight judges provided think-aloud data while evaluating 157 scripts, offering further insight into the writing features they considered and their decision-making during CJ. Findings showed that while most judges adapted a decision-making process similar to traditional rating methods, some adapted their method to accommodate the nature of CJ evaluation. Furthermore, results indicated that the judges considered construct-relevant criteria when using CJ, both holistically and by dimension, thus offering support to an argument for the appropriateness of using CJ in this context.</div></div>","PeriodicalId":46865,"journal":{"name":"Assessing Writing","volume":"66 ","pages":"Article 100986"},"PeriodicalIF":5.5,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145320231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Which gender provides more specific peer feedback? Gender and assessment training’s effects on peer feedback specificity and intrapersonal factors 哪种性别提供更具体的同伴反馈?性别和评估培训对同伴反馈特异性和个人因素的影响
IF 5.5 1区 文学 Q1 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2025-10-01 DOI: 10.1016/j.asw.2025.100987
José Carlos G. Ocampo , Ernesto Panadero , David Zamorano , Iván Sánchez-Iglesias
This study investigated the effects of assessor gender (male vs. female), fictitious assessee gender (male vs. female), and assessment training (with vs. without) on peer feedback specificity (i.e. localisation and focus) and intrapersonal factors (i.e. trust in the self as an assessor and discomfort). This study involved 240 undergraduate psychology students (nMen=120, nWomen=120), with half receiving assessment training and the other half receiving the task instructions. Participants were divided into eight subgroups based on training condition and their self-reported gender to provide peer feedback to three writing samples (poor, average, excellent quality) by fictitious male or female peer assessees in Eduflow. A total of 3017 peer feedback segments were analysed, revealing that trained or untrained male and female assessors were comparable in most peer feedback specificity categories when assessing fictitious male or female assessees. Nonetheless, we also found that female assessors excelled in certain categories of peer feedback specificity, while male assessors also demonstrated competencies in other categories. Results also showed that assessors who received assessment training provided localised peer feedback in all the writing samples. Finally, gender and training did not affect participants’ trust in their abilities and (dis)comfort when providing peer feedback.
本研究调查了评估者性别(男性与女性)、虚构的评估者性别(男性与女性)和评估培训(有与没有)对同伴反馈特异性(即定位和焦点)和个人因素(即对自我作为评估者的信任和不适)的影响。这项研究涉及240名心理学本科生(n男=120,n女=120),其中一半接受评估训练,另一半接受任务指导。参与者根据培训条件和自我报告的性别分为8个小组,由Eduflow中虚构的男性或女性同行评议者对三个写作样本(差、一般、优秀)提供同行反馈。共分析了3017个同伴反馈部分,揭示了在评估虚构的男性或女性评估者时,受过培训或未受过培训的男性和女性评估者在大多数同伴反馈特异性类别中具有可比性。尽管如此,我们也发现女性评估者在同伴反馈特异性的某些类别中表现出色,而男性评估者在其他类别中也表现出能力。结果还显示,接受评估培训的评估员在所有写作样本中提供了本地化的同行反馈。最后,性别和培训不影响参与者对自己能力的信任和提供同伴反馈时的(不)舒适感。
{"title":"Which gender provides more specific peer feedback? Gender and assessment training’s effects on peer feedback specificity and intrapersonal factors","authors":"José Carlos G. Ocampo ,&nbsp;Ernesto Panadero ,&nbsp;David Zamorano ,&nbsp;Iván Sánchez-Iglesias","doi":"10.1016/j.asw.2025.100987","DOIUrl":"10.1016/j.asw.2025.100987","url":null,"abstract":"<div><div>This study investigated the effects of assessor gender (male vs. female), fictitious assessee gender (male vs. female), and assessment training (with vs. without) on peer feedback specificity (i.e. localisation and focus) and intrapersonal factors (i.e. trust in the self as an assessor and discomfort). This study involved 240 undergraduate psychology students (n<sub>Men</sub>=120, n<sub>Women</sub>=120), with half receiving assessment training and the other half receiving the task instructions. Participants were divided into eight subgroups based on training condition and their self-reported gender to provide peer feedback to three writing samples (poor, average, excellent quality) by fictitious male or female peer assessees in <em>Eduflow</em>. A total of 3017 peer feedback segments were analysed, revealing that trained or untrained male and female assessors were comparable in most peer feedback specificity categories when assessing fictitious male or female assessees. Nonetheless, we also found that female assessors excelled in certain categories of peer feedback specificity, while male assessors also demonstrated competencies in other categories. Results also showed that assessors who received assessment training provided localised peer feedback in all the writing samples. Finally, gender and training did not affect participants’ trust in their abilities and (dis)comfort when providing peer feedback.</div></div>","PeriodicalId":46865,"journal":{"name":"Assessing Writing","volume":"66 ","pages":"Article 100987"},"PeriodicalIF":5.5,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145264744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GenAI and human assessments of L2 Chinese writing: Interrater reliability and rater bias 第二语言中文写作的基因与人评:评者间信度与评者偏差
IF 5.5 1区 文学 Q1 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2025-10-01 DOI: 10.1016/j.asw.2025.100989
Yuan Lu , Xiaoying Liles , Xi Ma
This study examines generative artificial intelligence (GenAI), specifically ChatGPT and DeepSeek, and human assessments of Chinese as a second language (L2) writing, with a focus on interrater reliability, severity, consistency, and potential genre-based biases. Agreement and correlation analyses revealed substantial variability in interrater reliability among human raters, regardless of their rating experience. ChatGPT consistently demonstrated higher agreement with human raters than DeepSeek. The lowest levels of agreement were observed between DeepSeek and human raters as well as between the two GenAI raters. A Many-Facet Rasch Model analysis showed that ChatGPT tended to rate essays more leniently than DeepSeek and closely resembled experienced human raters in terms of severity, but DeepSeek’s severity aligned more closely with that of novice human raters. No significant genre-based biases were identified for GenAI and human raters. The observed differences in GenAI rating performance may likely result from distinctions in their large language models’ training data, computing capacities, model architectures, and functionalities. These findings offer evidence-based practical implications for the integration of GenAI tools in L2 Chinese writing assessment.
本研究探讨了生成式人工智能(GenAI),特别是ChatGPT和DeepSeek,以及人类对汉语作为第二语言(L2)写作的评估,重点是翻译的可靠性、严重性、一致性和潜在的基于体裁的偏见。一致性和相关性分析揭示了人类评价者之间的可靠性的实质性变化,无论他们的评级经验如何。与DeepSeek相比,ChatGPT始终表现出与人类评分更高的一致性。DeepSeek和人类评分者以及两个GenAI评分者之间的一致性最低。一项多面Rasch模型分析显示,ChatGPT对文章的评分比DeepSeek更宽松,在严重程度上与经验丰富的人类评分者非常相似,但DeepSeek的严重程度与新手的人类评分者更接近。GenAI和人类评分者没有发现明显的基于体裁的偏差。观察到的GenAI评级性能的差异可能是由于它们的大型语言模型的训练数据、计算能力、模型体系结构和功能的差异。这些发现为将GenAI工具整合到第二语言中文写作评估中提供了基于证据的实践意义。
{"title":"GenAI and human assessments of L2 Chinese writing: Interrater reliability and rater bias","authors":"Yuan Lu ,&nbsp;Xiaoying Liles ,&nbsp;Xi Ma","doi":"10.1016/j.asw.2025.100989","DOIUrl":"10.1016/j.asw.2025.100989","url":null,"abstract":"<div><div>This study examines generative artificial intelligence (GenAI), specifically ChatGPT and DeepSeek, and human assessments of Chinese as a second language (L2) writing, with a focus on interrater reliability, severity, consistency, and potential genre-based biases. Agreement and correlation analyses revealed substantial variability in interrater reliability among human raters, regardless of their rating experience. ChatGPT consistently demonstrated higher agreement with human raters than DeepSeek. The lowest levels of agreement were observed between DeepSeek and human raters as well as between the two GenAI raters. A Many-Facet Rasch Model analysis showed that ChatGPT tended to rate essays more leniently than DeepSeek and closely resembled experienced human raters in terms of severity, but DeepSeek’s severity aligned more closely with that of novice human raters. No significant genre-based biases were identified for GenAI and human raters. The observed differences in GenAI rating performance may likely result from distinctions in their large language models’ training data, computing capacities, model architectures, and functionalities. These findings offer evidence-based practical implications for the integration of GenAI tools in L2 Chinese writing assessment.</div></div>","PeriodicalId":46865,"journal":{"name":"Assessing Writing","volume":"66 ","pages":"Article 100989"},"PeriodicalIF":5.5,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145415988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Assessing Writing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1