首页 > 最新文献

Applied Corpus Linguistics最新文献

英文 中文
Identifying ChatGPT-generated texts in EFL students’ writing: Through comparative analysis of linguistic fingerprints 识别英语语言学生写作中由 ChatGPT 生成的文本:通过语言指纹的比较分析
Pub Date : 2024-09-26 DOI: 10.1016/j.acorp.2024.100106
The emergence of generative AI (GenAI) poses new challenges for L2 writing teachers. This study investigates the distinguishability of essays written by Japanese EFL learners from those generated by ChatGPT. Partially replicating Herbold et al. (2023), 140 first-year university students wrote essays and completed a survey on ChatGPT use. Among them, 125 wrote independently, 13 used ChatGPT for proofreading, and two asked ChatGPT to write the entire essay. To create a comparative dataset, 123 additional essays were generated by ChatGPT, imitating the two texts. The resulting 263 essays were then analyzed using the natural language processing (NLP) technique, including automated linguistic analysis and machine learning classification using random forest. The results reveal significant differences between human-written and ChatGPT-generated essays across all linguistic features, with the latter being easily identifiable. This study emphasizes the need for clear guidelines on the ethical use of AI in L2 writing, highlighting the potential risk of inappropriate AI use and the importance of fostering a mutual understanding of AI use with learners regarding responsible AI integration in academic work.
生成式人工智能(GenAI)的出现给 L2 写作教师带来了新的挑战。本研究调查了日本 EFL 学习者撰写的文章与 ChatGPT 生成的文章之间的可区分性。部分复制 Herbold 等人(2023 年)的研究,140 名大学一年级学生撰写了文章,并完成了关于 ChatGPT 使用情况的调查。其中,125 人独立写作,13 人使用 ChatGPT 进行校对,2 人要求 ChatGPT 撰写整篇文章。为了创建一个比较数据集,ChatGPT 又模仿这两篇文章生成了 123 篇文章。然后使用自然语言处理(NLP)技术对生成的 263 篇文章进行了分析,包括自动语言分析和使用随机森林的机器学习分类。结果显示,人类撰写的文章与 ChatGPT 生成的文章在所有语言特征上都存在显著差异,后者很容易识别。本研究强调了在 L2 写作中使用人工智能的道德规范,强调了不当使用人工智能的潜在风险,以及与学习者就负责任地将人工智能融入学术工作促进对人工智能使用的相互理解的重要性。
{"title":"Identifying ChatGPT-generated texts in EFL students’ writing: Through comparative analysis of linguistic fingerprints","authors":"","doi":"10.1016/j.acorp.2024.100106","DOIUrl":"10.1016/j.acorp.2024.100106","url":null,"abstract":"<div><div>The emergence of generative AI (GenAI) poses new challenges for L2 writing teachers. This study investigates the distinguishability of essays written by Japanese EFL learners from those generated by ChatGPT. Partially replicating Herbold et al. (2023), 140 first-year university students wrote essays and completed a survey on ChatGPT use. Among them, 125 wrote independently, 13 used ChatGPT for proofreading, and two asked ChatGPT to write the entire essay. To create a comparative dataset, 123 additional essays were generated by ChatGPT, imitating the two texts. The resulting 263 essays were then analyzed using the natural language processing (NLP) technique, including automated linguistic analysis and machine learning classification using random forest. The results reveal significant differences between human-written and ChatGPT-generated essays across all linguistic features, with the latter being easily identifiable. This study emphasizes the need for clear guidelines on the ethical use of AI in L2 writing, highlighting the potential risk of inappropriate AI use and the importance of fostering a mutual understanding of AI use with learners regarding responsible AI integration in academic work.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142422071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
English podcasts for schoolchildren and their vocabulary demands 学童英语播客及其词汇需求
Pub Date : 2024-09-20 DOI: 10.1016/j.acorp.2024.100107
This exploratory study examines the vocabulary demands of English children's podcasts. A 359,153-word podcast corpus was created using the written transcripts of episodes from these popular children's podcasts: But Why, Circle Round, KidNuz, Smash Boom Best, and Wow in the World. The corpus was analyzed to determine the vocabulary size necessary to know 95 % and 98 % of the words in the English children's podcasts. The results showed that a vocabulary size of the most 4,000-word families plus knowledge of proper nouns (PN), marginal words (MW), transparent compounds (TC) and acronyms (AC) provided 95.69 % coverage of the children's podcast corpus and a vocabulary size of 7,000-word families plus PN, MW, TC and AC reached 98.10 % coverage, indicating that podcasts designed for children require a larger vocabulary size compared to general-audience podcasts designed for adults.
本探索性研究探讨了英语儿童播客的词汇需求。我们使用这些流行的儿童播客的书面文字记录创建了一个包含 359,153 个单词的播客语料库:But Why、Circle Round、KidNuz、Smash Boom Best 和 Wow in the World。对语料库进行了分析,以确定认识英语儿童播客中 95% 和 98% 的单词所需的词汇量。结果表明,4,000 个词族的词汇量加上专有名词 (PN)、边缘词 (MW)、透明化合物 (TC) 和缩略语 (AC) 的知识,儿童播客语料的覆盖率为 95.69%;7,000 个词族的词汇量加上 PN、MW、TC 和 AC 的知识,覆盖率达到 98.10%。
{"title":"English podcasts for schoolchildren and their vocabulary demands","authors":"","doi":"10.1016/j.acorp.2024.100107","DOIUrl":"10.1016/j.acorp.2024.100107","url":null,"abstract":"<div><div>This exploratory study examines the vocabulary demands of English children's podcasts. A 359,153-word podcast corpus was created using the written transcripts of episodes from these popular children's podcasts: <em>But Why, Circle Round, KidNuz, Smash Boom Best</em>, and <em>Wow in the World</em>. The corpus was analyzed to determine the vocabulary size necessary to know 95 % and 98 % of the words in the English children's podcasts. The results showed that a vocabulary size of the most 4,000-word families plus knowledge of proper nouns (PN), marginal words (MW), transparent compounds (TC) and acronyms (AC) provided 95.69 % coverage of the children's podcast corpus and a vocabulary size of 7,000-word families plus PN, MW, TC and AC reached 98.10 % coverage, indicating that podcasts designed for children require a larger vocabulary size compared to general-audience podcasts designed for adults.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142327820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Investigating spoken classroom interactions in linguistically heterogeneous learning groups – An interdisciplinary approach to process video-based data in second language acquisition classrooms 调查语言异质学习小组的课堂口语互动--处理第二语言习得课堂视频数据的跨学科方法
Pub Date : 2024-09-15 DOI: 10.1016/j.acorp.2024.100104
Speaking the local language is central for successful integration into society. The teacher's language in second language (L2) classrooms serves as a crucial tool in language learning. Heterogeneity of learners’ language proficiency levels challenges teachers to adapt their language and accompanied instructional behavior. We offer an approach to study language acquisition processes and how teachers adapt their instructional language. This article presents our language-independent guidelines for processing video-based data of classroom interactions and demonstrate their reliability in a German as Second Language (GSL) classroom. These guidelines enable transcriptions of spoken language in noisy environments and detailed annotations of non-verbal classroom behavior. We outline research avenues at the intersection of empirical education research and linguistics that become feasible through these resources focusing on studying (non-)verbal adaptation strategies of teachers for learners at different proficiency levels. Our work directly fosters the interdisciplinary study of teacher-learner interactions, teacher competencies, and language acquisition.
会说当地语言是成功融入社会的关键。在第二语言(L2)课堂上,教师的语言是语言学习的重要工具。学习者语言水平的异质性对教师的语言和教学行为提出了挑战。我们提供了一种研究语言习得过程和教师如何调整教学语言的方法。本文介绍了我们处理课堂互动视频数据的独立于语言的指南,并在德语作为第二语言(GSL)的课堂上证明了这些指南的可靠性。通过这些指南,我们可以在嘈杂的环境中转录有声语言,并对非语言课堂行为进行详细注释。我们概述了实证教育研究和语言学交叉领域的研究途径,通过这些资源,研究教师针对不同水平的学习者所采取的(非)语言适应策略成为可能。我们的工作直接促进了对师生互动、教师能力和语言习得的跨学科研究。
{"title":"Investigating spoken classroom interactions in linguistically heterogeneous learning groups – An interdisciplinary approach to process video-based data in second language acquisition classrooms","authors":"","doi":"10.1016/j.acorp.2024.100104","DOIUrl":"10.1016/j.acorp.2024.100104","url":null,"abstract":"<div><div>Speaking the local language is central for successful integration into society. The teacher's language in second language (L2) classrooms serves as a crucial tool in language learning. Heterogeneity of learners’ language proficiency levels challenges teachers to adapt their language and accompanied instructional behavior. We offer an approach to study language acquisition processes and how teachers adapt their instructional language. This article presents our language-independent guidelines for processing video-based data of classroom interactions and demonstrate their reliability in a German as Second Language (GSL) classroom. These guidelines enable transcriptions of spoken language in noisy environments and detailed annotations of non-verbal classroom behavior. We outline research avenues at the intersection of empirical education research and linguistics that become feasible through these resources focusing on studying (non-)verbal adaptation strategies of teachers for learners at different proficiency levels. Our work directly fosters the interdisciplinary study of teacher-learner interactions, teacher competencies, and language acquisition.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142323215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Capturing chronological variation in L2 speech through lexical measurements and regression analysis 通过词汇测量和回归分析捕捉 L2 言语中的年代变化
Pub Date : 2024-09-15 DOI: 10.1016/j.acorp.2024.100105

This study aims to bridge gaps in current research by analyzing a longitudinal spoken learner corpus of low-proficiency English learners. We investigated the chronological variation in lexical measurements in second language (L2) speaking production, focusing on data from 104 low-proficiency learners elicited eight times over 23 months. Our findings show that measures such as the number of different words and type-token ratio are effective indicators of L2 speaking development, whereas the use of sophisticated vocabulary was not significantly correlated with learning duration. These results suggest that in the early stages of L2 acquisition, speaking skills are influenced primarily by lexical variation. This finding underscores the importance of lexical variation as a key factor in novice-level L2 speaking proficiency.

本研究旨在通过分析低水平英语学习者的纵向口语学习者语料库,弥补当前研究的不足。我们调查了第二语言(L2)口语表达中词汇测量的时间变化,重点研究了在 23 个月内八次激发 104 名低水平学习者的数据。我们的研究结果表明,不同单词的数量和类型-单词比等测量指标是第二语言口语发展的有效指标,而复杂词汇的使用与学习时间的长短没有明显的相关性。这些结果表明,在学习 L2 的早期阶段,口语技能主要受词汇变化的影响。这一发现强调了词汇变化作为新手水平 L2 口语能力关键因素的重要性。
{"title":"Capturing chronological variation in L2 speech through lexical measurements and regression analysis","authors":"","doi":"10.1016/j.acorp.2024.100105","DOIUrl":"10.1016/j.acorp.2024.100105","url":null,"abstract":"<div><p>This study aims to bridge gaps in current research by analyzing a longitudinal spoken learner corpus of low-proficiency English learners. We investigated the chronological variation in lexical measurements in second language (L2) speaking production, focusing on data from 104 low-proficiency learners elicited eight times over 23 months. Our findings show that measures such as the number of different words and type-token ratio are effective indicators of L2 speaking development, whereas the use of sophisticated vocabulary was not significantly correlated with learning duration. These results suggest that in the early stages of L2 acquisition, speaking skills are influenced primarily by lexical variation. This finding underscores the importance of lexical variation as a key factor in novice-level L2 speaking proficiency.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666799124000224/pdfft?md5=18e6b1567dc0d76abee155e9e4bd6910&pid=1-s2.0-S2666799124000224-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142270812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FreeTxt: A corpus-based bilingual free-text survey and questionnaire data analysis toolkit FreeTxt:基于语料库的双语自由文本调查和问卷数据分析工具包
Pub Date : 2024-08-23 DOI: 10.1016/j.acorp.2024.100103

Qualitative free-text responses (e.g. from questionnaires and surveys) pose a challenge to many companies and institutions which lack the expertise to analyse such data with ease. While a range of sophisticated tools for the analysis of text do exist, these are often expensive, difficult to use and/or inaccessible to non-expert users. These tools also lack support for the analysis of English and Welsh text, which can be a particular challenge in the bilingual context of Wales. This paper details the key functionalities of the first corpus-based ‘FreeTxt’ toolkit which has been designed to support the systematic analysis and visualisation of free-text data, as a direct response to these two key needs. This paper demonstrates how, by working in partnership, software engineers, natural language processing (NLP) experts and corpus linguists can collaborate with end-users and beneficiaries to provide effective solutions to real world problems. Through the development of FreeTxt (www.freetxt.app), we aimed to empower end-users to direct and lead their own analyses of both small-scale and more extensive datasets to maximise the reach and potential impact generated. The approaches reported here, and the bilingual toolkit developed, can be replicated and extended for use in other language contexts and across a range of public and professional sectors. FreeTxt is now available for the analysis of Welsh and/or English, for use by anyone in any sector in Wales and beyond.

定性的自由文本回复(如来自问卷和调查的回复)给许多公司和机构带来了挑战,因为它们缺乏轻松分析此类数据的专业知识。虽然目前确实存在一系列复杂的文本分析工具,但这些工具往往价格昂贵、难以使用和/或非专家用户无法使用。这些工具还缺乏对英语和威尔士语文本分析的支持,这在威尔士的双语环境中是一个特殊的挑战。本文详细介绍了首个基于语料库的 "FreeTxt "工具包的主要功能,该工具包旨在支持自由文本数据的系统分析和可视化,是对这两个关键需求的直接回应。本文展示了软件工程师、自然语言处理(NLP)专家和语料库语言学家如何通过合作,与最终用户和受益者共同为现实问题提供有效的解决方案。通过开发 FreeTxt (www.freetxt.app),我们旨在授权最终用户指导和领导他们自己对小规模和更大规模数据集的分析,以最大限度地扩大影响范围和潜在影响。本文所报告的方法和开发的双语工具包可在其他语言环境和一系列公共与专业部门中复制和扩展使用。FreeTxt 现在可用于威尔士语和/或英语的分析,供威尔士及其他地区任何部门的任何人使用。
{"title":"FreeTxt: A corpus-based bilingual free-text survey and questionnaire data analysis toolkit","authors":"","doi":"10.1016/j.acorp.2024.100103","DOIUrl":"10.1016/j.acorp.2024.100103","url":null,"abstract":"<div><p>Qualitative free-text responses (e.g. from questionnaires and surveys) pose a challenge to many companies and institutions which lack the expertise to analyse such data with ease. While a range of sophisticated tools for the analysis of text <em>do</em> exist, these are often expensive, difficult to use and/or inaccessible to non-expert users. These tools also lack support for the analysis of English <em>and</em> Welsh text, which can be a particular challenge in the bilingual context of Wales. This paper details the key functionalities of the first corpus-based ‘FreeTxt’ toolkit which has been designed to support the systematic analysis and visualisation of free-text data, as a direct response to these two key needs. This paper demonstrates how, by working in partnership, software engineers, natural language processing (NLP) experts and corpus linguists can collaborate with end-users and beneficiaries to provide effective solutions to real world problems. Through the development of FreeTxt (<span><span>www.freetxt.app</span><svg><path></path></svg></span>), we aimed to empower end-users to <em>direct</em> and lead their own analyses of both small-scale and more extensive datasets to maximise the reach and potential impact generated. The approaches reported here, and the bilingual toolkit developed, can be replicated and extended for use in other language contexts and across a range of public and professional sectors. FreeTxt is now available for the analysis of Welsh and/or English, for use by <em>anyone</em> in <em>any sector</em> in Wales and beyond.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666799124000200/pdfft?md5=65f8a01d41b4150af967f22d4f542b8f&pid=1-s2.0-S2666799124000200-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142150563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How is L2 pair interaction related to fluency and language use? A quantitative approach L2 结对互动与语言流畅性和语言使用有何关系?定量方法
Pub Date : 2024-08-10 DOI: 10.1016/j.acorp.2024.100102

Previous research examined L2 interaction by describing salient features exhibited in different patterns of peer interaction. These studies mostly used qualitative methods and focused on the collaborative aspect of such construct (Galaczi, 2008). The present study adopts a quantitative approach to explore and describe L2 interaction, utilizing the data of the Corpus of Collaborative Oral Tasks (CCOT). Specifically, it measures pairs’ interaction by creating a composite score of interactivity to understand the relationship between the dyads' degree of interactivity and their use of lexico-grammatical features as well as their L2 fluency. Pearson's correlation tests showed weak to moderate positive relationships between interactivity and discourse particles, response forms, wh-questions, and second person pronouns. Additionally, the tests revealed weak negative relationships between interactivity and both nominal forms and hesitations. Furthermore, revealing moderate relationships, Pearson's correlation tests showed that interactivity was associated with more fluent L2 speech, where learners of higher interactivity levels tended to produce fewer silent pauses and faster speech rates. The study provides insights for scholars interested in L2 interaction. It suggests that some linguistic features were not only associated with collaborative behaviors (as reported in the literature) but also with interactivity as broad aspect. Furthermore, it provides a description of how the act of turn taking might potentially serve the fluency of higher interactivity students, warranting further investigation of turn frequency among L2 test takers as test raters might potentially be influenced by the test candidates’ fluency. Finally, it reports that L2 interactivity exhibited a relationship pattern with linguistic features that resembles patterns reported in the literature of studies on native speakers of English.

以往的研究通过描述同伴互动的不同模式所表现出的显著特征来研究 L2 互动。这些研究大多采用定性方法,并侧重于此类建构的协作方面(Galaczi,2008 年)。本研究采用定量方法,利用协作口语任务语料库(CCOT)的数据来探索和描述 L2 互动。具体而言,本研究通过创建互动性综合评分来衡量配对互动,从而了解配对互动程度与词汇语法特征的使用以及 L2 流利程度之间的关系。皮尔逊相关检验显示,互动性与话语微粒、反应形式、问句和第二人称代词之间存在弱到中等程度的正相关关系。此外,测试还显示互动性与名词形式和犹豫不决之间存在微弱的负相关关系。此外,皮尔逊相关测试表明,交互性与更流利的 L2 言语相关,交互性水平越高的学习者,无声停顿越少,语速越快。这项研究为对 L2 互动感兴趣的学者提供了启示。它表明,某些语言特点不仅与合作行为有关(如文献所报道的那样),而且还与交互性这个广泛的方面有关。此外,研究还描述了轮流发言的行为如何可能有助于提高互动性较高的学生的流利程度,这就需要进一步调查 L2 应试者的轮流发言频率,因为测试评分者可能会受到应试者流利程度的影响。最后,研究报告指出,第二语言互动性与语言特征之间的关系模式与有关英语母语使用者的研究报告中的模式相似。
{"title":"How is L2 pair interaction related to fluency and language use? A quantitative approach","authors":"","doi":"10.1016/j.acorp.2024.100102","DOIUrl":"10.1016/j.acorp.2024.100102","url":null,"abstract":"<div><p>Previous research examined L2 interaction by describing salient features exhibited in different patterns of peer interaction. These studies mostly used qualitative methods and focused on the collaborative aspect of such construct (Galaczi, 2008). The present study adopts a quantitative approach to explore and describe L2 interaction, utilizing the data of the Corpus of Collaborative Oral Tasks (CCOT). Specifically, it measures pairs’ interaction by creating a composite score of interactivity to understand the relationship between the dyads' degree of interactivity and their use of lexico-grammatical features as well as their L2 fluency. Pearson's correlation tests showed weak to moderate positive relationships between interactivity and discourse particles, response forms, <em>wh</em>-questions, and second person pronouns. Additionally, the tests revealed weak negative relationships between interactivity and both nominal forms and hesitations. Furthermore, revealing moderate relationships, Pearson's correlation tests showed that interactivity was associated with more fluent L2 speech, where learners of higher interactivity levels tended to produce fewer silent pauses and faster speech rates. The study provides insights for scholars interested in L2 interaction. It suggests that some linguistic features were not only associated with collaborative behaviors (as reported in the literature) but also with interactivity as broad aspect. Furthermore, it provides a description of how the act of turn taking might potentially serve the fluency of higher interactivity students, warranting further investigation of turn frequency among L2 test takers as test raters might potentially be influenced by the test candidates’ fluency. Finally, it reports that L2 interactivity exhibited a relationship pattern with linguistic features that resembles patterns reported in the literature of studies on native speakers of English.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141997741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mitigation in instructor feedback: A register analysis of written and spoken comments 教师反馈中的缓解:对书面和口头评论的语域分析
Pub Date : 2024-07-25 DOI: 10.1016/j.acorp.2024.100101

Register is among the most important predictors of linguistic variation. In a register such as instructor feedback, linguistic features have particularly high stakes, as they can make feedback more clear, detailed, and/or (de)motivating. Mitigation strategies (i.e., the use of hedges and other softeners) are frequently found in instructor feedback and are particularly influential in terms of the feedback's effectiveness. This study compares the patterns of mitigation strategies used in written and spoken feedback to gain insights into register variation. Written comments (provided electronically) and spoken comments (provided through screencast feedback, in which instructors share verbal feedback along with a screenshare of the student's essay) in the Writing Feedback Corpus (WFC) were analyzed. 1,568 comments across these registers were manually coded for mitigation within head acts (core speech acts) and external modification in the surrounding discourse. Strategies were compared quantitatively using key feature analysis (Egbert & Biber, 2023). The findings indicate that feedback registers promote the use of different mitigation strategies and external modification strategies, with written feedback favoring interrogative syntax and unmitigated forms and spoken feedback favoring personal attribution, hedges, and the nursery we as well as the external modifiers minimizer, positive comment, and reason. Implications for providing feedback on student writing are highlighted.

语域是预测语言变化的最重要因素之一。在教师反馈这样的语篇中,语言特点具有特别重要的意义,因为它们可以使反馈更加清晰、详细和/或(去)激励性。缓解策略(即使用对冲和其他软化剂)经常出现在教师反馈中,对反馈的效果影响特别大。本研究比较了书面反馈和口头反馈中使用的缓和策略模式,以深入了解音域的变化。本研究分析了写作反馈语料库(WFC)中的书面评语(以电子形式提供)和口头评语(通过截屏反馈提供,在截屏反馈中,指导教师在分享口头反馈的同时也分享学生作文的截屏)。对这些语域中的 1,568 条评论进行了人工编码,以确定其在头部行为(核心言语行为)中的缓解作用以及在周围话语中的外部修饰作用。使用关键特征分析(Egbert & Biber, 2023)对策略进行了定量比较。研究结果表明,反馈语域促进了不同缓和策略和外部修饰策略的使用,书面反馈偏向于询问句法和无缓和形式,而口语反馈偏向于个人归因、对冲、"我们 "和外部修饰语 "最小化"、"正面评论 "和 "理由"。强调了为学生写作提供反馈的意义。
{"title":"Mitigation in instructor feedback: A register analysis of written and spoken comments","authors":"","doi":"10.1016/j.acorp.2024.100101","DOIUrl":"10.1016/j.acorp.2024.100101","url":null,"abstract":"<div><p>Register is among the most important predictors of linguistic variation. In a register such as instructor feedback, linguistic features have particularly high stakes, as they can make feedback more clear, detailed, and/or (de)motivating. Mitigation strategies (i.e., the use of hedges and other softeners) are frequently found in instructor feedback and are particularly influential in terms of the feedback's effectiveness. This study compares the patterns of mitigation strategies used in written and spoken feedback to gain insights into register variation. Written comments (provided electronically) and spoken comments (provided through screencast feedback, in which instructors share verbal feedback along with a screenshare of the student's essay) in the Writing Feedback Corpus (WFC) were analyzed. 1,568 comments across these registers were manually coded for mitigation within head acts (core speech acts) and external modification in the surrounding discourse. Strategies were compared quantitatively using key feature analysis (Egbert &amp; Biber, 2023). The findings indicate that feedback registers promote the use of different mitigation strategies and external modification strategies, with written feedback favoring interrogative syntax and unmitigated forms and spoken feedback favoring personal attribution, hedges, and the nursery <em>we</em> as well as the external modifiers minimizer, positive comment, and reason. Implications for providing feedback on student writing are highlighted.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141849314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Book review of Designing and Evaluating Language Corpora 设计和评估语言语料库》书评
Pub Date : 2024-07-18 DOI: 10.1016/j.acorp.2024.100100
{"title":"Book review of Designing and Evaluating Language Corpora","authors":"","doi":"10.1016/j.acorp.2024.100100","DOIUrl":"10.1016/j.acorp.2024.100100","url":null,"abstract":"","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141962074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How do I know this Law corpus is reliable and valid? Using a representativeness argument for corpus validation 我如何知道该法律语料库是可靠有效的?使用代表性论据验证语料库
Pub Date : 2024-07-07 DOI: 10.1016/j.acorp.2024.100099
Jenny Kemp

Corpus findings are only useful if the corpus adequately represents the content and language of the target domain; yet few studies evaluate or report representativeness. This paper argues that corpus linguists should focus explicitly on the validation process. It introduces the innovative concept of a Representativeness Argument, which is an explicit statement of reliability and validity to enable defensible applications of a corpus for a specifically defined purpose and audience. Adapted from Toulmin's (1958/2003) argument model, its originality lies in its attention to both target domain and linguistic representativeness, and in the critical role played by expert judgements. To illustrate this approach, I present a representativeness argument for the 1.98-million-word ‘DSVC-IL’ corpus, which was compiled to investigate the discipline-specific vocabulary required for reading postgraduate International Law texts. The corpus is demonstrated to adequately represent target domain content, established by analysing modules and reading lists, and confirmed by experts. The language is shown to adequately reflect the domain through analysis of a 1026-flemma Single Word List, extracted using measures of frequency, keyness, range and evenness of distribution. List items are evenly-distributed in randomly-split corpus halves (rs=.98, p<.00). The list provides similar coverage of the DSVC-IL (26.37%) and other texts from the domain (23.87%). Moreover, Law experts confirmed the majority of list items were Law words. Together, the evidence supports the usefulness of the corpus and list for its explicitly defined purpose.

只有当语料库充分代表了目标领域的内容和语言时,语料库研究结果才会有用;然而,很少有研究对代表性进行评估或报告。本文认为,语料库语言学家应明确关注验证过程。它提出了 "代表性论证 "这一创新概念,明确说明了语料库的可靠性和有效性,从而使语料库在特定目的和受众面前的应用具有可辩护性。它改编自图尔敏(1958/2003)的论证模型,其独创性在于同时关注目标领域和语言代表性,以及专家判断所发挥的关键作用。为了说明这种方法,我对 198 万字的 "DSVC-IL "语料库进行了代表性论证,该语料库是为了研究阅读国际法研究生文章所需的特定学科词汇而编制的。通过分析模块和阅读清单,并经专家确认,该语料库充分反映了目标领域的内容。通过对 1026 个单词表进行分析,并使用频率、关键度、范围和分布均匀度等指标进行提取,证明语料充分反映了该领域的内容。单词表项目在随机分割的语料库两半中分布均匀(rs=.98,p<.00)。该列表的覆盖范围与 DSVC-IL (26.37%)和该领域其他文本(23.87%)相似。此外,法律专家证实,列表中的大多数项目都是法律词汇。总之,这些证据证明了语料库和列表在其明确定义的目的方面的实用性。
{"title":"How do I know this Law corpus is reliable and valid? Using a representativeness argument for corpus validation","authors":"Jenny Kemp","doi":"10.1016/j.acorp.2024.100099","DOIUrl":"https://doi.org/10.1016/j.acorp.2024.100099","url":null,"abstract":"<div><p>Corpus findings are only useful if the corpus adequately represents the content and language of the target domain; yet few studies evaluate or report representativeness. This paper argues that corpus linguists should focus explicitly on the validation process. It introduces the innovative concept of a <em>Representativeness Argument,</em> which is an explicit statement of reliability and validity to enable defensible applications of a corpus for a specifically defined purpose and audience. Adapted from Toulmin's (1958/2003) argument model, its originality lies in its attention to both target domain and linguistic representativeness, and in the critical role played by expert judgements. To illustrate this approach, I present a representativeness argument for the 1.98-million-word ‘<em>DSVC-IL</em>’ corpus, which was compiled to investigate the discipline-specific vocabulary required for reading postgraduate International Law texts. The corpus is demonstrated to adequately represent target domain content, established by analysing modules and reading lists, and confirmed by experts. The language is shown to adequately reflect the domain through analysis of a 1026-flemma Single Word List, extracted using measures of frequency, keyness, range and evenness of distribution. List items are evenly-distributed in randomly-split corpus halves (r<sub>s</sub>=.98, p&lt;.00). The list provides similar coverage of the <em>DSVC-IL</em> (26.37%) and other texts from the domain (23.87%). Moreover, Law experts confirmed the majority of list items were Law words. Together, the evidence supports the usefulness of the corpus and list for its explicitly defined purpose.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666799124000169/pdfft?md5=5be89dd8047952d7d59c561d28b28f8b&pid=1-s2.0-S2666799124000169-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141605671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Toward a tool for evaluating corpus-based word lists for use in english language teaching contexts 开发一种工具,用于评估英语教学中使用的基于语料库的词表
Pub Date : 2024-06-22 DOI: 10.1016/j.acorp.2024.100098
Sarah Alzeer , Paul Thompson

With the proliferation of large corpora and the availability of sophisticated corpus-analysis tools, the number of corpus-based word lists targeting different types of vocabulary has rapidly increased during the last 20 years. This wide variety of lists has caused problems for practitioners, for whom it is not always easy to decide which list is most useful for their purpose and context. Given the paucity of systematic guidance on how to evaluate word lists, this study aimed to construct an evaluation tool that is based on Nation's (2016) framework of critiquing word lists, but is reformulated for a different purpose and for different target users, in order to increase the applicability of information derived from corpus analysis (the word lists). Constructed based on a thorough literature review, and informed by practitioners’ views and uses of word lists, along with consultations with ELT practitioners and word list experts, the tool targets ELT practitioners such as teachers, curriculum and assessment coordinators, and materials developers involved in directing vocabulary acquisition. The tool caters to practitioners with different levels of expertise and knowledge—especially those who are unfamiliar with the intricacies of developing corpus-based word lists. This paper documents the development of the initial version of the evaluation tool, as well as its first iteration, drawing upon the insights of both word list experts and practitioners in ELT.

随着大型语料库的激增和复杂的语料库分析工具的出现,过去 20 年间,针对不同类型词汇的基于语料库的词表数量迅速增加。这些种类繁多的词表给从业人员带来了很多问题,他们不容易决定哪一个词表最适合他们的目的和语境。鉴于在如何评估词表方面缺乏系统性指导,本研究旨在构建一个评估工具,该工具以 Nation(2016 年)的词表评判框架为基础,但针对不同目的和不同目标用户重新制定,以提高从语料库分析(词表)中获得的信息的适用性。该工具的构建基于全面的文献综述,并参考了从业人员对单词表的看法和使用情况,还咨询了英语语言教学从业人员和单词表专家。该工具适用于具有不同专业水平和知识的从业人员,特别是那些不熟悉开发基于语料库的单词表的复杂性的人员。本文记录了该评估工具最初版本的开发过程,以及它的第一次迭代,并借鉴了词表专家和英语语言教学从业人员的见解。
{"title":"Toward a tool for evaluating corpus-based word lists for use in english language teaching contexts","authors":"Sarah Alzeer ,&nbsp;Paul Thompson","doi":"10.1016/j.acorp.2024.100098","DOIUrl":"https://doi.org/10.1016/j.acorp.2024.100098","url":null,"abstract":"<div><p>With the proliferation of large corpora and the availability of sophisticated corpus-analysis tools, the number of corpus-based word lists targeting different types of vocabulary has rapidly increased during the last 20 years. This wide variety of lists has caused problems for practitioners, for whom it is not always easy to decide which list is most useful for their purpose and context. Given the paucity of systematic guidance on how to evaluate word lists, this study aimed to construct an evaluation tool that is based on Nation's (2016) framework of critiquing word lists, but is reformulated for a different purpose and for different target users, in order to increase the applicability of information derived from corpus analysis (the word lists). Constructed based on a thorough literature review, and informed by practitioners’ views and uses of word lists, along with consultations with ELT practitioners and word list experts, the tool targets ELT practitioners such as teachers, curriculum and assessment coordinators, and materials developers involved in directing vocabulary acquisition. The tool caters to practitioners with different levels of expertise and knowledge—especially those who are unfamiliar with the intricacies of developing corpus-based word lists. This paper documents the development of the initial version of the evaluation tool, as well as its first iteration, drawing upon the insights of both word list experts and practitioners in ELT.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141483723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Applied Corpus Linguistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1