Applied Corpus Linguistics最新文献_第2页

Constructing China’s national image through political discourse: A corpus-based diachronic analysis of government work reports (2001–2025) 从政治话语构建中国国家形象：基于语料库的政府工作报告历时分析（2001-2025）

IF 2.1

Applied Corpus Linguistics

Pub Date : 2025-12-11 DOI: 10.1016/j.acorp.2025.100179

Liai Ma, Peter Crosthwaite

National image is one of the elements of a country’s soft power, and China’s Government Work Reports (GWRs) serve a critical function in shaping this national image, as the state constructs and communicates its political and economic narrative to both domestic and international audiences. This study addresses gaps in previous research by combining both quantitative and qualitative approaches to the analysis of China’s national image, focusing on self-representation and the other-perspective. Specifically, it examines the English editions of 25 GWRs (2001–2025) using Corpus-Assisted Discourse Studies (CADS), tracing keywords and their collocates over time and interpreting these patterns within the context of national image construction. Findings reveal a clear “global integration–domestic stabilization–global engagement” trajectory. During the global integration phase (2001–2010) terms including “World Trade Organization”, “opening up”, and “rapid growth” dominate, underscore China’s integration into the global economy. The domestic stabilization phase (2011–2015) foregrounds “structural adjustment”, narrowing “the rural–urban gap”, and “social harmony”, reflecting China’s efforts to manage internal social imbalances while maintaining stability. In the global engagement phase (2016–2025), phrases including “high-quality development”, “Belt and Road Initiative”, and “Chinese Path” signal China’s transformation from rule-taker to solution provider. Overall, China’s national image in its GWRs has transformed from a newcomer focused on speed, to that of a responsible leader setting global standards. The study offers a model case of applying CADS to the GWRs and provides a comprehensive account of how China’s national image has been constructed and repositioned in international communication.

国家形象是一个国家软实力的组成部分之一，中国的政府工作报告（GWRs）在塑造国家形象方面发挥着关键作用，因为国家构建并向国内和国际受众传达其政治和经济叙事。本研究采用定量与定性相结合的方法对中国国家形象进行分析，重点关注自我表征和他者视角，弥补了以往研究的不足。具体而言，本研究使用语料库辅助语篇研究（CADS）对2001-2025年的25篇GWRs英文版进行了研究，追踪了关键词及其随时间的搭配，并在国家形象构建的背景下解释了这些模式。研究结果揭示了一个清晰的“全球一体化-国内稳定-全球参与”轨迹。在全球一体化阶段（2001-2010年），“世界贸易组织”、“开放”和“快速增长”等术语占据主导地位，突显了中国融入全球经济的进程。国内稳定阶段（2011-2015年）以“结构调整”、缩小“城乡差距”和“社会和谐”为重点，反映了中国在保持稳定的同时管理内部社会失衡的努力。在全球参与阶段（2016-2025年），“高质量发展”、“一带一路”和“中国道路”等短语标志着中国从规则接受者向解决方案提供者的转变。总体而言，中国在高铁中的国家形象已经从一个注重速度的新来者转变为一个制定全球标准的负责任的领导者。该研究提供了一个将CADS应用于GWRs的模型案例，并提供了中国国家形象如何在国际传播中构建和重新定位的全面说明。

{"title":"Constructing China’s national image through political discourse: A corpus-based diachronic analysis of government work reports (2001–2025)","authors":"Liai Ma, Peter Crosthwaite","doi":"10.1016/j.acorp.2025.100179","DOIUrl":"10.1016/j.acorp.2025.100179","url":null,"abstract":"<div><div>National image is one of the elements of a country’s soft power, and China’s Government Work Reports (GWRs) serve a critical function in shaping this national image, as the state constructs and communicates its political and economic narrative to both domestic and international audiences. This study addresses gaps in previous research by combining both quantitative and qualitative approaches to the analysis of China’s national image, focusing on self<strong>-</strong>representation and the other<strong>-</strong>perspective. Specifically, it examines the English editions of 25 GWRs (2001–2025) using Corpus-Assisted Discourse Studies (CADS), tracing keywords and their collocates over time and interpreting these patterns within the context of national image construction. Findings reveal a clear “global integration–domestic stabilization–global engagement” trajectory. During the global integration phase (2001–2010) terms including “World Trade Organization”, “opening up”, and “rapid growth” dominate, underscore China’s integration into the global economy. The domestic stabilization phase (2011–2015) foregrounds “structural adjustment”, narrowing “the rural–urban gap”, and “social harmony”, reflecting China’s efforts to manage internal social imbalances while maintaining stability. In the global engagement phase (2016–2025), phrases including “high-quality development”, “Belt and Road Initiative”, and “Chinese Path” signal China’s transformation from rule-taker to solution provider. Overall, China’s national image in its GWRs has transformed from a newcomer focused on speed, to that of a responsible leader setting global standards. The study offers a model case of applying CADS to the GWRs and provides a comprehensive account of how China’s national image has been constructed and repositioned in international communication.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"6 1","pages":"Article 100179"},"PeriodicalIF":2.1,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145790527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Upgraded literacy: teacher training approaches to integrating corpus data and AI tools for school text readability adaptation 提高读写能力：整合语料库数据和适应学校文本可读性的人工智能工具的教师培训方法

IF 2.1

Applied Corpus Linguistics

Pub Date : 2025-12-09 DOI: 10.1016/j.acorp.2025.100181

Madalina Chitez, Karla Csürös, Roxana Rogobete

This study examines how an inductive learning approach can foster e-literacy, defined as the ability to critically and effectively use digital and AI tools to support literacy. It presents the outcomes of a teacher training program carried out in Romania within a national professional development initiative, involving 56 in-service teachers across primary, lower secondary, and upper secondary levels. The training combined theoretical input with hands-on activities, introducing participants to corpus-based readability analysis and AI platforms for text adaptation. Teachers worked with tools such as LEMI, Text Inspector, ARTE, ChatGPT, and Perplexity. The corpus-based linguistic analysis indicates that teachers most often addressed challenges of vocabulary complexity and cognitive load. Participants used readability and AI tools to simplify syntactic structures, reformulate dense passages, and adapt discourse to students’ linguistic proficiency levels. Reflections further indicated that teachers came to view literacy-related digital and AI platforms in complementary roles: as simplifiers that made texts more accessible, as co-designers that supported creativity in instructional planning, and as validators that confirmed their professional judgment. The training strengthened the teachers’ digital pedagogical awareness and metalinguistic insight, positioning e-literacy as a key competence within an updated literacy paradigm capable of supporting inclusive, level-appropriate education.

本研究探讨了归纳学习方法如何促进电子素养，定义为批判性和有效地使用数字和人工智能工具来支持扫盲的能力。报告介绍了罗马尼亚在一项国家专业发展倡议下开展的教师培训项目的成果，涉及56名小学、初中和高中的在职教师。培训将理论输入与实践活动相结合，向参与者介绍基于语料库的可读性分析和用于文本改编的人工智能平台。教师们使用的工具包括LEMI、Text Inspector、ARTE、ChatGPT和Perplexity。基于语料库的语言分析表明，教师最常解决的是词汇复杂性和认知负荷的挑战。参与者使用可读性和人工智能工具来简化句法结构，重新制定密集的段落，并根据学生的语言熟练程度调整话语。反思进一步表明，教师开始将与扫盲相关的数字平台和人工智能平台视为互补的角色：作为简化器，使文本更易于获取，作为支持教学规划创造力的共同设计师，以及作为确认其专业判断的验证器。培训加强了教师的数字教学意识和元语言洞察力，将电子素养定位为能够支持包容性，适合水平的教育的最新素养范式中的关键能力。

{"title":"Upgraded literacy: teacher training approaches to integrating corpus data and AI tools for school text readability adaptation","authors":"Madalina Chitez, Karla Csürös, Roxana Rogobete","doi":"10.1016/j.acorp.2025.100181","DOIUrl":"10.1016/j.acorp.2025.100181","url":null,"abstract":"<div><div>This study examines how an inductive learning approach can foster e-literacy, defined as the ability to critically and effectively use digital and AI tools to support literacy. It presents the outcomes of a teacher training program carried out in Romania within a national professional development initiative, involving 56 in-service teachers across primary, lower secondary, and upper secondary levels. The training combined theoretical input with hands-on activities, introducing participants to corpus-based readability analysis and AI platforms for text adaptation. Teachers worked with tools such as LEMI, Text Inspector, ARTE, ChatGPT, and Perplexity. The corpus-based linguistic analysis indicates that teachers most often addressed challenges of vocabulary complexity and cognitive load. Participants used readability and AI tools to simplify syntactic structures, reformulate dense passages, and adapt discourse to students’ linguistic proficiency levels. Reflections further indicated that teachers came to view literacy-related digital and AI platforms in complementary roles: as simplifiers that made texts more accessible, as co-designers that supported creativity in instructional planning, and as validators that confirmed their professional judgment. The training strengthened the teachers’ digital pedagogical awareness and metalinguistic insight, positioning e-literacy as a key competence within an updated literacy paradigm capable of supporting inclusive, level-appropriate education.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"6 1","pages":"Article 100181"},"PeriodicalIF":2.1,"publicationDate":"2025-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145790532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Topic-Specific corpus compilation: A componential approach to query formulation 特定主题语料库编译：查询公式的组件方法

IF 2.1

Applied Corpus Linguistics

Pub Date : 2025-12-09 DOI: 10.1016/j.acorp.2025.100180

Daniel Malone

This paper presents a methodological approach to topic-specific corpus compilation when retrieving texts from databases such as news archives or document repositories. When search terms exhibit flexible or context-dependent meanings, a high proportion of returned texts may be unrelated to the intended target concept, increasing processing workload and risking distortion of corpus-analysis results. In addressing this issue, the present paper proposes an approach to query formulation grounded in a componential analysis of the target concept’s meaning which identifies its key semantic attributes. These attributes are operationalised in a complex two-part query, referred to herein as the Dual-Group Query (DGQ). Each query group realises a defining semantic attribute, ensuring that retrieved texts express both components of the target concept. To enable systematic query expansion, the Relative Query Term Relevance method (Gabrielatos, 2007) is procedurally adapted to the DGQ model to evaluate candidate-term relevance prior to inclusion. Evaluation results show that, when applied to the Lone Wolf Corpus, a corpus of British press reporting on lone-actor/lone-wolf terrorism, the approach significantly improved retrieval efficiency (i.e., precision and recall) compared with two non-complex queries. More broadly, the proposed approach offers a replicable framework for corpus compilation in studies concerned with domain-specific topics, fine-grained concepts, or distinct sense relations of particular words.

本文提出了一种从数据库（如新闻档案或文档存储库）检索文本时进行主题特定语料库编译的方法。当搜索词表现出灵活的或上下文相关的含义时，返回的文本中有很大一部分可能与预期的目标概念无关，这会增加处理工作量，并有可能导致语料库分析结果失真。为了解决这个问题，本文提出了一种基于目标概念意义的成分分析的查询公式方法，该方法确定了目标概念的关键语义属性。这些属性在一个复杂的由两部分组成的查询中进行操作，在这里称为双组查询（DGQ）。每个查询组实现一个定义的语义属性，确保检索的文本表达目标概念的两个组成部分。为了实现系统的查询扩展，相对查询词相关性方法（Gabrielatos, 2007）在程序上适应于DGQ模型，以便在包含之前评估候选词的相关性。评估结果表明，当应用于Lone Wolf语料库（一个英国媒体报道的关于独狼恐怖主义的语料库）时，与两个非复杂查询相比，该方法显著提高了检索效率（即精度和召回率）。更广泛地说，所提出的方法为涉及特定领域主题、细粒度概念或特定词的不同意义关系的研究提供了一个可复制的语料库编译框架。

{"title":"Topic-Specific corpus compilation: A componential approach to query formulation","authors":"Daniel Malone","doi":"10.1016/j.acorp.2025.100180","DOIUrl":"10.1016/j.acorp.2025.100180","url":null,"abstract":"<div><div>This paper presents a methodological approach to topic-specific corpus compilation when retrieving texts from databases such as news archives or document repositories. When search terms exhibit flexible or context-dependent meanings, a high proportion of returned texts may be unrelated to the intended target concept, increasing processing workload and risking distortion of corpus-analysis results. In addressing this issue, the present paper proposes an approach to query formulation grounded in a componential analysis of the target concept’s meaning which identifies its key semantic attributes. These attributes are operationalised in a complex two-part query, referred to herein as the Dual-Group Query (DGQ). Each query group realises a defining semantic attribute, ensuring that retrieved texts express both components of the target concept. To enable systematic query expansion, the Relative Query Term Relevance method (Gabrielatos, 2007) is procedurally adapted to the DGQ model to evaluate candidate-term relevance prior to inclusion. Evaluation results show that, when applied to the Lone Wolf Corpus, a corpus of British press reporting on lone-actor/lone-wolf terrorism, the approach significantly improved retrieval efficiency (i.e., precision and recall) compared with two non-complex queries. More broadly, the proposed approach offers a replicable framework for corpus compilation in studies concerned with domain-specific topics, fine-grained concepts, or distinct sense relations of particular words.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"6 1","pages":"Article 100180"},"PeriodicalIF":2.1,"publicationDate":"2025-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145790530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Designing, compiling and profiling the Corpus of Arts and Humanities Academic Texts (CAHAT): A new resource for English for Specific Academic Purposes (ESAP) 设计、编译和分析艺术与人文学术文本语料库（CAHAT）：特殊学术英语（ESAP）的新资源

IF 2.1

Applied Corpus Linguistics

Pub Date : 2025-12-07 DOI: 10.1016/j.acorp.2025.100178

James O’Flynn

English for Academic Purposes (EAP) is broadly concerned with the use of English to perform academic tasks. Many corpus studies, though, have shown that the language used to perform academic tasks varies widely across the disciplines. Accordingly, EAP has come to be viewed as a continuum, with English for General Academic Purposes (EGAP) at one end and English for Specific Academic Purposes (ESAP) at the other. There is ever-growing interest in disciplinary corpus research in EAP, or ESAP research, but corpora to support it remain limited in availability and/or size. This paper therefore describes the development of a large and available ESAP corpus and then introduces it as the Corpus of Arts and Humanities Academic Texts (CAHAT). This c.25-million token (word) corpus of 288 PhD theses collected from three UK universities is organised into six disciplinary subcorpora and enriched with detailed text-external and text-internal metadata. The CAHAT is available on request for non-commercial ESAP research and pedagogy. The full corpus is available through Sketch Engine, while a miniature version of the corpus (c. 6.5 million tokens) is available via a free, purpose-made, user-friendly concordancing tool. The paper concludes by proposing potential applications of the CAHAT in ESAP research and pedagogy.

学术英语（EAP）广泛关注的是使用英语完成学术任务。然而，许多语料库研究表明，用于执行学术任务的语言在各个学科之间差异很大。因此，EAP被视为一个统一体，一端是通用学术英语（EGAP），另一端是特殊学术英语（ESAP）。人们对EAP或ESAP研究的学科语料库研究越来越感兴趣，但支持它的语料库在可用性和/或规模上仍然有限。因此，本文描述了一个大型和可用的ESAP语料库的发展，然后将其介绍为艺术与人文学术文本语料库（CAHAT）。从英国三所大学收集的288篇博士论文的250万token（单词）语料库被组织成六个学科子语料库，并丰富了详细的文本外部和文本内部元数据。CAHAT可应要求提供非商业ESAP研究和教学。完整的语料库可通过Sketch Engine获得，而语料库的微型版本（约650万令牌）可通过一个免费的、专用的、用户友好的检索工具获得。文章最后提出了CAHAT在ESAP研究和教学中的潜在应用。

{"title":"Designing, compiling and profiling the Corpus of Arts and Humanities Academic Texts (CAHAT): A new resource for English for Specific Academic Purposes (ESAP)","authors":"James O’Flynn","doi":"10.1016/j.acorp.2025.100178","DOIUrl":"10.1016/j.acorp.2025.100178","url":null,"abstract":"<div><div>English for Academic Purposes (EAP) is broadly concerned with the use of English to perform academic tasks. Many corpus studies, though, have shown that the language used to perform academic tasks varies widely across the disciplines. Accordingly, EAP has come to be viewed as a continuum, with English for General Academic Purposes (EGAP) at one end and English for Specific Academic Purposes (ESAP) at the other. There is ever-growing interest in disciplinary corpus research in EAP, or ESAP research, but corpora to support it remain limited in availability and/or size. This paper therefore describes the development of a large and available ESAP corpus and then introduces it as the Corpus of Arts and Humanities Academic Texts (CAHAT). This c.25-million token (word) corpus of 288 PhD theses collected from three UK universities is organised into six disciplinary subcorpora and enriched with detailed text-external and text-internal metadata. The CAHAT is available on request for non-commercial ESAP research and pedagogy. The full corpus is available through Sketch Engine, while a miniature version of the corpus (c. 6.5 million tokens) is available via a free, purpose-made, user-friendly concordancing tool. The paper concludes by proposing potential applications of the CAHAT in ESAP research and pedagogy.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"6 1","pages":"Article 100178"},"PeriodicalIF":2.1,"publicationDate":"2025-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145790528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

CorGeS: The corpus of German suicide notes 科尔格斯：德国人遗书的语料库

IF 2.1

Applied Corpus Linguistics

Pub Date : 2025-12-06 DOI: 10.1016/j.acorp.2025.100177

Dana Roemling , Lucia Busso

This paper introduces CorGeS, a historic corpus of authentic German suicide notes written between the 1910s and 1930s. Originally compiled and transcribed by a police officer, the corpus offers a rare and valuable resource for both linguistic and historical inquiry. We describe the provenance and structure of the corpus, as well as the methodological and ethical considerations involved in working with such sensitive material. While suicide note analysis is well established in English-language research, German-language material remains understudied, making CorGeS an important contribution to multilingual and cross-cultural perspectives in suicide note analysis. To illustrate the potential of the corpus, we present a preliminary topic modelling analysis, highlighting key thematic patterns in the texts, before using corpus methods to explore the most prevalent item in the corpus in more detail. These early results demonstrate the diversity and emotional complexity of the notes and suggest several avenues for further research at the intersection of linguistics, history, and suicide note analysis.

本文介绍了一个历史语料库，它收录了20世纪10年代至30年代德国人的真实自杀遗书。该语料库最初由一名警官汇编和转录，为语言和历史调查提供了罕见而宝贵的资源。我们描述了语料库的来源和结构，以及在处理这种敏感材料时所涉及的方法和伦理考虑。虽然遗书分析在英语研究中已经建立，但德语材料仍未得到充分研究，这使得科格斯对遗书分析中的多语言和跨文化视角做出了重要贡献。为了说明语料库的潜力，我们提出了一个初步的主题建模分析，突出了文本中的关键主题模式，然后使用语料库方法更详细地探索语料库中最普遍的项目。这些早期结果显示了遗书的多样性和情感复杂性，并为语言学、历史和遗书分析的交叉研究提供了一些途径。

引用次数: 0

Enhancing learners’ academic writing skills: a comparative analysis of traditional and AI-assisted instruction approaches 提高学习者的学术写作技巧：传统教学方法与人工智能辅助教学方法的比较分析

IF 2.1

Applied Corpus Linguistics

Pub Date : 2025-12-04 DOI: 10.1016/j.acorp.2025.100176

Kholida Begmatova, Iroda Saydazimova

The study aimed to explore the impact of two instructional approaches – traditional and AI-assisted, in teaching academic writing to year-1 EAP students in an EMI university in Uzbekistan. Control group students (n = 75) learned to write a literature review within a traditional approach, developing a matrix of sources to organize and synthesize findings and receive lecturers’ constructive feedback. An inductive approach was adopted in the instruction with the treatment group (n = 78), where students wrote their papers integrating AI chatbots at several stages, from narrowing the topic scope to responding to lecturers’ language-related instructive feedback on drafts using AI tools to introduce corrections independently. A Learner Corpus comprised two subsets of texts produced in two instructional approaches, with two entries per subset, totaling 306 literature reviews. These texts were used for data analysis, which was performed with Coh-Metrix. This analysis involved the evaluation of linguistic properties and the overall writing quality of student papers produced within the two instructional methods. Also, thematic linguistic analysis was conducted to evaluate the academic features of the texts (n = 20). Our findings revealed that students have demonstrated comparable readability levels, syntactic/grammatical range, sophistication levels, and cohesion across ideas in their writing. The papers produced within an AI-assisted approach had a higher semantic complexity, referential cohesion, and lexical diversity. Thematic linguistic analysis revealed three key areas in the academic features of students’ papers, including the use of referencing conventions, integration of cohesive devices, and demonstration of argumentation and critical analysis.

本研究旨在探讨传统和人工智能辅助两种教学方法对乌兹别克斯坦一所EMI大学一年级EAP学生学术写作的影响。控制组学生（n = 75）学习用传统的方法写文献综述，发展一个来源矩阵来组织和综合发现，并接受讲师的建设性反馈。在实验组（n = 78）的教学中采用归纳方法，学生在几个阶段使用人工智能聊天机器人撰写论文，从缩小主题范围到使用人工智能工具对讲师对草稿的语言相关指导性反馈做出回应，以独立引入纠正。学习者语料库由两种教学方法产生的两个文本子集组成，每个子集有两个条目，共计306篇文献综述。使用Coh-Metrix对这些文本进行数据分析。这一分析包括对两种教学方法下学生论文的语言特性和整体写作质量的评估。此外，还进行了主题语言学分析，以评估文本的学术特征（n = 20）。我们的研究结果表明，学生们在写作中表现出了相当的可读性、句法/语法范围、复杂程度和思想凝聚力。使用人工智能辅助方法生成的论文具有更高的语义复杂性、参考凝聚力和词汇多样性。主题语言学分析揭示了学生论文学术特征的三个关键方面，包括引用惯例的使用、衔接手段的整合、论证和批判性分析的展示。

{"title":"Enhancing learners’ academic writing skills: a comparative analysis of traditional and AI-assisted instruction approaches","authors":"Kholida Begmatova, Iroda Saydazimova","doi":"10.1016/j.acorp.2025.100176","DOIUrl":"10.1016/j.acorp.2025.100176","url":null,"abstract":"<div><div>The study aimed to explore the impact of two instructional approaches – traditional and AI-assisted, in teaching academic writing to year-1 EAP students in an EMI university in Uzbekistan. Control group students (<em>n</em> = 75) learned to write a literature review within a traditional approach, developing a matrix of sources to organize and synthesize findings and receive lecturers’ constructive feedback. An inductive approach was adopted in the instruction with the treatment group (<em>n</em> = 78), where students wrote their papers integrating AI chatbots at several stages, from narrowing the topic scope to responding to lecturers’ language-related instructive feedback on drafts using AI tools to introduce corrections independently. A Learner Corpus comprised two subsets of texts produced in two instructional approaches, with two entries per subset, totaling 306 literature reviews. These texts were used for data analysis, which was performed with Coh-Metrix. This analysis involved the evaluation of linguistic properties and the overall writing quality of student papers produced within the two instructional methods. Also, thematic linguistic analysis was conducted to evaluate the academic features of the texts (<em>n</em> = 20). Our findings revealed that students have demonstrated comparable readability levels, syntactic/grammatical range, sophistication levels, and cohesion across ideas in their writing. The papers produced within an AI-assisted approach had a higher semantic complexity, referential cohesion, and lexical diversity. Thematic linguistic analysis revealed three key areas in the academic features of students’ papers, including the use of referencing conventions, integration of cohesive devices, and demonstration of argumentation and critical analysis.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"6 1","pages":"Article 100176"},"PeriodicalIF":2.1,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145738034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Register alignment of ChatGPT-generated academic texts chatgpt生成的学术文本的寄存器对齐

IF 2.1

Applied Corpus Linguistics

Pub Date : 2025-12-02 DOI: 10.1016/j.acorp.2025.100174

Nur Yağmur Demir, Jesse Egbert

The rise of Artificial Intelligence (AI) tools such as ChatGPT has transformed language pedagogy and assessment. Despite their growing use in academic contexts—from classroom materials to standardized testing—questions remain about the register appropriateness of the texts they produce.

The humanlikeness of AI language must be defined not only by fluency or coherence, but by register appropriateness—functional language use that aligns with the situational characteristics of registers. This study investigates whether ChatGPT-generated academic texts mimic human-authored writing in two academic genres (journal articles and textbooks) across two disciplines (biology and history).

Using multi-dimensional analysis, we analyzed 200 texts (100 AI-generated and 100 human-authored) along three linguistic dimensions: (1) specialized information density vs. non-technical synthesis, (2) definition/evaluation of new concepts, and (3) author-centered stance. Our results reveal a mixed picture: while ChatGPT exhibits moderate success in mimicking register distinctions found in journal article registers, its performance is notably less aligned with textbooks. ChatGPT-generated textbook excerpts in biology, for instance, often resemble the dense, technical style of journal articles, as a result failing to match the simplified, pedagogically oriented discourse found in human-authored textbooks.

Our findings indicate that while ChatGPT can largely reproduce human-like register patterns in journal article writing, it struggles to achieve the same in textbook contexts, particularly within biology. Overall, the results suggest that ChatGPT-generated texts often lack sufficient functional appropriateness. We therefore recommend further quantitative linguistic analyses of AI-generated language and urge caution when using ChatGPT for content creation.

ChatGPT等人工智能（AI）工具的兴起改变了语言教学和评估。尽管它们在学术环境中越来越多地使用——从课堂材料到标准化测试——但它们所产生的文本的注册适当性问题仍然存在。人工智能语言的人类相似性不仅必须通过流利或连贯来定义，还必须通过语域的适当性来定义，即与语域的情境特征相一致的功能性语言使用。这项研究调查了chatgpt生成的学术文本是否模仿了两种学术类型（期刊文章和教科书）中跨越两门学科（生物学和历史学）的人类写作。使用多维分析，我们沿着三个语言学维度分析了200个文本（100个人工智能生成和100个人类撰写）：(1)专业信息密度与非技术合成，(2)新概念的定义/评估，以及(3)以作者为中心的立场。我们的研究结果揭示了一个复杂的情况：虽然ChatGPT在模仿期刊文章寄存器中发现的寄存器差异方面表现出适度的成功，但它的表现与教科书的表现明显不一致。例如，chatgpt生成的生物学教科书节选通常类似于期刊文章的密集、技术风格，因此无法与人类撰写的教科书中简化的、以教学为导向的论述相匹配。我们的研究结果表明，虽然ChatGPT可以在期刊文章写作中很大程度上重现人类的语域模式，但在教科书环境中，尤其是在生物学领域，它很难达到同样的效果。总的来说，结果表明，chatgpt生成的文本往往缺乏足够的功能适当性。因此，我们建议对人工智能生成的语言进行进一步的定量语言分析，并敦促在使用ChatGPT进行内容创建时保持谨慎。

{"title":"Register alignment of ChatGPT-generated academic texts","authors":"Nur Yağmur Demir, Jesse Egbert","doi":"10.1016/j.acorp.2025.100174","DOIUrl":"10.1016/j.acorp.2025.100174","url":null,"abstract":"<div><div>The rise of Artificial Intelligence (AI) tools such as ChatGPT has transformed language pedagogy and assessment. Despite their growing use in academic contexts—from classroom materials to standardized testing—questions remain about the register appropriateness of the texts they produce.</div><div>The humanlikeness of AI language must be defined not only by fluency or coherence, but by register appropriateness—functional language use that aligns with the situational characteristics of registers. This study investigates whether ChatGPT-generated academic texts mimic human-authored writing in two academic genres (journal articles and textbooks) across two disciplines (biology and history).</div><div>Using multi-dimensional analysis, we analyzed 200 texts (100 AI-generated and 100 human-authored) along three linguistic dimensions: (1) specialized information density vs. non-technical synthesis, (2) definition/evaluation of new concepts, and (3) author-centered stance. Our results reveal a mixed picture: while ChatGPT exhibits moderate success in mimicking register distinctions found in journal article registers, its performance is notably less aligned with textbooks. ChatGPT-generated textbook excerpts in biology, for instance, often resemble the dense, technical style of journal articles, as a result failing to match the simplified, pedagogically oriented discourse found in human-authored textbooks.</div><div>Our findings indicate that while ChatGPT can largely reproduce human-like register patterns in journal article writing, it struggles to achieve the same in textbook contexts, particularly within biology. Overall, the results suggest that ChatGPT-generated texts often lack sufficient functional appropriateness. We therefore recommend further quantitative linguistic analyses of AI-generated language and urge caution when using ChatGPT for content creation.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"6 1","pages":"Article 100174"},"PeriodicalIF":2.1,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145685013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Studying webcare across industries: from customer care to relationship management 跨行业研究网络护理：从客户关怀到关系管理

IF 2.1

Applied Corpus Linguistics

Pub Date : 2025-12-02 DOI: 10.1016/j.acorp.2025.100175

Ursula Lutzky

Businesses regularly engage in digital business communication by addressing stakeholders and responding to their feedback online. While previous research has explored these interactions from diverse perspectives, industry-specific differences have not been studied extensively to date. This article addresses this gap in research by studying the digital interactions of US companies from three different industries (airlines, food and beverage, streaming services) to uncover their approach to communicating with stakeholders online, also known as webcare. The study is based on the US Corporate Twitter Corpus, which includes 4.4m English tweets posted by and addressed to US companies, such as American Airlines, Burger King and HBO, between September 2021 and February 2023. By carrying out a keyword analysis, it investigates differences in the digital communication of the industries studied and links them to the organizational goals of webcare, including customer care, marketing, and reputation and relationship management. The findings of the corpus linguistic analysis show that the three industries engage in webcare to different ends. While airlines have a clear focus on customer care, streaming services and food and beverage use webcare primarily for marketing and relationship management purposes, highlighting the role of user engagement in online interactions. These findings underline the importance of taking the industry into account when engaging in webcare research and interpreting its results, which may not be generalizable across industries. At the same time, they give insight into industry-specific practice by revealing differences in the organizational strategy and goal pursued when interacting with stakeholders online.

企业通过与利益相关者沟通并在线回应他们的反馈，定期进行数字化商业沟通。虽然之前的研究从不同的角度探讨了这些相互作用，但到目前为止，特定行业的差异还没有得到广泛的研究。本文通过研究来自三个不同行业（航空公司、食品和饮料、流媒体服务）的美国公司的数字互动，揭示他们与利益相关者在线沟通的方法，也被称为网络关怀，来解决这一研究空白。这项研究基于美国企业推特语料库，其中包括2021年9月至2023年2月期间美国航空公司、汉堡王和HBO等美国公司发布和发送的440万条英文推文。通过进行关键字分析，它调查了所研究行业的数字通信差异，并将它们与网络护理的组织目标联系起来，包括客户服务、营销、声誉和关系管理。语料库语言分析的结果表明，三个行业从事网络护理的目的不同。虽然航空公司明确关注客户关怀，但流媒体服务和食品饮料主要将网络关怀用于营销和关系管理目的，强调用户参与在在线互动中的作用。这些发现强调了在进行网络护理研究和解释其结果时考虑行业的重要性，这可能无法在整个行业中推广。同时，它们通过揭示与利益相关者在线互动时所追求的组织战略和目标的差异，从而深入了解特定行业的实践。

{"title":"Studying webcare across industries: from customer care to relationship management","authors":"Ursula Lutzky","doi":"10.1016/j.acorp.2025.100175","DOIUrl":"10.1016/j.acorp.2025.100175","url":null,"abstract":"<div><div>Businesses regularly engage in digital business communication by addressing stakeholders and responding to their feedback online. While previous research has explored these interactions from diverse perspectives, industry-specific differences have not been studied extensively to date. This article addresses this gap in research by studying the digital interactions of US companies from three different industries (airlines, food and beverage, streaming services) to uncover their approach to communicating with stakeholders online, also known as webcare. The study is based on the US Corporate Twitter Corpus, which includes 4.4m English tweets posted by and addressed to US companies, such as American Airlines, Burger King and HBO, between September 2021 and February 2023. By carrying out a keyword analysis, it investigates differences in the digital communication of the industries studied and links them to the organizational goals of webcare, including customer care, marketing, and reputation and relationship management. The findings of the corpus linguistic analysis show that the three industries engage in webcare to different ends. While airlines have a clear focus on customer care, streaming services and food and beverage use webcare primarily for marketing and relationship management purposes, highlighting the role of user engagement in online interactions. These findings underline the importance of taking the industry into account when engaging in webcare research and interpreting its results, which may not be generalizable across industries. At the same time, they give insight into industry-specific practice by revealing differences in the organizational strategy and goal pursued when interacting with stakeholders online.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"6 1","pages":"Article 100175"},"PeriodicalIF":2.1,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145738032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

iThink, therefore iCheck: Critical engagement with ChatGPT in linguistic analysis and learning 我想，因此我检查：与ChatGPT在语言分析和学习中的关键参与

IF 2.1

Applied Corpus Linguistics

Pub Date : 2025-12-01 DOI: 10.1016/j.acorp.2025.100169

Pierfranca Forchini, Amanda C. Murphy

This study explores the integration of the free version of ChatGPT 4.0 into a graduate class, focusing on the tool’s ability to perform linguistic analysis and on students’ engagement with it through inductive learning. To address concerns about students’ uncritical use of ChatGPT, the research compares textual analyses of dialogs from two movies (drawn from the American Movie Corpus - (https://americanmoviecorpus.net) carried out by the instructors and by ChatGPT. Adopting a quasi-experimental design, it examines how two groups of graduate students of English – one trained in linguistic analysis and the multidimensional analysis (MDA) framework, the other untrained – interacted with ChatGPT’s analyses and evaluated both the tool and the learning experience.

Both instructors and student groups used structured prompts to generate general textual and MDA-based analyses via ChatGPT. The instructors’ output (produced and collected at the same time as the students’ output) were analyzed to assess the tool’s performance, while the reflections on the experiment by the students served to evaluate the impact of prior training on their critical engagement.

The findings reveal that ChatGPT’s ability to perform both general and MDA-based analyses was limited, often inconsistent and inaccurate. Students with prior MDA training showed stronger data literacy and more critical engagement with the tool, while untrained students exhibited overreliance and misconceptions regarding ChatGPT’s capabilities. These results highlight the need for targeted instruction to foster analytical skills and reduce uncritical AI use.

This study contributes to ongoing debates on AI in education by underscoring the value of instructor guidance and structured training. It supports a pedagogical approach where AI is critically integrated into academic settings, encouraging informed and responsible student engagement.

本研究探讨了将免费版ChatGPT 4.0整合到研究生课程中，重点关注该工具进行语言分析的能力以及学生通过归纳学习与之互动的能力。为了解决学生不加批判地使用ChatGPT的问题，该研究比较了两部电影（取自美国电影语料库- (https://americanmoviecorpus.net)）对白的文本分析，两部电影分别由教师和ChatGPT进行。采用准实验设计，研究了两组英语研究生——一组接受语言分析和多维分析（MDA）框架训练，另一组未经训练——如何与ChatGPT的分析互动，并评估工具和学习经验。教师和学生小组都使用结构化提示，通过ChatGPT生成一般的文本和基于mda的分析。分析教师的输出（与学生的输出同时产生和收集）以评估工具的性能，而学生对实验的反思用于评估先前培训对其批判性参与的影响。研究结果表明，ChatGPT执行一般和基于mda的分析的能力是有限的，经常不一致和不准确。受过MDA培训的学生表现出更强的数据素养和对该工具更多的批判性参与，而未受过培训的学生则表现出对ChatGPT功能的过度依赖和误解。这些结果强调需要有针对性的指导，以培养分析技能，减少不加批判地使用人工智能。这项研究通过强调教师指导和结构化培训的价值，为正在进行的关于人工智能在教育中的争论做出了贡献。它支持一种教学方法，将人工智能严格融入学术环境，鼓励知情和负责任的学生参与。

{"title":"iThink, therefore iCheck: Critical engagement with ChatGPT in linguistic analysis and learning","authors":"Pierfranca Forchini, Amanda C. Murphy","doi":"10.1016/j.acorp.2025.100169","DOIUrl":"10.1016/j.acorp.2025.100169","url":null,"abstract":"<div><div>This study explores the integration of the free version of ChatGPT 4.0 into a graduate class, focusing on the tool’s ability to perform linguistic analysis and on students’ engagement with it through inductive learning. To address concerns about students’ uncritical use of ChatGPT, the research compares textual analyses of dialogs from two movies (drawn from the American Movie Corpus - (<span><span>https://americanmoviecorpus.net</span><svg><path></path></svg></span>) carried out by the instructors and by ChatGPT. Adopting a quasi-experimental design, it examines how two groups of graduate students of English – one trained in linguistic analysis and the multidimensional analysis (MDA) framework, the other untrained – interacted with ChatGPT’s analyses and evaluated both the tool and the learning experience.</div><div>Both instructors and student groups used structured prompts to generate general textual and MDA-based analyses via ChatGPT. The instructors’ output (produced and collected at the same time as the students’ output) were analyzed to assess the tool’s performance, while the reflections on the experiment by the students served to evaluate the impact of prior training on their critical engagement.</div><div>The findings reveal that ChatGPT’s ability to perform both general and MDA-based analyses was limited, often inconsistent and inaccurate. Students with prior MDA training showed stronger data literacy and more critical engagement with the tool, while untrained students exhibited overreliance and misconceptions regarding ChatGPT’s capabilities. These results highlight the need for targeted instruction to foster analytical skills and reduce uncritical AI use.</div><div>This study contributes to ongoing debates on AI in education by underscoring the value of instructor guidance and structured training. It supports a pedagogical approach where AI is critically integrated into academic settings, encouraging informed and responsible student engagement.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 3","pages":"Article 100169"},"PeriodicalIF":2.1,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145624116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ChatGPT for President! Presupposed content in politicians versus GPT-generated texts 选总统！政治家的预设内容与gpt生成的文本

IF 2.1

Applied Corpus Linguistics

Pub Date : 2025-12-01 DOI: 10.1016/j.acorp.2025.100156

Davide Garassino , Nicola Brocca , Viviana Masia

This study examines GPT-4′s capability to replicate linguistic strategies used in political discourse, focusing on its potential for manipulative language generation. As Large Language Models (LLMs) become increasingly popular for text generation, concerns have grown regarding their role in spreading fake news and propaganda. This research compares French and Italian political speeches with those generated by GPT-4, with an emphasis on presuppositions – a rhetorical device that may subtly influence audiences by packaging some content as already known at the moment of utterance. Through a corpus-based pragmatic analysis, this study assesses how well GPT-4 can mimic this persuasive strategy. Our findings show that, despite some apparent similarities, a closer look reveals important differences in their distribution and function compared to politicians. For example, GPT-generated texts often rely on change-of-state verbs used in fixed phrases, whereas politicians use presupposition triggers in more varied ways. Such differences, however, are challenging to detect with a ‘naked eye,’ and this represents a potential risk of LLMs in political and public discourse.

本研究考察了GPT-4复制政治话语中使用的语言策略的能力，重点关注其在操纵性语言生成方面的潜力。随着大型语言模型（llm）在文本生成中越来越受欢迎，人们越来越关注它们在传播假新闻和宣传中的作用。这项研究将法国和意大利的政治演讲与GPT-4生成的演讲进行了比较，重点放在预设上——这是一种修辞手段，可以通过在演讲时包装一些已经知道的内容来微妙地影响听众。通过基于语料库的语用分析，本研究评估了GPT-4如何很好地模仿这种说服策略。我们的研究结果表明，尽管有一些明显的相似之处，但仔细观察就会发现，与政治家相比，他们在分布和功能上存在重大差异。例如，gpt生成的文本通常依赖于固定短语中使用的状态变化动词，而政治家则以更多样化的方式使用预设触发器。然而，这种差异很难用“肉眼”发现，这代表了法学硕士在政治和公共话语中的潜在风险。

{"title":"ChatGPT for President! Presupposed content in politicians versus GPT-generated texts","authors":"Davide Garassino , Nicola Brocca , Viviana Masia","doi":"10.1016/j.acorp.2025.100156","DOIUrl":"10.1016/j.acorp.2025.100156","url":null,"abstract":"<div><div>This study examines GPT-4′s capability to replicate linguistic strategies used in political discourse, focusing on its potential for manipulative language generation. As Large Language Models (LLMs) become increasingly popular for text generation, concerns have grown regarding their role in spreading fake news and propaganda. This research compares French and Italian political speeches with those generated by GPT-4, with an emphasis on presuppositions – a rhetorical device that may subtly influence audiences by packaging some content as already known at the moment of utterance. Through a corpus-based pragmatic analysis, this study assesses how well GPT-4 can mimic this persuasive strategy. Our findings show that, despite some apparent similarities, a closer look reveals important differences in their distribution and function compared to politicians. For example, GPT-generated texts often rely on change-of-state verbs used in fixed phrases, whereas politicians use presupposition triggers in more varied ways. Such differences, however, are challenging to detect with a ‘naked eye,’ and this represents a potential risk of LLMs in political and public discourse.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 3","pages":"Article 100156"},"PeriodicalIF":2.1,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145622770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0