首页 > 最新文献

Applied Corpus Linguistics最新文献

英文 中文
“Too young to love”: A corpus-assisted critical discourse analysis of adolescent romance on Chinese social media “太年轻而不能爱”:中国社交媒体上青少年浪漫的语料库辅助批评话语分析
IF 2.1 Pub Date : 2025-12-01 DOI: 10.1016/j.acorp.2025.100167
Xurui Ling , Xingbing Liu
Adolescent romance in China navigates a complex sociocultural landscape, marked by a discursive tension between evolving developmental understanding and deeply entrenched anxieties. This study employs a corpus-assisted critical discourse analysis of 1,740 WeChat articles and Sina Weibo posts (2020–2024) to investigate contemporary representations of adolescent romance. Findings reveal a ‘negotiated space’ where the increasing normalization of adolescent romance directly contends with historical apprehensions. A notable shift is observed from 20th-century criticism-dominated narratives towards a contemporary discourse featuring emergent supportive voices that reframe romance as integral to development. While critical views persist, their focus has transformed from broad moral condemnations to specific concerns about academic impact and adolescents’ psycho-emotional maturity. These insights are valuable for educators, policymakers, and researchers aiming to understand contemporary China’s social dynamics and youth development, offering a perspective on how digital platforms shape and reflect societal attitudes toward adolescent romance.
中国的青少年爱情是一种复杂的社会文化景观,其特点是在不断发展的理解和根深蒂固的焦虑之间存在着话语张力。本研究采用语料库辅助的批判性话语分析,对1740亿篇文章和新浪微博帖子(2020-2024)进行分析,探讨当代青少年浪漫的表现形式。研究结果揭示了一个“协商空间”,在这个空间里,青少年爱情的日益正常化直接与历史上的担忧相抗衡。从20世纪的批评主导叙事到当代话语的显著转变,以新兴的支持声音为特征,将浪漫重新定义为发展不可或缺的一部分。虽然批评的观点仍然存在,但他们的关注点已经从广泛的道德谴责转变为对学术影响和青少年心理情感成熟的具体关注。这些见解对旨在了解当代中国社会动态和青年发展的教育工作者、政策制定者和研究人员很有价值,为数字平台如何塑造和反映社会对青少年恋爱的态度提供了一个视角。
{"title":"“Too young to love”: A corpus-assisted critical discourse analysis of adolescent romance on Chinese social media","authors":"Xurui Ling ,&nbsp;Xingbing Liu","doi":"10.1016/j.acorp.2025.100167","DOIUrl":"10.1016/j.acorp.2025.100167","url":null,"abstract":"<div><div>Adolescent romance in China navigates a complex sociocultural landscape, marked by a discursive tension between evolving developmental understanding and deeply entrenched anxieties. This study employs a corpus-assisted critical discourse analysis of 1,740 WeChat articles and Sina Weibo posts (2020–2024) to investigate contemporary representations of adolescent romance. Findings reveal a ‘negotiated space’ where the increasing normalization of adolescent romance directly contends with historical apprehensions. A notable shift is observed from 20th-century criticism-dominated narratives towards a contemporary discourse featuring emergent supportive voices that reframe romance as integral to development. While critical views persist, their focus has transformed from broad moral condemnations to specific concerns about academic impact and adolescents’ psycho-emotional maturity. These insights are valuable for educators, policymakers, and researchers aiming to understand contemporary China’s social dynamics and youth development, offering a perspective on how digital platforms shape and reflect societal attitudes toward adolescent romance.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 3","pages":"Article 100167"},"PeriodicalIF":2.1,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145624117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Corpora and AI for inductive learning: Theory and practice 语料库与人工智能的归纳学习:理论与实践
IF 2.1 Pub Date : 2025-12-01 DOI: 10.1016/j.acorp.2025.100165
Eniko Csomay , Reka R. Jablonkai , Hui Sun
In this ACORP special issue, contributors explore the rapidly evolving intersection of corpus linguistics and generative artificial intelligence (GenAI) in language learning, teaching, and research. As GenAI tools gain prominence alongside established corpus-based approaches, new pedagogical opportunities and challenges emerge for learners, teachers, and researchers alike. The articles in this issue collectively examine how corpora and GenAI can be integrated to enhance language analysis, genre awareness, writing development, and instructional design. Together, they offer critical insights into how these complementary technologies can inform DDL, promote critical digital literacies, and reshape the future of language education and applied linguistics research.
在本期ACORP特刊中,作者探讨了语料库语言学和生成式人工智能(GenAI)在语言学习、教学和研究中的快速发展的交叉点。随着GenAI工具与现有的基于语料库的方法一起获得突出地位,学习者、教师和研究人员都面临着新的教学机会和挑战。本期的文章共同探讨了如何将语料库和GenAI整合起来,以增强语言分析、体裁意识、写作发展和教学设计。总之,他们为这些互补技术如何为DDL提供信息、促进关键数字素养、重塑语言教育和应用语言学研究的未来提供了重要见解。
{"title":"Corpora and AI for inductive learning: Theory and practice","authors":"Eniko Csomay ,&nbsp;Reka R. Jablonkai ,&nbsp;Hui Sun","doi":"10.1016/j.acorp.2025.100165","DOIUrl":"10.1016/j.acorp.2025.100165","url":null,"abstract":"<div><div>In this ACORP special issue, contributors explore the rapidly evolving intersection of corpus linguistics and generative artificial intelligence (GenAI) in language learning, teaching, and research. As GenAI tools gain prominence alongside established corpus-based approaches, new pedagogical opportunities and challenges emerge for learners, teachers, and researchers alike. The articles in this issue collectively examine how corpora and GenAI can be integrated to enhance language analysis, genre awareness, writing development, and instructional design. Together, they offer critical insights into how these complementary technologies can inform DDL, promote critical digital literacies, and reshape the future of language education and applied linguistics research.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 3","pages":"Article 100165"},"PeriodicalIF":2.1,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145624115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On corpus linguistics, the search for meaning, and a transversal–pluriversal turn in celebrating learner languaging 关于语料库语言学,对意义的探索,以及庆祝学习者语言的横向-多元转向
IF 2.1 Pub Date : 2025-11-26 DOI: 10.1016/j.acorp.2025.100172
Meng Huat Chau
This article revisits key insights from corpus linguistics, such as units of meaning and pattern grammar, in dialogue with cognitive linguistic understandings of form–meaning pairings, showing how meaning arises from patterned, contextualized, and emergent use rather than from isolated words. Foregrounding the language learner as a fully legitimate meaning maker alongside expert and other language users, it advances communicative meaningfulness as an ecological model grounded in relational resonance rather than formal accuracy or communicative effectiveness. Drawing on longitudinal corpus evidence from school students, the article demonstrates how learners rework patterned resources to express stance, negotiate values, and enact situated identities, revealing their languaging as meaning-in-motion. It further articulates a transversal–pluriversal turn in applied linguistics: transversal in its crossings of disciplinary, cultural, and linguistic boundaries; pluriversal in its affirmation of diverse epistemologies and ways of knowing. The article concludes that learners’ meaning making contributes to a living relational ecology of communication, positioning the study of corpora, learner languaging, and language as a whole as co-created, evolving, and interrelated resources. Such an orientation not only guides more inclusive, humane, and epistemically diverse practices in corpus linguistics and applied linguistics; it also, importantly, deepens and expands our shared human capacity for understanding and connection.
本文回顾了语料库语言学的关键见解,如意义单位和模式语法,与形式-意义配对的认知语言学理解对话,展示了意义是如何从模式、语境化和紧急使用中产生的,而不是从孤立的单词中产生的。它将语言学习者与专家和其他语言使用者一起视为完全合法的意义创造者,并将交际意义作为一种基于关系共鸣而不是形式准确性或交际有效性的生态模型。利用来自在校学生的纵向语料库证据,本文展示了学习者如何重新使用模式资源来表达立场、协商价值观和制定情境身份,揭示他们的语言是动态的意义。它进一步阐明了应用语言学的横向-多元转向:交叉学科、文化和语言边界的横向;多元论是指对不同的认识论和认识方式的肯定。学习者的意义建构有助于构建一个鲜活的交际关系生态,将语料库研究、学习者语言研究和语言整体研究定位为共同创造、进化和相互关联的资源。这种取向不仅指导了语料库语言学和应用语言学更具包容性、人性化和认识论多样性的实践;重要的是,它还加深和扩展了我们共同的人类理解和联系的能力。
{"title":"On corpus linguistics, the search for meaning, and a transversal–pluriversal turn in celebrating learner languaging","authors":"Meng Huat Chau","doi":"10.1016/j.acorp.2025.100172","DOIUrl":"10.1016/j.acorp.2025.100172","url":null,"abstract":"<div><div>This article revisits key insights from corpus linguistics, such as units of meaning and pattern grammar, in dialogue with cognitive linguistic understandings of form–meaning pairings, showing how meaning arises from patterned, contextualized, and emergent use rather than from isolated words. Foregrounding the language learner as a fully legitimate meaning maker alongside expert and other language users, it advances <em>communicative meaningfulness</em> as an ecological model grounded in relational resonance rather than formal accuracy or communicative effectiveness. Drawing on longitudinal corpus evidence from school students, the article demonstrates how learners rework patterned resources to express stance, negotiate values, and enact situated identities, revealing their languaging as meaning-in-motion. It further articulates a transversal–pluriversal turn in applied linguistics: <em>transversal</em> in its crossings of disciplinary, cultural, and linguistic boundaries; <em>pluriversal</em> in its affirmation of diverse epistemologies and ways of knowing. The article concludes that learners’ meaning making contributes to a living relational ecology of communication, positioning the study of corpora, learner languaging, and language as a whole as co-created, evolving, and interrelated resources. Such an orientation not only guides more inclusive, humane, and epistemically diverse practices in corpus linguistics and applied linguistics; it also, importantly, deepens and expands our shared human capacity for understanding and connection.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"6 1","pages":"Article 100172"},"PeriodicalIF":2.1,"publicationDate":"2025-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145884221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging large language models to supplement corpus-based inductive learning of Chinese as a second language 利用大型语言模型补充基于语料库的第二语言汉语归纳学习
IF 2.1 Pub Date : 2025-11-20 DOI: 10.1016/j.acorp.2025.100170
Tiffany Tsz-Yin Pang
Corpus tools have proven effective for supporting inductive language learning by enabling learners to observe multiple examples, form hypotheses, and verify the hypotheses based on additional examples. However, when applied to Chinese as a Second Language (CSL), these tools encounter limitations that disrupt the observe-hypothesize-verify process. Sketch Engine, for example, misanalyzes Chinese word boundaries, topicalized objects, and ba-constructions, and provides inaccurate observational data that undermines the effectiveness of inductive learning. This paper proposes integrating Large Language Models (LLMs) with corpus tools to address the limitations. Using Sketch Engine and Claude Opus 4 as exemplars, I demonstrate how LLMs serve three pedagogical functions: (1) error detection to identify misanalyzed features in corpus outputs, (2) guided pattern discovery to help learners recognize linguistic regularities across examples, and (3) hypothesis verification to confirm/refine learners’ observations. Through analysis of specific Chinese features, I show how LLM integration maintains the discovery processes while ensuring accurate linguistic input for the learners. The proposed corpus-LLM integration represents an advancement in leveraging AI for language pedagogy. The paper concludes with future research directions for optimizing this integration in CSL acquisition, and emphasizes the need to balance technological innovation with pedagogical principles.
语料库工具已被证明对支持归纳语言学习是有效的,它使学习者能够观察多个例子,形成假设,并基于其他例子验证假设。然而,当应用于汉语作为第二语言(CSL)时,这些工具遇到了限制,破坏了观察-假设-验证的过程。例如,Sketch Engine错误地分析了中文词边界、主题化对象和ba结构,并提供了不准确的观察数据,从而破坏了归纳学习的有效性。本文提出将大型语言模型(llm)与语料库工具集成来解决这一问题。以Sketch Engine和Claude Opus 4为例,我展示了llm如何服务于三个教学功能:(1)错误检测,以识别语料库输出中的错误分析特征;(2)引导模式发现,以帮助学习者识别示例中的语言规律;(3)假设验证,以确认/完善学习者的观察结果。通过分析具体的中国特色,我展示了LLM整合如何在保持发现过程的同时确保学习者准确的语言输入。拟议的语料库-法学硕士集成代表了利用人工智能进行语言教学的进步。最后,本文提出了在对外汉语习得中优化整合的未来研究方向,并强调需要在技术创新与教学原则之间取得平衡。
{"title":"Leveraging large language models to supplement corpus-based inductive learning of Chinese as a second language","authors":"Tiffany Tsz-Yin Pang","doi":"10.1016/j.acorp.2025.100170","DOIUrl":"10.1016/j.acorp.2025.100170","url":null,"abstract":"<div><div>Corpus tools have proven effective for supporting inductive language learning by enabling learners to observe multiple examples, form hypotheses, and verify the hypotheses based on additional examples. However, when applied to Chinese as a Second Language (CSL), these tools encounter limitations that disrupt the observe-hypothesize-verify process. Sketch Engine, for example, misanalyzes Chinese word boundaries, topicalized objects, and <em>ba</em>-constructions, and provides inaccurate observational data that undermines the effectiveness of inductive learning. This paper proposes integrating Large Language Models (LLMs) with corpus tools to address the limitations. Using Sketch Engine and Claude Opus 4 as exemplars, I demonstrate how LLMs serve three pedagogical functions: (1) error detection to identify misanalyzed features in corpus outputs, (2) guided pattern discovery to help learners recognize linguistic regularities across examples, and (3) hypothesis verification to confirm/refine learners’ observations. Through analysis of specific Chinese features, I show how LLM integration maintains the discovery processes while ensuring accurate linguistic input for the learners. The proposed corpus-LLM integration represents an advancement in leveraging AI for language pedagogy. The paper concludes with future research directions for optimizing this integration in CSL acquisition, and emphasizes the need to balance technological innovation with pedagogical principles.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"6 1","pages":"Article 100170"},"PeriodicalIF":2.1,"publicationDate":"2025-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145618426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Not so fast? A comparative study of pre-service teachers’ lesson design using corpora and generative artificial intelligence 没那么快?基于语料库和生成式人工智能的职前教师课程设计比较研究
IF 2.1 Pub Date : 2025-11-19 DOI: 10.1016/j.acorp.2025.100168
Agnieszka Leńko-Szymańska
The integration of corpora and generative artificial intelligence (GenAI) in language teacher education presents both opportunities and challenges. While corpus-based approaches have long been promoted for data-driven learning (DDL), their adoption remains limited due to complexity issues and time-demands. In contrast, GenAI tools offer immediate, user-friendly access to linguistic data, yet raise concerns about authenticity and reliability. This study compares pre-service teachers’ use of corpora and GenAI in pedagogically oriented language analysis, lesson planning, and materials development. Conducted within a graduate-level course, the study examines student teachers’ approaches to corpus-based and AI-based lesson design, focusing on their ability to retrieve and analyse linguistic data, plan lessons, create learning materials, and reflect on the effectiveness of these tools. Findings indicate the considerable potential of both corpora and GenAI for supporting data-informed, inductive approaches to language learning and teaching. Yet, the results also reveal that while pre-service teachers demonstrated operational proficiency in using both tools, they struggled to extract meaningful linguistic insights and integrate their findings into cohesive pedagogical frameworks. The study highlights the need for targeted training to develop teachers’ analytical and pedagogical skills in working with both types of resources. Ultimately, it argues that rather than replacing corpora, GenAI should complement data-driven learning, reinforcing the importance of linguistic accuracy and pedagogical soundness in technology-enhanced language teaching.
语料库与生成式人工智能(GenAI)在语文教师教育中的融合,带来了机遇与挑战。虽然基于语料库的方法长期以来一直被推广用于数据驱动学习(DDL),但由于复杂性问题和时间要求,它们的采用仍然受到限制。相比之下,GenAI工具提供了对语言数据的即时、用户友好的访问,但引起了对真实性和可靠性的担忧。本研究比较职前教师在以教学为导向的语言分析、课程规划和材料开发中使用语料库和GenAI的情况。该研究在研究生水平的课程中进行,考察了学生教师基于语料库和基于人工智能的课程设计方法,重点关注他们检索和分析语言数据、计划课程、创建学习材料以及反思这些工具的有效性的能力。研究结果表明,语料库和GenAI在支持基于数据的、归纳的语言学习和教学方法方面具有相当大的潜力。然而,研究结果还显示,尽管职前教师在使用这两种工具方面表现出了熟练的操作能力,但他们很难提取有意义的语言见解,并将他们的发现整合到有凝聚力的教学框架中。该研究强调需要进行有针对性的培训,以培养教师在使用这两种资源方面的分析和教学技能。最后,它认为GenAI不应该取代语料库,而应该补充数据驱动的学习,在技术增强的语言教学中强调语言准确性和教学合理性的重要性。
{"title":"Not so fast? A comparative study of pre-service teachers’ lesson design using corpora and generative artificial intelligence","authors":"Agnieszka Leńko-Szymańska","doi":"10.1016/j.acorp.2025.100168","DOIUrl":"10.1016/j.acorp.2025.100168","url":null,"abstract":"<div><div>The integration of corpora and generative artificial intelligence (GenAI) in language teacher education presents both opportunities and challenges. While corpus-based approaches have long been promoted for data-driven learning (DDL), their adoption remains limited due to complexity issues and time-demands. In contrast, GenAI tools offer immediate, user-friendly access to linguistic data, yet raise concerns about authenticity and reliability. This study compares pre-service teachers’ use of corpora and GenAI in pedagogically oriented language analysis, lesson planning, and materials development. Conducted within a graduate-level course, the study examines student teachers’ approaches to corpus-based and AI-based lesson design, focusing on their ability to retrieve and analyse linguistic data, plan lessons, create learning materials, and reflect on the effectiveness of these tools. Findings indicate the considerable potential of both corpora and GenAI for supporting data-informed, inductive approaches to language learning and teaching. Yet, the results also reveal that while pre-service teachers demonstrated operational proficiency in using both tools, they struggled to extract meaningful linguistic insights and integrate their findings into cohesive pedagogical frameworks. The study highlights the need for targeted training to develop teachers’ analytical and pedagogical skills in working with both types of resources. Ultimately, it argues that rather than replacing corpora, GenAI should complement data-driven learning, reinforcing the importance of linguistic accuracy and pedagogical soundness in technology-enhanced language teaching.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"6 1","pages":"Article 100168"},"PeriodicalIF":2.1,"publicationDate":"2025-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145685011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A comparative analysis of AI-generated texts, corpus data, and speaker judgments: Subject honorification patterns in Korean 人工智能生成文本、语料库数据和说话人判断的比较分析:韩国语的主语敬语模式
IF 2.1 Pub Date : 2025-11-19 DOI: 10.1016/j.acorp.2025.100171
Yejin Jung , Kathy MinHye Kim
Technological innovations can greatly enhance second language (L2) pragmatics instruction by providing learners with more natural and authentic communication opportunities. As Generative Artificial Intelligence (GenAI) tools become increasingly integrated into L2 teaching, questions arise as to whether they provide pedagogically appropriate input and how they can be used for inductive instruction (e.g., Data-driven Learning). To advance meaningful instructional approaches to Korean honorifics, understanding the nature of input is key; particularly, what exemplars of honorifics are available through GenAI and spoken corpora and how L2 learners perceive and evaluate different honorific forms. In response to these inquiries, we analyzed patterns of subject-verb honorific agreement in outputs from ChatGPT 4.0 and the NIKL Korean Dialogue Summarization Corpus (Study 1), and conducted an acceptability judgment test of four subject-verb honorific (mis)match forms (Study 2). We found that ChatGPT predominantly favored a subject-verb matched form, whereas corpus data reflected the highly complex, context-dependent use and variations of honorifics. L1 judgments aligned more closely with the corpus results, reflecting sensitivity to nuanced (mis)match forms, whereas L2 judgments closely mirrored ChatGPT’s patterns, lacking sensitivity beyond the matched forms. These results underscore the challenges associated with Korean honorification for both learners and educators, highlighting the need for more refined inductive teaching.
技术创新可以为学习者提供更自然、更真实的交际机会,从而极大地加强第二语言语用教学。随着生成式人工智能(GenAI)工具越来越多地融入第二语言教学,出现了一些问题,如它们是否提供了教学上适当的输入,以及如何将它们用于归纳教学(例如,数据驱动学习)。为了推进有意义的韩国语敬语教学方法,理解输入的本质是关键;特别是,通过GenAI和口语语料库可以获得哪些敬语范例,以及二语学习者如何感知和评估不同的敬语形式。针对这些问题,我们分析了ChatGPT 4.0和NIKL韩语对话摘要语料库输出的主、动词敬语一致性模式(研究1),并对四种主、动词敬语(错误)匹配形式进行了可接受性判断测试(研究2)。我们发现ChatGPT主要倾向于主谓匹配的形式,而语料库数据反映了高度复杂的、依赖于上下文的敬语使用和变化。L1判断与语料库结果更接近,反映了对细微差别(错误)匹配形式的敏感性,而L2判断与ChatGPT的模式密切相关,缺乏匹配形式之外的敏感性。这些结果强调了韩语敬语对学习者和教育者的挑战,强调了对更精细的归纳教学的需要。
{"title":"A comparative analysis of AI-generated texts, corpus data, and speaker judgments: Subject honorification patterns in Korean","authors":"Yejin Jung ,&nbsp;Kathy MinHye Kim","doi":"10.1016/j.acorp.2025.100171","DOIUrl":"10.1016/j.acorp.2025.100171","url":null,"abstract":"<div><div>Technological innovations can greatly enhance second language (L2) pragmatics instruction by providing learners with more natural and authentic communication opportunities. As Generative Artificial Intelligence (GenAI) tools become increasingly integrated into L2 teaching, questions arise as to whether they provide pedagogically appropriate input and how they can be used for inductive instruction (e.g., Data-driven Learning). To advance meaningful instructional approaches to Korean honorifics, understanding the nature of input is key; particularly, what exemplars of honorifics are available through GenAI and spoken corpora and how L2 learners perceive and evaluate different honorific forms. In response to these inquiries, we analyzed patterns of subject-verb honorific agreement in outputs from <em>ChatGPT 4.0</em> and the NIKL Korean Dialogue Summarization Corpus (Study 1), and conducted an acceptability judgment test of four subject-verb honorific (mis)match forms (Study 2). We found that ChatGPT predominantly favored a subject-verb matched form, whereas corpus data reflected the highly complex, context-dependent use and variations of honorifics. L1 judgments aligned more closely with the corpus results, reflecting sensitivity to nuanced (mis)match forms, whereas L2 judgments closely mirrored ChatGPT’s patterns, lacking sensitivity beyond the matched forms. These results underscore the challenges associated with Korean honorification for both learners and educators, highlighting the need for more refined inductive teaching.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"6 1","pages":"Article 100171"},"PeriodicalIF":2.1,"publicationDate":"2025-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145685012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring the potential of multiple CORE meanings in learning L2 verb-noun collocations: A corpus-based discovery learning approach 探索多种核心意义在学习第二语言动词-名词搭配中的潜力:基于语料库的发现学习方法
IF 2.1 Pub Date : 2025-11-16 DOI: 10.1016/j.acorp.2025.100166
Satoshi Yamagata , Gareth Carrol , Crayton Walker
Collocational knowledge is a critical component of second language (L2) learning. However, L2 learners often rely on first language (L1) translations, leading to the production of deviant collocations. To address this issue, this study investigates the pedagogical potential of teaching collocations through multiple CORE meanings (capitalised), in contrast to approaches that rely on a single core meaning of verbal nodes. Multiple CORE meanings are characterised not only by their typical nominal collocates, but also by other aspects of how they typically pattern. While previous accounts have tended to treat high-frequency verbal nodes as polysemous, we argue that many verbal nodes are better understood as examples of homonymy, which carries several semantically distinct CORE meanings (i.e., ‘draw’ meaning ‘to pull or move something’, ‘to divide something into two’, or ‘to make a picture’), and that this might offer a more logical way for learners to discover and learn collocational patterns. We first identified CORE meanings for six high-frequency verbal nodes through corpus-based analysis, and then tested their pedagogical potential with 240 EFL high school learners. Learners were taught verb-noun collocations using either a CORE meaning-based discovery approach or conventional L1 translations, and they completed a pre-test and two post-tests assessing productive recall and collocability judgement. Results showed that CORE meaning-based instruction enhanced productive recall, though the advantage did not extend to collocability judgement. These findings suggest that presenting learners with multiple CORE meanings can be a promising way to strengthen L2 collocational competence, although further refinement in instructional design is warranted.
搭配知识是第二语言学习的重要组成部分。然而,第二语言学习者往往依赖于第一语言(L1)翻译,导致产生偏差搭配。为了解决这个问题,本研究调查了通过多个核心意义(大写)教学搭配的教学潜力,而不是依赖于单一核心意义的口头节点的方法。多重核心意义的特征不仅在于它们典型的名义搭配,还在于它们典型的模式的其他方面。虽然之前的描述倾向于将高频言语节点视为多义词,但我们认为,许多言语节点可以更好地理解为同音的例子,它带有几个语义上不同的核心含义(即,“绘制”意味着“拉动或移动某物”,“将某物分成两部分”或“制作图片”),这可能为学习者提供一种更合乎逻辑的方式来发现和学习搭配模式。我们首先通过基于语料库的分析确定了六个高频词汇节点的核心意义,然后在240名英语高中学习者身上测试了它们的教学潜力。学习者使用CORE基于意义的发现方法或传统的L1翻译来教授动词-名词搭配,他们完成了一个前测试和两个后测试,评估生产性回忆和搭配性判断。结果表明,基于核心意义的教学提高了生产性回忆,但这种优势并未扩展到可搭配性判断。这些发现表明,向学习者呈现多种核心意义可能是加强第二语言搭配能力的一种有希望的方式,尽管教学设计需要进一步改进。
{"title":"Exploring the potential of multiple CORE meanings in learning L2 verb-noun collocations: A corpus-based discovery learning approach","authors":"Satoshi Yamagata ,&nbsp;Gareth Carrol ,&nbsp;Crayton Walker","doi":"10.1016/j.acorp.2025.100166","DOIUrl":"10.1016/j.acorp.2025.100166","url":null,"abstract":"<div><div>Collocational knowledge is a critical component of second language (L2) learning. However, L2 learners often rely on first language (L1) translations, leading to the production of deviant collocations. To address this issue, this study investigates the pedagogical potential of teaching collocations through multiple CORE meanings (capitalised), in contrast to approaches that rely on a single core meaning of verbal nodes. Multiple CORE meanings are characterised not only by their typical nominal collocates, but also by other aspects of how they typically pattern. While previous accounts have tended to treat high-frequency verbal nodes as polysemous, we argue that many verbal nodes are better understood as examples of homonymy, which carries several semantically distinct CORE meanings (i.e., ‘draw’ meaning ‘to pull or move something’, ‘to divide something into two’, or ‘to make a picture’), and that this might offer a more logical way for learners to discover and learn collocational patterns. We first identified CORE meanings for six high-frequency verbal nodes through corpus-based analysis, and then tested their pedagogical potential with 240 EFL high school learners. Learners were taught verb-noun collocations using either a CORE meaning-based discovery approach or conventional L1 translations, and they completed a pre-test and two post-tests assessing productive recall and collocability judgement. Results showed that CORE meaning-based instruction enhanced productive recall, though the advantage did not extend to collocability judgement. These findings suggest that presenting learners with multiple CORE meanings can be a promising way to strengthen L2 collocational competence, although further refinement in instructional design is warranted.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"6 1","pages":"Article 100166"},"PeriodicalIF":2.1,"publicationDate":"2025-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145839973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Concordancing with AI: Applications of word and sentence embeddings 人工智能检索:词和句子嵌入的应用
IF 2.1 Pub Date : 2025-11-12 DOI: 10.1016/j.acorp.2025.100164
Laurence Anthony
Concordancing is a central method in corpus research. It also plays an important role in the data-driven learning (DDL) classroom, where learners use Key-Word-In-Context (KWIC) analysis to discover and implicitly learn lexical and grammatical patterns in a specific language domain. Effective concordancing requires users to craft precise single- or multi-word queries that capture the language features of interest, but these queries can quickly become complex and error-prone. The method also relies on users selecting, ordering, and grouping results in order to interpret them in a meaningful way, which can also be a significant challenge.
This paper proposes using word and sentence embedding models from the field of AI to facilitate both the querying of corpora and the interpretation of concordance results. First, the paper explains that embeddings are numerical representations of language that capture semantic and contextual information in a high-dimensional vector space. Next, the paper reports on three experiments using pre-trained embedding models (BERT, word2vec) looking at synonym searches, concordance grouping and ordering, and language variation analysis across two general English corpora. The results show that embeddings allow for ‘fuzzy’, nuanced, context aware searches of corpus data without the need for meticulous query crafting, and enable the grouping and ordering of results in novel, interesting, and useful ways.
索引法是语料库研究的核心方法。它在数据驱动学习(DDL)课堂中也发挥着重要作用,学习者使用上下文关键词(KWIC)分析来发现和隐性学习特定语言领域的词汇和语法模式。有效的检索要求用户精心设计精确的单词或多词查询,以捕获感兴趣的语言特性,但这些查询可能很快变得复杂且容易出错。该方法还依赖于用户选择、排序和分组结果,以便以有意义的方式解释它们,这也是一个重大挑战。本文提出使用人工智能领域的词和句子嵌入模型来促进语料库的查询和一致性结果的解释。首先,本文解释了嵌入是语言的数字表示,它在高维向量空间中捕获语义和上下文信息。接下来,论文报告了三个实验,使用预训练的嵌入模型(BERT, word2vec),研究同义词搜索,一致性分组和排序,以及跨两个通用英语语料库的语言变化分析。结果表明,嵌入允许对语料库数据进行“模糊”、细微的、上下文感知的搜索,而不需要细致的查询制作,并且能够以新颖、有趣和有用的方式对结果进行分组和排序。
{"title":"Concordancing with AI: Applications of word and sentence embeddings","authors":"Laurence Anthony","doi":"10.1016/j.acorp.2025.100164","DOIUrl":"10.1016/j.acorp.2025.100164","url":null,"abstract":"<div><div>Concordancing is a central method in corpus research. It also plays an important role in the data-driven learning (DDL) classroom, where learners use Key-Word-In-Context (KWIC) analysis to discover and implicitly learn lexical and grammatical patterns in a specific language domain. Effective concordancing requires users to craft precise single- or multi-word queries that capture the language features of interest, but these queries can quickly become complex and error-prone. The method also relies on users selecting, ordering, and grouping results in order to interpret them in a meaningful way, which can also be a significant challenge.</div><div>This paper proposes using word and sentence embedding models from the field of AI to facilitate both the querying of corpora and the interpretation of concordance results. First, the paper explains that embeddings are numerical representations of language that capture semantic and contextual information in a high-dimensional vector space. Next, the paper reports on three experiments using pre-trained embedding models (BERT, word2vec) looking at synonym searches, concordance grouping and ordering, and language variation analysis across two general English corpora. The results show that embeddings allow for ‘fuzzy’, nuanced, context aware searches of corpus data without the need for meticulous query crafting, and enable the grouping and ordering of results in novel, interesting, and useful ways.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 3","pages":"Article 100164"},"PeriodicalIF":2.1,"publicationDate":"2025-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145578582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Discerning diachronic sinsign topic shifts: A case study of UK HIV news 识别历时符号话题转移:英国艾滋病新闻案例研究
IF 2.1 Pub Date : 2025-11-11 DOI: 10.1016/j.acorp.2025.100163
Jiantao Zou, Xuri Tang
The emerging triangulation approach in corpus-based critical discourse analysis—supra-lexical discursive component extraction in particular—faces the challenge of bridging macro-level analytical constructs (such as topics) with micro-level discursive realizations. This paper addresses this macro-micro divide in discerning sinsign topic shifts by proposing a framework that introduces unsupervised keyword extraction and word-embedding-based keyword clustering for topic shift identification and synthesizes collocation networks, sentiment analysis, and concordance reading to triangulate statistical topic shift patterns with fine-grained discursive realizations. The case study of UK HIV news discourse with the proposed framework identifies three diachronic shifts: the change from protection to prevention in HIV policy, destigmatization, and increasing focus on life quality of people living with HIV, all validated through macro-micro triangulation.
基于语料库的批判性语篇分析中新兴的三角测量方法——尤其是超词汇语篇成分提取——面临着将宏观层面的分析结构(如主题)与微观层面的语篇实现连接起来的挑战。本文通过提出一个框架来解决在识别新符号主题转移方面的宏观-微观分歧,该框架引入了无监督关键字提取和基于词嵌入的关键字聚类来识别主题转移,并综合了搭配网络、情感分析和一致性阅读,以细粒度话语实现对统计主题转移模式进行三角测量。基于该框架的英国艾滋病新闻话语案例研究确定了三个持续的转变:艾滋病政策从保护到预防的转变,去污名化,以及对艾滋病毒感染者生活质量的日益关注,所有这些都通过宏观-微观三角测量得到验证。
{"title":"Discerning diachronic sinsign topic shifts: A case study of UK HIV news","authors":"Jiantao Zou,&nbsp;Xuri Tang","doi":"10.1016/j.acorp.2025.100163","DOIUrl":"10.1016/j.acorp.2025.100163","url":null,"abstract":"<div><div>The emerging triangulation approach in corpus-based critical discourse analysis—supra-lexical discursive component extraction in particular—faces the challenge of bridging macro-level analytical constructs (such as topics) with micro-level discursive realizations. This paper addresses this macro-micro divide in discerning sinsign topic shifts by proposing a framework that introduces unsupervised keyword extraction and word-embedding-based keyword clustering for topic shift identification and synthesizes collocation networks, sentiment analysis, and concordance reading to triangulate statistical topic shift patterns with fine-grained discursive realizations. The case study of UK HIV news discourse with the proposed framework identifies three diachronic shifts: the change from protection to prevention in HIV policy, destigmatization, and increasing focus on life quality of people living with HIV, all validated through macro-micro triangulation.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 3","pages":"Article 100163"},"PeriodicalIF":2.1,"publicationDate":"2025-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145578580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Raising genre awareness through visualizing language features 通过可视化语言特征提高体裁意识
IF 2.1 Pub Date : 2025-11-10 DOI: 10.1016/j.acorp.2025.100162
John Blake, Maxim Mozgovoy
This paper introduces the Feature Visualizer, an open-access AI-powered tool designed to raise genre awareness among novice academic writers through inductive learning, a process that includes approaches such as discovery learning. The tool houses an annotated corpus of scientific research articles written by computer science majors and allows learners to explore authentic texts using on-demand visualizations and multimodal explanations. By engaging with the corpus, learners identify recurring language patterns and rhetorical structures at macro, meso, and micro levels, facilitating the bottom-up discovery of genre conventions. A longitudinal study with Japanese undergraduate computer science majors showed that the tool enhanced learners’ awareness of academic writing conventions and genre features. Focus group interviews further confirmed the usability and pedagogical value of the Feature Visualizer. We conclude by discussing practical applications for genre-based writing instruction informed by inductive learning principles.
本文介绍了Feature Visualizer,这是一个开放获取的人工智能工具,旨在通过归纳学习(包括发现学习等方法)提高新手学术作家的类型意识。该工具包含了由计算机科学专业学生撰写的科学研究文章的注释语料库,并允许学习者使用按需可视化和多模态解释来探索真实的文本。通过与语料库的接触,学习者在宏观、中观和微观层面上识别重复出现的语言模式和修辞结构,促进自下而上地发现体裁惯例。一项针对日本计算机科学专业本科生的纵向研究表明,该工具增强了学习者对学术写作惯例和体裁特征的意识。焦点小组访谈进一步证实了Feature Visualizer的可用性和教学价值。最后,我们讨论了归纳学习原则在体裁写作教学中的实际应用。
{"title":"Raising genre awareness through visualizing language features","authors":"John Blake,&nbsp;Maxim Mozgovoy","doi":"10.1016/j.acorp.2025.100162","DOIUrl":"10.1016/j.acorp.2025.100162","url":null,"abstract":"<div><div>This paper introduces the Feature Visualizer, an open-access AI-powered tool designed to raise genre awareness among novice academic writers through inductive learning, a process that includes approaches such as discovery learning. The tool houses an annotated corpus of scientific research articles written by computer science majors and allows learners to explore authentic texts using on-demand visualizations and multimodal explanations. By engaging with the corpus, learners identify recurring language patterns and rhetorical structures at macro, meso, and micro levels, facilitating the bottom-up discovery of genre conventions. A longitudinal study with Japanese undergraduate computer science majors showed that the tool enhanced learners’ awareness of academic writing conventions and genre features. Focus group interviews further confirmed the usability and pedagogical value of the Feature Visualizer. We conclude by discussing practical applications for genre-based writing instruction informed by inductive learning principles.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 3","pages":"Article 100162"},"PeriodicalIF":2.1,"publicationDate":"2025-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145578581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Applied Corpus Linguistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1