首页 > 最新文献

Applied Corpus Linguistics最新文献

英文 中文
A corpus-assisted discourse analysis of epistemic stances in tweets about U.S. police from 2013 to 2023 2013 - 2023年美国警察推文认知立场的语料库辅助话语分析
Pub Date : 2025-07-05 DOI: 10.1016/j.acorp.2025.100138
Mark Winston Visonà , Şebnem Kurt
Recently, public debate regarding law enforcement practices has extended into digital spaces, particularly on social media platforms such as Twitter (X). Prior research has focused on police-initiated communication both offline and online, yet few studies have explored how the public discusses policing on social media or whether this discussion has changed diachronically. The current study addresses these gaps via a corpus-assisted discourse analysis of a subset of tweets from four U.S. cities (Chicago, Houston, Los Angeles, and Washington, DC) posted between 2013 and 2023 containing the word ‘police’ and the epistemic marker ‘think/thought.’ By examining these tweets, the study analyzes how Twitter users position themselves or others on an epistemic gradient as more (K+) or less (K-) knowledgeable about specific aspects of policing. Using a mixed-methods approach that combines n-gram analysis with discourse analysis of stancetakers and tweet topics, this study identifies how key events shaped Twitter users’ attitudes towards U.S. policing practices over the last decade. Findings indicate that K+ tweets most frequently discussed police services, followed by crime/victims, with particular services like calling 911 and crimes involving vehicles debated by users. In K- tweets, users critiqued others’ knowledge of policing while police services remained the dominant topic with secondary topics like race varying more than in K+ tweets. This study thus contributes to our understanding of public perceptions of policing in online contexts and highlights epistemic stancetaking strategies used by Twitter users to involve others when discussing contentious issues related to law enforcement in the U.S.
最近,关于执法实践的公众辩论已经扩展到数字空间,特别是在Twitter等社交媒体平台上。先前的研究主要集中在警察发起的离线和在线沟通上,但很少有研究探讨公众如何在社交媒体上讨论警务,或者这种讨论是否发生了历时性变化。目前的研究通过语料库辅助话语分析来解决这些差距,分析了2013年至2023年期间来自美国四个城市(芝加哥、休斯顿、洛杉矶和华盛顿特区)发布的推文子集,其中包含“警察”一词和认知标记“思考/思想”。通过检查这些推文,该研究分析了推特用户如何将自己或他人定位为对警务特定方面了解更多(K+)或更少(K-)的认知梯度。本研究采用混合方法,将n-gram分析与立场者和推文主题的话语分析相结合,确定了过去十年中关键事件如何影响推特用户对美国警务实践的态度。调查结果表明,K+推文最常讨论的是警察服务,其次是犯罪/受害者,特别是用户讨论的服务,如拨打911和涉及车辆的犯罪。在K-推特中,用户批评其他人对警务的了解,而警察服务仍然是主导话题,种族等次要话题的变化比K+推特更大。因此,这项研究有助于我们理解公众对在线环境下警务的看法,并强调Twitter用户在讨论与美国执法有关的有争议问题时使用的认知立场策略
{"title":"A corpus-assisted discourse analysis of epistemic stances in tweets about U.S. police from 2013 to 2023","authors":"Mark Winston Visonà ,&nbsp;Şebnem Kurt","doi":"10.1016/j.acorp.2025.100138","DOIUrl":"10.1016/j.acorp.2025.100138","url":null,"abstract":"<div><div>Recently, public debate regarding law enforcement practices has extended into digital spaces, particularly on social media platforms such as Twitter (X). Prior research has focused on police-initiated communication both offline and online, yet few studies have explored how the public discusses policing on social media or whether this discussion has changed diachronically. The current study addresses these gaps via a corpus-assisted discourse analysis of a subset of tweets from four U.S. cities (Chicago, Houston, Los Angeles, and Washington, DC) posted between 2013 and 2023 containing the word ‘police’ and the epistemic marker ‘think/thought.’ By examining these tweets, the study analyzes how Twitter users position themselves or others on an epistemic gradient as more (K+) or less (K-) knowledgeable about specific aspects of policing. Using a mixed-methods approach that combines n-gram analysis with discourse analysis of stancetakers and tweet topics, this study identifies how key events shaped Twitter users’ attitudes towards U.S. policing practices over the last decade. Findings indicate that K+ tweets most frequently discussed police services, followed by crime/victims, with particular services like calling 911 and crimes involving vehicles debated by users. In K- tweets, users critiqued others’ knowledge of policing while police services remained the dominant topic with secondary topics like race varying more than in K+ tweets. This study thus contributes to our understanding of public perceptions of policing in online contexts and highlights epistemic stancetaking strategies used by Twitter users to involve others when discussing contentious issues related to law enforcement in the U.S.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 3","pages":"Article 100138"},"PeriodicalIF":0.0,"publicationDate":"2025-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144604558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A corpus-based analysis of gendered language in spoken religious discourse 基于语料库的宗教口语话语性别语言分析
Pub Date : 2025-06-16 DOI: 10.1016/j.acorp.2025.100137
Abdelhamid Elewa
This study employs corpus linguistics to analyze gendered language in religious discourse across three corpora: modern Arabic/English Friday sermons (MARC, MERC) and early Prophet Mohamed's sayings (Hadith). It specifically analyzes portrayals of women in sermons delivered exclusively by male preachers in Arabic and English, as well as in early Prophet Mohamed's sayings (Hadith). Quantitative comparisons of lexical density and collocational patterns reveal that modern sermons emphasize women’s physical appearance (e.g., attire) and traditional roles (e.g., motherhood), contrasting with early texts that acknowledge women’s individuality and agency. Semantic preference analysis shows singular ‘woman’ in modern contexts collocates with moral deviation, while plural "women" aligns with morality and collective protection. The study highlights how modern religious language in Arabic and English perpetuates gender stereotypes more conservatively than classical sources. The study emphasizes the potential of modern technology in revisiting religious literature and provides a comparative analysis of gendered language in Arabic and English for aligning doctrinal communication with gender equity goals.
本研究运用语料库语言学分析了三个语料库中的宗教话语中的性别语言:现代阿拉伯语/英语星期五布道(MARC, MERC)和早期先知穆罕默德的语录(圣训)。它特别分析了阿拉伯语和英语的男性传教士在布道中对女性的描绘,以及早期先知穆罕默德的言论(圣训)。词汇密度和搭配模式的定量比较表明,现代布道强调女性的外表(如服装)和传统角色(如母亲),与早期文本承认女性的个性和能动性形成鲜明对比。语义偏好分析表明,在现代语境中,单数“woman”与道德偏差搭配,复数“women”与道德和集体保护一致。该研究强调了阿拉伯语和英语中的现代宗教语言如何比古典文献更保守地延续了性别刻板印象。这项研究强调了现代技术在重新审视宗教文献方面的潜力,并对阿拉伯语和英语的性别语言进行了比较分析,以便使教义交流符合性别平等的目标。
{"title":"A corpus-based analysis of gendered language in spoken religious discourse","authors":"Abdelhamid Elewa","doi":"10.1016/j.acorp.2025.100137","DOIUrl":"10.1016/j.acorp.2025.100137","url":null,"abstract":"<div><div>This study employs corpus linguistics to analyze gendered language in religious discourse across three corpora: modern Arabic/English Friday sermons (MARC, MERC) and early Prophet Mohamed's sayings (Hadith). It specifically analyzes portrayals of women in sermons delivered exclusively by male preachers in Arabic and English, as well as in early Prophet Mohamed's sayings (Hadith). Quantitative comparisons of lexical density and collocational patterns reveal that modern sermons emphasize women’s physical appearance (e.g., attire) and traditional roles (e.g., motherhood), contrasting with early texts that acknowledge women’s individuality and agency. Semantic preference analysis shows singular ‘woman’ in modern contexts collocates with moral deviation, while plural \"women\" aligns with morality and collective protection. The study highlights how modern religious language in Arabic and English perpetuates gender stereotypes more conservatively than classical sources. The study emphasizes the potential of modern technology in revisiting religious literature and provides a comparative analysis of gendered language in Arabic and English for aligning doctrinal communication with gender equity goals.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 3","pages":"Article 100137"},"PeriodicalIF":0.0,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144470625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pub Date : 2025-06-14 DOI: 10.1016/j.acorp.2025.100136
Sak Lee
{"title":"","authors":"Sak Lee","doi":"10.1016/j.acorp.2025.100136","DOIUrl":"10.1016/j.acorp.2025.100136","url":null,"abstract":"","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 3","pages":"Article 100136"},"PeriodicalIF":0.0,"publicationDate":"2025-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144330523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pub Date : 2025-06-07 DOI: 10.1016/j.acorp.2025.100135
Amber Wanwen Wang
{"title":"","authors":"Amber Wanwen Wang","doi":"10.1016/j.acorp.2025.100135","DOIUrl":"10.1016/j.acorp.2025.100135","url":null,"abstract":"","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 3","pages":"Article 100135"},"PeriodicalIF":0.0,"publicationDate":"2025-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144297484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
What can a corpus tell us about school writing? Findings, challenges, and future directions 语料库能告诉我们关于学校写作的什么?发现、挑战和未来方向
Pub Date : 2025-05-15 DOI: 10.1016/j.acorp.2025.100134
Philip Durrant
Much child corpus research has focused on the language of school writing. In the first part of this paper, I discuss what this work can contribute to theory and educational practice. I then look in detail at a recent large-scale study conducted in England in order to illustrate, and critically discuss, key corpus methods and findings. In the final part of the paper, I discuss prospects for future work, focusing in particular on the issue of defining and analysing educational text types. I look at why this task is so central to valid research on school writing and why it has been so problematic, despite decades of attention from researchers. I then introduce and discuss the prospects for one promising approach to such analysis.
许多儿童语料库研究都集中在学校写作语言上。在本文的第一部分,我讨论了这一工作对理论和教育实践的贡献。然后,我详细介绍了最近在英国进行的一项大规模研究,以说明和批判性地讨论关键的语料库方法和发现。在论文的最后一部分,我讨论了未来工作的前景,特别关注定义和分析教育文本类型的问题。我研究了为什么这项任务对有效的学校写作研究如此重要,以及为什么尽管研究人员关注了几十年,但它仍然存在如此多的问题。然后,我介绍并讨论了这种分析的一种有希望的方法的前景。
{"title":"What can a corpus tell us about school writing? Findings, challenges, and future directions","authors":"Philip Durrant","doi":"10.1016/j.acorp.2025.100134","DOIUrl":"10.1016/j.acorp.2025.100134","url":null,"abstract":"<div><div>Much child corpus research has focused on the language of school writing. In the first part of this paper, I discuss what this work can contribute to theory and educational practice. I then look in detail at a recent large-scale study conducted in England in order to illustrate, and critically discuss, key corpus methods and findings. In the final part of the paper, I discuss prospects for future work, focusing in particular on the issue of defining and analysing educational text types. I look at why this task is so central to valid research on school writing and why it has been so problematic, despite decades of attention from researchers. I then introduce and discuss the prospects for one promising approach to such analysis.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 2","pages":"Article 100134"},"PeriodicalIF":0.0,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144123256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Vaccine and vaccination in parliamentary discourse in Brazilian Portuguese during COVID-19: Analysis of relational processes COVID-19期间巴西葡萄牙语议会话语中的疫苗和疫苗接种:相关过程分析
Pub Date : 2025-05-08 DOI: 10.1016/j.acorp.2025.100133
Rodrigo Esteves de Lima Lopes
This paper studies the collocates of relational processes in sessions of Câmara dos Deputados (Brazilian National Parliamentary) in which the lemma ‘vacina’ (vaccine) was discussed during the COVID-19 pandemic between 2020 and 2021. Brazil has historically a noteworthy immunisation programme, implemented and maintained by SUS (Sistema Único de Saúde/Unified Health System) which has, in the last few years, faced some challenges, resulting in a decrease in child and adult vaccination. The focus is how the ideological use of the vaccination took the debates in Câmara dos Deputados, in order to analyse whether different political stances may represent different language use and patterns. The corpus was compiled using Python scripts, and it is part of a larger Brazilian political language corpus (BrPoliCorpus). The corpus is structured in speeches from parties that endorse and those who opposed the political views of the Brazilian President at the time. The study analysed the collocates and their statistical significance, followed by a qualitative analysis based on Systemic-Functional Linguistics approach. The results show ideological polarisation in the discourse regarding vaccination.
本文研究了2020年至2021年COVID-19大流行期间巴西国民议会各届会议中相关进程的搭配,其中讨论了引语“疫苗”(vaccine)。巴西历史上有一个值得注意的免疫规划,由SUS (Sistema Único de Saúde/统一卫生系统)实施和维护,在过去几年中,该规划面临一些挑战,导致儿童和成人疫苗接种率下降。重点是疫苗接种的意识形态使用如何在 mara dos Deputados辩论中引起争论,以分析不同的政治立场是否可能代表不同的语言使用和模式。该语料库是使用Python脚本编译的,它是一个更大的巴西政治语言语料库(BrPoliCorpus)的一部分。语料库是由支持和反对当时巴西总统政治观点的政党的演讲构成的。本研究分析了这些搭配及其统计意义,并基于系统功能语言学的方法进行了定性分析。研究结果表明,在关于疫苗接种的论述中存在意识形态的两极分化。
{"title":"Vaccine and vaccination in parliamentary discourse in Brazilian Portuguese during COVID-19: Analysis of relational processes","authors":"Rodrigo Esteves de Lima Lopes","doi":"10.1016/j.acorp.2025.100133","DOIUrl":"10.1016/j.acorp.2025.100133","url":null,"abstract":"<div><div>This paper studies the collocates of relational processes in sessions of Câmara dos Deputados (Brazilian National Parliamentary) in which the lemma ‘vacina’ (vaccine) was discussed during the COVID-19 pandemic between 2020 and 2021. Brazil has historically a noteworthy immunisation programme, implemented and maintained by SUS (Sistema Único de Saúde/Unified Health System) which has, in the last few years, faced some challenges, resulting in a decrease in child and adult vaccination. The focus is how the ideological use of the vaccination took the debates in Câmara dos Deputados, in order to analyse whether different political stances may represent different language use and patterns. The corpus was compiled using Python scripts, and it is part of a larger Brazilian political language corpus (BrPoliCorpus). The corpus is structured in speeches from parties that endorse and those who opposed the political views of the Brazilian President at the time. The study analysed the collocates and their statistical significance, followed by a qualitative analysis based on Systemic-Functional Linguistics approach. The results show ideological polarisation in the discourse regarding vaccination.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 2","pages":"Article 100133"},"PeriodicalIF":0.0,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143942953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pub Date : 2025-05-02 DOI: 10.1016/j.acorp.2025.100132
Siaw-Fong Chung
{"title":"","authors":"Siaw-Fong Chung","doi":"10.1016/j.acorp.2025.100132","DOIUrl":"10.1016/j.acorp.2025.100132","url":null,"abstract":"","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 2","pages":"Article 100132"},"PeriodicalIF":0.0,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144072282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Investigating writing development, cross-linguistic influence and feedback practices through a longitudinal corpus of children’s school writing 通过儿童学校写作的纵向语料库调查写作发展,跨语言影响和反馈实践
Pub Date : 2025-04-25 DOI: 10.1016/j.acorp.2025.100131
Hildegunn Dirdal , Eva Thue Vold
This article reports on our work with child writing from the TRAWL Corpus, a longitudinal and multilingual corpus of school writing. We give examples of our work on vocabulary and complexity development, which utilizes the longitudinal design of the corpus, and on feedback practices and student uptake facilitated by the fact that many of the texts in the corpus include teacher comments. These studies illustrate how corpus data can be used in case studies and qualitative studies and emphasize the need for fine-grained classifications in learner corpus research. The TRAWL Corpus includes texts written in the five languages most commonly taught in Norwegian schools. We explain how our work on syntactic complexity and feedback practices is currently being expanded by an exploration of similarities and differences between language subjects and of interactions between the languages of individual learners through the project MULTIWRITE. The article highlights the benefits of corpora of authentic school writing that reflect the realities of the educational context and therefore can provide findings that are directly relevant and useful for the practice field
这篇文章报告了我们的工作与儿童写作从拖网语料库,纵向和多语言语料库的学校写作。我们给出了我们在词汇和复杂性发展方面的工作的例子,它利用了语料库的纵向设计,以及在语料库中的许多文本包括教师评论的事实促进的反馈实践和学生吸收。这些研究说明了语料库数据如何用于案例研究和定性研究,并强调了学习者语料库研究中细粒度分类的必要性。拖网语料库包括用挪威学校最常用的五种语言编写的文本。我们通过MULTIWRITE项目对语言主体之间的异同和个体学习者语言之间的相互作用的探索,解释了我们在句法复杂性和反馈实践方面的工作是如何扩展的。这篇文章强调了真实的学校写作语料库的好处,它反映了教育背景的现实,因此可以为实践领域提供直接相关和有用的发现
{"title":"Investigating writing development, cross-linguistic influence and feedback practices through a longitudinal corpus of children’s school writing","authors":"Hildegunn Dirdal ,&nbsp;Eva Thue Vold","doi":"10.1016/j.acorp.2025.100131","DOIUrl":"10.1016/j.acorp.2025.100131","url":null,"abstract":"<div><div>This article reports on our work with child writing from the TRAWL Corpus, a longitudinal and multilingual corpus of school writing. We give examples of our work on vocabulary and complexity development, which utilizes the longitudinal design of the corpus, and on feedback practices and student uptake facilitated by the fact that many of the texts in the corpus include teacher comments. These studies illustrate how corpus data can be used in case studies and qualitative studies and emphasize the need for fine-grained classifications in learner corpus research. The TRAWL Corpus includes texts written in the five languages most commonly taught in Norwegian schools. We explain how our work on syntactic complexity and feedback practices is currently being expanded by an exploration of similarities and differences between language subjects and of interactions between the languages of individual learners through the project MULTIWRITE. The article highlights the benefits of corpora of authentic school writing that reflect the realities of the educational context and therefore can provide findings that are directly relevant and useful for the practice field</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 2","pages":"Article 100131"},"PeriodicalIF":0.0,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144106908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the replicability of corpus-derived medical word lists 论语料库衍生医学词表的可复制性
Pub Date : 2025-04-16 DOI: 10.1016/j.acorp.2025.100130
Cosmin Mihail Florescu, Ryosuke L. Ohniwa
Several English medical vocabulary lists have been developed using corpora compiled from a variety of medical texts including research articles and medical textbooks. List items have been identified for inclusion using criteria mostly adopted from previous studies focused on academic vocabulary. This study aims to employ a systematic approach in compiling a corpus to create a medical word list for learners of English aiming to study or practice medicine in an English-speaking country. A large corpus of medical textbooks (CoMeT; 28,384,681 running words) was created using SketchEngine and analyzed to extract high-frequency lemmas. Keyness and dispersion values for each lemma were plotted in a histogram to visualize clustering patterns. This visual map was used to determine threshold values separating a medical vocabulary subset from a general vocabulary subset. The replicability of the findings was evaluated using two corpora (one medical, one non-medical) different from CoMeT. The newly developed list (Core Medical List; CoMeL) comprising a total of 2881 lemmas was found to include significantly more medicine-specific words and to have higher replicability compared to existing lists. CoMeL may assist learners and educators in English for Medical Purposes programs, including those aiming to undertake challenging medical licensing examinations in English-speaking countries.
一些英语医学词汇表已经开发使用语料库汇编从各种医学文本,包括研究文章和医学教科书。列表项目已确定纳入使用的标准,大多采用从以往的研究集中在学术词汇。本研究旨在采用系统的方法编制语料库,为在英语国家学习或实践医学的英语学习者创建一个医学词汇表。大量医学教科书(CoMeT;28,384,681个运行词)使用SketchEngine创建并分析以提取高频引理。每个引理的Keyness和dispersion值绘制在直方图中,以可视化聚类模式。该可视化地图用于确定将医学词汇子集与一般词汇子集分开的阈值。使用与CoMeT不同的两个语料库(一个是医学语料库,一个是非医学语料库)评估了研究结果的可重复性。新制定的清单(核心医疗清单;CoMeL),共包含2881个引词,发现与现有列表相比,该列表包含了更多的医学特定词,并且具有更高的可复制性。CoMeL可以帮助学习者和教育工作者学习医学英语课程,包括那些旨在在英语国家进行具有挑战性的医学执照考试的学生。
{"title":"On the replicability of corpus-derived medical word lists","authors":"Cosmin Mihail Florescu,&nbsp;Ryosuke L. Ohniwa","doi":"10.1016/j.acorp.2025.100130","DOIUrl":"10.1016/j.acorp.2025.100130","url":null,"abstract":"<div><div>Several English medical vocabulary lists have been developed using corpora compiled from a variety of medical texts including research articles and medical textbooks. List items have been identified for inclusion using criteria mostly adopted from previous studies focused on academic vocabulary. This study aims to employ a systematic approach in compiling a corpus to create a medical word list for learners of English aiming to study or practice medicine in an English-speaking country. A large corpus of medical textbooks (CoMeT; 28,384,681 running words) was created using SketchEngine and analyzed to extract high-frequency lemmas. Keyness and dispersion values for each lemma were plotted in a histogram to visualize clustering patterns. This visual map was used to determine threshold values separating a medical vocabulary subset from a general vocabulary subset. The replicability of the findings was evaluated using two corpora (one medical, one non-medical) different from CoMeT. The newly developed list (Core Medical List; CoMeL) comprising a total of 2881 lemmas was found to include significantly more medicine-specific words and to have higher replicability compared to existing lists. CoMeL may assist learners and educators in English for Medical Purposes programs, including those aiming to undertake challenging medical licensing examinations in English-speaking countries.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 2","pages":"Article 100130"},"PeriodicalIF":0.0,"publicationDate":"2025-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143874508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A corpus analysis of prepositional phrase-lexical bundles in academic writing: L2 writers from Indo-European and Non-Indo-European languages 学术写作中介词短语-词汇束的语料库分析:来自印欧语和非印欧语的二语作家
Pub Date : 2025-03-18 DOI: 10.1016/j.acorp.2025.100128
Ku Yunjung (Yunie)
English prepositions, characterized by their high frequency and complex polysemy, pose significant challenges for L2 learners (e.g., Geluso, 2022). Despite their crucial role in lexical bundles, few studies have examined how L2 learners use prepositions within lexical bundles in academic writing. This study investigated prepositional phrase (PP)-lexical bundles among L2 writers from prepositional Indo-European (IE) languages, prepositionless non-Indo-European (NIE) languages, and native English speakers. The research aimed to analyze the variety, frequency, and functions of these bundles, as well as the impact of L1 backgrounds on prepositional usage. Data were extracted from the ETS Corpus and the Louvain Corpus of Native English Essays. The findings revealed that while there were no significant differences between the NIE and IE groups in bundle type and frequency, the IE group's prepositional usage more closely resembled that of NSs, reflecting linguistic disparities tied to writers’ L1 backgrounds. Additionally, the functions of lexical bundles varied between NNS and NSs, with both NNS groups exhibiting a greater reliance on discourse organizers and stance expressions. These results suggest pedagogical implications for teaching prepositions, particularly to learners from prepositionless languages.
英语介词以其高频率和复杂的多义性为特点,给二语学习者带来了巨大的挑战(例如,Geluso, 2022)。尽管介词在词汇束中起着至关重要的作用,但很少有研究调查二语学习者在学术写作中如何在词汇束中使用介词。本研究调查了使用介词的印欧语(IE)、无介词的非印欧语(NIE)和母语为英语的二语作者的介词短语-词汇束。本研究旨在分析这些介词束的种类、频率和功能,以及母语背景对介词使用的影响。数据来自ETS语料库和鲁汶英语母语论文语料库。研究结果显示,虽然NIE组和IE组在束的类型和频率上没有显著差异,但IE组的介词使用更接近于NSs组,这反映了与作者的母语背景有关的语言差异。此外,词束的功能在网络神经网络和网络神经网络之间有所不同,两组网络神经网络都更依赖话语组织者和立场表达。这些结果表明了介词教学的意义,特别是对学习无介词语言的学习者。
{"title":"A corpus analysis of prepositional phrase-lexical bundles in academic writing: L2 writers from Indo-European and Non-Indo-European languages","authors":"Ku Yunjung (Yunie)","doi":"10.1016/j.acorp.2025.100128","DOIUrl":"10.1016/j.acorp.2025.100128","url":null,"abstract":"<div><div>English prepositions, characterized by their high frequency and complex polysemy, pose significant challenges for L2 learners (e.g., Geluso, 2022). Despite their crucial role in lexical bundles, few studies have examined how L2 learners use prepositions within lexical bundles in academic writing. This study investigated prepositional phrase (PP)-lexical bundles among L2 writers from prepositional Indo-European (IE) languages, prepositionless non-Indo-European (NIE) languages, and native English speakers. The research aimed to analyze the variety, frequency, and functions of these bundles, as well as the impact of L1 backgrounds on prepositional usage. Data were extracted from the ETS Corpus and the Louvain Corpus of Native English Essays. The findings revealed that while there were no significant differences between the NIE and IE groups in bundle type and frequency, the IE group's prepositional usage more closely resembled that of NSs, reflecting linguistic disparities tied to writers’ L1 backgrounds. Additionally, the functions of lexical bundles varied between NNS and NSs, with both NNS groups exhibiting a greater reliance on discourse organizers and stance expressions. These results suggest pedagogical implications for teaching prepositions, particularly to learners from prepositionless languages.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 2","pages":"Article 100128"},"PeriodicalIF":0.0,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143761227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Applied Corpus Linguistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1