首页 > 最新文献

Applied Corpus Linguistics最新文献

英文 中文
Creating and analysing a multimodal corpus of news texts with Google Cloud Vision's automatic image tagger 使用谷歌云视觉的自动图像标注器创建和分析新闻文本的多模态语料库
Pub Date : 2023-04-01 DOI: 10.1016/j.acorp.2023.100043
Paul Baker, Luke Collins

This study describes the creation and analysis of a small multimodal corpus of British news articles about obesity, where tags were assigned to images in the articles using the automatic tagger Google Cloud Vision. In order to illustrate the potential for analysis of image tags, the corpus analysis tool WordSmith was used to identify differences between newspapers in the ways that obesity was framed. Three forms of analysis were carried out – the first simply compared keywords across the newspapers, the second examined key visual tags and their collocates associated with each newspaper, while the third incorporated a combined analysis of words and image tags. The three analyses produced complementary findings, indicating the value in using Google Cloud Vision in creating and analysing multimodal corpora. The paper ends by reflecting on the method undertaken, while considering how additional research could improve our understanding of image tagging.

本研究描述了一个关于肥胖的英国新闻文章的小型多模态语料库的创建和分析,其中使用自动标记器Google Cloud Vision为文章中的图像分配标签。为了说明图像标签分析的潜力,语料库分析工具WordSmith被用来识别报纸在肥胖框架方面的差异。研究人员进行了三种形式的分析——第一种简单地比较了报纸上的关键词,第二种检查了关键的视觉标签及其与每份报纸相关的搭配,而第三种结合了文字和图像标签的综合分析。这三个分析产生了互补的结果,表明了使用谷歌云视觉在创建和分析多模态语料库中的价值。本文最后反思了所采用的方法,同时考虑了如何进一步研究可以提高我们对图像标记的理解。
{"title":"Creating and analysing a multimodal corpus of news texts with Google Cloud Vision's automatic image tagger","authors":"Paul Baker,&nbsp;Luke Collins","doi":"10.1016/j.acorp.2023.100043","DOIUrl":"10.1016/j.acorp.2023.100043","url":null,"abstract":"<div><p>This study describes the creation and analysis of a small multimodal corpus of British news articles about obesity, where tags were assigned to images in the articles using the automatic tagger Google Cloud Vision. In order to illustrate the potential for analysis of image tags, the corpus analysis tool WordSmith was used to identify differences between newspapers in the ways that obesity was framed. Three forms of analysis were carried out – the first simply compared keywords across the newspapers, the second examined key visual tags and their collocates associated with each newspaper, while the third incorporated a combined analysis of words and image tags. The three analyses produced complementary findings, indicating the value in using Google Cloud Vision in creating and analysing multimodal corpora. The paper ends by reflecting on the method undertaken, while considering how additional research could improve our understanding of image tagging.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43168059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The reception of public health messages during the COVID-19 pandemic COVID-19大流行期间公共卫生信息的接收
Pub Date : 2023-04-01 DOI: 10.1016/j.acorp.2022.100037
Emma McClaughlin , Sara Vilar-Lluch , Tamsin Parnell , Dawn Knight , Elena Nichele , Svenja Adolphs , Jérémie Clos , Giovanni Schiazza

Understanding the reception of public health messages in public-facing communications is of key importance to health agencies in managing crises, pandemics, and other health threats. Established public health communications strategies including self-efficacy messaging, fear appeals, and moralising messaging were all used during the Coronavirus pandemic. We explore the reception of public health messages to understand the efficacy of these established messaging strategies in the COVID-19 context. Taking a community-focussed approach, we combine a corpus linguistic analysis with methods of wider engagement, namely, a public survey and interactions with a Public Involvement Panel to analyse this type of real-world public health discourse.

Our findings indicate that effective health messaging content provides manageable instructions, which inspire public confidence that following the guidance is worthwhile. Messaging that appeals to the audience's morals or fears in order to provide a rationale for compliance can be polarising and divisive, producing a strongly negative emotional response from the public and potentially undermining social cohesion. Provenance of the messaging alongside text-external political factors also have an influence on messaging uptake. In addition, our findings highlight key differences in messaging uptake by audience age, which demonstrates the importance of tailored communications and the need to seek public feedback to test the efficacy of messaging with the relevant demographics. Our study illustrates the value of corpus linguistics to public health agencies and health communications professionals, and we share our recommendations for improving the public health messaging both in the context of the ongoing pandemic and for future novel and re-emerging infectious disease outbreaks.

了解在面向公众的传播中公共卫生信息的接收情况,对卫生机构管理危机、流行病和其他健康威胁至关重要。在冠状病毒大流行期间,包括自我效能信息、恐惧呼吁和道德信息在内的既定公共卫生传播策略都被使用。我们探讨了公共卫生信息的接收情况,以了解这些既定信息传递策略在COVID-19背景下的有效性。采用以社区为中心的方法,我们将语料库语言分析与更广泛参与的方法相结合,即公众调查和与公众参与小组的互动,以分析这种现实世界的公共卫生话语。我们的研究结果表明,有效的健康信息内容提供了可管理的指导,这激发了公众的信心,认为遵循指导是值得的。利用受众的道德或恐惧来提供遵守的理由的信息可能会两极分化和分裂,在公众中产生强烈的负面情绪反应,并可能破坏社会凝聚力。短信和文本的来源——外部政治因素也对短信的吸收有影响。此外,我们的研究结果强调了受众年龄在信息接收方面的关键差异,这表明了定制通信的重要性,以及寻求公众反馈以测试信息与相关人口统计数据的有效性的必要性。我们的研究说明了语料库语言学对公共卫生机构和卫生传播专业人员的价值,我们分享了我们的建议,以改善公共卫生信息,无论是在当前的大流行背景下,还是在未来新的和重新出现的传染病爆发的背景下。
{"title":"The reception of public health messages during the COVID-19 pandemic","authors":"Emma McClaughlin ,&nbsp;Sara Vilar-Lluch ,&nbsp;Tamsin Parnell ,&nbsp;Dawn Knight ,&nbsp;Elena Nichele ,&nbsp;Svenja Adolphs ,&nbsp;Jérémie Clos ,&nbsp;Giovanni Schiazza","doi":"10.1016/j.acorp.2022.100037","DOIUrl":"10.1016/j.acorp.2022.100037","url":null,"abstract":"<div><p>Understanding the reception of public health messages in public-facing communications is of key importance to health agencies in managing crises, pandemics, and other health threats. Established public health communications strategies including self-efficacy messaging, fear appeals, and moralising messaging were all used during the Coronavirus pandemic. We explore the reception of public health messages to understand the efficacy of these established messaging strategies in the COVID-19 context. Taking a community-focussed approach, we combine a corpus linguistic analysis with methods of wider engagement, namely, a public survey and interactions with a Public Involvement Panel to analyse this type of real-world public health discourse.</p><p>Our findings indicate that effective health messaging content provides manageable instructions, which inspire public confidence that following the guidance is worthwhile. Messaging that appeals to the audience's morals or fears in order to provide a rationale for compliance can be polarising and divisive, producing a strongly negative emotional response from the public and potentially undermining social cohesion. Provenance of the messaging alongside text-external political factors also have an influence on messaging uptake. In addition, our findings highlight key differences in messaging uptake by audience age, which demonstrates the importance of tailored communications and the need to seek public feedback to test the efficacy of messaging with the relevant demographics. Our study illustrates the value of corpus linguistics to public health agencies and health communications professionals, and we share our recommendations for improving the public health messaging both in the context of the ongoing pandemic and for future novel and re-emerging infectious disease outbreaks.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9630298/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9910276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Epistemic stance in written L2 English: The role of task type, L2 proficiency, and authorial style 二语书面英语的认知立场:任务类型、二语水平和作者风格的作用
Pub Date : 2023-04-01 DOI: 10.1016/j.acorp.2022.100040
Maria Pyykönen

The present study examines the relationship between the use of epistemic stance expressions (i.e., hedges and boosters) and task type, L2 proficiency, and individual authorial style in 1,773 essays representing three different kinds of tasks (complaint, letter, and opinion) written by 591 Finnish L2 English speakers on four different levels of proficiency (CEFR levels B1-C2). The results of the study show that the frequency of both hedges and boosters is mainly governed by task type, as the opinion tex contained a higher number of both hedges and boosters than the other tasks examined. Proficiency-related patterns were, nevertheless, also observed, as it was shown that in the complaint and opinion tasks, the frequency of both hedges and boosters tends to increase with proficiency, while in the letter task, the frequency of both ED types shows signs of decrease. Individual authorial style was shown to play a very limited role in the frequency of EDs in the data, but the results also suggest that the influence of authorial style may be greater with respect to boosters than it is with hedges.

本研究考察了591名芬兰语第二英语使用者在四种不同熟练程度(CEFR水平B1-C2)上写的1773篇代表三种不同任务(抱怨、信件和意见)的文章中认知立场表达(即模糊限制语和助词)的使用与任务类型、二语熟练程度和个人写作风格之间的关系。研究结果表明,模糊限制语和促进语的使用频率主要受任务类型的影响,因为意见稿中包含的模糊限制语和促进语的数量高于其他被调查的任务。然而,与熟练程度相关的模式也被观察到,因为它表明,在抱怨和意见任务中,模糊限制语和助推词的频率都倾向于随着熟练程度的增加而增加,而在字母任务中,两种ED类型的频率都有下降的迹象。在数据中,个人作者风格对电子邮件频率的影响非常有限,但结果也表明,作者风格对助推器的影响可能比对模糊限制的影响更大。
{"title":"Epistemic stance in written L2 English: The role of task type, L2 proficiency, and authorial style","authors":"Maria Pyykönen","doi":"10.1016/j.acorp.2022.100040","DOIUrl":"10.1016/j.acorp.2022.100040","url":null,"abstract":"<div><p>The present study examines the relationship between the use of epistemic stance expressions (i.e., hedges and boosters) and task type, L2 proficiency, and individual authorial style in 1,773 essays representing three different kinds of tasks (complaint, letter, and opinion) written by 591 Finnish L2 English speakers on four different levels of proficiency (CEFR levels B1-C2). The results of the study show that the frequency of both hedges and boosters is mainly governed by task type, as the opinion tex contained a higher number of both hedges and boosters than the other tasks examined. Proficiency-related patterns were, nevertheless, also observed, as it was shown that in the complaint and opinion tasks, the frequency of both hedges and boosters tends to increase with proficiency, while in the letter task, the frequency of both ED types shows signs of decrease. Individual authorial style was shown to play a very limited role in the frequency of EDs in the data, but the results also suggest that the influence of authorial style may be greater with respect to boosters than it is with hedges.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49252394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Trialling corpus search techniques for identifying person-first and identity-first language 基于语料库搜索技术的个人优先和身份优先语言识别试验
Pub Date : 2023-04-01 DOI: 10.1016/j.acorp.2023.100046
Monika Bednarek, Carly Bray

This short ‘methods’ article compares results for six different corpus search techniques for identifying person-first language (e.g. person/people with obesity, person/people with mental illness) and identity-first language (e.g. obese person/people, mentally ill person/people) in a corpus. This distinction is relevant across a range of health contexts, including but not limited to obesity, diabetes, or mental illness. Consequently, there is considerable interest in corpus linguistics and beyond in identifying the frequency of such language in large corpora. However, there is no consensus regarding the specific corpus search techniques to be used for this purpose. This article therefore offers a relevant methodological contribution, based on a trial of six different search techniques. Results from each technique are compared with respect to four different parameters: raw frequency, proportional usage, number of types identified (a proxy for ‘recall’) and false positives (a proxy for ‘precision’). This comparison in turn provides a basis for recommendations for future corpus linguistic studies of person- and identity-first language. The corpus that we use for this trial is a 16.4 million word corpus with newspaper articles containing the word obesity or obese. However, the findings should be relevant to other kinds of identity where similar syntactic structures are at play for expressing identity-first and person-first language.

这篇简短的“方法”文章比较了六种不同的语料库搜索技术的结果,用于识别语料库中以人为本的语言(例如肥胖的人/人,患有精神疾病的人/人)和身份为本的语言(例如肥胖的人/人,患有精神疾病的人/人)。这种区别与一系列健康背景相关,包括但不限于肥胖、糖尿病或精神疾病。因此,语料库语言学及其以外的领域对识别大型语料库中此类语言的频率有着相当大的兴趣。然而,对于用于此目的的具体语料库搜索技术尚无共识。因此,本文基于对六种不同搜索技术的试验,提供了相关的方法学贡献。每种技术的结果根据四个不同的参数进行比较:原始频率、比例使用、识别的类型数量(代表“召回率”)和误报(代表“精度”)。这种比较反过来又为今后的语料库语言学研究提供了建议的基础。我们在这个试验中使用的语料库是一个1640万单词的语料库,其中的报纸文章都包含“肥胖”或“肥胖”这个词。然而,这些发现应该与其他类型的身份有关,在这些类型的身份中,相似的句法结构在表达身份优先和个人优先的语言中起作用。
{"title":"Trialling corpus search techniques for identifying person-first and identity-first language","authors":"Monika Bednarek,&nbsp;Carly Bray","doi":"10.1016/j.acorp.2023.100046","DOIUrl":"10.1016/j.acorp.2023.100046","url":null,"abstract":"<div><p>This short ‘methods’ article compares results for six different corpus search techniques for identifying person-first language (e.g. <em>person/people with obesity, person/people with mental illness</em>) and identity-first language (e.g. <em>obese person/people, mentally ill person/people</em>) in a corpus. This distinction is relevant across a range of health contexts, including but not limited to obesity, diabetes, or mental illness. Consequently, there is considerable interest in corpus linguistics and beyond in identifying the frequency of such language in large corpora. However, there is no consensus regarding the specific corpus search techniques to be used for this purpose. This article therefore offers a relevant methodological contribution, based on a trial of six different search techniques. Results from each technique are compared with respect to four different parameters: raw frequency, proportional usage, number of types identified (a proxy for ‘recall’) and false positives (a proxy for ‘precision’). This comparison in turn provides a basis for recommendations for future corpus linguistic studies of person- and identity-first language. The corpus that we use for this trial is a 16.4 million word corpus with newspaper articles containing the word <em>obesity</em> or <em>obese</em>. However, the findings should be relevant to other kinds of identity where similar syntactic structures are at play for expressing identity-first and person-first language.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44693098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Becoming corpus literate: An in-service EFL teacher education framework for integrating corpora into EFL teaching 语料库素养:将语料库融入英语教学的在职教师教育框架
Pub Date : 2023-04-01 DOI: 10.1016/j.acorp.2023.100048
Cathryn Bennett, Elaine Uí Dhonnchadha

While much language teaching research has extolled the advantages of using corpora in the language learning classroom, uptake of corpora by language teachers remains low (Poole, 2020; Charles, 2020). Frankenberg-Garcia (2012a) reports that this is partly due to teachers having little time to prepare lessons with corpora given their busy teaching schedules. However, we also argue it is because teachers are not exposed to corpora sufficiently in pre-service teacher training programmes. This paper outlines a training framework for in-service EFL practitioners which will enable them to become corpus literate over a period of weeks. The framework utilises observational learning, and we gauge trainees’ corpus literacy development based on their reflections and experiences of teaching with corpora in class. Similar to previous corpus training research, teachers reported an overwhelmingly positive experience of learning to use a corpus for classroom materials design. Unlike previous trainings with corpora, teachers commented that using corpora did not require more time in lesson preparation when compared with traditional methods. Recommendations for future iterations of the training programme include the incorporation of additional corpora, tools and user interfaces.

虽然许多语言教学研究都赞扬了在语言学习课堂中使用语料库的优势,但语言教师对语料库的使用仍然很低(Poole, 2020;查尔斯,2020)。Frankenberg-Garcia (2012a)报告称,这部分是由于教师在繁忙的教学日程中几乎没有时间用语料库备课。然而,我们也认为这是因为教师在职前教师培训计划中没有充分接触语料库。本文概述了在职英语从业者的培训框架,使他们能够在几周内成为语料库读写。该框架采用观察式学习,并根据学员对课堂语料库教学的反思和经验来衡量其语料库素养的发展。与之前的语料库训练研究类似,教师报告了学习使用语料库进行课堂材料设计的压倒性积极体验。与以往使用语料库的培训不同,教师评价使用语料库与传统方法相比,不需要更多的备课时间。关于今后培训方案迭代的建议包括纳入更多的语料库、工具和用户界面。
{"title":"Becoming corpus literate: An in-service EFL teacher education framework for integrating corpora into EFL teaching","authors":"Cathryn Bennett,&nbsp;Elaine Uí Dhonnchadha","doi":"10.1016/j.acorp.2023.100048","DOIUrl":"10.1016/j.acorp.2023.100048","url":null,"abstract":"<div><p>While much language teaching research has extolled the advantages of using corpora in the language learning classroom, uptake of corpora by language teachers remains low (Poole, 2020; Charles, 2020). Frankenberg-Garcia (2012a) reports that this is partly due to teachers having little time to prepare lessons with corpora given their busy teaching schedules. However, we also argue it is because teachers are not exposed to corpora sufficiently in pre-service teacher training programmes. This paper outlines a training framework for in-service EFL practitioners which will enable them to become corpus literate over a period of weeks. The framework utilises observational learning, and we gauge trainees’ corpus literacy development based on their reflections and experiences of teaching with corpora in class. Similar to previous corpus training research, teachers reported an overwhelmingly positive experience of learning to use a corpus for classroom materials design. Unlike previous trainings with corpora, teachers commented that using corpora did not require more time in lesson preparation when compared with traditional methods. Recommendations for future iterations of the training programme include the incorporation of additional corpora, tools and user interfaces.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42556145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Book Reviews 书评
Pub Date : 2023-03-01 DOI: 10.1016/j.acorp.2023.100055
Rickey Lu
{"title":"Book Reviews","authors":"Rickey Lu","doi":"10.1016/j.acorp.2023.100055","DOIUrl":"https://doi.org/10.1016/j.acorp.2023.100055","url":null,"abstract":"","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46095037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hack your corpus analysis: How AI can assist corpus linguists deal with messy social media data 破解语料库分析:人工智能如何帮助语料库语言学家处理混乱的社交媒体数据
Pub Date : 2023-01-01 DOI: 10.1016/j.acorp.2023.100067
Michele Zappavigna
{"title":"Hack your corpus analysis: How AI can assist corpus linguists deal with messy social media data","authors":"Michele Zappavigna","doi":"10.1016/j.acorp.2023.100067","DOIUrl":"https://doi.org/10.1016/j.acorp.2023.100067","url":null,"abstract":"","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49774935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Book review Vander Viana (2023) teaching English with corpora: A resource book 书评Vander Viana(2023)用语料库教授英语:一本资源书
Pub Date : 2023-01-01 DOI: 10.1016/j.acorp.2023.100061
Larissa Goulart (Assistant Professor of Linguistics)
{"title":"Book review Vander Viana (2023) teaching English with corpora: A resource book","authors":"Larissa Goulart (Assistant Professor of Linguistics)","doi":"10.1016/j.acorp.2023.100061","DOIUrl":"https://doi.org/10.1016/j.acorp.2023.100061","url":null,"abstract":"","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49816437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The gap between intentions and reality: Reasons for EAP writers’ non-use of corpora 意图与现实的差距:EAP作者不使用语料库的原因
Pub Date : 2022-12-01 DOI: 10.1016/j.acorp.2022.100032
Maggie Charles

Over the last three decades, extensive research has been devoted to EAP students’ use of corpora for academic writing. However, corpus use has usually been ascertained immediately post-course; data on long-term use is sparse and little attention has been paid to those who give up using corpora. This study investigates the extent of corpus non-use and students’ reasons for discontinuing the practice in the long term. It draws on data from two questionnaires: (1) immediate post-course (ImmPQ); (2) delayed post-course (DelPQ) completed a year later. Participants were 182 graduates who took a six-week course during which they built and consulted do-it-yourself corpora in their own field. Results from ImmPQ showed that most students (63%) used their corpus regularly (≥ 1/week), but one year later DelPQ revealed that regular use had decreased to 36%. Although 87% of respondents to ImmPQ stated their intention to use their corpus in the future, DelPQ reported a total of 37% of non-users. There were 86 mentions of reasons for non-use; the most prevalent were: not doing any academic writing (29%), the use of other tools (20%), time issues and corpus issues (10% each). It is argued that students’ scarcity of time is a possible underlying cause of much non-use and the study suggests some ways in which long-term corpus take-up could be increased.

在过去的三十年里,广泛的研究致力于EAP学生在学术写作中使用语料库。然而,语料库的使用通常在课程结束后立即确定;长期使用语料库的数据很少,很少有人关注那些放弃使用语料库的人。本研究调查了语料库不使用的程度以及学生长期停止使用语料库的原因。它从两个问卷中提取数据:(1)课程后立即(ImmPQ);(2)延迟一年后完成的后课程(DelPQ)。参与者是182名毕业生,他们参加了为期六周的课程,在此期间,他们在自己的领域建立并咨询了自己动手的语料库。ImmPQ的结果显示,大多数学生(63%)定期使用语料库(≥1个/周),但一年后DelPQ显示,定期使用语料库的学生减少到36%。尽管87%的ImmPQ受访者表示他们打算在未来使用他们的语料库,但DelPQ报告的非用户总数为37%。有86次提到不使用的原因;最常见的是:没有做任何学术写作(29%),使用其他工具(20%),时间问题和语料库问题(各占10%)。有人认为,学生缺乏时间可能是不使用语料库的潜在原因,该研究提出了一些可以增加长期语料库占用的方法。
{"title":"The gap between intentions and reality: Reasons for EAP writers’ non-use of corpora","authors":"Maggie Charles","doi":"10.1016/j.acorp.2022.100032","DOIUrl":"10.1016/j.acorp.2022.100032","url":null,"abstract":"<div><p>Over the last three decades, extensive research has been devoted to EAP students’ use of corpora for academic writing. However, corpus use has usually been ascertained immediately post-course; data on long-term use is sparse and little attention has been paid to those who give up using corpora. This study investigates the extent of corpus non-use and students’ reasons for discontinuing the practice in the long term. It draws on data from two questionnaires: (1) immediate post-course (ImmPQ); (2) delayed post-course (DelPQ) completed a year later. Participants were 182 graduates who took a six-week course during which they built and consulted do-it-yourself corpora in their own field. Results from ImmPQ showed that most students (63%) used their corpus regularly (≥ 1/week), but one year later DelPQ revealed that regular use had decreased to 36%. Although 87% of respondents to ImmPQ stated their intention to use their corpus in the future, DelPQ reported a total of 37% of non-users. There were 86 mentions of reasons for non-use; the most prevalent were: not doing any academic writing (29%), the use of other tools (20%), time issues and corpus issues (10% each). It is argued that students’ scarcity of time is a possible underlying cause of much non-use and the study suggests some ways in which long-term corpus take-up could be increased.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666799122000156/pdfft?md5=f0528a6928b7b2511c7f7f2c8c8f18f7&pid=1-s2.0-S2666799122000156-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41858231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Replication as a means of assessing corpus representativeness and the generalizability of specialized word lists 复制作为一种评估语料库代表性和专业词表可泛化性的方法
Pub Date : 2022-12-01 DOI: 10.1016/j.acorp.2022.100027
Don Miller

Considerable energy has gone into designing lists of words that are salient in discourse domains of varying breadth. Over the past two decades, most efforts in designing and validating corpus-based frequency lists have focused on three areas: corpus compilation, item selection criteria, and coverage-based demonstrations of list robustness. As a result, modern corpora are now often much larger and better balanced; the application of additional dispersion statistics allows for better targeting of items with desired distributions; and contemporary lexical frequency lists are proving increasingly efficient, providing ever higher coverage of target texts or achieving such coverage with fewer words. However, despite these important advances, relatively minimal attention has been paid to word list reliability—the extent to which lists can be generalized to the wider discourse domain that has been represented by the corpora upon which they are based. This study begins to address this gap, demonstrating via two word list development case studies (one for Environmental Science and one for Applied Linguistics) that adding iterative reliability analysis—via methodological replication with corpora of increasing size and comparison of items on resulting lists—can be used to: 1) inform corpus design beyond what Biber (1991) terms “situational” parameters, allowing us to see whether corpora are adequately representative of lexical distributions in target discourse domains; and 2) provide valuable insight into the degree of generalizability of word lists we have developed.

相当多的精力投入到设计在不同宽度的话语域中突出的单词列表上。在过去的二十年中,设计和验证基于语料库的频率列表的大部分工作集中在三个方面:语料库编译、项目选择标准和基于覆盖的列表鲁棒性演示。因此,现代语料库现在往往更大,更平衡;应用额外的分散统计数据可以更好地定位具有期望分布的项目;当代词汇频率表的效率越来越高,可以提供更高的目标文本覆盖范围,或者用更少的单词实现这样的覆盖范围。然而,尽管取得了这些重要的进展,人们对词表可靠性的关注相对较少,即词表在多大程度上可以被推广到更广泛的话语领域,即它们所基于的语料库所代表的话语领域。本研究开始解决这一差距,通过两个单词列表开发案例研究(一个用于环境科学,一个用于应用语言学)证明,通过增加语料库规模的方法学复制和结果列表上项目的比较,增加迭代可靠性分析可以用于:1)告知语料库设计超越Biber(1991)所说的“情境”参数,使我们能够看到语料库是否充分代表了目标话语域的词汇分布;2)对我们所开发的词表的泛化程度提供有价值的见解。
{"title":"Replication as a means of assessing corpus representativeness and the generalizability of specialized word lists","authors":"Don Miller","doi":"10.1016/j.acorp.2022.100027","DOIUrl":"10.1016/j.acorp.2022.100027","url":null,"abstract":"<div><p>Considerable energy has gone into designing lists of words that are salient in discourse domains of varying breadth. Over the past two decades, most efforts in designing and validating corpus-based frequency lists have focused on three areas: corpus compilation, item selection criteria, and coverage-based demonstrations of list robustness. As a result, modern corpora are now often much larger and better balanced; the application of additional dispersion statistics allows for better targeting of items with desired distributions; and contemporary lexical frequency lists are proving increasingly efficient, providing ever higher coverage of target texts or achieving such coverage with fewer words. However, despite these important advances, relatively minimal attention has been paid to word list reliability—the extent to which lists can be generalized to the wider discourse domain that has been represented by the corpora upon which they are based. This study begins to address this gap, demonstrating via two word list development case studies (one for Environmental Science and one for Applied Linguistics) that adding iterative reliability analysis—via methodological replication with corpora of increasing size and comparison of items on resulting lists—can be used to: 1) inform corpus design beyond what Biber (1991) terms “situational” parameters, allowing us to see whether corpora are adequately representative of lexical distributions in target discourse domains; and 2) provide valuable insight into the degree of generalizability of word lists we have developed.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666799122000120/pdfft?md5=99bdd61e7345f961aa3e0dbbbda0d186&pid=1-s2.0-S2666799122000120-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49471849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Applied Corpus Linguistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1