首页 > 最新文献

Applied Corpus Linguistics最新文献

英文 中文
Corpus to curriculum: Developing word lists for adult learners of Welsh 语料库到课程:为成人威尔士语学习者开发单词表
Pub Date : 2023-08-01 DOI: 10.1016/j.acorp.2023.100052
Dawn Knight , Tess Fitzpatrick , Steve Morris , Bethan Tovey-Walsh , Helen Prosser , Emyr Davies

The launch of a language's first comprehensive general corpus promises a sea-change in teaching and learning resources. Effective transition from corpus to classroom is not necessarily straightforward, though; expert and end-user input is essential for the potential of the corpus resource to be realised. This paper outlines the process by which fit-for-purpose vocabulary lists were derived from the new National Corpus of Contemporary Welsh (Corpws Cenedlaethol Cymraeg Cyfoes – CorCenCC). The immediate purpose in this case was to inform the revision of A1 and A2 level course materials for adult learners. A longer-term aim was to put in place a method by which vocabulary lists for more advanced level learners and learners of different ages could be extracted and developed from the corpus. The new corpus means that for the first time, the Welsh language curriculum is able to use word frequency information; teaching and assessment materials in major languages have been informed by word frequencies for several decades. Raw frequency lists, though, include troublesome content, and can exclude items with high relevance to learners. This paper demonstrates how, by working in partnership, Welsh language curriculum writers, assessors, language experts and corpus linguists can effectively manipulate corpus data into curriculum content. The methods and approaches reported here are replicable for use in other language contexts.

一种语言的第一个综合通用语料库的推出预示着教学资源的巨大变化。然而,从语料库到课堂的有效过渡并不一定是直接的;专家和最终用户的输入对于实现语料库资源的潜力至关重要。本文概述了从新的《当代威尔士国家语料库》(Corws Cenedlaethol Cymraeg Cyvenues–CorCenCC)中提取符合目的词汇表的过程。本案的直接目的是通知成人学习者A1和A2级别课程材料的修订。一个长期的目标是建立一种方法,通过该方法可以从语料库中提取和开发更高水平的学习者和不同年龄的学习者的词汇表。新语料库意味着威尔士语课程首次能够使用词频信息;几十年来,主要语言的教学和评估材料一直以单词频率为依据。然而,原始频率列表包括麻烦的内容,并且可以排除与学习者高度相关的项目。本文展示了通过合作,威尔士语课程作者、评估员、语言专家和语料库语言学家如何有效地将语料库数据转化为课程内容。这里报告的方法和方法可在其他语言环境中使用。
{"title":"Corpus to curriculum: Developing word lists for adult learners of Welsh","authors":"Dawn Knight ,&nbsp;Tess Fitzpatrick ,&nbsp;Steve Morris ,&nbsp;Bethan Tovey-Walsh ,&nbsp;Helen Prosser ,&nbsp;Emyr Davies","doi":"10.1016/j.acorp.2023.100052","DOIUrl":"https://doi.org/10.1016/j.acorp.2023.100052","url":null,"abstract":"<div><p>The launch of a language's first comprehensive general corpus promises a sea-change in teaching and learning resources. Effective transition from corpus to classroom is not necessarily straightforward, though; expert and end-user input is essential for the potential of the corpus resource to be realised. This paper outlines the process by which fit-for-purpose vocabulary lists were derived from the new National Corpus of Contemporary Welsh (<em>Corpws Cenedlaethol Cymraeg Cyfoes</em> – CorCenCC). The immediate purpose in this case was to inform the revision of A1 and A2 level course materials for adult learners. A longer-term aim was to put in place a method by which vocabulary lists for more advanced level learners and learners of different ages could be extracted and developed from the corpus. The new corpus means that for the first time, the Welsh language curriculum is able to use word frequency information; teaching and assessment materials in major languages have been informed by word frequencies for several decades. Raw frequency lists, though, include troublesome content, and can exclude items with high relevance to learners. This paper demonstrates how, by working in partnership, Welsh language curriculum writers, assessors, language experts and corpus linguists can effectively manipulate corpus data into curriculum content. The methods and approaches reported here are replicable for use in other language contexts.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49817971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The interface between specialized translation and institutional translation: A selection of candidate terms validated by Aeronautical Meteorology corpora 专业翻译和机构翻译之间的接口:航空气象语料库验证的候选术语选择
Pub Date : 2023-08-01 DOI: 10.1016/j.acorp.2023.100051
Rafaela Araújo Jordão Rigaud Peixoto

The purpose of this work is to revise and expand an aeronautical meteorology glossary, available at REDEMET, a homepage hosted on the Department of Airspace Control website, taking into consideration corpus data in the field. For that, to best meet the needs of institutions and users, data were compiled from some segments of the Aeronautical Meteorology domain. During the compilation of this corpus, it was noticed that there was a great scarcity of specialized sources of this Aviation subdomain in English and, mainly, in Portuguese, including material by the Department of Airspace Control (DECEA), the only official Brazilian institution with the role of regulating standards relevant to Aeronautical Meteorology. By taking into account that a given government institution is considered an authoritative source concerning terms used in a specialized domain, it would be advisable to align professional and academic expertise, and institutional interests. Therefore, based on contributions of corpus linguistics theories, terminology, and institutional translation, this work relied on established parameters for the compilation and processing of information for inclusion in the corpus, and focused, in this first stage, on the selection of candidate terms, according to corpus analysis. The first results showed that institutional and academic segments present some subtleties regarding terminology, as, on the one hand, some words are more specific to the academic register and, on the other hand, there are different uses of terms in the institutional setting, by ICAO, WMO, or FAA.

这项工作的目的是修订和扩充航空气象词汇表,该词汇表可在美国空域管制部网站上的REDEMET网站上获得,同时考虑到该领域的语料库数据。为此,为了最好地满足各机构和用户的需要,从航空气象领域的某些部分汇编了数据。在编制这个语料库的过程中,注意到这个航空子领域的专门资料非常缺乏,英文的主要是葡萄牙语的,包括巴西唯一负责管理航空气象学相关标准的官方机构——空域管制部(DECEA)的资料。考虑到一个给定的政府机构被认为是一个专业领域中使用的术语的权威来源,将专业和学术专长与机构利益结合起来是明智的。因此,基于语料库语言学理论、术语学和机构翻译的贡献,本工作依赖于已建立的语料库信息的编译和处理参数,并在第一阶段侧重于根据语料库分析选择候选术语。第一个结果表明,机构和学术部门在术语方面存在一些微妙之处,因为一方面,有些词更具体地用于学术登记,另一方面,在机构背景下,国际民航组织、世界气象组织或美国联邦航空局对术语的使用不同。
{"title":"The interface between specialized translation and institutional translation: A selection of candidate terms validated by Aeronautical Meteorology corpora","authors":"Rafaela Araújo Jordão Rigaud Peixoto","doi":"10.1016/j.acorp.2023.100051","DOIUrl":"10.1016/j.acorp.2023.100051","url":null,"abstract":"<div><p><span><span>The purpose of this work is to revise and expand an aeronautical meteorology glossary, available at REDEMET, a homepage hosted on the Department of Airspace Control website, taking into consideration corpus data in the field. For that, to best meet the needs of institutions and users, data were compiled from some segments of the Aeronautical Meteorology domain. During the compilation of this corpus, it was noticed that there was a great scarcity of specialized sources of this Aviation subdomain in English and, mainly, in Portuguese, including material by the Department of Airspace Control (DECEA), the only official Brazilian institution with the role of regulating standards relevant to Aeronautical Meteorology. By taking into account that a given government institution is considered an authoritative source concerning terms used in a specialized domain, it would be advisable to align professional and academic expertise, and institutional interests. Therefore, based on contributions of corpus linguistics theories, terminology, and institutional translation, this work relied on established parameters for the compilation and processing of information for inclusion in the corpus, and focused, in this first stage, on the selection of candidate terms, according to </span>corpus analysis. The first results showed that institutional and academic segments present some subtleties regarding terminology, as, on the one hand, some words are more specific to the academic register and, on the other hand, there are different uses of terms in the institutional setting, by </span>ICAO, WMO, or FAA.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48255097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
“I will say the picture of the background is not related to the words”: using corpus linguistics and focus groups to reveal how speakers of English as an additional language perceive the effectiveness of the phraseology and imagery in UK public health tweets during COVID-19 “我要说的是,背景图片与单词无关”:使用语料库语言学和焦点小组来揭示新冠肺炎期间,英语作为一种附加语言的使用者如何感知英国公共卫生推文中的措辞和图像的有效性
Pub Date : 2023-08-01 DOI: 10.1016/j.acorp.2023.100053
Christian Jones, David Oakey, Kay L. O'Halloran

This paper reports on an application of a multimodal corpus-based study into the effectiveness of public health information about COVID-19 for speakers of English as an additional language (EAL) in the UK. A corpus of information tweets from 13 UK public health agencies totalling 560,000 words, with concomitant images and videos, was collected between March 2020 and February 2021. The most frequent n-grams occurring across all 13 public health agencies, and sample images occurring alongside these, were identified. In this study, we examine how images and videos combine with the phraseology to shape these COVID-19 public health information messages. Following this, six illustrative tweets were used as prompts for three focus groups of EAL participants based in the UK representing a range of first languages and occupations. Data from the focus groups was analysed in order to identify how common public health phraseology and images were received, understood and responded to by participants and how they felt they could be amended to increase their effectiveness for EAL speakers. We conclude with suggestions for making the language of public health messages simpler and more direct, aligning images more clearly with the language used and removing linguistic ambiguity. These recommendations for how such messaging could be improved in future public health campaigns could ensure a more effective and inclusive public health response.

本文报告了一项基于多模式语料库的研究的应用,该研究旨在研究新冠肺炎公共卫生信息对英国以英语为附加语言(EAL)的人的有效性。在2020年3月至2021年2月期间,收集了来自13个英国公共卫生机构的总计56万字的信息推特语料库,以及伴随的图像和视频。确定了所有13个公共卫生机构中出现频率最高的n图,以及与这些n图同时出现的样本图像。在这项研究中,我们研究了图像和视频如何与措辞相结合来塑造这些新冠肺炎公共卫生信息。在此之后,六条说明性推文被用作三组EAL参与者的提示,这三组参与者来自英国,代表一系列第一语言和职业。对焦点小组的数据进行了分析,以确定参与者如何接受、理解和回应常见的公共卫生措辞和图像,以及他们认为如何对其进行修改,以提高其对EAL演讲者的有效性。最后,我们提出了一些建议,使公共卫生信息的语言更简单、更直接,使图像与所使用的语言更清晰地对齐,并消除语言歧义。这些关于如何在未来的公共卫生运动中改进这种信息传递的建议可以确保更有效和更具包容性的公共卫生应对措施。
{"title":"“I will say the picture of the background is not related to the words”: using corpus linguistics and focus groups to reveal how speakers of English as an additional language perceive the effectiveness of the phraseology and imagery in UK public health tweets during COVID-19","authors":"Christian Jones,&nbsp;David Oakey,&nbsp;Kay L. O'Halloran","doi":"10.1016/j.acorp.2023.100053","DOIUrl":"https://doi.org/10.1016/j.acorp.2023.100053","url":null,"abstract":"<div><p>This paper reports on an application of a multimodal corpus-based study into the effectiveness of public health information about COVID-19 for speakers of English as an additional language (EAL) in the UK. A corpus of information tweets from 13 UK public health agencies totalling 560,000 words, with concomitant images and videos, was collected between March 2020 and February 2021. The most frequent n-grams occurring across all 13 public health agencies, and sample images occurring alongside these, were identified. In this study, we examine how images and videos combine with the phraseology to shape these COVID-19 public health information messages. Following this, six illustrative tweets were used as prompts for three focus groups of EAL participants based in the UK representing a range of first languages and occupations. Data from the focus groups was analysed in order to identify how common public health phraseology and images were received, understood and responded to by participants and how they felt they could be amended to increase their effectiveness for EAL speakers. We conclude with suggestions for making the language of public health messages simpler and more direct, aligning images more clearly with the language used and removing linguistic ambiguity. These recommendations for how such messaging could be improved in future public health campaigns could ensure a more effective and inclusive public health response.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49817931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Review of Deignan, Candarli, & Oxley (2023). The linguistic challenge of the transition to secondary school: A corpus study of academic language Deignan, Candarli, & Oxley(2023)。向中学过渡的语言挑战:学术语言的语料库研究
Pub Date : 2023-08-01 DOI: 10.1016/j.acorp.2023.100049
Philip Durrant
{"title":"Review of Deignan, Candarli, & Oxley (2023). The linguistic challenge of the transition to secondary school: A corpus study of academic language","authors":"Philip Durrant","doi":"10.1016/j.acorp.2023.100049","DOIUrl":"10.1016/j.acorp.2023.100049","url":null,"abstract":"","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47466935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The corpus of United States state statutes—design, construction and use 美国州法规文集——设计、建造和使用
Pub Date : 2023-08-01 DOI: 10.1016/j.acorp.2023.100047
Jesse Egbert, Margaret Wood

There is a need for more publicly available corpora of legal language. To help fill this gap, we have developed the Corpus of U.S. State Statutes, or CorUSSS, a new corpus comprising the statutory code from all 50 U.S. states. In total the corpus contains 1,785,742 texts, each of which represents the statutory text associated with a unique Universal Citation in one of the 50 U.S. states’ codes. This corpus provides us with the ability to explore language use in statutes within or across all 50 states. After motivating the need for this corpus, we describe its design and the methods we used to collect, clean and store the texts. We then report on a case study that illustrates the utility of this corpus for addressing important questions in statutory interpretation by investigating whether the word information can be used to refer to statements that are non-factual. We conclude with a call for researchers in law and corpus linguistics to rely on both legal and ordinary language when investigating questions of interpretation.

有必要提供更多公开的法律语言语料库。为了帮助填补这一空白,我们开发了美国州法规语料库(CorUSSS),这是一个包含美国所有50个州的法定代码的新语料库。该语料库总共包含1,785,742个文本,每个文本都代表与美国50个州法典之一的唯一通用引文相关的法定文本。这个语料库为我们提供了探索所有50个州内或跨州的法规中语言使用的能力。在激发了对这个语料库的需求之后,我们描述了它的设计以及我们用来收集、清理和存储文本的方法。然后,我们报告了一个案例研究,通过调查“信息”一词是否可以用来指非事实性陈述,说明了该语料库在解决法律解释中的重要问题方面的效用。最后,我们呼吁法律和语料库语言学的研究人员在调查解释问题时既依赖法律语言,也依赖普通语言。
{"title":"The corpus of United States state statutes—design, construction and use","authors":"Jesse Egbert,&nbsp;Margaret Wood","doi":"10.1016/j.acorp.2023.100047","DOIUrl":"10.1016/j.acorp.2023.100047","url":null,"abstract":"<div><p>There is a need for more publicly available corpora of legal language. To help fill this gap, we have developed the Corpus of U.S. State Statutes, or CorUSSS, a new corpus comprising the statutory code from all 50 U.S. states. In total the corpus contains 1,785,742 texts, each of which represents the statutory text associated with a unique Universal Citation in one of the 50 U.S. states’ codes. This corpus provides us with the ability to explore language use in statutes within or across all 50 states. After motivating the need for this corpus, we describe its design and the methods we used to collect, clean and store the texts. We then report on a case study that illustrates the utility of this corpus for addressing important questions in statutory interpretation by investigating whether the word <em>information</em><span> can be used to refer to statements that are non-factual. We conclude with a call for researchers in law and corpus linguistics to rely on both legal and ordinary language when investigating questions of interpretation.</span></p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48380661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Corpus approaches to the sociology of curricula: A methodological case study of human rights learning in Japan 课程社会学的语料库方法:日本人权学习的方法论个案研究
Pub Date : 2023-08-01 DOI: 10.1016/j.acorp.2023.100057
Thomas George Meyer

This article discusses how corpus linguistic methods were adapted for critical research examining the content and pedagogy of human rights learning within texts approved for classroom use under Japan's official social studies curriculum. While human rights concepts are common facets of official curricula, designed to address social injustice and foster peaceful coexistence, such learning has undergone critical re-examination as being complicit in perpetuating social injustice. Drawing upon Bernstein's sociology of the curriculum and based on a corpus of upper-secondary curricular texts, this research is a pragmatic mixing of quantitative and qualitative methods that sought to understand the curriculum's potential role within learning for human rights. By way of empirical example, the article aims to inform future critical research designs within the sociology of the curriculum. Corpus-based analytical techniques were vital in demonstrating how the structuring of textbook human rights limits student engagement with social justice issues and functions instead to inculcate pride in Japanese ethnonationality.

本文讨论了语料库语言学方法如何适用于批判性研究,研究日本官方社会研究课程批准课堂使用的文本中人权学习的内容和教学法。虽然人权概念是旨在解决社会不公正和促进和平共处的官方课程的共同方面,但这种学习已经过严格的重新审查,认为它是使社会不公正永久化的共谋。借鉴伯恩斯坦的课程社会学,并以高中课程文本为基础,本研究是一种定量和定性方法的实用混合,旨在了解课程在人权学习中的潜在作用。通过实证的方式,本文旨在为未来课程社会学的批判性研究设计提供信息。基于语料库的分析技术对于证明教科书人权的结构如何限制学生参与社会正义问题和功能而不是灌输日本民族自豪感至关重要。
{"title":"Corpus approaches to the sociology of curricula: A methodological case study of human rights learning in Japan","authors":"Thomas George Meyer","doi":"10.1016/j.acorp.2023.100057","DOIUrl":"10.1016/j.acorp.2023.100057","url":null,"abstract":"<div><p><span>This article discusses how corpus linguistic methods were adapted for critical research examining the content and pedagogy of human rights learning within texts approved for classroom use under Japan's official </span>social studies<span> curriculum. While human rights concepts are common facets of official curricula, designed to address social injustice and foster peaceful coexistence, such learning has undergone critical re-examination as being complicit in perpetuating social injustice. Drawing upon Bernstein's sociology of the curriculum and based on a corpus of upper-secondary curricular texts, this research is a pragmatic mixing of quantitative and qualitative methods that sought to understand the curriculum's potential role within learning for human rights. By way of empirical example, the article aims to inform future critical research designs within the sociology of the curriculum. Corpus-based analytical techniques were vital in demonstrating how the structuring of textbook human rights limits student engagement with social justice issues and functions instead to inculcate pride in Japanese ethnonationality.</span></p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42401134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Knowledge and belief in the times of COVID-19: A comparative analysis of epistemicity in English newspaper discourse of two stages of the pandemic 新冠肺炎时代的知识与信仰——疫情两个阶段英语报刊话语认知性比较分析
Pub Date : 2023-08-01 DOI: 10.1016/j.acorp.2023.100054
Marta Carretero

This paper sets forth a quantitative analysis of expressions of epistemicity, a category covering the expression of commitment to the information transmitted and comprising epistemic modality and evidentiality, in a corpus of 400 newspaper articles from The Guardian concerning the COVID-19 pandemic. 200 articles were written in April 2020; the other 200 were written between January and April 2022, after massive vaccination and an extraordinary increase in medical knowledge. The analysis distinguishes between a number of subtypes of epistemic expressions and three kinds of authorial voice. The results show that the April 2020 articles contain more epistemic expressions, of both weak commitment (might, perhaps, apparently…) and strong commitment (know, clearly, surely…), which suggests a greater need to distinguish the known from the unknown in this period, due to the pervasive state of uncertainty. The analysis has social implications, since it gives readers an opportunity to appreciate the careful assessments of epistemicity found in the corpus and therefore to consider the convenience of obtaining information from quality media. These social implications, together with the methodology of the analysis, contribute to the potential of the paper for pedagogical applications.

本文在《卫报》关于COVID-19大流行的400篇报纸文章语料库中对认识性表达进行了定量分析,这一类别涵盖了对所传递信息的承诺表达,包括认识性形式和证据性。2020年4月撰写文章200篇;另外200篇是在大规模接种疫苗和医学知识大幅增长之后,于2022年1月至4月期间完成的。该分析区分了认识论表达的一些亚型和三种作者的声音。结果表明,2020年4月的文章包含更多的认识论表达,包括弱承诺(可能,也许,显然……)和强承诺(知道,清楚,肯定……),这表明在这一时期,由于普遍存在不确定性,更需要区分已知和未知。该分析具有社会意义,因为它使读者有机会欣赏语料库中发现的认识性的仔细评估,从而考虑从优质媒体获取信息的便利性。这些社会影响,连同分析的方法,有助于论文的教学应用的潜力。
{"title":"Knowledge and belief in the times of COVID-19: A comparative analysis of epistemicity in English newspaper discourse of two stages of the pandemic","authors":"Marta Carretero","doi":"10.1016/j.acorp.2023.100054","DOIUrl":"10.1016/j.acorp.2023.100054","url":null,"abstract":"<div><p>This paper sets forth a quantitative analysis of expressions of epistemicity, a category covering the expression of commitment to the information transmitted and comprising epistemic modality and evidentiality, in a corpus of 400 newspaper articles from <em>The Guardian</em> concerning the COVID-19 pandemic. 200 articles were written in April 2020; the other 200 were written between January and April 2022, after massive vaccination and an extraordinary increase in medical knowledge. The analysis distinguishes between a number of subtypes of epistemic expressions and three kinds of authorial voice. The results show that the April 2020 articles contain more epistemic expressions, of both weak commitment (<em>might, perhaps, apparently</em>…) and strong commitment (<span>know</span>, <em>clearly, surely</em>…), which suggests a greater need to distinguish the known from the unknown in this period, due to the pervasive state of uncertainty. The analysis has social implications, since it gives readers an opportunity to appreciate the careful assessments of epistemicity found in the corpus and therefore to consider the convenience of obtaining information from quality media. These social implications, together with the methodology of the analysis, contribute to the potential of the paper for pedagogical applications.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10028353/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9963939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
“Sorry to hear you're going through a difficult time”: Investigating online discussions of consumer debt “很抱歉听到你正在经历一段艰难的时期”:调查消费者债务的在线讨论
Pub Date : 2023-08-01 DOI: 10.1016/j.acorp.2023.100056
Robert Lawson , Ursula Lutzky , Andrew Kehoe , Matt Gee

As recent years have witnessed increasing pressure on personal finances, compounded by the current cost of living crisis, online forums have become an important resource for people dealing with financial precarity. In this article, we offer a corpus linguistic analysis of data from MoneySavingExpert.com, the UK's largest online money management advice forum, studying 207 threads and 41.4 million words of text posted from 2005 to 2021. Through measures of word frequency and word association, we uncover similarities and differences in language use on the debt-free wannabe (DFW) and mortgage-free wannabe (MFW) forums. Our findings show that the DFW forum focuses on interactive exchanges involving requests for help and offers of advice, while the MFW forum is characterised by goal setting and community building. We thus contribute new insights into the discursive construction of debt in digital media and provide further understanding of the role online forums play in supporting vulnerable people.

近年来,个人财务压力越来越大,加上当前的生活成本危机,在线论坛已成为应对财务不稳定的人们的重要资源。在这篇文章中,我们对英国最大的在线理财咨询论坛MoneySavingExpert.com的数据进行了语料库语言学分析,研究了2005年至2021年发布的207个帖子和4140万字的文本。通过对单词频率和单词联想的测量,我们发现了无债务想成为者(DFW)和无抵押想成为者论坛上语言使用的异同。我们的研究结果表明,DFW论坛侧重于互动交流,包括寻求帮助和提供建议,而MFW论坛的特点是目标设定和社区建设。因此,我们对数字媒体中债务的话语构建有了新的见解,并进一步了解了在线论坛在支持弱势群体方面发挥的作用。
{"title":"“Sorry to hear you're going through a difficult time”: Investigating online discussions of consumer debt","authors":"Robert Lawson ,&nbsp;Ursula Lutzky ,&nbsp;Andrew Kehoe ,&nbsp;Matt Gee","doi":"10.1016/j.acorp.2023.100056","DOIUrl":"https://doi.org/10.1016/j.acorp.2023.100056","url":null,"abstract":"<div><p>As recent years have witnessed increasing pressure on personal finances, compounded by the current cost of living crisis, online forums have become an important resource for people dealing with financial precarity. In this article, we offer a corpus linguistic analysis of data from MoneySavingExpert.com, the UK's largest online money management advice forum, studying 207 threads and 41.4 million words of text posted from 2005 to 2021. Through measures of word frequency and word association, we uncover similarities and differences in language use on the debt-free wannabe (DFW) and mortgage-free wannabe (MFW) forums. Our findings show that the DFW forum focuses on interactive exchanges involving requests for help and offers of advice, while the MFW forum is characterised by goal setting and community building. We thus contribute new insights into the discursive construction of debt in digital media and provide further understanding of the role online forums play in supporting vulnerable people.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49817930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ChatGPT: Friend or foe (to corpus linguists)? 聊天:朋友还是敌人(语料库语言学家)?
Pub Date : 2023-07-13 DOI: 10.1016/j.acorp.2023.100065
Phoebe Lin

This short communication discusses the impact of ChatGPT on the field of corpus linguistics, particularly its potential as a concordancer. As a corpus linguist and app developer, the author reflects on how ChatGPT's ease of use, efficiency, and popularity could challenge traditional concordancers, and explores ways in which ChatGPT could be used to generate concordances and frequency lists.

这篇短文讨论了ChatGPT在语料库语言学领域的影响,特别是它作为协和语的潜力。作为语料库语言学家和应用程序开发人员,作者反思了ChatGPT的易用性,效率和受欢迎程度如何挑战传统的索引,并探索了使用ChatGPT生成索引和频率列表的方法。
{"title":"ChatGPT: Friend or foe (to corpus linguists)?","authors":"Phoebe Lin","doi":"10.1016/j.acorp.2023.100065","DOIUrl":"10.1016/j.acorp.2023.100065","url":null,"abstract":"<div><p>This short communication discusses the impact of ChatGPT on the field of corpus linguistics<span>, particularly its potential as a concordancer. As a corpus linguist and app developer, the author reflects on how ChatGPT's ease of use, efficiency, and popularity could challenge traditional concordancers, and explores ways in which ChatGPT could be used to generate concordances and frequency lists.</span></p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46783808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Generative AI and the end of corpus-assisted data-driven learning? Not so fast! 生成人工智能与语料库辅助数据驱动学习的终结?不要那么快!
Pub Date : 2023-07-13 DOI: 10.1016/j.acorp.2023.100066
Peter Crosthwaite , Vit Baisa

This article explores the potential advantages of corpora over generative artificial intelligence (GenAI) in understanding language patterns and usage, while also acknowledging the potential of GenAI to address some of the main shortcomings of corpus-based data-driven learning (DDL). One of the main advantages of corpora is that we know exactly the domain of texts from which the corpus data is derived, something that we cannot track from current large language models underlying applications like ChatGPT. We know the texts that make up large general corpora such as BNC2014 and BAWE, and can even extract full texts from these corpora if needed. Corpora also allow for more nuanced analysis of language patterns, including the statistics behind multi-word units and collocations, which can be difficult for GenAI to handle. However, it is important to note that GenAI has its own strengths in advancing our understanding of language-in-use that corpora, to date, have struggled with. We therefore argue that by combining corpus and GenAI approaches, language learners can gain a more comprehensive understanding of how language works in different contexts than is currently possible using only a single approach.

本文探讨了语料库在理解语言模式和用法方面相对于生成式人工智能(GenAI)的潜在优势,同时也承认了GenAI解决基于语料库的数据驱动学习(DDL)的一些主要缺点的潜力。语料库的一个主要优点是,我们确切地知道语料库数据来源于哪个文本领域,这是我们无法从ChatGPT等应用程序基础的当前大型语言模型中跟踪的。我们知道构成大型通用语料库的文本,如BNC2014和BAWE,如果需要,甚至可以从这些语料库中提取全文。语料库还允许对语言模式进行更细致的分析,包括多词单位和搭配背后的统计数据,这对于GenAI来说可能很难处理。然而,重要的是要注意到,GenAI在促进我们对使用语言的理解方面有自己的优势,这是语料库迄今为止一直在努力做到的。因此,我们认为,通过将语料库和GenAI方法相结合,语言学习者可以比目前仅使用单一方法更全面地了解语言在不同上下文中的工作原理。
{"title":"Generative AI and the end of corpus-assisted data-driven learning? Not so fast!","authors":"Peter Crosthwaite ,&nbsp;Vit Baisa","doi":"10.1016/j.acorp.2023.100066","DOIUrl":"10.1016/j.acorp.2023.100066","url":null,"abstract":"<div><p>This article explores the potential advantages of corpora over generative artificial intelligence (GenAI) in understanding language patterns and usage, while also acknowledging the potential of GenAI to address some of the main shortcomings of corpus-based data-driven learning (DDL). One of the main advantages of corpora is that we know exactly the domain of texts from which the corpus data is derived, something that we cannot track from current large language models underlying applications like ChatGPT. We know the texts that make up large general corpora such as BNC2014 and BAWE, and can even extract full texts from these corpora if needed. Corpora also allow for more nuanced analysis of language patterns, including the statistics behind multi-word units and collocations, which can be difficult for GenAI to handle. However, it is important to note that GenAI has its own strengths in advancing our understanding of language-in-use that corpora, to date, have struggled with. We therefore argue that by combining corpus and GenAI approaches, language learners can gain a more comprehensive understanding of how language works in different contexts than is currently possible using only a single approach.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49320227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
Applied Corpus Linguistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1