This article reviews Corpus linguistics for writing development
本文评述了用于写作发展的语料库语言学
{"title":"Review of Durrant (2023): Corpus linguistics for writing development","authors":"Joyce Lim","doi":"10.1075/ijcl.00059.lim","DOIUrl":"https://doi.org/10.1075/ijcl.00059.lim","url":null,"abstract":"This article reviews Corpus linguistics for writing development","PeriodicalId":46843,"journal":{"name":"International Journal of Corpus Linguistics","volume":"32 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2024-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141529520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Review of Dunn (2022): Natural Language Processing for Corpus Linguistics","authors":"Hanna Schmück","doi":"10.1075/ijcl.00057.sch","DOIUrl":"https://doi.org/10.1075/ijcl.00057.sch","url":null,"abstract":"","PeriodicalId":46843,"journal":{"name":"International Journal of Corpus Linguistics","volume":"42 14","pages":""},"PeriodicalIF":1.0,"publicationDate":"2023-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138946500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Review of Viana (2022): Teaching English with Corpora: A Resource Book","authors":"P. Pérez-Paredes","doi":"10.1075/ijcl.00056.per","DOIUrl":"https://doi.org/10.1075/ijcl.00056.per","url":null,"abstract":"","PeriodicalId":46843,"journal":{"name":"International Journal of Corpus Linguistics","volume":"33 9","pages":""},"PeriodicalIF":1.0,"publicationDate":"2023-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138950411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big corporations are a leading contributor to global carbon emissions and their investment decisions have a significant impact on the world’s ability to tackle climate change. This study combines corpus and discourse approaches to examine how major corporate emitters have responded to the Paris Agreement, how they legitimize their practices amid mounting public pressure, and how companies operating in high- and middle-income countries differ in their framing of climate change. The results show that carbon majors place increasing focus on climate issues, widely support the goals of the Paris Agreement, and are increasingly making net-zero pledges. However, close inspection of linguistic patterns reveals a troubling disconnect between proclaimed goals, the solutions advocated for, and the radical steps needed to address the escalating climate crisis. Companies from middle-income countries devote comparatively less attention to climate change, which points to the need for better coordinated global efforts to address this problem.
{"title":"Framing the path to net zero","authors":"Matteo Fuoli, Annika Beelitz","doi":"10.1075/ijcl.22123.fuo","DOIUrl":"https://doi.org/10.1075/ijcl.22123.fuo","url":null,"abstract":"\u0000 Big corporations are a leading contributor to global carbon emissions and their investment decisions have a\u0000 significant impact on the world’s ability to tackle climate change. This study combines corpus and discourse approaches to examine\u0000 how major corporate emitters have responded to the Paris Agreement, how they legitimize their practices amid mounting public\u0000 pressure, and how companies operating in high- and middle-income countries differ in their framing of climate change. The results\u0000 show that carbon majors place increasing focus on climate issues, widely support the goals of the Paris Agreement, and are\u0000 increasingly making net-zero pledges. However, close inspection of linguistic patterns reveals a troubling disconnect between\u0000 proclaimed goals, the solutions advocated for, and the radical steps needed to address the escalating climate crisis. Companies\u0000 from middle-income countries devote comparatively less attention to climate change, which points to the need for better\u0000 coordinated global efforts to address this problem.","PeriodicalId":46843,"journal":{"name":"International Journal of Corpus Linguistics","volume":"31 10","pages":""},"PeriodicalIF":1.0,"publicationDate":"2023-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138592265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract The present study is a corpus-based discourse analysis of the metaphorical framing of Covid-19 in American political discourse. Drawing on data from a corpus of the White House briefings and statements, the study investigates the corpus profile of war and virus and illustrates how the Coronavirus is primarily represented as an enemy to go to war with, rather than a public health crisis to control and mitigate. The study further situates the militaristic framing of Covid-19 within the theoretical framework of moral panic and examines the discursive features that ultimately bridge the metaphorical representation of the pandemic and the construction of moral panic. The study points to nuanced discourse strategies used in the White House press briefings that reconstruct the enemy and regroup the Coronavirus with other so-called enemies of the United States, such as the Communists, as well as the Islamic radicals and the Latin gangs and cartels.
{"title":"Political framing of Covid-19","authors":"Ariana N Mohammadi","doi":"10.1075/ijcl.22087.moh","DOIUrl":"https://doi.org/10.1075/ijcl.22087.moh","url":null,"abstract":"Abstract The present study is a corpus-based discourse analysis of the metaphorical framing of Covid-19 in American political discourse. Drawing on data from a corpus of the White House briefings and statements, the study investigates the corpus profile of war and virus and illustrates how the Coronavirus is primarily represented as an enemy to go to war with, rather than a public health crisis to control and mitigate. The study further situates the militaristic framing of Covid-19 within the theoretical framework of moral panic and examines the discursive features that ultimately bridge the metaphorical representation of the pandemic and the construction of moral panic. The study points to nuanced discourse strategies used in the White House press briefings that reconstruct the enemy and regroup the Coronavirus with other so-called enemies of the United States, such as the Communists, as well as the Islamic radicals and the Latin gangs and cartels.","PeriodicalId":46843,"journal":{"name":"International Journal of Corpus Linguistics","volume":"11 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134991153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract This paper introduces the Lannang Corpus (LanCorp), a public 375,000-word collection of raw and transcribed recordings of Lannang languages spoken in metropolitan Manila, which have been annotated with part-of-speech tags and linked to 40 types of sociolinguistic metadata. It begins by providing an overview of the LanCorp (e.g. design, formats, accessibility). Then, it goes on to show various examples of how the corpus can be used for variationist sociolinguistic research, using Lánnang-uè data as a case study. The findings from the exploratory studies indicate that Lannang languages are influenced by sociolinguistic factors, demonstrating the intricate nature of the Sino-Philippine sociolinguistic ecology. Due to its large size, sociolinguistic metadata, and various formats, LanCorp can be used to study Lannang languages in general and how they are used by specific social groups. It enables scholars to investigate multilingual interactions in a wide range of sociolinguistic factors, furthering the field of Sino-Philippine (socio)linguistics.
{"title":"Advancing Sino-Philippine linguistics and sociolinguistics using the Lannang Corpus (LanCorp)","authors":"Wilkinson Daniel Wong Gonzales","doi":"10.1075/ijcl.22096.gon","DOIUrl":"https://doi.org/10.1075/ijcl.22096.gon","url":null,"abstract":"Abstract This paper introduces the Lannang Corpus (LanCorp), a public 375,000-word collection of raw and transcribed recordings of Lannang languages spoken in metropolitan Manila, which have been annotated with part-of-speech tags and linked to 40 types of sociolinguistic metadata. It begins by providing an overview of the LanCorp (e.g. design, formats, accessibility). Then, it goes on to show various examples of how the corpus can be used for variationist sociolinguistic research, using Lánnang-uè data as a case study. The findings from the exploratory studies indicate that Lannang languages are influenced by sociolinguistic factors, demonstrating the intricate nature of the Sino-Philippine sociolinguistic ecology. Due to its large size, sociolinguistic metadata, and various formats, LanCorp can be used to study Lannang languages in general and how they are used by specific social groups. It enables scholars to investigate multilingual interactions in a wide range of sociolinguistic factors, furthering the field of Sino-Philippine (socio)linguistics.","PeriodicalId":46843,"journal":{"name":"International Journal of Corpus Linguistics","volume":"157 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136381392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rare syntactic constructions show an especially strong tendency to be repeated, but some rare constructions exhibit this tendency much more strongly than others. The reasons for this variation are not well understood. This exploratory study examines five rare noun-phrase (NP) expansions in English: (the rich), (a Bob Gates), (architect Julia Morgan), (the jobs data), and (home electronic equipment). Repetition tendencies are very strong in the first and second of these and somewhat strong in the third; in the fourth and fifth they are much weaker, only slightly higher than those of common NP expansions such as (the black dog). To explain this variation, we suggest that constructions may be associated with different types of discourse: constructions with high repetition tendencies tend to occur in persuasive rather than informative discourse.
摘要罕见句法结构具有特别强的重复倾向,但有些罕见句法结构的重复倾向要比其他句法结构强烈得多。造成这种差异的原因尚不清楚。本文探讨了英语中五种罕见的名词短语扩展:<A>(富人),<a N prop N prop >(一个鲍勃·盖茨),<N唱N道具N道具>(建筑师Julia Morgan), < N pl N sing >(就业数据)和<N sing A N sing >(家用电子设备)。重复倾向在第一个和第二个非常强烈,在第三个稍微强一些;在第4和第5中,它们要弱得多,仅略高于常见的NP扩展,如<D; A; N sing >(黑狗)。为了解释这种差异,我们认为结构可能与不同类型的话语有关:具有高重复倾向的结构往往出现在说服性话语中,而不是信息性话语中。
{"title":"The inverse frequency effect","authors":"David Temperley","doi":"10.1075/ijcl.22080.tem","DOIUrl":"https://doi.org/10.1075/ijcl.22080.tem","url":null,"abstract":"Rare syntactic constructions show an especially strong tendency to be repeated, but some rare constructions exhibit this tendency much more strongly than others. The reasons for this variation are not well understood. This exploratory study examines five rare noun-phrase (NP) expansions in English: (the rich), (a Bob Gates), (architect Julia Morgan), (the jobs data), and (home electronic equipment). Repetition tendencies are very strong in the first and second of these and somewhat strong in the third; in the fourth and fifth they are much weaker, only slightly higher than those of common NP expansions such as (the black dog). To explain this variation, we suggest that constructions may be associated with different types of discourse: constructions with high repetition tendencies tend to occur in persuasive rather than informative discourse.","PeriodicalId":46843,"journal":{"name":"International Journal of Corpus Linguistics","volume":"121 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135095995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract This paper presents a single-author case study which demonstrates that the statistical modelling technique change point analysis (CPA) can provide compelling evidence of prescriptive impact at an idiolectal level. It has been hypothesized that Late Modern English review periodicals consistently pushed a prescriptive agenda, and that this impacted language use ( McIntosh, 1998 ; Percy, 2009 ). A lack of empirical research has, however, left these claims unsubstantiated, partly because evaluating prescriptivist endeavours has proven challenging. Using a purpose-built 3-million-token idiolectal corpus spanning 7 decades, this paper reports that it is possible to discern a striking change in usage. Use of CPA enables this change to be located precisely, and correlated to the author’s exposure to a prescriptive review of her work. In demonstrating how effectively CPA can provide a sophisticated correlation indicative of causality, this paper showcases the suitability of this technique to the study of prescriptivism.
{"title":"Pinpointing prescriptive impact","authors":"Beth Malory","doi":"10.1075/ijcl.22001.mal","DOIUrl":"https://doi.org/10.1075/ijcl.22001.mal","url":null,"abstract":"Abstract This paper presents a single-author case study which demonstrates that the statistical modelling technique change point analysis (CPA) can provide compelling evidence of prescriptive impact at an idiolectal level. It has been hypothesized that Late Modern English review periodicals consistently pushed a prescriptive agenda, and that this impacted language use ( McIntosh, 1998 ; Percy, 2009 ). A lack of empirical research has, however, left these claims unsubstantiated, partly because evaluating prescriptivist endeavours has proven challenging. Using a purpose-built 3-million-token idiolectal corpus spanning 7 decades, this paper reports that it is possible to discern a striking change in usage. Use of CPA enables this change to be located precisely, and correlated to the author’s exposure to a prescriptive review of her work. In demonstrating how effectively CPA can provide a sophisticated correlation indicative of causality, this paper showcases the suitability of this technique to the study of prescriptivism.","PeriodicalId":46843,"journal":{"name":"International Journal of Corpus Linguistics","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134957679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper examines language used in five of the largest manosphere communities on Reddit (r/TheRedPill, r/braincels, r/MensRights, r/seduction, and r/MGTOW) to identify idiosyncratic language use within these communities. To do so, a novel methodology which combines key-key-word analysis with notions from set theory was used to identify and compare keywords between corpora and to find keywords that are used uniquely within – and thus are distinctive to – these five separate communities. The paper achieves the following: it (i) presents a novel method for identifying what we term ‘complement keywords’ (keywords that are not shared between multiple different corpora when compared against the same reference corpus), and (ii) explores idiosyncratic language use in five separate manosphere communities. The analysis first examines interdiscursive relationships between communities emerging from the complement keywords identified before discussing community-specific preoccupations emergent in the idiosyncratic language use found in these five communities.
{"title":"Keywords of the manosphere","authors":"M. McGlashan, A. Krendel","doi":"10.1075/ijcl.22053.mcg","DOIUrl":"https://doi.org/10.1075/ijcl.22053.mcg","url":null,"abstract":"\u0000This paper examines language used in five of the largest manosphere communities on Reddit (r/TheRedPill, r/braincels, r/MensRights, r/seduction, and r/MGTOW) to identify idiosyncratic language use within these communities. To do so, a novel methodology which combines key-key-word analysis with notions from set theory was used to identify and compare keywords between corpora and to find keywords that are used uniquely within – and thus are distinctive to – these five separate communities. The paper achieves the following: it (i) presents a novel method for identifying what we term ‘complement keywords’ (keywords that are not shared between multiple different corpora when compared against the same reference corpus), and (ii) explores idiosyncratic language use in five separate manosphere communities. The analysis first examines interdiscursive relationships between communities emerging from the complement keywords identified before discussing community-specific preoccupations emergent in the idiosyncratic language use found in these five communities.","PeriodicalId":46843,"journal":{"name":"International Journal of Corpus Linguistics","volume":" ","pages":""},"PeriodicalIF":1.0,"publicationDate":"2023-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43241876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this study, we propose a new evaluation scheme to assess the strengths and limitations of collocation extraction measures and explore type-sensitive methods for extracting collocations. We introduced the pooling strategy widely used in Information Retrieval and automated the evaluation process using online dictionaries. Sixteen well-known metrics are evaluated based on their effectiveness and then distributional and linguistic compared. The results show that Group A methods (e.g. z-score, Dice, PMI) are more effective in extracting low-frequency collocations with relatively small extraction scales. In contrast, Group B methods (e.g. t-test, LMI, LLR) perform better at finding high-frequency collocations, most of which outperform Group A methods as the extraction scale increases. Moreover, Group A prefers NN collocations, while Group B identifies collocations with a wide range of syntactic structures. This study provides suggestions for studies to identify hybrid extraction methods as well as for language educators and dictionary compilers.
{"title":"Association measures for collocation extraction","authors":"Qi Su, Chen Gu, Pengyuan Liu","doi":"10.1075/ijcl.21056.su","DOIUrl":"https://doi.org/10.1075/ijcl.21056.su","url":null,"abstract":"\u0000In this study, we propose a new evaluation scheme to assess the strengths and limitations of collocation extraction measures and explore type-sensitive methods for extracting collocations. We introduced the pooling strategy widely used in Information Retrieval and automated the evaluation process using online dictionaries. Sixteen well-known metrics are evaluated based on their effectiveness and then distributional and linguistic compared. The results show that Group A methods (e.g. z-score, Dice, PMI) are more effective in extracting low-frequency collocations with relatively small extraction scales. In contrast, Group B methods (e.g. t-test, LMI, LLR) perform better at finding high-frequency collocations, most of which outperform Group A methods as the extraction scale increases. Moreover, Group A prefers NN collocations, while Group B identifies collocations with a wide range of syntactic structures. This study provides suggestions for studies to identify hybrid extraction methods as well as for language educators and dictionary compilers.","PeriodicalId":46843,"journal":{"name":"International Journal of Corpus Linguistics","volume":" ","pages":""},"PeriodicalIF":1.0,"publicationDate":"2023-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44703465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}