In this paper, I use methods from corpus linguistics to examine patterns pertaining to the representation of women in online Arabic- and English-language political corpora. I highlight the discursi...
{"title":"Sketching women: a corpus-based approach to representations of women's agency in political Internet corpora in Arabic and English","authors":"K. Karimullah","doi":"10.3366/COR.2020.0184","DOIUrl":"https://doi.org/10.3366/COR.2020.0184","url":null,"abstract":"In this paper, I use methods from corpus linguistics to examine patterns pertaining to the representation of women in online Arabic- and English-language political corpora. I highlight the discursi...","PeriodicalId":44933,"journal":{"name":"Corpora","volume":"15 1","pages":"21-53"},"PeriodicalIF":0.5,"publicationDate":"2020-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43224672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper illustrates the advantages of combining corpus linguistic methods and correspondence analysis when investigating sub-varieties within written languages that have codified variation. Thro...
本文阐述了语料库语言学方法和对应分析相结合在研究书面语言中编码变体的子变体时的优势。Thro。。。
{"title":"Writing practice in a society with codified variation: a correspondence analysis of writing practice in New Norwegian/Nynorsk","authors":"S. J. Helset","doi":"10.3366/cor.2020.0183","DOIUrl":"https://doi.org/10.3366/cor.2020.0183","url":null,"abstract":"This paper illustrates the advantages of combining corpus linguistic methods and correspondence analysis when investigating sub-varieties within written languages that have codified variation. Thro...","PeriodicalId":44933,"journal":{"name":"Corpora","volume":"15 1","pages":"1-20"},"PeriodicalIF":0.5,"publicationDate":"2020-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48901570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents Lexical Explorer, 2 a tool that allows interactive browsing and filtering of quantitative corpus information. It further describes how this tool can be used to support linguistic work on corpora of spoken German. By using Lexical Explorer, users can analyse quantitative corpus data by interacting with frequency tables and obtaining customised word profiles of word distribution across word form variation, co-occurrences and metadata. Interaction with corpus examples of particular corpus counts is also enabled. Lexical Explorer was developed as a prototype for user-specific corpus access and is aimed at researchers of German lexicon in spoken interaction. Although Lexical Explorer was developed on the basis of two small speech corpora of the German language, the underlying principle of this tool can be easily adapted to other corpora and other user groups. Moreover, the tool can be used to gain insights into the corpus structure as well as to study and verify corpus content in a transparent and user-friendly way.
{"title":"Lexical Explorer: extending access to the Database for Spoken German for user-specific purposes","authors":"Dolores Lemmenmeier-Batinić","doi":"10.3366/cor.2020.0185","DOIUrl":"https://doi.org/10.3366/cor.2020.0185","url":null,"abstract":"This paper presents Lexical Explorer, 2 a tool that allows interactive browsing and filtering of quantitative corpus information. It further describes how this tool can be used to support linguistic work on corpora of spoken German. By using Lexical Explorer, users can analyse quantitative corpus data by interacting with frequency tables and obtaining customised word profiles of word distribution across word form variation, co-occurrences and metadata. Interaction with corpus examples of particular corpus counts is also enabled. Lexical Explorer was developed as a prototype for user-specific corpus access and is aimed at researchers of German lexicon in spoken interaction. Although Lexical Explorer was developed on the basis of two small speech corpora of the German language, the underlying principle of this tool can be easily adapted to other corpora and other user groups. Moreover, the tool can be used to gain insights into the corpus structure as well as to study and verify corpus content in a transparent and user-friendly way.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":" ","pages":""},"PeriodicalIF":0.5,"publicationDate":"2020-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48576756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper reports on the construction of the Sydney Corpus of Television Dialogue (SydTV). SydTV comprises approximately 275,000-words of dialogue from sixty-six episodes of recent US American fic...
{"title":"The Sydney Corpus of Television Dialogue: designing and building a corpus of dialogue from US TV series","authors":"M. Bednarek","doi":"10.3366/cor.2020.0187","DOIUrl":"https://doi.org/10.3366/cor.2020.0187","url":null,"abstract":"This paper reports on the construction of the Sydney Corpus of Television Dialogue (SydTV). SydTV comprises approximately 275,000-words of dialogue from sixty-six episodes of recent US American fic...","PeriodicalId":44933,"journal":{"name":"Corpora","volume":"15 1","pages":"107-119"},"PeriodicalIF":0.5,"publicationDate":"2020-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49077112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Previous corpus-based research on the progressive (BE+V-ing) investigated it from a diachronic point of view or from the angle of World Englishes (WEs). However, factors such as its propensity to occur with animate subjects or its preference for dynamic verbs have not been studied in relation to the choice between progressive and simple aspect. As the progressive has been extended to stative verbs, we argue that a variationist study of the construction in WEs needs to take simple VPs into account systematically, too, and investigate whether there is interaction between predictor variables underlying the progressive:simple choice. We use a probabilistic grammar approach to study progressives in newspaper writing across a broad range of WEs. We apply a tree and forest analysis to gauge the relative strength of the predictor variables ‘regional variety’, ‘animacy’, ‘tense/modality’, ‘verb type’ and ‘voice’. Our results show that the core grammar for the progressive:simple choice is shared across all Englishes. The extension of progressives to stative verbs, in particular, does not result in statistically detectable effects.We argue that they nevertheless serve to give a very ‘local’ flavour to contact varieties as they are salient against the backdrop of the core grammar.
{"title":"Progressive or simple? A corpus-based study of aspect in World Englishes","authors":"M. Hundt, Paula Rautionaho, C. Strobl","doi":"10.3366/COR.2020.0186","DOIUrl":"https://doi.org/10.3366/COR.2020.0186","url":null,"abstract":"Previous corpus-based research on the progressive (BE+V-ing) investigated it from a diachronic point of view or from the angle of World Englishes (WEs). However, factors such as its propensity to occur with animate subjects or its preference for dynamic verbs have not been studied in relation to the choice between progressive and simple aspect. As the progressive has been extended to stative verbs, we argue that a variationist study of the construction in WEs needs to take simple VPs into account systematically, too, and investigate whether there is interaction between predictor variables underlying the progressive:simple choice. We use a probabilistic grammar approach to study progressives in newspaper writing across a broad range of WEs. We apply a tree and forest analysis to gauge the relative strength of the predictor variables ‘regional variety’, ‘animacy’, ‘tense/modality’, ‘verb type’ and ‘voice’. Our results show that the core grammar for the progressive:simple choice is shared across all Englishes. The extension of \u0000progressives to stative verbs, in particular, does not result in statistically detectable effects.We argue that they nevertheless serve to give a very ‘local’ flavour to contact varieties as they are salient against the backdrop of the core grammar.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":"15 1","pages":"77-106"},"PeriodicalIF":0.5,"publicationDate":"2020-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.3366/COR.2020.0186","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41904542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Many education professionals in Britain believe that school pupils have difficulty accessing academic texts because of inadequate knowledge of vocabulary. Previous research has suggested that some high frequency words used in non-specialised contexts have academic meanings that can cause problems for school pupils. We take corpus techniques used in the study of higher education texts and apply them to a corpus of texts designed for school pupils aged 11 to 14, attempting to identify such words automatically. We use the Spoken BNC2014 as a reference corpus. We identify a list of semi-technical words ( Baker, 1988 ), many of which are polysemous, having everyday meanings and related school subject meanings that may not be familiar to pupils. We investigate how semi-technical vocabulary can be identified and distinguished from both specialised and general vocabulary. Some supplementary qualitative analysis was needed, using collocation and concordance analysis. While time-consuming, the potential benefits for pupils struggling with school language make this a worthwhile exercise.
{"title":"Using corpus methods to identify subject specific uses of polysemous words in English secondary school science materials","authors":"A. Deignan, Robbie Love","doi":"10.3366/cor.2021.0216","DOIUrl":"https://doi.org/10.3366/cor.2021.0216","url":null,"abstract":"Many education professionals in Britain believe that school pupils have difficulty accessing academic texts because of inadequate knowledge of vocabulary. Previous research has suggested that some high frequency words used in non-specialised contexts have academic meanings that can cause problems for school pupils. We take corpus techniques used in the study of higher education texts and apply them to a corpus of texts designed for school pupils aged 11 to 14, attempting to identify such words automatically. We use the Spoken BNC2014 as a reference corpus. We identify a list of semi-technical words ( Baker, 1988 ), many of which are polysemous, having everyday meanings and related school subject meanings that may not be familiar to pupils. We investigate how semi-technical vocabulary can be identified and distinguished from both specialised and general vocabulary. Some supplementary qualitative analysis was needed, using collocation and concordance analysis. While time-consuming, the potential benefits for pupils struggling with school language make this a worthwhile exercise.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":"16 1","pages":""},"PeriodicalIF":0.5,"publicationDate":"2019-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44022241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we use a corpus stylistic methodology to investigate whether serious (i.e., ‘literary’) fiction is syntactically more complex than popular (i.e., ‘genre’) fiction. This is on the basis of literary critical claims that the structural complexity of serious fiction is one of the features that distinguishes it from popular literature (which, by contrast, is seen as easier to read). We compare the serious and popular fiction sections of the Lancaster Speech, Writing and Thought Presentation corpus (see Semino and Short, 2004 ) against various samples of the British National Corpus available in Wmatrix ( Rayson, 2009 ), focussing particularly (though not exclusively) on the identification of subordinating conjunctions. We find that, on this measure, there is no basis for claiming that serious fiction is any more complex syntactically than popular fiction. We then investigate the issue in relation to a specific genre of popular fiction, Chick Lit. Here we find that while syntactic simplicity exists, this is at a phrasal rather than a clausal level. We argue that by using a corpus stylistic approach we are able to qualify accurately certain literary critical claims about syntactic complexity as a distinguishing feature of serious and popular fiction, and to propose a refined hypothesis which might be used in further studies of the syntactic structures used in these two text types.
{"title":"Subordination as a potential marker of complexity in serious and popular fiction: a corpus stylistic approach to the testing of literary critical claims","authors":"Rocío Montoro, D. McIntyre","doi":"10.3366/COR.2019.0175","DOIUrl":"https://doi.org/10.3366/COR.2019.0175","url":null,"abstract":"In this paper, we use a corpus stylistic methodology to investigate whether serious (i.e., ‘literary’) fiction is syntactically more complex than popular (i.e., ‘genre’) fiction. This is on the basis of literary critical claims that the structural complexity of serious fiction is one of the features that distinguishes it from popular literature (which, by contrast, is seen as easier to read). We compare the serious and popular fiction sections of the Lancaster Speech, Writing and Thought Presentation corpus (see Semino and Short, 2004 ) against various samples of the British National Corpus available in Wmatrix ( Rayson, 2009 ), focussing particularly (though not exclusively) on the identification of subordinating conjunctions. We find that, on this measure, there is no basis for claiming that serious fiction is any more complex syntactically than popular fiction. We then investigate the issue in relation to a specific genre of popular fiction, Chick Lit. Here we find that while syntactic simplicity exists, this is at a phrasal rather than a clausal level. We argue that by using a corpus stylistic approach we are able to qualify accurately certain literary critical claims about syntactic complexity as a distinguishing feature of serious and popular fiction, and to propose a refined hypothesis which might be used in further studies of the syntactic structures used in these two text types.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":" ","pages":""},"PeriodicalIF":0.5,"publicationDate":"2019-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42870275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}