首页 > 最新文献

Corpora最新文献

英文 中文
An evaluation of MorphInd's morphological annotation scheme for Indonesian 对MorphInd印尼语词法标注方案的评价
IF 0.5 Q3 LINGUISTICS Pub Date : 2021-08-19 DOI: 10.3366/cor.2021.0221
Prihantoro
MorphInd 2 ( Larasati et al., 2011 ) is a state-of-the-art morphological analyser for Indonesian. To date, there has not been any comprehensive evaluation of the morphological annotation scheme which MorphInd implements. My evaluation of this annotation scheme reveals a number of significant drawbacks. Some analytical features encoded in MorphInd's tagset seem not to reflect features actually present in Indonesian morphology, while certain common features in the analysis of Indonesian are absent. Likewise, the Part of Speech (pos) hierarchy in the MorphInd tagset does not reflect the usual pos hierarchy used by Indonesian reference grammars. Moreover, the MorphInd output does not link morphological tags to the corresponding morpheme. Finally, a number of issues which might problematise text/corpus querying in the annotation's layout are observable, particularly relating to affixes, reduplication, and the affix–reduplication interface.
MorphInd 2(Larasati等人,2011)是印度尼西亚最先进的形态分析仪。到目前为止,还没有对MorphInd实现的形态学注释方案进行任何全面的评估。我对这个注释方案的评估揭示了许多显著的缺点。MorphInd标签集中编码的一些分析特征似乎并没有反映出印尼语形态中实际存在的特征,而印尼语分析中的某些常见特征却不存在。同样,MorphInd标记集中的词性(pos)层次结构并没有反映印尼参考语法所使用的常见pos层次结构。此外,MorphInd输出不将形态标签链接到相应的语素。最后,可以观察到一些可能会使注释布局中的文本/语料库查询出现问题的问题,特别是与词缀、重叠和词缀-重叠界面有关的问题。
{"title":"An evaluation of MorphInd's morphological annotation scheme for Indonesian","authors":"Prihantoro","doi":"10.3366/cor.2021.0221","DOIUrl":"https://doi.org/10.3366/cor.2021.0221","url":null,"abstract":"MorphInd 2 ( Larasati et al., 2011 ) is a state-of-the-art morphological analyser for Indonesian. To date, there has not been any comprehensive evaluation of the morphological annotation scheme which MorphInd implements. My evaluation of this annotation scheme reveals a number of significant drawbacks. Some analytical features encoded in MorphInd's tagset seem not to reflect features actually present in Indonesian morphology, while certain common features in the analysis of Indonesian are absent. Likewise, the Part of Speech (pos) hierarchy in the MorphInd tagset does not reflect the usual pos hierarchy used by Indonesian reference grammars. Moreover, the MorphInd output does not link morphological tags to the corresponding morpheme. Finally, a number of issues which might problematise text/corpus querying in the annotation's layout are observable, particularly relating to affixes, reduplication, and the affix–reduplication interface.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":" ","pages":""},"PeriodicalIF":0.5,"publicationDate":"2021-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44733248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Catchy and conversational? A register analysis of pop lyrics 时髦又健谈?流行歌词语域分析
IF 0.5 Q3 LINGUISTICS Pub Date : 2021-08-19 DOI: 10.3366/cor.2021.0219
Valentin Werner
This study presents a register analysis of pop lyrics. To this end, it applies multi-dimensional register analysis to empirically test claims regarding the allegedly conversational nature of pop lyrics. It thus follows broader calls for the linguistic exploration of performed language as represented in non-canonical pop culture registers. This text-linguistic investigation relies on a corpus of contemporary pop lyrics and uses the Multidimensional Analysis Tagger ( Nini, 2018 ), software that replicates Biber's (1988) tagger, to identify register features to contrast lyrics with other varieties of text. In addition, the n-gram and keyword functionalities of a concordancer are used for establishing register markers and style features to identify characteristic properties of pop lyrics. In line with earlier claims, it becomes apparent that pop lyrics indeed carry some conversational force despite situational factors being indicative of planned and performed production. Furthermore, this analysis identifies additional features that are highly distinctive of pop lyrics ( versus general conversation), and is suggestive of the special status of this register on the speech-writing continuum.
本研究对流行歌词进行语域分析。为此,它应用多维语域分析来实证检验关于流行歌词所谓对话性质的说法。因此,它遵循了对非规范流行文化语域中表现的表演语言进行语言探索的更广泛呼吁。这项文本语言学调查依赖于当代流行歌词的语料库,并使用多维分析标记器(Nini,2018),该软件复制了Biber(1988)的标记器,来识别语域特征,以将歌词与其他文本进行对比。此外,索引器的n-gram和关键字功能用于建立寄存器标记和风格特征,以识别流行歌词的特征属性。与之前的说法一致,很明显,流行歌词确实具有一定的对话力,尽管情境因素表明了计划和表演的制作。此外,这一分析确定了流行歌词(与一般对话相比)的其他特征,并暗示了这一语域在言语写作连续体中的特殊地位。
{"title":"Catchy and conversational? A register analysis of pop lyrics","authors":"Valentin Werner","doi":"10.3366/cor.2021.0219","DOIUrl":"https://doi.org/10.3366/cor.2021.0219","url":null,"abstract":"This study presents a register analysis of pop lyrics. To this end, it applies multi-dimensional register analysis to empirically test claims regarding the allegedly conversational nature of pop lyrics. It thus follows broader calls for the linguistic exploration of performed language as represented in non-canonical pop culture registers. This text-linguistic investigation relies on a corpus of contemporary pop lyrics and uses the Multidimensional Analysis Tagger ( Nini, 2018 ), software that replicates Biber's (1988) tagger, to identify register features to contrast lyrics with other varieties of text. In addition, the n-gram and keyword functionalities of a concordancer are used for establishing register markers and style features to identify characteristic properties of pop lyrics. In line with earlier claims, it becomes apparent that pop lyrics indeed carry some conversational force despite situational factors being indicative of planned and performed production. Furthermore, this analysis identifies additional features that are highly distinctive of pop lyrics ( versus general conversation), and is suggestive of the special status of this register on the speech-writing continuum.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":" ","pages":""},"PeriodicalIF":0.5,"publicationDate":"2021-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49612690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Back matter 回到问题
IF 0.5 Q3 LINGUISTICS Pub Date : 2021-08-01 DOI: 10.3366/cor.2021.0223
{"title":"Back matter","authors":"","doi":"10.3366/cor.2021.0223","DOIUrl":"https://doi.org/10.3366/cor.2021.0223","url":null,"abstract":"","PeriodicalId":44933,"journal":{"name":"Corpora","volume":" ","pages":""},"PeriodicalIF":0.5,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48814946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Front matter 前页
IF 0.5 Q3 LINGUISTICS Pub Date : 2021-08-01 DOI: 10.3366/cor.2021.0215
{"title":"Front matter","authors":"","doi":"10.3366/cor.2021.0215","DOIUrl":"https://doi.org/10.3366/cor.2021.0215","url":null,"abstract":"","PeriodicalId":44933,"journal":{"name":"Corpora","volume":" ","pages":""},"PeriodicalIF":0.5,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45397199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The force dynamics of adjectival deontic modality in the mediatised register of the fatwa: a corpus cognitive–semantic analysis 法特瓦语中介语中形容词道义情态的力量动态:语料库认知语义分析
IF 0.5 Q3 LINGUISTICS Pub Date : 2021-04-01 DOI: 10.3366/COR.2021.0207
A. Youssef
The present study offers new insights in how the cognitive-semantic analysis of adjectival deontic modality in the mediatized register of fatwa can be methodologically enhanced at both quantitative and qualitative levels. Drawing on the force-dynamics model originated by Talmy (1981, 1988) and developed by Sweetser (1990), the adjectivally modal expressions of obligation and permission have been investigated in an electronic corpus of fatwas (353,293 words falling in 1440 texts). The research data is manipulated by the corpus tool of Wmatrix (Rayson, 2003) with a view to calculating the relevant modal keywords and generating their concordances; further, the interactive register analysis of the tenor in the fatwa discourse is provided in a way that (i) facilitates the concordance reading of the adjectival keywords of deontic modality and (ii) examines the force dynamics underlying these adjectival keywords in terms of their modally interactive meanings. The study has reached three main findings. First, in the specialized corpus of fatwa there are five keywords of adjectival deontic modality: obligatory, obliged, permissible, impermissible, and forbidden. Second, the force dynamics of obligatory, obliged and permissible reveals enacting positive-compulsion force with attitudinal variations of objective and subjective meanings towards real-world content (themes) and participants (questioner and questionee) in the mediatized register of fatwa. Third, complementary to second, the force dynamics of impermissible and forbidden reveals a set of debarring negative-restriction barriers of various forms, viz. personal, collective, generic, and topical, in the same fatwa register.
本研究为如何在定量和定性两方面从方法论上加强法特瓦语中介语中形容词道义情态的认知语义分析提供了新的见解。本文利用Talmy(1981,1988)提出、Sweetser(1990)发展的力动力学模型,对电子教令语料库(1440篇文本中353,293个单词)中义务和许可的形容词情态表达进行了研究。研究数据通过Wmatrix (Rayson, 2003)的语料库工具进行处理,以计算相关的模态关键词并生成它们的一致性;此外,法特瓦语篇中语旨的互动语域分析提供了一种方式:(i)促进了对义务情态形容词关键词的一致性阅读,(ii)根据其情态互动意义检查了这些形容词关键词背后的力量动态。这项研究有三个主要发现。首先,在法特瓦的专门语料库中,形容词道义情态有五个关键词:义务、义务、允许、不允许和禁止。其次,强制性、义务性和被允许性的力量动态揭示了在法特瓦的中介化登记中,对现实世界内容(主题)和参与者(提问者和被调查者)的客观和主观意义的态度变化所产生的积极强制性。第三,与第二相辅相成的是,“不允许”和“禁止”的力量动态揭示了在同一个法特瓦范围内,一系列各种形式的去禁止的消极限制障碍,即个人的、集体的、一般的和局部的。
{"title":"The force dynamics of adjectival deontic modality in the mediatised register of the fatwa: a corpus cognitive–semantic analysis","authors":"A. Youssef","doi":"10.3366/COR.2021.0207","DOIUrl":"https://doi.org/10.3366/COR.2021.0207","url":null,"abstract":"The present study offers new insights in how the cognitive-semantic analysis of adjectival deontic modality in the mediatized register of fatwa can be methodologically enhanced at both quantitative and qualitative levels. Drawing on the force-dynamics model originated by Talmy (1981, 1988) and developed by Sweetser (1990), the adjectivally modal expressions of obligation and permission have been investigated in an electronic corpus of fatwas (353,293 words falling in 1440 texts). The research data is manipulated by the corpus tool of Wmatrix (Rayson, 2003) with a view to calculating the relevant modal keywords and generating their concordances; further, the interactive register analysis of the tenor in the fatwa discourse is provided in a way that (i) facilitates the concordance reading of the adjectival keywords of deontic modality and (ii) examines the force dynamics underlying these adjectival keywords in terms of their modally interactive meanings. The study has reached three main findings. First, in the specialized corpus of fatwa there are five keywords of adjectival deontic modality: obligatory, obliged, permissible, impermissible, and forbidden. Second, the force dynamics of obligatory, obliged and permissible reveals enacting positive-compulsion force with attitudinal variations of objective and subjective meanings towards real-world content (themes) and participants (questioner and questionee) in the mediatized register of fatwa. Third, complementary to second, the force dynamics of impermissible and forbidden reveals a set of debarring negative-restriction barriers of various forms, viz. personal, collective, generic, and topical, in the same fatwa register.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":"16 1","pages":"1-30"},"PeriodicalIF":0.5,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47903014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Review: Römer, Cortes and Friginal (eds). 2020. Advances in Corpus-Based Research on Academic Writing: Effects of Discipline, Register, and Writing Expertise. Amsterdam and Philadelphia: John Benjamins 评论:Römer,Cortes和Friginal(编辑)。2020.基于语料库的学术写作研究进展:学科、语域和写作专业知识的影响。阿姆斯特丹和费城:约翰·本雅明
IF 0.5 Q3 LINGUISTICS Pub Date : 2021-04-01 DOI: 10.3366/COR.2021.0212
Larissa Goulart
{"title":"Review: Römer, Cortes and Friginal (eds). 2020. Advances in Corpus-Based Research on Academic Writing: Effects of Discipline, Register, and Writing Expertise. Amsterdam and Philadelphia: John Benjamins","authors":"Larissa Goulart","doi":"10.3366/COR.2021.0212","DOIUrl":"https://doi.org/10.3366/COR.2021.0212","url":null,"abstract":"","PeriodicalId":44933,"journal":{"name":"Corpora","volume":"16 1","pages":"157-159"},"PeriodicalIF":0.5,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41644103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic coherence analysis of Dutch: testing the subjectivity hypothesis on a larger scale 荷兰语的自动连贯性分析:在更大范围内检验主体性假设
IF 0.5 Q3 LINGUISTICS Pub Date : 2021-04-01 DOI: 10.3366/COR.2021.0211
J. Hoek, T. Sanders, W. Spooren
With the increasing availability of large corpora, quantitative corpus analysis is becoming more and more popular as a method for doing linguistic research. This paper uses a new research tool that makes it possible to search syntactically annotated corpora without extensive programming knowledge (CESAR) to study the subjectivity patterns of four Dutch causal connectives. Analyzing a large set of causal relations marked by four of the most frequent Dutch causal connectives (daarom, dus, omdat, and want), the case study aims to corroborate the subjectivity hypothesis established on the basis of smaller scale studies that used manual annotation. The automatic analysis of the subjectivity patterns of Dutch causal connectives illustrates the usability of CESAR in particular and the feasibility of automatic coherence analysis in general. In addition, it generates new insights into the subjectivity patterns of daarom, dus, omdat, and want.
随着大量语料库的出现,定量语料库分析作为一种语言学研究方法越来越受欢迎。本文使用一种新的研究工具,即在没有广泛编程知识的情况下搜索句法注释语料库(CESAR),来研究四个荷兰因果连接词的主观性模式。案例研究分析了一大组以四种最常见的荷兰因果连接词(daarom、dus、omdat和want)为标志的因果关系,旨在证实在使用手动注释的小规模研究基础上建立的主观性假设。荷兰因果连接词主观性模式的自动分析特别说明了CESAR的可用性,以及自动连贯分析的普遍可行性。此外,它对daarom、dus、omdat和want的主体性模式产生了新的见解。
{"title":"Automatic coherence analysis of Dutch: testing the subjectivity hypothesis on a larger scale","authors":"J. Hoek, T. Sanders, W. Spooren","doi":"10.3366/COR.2021.0211","DOIUrl":"https://doi.org/10.3366/COR.2021.0211","url":null,"abstract":"With the increasing availability of large corpora, quantitative corpus analysis is becoming more and more popular as a method for doing linguistic research. This paper uses a new research tool that makes it possible to search syntactically annotated corpora without extensive programming knowledge (CESAR) to study the subjectivity patterns of four Dutch causal connectives. Analyzing a large set of causal relations marked by four of the most frequent Dutch causal connectives (daarom, dus, omdat, and want), the case study aims to corroborate the subjectivity hypothesis established on the basis of smaller scale studies that used manual annotation. The automatic analysis of the subjectivity patterns of Dutch causal connectives illustrates the usability of CESAR in particular and the feasibility of automatic coherence analysis in general. In addition, it generates new insights into the subjectivity patterns of daarom, dus, omdat, and want.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":"16 1","pages":"129-155"},"PeriodicalIF":0.5,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44800476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Capturing Herder: a three-step approach to the identification of language ideologies using corpus linguistics and critical discourse analysis 捕捉牧人:使用语料库语言学和批评话语分析的语言意识形态识别的三步方法
IF 0.5 Q3 LINGUISTICS Pub Date : 2021-04-01 DOI: 10.3366/COR.2021.0209
Adnan Ajšić
Recent lexical approaches to the identification of language ideologies focus on the application of quantitative corpus-linguistic techniques to large data sets as a way to minimise researcher inference and ensure more objective sampling methods, replicability of analytical procedures, and a higher degree of generalisability ( Fitzsimmons-Doolan, 2014 ; Subtirelu, 2015 ; Vessey, 2017 ; Wright and Brooks, 2019 ; and McEntee-Atalianis and Vessey, 2020 ). Based on two comprehensive, specialised research (11.6 million words) and comparator (22.4 million words) newspaper corpora, this study offers an examination of the effectiveness of the multivariate and univariate statistical techniques, and proposes a three-step approach whereby corpus linguistics and critical discourse analysis are combined to identify ( 1) thematic and ( 2) ideological discourses (cf. ‘d’/’D’ discourses; Gee, 2010 ), and ( 3) language ideologies. In contrast to recent contributions, it is argued that item frequency is not necessarily a reliable or effective indicator of language ideologies but, rather, of language-related discourses which can be examined for implicit and explicit language-ideological content. A combination of multivariate and univariate statistical techniques, and the three-step approach are shown to be a highly effective methodological solution for synchronic and diachronic language ideology and discourse research based on topically/discursively heterogeneous corpora.
最近识别语言意识形态的词汇方法侧重于将定量语料库语言学技术应用于大数据集,以最大限度地减少研究人员的推断,并确保更客观的采样方法、分析程序的可复制性,以及更高程度的通用性(Fitzsimmons Doolan,2014;Subtirelu,2015;Vessey,2017;Wright和Brooks,2019;以及McEntee Atalianis和Vessey(2020)。本研究基于两项全面、专业的研究(1160万字)和比较(2240万字的)报纸语料库,对多元和单变量统计技术的有效性进行了检验,并提出了一种三步走的方法,将语料库语言学和批判性话语分析相结合,以识别(1)主题话语和(2)意识形态话语(参见“d”/“d”话语;Gee,2010)和(3)语言意识形态。与最近的贡献相反,有人认为,项目频率不一定是语言意识形态的可靠或有效指标,而是与语言相关的话语的指标,可以检查语言意识形态的内隐和外显内容。多元和单变量统计技术的结合以及三步法被证明是基于主题/话语异质语料库的共时和历时语言意识形态和话语研究的高效方法论解决方案。
{"title":"Capturing Herder: a three-step approach to the identification of language ideologies using corpus linguistics and critical discourse analysis","authors":"Adnan Ajšić","doi":"10.3366/COR.2021.0209","DOIUrl":"https://doi.org/10.3366/COR.2021.0209","url":null,"abstract":"Recent lexical approaches to the identification of language ideologies focus on the application of quantitative corpus-linguistic techniques to large data sets as a way to minimise researcher inference and ensure more objective sampling methods, replicability of analytical procedures, and a higher degree of generalisability ( Fitzsimmons-Doolan, 2014 ; Subtirelu, 2015 ; Vessey, 2017 ; Wright and Brooks, 2019 ; and McEntee-Atalianis and Vessey, 2020 ). Based on two comprehensive, specialised research (11.6 million words) and comparator (22.4 million words) newspaper corpora, this study offers an examination of the effectiveness of the multivariate and univariate statistical techniques, and proposes a three-step approach whereby corpus linguistics and critical discourse analysis are combined to identify ( 1) thematic and ( 2) ideological discourses (cf. ‘d’/’D’ discourses; Gee, 2010 ), and ( 3) language ideologies. In contrast to recent contributions, it is argued that item frequency is not necessarily a reliable or effective indicator of language ideologies but, rather, of language-related discourses which can be examined for implicit and explicit language-ideological content. A combination of multivariate and univariate statistical techniques, and the three-step approach are shown to be a highly effective methodological solution for synchronic and diachronic language ideology and discourse research based on topically/discursively heterogeneous corpora.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":"16 1","pages":"63-95"},"PeriodicalIF":0.5,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44744876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
‘Eww wtf, what a dumb bitch’: a case study of similitudes inside gender-specific swearing patterns on Twitter “哇,多么愚蠢的婊子”:推特上特定性别的脏话模式中的相似性案例研究
IF 0.5 Q3 LINGUISTICS Pub Date : 2021-04-01 DOI: 10.3366/COR.2021.0208
Michael Gauthier
Contrary to the idea which has been widespread for at least a hundred years that women differ substantially from men when they express themselves in English-speaking contexts (e.g., Jespersen, 1922 ; and Steadman, 1935 ), empirical studies have shown that these differences are often minimal and are not due to gender alone (e.g., Eckert, 2008 ; and Baker, 2014 ). This also frequently applies to the way they swear, despite certain preferences which have been documented in empirical studies. With the growing impact that social media now has in our everyday lives, these represent a unique opportunity to study vast quantities of written data. This paper is based on a corpus of about one-million tweets and is an attempt to delve deeper into the analysis of gendered swearword habits. First, the goal is to show that even if there are certain gendered preferences in terms of the choice of swearwords, women and men frequently display similar patterns in using them, thus reinforcing the idea that they are not so linguistically different. Secondly, this paper provides insights into how collocational networks can be used to achieve this, and thus how focussing on differences can be one way to spot similarities across two sub-corpora.
与至少一百年来普遍存在的观点相反,即女性在英语环境中表达自己时与男性有很大差异(例如,Jespersen,1922;和Steadman,1935),实证研究表明,这些差异往往很小,并不仅仅是由于性别(例如,Eckert,2008;和Baker,2014)。这也经常适用于他们的宣誓方式,尽管实证研究中已经记录了某些偏好。随着社交媒体在我们日常生活中的影响越来越大,这为研究大量书面数据提供了一个独特的机会。本文基于约100万条推文的语料库,试图更深入地分析性别脏话习惯。首先,我们的目标是表明,即使在脏话的选择方面存在某些性别偏好,女性和男性在使用脏话时也经常表现出相似的模式,从而强化了他们在语言上没有那么不同的想法。其次,本文深入了解了如何使用搭配网络来实现这一点,从而了解了关注差异如何成为发现两个子语料库相似之处的一种方法。
{"title":"‘Eww wtf, what a dumb bitch’: a case study of similitudes inside gender-specific swearing patterns on Twitter","authors":"Michael Gauthier","doi":"10.3366/COR.2021.0208","DOIUrl":"https://doi.org/10.3366/COR.2021.0208","url":null,"abstract":"Contrary to the idea which has been widespread for at least a hundred years that women differ substantially from men when they express themselves in English-speaking contexts (e.g., Jespersen, 1922 ; and Steadman, 1935 ), empirical studies have shown that these differences are often minimal and are not due to gender alone (e.g., Eckert, 2008 ; and Baker, 2014 ). This also frequently applies to the way they swear, despite certain preferences which have been documented in empirical studies. With the growing impact that social media now has in our everyday lives, these represent a unique opportunity to study vast quantities of written data. This paper is based on a corpus of about one-million tweets and is an attempt to delve deeper into the analysis of gendered swearword habits. First, the goal is to show that even if there are certain gendered preferences in terms of the choice of swearwords, women and men frequently display similar patterns in using them, thus reinforcing the idea that they are not so linguistically different. Secondly, this paper provides insights into how collocational networks can be used to achieve this, and thus how focussing on differences can be one way to spot similarities across two sub-corpora.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":"16 1","pages":"31-61"},"PeriodicalIF":0.5,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49229808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A comparative study of lexical bundles across paradigms and disciplines 跨范式、跨学科词汇束的比较研究
IF 0.5 Q3 LINGUISTICS Pub Date : 2021-04-01 DOI: 10.3366/COR.2021.0210
Fenglong Cao
Research on lexical bundles has shed much light on disciplinary influences on the employment of these multi-word expressions in academic discourse, particularly in research articles. Little work, however, has been done on how research paradigms may impact on lexical bundles in academic discourse. This study aims to investigate the extent to which lexical bundles vary in quantitative, qualitative and mixed methods research articles across two disciplines. All four-word lexical bundles were extracted from a specially built corpus of research articles and were analysed for their linguistic structures and discourse functions. The data analyses revealed marked structural and functional variation between different research paradigms and disciplines. Across paradigms, the quantitative articles differed from the qualitative articles by employing significantly more verb phrase bundles and participant-orientated functions whereas the qualitative articles employed significantly more prepositional phrase bundles and text-orientated functions. Across disciplines, the mixed methods articles in education employed significantly more noun phrase bundles and research-orientated functions, whereas the mixed methods articles in psychology used more prepositional bundles and text-orientated functions. These paradigmatic and disciplinary differences in lexical bundles are explained by examining the underlying perceptions of knowledge and knowledge-making practices in different research paradigms and disciplines.
对词汇束的研究揭示了学科对学术话语中,特别是研究文章中使用这些多词表达的影响。然而,关于研究范式如何影响学术话语中的词汇束,却很少有研究。本研究旨在调查两个学科的定量、定性和混合方法研究文章中词汇束的差异程度。所有四个词的词束都是从专门构建的研究文章语料库中提取的,并对其语言结构和语篇功能进行了分析。数据分析显示,不同研究范式和学科之间存在显著的结构和功能差异。在不同的范式中,定量文章与定性文章的不同之处在于,它们使用了更多的动词短语束和参与者导向功能,而定性文章则使用了更多介词短语束和文本导向功能。在不同学科中,教育学中的混合方法文章使用了更多的名词短语束和研究导向功能,而心理学中的混合方式文章使用了较多的介词束和文本导向功能。通过考察不同研究范式和学科中对知识和知识制造实践的潜在认知,可以解释词汇束中的这些范式和学科差异。
{"title":"A comparative study of lexical bundles across paradigms and disciplines","authors":"Fenglong Cao","doi":"10.3366/COR.2021.0210","DOIUrl":"https://doi.org/10.3366/COR.2021.0210","url":null,"abstract":"Research on lexical bundles has shed much light on disciplinary influences on the employment of these multi-word expressions in academic discourse, particularly in research articles. Little work, however, has been done on how research paradigms may impact on lexical bundles in academic discourse. This study aims to investigate the extent to which lexical bundles vary in quantitative, qualitative and mixed methods research articles across two disciplines. All four-word lexical bundles were extracted from a specially built corpus of research articles and were analysed for their linguistic structures and discourse functions. The data analyses revealed marked structural and functional variation between different research paradigms and disciplines. Across paradigms, the quantitative articles differed from the qualitative articles by employing significantly more verb phrase bundles and participant-orientated functions whereas the qualitative articles employed significantly more prepositional phrase bundles and text-orientated functions. Across disciplines, the mixed methods articles in education employed significantly more noun phrase bundles and research-orientated functions, whereas the mixed methods articles in psychology used more prepositional bundles and text-orientated functions. These paradigmatic and disciplinary differences in lexical bundles are explained by examining the underlying perceptions of knowledge and knowledge-making practices in different research paradigms and disciplines.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":"16 1","pages":"97-128"},"PeriodicalIF":0.5,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43182334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Corpora
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1