首页 > 最新文献

Corpora最新文献

英文 中文
Front matter 前页
IF 0.5 Q1 Arts and Humanities Pub Date : 2021-08-01 DOI: 10.3366/cor.2021.0215
{"title":"Front matter","authors":"","doi":"10.3366/cor.2021.0215","DOIUrl":"https://doi.org/10.3366/cor.2021.0215","url":null,"abstract":"","PeriodicalId":44933,"journal":{"name":"Corpora","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45397199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The force dynamics of adjectival deontic modality in the mediatised register of the fatwa: a corpus cognitive–semantic analysis 法特瓦语中介语中形容词道义情态的力量动态:语料库认知语义分析
IF 0.5 Q1 Arts and Humanities Pub Date : 2021-04-01 DOI: 10.3366/COR.2021.0207
A. Youssef
The present study offers new insights in how the cognitive-semantic analysis of adjectival deontic modality in the mediatized register of fatwa can be methodologically enhanced at both quantitative and qualitative levels. Drawing on the force-dynamics model originated by Talmy (1981, 1988) and developed by Sweetser (1990), the adjectivally modal expressions of obligation and permission have been investigated in an electronic corpus of fatwas (353,293 words falling in 1440 texts). The research data is manipulated by the corpus tool of Wmatrix (Rayson, 2003) with a view to calculating the relevant modal keywords and generating their concordances; further, the interactive register analysis of the tenor in the fatwa discourse is provided in a way that (i) facilitates the concordance reading of the adjectival keywords of deontic modality and (ii) examines the force dynamics underlying these adjectival keywords in terms of their modally interactive meanings. The study has reached three main findings. First, in the specialized corpus of fatwa there are five keywords of adjectival deontic modality: obligatory, obliged, permissible, impermissible, and forbidden. Second, the force dynamics of obligatory, obliged and permissible reveals enacting positive-compulsion force with attitudinal variations of objective and subjective meanings towards real-world content (themes) and participants (questioner and questionee) in the mediatized register of fatwa. Third, complementary to second, the force dynamics of impermissible and forbidden reveals a set of debarring negative-restriction barriers of various forms, viz. personal, collective, generic, and topical, in the same fatwa register.
本研究为如何在定量和定性两方面从方法论上加强法特瓦语中介语中形容词道义情态的认知语义分析提供了新的见解。本文利用Talmy(1981,1988)提出、Sweetser(1990)发展的力动力学模型,对电子教令语料库(1440篇文本中353,293个单词)中义务和许可的形容词情态表达进行了研究。研究数据通过Wmatrix (Rayson, 2003)的语料库工具进行处理,以计算相关的模态关键词并生成它们的一致性;此外,法特瓦语篇中语旨的互动语域分析提供了一种方式:(i)促进了对义务情态形容词关键词的一致性阅读,(ii)根据其情态互动意义检查了这些形容词关键词背后的力量动态。这项研究有三个主要发现。首先,在法特瓦的专门语料库中,形容词道义情态有五个关键词:义务、义务、允许、不允许和禁止。其次,强制性、义务性和被允许性的力量动态揭示了在法特瓦的中介化登记中,对现实世界内容(主题)和参与者(提问者和被调查者)的客观和主观意义的态度变化所产生的积极强制性。第三,与第二相辅相成的是,“不允许”和“禁止”的力量动态揭示了在同一个法特瓦范围内,一系列各种形式的去禁止的消极限制障碍,即个人的、集体的、一般的和局部的。
{"title":"The force dynamics of adjectival deontic modality in the mediatised register of the fatwa: a corpus cognitive–semantic analysis","authors":"A. Youssef","doi":"10.3366/COR.2021.0207","DOIUrl":"https://doi.org/10.3366/COR.2021.0207","url":null,"abstract":"The present study offers new insights in how the cognitive-semantic analysis of adjectival deontic modality in the mediatized register of fatwa can be methodologically enhanced at both quantitative and qualitative levels. Drawing on the force-dynamics model originated by Talmy (1981, 1988) and developed by Sweetser (1990), the adjectivally modal expressions of obligation and permission have been investigated in an electronic corpus of fatwas (353,293 words falling in 1440 texts). The research data is manipulated by the corpus tool of Wmatrix (Rayson, 2003) with a view to calculating the relevant modal keywords and generating their concordances; further, the interactive register analysis of the tenor in the fatwa discourse is provided in a way that (i) facilitates the concordance reading of the adjectival keywords of deontic modality and (ii) examines the force dynamics underlying these adjectival keywords in terms of their modally interactive meanings. The study has reached three main findings. First, in the specialized corpus of fatwa there are five keywords of adjectival deontic modality: obligatory, obliged, permissible, impermissible, and forbidden. Second, the force dynamics of obligatory, obliged and permissible reveals enacting positive-compulsion force with attitudinal variations of objective and subjective meanings towards real-world content (themes) and participants (questioner and questionee) in the mediatized register of fatwa. Third, complementary to second, the force dynamics of impermissible and forbidden reveals a set of debarring negative-restriction barriers of various forms, viz. personal, collective, generic, and topical, in the same fatwa register.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47903014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Review: Römer, Cortes and Friginal (eds). 2020. Advances in Corpus-Based Research on Academic Writing: Effects of Discipline, Register, and Writing Expertise. Amsterdam and Philadelphia: John Benjamins 评论:Römer,Cortes和Friginal(编辑)。2020.基于语料库的学术写作研究进展:学科、语域和写作专业知识的影响。阿姆斯特丹和费城:约翰·本雅明
IF 0.5 Q1 Arts and Humanities Pub Date : 2021-04-01 DOI: 10.3366/COR.2021.0212
Larissa Goulart
{"title":"Review: Römer, Cortes and Friginal (eds). 2020. Advances in Corpus-Based Research on Academic Writing: Effects of Discipline, Register, and Writing Expertise. Amsterdam and Philadelphia: John Benjamins","authors":"Larissa Goulart","doi":"10.3366/COR.2021.0212","DOIUrl":"https://doi.org/10.3366/COR.2021.0212","url":null,"abstract":"","PeriodicalId":44933,"journal":{"name":"Corpora","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41644103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic coherence analysis of Dutch: testing the subjectivity hypothesis on a larger scale 荷兰语的自动连贯性分析:在更大范围内检验主体性假设
IF 0.5 Q1 Arts and Humanities Pub Date : 2021-04-01 DOI: 10.3366/COR.2021.0211
J. Hoek, T. Sanders, W. Spooren
With the increasing availability of large corpora, quantitative corpus analysis is becoming more and more popular as a method for doing linguistic research. This paper uses a new research tool that makes it possible to search syntactically annotated corpora without extensive programming knowledge (CESAR) to study the subjectivity patterns of four Dutch causal connectives. Analyzing a large set of causal relations marked by four of the most frequent Dutch causal connectives (daarom, dus, omdat, and want), the case study aims to corroborate the subjectivity hypothesis established on the basis of smaller scale studies that used manual annotation. The automatic analysis of the subjectivity patterns of Dutch causal connectives illustrates the usability of CESAR in particular and the feasibility of automatic coherence analysis in general. In addition, it generates new insights into the subjectivity patterns of daarom, dus, omdat, and want.
随着大量语料库的出现,定量语料库分析作为一种语言学研究方法越来越受欢迎。本文使用一种新的研究工具,即在没有广泛编程知识的情况下搜索句法注释语料库(CESAR),来研究四个荷兰因果连接词的主观性模式。案例研究分析了一大组以四种最常见的荷兰因果连接词(daarom、dus、omdat和want)为标志的因果关系,旨在证实在使用手动注释的小规模研究基础上建立的主观性假设。荷兰因果连接词主观性模式的自动分析特别说明了CESAR的可用性,以及自动连贯分析的普遍可行性。此外,它对daarom、dus、omdat和want的主体性模式产生了新的见解。
{"title":"Automatic coherence analysis of Dutch: testing the subjectivity hypothesis on a larger scale","authors":"J. Hoek, T. Sanders, W. Spooren","doi":"10.3366/COR.2021.0211","DOIUrl":"https://doi.org/10.3366/COR.2021.0211","url":null,"abstract":"With the increasing availability of large corpora, quantitative corpus analysis is becoming more and more popular as a method for doing linguistic research. This paper uses a new research tool that makes it possible to search syntactically annotated corpora without extensive programming knowledge (CESAR) to study the subjectivity patterns of four Dutch causal connectives. Analyzing a large set of causal relations marked by four of the most frequent Dutch causal connectives (daarom, dus, omdat, and want), the case study aims to corroborate the subjectivity hypothesis established on the basis of smaller scale studies that used manual annotation. The automatic analysis of the subjectivity patterns of Dutch causal connectives illustrates the usability of CESAR in particular and the feasibility of automatic coherence analysis in general. In addition, it generates new insights into the subjectivity patterns of daarom, dus, omdat, and want.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44800476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
‘Eww wtf, what a dumb bitch’: a case study of similitudes inside gender-specific swearing patterns on Twitter “哇,多么愚蠢的婊子”:推特上特定性别的脏话模式中的相似性案例研究
IF 0.5 Q1 Arts and Humanities Pub Date : 2021-04-01 DOI: 10.3366/COR.2021.0208
Michael Gauthier
Contrary to the idea which has been widespread for at least a hundred years that women differ substantially from men when they express themselves in English-speaking contexts (e.g., Jespersen, 1922 ; and Steadman, 1935 ), empirical studies have shown that these differences are often minimal and are not due to gender alone (e.g., Eckert, 2008 ; and Baker, 2014 ). This also frequently applies to the way they swear, despite certain preferences which have been documented in empirical studies. With the growing impact that social media now has in our everyday lives, these represent a unique opportunity to study vast quantities of written data. This paper is based on a corpus of about one-million tweets and is an attempt to delve deeper into the analysis of gendered swearword habits. First, the goal is to show that even if there are certain gendered preferences in terms of the choice of swearwords, women and men frequently display similar patterns in using them, thus reinforcing the idea that they are not so linguistically different. Secondly, this paper provides insights into how collocational networks can be used to achieve this, and thus how focussing on differences can be one way to spot similarities across two sub-corpora.
与至少一百年来普遍存在的观点相反,即女性在英语环境中表达自己时与男性有很大差异(例如,Jespersen,1922;和Steadman,1935),实证研究表明,这些差异往往很小,并不仅仅是由于性别(例如,Eckert,2008;和Baker,2014)。这也经常适用于他们的宣誓方式,尽管实证研究中已经记录了某些偏好。随着社交媒体在我们日常生活中的影响越来越大,这为研究大量书面数据提供了一个独特的机会。本文基于约100万条推文的语料库,试图更深入地分析性别脏话习惯。首先,我们的目标是表明,即使在脏话的选择方面存在某些性别偏好,女性和男性在使用脏话时也经常表现出相似的模式,从而强化了他们在语言上没有那么不同的想法。其次,本文深入了解了如何使用搭配网络来实现这一点,从而了解了关注差异如何成为发现两个子语料库相似之处的一种方法。
{"title":"‘Eww wtf, what a dumb bitch’: a case study of similitudes inside gender-specific swearing patterns on Twitter","authors":"Michael Gauthier","doi":"10.3366/COR.2021.0208","DOIUrl":"https://doi.org/10.3366/COR.2021.0208","url":null,"abstract":"Contrary to the idea which has been widespread for at least a hundred years that women differ substantially from men when they express themselves in English-speaking contexts (e.g., Jespersen, 1922 ; and Steadman, 1935 ), empirical studies have shown that these differences are often minimal and are not due to gender alone (e.g., Eckert, 2008 ; and Baker, 2014 ). This also frequently applies to the way they swear, despite certain preferences which have been documented in empirical studies. With the growing impact that social media now has in our everyday lives, these represent a unique opportunity to study vast quantities of written data. This paper is based on a corpus of about one-million tweets and is an attempt to delve deeper into the analysis of gendered swearword habits. First, the goal is to show that even if there are certain gendered preferences in terms of the choice of swearwords, women and men frequently display similar patterns in using them, thus reinforcing the idea that they are not so linguistically different. Secondly, this paper provides insights into how collocational networks can be used to achieve this, and thus how focussing on differences can be one way to spot similarities across two sub-corpora.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49229808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Capturing Herder: a three-step approach to the identification of language ideologies using corpus linguistics and critical discourse analysis 捕捉牧人:使用语料库语言学和批评话语分析的语言意识形态识别的三步方法
IF 0.5 Q1 Arts and Humanities Pub Date : 2021-04-01 DOI: 10.3366/COR.2021.0209
Adnan Ajšić
Recent lexical approaches to the identification of language ideologies focus on the application of quantitative corpus-linguistic techniques to large data sets as a way to minimise researcher inference and ensure more objective sampling methods, replicability of analytical procedures, and a higher degree of generalisability ( Fitzsimmons-Doolan, 2014 ; Subtirelu, 2015 ; Vessey, 2017 ; Wright and Brooks, 2019 ; and McEntee-Atalianis and Vessey, 2020 ). Based on two comprehensive, specialised research (11.6 million words) and comparator (22.4 million words) newspaper corpora, this study offers an examination of the effectiveness of the multivariate and univariate statistical techniques, and proposes a three-step approach whereby corpus linguistics and critical discourse analysis are combined to identify ( 1) thematic and ( 2) ideological discourses (cf. ‘d’/’D’ discourses; Gee, 2010 ), and ( 3) language ideologies. In contrast to recent contributions, it is argued that item frequency is not necessarily a reliable or effective indicator of language ideologies but, rather, of language-related discourses which can be examined for implicit and explicit language-ideological content. A combination of multivariate and univariate statistical techniques, and the three-step approach are shown to be a highly effective methodological solution for synchronic and diachronic language ideology and discourse research based on topically/discursively heterogeneous corpora.
最近识别语言意识形态的词汇方法侧重于将定量语料库语言学技术应用于大数据集,以最大限度地减少研究人员的推断,并确保更客观的采样方法、分析程序的可复制性,以及更高程度的通用性(Fitzsimmons Doolan,2014;Subtirelu,2015;Vessey,2017;Wright和Brooks,2019;以及McEntee Atalianis和Vessey(2020)。本研究基于两项全面、专业的研究(1160万字)和比较(2240万字的)报纸语料库,对多元和单变量统计技术的有效性进行了检验,并提出了一种三步走的方法,将语料库语言学和批判性话语分析相结合,以识别(1)主题话语和(2)意识形态话语(参见“d”/“d”话语;Gee,2010)和(3)语言意识形态。与最近的贡献相反,有人认为,项目频率不一定是语言意识形态的可靠或有效指标,而是与语言相关的话语的指标,可以检查语言意识形态的内隐和外显内容。多元和单变量统计技术的结合以及三步法被证明是基于主题/话语异质语料库的共时和历时语言意识形态和话语研究的高效方法论解决方案。
{"title":"Capturing Herder: a three-step approach to the identification of language ideologies using corpus linguistics and critical discourse analysis","authors":"Adnan Ajšić","doi":"10.3366/COR.2021.0209","DOIUrl":"https://doi.org/10.3366/COR.2021.0209","url":null,"abstract":"Recent lexical approaches to the identification of language ideologies focus on the application of quantitative corpus-linguistic techniques to large data sets as a way to minimise researcher inference and ensure more objective sampling methods, replicability of analytical procedures, and a higher degree of generalisability ( Fitzsimmons-Doolan, 2014 ; Subtirelu, 2015 ; Vessey, 2017 ; Wright and Brooks, 2019 ; and McEntee-Atalianis and Vessey, 2020 ). Based on two comprehensive, specialised research (11.6 million words) and comparator (22.4 million words) newspaper corpora, this study offers an examination of the effectiveness of the multivariate and univariate statistical techniques, and proposes a three-step approach whereby corpus linguistics and critical discourse analysis are combined to identify ( 1) thematic and ( 2) ideological discourses (cf. ‘d’/’D’ discourses; Gee, 2010 ), and ( 3) language ideologies. In contrast to recent contributions, it is argued that item frequency is not necessarily a reliable or effective indicator of language ideologies but, rather, of language-related discourses which can be examined for implicit and explicit language-ideological content. A combination of multivariate and univariate statistical techniques, and the three-step approach are shown to be a highly effective methodological solution for synchronic and diachronic language ideology and discourse research based on topically/discursively heterogeneous corpora.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44744876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A comparative study of lexical bundles across paradigms and disciplines 跨范式、跨学科词汇束的比较研究
IF 0.5 Q1 Arts and Humanities Pub Date : 2021-04-01 DOI: 10.3366/COR.2021.0210
Fenglong Cao
Research on lexical bundles has shed much light on disciplinary influences on the employment of these multi-word expressions in academic discourse, particularly in research articles. Little work, however, has been done on how research paradigms may impact on lexical bundles in academic discourse. This study aims to investigate the extent to which lexical bundles vary in quantitative, qualitative and mixed methods research articles across two disciplines. All four-word lexical bundles were extracted from a specially built corpus of research articles and were analysed for their linguistic structures and discourse functions. The data analyses revealed marked structural and functional variation between different research paradigms and disciplines. Across paradigms, the quantitative articles differed from the qualitative articles by employing significantly more verb phrase bundles and participant-orientated functions whereas the qualitative articles employed significantly more prepositional phrase bundles and text-orientated functions. Across disciplines, the mixed methods articles in education employed significantly more noun phrase bundles and research-orientated functions, whereas the mixed methods articles in psychology used more prepositional bundles and text-orientated functions. These paradigmatic and disciplinary differences in lexical bundles are explained by examining the underlying perceptions of knowledge and knowledge-making practices in different research paradigms and disciplines.
对词汇束的研究揭示了学科对学术话语中,特别是研究文章中使用这些多词表达的影响。然而,关于研究范式如何影响学术话语中的词汇束,却很少有研究。本研究旨在调查两个学科的定量、定性和混合方法研究文章中词汇束的差异程度。所有四个词的词束都是从专门构建的研究文章语料库中提取的,并对其语言结构和语篇功能进行了分析。数据分析显示,不同研究范式和学科之间存在显著的结构和功能差异。在不同的范式中,定量文章与定性文章的不同之处在于,它们使用了更多的动词短语束和参与者导向功能,而定性文章则使用了更多介词短语束和文本导向功能。在不同学科中,教育学中的混合方法文章使用了更多的名词短语束和研究导向功能,而心理学中的混合方式文章使用了较多的介词束和文本导向功能。通过考察不同研究范式和学科中对知识和知识制造实践的潜在认知,可以解释词汇束中的这些范式和学科差异。
{"title":"A comparative study of lexical bundles across paradigms and disciplines","authors":"Fenglong Cao","doi":"10.3366/COR.2021.0210","DOIUrl":"https://doi.org/10.3366/COR.2021.0210","url":null,"abstract":"Research on lexical bundles has shed much light on disciplinary influences on the employment of these multi-word expressions in academic discourse, particularly in research articles. Little work, however, has been done on how research paradigms may impact on lexical bundles in academic discourse. This study aims to investigate the extent to which lexical bundles vary in quantitative, qualitative and mixed methods research articles across two disciplines. All four-word lexical bundles were extracted from a specially built corpus of research articles and were analysed for their linguistic structures and discourse functions. The data analyses revealed marked structural and functional variation between different research paradigms and disciplines. Across paradigms, the quantitative articles differed from the qualitative articles by employing significantly more verb phrase bundles and participant-orientated functions whereas the qualitative articles employed significantly more prepositional phrase bundles and text-orientated functions. Across disciplines, the mixed methods articles in education employed significantly more noun phrase bundles and research-orientated functions, whereas the mixed methods articles in psychology used more prepositional bundles and text-orientated functions. These paradigmatic and disciplinary differences in lexical bundles are explained by examining the underlying perceptions of knowledge and knowledge-making practices in different research paradigms and disciplines.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43182334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Review: Tao. 2018. Russian–Chinese Parallel Corpus-based Research on Translational Texts about Humanities and Social Sciences. Beijing: Science Press 回顾:Tao. 2018。基于中俄平行语料库的人文社会科学翻译文本研究。北京:科学出版社
IF 0.5 Q1 Arts and Humanities Pub Date : 2021-04-01 DOI: 10.3366/COR.2021.0213
Zhanhao Jiang
{"title":"Review: Tao. 2018. Russian–Chinese Parallel Corpus-based Research on Translational Texts about Humanities and Social Sciences. Beijing: Science Press","authors":"Zhanhao Jiang","doi":"10.3366/COR.2021.0213","DOIUrl":"https://doi.org/10.3366/COR.2021.0213","url":null,"abstract":"","PeriodicalId":44933,"journal":{"name":"Corpora","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43556429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Operation Heron: latent topic changes in an abusive letter series 苍鹭行动:在辱骂信件系列中潜在的话题变化
IF 0.5 Q1 Arts and Humanities Pub Date : 2021-03-26 DOI: 10.3366/cor.2022.0255
Lucia Busso, Márton Petykó, S. Atkins, Tim D. Grant
The paper presents a two-part forensic linguistic analysis of an historic collection of abuse letters, sent to individuals in the public eye and individuals’ private homes between 2007 and 2009. We employ the technique of structural topic modelling (stm) to identify distinctions in the core topics of the letters, gauging the value of this relatively under-used methodology in forensic linguistics. Four key topics were identified in the letters, ‘Politics A’ and ‘B’, ‘Healthcare’ and ‘Immigration’, and their coherence, correlation and shifts in topic were evaluated. Following the stm, a qualitative corpus linguistic analysis was undertaken, coding concordance lines according to topic, with the reliability between coders tested. This coding demonstrated that various connected statements within the same topic tend to gain or lose prevalence over time, and ultimately confirmed the consistency of content within the four topics identified through stm throughout the letter series. The discussion and conclusions to the paper reflect on the findings and also consider the utility of these methodologies for linguistics and forensic linguistics in particular. The study demonstrates real value in revisiting a forensic linguistic dataset such as this to test and develop methodologies for the field.
这篇论文对一组历史上的虐待信件进行了分两部分的法医语言学分析,这些信件在2007年至2009年间被发送给公众和私人住宅。我们采用结构主题建模(stm)技术来识别信件核心主题的区别,衡量这种相对较少使用的方法在法律语言学中的价值。在字母中确定了四个关键主题,“政治A”和“B”,“医疗保健”和“移民”,并评估了它们的一致性,相关性和主题的变化。在此基础上,对语料库进行了定性的语言分析,按主题编码一致性线,并对编码器之间的信度进行了测试。这种编码表明,随着时间的推移,同一主题中的各种关联语句往往会增加或减少流行度,并最终确认了通过stm在整个信件系列中确定的四个主题中的内容的一致性。论文的讨论和结论反映了研究结果,并考虑了这些方法对语言学和法律语言学的效用。这项研究展示了重新访问像这样的法医语言数据集来测试和开发该领域的方法的真正价值。
{"title":"Operation Heron: latent topic changes in an abusive letter series","authors":"Lucia Busso, Márton Petykó, S. Atkins, Tim D. Grant","doi":"10.3366/cor.2022.0255","DOIUrl":"https://doi.org/10.3366/cor.2022.0255","url":null,"abstract":"The paper presents a two-part forensic linguistic analysis of an historic collection of abuse letters, sent to individuals in the public eye and individuals’ private homes between 2007 and 2009. We employ the technique of structural topic modelling (stm) to identify distinctions in the core topics of the letters, gauging the value of this relatively under-used methodology in forensic linguistics. Four key topics were identified in the letters, ‘Politics A’ and ‘B’, ‘Healthcare’ and ‘Immigration’, and their coherence, correlation and shifts in topic were evaluated. Following the stm, a qualitative corpus linguistic analysis was undertaken, coding concordance lines according to topic, with the reliability between coders tested. This coding demonstrated that various connected statements within the same topic tend to gain or lose prevalence over time, and ultimately confirmed the consistency of content within the four topics identified through stm throughout the letter series. The discussion and conclusions to the paper reflect on the findings and also consider the utility of these methodologies for linguistics and forensic linguistics in particular. The study demonstrates real value in revisiting a forensic linguistic dataset such as this to test and develop methodologies for the field.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2021-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42546142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Late Latin Charter Treebank: contents and annotation 晚期拉丁语宪章树库:内容和注释
IF 0.5 Q1 Arts and Humanities Pub Date : 2021-01-01 DOI: 10.3366/cor.2021.0217
Timo Korkiakangas
This paper describes the construction and annotation of the Late Latin Charter Treebank, a set of three dependency treebanks (llct1, llct2 and llct3) which together contain 1,261 Early Medieval Latin documentary texts (i.e., original charters) written in Italy between ad 714 and 1000 (about 594,000 tokens). The paper focusses on matters which a linguistically or philologically inclined user of llct needs to know: the criteria on which the charters were selected, the special characteristics of the annotation types utilised, and the geographical and chronological distribution of the data. In addition to normal queries on forms, lemmas, morphology and syntax, complex philological research settings are enabled by the textual annotation layer of llct, which indicates abbreviated and damaged words, as well as the formulaic and non-formulaic passages of each charter.
本文描述了晚期拉丁宪章树库的构建和注释,这是一个由三个依赖树库(llct1, llct2和llct3)组成的集合,共包含1261个早期中世纪拉丁文献文本(即原始宪章),写于公元714年至公元1000年之间的意大利(约594,000个标记)。本文的重点是语言学或语言学倾向的用户需要知道的事项:选择宪章的标准,所使用的注释类型的特殊特征,以及数据的地理和时间分布。除了对形式、引理、词法和句法的常规查询外,llct的文本注释层还支持复杂的文字学研究设置,它可以显示缩写和损坏的单词,以及每个宪章的公式化和非公式化段落。
{"title":"Late Latin Charter Treebank: contents and annotation","authors":"Timo Korkiakangas","doi":"10.3366/cor.2021.0217","DOIUrl":"https://doi.org/10.3366/cor.2021.0217","url":null,"abstract":"This paper describes the construction and annotation of the Late Latin Charter Treebank, a set of three dependency treebanks (llct1, llct2 and llct3) which together contain 1,261 Early Medieval Latin documentary texts (i.e., original charters) written in Italy between ad 714 and 1000 (about 594,000 tokens). The paper focusses on matters which a linguistically or philologically inclined user of llct needs to know: the criteria on which the charters were selected, the special characteristics of the annotation types utilised, and the geographical and chronological distribution of the data. In addition to normal queries on forms, lemmas, morphology and syntax, complex philological research settings are enabled by the textual annotation layer of llct, which indicates abbreviated and damaged words, as well as the formulaic and non-formulaic passages of each charter.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"69516683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
Corpora
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1