{"title":"Review: Tao. 2018. Russian–Chinese Parallel Corpus-based Research on Translational Texts about Humanities and Social Sciences. Beijing: Science Press","authors":"Zhanhao Jiang","doi":"10.3366/COR.2021.0213","DOIUrl":"https://doi.org/10.3366/COR.2021.0213","url":null,"abstract":"","PeriodicalId":44933,"journal":{"name":"Corpora","volume":"16 1","pages":"161-164"},"PeriodicalIF":0.5,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43556429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lucia Busso, Márton Petykó, S. Atkins, Tim D. Grant
The paper presents a two-part forensic linguistic analysis of an historic collection of abuse letters, sent to individuals in the public eye and individuals’ private homes between 2007 and 2009. We employ the technique of structural topic modelling (stm) to identify distinctions in the core topics of the letters, gauging the value of this relatively under-used methodology in forensic linguistics. Four key topics were identified in the letters, ‘Politics A’ and ‘B’, ‘Healthcare’ and ‘Immigration’, and their coherence, correlation and shifts in topic were evaluated. Following the stm, a qualitative corpus linguistic analysis was undertaken, coding concordance lines according to topic, with the reliability between coders tested. This coding demonstrated that various connected statements within the same topic tend to gain or lose prevalence over time, and ultimately confirmed the consistency of content within the four topics identified through stm throughout the letter series. The discussion and conclusions to the paper reflect on the findings and also consider the utility of these methodologies for linguistics and forensic linguistics in particular. The study demonstrates real value in revisiting a forensic linguistic dataset such as this to test and develop methodologies for the field.
{"title":"Operation Heron: latent topic changes in an abusive letter series","authors":"Lucia Busso, Márton Petykó, S. Atkins, Tim D. Grant","doi":"10.3366/cor.2022.0255","DOIUrl":"https://doi.org/10.3366/cor.2022.0255","url":null,"abstract":"The paper presents a two-part forensic linguistic analysis of an historic collection of abuse letters, sent to individuals in the public eye and individuals’ private homes between 2007 and 2009. We employ the technique of structural topic modelling (stm) to identify distinctions in the core topics of the letters, gauging the value of this relatively under-used methodology in forensic linguistics. Four key topics were identified in the letters, ‘Politics A’ and ‘B’, ‘Healthcare’ and ‘Immigration’, and their coherence, correlation and shifts in topic were evaluated. Following the stm, a qualitative corpus linguistic analysis was undertaken, coding concordance lines according to topic, with the reliability between coders tested. This coding demonstrated that various connected statements within the same topic tend to gain or lose prevalence over time, and ultimately confirmed the consistency of content within the four topics identified through stm throughout the letter series. The discussion and conclusions to the paper reflect on the findings and also consider the utility of these methodologies for linguistics and forensic linguistics in particular. The study demonstrates real value in revisiting a forensic linguistic dataset such as this to test and develop methodologies for the field.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":"1 1","pages":""},"PeriodicalIF":0.5,"publicationDate":"2021-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42546142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper describes the construction and annotation of the Late Latin Charter Treebank, a set of three dependency treebanks (llct1, llct2 and llct3) which together contain 1,261 Early Medieval Latin documentary texts (i.e., original charters) written in Italy between ad 714 and 1000 (about 594,000 tokens). The paper focusses on matters which a linguistically or philologically inclined user of llct needs to know: the criteria on which the charters were selected, the special characteristics of the annotation types utilised, and the geographical and chronological distribution of the data. In addition to normal queries on forms, lemmas, morphology and syntax, complex philological research settings are enabled by the textual annotation layer of llct, which indicates abbreviated and damaged words, as well as the formulaic and non-formulaic passages of each charter.
{"title":"Late Latin Charter Treebank: contents and annotation","authors":"Timo Korkiakangas","doi":"10.3366/cor.2021.0217","DOIUrl":"https://doi.org/10.3366/cor.2021.0217","url":null,"abstract":"This paper describes the construction and annotation of the Late Latin Charter Treebank, a set of three dependency treebanks (llct1, llct2 and llct3) which together contain 1,261 Early Medieval Latin documentary texts (i.e., original charters) written in Italy between ad 714 and 1000 (about 594,000 tokens). The paper focusses on matters which a linguistically or philologically inclined user of llct needs to know: the criteria on which the charters were selected, the special characteristics of the annotation types utilised, and the geographical and chronological distribution of the data. In addition to normal queries on forms, lemmas, morphology and syntax, complex philological research settings are enabled by the textual annotation layer of llct, which indicates abbreviated and damaged words, as well as the formulaic and non-formulaic passages of each charter.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":"28 1","pages":""},"PeriodicalIF":0.5,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"69516683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Patterning in frames (i.e., discontinuous word sequences with at least one variable slot) involves both syntagmatic co-occurrences and paradigmatic variations, and this has received considerable at...
框架中的模式(即,至少有一个可变槽的不连续词序列)涉及句法共现和范式变化,这在…
{"title":"Exploring recurrent frames in written Chinese","authors":"Chan-Chia Hsu","doi":"10.3366/cor.2020.0201","DOIUrl":"https://doi.org/10.3366/cor.2020.0201","url":null,"abstract":"Patterning in frames (i.e., discontinuous word sequences with at least one variable slot) involves both syntagmatic co-occurrences and paradigmatic variations, and this has received considerable at...","PeriodicalId":44933,"journal":{"name":"Corpora","volume":"15 1","pages":"291-315"},"PeriodicalIF":0.5,"publicationDate":"2020-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46210201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper applies a quantitative model developed for measuring grammatical status, using data from the Lancaster Corpus of Mandarin Chinese (lcmc). The model takes into account four quantitative f...
{"title":"Measuring grammatical status in Chinese through quantitative corpus analysis","authors":"Linlin Sun, D. Saavedra","doi":"10.3366/cor.2020.0202","DOIUrl":"https://doi.org/10.3366/cor.2020.0202","url":null,"abstract":"This paper applies a quantitative model developed for measuring grammatical status, using data from the Lancaster Corpus of Mandarin Chinese (lcmc). The model takes into account four quantitative f...","PeriodicalId":44933,"journal":{"name":"Corpora","volume":"15 1","pages":"317-342"},"PeriodicalIF":0.5,"publicationDate":"2020-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45748115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Review: Doval and Sánchez Nieto. 2019. Parallel Corpora for Contrastive and Translation Studies: New Resources and Applications","authors":"Yi Li","doi":"10.3366/cor.2020.0204","DOIUrl":"https://doi.org/10.3366/cor.2020.0204","url":null,"abstract":"","PeriodicalId":44933,"journal":{"name":"Corpora","volume":"15 1","pages":"355-358"},"PeriodicalIF":0.5,"publicationDate":"2020-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43992963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Followed by a temporal noun, past can be synonymous with last, but not with non-deictically anchored previous (e.g., ‘I've not been feeling very well for the past/last/*previous few days’). Most di...
{"title":"The interaction of various temporal devices in the use of past followed by temporal nouns","authors":"I. Yoo","doi":"10.3366/cor.2020.0199","DOIUrl":"https://doi.org/10.3366/cor.2020.0199","url":null,"abstract":"Followed by a temporal noun, past can be synonymous with last, but not with non-deictically anchored previous (e.g., ‘I've not been feeling very well for the past/last/*previous few days’). Most di...","PeriodicalId":44933,"journal":{"name":"Corpora","volume":"15 1","pages":"247-272"},"PeriodicalIF":0.5,"publicationDate":"2020-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47185221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As stock trading became a popular topic on Twitter, many researchers have proposed different approaches to make predictions on it, relying on the emotions found in messages. However, detailed studi...
{"title":"Stock market tweets annotated with emotions","authors":"F. J. V. D. Silva, N. T. Roman, Ariadne Carvalho","doi":"10.3366/cor.2020.0203","DOIUrl":"https://doi.org/10.3366/cor.2020.0203","url":null,"abstract":"As stock trading became a popular topic on Twitter, many researchers have proposed different approaches to make predictions on it, relying on the emotions found in messages. However, detailed studi...","PeriodicalId":44933,"journal":{"name":"Corpora","volume":"15 1","pages":"343-354"},"PeriodicalIF":0.5,"publicationDate":"2020-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48621145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
While the role of corpus linguistics (cl) in language teaching and learning continues to evolve, its use in the language teaching industry remains somewhat unclear. The specific ways in which elt publishers use cl research to inform materials development are under-studied, meaning that it is not known whether cl is being used by publishers to its full potential. This study investigates the use of cl research by a major international elt publisher by conducting research into recent change in adverbs in casual spoken British English and sharing the findings with editors from the publisher. Through our analysis, we find evidence of major recent changes in the use of frequent adverbs. Following the corpus analysis, we conducted in-depth interviews with the editors and a review of the materials they subsequently produced using the corpus findings. In so doing, we find some evidence of effective use of corpora in materials development but reveal limitations in current corpus research which prevent editors from employing cl research more effectively.
{"title":"Adverbs on the move: investigating publisher application of corpus research on recent language change to ELT coursebook development","authors":"Niall Curry, Robbie Love, Olivia Goodman","doi":"10.3366/cor.2022.0233","DOIUrl":"https://doi.org/10.3366/cor.2022.0233","url":null,"abstract":"While the role of corpus linguistics (cl) in language teaching and learning continues to evolve, its use in the language teaching industry remains somewhat unclear. The specific ways in which elt publishers use cl research to inform materials development are under-studied, meaning that it is not known whether cl is being used by publishers to its full potential. This study investigates the use of cl research by a major international elt publisher by conducting research into recent change in adverbs in casual spoken British English and sharing the findings with editors from the publisher. Through our analysis, we find evidence of major recent changes in the use of frequent adverbs. Following the corpus analysis, we conducted in-depth interviews with the editors and a review of the materials they subsequently produced using the corpus findings. In so doing, we find some evidence of effective use of corpora in materials development but reveal limitations in current corpus research which prevent editors from employing cl research more effectively.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":" ","pages":""},"PeriodicalIF":0.5,"publicationDate":"2020-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48347239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}