Abstract Bridging relations are used when the identity of a discourse-new entity can be inferred via lexical relations from an antecedent (e. g. a cake … the slice) or non-lexically via reference to world knowledge or discourse structure (e. g. a war … the survivors). Such relations are marked in English via the definite article, which is considered a difficult feature of the English language for L2 learners to acquire, particularly for L1 speakers of article-less languages. This paper provides an Integrated Contrastive Model (e. g. Granger 1996) of the L1 and L2 production of definite article bridging relations using L2 English learner corpus data produced by native Mandarin and Korean speakers at four L2 proficiency levels, alongside comparative native English data. The data is taken from the International Corpus Network of Asian Learners of English (ICNALE, Ishikawa 2011, 2013), totalling just under 400,000 words with over 1500 bridging NPs identified. Results suggest subtle but significant differences between L1-L2 and L2-L2 groupings in terms of the frequency of particular bridging relation types and lemmatised wordings identified in the data, although there was little evidence of pseudo-longitudinal development. Such differences may suggest an effect of L1-L2 linguistic relativity, influencing the selection of relational links between given/new discourse entities during L2 production.
{"title":"Definite article bridging relations in L2: A learner corpus study","authors":"P. Crosthwaite","doi":"10.1515/cllt-2015-0058","DOIUrl":"https://doi.org/10.1515/cllt-2015-0058","url":null,"abstract":"Abstract Bridging relations are used when the identity of a discourse-new entity can be inferred via lexical relations from an antecedent (e. g. a cake … the slice) or non-lexically via reference to world knowledge or discourse structure (e. g. a war … the survivors). Such relations are marked in English via the definite article, which is considered a difficult feature of the English language for L2 learners to acquire, particularly for L1 speakers of article-less languages. This paper provides an Integrated Contrastive Model (e. g. Granger 1996) of the L1 and L2 production of definite article bridging relations using L2 English learner corpus data produced by native Mandarin and Korean speakers at four L2 proficiency levels, alongside comparative native English data. The data is taken from the International Corpus Network of Asian Learners of English (ICNALE, Ishikawa 2011, 2013), totalling just under 400,000 words with over 1500 bridging NPs identified. Results suggest subtle but significant differences between L1-L2 and L2-L2 groupings in terms of the frequency of particular bridging relation types and lemmatised wordings identified in the data, although there was little evidence of pseudo-longitudinal development. Such differences may suggest an effect of L1-L2 linguistic relativity, influencing the selection of relational links between given/new discourse entities during L2 production.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":"15 1","pages":"297 - 319"},"PeriodicalIF":1.6,"publicationDate":"2019-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/cllt-2015-0058","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45865903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract This paper argues that Arabic function words (FWs) vary in usage between old and modern Arabic, thus prompting an experimental investigation into their changeability. This investigation is carried out by testing classical Arabic (CA) in Arabic heritage language (AHL) texts – those labeled as archistratum – and the modern standard Arabic (MSA) of Arabic newspaper texts (ANT), each group of which contains randomly collected 5 million (M) word texts. The linguistic theory of the grammar of Arabic FWs is explained through the differences between CA and MSA, despite Arabic FW changes and the unlearnability and/or unusability of some FW constructions between in these two eras of Arabic usage. The dispersion/distribution of the construction grammar (CxG) of FWs and the number (n) of word attractions/repulsions between the two distinct eras is explored using the very latest and most sophisticated Arabic corpus processing tools, and Sketch Engine’s SkeEn gramrels operators. The analysis of a 5 M word corpus from each era of Arabic serves to prove the non-existence of rigorous Arabic CxG. The approach in this study adopts a technique which, by contrasting AHL with ANT, relies on analyzing the frequency distributions of FWs, the co-occurrences of FWs in a span of 2n-grams collocational patterning, and some cases of FW usage changes in terms of lexical cognition (FW grammatical relationships). The results show that the frequencies of FWs, in addition to the case studies, are not the same, and this implies that FWs and their associations with the main part of speech class in a fusion language like Arabic have grammatically changed in MSA. Their constructional changes are neglected in Arabic grammar.
{"title":"Grammatical construction of function words between old and modern written Arabic: A corpus-based analysis","authors":"Sultan Almujaiwel","doi":"10.1515/cllt-2016-0069","DOIUrl":"https://doi.org/10.1515/cllt-2016-0069","url":null,"abstract":"Abstract This paper argues that Arabic function words (FWs) vary in usage between old and modern Arabic, thus prompting an experimental investigation into their changeability. This investigation is carried out by testing classical Arabic (CA) in Arabic heritage language (AHL) texts – those labeled as archistratum – and the modern standard Arabic (MSA) of Arabic newspaper texts (ANT), each group of which contains randomly collected 5 million (M) word texts. The linguistic theory of the grammar of Arabic FWs is explained through the differences between CA and MSA, despite Arabic FW changes and the unlearnability and/or unusability of some FW constructions between in these two eras of Arabic usage. The dispersion/distribution of the construction grammar (CxG) of FWs and the number (n) of word attractions/repulsions between the two distinct eras is explored using the very latest and most sophisticated Arabic corpus processing tools, and Sketch Engine’s SkeEn gramrels operators. The analysis of a 5 M word corpus from each era of Arabic serves to prove the non-existence of rigorous Arabic CxG. The approach in this study adopts a technique which, by contrasting AHL with ANT, relies on analyzing the frequency distributions of FWs, the co-occurrences of FWs in a span of 2n-grams collocational patterning, and some cases of FW usage changes in terms of lexical cognition (FW grammatical relationships). The results show that the frequencies of FWs, in addition to the case studies, are not the same, and this implies that FWs and their associations with the main part of speech class in a fusion language like Arabic have grammatically changed in MSA. Their constructional changes are neglected in Arabic grammar.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":"15 1","pages":"267 - 296"},"PeriodicalIF":1.6,"publicationDate":"2019-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/cllt-2016-0069","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46255286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract This paper discusses the debatable hypotheses of “Translation Universals”, i. e. the recurring common features of translated texts in relation to original utterances. We propose that, if translational language does have some distinctive linguistic features in contrast to non-translated writings in the same language, those differences should be statistically significant, consistently distributed and systematically co-occurring across registers and genres. Based on the balanced Corpus of Translational English (COTE) and its non-translated English counterpart, the Freiburg-LOB corpus of British English (FLOB), and by deploying a multi-feature statistical analysis on 96 lexical, syntactic and textual features, we try to pinpoint those distinctive features in translated English texts. We also propose that the stylo-statistical model developed in this study will be effective not only in analysing the translational variation of English but also be capable of clustering those variational features into a “translational” dimension which will facilitate a crosslinguistic comparison of translational languages (e. g. translational Chinese) to test the Translation Universals hypotheses.
{"title":"How do English translations differ from non-translated English writings? A multi-feature statistical model for linguistic variation analysis","authors":"Xianyao Hu, R. Xiao, A. Hardie","doi":"10.1515/cllt-2014-0047","DOIUrl":"https://doi.org/10.1515/cllt-2014-0047","url":null,"abstract":"Abstract This paper discusses the debatable hypotheses of “Translation Universals”, i. e. the recurring common features of translated texts in relation to original utterances. We propose that, if translational language does have some distinctive linguistic features in contrast to non-translated writings in the same language, those differences should be statistically significant, consistently distributed and systematically co-occurring across registers and genres. Based on the balanced Corpus of Translational English (COTE) and its non-translated English counterpart, the Freiburg-LOB corpus of British English (FLOB), and by deploying a multi-feature statistical analysis on 96 lexical, syntactic and textual features, we try to pinpoint those distinctive features in translated English texts. We also propose that the stylo-statistical model developed in this study will be effective not only in analysing the translational variation of English but also be capable of clustering those variational features into a “translational” dimension which will facilitate a crosslinguistic comparison of translational languages (e. g. translational Chinese) to test the Translation Universals hypotheses.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":"15 1","pages":"347 - 382"},"PeriodicalIF":1.6,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/cllt-2014-0047","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42398556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-09-27DOI: 10.1515/cllt-2019-frontmatter2
{"title":"Frontmatter","authors":"","doi":"10.1515/cllt-2019-frontmatter2","DOIUrl":"https://doi.org/10.1515/cllt-2019-frontmatter2","url":null,"abstract":"","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2019-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/cllt-2019-frontmatter2","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44203595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract This study sheds light on the vocabulary complexity of various physics genres and how it affects reading and listening comprehension of the science of physics. We analysed the vocabulary frequency profile of seven physics genres: research articles, textbooks, lectures, magazines, popular books, TV documentaries and TED talks, to determine the presence of general-purpose, academic and technical vocabulary in them, as well as their vocabulary level and variation. The main research question was whether the vocabulary level of these genres could pose an impediment to typical native and non-native speakers of English in terms of their reading/listening comprehension, and, in general, how accessible these genres are vocabulary-wise. The results suggest that typical native speakers will struggle reading physics research and magazine articles, whereas typical non-native speakers will not read/listen to any of the genres at an optimal level, but will be able to read/listen to four of them at an acceptable level.
{"title":"Vocabulary complexity and reading and listening comprehension of various physics genres","authors":"Milica Vuković Stamatović","doi":"10.1515/cllt-2019-0022","DOIUrl":"https://doi.org/10.1515/cllt-2019-0022","url":null,"abstract":"Abstract This study sheds light on the vocabulary complexity of various physics genres and how it affects reading and listening comprehension of the science of physics. We analysed the vocabulary frequency profile of seven physics genres: research articles, textbooks, lectures, magazines, popular books, TV documentaries and TED talks, to determine the presence of general-purpose, academic and technical vocabulary in them, as well as their vocabulary level and variation. The main research question was whether the vocabulary level of these genres could pose an impediment to typical native and non-native speakers of English in terms of their reading/listening comprehension, and, in general, how accessible these genres are vocabulary-wise. The results suggest that typical native speakers will struggle reading physics research and magazine articles, whereas typical non-native speakers will not read/listen to any of the genres at an optimal level, but will be able to read/listen to four of them at an acceptable level.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":"16 1","pages":"487 - 514"},"PeriodicalIF":1.6,"publicationDate":"2019-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/cllt-2019-0022","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44316100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract This corpus-based study of pluralized non-count nouns (informations, advices, etc.) uses collocation-derived measures (determiners vs. bare noun and mass quantifiers) to extract potential candidates of non-count nouns in a bottom-up approach from the British National Corpus (BNC), allowing the detection of grammatical categories from distributional features. We then use this token list to retrieve data on pluralization of non-counts from nine annotated components of the International Corpus of English (ICE). While the distinction between count and non-count nouns is gradient rather than categorical, it is still possible to distinguish between standard and non-standard pluralization of non-counts. Qualitative analyses of our data show that non-standard pluralization of non-count nouns is regularly attested in second-language varieties, including previously unrecorded types; however, it is also occasionally found in first-language varieties. We discuss implications of our corpus results for common explanations of pluralized non-count nouns, such as substrate influence, language learning effects and historical input. By combining a bottom-up corpus-based approach with fine-grained qualitative analyses we can provide a more nuanced view of pluralization of non-counts across ENL and ESL for the investigation of World Englishes.
{"title":"Pluralized non-count nouns across Englishes: A corpus-linguistic approach to variety types","authors":"G. Schneider, M. Hundt, D. Schreier","doi":"10.1515/CLLT-2018-0068","DOIUrl":"https://doi.org/10.1515/CLLT-2018-0068","url":null,"abstract":"Abstract This corpus-based study of pluralized non-count nouns (informations, advices, etc.) uses collocation-derived measures (determiners vs. bare noun and mass quantifiers) to extract potential candidates of non-count nouns in a bottom-up approach from the British National Corpus (BNC), allowing the detection of grammatical categories from distributional features. We then use this token list to retrieve data on pluralization of non-counts from nine annotated components of the International Corpus of English (ICE). While the distinction between count and non-count nouns is gradient rather than categorical, it is still possible to distinguish between standard and non-standard pluralization of non-counts. Qualitative analyses of our data show that non-standard pluralization of non-count nouns is regularly attested in second-language varieties, including previously unrecorded types; however, it is also occasionally found in first-language varieties. We discuss implications of our corpus results for common explanations of pluralized non-count nouns, such as substrate influence, language learning effects and historical input. By combining a bottom-up corpus-based approach with fine-grained qualitative analyses we can provide a more nuanced view of pluralization of non-counts across ENL and ESL for the investigation of World Englishes.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":"16 1","pages":"515 - 546"},"PeriodicalIF":1.6,"publicationDate":"2019-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/CLLT-2018-0068","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44211764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract This article describes a study of shell nouns (SNs) complemented by appositive that-clauses observed in a two-million-word corpus of media English by British and Chinese writers. The grammatical metaphor theory was applied to the data in the light of a novel proposal that the metaphorical forms of SN+that constructions, in their contextual semantic settings, serve to re-construe various transitivity processes. The study produced significant findings, including: (1) the two writer groups demonstrate significantly different preferences for SN types but the British and the Chinese uses are instantiated from a common core set; (2) the Chinese group prefers the re-construal of Identifying Relational processes of facts and evidence as markers of neutral and impersonal discourse; (3) British writers favour the re-construal of Verbal processes of assertion and stance and tend to re-construe Attributive Relational processes with varying degrees of commitment to the encapsulated propositional truth; (4) both groups are inclined towards the re-construal of Mental processes of cognition with a common preference for the re-construal of the experience of knowing, believing and thinking. The findings above lend important empirical support to systemic functional theories and suggest further research in the future regarding SNs as indicators of disparate construals in discourse.
{"title":"Shell nouns as grammatical metaphor revealing disparate construals: Investigating the differences between British English and China English based on a comparable corpus","authors":"Min Dong, A. Fang","doi":"10.1515/CLLT-2018-0047","DOIUrl":"https://doi.org/10.1515/CLLT-2018-0047","url":null,"abstract":"Abstract This article describes a study of shell nouns (SNs) complemented by appositive that-clauses observed in a two-million-word corpus of media English by British and Chinese writers. The grammatical metaphor theory was applied to the data in the light of a novel proposal that the metaphorical forms of SN+that constructions, in their contextual semantic settings, serve to re-construe various transitivity processes. The study produced significant findings, including: (1) the two writer groups demonstrate significantly different preferences for SN types but the British and the Chinese uses are instantiated from a common core set; (2) the Chinese group prefers the re-construal of Identifying Relational processes of facts and evidence as markers of neutral and impersonal discourse; (3) British writers favour the re-construal of Verbal processes of assertion and stance and tend to re-construe Attributive Relational processes with varying degrees of commitment to the encapsulated propositional truth; (4) both groups are inclined towards the re-construal of Mental processes of cognition with a common preference for the re-construal of the experience of knowing, believing and thinking. The findings above lend important empirical support to systemic functional theories and suggest further research in the future regarding SNs as indicators of disparate construals in discourse.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":"17 1","pages":"743 - 779"},"PeriodicalIF":1.6,"publicationDate":"2019-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/CLLT-2018-0047","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47030812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract We present a model of the linguistic development of scientific English from the mid-seventeenth to the late-nineteenth century, a period that witnessed significant political and social changes, including the evolution of modern science. There is a wealth of descriptive accounts of scientific English, both from a synchronic and a diachronic perspective, but only few attempts at a unified explanation of its evolution. The explanation we offer here is a communicative one: while external pressures (specialization, diversification) push for an increase in expressivity, communicative concerns pull toward convergence on particular options (conventionalization). What emerges over time is a code which is optimized for written, specialist communication, relying on specific linguistic means to modulate information content. As we show, this is achieved by the systematic interplay between lexis and grammar. The corpora we employ are the Royal Society Corpus (RSC) and for comparative purposes, the Corpus of Late Modern English (CLMET). We build various diachronic, computational n-gram language models of these corpora and then apply formal measures of information content (here: relative entropy and surprisal) to detect the linguistic features significantly contributing to diachronic change, estimate the (changing) level of information of features and capture the time course of change.
{"title":"Toward an optimal code for communication: The case of scientific English","authors":"Stefania Degaetano-Ortlieb, E. Teich","doi":"10.1515/CLLT-2018-0088","DOIUrl":"https://doi.org/10.1515/CLLT-2018-0088","url":null,"abstract":"Abstract We present a model of the linguistic development of scientific English from the mid-seventeenth to the late-nineteenth century, a period that witnessed significant political and social changes, including the evolution of modern science. There is a wealth of descriptive accounts of scientific English, both from a synchronic and a diachronic perspective, but only few attempts at a unified explanation of its evolution. The explanation we offer here is a communicative one: while external pressures (specialization, diversification) push for an increase in expressivity, communicative concerns pull toward convergence on particular options (conventionalization). What emerges over time is a code which is optimized for written, specialist communication, relying on specific linguistic means to modulate information content. As we show, this is achieved by the systematic interplay between lexis and grammar. The corpora we employ are the Royal Society Corpus (RSC) and for comparative purposes, the Corpus of Late Modern English (CLMET). We build various diachronic, computational n-gram language models of these corpora and then apply formal measures of information content (here: relative entropy and surprisal) to detect the linguistic features significantly contributing to diachronic change, estimate the (changing) level of information of features and capture the time course of change.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":"18 1","pages":"175 - 207"},"PeriodicalIF":1.6,"publicationDate":"2019-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/CLLT-2018-0088","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48344006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract This paper addresses creativity as inhibition of repetitive behaviour. We argue that entrenchment and constructional change can be in competition with large-scale creative attempts of recomposition of constructions’ internal constituency. After undergoing chunking, the recurrent usage of a construction may be significantly counterbalanced with new attempts of entrenchment inhibition (viz. inhibition of entrenchment). These are cases where speakers opt for more compositional and less predictable ways to express a similar meaning of a conventionalised form. We focus on the constructionalisation of noun–participle compounds (e.g. snow-covered) in the Historical Corpus of American English. During the second part of the twentieth century, speakers increasingly inhibit the usage of conventionalised noun phrase–past participle forms in favour of more compositional strategies involving the same internal constituents. This entails that constructional change not only affects the meaning of the chunk that undergoes constructionalisation but also the way speakers creatively rediscover its internal constituency. These results additionally aim to inform research in cognitive architectures and artificial intelligence, where creativity is often merely considered as a problem-solving mechanism rather than a potential process of inhibition of automatised behaviour.
{"title":"Entrenchment inhibition: Constructional change and repetitive behaviour can be in competition with large-scale “recompositional” creativity","authors":"Vittorio Tantucci, Matteo Di Cristofaro","doi":"10.1515/CLLT-2019-0017","DOIUrl":"https://doi.org/10.1515/CLLT-2019-0017","url":null,"abstract":"Abstract This paper addresses creativity as inhibition of repetitive behaviour. We argue that entrenchment and constructional change can be in competition with large-scale creative attempts of recomposition of constructions’ internal constituency. After undergoing chunking, the recurrent usage of a construction may be significantly counterbalanced with new attempts of entrenchment inhibition (viz. inhibition of entrenchment). These are cases where speakers opt for more compositional and less predictable ways to express a similar meaning of a conventionalised form. We focus on the constructionalisation of noun–participle compounds (e.g. snow-covered) in the Historical Corpus of American English. During the second part of the twentieth century, speakers increasingly inhibit the usage of conventionalised noun phrase–past participle forms in favour of more compositional strategies involving the same internal constituents. This entails that constructional change not only affects the meaning of the chunk that undergoes constructionalisation but also the way speakers creatively rediscover its internal constituency. These results additionally aim to inform research in cognitive architectures and artificial intelligence, where creativity is often merely considered as a problem-solving mechanism rather than a potential process of inhibition of automatised behaviour.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":"16 1","pages":"547 - 579"},"PeriodicalIF":1.6,"publicationDate":"2019-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/CLLT-2019-0017","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49425255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract We report on the results of an annotation experiment comparing naïve and expert coders in a sense disambiguation task consisting in the assignment of function labels to discourse markers (e.g. well, but, I mean) in spoken French and English using a taxonomy specifically designed for speech. Our qualitative-quantitative assessment of its reliability led us to suggest fundamental revisions of the structure of the taxonomy, striving to find a better balance between reliability and granularity. The resulting model articulates two independent levels of annotation (domains and functions) which, once combined, provide a robust tool for the analysis of discourse markers and relate them to more general functions of spoken language.
{"title":"Reliability vs. granularity in discourse annotation: What is the trade-off?","authors":"Ludivine Crible, Liesbeth Degand","doi":"10.1515/cllt-2016-0046","DOIUrl":"https://doi.org/10.1515/cllt-2016-0046","url":null,"abstract":"Abstract We report on the results of an annotation experiment comparing naïve and expert coders in a sense disambiguation task consisting in the assignment of function labels to discourse markers (e.g. well, but, I mean) in spoken French and English using a taxonomy specifically designed for speech. Our qualitative-quantitative assessment of its reliability led us to suggest fundamental revisions of the structure of the taxonomy, striving to find a better balance between reliability and granularity. The resulting model articulates two independent levels of annotation (domains and functions) which, once combined, provide a robust tool for the analysis of discourse markers and relate them to more general functions of spoken language.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":"15 1","pages":"71 - 99"},"PeriodicalIF":1.6,"publicationDate":"2019-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/cllt-2016-0046","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47836050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}