Over the past decade, learner corpora have gained recognition as valuable data sources in Second Language Acquisition (SLA) research. This development can be attributed to significant progress in Learner Corpus Research (LCR). However, there is still substantial work to be done. This article highlights key issues essential for sustaining the relevance of learner corpora in SLA. More particularly, I focus on the need for more diverse types of learner corpora, stress the importance of detailed metadata, and advocate for multifactorial study designs. I then revisit ongoing debates regarding the role of the native speaker in LCR and propose a practical solution to address this thorny issue. Finally, I also readdress the need for improvement in the quantitative methods and statistics, arguing that the importance of robust quantitative analysis cannot be overstated. In conclusion, I envision an ambitious learner corpus compilation project that adheres to the FAIR principles, with the goal of further elevating study quality in LCR.
{"title":"Learner corpus research: a critical appraisal and roadmap for contributing (more) to SLA research agendas","authors":"Magali Paquot","doi":"10.1515/cllt-2024-0014","DOIUrl":"https://doi.org/10.1515/cllt-2024-0014","url":null,"abstract":"Over the past decade, learner corpora have gained recognition as valuable data sources in Second Language Acquisition (SLA) research. This development can be attributed to significant progress in Learner Corpus Research (LCR). However, there is still substantial work to be done. This article highlights key issues essential for sustaining the relevance of learner corpora in SLA. More particularly, I focus on the need for more diverse types of learner corpora, stress the importance of detailed metadata, and advocate for multifactorial study designs. I then revisit ongoing debates regarding the role of the native speaker in LCR and propose a practical solution to address this thorny issue. Finally, I also readdress the need for improvement in the quantitative methods and statistics, arguing that the importance of robust quantitative analysis cannot be overstated. In conclusion, I envision an ambitious learner corpus compilation project that adheres to the FAIR principles, with the goal of further elevating study quality in LCR.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140931112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Corpus linguistics, with its methodological orientation towards the empirical analysis of language based on large text collections, has the potential to offer significant tools for addressing real-world problems across various social science domains, including climate change, criminology, healthcare and policy making. Despite this potential, the integration of corpus linguistics into social science disciplines (beyond linguistics) remains hampered by fundamental differences in epistemology, definitions and methodological approaches. This article explores the relationship between corpus linguistics and the social sciences. It is argued that epistemology, or the theory of knowledge, represents a primary barrier to integration, with much corpus linguistics research aligning with positivist and naturalist epistemologies. By contrast, many social science disciplines embrace more interpretive, conventionalist approaches that account for the dynamic nature of social phenomena. Considering the role of naturalism and conventionalism within both corpus linguistics and the social sciences, this article illustrates how these epistemological stances are likely to influence the acceptance and use of corpus methods in social science research. Despite the challenges, areas of convergence (e.g. shared use of data processing tools and the acknowledgement of the central role of language in social processes) provide opportunities for cross-disciplinary collaboration. As means to bridge the epistemological divide, this article advocates for a critical realist approach and concludes by calling on users of corpus linguistic methods to be reflexive and transparent about their epistemological stances when reporting their research.
{"title":"Corpus linguistics and the social sciences","authors":"Tony McEnery, Gavin Brookes","doi":"10.1515/cllt-2024-0036","DOIUrl":"https://doi.org/10.1515/cllt-2024-0036","url":null,"abstract":"\u0000 Corpus linguistics, with its methodological orientation towards the empirical analysis of language based on large text collections, has the potential to offer significant tools for addressing real-world problems across various social science domains, including climate change, criminology, healthcare and policy making. Despite this potential, the integration of corpus linguistics into social science disciplines (beyond linguistics) remains hampered by fundamental differences in epistemology, definitions and methodological approaches. This article explores the relationship between corpus linguistics and the social sciences. It is argued that epistemology, or the theory of knowledge, represents a primary barrier to integration, with much corpus linguistics research aligning with positivist and naturalist epistemologies. By contrast, many social science disciplines embrace more interpretive, conventionalist approaches that account for the dynamic nature of social phenomena. Considering the role of naturalism and conventionalism within both corpus linguistics and the social sciences, this article illustrates how these epistemological stances are likely to influence the acceptance and use of corpus methods in social science research. Despite the challenges, areas of convergence (e.g. shared use of data processing tools and the acknowledgement of the central role of language in social processes) provide opportunities for cross-disciplinary collaboration. As means to bridge the epistemological divide, this article advocates for a critical realist approach and concludes by calling on users of corpus linguistic methods to be reflexive and transparent about their epistemological stances when reporting their research.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140656127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
John Gamboa, Kristina Braun, Juhani Järvikivi, Shanley E. M. Allen
Nominal compounds are a structure commonly used in scientific texts. Despite their commonality, very little is known about how they are distributed in scientific articles. Based on the Uniform Information Density hypothesis, which states that speakers communicate information at a constant rate, avoiding peaks and troughs of information transmission, we predict that nominal compounds should cluster toward the end of scientific texts, be preceded by supporting text that facilitates their understanding, and be repeated often after their first use. In this paper, we examine these predictions through a quantitative and a qualitative analysis of a corpus of scientific papers from the fields of Biology, Economics and Linguistics. While our investigation did not reveal definitive findings for the first and third predictions above, it did produce supporting evidence in favor of our second prediction, thus advancing our understanding of NC use and the choices speakers make when transmitting information.
名词性化合物是科学文章中常用的一种结构。尽管它们很常见,但人们对它们在科学文章中的分布却知之甚少。根据 "均匀信息密度假说"(Uniform Information Density hypothesis),即说话者以恒定的速度传递信息,避免信息传递的高峰和低谷,我们预测名词性复词应集中在科技文章的末尾,在其前面有有助于理解的辅助文字,并在首次使用后经常重复出现。在本文中,我们通过对生物学、经济学和语言学领域的科学论文语料库进行定量和定性分析,对上述预测进行了研究。虽然我们的调查没有为上述第一和第三项预测揭示明确的结论,但却为第二项预测提供了支持性证据,从而推进了我们对数控系统使用和说话者在传递信息时所作选择的理解。
{"title":"The distributional properties of long nominal compounds in scientific articles: an investigation based on the uniform information density hypothesis","authors":"John Gamboa, Kristina Braun, Juhani Järvikivi, Shanley E. M. Allen","doi":"10.1515/cllt-2023-0028","DOIUrl":"https://doi.org/10.1515/cllt-2023-0028","url":null,"abstract":"Nominal compounds are a structure commonly used in scientific texts. Despite their commonality, very little is known about how they are distributed in scientific articles. Based on the Uniform Information Density hypothesis, which states that speakers communicate information at a constant rate, avoiding peaks and troughs of information transmission, we predict that nominal compounds should cluster toward the end of scientific texts, be preceded by supporting text that facilitates their understanding, and be repeated often after their first use. In this paper, we examine these predictions through a quantitative and a qualitative analysis of a corpus of scientific papers from the fields of Biology, Economics and Linguistics. While our investigation did not reveal definitive findings for the first and third predictions above, it did produce supporting evidence in favor of our second prediction, thus advancing our understanding of NC use and the choices speakers make when transmitting information.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140609182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Monika Bednarek, Martin Schweinberger, Kelvin K. H. Lee
Recent years have seen an increase in data and method reflection in corpus-based discourse analysis. In this article, we first take stock of some of the issues arising from such reflection (covering concepts such as triangulation, objectivity/subjectivity, replication, transparency, reflexivity, consistency). We then introduce a new ‘accountability’ framework for use in corpus-based discourse analysis (and perhaps beyond). We conceptualise such accountability as a multi-faceted phenomenon, covering various aspects of the research process. In the second part of this article, we then link this framework to a new cross-institutional initiative – the Australian Text Analytics Platform (ATAP) – which aims to address a small part of the framework, namely the transparency of analyses through Jupyter notebooks. We introduce the Quotation Tool as an example ATAP notebook of particular relevance to corpus-based discourse analysis. We reflect on how this notebook fosters accountability in relation to transparency of analysis and illustrate key applications using a set of different corpora.
{"title":"Corpus-based discourse analysis: from meta-reflection to accountability","authors":"Monika Bednarek, Martin Schweinberger, Kelvin K. H. Lee","doi":"10.1515/cllt-2023-0104","DOIUrl":"https://doi.org/10.1515/cllt-2023-0104","url":null,"abstract":"Recent years have seen an increase in data and method reflection in corpus-based discourse analysis. In this article, we first take stock of some of the issues arising from such reflection (covering concepts such as triangulation, objectivity/subjectivity, replication, transparency, reflexivity, consistency). We then introduce a new ‘accountability’ framework for use in corpus-based discourse analysis (and perhaps beyond). We conceptualise such accountability as a multi-faceted phenomenon, covering various aspects of the research process. In the second part of this article, we then link this framework to a new cross-institutional initiative – the Australian Text Analytics Platform (ATAP) – which aims to address a small part of the framework, namely the transparency of analyses through Jupyter notebooks. We introduce the Quotation Tool as an example ATAP notebook of particular relevance to corpus-based discourse analysis. We reflect on how this notebook fosters accountability in relation to transparency of analysis and illustrate key applications using a set of different corpora.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140609022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Following Ariel (2021. Why it’s hard to construct ad hoc number concepts. In Caterina Mauri, Ilaria Fiorentini, & Eugenio Goria (eds.), Building categories in interaction: Linguistic resources at work, 439–462. Amsterdam: John Benjamins), we argue that number words manifest distinct distributional patterns from open-class lexical items. When modified, open-class words typically take selectors (as in kinda table), which select a subset of their potential denotations (e.g., “nonprototypical table”). They are typically not modified by loosening operators (e.g., approximately), since even if bare, typical lexemes can broaden their interpretation (e.g., table referring to a rock used as a table). Number words, on the other hand, have a single, precise meaning and denotation and cannot take a selector, which would need to select a subset of their (single) denotation (??kinda seven). However, they are often overtly broadened (approximately seven), creating a range of values around N. First, we extend Ariel’s empirical examination to the larger COCA and to Hebrew (HeTenTen). Second, we propose that open-class and number words belong to sparse versus dense lexical domains, respectively, because the former exhibit prototypicality effects, but the latter do not. Third, we further support the contrast between sparse and dense domains by reference to: synchronic word2vec models of sparse and dense lexemes, which testify to their differential distributions, numeral use in noncounting communities, and different renewal rates for the two lexical types.
继阿里尔(2021.为什么难以构建特设数字概念?见 Caterina Mauri, Ilaria Fiorentini, & Eugenio Goria (eds.), Building categories in interaction:Linguistic resources at work, 439-462.阿姆斯特丹:John Benjamins),我们认为数词表现出与开放类词项不同的分布模式。当被修改时,开放类词汇通常会使用选择器(如 kinda table),选择其潜在指称的一个子集(如 "非原型表")。它们通常不会被松散运算符(如 "大约")修饰,因为即使是裸词,典型词素也可以扩大它们的释义范围(如 "桌子 "指的是用作桌子的石头)。首先,我们将 Ariel 的实证研究扩展到更大的 COCA 和希伯来语(HeTenTen)。其次,我们提出开放类词和数字词分别属于稀疏词域和密集词域,因为前者表现出原型效应,而后者则没有。第三,我们进一步支持稀疏词域和密集词域之间的对比,我们参考了稀疏词域和密集词域的同步 word2vec 模型,这些模型证明了稀疏词域和密集词域的不同分布,数字词在非计数社区中的使用,以及这两类词的不同更新率。
{"title":"The counting principle makes number words unique","authors":"Mira Ariel, Natalia Levshina","doi":"10.1515/cllt-2023-0105","DOIUrl":"https://doi.org/10.1515/cllt-2023-0105","url":null,"abstract":"\u0000 Following Ariel (2021. Why it’s hard to construct ad hoc number concepts. In Caterina Mauri, Ilaria Fiorentini, & Eugenio Goria (eds.), Building categories in interaction: Linguistic resources at work, 439–462. Amsterdam: John Benjamins), we argue that number words manifest distinct distributional patterns from open-class lexical items. When modified, open-class words typically take selectors (as in kinda table), which select a subset of their potential denotations (e.g., “nonprototypical table”). They are typically not modified by loosening operators (e.g., approximately), since even if bare, typical lexemes can broaden their interpretation (e.g., table referring to a rock used as a table). Number words, on the other hand, have a single, precise meaning and denotation and cannot take a selector, which would need to select a subset of their (single) denotation (??kinda seven). However, they are often overtly broadened (approximately seven), creating a range of values around N. First, we extend Ariel’s empirical examination to the larger COCA and to Hebrew (HeTenTen). Second, we propose that open-class and number words belong to sparse versus dense lexical domains, respectively, because the former exhibit prototypicality effects, but the latter do not. Third, we further support the contrast between sparse and dense domains by reference to: synchronic word2vec models of sparse and dense lexemes, which testify to their differential distributions, numeral use in noncounting communities, and different renewal rates for the two lexical types.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140365928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Japanese features a general noun-modifying clause construction (NMCC) with a more versatile range of semantic and pragmatic interpretations than equivalent constructions in other languages. Motivated by the learning challenge NMCCs pose to Japanese as a foreign language (JFL) learners, this article examines speech data from the International Corpus of Japanese as a Second Language (I-JAS) to compare learner use of NMCCs against a large L1 Japanese corpus. Instances of the construction from both corpora were analyzed to identify high-frequency part-of-speech categories and subcategories in the modifying clause predicate and head noun slots. A simple collexeme analysis was then employed to identify strongly attracted and repelled lexical items among those identified in realizations of the construction. Taken together, findings from these analyses revealed an important connection between the semantic weight of head nouns in NMCCs and the idiomaticity of the construction, with learner productions demonstrating a tendency toward heavy head nouns. This study lays the groundwork for future research seeking to explore the NMCC at different levels of granularity and to improve its treatment in JFL pedagogical materials.
{"title":"A collostructional approach to Japanese noun-modifying clause construction use and acquisition: a learner corpus study","authors":"Nicole C. De Los Reyes, Ute Römer-Barron","doi":"10.1515/cllt-2024-0020","DOIUrl":"https://doi.org/10.1515/cllt-2024-0020","url":null,"abstract":"Japanese features a general noun-modifying clause construction (NMCC) with a more versatile range of semantic and pragmatic interpretations than equivalent constructions in other languages. Motivated by the learning challenge NMCCs pose to Japanese as a foreign language (JFL) learners, this article examines speech data from the International Corpus of Japanese as a Second Language (I-JAS) to compare learner use of NMCCs against a large L1 Japanese corpus. Instances of the construction from both corpora were analyzed to identify high-frequency part-of-speech categories and subcategories in the modifying clause predicate and head noun slots. A simple collexeme analysis was then employed to identify strongly attracted and repelled lexical items among those identified in realizations of the construction. Taken together, findings from these analyses revealed an important connection between the semantic weight of head nouns in NMCCs and the idiomaticity of the construction, with learner productions demonstrating a tendency toward heavy head nouns. This study lays the groundwork for future research seeking to explore the NMCC at different levels of granularity and to improve its treatment in JFL pedagogical materials.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140196917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper aims to give an overview of corpus-based research that investigates processes of language change from the theoretical perspective of Construction Grammar. Starting in the early 2000s, a dynamic community of researchers has come together in order to contribute to this effort. Among the different lines of work that have characterized this enterprise, this paper discusses the respective roles of qualitative approaches, diachronic collostructional analysis, multivariate techniques, distributional semantic models, and analyses of network structure. The paper tries to contextualize these approaches and to offer pointers for future research.
{"title":"Corpus linguistics meets historical linguistics and construction grammar: how far have we come, and where do we go from here?","authors":"Martin Hilpert","doi":"10.1515/cllt-2024-0009","DOIUrl":"https://doi.org/10.1515/cllt-2024-0009","url":null,"abstract":"This paper aims to give an overview of corpus-based research that investigates processes of language change from the theoretical perspective of Construction Grammar. Starting in the early 2000s, a dynamic community of researchers has come together in order to contribute to this effort. Among the different lines of work that have characterized this enterprise, this paper discusses the respective roles of qualitative approaches, diachronic collostructional analysis, multivariate techniques, distributional semantic models, and analyses of network structure. The paper tries to contextualize these approaches and to offer pointers for future research.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140196830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In an attempt to identify possible cases of collostructional transfer in the use of the causative construction [X make Y Vinf] by French-speaking learners of English, two types of analyses are combined in this study. First, a contrastive collostructional analysis compares the verbs occurring in the [Vinf] slot of the English construction and its French equivalent, [X faire Vinf Y]. Second, a contrastive interlanguage collostructional analysis compares the verbs used in the [Vinf] slot of [X make Y Vinf] by native speakers of English, French-speaking learners of English and learners of English from other mother tongue backgrounds. The aim is to identify verbs that are more distinctive of [X faire Vinf Y] than of [X make Y Vinf] and that are also more likely to be used by French-speaking learners of English than by other populations, as these verbs could be potential cases of collostructional preferences transferred by learners from French to English. The results suggest that learners might transfer verbs expressing a change of state or location and some individual verbs like discover from the French to the English causative construction. Their dispreference for copular verbs (other than be) could also be the result of transfer effects.
为了确定法语英语学习者在使用因果结构[X make Y Vinf]时可能出现的同位语结构转换情况,本研究结合了两种类型的分析。首先,对比性对位分析比较了出现在英语结构[Vinf]槽中的动词及其法语对等结构[X faire Vinf Y]。其次,对比性语际搭配分析比较了英语母语者、法语英语学习者和其他母语背景的英语学习者在[X make Y Vinf]的[Vinf]槽中使用的动词。目的是找出[X faire Vinf Y]比[X make Y Vinf]更独特的动词,而且法语英语学习者比其他人群更有可能使用这些动词,因为这些动词可能是学习者从法语转移到英语的同构偏好的潜在案例。研究结果表明,学习者可能会将表示状态或位置变化的动词以及一些单个动词(如 "发现")从法语因果结构转移到英语因果结构中。他们对共轭动词(be 除外)的偏爱也可能是迁移效应的结果。
{"title":"Transfer of collostructions: the case of causative constructions","authors":"Gaëtanelle Gilquin","doi":"10.1515/cllt-2024-0023","DOIUrl":"https://doi.org/10.1515/cllt-2024-0023","url":null,"abstract":"In an attempt to identify possible cases of collostructional transfer in the use of the causative construction [X <jats:sc> <jats:italic>make</jats:italic> </jats:sc> Y V<jats:sub>inf</jats:sub>] by French-speaking learners of English, two types of analyses are combined in this study. First, a contrastive collostructional analysis compares the verbs occurring in the [V<jats:sub>inf</jats:sub>] slot of the English construction and its French equivalent, [X <jats:sc> <jats:italic>faire</jats:italic> </jats:sc> V<jats:sub>inf</jats:sub> Y]. Second, a contrastive interlanguage collostructional analysis compares the verbs used in the [V<jats:sub>inf</jats:sub>] slot of [X <jats:sc> <jats:italic>make</jats:italic> </jats:sc> Y V<jats:sub>inf</jats:sub>] by native speakers of English, French-speaking learners of English and learners of English from other mother tongue backgrounds. The aim is to identify verbs that are more distinctive of [X <jats:sc> <jats:italic>faire</jats:italic> </jats:sc> V<jats:sub>inf</jats:sub> Y] than of [X <jats:sc> <jats:italic>make</jats:italic> </jats:sc> Y V<jats:sub>inf</jats:sub>] and that are also more likely to be used by French-speaking learners of English than by other populations, as these verbs could be potential cases of collostructional preferences transferred by learners from French to English. The results suggest that learners might transfer verbs expressing a change of state or location and some individual verbs like <jats:italic>discover</jats:italic> from the French to the English causative construction. Their dispreference for copular verbs (other than <jats:italic>be</jats:italic>) could also be the result of transfer effects.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140167239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In undertaking any collostructional analysis, a researcher must make decisions concerning the properties of words, constructions, and corpora. Each of these crucial aspects of the analysis can be dealt with in alternative ways: words can be investigated as either lemmas or inflected forms; a construction can be characterized in alternative ways (reliance on semantics or syntax or some combination thereof, the span of the construction, etc.); the choice of corpus (or corpora) will be influenced by whether a researcher has an interest in different genres and varieties, whether the study is synchronic or diachronic, etc. I review various ways in which a researcher’s decisions about words, constructions, and corpora are relevant to a corpus-based study of N waiting to happen, referencing throughout the collostructional analysis of this construction by Stefanowitsch and Gries. The approach adopted here can be seen as supplementing Stefanowitsch and Gries’ original collostructional analysis. It illustrates how multifarious the results of a corpus-based study of constructions can be and serves as a reminder that no one corpus-based measure can possibly answer all the questions linguists might reasonably ask about a construction.
{"title":"Revisiting N waiting to happen: word, construction, and corpus choices in a collostructional analysis","authors":"John Newman","doi":"10.1515/cllt-2024-0019","DOIUrl":"https://doi.org/10.1515/cllt-2024-0019","url":null,"abstract":"In undertaking any collostructional analysis, a researcher must make decisions concerning the properties of words, constructions, and corpora. Each of these crucial aspects of the analysis can be dealt with in alternative ways: words can be investigated as either lemmas or inflected forms; a construction can be characterized in alternative ways (reliance on semantics or syntax or some combination thereof, the span of the construction, etc.); the choice of corpus (or corpora) will be influenced by whether a researcher has an interest in different genres and varieties, whether the study is synchronic or diachronic, etc. I review various ways in which a researcher’s decisions about words, constructions, and corpora are relevant to a corpus-based study of N <jats:italic>waiting to happen</jats:italic>, referencing throughout the collostructional analysis of this construction by Stefanowitsch and Gries. The approach adopted here can be seen as supplementing Stefanowitsch and Gries’ original collostructional analysis. It illustrates how multifarious the results of a corpus-based study of constructions can be and serves as a reminder that no one corpus-based measure can possibly answer all the questions linguists might reasonably ask about a construction.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140116832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This article presents a corpus-based study of the go (a)round Ving- and go (a)round and V-constructions in American English. More specifically, it addresses the possibility of the constructions serving as pragmatic markers of stance through the collocational phenomenon of semantic prosody. It is argued that the notions of internal and external constructional properties from the early days of construction grammar as well as the corpus-linguistic idea of association patterns would be beneficial to usage-based construction grammatical descriptions of phenomena such as semantic prosody. Drawing on a 248,145,425-word portion of the Corpus of Contemporary American English, both simple collexeme analysis and distinctive collexeme analysis are applied to generate output that feeds into semantic-prosodic analysis. Moreover, standard distinctive collexeme analysis and multiple distinctive collexeme analysis are applied at the level of semantic prosodies in the collexemic fields (i.e., distinctive semantic-prosodic analysis), at the level of verbal category colligations (i.e., distinctive colligational analysis), and at the level of speech act functions of usage-events of the two constructions (i.e., distinctive speech act analysis) as a type of trial balloon. The purpose is to expand semantic-prosodic analysis from focusing merely on lexemes to exploring how other linguistic and pragmatic phenomena may be at play.
本文以语料库为基础,对美国英语中的 go (a)round Ving- 和 go (a)round and V- 结构进行了研究。更具体地说,文章探讨了这些构式通过语义前置的搭配现象作为语用标记的可能性。该研究认为,早期构式语法中的内部和外部构式属性概念以及语料库语言学中的关联模式概念将有利于基于用法的构式语法对语义拟声等现象的描述。利用《当代美国英语语料库》(Corpus of Contemporary American English)中的 248,145,425 个单词,简单的词组分析和独特的词组分析都被应用到了语义拟声分析中。此外,作为一种试验气球,在语义前体分析(即独特的语义前体分析)、动词类别搭配分析(即独特的搭配分析)和两个结构的用法事件的言语行为功能分析(即独特的言语行为分析)层面上,还应用了标准的独特词组分析和多重独特词组分析。其目的是将语义-韵律分析从仅仅关注词素扩展到探索其他语言和语用现象如何发挥作用。
{"title":"Well, maybe you shouldn’t go around shaving poodles: collostructional semantic and discursive prosody in the go (a)round Ving and go (a)round and V constructions","authors":"Kim Ebensgaard Jensen","doi":"10.1515/cllt-2024-0018","DOIUrl":"https://doi.org/10.1515/cllt-2024-0018","url":null,"abstract":"\u0000 This article presents a corpus-based study of the go (a)round Ving- and go (a)round and V-constructions in American English. More specifically, it addresses the possibility of the constructions serving as pragmatic markers of stance through the collocational phenomenon of semantic prosody. It is argued that the notions of internal and external constructional properties from the early days of construction grammar as well as the corpus-linguistic idea of association patterns would be beneficial to usage-based construction grammatical descriptions of phenomena such as semantic prosody. Drawing on a 248,145,425-word portion of the Corpus of Contemporary American English, both simple collexeme analysis and distinctive collexeme analysis are applied to generate output that feeds into semantic-prosodic analysis. Moreover, standard distinctive collexeme analysis and multiple distinctive collexeme analysis are applied at the level of semantic prosodies in the collexemic fields (i.e., distinctive semantic-prosodic analysis), at the level of verbal category colligations (i.e., distinctive colligational analysis), and at the level of speech act functions of usage-events of the two constructions (i.e., distinctive speech act analysis) as a type of trial balloon. The purpose is to expand semantic-prosodic analysis from focusing merely on lexemes to exploring how other linguistic and pragmatic phenomena may be at play.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140077122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}