首页 > 最新文献

International Journal of Corpus Linguistics最新文献

英文 中文
Review of Durrant (2023): Corpus linguistics for writing development 评论 Durrant (2023):用于写作发展的语料库语言学
IF 1 2区 文学 N/A LANGUAGE & LINGUISTICS Pub Date : 2024-06-13 DOI: 10.1075/ijcl.00059.lim
Joyce Lim
This article reviews Corpus linguistics for writing development
本文评述了用于写作发展的语料库语言学
{"title":"Review of Durrant (2023): Corpus linguistics for writing development","authors":"Joyce Lim","doi":"10.1075/ijcl.00059.lim","DOIUrl":"https://doi.org/10.1075/ijcl.00059.lim","url":null,"abstract":"This article reviews Corpus linguistics for writing development","PeriodicalId":46843,"journal":{"name":"International Journal of Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2024-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141529520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assessing the potential of LLM-assisted annotation for corpus-based pragmatics and discourse analysis 评估 LLM 辅助注释在基于语料库的语用学和话语分析方面的潜力
IF 1 2区 文学 Q1 Arts and Humanities Pub Date : 2024-06-03 DOI: 10.1075/ijcl.23087.yu
Danni Yu, Luyang Li, Hang Su, Matteo Fuoli
Certain forms of linguistic annotation, like part of speech and semantic tagging, can be automated with high accuracy. However, manual annotation is still necessary for complex pragmatic and discursive features that lack a direct mapping to lexical forms. This manual process is time-consuming and error-prone, limiting the scalability of function-to-form approaches in corpus linguistics. To address this, our study explores the possibility of using large language models (LLMs) to automate pragma-discursive corpus annotation. We compare GPT-3.5 (the model behind the free-to-use version of ChatGPT), GPT-4 (the model underpinning the precise mode of Bing chatbot), and a human coder in annotating apology components in English based on the local grammar framework. We find that GPT-4 outperformed GPT-3.5, with accuracy approaching that of a human coder. These results suggest that LLMs can be successfully deployed to aid pragma-discursive corpus annotation, making the process more efficient, scalable, and accessible.
某些形式的语言注释,如语篇和语义标记,可以实现高精度的自动化。然而,对于复杂的语用和话语特征,由于缺乏与词汇形式的直接映射,仍然需要人工标注。这种手工操作既耗时又容易出错,限制了语料库语言学中功能到形式方法的可扩展性。为了解决这个问题,我们的研究探索了使用大型语言模型(LLM)自动进行语法辨析语料注释的可能性。我们比较了 GPT-3.5(ChatGPT 免费使用版本背后的模型)、GPT-4(必应聊天机器人精确模式的基础模型)和基于本地语法框架注释英语道歉成分的人工编码员。我们发现,GPT-4 的表现优于 GPT-3.5,准确率接近人类编码员。这些结果表明,可以成功地部署 LLM 来辅助语法杂乱的语料注释,使注释过程更加高效、可扩展且易于使用。
{"title":"Assessing the potential of LLM-assisted annotation for corpus-based pragmatics and discourse analysis","authors":"Danni Yu, Luyang Li, Hang Su, Matteo Fuoli","doi":"10.1075/ijcl.23087.yu","DOIUrl":"https://doi.org/10.1075/ijcl.23087.yu","url":null,"abstract":"\u0000 Certain forms of linguistic annotation, like part of speech and semantic tagging, can be automated with high\u0000 accuracy. However, manual annotation is still necessary for complex pragmatic and discursive features that lack a direct mapping\u0000 to lexical forms. This manual process is time-consuming and error-prone, limiting the scalability of function-to-form approaches\u0000 in corpus linguistics. To address this, our study explores the possibility of using large language models (LLMs) to automate\u0000 pragma-discursive corpus annotation. We compare GPT-3.5 (the model behind the free-to-use version of ChatGPT), GPT-4 (the model\u0000 underpinning the precise mode of Bing chatbot), and a human coder in annotating apology components in English based on the local\u0000 grammar framework. We find that GPT-4 outperformed GPT-3.5, with accuracy approaching that of a human coder. These results suggest\u0000 that LLMs can be successfully deployed to aid pragma-discursive corpus annotation, making the process more efficient, scalable,\u0000 and accessible.","PeriodicalId":46843,"journal":{"name":"International Journal of Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141272524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Case and agreement variation in contact 接触中的案例和协议变化
IF 1 2区 文学 Q1 Arts and Humanities Pub Date : 2024-04-25 DOI: 10.1075/ijcl.22119.zha
Yi Zhang, Ming Yue
This study investigates the influence of language contact on morphosyntactic variation in World Englishes, specifically focusing on the joint variation of case and agreement in it-clefts with pronominal clefted constituents. Employing a multifactorial approach within the framework of probabilistic grammar, we examine the distribution of the four relevant it-cleft variants in the GloWbE corpus. We find that language contact, as a language-external factor, impacts the strengths and rankings of language-internal factors but not their directions. Additionally, we observe an intricate interplay between language contact and language-internal factors in shaping morphosyntactic patterns: low-contact varieties tend to display feature-based case and agreement with a high degree of variability, while high-contact varieties tend to exhibit position-based case and agreement with a low degree of variability. These findings shed light on the mechanisms underlying the development of language diversity and structural simplification in World Englishes.
本研究探讨了语言接触对世界英语中形态句法变异的影响,特别关注了带有pronominal clefted constituents的it-clefts中大小写和协议的联合变异。我们在概率语法框架内采用多因素方法,研究了 GloWbE 语料库中四种相关 it-cleft 变体的分布情况。我们发现,语言接触作为一种语言外部因素,会影响语言内部因素的强度和排序,但不会影响其方向。此外,我们还观察到语言接触和语言内部因素之间在形成形态句法模式方面错综复杂的相互作用:低接触变体倾向于表现出基于特征的大小写和一致,变异性较高;而高接触变体倾向于表现出基于位置的大小写和一致,变异性较低。这些发现揭示了世界英语中语言多样性和结构简化的发展机制。
{"title":"Case and agreement variation in contact","authors":"Yi Zhang, Ming Yue","doi":"10.1075/ijcl.22119.zha","DOIUrl":"https://doi.org/10.1075/ijcl.22119.zha","url":null,"abstract":"\u0000This study investigates the influence of language contact on morphosyntactic variation in World Englishes, specifically focusing on the joint variation of case and agreement in it-clefts with pronominal clefted constituents. Employing a multifactorial approach within the framework of probabilistic grammar, we examine the distribution of the four relevant it-cleft variants in the GloWbE corpus. We find that language contact, as a language-external factor, impacts the strengths and rankings of language-internal factors but not their directions. Additionally, we observe an intricate interplay between language contact and language-internal factors in shaping morphosyntactic patterns: low-contact varieties tend to display feature-based case and agreement with a high degree of variability, while high-contact varieties tend to exhibit position-based case and agreement with a low degree of variability. These findings shed light on the mechanisms underlying the development of language diversity and structural simplification in World Englishes.","PeriodicalId":46843,"journal":{"name":"International Journal of Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140658051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A user-friendly corpus tool for disciplinary data-driven learning 用于学科数据驱动学习的用户友好型语料库工具
IF 1 2区 文学 Q1 Arts and Humanities Pub Date : 2024-04-16 DOI: 10.1075/ijcl.23056.cro
Peter Crosthwaite, V. Baisa
Most corpus tools commonly used for corpus-based data-driven learning (DDL) are designed for research rather than teaching purposes, with much DDL research suggesting learners and their teachers often stop DDL after initial training due to tool-related issues like complex user interfaces and system settings. Based on feedback from secondary-age language learners and their teachers in the Australian context, we present CorpusMate (https://corpusmate.com), a new, user-friendly corpus tool that incorporates several publicly available written and spoken corpora across 20 disciplinary subjects. It offers a range of flexible concordancing, n-gram and data visualisation options to ensure a fast, smooth and simple DDL experience for end users.
大多数常用于基于语料库的数据驱动学习(DDL)的语料库工具都是为研究而不是教学目的而设计的,许多DDL研究表明,由于复杂的用户界面和系统设置等与工具相关的问题,学习者及其教师往往在初始培训后就停止了DDL。根据澳大利亚中学阶段语言学习者及其教师的反馈,我们推出了CorpusMate (https://corpusmate.com),这是一种新的、用户友好型语料库工具,它整合了20个学科的多个公开可用的书面和口语语料库。它提供了一系列灵活的配对、n-gram和数据可视化选项,以确保最终用户获得快速、流畅和简单的DDL体验。
{"title":"A user-friendly corpus tool for disciplinary data-driven learning","authors":"Peter Crosthwaite, V. Baisa","doi":"10.1075/ijcl.23056.cro","DOIUrl":"https://doi.org/10.1075/ijcl.23056.cro","url":null,"abstract":"\u0000 Most corpus tools commonly used for corpus-based data-driven learning (DDL) are designed for research rather than\u0000 teaching purposes, with much DDL research suggesting learners and their teachers often stop DDL after initial training due to\u0000 tool-related issues like complex user interfaces and system settings. Based on feedback from secondary-age language learners and\u0000 their teachers in the Australian context, we present CorpusMate (https://corpusmate.com), a new, user-friendly corpus tool that incorporates several publicly available written and\u0000 spoken corpora across 20 disciplinary subjects. It offers a range of flexible concordancing, n-gram and data visualisation options\u0000 to ensure a fast, smooth and simple DDL experience for end users.","PeriodicalId":46843,"journal":{"name":"International Journal of Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140695858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Review of Flach & Hilpert (2022): Broadening the spectrum of corpus linguistics: New approaches to variability and change 评论 Flach & Hilpert (2022):拓宽语料库语言学的范围:变异和变化的新方法
IF 1 2区 文学 Q1 Arts and Humanities Pub Date : 2024-04-04 DOI: 10.1075/ijcl.00058.fle
Kristen Fleckenstein
{"title":"Review of Flach & Hilpert (2022): Broadening the spectrum of corpus linguistics: New approaches to variability and change","authors":"Kristen Fleckenstein","doi":"10.1075/ijcl.00058.fle","DOIUrl":"https://doi.org/10.1075/ijcl.00058.fle","url":null,"abstract":"","PeriodicalId":46843,"journal":{"name":"International Journal of Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140744589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Down-sampling from hierarchically structured corpus data 从分层结构的语料库数据中向下采样
IF 1 2区 文学 Q1 Arts and Humanities Pub Date : 2024-03-25 DOI: 10.1075/ijcl.23079.son
Lukas Sönning
Resource constraints often force researchers to downsize the list of tokens returned by a corpus query. This paper sketches a methodology for down-sampling and offers a survey of current practices. We build on earlier work and extend the evaluation of down-sampling designs to settings where tokens are clustered by text file and lexeme. Our case study deals with third-person present-tense verb inflection in Early Modern English and focuses on five predictors: year, gender, genre, frequency, and phonological context. We evaluate two strategies for selecting 2,000 (out of 11,645) tokens: simple down-sampling, where each hit has the same selection probability; and structured down-sampling, where this probability is inversely proportional to the author- and verb-specific token count. We form 500 subsamples using each scheme and compare regression results to a reference model fit to the full set of cases. We observe that structured down-sampling shows better performance on several evaluation criteria.
资源限制常常迫使研究人员缩减语料库查询返回的标记列表。本文概述了缩减取样的方法,并对当前的实践进行了调查。我们在早期工作的基础上,将对向下取样设计的评估扩展到了按文本文件和词素对标记进行聚类的情况。我们的案例研究涉及早期现代英语中的第三人称现在式动词变位,重点关注五个预测因素:年份、性别、体裁、频率和语音语境。我们评估了从 11,645 个标记中选择 2,000 个标记的两种策略:简单向下抽样,即每个命中标记的选择概率相同;结构化向下抽样,即选择概率与作者和动词的特定标记数成反比。我们使用每种方案形成 500 个子样本,并将回归结果与拟合全套案例的参考模型进行比较。我们发现,结构化向下取样在多个评估标准上都有更好的表现。
{"title":"Down-sampling from hierarchically structured corpus data","authors":"Lukas Sönning","doi":"10.1075/ijcl.23079.son","DOIUrl":"https://doi.org/10.1075/ijcl.23079.son","url":null,"abstract":"\u0000Resource constraints often force researchers to downsize the list of tokens returned by a corpus query. This paper sketches a methodology for down-sampling and offers a survey of current practices. We build on earlier work and extend the evaluation of down-sampling designs to settings where tokens are clustered by text file and lexeme. Our case study deals with third-person present-tense verb inflection in Early Modern English and focuses on five predictors: year, gender, genre, frequency, and phonological context. We evaluate two strategies for selecting 2,000 (out of 11,645) tokens: simple down-sampling, where each hit has the same selection probability; and structured down-sampling, where this probability is inversely proportional to the author- and verb-specific token count. We form 500 subsamples using each scheme and compare regression results to a reference model fit to the full set of cases. We observe that structured down-sampling shows better performance on several evaluation criteria.","PeriodicalId":46843,"journal":{"name":"International Journal of Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140383301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
“People should get their booster” "人们应该得到他们的助推器"
IF 1 2区 文学 Q1 Arts and Humanities Pub Date : 2024-02-13 DOI: 10.1075/ijcl.22110.zou
H. Zou, Ken Hyland
Debates around the efficacy and dangers of vaccination have taken on critical importance with the Covid pandemic and WHO naming vaccine hesitancy as a major global health threat. We explore how writers use two types of blog, academic and journalistic, to promote key public health messages around the effectiveness and necessity of Covid-19 vaccinations to a broad, heterogeneous audience. Examining 120 Covid-19 vaccination themed posts from reputable news and academic blog sites, we compare the different ways writers present a stance and take a position towards vaccines and vaccinations in these different interactional contexts. Findings show that both types of bloggers are clearly aware of the need to convey a stance towards their topic and audiences feel entitled to position themselves in relation to vaccination issues, but with different emphases. The study has important implications for how healthcare information is disseminated and persuasion accomplished in these public arenas of discourse.
随着 Covid 大流行和世卫组织将疫苗接种犹豫列为全球主要健康威胁,围绕疫苗接种的有效性和危险性的辩论变得至关重要。我们探讨了作者如何利用学术博客和新闻博客这两种类型的博客,向广泛的不同受众宣传有关 Covid-19 疫苗接种的有效性和必要性的关键公共卫生信息。我们研究了来自知名新闻和学术博客网站的 120 篇以 Covid-19 疫苗接种为主题的文章,比较了作者在这些不同的互动环境中对疫苗和疫苗接种的立场和表态的不同方式。研究结果表明,这两类博客作者都清楚地意识到有必要表达对其主题的立场,受众也认为自己有权对疫苗接种问题进行自我定位,但侧重点有所不同。这项研究对于如何在这些公共话语环境中传播医疗保健信息和完成说服工作具有重要意义。
{"title":"“People should get their booster”","authors":"H. Zou, Ken Hyland","doi":"10.1075/ijcl.22110.zou","DOIUrl":"https://doi.org/10.1075/ijcl.22110.zou","url":null,"abstract":"\u0000Debates around the efficacy and dangers of vaccination have taken on critical importance with the Covid pandemic and WHO naming vaccine hesitancy as a major global health threat. We explore how writers use two types of blog, academic and journalistic, to promote key public health messages around the effectiveness and necessity of Covid-19 vaccinations to a broad, heterogeneous audience. Examining 120 Covid-19 vaccination themed posts from reputable news and academic blog sites, we compare the different ways writers present a stance and take a position towards vaccines and vaccinations in these different interactional contexts. Findings show that both types of bloggers are clearly aware of the need to convey a stance towards their topic and audiences feel entitled to position themselves in relation to vaccination issues, but with different emphases. The study has important implications for how healthcare information is disseminated and persuasion accomplished in these public arenas of discourse.","PeriodicalId":46843,"journal":{"name":"International Journal of Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2024-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139840774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
“People should get their booster” "人们应该得到他们的助推器"
IF 1 2区 文学 Q1 Arts and Humanities Pub Date : 2024-02-13 DOI: 10.1075/ijcl.22110.zou
H. Zou, Ken Hyland
Debates around the efficacy and dangers of vaccination have taken on critical importance with the Covid pandemic and WHO naming vaccine hesitancy as a major global health threat. We explore how writers use two types of blog, academic and journalistic, to promote key public health messages around the effectiveness and necessity of Covid-19 vaccinations to a broad, heterogeneous audience. Examining 120 Covid-19 vaccination themed posts from reputable news and academic blog sites, we compare the different ways writers present a stance and take a position towards vaccines and vaccinations in these different interactional contexts. Findings show that both types of bloggers are clearly aware of the need to convey a stance towards their topic and audiences feel entitled to position themselves in relation to vaccination issues, but with different emphases. The study has important implications for how healthcare information is disseminated and persuasion accomplished in these public arenas of discourse.
随着 Covid 大流行和世卫组织将疫苗接种犹豫列为全球主要健康威胁,围绕疫苗接种的有效性和危险性的辩论变得至关重要。我们探讨了作者如何利用学术博客和新闻博客这两种类型的博客,向广泛的不同受众宣传有关 Covid-19 疫苗接种的有效性和必要性的关键公共卫生信息。我们研究了来自知名新闻和学术博客网站的 120 篇以 Covid-19 疫苗接种为主题的文章,比较了作者在这些不同的互动环境中对疫苗和疫苗接种的立场和表态的不同方式。研究结果表明,这两类博客作者都清楚地意识到有必要表达对其主题的立场,受众也认为自己有权对疫苗接种问题进行自我定位,但侧重点有所不同。这项研究对于如何在这些公共话语环境中传播医疗保健信息和完成说服工作具有重要意义。
{"title":"“People should get their booster”","authors":"H. Zou, Ken Hyland","doi":"10.1075/ijcl.22110.zou","DOIUrl":"https://doi.org/10.1075/ijcl.22110.zou","url":null,"abstract":"\u0000Debates around the efficacy and dangers of vaccination have taken on critical importance with the Covid pandemic and WHO naming vaccine hesitancy as a major global health threat. We explore how writers use two types of blog, academic and journalistic, to promote key public health messages around the effectiveness and necessity of Covid-19 vaccinations to a broad, heterogeneous audience. Examining 120 Covid-19 vaccination themed posts from reputable news and academic blog sites, we compare the different ways writers present a stance and take a position towards vaccines and vaccinations in these different interactional contexts. Findings show that both types of bloggers are clearly aware of the need to convey a stance towards their topic and audiences feel entitled to position themselves in relation to vaccination issues, but with different emphases. The study has important implications for how healthcare information is disseminated and persuasion accomplished in these public arenas of discourse.","PeriodicalId":46843,"journal":{"name":"International Journal of Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2024-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139780896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modeling the locative alternation in Mandarin Chinese 普通话中位置交替的建模
IF 1 2区 文学 Q1 Arts and Humanities Pub Date : 2024-01-29 DOI: 10.1075/ijcl.22072.xu
Mengmin Xu, F. Li, Benedikt Szmrecsanyi
The current study investigates the probabilistic conditioning of the Mandarin locative alternation. We adopt a corpus-based multivariate approach to analyze 2,836 observations of locative variants from a large Chinese corpus and annotated manually for various language-internal and language-external constraints. Multivariate modeling reveals that the Mandarin locative alternation is not only influenced by semantic predictors like affectedness and telicity, but also by previously unexplored syntactic and language-external constraints, such as complexity and animacy of locatum and location, accessibility of locatum, pronominality, definiteness of location, length ratio and register. Notably, the effects of affectedness, definiteness and pronominality are broadly parallel in both the Mandarin locative alternation and its English counterpart. We thus contribute to theorizing in corpus-based variationist linguistics by uncovering the probabilistic grammar of the locative alternation in Mandarin Chinese, and by identifying the constraints that may be universal across languages.
本研究探讨了普通话定位交替的概率条件。我们采用基于语料库的多元方法,分析了来自大型中文语料库的 2,836 个定位词变体观测值,并针对各种语言内部和语言外部限制因素进行了人工标注。多变量建模显示,普通话的定位交替不仅受语义预测因素(如受影响程度和远近性)的影响,而且还受以前未曾探究过的句法和语言外部限制因素的影响,如定位和位置的复杂性和生动性、定位的可及性、祈使性、位置的确定性、长度比和语域。值得注意的是,在普通话的定位交替和英语的定位交替中,影响性、定义性和主位性的影响大致相同。因此,我们揭示了普通话中位置交替的概率语法,并确定了在不同语言中可能具有普遍性的制约因素,从而为基于语料库的变异语言学理论研究做出了贡献。
{"title":"Modeling the locative alternation in Mandarin Chinese","authors":"Mengmin Xu, F. Li, Benedikt Szmrecsanyi","doi":"10.1075/ijcl.22072.xu","DOIUrl":"https://doi.org/10.1075/ijcl.22072.xu","url":null,"abstract":"\u0000 The current study investigates the probabilistic conditioning of the Mandarin locative alternation. We adopt a\u0000 corpus-based multivariate approach to analyze 2,836 observations of locative variants from a large Chinese corpus and annotated\u0000 manually for various language-internal and language-external constraints. Multivariate modeling reveals that the Mandarin locative\u0000 alternation is not only influenced by semantic predictors like affectedness and telicity, but also by previously unexplored\u0000 syntactic and language-external constraints, such as complexity and animacy of locatum and location, accessibility of locatum,\u0000 pronominality, definiteness of location, length ratio and register. Notably, the effects of affectedness, definiteness and\u0000 pronominality are broadly parallel in both the Mandarin locative alternation and its English counterpart. We thus contribute to\u0000 theorizing in corpus-based variationist linguistics by uncovering the probabilistic grammar of the locative alternation in\u0000 Mandarin Chinese, and by identifying the constraints that may be universal across languages.","PeriodicalId":46843,"journal":{"name":"International Journal of Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2024-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140488787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Review of Dunn (2022): Natural Language Processing for Corpus Linguistics 邓恩(2022)评论:语料库语言学的自然语言处理
IF 1 2区 文学 Q1 Arts and Humanities Pub Date : 2023-12-22 DOI: 10.1075/ijcl.00057.sch
Hanna Schmück
{"title":"Review of Dunn (2022): Natural Language Processing for Corpus Linguistics","authors":"Hanna Schmück","doi":"10.1075/ijcl.00057.sch","DOIUrl":"https://doi.org/10.1075/ijcl.00057.sch","url":null,"abstract":"","PeriodicalId":46843,"journal":{"name":"International Journal of Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2023-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138946500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Journal of Corpus Linguistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1