International Journal of Corpus Linguistics最新文献

英文中文

IF 1 2区文学 0 LANGUAGE & LINGUISTICS

International Journal of Corpus Linguistics

Pub Date : 2022-07-18 DOI: 10.1075/ijcl.20022.hac

Olav Hackstein, Ryan Sandell

This article examines the lexically parallel English and German constructions can’t stand somebody/something and jemanden/etwas nicht ausstehen können “not tolerate (someone or something)”, from synchronic, diachronic, and quantitative perspectives. Syntactic and semantic restrictions suggest that the usage of stand and ausstehen in the relevant sense is older than other semantically similar verbs (e.g. English tolerate, German leiden), while quantitative evidence from corpora shows that the can’t stand and nicht ausstehen können constructions are both colligationally stronger than lexical competitors. Evidence from the history of stand indicates that the lexeme stand in the Germanic and other Indo-European languages has a long history of being employed in the relevant sense. The restrictions on usage and the colligational strength of the respective English and German constructions are thus argued to result from the antiquity of the construction and functional competition from other lexemes.

本文从共时性、历时性和定量的角度考察了英语和德语词汇平行结构can 't stand someone /something和jemanden/etwas nicht ausstehen können“不能容忍(某人或某物)”。句法和语义限制表明，stand和ausstehen在相关意义上的使用比其他语义相似的动词(如英语tolerate，德语leiden)更古老，而来自语料库的定量证据表明，can 't stand和night ausstehen können结构的综合能力都强于词汇竞争对手。来自stand历史的证据表明，在日耳曼语和其他印欧语中，词素stand在相关意义上的使用历史悠久。因此，英语和德语结构对用法的限制和各自的整合强度被认为是由于结构的古老和与其他词汇的功能竞争。

引用次数: 1

Handle it in-house? 内部处理?

IF 1 2区文学 0 LANGUAGE & LINGUISTICS

International Journal of Corpus Linguistics

Pub Date : 2022-07-13 DOI: 10.1075/ijcl.20024.nai

Ben Naismith, Alan Juffs, Na-Rae Han, Daniel Zheng

Vocabulary lists of high-frequency lexical items are an important resource in language education and a key product of corpus research. However, no single vocabulary list will be useful for every learning context, with the appropriateness of such lists affected by the corpora on which they are based. This paper investigates the impact of corpus selection on one measure of lexical sophistication, Advanced Guiraud, focusing on two frequency lists originating from an in-house learner corpus (PELIC) and a global learner corpus (Cambridge Learner Corpus). This analysis shows that frequency lists derived from both types of learner corpus can effectively serve as the basis for measuring the development of lexical sophistication, regardless of the specific program of the learners. Therefore, publicly available learner corpus frequency lists can be a reliable resource for stakeholders interested in the lexical gains of language learners.

高频词汇表是语言教学的重要资源，也是语料库研究的重要成果。然而，没有一个单一的词汇表对每一个学习环境都有用，这些词汇表的适当性受到它们所基于的语料库的影响。本文研究了语料库选择对词汇复杂程度的影响，重点研究了来自内部学习者语料库(PELIC)和全球学习者语料库(剑桥学习者语料库)的两个频率列表。这一分析表明，从两种类型的学习者语料库中得出的频率表可以有效地作为衡量词汇复杂程度发展的基础，而不考虑学习者的具体计划。因此，公开可用的学习者语料库频率列表可以成为对语言学习者词汇增益感兴趣的利益相关者的可靠资源。

引用次数: 0

Review of Egbert & Baker (2019): Using Corpus Methods to Triangulate Linguistic Analysis 回顾Egbert & Baker(2019):使用语料库方法对语言分析进行三角测量

IF 1 2区文学 0 LANGUAGE & LINGUISTICS

International Journal of Corpus Linguistics

Pub Date : 2022-06-17 DOI: 10.1075/ijcl.00048.ant

L. Anthony

引用次数: 0

Lectal contamination Lectal污染

IF 1 2区文学 0 LANGUAGE & LINGUISTICS

International Journal of Corpus Linguistics

Pub Date : 2022-06-13 DOI: 10.1075/ijcl.20040.pij

Dirk Pijpops

This paper presents evidence from both corpora and agent-based simulation for the effect of lectal contamination. By doing so, it shows how agent-based simulation can be used as a complementary technique to corpus research in the study of language variation. Lectal contamination is an effect whereby the words that are typical of a language variety more often appear in a morphosyntactic variant typical of that same variety, even among language use from a different variety. This study looks at the Dutch partitive genitive construction, which exhibits variation between a “Netherlandic” variant with -s ending and a “Belgian” variant without -s ending. It is shown that the probability of the Belgian variant without -s increases among more “Belgian” words, in the language use of both Belgians and people from the Netherlands. Meanwhile, an agent-based simulation reveals the crucial theoretical preconditions that lead to this effect.

本文从语料库和基于代理的模拟中为电污染的影响提供了证据。通过这样做，它表明了在语言变异研究中，基于主体的模拟可以作为语料库研究的补充技术。语言污染是一种影响，即一种语言变体的典型单词更经常出现在同一变体的典型形态句法变体中，甚至在不同变体的语言使用中也是如此。这项研究着眼于荷兰的部分属格结构，它表现出有-s结尾的“荷兰”变体和没有-s结尾的”比利时“变体之间的变化。研究表明，在比利时人和荷兰人的语言使用中，在更多的“比利时”单词中，不带-s的比利时变体的概率增加。同时，基于agent的模拟揭示了导致这种效果的关键理论前提。

引用次数: 0

Review of Carrió-Pastor (2020): Corpus Analysis in Different Genres: Academic Discourse and Learner Corpora Carrió-Pastor(2020):不同体裁的语料库分析:学术话语和学习者语料库

IF 1 2区文学 0 LANGUAGE & LINGUISTICS

International Journal of Corpus Linguistics

Pub Date : 2022-06-13 DOI: 10.1075/ijcl.00049.wu

Shuqiong Wu

引用次数: 0

Use words, not constructions! 用词，不要用结构！

IF 1 2区文学 0 LANGUAGE & LINGUISTICS

International Journal of Corpus Linguistics

Pub Date : 2022-05-25 DOI: 10.1075/ijcl.20072.pro

Thomas Proisl

The aim of collostructional analysis or, more precisely, simple collexeme analysis, is to quantify the statistical association between a construction c and a lexeme l that occurs in a particular slot of the construction. The analysis is based on 2×2 contingency tables that ought to represent a cross-classification of the units of analysis. So far, the units of analysis have been identified either as all constructions in the corpus or all instances of a class C of constructions to which construction c belongs. In practice, it is often not possible or feasible to identify these constructions. Therefore, the sample size is typically approximated by heuristic estimates. The bottom-right cell of the contingency table is most affected by these approximations. I suggest that the units of analysis be defined on the word level, instead, as the class W of word forms that satisfy the restrictions on the collexeme slot of c.

搭配分析的目的，或者更准确地说，是简单的集合词分析，是量化结构c和词位l之间的统计关联，这种关联发生在结构的特定位置。该分析基于2×2列联表，列联表应表示分析单位的交叉分类。到目前为止，分析单元已经被识别为语料库中的所有结构，或者结构C所属的结构类C的所有实例。在实践中，识别这些结构通常是不可能或不可行的。因此，样本大小通常通过启发式估计来近似。列联表的右下角单元格受这些近似值的影响最大。我建议在单词层面上定义分析单位，而不是定义为满足c的collexeme槽限制的单词形式的W类。

引用次数: 1

Exploring the impact of lexical context on word association responses 探讨词汇语境对词汇联想反应的影响

IF 1 2区文学 0 LANGUAGE & LINGUISTICS

International Journal of Corpus Linguistics

Pub Date : 2022-05-25 DOI: 10.1075/ijcl.20102.thw

P. Thwaites

In word association tasks, participants respond with the first word that comes to mind on seeing a given cue. These responses are generally assumed to be influenced by a number of factors, including cue semantics, form, and textual distribution. Previous studies exploring the third of these influences have used pairwise association measures, such as mutual information, to evaluate the extent to which textual distributions influence response selection. In the current paper, a different approach is taken. Rather than examining co-occurrences between a cue and its observed responses, this paper explores the possibility that the cue’s holistic collocational environment shapes its associative profile. Regression modelling demonstrates that the predictability of this textual distribution is a significant predictor of variance in the cue’s response profile. Overall, however, the amount of variance explained is small. A subsequent qualitative examination of distributional and associative profiles suggests several semantically based constraints to response generation.

在单词联想任务中，参与者在看到给定提示时会想到第一个单词。这些反应通常被认为受到许多因素的影响，包括线索语义、形式和文本分布。先前探索第三种影响的研究使用了成对关联测量，如相互信息，来评估文本分布对反应选择的影响程度。在当前的论文中，采用了不同的方法。本文没有考察线索与其观察到的反应之间的共同出现，而是探讨了线索的整体搭配环境塑造其联想轮廓的可能性。回归模型表明，这种文本分布的可预测性是线索反应谱方差的重要预测因子。不过，总的来说，所解释的差异数额很小。随后对分布和关联简档的定性检查表明，对响应生成有几个基于语义的约束。

引用次数: 0

Degrees of non-standardness 非标准程度

IF 1 2区文学 0 LANGUAGE & LINGUISTICS

International Journal of Corpus Linguistics

Pub Date : 2022-05-20 DOI: 10.1075/ijcl.20014.vuk

Teodora Vuković, Anastasia Escher, Barbara Sonnenhauser

A corpus-based method for assessing a range of dialect-standard variation is presented for identifying samples exhibiting the highest prevalence of dialect features. This method provides insight into areal and inter-speaker variation and allows the extraction of maximally non-standard manifestations of the dialect, which may then be sampled and used for the study of language change and variation. The focus is on a non-standard Torlak variety, which has undergone considerable change under the influence of standard Serbian. The degree of variation is assessed by measuring the frequencies of five distinguishing linguistic features: accent position, dative reflexive si, auxiliary omission in the compound perfect, the post-positive article, and analytic case marking in the indirect object and possessive. Locations subject to the greatest and least influence of the standard are revealed using hierarchical clustering. A positive correlation between the frequencies of occurrence reveals which non-standard feature is the best predictor of the others.

本文提出了一种基于语料库的方法，用于评估方言标准变化的范围，以识别显示方言特征最普遍的样本。这种方法提供了对地区和说话人之间变化的洞察，并允许最大限度地提取方言的非标准表现，然后可以对其进行采样并用于语言变化和变化的研究。重点是一种非标准的托拉克品种，在标准塞尔维亚语的影响下经历了相当大的变化。变化的程度是通过测量五个显著的语言特征的频率来评估的:重音位置、与格反身体si、复合完成时的辅助省略、后肯定冠词、间接宾语和所有格中的分析格标记。利用层次聚类方法揭示受标准影响最大和最小的位置。出现频率之间的正相关关系揭示了哪个非标准特征是其他非标准特征的最佳预测器。

引用次数: 0

A multi-dimensional comparison of the effectiveness and efficiency of association measures in collocation extraction 搭配抽取中关联度量的有效性和效率的多维比较

IF 1 2区文学 0 LANGUAGE & LINGUISTICS

International Journal of Corpus Linguistics

Pub Date : 2022-05-10 DOI: 10.1075/ijcl.19111.den

Yaochen Deng, Dilin Liu

Because of the ubiquity and importance of collocations in language use/learning, how to effectively and efficiently identify collocations has been a topic of interest. Although some studies have evaluated many of the existing association measures (AMs) used in the automatic identification of collocations, the results so far have been inconsistent and unclear due to various limitations of the existing studies. Hence, this study makes a multi-dimensional evaluation of the effectiveness and efficiency of seven major AMs in the identification of three types of collocations across five genres and seven corpora of different sizes. The results indicate that while a few AMs, such as Log Likelihood Ratio and Cubic Mutual Information (MI3), are consistently more effective and efficient than the other five AMs being examined, no one AM alone may be adequate in the identification of different types of collocations across different genres and corpus sizes. Research implications are also discussed.

由于搭配在语言使用/学习中的普遍存在和重要性，如何有效和高效地识别搭配一直是人们感兴趣的话题。虽然一些研究已经评估了许多用于搭配自动识别的现有关联测量(AMs)，但由于现有研究的各种局限性，迄今为止的结果并不一致和不明确。因此，本研究从多维度上评价了七种主要人工智能在识别五种类型、七种不同大小语料库的三种搭配中的有效性和效率。结果表明，虽然对数似然比(Log Likelihood Ratio)和立方互信息(Cubic Mutual Information, MI3)等几种类型化方法始终比其他五种类型化方法更有效，但单独使用一种类型化方法可能不足以识别不同类型和语料库大小的不同类型的搭配。本文还讨论了研究的意义。

引用次数: 1

Review of Feng (2020): Form, Meaning and Function in Collocation: A Corpus Study on Commercial Chinese-to-English Translation 《搭配中的形式、意义和功能:商业汉英翻译的语料库研究》，冯文杰(2020)

IF 1 2区文学 0 LANGUAGE & LINGUISTICS

International Journal of Corpus Linguistics

Pub Date : 2022-05-03 DOI: 10.1075/ijcl.00047.far

Mehrdad Vasheghani Farahani

引用次数: 5

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

International Journal of Corpus Linguistics

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀