International Journal of Corpus Linguistics最新文献

英文中文

Review of Stefanowitsch (2020): Corpus Linguistics: A Guide to the Methodology stefanwitsch(2020):语料库语言学:方法论指南

IF 1 2区文学 0 LANGUAGE & LINGUISTICS

International Journal of Corpus Linguistics

Pub Date : 2022-04-26 DOI: 10.1075/ijcl.00046.ger

Kevin F Gerigk

引用次数: 0

Verb form error detection in written English of Chinese EFL learners 中国英语学习者书面英语动词形式错误检测

IF 1 2区文学 0 LANGUAGE & LINGUISTICS

International Journal of Corpus Linguistics

Pub Date : 2022-04-13 DOI: 10.1075/ijcl.19107.che

Gong Chen, Maocheng Liang

In the past few decades, researchers have paid increasing attention to automatic error detection in natural languages, but few have focused on developing an error-checking tool for EFL learners in China. Based on the theory of Pattern Grammar, this study formalizes verb patterns through Link Grammar, a formal grammatical system developed by Sleator and Temperley (1991), and reconstructs an Link Grammar verb dictionary to create an automatic checking tool for verb form errors in Chinese learners’ written English. The test results show that by importing more detailed pattern information of verbs in the Link Grammar dictionary, the Link Grammar parser can identify verb form errors more accurately and effectively than the original and the parsing capability of the Link Grammar parser is improved. The article shows that Pattern Grammar and Link Grammar can work together and be applied to the construction of error-checking tools for EFL learners with promising results.

在过去的几十年里，研究人员越来越关注自然语言中的自动错误检测，但很少有人关注为中国的英语学习者开发一种错误检测工具。本研究以模式语法理论为基础，通过Sleator和Temperley（1991）开发的形式语法系统Link Grammar对动词模式进行形式化，并重构Link Grammars动词词典，为中国学习者书面英语中动词形式错误的自动查查工具。测试结果表明，通过在Link语法词典中导入更详细的动词模式信息，Link语法分析器可以比原来更准确、更有效地识别动词形式错误，提高了Link语法分析器的解析能力。这篇文章表明，模式语法和链接语法可以协同工作，并应用于构建英语学习者的错误检查工具，取得了良好的效果。

引用次数: 0

The affordances of metaphor for diachronic corpora & discourse analysis 隐喻对历时语料库的启示与语篇分析

IF 1 2区文学 0 LANGUAGE & LINGUISTICS

International Journal of Corpus Linguistics

Pub Date : 2022-03-14 DOI: 10.1075/ijcl.22004.tay

Charlotte Taylor

This paper examines the utility of metaphor as an investigative tool in “long-distance” corpora and discourse studies. I show that metaphor is both important for understanding discourses and useful for diachronic analysis because it allows us to abstract out above the purely lexical level, enabling comparison across contexts where the same concept could be lexicalised differently. The case-study is concerned with the oft-discussed metaphor of migrants are water in the UK-based Times newspaper from 1800–2018 and the conventionalisation and evaluative patterns are presented. The findings confirm that the water metaphor has an extensive discourse history regarding how migration is represented in the UK press, but also that evaluations may differ significantly. The paper shows how metaphor can provide a way to find discourse evaluations and framings across different time periods. The use of second-order collocates illustrates how corpus tools can help re-contextualise data to ensure interpretation heeds contemporary framings.

本文考察了隐喻作为一种调查工具在“远距离”语料库和语篇研究中的效用。我表明，隐喻对理解语篇很重要，对历时分析也很有用，因为它使我们能够在纯粹的词汇层面上进行抽象，从而能够在同一概念可能被不同地词汇化的情况下进行比较。该案例研究涉及1800年至2018年英国《泰晤士报》上经常讨论的移民是水的隐喻，并介绍了传统化和评价模式。研究结果证实，关于移民在英国媒体中的表现，水的隐喻有着广泛的话语历史，但评价也可能存在显著差异。本文展示了隐喻如何提供一种方法来寻找不同时间段的话语评价和框架。二阶并置的使用说明了语料库工具如何帮助重新将数据置于上下文中，以确保解释符合当代框架。

引用次数: 5

The hapax / type ratio hapax / type比率

IF 1 2区文学 0 LANGUAGE & LINGUISTICS

International Journal of Corpus Linguistics

Pub Date : 2022-03-09 DOI: 10.1075/ijcl.19114.van

Niek Van Wettere

This article addresses one of the lesser-known productivity measures, namely the hapax / type ratio (HTR). Through a case study involving the Dutch semi-copula raken (“attain”), it is shown that the HTR more or less stabilizes from a certain sample size onwards. Moreover, this point of stabilization seems to coincide with an increased permanency of the hapaxes, i.e. the share of hapaxes that convert quickly to non-hapaxes is not as large as was the case at the beginning of the sampling process. Therefore, the stabilization of the HTR might be a good indicator of minimally required sample size in productivity studies, suggesting that the hapaxes are ‘non-incidental’ from this sample size onwards. However, I did not find a clear link between the onset of the stabilization of the HTR and the extent to which the inventory of types accounted for at the top of the frequency distribution is (quasi-)complete.

本文讨论了一个鲜为人知的生产力度量，即hapax / type ratio (HTR)。通过一个涉及荷兰半copula raken(“attain”)的案例研究，表明HTR从一定的样本量开始或多或少趋于稳定。此外，这个稳定点似乎与hapax的持久性增加相一致，即迅速转化为非hapax的hapax的份额不像抽样过程开始时那样大。因此，HTR的稳定性可能是生产力研究中最小样本量的一个很好的指标，这表明从这个样本量开始，hapax是“非偶然的”。然而，我没有发现HTR稳定的开始与频率分布顶部的类型清单(准)完整的程度之间存在明确的联系。

引用次数: 1

Universals in machine translation? 机器翻译中的共性?

IF 1 2区文学 0 LANGUAGE & LINGUISTICS

International Journal of Corpus Linguistics

Pub Date : 2022-02-14 DOI: 10.1075/ijcl.19127.luo

Jinru Luo, Dechao Li

By examining and comparing the linguistic patterns in a self-built corpus of Chinese-English translations produced by WeChat Translate, the latest online machine translation app from the most popular social media platform (WeChat) in China, this study explores such questions as whether or not and to what extent simplification and normalization (hypothesized Translation Universals) exhibit themselves in these translations. The results show that, whereas simplification cannot be substantiated, the tendency of normalization to occur in the WeChat translations can be confirmed. The research finds that these results are caused by the operating mechanism of machine translation (MT) systems. Certain salient words tend to prime WeChat’s MT system to repetitively resort to typical language patterns, which leads to a significant overuse of lexical chunks. It is hoped that the present study can shed new light on the development of MT systems and encourage more corpus-based product-oriented research on MT.

通过研究和比较中国最受欢迎的社交媒体平台（微信）的最新在线机器翻译应用微信翻译（WeChat Translate）制作的自建中英文翻译语料库中的语言模式，本研究探讨了简化和规范化（假设的翻译普遍性）在这些翻译中是否以及在多大程度上表现出来等问题。结果表明，虽然简化无法得到证实，但可以证实微信翻译中出现规范化的趋势。研究发现，这些结果是由机器翻译系统的运行机制引起的。某些显著的单词往往会使微信的机器翻译系统重复使用典型的语言模式，从而导致词块的过度使用。希望本研究能够为机器翻译系统的发展提供新的思路，并鼓励更多基于语料库的面向产品的机器翻译研究。

引用次数: 10

The Sociolinguistic Speech Corpus of Chilean Spanish (COSCACH) 智利西班牙语的社会语言学言语语料库

IF 1 2区文学 0 LANGUAGE & LINGUISTICS

International Journal of Corpus Linguistics

Pub Date : 2022-01-31 DOI: 10.1075/ijcl.19103.sad

Scott Sadowsky

This paper presents the Sociolinguistic Speech Corpus of Chilean Spanish (COSCACH) v1.0, a 9.3-million-word corpus containing transcribed, lemmatized and morphologically tagged text, audio recordings and videos from 1,237 L1 speakers of Chilean Spanish, as well as a control sample of 21 non-Chilean L1 Spanish speakers. The COSCACH is the first freely available corpus of spoken Chilean Spanish of substantial size, as well as one of the largest speech corpora of any variety of Spanish. Following a review of other Chilean speech corpora, I describe how the COSCACH was constructed, covering corpus design, speaker recruitment and metadata collection, speech elicitation and recording, transcription, lemmatization and morphological tagging, and corpus compilation. I thereby aim to provide a blueprint for creating modern, large-scale speech corpora suitable for phonetic, sociophonetic and sociolinguistic research, in addition to traditional inquiry into semantics, lexis, grammar, pragmatics and discourse.

本文介绍了智利西班牙语社会语言学语音语料库(COSCACH) v1.0，这是一个930万字的语料库，包含来自1,237名母语智利西班牙语使用者的转录、词源化和形态标记的文本、录音和视频，以及21名非智利母语西班牙语使用者的对照样本。COSCACH是第一个免费提供的大量智利西班牙语口语语料库，也是各种西班牙语中最大的语音语料库之一。在回顾了其他智利语音语料库之后，我描述了COSCACH是如何构建的，包括语料库设计、说话者招募和元数据收集、语音引出和记录、转录、词形化和形态标记以及语料库编写。因此，我的目标是为创建适合语音学、社会语音学和社会语言学研究的现代大规模语音语料库提供蓝图，除了传统的语义学、词汇、语法、语用学和语篇研究之外。

引用次数: 3

The syntax and semantics of coherence relations 连贯关系的句法和语义

IF 1 2区文学 0 LANGUAGE & LINGUISTICS

International Journal of Corpus Linguistics

Pub Date : 2022-01-21 DOI: 10.1075/ijcl.19109.cri

Ludivine Crible

This corpus-based study investigates the inter-relation between discourse markers (DMs) and other contextual signals that contribute to the interpretation of coherence relations. The objectives are three-fold: (i) to provide a comprehensive and systematic portrait of the syntax and semantics of a set of coherence relations in English; (ii) to draw a distinction between mere tendencies of co-occurrence and strong predictive signals; (iii) to identify factors that account for the variation of these signals, focusing on relation complexity, DM strength and genre preferences. The methodology combines systematic coding (description) and multivariate statistical modelling (prediction). While the effect of genre and relation complexity was found to be null or moderate, the presence of discourse signals systematically varies with the ambiguity of the DM in the relation: signals co-occur more with ambiguous DMs than with more informative ones.

本研究以语料库为基础，探讨语篇标记语与其他有助于连贯关系解释的语境信号之间的相互关系。本研究的目标有三个:(1)提供一组英语连贯关系的语法和语义的全面和系统的描述;(ii)区分单纯的共现趋势和强烈的预测信号;(iii)识别导致这些信号变化的因素，重点关注关系复杂性、DM强度和类型偏好。该方法结合了系统编码(描述)和多元统计建模(预测)。虽然体裁和关系复杂性的影响被发现为零或中等，但话语信号的存在随着关系中DM的模糊性而系统地变化:信号更多地与模棱两可的DM一起出现，而不是与信息更丰富的DM一起出现。

引用次数: 1

(The) fact is … /(Die) Tatsache ist … focaliser constructions in English and German are similar but subject to different constraints (事实)是…/(Die) Tatsache ist…英语和德语的聚焦结构相似，但受到不同的限制

IF 1 2区文学 0 LANGUAGE & LINGUISTICS

International Journal of Corpus Linguistics

Pub Date : 2022-01-21 DOI: 10.1075/ijcl.17073.hun

M. Hundt, R. Oppliger

N-is/ist constructions are elements in the left periphery of English/German sentences that have developed pragmatic meaning: they can be used as discourse markers with various functions, depending on the nominal element that is used in the construction. We use evidence from parallel and comparable corpora of English and German to investigate variable article use in these focaliser constructions and model factors that may play a role in article omission/retention (such as modification, choice of head noun, degree of syntactic integration of the focaliser). Our evidence shows that article use largely depends on the lexical head in German but is constrained by different factors in English (notably modification). We interpret our results against the backdrop of construction grammar, arguing that article omission plays a different role in the two languages. From a contrastive point of view, formal syntactic separation in English is easier to achieve than in German and thus facilitates use of English N-is constructions as focalisers.

N-is/ist结构是英语/德语句子左边缘的一些元素，它们已经发展出了语用意义：根据结构中使用的名词性元素，它们可以用作具有各种功能的话语标记。我们使用来自平行和可比较的英语和德语语料库的证据来研究这些聚焦词结构中的可变冠词使用，以及可能在冠词省略/保留中发挥作用的模型因素（如修饰、头名词的选择、聚焦词的句法整合程度）。我们的证据表明，德语中冠词的使用在很大程度上取决于词头，但在英语中受到不同因素的制约（尤其是修饰）。我们在结构语法的背景下解释我们的结果，认为冠词省略在两种语言中扮演着不同的角色。从对比的角度来看，英语中的形式句法分离比德语中更容易实现，因此有助于将英语的N-is结构用作焦点词。

引用次数: 2

Review of Čermáková & Malá (2021): Variation in Time and Space. Observing the World through Corpora Čermáková和Malá评论（2021）：时间和空间的变化。通过公司观察世界

IF 1 2区文学 0 LANGUAGE & LINGUISTICS

International Journal of Corpus Linguistics

Pub Date : 2022-01-18 DOI: 10.1075/ijcl.00045.nur

A. Nurmi

引用次数: 0

Review of Rüdiger & Dayter (2020): Corpus Approaches to Social Media Rüdiger&Dayter评论（2020）：社交媒体的语料库方法

IF 1 2区文学 0 LANGUAGE & LINGUISTICS

International Journal of Corpus Linguistics

Pub Date : 2022-01-13 DOI: 10.1075/ijcl.00044.lef

Elen Le Foll

引用次数: 1

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

International Journal of Corpus Linguistics

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀