International Journal of Corpus Linguistics最新文献

英文中文

“You betcha I’m a ’Merican” “你说得对，我是美国人!”

IF 1 2区文学 0 LANGUAGE & LINGUISTICS

International Journal of Corpus Linguistics

Pub Date : 2023-02-23 DOI: 10.1075/ijcl.21060.hir

Tomoharu Hirota, Laurel J. Brinton

This article studies you bet and related phrases when they are used as a parenthetical and as a free-standing response. Drawing on a range of corpora, we provide both contemporary and historical perspectives on the set of pragmatic expressions that has largely escaped scholars’ attention. Synchronically, we demonstrate that they are colloquial American pragmatic markers to express speaker certainty/affirmation or to respond to thanks. Diachronically, these markers are hypothesized to have developed out of main clause usage with a clausal complement (‘the matrix clause hypothesis’); however, our historical corpus evidence does not straightforwardly support this hypothesis. Instead, we suggest that multiple constructions might have been involved in the emergence of the pragmatic markers, namely, wh-interrogatives (e.g. what will you bet (that) …?), modal constructions (e.g. you may/can bet (that) …), and main clauses with a reduced complement (e.g. You bet I do).

本文将研究您的赌注和相关短语，当它们被用作插入语或作为独立的回应时。在一系列语料库的基础上，我们提供了当代和历史的视角来研究这些很大程度上逃过了学者们注意的语用表达。同时，我们证明了它们是美国口语语用标记，用来表达说话者的肯定/肯定或对感谢的回应。历时上，这些标记被假设是从带有小句补语的主句使用中发展出来的(“矩阵子句假设”);然而，我们的历史语料库证据并不能直接支持这一假设。相反，我们认为语用标记的出现可能涉及多种结构，即wh-疑问句(例如，what will you bet (that)…?)、情态结构(例如，you may/can bet (that)…)和主句(例如，you bet I do)。

引用次数: 2

A proposal for the inductive categorisation of parenthetical discourse markers in Spanish using parallel corpora 使用平行语料库对西班牙语括号话语标记进行归纳分类的建议

IF 1 2区文学 0 LANGUAGE & LINGUISTICS

International Journal of Corpus Linguistics

Pub Date : 2023-02-23 DOI: 10.1075/ijcl.20017.rob

Hernán Robledo, Rogelio Nazar

We propose a method for the automatic induction of categories of Spanish discourse markers using parallel corpora, based on a quantitative and empirical approach that minimises explicit linguistic knowledge. We conducted the analysis the using a large Spanish-English parallel corpus. First, we used this corpus to obtain a list of parenthetical discourse markers in each language. Then, we used it as a “semantic mirror”, inspecting the English equivalences and assessing which Spanish discourse markers fulfil a similar function in discourse and vice versa. The result of this procedure is an emerging categorisation of discourse markers. The main contribution is to offer empirical evidence for the adequacy of existing manually-compiled taxonomies and the potential for discovery of new, unaccounted categories. In this article we focus on units pertaining to the Spanish language but, since the method is purely quantitative, it is possible to apply it to different languages as well.

我们提出了一种使用平行语料库自动归纳西班牙语话语标记类别的方法，该方法基于最小化显性语言知识的定量和实证方法。我们使用一个大型的西班牙语-英语平行语料库进行了分析。首先，我们使用这个语料库来获得每种语言的括号话语标记列表。然后，我们将其用作“语义镜”，检查英语的对等，并评估哪些西班牙语话语标记在话语中发挥了类似的功能，反之亦然。这一过程的结果是话语标记物的一种新兴分类。主要贡献是为现有人工汇编的分类法的充分性以及发现新的、未解释的分类的潜力提供经验证据。在这篇文章中，我们关注的是与西班牙语有关的单位，但由于该方法纯粹是定量的，因此也可以将其应用于不同的语言。

引用次数: 1

When loanwords are not lone words 当外来词不是单独的词

IF 1 2区文学 0 LANGUAGE & LINGUISTICS

International Journal of Corpus Linguistics

Pub Date : 2023-01-09 DOI: 10.1075/ijcl.21124.try

David Trye, Andreea S. Calude, T. Keegan, Julia R. Falconer

Networks are being used to model an increasingly diverse range of real-world phenomena. This paper introduces an exploratory approach to studying loanwords in relation to one another, using networks of co-occurrence. While traditional studies treat individual loanwords as discrete items, we show that insights can be gained by focusing on the various loanwords that co-occur within each text in a corpus, especially when leveraging the notion of a hypergraph. Our research involves a case-study of New Zealand English (NZE), which borrows Indigenous Māori words on a large scale. We use a topic-constrained corpus to show that: (i) Māori loanword types tend not to occur by themselves in a text; (ii) infrequent loanwords are nearly always accompanied by frequent loanwords; and (iii) it is not uncommon for texts to contain a mixture of listed and unlisted loanwords, suggesting that NZE is still riding a wave of borrowing importation from Māori.

网络正被用来模拟越来越多样化的现实世界现象。本文介绍了一种利用共现网络研究外来词相互关系的探索性方法。虽然传统的研究将单个外来词视为离散项目，但我们发现，通过关注语料库中每个文本中共同出现的各种外来词，尤其是在利用超图概念时，可以获得见解。我们的研究涉及新西兰英语（NZE）的案例研究，该英语大规模借用了土著毛利语单词。我们使用主题约束语料库来表明：（i）毛利借词类型往往不会在文本中单独出现；（ii）不频繁的外来语几乎总是伴随着频繁的外来词；以及（iii）文本中混合了列出和未列出的借词并不罕见，这表明新西兰语仍在经历从毛利语借用输入的浪潮。

引用次数: 0

Corpus linguistics and clinical psychology: Investigating personification in first-person accounts of voice-hearing. 语料库语言学与临床心理学：调查第一人称听力描述中的拟人化。

IF 1.6 2区文学 0 LANGUAGE & LINGUISTICS

International Journal of Corpus Linguistics

Pub Date : 2023-01-01 Epub Date: 2022-04-29 DOI: 10.1075/ijcl.21019.col

Luke Collins, Vaclav Brezina, Zsófia Demjén, Elena Semino, Angela Woods

Triangulating corpus linguistic approaches with other (linguistic and non-linguistic) approaches enhances "both the rigour of corpus linguistics and its incorporation into all kinds of research" (McEnery & Hardie, 2012:227). Our study investigates an important area of mental health research: the experiences of those who hear voices that others cannot hear, and particularly the ways in which those voices are described as person-like. We apply corpus methods to augment the findings of a qualitative approach to 40 interviews with voice-hearers, whereby each interview was coded as involving 'minimal' or 'complex' personification of voices. Our analysis provides linguistic evidence in support of the qualitative coding of the interviews, but also goes beyond a binary approach by revealing different types and degrees of personification of voices, based on how they are referred to and described by voice-hearers. We relate these findings to concepts that inform therapeutic interventions in clinical psychology.

将语料库语言学方法与其他（语言学和非语言学）方法进行三角分析，可增强 "语料库语言学的严谨性，并将其融入各类研究中"（McEnery & Hardie, 2012:227）。我们的研究调查了心理健康研究的一个重要领域：那些听到别人听不到的声音的人的经历，尤其是这些声音被描述为类似人的方式。我们运用语料库方法，对 40 个声音聆听者的访谈结果进行了定性分析，每个访谈都被编码为涉及声音的 "最小 "或 "复杂 "人格化。我们的分析为访谈的定性编码提供了语言证据，但也超越了二元对立的方法，根据声音聆听者如何提及和描述声音，揭示了声音人格化的不同类型和程度。我们将这些发现与临床心理学治疗干预的相关概念联系起来。

{"title":"Corpus linguistics and clinical psychology: Investigating personification in first-person accounts of voice-hearing.","authors":"Luke Collins, Vaclav Brezina, Zsófia Demjén, Elena Semino, Angela Woods","doi":"10.1075/ijcl.21019.col","DOIUrl":"10.1075/ijcl.21019.col","url":null,"abstract":"Triangulating corpus linguistic approaches with other (linguistic and non-linguistic) approaches enhances \"both the rigour of corpus linguistics and its incorporation into all kinds of research\" (McEnery & Hardie, 2012:227). Our study investigates an important area of mental health research: the experiences of those who hear voices that others cannot hear, and particularly the ways in which those voices are described as person-like. We apply corpus methods to augment the findings of a qualitative approach to 40 interviews with voice-hearers, whereby each interview was coded as involving 'minimal' or 'complex' personification of voices. Our analysis provides linguistic evidence in support of the qualitative coding of the interviews, but also goes beyond a binary approach by revealing different types and degrees of personification of voices, based on how they are referred to and described by voice-hearers. We relate these findings to concepts that inform therapeutic interventions in clinical psychology.","PeriodicalId":46843,"journal":{"name":"International Journal of Corpus Linguistics","volume":"28 1","pages":"28-59"},"PeriodicalIF":1.6,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7614468/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9388413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Review of Le Bruyn & Paquot (2021): Learner Corpus Research Meets Second Language Acquisition Le Bruyn & Paquot(2021):学习者语料库研究与第二语言习得

IF 1 2区文学 0 LANGUAGE & LINGUISTICS

International Journal of Corpus Linguistics

Pub Date : 2022-12-02 DOI: 10.1075/ijcl.00051.ngu

Li Nguyen

引用次数: 0

Assessing word commonness 评估单词的通用性

IF 1 2区文学 0 LANGUAGE & LINGUISTICS

International Journal of Corpus Linguistics

Pub Date : 2022-11-25 DOI: 10.1075/ijcl.21037.eke

Mikkel Ekeland Paulsen

The article investigates the two main corpus indicators of word commonness, frequency and dispersion, through a cross-validation analysis of frequency and four dispersion measures (‘Range’, ‘Chi-squared’, ‘Deviation of Proportions’ and ‘Juilland’s D’). The approach provides an estimation of the capacity of the named measures to predict the distribution of corpus items in an extracted language sample. Based on a dataset of 273 Norwegian compounds, the results show that especially Deviation of Proportions is a robust measure of dispersion that can be used in conjunction with frequency to substantiate assertions of word commonness based on corpus data. In addition, dispersion measures do not only reflect what sort of distribution the frequency statistic is generated from, but also how reliable the frequency estimation in the corpus sample is in terms of giving an accurate representation of frequency in the language variety that the corpus is sampled from.

本文通过对词频和四个离散度度量(“范围”、“卡方”、“比例偏差”和“茱莉兰D”)的交叉验证分析，考察了词的共性、频率和离散度这两个主要语料库指标。该方法提供了命名度量在提取的语言样本中预测语料库项目分布的能力的估计。基于273个挪威语复合词的数据集，结果表明，比例偏差(Deviation of Proportions)是一种鲁棒的离散度量，可以与频率一起使用，以证实基于语料库数据的词共同性断言。此外，离散度量不仅反映了频率统计是从哪种分布生成的，而且还反映了语料库样本中频率估计的可靠性，因为语料库样本给出了语料库样本中语言种类频率的准确表示。

引用次数: 0

Things we smell and things they smell like 我们闻到的味道和它们闻到的味道

IF 1 2区文学 0 LANGUAGE & LINGUISTICS

International Journal of Corpus Linguistics

Pub Date : 2022-11-25 DOI: 10.1075/ijcl.21028.pou

Thomas Poulton

The sense of smell has been relatively neglected in the Western research. It is not regarded as particularly useful compared to the perceived importance of senses like sight, sound, and touch. Correspondingly, English speakers are ill-equipped to describe qualities of smells, instead invoking entities that share similar olfactory qualities, e.g. like roses. This raises the question: which odours do English speakers frequently refer to, and which terms describe them? This corpus-driven study looks at nouns in olfactory contexts, and the conceptual domains they fall into. Results show that speakers invoke different smells according to context: when talking about a smell they perceive, when describing a smell, or in a description of another smell, which demonstrates the differential communicative functions of smells. Further analysis shows that smells that are described are more variable than those used as descriptors, and smells being used to describe are more emotional using psychometric norming data.

嗅觉在西方研究中一直被相对忽视。与视觉、声音和触觉等感官的重要性相比，它并没有被认为特别有用。相应地，讲英语的人没有能力描述气味的性质，而是调用具有相似嗅觉性质的实体，例如玫瑰。这就提出了一个问题：英语使用者经常提到哪些气味，哪些术语描述它们？这项语料库驱动的研究着眼于嗅觉语境中的名词，以及它们所属的概念领域。结果表明，说话者根据上下文调用不同的气味：当谈论他们感知的气味时，当描述一种气味时，或在描述另一种气味中，这表明了气味的不同交际功能。进一步的分析表明，被描述的气味比用作描述符的气味更具可变性，并且使用心理测量规范数据来描述的气味更情绪化。

引用次数: 1

Research trends in corpus linguistics 语料库语言学的研究趋势

IF 1 2区文学 0 LANGUAGE & LINGUISTICS

International Journal of Corpus Linguistics

Pub Date : 2022-11-14 DOI: 10.1075/ijcl.21072.cro

P. Crosthwaite, Sulistya Ningrum, M. Schweinberger

This paper uses a bibliometric analysis to map the field of Corpus Linguistics (CL) research in arts and humanities over the last 20 years, tracking changes in popular CL research topics, outlets, highly cited authors, and geographical origins based on the metadata of 5,829 CL-related articles from 429 Scopus-indexed journals. Results reveal an increase in corpus-assisted discourse studies, lexical bundles and academic writing, alongside newer topics including multilingualism and social media. CL studies span 193 languages/dialects with a significant rise in Chinese, Russian, Spanish, and Italian CL research over the past decade. Clusters of highly cited CL researchers are identified spanning (inter)disciplinary research areas. An increase of CL researchers in China, Poland, South Korea, Japan, and more is evidence of the now global reach of CL research. These findings mirror diachronic socio-cultural developments in applied linguistics and society more generally and provide insights into what CL research might come next.

本文采用文献计量分析法绘制了过去20年来语料库语言学在艺术和人文学科中的研究领域，基于429种Scopus索引期刊中5829篇语料库相关文章的元数据，跟踪了热门语料库研究主题、渠道、高引引引作者和地理起源的变化。结果显示，语料库辅助的话语研究、词汇束和学术写作以及包括多语和社交媒体在内的新话题有所增加。CL研究涉及193种语言/方言，在过去十年中，汉语、俄语、西班牙语和意大利语的CL研究显著增加。被高度引用的CL研究人员集群被确定为跨越（跨）学科研究领域。中国、波兰、韩国、日本等地CL研究人员的增加证明了CL研究的全球影响力。这些发现反映了应用语言学和社会的历时性社会文化发展，并为CL研究的下一步提供了见解。

引用次数: 1

A comparison of automated and manual analyses of syntactic complexity in L2 English writing 二语写作中句法复杂性自动分析与人工分析的比较

IF 1 2区文学 0 LANGUAGE & LINGUISTICS

International Journal of Corpus Linguistics

Pub Date : 2022-10-17 DOI: 10.1075/ijcl.20181.cha

Quang Hồng Châu, Bram Bulté

Automated tools for syntactic complexity measurement are increasingly used for analyzing various kinds of second language corpora, even though these tools were originally developed and tested for texts produced by advanced learners. This study investigates the reliability of automated complexity measurement for beginner and lower-intermediate L2 English data by comparing manual and automated analyses of a corpus of 80 texts written by Dutch-speaking learners. Our quantitative and qualitative analyses reveal that the reliability of automated complexity measurement is substantially affected by learner errors, parser errors, and Tregex pattern undergeneration. We also demonstrate the importance of aligning the definitions of analytical units between the computational tool and human annotators. In order to enhance the reliability of automated analyses, it is recommended that certain modifications are made to the system, and non-advanced L2 English data are preprocessed prior to automated analyses.

句法复杂性测量的自动化工具越来越多地用于分析各种第二语言语料库，尽管这些工具最初是为高级学习者编写的文本开发和测试的。本研究通过对荷兰语学习者所写的80篇文本的语料库进行人工分析和自动分析，探讨了自动测量初级和中低二语英语数据复杂性的可靠性。我们的定量和定性分析表明，自动化复杂性测量的可靠性在很大程度上受到学习者错误、解析器错误和Tregex模式欠生成的影响。我们还演示了在计算工具和人类注释器之间对齐分析单元定义的重要性。为了提高自动化分析的可靠性，建议对系统进行某些修改，并且在自动分析之前对非高级L2英语数据进行预处理。

引用次数: 2

Towards a corpus-based description of speech-gesture units of meaning 基于语料库的语音手势意义单位描述

IF 1 2区文学 0 LANGUAGE & LINGUISTICS

International Journal of Corpus Linguistics

Pub Date : 2022-10-13 DOI: 10.1075/ijcl.20174.che

Yaoyao Chen, S. Adolphs

The theories and methods in corpus linguistics (CL) have had an impact on numerous areas in applied linguistics. However, the interface between CL and multimodal speech-gesture studies remains underexplored. One fundamental question is whether it is possible, and even appropriate, to apply the theories and paradigms established based on textual data to multimodal data. To explore this, we examine how CL can assist investigating lexico-grammatical patterns of speech co-occurring with a recurrent gesture (i.e. the circular gesture). Sinclair’s (1996) unit of meaning model is used to describe the co-gestural speech patterns. The study draws on a subset of the Nottingham Multimodal Corpus, in which 570 instances of circular gestures and their co-occurring speech are identified and analysed. We argue that Sinclair’s unit of meaning model can be extended to include speech-gesture patterns, and that those descriptions enable a more nuanced understanding of meaning in context.

语料库语言学的理论和方法对应用语言学的许多领域产生了影响。然而，CL和多模态语音手势研究之间的界面仍然没有得到充分的探索。一个基本问题是，将基于文本数据建立的理论和范式应用于多模态数据是否可能，甚至是否合适。为了探索这一点，我们研究了CL如何帮助研究与反复出现的手势(即圆形手势)同时发生的语音的词汇语法模式。辛克莱(1996)的意义单位模型用于描述同手势语音模式。这项研究利用了诺丁汉多模态语料库的一个子集，其中570个圆形手势和它们共同出现的语音实例被识别和分析。我们认为，辛克莱的意义单位模型可以扩展到包括语言-手势模式，并且这些描述可以在上下文中更细致地理解意义。

引用次数: 1

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

International Journal of Corpus Linguistics

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀