ICAME journal : computers in English linguistics最新文献

英文中文

Guidelines for normalising Early Modern English corpora: Decisions and justifications 早期现代英语语料库规范化指南:决定和理由

ICAME journal : computers in English linguistics

Pub Date : 2015-03-01 DOI: 10.1515/icame-2015-0001

D. Archer, Merja Kytö, Alistair Baron, Paul Rayson

Abstract Corpora of Early Modern English have been collected and released for research for a number of years. With large scale digitisation activities gathering pace in the last decade, much more historical textual data is now available for research on numerous topics including historical linguistics and conceptual history. We summarise previous research which has shown that it is necessary to map historical spelling variants to modern equivalents in order to successfully apply natural language processing and corpus linguistics methods. Manual and semiautomatic methods have been devised to support this normalisation and standardisation process. We argue that it is important to develop a linguistically meaningful rationale to achieve good results from this process. In order to do so, we propose a number of guidelines for normalising corpora and show how these guidelines have been applied in the Corpus of English Dialogues.

摘要早期现代英语语料库的收集和发布已经进行了多年的研究。在过去十年中，随着大规模数字化活动的加快，现在有更多的历史文本数据可用于研究许多主题，包括历史语言学和概念史。我们总结了以前的研究，这些研究表明，为了成功地应用自然语言处理和语料库语言学方法，有必要将历史拼写变体映射到现代对等物。已经设计了手动和半自动方法来支持这一规范化和标准化过程。我们认为，重要的是要发展一个语言学上有意义的理论基础，以实现良好的结果，从这一过程。为了做到这一点，我们提出了一些规范语料库的准则，并展示了这些准则是如何在英语对话语料库中应用的。

引用次数: 45

David L. Hoover, Jonathan Culpeper and Kieran O’Halloran. Digital literary studies David L. Hoover, Jonathan Culpeper和Kieran O 'Halloran。数字文学研究

ICAME journal : computers in English linguistics

Pub Date : 2015-03-01 DOI: 10.1515/icame-2015-0009

Jesse Egbert

引用次数: 0

Lieven Vandelanotte, Kristin Davidse, Caroline Gentens and Ditte Kimps (eds.). Recent advances in corpus linguistics. Developing and exploiting corpora Lieven Vandelanotte, Kristin Davidse, Caroline Gentens和Ditte Kimps(编)。语料库语言学的最新进展。开发和利用语料库

ICAME journal : computers in English linguistics

Pub Date : 2015-03-01 DOI: 10.1515/icame-2015-0012

Leonie Wiemeyer

引用次数: 1

Susan Nacey. Metaphors in learner English 苏珊Nacey。英语学习中的隐喻

ICAME journal : computers in English linguistics

Pub Date : 2015-03-01 DOI: 10.1515/icame-2015-0010

R. Kreyer

引用次数: 1

Modest XPath and XQuery for corpora: Exploiting deep XML annotation 语料库的适度XPath和XQuery:利用深度XML注释

ICAME journal : computers in English linguistics

Pub Date : 2015-03-01 DOI: 10.1515/icame-2015-0003

Christoph Rühlemann, Andrej Bagoutdinov, M. O'Donnell

Abstract This paper outlines a modest approach to XPath and XQuery, tools allowing the navigation and exploitation of XML-encoded texts. The paper starts off from where Andrew Hardie’s paper “Modest XML for corpora: Not a standard, but a suggestion” (Hardie 2014) left the reader, namely wondering how one’s corpus can be usefully analyzed once its XML-encoding is finished, a question the paper did not address. Hardie argued persuasively that “there is a clear benefit to be had from a set of recommendations (not a standard) that outlines general best practices in the use of XML in corpora without going into any of the more technical aspects of XML or the full weight of TEI encoding” (Hardie 2014: 73). In a similar vein this paper argues that even a basic understanding of XPath and XQuery can bring great benefits to corpus linguists. To make this point, we present not only a modest introduction to basic structures underlying the XPath and XQuery syntax but demonstrate their analytical potential using Obama’s 2009 Inaugural Address as a test bed. The speech was encoded in XML, automatically PoS-tagged and manually annotated on additional layers that target two rhetorical figures, anaphora and isocola. We refer to this resource as the Inaugural Rhetorical Corpus (IRC). Further, we created a companion website hosting not only the Inaugural Rhetorical Corpus, but also the Inaugural Training Corpus) (a training corpus in the form of an abbreviated version of the IRC to allow manual checks of query results) as well as an extensive list of tried and tested queries for use with either corpus. All of the queries presented in this paper are at beginners to lower-intermediate levels of XPath/XQuery expertise. Nonetheless, they yield fruitful results: they show how Obama uses the inclusive pronouns we and our as a discursive strategy to advance his political strategy to re-focus American politics on economic and domestic matters. Further, they demonstrate how sentence length contributes to the build-up of climactic tension. Finally, they suggest that Obama’s signature rhetorical figure is the isocolon and that the overwhelming majority of isocola in the speech instantiate the crescens type, where the cola gradually increase in length over the sequence.

本文概述了XPath和XQuery的一种适度的方法，这些工具允许导航和利用xml编码的文本。本文从Andrew Hardie的论文“语料库的适度XML:不是标准，而是建议”(Hardie 2014)开始，即想知道一旦XML编码完成，语料库如何有效地分析，这是一个论文没有解决的问题。Hardie很有说服力地指出:“一组建议(不是标准)概述了在语料库中使用XML的一般最佳实践，而不涉及XML的任何技术方面或TEI编码的全部重量，这显然有好处”(Hardie 2014: 73)。同样，本文认为，即使对XPath和XQuery有基本的了解，也能给语料库语言学家带来很大的好处。为了说明这一点，我们不仅简要介绍了XPath和XQuery语法的基本结构，而且以奥巴马2009年的就职演说为测试平台，展示了它们的分析潜力。演讲用XML编码，自动进行pos标记，并在针对两种修辞手法的额外层上进行手动注释，即回指和异指。我们把这个资源称为就职修辞语料库(IRC)。此外，我们创建了一个配套网站，不仅托管就职修辞语料库，还托管就职培训语料库(IRC的缩写形式的培训语料库，允许手动检查查询结果)，以及用于两个语料库的经过尝试和测试的查询的广泛列表。本文中介绍的所有查询都适用于XPath/XQuery专业知识的初级到中低水平的读者。尽管如此，他们还是取得了丰硕的成果:他们展示了奥巴马如何使用包容性代词“我们”和“我们的”作为一种话语策略来推进他的政治策略，将美国政治重新聚焦于经济和国内事务。此外，他们还展示了句子长度是如何促成高潮紧张气氛的。最后，他们认为奥巴马的标志性修辞手法是“可乐”，而演讲中绝大多数的“可乐”都是“渐长”类型的实例，即可乐的长度随着序列的增加而逐渐增加。

{"title":"Modest XPath and XQuery for corpora: Exploiting deep XML annotation","authors":"Christoph Rühlemann, Andrej Bagoutdinov, M. O'Donnell","doi":"10.1515/icame-2015-0003","DOIUrl":"https://doi.org/10.1515/icame-2015-0003","url":null,"abstract":"Abstract This paper outlines a modest approach to XPath and XQuery, tools allowing the navigation and exploitation of XML-encoded texts. The paper starts off from where Andrew Hardie’s paper “Modest XML for corpora: Not a standard, but a suggestion” (Hardie 2014) left the reader, namely wondering how one’s corpus can be usefully analyzed once its XML-encoding is finished, a question the paper did not address. Hardie argued persuasively that “there is a clear benefit to be had from a set of recommendations (not a standard) that outlines general best practices in the use of XML in corpora without going into any of the more technical aspects of XML or the full weight of TEI encoding” (Hardie 2014: 73). In a similar vein this paper argues that even a basic understanding of XPath and XQuery can bring great benefits to corpus linguists. To make this point, we present not only a modest introduction to basic structures underlying the XPath and XQuery syntax but demonstrate their analytical potential using Obama’s 2009 Inaugural Address as a test bed. The speech was encoded in XML, automatically PoS-tagged and manually annotated on additional layers that target two rhetorical figures, anaphora and isocola. We refer to this resource as the Inaugural Rhetorical Corpus (IRC). Further, we created a companion website hosting not only the Inaugural Rhetorical Corpus, but also the Inaugural Training Corpus) (a training corpus in the form of an abbreviated version of the IRC to allow manual checks of query results) as well as an extensive list of tried and tested queries for use with either corpus. All of the queries presented in this paper are at beginners to lower-intermediate levels of XPath/XQuery expertise. Nonetheless, they yield fruitful results: they show how Obama uses the inclusive pronouns we and our as a discursive strategy to advance his political strategy to re-focus American politics on economic and domestic matters. Further, they demonstrate how sentence length contributes to the build-up of climactic tension. Finally, they suggest that Obama’s signature rhetorical figure is the isocolon and that the overwhelming majority of isocola in the speech instantiate the crescens type, where the cola gradually increase in length over the sequence.","PeriodicalId":73271,"journal":{"name":"ICAME journal : computers in English linguistics","volume":"3 1","pages":"47 - 84"},"PeriodicalIF":0.0,"publicationDate":"2015-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75259741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

Tony Berber Sardinha and Marcia Veirano Pinto (eds.). Multi-Dimensional analysis, 25 years on – a tribute to Douglas Biber Tony Berber Sardinha和Marcia Veirano Pinto(编)。多维分析，25年过去了——向道格拉斯·比伯致敬

ICAME journal : computers in English linguistics

Pub Date : 2015-03-01 DOI: 10.1515/icame-2015-0006

Marco Schilk

引用次数: 1

Word frequency and collocation: Using children’s literature in adult learning 词频与搭配:儿童文学在成人学习中的运用

ICAME journal : computers in English linguistics

Pub Date : 2015-03-01 DOI: 10.1515/icame-2015-0004

E. Thomas

Abstract This study involved the creation of a corpus of children’s literature spanning 5.5 million words. Using concordance software, the corpus was able to show the most frequent words and collocations. These will be of interest both to literary researchers in the genre of children’s literature and also teachers and applied linguists working with adult students of English.

摘要本研究建立了一个550万字的儿童文学语料库。使用协和软件，语料库能够显示最常见的单词和搭配。儿童文学研究人员以及与成人英语学生打交道的教师和应用语言学家都会对这些内容感兴趣。

引用次数: 1

Claudia Claridge. Hyperbole in English: A corpus-based study of exaggeration 克劳迪娅·克拉里奇。英语中的夸张:基于语料库的夸张研究

ICAME journal : computers in English linguistics

Pub Date : 2015-03-01 DOI: 10.1515/icame-2015-0007

M. Levin

引用次数: 1

Christoph Rühlemann. Narrative in English conversation Christoph Ruhlemann。英语会话中的叙事

ICAME journal : computers in English linguistics

Pub Date : 2015-01-01 DOI: 10.1515/icame-2015-0011

A. Partington

引用次数: 1

Review: Sandra Götz. Fluency in native and nonnative English speech 回顾:Sandra Götz。流利的母语和非母语英语口语

ICAME journal : computers in English linguistics

Pub Date : 2014-04-28 DOI: 10.2478/icame-2014-0013

P. Pérez-Paredes

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

ICAME journal : computers in English linguistics

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀