Romance Parsed Corpora最新文献

英文中文

Analyzing the structure of code-switched written texts 语码转换的书面文本结构分析

Romance Parsed Corpora

Pub Date : 2018-07-13 DOI: 10.1075/LV.00007.EST

Bruno Estigarribia, Zachary Wilkins

As more written language data become available, the interest in written language mixing / codeswitching (LM/CS) is increasing (Sebba, Mahootian & Jonsson 2012; Sebba 2013). LM/CS in non-naturalistic (e.g., literary) texts raises issues related to gauging (1) the authenticity and representativity of a textual corpus, and deciding (2) whether categories/mechanisms of spoken LM/CS apply to written LM/CS.1 We focus on Guarani-Spanish LM/CS (Jopara) as represented in the Paraguayan novel Ramona Quebranto (RQ). We apply the framework of Muysken (1997; 2000; 2013), developed as a taxonomy of spoken LM/CS. Our contribution extends its applicability to written LM/CS. We show that Jopara has a mix of insertional and backflagging strategies, with infrequent alternations.

随着越来越多的书面语言数据可用，对书面语言混合/代码转换(LM/CS)的兴趣正在增加(Sebba, Mahootian & Jonsson 2012;Sebba 2013)。非自然主义(如文学)文本中的LM/CS提出了与衡量(1)文本语料库的真实性和代表性有关的问题，并决定(2)口头LM/CS的类别/机制是否适用于书面LM/CS我们关注巴拉圭小说《雷蒙娜·魁布兰托》中所代表的瓜拉尼-西班牙语LM/CS(约帕拉语)。我们采用Muysken(1997)的框架;2000;2013)，作为口语LM/CS的分类学而发展。我们的贡献扩展了它对书面LM/CS的适用性。我们表明，Jopara混合了插入和后退策略，很少有变化。

引用次数: 1

The challenges and benefits of annotating oral bilingual corpora 双语口语语料库注释的挑战与益处

Romance Parsed Corpora

Pub Date : 2018-07-13 DOI: 10.1075/LV.00006.BUL

Barbara E. Bullock, Jacqueline Serigos, Almeida Jacqueline Toribio, Arthur Wendorf

This article describes efforts to collect, process, and automatically annotate a corpus of Spanish as spoken in Texas. It elaborates the protocols for the development of the corpus and the procedures for automatic annotation, illustrating the common pitfalls to language identification in bilingual corpora and potential methods for circumventing them. The benefits of a comparative corpus approach to contact varieties is illustrated by a case study of a putative verbal calque from the Spanish in Texas data. It is demonstrated that the relative frequency of the verb is much higher than in its source Mexican variety and that the verb selects different complements in Texas than it does in other varieties. The article concludes with a discussion of how computational tools might be fruitfully exploited to resolve long-standing debates about language variation in contact settings.

本文描述了收集、处理和自动注释德克萨斯州使用的西班牙语语料库的工作。阐述了语料库开发的协议和自动标注的步骤，说明了双语语料库中语言识别的常见缺陷和规避这些缺陷的可能方法。比较语料库方法接触品种的好处是由一个假定的口头calque的案例研究从西班牙语在德克萨斯州的数据说明。结果表明，该动词的相对频率远高于其来源墨西哥变体，并且该动词在德克萨斯州选择的补语与在其他变体中选择的补语不同。文章最后讨论了如何有效地利用计算工具来解决关于接触环境中语言变化的长期争论。

引用次数: 5

Eguren Luis, Olga Fernández-Soriano & Amaya Mendikoetxea (eds.). Rethinking Parameters.

Romance Parsed Corpora

Pub Date : 2018-07-13 DOI: 10.1075/LV.17009.RAN

Rodrigo Ranero

引用次数: 0

The Tycho Brahe Corpus of Historical Portuguese 第谷·布拉赫历史葡萄牙语语料库

Romance Parsed Corpora

Pub Date : 2018-07-13 DOI: 10.1075/LV.00004.GAL

Charlotte Galves

This article introduces the Tycho Brahe Corpus (TBC), a parsed corpus of Historical Portuguese built on the model of the Penn-York Corpora of English. As an illustration of the usefulness of the TBC, the article presents research on the evolution of the position and interpretation of subjects in Portuguese from the 16th to the 19th century. Two main claims emerge, in response to questions that have largely remained unanswered until now, due to the paucity of available data. One is that the texts of the classical period instantiate verb-movement to Comp in matrix clauses, reflecting a V2 grammar. The other is that quantitative and qualitative changes appearing in the texts of the authors born from the beginning of the 18th century on indicate that, at this period, verb-movement to Comp was lost and the modern SVO grammar emerged.

本文介绍了第谷·布拉赫语料库(TBC)，这是一个以宾夕法尼亚-约克英语语料库为模型建立的历史葡萄牙语解析语料库。为了说明TBC的有用性，本文介绍了从16世纪到19世纪葡萄牙语中主题的地位和解释的演变研究。由于缺乏可用的数据，到目前为止，在很大程度上仍未得到解答的问题中，出现了两种主要的说法。一是古典时期的文本在矩阵从句中实例化动词运动，反映了V2语法。二是18世纪初出生的作家文本中出现的量变和质变表明，在这一时期，动词向Comp的运动已经消失，现代SVO语法出现了。

引用次数: 13

The variable use of determiners in Old French and the argument DP hypothesis 古法语中限定词的可变用法和DP假说的论点

Romance Parsed Corpora

Pub Date : 2018-07-13 DOI: 10.1075/LV.00003.DUF

Monique Dufresne, Mire-ô B. Tremblay, R. Déchaîne

The argument DP hypothesis, adopted by many syntactic analyses, claims that nominal arguments are introduced by a determiner (D), which may be covert or overt. While overt D is obligatory in Modern French (consistent with the argument DP hypothesis), it was not obligatory in earlier stages of French. We explore the factors that contributed to this change – including semantic class, syntactic function, number, and definiteness – focusing on a shift that occurred in the D-paradigm in two Anglo-Norman texts of the 12th century. Quantitative analysis (Goldvarb) yields two major findings. First, the effect of syntactic function remains constant: subject position favours overt D, but object position inhibits it. Second, there is a change in the effect of semantic class: count nouns increasingly favour overt D, but non-count (mass and abstract) nouns increasingly inhibit it. More generally, the gradual disappearance of bare Ns in French reflects the emergence of paradigmatically conditioned D.

许多句法分析采用的论点DP假设声称，名义论点是由限定词(D)引入的，限定词可能是隐蔽的，也可能是公开的。虽然显性D在现代法语中是强制性的(与论点DP假设一致)，但在法语的早期阶段并不是强制性的。我们探讨了导致这种变化的因素——包括语义类、句法功能、数量和确定性——重点关注12世纪两个盎格鲁-诺曼文本中d范式的转变。定量分析(Goldvarb)有两个主要发现。首先，句法功能的影响保持不变:主语位置有利于显性D，而宾语位置抑制显性D。其次，语义类的影响发生了变化:可数名词越来越倾向于显性D，而非可数名词(质量和抽象)越来越抑制显性D。更普遍地说，法语中裸n的逐渐消失反映了范式条件D的出现。

引用次数: 1

Diachronic syntax based on constituency and dependency annotated corpora 基于选区和依赖标注语料库的历时语法

Romance Parsed Corpora

Pub Date : 2018-07-13 DOI: 10.1075/LV.00005.STE

A. Stein

This contribution presents two syntactically annotated corpora of Old French, Modéliser le changement: les voies du français (MCVF) and the Syntactic Reference Corpus of Medieval French (SRCMF). The focus is on how the underlying syntactic theory (constituency vs. dependency) influences the grammar model and how this choice is reflected in the syntactic annotations of the corpora. The comparison relates to the most relevant general properties of the corpora as well as to two phenomena, null subjects and cleft constructions. Null subjects highlight possible conflicts between syntactic annotation models and syntactic theory, and the information-structural properties of cleft constructions pose a particular problem for the interpretation and annotation of historical corpora. Both phenomena are major instances of diachronic variation in French. The study is relevant for corpus users working on diachronic syntax, as well for corpus builders wishing to design a grammar model for annotation.

本文介绍了两个古法语语法注释语料库，modsamliiser le changement: les voies du franais (MCVF)和中古法语句法参考语料库(SRCMF)。重点是底层的句法理论(集合与依赖)如何影响语法模型，以及这种选择如何反映在语料库的句法注释中。这种比较涉及到语料库最相关的一般性质，以及两种现象:空主语和裂缝结构。空主语突出了句法标注模型与句法理论之间可能存在的冲突，而断裂性结构的信息结构特性给历史语料库的解释和标注带来了特殊的问题。这两种现象都是法语历时变化的主要实例。该研究不仅适用于使用历时语法的语料库用户，也适用于希望为标注设计语法模型的语料库构建者。

引用次数: 0

Merging verb cluster variation 合并动词簇变异

Romance Parsed Corpora

Pub Date : 2018-07-13 DOI: 10.1075/LV.00008.BAR

S. Barbiers, H. J. Bennis, L. Dros-Hendriks

In this paper we argue that verb clusters in Dutch varieties are merged and linearized in fully ascending (1-2-3) or fully descending (3-2-1) orders. We argue that verb clusters that deviate from these orders involve non-verbal material: adjectival participles, or nominal infinitives. As a result, our approach does not involve any unmotivated movements that are specific for verb clusters. Support for our analysis comes from (i) the interpretation of verb clusters; (ii) the fact that order variation depends on the types of verbs involved, which can be explained by selectional requirements of the verbs; and (iii) the geographic co-occurrence patterns of various orders. First, the 1-3-2 and 3-1-2 orders are argued to be ascending orders with a non-verbal 3. Indeed these orders occur in grammars that have ascending, rather than descending, verb clusters. Secondly, the 1-3-2 order is argued to be an interrupted V1-V2 cluster with a non-verbal 3. Indeed, this order is most common in the region where non-verbal material can interrupt the verb cluster. Our analysis of word order variation in verb clusters in terms of principles of grammar is further supported by an experiment in which we asked a large number of speakers distributed over the Dutch language area to rank all logically possible orders, including orders that are not common in their own variety of Dutch. The results demonstrate that speakers apply their syntactic knowledge to rank verb cluster orders that they do not use themselves. We argue that this knowledge cannot be due to familiarity with the various orders.

在本文中，我们认为荷兰语变体中的动词簇是按完全升序(1-2-3)或完全降序(3-2-1)合并和线性化的。我们认为，偏离这些顺序的动词集群涉及非言语材料:形容词分词或名义不定式。因此，我们的方法不涉及任何特定于动词簇的无动机运动。对我们分析的支持来自(1)动词集群的解释;(ii)顺序变化取决于所涉及的动词类型，这可以通过动词的选择要求来解释;(3)各阶的地理共现模式。首先，1-3-2和3-1-2顺序被认为是带有非语言3的升序。事实上，这些顺序出现在有升序而不是降序动词集群的语法中。其次，1-3-2顺序被认为是一个具有非言语3的中断的V1-V2集群。事实上，这种顺序在非言语材料可以打断动词簇的区域最常见。我们根据语法原则对动词簇词序变化的分析得到了一个实验的进一步支持，在这个实验中，我们要求分布在荷兰语地区的大量说话者对所有逻辑上可能的顺序进行排序，包括在他们自己的荷兰语中不常见的顺序。结果表明，说话者运用句法知识对自己不使用的动词簇顺序进行排序。我们认为，这种知识不可能是由于熟悉各种秩序。

{"title":"Merging verb cluster variation","authors":"S. Barbiers, H. J. Bennis, L. Dros-Hendriks","doi":"10.1075/LV.00008.BAR","DOIUrl":"https://doi.org/10.1075/LV.00008.BAR","url":null,"abstract":"\u0000 In this paper we argue that verb clusters in Dutch varieties are merged and linearized in fully ascending (1-2-3) or fully descending (3-2-1) orders. We argue that verb clusters that deviate from these orders involve non-verbal material: adjectival participles, or nominal infinitives. As a result, our approach does not involve any unmotivated movements that are specific for verb clusters.\u0000 Support for our analysis comes from (i) the interpretation of verb clusters; (ii) the fact that order variation depends on the types of verbs involved, which can be explained by selectional requirements of the verbs; and (iii) the geographic co-occurrence patterns of various orders. First, the 1-3-2 and 3-1-2 orders are argued to be ascending orders with a non-verbal 3. Indeed these orders occur in grammars that have ascending, rather than descending, verb clusters. Secondly, the 1-3-2 order is argued to be an interrupted V1-V2 cluster with a non-verbal 3. Indeed, this order is most common in the region where non-verbal material can interrupt the verb cluster.\u0000 Our analysis of word order variation in verb clusters in terms of principles of grammar is further supported by an experiment in which we asked a large number of speakers distributed over the Dutch language area to rank all logically possible orders, including orders that are not common in their own variety of Dutch. The results demonstrate that speakers apply their syntactic knowledge to rank verb cluster orders that they do not use themselves. We argue that this knowledge cannot be due to familiarity with the various orders.","PeriodicalId":103584,"journal":{"name":"Romance Parsed Corpora","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132716754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Romance Parsed Corpora

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀