首页 > 最新文献

Language Dynamics and Change最新文献

英文 中文
Prehistoric languages and human self-domestication 史前语言与人类自我驯化
IF 0.7 Q1 Arts and Humanities Pub Date : 2018-12-10 DOI: 10.31234/osf.io/v6m2b
A. Benítez‐Burraco
The comparative method has enabled us to trace distant phylogenetic relationships among languages and reconstruct extinct languages from the past. Nonetheless, it has limitations, mostly resulting from the circumstance that languages also change by contact with unrelated languages and in response to external factors, particularly, aspects of human cognition and features of our physical and cultural environments. In this paper, it is argued that the limitations of historical linguistics can be partially alleviated by the consideration of the links between language structure and the biological underpinnings of human language, human cognition, and human behaviour, and specifically, of human self-domestication (that is, the existence in humans of features of domesticated mammals). Overall, we can expect that the languages spoken in remote prehistory exhibited most of the features of the so-called esoteric languages, which are used by present-day, close-knit, small human communities that share a great deal of knowledge about their environment.
这种比较方法使我们能够追溯语言之间遥远的系统发育关系,并重建过去灭绝的语言。尽管如此,它还是有局限性的,主要是由于语言也会随着与无关语言的接触和外部因素的变化而变化,特别是人类认知的各个方面以及我们的物理和文化环境的特征。本文认为,通过考虑语言结构与人类语言、人类认知和人类行为的生物学基础之间的联系,特别是人类自我驯化(即驯化哺乳动物的特征在人类中的存在),可以部分缓解历史语言学的局限性。总的来说,我们可以预期,在遥远的史前时期所说的语言表现出了所谓的深奥语言的大部分特征,这些语言被当今紧密联系的小人类社区所使用,这些社区分享了大量关于其环境的知识。
{"title":"Prehistoric languages and human self-domestication","authors":"A. Benítez‐Burraco","doi":"10.31234/osf.io/v6m2b","DOIUrl":"https://doi.org/10.31234/osf.io/v6m2b","url":null,"abstract":"\u0000 The comparative method has enabled us to trace distant phylogenetic relationships among languages and reconstruct extinct languages from the past. Nonetheless, it has limitations, mostly resulting from the circumstance that languages also change by contact with unrelated languages and in response to external factors, particularly, aspects of human cognition and features of our physical and cultural environments. In this paper, it is argued that the limitations of historical linguistics can be partially alleviated by the consideration of the links between language structure and the biological underpinnings of human language, human cognition, and human behaviour, and specifically, of human self-domestication (that is, the existence in humans of features of domesticated mammals). Overall, we can expect that the languages spoken in remote prehistory exhibited most of the features of the so-called esoteric languages, which are used by present-day, close-knit, small human communities that share a great deal of knowledge about their environment.","PeriodicalId":43113,"journal":{"name":"Language Dynamics and Change","volume":null,"pages":null},"PeriodicalIF":0.7,"publicationDate":"2018-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42883116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Modeling change in contact settings 联系人设置中的建模更改
IF 0.7 Q1 Arts and Humanities Pub Date : 2018-11-16 DOI: 10.1163/22105832-00802006
Katia Chirkova, T. Gong
Convergence is an oft-used notion in contact linguistics and historical linguistics. Yet it is problematic as an explanatory account for the changes it represents. In this study, we model one specific case of convergence (Duoxu, an endangered Tibeto-Burman language with 9 remaining speakers) to contribute to a more systematic understanding of the mechanisms underlying this phenomenon. The goals are (1) to address the role of some linguistic and social factors assumed to have an effect on the process of convergence, and (2) to test the following explanations of empirical observations related to phonological convergence: (a) the loss of phonological segments in a language that has undergone convergence is correlated with the relative frequency and markedness of these segments in the combined bilingual repertoire, and (b) widespread bilingualism is a prerequisite for convergence. The results of our agent-based simulation affirm the importance of frequency and markedness of phonological segments in the process of convergence. At the same time, they suggest that the explanation related to widespread bilingualism may not be valid. Our study suggests computer simulations as a promising tool for investigation of complex cases of language change in contact settings.
趋同是接触语言学和历史语言学中经常使用的一个概念。然而,用它来解释它所代表的变化是有问题的。在这项研究中,我们模拟了一个特定的趋同案例(多语,一种濒危的藏缅语,有9个使用者),以有助于更系统地理解这种现象背后的机制。本研究的目标是:(1)探讨一些被认为对语音趋同过程有影响的语言和社会因素的作用,以及(2)检验以下与语音趋同相关的实证观察的解释:(a)经历了趋同的语言中语音片段的丢失与这些片段在综合双语曲目中的相对频率和标记性相关;(b)广泛的双语是趋同的先决条件。我们基于智能体的模拟结果肯定了语音段的频率和标记性在收敛过程中的重要性。与此同时,他们认为与广泛使用双语有关的解释可能不成立。我们的研究表明,计算机模拟是一种很有前途的工具,用于调查接触环境中语言变化的复杂情况。
{"title":"Modeling change in contact settings","authors":"Katia Chirkova, T. Gong","doi":"10.1163/22105832-00802006","DOIUrl":"https://doi.org/10.1163/22105832-00802006","url":null,"abstract":"\u0000 Convergence is an oft-used notion in contact linguistics and historical linguistics. Yet it is problematic as an explanatory account for the changes it represents. In this study, we model one specific case of convergence (Duoxu, an endangered Tibeto-Burman language with 9 remaining speakers) to contribute to a more systematic understanding of the mechanisms underlying this phenomenon. The goals are (1) to address the role of some linguistic and social factors assumed to have an effect on the process of convergence, and (2) to test the following explanations of empirical observations related to phonological convergence: (a) the loss of phonological segments in a language that has undergone convergence is correlated with the relative frequency and markedness of these segments in the combined bilingual repertoire, and (b) widespread bilingualism is a prerequisite for convergence. The results of our agent-based simulation affirm the importance of frequency and markedness of phonological segments in the process of convergence. At the same time, they suggest that the explanation related to widespread bilingualism may not be valid. Our study suggests computer simulations as a promising tool for investigation of complex cases of language change in contact settings.","PeriodicalId":43113,"journal":{"name":"Language Dynamics and Change","volume":null,"pages":null},"PeriodicalIF":0.7,"publicationDate":"2018-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1163/22105832-00802006","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43197181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Linguistic diversification as a long-term effect of asymmetric priming 非对称启动效应对语言多样化的长期影响
IF 0.7 Q1 Arts and Humanities Pub Date : 2018-10-01 DOI: 10.1163/22105832-00802002
Andreas Baumann, Lotte Sommerer
This paper tries to narrow the gap between diachronic linguistics and research on population dynamics by presenting a mathematical model corroborating the notion that the cognitive mechanism of asymmetric priming can account for observable tendencies in language change. The asymmetric-priming hypothesis asserts that items with more substance are more likely to prime items with less substance than the reverse. Although these effects operate on a very short time scale (e.g. within an utterance) it has been argued that their long-term effect might be reductionist, unidirectional processes in language change. In this paper, we study a mathematical model of the interaction of linguistic items that differ in their formal substance, showing that, in addition to reductionist effects, asymmetric priming also results in diversification and stable coexistence of two formally related variants. The model will be applied to phenomena in the sublexical as well as the lexical domain.
本文试图通过提出一个数学模型来证实非对称启动的认知机制可以解释语言变化的可观察趋势,从而缩小历时语言学与人口动态研究之间的差距。非对称启动假说认为,物质较多的项目更有可能启动物质较少的项目,而不是相反。尽管这些影响在很短的时间尺度上(例如在一个话语中)起作用,但有人认为它们的长期影响可能是语言变化中的简化的、单向的过程。在本文中,我们研究了形式物质不同的语言项目相互作用的数学模型,结果表明,除了还原论效应外,不对称启动还导致两种形式相关变体的多样化和稳定共存。该模型将应用于亚词汇领域和词汇领域的现象。
{"title":"Linguistic diversification as a long-term effect of asymmetric priming","authors":"Andreas Baumann, Lotte Sommerer","doi":"10.1163/22105832-00802002","DOIUrl":"https://doi.org/10.1163/22105832-00802002","url":null,"abstract":"\u0000 This paper tries to narrow the gap between diachronic linguistics and research on population dynamics by presenting a mathematical model corroborating the notion that the cognitive mechanism of asymmetric priming can account for observable tendencies in language change. The asymmetric-priming hypothesis asserts that items with more substance are more likely to prime items with less substance than the reverse. Although these effects operate on a very short time scale (e.g. within an utterance) it has been argued that their long-term effect might be reductionist, unidirectional processes in language change. In this paper, we study a mathematical model of the interaction of linguistic items that differ in their formal substance, showing that, in addition to reductionist effects, asymmetric priming also results in diversification and stable coexistence of two formally related variants. The model will be applied to phenomena in the sublexical as well as the lexical domain.","PeriodicalId":43113,"journal":{"name":"Language Dynamics and Change","volume":null,"pages":null},"PeriodicalIF":0.7,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1163/22105832-00802002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48624610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Testing an agent-based model of language choice on sociolinguistic survey data 基于社会语言学调查数据的语言选择模型测试
IF 0.7 Q1 Arts and Humanities Pub Date : 2018-10-01 DOI: 10.1163/22105832-00802004
Andres Karjus, Martin Ehala
The paper outlines an agent-based model for language choice in multilingual communities and tests its performance on samples of data drawn from a large-scale sociolinguistic survey carried out in Estonia. While previous research in the field of language competition has focused on diachronic applications, utilizing rather abstract models of uniform speakers, we aim to model synchronic language competition among more realistic, data-based agents. We hypothesized that a reasonably parametrized simulation of interactions between agents endowed with interaction principles grounded in sociolinguistic research would give rise to a network structure resembling real-world social networks, and that the distribution of languages used in the model would resemble their actual usage distribution. The simulation was reasonably successful in replicating the real-world scenarios, while further analysis revealed that the model parameters differ in importance between samples. We conclude that such variation should be considered in parametrizing future language choice and competition models.
本文概述了一个基于主体的多语言社区语言选择模型,并在爱沙尼亚进行的一项大规模社会语言学调查的数据样本上测试了该模型的性能。虽然之前在语言竞争领域的研究主要集中在历时性应用上,利用统一说话者的抽象模型,我们的目标是在更现实的、基于数据的代理之间建立共时语言竞争的模型。我们假设,基于社会语言学研究,对具有交互原理的主体之间的交互进行合理的参数化模拟,将产生类似于现实世界社交网络的网络结构,并且模型中使用的语言的分布将类似于其实际使用分布。模拟在复制真实世界场景方面相当成功,而进一步的分析表明,模型参数在样本之间的重要性不同。我们的结论是,在参数化未来的语言选择和竞争模型时,应该考虑这种变化。
{"title":"Testing an agent-based model of language choice on sociolinguistic survey data","authors":"Andres Karjus, Martin Ehala","doi":"10.1163/22105832-00802004","DOIUrl":"https://doi.org/10.1163/22105832-00802004","url":null,"abstract":"\u0000 The paper outlines an agent-based model for language choice in multilingual communities and tests its performance on samples of data drawn from a large-scale sociolinguistic survey carried out in Estonia. While previous research in the field of language competition has focused on diachronic applications, utilizing rather abstract models of uniform speakers, we aim to model synchronic language competition among more realistic, data-based agents. We hypothesized that a reasonably parametrized simulation of interactions between agents endowed with interaction principles grounded in sociolinguistic research would give rise to a network structure resembling real-world social networks, and that the distribution of languages used in the model would resemble their actual usage distribution. The simulation was reasonably successful in replicating the real-world scenarios, while further analysis revealed that the model parameters differ in importance between samples. We conclude that such variation should be considered in parametrizing future language choice and competition models.","PeriodicalId":43113,"journal":{"name":"Language Dynamics and Change","volume":null,"pages":null},"PeriodicalIF":0.7,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1163/22105832-00802004","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43836082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A new approach to concept basicness and stability as a window to the robustness of concept list rankings 概念基本性和稳定性的新方法作为概念列表排名稳健性的窗口
IF 0.7 Q1 Arts and Humanities Pub Date : 2018-10-01 DOI: 10.1163/22105832-00802001
Johannes Dellert, Armin Buch
Based on a recently published large-scale lexicostatistical database, we rank 1,016 concepts by their suitability for inclusion in Swadesh-style lists of basic stable concepts. For this, we define separate measures of basicness and stability. Basicness in the sense of morphological simplicity is measured based on information content, a generalization of word length which corrects for distorting effects of phoneme inventory sizes, phonotactics and non-stem morphemes in dictionary forms. Stability against replacement by semantic shift or borrowing is measured by sampling independent language pairs, and correlating the distances between the forms for the concept with the overall language distances. In order to determine the relative importance of basicness and stability, we optimize our combination of the two partial measures towards similarity with existing lists. A comparison with and among existing rankings suggests that concept rankings are highly data-dependent and therefore less well-grounded than previously assumed. To explore this issue, we evaluate the robustness of our ranking against language pair resampling, allowing us to assess how much volatility can be expected, and showing that only about half of the concepts on a list based on our ranking can safely be assumed to belong on the list independently of the data.
基于最近发布的一个大型词典统计数据库,我们根据1016个概念是否适合纳入Swadesh风格的基本稳定概念列表对其进行了排名。为此,我们定义了基本性和稳定性的单独度量。词形简单意义上的基本性是基于信息内容来衡量的,信息内容是对单词长度的概括,它纠正了词典形式中音素清单大小、表音策略和非词干词素的扭曲效应。通过对独立的语言对进行采样,并将概念的形式之间的距离与整体语言距离相关联,来衡量对语义转移或借用替代的稳定性。为了确定基本性和稳定性的相对重要性,我们优化了两个部分度量的组合,以实现与现有列表的相似性。与现有排名的比较表明,概念排名高度依赖于数据,因此没有之前假设的那么有根据。为了探讨这个问题,我们评估了我们的排名相对于语言对重采样的稳健性,使我们能够评估预期的波动程度,并表明基于我们的排名的列表中只有大约一半的概念可以安全地被认为与数据无关地属于该列表。
{"title":"A new approach to concept basicness and stability as a window to the robustness of concept list rankings","authors":"Johannes Dellert, Armin Buch","doi":"10.1163/22105832-00802001","DOIUrl":"https://doi.org/10.1163/22105832-00802001","url":null,"abstract":"\u0000 Based on a recently published large-scale lexicostatistical database, we rank 1,016 concepts by their suitability for inclusion in Swadesh-style lists of basic stable concepts. For this, we define separate measures of basicness and stability. Basicness in the sense of morphological simplicity is measured based on information content, a generalization of word length which corrects for distorting effects of phoneme inventory sizes, phonotactics and non-stem morphemes in dictionary forms. Stability against replacement by semantic shift or borrowing is measured by sampling independent language pairs, and correlating the distances between the forms for the concept with the overall language distances. In order to determine the relative importance of basicness and stability, we optimize our combination of the two partial measures towards similarity with existing lists. A comparison with and among existing rankings suggests that concept rankings are highly data-dependent and therefore less well-grounded than previously assumed. To explore this issue, we evaluate the robustness of our ranking against language pair resampling, allowing us to assess how much volatility can be expected, and showing that only about half of the concepts on a list based on our ranking can safely be assumed to belong on the list independently of the data.","PeriodicalId":43113,"journal":{"name":"Language Dynamics and Change","volume":null,"pages":null},"PeriodicalIF":0.7,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1163/22105832-00802001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44656488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Using ancestral state reconstruction methods for onomasiological reconstruction in multilingual word lists 基于祖先状态重构方法的多语词表拟声重建
IF 0.7 Q1 Arts and Humanities Pub Date : 2018-06-22 DOI: 10.1163/22105832-00801002
Gerhard Jäger, Johann-Mattis List
Current efforts in computational historical linguistics are predominantly concerned with phylogenetic inference. Methods for ancestral state reconstruction have only been applied sporadically. In contrast to phylogenetic algorithms, automatic reconstruction methods presuppose phylogenetic information in order to explain what has evolved when and where. Here we report a pilot study exploring how well automatic methods for ancestral state reconstruction perform in the task of onomasiological reconstruction in multilingual word lists, where algorithms are used to infer how the words evolved along a given phylogeny, and reconstruct which cognate classes were used to express a given meaning in the ancestral languages. Comparing three different methods, Maximum Parsimony, Minimal Lateral Networks, and Maximum Likelihood on three different test sets (Indo-European, Austronesian, Chinese) using binary and multi-state coding of the data as well as single and sampled phylogenies, we find that Maximum Likelihood largely outperforms the other methods. At the same time, however, the general performance was disappointingly low, ranging between 0.66 (Chinese) and 0.79 (Austronesian) for the F-Scores. A closer linguistic evaluation of the reconstructions proposed by the best method and the reconstructions given in the gold standards revealed that the majority of the cases where the algorithms failed can be attributed to problems of independent semantic shift (homoplasy), to morphological processes in lexical change, and to wrong reconstructions in the independently created test sets that we employed.
目前计算历史语言学主要关注系统发育推断。重建祖先状态的方法只是偶尔应用。与系统发育算法相反,自动重建方法以系统发育信息为前提,以解释什么在何时何地进化。在这里,我们报告了一项试点研究,探索祖先状态重建的自动方法在多语言单词列表中的词汇重建任务中的表现,其中算法用于推断单词如何沿着给定的系统发育进化,并重建哪些同源类用于表达祖先语言中的给定含义。在三个不同的测试集(印欧语、南岛语、汉语)上,使用数据的二进制和多状态编码以及单样本和采样系统发育,比较三种不同的方法,即最大解析法、最小横向网络和最大似然法,我们发现最大似然法在很大程度上优于其他方法。然而,与此同时,总体表现却低得令人失望,F分在0.66(中国人)和0.79(南岛人)之间。对最佳方法提出的重建和黄金标准中给出的重建进行了更仔细的语言学评估,结果表明,算法失败的大多数情况可归因于独立语义转移(同源性)问题、词汇变化中的形态过程、,以及我们使用的独立创建的测试集中的错误重建。
{"title":"Using ancestral state reconstruction methods for onomasiological reconstruction in multilingual word lists","authors":"Gerhard Jäger, Johann-Mattis List","doi":"10.1163/22105832-00801002","DOIUrl":"https://doi.org/10.1163/22105832-00801002","url":null,"abstract":"Current efforts in computational historical linguistics are predominantly concerned with phylogenetic inference. Methods for ancestral state reconstruction have only been applied sporadically. In contrast to phylogenetic algorithms, automatic reconstruction methods presuppose phylogenetic information in order to explain what has evolved when and where. Here we report a pilot study exploring how well automatic methods for ancestral state reconstruction perform in the task of onomasiological reconstruction in multilingual word lists, where algorithms are used to infer how the words evolved along a given phylogeny, and reconstruct which cognate classes were used to express a given meaning in the ancestral languages. Comparing three different methods, Maximum Parsimony, Minimal Lateral Networks, and Maximum Likelihood on three different test sets (Indo-European, Austronesian, Chinese) using binary and multi-state coding of the data as well as single and sampled phylogenies, we find that Maximum Likelihood largely outperforms the other methods. At the same time, however, the general performance was disappointingly low, ranging between 0.66 (Chinese) and 0.79 (Austronesian) for the F-Scores. A closer linguistic evaluation of the reconstructions proposed by the best method and the reconstructions given in the gold standards revealed that the majority of the cases where the algorithms failed can be attributed to problems of independent semantic shift (homoplasy), to morphological processes in lexical change, and to wrong reconstructions in the independently created test sets that we employed.","PeriodicalId":43113,"journal":{"name":"Language Dynamics and Change","volume":null,"pages":null},"PeriodicalIF":0.7,"publicationDate":"2018-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1163/22105832-00801002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47493958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Making genealogical language classifications available for phylogenetic analysis 使系谱语言分类可用于系统发育分析
IF 0.7 Q1 Arts and Humanities Pub Date : 2018-06-22 DOI: 10.1163/22105832-00801001
D. Dediu
One of the best-known types of non-independence between languages is caused by genealogical relationships due to descent from a common ancestor. These can be represented by (more or less resolved and controversial) language family trees. In theory, one can argue that language families should be built through the strict application of the comparative method of historical linguistics, but in practice this is not always the case, and there are several proposed classifications of languages into language families, each with its own advantages and disadvantages. A major stumbling block shared by most of them is that they are relatively difficult to use with computational methods, and in particular with phylogenetics. This is due to their lack of standardization, coupled with the general non-availability of branch length information, which encapsulates the amount of evolution taking place on the family tree. In this paper I introduce a method (and its implementation in R) that converts the language classifications provided by four widely-used databases (Ethnologue, WALS, AUTOTYP and Glottolog) into the de facto Newick standard generally used in phylogenetics, aligns the four most used conventions for unique identifiers of linguistic entities (ISO 639-3, WALS, AUTOTYP and Glottocode), and adds branch length information from a variety of sources (the tree’s own topology, an externally given numeric constant, or a distance matrix). The R scripts, input data and resulting Newick trees are available under liberal open-source licenses in a GitHub repository (https://github.com/ddediu/lgfam-newick), to encourage and promote the use of phylogenetic methods to investigate linguistic diversity and its temporal dynamics.
语言之间最著名的不独立类型之一是由共同祖先的后裔所引起的宗谱关系。这些可以用(或多或少已解决或有争议的)语言家谱来表示。理论上,人们可以认为应该通过严格应用历史语言学的比较方法来建立语族,但在实践中并非总是如此,并且有几种建议将语言分类为语族,每种语言都有自己的优点和缺点。它们中的大多数都有一个主要的障碍,那就是它们相对难以与计算方法一起使用,特别是与系统发育学一起使用。这是由于它们缺乏标准化,加上分支长度信息的不可用性,分支长度信息封装了在家族树中发生的进化的数量。在本文中,我介绍了一种方法(及其在R中的实现),该方法将四个广泛使用的数据库(Ethnologue, WALS, AUTOTYP和Glottolog)提供的语言分类转换为系统发育学中通常使用的事实上的Newick标准,对语言实体的唯一标识符(ISO 639-3, WALS, AUTOTYP和Glottocode)的四种最常用惯例进行校准,并添加来自各种来源的分支长度信息(树自身的拓扑结构,(外部给定的数值常数,或距离矩阵)。R脚本、输入数据和生成的Newick树在GitHub存储库(https://github.com/ddediu/lgfam-newick)的自由开源许可下可用,以鼓励和促进使用系统发育方法来研究语言多样性及其时间动态。
{"title":"Making genealogical language classifications available for phylogenetic analysis","authors":"D. Dediu","doi":"10.1163/22105832-00801001","DOIUrl":"https://doi.org/10.1163/22105832-00801001","url":null,"abstract":"One of the best-known types of non-independence between languages is caused by genealogical relationships due to descent from a common ancestor. These can be represented by (more or less resolved and controversial) language family trees. In theory, one can argue that language families should be built through the strict application of the comparative method of historical linguistics, but in practice this is not always the case, and there are several proposed classifications of languages into language families, each with its own advantages and disadvantages. A major stumbling block shared by most of them is that they are relatively difficult to use with computational methods, and in particular with phylogenetics. This is due to their lack of standardization, coupled with the general non-availability of branch length information, which encapsulates the amount of evolution taking place on the family tree. In this paper I introduce a method (and its implementation in R) that converts the language classifications provided by four widely-used databases (Ethnologue, WALS, AUTOTYP and Glottolog) into the de facto Newick standard generally used in phylogenetics, aligns the four most used conventions for unique identifiers of linguistic entities (ISO 639-3, WALS, AUTOTYP and Glottocode), and adds branch length information from a variety of sources (the tree’s own topology, an externally given numeric constant, or a distance matrix). The R scripts, input data and resulting Newick trees are available under liberal open-source licenses in a GitHub repository (https://github.com/ddediu/lgfam-newick), to encourage and promote the use of phylogenetic methods to investigate linguistic diversity and its temporal dynamics.","PeriodicalId":43113,"journal":{"name":"Language Dynamics and Change","volume":null,"pages":null},"PeriodicalIF":0.7,"publicationDate":"2018-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1163/22105832-00801001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47214855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
The effect of dictionary omissions on phylogenies computationally inferred from lexical data 从词汇数据计算推断的词典遗漏对系统发育的影响
IF 0.7 Q1 Arts and Humanities Pub Date : 2018-06-22 DOI: 10.1163/22105832-00801007
I. Yanovich
Lexical datasets used for computational phylogenetic inference suffer a unique type of data error. Some words actually present in a language may be absent from the dataset at no fault of its curators: especially for lesser-studied languages, a word may be missing from all available sources such as dictionaries. It is thus important to be able to (i) check how robust one’s inferences are to dictionary omission errors, and (ii) incorporate the knowledge that such errors may be present into one’s inference. I introduce two simple techniques that work towards those goals, and study the possible effects of dictionary omission errors in two real-life case studies on the Lezgian and Uralic datasets from Kassian (2015) and Syrjänen et al. (2013), respectively. The effects of dictionary omission turn out to be moderate (Lezgian) to negligible (Uralic), and certainly far less significant than the possible effects of modeling choices, including priors, on the inferred phylogeny, as demonstrated in the Uralic case study. Assessing the possible effects of dictionary omissions is advisable, but severe problems are unlikely. Collecting significantly larger lexical datasets, in order to overcome sensitivity to priors, is likely more important than expending resources on verifying data against dictionary omissions.
用于计算系统发育推断的词汇数据集存在一种独特类型的数据错误。一种语言中实际存在的一些单词可能没有出现在数据集中,这不是数据集管理员的错:尤其是对于研究较少的语言,一个单词可能会从所有可用的来源(如词典)中消失。因此,重要的是能够(i)检查一个人的推断对字典遗漏错误的鲁棒性,以及(ii)将这种错误可能存在的知识纳入一个人的推理中。我介绍了两种实现这些目标的简单技术,并分别在Kassian(2015)和Syrjänen等人(2013)的Lezgian和Uralic数据集的两个真实案例研究中研究了词典遗漏错误的可能影响。字典遗漏的影响是中等的(Lezgian)到可忽略的(Uralic),当然远不如Uralic案例研究中所证明的建模选择(包括先验)对推断的系统发育的可能影响重要。评估字典遗漏的可能影响是可取的,但严重的问题不太可能出现。为了克服对先验的敏感性,收集大得多的词汇数据集可能比花费资源根据字典遗漏验证数据更重要。
{"title":"The effect of dictionary omissions on phylogenies computationally inferred from lexical data","authors":"I. Yanovich","doi":"10.1163/22105832-00801007","DOIUrl":"https://doi.org/10.1163/22105832-00801007","url":null,"abstract":"Lexical datasets used for computational phylogenetic inference suffer a unique type of data error. Some words actually present in a language may be absent from the dataset at no fault of its curators: especially for lesser-studied languages, a word may be missing from all available sources such as dictionaries. It is thus important to be able to (i) check how robust one’s inferences are to dictionary omission errors, and (ii) incorporate the knowledge that such errors may be present into one’s inference. I introduce two simple techniques that work towards those goals, and study the possible effects of dictionary omission errors in two real-life case studies on the Lezgian and Uralic datasets from Kassian (2015) and Syrjänen et al. (2013), respectively. The effects of dictionary omission turn out to be moderate (Lezgian) to negligible (Uralic), and certainly far less significant than the possible effects of modeling choices, including priors, on the inferred phylogeny, as demonstrated in the Uralic case study. Assessing the possible effects of dictionary omissions is advisable, but severe problems are unlikely. Collecting significantly larger lexical datasets, in order to overcome sensitivity to priors, is likely more important than expending resources on verifying data against dictionary omissions.","PeriodicalId":43113,"journal":{"name":"Language Dynamics and Change","volume":null,"pages":null},"PeriodicalIF":0.7,"publicationDate":"2018-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1163/22105832-00801007","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45453654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Halfway up the mountain 在半山腰
IF 0.7 Q1 Arts and Humanities Pub Date : 2018-06-22 DOI: 10.1163/22105832-00801006
J. Heath
{"title":"Halfway up the mountain","authors":"J. Heath","doi":"10.1163/22105832-00801006","DOIUrl":"https://doi.org/10.1163/22105832-00801006","url":null,"abstract":"","PeriodicalId":43113,"journal":{"name":"Language Dynamics and Change","volume":null,"pages":null},"PeriodicalIF":0.7,"publicationDate":"2018-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1163/22105832-00801006","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46034970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Armenian prosody in typology and diachrony 亚美尼亚韵律的类型学和历时性
IF 0.7 Q1 Arts and Humanities Pub Date : 2018-06-22 DOI: 10.1163/22105832-00801005
J. DeLisi
This paper examines the relationship between typology and historical linguistics through a case study from the history of Armenian, where two different stress systems are found in the modern language. The first is a penult system with no associated secondary stress ([… σ́σ]ω). The other, the so-called hammock pattern, has primary stress on the final syllable and secondary stress on the initial syllable of the prosodic word ([σ̀ … σ́]ω). Although penult stress patterns are by far more typologically common than the hammock pattern in the world’s languages, I will argue that the hammock pattern must be reconstructed for the period of shared innovation, the Proto-Armenian period.
本文通过亚美尼亚语史的案例研究,探讨了类型学与历史语言学之间的关系,在亚美尼亚语史中发现了两种不同的重音系统。第一种是没有相关二次应力([…σ́σ]ω)的倒数系统。另一种是所谓的吊床模式,主要重音在韵律词的最后一个音节上,次要重音在首音节上([σÉ…σ́]ω)。尽管在世界语言中,倒数重音模式在类型学上比吊床模式更常见,但我认为,吊床模式必须为共同创新时期——原亚美尼亚时期——重建。
{"title":"Armenian prosody in typology and diachrony","authors":"J. DeLisi","doi":"10.1163/22105832-00801005","DOIUrl":"https://doi.org/10.1163/22105832-00801005","url":null,"abstract":"This paper examines the relationship between typology and historical linguistics through a case study from the history of Armenian, where two different stress systems are found in the modern language. The first is a penult system with no associated secondary stress ([… σ́σ]ω). The other, the so-called hammock pattern, has primary stress on the final syllable and secondary stress on the initial syllable of the prosodic word ([σ̀ … σ́]ω). Although penult stress patterns are by far more typologically common than the hammock pattern in the world’s languages, I will argue that the hammock pattern must be reconstructed for the period of shared innovation, the Proto-Armenian period.","PeriodicalId":43113,"journal":{"name":"Language Dynamics and Change","volume":null,"pages":null},"PeriodicalIF":0.7,"publicationDate":"2018-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1163/22105832-00801005","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48887423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
Language Dynamics and Change
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1