Paul Van Eecke, Katrien Beuls, Jérôme Botoko Ekila, Roxana Rădulescu
Today, computational models of emergent communication in populations of autonomous agents are studied through two main methodological paradigms: multi-agent reinforcement learning (MARL) and the language game paradigm. While both paradigms share their main objectives and employ strikingly similar methods, the interaction between both communities has so far been surprisingly limited. This can to a large extent be ascribed to the use of different terminologies and experimental designs, which sometimes hinder the detection and interpretation of one another’s results and progress. Through this paper, we aim to remedy this situation by (1) formulating the challenge of re-conceptualising the language game experimental paradigm in the framework of MARL, and by (2) providing both an alignment between their terminologies and an MARL−based reformulation of the canonical naming game experiment. Tackling this challenge will enable future language game experiments to benefit from the rapid and promising methodological advances in the MARL community, while it will enable future MARL experiments on learning emergent communication to benefit from the insights and results gained through language game experiments. We strongly believe that this cross-pollination has the potential to lead to major breakthroughs in the modelling of how human-like languages can emerge and evolve in multi-agent systems.
{"title":"Language games meet multi-agent reinforcement learning: A case study for the naming game","authors":"Paul Van Eecke, Katrien Beuls, Jérôme Botoko Ekila, Roxana Rădulescu","doi":"10.1093/jole/lzad001","DOIUrl":"https://doi.org/10.1093/jole/lzad001","url":null,"abstract":"\u0000 Today, computational models of emergent communication in populations of autonomous agents are studied through two main methodological paradigms: multi-agent reinforcement learning (MARL) and the language game paradigm. While both paradigms share their main objectives and employ strikingly similar methods, the interaction between both communities has so far been surprisingly limited. This can to a large extent be ascribed to the use of different terminologies and experimental designs, which sometimes hinder the detection and interpretation of one another’s results and progress. Through this paper, we aim to remedy this situation by (1) formulating the challenge of re-conceptualising the language game experimental paradigm in the framework of MARL, and by (2) providing both an alignment between their terminologies and an MARL−based reformulation of the canonical naming game experiment. Tackling this challenge will enable future language game experiments to benefit from the rapid and promising methodological advances in the MARL community, while it will enable future MARL experiments on learning emergent communication to benefit from the insights and results gained through language game experiments. We strongly believe that this cross-pollination has the potential to lead to major breakthroughs in the modelling of how human-like languages can emerge and evolve in multi-agent systems.","PeriodicalId":37118,"journal":{"name":"Journal of Language Evolution","volume":" ","pages":""},"PeriodicalIF":2.6,"publicationDate":"2023-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44818082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sławomir Wacewicz, M. Pleyer, A. Szczepańska, Aleksandra Ewa Poniewierska, Przemysław Żywiczyński
The last three decades have brought a wealth of new empirical data and methods that have transformed investigations of language evolution into a fast-growing field of scientific research. In this paper, we investigate how the results of this research are represented in the content of the most popular introductory linguistic textbooks. We carried out a comprehensive computer-assisted qualitative study, in which we inspected eighteen English-language textbooks for all content related to the evolutionary emergence of language and its uniqueness in nature, in order to evaluate its thematic scope, selection of topics, theories covered, researchers cited, structural soundness, currency, and factual accuracy. Overall, we found that the content of interest lacks a defined canonical representation across the textbooks. The coverage of animal communication was relatively broad, with some recurring classic examples, such as vervet monkeys or honeybees; this content was mostly structured around the ‘design features’ approach. In contrast, the coverage of topics related to language origins and evolution was much less extensive and systematic, and tended to include a relatively large the proportion of content of historical value (i.e. creation myths, ‘bow-wow’ theories). We conclude by making recommendations for future editions of textbooks, in particular, a better representation of important frameworks such as signalling theory, and of current research results in this fast-paced field.
{"title":"The representation of animal communication and language evolution in introductory linguistics textbooks","authors":"Sławomir Wacewicz, M. Pleyer, A. Szczepańska, Aleksandra Ewa Poniewierska, Przemysław Żywiczyński","doi":"10.1093/jole/lzac010","DOIUrl":"https://doi.org/10.1093/jole/lzac010","url":null,"abstract":"\u0000 The last three decades have brought a wealth of new empirical data and methods that have transformed investigations of language evolution into a fast-growing field of scientific research. In this paper, we investigate how the results of this research are represented in the content of the most popular introductory linguistic textbooks. We carried out a comprehensive computer-assisted qualitative study, in which we inspected eighteen English-language textbooks for all content related to the evolutionary emergence of language and its uniqueness in nature, in order to evaluate its thematic scope, selection of topics, theories covered, researchers cited, structural soundness, currency, and factual accuracy. Overall, we found that the content of interest lacks a defined canonical representation across the textbooks. The coverage of animal communication was relatively broad, with some recurring classic examples, such as vervet monkeys or honeybees; this content was mostly structured around the ‘design features’ approach. In contrast, the coverage of topics related to language origins and evolution was much less extensive and systematic, and tended to include a relatively large the proportion of content of historical value (i.e. creation myths, ‘bow-wow’ theories). We conclude by making recommendations for future editions of textbooks, in particular, a better representation of important frameworks such as signalling theory, and of current research results in this fast-paced field.","PeriodicalId":37118,"journal":{"name":"Journal of Language Evolution","volume":" ","pages":""},"PeriodicalIF":2.6,"publicationDate":"2023-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47278126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The so-called ‘Altaic’ languages have been subject of debate for over 200 years. An array of different data sets have been used to investigate the genealogical relationships between them, but the controversy persists. The new data with a high potential for such cases in historical linguistics are structural features, which are sometimes declared to be prone to borrowing and discarded from the very beginning and at other times considered to have an especially precise historical signal reaching further back in time than other types of linguistic data. We investigate the performance of typological features across different domains of language by using an admixture model from genetics. As implemented in the software STRUCTURE, this model allows us to account for both a genealogical and an areal signal in the data. Our analysis shows that morphological features have the strongest genealogical signal and syntactic features diffuse most easily. When using only morphological structural data, the model is able to correctly identify three language families: Turkic, Mongolic, and Tungusic, whereas Japonic and Koreanic languages are assigned the same ancestry.
{"title":"Modelling admixture across language levels to evaluate deep history claims","authors":"Nataliia Hübler, Simon J. Greenhill","doi":"10.1093/jole/lzad002","DOIUrl":"https://doi.org/10.1093/jole/lzad002","url":null,"abstract":"\u0000 The so-called ‘Altaic’ languages have been subject of debate for over 200 years. An array of different data sets have been used to investigate the genealogical relationships between them, but the controversy persists. The new data with a high potential for such cases in historical linguistics are structural features, which are sometimes declared to be prone to borrowing and discarded from the very beginning and at other times considered to have an especially precise historical signal reaching further back in time than other types of linguistic data. We investigate the performance of typological features across different domains of language by using an admixture model from genetics. As implemented in the software STRUCTURE, this model allows us to account for both a genealogical and an areal signal in the data. Our analysis shows that morphological features have the strongest genealogical signal and syntactic features diffuse most easily. When using only morphological structural data, the model is able to correctly identify three language families: Turkic, Mongolic, and Tungusic, whereas Japonic and Koreanic languages are assigned the same ancestry.","PeriodicalId":37118,"journal":{"name":"Journal of Language Evolution","volume":" ","pages":""},"PeriodicalIF":2.6,"publicationDate":"2023-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44558820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sławomir Wacewicz, Marta Sibierska, Placiński Marek, A. Szczepańska, Aleksandra Poniewierska, Yen Ng, Przemysław Żywiczyński
Language evolution is a modern incarnation of a long intellectual tradition that addresses the fundamental question of how language began. Such a formulation is intuitively obvious, but a more precise characterisation of this area of research with its central notions—language and evolution—has proved surprisingly elusive. In this paper, we show how conceptual analysis can be complemented with scientometric analysis in describing language evolution. To this end, we built a database containing information on the contributions and contributors to the proceedings of the nine most recent iterations (years 2004–20) of the Evolang conference, which given its long history (1996–) and attendance rates gives a good reflection of the thematic scope and research trends in the field of language evolution as a whole. We present several analyses of these data, concerning the geographical distribution of the researchers contributing to the conference, a set of ‘classic’ references most frequently cited in Evolang proceedings, researcher profiles self-associated with the most popular tags for this area of research (such as ‘evolution of language’ vs. ‘language evolution’), and the changes to the profile of the conference as represented in the proportions of topics and author networks over the most recent Evolang iterations. While our resource is intended primarily as a source of insight into the Evolang conference—and by extension into the entire field of language evolution—it holds potential for comparisons with other fields and for addressing questions on the production of scientific knowledge.
{"title":"The scientometric landscape of Evolang: A comprehensive database of the Evolang conference","authors":"Sławomir Wacewicz, Marta Sibierska, Placiński Marek, A. Szczepańska, Aleksandra Poniewierska, Yen Ng, Przemysław Żywiczyński","doi":"10.1093/jole/lzad003","DOIUrl":"https://doi.org/10.1093/jole/lzad003","url":null,"abstract":"\u0000 Language evolution is a modern incarnation of a long intellectual tradition that addresses the fundamental question of how language began. Such a formulation is intuitively obvious, but a more precise characterisation of this area of research with its central notions—language and evolution—has proved surprisingly elusive. In this paper, we show how conceptual analysis can be complemented with scientometric analysis in describing language evolution. To this end, we built a database containing information on the contributions and contributors to the proceedings of the nine most recent iterations (years 2004–20) of the Evolang conference, which given its long history (1996–) and attendance rates gives a good reflection of the thematic scope and research trends in the field of language evolution as a whole. We present several analyses of these data, concerning the geographical distribution of the researchers contributing to the conference, a set of ‘classic’ references most frequently cited in Evolang proceedings, researcher profiles self-associated with the most popular tags for this area of research (such as ‘evolution of language’ vs. ‘language evolution’), and the changes to the profile of the conference as represented in the proportions of topics and author networks over the most recent Evolang iterations. While our resource is intended primarily as a source of insight into the Evolang conference—and by extension into the entire field of language evolution—it holds potential for comparisons with other fields and for addressing questions on the production of scientific knowledge.","PeriodicalId":37118,"journal":{"name":"Journal of Language Evolution","volume":" ","pages":""},"PeriodicalIF":2.6,"publicationDate":"2023-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46567495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Valentin Thouzeau, Antonin Affholder, Philippe Mennecier, Paul Verdu, Frédéric Austerlitz
Historical linguistics strongly benefited from recent methodological advances inspired by phylogenetics. Nevertheless, no available method uses contemporaneous within-population linguistic diversity to reconstruct the history of human populations. Here, we developed an approach inspired from population genetics to perform historical linguistic inferences from linguistic data sampled at the individual scale, within a population. We built four within-population demographic models of linguistic transmission over generations, each differing by the number of teachers involved during the language acquisition and the relative roles of the teachers. We then compared the simulated data obtained with these models with real contemporaneous linguistic data sampled from Tajik speakers from Central Asia, an area known for its large within-population linguistic diversity, using approximate Bayesian computation methods. Under this statistical framework, we were able to select the models that best explained the data, and infer the best-fitting parameters under the selected models. The selected model assumes that the lexicon of individuals is the result of a vertical transmission by two teachers, with a specific lexicon for each teacher. This demonstrates the feasibility of using contemporaneous within-population linguistic diversity to infer historical features of human cultural evolution.
{"title":"Inferring linguistic transmission between generations at the scale of individuals","authors":"Valentin Thouzeau, Antonin Affholder, Philippe Mennecier, Paul Verdu, Frédéric Austerlitz","doi":"10.1093/jole/lzac009","DOIUrl":"https://doi.org/10.1093/jole/lzac009","url":null,"abstract":"Historical linguistics strongly benefited from recent methodological advances inspired by phylogenetics. Nevertheless, no available method uses contemporaneous within-population linguistic diversity to reconstruct the history of human populations. Here, we developed an approach inspired from population genetics to perform historical linguistic inferences from linguistic data sampled at the individual scale, within a population. We built four within-population demographic models of linguistic transmission over generations, each differing by the number of teachers involved during the language acquisition and the relative roles of the teachers. We then compared the simulated data obtained with these models with real contemporaneous linguistic data sampled from Tajik speakers from Central Asia, an area known for its large within-population linguistic diversity, using approximate Bayesian computation methods. Under this statistical framework, we were able to select the models that best explained the data, and infer the best-fitting parameters under the selected models. The selected model assumes that the lexicon of individuals is the result of a vertical transmission by two teachers, with a specific lexicon for each teacher. This demonstrates the feasibility of using contemporaneous within-population linguistic diversity to infer historical features of human cultural evolution.","PeriodicalId":37118,"journal":{"name":"Journal of Language Evolution","volume":"52 1","pages":""},"PeriodicalIF":2.6,"publicationDate":"2023-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138519798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Isabeau De Smet, Laura Rosseel, Freek Van de Velde
It has often been suggested that there is an inverse correlation between the number of adult non-native speakers in a language and its morphological complexity. Secluded languages often show more complex morphology, while high-contact languages go through more severe simplifications throughout the ages. One such simplification linked to language contact is the regularization of the Germanic past tense. Yet, a Wug task on the English past tense system by Cuskley et al. (2015) showed that non-native speakers tend to use the irregular past tense even more than native speakers. In this article, we replicate the Wug experiment for Dutch. Our results show similar evidence for a higher rate of irregularization across non-native speakers. Furthermore, we do not find any other simplification strategies among non-native speakers. Though caution is warranted, these converging results may suggest that non-native speakers are not the drivers of morphological simplification.
{"title":"Are non-native speakers the drivers of morphological simplification? A Wug experiment on the Dutch past tense system","authors":"Isabeau De Smet, Laura Rosseel, Freek Van de Velde","doi":"10.1093/jole/lzac008","DOIUrl":"https://doi.org/10.1093/jole/lzac008","url":null,"abstract":"\u0000 It has often been suggested that there is an inverse correlation between the number of adult non-native speakers in a language and its morphological complexity. Secluded languages often show more complex morphology, while high-contact languages go through more severe simplifications throughout the ages. One such simplification linked to language contact is the regularization of the Germanic past tense. Yet, a Wug task on the English past tense system by Cuskley et al. (2015) showed that non-native speakers tend to use the irregular past tense even more than native speakers. In this article, we replicate the Wug experiment for Dutch. Our results show similar evidence for a higher rate of irregularization across non-native speakers. Furthermore, we do not find any other simplification strategies among non-native speakers. Though caution is warranted, these converging results may suggest that non-native speakers are not the drivers of morphological simplification.","PeriodicalId":37118,"journal":{"name":"Journal of Language Evolution","volume":" ","pages":""},"PeriodicalIF":2.6,"publicationDate":"2022-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49197819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
There is growing evidence that cognitive biases play a role in shaping language structure. Here, we ask whether such biases could contribute to the propensity of Zipfian word-frequency distributions in language, one of the striking commonalities between languages. Recent theoretical accounts and experimental findings suggest that such distributions provide a facilitative environment for word learning and segmentation. However, it remains unclear whether the advantage found in the laboratory reflects prior linguistic experience with such distributions or a cognitive preference for them. To explore this, we used an iterated learning paradigm—which can be used to reveal weak individual biases that are amplified overtime—to see if learners change a uniform input distribution to make it more skewed via cultural transmission. In the first study, we show that speakers are biased to produce skewed word distributions in telling a novel story. In the second study, we ask if this bias leads to a shift from uniform distributions towards more skewed ones using an iterated learning design. We exposed the first learner to a story where six nonce words appeared equally often, and asked them to re-tell it. Their output served as input for the next learner, and so on for a chain of ten learners (or ‘generations’). Over time, word distributions became more skewed (as measured by lower levels of word entropy). The third study asked if the shift will be less pronounced when lexical access was made easier (by reminding participants of the novel word forms), but this did not have a significant effect on entropy reduction. These findings are consistent with a cognitive bias for skewed distributions that gets amplified over time and support the role of entropy minimization in the emergence of Zipfian distributions.
{"title":"A Cognitive Bias for Zipfian Distributions? Uniform Distributions Become More Skewed via Cultural Transmission","authors":"Amir Shufaniya, Inbal Arnon","doi":"10.1093/jole/lzac005","DOIUrl":"https://doi.org/10.1093/jole/lzac005","url":null,"abstract":"\u0000 There is growing evidence that cognitive biases play a role in shaping language structure. Here, we ask whether such biases could contribute to the propensity of Zipfian word-frequency distributions in language, one of the striking commonalities between languages. Recent theoretical accounts and experimental findings suggest that such distributions provide a facilitative environment for word learning and segmentation. However, it remains unclear whether the advantage found in the laboratory reflects prior linguistic experience with such distributions or a cognitive preference for them. To explore this, we used an iterated learning paradigm—which can be used to reveal weak individual biases that are amplified overtime—to see if learners change a uniform input distribution to make it more skewed via cultural transmission. In the first study, we show that speakers are biased to produce skewed word distributions in telling a novel story. In the second study, we ask if this bias leads to a shift from uniform distributions towards more skewed ones using an iterated learning design. We exposed the first learner to a story where six nonce words appeared equally often, and asked them to re-tell it. Their output served as input for the next learner, and so on for a chain of ten learners (or ‘generations’). Over time, word distributions became more skewed (as measured by lower levels of word entropy). The third study asked if the shift will be less pronounced when lexical access was made easier (by reminding participants of the novel word forms), but this did not have a significant effect on entropy reduction. These findings are consistent with a cognitive bias for skewed distributions that gets amplified over time and support the role of entropy minimization in the emergence of Zipfian distributions.","PeriodicalId":37118,"journal":{"name":"Journal of Language Evolution","volume":" ","pages":""},"PeriodicalIF":2.6,"publicationDate":"2022-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46882595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Teaching is widely understood to have an important role in cultural transmission. But cultural transmission experiments typically do not document or analyse what happens during teaching. Here, we examine the content of teaching during skill transmission under two conditions: in the presence of the artefact (no-displacement condition) and in the absence of the artefact (displacement condition). Participants built baskets from various materials to carry as much rice as possible before teaching the next participant in line. The efficacy of baskets increased over generations in both conditions, and higher performing baskets were more frequently copied; however, the weight of rice transported did not differ between conditions. Displacement affected the choice of strategy by increasing innovation. Teachers shared personal experience more to discuss non-routine events (those departing from expectations) than they did other types of teaching, especially in the presence of the artefact. Exposure to non-routine experience sharing during teaching increased subsequent innovation, supporting the idea that sharing experience through activities such as storytelling serves a sensemaking function in teaching. This study thus provides experimental evidence that sharing experience is a useful teaching method in the context of manual skill transmission.
{"title":"Teaching, sharing experience, and innovation in cultural transmission","authors":"Ottilie Tilston, Adrian Bangerter, K. Tylén","doi":"10.1093/jole/lzac007","DOIUrl":"https://doi.org/10.1093/jole/lzac007","url":null,"abstract":"\u0000 Teaching is widely understood to have an important role in cultural transmission. But cultural transmission experiments typically do not document or analyse what happens during teaching. Here, we examine the content of teaching during skill transmission under two conditions: in the presence of the artefact (no-displacement condition) and in the absence of the artefact (displacement condition). Participants built baskets from various materials to carry as much rice as possible before teaching the next participant in line. The efficacy of baskets increased over generations in both conditions, and higher performing baskets were more frequently copied; however, the weight of rice transported did not differ between conditions. Displacement affected the choice of strategy by increasing innovation. Teachers shared personal experience more to discuss non-routine events (those departing from expectations) than they did other types of teaching, especially in the presence of the artefact. Exposure to non-routine experience sharing during teaching increased subsequent innovation, supporting the idea that sharing experience through activities such as storytelling serves a sensemaking function in teaching. This study thus provides experimental evidence that sharing experience is a useful teaching method in the context of manual skill transmission.","PeriodicalId":37118,"journal":{"name":"Journal of Language Evolution","volume":" ","pages":""},"PeriodicalIF":2.6,"publicationDate":"2022-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43261724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This article provides an attempt to revise the phylogenetic structure of the Turkic family using a computational lexicostatistical approach. The methodological framework of the present research is characterized by the following features: (1) wordlists with strictly controlled semantics; (2) step-by-step reconstruction using Swadesh wordlists for proto-languages; (3) three stages of post-processing of the input data (analysis of root cognacy, elimination of derivational drift, and optimization of homoplasy); (4) application of several computational algorithms (Starling neighbor-joining, Bayesian MCMC, and maximum parsimony). The analysis provided confirms the status of Chuvash as the first outlier and suggests a subsequent multifurcation of Proto-Nuclear-Turkic into eight branches. The Siberian Turkic group is a purely areal unity, that is, Yakut-Dolgan, Tofa-Tuvinian, Khakas-Mrassu, Sarygh Yugur and Altai do not form a clade. Altai is grouped together with the Kipchak languages as a separate taxon; it does not show a particularly close relationship with Kirghiz, which belongs to another Kipchak subgroup. Karluk is a low-level taxon inside the Kipchak clade.
这篇文章提供了一个尝试修改突厥家族的系统发育结构使用计算词典统计方法。本研究的方法论框架具有以下特点:(1)严格控制语义的词表;(2)利用Swadesh词表对原语言进行分步重建;(3)输入数据的三个后处理阶段(词根同源性分析、导数漂移消除和同质性优化);(4)几种计算算法(Starling neighbor-joining, Bayesian MCMC, maximum parsimony)的应用。分析证实了Chuvash作为第一个异常的地位,并提出了原始核突厥语系随后的多分支,分为八个分支。西伯利亚突厥群是一个纯粹的地区统一,也就是说,雅库特-多尔干,托法-图维尼亚,Khakas-Mrassu, Sarygh Yugur和阿尔泰不形成一个分支。阿尔泰语与奇普恰克语归为一个单独的分类群;它并没有显示出与吉尔吉斯语的特别密切的关系,吉尔吉斯语属于另一个奇普察克亚群。Karluk是Kipchak分支中的一个低级分类单元。
{"title":"Phylogeny of the Turkic Languages Inferred from Basic Vocabulary: Limitations of the Lexicostatistical Methods in an Intensive Contact Situation","authors":"Ilya M Egorov, Anna V Dybo, Alexei S Kassian","doi":"10.1093/jole/lzac006","DOIUrl":"https://doi.org/10.1093/jole/lzac006","url":null,"abstract":"This article provides an attempt to revise the phylogenetic structure of the Turkic family using a computational lexicostatistical approach. The methodological framework of the present research is characterized by the following features: (1) wordlists with strictly controlled semantics; (2) step-by-step reconstruction using Swadesh wordlists for proto-languages; (3) three stages of post-processing of the input data (analysis of root cognacy, elimination of derivational drift, and optimization of homoplasy); (4) application of several computational algorithms (Starling neighbor-joining, Bayesian MCMC, and maximum parsimony). The analysis provided confirms the status of Chuvash as the first outlier and suggests a subsequent multifurcation of Proto-Nuclear-Turkic into eight branches. The Siberian Turkic group is a purely areal unity, that is, Yakut-Dolgan, Tofa-Tuvinian, Khakas-Mrassu, Sarygh Yugur and Altai do not form a clade. Altai is grouped together with the Kipchak languages as a separate taxon; it does not show a particularly close relationship with Kirghiz, which belongs to another Kipchak subgroup. Karluk is a low-level taxon inside the Kipchak clade.","PeriodicalId":37118,"journal":{"name":"Journal of Language Evolution","volume":"20 1","pages":""},"PeriodicalIF":2.6,"publicationDate":"2022-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138519790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bayesian phylogenetic methods have been gaining traction and currency in historical linguistics, as their potential for uncovering elements of language change is increasingly understood. Here, we demonstrate a proof of concept for using ancestral state reconstruction methods to reconstruct changes in morphology. We use a simple Brownian motion model of character evolution to test how splits in ergative marking evolve across Pama-Nyungan, a large family of Australian languages. We are able to recover linguistically plausible paths of change, as well as rejecting implausible paths. The results of these analyses elucidate constraints on changes that have led to extensive synchronic variation in an interlocking morphological system. They further provide evidence of an ergative–accusative split traceable to Proto-Pama-Nyungan.
{"title":"Bayesian methods for ancestral state reconstruction in morphosyntax: Exploring the history of argument marking strategies in a large language family","authors":"Joshua L. Phillips, Claire Bowern","doi":"10.1093/jole/lzac002","DOIUrl":"https://doi.org/10.1093/jole/lzac002","url":null,"abstract":"\u0000 Bayesian phylogenetic methods have been gaining traction and currency in historical linguistics, as their potential for uncovering elements of language change is increasingly understood. Here, we demonstrate a proof of concept for using ancestral state reconstruction methods to reconstruct changes in morphology. We use a simple Brownian motion model of character evolution to test how splits in ergative marking evolve across Pama-Nyungan, a large family of Australian languages. We are able to recover linguistically plausible paths of change, as well as rejecting implausible paths. The results of these analyses elucidate constraints on changes that have led to extensive synchronic variation in an interlocking morphological system. They further provide evidence of an ergative–accusative split traceable to Proto-Pama-Nyungan.","PeriodicalId":37118,"journal":{"name":"Journal of Language Evolution","volume":" ","pages":""},"PeriodicalIF":2.6,"publicationDate":"2022-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44792758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}