{"title":"The halfway similarity avoidance rule replicated using phonetic data from European language varieties","authors":"Matías Guzmán Naranjo, Søren Wichmann","doi":"10.1163/22105832-bja10029","DOIUrl":null,"url":null,"abstract":"Previous work using lexical data from around the world has suggested that distances between language varieties are distributed such that varieties are typically either rather similar, qualifying as dialects of the same language, or rather dissimilar, qualifying as different languages, with a scarcity of varieties that are around halfway similar. Using a potentially biased sample, Wichmann (2019) observed that there is a bimodal distribution of distances with two roughly normal distributions separated by a valley. Here we test whether a similar distribution is found when using another source of data and an unbiased sample drawn from the cells of a geographical grid (of central Europe). The data consists of 18 lexemes from 274 doculects. Using Bayesian beta regression and leave-one-out cross-validation, we show that the data follows a bimodal distribution which is robust to sampling, and also to at least some aspects of the data (coarse- vs. fine-grained phonetic transcriptions).","PeriodicalId":43113,"journal":{"name":"Language Dynamics and Change","volume":"47 25","pages":""},"PeriodicalIF":0.5000,"publicationDate":"2023-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Language Dynamics and Change","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1163/22105832-bja10029","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}
引用次数: 0
Abstract
Previous work using lexical data from around the world has suggested that distances between language varieties are distributed such that varieties are typically either rather similar, qualifying as dialects of the same language, or rather dissimilar, qualifying as different languages, with a scarcity of varieties that are around halfway similar. Using a potentially biased sample, Wichmann (2019) observed that there is a bimodal distribution of distances with two roughly normal distributions separated by a valley. Here we test whether a similar distribution is found when using another source of data and an unbiased sample drawn from the cells of a geographical grid (of central Europe). The data consists of 18 lexemes from 274 doculects. Using Bayesian beta regression and leave-one-out cross-validation, we show that the data follows a bimodal distribution which is robust to sampling, and also to at least some aspects of the data (coarse- vs. fine-grained phonetic transcriptions).
期刊介绍:
Language Dynamics and Change (LDC) is an international peer-reviewed journal that covers both new and traditional aspects of the study of language change. Work on any language or language family is welcomed, as long as it bears on topics that are also of theoretical interest. A particular focus is on new developments in the field arising from the accumulation of extensive databases of dialect variation and typological distributions, spoken corpora, parallel texts, and comparative lexicons, which allow for the application of new types of quantitative approaches to diachronic linguistics. Moreover, the journal will serve as an outlet for increasingly important interdisciplinary work on such topics as the evolution of language, archaeology and linguistics (‘archaeolinguistics’), human genetic and linguistic prehistory, and the computational modeling of language dynamics.