不同粒度级别上的意义分布

IF 0.8 Q4 LINGUISTICS Glottometrics Pub Date : 2023-01-01 DOI:10.53482/2023_54_405

T. Yih, Haitao Liu

{"title":"不同粒度级别上的意义分布","authors":"T. Yih, Haitao Liu","doi":"10.53482/2023_54_405","DOIUrl":null,"url":null,"abstract":"The meaning distributions of certain linguistic forms generally follow a Zipfian distribution. However, since the meanings can be observed and classified on different levels of granularity, it is thus interesting to ask whether their distributions on different levels can be fitted by the same model and whether the parameters are the same. In this study, we investigate three quasi-prepositions in Shanghainese, a dialect of Wu Chinese, and test whether the meaning distributions on two levels of granularity can be fitted by the same model and whether the parameters are close. The results first show that the three models proposed by modern quantitative linguists can both achieve a good fit for all cases, while both the exponential (EXP) model and the right-truncated negative binomial (RTBN) models behave better than the modified right-truncated Zipf-Alekseev distribution (MRTZA), in terms of the consistency of the goodness of fit, parameter change, rationality, and simplicity. Second, the parameters of the distributions on the two levels and the curves are not exactly the same or even close to each other. This has supported a weak view of the concept of ‘scaling’ in complex sciences. Finally, differences are found to lie between the distributions on the two levels. The fine-grained meaning distributions are more right-skewed and more non-linear. This is attributed to the openness of the categories of systems. The finer semantic differentiation behaves like systems with open set of categories, while the coarse-grained meaning distribution resembles those having a close set of few categories.","PeriodicalId":51918,"journal":{"name":"Glottometrics","volume":"93 1","pages":"13-38"},"PeriodicalIF":0.8000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The meaning distributions on different levels of granularity\",\"authors\":\"T. Yih, Haitao Liu\",\"doi\":\"10.53482/2023_54_405\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The meaning distributions of certain linguistic forms generally follow a Zipfian distribution. However, since the meanings can be observed and classified on different levels of granularity, it is thus interesting to ask whether their distributions on different levels can be fitted by the same model and whether the parameters are the same. In this study, we investigate three quasi-prepositions in Shanghainese, a dialect of Wu Chinese, and test whether the meaning distributions on two levels of granularity can be fitted by the same model and whether the parameters are close. The results first show that the three models proposed by modern quantitative linguists can both achieve a good fit for all cases, while both the exponential (EXP) model and the right-truncated negative binomial (RTBN) models behave better than the modified right-truncated Zipf-Alekseev distribution (MRTZA), in terms of the consistency of the goodness of fit, parameter change, rationality, and simplicity. Second, the parameters of the distributions on the two levels and the curves are not exactly the same or even close to each other. This has supported a weak view of the concept of ‘scaling’ in complex sciences. Finally, differences are found to lie between the distributions on the two levels. The fine-grained meaning distributions are more right-skewed and more non-linear. This is attributed to the openness of the categories of systems. The finer semantic differentiation behaves like systems with open set of categories, while the coarse-grained meaning distribution resembles those having a close set of few categories.\",\"PeriodicalId\":51918,\"journal\":{\"name\":\"Glottometrics\",\"volume\":\"93 1\",\"pages\":\"13-38\"},\"PeriodicalIF\":0.8000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Glottometrics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.53482/2023_54_405\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"LINGUISTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Glottometrics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.53482/2023_54_405","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"LINGUISTICS","Score":null,"Total":0}

引用次数: 0

摘要

某些语言形式的意义分布一般遵循齐夫分布。然而，由于意义可以在不同的粒度级别上观察和分类，因此，它们在不同级别上的分布是否可以用同一个模型来拟合，参数是否相同，这是一个有趣的问题。本研究以吴语上海话中的三个准介词为研究对象，检验了两个粒度水平上的意义分布是否可以用同一模型拟合，参数是否接近。结果首先表明，现代定量语言学家提出的三种模型都能很好地拟合所有情况，而指数模型(EXP)和右截断负二项模型(RTBN)在拟合优度的一致性、参数变化、合理性和简单性方面都优于修正的右截断Zipf-Alekseev分布(MRTZA)。二是两层分布和曲线的参数不完全相同，甚至不接近。这支持了复杂科学中“尺度”概念的薄弱观点。最后，发现两个层次的分布之间存在差异。细粒度的意义分布更右偏，更非线性。这要归功于系统类别的开放性。更精细的语义区分表现为具有开放类别集的系统，而粗粒度的含义分布类似于具有少数类别的紧密集的系统。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

The meaning distributions on different levels of granularity

The meaning distributions of certain linguistic forms generally follow a Zipfian distribution. However, since the meanings can be observed and classified on different levels of granularity, it is thus interesting to ask whether their distributions on different levels can be fitted by the same model and whether the parameters are the same. In this study, we investigate three quasi-prepositions in Shanghainese, a dialect of Wu Chinese, and test whether the meaning distributions on two levels of granularity can be fitted by the same model and whether the parameters are close. The results first show that the three models proposed by modern quantitative linguists can both achieve a good fit for all cases, while both the exponential (EXP) model and the right-truncated negative binomial (RTBN) models behave better than the modified right-truncated Zipf-Alekseev distribution (MRTZA), in terms of the consistency of the goodness of fit, parameter change, rationality, and simplicity. Second, the parameters of the distributions on the two levels and the curves are not exactly the same or even close to each other. This has supported a weak view of the concept of ‘scaling’ in complex sciences. Finally, differences are found to lie between the distributions on the two levels. The fine-grained meaning distributions are more right-skewed and more non-linear. This is attributed to the openness of the categories of systems. The finer semantic differentiation behaves like systems with open set of categories, while the coarse-grained meaning distribution resembles those having a close set of few categories.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Glottometrics LINGUISTICS-

CiteScore

0.50

自引率

0.00%

发文量

期刊介绍： The aim of Glottometrics is quantification, measurement and mathematical modeling of any kind of language phenomena. We invite contributions on probabilistic or other mathematical models (e.g. graph theoretic or optimization approaches) which enable to establish language laws that can be validated by testing statistical hypotheses.