Recursive Numeral Systems Optimize the Trade-off Between Lexicon Size and Average Morphosyntactic Complexity

IF 4.6 Q2 MATERIALS SCIENCE, BIOMATERIALS ACS Applied Bio Materials Pub Date : 2024-03-18 DOI:10.1111/cogs.13424
Milica Denić, Jakub Szymanik
{"title":"Recursive Numeral Systems Optimize the Trade-off Between Lexicon Size and Average Morphosyntactic Complexity","authors":"Milica Denić,&nbsp;Jakub Szymanik","doi":"10.1111/cogs.13424","DOIUrl":null,"url":null,"abstract":"<p>Human languages vary in terms of which meanings they lexicalize, but this variation is constrained. It has been argued that languages are under two competing pressures: the pressure to be simple (e.g., to have a small lexicon) and to allow for informative (i.e., precise) communication, and that which meanings get lexicalized may be explained by languages finding a good way to trade off between these two pressures. However, in certain semantic domains, languages can reach very high levels of informativeness even if they lexicalize very few meanings in that domain. This is due to productive morphosyntax and compositional semantics, which may allow for construction of meanings which are not lexicalized. Consider the semantic domain of natural numbers: many languages lexicalize few natural number meanings as monomorphemic expressions, but can precisely convey very many natural number meanings using morphosyntactically complex numerals. In such semantic domains, lexicon size is not in direct competition with informativeness. What explains which meanings are lexicalized in such semantic domains? We will propose that in such cases, languages need to solve a different kind of trade-off problem: the trade-off between the pressure to lexicalize as few meanings as possible (i.e, to minimize lexicon size) and the pressure to produce as morphosyntactically simple utterances as possible (i.e, to minimize average morphosyntactic complexity of utterances). To support this claim, we will present a case study of 128 natural languages' numeral systems, and show computationally that they achieve a near-optimal trade-off between lexicon size and average morphosyntactic complexity of numerals. This study in conjunction with previous work on communicative efficiency suggests that languages' lexicons are shaped by a trade-off between not two but <i>three</i> pressures: be simple, be informative, and minimize average morphosyntactic complexity of utterances.</p>","PeriodicalId":2,"journal":{"name":"ACS Applied Bio Materials","volume":null,"pages":null},"PeriodicalIF":4.6000,"publicationDate":"2024-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/cogs.13424","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Bio Materials","FirstCategoryId":"102","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/cogs.13424","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, BIOMATERIALS","Score":null,"Total":0}
引用次数: 0

Abstract

Human languages vary in terms of which meanings they lexicalize, but this variation is constrained. It has been argued that languages are under two competing pressures: the pressure to be simple (e.g., to have a small lexicon) and to allow for informative (i.e., precise) communication, and that which meanings get lexicalized may be explained by languages finding a good way to trade off between these two pressures. However, in certain semantic domains, languages can reach very high levels of informativeness even if they lexicalize very few meanings in that domain. This is due to productive morphosyntax and compositional semantics, which may allow for construction of meanings which are not lexicalized. Consider the semantic domain of natural numbers: many languages lexicalize few natural number meanings as monomorphemic expressions, but can precisely convey very many natural number meanings using morphosyntactically complex numerals. In such semantic domains, lexicon size is not in direct competition with informativeness. What explains which meanings are lexicalized in such semantic domains? We will propose that in such cases, languages need to solve a different kind of trade-off problem: the trade-off between the pressure to lexicalize as few meanings as possible (i.e, to minimize lexicon size) and the pressure to produce as morphosyntactically simple utterances as possible (i.e, to minimize average morphosyntactic complexity of utterances). To support this claim, we will present a case study of 128 natural languages' numeral systems, and show computationally that they achieve a near-optimal trade-off between lexicon size and average morphosyntactic complexity of numerals. This study in conjunction with previous work on communicative efficiency suggests that languages' lexicons are shaped by a trade-off between not two but three pressures: be simple, be informative, and minimize average morphosyntactic complexity of utterances.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
递归数字系统在词典规模和平均形态句法复杂性之间实现了最佳权衡。
人类语言在词汇化意义方面各不相同,但这种变化是受限制的。有人认为,语言面临着两种相互竞争的压力:一是简单的压力(例如,词汇量小),二是信息交流的压力(即,精确)。然而,在某些语义领域,语言的信息量可以达到很高的水平,即使在该领域词汇化的意义很少。这要归功于生产性形态句法和组合语义学,它们可以构造出没有被词汇化的意义。考虑一下自然数的语义域:许多语言将极少数自然数的意义词法化为单体表达式,但却能用形态句法复杂的数字精确地表达极多的自然数意义。在这种语义领域,词库规模与信息量并不直接竞争。在这种语义域中,哪些意义被词法化了呢?我们将提出,在这种情况下,语言需要解决一个不同类型的权衡问题:在将尽可能少的意义词汇化的压力(即尽可能缩小词库规模)和产生尽可能形态句法简单的语篇的压力(即尽可能降低语篇的平均形态句法复杂性)之间进行权衡。为了支持这一观点,我们将对 128 种自然语言的数字系统进行案例研究,并通过计算证明它们在词库大小和数字的平均形态句法复杂性之间实现了近乎最佳的权衡。这项研究与之前关于交际效率的研究相结合,表明语言的词典是在三种压力的权衡下形成的:简单、信息量大、语篇的平均形态句法复杂度最小。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
ACS Applied Bio Materials
ACS Applied Bio Materials Chemistry-Chemistry (all)
CiteScore
9.40
自引率
2.10%
发文量
464
期刊最新文献
A Systematic Review of Sleep Disturbance in Idiopathic Intracranial Hypertension. Advancing Patient Education in Idiopathic Intracranial Hypertension: The Promise of Large Language Models. Anti-Myelin-Associated Glycoprotein Neuropathy: Recent Developments. Approach to Managing the Initial Presentation of Multiple Sclerosis: A Worldwide Practice Survey. Association Between LACE+ Index Risk Category and 90-Day Mortality After Stroke.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1