{"title":"Computing word meanings by aggregating individualized distributional models: Wisdom of the crowds in lexical semantic memory","authors":"Brendan T. Johns","doi":"10.1016/j.cogsys.2023.02.009","DOIUrl":null,"url":null,"abstract":"<div><p>Linguistic experience varies across individuals and is impacted by both demography and personal preferences, leading to differences in word meanings across languages (<span>Thompson et al., 2020</span>) and people (<span>Johns, 2022</span><span>). An active area of study in the cognitive sciences that examines the impact of varied knowledge across individuals is the wisdom of the crowd effect, where it is found that the aggregate judgement of a group of individuals is often better than the judgement of the best individual in the group (</span><span>Surowiecki, 2004</span><span>). The goal of this article was to determine if there is a wisdom of the crowd effect in lexical semantic memory, such that the aggregated word similarity values from many individual language users exceeds the fit of the best fitting individual. This was accomplished by training 500 different distributional models from 500 high-level commenters on the internet forum Reddit. By deriving aggregated word similarity values from these individuals, a strong wisdom of the crowd effect was found where the aggregated similarity values far exceeded the performance of the best fitting individual for each dataset tested. Additionally, it was found that even aggregating only a small number of users provided a large increase in fit relative to the individual corpora, but with the best fitting measure including word similarity values from all possible users. The results of this article provide an avenue for future distributional model development by demonstrating that the best pathway towards better distributional models may lie in the aggregation of multiple representations attained from individual users of a language.</span></p></div>","PeriodicalId":55242,"journal":{"name":"Cognitive Systems Research","volume":null,"pages":null},"PeriodicalIF":2.1000,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cognitive Systems Research","FirstCategoryId":"102","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1389041723000244","RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Linguistic experience varies across individuals and is impacted by both demography and personal preferences, leading to differences in word meanings across languages (Thompson et al., 2020) and people (Johns, 2022). An active area of study in the cognitive sciences that examines the impact of varied knowledge across individuals is the wisdom of the crowd effect, where it is found that the aggregate judgement of a group of individuals is often better than the judgement of the best individual in the group (Surowiecki, 2004). The goal of this article was to determine if there is a wisdom of the crowd effect in lexical semantic memory, such that the aggregated word similarity values from many individual language users exceeds the fit of the best fitting individual. This was accomplished by training 500 different distributional models from 500 high-level commenters on the internet forum Reddit. By deriving aggregated word similarity values from these individuals, a strong wisdom of the crowd effect was found where the aggregated similarity values far exceeded the performance of the best fitting individual for each dataset tested. Additionally, it was found that even aggregating only a small number of users provided a large increase in fit relative to the individual corpora, but with the best fitting measure including word similarity values from all possible users. The results of this article provide an avenue for future distributional model development by demonstrating that the best pathway towards better distributional models may lie in the aggregation of multiple representations attained from individual users of a language.
语言体验因个体而异,并受到人口学和个人偏好的影响,导致不同语言(Thompson et al.,2020)和不同人群(Johns,2022)的词义存在差异。认知科学中一个研究不同知识对个体影响的活跃领域是群体效应的智慧,人们发现,一组个体的总体判断往往优于该组中最好的个体的判断(Surowiecki,2004)。本文的目的是确定词汇语义记忆中是否存在群体效应的智慧,从而使许多语言用户的单词相似度值超过了最适合的个人。这是通过在互联网论坛Reddit上培训来自500名高级评论者的500个不同的分发模型来实现的。通过从这些个体中推导出聚合的单词相似性值,发现了群体效应的强大智慧,其中聚合的相似性值远远超过了每个测试数据集的最佳拟合个体的性能。此外,研究发现,即使只聚合少量用户,相对于单个语料库,拟合度也会大幅提高,但最佳拟合度包括来自所有可能用户的单词相似性值。这篇文章的结果为未来的分布模型开发提供了一条途径,证明了通往更好的分布模型的最佳途径可能在于从语言的个人用户那里获得的多个表示的集合。
期刊介绍:
Cognitive Systems Research is dedicated to the study of human-level cognition. As such, it welcomes papers which advance the understanding, design and applications of cognitive and intelligent systems, both natural and artificial.
The journal brings together a broad community studying cognition in its many facets in vivo and in silico, across the developmental spectrum, focusing on individual capacities or on entire architectures. It aims to foster debate and integrate ideas, concepts, constructs, theories, models and techniques from across different disciplines and different perspectives on human-level cognition. The scope of interest includes the study of cognitive capacities and architectures - both brain-inspired and non-brain-inspired - and the application of cognitive systems to real-world problems as far as it offers insights relevant for the understanding of cognition.
Cognitive Systems Research therefore welcomes mature and cutting-edge research approaching cognition from a systems-oriented perspective, both theoretical and empirically-informed, in the form of original manuscripts, short communications, opinion articles, systematic reviews, and topical survey articles from the fields of Cognitive Science (including Philosophy of Cognitive Science), Artificial Intelligence/Computer Science, Cognitive Robotics, Developmental Science, Psychology, and Neuroscience and Neuromorphic Engineering. Empirical studies will be considered if they are supplemented by theoretical analyses and contributions to theory development and/or computational modelling studies.