Investigating lay evaluations of models

IF 2.5 3区 心理学 Q2 PSYCHOLOGY, EXPERIMENTAL Thinking & Reasoning Pub Date : 2021-11-09 DOI:10.1080/13546783.2021.1999327
P. Kane, S. Broomell
{"title":"Investigating lay evaluations of models","authors":"P. Kane, S. Broomell","doi":"10.1080/13546783.2021.1999327","DOIUrl":null,"url":null,"abstract":"Abstract Many important decisions depend on unknown states of the world. Society is increasingly relying on statistical predictive models to make decisions in these cases. While predictive models are useful, previous research has documented that (a) individual decision makers distrust models and (b) people’s predictions are often worse than those of models. These findings indicate a lack of awareness of how to evaluate predictions generally. This includes concepts like the loss function used to aggregate errors or whether error is training error or generalisation error. To address this gap, we present three studies testing how lay people visually evaluate the predictive accuracy of models. We found that (a) participant judgements of prediction errors were more similar to absolute error than squared error (Study 1), (b) we did not detect a difference in participant reactions to training error versus generalisation error (Study 2), and (c) participants rated complex models as more accurate when comparing two models, but rated simple models as more accurate when shown single models in isolation (Study 3). When communicating about models, researchers should be aware that the public’s visual evaluation of models may disagree with their method of measuring errors and that many may fail to recognise overfitting.","PeriodicalId":47270,"journal":{"name":"Thinking & Reasoning","volume":"67 1","pages":"569 - 604"},"PeriodicalIF":2.5000,"publicationDate":"2021-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Thinking & Reasoning","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1080/13546783.2021.1999327","RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PSYCHOLOGY, EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0

Abstract

Abstract Many important decisions depend on unknown states of the world. Society is increasingly relying on statistical predictive models to make decisions in these cases. While predictive models are useful, previous research has documented that (a) individual decision makers distrust models and (b) people’s predictions are often worse than those of models. These findings indicate a lack of awareness of how to evaluate predictions generally. This includes concepts like the loss function used to aggregate errors or whether error is training error or generalisation error. To address this gap, we present three studies testing how lay people visually evaluate the predictive accuracy of models. We found that (a) participant judgements of prediction errors were more similar to absolute error than squared error (Study 1), (b) we did not detect a difference in participant reactions to training error versus generalisation error (Study 2), and (c) participants rated complex models as more accurate when comparing two models, but rated simple models as more accurate when shown single models in isolation (Study 3). When communicating about models, researchers should be aware that the public’s visual evaluation of models may disagree with their method of measuring errors and that many may fail to recognise overfitting.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
调查对模型的评价
许多重要的决策依赖于未知的世界状态。社会越来越依赖统计预测模型在这些情况下做出决定。虽然预测模型是有用的,但之前的研究已经证明:(a)个体决策者不信任模型,(b)人们的预测通常比模型更糟糕。这些发现表明,人们缺乏对如何普遍评估预测的认识。这包括诸如用于汇总错误的损失函数之类的概念,或者错误是训练错误还是泛化错误。为了解决这一差距,我们提出了三个研究,测试外行人如何直观地评估模型的预测准确性。我们发现(a)参与者对预测误差的判断更类似于绝对误差,而不是平方误差(研究1),(b)我们没有发现参与者对训练误差和泛化误差的反应有差异(研究2),(c)参与者在比较两个模型时认为复杂模型更准确,但在单独展示单个模型时认为简单模型更准确(研究3)。研究人员应该意识到,公众对模型的视觉评价可能与他们测量误差的方法不一致,而且许多人可能无法识别过拟合。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Thinking & Reasoning
Thinking & Reasoning PSYCHOLOGY, EXPERIMENTAL-
CiteScore
6.50
自引率
11.50%
发文量
25
期刊最新文献
The skeptical import of motivated reasoning: a closer look at the evidence When word frequency meets word order: factors determining multiply-constrained creative association Mindset effects on the regulation of thinking time in problem-solving Elementary probabilistic operations: a framework for probabilistic reasoning Testing the underlying structure of unfounded beliefs about COVID-19 around the world
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1