使用 GPT-4 无监督方法评估创意的新颖性、可行性和价值。

IF 3.2 2区 心理学 Q1 PSYCHOLOGY, MULTIDISCIPLINARY British journal of psychology Pub Date : 2024-07-22 DOI:10.1111/bjop.12720
Felix B Kern, Chien-Te Wu, Zenas C Chao
{"title":"使用 GPT-4 无监督方法评估创意的新颖性、可行性和价值。","authors":"Felix B Kern, Chien-Te Wu, Zenas C Chao","doi":"10.1111/bjop.12720","DOIUrl":null,"url":null,"abstract":"<p><p>Creativity is defined by three key factors: novelty, feasibility and value. While many creativity tests focus primarily on novelty, they often neglect feasibility and value, thereby limiting their reflection of real-world creativity. In this study, we employ GPT-4, a large language model, to assess these three dimensions in a Japanese-language Alternative Uses Test (AUT). Using a crowdsourced evaluation method, we acquire ground truth data for 30 question items and test various GPT prompt designs. Our findings show that asking for multiple responses in a single prompt, using an 'explain first, rate later' design, is both cost-effective and accurate (r = .62, .59 and .33 for novelty, feasibility and value, respectively). Moreover, our method offers comparable accuracy to existing methods in assessing novelty, without the need for training data. We also evaluate additional models such as GPT-4 Turbo, GPT-4 Omni and Claude 3.5 Sonnet. Comparable performance across these models demonstrates the universal applicability of our prompt design. Our results contribute a straightforward platform for instant AUT evaluation and provide valuable ground truth data for future methodological research.</p>","PeriodicalId":9300,"journal":{"name":"British journal of psychology","volume":" ","pages":""},"PeriodicalIF":3.2000,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Assessing novelty, feasibility and value of creative ideas with an unsupervised approach using GPT-4.\",\"authors\":\"Felix B Kern, Chien-Te Wu, Zenas C Chao\",\"doi\":\"10.1111/bjop.12720\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Creativity is defined by three key factors: novelty, feasibility and value. While many creativity tests focus primarily on novelty, they often neglect feasibility and value, thereby limiting their reflection of real-world creativity. In this study, we employ GPT-4, a large language model, to assess these three dimensions in a Japanese-language Alternative Uses Test (AUT). Using a crowdsourced evaluation method, we acquire ground truth data for 30 question items and test various GPT prompt designs. Our findings show that asking for multiple responses in a single prompt, using an 'explain first, rate later' design, is both cost-effective and accurate (r = .62, .59 and .33 for novelty, feasibility and value, respectively). Moreover, our method offers comparable accuracy to existing methods in assessing novelty, without the need for training data. We also evaluate additional models such as GPT-4 Turbo, GPT-4 Omni and Claude 3.5 Sonnet. Comparable performance across these models demonstrates the universal applicability of our prompt design. Our results contribute a straightforward platform for instant AUT evaluation and provide valuable ground truth data for future methodological research.</p>\",\"PeriodicalId\":9300,\"journal\":{\"name\":\"British journal of psychology\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2024-07-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"British journal of psychology\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://doi.org/10.1111/bjop.12720\",\"RegionNum\":2,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PSYCHOLOGY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"British journal of psychology","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1111/bjop.12720","RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

创造力由三个关键因素定义:新颖性、可行性和价值。许多创造力测试主要关注新颖性,但往往忽略了可行性和价值,从而限制了对现实世界创造力的反映。在本研究中,我们采用了大型语言模型 GPT-4 来评估日语替代用途测试(AUT)中的这三个维度。通过众包评估方法,我们获得了 30 个问题项目的基本真实数据,并测试了各种 GPT 提示设计。我们的研究结果表明,采用 "先解释,后评价 "的设计,在单个提示中要求多个回答,既经济又准确(新颖性、可行性和价值的 r 分别为 0.62、0.59 和 0.33)。此外,我们的方法在评估新颖性方面的准确性与现有方法相当,而且无需训练数据。我们还评估了其他模型,如 GPT-4 Turbo、GPT-4 Omni 和 Claude 3.5 Sonnet。这些模型的性能相当,这表明我们的提示设计具有普遍适用性。我们的结果为即时 AUT 评估提供了一个直接的平台,并为未来的方法研究提供了宝贵的基础数据。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Assessing novelty, feasibility and value of creative ideas with an unsupervised approach using GPT-4.

Creativity is defined by three key factors: novelty, feasibility and value. While many creativity tests focus primarily on novelty, they often neglect feasibility and value, thereby limiting their reflection of real-world creativity. In this study, we employ GPT-4, a large language model, to assess these three dimensions in a Japanese-language Alternative Uses Test (AUT). Using a crowdsourced evaluation method, we acquire ground truth data for 30 question items and test various GPT prompt designs. Our findings show that asking for multiple responses in a single prompt, using an 'explain first, rate later' design, is both cost-effective and accurate (r = .62, .59 and .33 for novelty, feasibility and value, respectively). Moreover, our method offers comparable accuracy to existing methods in assessing novelty, without the need for training data. We also evaluate additional models such as GPT-4 Turbo, GPT-4 Omni and Claude 3.5 Sonnet. Comparable performance across these models demonstrates the universal applicability of our prompt design. Our results contribute a straightforward platform for instant AUT evaluation and provide valuable ground truth data for future methodological research.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
British journal of psychology
British journal of psychology PSYCHOLOGY, MULTIDISCIPLINARY-
CiteScore
7.60
自引率
2.50%
发文量
67
期刊介绍: The British Journal of Psychology publishes original research on all aspects of general psychology including cognition; health and clinical psychology; developmental, social and occupational psychology. For information on specific requirements, please view Notes for Contributors. We attract a large number of international submissions each year which make major contributions across the range of psychology.
期刊最新文献
The alone team: How an alone mindset affects group processes. Keep bright in the dark: Multimodal emotional effects on donation-based crowdfunding performance and their empathic mechanisms. Predictors of online and offline activism in hybrid regime society - Serbian study. Importance of transgender nuances in research and advocacy: Reply to Morgenroth (2025) and Tate (2025). Language about gender/sex should be used intentionally and flexibly.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1