{"title":"Assessing novelty, feasibility and value of creative ideas with an unsupervised approach using GPT-4.","authors":"Felix B Kern, Chien-Te Wu, Zenas C Chao","doi":"10.1111/bjop.12720","DOIUrl":null,"url":null,"abstract":"<p><p>Creativity is defined by three key factors: novelty, feasibility and value. While many creativity tests focus primarily on novelty, they often neglect feasibility and value, thereby limiting their reflection of real-world creativity. In this study, we employ GPT-4, a large language model, to assess these three dimensions in a Japanese-language Alternative Uses Test (AUT). Using a crowdsourced evaluation method, we acquire ground truth data for 30 question items and test various GPT prompt designs. Our findings show that asking for multiple responses in a single prompt, using an 'explain first, rate later' design, is both cost-effective and accurate (r = .62, .59 and .33 for novelty, feasibility and value, respectively). Moreover, our method offers comparable accuracy to existing methods in assessing novelty, without the need for training data. We also evaluate additional models such as GPT-4 Turbo, GPT-4 Omni and Claude 3.5 Sonnet. Comparable performance across these models demonstrates the universal applicability of our prompt design. Our results contribute a straightforward platform for instant AUT evaluation and provide valuable ground truth data for future methodological research.</p>","PeriodicalId":9300,"journal":{"name":"British journal of psychology","volume":" ","pages":""},"PeriodicalIF":3.2000,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"British journal of psychology","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1111/bjop.12720","RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Creativity is defined by three key factors: novelty, feasibility and value. While many creativity tests focus primarily on novelty, they often neglect feasibility and value, thereby limiting their reflection of real-world creativity. In this study, we employ GPT-4, a large language model, to assess these three dimensions in a Japanese-language Alternative Uses Test (AUT). Using a crowdsourced evaluation method, we acquire ground truth data for 30 question items and test various GPT prompt designs. Our findings show that asking for multiple responses in a single prompt, using an 'explain first, rate later' design, is both cost-effective and accurate (r = .62, .59 and .33 for novelty, feasibility and value, respectively). Moreover, our method offers comparable accuracy to existing methods in assessing novelty, without the need for training data. We also evaluate additional models such as GPT-4 Turbo, GPT-4 Omni and Claude 3.5 Sonnet. Comparable performance across these models demonstrates the universal applicability of our prompt design. Our results contribute a straightforward platform for instant AUT evaluation and provide valuable ground truth data for future methodological research.
期刊介绍:
The British Journal of Psychology publishes original research on all aspects of general psychology including cognition; health and clinical psychology; developmental, social and occupational psychology. For information on specific requirements, please view Notes for Contributors. We attract a large number of international submissions each year which make major contributions across the range of psychology.