Philipp Schoenegger, Spencer Greenberg, Alexander Grishin, Joshua Lewis, Lucius Caviola
{"title":"AI can outperform humans in predicting correlations between personality items.","authors":"Philipp Schoenegger, Spencer Greenberg, Alexander Grishin, Joshua Lewis, Lucius Caviola","doi":"10.1038/s44271-025-00205-w","DOIUrl":null,"url":null,"abstract":"<p><p>We assess the abilities of both specialized deep neural networks, such as PersonalityMap, and general LLMs, including GPT-4o and Claude 3 Opus, in understanding human personality by predicting correlations between personality questionnaire items. All AI models outperform the vast majority of laypeople and academic experts. However, we can improve the accuracy of individual correlation predictions by taking the median prediction per group to produce a \"wisdom of the crowds\" estimate. Thus, we also compare the median predictions from laypeople, academic experts, GPT-4o/Claude 3 Opus, and PersonalityMap. Based on medians, PersonalityMap and academic experts surpass both LLMs and laypeople on most measures. These results suggest that while advanced LLMs make superior predictions compared to most individual humans, specialized models like PersonalityMap can match even expert group-level performance in domain-specific tasks. This underscores the capabilities of large language models while emphasizing the continued relevance of specialized systems as well as human experts for personality research.</p>","PeriodicalId":501698,"journal":{"name":"Communications Psychology","volume":"3 1","pages":"23"},"PeriodicalIF":0.0000,"publicationDate":"2025-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11822111/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Communications Psychology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1038/s44271-025-00205-w","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We assess the abilities of both specialized deep neural networks, such as PersonalityMap, and general LLMs, including GPT-4o and Claude 3 Opus, in understanding human personality by predicting correlations between personality questionnaire items. All AI models outperform the vast majority of laypeople and academic experts. However, we can improve the accuracy of individual correlation predictions by taking the median prediction per group to produce a "wisdom of the crowds" estimate. Thus, we also compare the median predictions from laypeople, academic experts, GPT-4o/Claude 3 Opus, and PersonalityMap. Based on medians, PersonalityMap and academic experts surpass both LLMs and laypeople on most measures. These results suggest that while advanced LLMs make superior predictions compared to most individual humans, specialized models like PersonalityMap can match even expert group-level performance in domain-specific tasks. This underscores the capabilities of large language models while emphasizing the continued relevance of specialized systems as well as human experts for personality research.