大型语言模型的人格测试:时间稳定性有限,但亲社会性突出。

IF 2.9 3区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Royal Society Open Science Pub Date : 2024-10-09 eCollection Date: 2024-10-01 DOI:10.1098/rsos.240180
Bojana Bodroža, Bojana M Dinić, Ljubiša Bojić
{"title":"大型语言模型的人格测试:时间稳定性有限,但亲社会性突出。","authors":"Bojana Bodroža, Bojana M Dinić, Ljubiša Bojić","doi":"10.1098/rsos.240180","DOIUrl":null,"url":null,"abstract":"<p><p>As large language models (LLMs) continue to gain popularity due to their human-like traits and the intimacy they offer to users, their societal impact inevitably expands. This leads to the rising necessity for comprehensive studies to fully understand LLMs and reveal their potential opportunities, drawbacks and overall societal impact. With that in mind, this research conducted an extensive investigation into seven LLMs, aiming to assess the temporal stability and inter-rater agreement on their responses on personality instruments in two time points. In addition, LLMs' personality profile was analysed and compared with human normative data. The findings revealed varying levels of inter-rater agreement in the LLMs' responses over a short time, with some LLMs showing higher agreement (e.g. Llama3 and GPT-4o) compared with others (e.g. GPT-4 and Gemini). Furthermore, agreement depended on used instruments as well as on domain or trait. This implies the variable robustness in LLMs' ability to reliably simulate stable personality characteristics. In the case of scales which showed at least fair agreement, LLMs displayed mostly a socially desirable profile in both agentic and communal domains, as well as a prosocial personality profile reflected in higher agreeableness and conscientiousness and lower Machiavellianism. Exhibiting temporal stability and coherent responses on personality traits is crucial for AI systems due to their societal impact and AI safety concerns.</p>","PeriodicalId":21525,"journal":{"name":"Royal Society Open Science","volume":"11 10","pages":"240180"},"PeriodicalIF":2.9000,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11461045/pdf/","citationCount":"0","resultStr":"{\"title\":\"Personality testing of large language models: limited temporal stability, but highlighted prosociality.\",\"authors\":\"Bojana Bodroža, Bojana M Dinić, Ljubiša Bojić\",\"doi\":\"10.1098/rsos.240180\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>As large language models (LLMs) continue to gain popularity due to their human-like traits and the intimacy they offer to users, their societal impact inevitably expands. This leads to the rising necessity for comprehensive studies to fully understand LLMs and reveal their potential opportunities, drawbacks and overall societal impact. With that in mind, this research conducted an extensive investigation into seven LLMs, aiming to assess the temporal stability and inter-rater agreement on their responses on personality instruments in two time points. In addition, LLMs' personality profile was analysed and compared with human normative data. The findings revealed varying levels of inter-rater agreement in the LLMs' responses over a short time, with some LLMs showing higher agreement (e.g. Llama3 and GPT-4o) compared with others (e.g. GPT-4 and Gemini). Furthermore, agreement depended on used instruments as well as on domain or trait. This implies the variable robustness in LLMs' ability to reliably simulate stable personality characteristics. In the case of scales which showed at least fair agreement, LLMs displayed mostly a socially desirable profile in both agentic and communal domains, as well as a prosocial personality profile reflected in higher agreeableness and conscientiousness and lower Machiavellianism. Exhibiting temporal stability and coherent responses on personality traits is crucial for AI systems due to their societal impact and AI safety concerns.</p>\",\"PeriodicalId\":21525,\"journal\":{\"name\":\"Royal Society Open Science\",\"volume\":\"11 10\",\"pages\":\"240180\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2024-10-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11461045/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Royal Society Open Science\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.1098/rsos.240180\",\"RegionNum\":3,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/10/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Royal Society Open Science","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1098/rsos.240180","RegionNum":3,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/10/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

摘要

随着大型语言模型(LLMs)因其类似人类的特征和对用户的亲和力而不断受到欢迎,其社会影响也不可避免地扩大了。因此,越来越有必要进行全面研究,以充分了解 LLM,揭示其潜在的机遇、缺点和整体社会影响。有鉴于此,本研究对七位法律硕士进行了广泛调查,旨在评估他们在两个时间点上对人格工具的反应的时间稳定性和评分者之间的一致性。此外,还对法律硕士的人格特征进行了分析,并与人类常模数据进行了比较。研究结果表明,在短时间内,利比里亚人的回答在不同程度上存在评分者之间的一致性,一些利比里亚人(如 Llama3 和 GPT-4o)与其他利比里亚人(如 GPT-4 和双子座)相比显示出更高的一致性。此外,一致性取决于所使用的工具以及领域或特征。这意味着 LLMs 在可靠模拟稳定人格特征方面具有不同的稳健性。在至少表现出相当一致的量表中,LLMs 在代理和公共领域大多表现出理想的社会特征,以及亲社会人格特征,这反映在较高的合意性和自觉性以及较低的马基雅维利主义上。由于人工智能系统的社会影响和人工智能的安全问题,在人格特质上表现出时间稳定性和连贯性对于人工智能系统至关重要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Personality testing of large language models: limited temporal stability, but highlighted prosociality.

As large language models (LLMs) continue to gain popularity due to their human-like traits and the intimacy they offer to users, their societal impact inevitably expands. This leads to the rising necessity for comprehensive studies to fully understand LLMs and reveal their potential opportunities, drawbacks and overall societal impact. With that in mind, this research conducted an extensive investigation into seven LLMs, aiming to assess the temporal stability and inter-rater agreement on their responses on personality instruments in two time points. In addition, LLMs' personality profile was analysed and compared with human normative data. The findings revealed varying levels of inter-rater agreement in the LLMs' responses over a short time, with some LLMs showing higher agreement (e.g. Llama3 and GPT-4o) compared with others (e.g. GPT-4 and Gemini). Furthermore, agreement depended on used instruments as well as on domain or trait. This implies the variable robustness in LLMs' ability to reliably simulate stable personality characteristics. In the case of scales which showed at least fair agreement, LLMs displayed mostly a socially desirable profile in both agentic and communal domains, as well as a prosocial personality profile reflected in higher agreeableness and conscientiousness and lower Machiavellianism. Exhibiting temporal stability and coherent responses on personality traits is crucial for AI systems due to their societal impact and AI safety concerns.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Royal Society Open Science
Royal Society Open Science Multidisciplinary-Multidisciplinary
CiteScore
6.00
自引率
0.00%
发文量
508
审稿时长
14 weeks
期刊介绍: Royal Society Open Science is a new open journal publishing high-quality original research across the entire range of science on the basis of objective peer-review. The journal covers the entire range of science and mathematics and will allow the Society to publish all the high-quality work it receives without the usual restrictions on scope, length or impact.
期刊最新文献
Characterizing the mechanics of rectangular peg-hole disassembly and the effect of the active compliance centre on the extraction force. Cross-sectional personal network analysis of adult smoking in rural areas. Late Triassic †Cryptovaranoides microlanius is a squamate, not an archosauromorph. What is beautiful is still good: the attractiveness halo effect in the era of beauty filters. Data-driven Huntington's disease progression modelling and estimation of societal cost in the UK.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1