思维与机器相遇:破解GPT-4的认知心理学

BenchCouncil Transactions on Benchmarks, Standards and Evaluations Pub Date : 2023-09-01 DOI:10.1016/j.tbench.2023.100139

Sifatkaur Dhingra , Manmeet Singh , Vaisakh S.B. , Neetiraj Malviya , Sukhpal Singh Gill

{"title":"思维与机器相遇:破解GPT-4的认知心理学","authors":"Sifatkaur Dhingra , Manmeet Singh , Vaisakh S.B. , Neetiraj Malviya , Sukhpal Singh Gill","doi":"10.1016/j.tbench.2023.100139","DOIUrl":null,"url":null,"abstract":"<div><p>Cognitive psychology delves on understanding perception, attention, memory, language, problem-solving, decision-making, and reasoning. Large Language Models (LLMs) are emerging as potent tools increasingly capable of performing human-level tasks. The recent development in the form of Generative Pre-trained Transformer 4 (GPT-4) and its demonstrated success in tasks complex to humans exam and complex problems has led to an increased confidence in the LLMs to become perfect instruments of intelligence. Although GPT-4 report has shown performance on some cognitive psychology tasks, a comprehensive assessment of GPT-4, via the existing well-established datasets is required. In this study, we focus on the evaluation of GPT-4’s performance on a set of cognitive psychology datasets such as CommonsenseQA, SuperGLUE, MATH and HANS. In doing so, we understand how GPT-4 processes and integrates cognitive psychology with contextual information, providing insight into the underlying cognitive processes that enable its ability to generate the responses. We show that GPT-4 exhibits a high level of accuracy in cognitive psychology tasks relative to the prior state-of-the-art models. Our results strengthen the already available assessments and confidence on GPT-4’s cognitive psychology abilities. It has significant potential to revolutionise the field of Artificial Intelligence (AI), by enabling machines to bridge the gap between human and machine reasoning.</p></div>","PeriodicalId":100155,"journal":{"name":"BenchCouncil Transactions on Benchmarks, Standards and Evaluations","volume":"3 3","pages":"Article 100139"},"PeriodicalIF":0.0000,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Mind meets machine: Unravelling GPT-4’s cognitive psychology\",\"authors\":\"Sifatkaur Dhingra , Manmeet Singh , Vaisakh S.B. , Neetiraj Malviya , Sukhpal Singh Gill\",\"doi\":\"10.1016/j.tbench.2023.100139\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Cognitive psychology delves on understanding perception, attention, memory, language, problem-solving, decision-making, and reasoning. Large Language Models (LLMs) are emerging as potent tools increasingly capable of performing human-level tasks. The recent development in the form of Generative Pre-trained Transformer 4 (GPT-4) and its demonstrated success in tasks complex to humans exam and complex problems has led to an increased confidence in the LLMs to become perfect instruments of intelligence. Although GPT-4 report has shown performance on some cognitive psychology tasks, a comprehensive assessment of GPT-4, via the existing well-established datasets is required. In this study, we focus on the evaluation of GPT-4’s performance on a set of cognitive psychology datasets such as CommonsenseQA, SuperGLUE, MATH and HANS. In doing so, we understand how GPT-4 processes and integrates cognitive psychology with contextual information, providing insight into the underlying cognitive processes that enable its ability to generate the responses. We show that GPT-4 exhibits a high level of accuracy in cognitive psychology tasks relative to the prior state-of-the-art models. Our results strengthen the already available assessments and confidence on GPT-4’s cognitive psychology abilities. It has significant potential to revolutionise the field of Artificial Intelligence (AI), by enabling machines to bridge the gap between human and machine reasoning.</p></div>\",\"PeriodicalId\":100155,\"journal\":{\"name\":\"BenchCouncil Transactions on Benchmarks, Standards and Evaluations\",\"volume\":\"3 3\",\"pages\":\"Article 100139\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BenchCouncil Transactions on Benchmarks, Standards and Evaluations\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S277248592300056X\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BenchCouncil Transactions on Benchmarks, Standards and Evaluations","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S277248592300056X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

认知心理学研究理解感知、注意力、记忆、语言、解决问题、决策和推理。大型语言模型（LLM）正在成为一种强大的工具，越来越能够执行人类级别的任务。Generative Pre-trained Transformer 4（GPT-4）形式的最新发展及其在人类复杂任务、考试和复杂问题方面的成功证明，增强了人们对LLM成为完美智能工具的信心。尽管GPT-4报告显示了一些认知心理学任务的表现，但需要通过现有的成熟数据集对GPT-4进行全面评估。在本研究中，我们重点评估了GPT-4在一组认知心理学数据集上的表现，如CommonsenseQA、SuperGLUE、MATH和HANS。通过这样做，我们了解了GPT-4是如何处理认知心理学并将其与上下文信息相结合的，从而深入了解其产生反应的潜在认知过程。我们发现，与先前最先进的模型相比，GPT-4在认知心理学任务中表现出较高的准确性。我们的研究结果加强了对GPT-4认知心理能力的现有评估和信心。它具有巨大的潜力，可以通过使机器弥合人类和机器推理之间的差距，彻底改变人工智能领域。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Mind meets machine: Unravelling GPT-4’s cognitive psychology

Cognitive psychology delves on understanding perception, attention, memory, language, problem-solving, decision-making, and reasoning. Large Language Models (LLMs) are emerging as potent tools increasingly capable of performing human-level tasks. The recent development in the form of Generative Pre-trained Transformer 4 (GPT-4) and its demonstrated success in tasks complex to humans exam and complex problems has led to an increased confidence in the LLMs to become perfect instruments of intelligence. Although GPT-4 report has shown performance on some cognitive psychology tasks, a comprehensive assessment of GPT-4, via the existing well-established datasets is required. In this study, we focus on the evaluation of GPT-4’s performance on a set of cognitive psychology datasets such as CommonsenseQA, SuperGLUE, MATH and HANS. In doing so, we understand how GPT-4 processes and integrates cognitive psychology with contextual information, providing insight into the underlying cognitive processes that enable its ability to generate the responses. We show that GPT-4 exhibits a high level of accuracy in cognitive psychology tasks relative to the prior state-of-the-art models. Our results strengthen the already available assessments and confidence on GPT-4’s cognitive psychology abilities. It has significant potential to revolutionise the field of Artificial Intelligence (AI), by enabling machines to bridge the gap between human and machine reasoning.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

BenchCouncil Transactions on Benchmarks, Standards and Evaluations

CiteScore

4.80

自引率

0.00%

发文量