{"title":"ChatGPT and CLT: Investigating differences in multimodal processing","authors":"Michael Cahalane, Samuel N. Kirshner","doi":"10.1016/j.ject.2024.11.008","DOIUrl":null,"url":null,"abstract":"<div><div>Drawing on construal level theory, recent studies have demonstrated that ChatGPT interprets text inputs from an abstract perspective. However, as ChatGPT has evolved into a multimodal tool, this research examines whether ChatGPT's abstraction bias extends to image-based prompts. In a pre-registered study utilising hierarchical letters, ChatGPT predominantly associated these images with local rather than global letters, suggesting a concrete bias when analysing images. This starkly contrasts human participants who predominantly identified the same images with the global letters, indicating that humans and ChatGPT significantly diverge in image interpretations. Furthermore, while humans generally perceive ChatGPT to be more concrete in image processing, there is a notable discrepancy between this perception and the actual level of concreteness exhibited by ChatGPT in handling image-based tasks. These findings provide insights into the distinct cognitive behaviours of LLMs compared to humans, contributing to an emerging understanding of LLM cognition in the context of multimodal inputs.</div></div>","PeriodicalId":100776,"journal":{"name":"Journal of Economy and Technology","volume":"3 ","pages":"Pages 10-21"},"PeriodicalIF":0.0000,"publicationDate":"2024-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Economy and Technology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949948824000611","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Drawing on construal level theory, recent studies have demonstrated that ChatGPT interprets text inputs from an abstract perspective. However, as ChatGPT has evolved into a multimodal tool, this research examines whether ChatGPT's abstraction bias extends to image-based prompts. In a pre-registered study utilising hierarchical letters, ChatGPT predominantly associated these images with local rather than global letters, suggesting a concrete bias when analysing images. This starkly contrasts human participants who predominantly identified the same images with the global letters, indicating that humans and ChatGPT significantly diverge in image interpretations. Furthermore, while humans generally perceive ChatGPT to be more concrete in image processing, there is a notable discrepancy between this perception and the actual level of concreteness exhibited by ChatGPT in handling image-based tasks. These findings provide insights into the distinct cognitive behaviours of LLMs compared to humans, contributing to an emerging understanding of LLM cognition in the context of multimodal inputs.