{"title":"Assessing the Strengths and Weaknesses of Large Language Models","authors":"Shalom Lappin","doi":"10.1007/s10849-023-09409-x","DOIUrl":null,"url":null,"abstract":"Abstract The transformers that drive chatbots and other AI systems constitute large language models (LLMs). These are currently the focus of a lively discussion in both the scientific literature and the popular media. This discussion ranges from hyperbolic claims that attribute general intelligence and sentience to LLMs, to the skeptical view that these devices are no more than “stochastic parrots”. I present an overview of some of the weak arguments that have been presented against LLMs, and I consider several of the more compelling criticisms of these devices. The former significantly underestimate the capacity of transformers to achieve subtle inductive inferences required for high levels of performance on complex, cognitively significant tasks. In some instances, these arguments misconstrue the nature of deep learning. The latter criticisms identify significant limitations in the way in which transformers learn and represent patterns in data. They also point out important differences between the procedures through which deep neural networks and humans acquire knowledge of natural language. It is necessary to look carefully at both sets of arguments in order to achieve a balanced assessment of the potential and the limitations of LLMs.","PeriodicalId":48732,"journal":{"name":"Journal of Logic Language and Information","volume":"6 2","pages":"0"},"PeriodicalIF":0.7000,"publicationDate":"2023-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Logic Language and Information","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s10849-023-09409-x","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Abstract The transformers that drive chatbots and other AI systems constitute large language models (LLMs). These are currently the focus of a lively discussion in both the scientific literature and the popular media. This discussion ranges from hyperbolic claims that attribute general intelligence and sentience to LLMs, to the skeptical view that these devices are no more than “stochastic parrots”. I present an overview of some of the weak arguments that have been presented against LLMs, and I consider several of the more compelling criticisms of these devices. The former significantly underestimate the capacity of transformers to achieve subtle inductive inferences required for high levels of performance on complex, cognitively significant tasks. In some instances, these arguments misconstrue the nature of deep learning. The latter criticisms identify significant limitations in the way in which transformers learn and represent patterns in data. They also point out important differences between the procedures through which deep neural networks and humans acquire knowledge of natural language. It is necessary to look carefully at both sets of arguments in order to achieve a balanced assessment of the potential and the limitations of LLMs.
期刊介绍:
The scope of the journal is the logical and computational foundations of natural, formal, and programming languages, as well as the different forms of human and mechanized inference. It covers the logical, linguistic, and information-theoretic parts of the cognitive sciences.
Examples of main subareas are Intentional Logics including Dynamic Logic; Nonmonotonic Logic and Belief Revision; Constructive Logics; Complexity Issues in Logic and Linguistics; Theoretical Problems of Logic Programming and Resolution; Categorial Grammar and Type Theory; Generalized Quantification; Information-Oriented Theories of Semantic Structure like Situation Semantics, Discourse Representation Theory, and Dynamic Semantics; Connectionist Models of Logical and Linguistic Structures. The emphasis is on the theoretical aspects of these areas.