{"title":"The development,application,and future of LLM similar to ChatGPT","authors":"Yan Hao, Liu Yuliang, Jin Lianwen, Bai Xiang","doi":"10.11834/jig.230536","DOIUrl":null,"url":null,"abstract":"生成式人工智能技术自 ChatGPT 发布以来,不断突破瓶颈,吸引了资本规模投入、多领域革命和政府重点关注。本文首先分析了大模型的发展动态、应用现状和前景,然后从以下 3 个方面对大模型相关技术进行了简要介绍:1)概述了大模型相关构造技术,包括构造流程、研究现状和优化技术;2)总结了 3 类当前主流图像-文本的大模型多模态技术;3)介绍了根据评估方式不同而划分的 3 类大模型评估基准。参数优化与数据集构建是大模型产品普及与技术迭代的核心问题;多模态能力是大模型重要发展方向之一;设立评估基准是比较与约束大模型的关键方法。此外,本文还讨论了现有相关技术面临的挑战与未来可能的发展方向。现阶段的大模型产品已有强大的理解能力和创造能力,在教育、医疗和金融等领域已展现出广阔的应用前景。但同时,它们也存在训练部署困难、专业知识不足和安全隐患等问题。因此,完善参数优化、优质数据集构建、多模态等技术,并建立统一、全面、便捷的评估基准,将成为大模型突破现有局限的关键。;Generative artificial intelligence(AI)technology has achieved remarkable breakthroughs and advances in its intelligence level since the release of ChatGPT several months ago, especially in terms of its scope, automation, and intelligence.The rising popularity of generative AI attracts capital inflows and promotes the innovation of various fields.Moreover, governments worldwide pay considerable attention to generative AI and hold different attitudes toward it.The US government maintains a relatively relaxed attitude to stay ahead in the global technological arena, while European countries are conservative and are concerned about data privacy in large language models(LLMs).The Chinese government attaches great importance to AI and LLMs but also emphasizes the regulatory issues.With the growing influence of ChatGPT and its competitors and the rapid development of generative AI technology, conducting a deep analysis of them becomes necessary.This paper first provides an in-depth analysis of the development, application, and prospects of generative AI.Various types of LLMs have emerged as a series of remarkable technological products that have demonstrated versatile capabilities across multiple domains, such as education, medicine, finance, law, programming, and paper writing.These models are usually fine-tuned on the basis of general LLMs, with the aim of endowing the large models with additional domainspecific knowledge and enhanced adaptability to a specific domain.LLMs(e.g., GPT-4)have achieved rapid improvements in the past few months in terms of professional knowledge, reasoning, coding, credibility, security, transferability, and multimodality.Then, the technical contribution of generative AI technology is briefly introduced from four aspects:1) we review the related work on LLMs, such as GPT-4, PaLM2, ERNIE Bot, and their construction pipeline, which involves the training of base and assistant models.The base models store a large amount of linguistic knowledge, while the assistant models acquire stronger comprehension and generation capabilities after a series of fine-tuning.2)We outline a series of public LLMs based on LLaMA, a framework for building lightweight and memory-efficient LLMs, including Alpaca, Vicuna, Koala, and Baize, as well as the key technologies for building LLMs with low memory and computation requirements, consisting of low-rank adaptation, Self-instruct, and automatic prompt engineer.3)We summarize three types of existing mainstream image -text multimodal techniques:training additional adaptation layers to align visual modules and language models, multimodal instruction fine-tuning, and LLM serving as the center of understanding.4)We introduce three types of LLM evaluation benchmarks based on different implementation methods, namely, manual evaluation, automatic evaluation, and LLM evaluation.Parameter optimization and fine-tuning dataset construction are crucial for the popularization and innovation of generative AI products because they can significantly reduce the training cost and computational resource consumption of LLMs while enhancing the diversity and generalization ability of LLMs.Multimodal capability is the future trend of generative AI because multimodal models have the ability to integrate information from multiple perceptual dimensions, which is consistent with human cognition.Evaluation benchmarks are the key methods to compare and constrain the models of generative AI, given that they can efficiently measure and optimize the performance and generalization ability of LLMs and reveal their strengths and limitations.In conclusion, improving parameter optimization, highquality dataset construction, multimodal, and other technologies and establishing a unified, comprehensive, and convenient evaluation benchmark will be the key to achieving further development in generative AI.Furthermore, the current challenges and possible future directions of the related technologies are discussed in this paper.Existing generative AI products have considerable creativity, understanding, and intelligence and have shown broad application prospects in various fields, such as empowering content creation, innovating interactive experience, creating digital life, serving as smart home and family assistants, and realizing autonomous driving and intelligent car interaction.However, LLMs still exhibit some limitations, such as lack of high-quality training data, susceptibility to hallucinations, output factual errors, uninterpretability, high training and deployment costs, and security and privacy issues.Therefore, the potential research directions can be divided into three aspects:1)the data aspect focuses on the input and output of LLMs, including the construction of general tuning instruction datasets and domain-specific knowledge datasets.2)The technical aspect improves the internal structure and function of LLMs, including the training, multimodality, principle innovation, and structure pruning of LLMs.3)The application aspect enhances the practical effect and application value of LLMs, including security enhancement, evaluation system development, and LLM application engineering implementation.The advancement of generative AI has provided remarkable benefits for economic development.However, it also entails new opportunities and challenges for various stakeholders, especially the industry and the general public.On the one hand, the industry needs to foster a large pool of researchers who can conduct systematic and cutting-edge research on generative AI technologies, which are constantly improving and innovating.On the other hand, the general public needs to acquire and apply the skills of prompt engineering, which can enable them to utilize existing LLMs effectively and efficiently.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"123 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"中国图象图形学报","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.11834/jig.230536","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 1
Abstract
生成式人工智能技术自 ChatGPT 发布以来,不断突破瓶颈,吸引了资本规模投入、多领域革命和政府重点关注。本文首先分析了大模型的发展动态、应用现状和前景,然后从以下 3 个方面对大模型相关技术进行了简要介绍:1)概述了大模型相关构造技术,包括构造流程、研究现状和优化技术;2)总结了 3 类当前主流图像-文本的大模型多模态技术;3)介绍了根据评估方式不同而划分的 3 类大模型评估基准。参数优化与数据集构建是大模型产品普及与技术迭代的核心问题;多模态能力是大模型重要发展方向之一;设立评估基准是比较与约束大模型的关键方法。此外,本文还讨论了现有相关技术面临的挑战与未来可能的发展方向。现阶段的大模型产品已有强大的理解能力和创造能力,在教育、医疗和金融等领域已展现出广阔的应用前景。但同时,它们也存在训练部署困难、专业知识不足和安全隐患等问题。因此,完善参数优化、优质数据集构建、多模态等技术,并建立统一、全面、便捷的评估基准,将成为大模型突破现有局限的关键。;Generative artificial intelligence(AI)technology has achieved remarkable breakthroughs and advances in its intelligence level since the release of ChatGPT several months ago, especially in terms of its scope, automation, and intelligence.The rising popularity of generative AI attracts capital inflows and promotes the innovation of various fields.Moreover, governments worldwide pay considerable attention to generative AI and hold different attitudes toward it.The US government maintains a relatively relaxed attitude to stay ahead in the global technological arena, while European countries are conservative and are concerned about data privacy in large language models(LLMs).The Chinese government attaches great importance to AI and LLMs but also emphasizes the regulatory issues.With the growing influence of ChatGPT and its competitors and the rapid development of generative AI technology, conducting a deep analysis of them becomes necessary.This paper first provides an in-depth analysis of the development, application, and prospects of generative AI.Various types of LLMs have emerged as a series of remarkable technological products that have demonstrated versatile capabilities across multiple domains, such as education, medicine, finance, law, programming, and paper writing.These models are usually fine-tuned on the basis of general LLMs, with the aim of endowing the large models with additional domainspecific knowledge and enhanced adaptability to a specific domain.LLMs(e.g., GPT-4)have achieved rapid improvements in the past few months in terms of professional knowledge, reasoning, coding, credibility, security, transferability, and multimodality.Then, the technical contribution of generative AI technology is briefly introduced from four aspects:1) we review the related work on LLMs, such as GPT-4, PaLM2, ERNIE Bot, and their construction pipeline, which involves the training of base and assistant models.The base models store a large amount of linguistic knowledge, while the assistant models acquire stronger comprehension and generation capabilities after a series of fine-tuning.2)We outline a series of public LLMs based on LLaMA, a framework for building lightweight and memory-efficient LLMs, including Alpaca, Vicuna, Koala, and Baize, as well as the key technologies for building LLMs with low memory and computation requirements, consisting of low-rank adaptation, Self-instruct, and automatic prompt engineer.3)We summarize three types of existing mainstream image -text multimodal techniques:training additional adaptation layers to align visual modules and language models, multimodal instruction fine-tuning, and LLM serving as the center of understanding.4)We introduce three types of LLM evaluation benchmarks based on different implementation methods, namely, manual evaluation, automatic evaluation, and LLM evaluation.Parameter optimization and fine-tuning dataset construction are crucial for the popularization and innovation of generative AI products because they can significantly reduce the training cost and computational resource consumption of LLMs while enhancing the diversity and generalization ability of LLMs.Multimodal capability is the future trend of generative AI because multimodal models have the ability to integrate information from multiple perceptual dimensions, which is consistent with human cognition.Evaluation benchmarks are the key methods to compare and constrain the models of generative AI, given that they can efficiently measure and optimize the performance and generalization ability of LLMs and reveal their strengths and limitations.In conclusion, improving parameter optimization, highquality dataset construction, multimodal, and other technologies and establishing a unified, comprehensive, and convenient evaluation benchmark will be the key to achieving further development in generative AI.Furthermore, the current challenges and possible future directions of the related technologies are discussed in this paper.Existing generative AI products have considerable creativity, understanding, and intelligence and have shown broad application prospects in various fields, such as empowering content creation, innovating interactive experience, creating digital life, serving as smart home and family assistants, and realizing autonomous driving and intelligent car interaction.However, LLMs still exhibit some limitations, such as lack of high-quality training data, susceptibility to hallucinations, output factual errors, uninterpretability, high training and deployment costs, and security and privacy issues.Therefore, the potential research directions can be divided into three aspects:1)the data aspect focuses on the input and output of LLMs, including the construction of general tuning instruction datasets and domain-specific knowledge datasets.2)The technical aspect improves the internal structure and function of LLMs, including the training, multimodality, principle innovation, and structure pruning of LLMs.3)The application aspect enhances the practical effect and application value of LLMs, including security enhancement, evaluation system development, and LLM application engineering implementation.The advancement of generative AI has provided remarkable benefits for economic development.However, it also entails new opportunities and challenges for various stakeholders, especially the industry and the general public.On the one hand, the industry needs to foster a large pool of researchers who can conduct systematic and cutting-edge research on generative AI technologies, which are constantly improving and innovating.On the other hand, the general public needs to acquire and apply the skills of prompt engineering, which can enable them to utilize existing LLMs effectively and efficiently.
生成式人工智能技术自 ChatGPT 发布以来,不断突破瓶颈,吸引了资本规模投入、多领域革命和政府重点关注。本文首先分析了大模型的发展动态、应用现状和前景,然后从以下 3 个方面对大模型相关技术进行了简要介绍:1)概述了大模型相关构造技术,包括构造流程、研究现状和优化技术;2)总结了 3 类当前主流图像-文本的大模型多模态技术;3)介绍了根据评估方式不同而划分的 3 类大模型评估基准。参数优化与数据集构建是大模型产品普及与技术迭代的核心问题;多模态能力是大模型重要发展方向之一;设立评估基准是比较与约束大模型的关键方法。此外,本文还讨论了现有相关技术面临的挑战与未来可能的发展方向。现阶段的大模型产品已有强大的理解能力和创造能力,在教育、医疗和金融等领域已展现出广阔的应用前景。但同时,它们也存在训练部署困难、专业知识不足和安全隐患等问题。因此,完善参数优化、优质数据集构建、多模态等技术,并建立统一、全面、便捷的评估基准,将成为大模型突破现有局限的关键。;Generative artificial intelligence(AI)technology has achieved remarkable breakthroughs and advances in its intelligence level since the release of ChatGPT several months ago, especially in terms of its scope, automation, and intelligence.The rising popularity of generative AI attracts capital inflows and promotes the innovation of various fields.Moreover, governments worldwide pay considerable attention to generative AI and hold different attitudes toward it.The US government maintains a relatively relaxed attitude to stay ahead in the global technological arena, while European countries are conservative and are concerned about data privacy in large language models(LLMs).The Chinese government attaches great importance to AI and LLMs but also emphasizes the regulatory issues.With the growing influence of ChatGPT and its competitors and the rapid development of generative AI technology, conducting a deep analysis of them becomes necessary.This paper first provides an in-depth analysis of the development, application, and prospects of generative AI.Various types of LLMs have emerged as a series of remarkable technological products that have demonstrated versatile capabilities across multiple domains, such as education, medicine, finance, law, programming, and paper writing.These models are usually fine-tuned on the basis of general LLMs, with the aim of endowing the large models with additional domainspecific knowledge and enhanced adaptability to a specific domain.LLMs(e.g., GPT-4)have achieved rapid improvements in the past few months in terms of professional knowledge, reasoning, coding, credibility, security, transferability, and multimodality.Then, the technical contribution of generative AI technology is briefly introduced from four aspects:1) we review the related work on LLMs, such as GPT-4, PaLM2, ERNIE Bot, and their construction pipeline, which involves the training of base and assistant models.The base models store a large amount of linguistic knowledge, while the assistant models acquire stronger comprehension and generation capabilities after a series of fine-tuning.2)We outline a series of public LLMs based on LLaMA, a framework for building lightweight and memory-efficient LLMs, including Alpaca, Vicuna, Koala, and Baize, as well as the key technologies for building LLMs with low memory and computation requirements, consisting of low-rank adaptation, Self-instruct, and automatic prompt engineer.3)We summarize three types of existing mainstream image -text multimodal techniques:training additional adaptation layers to align visual modules and language models, multimodal instruction fine-tuning, and LLM serving as the center of understanding.4)We introduce three types of LLM evaluation benchmarks based on different implementation methods, namely, manual evaluation, automatic evaluation, and LLM evaluation.Parameter optimization and fine-tuning dataset construction are crucial for the popularization and innovation of generative AI products because they can significantly reduce the training cost and computational resource consumption of LLMs while enhancing the diversity and generalization ability of LLMs.Multimodal capability is the future trend of generative AI because multimodal models have the ability to integrate information from multiple perceptual dimensions, which is consistent with human cognition.Evaluation benchmarks are the key methods to compare and constrain the models of generative AI, given that they can efficiently measure and optimize the performance and generalization ability of LLMs and reveal their strengths and limitations.In conclusion, improving parameter optimization, highquality dataset construction, multimodal, and other technologies and establishing a unified, comprehensive, and convenient evaluation benchmark will be the key to achieving further development in generative AI.Furthermore, the current challenges and possible future directions of the related technologies are discussed in this paper.Existing generative AI products have considerable creativity, understanding, and intelligence and have shown broad application prospects in various fields, such as empowering content creation, innovating interactive experience, creating digital life, serving as smart home and family assistants, and realizing autonomous driving and intelligent car interaction. 然而,法学硕士仍然存在一些局限性,例如缺乏高质量的训练数据、容易产生幻觉、输出事实错误、不可解释性、高培训和部署成本以及安全和隐私问题。因此,潜在的研究方向可分为三个方面:1)数据方面关注法学硕士的输入和输出,包括构建通用调优指令数据集和特定领域知识数据集;2)技术方面完善法学硕士的内部结构和功能,包括法学硕士的训练、多模态、原理创新和结构修剪;3)应用方面提高法学硕士的实际效果和应用价值。包括安全增强、评估系统开发和法学硕士应用工程实施。生成式人工智能的进步为经济发展带来了显著的效益。然而,这也为各持份者,特别是业界和公众带来了新的机遇和挑战。一方面,该行业需要培养大量的研究人员,他们可以对不断改进和创新的生成式人工智能技术进行系统和前沿的研究。另一方面,普通大众需要获得和应用即时工程的技能,这可以使他们有效和高效地利用现有的法学硕士。
中国图象图形学报Computer Science-Computer Graphics and Computer-Aided Design
CiteScore
1.20
自引率
0.00%
发文量
6776
期刊介绍:
Journal of Image and Graphics (ISSN 1006-8961, CN 11-3758/TB, CODEN ZTTXFZ) is an authoritative academic journal supervised by the Chinese Academy of Sciences and co-sponsored by the Institute of Space and Astronautical Information Innovation of the Chinese Academy of Sciences (ISIAS), the Chinese Society of Image and Graphics (CSIG), and the Beijing Institute of Applied Physics and Computational Mathematics (BIAPM). The journal integrates high-tech theories, technical methods and industrialisation of applied research results in computer image graphics, and mainly publishes innovative and high-level scientific research papers on basic and applied research in image graphics science and its closely related fields. The form of papers includes reviews, technical reports, project progress, academic news, new technology reviews, new product introduction and industrialisation research. The content covers a wide range of fields such as image analysis and recognition, image understanding and computer vision, computer graphics, virtual reality and augmented reality, system simulation, animation, etc., and theme columns are opened according to the research hotspots and cutting-edge topics.
Journal of Image and Graphics reaches a wide range of readers, including scientific and technical personnel, enterprise supervisors, and postgraduates and college students of colleges and universities engaged in the fields of national defence, military, aviation, aerospace, communications, electronics, automotive, agriculture, meteorology, environmental protection, remote sensing, mapping, oil field, construction, transportation, finance, telecommunications, education, medical care, film and television, and art.
Journal of Image and Graphics is included in many important domestic and international scientific literature database systems, including EBSCO database in the United States, JST database in Japan, Scopus database in the Netherlands, China Science and Technology Thesis Statistics and Analysis (Annual Research Report), China Science Citation Database (CSCD), China Academic Journal Network Publishing Database (CAJD), and China Academic Journal Network Publishing Database (CAJD). China Science Citation Database (CSCD), China Academic Journals Network Publishing Database (CAJD), China Academic Journal Abstracts, Chinese Science Abstracts (Series A), China Electronic Science Abstracts, Chinese Core Journals Abstracts, Chinese Academic Journals on CD-ROM, and China Academic Journals Comprehensive Evaluation Database.