{"title":"Mobile Edge Intelligence for Large Language Models: A Contemporary Survey","authors":"Guanqiao Qu;Qiyuan Chen;Wei Wei;Zheng Lin;Xianhao Chen;Kaibin Huang","doi":"10.1109/COMST.2025.3527641","DOIUrl":null,"url":null,"abstract":"On-device large language models (LLMs), referring to running LLMs on edge devices, have raised considerable interest since they are more cost-effective, latency-efficient, and privacy-preserving compared with the cloud LLM paradigm. Nonetheless unlike cloud LLMs, the performance of on-device LLMs is intrinsically constrained by resource limitations on edge devices. Sitting between cloud and on-device AI, mobile edge intelligence (MEI) may address this dilemma by provisioning AI capabilities at the edge of mobile networks, e.g., on base stations. This article provides a contemporary survey on harnessing MEI for LLM deployment. We begin by illustrating several killer applications to demonstrate the urgent need for deploying LLMs at the network edge. Next, we present the preliminaries of LLMs, MEI, and resource-efficient LLM techniques. We then provide an architectural overview of MEI for LLMs (MEI4LLM), outlining its core components and how it supports LLM deployment. Subsequently, we delve into various aspects of MEI4LLM, extensively covering edge LLM caching and delivery, edge LLM training, and edge LLM inference. Finally, we identify future research opportunities. We hope this article inspires researchers in the field to leverage mobile edge computing to facilitate LLM deployment, thereby unleashing the potential of LLMs across various privacy- and delay-sensitive applications.","PeriodicalId":55029,"journal":{"name":"IEEE Communications Surveys and Tutorials","volume":"27 6","pages":"3820-3860"},"PeriodicalIF":34.4000,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Communications Surveys and Tutorials","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10835069/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
On-device large language models (LLMs), referring to running LLMs on edge devices, have raised considerable interest since they are more cost-effective, latency-efficient, and privacy-preserving compared with the cloud LLM paradigm. Nonetheless unlike cloud LLMs, the performance of on-device LLMs is intrinsically constrained by resource limitations on edge devices. Sitting between cloud and on-device AI, mobile edge intelligence (MEI) may address this dilemma by provisioning AI capabilities at the edge of mobile networks, e.g., on base stations. This article provides a contemporary survey on harnessing MEI for LLM deployment. We begin by illustrating several killer applications to demonstrate the urgent need for deploying LLMs at the network edge. Next, we present the preliminaries of LLMs, MEI, and resource-efficient LLM techniques. We then provide an architectural overview of MEI for LLMs (MEI4LLM), outlining its core components and how it supports LLM deployment. Subsequently, we delve into various aspects of MEI4LLM, extensively covering edge LLM caching and delivery, edge LLM training, and edge LLM inference. Finally, we identify future research opportunities. We hope this article inspires researchers in the field to leverage mobile edge computing to facilitate LLM deployment, thereby unleashing the potential of LLMs across various privacy- and delay-sensitive applications.
期刊介绍:
IEEE Communications Surveys & Tutorials is an online journal published by the IEEE Communications Society for tutorials and surveys covering all aspects of the communications field. Telecommunications technology is progressing at a rapid pace, and the IEEE Communications Society is committed to providing researchers and other professionals the information and tools to stay abreast. IEEE Communications Surveys and Tutorials focuses on integrating and adding understanding to the existing literature on communications, putting results in context. Whether searching for in-depth information about a familiar area or an introduction into a new area, IEEE Communications Surveys & Tutorials aims to be the premier source of peer-reviewed, comprehensive tutorials and surveys, and pointers to further sources. IEEE Communications Surveys & Tutorials publishes only articles exclusively written for IEEE Communications Surveys & Tutorials and go through a rigorous review process before their publication in the quarterly issues.
A tutorial article in the IEEE Communications Surveys & Tutorials should be designed to help the reader to become familiar with and learn something specific about a chosen topic. In contrast, the term survey, as applied here, is defined to mean a survey of the literature. A survey article in IEEE Communications Surveys & Tutorials should provide a comprehensive review of developments in a selected area, covering its development from its inception to its current state and beyond, and illustrating its development through liberal citations from the literature. Both tutorials and surveys should be tutorial in nature and should be written in a style comprehensible to readers outside the specialty of the article.