FlexGen: Efficient On-Demand Generative AI Service With Flexible Diffusion Model in Mobile Edge Networks

IF 7 1区 计算机科学 Q1 TELECOMMUNICATIONS IEEE Transactions on Cognitive Communications and Networking Pub Date : 2024-12-23 DOI:10.1109/TCCN.2024.3522084
Peichun Li;Huanyu Dong;Liping Qian;Sheng Zhou;Yuan Wu
{"title":"FlexGen: Efficient On-Demand Generative AI Service With Flexible Diffusion Model in Mobile Edge Networks","authors":"Peichun Li;Huanyu Dong;Liping Qian;Sheng Zhou;Yuan Wu","doi":"10.1109/TCCN.2024.3522084","DOIUrl":null,"url":null,"abstract":"Generative artificial intelligence (AI) in edge networks has excelled in delivering human-level creative services close to the end users. However, providing customized intelligence services to a wide range of end clients remains challenging due to the diverse demands of edge applications. In this paper, we present FlexGen, an efficient generative AI framework with flexible diffusion models, to tailor the intelligence service for different client-side requests under diverse quality and efficiency constraints. To this end, we first design and train a flexible diffusion model to support quality-and-cost adjustable image synthesis. After that, we focus on the server-side energy minimization problem subject to the quality level of generative service and the client-side latency constraint. We further theoretically characterize the relationship between the width of the diffusion model and the expected quality of the synthetic image. Following that, the decomposition solution is applied to optimize the generative service, where the image synthesis strategy and resource allocation policy are personalized for different client-side requests. Experiments indicate that, compared to existing image generation schemes, our framework can save up to two times energy consumption without sacrificing the quality of the service.","PeriodicalId":13069,"journal":{"name":"IEEE Transactions on Cognitive Communications and Networking","volume":"11 2","pages":"961-973"},"PeriodicalIF":7.0000,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cognitive Communications and Networking","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10813021/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TELECOMMUNICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

Generative artificial intelligence (AI) in edge networks has excelled in delivering human-level creative services close to the end users. However, providing customized intelligence services to a wide range of end clients remains challenging due to the diverse demands of edge applications. In this paper, we present FlexGen, an efficient generative AI framework with flexible diffusion models, to tailor the intelligence service for different client-side requests under diverse quality and efficiency constraints. To this end, we first design and train a flexible diffusion model to support quality-and-cost adjustable image synthesis. After that, we focus on the server-side energy minimization problem subject to the quality level of generative service and the client-side latency constraint. We further theoretically characterize the relationship between the width of the diffusion model and the expected quality of the synthetic image. Following that, the decomposition solution is applied to optimize the generative service, where the image synthesis strategy and resource allocation policy are personalized for different client-side requests. Experiments indicate that, compared to existing image generation schemes, our framework can save up to two times energy consumption without sacrificing the quality of the service.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
FlexGen:移动边缘网络中基于柔性扩散模型的高效按需生成式AI服务
边缘网络中的生成式人工智能(AI)在向最终用户提供接近人类水平的创造性服务方面表现出色。然而,由于边缘应用程序的不同需求,为广泛的终端客户提供定制的智能服务仍然具有挑战性。在本文中,我们提出了FlexGen,这是一个具有灵活扩散模型的高效生成AI框架,可以在不同质量和效率约束下为不同的客户端请求定制智能服务。为此,我们首先设计并训练了一个灵活的扩散模型,以支持质量和成本可调的图像合成。在此基础上,重点研究了基于生成服务质量水平和客户端延迟约束的服务器端能量最小化问题。我们进一步从理论上描述了扩散模型的宽度与合成图像的期望质量之间的关系。然后,应用分解解决方案优化生成服务,其中针对不同的客户端请求个性化图像合成策略和资源分配策略。实验表明,与现有的图像生成方案相比,我们的框架在不牺牲服务质量的情况下,可以节省高达两倍的能耗。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Transactions on Cognitive Communications and Networking
IEEE Transactions on Cognitive Communications and Networking Computer Science-Artificial Intelligence
CiteScore
15.50
自引率
7.00%
发文量
108
期刊介绍: The IEEE Transactions on Cognitive Communications and Networking (TCCN) aims to publish high-quality manuscripts that push the boundaries of cognitive communications and networking research. Cognitive, in this context, refers to the application of perception, learning, reasoning, memory, and adaptive approaches in communication system design. The transactions welcome submissions that explore various aspects of cognitive communications and networks, focusing on innovative and holistic approaches to complex system design. Key topics covered include architecture, protocols, cross-layer design, and cognition cycle design for cognitive networks. Additionally, research on machine learning, artificial intelligence, end-to-end and distributed intelligence, software-defined networking, cognitive radios, spectrum sharing, and security and privacy issues in cognitive networks are of interest. The publication also encourages papers addressing novel services and applications enabled by these cognitive concepts.
期刊最新文献
Coverage Optimization in RIS-enabled Satellite-Terrestrial Networks: A Digital Twin-based Spatial-Temporal Approach Confidence-guided Prototypical Contrastive Domain Adaptation for Cross-domain Automatic Modulation Classification Curated Collaborative AI Edge with Network Data Analytics for B5G/6G Radio Access Networks Convolutional Autoencoder-Enhanced Semantic Communication in Optical Fiber Systems Two-Phase Cell Switching in 6G vHetNets: Sleeping-Cell Load Estimation and Renewable-Aware Switching Toward NES
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1