敏捷优化框架:神经网络中的张量算子优化框架

IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Future Generation Computer Systems-The International Journal of Escience Pub Date : 2024-07-16 DOI:10.1016/j.future.2024.07.019
{"title":"敏捷优化框架:神经网络中的张量算子优化框架","authors":"","doi":"10.1016/j.future.2024.07.019","DOIUrl":null,"url":null,"abstract":"<div><p>In recent years, with the gradual slowing of Moore’s Law and the development of deep learning, the demand for hardware performance of executing deep learning based applications has significantly increased. In this case, deep learning compilers have been proven to maximize hardware performance while keeping computational power constant, especially the end-to-end compiler Tensor Virtual Machine (TVM). TVM optimizes tensors by finding excellent parallel computing schemes, thereby achieving the goal of improving the performance of neural network inference. However, there is still untapped potential in current optimization methods. However, existing optimization methods based on the TVM, such as Genetic Algorithms Tuner (GA-Tuner), have failed to achieve a balance between optimization performance and optimization time. The intolerable duration of optimization detracts from TVM’s usability, rendering it challenging to extend into the scientific community. This paper introduces a novel deep learning compilation optimization framework base on TVM called Agile Optimization Framework (AOF), which incorporates a tuner based on the latest Beluga Whale Optimization Algorithm (BWO). The BWO is adept at tackling complex problems characterized by numerous local optima, making it particularly suitable for hardware compilation optimization scenarios. We further propose an Evolving Epsilon Strategy (EES), a search strategy that adaptively adjusts the balance between exploration and exploitation, thereby enhancing the effectiveness of the algorithm. Additionally, we developed a supervised Tuning Accelerator (TA) aimed at reducing the time required for optimization and enhancing efficiency. Comparative experiments demonstrate that AOF achieves 11.36%–66.20% improvement in performance and 30.30%–54.60% reduction in optimization time, significantly outperforming the control group.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2000,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Agile Optimization Framework: A framework for tensor operator optimization in neural network\",\"authors\":\"\",\"doi\":\"10.1016/j.future.2024.07.019\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>In recent years, with the gradual slowing of Moore’s Law and the development of deep learning, the demand for hardware performance of executing deep learning based applications has significantly increased. In this case, deep learning compilers have been proven to maximize hardware performance while keeping computational power constant, especially the end-to-end compiler Tensor Virtual Machine (TVM). TVM optimizes tensors by finding excellent parallel computing schemes, thereby achieving the goal of improving the performance of neural network inference. However, there is still untapped potential in current optimization methods. However, existing optimization methods based on the TVM, such as Genetic Algorithms Tuner (GA-Tuner), have failed to achieve a balance between optimization performance and optimization time. The intolerable duration of optimization detracts from TVM’s usability, rendering it challenging to extend into the scientific community. This paper introduces a novel deep learning compilation optimization framework base on TVM called Agile Optimization Framework (AOF), which incorporates a tuner based on the latest Beluga Whale Optimization Algorithm (BWO). The BWO is adept at tackling complex problems characterized by numerous local optima, making it particularly suitable for hardware compilation optimization scenarios. We further propose an Evolving Epsilon Strategy (EES), a search strategy that adaptively adjusts the balance between exploration and exploitation, thereby enhancing the effectiveness of the algorithm. Additionally, we developed a supervised Tuning Accelerator (TA) aimed at reducing the time required for optimization and enhancing efficiency. Comparative experiments demonstrate that AOF achieves 11.36%–66.20% improvement in performance and 30.30%–54.60% reduction in optimization time, significantly outperforming the control group.</p></div>\",\"PeriodicalId\":55132,\"journal\":{\"name\":\"Future Generation Computer Systems-The International Journal of Escience\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":6.2000,\"publicationDate\":\"2024-07-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Future Generation Computer Systems-The International Journal of Escience\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167739X24003856\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X24003856","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

摘要

近年来,随着摩尔定律的逐步放缓和深度学习的发展,基于深度学习的应用对硬件执行性能的要求显著提高。在这种情况下,深度学习编译器被证明可以在保持计算能力不变的情况下最大限度地提高硬件性能,尤其是端到端编译器张量虚拟机(TVM)。TVM 通过寻找优秀的并行计算方案来优化张量,从而实现提高神经网络推理性能的目标。然而,目前的优化方法仍有未开发的潜力。然而,现有的基于 TVM 的优化方法,如遗传算法调谐器(GA-Tuner),未能在优化性能和优化时间之间取得平衡。令人难以忍受的优化时间降低了 TVM 的可用性,使其在科学界的推广面临挑战。本文介绍了一种基于 TVM 的新型深度学习编译优化框架,名为 "敏捷优化框架"(Agile Optimization Framework,AOF),其中包含一个基于最新白鲸优化算法(Beluga Whale Optimization Algorithm,BWO)的调谐器。BWO 擅长处理以众多局部最优为特征的复杂问题,因此特别适用于硬件编译优化场景。我们进一步提出了 "进化伊普西隆策略"(EES),这是一种搜索策略,能自适应地调整探索与开发之间的平衡,从而提高算法的有效性。此外,我们还开发了一种有监督的调整加速器(TA),旨在减少优化所需的时间并提高效率。对比实验表明,AOF 的性能提高了 11.36%-66.20%,优化时间缩短了 30.30%-54.60%,明显优于对照组。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Agile Optimization Framework: A framework for tensor operator optimization in neural network

In recent years, with the gradual slowing of Moore’s Law and the development of deep learning, the demand for hardware performance of executing deep learning based applications has significantly increased. In this case, deep learning compilers have been proven to maximize hardware performance while keeping computational power constant, especially the end-to-end compiler Tensor Virtual Machine (TVM). TVM optimizes tensors by finding excellent parallel computing schemes, thereby achieving the goal of improving the performance of neural network inference. However, there is still untapped potential in current optimization methods. However, existing optimization methods based on the TVM, such as Genetic Algorithms Tuner (GA-Tuner), have failed to achieve a balance between optimization performance and optimization time. The intolerable duration of optimization detracts from TVM’s usability, rendering it challenging to extend into the scientific community. This paper introduces a novel deep learning compilation optimization framework base on TVM called Agile Optimization Framework (AOF), which incorporates a tuner based on the latest Beluga Whale Optimization Algorithm (BWO). The BWO is adept at tackling complex problems characterized by numerous local optima, making it particularly suitable for hardware compilation optimization scenarios. We further propose an Evolving Epsilon Strategy (EES), a search strategy that adaptively adjusts the balance between exploration and exploitation, thereby enhancing the effectiveness of the algorithm. Additionally, we developed a supervised Tuning Accelerator (TA) aimed at reducing the time required for optimization and enhancing efficiency. Comparative experiments demonstrate that AOF achieves 11.36%–66.20% improvement in performance and 30.30%–54.60% reduction in optimization time, significantly outperforming the control group.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
19.90
自引率
2.70%
发文量
376
审稿时长
10.6 months
期刊介绍: Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications. Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration. Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.
期刊最新文献
Analyzing inference workloads for spatiotemporal modeling An efficient federated learning solution for the artificial intelligence of things Generative adversarial networks to detect intrusion and anomaly in IP flow-based networks Blockchain-based conditional privacy-preserving authentication scheme using PUF for vehicular ad hoc networks UAV-IRS-assisted energy harvesting for edge computing based on deep reinforcement learning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1