Genetic Multi-Armed Bandits: A Reinforcement Learning Inspired Approach for Simulation Optimization

IF 11.7 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE IEEE Transactions on Evolutionary Computation Pub Date : 2024-12-31 DOI:10.1109/TEVC.2024.3524505
Deniz Preil;Michael Krapp
{"title":"Genetic Multi-Armed Bandits: A Reinforcement Learning Inspired Approach for Simulation Optimization","authors":"Deniz Preil;Michael Krapp","doi":"10.1109/TEVC.2024.3524505","DOIUrl":null,"url":null,"abstract":"Many real-world problems are inherently stochastic, complicating, or even precluding the use of analytical methods. These problems are often characterized by high dimensionality, large solution spaces, and numerous local optima, which make finding optimal solutions challenging. Therefore, simulation optimization is frequently employed. This article specifically focuses on the discrete case, also known as discrete optimization via simulation. Despite their adaptions for stochastic problems, previous evolutionary algorithms face a major limitation in these problems. They discard all information about solutions that are not involved in the most recent population. However, this is ineffective, as each simulation observation gathered over the course of iterations provides valuable information that should guide the selection of subsequent solutions. Inspired by the domain of reinforcement learning (RL), we propose a novel memory concept for evolutionary algorithms that ensures global convergence and significantly improves their finite time performance. Unlike previous evolutionary algorithms, our approach permanently preserves simulation observations to progressively improve the accuracy of sample means when revisiting solutions in later iterations. Moreover, the selection of new solutions is based on the entire memory rather than just the last population. The numerical experiments demonstrate that this novel approach, which combines a genetic algorithm (GA) with such memory, consistently outperforms popular convergent state-of-the-art benchmark algorithms in a large variety of established test problems while requiring considerably less computational effort. This marks the so-called genetic multi-armed bandit (MAB) as one of the currently most powerful algorithms for solving stochastic problems.","PeriodicalId":13206,"journal":{"name":"IEEE Transactions on Evolutionary Computation","volume":"29 2","pages":"360-374"},"PeriodicalIF":11.7000,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10818791","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Evolutionary Computation","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10818791/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Many real-world problems are inherently stochastic, complicating, or even precluding the use of analytical methods. These problems are often characterized by high dimensionality, large solution spaces, and numerous local optima, which make finding optimal solutions challenging. Therefore, simulation optimization is frequently employed. This article specifically focuses on the discrete case, also known as discrete optimization via simulation. Despite their adaptions for stochastic problems, previous evolutionary algorithms face a major limitation in these problems. They discard all information about solutions that are not involved in the most recent population. However, this is ineffective, as each simulation observation gathered over the course of iterations provides valuable information that should guide the selection of subsequent solutions. Inspired by the domain of reinforcement learning (RL), we propose a novel memory concept for evolutionary algorithms that ensures global convergence and significantly improves their finite time performance. Unlike previous evolutionary algorithms, our approach permanently preserves simulation observations to progressively improve the accuracy of sample means when revisiting solutions in later iterations. Moreover, the selection of new solutions is based on the entire memory rather than just the last population. The numerical experiments demonstrate that this novel approach, which combines a genetic algorithm (GA) with such memory, consistently outperforms popular convergent state-of-the-art benchmark algorithms in a large variety of established test problems while requiring considerably less computational effort. This marks the so-called genetic multi-armed bandit (MAB) as one of the currently most powerful algorithms for solving stochastic problems.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
遗传多武装强盗:一种受强化学习启发的仿真优化方法
许多现实世界的问题本质上是随机的,复杂的,甚至排除了分析方法的使用。这些问题通常具有高维、大解空间和大量局部最优的特点,这使得寻找最优解具有挑战性。因此,经常采用仿真优化。本文特别关注离散情况,也称为通过模拟进行离散优化。尽管以前的进化算法能够适应随机问题,但在这些问题上存在很大的局限性。它们丢弃了所有与最近种群无关的解的信息。然而,这是无效的,因为在迭代过程中收集的每个模拟观察都提供了有价值的信息,这些信息应该指导后续解决方案的选择。受强化学习(RL)领域的启发,我们提出了一种新的进化算法的记忆概念,该概念确保了全局收敛并显着提高了其有限时间性能。与以前的进化算法不同,我们的方法永久地保留了模拟观测值,以便在以后的迭代中重访解决方案时逐步提高样本均值的准确性。此外,新的解决方案的选择是基于整个记忆,而不仅仅是最后的人口。数值实验表明,这种将遗传算法(GA)与这种存储器相结合的新方法在各种已建立的测试问题中始终优于流行的收敛的最先进的基准算法,同时所需的计算量大大减少。这标志着所谓的遗传多臂强盗(MAB)是目前解决随机问题最强大的算法之一。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Transactions on Evolutionary Computation
IEEE Transactions on Evolutionary Computation 工程技术-计算机:理论方法
CiteScore
21.90
自引率
9.80%
发文量
196
审稿时长
3.6 months
期刊介绍: The IEEE Transactions on Evolutionary Computation is published by the IEEE Computational Intelligence Society on behalf of 13 societies: Circuits and Systems; Computer; Control Systems; Engineering in Medicine and Biology; Industrial Electronics; Industry Applications; Lasers and Electro-Optics; Oceanic Engineering; Power Engineering; Robotics and Automation; Signal Processing; Social Implications of Technology; and Systems, Man, and Cybernetics. The journal publishes original papers in evolutionary computation and related areas such as nature-inspired algorithms, population-based methods, optimization, and hybrid systems. It welcomes both purely theoretical papers and application papers that provide general insights into these areas of computation.
期刊最新文献
Population Diversity Dynamics Analysis for Imbalanced Multi-objective Optimization CoMAEA: A Collision-Avoiding Multi-Agent Evolutionary Algorithm for Coverage Path Planning Dynamic Multi-objective Operation Optimization for Multi-condition Blast Furnace Ironmaking Process with Changing Decision Variables Evo-TFS: Evolutionary Time-Frequency Domain-Based Synthetic Minority Oversampling Approach to Imbalanced Time Series Classification LLMENAS: Evolutionary Neural Architecture Search via Large Language Model Guidance
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1