Towards optimal control of HPV model using safe reinforcement learning with actor–critic neural networks

IF 7.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Expert Systems with Applications Pub Date : 2024-11-27 DOI:10.1016/j.eswa.2024.125783

Roya Khalili Amirabadi , Omid S. Fard , Mohsen Jalaeian Farimani

{"title":"Towards optimal control of HPV model using safe reinforcement learning with actor–critic neural networks","authors":"Roya Khalili Amirabadi , Omid S. Fard , Mohsen Jalaeian Farimani","doi":"10.1016/j.eswa.2024.125783","DOIUrl":null,"url":null,"abstract":"<div><div>This paper proposes a novel approach that applies state-of-the-art concepts in reinforcement learning (RL) to the optimal control of human papillomavirus (HPV) infection. The methodology transforms the nonlinear optimal control problem into a constrained nonlinear programming problem, thus allowing effective application of the RL algorithms. This approach combines Hamilton–Jacobi–Bellman (HJB) equations with actor–critic neural networks and control barrier functions to obtain an adaptive strategy for optimal vaccination and screening against HPV infection. A key innovation is the Sophia optimizer with experience replay, addressing the critical need for online data application in infectious disease control. Unlike the traditional methods that rely on the accumulation of extensive data, this approach utilizes experience replay to learn and adapt continuously, hence giving practical solutions for diseases like HPV where waiting for data is not practical or desirable. Experience replay helps to store and reuse past experience, hence improving the learning efficiency and stability of the system. This is an important feature for online applications to make sure that an RL model responds quickly enough to changing epidemiological conditions. Numerical simulations demonstrate the effectiveness of this approach in minimizing HPV prevalence and optimizing resource allocation. This research offers significant insights into the application of advanced control strategies in infectious disease management, highlighting the potential of RL to address complex epidemiological challenges. The ability to apply these techniques to online underscores the importance of adaptive and responsive strategies in public health.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"264 ","pages":"Article 125783"},"PeriodicalIF":7.5000,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417424026502","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

This paper proposes a novel approach that applies state-of-the-art concepts in reinforcement learning (RL) to the optimal control of human papillomavirus (HPV) infection. The methodology transforms the nonlinear optimal control problem into a constrained nonlinear programming problem, thus allowing effective application of the RL algorithms. This approach combines Hamilton–Jacobi–Bellman (HJB) equations with actor–critic neural networks and control barrier functions to obtain an adaptive strategy for optimal vaccination and screening against HPV infection. A key innovation is the Sophia optimizer with experience replay, addressing the critical need for online data application in infectious disease control. Unlike the traditional methods that rely on the accumulation of extensive data, this approach utilizes experience replay to learn and adapt continuously, hence giving practical solutions for diseases like HPV where waiting for data is not practical or desirable. Experience replay helps to store and reuse past experience, hence improving the learning efficiency and stability of the system. This is an important feature for online applications to make sure that an RL model responds quickly enough to changing epidemiological conditions. Numerical simulations demonstrate the effectiveness of this approach in minimizing HPV prevalence and optimizing resource allocation. This research offers significant insights into the application of advanced control strategies in infectious disease management, highlighting the potential of RL to address complex epidemiological challenges. The ability to apply these techniques to online underscores the importance of adaptive and responsive strategies in public health.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用行为批判神经网络的安全强化学习实现 HPV 模型的优化控制

本文提出了一种新方法，将强化学习（RL）的最新概念应用于人类乳头瘤病毒（HPV）感染的优化控制。该方法将非线性优化控制问题转化为受约束的非线性编程问题，从而允许有效应用 RL 算法。这种方法将汉密尔顿-雅各比-贝尔曼（HJB）方程与行为批判神经网络和控制障碍函数相结合，从而获得了针对 HPV 感染的最佳疫苗接种和筛查的自适应策略。其中一项关键创新是具有经验回放功能的索菲亚优化器，它满足了传染病控制领域对在线数据应用的迫切需求。与依赖大量数据积累的传统方法不同，这种方法利用经验回放来不断学习和适应，从而为人类乳头瘤病毒等疾病提供切实可行的解决方案，因为在这些疾病中，等待数据是不现实的，也是不可取的。经验重放有助于存储和重用过去的经验，从而提高系统的学习效率和稳定性。这是在线应用的一项重要功能，可确保 RL 模型对不断变化的流行病学条件做出足够快的反应。数值模拟证明了这种方法在最小化人类乳头瘤病毒流行率和优化资源分配方面的有效性。这项研究为在传染病管理中应用先进的控制策略提供了重要启示，凸显了 RL 应对复杂流行病学挑战的潜力。将这些技术应用于在线的能力强调了适应性和响应性战略在公共卫生领域的重要性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Expert Systems with Applications 工程技术-工程：电子与电气

CiteScore

13.80

自引率

10.60%

发文量

2045

审稿时长

8.7 months

期刊介绍： Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.