贝叶斯序列设计与强化学习的比较教程

M. Tec, Yunshan Duan, P. Müller
{"title":"贝叶斯序列设计与强化学习的比较教程","authors":"M. Tec, Yunshan Duan, P. Müller","doi":"10.1080/00031305.2022.2129787","DOIUrl":null,"url":null,"abstract":"Abstract Reinforcement learning (RL) is a computational approach to reward-driven learning in sequential decision problems. It implements the discovery of optimal actions by learning from an agent interacting with an environment rather than from supervised data. We contrast and compare RL with traditional sequential design, focusing on simulation-based Bayesian sequential design (BSD). Recently, there has been an increasing interest in RL techniques for healthcare applications. We introduce two related applications as motivating examples. In both applications, the sequential nature of the decisions is restricted to sequential stopping. Rather than a comprehensive survey, the focus of the discussion is on solutions using standard tools for these two relatively simple sequential stopping problems. Both problems are inspired by adaptive clinical trial design. We use examples to explain the terminology and mathematical background that underlie each framework and map one to the other. The implementations and results illustrate the many similarities between RL and BSD. The results motivate the discussion of the potential strengths and limitations of each approach.","PeriodicalId":342642,"journal":{"name":"The American Statistician","volume":"100 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Comparative Tutorial of Bayesian Sequential Design and Reinforcement Learning\",\"authors\":\"M. Tec, Yunshan Duan, P. Müller\",\"doi\":\"10.1080/00031305.2022.2129787\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract Reinforcement learning (RL) is a computational approach to reward-driven learning in sequential decision problems. It implements the discovery of optimal actions by learning from an agent interacting with an environment rather than from supervised data. We contrast and compare RL with traditional sequential design, focusing on simulation-based Bayesian sequential design (BSD). Recently, there has been an increasing interest in RL techniques for healthcare applications. We introduce two related applications as motivating examples. In both applications, the sequential nature of the decisions is restricted to sequential stopping. Rather than a comprehensive survey, the focus of the discussion is on solutions using standard tools for these two relatively simple sequential stopping problems. Both problems are inspired by adaptive clinical trial design. We use examples to explain the terminology and mathematical background that underlie each framework and map one to the other. The implementations and results illustrate the many similarities between RL and BSD. The results motivate the discussion of the potential strengths and limitations of each approach.\",\"PeriodicalId\":342642,\"journal\":{\"name\":\"The American Statistician\",\"volume\":\"100 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The American Statistician\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1080/00031305.2022.2129787\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The American Statistician","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/00031305.2022.2129787","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

摘要强化学习(RL)是序列决策问题中奖励驱动学习的一种计算方法。它通过学习与环境交互的智能体而不是从监督数据中学习来实现最佳行为的发现。我们将强化学习与传统序列设计进行对比和比较,重点研究了基于仿真的贝叶斯序列设计(BSD)。最近,人们对RL技术在医疗保健领域的应用越来越感兴趣。我们将介绍两个相关的应用程序作为激励示例。在这两个应用程序中,决策的顺序性质被限制为顺序停止。讨论的重点不是全面的调查,而是使用标准工具解决这两个相对简单的顺序停止问题。这两个问题都受到适应性临床试验设计的启发。我们使用示例来解释每个框架背后的术语和数学背景,并将一个框架映射到另一个框架。实现和结果说明了RL和BSD之间的许多相似之处。结果激发了对每种方法的潜在优势和局限性的讨论。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A Comparative Tutorial of Bayesian Sequential Design and Reinforcement Learning
Abstract Reinforcement learning (RL) is a computational approach to reward-driven learning in sequential decision problems. It implements the discovery of optimal actions by learning from an agent interacting with an environment rather than from supervised data. We contrast and compare RL with traditional sequential design, focusing on simulation-based Bayesian sequential design (BSD). Recently, there has been an increasing interest in RL techniques for healthcare applications. We introduce two related applications as motivating examples. In both applications, the sequential nature of the decisions is restricted to sequential stopping. Rather than a comprehensive survey, the focus of the discussion is on solutions using standard tools for these two relatively simple sequential stopping problems. Both problems are inspired by adaptive clinical trial design. We use examples to explain the terminology and mathematical background that underlie each framework and map one to the other. The implementations and results illustrate the many similarities between RL and BSD. The results motivate the discussion of the potential strengths and limitations of each approach.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A review of Design of Experiments courses offered to undergraduate students at American universities On Misuses of the Kolmogorov–Smirnov Test for One-Sample Goodness-of-Fit Sequential Selection for Minimizing the Variance with Application to Crystallization Experiments Sequential Selection for Minimizing the Variance with Application to Crystallization Experiments Introduction to Stochastic Finance with Market Examples, 2nd ed Introduction to Stochastic Finance with Market Examples, 2nd ed . Nicolas Privault, Boca Raton, FL: Chapman & Hall/CRC Press, 2023, x + 652 pp., $120.00(H), ISBN 978-1-032-28826-0.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1