教育中的强化学习:多武装强盗方法

H. Combrink, Vukosi Marivate, Benjamin Rosman
{"title":"教育中的强化学习:多武装强盗方法","authors":"H. Combrink, Vukosi Marivate, Benjamin Rosman","doi":"10.48550/arXiv.2211.00779","DOIUrl":null,"url":null,"abstract":"Advances in reinforcement learning research have demonstrated the ways in which different agent-based models can learn how to optimally perform a task within a given environment. Reinforcement leaning solves unsupervised problems where agents move through a state-action-reward loop to maximize the overall reward for the agent, which in turn optimizes the solving of a specific problem in a given environment. However, these algorithms are designed based on our understanding of actions that should be taken in a real-world environment to solve a specific problem. One such problem is the ability to identify, recommend and execute an action within a system where the users are the subject, such as in education. In recent years, the use of blended learning approaches integrating face-to-face learning with online learning in the education context, has in-creased. Additionally, online platforms used for education require the automation of certain functions such as the identification, recommendation or execution of actions that can benefit the user, in this sense, the student or learner. As promising as these scientific advances are, there is still a need to conduct research in a variety of different areas to ensure the successful deployment of these agents within education systems. Therefore, the aim of this study was to contextualise and simulate the cumulative reward within an environment for an intervention recommendation problem in the education context.","PeriodicalId":127774,"journal":{"name":"International Conference on Emerging Technologies for Developing Countries","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Reinforcement Learning in Education: A Multi-Armed Bandit Approach\",\"authors\":\"H. Combrink, Vukosi Marivate, Benjamin Rosman\",\"doi\":\"10.48550/arXiv.2211.00779\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Advances in reinforcement learning research have demonstrated the ways in which different agent-based models can learn how to optimally perform a task within a given environment. Reinforcement leaning solves unsupervised problems where agents move through a state-action-reward loop to maximize the overall reward for the agent, which in turn optimizes the solving of a specific problem in a given environment. However, these algorithms are designed based on our understanding of actions that should be taken in a real-world environment to solve a specific problem. One such problem is the ability to identify, recommend and execute an action within a system where the users are the subject, such as in education. In recent years, the use of blended learning approaches integrating face-to-face learning with online learning in the education context, has in-creased. Additionally, online platforms used for education require the automation of certain functions such as the identification, recommendation or execution of actions that can benefit the user, in this sense, the student or learner. As promising as these scientific advances are, there is still a need to conduct research in a variety of different areas to ensure the successful deployment of these agents within education systems. Therefore, the aim of this study was to contextualise and simulate the cumulative reward within an environment for an intervention recommendation problem in the education context.\",\"PeriodicalId\":127774,\"journal\":{\"name\":\"International Conference on Emerging Technologies for Developing Countries\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Emerging Technologies for Developing Countries\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.48550/arXiv.2211.00779\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Emerging Technologies for Developing Countries","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2211.00779","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

强化学习研究的进展已经证明了不同的基于智能体的模型可以学习如何在给定环境中最佳地执行任务。强化学习解决了无监督问题,其中智能体通过状态-行动-奖励循环来最大化智能体的总体奖励,这反过来又优化了给定环境中特定问题的解决。然而,这些算法是基于我们对在现实环境中解决特定问题应该采取的行动的理解而设计的。其中一个问题是在以用户为主体的系统(例如教育系统)中识别、推荐和执行操作的能力。近年来,将面对面学习与在线学习相结合的混合学习方法在教育环境中的使用有所增加。此外,用于教育的在线平台需要某些功能的自动化,例如识别、推荐或执行可以使用户受益的操作,从这个意义上说,是学生或学习者。尽管这些科学进步充满希望,但仍需要在各种不同领域进行研究,以确保在教育系统中成功部署这些代理。因此,本研究的目的是情境化和模拟环境中对教育背景下的干预建议问题的累积奖励。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Reinforcement Learning in Education: A Multi-Armed Bandit Approach
Advances in reinforcement learning research have demonstrated the ways in which different agent-based models can learn how to optimally perform a task within a given environment. Reinforcement leaning solves unsupervised problems where agents move through a state-action-reward loop to maximize the overall reward for the agent, which in turn optimizes the solving of a specific problem in a given environment. However, these algorithms are designed based on our understanding of actions that should be taken in a real-world environment to solve a specific problem. One such problem is the ability to identify, recommend and execute an action within a system where the users are the subject, such as in education. In recent years, the use of blended learning approaches integrating face-to-face learning with online learning in the education context, has in-creased. Additionally, online platforms used for education require the automation of certain functions such as the identification, recommendation or execution of actions that can benefit the user, in this sense, the student or learner. As promising as these scientific advances are, there is still a need to conduct research in a variety of different areas to ensure the successful deployment of these agents within education systems. Therefore, the aim of this study was to contextualise and simulate the cumulative reward within an environment for an intervention recommendation problem in the education context.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Reinforcement Learning in Education: A Multi-Armed Bandit Approach Modelling DDoS Attacks in IoT Networks using Machine Learning Signal Processing, Control and Coordination in an Intelligent Connected Vehicle Exploring Users' Continuance Intention Towards Mobile SNS: A Mobile Value Perspective Vector Space Model of Text Classification Based on Inertia Contribution of Document
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1