Using Q-Learning to Personalize Pedagogical Policies for Addition Problems

Danyating Shen, Takara E. Truong, C. Weintz
{"title":"Using Q-Learning to Personalize Pedagogical Policies for Addition Problems","authors":"Danyating Shen, Takara E. Truong, C. Weintz","doi":"10.1109/CONF-SPML54095.2021.00043","DOIUrl":null,"url":null,"abstract":"The prevalence of COVID-19 has illuminated the need for practical digital education tools over the past year. With students studying from home, teachers have struggled to provide their students with adequately challenging coursework. Our project aims to solve this issue in the context of math. More specifically, our goal is to encourage thoughtful learning by supplying students with personalized two-number addition problems that take time to solve but expect the student to answer correctly still. Our solution is to model the process of selecting a math problem to give a student as a Markov Decision Process (MDP) and then use Q-learning to determine the best policy for arriving at the most optimally challenging two-number addition problem for that student. The project creates three student simulators based on group member data. We show that it took student one: $(162 \\pm 134)$ iterations to give appropriate level problems where the first entry is mean and the second is the standard deviation. Student two took $(230 \\pm 205)$ iterations, and student three took $(247 \\pm 236)$ iterations. Lastly, we demonstrate that pre-training our model on students two and three and testing on student one showed a significant improvement from $(162 \\pm 134)$ iterations to $(35 \\pm 44)$ iterations.","PeriodicalId":415094,"journal":{"name":"2021 International Conference on Signal Processing and Machine Learning (CONF-SPML)","volume":"2010 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Signal Processing and Machine Learning (CONF-SPML)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CONF-SPML54095.2021.00043","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

The prevalence of COVID-19 has illuminated the need for practical digital education tools over the past year. With students studying from home, teachers have struggled to provide their students with adequately challenging coursework. Our project aims to solve this issue in the context of math. More specifically, our goal is to encourage thoughtful learning by supplying students with personalized two-number addition problems that take time to solve but expect the student to answer correctly still. Our solution is to model the process of selecting a math problem to give a student as a Markov Decision Process (MDP) and then use Q-learning to determine the best policy for arriving at the most optimally challenging two-number addition problem for that student. The project creates three student simulators based on group member data. We show that it took student one: $(162 \pm 134)$ iterations to give appropriate level problems where the first entry is mean and the second is the standard deviation. Student two took $(230 \pm 205)$ iterations, and student three took $(247 \pm 236)$ iterations. Lastly, we demonstrate that pre-training our model on students two and three and testing on student one showed a significant improvement from $(162 \pm 134)$ iterations to $(35 \pm 44)$ iterations.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
运用Q-Learning实现加法问题教学策略的个性化
过去一年,COVID-19的流行凸显了对实用数字教育工具的需求。由于学生在家学习,老师们一直在努力为他们的学生提供足够有挑战性的课程。我们的项目旨在在数学的背景下解决这个问题。更具体地说,我们的目标是通过为学生提供个性化的两数加法问题来鼓励深思熟虑的学习,这些问题需要时间来解决,但希望学生仍然能正确回答。我们的解决方案是将选择数学问题的过程建模为马尔可夫决策过程(MDP),然后使用Q-learning来确定最佳策略,以达到对该学生最具挑战性的两数加法问题。该项目基于小组成员数据创建了三个学生模拟器。我们展示了学生1:$(162 \pm 134)$迭代来给出适当级别的问题,其中第一个条目是平均值,第二个是标准差。学生2获得$(230 \pm 205)$迭代,学生3获得$(247 \pm 236)$迭代。最后,我们证明了在学生2和3上预训练我们的模型并在学生1上进行测试显示了从$(162 \pm 134)$迭代到$(35 \pm 44)$迭代的显着改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A Two-stage Adaptive Weight-adjusting Interference Cancellation Demodulation Technology Based on SLIC and CWIC for NOMA Stabilization with the Idea of Notch Filter in Automatic Control System Remote Sensing Image Classification Methods Based on CNN: Challenge and Trends An Overview of Recommender Systems and Its Next Generation: Context-Aware Recommender Systems Manifold Guided Graph Neural Networks for Skeleton-based Action Recognition in Human Computer Interaction Videos
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1