A Comparative Analysis of Multiple Biasing Techniques for $Q_{biased}$ Softmax Regression Algorithm

Muhammad Moiz, Hazique Malik, Muhammad Bilal, Noman Naseer
{"title":"A Comparative Analysis of Multiple Biasing Techniques for $Q_{biased}$ Softmax Regression Algorithm","authors":"Muhammad Moiz, Hazique Malik, Muhammad Bilal, Noman Naseer","doi":"10.1109/AIMS52415.2021.9466049","DOIUrl":null,"url":null,"abstract":"Over the past many years the popularity of robotic workers has seen a tremendous surge. Several tasks which were previously considered insurmountable are able to be performed by robots efficiently, with much ease. This is mainly due to the advances made in the field of control systems and artificial intelligence in recent years. Lately, we have seen Reinforcement Learning (RL) capture the spotlight, in the field of robotics. Instead of explicitly specifying the solution of a particular task, RL enables the robot (agent) to explore its environment and through trial and error choose the appropriate response. In this paper, a comparative analysis of biasing techniques for the Q-biased softmax regression (QBIASSR) algorithm has been presented. In QBIASSR, decision-making for un-explored states depends upon the set of previously explored states. This algorithm improves the learning process when the robot reaches unexplored states. A vector bias(s) is calculated on the basis of variable values of experienced states and added to the Q-value function for action selection. To obtain the optimized reward, different techniques to calculate bias(s) are adopted. The performance of all the techniques has been evaluated and compared for obstacle avoidance in the case of a mobile robot. In the end, we have demonstrated that the cumulative reward generated by the technique proposed in our paper is at least 2 times greater than the baseline.","PeriodicalId":299121,"journal":{"name":"2021 International Conference on Artificial Intelligence and Mechatronics Systems (AIMS)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Artificial Intelligence and Mechatronics Systems (AIMS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AIMS52415.2021.9466049","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Over the past many years the popularity of robotic workers has seen a tremendous surge. Several tasks which were previously considered insurmountable are able to be performed by robots efficiently, with much ease. This is mainly due to the advances made in the field of control systems and artificial intelligence in recent years. Lately, we have seen Reinforcement Learning (RL) capture the spotlight, in the field of robotics. Instead of explicitly specifying the solution of a particular task, RL enables the robot (agent) to explore its environment and through trial and error choose the appropriate response. In this paper, a comparative analysis of biasing techniques for the Q-biased softmax regression (QBIASSR) algorithm has been presented. In QBIASSR, decision-making for un-explored states depends upon the set of previously explored states. This algorithm improves the learning process when the robot reaches unexplored states. A vector bias(s) is calculated on the basis of variable values of experienced states and added to the Q-value function for action selection. To obtain the optimized reward, different techniques to calculate bias(s) are adopted. The performance of all the techniques has been evaluated and compared for obstacle avoidance in the case of a mobile robot. In the end, we have demonstrated that the cumulative reward generated by the technique proposed in our paper is at least 2 times greater than the baseline.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
$Q_{biased}$ Softmax回归算法的多偏置技术比较分析
在过去的许多年里,机器人工人的受欢迎程度出现了巨大的增长。一些以前被认为无法完成的任务可以由机器人轻松高效地完成。这主要是由于近年来在控制系统和人工智能领域取得的进展。最近,我们看到强化学习(RL)在机器人领域引起了人们的关注。强化学习不是明确指定特定任务的解决方案,而是使机器人(代理)能够探索其环境,并通过反复试验选择适当的响应。本文对q偏软最大回归(QBIASSR)算法的偏置技术进行了比较分析。在QBIASSR中,未探索状态的决策取决于先前探索状态的集合。该算法改进了机器人到达未探索状态时的学习过程。根据经验状态的变量值计算向量偏差(s),并将其添加到q值函数中以进行行动选择。为了获得最优的奖励,采用了不同的偏差计算技术。在移动机器人避障的情况下,对所有技术的性能进行了评估和比较。最后,我们证明了我们论文中提出的技术所产生的累积奖励至少是基线的2倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A Feasibility Study of M2M/IoT Numbering Model in Indonesia Classification of sensorimotor cortex signals based on the task durations: an fNIRS-BCI study A genetic algorithm with an elitism replacement method for solving the nonfunctional web service composition under fuzzy QoS parameters The Effect of Wave Stirring Mechanism in Improving Heating Uniformity in Microwave Chamber For Fishing Industry A Survey of Emotion Recognition using Physiological Signal in Wearable Devices
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1