Route optimization for autonomous bulldozer by distributed deep reinforcement learning

Yasuhiro Osaka, Naoya Odajima, Y. Uchimura
{"title":"Route optimization for autonomous bulldozer by distributed deep reinforcement learning","authors":"Yasuhiro Osaka, Naoya Odajima, Y. Uchimura","doi":"10.1109/ICM46511.2021.9385686","DOIUrl":null,"url":null,"abstract":"Since the publication showed DQN based reinforcement learning methods exceeds human's score in Atari 2600 video games, various deep reinforcement learning have bee researched. This paper proposes a method to control bulldozer autonomously by learning the sediment leveling route using PPO that enables distributed deep reinforcement learning. The simulator was originally developed that enables to reproduce the behavior of small and uniform sediment. By incorporating an LSTM that processes the input state as time-series data into the agent network, more than 95% of the sediment in the target area on average was achieved. In addition, the generalization performance for unknown condition was evaluated, by giving unlearned conditions were given as initial setups.","PeriodicalId":373423,"journal":{"name":"2021 IEEE International Conference on Mechatronics (ICM)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Mechatronics (ICM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICM46511.2021.9385686","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Since the publication showed DQN based reinforcement learning methods exceeds human's score in Atari 2600 video games, various deep reinforcement learning have bee researched. This paper proposes a method to control bulldozer autonomously by learning the sediment leveling route using PPO that enables distributed deep reinforcement learning. The simulator was originally developed that enables to reproduce the behavior of small and uniform sediment. By incorporating an LSTM that processes the input state as time-series data into the agent network, more than 95% of the sediment in the target area on average was achieved. In addition, the generalization performance for unknown condition was evaluated, by giving unlearned conditions were given as initial setups.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于分布式深度强化学习的自主推土机路径优化
由于该出版物显示基于DQN的强化学习方法超过了人类在Atari 2600视频游戏中的得分,各种深度强化学习已经被研究。本文提出了一种利用PPO学习泥沙平整路径来实现推土机自主控制的方法,该方法实现了分布式深度强化学习。模拟器最初是为了重现小而均匀的沉积物的行为而开发的。通过将LSTM将输入状态作为时间序列数据处理到代理网络中,平均可获得目标区域95%以上的沉积物。此外,通过给出未学习条件作为初始设置,评估了未知条件下的泛化性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Vision-Based Rapid Target Tracking Method for Trajectories Estimation and Actuator Parameter Uncertainties for Asteroid Flyby Problem Hybrid identification with time-series data and frequency response data for accurate estimation of linear characteristics Study on how to remove the rope traction device on the overhead distribution lines Adaptive Robust Motion Control of Series Elastic Actuator with Unmatched Uncertainties Modeling and Control of Stable Limit Cycle Walking on Floating Island
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1