Comparing Physics Effects through Reinforcement Learning in the ARORA Simulator

Troyle Thomas, Armando Fandango, D. Reed, C. Hoayun, J. Hurter, Alexander Gutierrez, K. Brawner
{"title":"Comparing Physics Effects through Reinforcement Learning in the ARORA Simulator","authors":"Troyle Thomas, Armando Fandango, D. Reed, C. Hoayun, J. Hurter, Alexander Gutierrez, K. Brawner","doi":"10.46354/i3m.2021.emss.015","DOIUrl":null,"url":null,"abstract":"By testing various physics levels for training autonomous-vehicle navigation using a deep deterministic policy gradient algorithm, the present study fills a lack of research on the impact of physics levels for vehicle behaviour, specifically for reinforcement-learning algorithms. Measures from a PointGoal Navigation task were investigated: simulator run-time, training steps, and agent effectiveness through the Success weighted by (normalised inverse) Path Length (SPL) measure. Training and testing occurred in the novel simulator ARORA, or A Realistic Open environment for Rapid Agent training. The goal of ARORA is to provide a high-fidelity, open-source platform for simulation, using physics-based movement, vehicle modelling, and a continuous action space within a large-scale geospecific city environment. Using four physics levels, or models, to create four different curriculum conditions for training, the SPL was highest for the condition using all physics levels defined for the experiment, with two conditions returning zero values. Future researchers should consider providing adequate support when training complex-physics vehicle models. The run-time results revealed a benefit for experimental machines with a better CPU, at least for the vector-only observations we employed.","PeriodicalId":322169,"journal":{"name":"Proceedings of the 33rd European Modeling & Simulation Symposium","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 33rd European Modeling & Simulation Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.46354/i3m.2021.emss.015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

By testing various physics levels for training autonomous-vehicle navigation using a deep deterministic policy gradient algorithm, the present study fills a lack of research on the impact of physics levels for vehicle behaviour, specifically for reinforcement-learning algorithms. Measures from a PointGoal Navigation task were investigated: simulator run-time, training steps, and agent effectiveness through the Success weighted by (normalised inverse) Path Length (SPL) measure. Training and testing occurred in the novel simulator ARORA, or A Realistic Open environment for Rapid Agent training. The goal of ARORA is to provide a high-fidelity, open-source platform for simulation, using physics-based movement, vehicle modelling, and a continuous action space within a large-scale geospecific city environment. Using four physics levels, or models, to create four different curriculum conditions for training, the SPL was highest for the condition using all physics levels defined for the experiment, with two conditions returning zero values. Future researchers should consider providing adequate support when training complex-physics vehicle models. The run-time results revealed a benefit for experimental machines with a better CPU, at least for the vector-only observations we employed.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
在ARORA模拟器中通过强化学习比较物理效果
通过使用深度确定性策略梯度算法测试各种物理级别来训练自动驾驶汽车导航,本研究填补了物理级别对车辆行为影响的研究缺失,特别是对强化学习算法的影响。研究了PointGoal导航任务的度量:模拟器运行时间、训练步骤和通过(归一化逆)路径长度(SPL)度量加权成功的代理有效性。训练和测试在新型模拟器ARORA中进行,即用于快速智能体训练的现实开放环境。ARORA的目标是提供一个高保真的、开源的仿真平台,使用基于物理的运动、车辆建模和大规模地理特定城市环境中的连续动作空间。使用四个物理水平或模型来创建四种不同的训练课程条件,在使用为实验定义的所有物理水平的条件下,SPL最高,其中两个条件返回零值。未来的研究人员应该考虑在训练复杂物理车辆模型时提供足够的支持。运行时结果揭示了具有更好CPU的实验机器的好处,至少对于我们使用的仅向量观察来说是这样。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Estimation of optimal positioning of gold contact pads for modulating nanophotonic devices based on lithium niobate on insulator platform Role of Lean manufacturing tools on economic sustainability in the Mexican manufacturing industry Distance of bus stops from junctions: Simulation assessment A simulation of an end-of-life reverse supply chain for electric vehicle batteries An approach for target-oriented process analysis for the implementation of Digital Process Optimization Twins in the field of intralogistics
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1