{"title":"DASP: Hierarchical Offline Reinforcement Learning via Diffusion Autodecoder and Skill Primitive","authors":"Sicheng Liu;Yunchuan Zhang;Wenbai Chen;Peiliang Wu","doi":"10.1109/LRA.2024.3522842","DOIUrl":null,"url":null,"abstract":"Offline reinforcement learning strives to enable agents to effectively utilize pre-collected offline datasets for learning. Such an offline setup tremendously mitigates the problems of online reinforcement learning algorithms in real-world applications, particularly in scenarios where interactions are constrained or exploration is costly. The learned strategy, on the other hand, has a distributional bias with respect to the behavioral strategy, which consequently leads to the problem of extrapolation error for out-of-distribution actions. To mitigate this problem, in this paper, we adopt a hierarchical offline reinforcement learning framework that extracts recurrent and spatio-temporally extended primitive skills from offline data before using them for downstream task learning. Besides, we introduce an autodecoder conditional diffusion model to characterize low-level strategy decoding. Such a deep learning generative model enables the reduction of action primitives for the strategy space, which is then used to learn high-level task strategy-guided primitives via the offline learning algorithm IQL. Experimental results and ablation studies on D4RL benchmark tasks (Antmaze, Adroit and Kitchen) demonstrate that our approach achieves SOTA performance in most tasks.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 2","pages":"1649-1655"},"PeriodicalIF":4.6000,"publicationDate":"2024-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Robotics and Automation Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10816163/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}
引用次数: 0
Abstract
Offline reinforcement learning strives to enable agents to effectively utilize pre-collected offline datasets for learning. Such an offline setup tremendously mitigates the problems of online reinforcement learning algorithms in real-world applications, particularly in scenarios where interactions are constrained or exploration is costly. The learned strategy, on the other hand, has a distributional bias with respect to the behavioral strategy, which consequently leads to the problem of extrapolation error for out-of-distribution actions. To mitigate this problem, in this paper, we adopt a hierarchical offline reinforcement learning framework that extracts recurrent and spatio-temporally extended primitive skills from offline data before using them for downstream task learning. Besides, we introduce an autodecoder conditional diffusion model to characterize low-level strategy decoding. Such a deep learning generative model enables the reduction of action primitives for the strategy space, which is then used to learn high-level task strategy-guided primitives via the offline learning algorithm IQL. Experimental results and ablation studies on D4RL benchmark tasks (Antmaze, Adroit and Kitchen) demonstrate that our approach achieves SOTA performance in most tasks.
期刊介绍:
The scope of this journal is to publish peer-reviewed articles that provide a timely and concise account of innovative research ideas and application results, reporting significant theoretical findings and application case studies in areas of robotics and automation.