基于深度强化学习的人类技能训练 "循序渐进 "指南

IF 3.1 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Journal of Intelligent & Robotic Systems Pub Date : 2024-08-03 DOI:10.1007/s10846-024-02147-7

Yang Yang, Haifei Chen, Xing Liu, Panfeng Huang

{"title":"基于深度强化学习的人类技能训练 \"循序渐进 \"指南","authors":"Yang Yang, Haifei Chen, Xing Liu, Panfeng Huang","doi":"10.1007/s10846-024-02147-7","DOIUrl":null,"url":null,"abstract":"<p>To achieve psychological inclusion and skill development orientation in human skill training, this paper proposes a haptic-guided training strategy generation method with Deep Reinforcement Learning (DRL)-based agent as the core and Zone of Proximal Development (ZPD) tuning as the auxiliary. The information of the expert and trainee is stored first with a designed database that can be accessed in real-time, which establishes the data foundation. Then, under the DRL framework, a strategy generation agent is designed, which consists of an actor-network and two Q-networks. The former network generates the agent’s decision policy, while the other two Q-networks work to approximate the state-action value function, and the parameters of all of them are administrated by the Soft Actor-Critic (SAC) algorithm. In addition, for the first time, the psychological ZPD evaluation method is integrated into the strategy generation of the DRL-based agent, which is utilized to describe the relationship between a trainees intrinsic skills and guidance. With it, the problem of transitional guidance or insufficient guidance can be handled well. Finally, simulation experiments validate the proposed method, demonstrating its efficiency in regulating the trainee under favorable training conditions.</p>","PeriodicalId":54794,"journal":{"name":"Journal of Intelligent & Robotic Systems","volume":"75 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2024-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Guidance-As-Progressive in Human Skill Training Based on Deep Reinforcement Learning\",\"authors\":\"Yang Yang, Haifei Chen, Xing Liu, Panfeng Huang\",\"doi\":\"10.1007/s10846-024-02147-7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>To achieve psychological inclusion and skill development orientation in human skill training, this paper proposes a haptic-guided training strategy generation method with Deep Reinforcement Learning (DRL)-based agent as the core and Zone of Proximal Development (ZPD) tuning as the auxiliary. The information of the expert and trainee is stored first with a designed database that can be accessed in real-time, which establishes the data foundation. Then, under the DRL framework, a strategy generation agent is designed, which consists of an actor-network and two Q-networks. The former network generates the agent’s decision policy, while the other two Q-networks work to approximate the state-action value function, and the parameters of all of them are administrated by the Soft Actor-Critic (SAC) algorithm. In addition, for the first time, the psychological ZPD evaluation method is integrated into the strategy generation of the DRL-based agent, which is utilized to describe the relationship between a trainees intrinsic skills and guidance. With it, the problem of transitional guidance or insufficient guidance can be handled well. Finally, simulation experiments validate the proposed method, demonstrating its efficiency in regulating the trainee under favorable training conditions.</p>\",\"PeriodicalId\":54794,\"journal\":{\"name\":\"Journal of Intelligent & Robotic Systems\",\"volume\":\"75 1\",\"pages\":\"\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-08-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Intelligent & Robotic Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s10846-024-02147-7\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Intelligent & Robotic Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10846-024-02147-7","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

为了在人类技能训练中实现心理包容和技能发展导向，本文提出了一种以基于深度强化学习（DRL）的代理为核心、以近端发展区（ZPD）调整为辅助的触觉引导训练策略生成方法。专家和受训者的信息首先被存储到一个可实时访问的数据库中，从而建立了数据基础。然后，在 DRL 框架下，设计了一个策略生成代理，它由一个角色网络和两个 Q 网络组成。前一个网络生成代理的决策策略，而另外两个 Q 网络则用于近似状态-行动值函数，所有网络的参数都由软代理批判（SAC）算法管理。此外，还首次将心理 ZPD 评估方法集成到基于 DRL 的代理策略生成中，用于描述受训者内在技能与指导之间的关系。有了它，就能很好地解决过渡指导或指导不足的问题。最后，模拟实验验证了所提出的方法，证明它能在有利的训练条件下有效地调节受训者。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Guidance-As-Progressive in Human Skill Training Based on Deep Reinforcement Learning

To achieve psychological inclusion and skill development orientation in human skill training, this paper proposes a haptic-guided training strategy generation method with Deep Reinforcement Learning (DRL)-based agent as the core and Zone of Proximal Development (ZPD) tuning as the auxiliary. The information of the expert and trainee is stored first with a designed database that can be accessed in real-time, which establishes the data foundation. Then, under the DRL framework, a strategy generation agent is designed, which consists of an actor-network and two Q-networks. The former network generates the agent’s decision policy, while the other two Q-networks work to approximate the state-action value function, and the parameters of all of them are administrated by the Soft Actor-Critic (SAC) algorithm. In addition, for the first time, the psychological ZPD evaluation method is integrated into the strategy generation of the DRL-based agent, which is utilized to describe the relationship between a trainees intrinsic skills and guidance. With it, the problem of transitional guidance or insufficient guidance can be handled well. Finally, simulation experiments validate the proposed method, demonstrating its efficiency in regulating the trainee under favorable training conditions.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Intelligent & Robotic Systems 工程技术-机器人学

CiteScore

7.00

自引率

9.10%

发文量

219

审稿时长

6 months

期刊介绍： The Journal of Intelligent and Robotic Systems bridges the gap between theory and practice in all areas of intelligent systems and robotics. It publishes original, peer reviewed contributions from initial concept and theory to prototyping to final product development and commercialization. On the theoretical side, the journal features papers focusing on intelligent systems engineering, distributed intelligence systems, multi-level systems, intelligent control, multi-robot systems, cooperation and coordination of unmanned vehicle systems, etc. On the application side, the journal emphasizes autonomous systems, industrial robotic systems, multi-robot systems, aerial vehicles, mobile robot platforms, underwater robots, sensors, sensor-fusion, and sensor-based control. Readers will also find papers on real applications of intelligent and robotic systems (e.g., mechatronics, manufacturing, biomedical, underwater, humanoid, mobile/legged robot and space applications, etc.).