Adaptive Inverse Optimal Control for Linear Human-in-the-Loop Systems With Completely Unknown Dynamics

IF 7.9 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS IEEE Transactions on Automation Science and Engineering Pub Date : 2024-11-05 DOI:10.1109/TASE.2024.3487857
Mi Wang;Huai-Ning Wu
{"title":"Adaptive Inverse Optimal Control for Linear Human-in-the-Loop Systems With Completely Unknown Dynamics","authors":"Mi Wang;Huai-Ning Wu","doi":"10.1109/TASE.2024.3487857","DOIUrl":null,"url":null,"abstract":"To improve machines’ intelligence, it is necessary for the machines to learn human’s behavior. In this paper, we make a reasonable hypothesis that a human behaves like a linear quadratic regulator whose cost function is unknown to the machine when performing a task. In addition, the system dynamics in many real applications is completely unknown. Therefore, our purpose is to search for an equivalent cost function to the human only from control input and system state data for continuous-time linear human-in-the-loop (HiTL) systems with completely unknown dynamics. An adaptive inverse optimal control (IOC) method is proposed for this purpose, which can help the machine conduct a better understanding for the human behavior and makes it possible to reproduce a similar optimal controller in other environments. Noticing the difficulty of directly obtaining the weighting matrix, an adaptive integral concurrent learning (ICL) algorithm is developed to identify the system matrices and human feedback gain matrix online, which removes the persistent excitation (PE) conditions. Then, the weighting matrix is determined via solving a convex programming problem. Finally, simulation results on the lane-keeping assist system of an intelligent vehicle are presented to demonstrate the validity of the proposed adaptive IOC algorithm. Note to Practitioners—In practice, it is hoped that the machine can work like a human such that it can replace the human to complete certain tasks. However, it is not easy to design corresponding algorithms for the machine because many tests need to be carried out for selecting appropriate parameters. Instead, an effective method is to teach the machine learn the human’s demonstrated behavior. It is noteworthy that the environment (system dynamics) may be not prior knowledge and only system state and control input are measurable. To this end, an adaptive IOC method is developed for imitation learning the human’s behavior, which is implemented online but requires only limited data. The proposed approach can be used in autonomous driving vehicle, service robot, and medical rehabilitation, etc. In future research, we will extent the proposed method to more complex environment.","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"22 ","pages":"8683-8694"},"PeriodicalIF":7.9000,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automation Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10744033/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

To improve machines’ intelligence, it is necessary for the machines to learn human’s behavior. In this paper, we make a reasonable hypothesis that a human behaves like a linear quadratic regulator whose cost function is unknown to the machine when performing a task. In addition, the system dynamics in many real applications is completely unknown. Therefore, our purpose is to search for an equivalent cost function to the human only from control input and system state data for continuous-time linear human-in-the-loop (HiTL) systems with completely unknown dynamics. An adaptive inverse optimal control (IOC) method is proposed for this purpose, which can help the machine conduct a better understanding for the human behavior and makes it possible to reproduce a similar optimal controller in other environments. Noticing the difficulty of directly obtaining the weighting matrix, an adaptive integral concurrent learning (ICL) algorithm is developed to identify the system matrices and human feedback gain matrix online, which removes the persistent excitation (PE) conditions. Then, the weighting matrix is determined via solving a convex programming problem. Finally, simulation results on the lane-keeping assist system of an intelligent vehicle are presented to demonstrate the validity of the proposed adaptive IOC algorithm. Note to Practitioners—In practice, it is hoped that the machine can work like a human such that it can replace the human to complete certain tasks. However, it is not easy to design corresponding algorithms for the machine because many tests need to be carried out for selecting appropriate parameters. Instead, an effective method is to teach the machine learn the human’s demonstrated behavior. It is noteworthy that the environment (system dynamics) may be not prior knowledge and only system state and control input are measurable. To this end, an adaptive IOC method is developed for imitation learning the human’s behavior, which is implemented online but requires only limited data. The proposed approach can be used in autonomous driving vehicle, service robot, and medical rehabilitation, etc. In future research, we will extent the proposed method to more complex environment.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
完全未知动态的线性人在环系统的自适应逆优化控制
为了提高机器的智能,机器有必要学习人类的行为。在本文中,我们提出了一个合理的假设,即人在执行任务时表现得像一个线性二次型调节器,其成本函数对机器来说是未知的。此外,许多实际应用中的系统动力学是完全未知的。因此,我们的目的是仅从控制输入和系统状态数据中搜索具有完全未知动力学的连续时间线性人在环(HiTL)系统的等价代价函数。为此提出了一种自适应逆最优控制(IOC)方法,该方法可以帮助机器更好地理解人类的行为,并使其有可能在其他环境中重现类似的最优控制器。考虑到直接获取权矩阵的困难,提出了一种自适应积分并发学习(ICL)算法来在线识别系统矩阵和人反馈增益矩阵,该算法消除了持续激励(PE)条件。然后,通过求解一个凸规划问题确定权重矩阵。最后,以某智能汽车车道保持辅助系统为例进行了仿真,验证了自适应IOC算法的有效性。从业人员注意事项——在实践中,希望机器能像人类一样工作,这样它就可以代替人类完成某些任务。然而,为机器设计相应的算法并不容易,因为需要进行许多试验来选择合适的参数。相反,一个有效的方法是教机器学习人类的示范行为。值得注意的是,环境(系统动力学)可能不是先验知识,只有系统状态和控制输入是可测量的。为此,开发了一种用于模仿学习人类行为的自适应IOC方法,该方法在线实现,但只需要有限的数据。该方法可应用于自动驾驶汽车、服务机器人、医疗康复等领域。在未来的研究中,我们将把该方法扩展到更复杂的环境中。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Transactions on Automation Science and Engineering
IEEE Transactions on Automation Science and Engineering 工程技术-自动化与控制系统
CiteScore
12.50
自引率
14.30%
发文量
404
审稿时长
3.0 months
期刊介绍: The IEEE Transactions on Automation Science and Engineering (T-ASE) publishes fundamental papers on Automation, emphasizing scientific results that advance efficiency, quality, productivity, and reliability. T-ASE encourages interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, operations research, and other fields. T-ASE welcomes results relevant to industries such as agriculture, biotechnology, healthcare, home automation, maintenance, manufacturing, pharmaceuticals, retail, security, service, supply chains, and transportation. T-ASE addresses a research community willing to integrate knowledge across disciplines and industries. For this purpose, each paper includes a Note to Practitioners that summarizes how its results can be applied or how they might be extended to apply in practice.
期刊最新文献
Towards Precise Guidance: A Novel Cross-Dimensional Mapping Framework for 3D Cerebrovascular Surgical Navigation Geometric Regularization for Robust Learning of Neural Autonomous Dynamical Systems from Demonstrations Inverse Reinforcement Learning for Structured Fault-Tolerant Control in Nonlinear MASs under Switching Topologies Accelerated Alternating Direction Method of Multipliers via Reinforcement Learning Meta-Optimization for Nonlinear Model Predictive Control Lightweight Defense Against Data Consistency Attacks in Distributed DC Optimal Power Flow
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1