Observational Learning: Imitation Through an Adaptive Probabilistic Approach

Sheida Nozari, L. Marcenaro, David Martín, C. Regazzoni
{"title":"Observational Learning: Imitation Through an Adaptive Probabilistic Approach","authors":"Sheida Nozari, L. Marcenaro, David Martín, C. Regazzoni","doi":"10.1109/ICAS49788.2021.9551152","DOIUrl":null,"url":null,"abstract":"This paper proposes an adaptive method to enable imitation learning from expert demonstrations in a multi-agent context. The proposed system employs the inverse reinforcement learning method to a coupled Dynamic Bayesian Network to facilitate dynamic learning in an interactive system. This method studies the interaction at both discrete and continuous levels by identifying inter-relationships between the objects to facilitate the prediction of an expert agent. We evaluate the learning procedure in the scene of learner agent based on probabilistic reward function. Our goal is to estimate policies that predict matched trajectories with the observed one by minimizing the Kullback-Leiber divergence. The reward policies provide a probabilistic dynamic structure to minimise the abnormalities.","PeriodicalId":287105,"journal":{"name":"2021 IEEE International Conference on Autonomous Systems (ICAS)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Autonomous Systems (ICAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAS49788.2021.9551152","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

This paper proposes an adaptive method to enable imitation learning from expert demonstrations in a multi-agent context. The proposed system employs the inverse reinforcement learning method to a coupled Dynamic Bayesian Network to facilitate dynamic learning in an interactive system. This method studies the interaction at both discrete and continuous levels by identifying inter-relationships between the objects to facilitate the prediction of an expert agent. We evaluate the learning procedure in the scene of learner agent based on probabilistic reward function. Our goal is to estimate policies that predict matched trajectories with the observed one by minimizing the Kullback-Leiber divergence. The reward policies provide a probabilistic dynamic structure to minimise the abnormalities.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
观察学习:通过自适应概率方法进行模仿
本文提出了一种自适应方法来实现多智能体环境下专家演示的模仿学习。该系统将逆强化学习方法应用于一个耦合的动态贝叶斯网络,以促进交互式系统的动态学习。该方法通过识别对象之间的相互关系来研究离散级和连续级的相互作用,以方便专家代理的预测。我们基于概率奖励函数来评估学习智能体场景下的学习过程。我们的目标是通过最小化Kullback-Leiber散度来估计预测与观测轨迹匹配的政策。奖励政策提供了一个概率动态结构,以尽量减少异常。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Improving Automated Search for Underwater Threats Using Multistatic Sensor Fields by Incorporating Unconfirmed Track Information Matching Models for Crowd-Shipping Considering Shipper’s Acceptance Uncertainty Observational Learning: Imitation Through an Adaptive Probabilistic Approach Simultaneous Calibration of Positions, Orientations, and Time Offsets, Among Multiple Microphone Arrays Modified crop health monitoring and pesticide spraying system using NDVI and Semantic Segmentation: An AGROCOPTER based approach
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1