Mohd Rashdan Abdul Kadir, Ali Selamat, Ondrej Krejcar
{"title":"Norm Augmented Reinforcement Learning Agents With Synthesized Normative Rules","authors":"Mohd Rashdan Abdul Kadir, Ali Selamat, Ondrej Krejcar","doi":"10.4018/jcit.345650","DOIUrl":null,"url":null,"abstract":"The dynamic deontic (DD) is a norm synthesis framework that extracts normative rules from reinforcement learning (RL), however it was not designed to be applied in agent coordination. This study proposes a norm augmented reinforcement learning framework (NARLF) that extends said model to include a norm deliberation mechanism for learned norms re-imputation for norm biased decision-making RL agents. This study aims to test the effects of synthesized norms applied on-line and off-line on agent learning performance. The framework consists of the DD framework extended with a pre-processing and deliberation component to allow re-imputation of normative rules. A deliberation model, the Norm Augmented Q-Table (NAugQT), is proposed to map normative rules into RL agents via q-values weight updates. Results show that the framework is able to map and improve RL agent's performance but only when synthesized off-line edited absolute norm salience value norms are used. This shows limitations when unstable salience norms are applied. Improvement in norm extraction and pre-processing are required.","PeriodicalId":0,"journal":{"name":"","volume":"2 12","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4018/jcit.345650","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The dynamic deontic (DD) is a norm synthesis framework that extracts normative rules from reinforcement learning (RL), however it was not designed to be applied in agent coordination. This study proposes a norm augmented reinforcement learning framework (NARLF) that extends said model to include a norm deliberation mechanism for learned norms re-imputation for norm biased decision-making RL agents. This study aims to test the effects of synthesized norms applied on-line and off-line on agent learning performance. The framework consists of the DD framework extended with a pre-processing and deliberation component to allow re-imputation of normative rules. A deliberation model, the Norm Augmented Q-Table (NAugQT), is proposed to map normative rules into RL agents via q-values weight updates. Results show that the framework is able to map and improve RL agent's performance but only when synthesized off-line edited absolute norm salience value norms are used. This shows limitations when unstable salience norms are applied. Improvement in norm extraction and pre-processing are required.