{"title":"Credit Assignment: Challenges and Opportunities in Developing Human-like Learning Agents","authors":"Thuy Ngoc Nguyen, Chase McDonald, Cleotilde Gonzalez","doi":"10.1609/aaaiss.v3i1.31180","DOIUrl":null,"url":null,"abstract":"Temporal credit assignment is the process of distributing delayed outcomes to each action in a sequence, which is essential for learning to adapt and make decisions in dynamic environments. While computational methods in reinforcement learning, such as temporal difference (TD), have shown success in tackling this issue, it remains unclear whether these mechanisms accurately reflect how humans handle feedback delays. Furthermore, cognitive science research has not fully explored the credit assignment problem in humans and cognitive models. Our study uses a cognitive model based on Instance-Based Learning Theory (IBLT) to investigate various credit assignment mechanisms, including equal credit, exponential credit, and TD credit, using the IBL decision mechanism in a goal-seeking navigation task with feedback delays and varying levels of decision complexity. We compare the performance and process measures of the different models with human decision-making in two experiments. Our findings indicate that the human learning process cannot be fully explained by any of the mechanisms. We also observe that decision complexity affects human behavior but not model behavior. By examining the similarities and differences between human and model behavior, we summarize the challenges and opportunities for developing learning agents that emulate human decisions in dynamic environments.","PeriodicalId":516827,"journal":{"name":"Proceedings of the AAAI Symposium Series","volume":"12 9","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the AAAI Symposium Series","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1609/aaaiss.v3i1.31180","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Temporal credit assignment is the process of distributing delayed outcomes to each action in a sequence, which is essential for learning to adapt and make decisions in dynamic environments. While computational methods in reinforcement learning, such as temporal difference (TD), have shown success in tackling this issue, it remains unclear whether these mechanisms accurately reflect how humans handle feedback delays. Furthermore, cognitive science research has not fully explored the credit assignment problem in humans and cognitive models. Our study uses a cognitive model based on Instance-Based Learning Theory (IBLT) to investigate various credit assignment mechanisms, including equal credit, exponential credit, and TD credit, using the IBL decision mechanism in a goal-seeking navigation task with feedback delays and varying levels of decision complexity. We compare the performance and process measures of the different models with human decision-making in two experiments. Our findings indicate that the human learning process cannot be fully explained by any of the mechanisms. We also observe that decision complexity affects human behavior but not model behavior. By examining the similarities and differences between human and model behavior, we summarize the challenges and opportunities for developing learning agents that emulate human decisions in dynamic environments.