{"title":"An Introduction to Centralized Training for Decentralized Execution in Cooperative Multi-Agent Reinforcement Learning","authors":"Christopher Amato","doi":"arxiv-2409.03052","DOIUrl":null,"url":null,"abstract":"Multi-agent reinforcement learning (MARL) has exploded in popularity in\nrecent years. Many approaches have been developed but they can be divided into\nthree main types: centralized training and execution (CTE), centralized\ntraining for decentralized execution (CTDE), and Decentralized training and\nexecution (DTE). CTDE methods are the most common as they can use centralized information\nduring training but execute in a decentralized manner -- using only information\navailable to that agent during execution. CTDE is the only paradigm that\nrequires a separate training phase where any available information (e.g., other\nagent policies, underlying states) can be used. As a result, they can be more\nscalable than CTE methods, do not require communication during execution, and\ncan often perform well. CTDE fits most naturally with the cooperative case, but\ncan be potentially applied in competitive or mixed settings depending on what\ninformation is assumed to be observed. This text is an introduction to CTDE in cooperative MARL. It is meant to\nexplain the setting, basic concepts, and common methods. It does not cover all\nwork in CTDE MARL as the subarea is quite extensive. I have included work that\nI believe is important for understanding the main concepts in the subarea and\napologize to those that I have omitted.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"19 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Multiagent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.03052","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Multi-agent reinforcement learning (MARL) has exploded in popularity in
recent years. Many approaches have been developed but they can be divided into
three main types: centralized training and execution (CTE), centralized
training for decentralized execution (CTDE), and Decentralized training and
execution (DTE). CTDE methods are the most common as they can use centralized information
during training but execute in a decentralized manner -- using only information
available to that agent during execution. CTDE is the only paradigm that
requires a separate training phase where any available information (e.g., other
agent policies, underlying states) can be used. As a result, they can be more
scalable than CTE methods, do not require communication during execution, and
can often perform well. CTDE fits most naturally with the cooperative case, but
can be potentially applied in competitive or mixed settings depending on what
information is assumed to be observed. This text is an introduction to CTDE in cooperative MARL. It is meant to
explain the setting, basic concepts, and common methods. It does not cover all
work in CTDE MARL as the subarea is quite extensive. I have included work that
I believe is important for understanding the main concepts in the subarea and
apologize to those that I have omitted.