{"title":"实现稳健的多代理强化学习","authors":"Aritra Mitra","doi":"10.1609/aaaiss.v3i1.31222","DOIUrl":null,"url":null,"abstract":"Stochastic gradient descent (SGD) is at the heart of large-scale distributed machine learning paradigms such as federated learning (FL). In these applications, the task of training high-dimensional weight vectors is distributed among several workers that exchange information over networks of limited bandwidth. While parallelization at such an immense scale helps to reduce the computational burden, it creates several other challenges: delays, asynchrony, and most importantly, a significant communication bottleneck. The popularity and success of SGD can be attributed in no small part to the fact that it is extremely robust to such deviations from ideal operating conditions. Inspired by these findings, we ask: Are common reinforcement learning (RL)\nalgorithms also robust to similarly structured perturbations? Perhaps surprisingly, despite the recent surge of interest in multi-agent/federated RL, almost nothing is known about the above question. This paper collects some of our recent results in filling this void.","PeriodicalId":516827,"journal":{"name":"Proceedings of the AAAI Symposium Series","volume":"94 15","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Towards Robust Multi-Agent Reinforcement Learning\",\"authors\":\"Aritra Mitra\",\"doi\":\"10.1609/aaaiss.v3i1.31222\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Stochastic gradient descent (SGD) is at the heart of large-scale distributed machine learning paradigms such as federated learning (FL). In these applications, the task of training high-dimensional weight vectors is distributed among several workers that exchange information over networks of limited bandwidth. While parallelization at such an immense scale helps to reduce the computational burden, it creates several other challenges: delays, asynchrony, and most importantly, a significant communication bottleneck. The popularity and success of SGD can be attributed in no small part to the fact that it is extremely robust to such deviations from ideal operating conditions. Inspired by these findings, we ask: Are common reinforcement learning (RL)\\nalgorithms also robust to similarly structured perturbations? Perhaps surprisingly, despite the recent surge of interest in multi-agent/federated RL, almost nothing is known about the above question. This paper collects some of our recent results in filling this void.\",\"PeriodicalId\":516827,\"journal\":{\"name\":\"Proceedings of the AAAI Symposium Series\",\"volume\":\"94 15\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the AAAI Symposium Series\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1609/aaaiss.v3i1.31222\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the AAAI Symposium Series","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1609/aaaiss.v3i1.31222","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Stochastic gradient descent (SGD) is at the heart of large-scale distributed machine learning paradigms such as federated learning (FL). In these applications, the task of training high-dimensional weight vectors is distributed among several workers that exchange information over networks of limited bandwidth. While parallelization at such an immense scale helps to reduce the computational burden, it creates several other challenges: delays, asynchrony, and most importantly, a significant communication bottleneck. The popularity and success of SGD can be attributed in no small part to the fact that it is extremely robust to such deviations from ideal operating conditions. Inspired by these findings, we ask: Are common reinforcement learning (RL)
algorithms also robust to similarly structured perturbations? Perhaps surprisingly, despite the recent surge of interest in multi-agent/federated RL, almost nothing is known about the above question. This paper collects some of our recent results in filling this void.