Hauton Tsang, M. A. Salahuddin, Noura Limam, R. Boutaba
{"title":"Meta-ATMoS+:用于软件定义网络威胁缓解的元强化学习框架","authors":"Hauton Tsang, M. A. Salahuddin, Noura Limam, R. Boutaba","doi":"10.1109/lcn58197.2023.10223403","DOIUrl":null,"url":null,"abstract":"—As cyber threats become increasingly common, automated threat mitigation solutions are more necessary than ever. Conventional threat mitigation frameworks are difficult to tune for different network environments, but frameworks utilizing deep reinforcement learning (RL) have been proven to be an effective approach that can adapt to different networks automatically. Existing RL-based frameworks have shown to be generalizable to different network sizes and threats, and robust to false positives. However, training RL agents for these frameworks can be challenging in a production environment as the training process is time-consuming and disruptive to the production network. Hence, a staging environment is required to effectively train them. In this paper, we propose Meta-ATMoS+, a meta-RL framework for threat mitigation in software-defined networks. We leverage Model-Agnostic Meta-Learning (MAML) to find an initialization for the RL agent that generalizes to a variety of different network configurations. We show that the RL agent with MAML-learned initialization can accomplish few-shot learning on a target network with comparable performance to training on a staging environment. Few-shot learning not only allows the model to be trainable directly in the production environment but also enables human-in-the-loop RL for the mitigation of threats that do not have an easily-definable reward function.","PeriodicalId":178458,"journal":{"name":"2023 IEEE 48th Conference on Local Computer Networks (LCN)","volume":"83 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Meta-ATMoS+: A Meta-Reinforcement Learning Framework for Threat Mitigation in Software-Defined Networks\",\"authors\":\"Hauton Tsang, M. A. Salahuddin, Noura Limam, R. Boutaba\",\"doi\":\"10.1109/lcn58197.2023.10223403\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"—As cyber threats become increasingly common, automated threat mitigation solutions are more necessary than ever. Conventional threat mitigation frameworks are difficult to tune for different network environments, but frameworks utilizing deep reinforcement learning (RL) have been proven to be an effective approach that can adapt to different networks automatically. Existing RL-based frameworks have shown to be generalizable to different network sizes and threats, and robust to false positives. However, training RL agents for these frameworks can be challenging in a production environment as the training process is time-consuming and disruptive to the production network. Hence, a staging environment is required to effectively train them. In this paper, we propose Meta-ATMoS+, a meta-RL framework for threat mitigation in software-defined networks. We leverage Model-Agnostic Meta-Learning (MAML) to find an initialization for the RL agent that generalizes to a variety of different network configurations. We show that the RL agent with MAML-learned initialization can accomplish few-shot learning on a target network with comparable performance to training on a staging environment. Few-shot learning not only allows the model to be trainable directly in the production environment but also enables human-in-the-loop RL for the mitigation of threats that do not have an easily-definable reward function.\",\"PeriodicalId\":178458,\"journal\":{\"name\":\"2023 IEEE 48th Conference on Local Computer Networks (LCN)\",\"volume\":\"83 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-10-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE 48th Conference on Local Computer Networks (LCN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/lcn58197.2023.10223403\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 48th Conference on Local Computer Networks (LCN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/lcn58197.2023.10223403","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Meta-ATMoS+: A Meta-Reinforcement Learning Framework for Threat Mitigation in Software-Defined Networks
—As cyber threats become increasingly common, automated threat mitigation solutions are more necessary than ever. Conventional threat mitigation frameworks are difficult to tune for different network environments, but frameworks utilizing deep reinforcement learning (RL) have been proven to be an effective approach that can adapt to different networks automatically. Existing RL-based frameworks have shown to be generalizable to different network sizes and threats, and robust to false positives. However, training RL agents for these frameworks can be challenging in a production environment as the training process is time-consuming and disruptive to the production network. Hence, a staging environment is required to effectively train them. In this paper, we propose Meta-ATMoS+, a meta-RL framework for threat mitigation in software-defined networks. We leverage Model-Agnostic Meta-Learning (MAML) to find an initialization for the RL agent that generalizes to a variety of different network configurations. We show that the RL agent with MAML-learned initialization can accomplish few-shot learning on a target network with comparable performance to training on a staging environment. Few-shot learning not only allows the model to be trainable directly in the production environment but also enables human-in-the-loop RL for the mitigation of threats that do not have an easily-definable reward function.