{"title":"通过无模型 Q-learning 方法优化航天器系统的数据注入攻击设计","authors":"Huanhuan Yuan, Mengbi Wang, Chao Xi","doi":"10.1049/cth2.12685","DOIUrl":null,"url":null,"abstract":"This paper aims to analyse the dynamic response of a corrupted spacecraft rendezvous system from the perspective of attacker. The optimal data injection attack problem is formulated by constructing a tradeoff cost function in a quadratic form. First, the optimal attack strategy and associated sufficient condition for its existence are derived similar to optimal control for attacker without being detected. Breaking the assumption in most existing works, the goal of this paper is to explore the optimal attack strategy without knowing system matrices. A model free Q‐learning approach is designed with the application to solve attacker's optimization problem. Critic network and action network are used to adaptive tuning the value and action for attacker in a forward time. For a more practical situation, a model free attack strategy design is implemented only based on measured input/output data. Finally, the simulation results on the spacecraft system are presented to show the effectiveness of the proposed method for model free attack strategy design.","PeriodicalId":502998,"journal":{"name":"IET Control Theory & Applications","volume":"51 48","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Optimal data injection attack design for spacecraft systems via a model free Q‐learning approach\",\"authors\":\"Huanhuan Yuan, Mengbi Wang, Chao Xi\",\"doi\":\"10.1049/cth2.12685\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper aims to analyse the dynamic response of a corrupted spacecraft rendezvous system from the perspective of attacker. The optimal data injection attack problem is formulated by constructing a tradeoff cost function in a quadratic form. First, the optimal attack strategy and associated sufficient condition for its existence are derived similar to optimal control for attacker without being detected. Breaking the assumption in most existing works, the goal of this paper is to explore the optimal attack strategy without knowing system matrices. A model free Q‐learning approach is designed with the application to solve attacker's optimization problem. Critic network and action network are used to adaptive tuning the value and action for attacker in a forward time. For a more practical situation, a model free attack strategy design is implemented only based on measured input/output data. Finally, the simulation results on the spacecraft system are presented to show the effectiveness of the proposed method for model free attack strategy design.\",\"PeriodicalId\":502998,\"journal\":{\"name\":\"IET Control Theory & Applications\",\"volume\":\"51 48\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IET Control Theory & Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1049/cth2.12685\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Control Theory & Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1049/cth2.12685","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Optimal data injection attack design for spacecraft systems via a model free Q‐learning approach
This paper aims to analyse the dynamic response of a corrupted spacecraft rendezvous system from the perspective of attacker. The optimal data injection attack problem is formulated by constructing a tradeoff cost function in a quadratic form. First, the optimal attack strategy and associated sufficient condition for its existence are derived similar to optimal control for attacker without being detected. Breaking the assumption in most existing works, the goal of this paper is to explore the optimal attack strategy without knowing system matrices. A model free Q‐learning approach is designed with the application to solve attacker's optimization problem. Critic network and action network are used to adaptive tuning the value and action for attacker in a forward time. For a more practical situation, a model free attack strategy design is implemented only based on measured input/output data. Finally, the simulation results on the spacecraft system are presented to show the effectiveness of the proposed method for model free attack strategy design.