Existing transfer-based adversarial attacks suffer from poor transferability due to limitations of the proxy dataset or inaccurate imitation of the target model by the substitute model. Thus, we propose a theft model-based black-box adversarial attack in embedding space. The substitute model acts as the discriminator of the generative adversarial network, and we introduce a diversity loss to train the generator without relying on a proxy dataset, enabling it to imitate the target model better. Furthermore, we design a combined adversarial attack method that integrates the gradient-based attack and natural evolution strategy to construct adversarial examples in the embedding space search. This ensures that the adversarial examples are compelling on both the target and the substitute models. Experimental results demonstrate that our method has good imitation ability and transferability. When using VGG16, OUR outperforms TREMBA by 14.71% in un-targeted attack success rate and shows a 13.49% improvement in targeted attacks.
扫码关注我们
求助内容:
应助结果提醒方式:
