{"title":"细粒度视觉分类的空间激发注意学习","authors":"Zhaozhi Luo, Min-Hsiang Hung, Yi-Wen Lu, Kuan-Wen Chen","doi":"10.1109/ARIS56205.2022.9910447","DOIUrl":null,"url":null,"abstract":"Learning distinguishable feature embedding plays an important role in fine-grained visual categorization. The existing methods focus on either designing a complex attention mechanism to boost the overall classification performance or proposing a specific training strategy to enhance the learning of the backbone network to achieve a low-cost backbone-only inference. Unlike all of them, an alternative approach called Spatially-Excited Attention Learning (SEAL) is proposed in this paper. The training of SEAL is similar to that of most of the existing methods, but it provides two alternative streams during a network inference: one stream requires higher effort but provides higher performance; the other is a low-cost backbone-only inference with lower but still comparative performance. Note that both the streams are trained at the same time by SEAL. The experiments show that SEAL achieves the state-of-the-art performance under both complex architecture and backbone-only inference conditions.","PeriodicalId":254572,"journal":{"name":"2022 International Conference on Advanced Robotics and Intelligent Systems (ARIS)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Spatially-Excited Attention Learning for Fine-Grained Visual Categorization\",\"authors\":\"Zhaozhi Luo, Min-Hsiang Hung, Yi-Wen Lu, Kuan-Wen Chen\",\"doi\":\"10.1109/ARIS56205.2022.9910447\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Learning distinguishable feature embedding plays an important role in fine-grained visual categorization. The existing methods focus on either designing a complex attention mechanism to boost the overall classification performance or proposing a specific training strategy to enhance the learning of the backbone network to achieve a low-cost backbone-only inference. Unlike all of them, an alternative approach called Spatially-Excited Attention Learning (SEAL) is proposed in this paper. The training of SEAL is similar to that of most of the existing methods, but it provides two alternative streams during a network inference: one stream requires higher effort but provides higher performance; the other is a low-cost backbone-only inference with lower but still comparative performance. Note that both the streams are trained at the same time by SEAL. The experiments show that SEAL achieves the state-of-the-art performance under both complex architecture and backbone-only inference conditions.\",\"PeriodicalId\":254572,\"journal\":{\"name\":\"2022 International Conference on Advanced Robotics and Intelligent Systems (ARIS)\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Conference on Advanced Robotics and Intelligent Systems (ARIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ARIS56205.2022.9910447\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Advanced Robotics and Intelligent Systems (ARIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ARIS56205.2022.9910447","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Spatially-Excited Attention Learning for Fine-Grained Visual Categorization
Learning distinguishable feature embedding plays an important role in fine-grained visual categorization. The existing methods focus on either designing a complex attention mechanism to boost the overall classification performance or proposing a specific training strategy to enhance the learning of the backbone network to achieve a low-cost backbone-only inference. Unlike all of them, an alternative approach called Spatially-Excited Attention Learning (SEAL) is proposed in this paper. The training of SEAL is similar to that of most of the existing methods, but it provides two alternative streams during a network inference: one stream requires higher effort but provides higher performance; the other is a low-cost backbone-only inference with lower but still comparative performance. Note that both the streams are trained at the same time by SEAL. The experiments show that SEAL achieves the state-of-the-art performance under both complex architecture and backbone-only inference conditions.