Ang Li, Jianxin Chen, B. Kang, Wenqin Zhuang, Xuguang Zhang
{"title":"用于细粒度图像识别的自适应多注意卷积神经网络","authors":"Ang Li, Jianxin Chen, B. Kang, Wenqin Zhuang, Xuguang Zhang","doi":"10.1109/GCWkshps45667.2019.9024585","DOIUrl":null,"url":null,"abstract":"Fine-grained recognition is still a difficult task in pattern recognition applications due to the challenge of accurate localization of discriminative parts. Recent CNN-based methods generally utilize attention mechanism to produce attention masks without part labels/annotations and extract corresponding image parts from them. However, these methods extract the attention parts by using fixed-size rectangles to crop images regardless of the size of objects to be recognized, which will hinder the feature expression of the following Part-CNNs. In this paper, we propose an adaptive cropping module based on the information of attention masks to adjust size of cropping rectangles. The trainingprocessofadaptivecroppingmoduleandPart-CNNscan reinforce each other with the proposed rank loss and the classic softmax loss. To further balance and fuse all attention parts, we propose a part weighting module to evaluate part contributions. Under the optimization of sort loss, the part weighting module will produce part weights in the same order as prediction scores learned by attention parts. The backbone of our network is MA-CNN. Different from MA-CNN, the new proposed adaptive cropping module and part weighting module can jointly guide the framework to produce a more discriminative fine-grained feature. Experiments show that the AMA-CNN outperforms MA-CNN by 1.1% on CUB200-2011 bird dataset.","PeriodicalId":210825,"journal":{"name":"2019 IEEE Globecom Workshops (GC Wkshps)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Adaptive Multi-Attention Convolutional Neural Network for Fine-Grained Image Recognition\",\"authors\":\"Ang Li, Jianxin Chen, B. Kang, Wenqin Zhuang, Xuguang Zhang\",\"doi\":\"10.1109/GCWkshps45667.2019.9024585\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Fine-grained recognition is still a difficult task in pattern recognition applications due to the challenge of accurate localization of discriminative parts. Recent CNN-based methods generally utilize attention mechanism to produce attention masks without part labels/annotations and extract corresponding image parts from them. However, these methods extract the attention parts by using fixed-size rectangles to crop images regardless of the size of objects to be recognized, which will hinder the feature expression of the following Part-CNNs. In this paper, we propose an adaptive cropping module based on the information of attention masks to adjust size of cropping rectangles. The trainingprocessofadaptivecroppingmoduleandPart-CNNscan reinforce each other with the proposed rank loss and the classic softmax loss. To further balance and fuse all attention parts, we propose a part weighting module to evaluate part contributions. Under the optimization of sort loss, the part weighting module will produce part weights in the same order as prediction scores learned by attention parts. The backbone of our network is MA-CNN. Different from MA-CNN, the new proposed adaptive cropping module and part weighting module can jointly guide the framework to produce a more discriminative fine-grained feature. Experiments show that the AMA-CNN outperforms MA-CNN by 1.1% on CUB200-2011 bird dataset.\",\"PeriodicalId\":210825,\"journal\":{\"name\":\"2019 IEEE Globecom Workshops (GC Wkshps)\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE Globecom Workshops (GC Wkshps)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/GCWkshps45667.2019.9024585\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE Globecom Workshops (GC Wkshps)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GCWkshps45667.2019.9024585","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Adaptive Multi-Attention Convolutional Neural Network for Fine-Grained Image Recognition
Fine-grained recognition is still a difficult task in pattern recognition applications due to the challenge of accurate localization of discriminative parts. Recent CNN-based methods generally utilize attention mechanism to produce attention masks without part labels/annotations and extract corresponding image parts from them. However, these methods extract the attention parts by using fixed-size rectangles to crop images regardless of the size of objects to be recognized, which will hinder the feature expression of the following Part-CNNs. In this paper, we propose an adaptive cropping module based on the information of attention masks to adjust size of cropping rectangles. The trainingprocessofadaptivecroppingmoduleandPart-CNNscan reinforce each other with the proposed rank loss and the classic softmax loss. To further balance and fuse all attention parts, we propose a part weighting module to evaluate part contributions. Under the optimization of sort loss, the part weighting module will produce part weights in the same order as prediction scores learned by attention parts. The backbone of our network is MA-CNN. Different from MA-CNN, the new proposed adaptive cropping module and part weighting module can jointly guide the framework to produce a more discriminative fine-grained feature. Experiments show that the AMA-CNN outperforms MA-CNN by 1.1% on CUB200-2011 bird dataset.