{"title":"Hierarchical AttentionShift for Pointly Supervised Instance Segmentation","authors":"Mingxiang Liao;Fang Wan;Zonghao Guo;Qixiang Ye","doi":"10.1109/TNNLS.2025.3526961","DOIUrl":null,"url":null,"abstract":"Pointly supervised instance segmentation (PSIS) remains a challenging task when appearance variances across object parts cause semantic inconsistency. In this article, we propose a hierarchical AttentionShift approach, to solve the semantic inconsistency issue through exploiting the hierarchical nature of semantics and the flexibility of key-point representation. The estimation of hierarchical attention is defined upon key-point sets. The representative key points are iteratively estimated spatially and in the feature space to capture the fine-grained semantics and cover the full object extent. Hierarchical AttentionShift is performed at instance, part, and fine-grained levels, optimizing object semantics while promoting the conventional self-attention activation to hierarchical activation with local refinement. Experiments on PASCAL VOC 2012 Aug and MS-COCO 2017 benchmarks show that hierarchical AttentionShift improves the state-of-the-art (SOTA) method by 10.4% and 7.0% upon mean average precision (mAP)50, respectively. When applying hierarchical AttentionShift to the segment anything model (SAM), 9.4% AP improvement on the COCO test-dev is achieved. Hierarchical AttentionShift provides a fresh insight to regularize the self-attention mechanism for fine-grained vision tasks. The code is available at github.com/MingXiangL/AttentionShift.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"36 8","pages":"15528-15541"},"PeriodicalIF":8.9000,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks and learning systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10879127/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Pointly supervised instance segmentation (PSIS) remains a challenging task when appearance variances across object parts cause semantic inconsistency. In this article, we propose a hierarchical AttentionShift approach, to solve the semantic inconsistency issue through exploiting the hierarchical nature of semantics and the flexibility of key-point representation. The estimation of hierarchical attention is defined upon key-point sets. The representative key points are iteratively estimated spatially and in the feature space to capture the fine-grained semantics and cover the full object extent. Hierarchical AttentionShift is performed at instance, part, and fine-grained levels, optimizing object semantics while promoting the conventional self-attention activation to hierarchical activation with local refinement. Experiments on PASCAL VOC 2012 Aug and MS-COCO 2017 benchmarks show that hierarchical AttentionShift improves the state-of-the-art (SOTA) method by 10.4% and 7.0% upon mean average precision (mAP)50, respectively. When applying hierarchical AttentionShift to the segment anything model (SAM), 9.4% AP improvement on the COCO test-dev is achieved. Hierarchical AttentionShift provides a fresh insight to regularize the self-attention mechanism for fine-grained vision tasks. The code is available at github.com/MingXiangL/AttentionShift.
期刊介绍:
The focus of IEEE Transactions on Neural Networks and Learning Systems is to present scholarly articles discussing the theory, design, and applications of neural networks as well as other learning systems. The journal primarily highlights technical and scientific research in this domain.