Jie Jiang;Xingjian He;Weining Wang;Hanqing Lu;Jing Liu
{"title":"Hierarchical Contrastive Learning for Semantic Segmentation","authors":"Jie Jiang;Xingjian He;Weining Wang;Hanqing Lu;Jing Liu","doi":"10.1109/TNNLS.2024.3491782","DOIUrl":null,"url":null,"abstract":"Recently, pixel-to-pixel contrastive learning in single-scale feature space has been widely studied in semantic segmentation to learn a unified feature expression for pixels of the same category. However, the unified representation is too extreme, and the receptive field of each single-scale pixel is limited, which is insufficient to reflect the representative features of the category. To address these problems, this article extends the single-scale feature space to that of multiscale and proposes a hierarchical contrastive learning (Hi-CL) method to explore pixel-to-component semantic relationships. First, we generate multiscale candidate samples by applying several pooling windows with different sizes on a feature map, where different windows may represent different parts of the objects in the image. Then, we prune the sample set through threshold-based criteria to select appropriate samples for feature representation learning. Finally, Hi-CL is performed to learn the pixel-to-component consistency with the pruned samples. Our method is easy to be applied on existing semantic segmentation models and obtains consistent improvement. Furthermore, we achieve state-of-the-art results on three popular benchmarks, including Cityscapes, ADE20K, and COCO Stuff datasets.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"36 6","pages":"11202-11214"},"PeriodicalIF":8.9000,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks and learning systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10758330/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Recently, pixel-to-pixel contrastive learning in single-scale feature space has been widely studied in semantic segmentation to learn a unified feature expression for pixels of the same category. However, the unified representation is too extreme, and the receptive field of each single-scale pixel is limited, which is insufficient to reflect the representative features of the category. To address these problems, this article extends the single-scale feature space to that of multiscale and proposes a hierarchical contrastive learning (Hi-CL) method to explore pixel-to-component semantic relationships. First, we generate multiscale candidate samples by applying several pooling windows with different sizes on a feature map, where different windows may represent different parts of the objects in the image. Then, we prune the sample set through threshold-based criteria to select appropriate samples for feature representation learning. Finally, Hi-CL is performed to learn the pixel-to-component consistency with the pruned samples. Our method is easy to be applied on existing semantic segmentation models and obtains consistent improvement. Furthermore, we achieve state-of-the-art results on three popular benchmarks, including Cityscapes, ADE20K, and COCO Stuff datasets.
期刊介绍:
The focus of IEEE Transactions on Neural Networks and Learning Systems is to present scholarly articles discussing the theory, design, and applications of neural networks as well as other learning systems. The journal primarily highlights technical and scientific research in this domain.