Yongquan Wan, Jianfei Zheng, Cairong Yan, Guobing Zou
{"title":"From spatial to semantic: attribute-aware fashion similarity learning via iterative positioning and attribute diverging","authors":"Yongquan Wan, Jianfei Zheng, Cairong Yan, Guobing Zou","doi":"10.1007/s10489-024-06173-8","DOIUrl":null,"url":null,"abstract":"<div><p>Fashion image retrieval emphasizes accurately perceiving the fine-grained features to meet users’ precise needs. However, the existing global image-based retrieval methods encounter challenges such as imprecise positioning of attributes, difficulty in distinguishing visually similar but semantically different attribute values, and struggles in the learning of attribute features within specific regions and viewpoints. This paper proposes a two-stage hybrid framework called IPAD (Iterative Positioning and Attribute Diverging) for attribute-aware fashion similarity learning. In the initial stage, we present an iterative positioning strategy to precisely identify local attribute regions through an iterative attention mechanism with adaptive suppression. IPAD leverages the strengths of Convolutional Neural Networks and Vision Transformers. Subsequently, we design an attribute diverging strategy to optimize attribute value aggregation via online clustering using a momentum encoder, thereby enhancing model stability and representation. During inference, we further present a feature reasoning mechanism to refine retrieval results through subgraph similarity matrix generation and re-ranking to enhance accuracy and robustness. Extensive evaluations on three public datasets demonstrate IPAD’s superior performance over state-of-the-art methods in retrieval accuracy, achieving an average improvement in MAP by +4.22%. The source code is available at https://github.com/h8e9r7/IPAD.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 4","pages":""},"PeriodicalIF":3.4000,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Intelligence","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10489-024-06173-8","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Fashion image retrieval emphasizes accurately perceiving the fine-grained features to meet users’ precise needs. However, the existing global image-based retrieval methods encounter challenges such as imprecise positioning of attributes, difficulty in distinguishing visually similar but semantically different attribute values, and struggles in the learning of attribute features within specific regions and viewpoints. This paper proposes a two-stage hybrid framework called IPAD (Iterative Positioning and Attribute Diverging) for attribute-aware fashion similarity learning. In the initial stage, we present an iterative positioning strategy to precisely identify local attribute regions through an iterative attention mechanism with adaptive suppression. IPAD leverages the strengths of Convolutional Neural Networks and Vision Transformers. Subsequently, we design an attribute diverging strategy to optimize attribute value aggregation via online clustering using a momentum encoder, thereby enhancing model stability and representation. During inference, we further present a feature reasoning mechanism to refine retrieval results through subgraph similarity matrix generation and re-ranking to enhance accuracy and robustness. Extensive evaluations on three public datasets demonstrate IPAD’s superior performance over state-of-the-art methods in retrieval accuracy, achieving an average improvement in MAP by +4.22%. The source code is available at https://github.com/h8e9r7/IPAD.
期刊介绍:
With a focus on research in artificial intelligence and neural networks, this journal addresses issues involving solutions of real-life manufacturing, defense, management, government and industrial problems which are too complex to be solved through conventional approaches and require the simulation of intelligent thought processes, heuristics, applications of knowledge, and distributed and parallel processing. The integration of these multiple approaches in solving complex problems is of particular importance.
The journal presents new and original research and technological developments, addressing real and complex issues applicable to difficult problems. It provides a medium for exchanging scientific research and technological achievements accomplished by the international community.