Shaohua Teng;Tuhong Xu;Zefeng Zheng;NaiQi Wu;Wei Zhang;Luyao Teng
{"title":"Robust Asymmetric Cross-Modal Hashing Retrieval With Dual Semantic Enhancement","authors":"Shaohua Teng;Tuhong Xu;Zefeng Zheng;NaiQi Wu;Wei Zhang;Luyao Teng","doi":"10.1109/TCSS.2024.3352494","DOIUrl":null,"url":null,"abstract":"As social media faces with large amounts of data and multimodal properties, cross-modal hashing (CMH) retrieval gains extensive applications with its high efficiency and low storage consumption. However, there are two issues that hinder the performance of the existing semantics-learning-based CMH methods: 1) there exist some nonlinear relationships, noises, and outliers in the data, which may degrade the learning effectiveness of a model; and 2) the complementary relationships between the label semantics and sample semantics may be inadequately explored. To address the above two problems, a method called robust asymmetric cross-modal hashing retrieval with dual semantic enhancement (RADSE) is proposed. RADSE consists of three parts: 1) cross-modal data alignment (CDA) that applies kernel mapping and establishes a unified linear representation in the neighborhood to capture the nonlinear relationships between cross-modal data; 2) relaxed label semantic learning for robustness (RLSLR) that uses a relaxation strategy to expand label distinctiveness, and leverages \n<inline-formula><tex-math>$\\ell_{2,1}$</tex-math></inline-formula>\n norm to enhance the robustness of the model against noise and outliers; and 3) dual semantic enhancement learning (DSEL) that learns more interrelationships between samples under the label semantic guidance to ensure the mutual enhancement of semantic information. Extensive experiments and analyses on three popular datasets demonstrate that RADSE outperforms the most existing methods in terms of mean average precision (MAP), precision recall (P–R) curves, and top-N precision curves. In the comparisons of MAP, RADSE improves by an average of 2%–3% in two retrieval tasks.","PeriodicalId":13044,"journal":{"name":"IEEE Transactions on Computational Social Systems","volume":null,"pages":null},"PeriodicalIF":4.5000,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computational Social Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10476594/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, CYBERNETICS","Score":null,"Total":0}
引用次数: 0
Abstract
As social media faces with large amounts of data and multimodal properties, cross-modal hashing (CMH) retrieval gains extensive applications with its high efficiency and low storage consumption. However, there are two issues that hinder the performance of the existing semantics-learning-based CMH methods: 1) there exist some nonlinear relationships, noises, and outliers in the data, which may degrade the learning effectiveness of a model; and 2) the complementary relationships between the label semantics and sample semantics may be inadequately explored. To address the above two problems, a method called robust asymmetric cross-modal hashing retrieval with dual semantic enhancement (RADSE) is proposed. RADSE consists of three parts: 1) cross-modal data alignment (CDA) that applies kernel mapping and establishes a unified linear representation in the neighborhood to capture the nonlinear relationships between cross-modal data; 2) relaxed label semantic learning for robustness (RLSLR) that uses a relaxation strategy to expand label distinctiveness, and leverages
$\ell_{2,1}$
norm to enhance the robustness of the model against noise and outliers; and 3) dual semantic enhancement learning (DSEL) that learns more interrelationships between samples under the label semantic guidance to ensure the mutual enhancement of semantic information. Extensive experiments and analyses on three popular datasets demonstrate that RADSE outperforms the most existing methods in terms of mean average precision (MAP), precision recall (P–R) curves, and top-N precision curves. In the comparisons of MAP, RADSE improves by an average of 2%–3% in two retrieval tasks.
期刊介绍:
IEEE Transactions on Computational Social Systems focuses on such topics as modeling, simulation, analysis and understanding of social systems from the quantitative and/or computational perspective. "Systems" include man-man, man-machine and machine-machine organizations and adversarial situations as well as social media structures and their dynamics. More specifically, the proposed transactions publishes articles on modeling the dynamics of social systems, methodologies for incorporating and representing socio-cultural and behavioral aspects in computational modeling, analysis of social system behavior and structure, and paradigms for social systems modeling and simulation. The journal also features articles on social network dynamics, social intelligence and cognition, social systems design and architectures, socio-cultural modeling and representation, and computational behavior modeling, and their applications.