The primary objective of jamming strategy optimization is to ensure that a jammer timely finds an effective jamming strategy against the multifunction radar (MFR), thereby ensuring the safety of targets. Deep reinforcement learning (DRL) has been widely applied in solving the problem of jamming strategy optimization. However, the process still faces challenges such as low learning efficiency and a heavy memory burden. Therefore, we propose a fast jamming strategy optimization method with imperfect experience. Firstly, we model the radar countermeasure process as a Markov decision process (MDP), and formulate the jamming reward function by combining the jamming effectiveness and the jammer’s operational intent. Secondly, we design a novel hybrid jamming strategy choice module, which uses imperfect experience to improve the optimization efficiency of jamming strategy. Furthermore, to improve sample efficiency and reduce forgetting caused by a small replay buffer, we respectively employ a mixed replay buffer strategy and a knowledge consolidation technique. Finally, extensive experiments demonstrate that under the guidance of imperfect experience, our proposed method achieves faster convergence speed and higher strategy accuracy compared with existing DRL-based methods.
{"title":"A Fast Jamming Strategy Optimization Method With Imperfect Experience","authors":"Jianxin Li;Tian Tian;Jingjing Cai;Weiwei Fan;Yunan Sun;Feng Zhou","doi":"10.1109/TIFS.2025.3650410","DOIUrl":"10.1109/TIFS.2025.3650410","url":null,"abstract":"The primary objective of jamming strategy optimization is to ensure that a jammer timely finds an effective jamming strategy against the multifunction radar (MFR), thereby ensuring the safety of targets. Deep reinforcement learning (DRL) has been widely applied in solving the problem of jamming strategy optimization. However, the process still faces challenges such as low learning efficiency and a heavy memory burden. Therefore, we propose a fast jamming strategy optimization method with imperfect experience. Firstly, we model the radar countermeasure process as a Markov decision process (MDP), and formulate the jamming reward function by combining the jamming effectiveness and the jammer’s operational intent. Secondly, we design a novel hybrid jamming strategy choice module, which uses imperfect experience to improve the optimization efficiency of jamming strategy. Furthermore, to improve sample efficiency and reduce forgetting caused by a small replay buffer, we respectively employ a mixed replay buffer strategy and a knowledge consolidation technique. Finally, extensive experiments demonstrate that under the guidance of imperfect experience, our proposed method achieves faster convergence speed and higher strategy accuracy compared with existing DRL-based methods.","PeriodicalId":13492,"journal":{"name":"IEEE Transactions on Information Forensics and Security","volume":"21 ","pages":"992-1005"},"PeriodicalIF":8.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145893720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Existing CNN inference frameworks based on FHE often suffer from reduced efficiency and accuracy due to the polynomial approximation of activation functions, and they lack effective mechanisms to prevent sensitive information leakage during the final classification stage. To address these limitations, we propose FSAT, a fast and secure inference framework enhanced with adversarial training. Specifically, FSAT employs a private CNN model architecture, where linear layers are computed through an optimized homomorphic ciphertext convolution operation, while non-linear layer operations are efficiently realized using a secure searchable index and an encrypted look-up table, which replace polynomial activation approximations and significantly improve inference accuracy and latency performance. To further mitigate information leakage, we introduce a dual-constraint adversarial training scheme that makes it substantially more difficult for an adversary to infer sensitive attributes of the input data. Experimental results demonstrate that FSAT achieves high inference accuracy and efficiency while substantially reducing the risk of sensitive data leakage.
{"title":"FSAT: A Faster Secure Convolutional Neural Network Inference Framework With Adversarial Training in Resource-Constrained Scenarios","authors":"Dong Li;Anupam Chattopadhyay;Qingguo Lü;Jiahui Wu;Tao Xiang;Xiaofeng Liao","doi":"10.1109/TIFS.2025.3650384","DOIUrl":"10.1109/TIFS.2025.3650384","url":null,"abstract":"Existing CNN inference frameworks based on FHE often suffer from reduced efficiency and accuracy due to the polynomial approximation of activation functions, and they lack effective mechanisms to prevent sensitive information leakage during the final classification stage. To address these limitations, we propose FSAT, a fast and secure inference framework enhanced with adversarial training. Specifically, FSAT employs a private CNN model architecture, where linear layers are computed through an optimized homomorphic ciphertext convolution operation, while non-linear layer operations are efficiently realized using a secure searchable index and an encrypted look-up table, which replace polynomial activation approximations and significantly improve inference accuracy and latency performance. To further mitigate information leakage, we introduce a dual-constraint adversarial training scheme that makes it substantially more difficult for an adversary to infer sensitive attributes of the input data. Experimental results demonstrate that FSAT achieves high inference accuracy and efficiency while substantially reducing the risk of sensitive data leakage.","PeriodicalId":13492,"journal":{"name":"IEEE Transactions on Information Forensics and Security","volume":"21 ","pages":"798-811"},"PeriodicalIF":8.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145893721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01DOI: 10.1109/TIFS.2025.3650396
Jinsheng Xiao;Hao Ma;Ruidi Chen;Xingyu Gao;Hailong Shi;Zhongyuan Wang
For providing timely warnings and preventing potential damages, it is crucial to detect anomalous actions that threaten public safety through surveillance cameras. Compared to normal actions, anomalous actions often occupy only a small portion of surveillance videos and exhibit more complex manifestations in terms of time and space. Considering that normal action recognition methods fail to highlight crucial information from small-sized patches, we propose the Spatio-temporal Key Patch Selection Network(STKPS-Net). It includes a Spatially Adaptive Key Patch Selection(SAKPS) module to select small but informative patches, and a Long-short Feature Map Spatio-temporal Relation(LFMSR) module to capture dynamic changes in anomalous actions. Additionally, a spatio-temporal refined loss is introduced to enhance fine-grained feature learning. Experimental results on the HMDB51, Kinetics, and UCF-Crime v2 datasets show that our STKPS-Net achieves state-of-the-art performance in few-shot anomalous action recognition, outperforming the most competitive methods by 1.2% on the anomalous action dataset UCF-Crime v2. More details can be found at https://github.com/xiaojs18/STKPS-Net.
{"title":"STKPS-Net: Spatio-Temporal Key Patch Selection Network for Few Shot Anomalous Action Recognition","authors":"Jinsheng Xiao;Hao Ma;Ruidi Chen;Xingyu Gao;Hailong Shi;Zhongyuan Wang","doi":"10.1109/TIFS.2025.3650396","DOIUrl":"10.1109/TIFS.2025.3650396","url":null,"abstract":"For providing timely warnings and preventing potential damages, it is crucial to detect anomalous actions that threaten public safety through surveillance cameras. Compared to normal actions, anomalous actions often occupy only a small portion of surveillance videos and exhibit more complex manifestations in terms of time and space. Considering that normal action recognition methods fail to highlight crucial information from small-sized patches, we propose the Spatio-temporal Key Patch Selection Network(STKPS-Net). It includes a Spatially Adaptive Key Patch Selection(SAKPS) module to select small but informative patches, and a Long-short Feature Map Spatio-temporal Relation(LFMSR) module to capture dynamic changes in anomalous actions. Additionally, a spatio-temporal refined loss is introduced to enhance fine-grained feature learning. Experimental results on the HMDB51, Kinetics, and UCF-Crime v2 datasets show that our STKPS-Net achieves state-of-the-art performance in few-shot anomalous action recognition, outperforming the most competitive methods by 1.2% on the anomalous action dataset UCF-Crime v2. More details can be found at <uri>https://github.com/xiaojs18/STKPS-Net</uri>.","PeriodicalId":13492,"journal":{"name":"IEEE Transactions on Information Forensics and Security","volume":"21 ","pages":"827-838"},"PeriodicalIF":8.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145893722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01DOI: 10.1109/TIFS.2025.3650379
Mengqian Li;Youliang Tian;Junpeng Zhang;Ze Yang;Jinbo Xiong;Jianfeng Ma
Federated learning (FL) allows multiple distributed clients with local datasets to train a global model collaboratively. Due to the potential privacy risk of the training process, differential privacy (DP) is introduced into FL to protect clients’ sensitive information by perturbing the model updates. However, the probability density function of the Laplace mechanism has a long-tail effect, which may generate large noise to induce the model to deviate from the normal result. Moreover, as the cloud is not fully trusted, there is no guarantee that the server follows the aggregation protocol correctly. To address these issues, in this paper, we propose a secure rational delegation FL scheme, namely SRDFL, and analyze its protection and convergence performance. Specifically, we first utilize the zero-determinant strategy to construct a FL rational model. It delegates tasks to multiple servers and encourages them to perform correct aggregation. Then, we design a bounded DP protection mechanism to achieve a fixed universe of perturbation outputs in a threshold-constrained manner. Finally, based on Shamir’s secret sharing, we propose a trusted verification algorithm of DP to validate servers for correct aggregation. Detailed theoretical analysis and extensive performance evaluations demonstrate that our proposed scheme is effective. Compared to existing works, SRDFL is able to improve 2.72%–47.92% model accuracy.
{"title":"Secure Rational Delegation Federated Learning","authors":"Mengqian Li;Youliang Tian;Junpeng Zhang;Ze Yang;Jinbo Xiong;Jianfeng Ma","doi":"10.1109/TIFS.2025.3650379","DOIUrl":"10.1109/TIFS.2025.3650379","url":null,"abstract":"Federated learning (FL) allows multiple distributed clients with local datasets to train a global model collaboratively. Due to the potential privacy risk of the training process, differential privacy (DP) is introduced into FL to protect clients’ sensitive information by perturbing the model updates. However, the probability density function of the Laplace mechanism has a long-tail effect, which may generate large noise to induce the model to deviate from the normal result. Moreover, as the cloud is not fully trusted, there is no guarantee that the server follows the aggregation protocol correctly. To address these issues, in this paper, we propose a secure rational delegation FL scheme, namely SRDFL, and analyze its protection and convergence performance. Specifically, we first utilize the zero-determinant strategy to construct a FL rational model. It delegates tasks to multiple servers and encourages them to perform correct aggregation. Then, we design a bounded DP protection mechanism to achieve a fixed universe of perturbation outputs in a threshold-constrained manner. Finally, based on Shamir’s secret sharing, we propose a trusted verification algorithm of DP to validate servers for correct aggregation. Detailed theoretical analysis and extensive performance evaluations demonstrate that our proposed scheme is effective. Compared to existing works, SRDFL is able to improve 2.72%–47.92% model accuracy.","PeriodicalId":13492,"journal":{"name":"IEEE Transactions on Information Forensics and Security","volume":"21 ","pages":"839-853"},"PeriodicalIF":8.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145893719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01DOI: 10.1109/TIFS.2025.3650373
Willy Susilo;Jianchang Lai;Fuchun Guo;Yudi Zhang
Cloud storage has become the most attractive way to achieve data sharing by setting flexible access policies. Cryptographic tools are considered the most popular approach to protecting the privacy of data stored on the cloud. Dividing data into different classes plays a significant role in cloud storage, making data organization more methodical and data sharing more expressive and efficient. Unfortunately, current data sharing solutions either neglect data classification or suffer from data leakage. Specifically, shared keys can decrypt newly added encrypted data within the same class, and have key abuse issues where shared keys are untraceable once sold. In this work, we propose a time-bound data sharing system that addresses all these issues simultaneously. In our scheme, data is divided into different classes and encrypted according to its class and associated time period. Decryption keys for a set of chosen data classes can be aggregated into a single key, allowing users to decrypt multiple ciphertexts whose classes are within the set; while other encrypted data with classes outside the set remain confidential. Moreover, the aggregate key is time-bound which can only decrypt the ciphertexts generated before the embedded time period, ensuring it cannot access newly added encrypted data. The key size is independent of the chosen class set size and is only logarithmic in the bit length of the time period used. For each sharing, the shared aggregate key is different. In the event of data leakage or key selling, the data provider can identify the responsible users. We provide formal security analysis of our system and evaluate its performance through experiments. The results demonstrate that our system is highly efficient in terms of shared keys. It provides a practical solution for achieving efficient and dynamic data sharing in cloud storage.
{"title":"Outsourced Cloud Storage and Dynamic Sharing: Efficient Time-Bound Access Control","authors":"Willy Susilo;Jianchang Lai;Fuchun Guo;Yudi Zhang","doi":"10.1109/TIFS.2025.3650373","DOIUrl":"10.1109/TIFS.2025.3650373","url":null,"abstract":"Cloud storage has become the most attractive way to achieve data sharing by setting flexible access policies. Cryptographic tools are considered the most popular approach to protecting the privacy of data stored on the cloud. Dividing data into different classes plays a significant role in cloud storage, making data organization more methodical and data sharing more expressive and efficient. Unfortunately, current data sharing solutions either neglect data classification or suffer from data leakage. Specifically, shared keys can decrypt newly added encrypted data within the same class, and have key abuse issues where shared keys are untraceable once sold. In this work, we propose a time-bound data sharing system that addresses all these issues simultaneously. In our scheme, data is divided into different classes and encrypted according to its class and associated time period. Decryption keys for a set of chosen data classes can be aggregated into a single key, allowing users to decrypt multiple ciphertexts whose classes are within the set; while other encrypted data with classes outside the set remain confidential. Moreover, the aggregate key is time-bound which can only decrypt the ciphertexts generated before the embedded time period, ensuring it cannot access newly added encrypted data. The key size is independent of the chosen class set size and is only logarithmic in the bit length of the time period used. For each sharing, the shared aggregate key is different. In the event of data leakage or key selling, the data provider can identify the responsible users. We provide formal security analysis of our system and evaluate its performance through experiments. The results demonstrate that our system is highly efficient in terms of shared keys. It provides a practical solution for achieving efficient and dynamic data sharing in cloud storage.","PeriodicalId":13492,"journal":{"name":"IEEE Transactions on Information Forensics and Security","volume":"21 ","pages":"900-912"},"PeriodicalIF":8.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145893718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Solana is a rapidly evolving blockchain platform that has attracted an increasing number of users. However, this growth has also drawn the attention of malicious actors, with some phishers extending their reach into the Solana ecosystem. Unlike platforms such as Ethereum, Solana has distinct designs of accounts and transactions, leading to the emergence of new types of phishing transactions that we term SolPhish. We define three types of SolPhish and develop a detection tool called SolPhishHunter. Utilizing SolPhishHunter, we detect a total of 8,058 instances of SolPhish and conduct an empirical analysis of these detected cases. Our analysis explores the distribution and impact of SolPhish, the characteristics of the phishers, and the relationships among phishing gangs. Particularly, the detected SolPhish transactions have resulted in nearly ${$}$ 1.1 million in losses for victims. We report our detection results to the community and construct SolPhishDataset, the first Solana phishing-related dataset in academia.
{"title":"SolPhishHunter: Toward Detecting and Understanding Phishing on Solana","authors":"Ziwei Li;Zigui Jiang;Ming Fang;Jiaxin Chen;Zhiying Wu;Jiajing Wu;Lun Zhang;Zibin Zheng","doi":"10.1109/TIFS.2025.3649957","DOIUrl":"10.1109/TIFS.2025.3649957","url":null,"abstract":"Solana is a rapidly evolving blockchain platform that has attracted an increasing number of users. However, this growth has also drawn the attention of malicious actors, with some phishers extending their reach into the Solana ecosystem. Unlike platforms such as Ethereum, Solana has distinct designs of accounts and transactions, leading to the emergence of new types of phishing transactions that we term SolPhish. We define three types of SolPhish and develop a detection tool called SolPhishHunter. Utilizing SolPhishHunter, we detect a total of 8,058 instances of SolPhish and conduct an empirical analysis of these detected cases. Our analysis explores the distribution and impact of SolPhish, the characteristics of the phishers, and the relationships among phishing gangs. Particularly, the detected SolPhish transactions have resulted in nearly <inline-formula> <tex-math>${$}$ </tex-math></inline-formula>1.1 million in losses for victims. We report our detection results to the community and construct SolPhishDataset, the first Solana phishing-related dataset in academia.","PeriodicalId":13492,"journal":{"name":"IEEE Transactions on Information Forensics and Security","volume":"21 ","pages":"757-771"},"PeriodicalIF":8.0,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145893725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-31DOI: 10.1109/tifs.2025.3650025
Zixuan Ding, Ding Wang
{"title":"Hybrid Password Hardening Encryption","authors":"Zixuan Ding, Ding Wang","doi":"10.1109/tifs.2025.3650025","DOIUrl":"https://doi.org/10.1109/tifs.2025.3650025","url":null,"abstract":"","PeriodicalId":13492,"journal":{"name":"IEEE Transactions on Information Forensics and Security","volume":"94 1","pages":"1-1"},"PeriodicalIF":6.8,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145893724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the Segment Routing over IPv6 (SRv6) network, a wide range of network events (e.g., attacks, intrusions, violations, malicious route announcements) may occur. Network management requires real-time monitoring of untrusted and unreliable environments (e.g., unsafe components and devices). Early localization of abnormal links causing violations in the SRv6 network helps minimize the compensation required for service unavailability. However, the overhead of the state-of-the-art methods does not scale efficiently to large-scale SRv6 networks and exhibit poor robustness to addressing various disturbances from unreliable networks. To cope with these challenges, we propose $textsf {Glint}$ , an in-band network telemetry framework to localize abnormal links in SRv6 networks. The key idea of $textsf {Glint}$ is sampling part of the information while the overall information is known. $textsf {Glint}$ provides probabilistic in-band collection to gather segment-level telemetry data, reducing overhead and improving efficiency. $textsf {Glint}$ also proposes distributed verification-based detection to enhance the trustworthiness of security assessments, further improving robustness against disturbances. In addition, we design selective telemetry that reduces telemetry reports while preserving security-relevant visibility. Our evaluations demonstrate that, compared to the state-of-the-art frameworks, $textsf {Glint}$ significantly reduces header bandwidth overhead by 75.6% and memory overhead by 48.7% while reducing false positives. We also implement $textsf {Glint}$ on the Intel Tofino switch, achieving over a 50% reduction in hardware resource consumption compared to existing methods.
{"title":"Glint: Localization of Gray Violations in Untrusted and Unreliable SRv6 Networks","authors":"Kaiyang Zhao;Han Zhang;Yahui Li;Xingang Shi;Zhiliang Wang;Xia Yin;Jiankun Hu;Jianping Wu","doi":"10.1109/TIFS.2025.3649962","DOIUrl":"10.1109/TIFS.2025.3649962","url":null,"abstract":"In the Segment Routing over IPv6 (SRv6) network, a wide range of network events (e.g., attacks, intrusions, violations, malicious route announcements) may occur. Network management requires real-time monitoring of untrusted and unreliable environments (e.g., unsafe components and devices). Early localization of abnormal links causing violations in the SRv6 network helps minimize the compensation required for service unavailability. However, the overhead of the state-of-the-art methods does not scale efficiently to large-scale SRv6 networks and exhibit poor robustness to addressing various disturbances from unreliable networks. To cope with these challenges, we propose <inline-formula> <tex-math>$textsf {Glint}$ </tex-math></inline-formula>, an in-band network telemetry framework to localize abnormal links in SRv6 networks. The key idea of <inline-formula> <tex-math>$textsf {Glint}$ </tex-math></inline-formula> is sampling part of the information while the overall information is known. <inline-formula> <tex-math>$textsf {Glint}$ </tex-math></inline-formula> provides probabilistic in-band collection to gather segment-level telemetry data, reducing overhead and improving efficiency. <inline-formula> <tex-math>$textsf {Glint}$ </tex-math></inline-formula> also proposes distributed verification-based detection to enhance the trustworthiness of security assessments, further improving robustness against disturbances. In addition, we design selective telemetry that reduces telemetry reports while preserving security-relevant visibility. Our evaluations demonstrate that, compared to the state-of-the-art frameworks, <inline-formula> <tex-math>$textsf {Glint}$ </tex-math></inline-formula> significantly reduces header bandwidth overhead by 75.6% and memory overhead by 48.7% while reducing false positives. We also implement <inline-formula> <tex-math>$textsf {Glint}$ </tex-math></inline-formula> on the Intel Tofino switch, achieving over a 50% reduction in hardware resource consumption compared to existing methods.","PeriodicalId":13492,"journal":{"name":"IEEE Transactions on Information Forensics and Security","volume":"21 ","pages":"812-826"},"PeriodicalIF":8.0,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145893726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the rapid development of intelligent surveillance technology, the massive amount of multimodal data (e.g., videos, images, and text) has imposed higher demands on efficient information retrieval and security. Traditional single-modal retrieval methods struggle to meet practical requirements, making multimodal image-text retrieval a research hotspot in this field. Existing approaches, however, still face challenges in fine-grained semantic alignment and suffer from rigid matching mechanisms. To address these issues, this paper introduces SeaNcr, a novel framework that integrates cross-modal semantic entity alignment with non-correspondence reasoning. Our method constructs class-level entity representations enhanced by saliency-guided masking to capture discriminative semantic features. A pseudo-frozen asynchronous optimization strategy is introduced to maintain semantic consistency across modalities by associating stable entity representations with dynamically updated encoder features. Moreover, to overcome rigid matching, we design a non-correspondence reasoning module that jointly leverages intra-modal similarity and cross-modal mutual nearest neighbor constraints, optimizing matching flexibility and generalization. Extensive experiments validate that SeaNcr significantly enhances cross-modal feature representation and retrieval robustness, achieving state-of-the-art performance on multiple person re-identification benchmarks.
{"title":"Semantic Entity Alignment and Non-Corresponding Reasoning for Text-to-Image Person Re-Identification","authors":"Wanru Peng;Houjin Chen;Yanfeng Li;Jia Sun;Luyifu Chen","doi":"10.1109/TIFS.2025.3649361","DOIUrl":"https://doi.org/10.1109/TIFS.2025.3649361","url":null,"abstract":"With the rapid development of intelligent surveillance technology, the massive amount of multimodal data (e.g., videos, images, and text) has imposed higher demands on efficient information retrieval and security. Traditional single-modal retrieval methods struggle to meet practical requirements, making multimodal image-text retrieval a research hotspot in this field. Existing approaches, however, still face challenges in fine-grained semantic alignment and suffer from rigid matching mechanisms. To address these issues, this paper introduces SeaNcr, a novel framework that integrates cross-modal semantic entity alignment with non-correspondence reasoning. Our method constructs class-level entity representations enhanced by saliency-guided masking to capture discriminative semantic features. A pseudo-frozen asynchronous optimization strategy is introduced to maintain semantic consistency across modalities by associating stable entity representations with dynamically updated encoder features. Moreover, to overcome rigid matching, we design a non-correspondence reasoning module that jointly leverages intra-modal similarity and cross-modal mutual nearest neighbor constraints, optimizing matching flexibility and generalization. Extensive experiments validate that SeaNcr significantly enhances cross-modal feature representation and retrieval robustness, achieving state-of-the-art performance on multiple person re-identification benchmarks.","PeriodicalId":13492,"journal":{"name":"IEEE Transactions on Information Forensics and Security","volume":"21 ","pages":"772-783"},"PeriodicalIF":8.0,"publicationDate":"2025-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145929599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}