Pub Date : 2026-05-05Epub Date: 2026-01-17DOI: 10.1016/j.ins.2026.123125
Lorenzo Mannocci , Stefano Cresci , Matteo Magnani , Anna Monreale , Maurizio Tesconi
Coordinated online behavior, which spans from beneficial collective actions to harmful manipulation such as disinformation campaigns, has become a key focus in digital ecosystem analysis. Traditional methods often rely on monomodal approaches, focusing on single types of interactions like co-retweets or co-hashtags, or consider multiple modalities independently of each other. However, these approaches may overlook the complex dynamics inherent in multimodal coordination. This study compares different ways of operationalizing multimodal coordinated behavior, examining the trade-off between weakly and strongly integrated models and their ability to capture broad versus tightly aligned coordination patterns. By contrasting monomodal, flattened, and multimodal methods, we evaluate the distinct contributions of each modality and the impact of different integration strategies. Our findings show that while not all modalities provide unique insights, multimodal analysis consistently offers a more informative representation of coordinated behavior, preserving structures that monomodal and flattened approaches often lose. This work enhances the ability to detect and analyze coordinated online behavior, offering new perspectives for safeguarding the integrity of digital platforms.
{"title":"Multimodal coordinated online behavior: Trade-offs and strategies","authors":"Lorenzo Mannocci , Stefano Cresci , Matteo Magnani , Anna Monreale , Maurizio Tesconi","doi":"10.1016/j.ins.2026.123125","DOIUrl":"10.1016/j.ins.2026.123125","url":null,"abstract":"<div><div>Coordinated online behavior, which spans from beneficial collective actions to harmful manipulation such as disinformation campaigns, has become a key focus in digital ecosystem analysis. Traditional methods often rely on monomodal approaches, focusing on single types of interactions like co-retweets or co-hashtags, or consider multiple modalities independently of each other. However, these approaches may overlook the complex dynamics inherent in multimodal coordination. This study compares different ways of operationalizing multimodal coordinated behavior, examining the trade-off between weakly and strongly integrated models and their ability to capture broad versus tightly aligned coordination patterns. By contrasting monomodal, flattened, and multimodal methods, we evaluate the distinct contributions of each modality and the impact of different integration strategies. Our findings show that while not all modalities provide unique insights, multimodal analysis consistently offers a more informative representation of coordinated behavior, preserving structures that monomodal and flattened approaches often lose. This work enhances the ability to detect and analyze coordinated online behavior, offering new perspectives for safeguarding the integrity of digital platforms.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"737 ","pages":"Article 123125"},"PeriodicalIF":6.8,"publicationDate":"2026-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-05-05Epub Date: 2026-01-16DOI: 10.1016/j.ins.2026.123126
Ping Li , Jiajun Chen , Shaoqi Tian , Ran Wang
Few-shot class-incremental learning requires a model to incrementally learn to recognize novel classes from limited samples while preserving its ability to classify previously learned base and old classes. It presents two main challenges, i.e., catastrophic forgetting on old classes due to the absence of their samples during incremental phases, and overfitting of the few available samples of novel classes. To address these issues, we propose a Class Semantics guided Knowledge Distillation (CSKD) method. In the base session, CSKD leverages the pre-trained vision-language model CLIP (Contrastive Language-Image Pre-Training) to perform knowledge distillation for enhancing the base model. During each incremental session, the method utilizes the CLIP-derived class textual semantics to guide the optimization of the classifier, thereby alleviating over-fitting on novel classes and forgetting prior knowledge. Extensive experiments on three image datasets, i.e., mini-ImageNet, CUB200, and CIFAR100, as well as two video datasets, i.e., UCF101 and HMDB51, demonstrate CSKD outperforms SOTA competitive alternatives, showing particularly strong generalization ability on novel classes. Code is available at https://github.com/mlvccn/CSKD_Fewshot.
{"title":"Class semantics guided knowledge distillation for few-shot class incremental learning","authors":"Ping Li , Jiajun Chen , Shaoqi Tian , Ran Wang","doi":"10.1016/j.ins.2026.123126","DOIUrl":"10.1016/j.ins.2026.123126","url":null,"abstract":"<div><div>Few-shot class-incremental learning requires a model to incrementally learn to recognize novel classes from limited samples while preserving its ability to classify previously learned base and old classes. It presents two main challenges, i.e., catastrophic forgetting on old classes due to the absence of their samples during incremental phases, and overfitting of the few available samples of novel classes. To address these issues, we propose a Class Semantics guided Knowledge Distillation (<strong>CSKD</strong>) method. In the base session, CSKD leverages the pre-trained vision-language model CLIP (Contrastive Language-Image Pre-Training) to perform knowledge distillation for enhancing the base model. During each incremental session, the method utilizes the CLIP-derived class textual semantics to guide the optimization of the classifier, thereby alleviating over-fitting on novel classes and forgetting prior knowledge. Extensive experiments on three image datasets, i.e., mini-ImageNet, CUB200, and CIFAR100, as well as two video datasets, i.e., UCF101 and HMDB51, demonstrate CSKD outperforms SOTA competitive alternatives, showing particularly strong generalization ability on novel classes. Code is available at <span><span>https://github.com/mlvccn/CSKD_Fewshot</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"737 ","pages":"Article 123126"},"PeriodicalIF":6.8,"publicationDate":"2026-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-05-05Epub Date: 2026-01-17DOI: 10.1016/j.ins.2026.123116
Renjie Fu, Haoyue Yang, Wei Xing, Junfeng Zhang
In this paper, the practical formation consensus problem is addressed for Takagi-Sugeno fuzzy positive multi-agent systems under deception attacks. During the transmission of information, malicious attackers inject incorrect information into the agents to disrupt the formation consensus. A Bernoulli random process is used to model the randomly occurring deception attacks in the controller. To achieve formation consensus, a novel error variable is introduced to control the formation. The main objective of this paper is to ensure the normal operation of Takagi-Sugeno fuzzy positive multi-agent systems and the unchanged formation of the agents when the randomly occurring deception attacks arise. Then, the gain matrices are designed using matrix decomposition techniques and computed via linear programming. Lastly, a numerical example is presented to validate the efficacy and robustness of the proposed controller.
{"title":"Practical formation control of T-S fuzzy positive multi-agent systems under deception attacks","authors":"Renjie Fu, Haoyue Yang, Wei Xing, Junfeng Zhang","doi":"10.1016/j.ins.2026.123116","DOIUrl":"10.1016/j.ins.2026.123116","url":null,"abstract":"<div><div>In this paper, the practical formation consensus problem is addressed for Takagi-Sugeno fuzzy positive multi-agent systems under deception attacks. During the transmission of information, malicious attackers inject incorrect information into the agents to disrupt the formation consensus. A Bernoulli random process is used to model the randomly occurring deception attacks in the controller. To achieve formation consensus, a novel error variable is introduced to control the formation. The main objective of this paper is to ensure the normal operation of Takagi-Sugeno fuzzy positive multi-agent systems and the unchanged formation of the agents when the randomly occurring deception attacks arise. Then, the gain matrices are designed using matrix decomposition techniques and computed via linear programming. Lastly, a numerical example is presented to validate the efficacy and robustness of the proposed controller.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"737 ","pages":"Article 123116"},"PeriodicalIF":6.8,"publicationDate":"2026-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-05-05Epub Date: 2026-01-22DOI: 10.1016/j.ins.2026.123134
Haipeng Cui , Kai Xiao , Hua Wang , Xuxin Zhang
High-precision trajectory prediction can promote safe and efficient autonomous driving decisions. Existing state-of-the-art models, such as Dual-Attention Mechanism (DAM) and Hierarchical Attention Network (HAN), treat all neighboring vehicles as undifferentiated sets, ignoring lane structures when extracting spatial features. In this study, we propose a novel Lane-specific Spatial-Temporal Attention Network (LSTAN) to address the lane-level traffic information in vehicle trajectory prediction. Specifically, we employ an encoder module based on a Long Short-Term Memory Network to extract temporal features for target vehicles and their surrounding vehicles. Meanwhile, a lane attention module (LAM) and a temporal self-attention module (TSAM) are proposed for spatial and temporal feature extractions. The LAM introduces a dual-attention framework to discern spatial relationships between the target vehicle and its surrounding vehicles considering the lane-level effects. The TSAM refines the temporal features for target vehicles. Finally, the decoder integrates the learned features with the driving intention to obtain the predicted trajectories. Experiments are conducted using two real-world datasets: the next generation simulation (NGSIM) and HighD. Results show that the LSTAN outperforms the benchmarks by an average root mean square error (RMSE) of 0.37 m. Ablation studies and component replacement experiments are conducted to evaluate the effectiveness of the components in LSTAN.
{"title":"Lane-flow-learning based autonomous vehicle trajectory prediction using spatial–temporal fusion attention","authors":"Haipeng Cui , Kai Xiao , Hua Wang , Xuxin Zhang","doi":"10.1016/j.ins.2026.123134","DOIUrl":"10.1016/j.ins.2026.123134","url":null,"abstract":"<div><div>High-precision trajectory prediction can promote safe and efficient autonomous driving decisions. Existing state-of-the-art models, such as Dual-Attention Mechanism (DAM) and Hierarchical Attention Network (HAN), treat all neighboring vehicles as undifferentiated sets, ignoring lane structures when extracting spatial features. In this study, we propose a novel Lane-specific Spatial-Temporal Attention Network (LSTAN) to address the lane-level traffic information in vehicle trajectory prediction. Specifically, we employ an encoder module based on a Long Short-Term Memory Network to extract temporal features for target vehicles and their surrounding vehicles. Meanwhile, a lane attention module (LAM) and a temporal self-attention module (TSAM) are proposed for spatial and temporal feature extractions. The LAM introduces a dual-attention framework to discern spatial relationships between the target vehicle and its surrounding vehicles considering the lane-level effects. The TSAM refines the temporal features for target vehicles. Finally, the decoder integrates the learned features with the driving intention to obtain the predicted trajectories. Experiments are conducted using two real-world datasets: the next generation simulation (NGSIM) and HighD. Results show that the LSTAN outperforms the benchmarks by an average root mean square error (RMSE) of 0.37 m. Ablation studies and component replacement experiments are conducted to evaluate the effectiveness of the components in LSTAN.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"737 ","pages":"Article 123134"},"PeriodicalIF":6.8,"publicationDate":"2026-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146080724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-05-05Epub Date: 2026-01-19DOI: 10.1016/j.ins.2026.123114
Bo Liu , Qingshan Tang , YanShan Xiao , Weijie Zeng , Xinzhe Jiang , Yunlong Sun , Jiajun Chen , Zongxiong Yang
Over the past years, respiratory diseases have accounted for over 5 million annual fatalities, rendering precise diagnostics imperative. Chest radiography (CXR), which serves as the primary screening modality, exhibits inherent limitations, including anatomical overlap (where ribs obscure lung tissue), low contrast of subtle pathologies, and substantial lesion-scale variability. Contemporary deep learning architectures (e.g., ResNet, EfficientNet) demonstrate inadequacies in addressing these challenges due to fixed receptive fields, constrained global context capture, and deficient spatial-channel feature fusion. To circumvent these limitations, we propose DyFASA: a lightweight (0.17M parameters) attention module integrating three synergistic components. In the proposed method, Dynamic Kernel Selection (DKS) employs a gating network to weight // branches adaptively, thereby adapting receptive fields for multi-scale lesions. Frequency-Domain Adaptive Attention (FAA) leverages FFT to segregate pathological textures from skeletal interference while capturing global context. Spatially Adaptive Channel Attention (SACA) fuses local DKS features with global FAA context to concentrate on diagnostically relevant regions. Upon evaluation using the MUT and BIN datasets, DyFASA elevates U-Net (DyNet) lung segmentation accuracy to 99.34% and enhances EfficientNet-B0’s MUT classification precision by approximately 10%. It presents an efficient solution for computationally constrained clinical environments.
{"title":"DyFASA: Dynamic frequency-domain-aware spatial-channel attention for efficient lung disease detection from chest x-rays","authors":"Bo Liu , Qingshan Tang , YanShan Xiao , Weijie Zeng , Xinzhe Jiang , Yunlong Sun , Jiajun Chen , Zongxiong Yang","doi":"10.1016/j.ins.2026.123114","DOIUrl":"10.1016/j.ins.2026.123114","url":null,"abstract":"<div><div>Over the past years, respiratory diseases have accounted for over 5 million annual fatalities, rendering precise diagnostics imperative. Chest radiography (CXR), which serves as the primary screening modality, exhibits inherent limitations, including anatomical overlap (where ribs obscure lung tissue), low contrast of subtle pathologies, and substantial lesion-scale variability. Contemporary deep learning architectures (e.g., ResNet, EfficientNet) demonstrate inadequacies in addressing these challenges due to fixed receptive fields, constrained global context capture, and deficient spatial-channel feature fusion. To circumvent these limitations, we propose DyFASA: a lightweight (0.17M parameters) attention module integrating three synergistic components. In the proposed method, Dynamic Kernel Selection (DKS) employs a gating network to weight <span><math><mn>1</mn><mo>×</mo><mn>1</mn></math></span>/<span><math><mn>3</mn><mo>×</mo><mn>3</mn></math></span>/<span><math><mn>5</mn><mo>×</mo><mn>5</mn></math></span> branches adaptively, thereby adapting receptive fields for multi-scale lesions. Frequency-Domain Adaptive Attention (FAA) leverages FFT to segregate pathological textures from skeletal interference while capturing global context. Spatially Adaptive Channel Attention (SACA) fuses local DKS features with global FAA context to concentrate on diagnostically relevant regions. Upon evaluation using the MUT and BIN datasets, DyFASA elevates U-Net (DyNet) lung segmentation accuracy to 99.34% and enhances EfficientNet-B0’s MUT classification precision by approximately 10%. It presents an efficient solution for computationally constrained clinical environments.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"737 ","pages":"Article 123114"},"PeriodicalIF":6.8,"publicationDate":"2026-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-04-25Epub Date: 2025-12-12DOI: 10.1016/j.ins.2025.122995
Liangliang Jia , Lingxia Mu , Shihai Wu , Ding Liu
Accurate fault diagnosis of rotating machinery under complex operating conditions is hindered by strongly coupled spatio-temporal dynamics and the limited expressiveness of single-modality representations. To address this challenge, we propose a dual-stream interactive diagnosis framework for heterogeneous spatio-temporal features. In the temporal stream, a multi-scale variable temporal convolutional network is designed to jointly employ multi-scale dilated convolutions and a novel variable ReLU dynamic activation, enabling concurrent capture of short-term transient shocks and long-range periodic attenuation in vibration signals. In the spatial stream, raw one-dimensional signals are first transformed into Gramian angular difference field images; then, a transfer-learning strategy migrates selected layers of a pretrained AlexNet with a hierarchical scheme combining early-layer freezing and layer-wise fine-tuning to extract high-quality spatial descriptors efficiently. A gated fusion module establishes deep correlations between the two modalities and adaptively integrates the branch outputs for precise multi-class fault identification. Experimental results on the Paderborn University bearing dataset, the University of Connecticut gear dataset, and a self-built crystal lifting and rotation mechanism dataset show that the proposed method attains accuracies of 99.58%, 99.54%, and 98.33%, respectively. Comparative and ablation studies further demonstrate that its generalization ability and robustness are significantly superior to those of mainstream diagnostic approaches.
{"title":"Dual-stream interactive diagnosis of spatio-temporal heterogeneous features: Joint modeling with multi-scale variable temporal convolutions and transfer learning","authors":"Liangliang Jia , Lingxia Mu , Shihai Wu , Ding Liu","doi":"10.1016/j.ins.2025.122995","DOIUrl":"10.1016/j.ins.2025.122995","url":null,"abstract":"<div><div>Accurate fault diagnosis of rotating machinery under complex operating conditions is hindered by strongly coupled spatio-temporal dynamics and the limited expressiveness of single-modality representations. To address this challenge, we propose a dual-stream interactive diagnosis framework for heterogeneous spatio-temporal features. In the temporal stream, a multi-scale variable temporal convolutional network is designed to jointly employ multi-scale dilated convolutions and a novel variable ReLU dynamic activation, enabling concurrent capture of short-term transient shocks and long-range periodic attenuation in vibration signals. In the spatial stream, raw one-dimensional signals are first transformed into Gramian angular difference field images; then, a transfer-learning strategy migrates selected layers of a pretrained AlexNet with a hierarchical scheme combining early-layer freezing and layer-wise fine-tuning to extract high-quality spatial descriptors efficiently. A gated fusion module establishes deep correlations between the two modalities and adaptively integrates the branch outputs for precise multi-class fault identification. Experimental results on the Paderborn University bearing dataset, the University of Connecticut gear dataset, and a self-built crystal lifting and rotation mechanism dataset show that the proposed method attains accuracies of 99.58%, 99.54%, and 98.33%, respectively. Comparative and ablation studies further demonstrate that its generalization ability and robustness are significantly superior to those of mainstream diagnostic approaches.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"733 ","pages":"Article 122995"},"PeriodicalIF":6.8,"publicationDate":"2026-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145791950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-04-25Epub Date: 2025-12-11DOI: 10.1016/j.ins.2025.122991
Fan Zhang, Jintao Chen, Hongru Ren
A balanced path planning method for multi-robot systems (MRS) in indoor firefighting scenarios is presented by integrating community detection with Voronoi diagrams. The environment is partitioned into communities via the Louvain algorithm, with centroids serving as proxy nodes. Using these nodes together with the obstacles, the system generates Voronoi-based initial paths, which are then decomposed into robot tasks through spectral clustering. A path duplication and allocation mechanism ensures that the multi-robot system performs a cyclic, cooperative search. Designed for time-sensitive fire rescue operations, the method achieves planning within 1.5–2.5 s across residential, maze-like, and complex interiors, and it demonstrates efficient obstacle avoidance and coverage even in densely obstructed layouts. Experiments confirm notable improvements in search accuracy and robustness, enabling the multi-robot system to be rapidly deployed in large-scale missions. The combination of proxy nodes and Voronoi diagrams effectively addresses vertical complexity and spatial fragmentation in high-rise buildings, enabling coordinated navigation through narrow spaces while minimizing mission time. Comparative results verify that the proposed approach offers a significant advantage in time efficiency.
{"title":"Path planning and task allocation based on community detection in Voronoi diagrams","authors":"Fan Zhang, Jintao Chen, Hongru Ren","doi":"10.1016/j.ins.2025.122991","DOIUrl":"10.1016/j.ins.2025.122991","url":null,"abstract":"<div><div>A balanced path planning method for multi-robot systems (MRS) in indoor firefighting scenarios is presented by integrating community detection with Voronoi diagrams. The environment is partitioned into communities via the Louvain algorithm, with centroids serving as proxy nodes. Using these nodes together with the obstacles, the system generates Voronoi-based initial paths, which are then decomposed into robot tasks through spectral clustering. A path duplication and allocation mechanism ensures that the multi-robot system performs a cyclic, cooperative search. Designed for time-sensitive fire rescue operations, the method achieves planning within 1.5–2.5 s across residential, maze-like, and complex interiors, and it demonstrates efficient obstacle avoidance and coverage even in densely obstructed layouts. Experiments confirm notable improvements in search accuracy and robustness, enabling the multi-robot system to be rapidly deployed in large-scale missions. The combination of proxy nodes and Voronoi diagrams effectively addresses vertical complexity and spatial fragmentation in high-rise buildings, enabling coordinated navigation through narrow spaces while minimizing mission time. Comparative results verify that the proposed approach offers a significant advantage in time efficiency.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"733 ","pages":"Article 122991"},"PeriodicalIF":6.8,"publicationDate":"2026-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145791681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-04-25Epub Date: 2025-12-06DOI: 10.1016/j.ins.2025.122942
Hai Liao , Song Chen , Yun Xiao , Linyun Xiang , Fan Min
Large-scale Text-to-SQL models are vulnerable to perturbations in natural language query (NLQ) and database schemas. Most existing research has focused on adversarial attacks in input sequences, while neglecting defense. In this article, we argue that defense techniques are more critical, as they address diverse attacks. With this in mind, we posit that adversarial defense in large-scale Text-to-SQL poses a broader challenge than classical robustness. We also introduce two metrics to statistically evaluate defense performance. A framework for a certified robust method from an information theory perspective is proposed to address the new problem. One novel component is a regularizer (MIR), which uses active random masking to extract local features and maximize their mutual information with global features, ensuring theoretical robustness. Another new component is a Transformer-based Schema Linking (TSL) algorithm that enhances question-schema alignment under adversarial settings. To support its supervised training, we propose Spider-SL, a new fine-grained alignment dataset derived from Spider. Our method is evaluated on five benchmarks encompassing 20 perturbation attacks. To the best of our knowledge, the results demonstrate that our model, using only 3B parameters, achieves state-of-the-art robustness and learning performance. This study suggests new research trends concerning the robustness of Text-to-SQL. Our code is available at: https://github.com/iliaohai/infosql.
{"title":"Large-scale text-to-SQL generation with adversarial defense","authors":"Hai Liao , Song Chen , Yun Xiao , Linyun Xiang , Fan Min","doi":"10.1016/j.ins.2025.122942","DOIUrl":"10.1016/j.ins.2025.122942","url":null,"abstract":"<div><div>Large-scale Text-to-SQL models are vulnerable to perturbations in natural language query (NLQ) and database schemas. Most existing research has focused on adversarial attacks in input sequences, while neglecting defense. In this article, we argue that defense techniques are more critical, as they address diverse attacks. With this in mind, we posit that adversarial defense in large-scale Text-to-SQL poses a broader challenge than classical robustness. We also introduce two metrics to statistically evaluate defense performance. A framework for a certified robust method from an information theory perspective is proposed to address the new problem. One novel component is a regularizer (MIR), which uses active random masking to extract local features and maximize their mutual information with global features, ensuring theoretical robustness. Another new component is a Transformer-based Schema Linking (TSL) algorithm that enhances question-schema alignment under adversarial settings. To support its supervised training, we propose Spider-SL, a new fine-grained alignment dataset derived from Spider. Our method is evaluated on five benchmarks encompassing 20 perturbation attacks. To the best of our knowledge, the results demonstrate that our model, using only 3B parameters, achieves state-of-the-art robustness and learning performance. This study suggests new research trends concerning the robustness of Text-to-SQL. Our code is available at: <span><span>https://github.com/iliaohai/infosql</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"733 ","pages":"Article 122942"},"PeriodicalIF":6.8,"publicationDate":"2026-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145738389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Stock price forecasting is a critical yet inherently difficult task in quantitative finance due to the volatile and non-stationary nature of financial time series. While diffusion models have emerged as promising tools for capturing predictive uncertainty, their effectiveness is often limited by insufficient data and the absence of informative guidance during generation. To address these challenges, we propose VARDiff, a diffusion forecasting architecture conditioned on visual-semantic references retrieved from a historical database. Our core novelty is a cross-attention-based denoising network that operates on delay embedding (DE) image representations of time series, fusing the target trajectory with its visually similar historical counterparts retrieved via a GAF-based visual encoding pipeline using a pre-trained VGG backbone to provide structured guidance during iterative denoising. VARDiff transforms historical price sequences into image representations and extracts semantic embeddings using a pre-trained vision encoder. These embeddings facilitate the retrieval of visually similar historical trajectories, which serve as external references to guide the denoising process of the diffusion model. Extensive experiments on nine benchmark stock datasets show that VARDiff reduces forecasting errors by an average of 16.27% (MSE) and 8.12% (MAE) compared to state-of-the-art baselines. The results underscore the effectiveness of integrating vision-based retrieval into diffusion forecasting, leading to more robust and data-efficient financial prediction.
{"title":"VARDiff: Vision-augmented retrieval-guided diffusion for stock forecasting","authors":"Thi-Thu Nguyen, Xuan-Thong Truong, Thai-Binh Nguyen-Khac, Nhat-Hai Nguyen","doi":"10.1016/j.ins.2026.123113","DOIUrl":"10.1016/j.ins.2026.123113","url":null,"abstract":"<div><div>Stock price forecasting is a critical yet inherently difficult task in quantitative finance due to the volatile and non-stationary nature of financial time series. While diffusion models have emerged as promising tools for capturing predictive uncertainty, their effectiveness is often limited by insufficient data and the absence of informative guidance during generation. To address these challenges, we propose VARDiff, a diffusion forecasting architecture conditioned on visual-semantic references retrieved from a historical database. Our core novelty is a cross-attention-based denoising network that operates on delay embedding (DE) image representations of time series, fusing the target trajectory with its visually similar historical counterparts retrieved via a GAF-based visual encoding pipeline using a pre-trained VGG backbone to provide structured guidance during iterative denoising. VARDiff transforms historical price sequences into image representations and extracts semantic embeddings using a pre-trained vision encoder. These embeddings facilitate the retrieval of visually similar historical trajectories, which serve as external references to guide the denoising process of the diffusion model. Extensive experiments on nine benchmark stock datasets show that VARDiff reduces forecasting errors by an average of 16.27% (MSE) and 8.12% (MAE) compared to state-of-the-art baselines. The results underscore the effectiveness of integrating vision-based retrieval into diffusion forecasting, leading to more robust and data-efficient financial prediction.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"736 ","pages":"Article 123113"},"PeriodicalIF":6.8,"publicationDate":"2026-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146038889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-04-25Epub Date: 2026-01-14DOI: 10.1016/j.ins.2026.123107
Shiyu Chu , Feifei Du , Kejin Li , Qiang Li
This paper addresses the quasi-stabilization problem for discrete-time fractional-order (DTFO) Hopfield neural networks with time delays under non-convergent perturbations. Existing methods fail when perturbations are neither constant nor possess a limit. To bridge this gap, a novel non-autonomous DTFO Halanay inequality that incorporates non-zero-limit perturbations is introduced. By integrating this inequality with an event-triggering mechanism, a quasi-stability criterion is established. The effectiveness of our approach is validated through numerical examples.
{"title":"Event-triggered quasi-stabilization for discrete-time fractional-order Hopfield neural networks with time delays","authors":"Shiyu Chu , Feifei Du , Kejin Li , Qiang Li","doi":"10.1016/j.ins.2026.123107","DOIUrl":"10.1016/j.ins.2026.123107","url":null,"abstract":"<div><div>This paper addresses the quasi-stabilization problem for discrete-time fractional-order (DTFO) Hopfield neural networks with time delays under non-convergent perturbations. Existing methods fail when perturbations are neither constant nor possess a limit. To bridge this gap, a novel non-autonomous DTFO Halanay inequality that incorporates non-zero-limit perturbations is introduced. By integrating this inequality with an event-triggering mechanism, a quasi-stability criterion is established. The effectiveness of our approach is validated through numerical examples.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"736 ","pages":"Article 123107"},"PeriodicalIF":6.8,"publicationDate":"2026-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146038885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}