It is a critical challenge to realize the control of dissolved oxygen (DO) in uncertain aeration process, due to the inherent nonlinearity, dynamic and unknown disturbances in wastewater treatment process (WWTP). To address this issue, the incremental multi-subreservoirs echo state network (IMSESN) controller is proposed. First, the echo state network (ESN) is employed as the approximator for the unknown system state, and the disturbance observer is constructed to handle the unmeasurable disturbances.Second, to further improve controller adaptability, the error-driven subreservoir increment mechanism is incorporated, in which the new subreservoirs are inserted into the network to enhance uncertainty approximation.Moreover, the minimum learning parameter (MLP) algorithm is introduced to update only the norm of output weights, significantly reducing computational complexity while maintaining control accuracy.Third, the Lyapunov stability theory is applied to demonstrate the semiglobal ultimate boundedness of the closed-loop signals. Under diverse weather conditions, the simulations on the benchmark simulation model no. 1 (BSM1) show that the proposed controller has outperformed existing methods in tracking accuracy and computational efficiency.
{"title":"Incremental multi-subreservoirs echo state network control for uncertain aeration process.","authors":"Cuili Yang, Qingrun Zhang, Jiahang Zhang, Jian Tang","doi":"10.1016/j.neunet.2025.108454","DOIUrl":"10.1016/j.neunet.2025.108454","url":null,"abstract":"<p><p>It is a critical challenge to realize the control of dissolved oxygen (DO) in uncertain aeration process, due to the inherent nonlinearity, dynamic and unknown disturbances in wastewater treatment process (WWTP). To address this issue, the incremental multi-subreservoirs echo state network (IMSESN) controller is proposed. First, the echo state network (ESN) is employed as the approximator for the unknown system state, and the disturbance observer is constructed to handle the unmeasurable disturbances.Second, to further improve controller adaptability, the error-driven subreservoir increment mechanism is incorporated, in which the new subreservoirs are inserted into the network to enhance uncertainty approximation.Moreover, the minimum learning parameter (MLP) algorithm is introduced to update only the norm of output weights, significantly reducing computational complexity while maintaining control accuracy.Third, the Lyapunov stability theory is applied to demonstrate the semiglobal ultimate boundedness of the closed-loop signals. Under diverse weather conditions, the simulations on the benchmark simulation model no. 1 (BSM1) show that the proposed controller has outperformed existing methods in tracking accuracy and computational efficiency.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"196 ","pages":"108454"},"PeriodicalIF":6.3,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145776279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-04-01Epub Date: 2025-12-08DOI: 10.1016/j.neunet.2025.108450
Di Yuan, Huayi Zhu, Rui Chen, Sida Zhou, Jianing Tang, Xiu Shu, Qiao Liu
The rapid development of deep learning provides an excellent solution for end-to-end multi-modal image fusion. However, existing methods mainly focus on the spatial domain and fail to fully utilize valuable information in the frequency domain. Moreover, even if spatial domain learning methods can optimize convergence to an ideal solution, there are still significant differences in high-frequency details between the fused image and the source images. Therefore, we propose a Cross-Modal Multi-Domain Learning (CMMDL) method for image fusion. Firstly, CMMDL employs the Restormer structure equipped with the proposed Spatial-Frequency domain Cascaded Attention (SFCA) mechanism to provide comprehensive and detailed pixel-level features for subsequent multi-domain learning. Then, we propose a dual-domain parallel learning strategy. The proposed Spatial Domain Learning Block (SDLB) focuses on extracting modality-specific features in the spatial domain through a dual-branch invertible neural network, while the proposed Frequency Domain Learning Block (FDLB) captures continuous and precise global contextual information using cross-modal deep perceptual Fourier transforms. Finally, the proposed Heterogeneous Domain Feature Fusion Block (HDFFB) promotes feature interaction and fusion between different domains through various pixel-level attention structures to obtain the final output image. Extensive experiments demonstrate that the proposed CMMDL achieves state-of-the-art performance on multiple datasets. The code is available at: https://github.com/Ist-Zhy/CMMDL.
{"title":"CMMDL: Cross-modal multi-domain learning method for image fusion.","authors":"Di Yuan, Huayi Zhu, Rui Chen, Sida Zhou, Jianing Tang, Xiu Shu, Qiao Liu","doi":"10.1016/j.neunet.2025.108450","DOIUrl":"10.1016/j.neunet.2025.108450","url":null,"abstract":"<p><p>The rapid development of deep learning provides an excellent solution for end-to-end multi-modal image fusion. However, existing methods mainly focus on the spatial domain and fail to fully utilize valuable information in the frequency domain. Moreover, even if spatial domain learning methods can optimize convergence to an ideal solution, there are still significant differences in high-frequency details between the fused image and the source images. Therefore, we propose a Cross-Modal Multi-Domain Learning (CMMDL) method for image fusion. Firstly, CMMDL employs the Restormer structure equipped with the proposed Spatial-Frequency domain Cascaded Attention (SFCA) mechanism to provide comprehensive and detailed pixel-level features for subsequent multi-domain learning. Then, we propose a dual-domain parallel learning strategy. The proposed Spatial Domain Learning Block (SDLB) focuses on extracting modality-specific features in the spatial domain through a dual-branch invertible neural network, while the proposed Frequency Domain Learning Block (FDLB) captures continuous and precise global contextual information using cross-modal deep perceptual Fourier transforms. Finally, the proposed Heterogeneous Domain Feature Fusion Block (HDFFB) promotes feature interaction and fusion between different domains through various pixel-level attention structures to obtain the final output image. Extensive experiments demonstrate that the proposed CMMDL achieves state-of-the-art performance on multiple datasets. The code is available at: https://github.com/Ist-Zhy/CMMDL.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"196 ","pages":"108450"},"PeriodicalIF":6.3,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145776346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The technique of structural re-parameterization has been widely adopted in Convolutional Neural Networks (CNNs) and Multi-Layer Perceptrons (MLPs) for image-related tasks. However, its integration with attention mechanisms in the video domain remains relatively unexplored. Moreover, video analysis tasks continue to face challenges due to high computational costs, particularly during inference. In this paper, we investigate the re-parameterization of widely-used 3D attention mechanism for video understanding by incorporating a spatiotemporal coherence prior. This approach allows the learning of more robust video features while introducing negligible computational overhead at inference time. Specifically, we propose a SpatioTemporally Augmented 3D Attention (STA-3DA) module as a building block for Transformer architectures. The STA-3DA integrates 3D, spatial, and temporal attention branches during training, serving as an effective replacement for standard 3D attention in existing Transformer models and leading to improved performance. During testing, the different branches are merged into a single 3D attention operation via learned fusion weights, resulting in minimal additional computational cost. Experimental results demonstrate that the proposed method achieves competitive video understanding performance on benchmark datasets such as Kinetics-400 and Something-Something V2.
{"title":"RepAttn3D: Re-parameterizing 3D attention with spatiotemporal augmentation for video understanding.","authors":"Xiusheng Lu, Lechao Cheng, Sicheng Zhao, Ying Zheng, Yongheng Wang, Guiguang Ding, Mingli Song","doi":"10.1016/j.neunet.2025.108313","DOIUrl":"10.1016/j.neunet.2025.108313","url":null,"abstract":"<p><p>The technique of structural re-parameterization has been widely adopted in Convolutional Neural Networks (CNNs) and Multi-Layer Perceptrons (MLPs) for image-related tasks. However, its integration with attention mechanisms in the video domain remains relatively unexplored. Moreover, video analysis tasks continue to face challenges due to high computational costs, particularly during inference. In this paper, we investigate the re-parameterization of widely-used 3D attention mechanism for video understanding by incorporating a spatiotemporal coherence prior. This approach allows the learning of more robust video features while introducing negligible computational overhead at inference time. Specifically, we propose a SpatioTemporally Augmented 3D Attention (STA-3DA) module as a building block for Transformer architectures. The STA-3DA integrates 3D, spatial, and temporal attention branches during training, serving as an effective replacement for standard 3D attention in existing Transformer models and leading to improved performance. During testing, the different branches are merged into a single 3D attention operation via learned fusion weights, resulting in minimal additional computational cost. Experimental results demonstrate that the proposed method achieves competitive video understanding performance on benchmark datasets such as Kinetics-400 and Something-Something V2.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"195 ","pages":"108313"},"PeriodicalIF":6.3,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145589948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-03DOI: 10.1016/j.neunet.2026.108680
Ben Liang, Yuan Liu, Chao Sui, Yihong Wang, Lin Xiao, Xiubao Sui, Qian Chen
With the advancement of high-precision remote sensing equipment and precision measurement technology, object detection based on remote sensing images (RSIs) has been widely used in military and civilian fields. Different from traditional general-purpose environments, remote sensing presents unique challenges that significantly complicate the detection process. Specifically: (1) RSIs cover extensive monitoring areas, resulting in complex and textured backgrounds; and (2) objects often exhibit cluttered distributions, small sizes, and considerable scale variations across categories. To effectively address these challenges, we propose a Multi-Scale Pattern-Aware Task-Gating Network (MPTNet) for remote sensing object detection. First, we design a Multi-Scale Pattern-Aware Network (MPNet) backbone that employs a small and large kernel convolutional complementary strategy to capture both large-scale and small-scale spatial patterns, yielding more comprehensive semantic features. Next, we introduce a Multi-Head Cross-Space Encoder (MCE) that improves semantic fusion and spatial representation across hierarchical levels. By combining a multi-head mechanism with directional one-dimensional strip convolutions, MCE enhances spatial sensitivity at the pixel level, thus improving object localization in densely textured scenes. To harmonize cross-task synergy, we propose a Dynamic Task-Gating (DTG) head that adaptively recalibrates spatial feature representations between classification and localization branches. Extensive experimental validations on three publicly available datasets, including VisDrone, DIOR, and COCO-mini, demonstrate that our method achieves excellent performance, obtaining AP50 scores of 43.3%, 80.6%, and 49.5%, respectively.
{"title":"Multi-Scale pattern-Aware task-Gating network for aerial small object detection.","authors":"Ben Liang, Yuan Liu, Chao Sui, Yihong Wang, Lin Xiao, Xiubao Sui, Qian Chen","doi":"10.1016/j.neunet.2026.108680","DOIUrl":"https://doi.org/10.1016/j.neunet.2026.108680","url":null,"abstract":"<p><p>With the advancement of high-precision remote sensing equipment and precision measurement technology, object detection based on remote sensing images (RSIs) has been widely used in military and civilian fields. Different from traditional general-purpose environments, remote sensing presents unique challenges that significantly complicate the detection process. Specifically: (1) RSIs cover extensive monitoring areas, resulting in complex and textured backgrounds; and (2) objects often exhibit cluttered distributions, small sizes, and considerable scale variations across categories. To effectively address these challenges, we propose a Multi-Scale Pattern-Aware Task-Gating Network (MPTNet) for remote sensing object detection. First, we design a Multi-Scale Pattern-Aware Network (MPNet) backbone that employs a small and large kernel convolutional complementary strategy to capture both large-scale and small-scale spatial patterns, yielding more comprehensive semantic features. Next, we introduce a Multi-Head Cross-Space Encoder (MCE) that improves semantic fusion and spatial representation across hierarchical levels. By combining a multi-head mechanism with directional one-dimensional strip convolutions, MCE enhances spatial sensitivity at the pixel level, thus improving object localization in densely textured scenes. To harmonize cross-task synergy, we propose a Dynamic Task-Gating (DTG) head that adaptively recalibrates spatial feature representations between classification and localization branches. Extensive experimental validations on three publicly available datasets, including VisDrone, DIOR, and COCO-mini, demonstrate that our method achieves excellent performance, obtaining AP<sub>50</sub> scores of 43.3%, 80.6%, and 49.5%, respectively.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"108680"},"PeriodicalIF":6.3,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146144671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-02DOI: 10.1016/j.neunet.2026.108682
Tianyu Hu, Renda Han, Liu Mao, Jing Chen, Xia Xie
Graph-level clustering aims to group graphs into distinct clusters based on shared structural characteristics or semantic similarities. However, existing graph-level clustering methods generally assume that the input graph structure is complete and overlook the problem of missing relationships that commonly exist in real-world scenarios. These unmodeled missing relationships will lead to the accumulation of structural information distortion during the graph representation learning process, significantly reducing the clustering performance. To this end, we propose a novel method, Structure-Missing Graph-Level Clustering Network (SMGCN), which includes a structure augmentation module LR-SEA, an Anchor Positioning Mechanism, and Joint Contrastive Optimization. Specifically, we first output augmented graphs based on low-rank matrix completion, perform cluster matching using the Hungarian algorithm to obtain anchors, and then force same clustering graphs to converge to the corresponding anchors in the embedding space. According to our research, this is the first time that the graph-level clustering task with missing relations is proposed, and the superiority of our method is demonstrated through experiments on five benchmark datasets, compared with the state-of-the-art methods. Our source codes are available at https://github.com/MrHuSN/SMGCN.
{"title":"Structure-missing graph-level clustering network.","authors":"Tianyu Hu, Renda Han, Liu Mao, Jing Chen, Xia Xie","doi":"10.1016/j.neunet.2026.108682","DOIUrl":"https://doi.org/10.1016/j.neunet.2026.108682","url":null,"abstract":"<p><p>Graph-level clustering aims to group graphs into distinct clusters based on shared structural characteristics or semantic similarities. However, existing graph-level clustering methods generally assume that the input graph structure is complete and overlook the problem of missing relationships that commonly exist in real-world scenarios. These unmodeled missing relationships will lead to the accumulation of structural information distortion during the graph representation learning process, significantly reducing the clustering performance. To this end, we propose a novel method, Structure-Missing Graph-Level Clustering Network (SMGCN), which includes a structure augmentation module LR-SEA, an Anchor Positioning Mechanism, and Joint Contrastive Optimization. Specifically, we first output augmented graphs based on low-rank matrix completion, perform cluster matching using the Hungarian algorithm to obtain anchors, and then force same clustering graphs to converge to the corresponding anchors in the embedding space. According to our research, this is the first time that the graph-level clustering task with missing relations is proposed, and the superiority of our method is demonstrated through experiments on five benchmark datasets, compared with the state-of-the-art methods. Our source codes are available at https://github.com/MrHuSN/SMGCN.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"108682"},"PeriodicalIF":6.3,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146144622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01DOI: 10.1016/j.neunet.2026.108644
Yue Zhou, Liang Cao, Yan Lei, Hongru Ren
Six-rotor unmanned aerial vehicles (UAVs) offer significant potential, but still encounter persistent challenges in achieving efficient allocation of limited resources in dynamic and complex environments. Consequently, this paper explores the prescribed-time observer-based optimal consensus control problem for six-rotor UAVs with unified prescribed performance. A practical prescribed-time optimal control scheme is constructed through embedding the prescribed-time control method with a simplified reinforcement learning framework to realize the efficient resource allocation. Leveraging a prescribed-time adjustment function, the novel updating laws for actor and critic neural networks are developed, which guarantee that six-rotor UAVs reach a desired steady state within prescribed time. Moreover, an improved distributed prescribed-time observer is established, ensuring that each follower is able to precisely estimate the velocity and position information of the leader within prescribed time. Then, a series of nonlinear transformations and mappings is proposed, which cannot only satisfy diverse performance requirements under a unified control framework through only adjusting the design parameters a priori but also improve the user-friendliness of implementation and control design. Significantly, the global performance requirement simplifies verification process of initial constraints in traditional performance control methods. Furthermore, an adaptive prescribed-time filter is introduced to address the complexity explosion issue of the backstepping method on six-rotor UAVs, while ensuring the filter error converges within prescribed time. Eventually, simulation results confirm the effectiveness of the designed method.
{"title":"Observer-based prescribed-time optimal neural consensus control for six-rotor UAVs: A novel actor-critic reinforcement learning strategy.","authors":"Yue Zhou, Liang Cao, Yan Lei, Hongru Ren","doi":"10.1016/j.neunet.2026.108644","DOIUrl":"https://doi.org/10.1016/j.neunet.2026.108644","url":null,"abstract":"<p><p>Six-rotor unmanned aerial vehicles (UAVs) offer significant potential, but still encounter persistent challenges in achieving efficient allocation of limited resources in dynamic and complex environments. Consequently, this paper explores the prescribed-time observer-based optimal consensus control problem for six-rotor UAVs with unified prescribed performance. A practical prescribed-time optimal control scheme is constructed through embedding the prescribed-time control method with a simplified reinforcement learning framework to realize the efficient resource allocation. Leveraging a prescribed-time adjustment function, the novel updating laws for actor and critic neural networks are developed, which guarantee that six-rotor UAVs reach a desired steady state within prescribed time. Moreover, an improved distributed prescribed-time observer is established, ensuring that each follower is able to precisely estimate the velocity and position information of the leader within prescribed time. Then, a series of nonlinear transformations and mappings is proposed, which cannot only satisfy diverse performance requirements under a unified control framework through only adjusting the design parameters a priori but also improve the user-friendliness of implementation and control design. Significantly, the global performance requirement simplifies verification process of initial constraints in traditional performance control methods. Furthermore, an adaptive prescribed-time filter is introduced to address the complexity explosion issue of the backstepping method on six-rotor UAVs, while ensuring the filter error converges within prescribed time. Eventually, simulation results confirm the effectiveness of the designed method.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"108644"},"PeriodicalIF":6.3,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146138110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01Epub Date: 2025-10-10DOI: 10.1016/j.neunet.2025.108188
Shengcai Zhang, Huiju Yi, Fanchang Zeng, Xuan Zhang, Zhiying Fu, Dezhi An
Time series forecasting is widely applied in fields such as energy and network security. Various prediction models based on Transformer and MLP architectures have been proposed. However, their performance may decline to varying degrees when applied to real-world sequences with significant non-stationarity. Traditional approaches generally adopt either stabilization or a combination of stabilization and non-stationarity compensation for prediction tasks. However, non-stationarity is a crucial attribute of time series; the former approach tends to eliminate useful non-stationary patterns, while the latter may inadequately capture non-stationary information. Therefore, we propose DiffMixer, which analyzes and predicts different frequencies in non-stationary time series. We use Variational Mode Decomposition (VMD) to obtain multiple frequency components of the sequence, Multi-scale Decomposition (MsD) to optimize the decomposition of downsampled sequences, and Improved Star Aggregate-Redistribute (iSTAR) to capture interdependencies between different frequency components. Additionally, we employ the Frequency domain Processing Block (FPB) to capture global features of different frequency components in the frequency domain, and Dual Dimension Fusion (DuDF) to fuse different frequency components in two dimensions, enhancing the predictive fit for various frequencies. Compared to previous state-of-the-art methods, DiffMixer reduces the Mean Squared Error (MSE), Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and Symmetric Mean Absolute Percentage Error (SMAPE) by 24.5%, 12.3%, 13.5%, and 6.1%, respectively.
{"title":"DiffMixer: A prediction model based on mixing different frequency features.","authors":"Shengcai Zhang, Huiju Yi, Fanchang Zeng, Xuan Zhang, Zhiying Fu, Dezhi An","doi":"10.1016/j.neunet.2025.108188","DOIUrl":"10.1016/j.neunet.2025.108188","url":null,"abstract":"<p><p>Time series forecasting is widely applied in fields such as energy and network security. Various prediction models based on Transformer and MLP architectures have been proposed. However, their performance may decline to varying degrees when applied to real-world sequences with significant non-stationarity. Traditional approaches generally adopt either stabilization or a combination of stabilization and non-stationarity compensation for prediction tasks. However, non-stationarity is a crucial attribute of time series; the former approach tends to eliminate useful non-stationary patterns, while the latter may inadequately capture non-stationary information. Therefore, we propose DiffMixer, which analyzes and predicts different frequencies in non-stationary time series. We use Variational Mode Decomposition (VMD) to obtain multiple frequency components of the sequence, Multi-scale Decomposition (MsD) to optimize the decomposition of downsampled sequences, and Improved Star Aggregate-Redistribute (iSTAR) to capture interdependencies between different frequency components. Additionally, we employ the Frequency domain Processing Block (FPB) to capture global features of different frequency components in the frequency domain, and Dual Dimension Fusion (DuDF) to fuse different frequency components in two dimensions, enhancing the predictive fit for various frequencies. Compared to previous state-of-the-art methods, DiffMixer reduces the Mean Squared Error (MSE), Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and Symmetric Mean Absolute Percentage Error (SMAPE) by 24.5%, 12.3%, 13.5%, and 6.1%, respectively.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"194 ","pages":"108188"},"PeriodicalIF":6.3,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145314070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01Epub Date: 2025-10-03DOI: 10.1016/j.neunet.2025.108141
Yueyao Li, Bin Wu
Incomplete multi-view clustering (IMVC) has become an area of increasing focus due to the frequent occurrence of missing views in real-world multi-view datasets. Traditional methods often address this by attempting to recover the missing views before clustering. However, these methods face two main limitations: (1) inadequate modeling of cross-view consistency, which weakens the relationships between views, especially with a high missing rate, and (2) limited capacity to generate realistic and diverse missing views, leading to suboptimal clustering results. To tackle these issues, we propose a novel framework, Joint Generative Adversarial Network and Alignment Adversarial (JGA-IMVC). Our framework leverages adversarial learning to simultaneously generate missing views and enforce consistency alignment across views, ensuring effective reconstruction of incomplete data while preserving underlying structural relationships. Extensive experiments on benchmark datasets with varying missing rates demonstrate that JGA-IMVC consistently outperforms current state-of-the-art methods. The model achieves improvements of 3 % to 5 % in key clustering metrics such as Accuracy, Normalized Mutual Information (NMI), and Adjusted Rand Index (ARI). JGA-IMVC excels under high missing conditions, confirming its robustness and generalization capabilities, providing a practical solution for incomplete multi-view clustering scenarios.
{"title":"Joint generative and alignment adversarial learning for robust incomplete multi-view clustering.","authors":"Yueyao Li, Bin Wu","doi":"10.1016/j.neunet.2025.108141","DOIUrl":"10.1016/j.neunet.2025.108141","url":null,"abstract":"<p><p>Incomplete multi-view clustering (IMVC) has become an area of increasing focus due to the frequent occurrence of missing views in real-world multi-view datasets. Traditional methods often address this by attempting to recover the missing views before clustering. However, these methods face two main limitations: (1) inadequate modeling of cross-view consistency, which weakens the relationships between views, especially with a high missing rate, and (2) limited capacity to generate realistic and diverse missing views, leading to suboptimal clustering results. To tackle these issues, we propose a novel framework, Joint Generative Adversarial Network and Alignment Adversarial (JGA-IMVC). Our framework leverages adversarial learning to simultaneously generate missing views and enforce consistency alignment across views, ensuring effective reconstruction of incomplete data while preserving underlying structural relationships. Extensive experiments on benchmark datasets with varying missing rates demonstrate that JGA-IMVC consistently outperforms current state-of-the-art methods. The model achieves improvements of 3 % to 5 % in key clustering metrics such as Accuracy, Normalized Mutual Information (NMI), and Adjusted Rand Index (ARI). JGA-IMVC excels under high missing conditions, confirming its robustness and generalization capabilities, providing a practical solution for incomplete multi-view clustering scenarios.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"194 ","pages":"108141"},"PeriodicalIF":6.3,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145309821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01Epub Date: 2025-10-08DOI: 10.1016/j.neunet.2025.108191
Jiayi Mao, Hanle Zheng, Huifeng Yin, Hanxiao Fan, Lingrui Mei, Hao Guo, Yao Li, Jibin Wu, Jing Pei, Lei Deng
Brain-inspired neural networks, drawing insights from biological neural systems, have emerged as a promising paradigm for temporal information processing due to their inherent neural dynamics. Spiking Neural Networks (SNNs) have gained extensive attention among existing brain-inspired neural models. However, they often struggle with capturing multi-timescale temporal features due to the static parameters across time steps and the low-precision spike activities. To this end, we propose a dynamic SNN with enhanced dendritic heterogeneity to enhance the multi-timescale feature extraction capability. We design a Leaky Integrate Modulation neuron model with Dendritic Heterogeneity (DH-LIM) that replaces traditional spike activities with a continuous modulation mechanism for preserving the nonlinear behaviors while enhancing the feature expression capability. We also introduce an Adaptive Dendritic Plasticity (ADP) mechanism that dynamically adjusts dendritic timing factors based on the frequency domain information of input signals, enabling the model to capture both rapid- and slow-changing temporal patterns. Extensive experiments on multiple datasets with rich temporal features demonstrate that our proposed method achieves excellent performance in processing complex temporal signals. These optimizations provide fresh solutions for optimizing the multi-timescale feature extraction capability of SNNs, showcasing its broad application potential.
{"title":"Adaptive dendritic plasticity in brain-inspired dynamic neural networks for enhanced multi-timescale feature extraction.","authors":"Jiayi Mao, Hanle Zheng, Huifeng Yin, Hanxiao Fan, Lingrui Mei, Hao Guo, Yao Li, Jibin Wu, Jing Pei, Lei Deng","doi":"10.1016/j.neunet.2025.108191","DOIUrl":"10.1016/j.neunet.2025.108191","url":null,"abstract":"<p><p>Brain-inspired neural networks, drawing insights from biological neural systems, have emerged as a promising paradigm for temporal information processing due to their inherent neural dynamics. Spiking Neural Networks (SNNs) have gained extensive attention among existing brain-inspired neural models. However, they often struggle with capturing multi-timescale temporal features due to the static parameters across time steps and the low-precision spike activities. To this end, we propose a dynamic SNN with enhanced dendritic heterogeneity to enhance the multi-timescale feature extraction capability. We design a Leaky Integrate Modulation neuron model with Dendritic Heterogeneity (DH-LIM) that replaces traditional spike activities with a continuous modulation mechanism for preserving the nonlinear behaviors while enhancing the feature expression capability. We also introduce an Adaptive Dendritic Plasticity (ADP) mechanism that dynamically adjusts dendritic timing factors based on the frequency domain information of input signals, enabling the model to capture both rapid- and slow-changing temporal patterns. Extensive experiments on multiple datasets with rich temporal features demonstrate that our proposed method achieves excellent performance in processing complex temporal signals. These optimizations provide fresh solutions for optimizing the multi-timescale feature extraction capability of SNNs, showcasing its broad application potential.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"194 ","pages":"108191"},"PeriodicalIF":6.3,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145287548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}