Neurocomputing最新文献_第7页

Routeformer:Transformer utilizing routing mechanism for traffic flow forecasting

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing

Pub Date : 2025-02-24 DOI: 10.1016/j.neucom.2025.129753

Jun Qi, Hong Fan

Traffic flow prediction is vital for the development of intelligent transportation systems. The challenge lies in accurately capturing the complex and dynamic spatiotemporal dependencies influenced by real road network fluctuations. These dependencies can be simplified into three categories: (i) spatial dependencies among sensors at the same timestamp, (ii) temporal dependencies of the same sensor at different timestamps, and (iii) cross dimensional dependencies between different sensors at different timestamps. The third type of cross dimensional dependency requires considering the relationships between different sensors across multiple time points, which is not only complex but also difficult to capture accurately. Existing methods often describe it indirectly by merging spatiotemporal dependencies, but this approach is frequently insufficiently accurate. We aim to characterize this relationship more precisely by capturing the sequential dependencies among sensors, referred to as inter-series dependencies. Capturing inter-series dependencies does not require directly modeling the relationships between different sensors across multiple time points; rather, it focuses on the dependencies between the temporal patterns of different sensors. Our designed Temporal Routing Transformer captures temporal dependencies along the temporal axis while implicitly modeling the inter-series dependencies between sensors. At the same time, we capture spatial dependencies through the Spatial Routing Transformer and multi-scale temporal dependencies by using the Context-Aware Transformer. A series of evaluations were conducted on seven real world datasets, and Routeformer achieved state-of-the-art performance.

{"title":"Routeformer:Transformer utilizing routing mechanism for traffic flow forecasting","authors":"Jun Qi, Hong Fan","doi":"10.1016/j.neucom.2025.129753","DOIUrl":"10.1016/j.neucom.2025.129753","url":null,"abstract":"<div><div>Traffic flow prediction is vital for the development of intelligent transportation systems. The challenge lies in accurately capturing the complex and dynamic spatiotemporal dependencies influenced by real road network fluctuations. These dependencies can be simplified into three categories: (i) spatial dependencies among sensors at the same timestamp, (ii) temporal dependencies of the same sensor at different timestamps, and (iii) cross dimensional dependencies between different sensors at different timestamps. The third type of cross dimensional dependency requires considering the relationships between different sensors across multiple time points, which is not only complex but also difficult to capture accurately. Existing methods often describe it indirectly by merging spatiotemporal dependencies, but this approach is frequently insufficiently accurate. We aim to characterize this relationship more precisely by capturing the sequential dependencies among sensors, referred to as inter-series dependencies. Capturing inter-series dependencies does not require directly modeling the relationships between different sensors across multiple time points; rather, it focuses on the dependencies between the temporal patterns of different sensors. Our designed Temporal Routing Transformer captures temporal dependencies along the temporal axis while implicitly modeling the inter-series dependencies between sensors. At the same time, we capture spatial dependencies through the Spatial Routing Transformer and multi-scale temporal dependencies by using the Context-Aware Transformer. A series of evaluations were conducted on seven real world datasets, and Routeformer achieved state-of-the-art performance.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"633 ","pages":"Article 129753"},"PeriodicalIF":5.5,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143529539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Unsupervised domain-adaptive object detection: An efficient method based on UDA-DETR

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing

Pub Date : 2025-02-24 DOI: 10.1016/j.neucom.2025.129711

Hanguang Xiao, Tingting Zhou, Shidong Xiong, Jinlan Li, Zhuhan Li, Xin Liu, Tianhao Deng

Object detection based on deep learning has a wide range of applications in everyday life. However, when a domain gap exists between the training data (source domain) and real-world data (target domain), the performance of many object detectors significantly deteriorates. To address this issue, numerous Unsupervised Domain Adaptation (UDA) detectors attempt to reduce domain discrepancies and align cross-domain features. While these methods have achieved some success, they often align global features in a class-agnostic manner, neglecting the differences in feature distributions across categories within each domain. Our approach emphasizes the importance of both global image features and local instance features across domains, as well as category-specific information. Specifically, we propose an Encoder Feature Alignment (EFA) module, which introduces domain queries to adversarially align encoder-generated features, enabling the detector to extract more domain-invariant features. Additionally, we design an Instance-level Feature Alignment (IFA) module that extracts class-specific central features from the decoder for category-aware cross-domain feature alignment. During each training iteration, local class features progressively converge to global class features, guided by contrastive learning and adversarial loss to achieve Global Feature Alignment (GFA). Our method achieves 46.0% mean accuracy (mAP) in the Weather Adaptation scenario. Compared to the baseline model, a 19.1% mAP gain is achieved (26.9%

\to

46.0%). Extensive experimental results show that our proposed model achieves excellent detection performance and strong generalization ability on multiple domain adaptation benchmark datasets.

{"title":"Unsupervised domain-adaptive object detection: An efficient method based on UDA-DETR","authors":"Hanguang Xiao, Tingting Zhou, Shidong Xiong, Jinlan Li, Zhuhan Li, Xin Liu, Tianhao Deng","doi":"10.1016/j.neucom.2025.129711","DOIUrl":"10.1016/j.neucom.2025.129711","url":null,"abstract":"<div><div>Object detection based on deep learning has a wide range of applications in everyday life. However, when a domain gap exists between the training data (source domain) and real-world data (target domain), the performance of many object detectors significantly deteriorates. To address this issue, numerous Unsupervised Domain Adaptation (UDA) detectors attempt to reduce domain discrepancies and align cross-domain features. While these methods have achieved some success, they often align global features in a class-agnostic manner, neglecting the differences in feature distributions across categories within each domain. Our approach emphasizes the importance of both global image features and local instance features across domains, as well as category-specific information. Specifically, we propose an Encoder Feature Alignment (EFA) module, which introduces domain queries to adversarially align encoder-generated features, enabling the detector to extract more domain-invariant features. Additionally, we design an Instance-level Feature Alignment (IFA) module that extracts class-specific central features from the decoder for category-aware cross-domain feature alignment. During each training iteration, local class features progressively converge to global class features, guided by contrastive learning and adversarial loss to achieve Global Feature Alignment (GFA). Our method achieves 46.0% mean accuracy (mAP) in the Weather Adaptation scenario. Compared to the baseline model, a 19.1% mAP gain is achieved (26.9% <span><math><mo>→</mo></math></span> 46.0%). Extensive experimental results show that our proposed model achieves excellent detection performance and strong generalization ability on multiple domain adaptation benchmark datasets.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"631 ","pages":"Article 129711"},"PeriodicalIF":5.5,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143508090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Automatic diagnosis of early pregnancy fetal nasal bone development based on complex mid-sagittal section ultrasound imaging 基于复杂中矢切面超声成像的孕早期胎儿鼻骨发育自动诊断技术

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing

Pub Date : 2025-02-23 DOI: 10.1016/j.neucom.2025.129773

Xi Chen , Xiaoyu Xu , Lyuyang Tong , Huangxuan Zhao , Bo Du

Early prenatal screening of fetal nasal bone (FNB) development is crucial for detecting chromosomal abnormalities. Existing deep learning approaches primarily focus on detection rather than diagnosis of FNB. This paper introduces an early prenatal FNB development automated diagnostic system (FNB-ADS), which employs a cascaded hierarchical filtering method to reduce noise interference in mid-sagittal plane ultrasound images. Specifically, the system employs YOLOv8 for precise FNB localization, segments the nasal bone, tip, and prenasal skin using a specially designed lightweight segmentation network, and diagnoses developmental abnormalities using Resnet34 classification methods. Furthermore, this paper has collected and publicly released the FNB-UDV dataset, which includes a detection subset and a video subset. The detection subset contains 1,007 two-dimensional ultrasound images, while the video subset comprises 12 ultrasound videos. Upon a comprehensive evaluation, the diagnostic accuracy of FNB-ADS reaches 92.37% with a processing time of 0.14 s per image, and the video diagnostic accuracy is 98.69% with a per-frame inference speed of 0.37 s in the FNB-UDV dataset. Representing the first deep-learning approach tailored specifically for early pregnancy FNB ultrasound video diagnosis, FNB-ADS significantly enhances the standardization of diagnostic procedures and reduces the dependence on subjective clinical assessments. The dataset and code are available at https://github.com/SIGMACX/FNB-AD/tree/FNB-ADS.

{"title":"Automatic diagnosis of early pregnancy fetal nasal bone development based on complex mid-sagittal section ultrasound imaging","authors":"Xi Chen , Xiaoyu Xu , Lyuyang Tong , Huangxuan Zhao , Bo Du","doi":"10.1016/j.neucom.2025.129773","DOIUrl":"10.1016/j.neucom.2025.129773","url":null,"abstract":"<div><div>Early prenatal screening of fetal nasal bone (FNB) development is crucial for detecting chromosomal abnormalities. Existing deep learning approaches primarily focus on detection rather than diagnosis of FNB. This paper introduces an early prenatal FNB development automated diagnostic system (FNB-ADS), which employs a cascaded hierarchical filtering method to reduce noise interference in mid-sagittal plane ultrasound images. Specifically, the system employs YOLOv8 for precise FNB localization, segments the nasal bone, tip, and prenasal skin using a specially designed lightweight segmentation network, and diagnoses developmental abnormalities using Resnet34 classification methods. Furthermore, this paper has collected and publicly released the FNB-UDV dataset, which includes a detection subset and a video subset. The detection subset contains 1,007 two-dimensional ultrasound images, while the video subset comprises 12 ultrasound videos. Upon a comprehensive evaluation, the diagnostic accuracy of FNB-ADS reaches 92.37% with a processing time of 0.14 s per image, and the video diagnostic accuracy is 98.69% with a per-frame inference speed of 0.37 s in the FNB-UDV dataset. Representing the first deep-learning approach tailored specifically for early pregnancy FNB ultrasound video diagnosis, FNB-ADS significantly enhances the standardization of diagnostic procedures and reduces the dependence on subjective clinical assessments. The dataset and code are available at <span><span>https://github.com/SIGMACX/FNB-AD/tree/FNB-ADS</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"633 ","pages":"Article 129773"},"PeriodicalIF":5.5,"publicationDate":"2025-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143520415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Deep attributed network representation learning via enhanced local attribute neighbor

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing

Pub Date : 2025-02-22 DOI: 10.1016/j.neucom.2025.129763

Lili Han , Hui Zhao

Network representation learning aims to transform nodes in a network into low-dimensional spatial vectors while preserving the topological structure information of the network and its fundamental attributes, which has a wide range of practical applications. However, most existing attributed network representation learning methods only preserve part of the attributes and local or global topology information of the network, and do not fully capture the full attribute information of the complex interactions and the full topology information of the deep potential in the network. In the process of learning node embedding, it is a difficult and challenging task to fully and comprehensively capture and fuse the attribute and topology information in the network. To this end, we propose a new attributed network representation learning framework via enhanced local attribute neighbor, aiming to more effectively capture the global and local attribute information as well as the full topology information more comprehensively from the entire network. Specifically, a global attribute autoencoder is designed to model the mutual influence relationship of long-distance node attribute information, capture the global attribute neighbors of nodes from the whole network, and get the global attribute information of the complex interactions in the network. Additionally, a new random walk guide index, i.e., comprehensive influence, is designed to efficiently obtain the potential local and global topological structure information in the network. While at the same time, an enhanced local attribute neighbor skip-gram model is designed to obtain the local attribute information of nodes, so as to achieve the purpose of obtaining the network information in a full-aspect and multi-dimensional manner. We conduct extensive experiments on five real-world datasets for three downstream network analysis tasks: node classification, link prediction, and node clustering. The experimental results show that the method can achieve superior performance on each network analysis task, with the highest improvement of 3.94 % and 4.19 % in Micro-F1 and Macro-F1 over the optimal baseline method in node classification, respectively; 5.86 % and 5.2 % in Accuracy and Area Under Curve (AUC) in link prediction, respectively; and 7.73 %, 9.86 %, and 14.41 % in Normalized Mutual Information (NMI), Adjusted Rand Index (ARI), and Completeness (Comp) in clustering, respectively, which proves the effectiveness of the proposed method.

{"title":"Deep attributed network representation learning via enhanced local attribute neighbor","authors":"Lili Han , Hui Zhao","doi":"10.1016/j.neucom.2025.129763","DOIUrl":"10.1016/j.neucom.2025.129763","url":null,"abstract":"<div><div>Network representation learning aims to transform nodes in a network into low-dimensional spatial vectors while preserving the topological structure information of the network and its fundamental attributes, which has a wide range of practical applications. However, most existing attributed network representation learning methods only preserve part of the attributes and local or global topology information of the network, and do not fully capture the full attribute information of the complex interactions and the full topology information of the deep potential in the network. In the process of learning node embedding, it is a difficult and challenging task to fully and comprehensively capture and fuse the attribute and topology information in the network. To this end, we propose a new attributed network representation learning framework via enhanced local attribute neighbor, aiming to more effectively capture the global and local attribute information as well as the full topology information more comprehensively from the entire network. Specifically, a global attribute autoencoder is designed to model the mutual influence relationship of long-distance node attribute information, capture the global attribute neighbors of nodes from the whole network, and get the global attribute information of the complex interactions in the network. Additionally, a new random walk guide index, i.e., comprehensive influence, is designed to efficiently obtain the potential local and global topological structure information in the network. While at the same time, an enhanced local attribute neighbor skip-gram model is designed to obtain the local attribute information of nodes, so as to achieve the purpose of obtaining the network information in a full-aspect and multi-dimensional manner. We conduct extensive experiments on five real-world datasets for three downstream network analysis tasks: node classification, link prediction, and node clustering. The experimental results show that the method can achieve superior performance on each network analysis task, with the highest improvement of 3.94 % and 4.19 % in Micro-F1 and Macro-F1 over the optimal baseline method in node classification, respectively; 5.86 % and 5.2 % in Accuracy and Area Under Curve (AUC) in link prediction, respectively; and 7.73 %, 9.86 %, and 14.41 % in Normalized Mutual Information (NMI), Adjusted Rand Index (ARI), and Completeness (Comp) in clustering, respectively, which proves the effectiveness of the proposed method.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"631 ","pages":"Article 129763"},"PeriodicalIF":5.5,"publicationDate":"2025-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143508284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Memory-based walk-enhanced dynamic graph neural network for temporal graph representation learning

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing

Pub Date : 2025-02-22 DOI: 10.1016/j.neucom.2025.129759

Zhigang Jin , Renjun Su , Hao Zhang , Xiaofang Zhao

Depending on the ability of obtaining low-dimensional representations of nodes that preserve valuable structural information, graph representation learning has a wide range of applications in graph analysis and inference. However, real-world complex systems are naturally heterogeneous and time-varying, which makes it difficult to learn high-quality node representations. We propose a Memory-based Walk-enhanced Dynamic Graph neural Network (denoted as MWDGN) to fully exploit the dependencies and structural features in temporal graph. To capture long-term dependencies, we use a memory module to store and evolve dynamic node representations. MWDGN captures network structural information by constructing time-constrained walk sequences for each interaction node. The walk sequence features are creatively integrated into the update process of memory module, so as to capture the useful information of neighborhood structure features for the interaction node while preserving the long-term dependency of the temporal graph. In addition, we focus on the enlightenment of non-negligible temporal information for sensing key historical interaction nodes of the target node, and design a new aggregation method of historical interaction nodes information. It exploits the temporal attenuation effect of event impact to model short-term dependencies. We further exploit causal convolutional network to mine the potential associations of historical interaction node features of the target node. Comparison experiments on six datasets with mainstream baseline models demonstrate that MWDGN is capable of jointly extracting the heterogeneity and evolutionary patterns of nodes in the graph, improving the node representation quality, and enhancing the performance of the temporal link prediction and dynamic node classification tasks. The effectiveness of the proposed model is further proved by time complexity analysis, ablation study and parameter sensitivity analysis.

图表示学习能够获得保留有价值结构信息的低维节点表示，因此在图分析和推理中有着广泛的应用。然而，现实世界中的复杂系统具有天然的异质性和时变性，这使得学习高质量的节点表征变得十分困难。我们提出了一种基于记忆的步行增强动态图神经网络（简称 MWDGN），以充分利用时序图中的依赖关系和结构特征。为了捕捉长期依赖关系，我们使用记忆模块来存储和演化动态节点表征。MWDGN 通过为每个交互节点构建受时间限制的行走序列来捕捉网络结构信息。行走序列特征被创造性地集成到内存模块的更新过程中，从而在捕捉交互节点邻域结构特征的有用信息的同时，保留了时序图的长期依赖性。此外，我们还注重对感知目标节点关键历史交互节点的不可忽略的时间信息的启示，并设计了一种新的历史交互节点信息聚合方法。它利用事件影响的时间衰减效应来建立短期依赖关系模型。我们进一步利用因果卷积网络挖掘目标节点历史交互节点特征的潜在关联。在六个数据集上与主流基线模型的对比实验表明，MWDGN 能够联合提取图中节点的异质性和演化模式，改善节点表示质量，提高时序链接预测和动态节点分类任务的性能。时间复杂性分析、消融研究和参数敏感性分析进一步证明了所提模型的有效性。

{"title":"Memory-based walk-enhanced dynamic graph neural network for temporal graph representation learning","authors":"Zhigang Jin , Renjun Su , Hao Zhang , Xiaofang Zhao","doi":"10.1016/j.neucom.2025.129759","DOIUrl":"10.1016/j.neucom.2025.129759","url":null,"abstract":"<div><div>Depending on the ability of obtaining low-dimensional representations of nodes that preserve valuable structural information, graph representation learning has a wide range of applications in graph analysis and inference. However, real-world complex systems are naturally heterogeneous and time-varying, which makes it difficult to learn high-quality node representations. We propose a <strong>M</strong>emory-based <strong>W</strong>alk-enhanced <strong>D</strong>ynamic <strong>G</strong>raph neural <strong>N</strong>etwork (denoted as MWDGN) to fully exploit the dependencies and structural features in temporal graph. To capture long-term dependencies, we use a memory module to store and evolve dynamic node representations. MWDGN captures network structural information by constructing time-constrained walk sequences for each interaction node. The walk sequence features are creatively integrated into the update process of memory module, so as to capture the useful information of neighborhood structure features for the interaction node while preserving the long-term dependency of the temporal graph. In addition, we focus on the enlightenment of non-negligible temporal information for sensing key historical interaction nodes of the target node, and design a new aggregation method of historical interaction nodes information. It exploits the temporal attenuation effect of event impact to model short-term dependencies. We further exploit causal convolutional network to mine the potential associations of historical interaction node features of the target node. Comparison experiments on six datasets with mainstream baseline models demonstrate that MWDGN is capable of jointly extracting the heterogeneity and evolutionary patterns of nodes in the graph, improving the node representation quality, and enhancing the performance of the temporal link prediction and dynamic node classification tasks. The effectiveness of the proposed model is further proved by time complexity analysis, ablation study and parameter sensitivity analysis.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"633 ","pages":"Article 129759"},"PeriodicalIF":5.5,"publicationDate":"2025-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143520418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-view clustering via view-specific consensus kernelized graph learning

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing

Pub Date : 2025-02-22 DOI: 10.1016/j.neucom.2025.129766

Bing Hu , Tong Wu , Lixin Han , Shu Li , Yi Xu , Gui-fu Lu

Multi-view clustering has received extensive and in-depth research attention in recent years owing to its ability to reflect the nature of the real world from multiple perspectives. Kernel-based methods and subspace learning-based methods are two important categories of multi-view clustering. Compared with subspace-based algorithms, kernel-based algorithms can better address nonlinear relationships in feature spaces. However, the current kernel-based algorithms focus mainly on the diversity of different kernels, and obtaining the optimal kernel via linear combinations of multiple kernels, ignoring the cross-view information and space information in the original feature spaces. To address this issue, our paper proposes a novel algorithm named MC-VCKGL. Specifically, we first obtain view-specific consensus kernelized graphs of each view through kernel-based self-representation learning and by using the kernel trick. Moreover, Laplacian constraints are applied to maintain smoothness in the raw feature space of each view. We stack these kernelized graphs together to obtain a tensor, and then rotate this tensor and apply tensor nuclear norm constraints. As a result, the cross-view complementary information can be explored. We apply our algorithm to seven open datasets, including both text and image datasets. Experiments show that our method outperforms most state-of-the-art multi-view clustering algorithms.

{"title":"Multi-view clustering via view-specific consensus kernelized graph learning","authors":"Bing Hu , Tong Wu , Lixin Han , Shu Li , Yi Xu , Gui-fu Lu","doi":"10.1016/j.neucom.2025.129766","DOIUrl":"10.1016/j.neucom.2025.129766","url":null,"abstract":"<div><div>Multi-view clustering has received extensive and in-depth research attention in recent years owing to its ability to reflect the nature of the real world from multiple perspectives. Kernel-based methods and subspace learning-based methods are two important categories of multi-view clustering. Compared with subspace-based algorithms, kernel-based algorithms can better address nonlinear relationships in feature spaces. However, the current kernel-based algorithms focus mainly on the diversity of different kernels, and obtaining the optimal kernel via linear combinations of multiple kernels, ignoring the cross-view information and space information in the original feature spaces. To address this issue, our paper proposes a novel algorithm named MC-VCKGL. Specifically, we first obtain view-specific consensus kernelized graphs of each view through kernel-based self-representation learning and by using the kernel trick. Moreover, Laplacian constraints are applied to maintain smoothness in the raw feature space of each view. We stack these kernelized graphs together to obtain a tensor, and then rotate this tensor and apply tensor nuclear norm constraints. As a result, the cross-view complementary information can be explored. We apply our algorithm to seven open datasets, including both text and image datasets. Experiments show that our method outperforms most state-of-the-art multi-view clustering algorithms.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"633 ","pages":"Article 129766"},"PeriodicalIF":5.5,"publicationDate":"2025-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143526703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-modal cognitive maps for language and vision based on neural successor representations

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing

Pub Date : 2025-02-21 DOI: 10.1016/j.neucom.2025.129662

Paul Stoewer , Achim Schilling , Pegah Ramezani , Hassane Kissane , Andreas Maier , Patrick Krauss

Cognitive maps are a proposed concept on how the brain efficiently organizes memories and retrieves context out of them. The entorhinal-hippocampal complex is heavily involved in episodic and relational memory processing, as well as spatial navigation and is thought to built cognitive maps via place and grid cells. To make use of the promising properties of cognitive maps, we set up a multi-modal neural network using successor representations which are able to model place cell dynamics and cognitive map representations. Here, we use multi-modal inputs consisting of images and word embeddings. The network learns the similarities between novel inputs and the training database and therefore the representation of the cognitive map successfully. Subsequently, the prediction of the network can be used to infer from one modality to another with over 90% accuracy. The proposed method could therefore be a building block to improve current AI systems for better understanding of the environment and the different modalities in which objects appear. The association of specific modalities with certain encounters can therefore lead to context awareness in novel situations when similar encounters with less information occur and additional information can be inferred from the learned cognitive map. Cognitive maps, as represented by the entorhinal-hippocampal complex in the brain, organize and retrieve context from memories, suggesting that large language models (LLMs) like ChatGPT could harness similar architectures to function as a high-level processing center, akin to how the hippocampus operates within the cortex hierarchy. Finally, by utilizing multi-modal inputs, LLMs can potentially bridge the gap between different forms of data (like images and words), paving the way for context-awareness and grounding of abstract concepts through learned associations, addressing the symbol grounding problem in AI.

{"title":"Multi-modal cognitive maps for language and vision based on neural successor representations","authors":"Paul Stoewer , Achim Schilling , Pegah Ramezani , Hassane Kissane , Andreas Maier , Patrick Krauss","doi":"10.1016/j.neucom.2025.129662","DOIUrl":"10.1016/j.neucom.2025.129662","url":null,"abstract":"<div><div>Cognitive maps are a proposed concept on how the brain efficiently organizes memories and retrieves context out of them. The entorhinal-hippocampal complex is heavily involved in episodic and relational memory processing, as well as spatial navigation and is thought to built cognitive maps via place and grid cells. To make use of the promising properties of cognitive maps, we set up a multi-modal neural network using successor representations which are able to model place cell dynamics and cognitive map representations. Here, we use multi-modal inputs consisting of images and word embeddings. The network learns the similarities between novel inputs and the training database and therefore the representation of the cognitive map successfully. Subsequently, the prediction of the network can be used to infer from one modality to another with over 90% accuracy. The proposed method could therefore be a building block to improve current AI systems for better understanding of the environment and the different modalities in which objects appear. The association of specific modalities with certain encounters can therefore lead to context awareness in novel situations when similar encounters with less information occur and additional information can be inferred from the learned cognitive map. Cognitive maps, as represented by the entorhinal-hippocampal complex in the brain, organize and retrieve context from memories, suggesting that large language models (LLMs) like ChatGPT could harness similar architectures to function as a high-level processing center, akin to how the hippocampus operates within the cortex hierarchy. Finally, by utilizing multi-modal inputs, LLMs can potentially bridge the gap between different forms of data (like images and words), paving the way for context-awareness and grounding of abstract concepts through learned associations, addressing the symbol grounding problem in AI.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"631 ","pages":"Article 129662"},"PeriodicalIF":5.5,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143479432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Lie group Laplacian Support Vector Machine for semi-supervised learning

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing

Pub Date : 2025-02-21 DOI: 10.1016/j.neucom.2025.129728

Yue Zhang, Li Liu, Qian Qiao, Fanzhang Li

As a semi-supervised learning (SSL) method, the Laplacian Support Vector Machine (LapSVM) utilizes both labeled and unlabeled data to form a manifold regularization with graph theory and often receives a better performance than other SVM methods when the labeled data samples are insufficient. However, current LapSVMs usually struggle to handle high-dimensional data with transformation invariance such as images and videos, ignore the neighboring and distribution information of labeled data samples, and are sensitive to data samples around the decision boundary. Focusing on these problems, we propose a novel SVM method for SSL called LG-LapSVM by incorporating Lie group theory, the theory of local behavior similarity, and the RoBoSS loss function into the existing LapSVM framework. LG-LapSVM first represents data samples as transformation-invariant features on a Lie group manifold where a Laplacian graph is constructed according to geodesic distances between group elements, neighboring information, and class distributions of labeled data samples; then forms the Lie group manifold regulation by using the Laplacian matrix generated from the Laplacian graph and Lie group kernel metrics between each pair of data samples; finally trains an SVM classifier with an objective function including the RoBoSS loss function for labeled data samples and the Lie group manifold regulation. Comprehensive experiments on three public datasets validate the effectiveness of LG-LapSVM and the superiority of the method compared to nine typical SVM methods for SSL.

{"title":"A Lie group Laplacian Support Vector Machine for semi-supervised learning","authors":"Yue Zhang, Li Liu, Qian Qiao, Fanzhang Li","doi":"10.1016/j.neucom.2025.129728","DOIUrl":"10.1016/j.neucom.2025.129728","url":null,"abstract":"<div><div>As a semi-supervised learning (SSL) method, the Laplacian Support Vector Machine (LapSVM) utilizes both labeled and unlabeled data to form a manifold regularization with graph theory and often receives a better performance than other SVM methods when the labeled data samples are insufficient. However, current LapSVMs usually struggle to handle high-dimensional data with transformation invariance such as images and videos, ignore the neighboring and distribution information of labeled data samples, and are sensitive to data samples around the decision boundary. Focusing on these problems, we propose a novel SVM method for SSL called LG-LapSVM by incorporating Lie group theory, the theory of local behavior similarity, and the RoBoSS loss function into the existing LapSVM framework. LG-LapSVM first represents data samples as transformation-invariant features on a Lie group manifold where a Laplacian graph is constructed according to geodesic distances between group elements, neighboring information, and class distributions of labeled data samples; then forms the Lie group manifold regulation by using the Laplacian matrix generated from the Laplacian graph and Lie group kernel metrics between each pair of data samples; finally trains an SVM classifier with an objective function including the RoBoSS loss function for labeled data samples and the Lie group manifold regulation. Comprehensive experiments on three public datasets validate the effectiveness of LG-LapSVM and the superiority of the method compared to nine typical SVM methods for SSL.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"630 ","pages":"Article 129728"},"PeriodicalIF":5.5,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143474483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

How robust are ensemble machine learning explanations?

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing

Pub Date : 2025-02-21 DOI: 10.1016/j.neucom.2025.129686

Maria Carla Calzarossa , Paolo Giudici , Rasha Zieni

To date, several explainable AI methods are available. The variability of the resulting explanations can be high, especially when many input features are considered. This lack of robustness may limit their usability. In this paper we try to fill this gap, by contributing a methodology that: i) is able to measure the robustness of a given set of explanations; ii) suggests how to improve robustness, by tuning the model parameters. Without loss of generality, we exemplify our proposal for ensemble tree models, which typically reach a high predictive performance in classification problems. We consider a toy case study with artificially generated data as well as two real case studies whose application domain is cybersecurity and more precisely the models used for detecting phishing websites.

引用次数: 0

Neural network optimal adaptive control for affine continuous-time nonlinear systems with unknown internal dynamics

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing

Pub Date : 2025-02-21 DOI: 10.1016/j.neucom.2025.129685

Pavlo V. Tymoshchuk

A neural network (NN) optimal adaptive control for affine nonlinear continuous-time system with unknown internal dynamics is presented. The optimal adaptive control is given by a nonlinear differential equation with variable structure. A block-diagram of the controlled system is designed and investigated. Hardware and software implementation possibilities of the network are discussed. The NN does not need any learning and has moderate complexity. The trajectories of optimal adaptive control and system state variable are globally stable and convergent to unique steady states. It is shown that these trajectories are convergent to the steady states in finite time. Sliding modes of the trajectories are investigated. An accuracy of the network operation in the presence of disturbances of its nonlinearities is analyzed. Using the network for partial case of optimal tracking control is investigated. Computer simulations of the network operation that confirm theoretical derivations and illustrate high performance of the network showing its practical applications are provided.

引用次数: 0