首页 > 最新文献

Neural Networks最新文献

英文 中文
Online ensemble model compression for nonstationary data stream learning.
IF 6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-22 DOI: 10.1016/j.neunet.2025.107151
Rodrigo G F Soares, Leandro L Minku

Learning from data streams that emerge from nonstationary environments has many real-world applications and poses various challenges. A key characteristic of such a task is the varying nature of the underlying data distributions over time (concept drifts). However, the most common type of data stream learning approach are ensemble approaches, which involve the training of multiple base learners. This can severely increase their computational cost, especially when the learners have to recover from concept drift, rendering them inadequate for applications with tight time and space constraints. In this work, we propose Online Weight Averaging (OWA) - a robust and fast online model compression method for nonstationary data streams based on stochastic weight averaging. It is the first online model compression for nonstationary data streams, which is capable of compressing an evolving ensemble of neural networks into a single model continuously over time. It combines several snapshots of a neural network over time by averaging its weights in specific time steps to find promising regions in the loss landscape with the ability to forget weights from outdated time steps when a concept drift occurs. In this way, at any point in time, a single neural network is maintained to represent a whole ensemble, leveraging the power of ensembles while being appropriate for applications with tight speed requirements. Our experiments show that this key advantage of our proposed method also translates into other advantages such as (1) significant savings in computational cost compared to state-of-the-art data stream ensemble methods while (2) delivering similar predictive performance.

{"title":"Online ensemble model compression for nonstationary data stream learning.","authors":"Rodrigo G F Soares, Leandro L Minku","doi":"10.1016/j.neunet.2025.107151","DOIUrl":"https://doi.org/10.1016/j.neunet.2025.107151","url":null,"abstract":"<p><p>Learning from data streams that emerge from nonstationary environments has many real-world applications and poses various challenges. A key characteristic of such a task is the varying nature of the underlying data distributions over time (concept drifts). However, the most common type of data stream learning approach are ensemble approaches, which involve the training of multiple base learners. This can severely increase their computational cost, especially when the learners have to recover from concept drift, rendering them inadequate for applications with tight time and space constraints. In this work, we propose Online Weight Averaging (OWA) - a robust and fast online model compression method for nonstationary data streams based on stochastic weight averaging. It is the first online model compression for nonstationary data streams, which is capable of compressing an evolving ensemble of neural networks into a single model continuously over time. It combines several snapshots of a neural network over time by averaging its weights in specific time steps to find promising regions in the loss landscape with the ability to forget weights from outdated time steps when a concept drift occurs. In this way, at any point in time, a single neural network is maintained to represent a whole ensemble, leveraging the power of ensembles while being appropriate for applications with tight speed requirements. Our experiments show that this key advantage of our proposed method also translates into other advantages such as (1) significant savings in computational cost compared to state-of-the-art data stream ensemble methods while (2) delivering similar predictive performance.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"107151"},"PeriodicalIF":6.0,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143069227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
When bipartite graph learning meets anomaly detection in attributed networks: Understand abnormalities from each attribute.
IF 6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-22 DOI: 10.1016/j.neunet.2025.107194
Zhen Peng, Yunfan Wang, Qika Lin, Bo Dong, Chao Shen

Detecting anomalies in attributed networks has become a subject of interest in both academia and industry due to its wide spectrum of applications. Although most existing methods achieve desirable performance by the merit of various graph neural networks, the way they bundle node-affiliated multidimensional attributes into a whole for embedding calculation hinders their ability to model and analyze anomalies at the fine-grained feature level. To characterize anomalies from each feature dimension, we propose Eagle, a deep framework based on bipartitE grAph learninG for anomaLy dEtection. Specifically, we disentangle instances and attributes as two disjoint and independent node sets, then formulate the input attributed network as an intra-connected bipartite graph that involves two different relations: edges across two types of nodes described by attribute values, and links between nodes of the same type recorded in the network topology. By learning a self-supervised edge-level prediction task, named affinity inference, Eagle has good physical sense in explaining abnormal deviations from each attribute. Experiments corroborate the effectiveness of Eagle under transductive and inductive task settings. Moreover, case studies illustrate that Eagle is more user-friendly as it opens the door for humans to understand abnormalities from the perspective of different feature combinations.

{"title":"When bipartite graph learning meets anomaly detection in attributed networks: Understand abnormalities from each attribute.","authors":"Zhen Peng, Yunfan Wang, Qika Lin, Bo Dong, Chao Shen","doi":"10.1016/j.neunet.2025.107194","DOIUrl":"https://doi.org/10.1016/j.neunet.2025.107194","url":null,"abstract":"<p><p>Detecting anomalies in attributed networks has become a subject of interest in both academia and industry due to its wide spectrum of applications. Although most existing methods achieve desirable performance by the merit of various graph neural networks, the way they bundle node-affiliated multidimensional attributes into a whole for embedding calculation hinders their ability to model and analyze anomalies at the fine-grained feature level. To characterize anomalies from each feature dimension, we propose Eagle, a deep framework based on bipartitE grAph learninG for anomaLy dEtection. Specifically, we disentangle instances and attributes as two disjoint and independent node sets, then formulate the input attributed network as an intra-connected bipartite graph that involves two different relations: edges across two types of nodes described by attribute values, and links between nodes of the same type recorded in the network topology. By learning a self-supervised edge-level prediction task, named affinity inference, Eagle has good physical sense in explaining abnormal deviations from each attribute. Experiments corroborate the effectiveness of Eagle under transductive and inductive task settings. Moreover, case studies illustrate that Eagle is more user-friendly as it opens the door for humans to understand abnormalities from the perspective of different feature combinations.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"107194"},"PeriodicalIF":6.0,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143042960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Two algorithms for improving model-based diagnosis using multiple observations and deep learning.
IF 6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-22 DOI: 10.1016/j.neunet.2025.107185
Ran Tai, Dantong Ouyang, Liming Zhang

Model-based diagnosis (MBD) is a critical problem in artificial intelligence. Recent advancements have made it possible to address this challenge using methods like deep learning. However, current approaches that use deep learning for MBD often struggle with accuracy and computation time due to the limited diagnostic information provided by a single observation. To address this challenge, we introduce two novel algorithms, Discret2DiMO (Discret2Di with Multiple Observations) and Discret2DiMO-DC (Discret2Di with Multiple Observations and Dictionary Cache), which enhance MBD by integrating multiple observations with deep learning techniques. Experimental evaluations on a simulated three-tank model demonstrate that Discret2DiMO significantly improves diagnostic accuracy, achieving up to a 685.06% increase and an average improvement of 59.18% over Discret2Di across all test cases. To address computational overhead, Discret2DiMO-DC additionally implements a caching mechanism that eliminates redundant computations during diagnosis. Remarkably, Discret2DiMO-DC achieves comparable accuracy while reducing computation time by an average of 95.74% compared to Discret2DiMO and 89.42% compared to Discret2Di, with computation times reduced by two orders of magnitude. These results indicate that our proposed algorithms significantly enhance diagnostic accuracy and efficiency in MBD compared with the state-of-the-art algorithm, highlighting the potential of integrating multiple observations with deep learning for more accurate and efficient diagnostics in complex systems.

{"title":"Two algorithms for improving model-based diagnosis using multiple observations and deep learning.","authors":"Ran Tai, Dantong Ouyang, Liming Zhang","doi":"10.1016/j.neunet.2025.107185","DOIUrl":"https://doi.org/10.1016/j.neunet.2025.107185","url":null,"abstract":"<p><p>Model-based diagnosis (MBD) is a critical problem in artificial intelligence. Recent advancements have made it possible to address this challenge using methods like deep learning. However, current approaches that use deep learning for MBD often struggle with accuracy and computation time due to the limited diagnostic information provided by a single observation. To address this challenge, we introduce two novel algorithms, Discret2DiMO (Discret2Di with Multiple Observations) and Discret2DiMO-DC (Discret2Di with Multiple Observations and Dictionary Cache), which enhance MBD by integrating multiple observations with deep learning techniques. Experimental evaluations on a simulated three-tank model demonstrate that Discret2DiMO significantly improves diagnostic accuracy, achieving up to a 685.06% increase and an average improvement of 59.18% over Discret2Di across all test cases. To address computational overhead, Discret2DiMO-DC additionally implements a caching mechanism that eliminates redundant computations during diagnosis. Remarkably, Discret2DiMO-DC achieves comparable accuracy while reducing computation time by an average of 95.74% compared to Discret2DiMO and 89.42% compared to Discret2Di, with computation times reduced by two orders of magnitude. These results indicate that our proposed algorithms significantly enhance diagnostic accuracy and efficiency in MBD compared with the state-of-the-art algorithm, highlighting the potential of integrating multiple observations with deep learning for more accurate and efficient diagnostics in complex systems.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"107185"},"PeriodicalIF":6.0,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143043015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SAC-BL: A hypothesis testing framework for unsupervised visual anomaly detection and location.
IF 6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-22 DOI: 10.1016/j.neunet.2025.107147
Xinsong Ma, Jie Wu, Weiwei Liu

Reconstruction-based methods achieve promising performance for visual anomaly detection (AD), relying on the underlying assumption that the anomalies cannot be accurately reconstructed. However, this assumption does not always hold, especially when suffering weak anomalous (a.k.a. normal-like) examples. More significantly, the existing methods primarily devote to obtaining the strong discriminative score functions, but neglecting the systematic investigation of the decision rule based on the proposed score function. Unlike previous work, this paper solves the AD issue starting from the decision rule within the statistical framework, providing a new insight for AD community. Specifically, we frame the AD task as a multiple hypothesis testing problem, Then, we propose a novel betting-like (BL) procedure with an embedding of strong anomaly constraint network (SACNet), called SAC-BL, to address this testing problem. In SAC-BL, BL procedure serves as the decision rule and SACNet is trained to capture the critical discriminative information from weak anomalies. Theoretically, our SAC-BL can control false discovery rate (FDR) at the prescribed level. Finally, we conduct extensive experiments to verify the superiority of SAC-BL over previous method.

{"title":"SAC-BL: A hypothesis testing framework for unsupervised visual anomaly detection and location.","authors":"Xinsong Ma, Jie Wu, Weiwei Liu","doi":"10.1016/j.neunet.2025.107147","DOIUrl":"https://doi.org/10.1016/j.neunet.2025.107147","url":null,"abstract":"<p><p>Reconstruction-based methods achieve promising performance for visual anomaly detection (AD), relying on the underlying assumption that the anomalies cannot be accurately reconstructed. However, this assumption does not always hold, especially when suffering weak anomalous (a.k.a. normal-like) examples. More significantly, the existing methods primarily devote to obtaining the strong discriminative score functions, but neglecting the systematic investigation of the decision rule based on the proposed score function. Unlike previous work, this paper solves the AD issue starting from the decision rule within the statistical framework, providing a new insight for AD community. Specifically, we frame the AD task as a multiple hypothesis testing problem, Then, we propose a novel betting-like (BL) procedure with an embedding of strong anomaly constraint network (SACNet), called SAC-BL, to address this testing problem. In SAC-BL, BL procedure serves as the decision rule and SACNet is trained to capture the critical discriminative information from weak anomalies. Theoretically, our SAC-BL can control false discovery rate (FDR) at the prescribed level. Finally, we conduct extensive experiments to verify the superiority of SAC-BL over previous method.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"107147"},"PeriodicalIF":6.0,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143076115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ShadowGAN-Former: Reweighting self-attention based on mask for shadow removal.
IF 6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-22 DOI: 10.1016/j.neunet.2025.107175
Jianyi Hu, Shuhuan Wen, Jiaqi Li, Hamid Reza Karimi

Shadow removal remains a challenging visual task aimed at restoring the original brightness of shadow regions in images. Many existing methods overlook the implicit clues within non-shadow regions, leading to inconsistencies in the color, texture, and illumination of the reconstructed shadow-free images. To address these issues, we propose an efficient hybrid model of Transformer and Generative Adversarial Network (GAN), named ShadowGAN-Former, which utilizes information from non-shadow regions to assist in shadow removal. We introduce the Multi-Head Transposed Attention (MHTA) and Gated Feed-Forward Network (Gated FFN), designed to enhance focus on key features while reducing computational costs. Furthermore, we propose the Shadow Attention Reweight Module (SARM) to reweight the self-attention maps based on the correlation between shadow and non-shadow regions, thereby emphasizing the contextual relevance between them. Experimental results on the ISTD and SRD datasets show that our method outperforms popular and state-of-the-art shadow removal algorithms, with the SARM module improving PSNR by 5.42% and reducing RMSE by 14.76%.

去阴影仍然是一项具有挑战性的视觉任务,旨在恢复图像中阴影区域的原始亮度。现有的许多方法都忽略了非阴影区域内的隐含线索,导致重建后的无阴影图像在颜色、纹理和光照方面不一致。为了解决这些问题,我们提出了一种高效的变换器和生成对抗网络(GAN)混合模型,命名为 ShadowGAN-变换器,它可以利用非阴影区域的信息来帮助去除阴影。我们引入了多头变换注意力 (MHTA) 和门控前馈网络 (FFN),旨在加强对关键特征的关注,同时降低计算成本。此外,我们还提出了阴影注意力重权模块(SARM),根据阴影和非阴影区域之间的相关性对自我注意力地图进行重权,从而强调它们之间的上下文相关性。在 ISTD 和 SRD 数据集上的实验结果表明,我们的方法优于流行的和最先进的阴影去除算法,其中 SARM 模块将 PSNR 提高了 5.42%,将 RMSE 降低了 14.76%。
{"title":"ShadowGAN-Former: Reweighting self-attention based on mask for shadow removal.","authors":"Jianyi Hu, Shuhuan Wen, Jiaqi Li, Hamid Reza Karimi","doi":"10.1016/j.neunet.2025.107175","DOIUrl":"https://doi.org/10.1016/j.neunet.2025.107175","url":null,"abstract":"<p><p>Shadow removal remains a challenging visual task aimed at restoring the original brightness of shadow regions in images. Many existing methods overlook the implicit clues within non-shadow regions, leading to inconsistencies in the color, texture, and illumination of the reconstructed shadow-free images. To address these issues, we propose an efficient hybrid model of Transformer and Generative Adversarial Network (GAN), named ShadowGAN-Former, which utilizes information from non-shadow regions to assist in shadow removal. We introduce the Multi-Head Transposed Attention (MHTA) and Gated Feed-Forward Network (Gated FFN), designed to enhance focus on key features while reducing computational costs. Furthermore, we propose the Shadow Attention Reweight Module (SARM) to reweight the self-attention maps based on the correlation between shadow and non-shadow regions, thereby emphasizing the contextual relevance between them. Experimental results on the ISTD and SRD datasets show that our method outperforms popular and state-of-the-art shadow removal algorithms, with the SARM module improving PSNR by 5.42% and reducing RMSE by 14.76%.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"107175"},"PeriodicalIF":6.0,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143069233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A multi-agent reinforcement learning framework for cross-domain sequential recommendation.
IF 6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-22 DOI: 10.1016/j.neunet.2025.107192
Huiting Liu, Junyi Wei, Kaiwen Zhu, Peipei Li, Peng Zhao, Xindong Wu

Sequential recommendation models aim to predict the next item based on the sequence of items users interact with, ordered chronologically. However, these models face the challenge of data sparsity. Recent studies have explored cross-domain sequential recommendation, where users' interaction data across multiple source domains are leveraged to enhance recommendations in data-sparse target domains. Despite this, users' interests in the target and source domains may not align perfectly. Additionally, current research often neglects the collaboration between different transfer strategies across source domains, leading to suboptimal performance. To address these challenges, we propose a multi-agent reinforcement learning framework for cross-domain sequential recommendation (MARL4CDSR). Unlike traditional approaches that transfer knowledge from the entire source domain sequence, MARL4CDSR uses agents to select relevant items from source domain sequences for transfer. This approach optimizes the transfer process by coordinating agents' strategies within each source domain through a multi-agent reinforcement learning framework. Additionally, we introduce an information fusion module with a cross-attention mechanism to align the embedding representations of selected source domain items with target domain items. A reward function based on score differences for the next item optimizes the multi-agent system. We evaluate the method on three Amazon domains: Movies_and_TV, Toys_and_Games, and Books. Our proposed model MARL4CDSR outperforms all baselines on all metrics. Specifically, for the Movies&Books→Toys task, where the target domain interaction sequence is relatively sparse, MARL4CDSR improves NDCG@10 and HR@10 by 14.76% and 10.25%, respectively.

{"title":"A multi-agent reinforcement learning framework for cross-domain sequential recommendation.","authors":"Huiting Liu, Junyi Wei, Kaiwen Zhu, Peipei Li, Peng Zhao, Xindong Wu","doi":"10.1016/j.neunet.2025.107192","DOIUrl":"https://doi.org/10.1016/j.neunet.2025.107192","url":null,"abstract":"<p><p>Sequential recommendation models aim to predict the next item based on the sequence of items users interact with, ordered chronologically. However, these models face the challenge of data sparsity. Recent studies have explored cross-domain sequential recommendation, where users' interaction data across multiple source domains are leveraged to enhance recommendations in data-sparse target domains. Despite this, users' interests in the target and source domains may not align perfectly. Additionally, current research often neglects the collaboration between different transfer strategies across source domains, leading to suboptimal performance. To address these challenges, we propose a multi-agent reinforcement learning framework for cross-domain sequential recommendation (MARL4CDSR). Unlike traditional approaches that transfer knowledge from the entire source domain sequence, MARL4CDSR uses agents to select relevant items from source domain sequences for transfer. This approach optimizes the transfer process by coordinating agents' strategies within each source domain through a multi-agent reinforcement learning framework. Additionally, we introduce an information fusion module with a cross-attention mechanism to align the embedding representations of selected source domain items with target domain items. A reward function based on score differences for the next item optimizes the multi-agent system. We evaluate the method on three Amazon domains: Movies_and_TV, Toys_and_Games, and Books. Our proposed model MARL4CDSR outperforms all baselines on all metrics. Specifically, for the Movies&Books→Toys task, where the target domain interaction sequence is relatively sparse, MARL4CDSR improves NDCG@10 and HR@10 by 14.76% and 10.25%, respectively.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"107192"},"PeriodicalIF":6.0,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143069220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EHM: Exploring dynamic alignment and hierarchical clustering in unsupervised domain adaptation via high-order moment-guided contrastive learning.
IF 6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-22 DOI: 10.1016/j.neunet.2025.107188
Tengyue Xu, Jun Dan

Unsupervised domain adaptation (UDA) aims to annotate unlabeled target domain samples using transferable knowledge learned from the labeled source domain. Optimal transport (OT) is a widely adopted probability metric in transfer learning for quantifying domain discrepancy. However, many existing OT-based UDA methods usually employ an entropic regularization term to solve the OT optimization problem, inevitably resulting in a biased estimation of domain discrepancy. Furthermore, to achieve precise alignment of class distributions, numerous UDA methods commonly employ deep features for guiding contrastive learning, overlooking the loss of discriminative information. Additionally, prior studies frequently use conditional entropy regularization term to cluster unlabeled target samples, which may guide the model toward optimizing in the wrong direction. To address the aforementioned issues, this paper proposes a new UDA framework called EHM, which employs a Dynamic Domain Alignment (DDA) strategy, a Reliable High-order Contrastive Alignment (RHCA) strategy, and a Trustworthy Hierarchical Clustering (THC) strategy. Specially, DDA leverages a dynamically adjusted Sinkhorn divergence to measure domain discrepancy, effectively eliminating the biased estimation issue. Our RHCA skillfully conducts contrastive learning in a high-order moment space, significantly enhancing the representation power of transferable features and reducing the domain discrepancy at the class-level. Moreover, THC integrates multi-view information to guide unlabeled samples towards achieving robust clustering. Extensive experiments on various benchmarks demonstrate the effectiveness of our EHM.

{"title":"EHM: Exploring dynamic alignment and hierarchical clustering in unsupervised domain adaptation via high-order moment-guided contrastive learning.","authors":"Tengyue Xu, Jun Dan","doi":"10.1016/j.neunet.2025.107188","DOIUrl":"https://doi.org/10.1016/j.neunet.2025.107188","url":null,"abstract":"<p><p>Unsupervised domain adaptation (UDA) aims to annotate unlabeled target domain samples using transferable knowledge learned from the labeled source domain. Optimal transport (OT) is a widely adopted probability metric in transfer learning for quantifying domain discrepancy. However, many existing OT-based UDA methods usually employ an entropic regularization term to solve the OT optimization problem, inevitably resulting in a biased estimation of domain discrepancy. Furthermore, to achieve precise alignment of class distributions, numerous UDA methods commonly employ deep features for guiding contrastive learning, overlooking the loss of discriminative information. Additionally, prior studies frequently use conditional entropy regularization term to cluster unlabeled target samples, which may guide the model toward optimizing in the wrong direction. To address the aforementioned issues, this paper proposes a new UDA framework called EHM, which employs a Dynamic Domain Alignment (DDA) strategy, a Reliable High-order Contrastive Alignment (RHCA) strategy, and a Trustworthy Hierarchical Clustering (THC) strategy. Specially, DDA leverages a dynamically adjusted Sinkhorn divergence to measure domain discrepancy, effectively eliminating the biased estimation issue. Our RHCA skillfully conducts contrastive learning in a high-order moment space, significantly enhancing the representation power of transferable features and reducing the domain discrepancy at the class-level. Moreover, THC integrates multi-view information to guide unlabeled samples towards achieving robust clustering. Extensive experiments on various benchmarks demonstrate the effectiveness of our EHM.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"107188"},"PeriodicalIF":6.0,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143069223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fast Co-clustering via Anchor-guided Label Spreading.
IF 6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-21 DOI: 10.1016/j.neunet.2025.107187
Fangyuan Xie, Feiping Nie, Weizhong Yu, Xuelong Li

The attention towards clustering using anchor graph has grown due to its effectiveness and efficiency. As the most representative points in original data, anchors are also regarded as connecting the sample space to the label space. However, when there is noise in original data, the anchor-guided label spreading may fail. To alleviate this, we propose a Fast Co-clustering method via Anchor-guided Label Spreading (FCALS), in which the label of samples and anchors could be obtained simultaneously. Our method could not only maximize the intra-cluster similarity among anchors but also ensure that the relationship between anchors and original data is preserved. Besides, to avoid trivial solutions, the size constraint is introduced in our model, in which it is required that the samples within each cluster must not fall below a certain value. Furthermore, the lower limit exhibits insensitivity with a relatively broad range of possible values. Considering that the label matrix of original data could be fuzzy or discrete, the continuous and discrete models are proposed, which are named FCALS-C and FCALS-D respectively. Since labels of anchors can be directly obtained, the proposed methods are naturally applicable to out-of-sample problems. The superiority of the proposed methods is demonstrated through experimental results on both synthetic and real-world datasets.

{"title":"Fast Co-clustering via Anchor-guided Label Spreading.","authors":"Fangyuan Xie, Feiping Nie, Weizhong Yu, Xuelong Li","doi":"10.1016/j.neunet.2025.107187","DOIUrl":"https://doi.org/10.1016/j.neunet.2025.107187","url":null,"abstract":"<p><p>The attention towards clustering using anchor graph has grown due to its effectiveness and efficiency. As the most representative points in original data, anchors are also regarded as connecting the sample space to the label space. However, when there is noise in original data, the anchor-guided label spreading may fail. To alleviate this, we propose a Fast Co-clustering method via Anchor-guided Label Spreading (FCALS), in which the label of samples and anchors could be obtained simultaneously. Our method could not only maximize the intra-cluster similarity among anchors but also ensure that the relationship between anchors and original data is preserved. Besides, to avoid trivial solutions, the size constraint is introduced in our model, in which it is required that the samples within each cluster must not fall below a certain value. Furthermore, the lower limit exhibits insensitivity with a relatively broad range of possible values. Considering that the label matrix of original data could be fuzzy or discrete, the continuous and discrete models are proposed, which are named FCALS-C and FCALS-D respectively. Since labels of anchors can be directly obtained, the proposed methods are naturally applicable to out-of-sample problems. The superiority of the proposed methods is demonstrated through experimental results on both synthetic and real-world datasets.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"107187"},"PeriodicalIF":6.0,"publicationDate":"2025-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143076085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On spectral bias reduction of multi-scale neural networks for regression problems.
IF 6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-21 DOI: 10.1016/j.neunet.2025.107179
Bo Wang, Heng Yuan, Lizuo Liu, Wenzhong Zhang, Wei Cai

In this paper, we derive diffusion equation models in the spectral domain to study the evolution of the training error of two-layer multiscale deep neural networks (MscaleDNN) (Cai and Xu, 2019; Liu et al., 2020), which is designed to reduce the spectral bias of fully connected deep neural networks in approximating oscillatory functions. The diffusion models are obtained from the spectral form of the error equation of the MscaleDNN, derived with a neural tangent kernel approach and gradient descent training and a sine activation function, assuming a vanishing learning rate and infinite network width and domain size. The involved diffusion coefficients are shown to have larger supports if more scales are used in the MscaleDNN, and thus, the proposed diffusion equation models in the frequency domain explain the MscaleDNN's spectral bias reduction capability. The diffusion model in the Fourier-spectral domain allows us to understand clearly the training error decay for different Fourier-frequencies. The numerical results of the diffusion models for a two-layer MscaleDNN training match the error evolution of the actual gradient descent training with a reasonably large network width, thus validating the effectiveness of the diffusion models. Meanwhile, the numerical results for MscaleDNN show error decay over a wide frequency range and confirm the advantage of using MscaleDNN to approximate functions with a wide range of frequencies.

{"title":"On spectral bias reduction of multi-scale neural networks for regression problems.","authors":"Bo Wang, Heng Yuan, Lizuo Liu, Wenzhong Zhang, Wei Cai","doi":"10.1016/j.neunet.2025.107179","DOIUrl":"https://doi.org/10.1016/j.neunet.2025.107179","url":null,"abstract":"<p><p>In this paper, we derive diffusion equation models in the spectral domain to study the evolution of the training error of two-layer multiscale deep neural networks (MscaleDNN) (Cai and Xu, 2019; Liu et al., 2020), which is designed to reduce the spectral bias of fully connected deep neural networks in approximating oscillatory functions. The diffusion models are obtained from the spectral form of the error equation of the MscaleDNN, derived with a neural tangent kernel approach and gradient descent training and a sine activation function, assuming a vanishing learning rate and infinite network width and domain size. The involved diffusion coefficients are shown to have larger supports if more scales are used in the MscaleDNN, and thus, the proposed diffusion equation models in the frequency domain explain the MscaleDNN's spectral bias reduction capability. The diffusion model in the Fourier-spectral domain allows us to understand clearly the training error decay for different Fourier-frequencies. The numerical results of the diffusion models for a two-layer MscaleDNN training match the error evolution of the actual gradient descent training with a reasonably large network width, thus validating the effectiveness of the diffusion models. Meanwhile, the numerical results for MscaleDNN show error decay over a wide frequency range and confirm the advantage of using MscaleDNN to approximate functions with a wide range of frequencies.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"107179"},"PeriodicalIF":6.0,"publicationDate":"2025-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143061441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Tensor neural networks for high-dimensional Fokker-Planck equations.
IF 6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-21 DOI: 10.1016/j.neunet.2025.107165
Taorui Wang, Zheyuan Hu, Kenji Kawaguchi, Zhongqiang Zhang, George Em Karniadakis

We solve high-dimensional steady-state Fokker-Planck equations on the whole space by applying tensor neural networks. The tensor networks are a linear combination of tensor products of one-dimensional feedforward networks or a linear combination of several selected radial basis functions. The use of tensor feedforward networks allows us to efficiently exploit auto-differentiation (in physical variables) in major Python packages while using radial basis functions can fully avoid auto-differentiation, which is rather expensive in high dimensions. We then use the physics-informed neural networks and stochastic gradient descent methods to learn the tensor networks. One essential step is to determine a proper bounded domain or numerical support for the Fokker-Planck equation. To better train the tensor radial basis function networks, we impose some constraints on parameters, which lead to relatively high accuracy. We demonstrate numerically that the tensor neural networks in physics-informed machine learning are efficient for steady-state Fokker-Planck equations from two to ten dimensions.

{"title":"Tensor neural networks for high-dimensional Fokker-Planck equations.","authors":"Taorui Wang, Zheyuan Hu, Kenji Kawaguchi, Zhongqiang Zhang, George Em Karniadakis","doi":"10.1016/j.neunet.2025.107165","DOIUrl":"https://doi.org/10.1016/j.neunet.2025.107165","url":null,"abstract":"<p><p>We solve high-dimensional steady-state Fokker-Planck equations on the whole space by applying tensor neural networks. The tensor networks are a linear combination of tensor products of one-dimensional feedforward networks or a linear combination of several selected radial basis functions. The use of tensor feedforward networks allows us to efficiently exploit auto-differentiation (in physical variables) in major Python packages while using radial basis functions can fully avoid auto-differentiation, which is rather expensive in high dimensions. We then use the physics-informed neural networks and stochastic gradient descent methods to learn the tensor networks. One essential step is to determine a proper bounded domain or numerical support for the Fokker-Planck equation. To better train the tensor radial basis function networks, we impose some constraints on parameters, which lead to relatively high accuracy. We demonstrate numerically that the tensor neural networks in physics-informed machine learning are efficient for steady-state Fokker-Planck equations from two to ten dimensions.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"107165"},"PeriodicalIF":6.0,"publicationDate":"2025-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143069236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Neural Networks
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1