首页 > 最新文献

Neurocomputing最新文献

英文 中文
Improving generalization performance of adaptive gradient method via bounded step sizes 利用有界步长改进自适应梯度法的泛化性能
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-20 DOI: 10.1016/j.neucom.2024.128966
Yangchuan Wang , Lianhong Ding , Peng Shi , Juntao Li , Ruiping Yuan
While adaptive gradient methods such as Adam have been widely used in the training of deep neural networks, a recent study has provided a synthetic function that shows the non-convergence problem of Adam. This issue stems from the existence of extreme gradients and the mismatch between the first and second moments. Several adaptive optimizers have been continuously developed. However, designing a fast optimizer with excellent generalization capability is still challenging. We propose an adaptive method with bounded step sizes, named AdaBS, which removes the extreme step sizes and ensures that it appropriately adjusts adaptive step sizes to mitigate the over-adaptation of step sizes in Adam. In particular, AdaBS effectively clips step sizes that are too large or too small by using two static bounds with a predetermined boundary to control updates. When determining the step size, static bound clipping will be used if the preconditioner is outside the modest boundary, and vanilla Adam will be used if the preconditioner is inside the boundary. AdaBS establishes a trust region around the basic step size and obtains benefits of both Adam and SGD, i.e. fast convergence and better generalization. Finally, we conduct extensive experiments on a variety of practical tasks with benchmark datasets, including image classification and modeling language tasks. Empirical results demonstrate AdaBS’s promising performance with remarkably fast convergence, superior generalization, and robustness.
虽然Adam等自适应梯度方法已广泛应用于深度神经网络的训练中,但最近的一项研究提供了一个综合函数,显示了Adam的不收敛问题。这个问题源于极端梯度的存在以及第一和第二矩之间的不匹配。一些自适应优化器不断被开发出来。然而,设计一个具有良好泛化能力的快速优化器仍然是一个挑战。我们提出了一种有界步长自适应方法,称为AdaBS,它消除了极端步长,并确保它适当地调整自适应步长,以减轻Adam中的步长过度适应。特别是,AdaBS通过使用带有预定边界的两个静态边界来控制更新,从而有效地剪辑过大或过小的步长。在确定步长时,如果前置条件在适度边界外,则使用静态边界剪辑,如果前置条件在适度边界内,则使用香草亚当剪辑。AdaBS在基本步长周围建立信任域,获得了Adam和SGD的优点,收敛速度快,泛化效果好。最后,我们使用基准数据集在各种实际任务上进行了广泛的实验,包括图像分类和建模语言任务。实证结果表明,AdaBS具有显著的收敛速度、良好的泛化能力和鲁棒性。
{"title":"Improving generalization performance of adaptive gradient method via bounded step sizes","authors":"Yangchuan Wang ,&nbsp;Lianhong Ding ,&nbsp;Peng Shi ,&nbsp;Juntao Li ,&nbsp;Ruiping Yuan","doi":"10.1016/j.neucom.2024.128966","DOIUrl":"10.1016/j.neucom.2024.128966","url":null,"abstract":"<div><div>While adaptive gradient methods such as Adam have been widely used in the training of deep neural networks, a recent study has provided a synthetic function that shows the non-convergence problem of Adam. This issue stems from the existence of extreme gradients and the mismatch between the first and second moments. Several adaptive optimizers have been continuously developed. However, designing a fast optimizer with excellent generalization capability is still challenging. We propose an adaptive method with bounded step sizes, named AdaBS, which removes the extreme step sizes and ensures that it appropriately adjusts adaptive step sizes to mitigate the over-adaptation of step sizes in Adam. In particular, AdaBS effectively clips step sizes that are too large or too small by using two static bounds with a predetermined boundary to control updates. When determining the step size, static bound clipping will be used if the preconditioner is outside the modest boundary, and vanilla Adam will be used if the preconditioner is inside the boundary. AdaBS establishes a trust region around the basic step size and obtains benefits of both Adam and SGD, i.e. fast convergence and better generalization. Finally, we conduct extensive experiments on a variety of practical tasks with benchmark datasets, including image classification and modeling language tasks. Empirical results demonstrate AdaBS’s promising performance with remarkably fast convergence, superior generalization, and robustness.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"617 ","pages":"Article 128966"},"PeriodicalIF":5.5,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142759551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dynamic synchronous graph transformer network for region-level air-quality forecasting 面向区域空气质量预报的动态同步图变网络
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-19 DOI: 10.1016/j.neucom.2024.128924
Hanzhong Xia , Xiaoxia Chen , Binjie Chen , Yue Hu
Accurate forecasting of air quality aids in mitigating air pollution, enhancing the well-being of residents, and supporting the city’s sustainable growth. Recent works have utilized graph neural network for spatial dependency modeling in air-quality forecasting task. However, many existing methods rely on separate components to individually capture temporal and spatial correlations, which makes it difficult to synchronously capture the multiscale spatiotemporal correlation (MSTCs) from the spatiotemporal graph. This paper proposed a dynamic synchronous graph transformer (DSGT) based on the Encoder-Decoder structure to forecast air quality of urban regions. It captures time-varying observed station readings through dynamic graph convolution operations and can learn the influence of auxiliary features. We designed a multiscale dynamic synchronous graph constructing way to construct graphs which can effectively encode the MSTCs. There is a multiscale spatiotemporal synchronous graph convolution component in DSGT for extracting multiscale spatiotemporal representation from the constructed graphs. The synchronous graph attention mechanism and temporal attention mechanism were designed to integrated into Encoder-Decoder structure to focus the long-term influence of auxiliary features and the short-term influence of multiscale spatiotemporal representation. Via extensive experiments on two real-world datasets, it is demonstrated that the proposed model outperforms existing methods in both short- and long-term forecasting.
准确的空气质量预测有助于减轻空气污染,提高居民的福祉,并支持城市的可持续发展。近年来的研究将图神经网络应用于空气质量预测任务的空间依赖关系建模。然而,许多现有的方法依赖于单独的分量来单独捕获时空相关性,这使得从时空图中同步捕获多尺度时空相关性(MSTCs)变得困难。提出了一种基于编码器-解码器结构的动态同步图转换器(DSGT)用于城市空气质量预测。它通过动态图卷积运算获取时变观测站读数,并可以学习辅助特征的影响。设计了一种多尺度动态同步图构造方法来构造能有效编码mstc的图。DSGT中有一个多尺度时空同步图卷积分量,用于从构建的图中提取多尺度时空表示。将同步图注意机制和时间注意机制整合到编码器-解码器结构中,关注辅助特征的长期影响和多尺度时空表征的短期影响。通过对两个真实数据集的大量实验,表明所提出的模型在短期和长期预测方面都优于现有方法。
{"title":"Dynamic synchronous graph transformer network for region-level air-quality forecasting","authors":"Hanzhong Xia ,&nbsp;Xiaoxia Chen ,&nbsp;Binjie Chen ,&nbsp;Yue Hu","doi":"10.1016/j.neucom.2024.128924","DOIUrl":"10.1016/j.neucom.2024.128924","url":null,"abstract":"<div><div>Accurate forecasting of air quality aids in mitigating air pollution, enhancing the well-being of residents, and supporting the city’s sustainable growth. Recent works have utilized graph neural network for spatial dependency modeling in air-quality forecasting task. However, many existing methods rely on separate components to individually capture temporal and spatial correlations, which makes it difficult to synchronously capture the multiscale spatiotemporal correlation (MSTCs) from the spatiotemporal graph. This paper proposed a dynamic synchronous graph transformer (DSGT) based on the Encoder-Decoder structure to forecast air quality of urban regions. It captures time-varying observed station readings through dynamic graph convolution operations and can learn the influence of auxiliary features. We designed a multiscale dynamic synchronous graph constructing way to construct graphs which can effectively encode the MSTCs. There is a multiscale spatiotemporal synchronous graph convolution component in DSGT for extracting multiscale spatiotemporal representation from the constructed graphs. The synchronous graph attention mechanism and temporal attention mechanism were designed to integrated into Encoder-Decoder structure to focus the long-term influence of auxiliary features and the short-term influence of multiscale spatiotemporal representation. Via extensive experiments on two real-world datasets, it is demonstrated that the proposed model outperforms existing methods in both short- and long-term forecasting.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"616 ","pages":"Article 128924"},"PeriodicalIF":5.5,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142742872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Self-training: A survey 自我训练:一项调查
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-19 DOI: 10.1016/j.neucom.2024.128904
Massih-Reza Amini , Vasilii Feofanov , Loïc Pauletto , Liès Hadjadj , Émilie Devijver , Yury Maximov
Self-training methods have gained significant attention in recent years due to their effectiveness in leveraging small labeled datasets and large unlabeled observations for prediction tasks. These models identify decision boundaries in low-density regions without additional assumptions about data distribution, using the confidence scores of a learned classifier. The core principle of self-training involves iteratively assigning pseudo-labels to unlabeled samples with confidence scores above a certain threshold, enriching the labeled dataset and retraining the classifier. This paper presents self-training methods for binary and multi-class classification, along with variants and related approaches such as consistency-based methods and transductive learning. We also briefly describe self-supervised learning and reinforced self-training. Furthermore, we highlight popular applications of self-training and discuss the importance of dynamic thresholding and reducing pseudo-label noise for performance improvement.
To the best of our knowledge, this is the first thorough and complete survey on self-training.
近年来,由于自我训练方法在利用小标记数据集和大型未标记观测值进行预测任务方面的有效性,因此获得了极大的关注。这些模型使用学习到的分类器的置信度分数来识别低密度区域中的决策边界,而不需要对数据分布进行额外的假设。自训练的核心原理是对置信度高于一定阈值的未标记样本迭代分配伪标签,丰富标记数据集并重新训练分类器。本文介绍了二分类和多分类的自训练方法,以及基于一致性的方法和转导学习等变体和相关方法。我们还简要介绍了自我监督学习和强化自我训练。此外,我们强调了自我训练的流行应用,并讨论了动态阈值和减少伪标签噪声对性能改进的重要性。据我们所知,这是第一次关于自我训练的全面调查。
{"title":"Self-training: A survey","authors":"Massih-Reza Amini ,&nbsp;Vasilii Feofanov ,&nbsp;Loïc Pauletto ,&nbsp;Liès Hadjadj ,&nbsp;Émilie Devijver ,&nbsp;Yury Maximov","doi":"10.1016/j.neucom.2024.128904","DOIUrl":"10.1016/j.neucom.2024.128904","url":null,"abstract":"<div><div>Self-training methods have gained significant attention in recent years due to their effectiveness in leveraging small labeled datasets and large unlabeled observations for prediction tasks. These models identify decision boundaries in low-density regions without additional assumptions about data distribution, using the confidence scores of a learned classifier. The core principle of self-training involves iteratively assigning pseudo-labels to unlabeled samples with confidence scores above a certain threshold, enriching the labeled dataset and retraining the classifier. This paper presents self-training methods for binary and multi-class classification, along with variants and related approaches such as consistency-based methods and transductive learning. We also briefly describe self-supervised learning and reinforced self-training. Furthermore, we highlight popular applications of self-training and discuss the importance of dynamic thresholding and reducing pseudo-label noise for performance improvement.</div><div>To the best of our knowledge, this is the first thorough and complete survey on self-training.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"616 ","pages":"Article 128904"},"PeriodicalIF":5.5,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142742864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generalization and risk bounds for recurrent neural networks 递归神经网络的泛化与风险界
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-19 DOI: 10.1016/j.neucom.2024.128825
Xuewei Cheng , Ke Huang , Shujie Ma
Recurrent Neural Networks (RNNs) have achieved great success in the prediction of sequential data. However, their theoretical studies are still lagging behind because of their complex interconnected structures. In this paper, we establish a new generalization error bound for vanilla RNNs, and provide a unified framework to calculate the Rademacher complexity that can be applied to a variety of loss functions. When the ramp loss is used, we show that our bound is tighter than the existing bounds based on the same assumptions on the Frobenius and spectral norms of the weight matrices and a few mild conditions. Our numerical results show that our new generalization bound is the tightest among all existing bounds in three public datasets. Our bound improves the second tightest one by an average percentage of 13.80% and 3.01% when the tanh and ReLU activation functions are used, respectively. Moreover, we derive a sharp estimation error bound for RNN-based estimators obtained through empirical risk minimization (ERM) in multi-class classification problems when the loss function satisfies a Bernstein condition.
递归神经网络(RNNs)在序列数据预测方面取得了巨大成功。但由于其复杂的相互联系结构,对其理论研究还比较滞后。在本文中,我们为普通rnn建立了一个新的泛化误差界,并提供了一个统一的框架来计算可应用于各种损失函数的Rademacher复杂度。当使用斜坡损失时,我们证明了基于权矩阵的Frobenius和谱范数的相同假设以及一些温和的条件,我们的界比现有的界更紧。我们的数值结果表明,我们的新泛化边界在三个公共数据集的所有现有边界中是最紧的。当使用tanh和ReLU激活函数时,我们的边界分别提高了13.80%和3.01%的平均百分比。此外,在多类分类问题中,当损失函数满足Bernstein条件时,通过经验风险最小化(ERM)得到基于rnn的估计量,我们得到了一个尖锐的估计误差界。
{"title":"Generalization and risk bounds for recurrent neural networks","authors":"Xuewei Cheng ,&nbsp;Ke Huang ,&nbsp;Shujie Ma","doi":"10.1016/j.neucom.2024.128825","DOIUrl":"10.1016/j.neucom.2024.128825","url":null,"abstract":"<div><div>Recurrent Neural Networks (RNNs) have achieved great success in the prediction of sequential data. However, their theoretical studies are still lagging behind because of their complex interconnected structures. In this paper, we establish a new generalization error bound for vanilla RNNs, and provide a unified framework to calculate the Rademacher complexity that can be applied to a variety of loss functions. When the ramp loss is used, we show that our bound is tighter than the existing bounds based on the same assumptions on the Frobenius and spectral norms of the weight matrices and a few mild conditions. Our numerical results show that our new generalization bound is the tightest among all existing bounds in three public datasets. Our bound improves the second tightest one by an average percentage of 13.80% and 3.01% when the <span><math><mo>tanh</mo></math></span> and ReLU activation functions are used, respectively. Moreover, we derive a sharp estimation error bound for RNN-based estimators obtained through empirical risk minimization (ERM) in multi-class classification problems when the loss function satisfies a Bernstein condition.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"616 ","pages":"Article 128825"},"PeriodicalIF":5.5,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142742873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A fast optimization approach for seeking Nash equilibrium based on Nikaido–Isoda function, state transition algorithm and Gauss–Seidel technique 基于Nikaido-Isoda函数、状态转移算法和gaas - seidel技术的纳什均衡快速寻优方法
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-19 DOI: 10.1016/j.neucom.2024.128922
Xiaojun Zhou , Zheng Wang , Tingwen Huang
This paper proposes a fast optimization approach for non-cooperative games with complicated payoff functions (non-smooth, non-concave, etc.). The Nikaido–Isoda function is employed to convert knotty Nash equilibrium problems (NEPs) into large-scale optimization problems with complex objective functions. To efficiently seek Nash equilibrium, the resulting optimization problems are decomposed into many subproblems where each player tries to maximize its payoff when observing others’ current strategies. All players’ strategies are updated iteratively until reaching Nash equilibrium. Specifically, a dynamic state transition algorithm (STA) is proposed to seek global optima of subproblems at each iteration, and the sequential quadratic programming (SQP) is embedded into dynamic STA for convergence acceleration. A Gauss–Seidel technique is utilized for players’ strategy updates to improve computational efficiency further. Numerical examples drawn from multidisciplinary contexts validate that the proposed approach could effectively seek out Nash equilibrium for simultaneously decreasing the time-consuming remarkably.
针对具有复杂支付函数(非光滑、非凹等)的非合作博弈,提出了一种快速优化方法。利用Nikaido-Isoda函数将棘手纳什均衡问题转化为具有复杂目标函数的大规模优化问题。为了有效地寻求纳什均衡,将优化问题分解为许多子问题,每个参与者在观察他人当前策略时都试图最大化自己的收益。所有参与者的策略迭代更新,直到达到纳什均衡。具体而言,提出了一种动态状态转移算法(STA),在每次迭代时寻求子问题的全局最优,并将序列二次规划(SQP)嵌入到动态状态转移算法中以加速收敛。利用高斯-塞德尔技术对玩家策略进行更新,进一步提高了计算效率。多学科背景下的数值算例验证了该方法能够有效地寻求纳什均衡,同时显著减少求解时间。
{"title":"A fast optimization approach for seeking Nash equilibrium based on Nikaido–Isoda function, state transition algorithm and Gauss–Seidel technique","authors":"Xiaojun Zhou ,&nbsp;Zheng Wang ,&nbsp;Tingwen Huang","doi":"10.1016/j.neucom.2024.128922","DOIUrl":"10.1016/j.neucom.2024.128922","url":null,"abstract":"<div><div>This paper proposes a fast optimization approach for non-cooperative games with complicated payoff functions (non-smooth, non-concave, etc.). The Nikaido–Isoda function is employed to convert knotty Nash equilibrium problems (NEPs) into large-scale optimization problems with complex objective functions. To efficiently seek Nash equilibrium, the resulting optimization problems are decomposed into many subproblems where each player tries to maximize its payoff when observing others’ current strategies. All players’ strategies are updated iteratively until reaching Nash equilibrium. Specifically, a dynamic state transition algorithm (STA) is proposed to seek global optima of subproblems at each iteration, and the sequential quadratic programming (SQP) is embedded into dynamic STA for convergence acceleration. A Gauss–Seidel technique is utilized for players’ strategy updates to improve computational efficiency further. Numerical examples drawn from multidisciplinary contexts validate that the proposed approach could effectively seek out Nash equilibrium for simultaneously decreasing the time-consuming remarkably.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"616 ","pages":"Article 128922"},"PeriodicalIF":5.5,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142743066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RGBT tracking via frequency-aware feature enhancement and unidirectional mixed attention 基于频率感知特征增强和单向混合注意的rbt跟踪
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-19 DOI: 10.1016/j.neucom.2024.128908
Jianming Zhang , Jing Yang , Zikang Liu , Jin Wang
RGBT object tracking is widely used due to the complementary nature of RGB and TIR modalities. However, RGBT trackers based on Transformer or CNN face significant challenges in effectively enhancing and extracting features from one modality and fusing them into another modality. To achieve effective regional feature representation and adequate information fusion, we propose a novel tracking method that employs frequency-aware feature enhancement and bidirectional multistage feature fusion. Firstly, we propose an Early Region Feature Enhancement (ERFE) module, which is comprised of the Frequency-aware Self-region Feature Enhancement (FSFE) block and the Cross-attention Cross-region Feature Enhancement (CCFE) block. The FFT-based FSFE block can enhance the feature of the template or search region separately, while the CCFE block can improve feature representation by considering the template and search region jointly. Secondly, we propose a Bidirectional Multistage Feature Fusion (BMFF) module, with the Complementary Feature Extraction Attention (CFEA) module as its core component. The CFEA module including the Unidirectional Mixed Attention (UMA) block and the Context Focused Attention (CFA) block, can extract information from one modality. When RGB is the primary modality, TIR is the auxiliary modality, and vice versa. The auxiliary modal features processed by CFEA are added to the primary modal features. This information fusion process is bidirectional and multistage. Thirdly, extensive experiments on three benchmark datasets — RGBT234, LaSHeR, and GTOT — demonstrate that our tracker outperforms the advanced RGBT tracking methods.
由于RGB和TIR模式的互补性,RGB目标跟踪得到了广泛的应用。然而,基于Transformer或CNN的rbt跟踪器在有效增强和提取一种模态的特征并将其融合到另一种模态方面面临着重大挑战。为了实现有效的区域特征表示和充分的信息融合,提出了一种采用频率感知特征增强和双向多阶段特征融合的跟踪方法。首先,我们提出了一种早期区域特征增强(ERFE)模块,该模块由频率感知自区域特征增强(FSFE)块和交叉注意跨区域特征增强(CCFE)块组成。基于fft的FSFE块可以单独增强模板或搜索区域的特征,而CCFE块可以通过联合考虑模板和搜索区域来提高特征表示。其次,提出了以互补特征提取注意(CFEA)模块为核心的双向多阶段特征融合(BMFF)模块;CFEA模块包括单向混合注意(UMA)块和上下文聚焦注意(CFA)块,可以从一种模态中提取信息。当RGB是主要模态时,TIR是辅助模态,反之亦然。在主模态特征的基础上加入了经CFEA处理的辅助模态特征。这种信息融合过程是双向的、多阶段的。第三,在RGBT234、LaSHeR和GTOT三个基准数据集上进行了广泛的实验,证明了我们的跟踪器优于先进的RGBT234跟踪方法。
{"title":"RGBT tracking via frequency-aware feature enhancement and unidirectional mixed attention","authors":"Jianming Zhang ,&nbsp;Jing Yang ,&nbsp;Zikang Liu ,&nbsp;Jin Wang","doi":"10.1016/j.neucom.2024.128908","DOIUrl":"10.1016/j.neucom.2024.128908","url":null,"abstract":"<div><div>RGBT object tracking is widely used due to the complementary nature of RGB and TIR modalities. However, RGBT trackers based on Transformer or CNN face significant challenges in effectively enhancing and extracting features from one modality and fusing them into another modality. To achieve effective regional feature representation and adequate information fusion, we propose a novel tracking method that employs frequency-aware feature enhancement and bidirectional multistage feature fusion. Firstly, we propose an Early Region Feature Enhancement (ERFE) module, which is comprised of the Frequency-aware Self-region Feature Enhancement (FSFE) block and the Cross-attention Cross-region Feature Enhancement (CCFE) block. The FFT-based FSFE block can enhance the feature of the template or search region separately, while the CCFE block can improve feature representation by considering the template and search region jointly. Secondly, we propose a Bidirectional Multistage Feature Fusion (BMFF) module, with the Complementary Feature Extraction Attention (CFEA) module as its core component. The CFEA module including the Unidirectional Mixed Attention (UMA) block and the Context Focused Attention (CFA) block, can extract information from one modality. When RGB is the primary modality, TIR is the auxiliary modality, and vice versa. The auxiliary modal features processed by CFEA are added to the primary modal features. This information fusion process is bidirectional and multistage. Thirdly, extensive experiments on three benchmark datasets — RGBT234, LaSHeR, and GTOT — demonstrate that our tracker outperforms the advanced RGBT tracking methods.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"616 ","pages":"Article 128908"},"PeriodicalIF":5.5,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142743502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Observer-based fully distributed bipartite consensus of multiagent systems with disturbance rejection 干扰抑制多智能体系统的基于观测器的全分布二部一致性
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-19 DOI: 10.1016/j.neucom.2024.128842
Jian-Qiao Wang , Jin-Liang Wang , Xueming Dong
Observer-based output feedback control method is utilized to deal with the bipartite consensus problem of multiagent systems (MASs) suffering deterministic disturbances. Based on the leaderless and leader-follower methods, two fully distributed observer-based output feedback controllers are devised to guarantee the bipartite consensus of MASs. Moreover, due to the limited bandwidth of communication channels in practical systems, an event-triggered output feedback controller for the bipartite consensus of MASs is also developed and can guarantee that Zeno behavior does not occur. Finally, the effectiveness and advantages of the control protocols are verified via illustrate examples.
采用基于观测器的输出反馈控制方法,解决了多智能体系统存在确定性扰动时的二部一致性问题。基于无领导和领导-跟随两种方法,设计了两个完全分布式的基于观测器的输出反馈控制器,以保证质量的二部一致性。此外,由于实际系统中通信信道的带宽有限,还开发了一种事件触发的MASs二部共识输出反馈控制器,可以保证不发生Zeno行为。最后,通过实例验证了控制协议的有效性和优越性。
{"title":"Observer-based fully distributed bipartite consensus of multiagent systems with disturbance rejection","authors":"Jian-Qiao Wang ,&nbsp;Jin-Liang Wang ,&nbsp;Xueming Dong","doi":"10.1016/j.neucom.2024.128842","DOIUrl":"10.1016/j.neucom.2024.128842","url":null,"abstract":"<div><div>Observer-based output feedback control method is utilized to deal with the bipartite consensus problem of multiagent systems (MASs) suffering deterministic disturbances. Based on the leaderless and leader-follower methods, two fully distributed observer-based output feedback controllers are devised to guarantee the bipartite consensus of MASs. Moreover, due to the limited bandwidth of communication channels in practical systems, an event-triggered output feedback controller for the bipartite consensus of MASs is also developed and can guarantee that Zeno behavior does not occur. Finally, the effectiveness and advantages of the control protocols are verified via illustrate examples.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"616 ","pages":"Article 128842"},"PeriodicalIF":5.5,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142742862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Diversified recommendation with weighted hypergraph embedding: Case study in music 基于加权超图嵌入的多元化推荐:以音乐为例
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-19 DOI: 10.1016/j.neucom.2024.128905
Chaoguang Luo , Liuying Wen , Yong Qin , Philip S. Yu , Liangwei Yang , Zhineng Hu
Recommender systems serve a dual purpose for users: sifting out inappropriate or mismatched information while accurately identifying items that align with their preferences. Numerous recommendation algorithms rely on rich feature data to deliver personalized suggestions. However, in scenarios without explicit features, balancing accuracy and diversity in recommendations is a pressing concern. To address this challenge, exemplified by music recommendation, we introduce the Diversified Weighted Hypergraph Recommendation algorithm (DWHRec). In DWHRec, the initial connections between users and items are modeled using a weighted hypergraph, where additional entities linked to users and items, such as artists, albums, and tags, are simultaneously integrated into the hypergraph structure. To capture users’ latent preferences, a random-walk embedding method is applied to the hypergraph. Accuracy is measured by the match between users and items, and diversity is gauged by the variety of recommended item types. Extensive experiments conducted on two real-world music datasets show that DWHRec substantially outperforms eight state-of-the-art algorithms in terms of accuracy and diversity. Beyond music recommendation, DWHRec is a versatile framework that can be applied to other domains with similar data structures. The algorithm code is available on GitHub.1
推荐系统为用户提供双重目的:筛选不合适或不匹配的信息,同时准确识别符合他们偏好的项目。许多推荐算法依赖于丰富的特征数据来提供个性化的建议。然而,在没有明确特征的情况下,平衡推荐的准确性和多样性是一个紧迫的问题。为了解决这一挑战,以音乐推荐为例,我们引入了多元化加权超图推荐算法(DWHRec)。在DWHRec中,用户和项目之间的初始连接使用加权超图建模,其中链接到用户和项目的附加实体(如艺术家、专辑和标签)同时集成到超图结构中。为了捕获用户的潜在偏好,对超图应用了随机游走嵌入方法。准确性是通过用户和项目之间的匹配来衡量的,多样性是通过推荐的项目类型的多样性来衡量的。在两个真实音乐数据集上进行的广泛实验表明,DWHRec在准确性和多样性方面大大优于八种最先进的算法。除了音乐推荐之外,DWHRec是一个通用框架,可以应用于具有类似数据结构的其他领域。算法代码可在GitHub.1上获得
{"title":"Diversified recommendation with weighted hypergraph embedding: Case study in music","authors":"Chaoguang Luo ,&nbsp;Liuying Wen ,&nbsp;Yong Qin ,&nbsp;Philip S. Yu ,&nbsp;Liangwei Yang ,&nbsp;Zhineng Hu","doi":"10.1016/j.neucom.2024.128905","DOIUrl":"10.1016/j.neucom.2024.128905","url":null,"abstract":"<div><div>Recommender systems serve a dual purpose for users: sifting out inappropriate or mismatched information while accurately identifying items that align with their preferences. Numerous recommendation algorithms rely on rich feature data to deliver personalized suggestions. However, in scenarios without explicit features, balancing accuracy and diversity in recommendations is a pressing concern. To address this challenge, exemplified by music recommendation, we introduce the Diversified Weighted Hypergraph Recommendation algorithm (DWHRec). In DWHRec, the initial connections between users and items are modeled using a weighted hypergraph, where additional entities linked to users and items, such as artists, albums, and tags, are simultaneously integrated into the hypergraph structure. To capture users’ latent preferences, a random-walk embedding method is applied to the hypergraph. Accuracy is measured by the match between users and items, and diversity is gauged by the variety of recommended item types. Extensive experiments conducted on two real-world music datasets show that DWHRec substantially outperforms eight state-of-the-art algorithms in terms of accuracy and diversity. Beyond music recommendation, DWHRec is a versatile framework that can be applied to other domains with similar data structures. The algorithm code is available on GitHub.<span><span><sup>1</sup></span></span></div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"616 ","pages":"Article 128905"},"PeriodicalIF":5.5,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142742861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An optimal transport-guided diffusion framework with mitigating mode mixture 具有缓和模式混合的最优传输引导扩散框架
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-19 DOI: 10.1016/j.neucom.2024.128910
Shenghao Li, Zhanpeng Wang, Zhongxuan Luo, Na Lei
Diffusion probability models (DPMs) have achieved excellent results in image generation; however, their inference process is slow and tends to produce more mixed images. The autoencoder optimal transport (OT) model addresses the mode collapse/mixture problem from the OT perspective but produces low-quality images. Therefore, to generate high-quality images and mitigate mode mixture, we propose an innovative OT-guided diffusion framework. The key is to find the optimal truncation step M to ensure that the class boundaries of the original data do not intersect during the forward process, ensuring that the generated image belongs to the same class as the initial point in the reverse process. The value of M is determined by evaluating the Peak Signal-to-Noise Ratio, enabling us to mitigate the generation of mixed images. Specifically, our approach first involves embedding the images’ manifold into the latent space through an encoder. The images are subsequently decoded using latent codes, which are generated through an OT map from the Gaussian distribution to the empirical latent distribution. Finally, the trained M-step DPM is utilized to refine the image generated by the decoder. Experimental results demonstrate that our method not only improves image quality but also alleviates mode mixture in diffusion models. Additionally, it enhances sampling efficiency and reduces training cost compared to classical diffusion models.
扩散概率模型(DPMs)在图像生成中取得了优异的效果;然而,他们的推理过程缓慢,往往产生更多的混合图像。自编码器最佳传输(OT)模型从OT的角度解决了模式崩溃/混合问题,但产生了低质量的图像。因此,为了生成高质量的图像并减轻模式混合,我们提出了一种创新的ot引导扩散框架。关键是找到最优截断步长M,保证正演过程中原始数据的类边界不相交,保证生成的图像与反演过程中的初始点属于同一类。M的值是通过评估峰值信噪比来确定的,使我们能够减轻混合图像的产生。具体来说,我们的方法首先涉及通过编码器将图像的流形嵌入到潜在空间中。随后使用隐码对图像进行解码,隐码通过从高斯分布到经验隐分布的OT映射生成。最后,利用训练好的m步DPM对解码器生成的图像进行细化。实验结果表明,该方法不仅提高了图像质量,而且减轻了扩散模型中的模式混合。此外,与经典扩散模型相比,它提高了采样效率,降低了训练成本。
{"title":"An optimal transport-guided diffusion framework with mitigating mode mixture","authors":"Shenghao Li,&nbsp;Zhanpeng Wang,&nbsp;Zhongxuan Luo,&nbsp;Na Lei","doi":"10.1016/j.neucom.2024.128910","DOIUrl":"10.1016/j.neucom.2024.128910","url":null,"abstract":"<div><div>Diffusion probability models (DPMs) have achieved excellent results in image generation; however, their inference process is slow and tends to produce more mixed images. The autoencoder optimal transport (OT) model addresses the mode collapse/mixture problem from the OT perspective but produces low-quality images. Therefore, to generate high-quality images and mitigate mode mixture, we propose an innovative OT-guided diffusion framework. The key is to find the optimal truncation step <span><math><mi>M</mi></math></span> to ensure that the class boundaries of the original data do not intersect during the forward process, ensuring that the generated image belongs to the same class as the initial point in the reverse process. The value of <span><math><mi>M</mi></math></span> is determined by evaluating the Peak Signal-to-Noise Ratio, enabling us to mitigate the generation of mixed images. Specifically, our approach first involves embedding the images’ manifold into the latent space through an encoder. The images are subsequently decoded using latent codes, which are generated through an OT map from the Gaussian distribution to the empirical latent distribution. Finally, the trained <span><math><mi>M</mi></math></span>-step DPM is utilized to refine the image generated by the decoder. Experimental results demonstrate that our method not only improves image quality but also alleviates mode mixture in diffusion models. Additionally, it enhances sampling efficiency and reduces training cost compared to classical diffusion models.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"616 ","pages":"Article 128910"},"PeriodicalIF":5.5,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142742964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Working condition decoupling adversarial network: A novel method for multi-target domain fault diagnosis 工况解耦对抗网络:一种多目标域故障诊断新方法
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-19 DOI: 10.1016/j.neucom.2024.128953
Xuepeng Zhang , Jinrui Wang , Xue Jiang , Zongzhen Zhang , Baokun Han , Huaiqian Bao , Xingxing Jiang
In the practical application of rotating machinery, the change of working conditions can meet different manufacturing requirements. When fault diagnosis is performed on monitoring data with different working conditions, the change of data distribution will bring interference information which is highly related to working conditions and inconsistent matching problems in the process of multi-target domain transfer. In order to solve these problems, a working condition decoupling adversarial network (WCDAN) is proposed for multi-target domain fault diagnosis. Specifically, the prototype discrepancy alignment module is constructed following a weight-shared wavelet convolution feature extractor to ensure a clear prototype representation boundary. Then, the adaptive domain discriminator weight, along with the acquired multi-domain discrepancy, are utilized to decouple the working conditions. This process filters out interference information that highly associated with the source domain working conditions while preserving the inherent fault characteristics. Furthermore, the strategy of multi-domain hybrid alignment aims to minimize the disparity between different domains and solve the inconsistent matching issue. Based on two gearbox fault datasets under stable and unstable conditions, the comparative experimental results show that the WCDAN can be generalized from a single source domain to multiple target domains at the same time and achieve excellent fault diagnosis performance.
在旋转机械的实际应用中,工作条件的变化可以满足不同的制造要求。在对不同工况下的监测数据进行故障诊断时,数据分布的变化会带来与工况高度相关的干扰信息和多目标域转移过程中的匹配不一致问题。为了解决这些问题,提出了一种用于多目标域故障诊断的工况解耦对抗网络(WCDAN)。具体而言,在权值共享的小波卷积特征提取器的基础上构建原型差异对齐模块,以确保清晰的原型表示边界。然后,利用自适应域鉴别器权值和获取的多域差异对工况进行解耦。该过程滤除了与源域工况高度相关的干扰信息,同时保留了固有的故障特征。此外,多域混合对齐策略旨在最小化不同域之间的差异,解决匹配不一致的问题。基于稳定和不稳定两种工况下的齿轮箱故障数据集,对比实验结果表明,WCDAN可以从单一源域同时推广到多个目标域,取得了优异的故障诊断性能。
{"title":"Working condition decoupling adversarial network: A novel method for multi-target domain fault diagnosis","authors":"Xuepeng Zhang ,&nbsp;Jinrui Wang ,&nbsp;Xue Jiang ,&nbsp;Zongzhen Zhang ,&nbsp;Baokun Han ,&nbsp;Huaiqian Bao ,&nbsp;Xingxing Jiang","doi":"10.1016/j.neucom.2024.128953","DOIUrl":"10.1016/j.neucom.2024.128953","url":null,"abstract":"<div><div>In the practical application of rotating machinery, the change of working conditions can meet different manufacturing requirements. When fault diagnosis is performed on monitoring data with different working conditions, the change of data distribution will bring interference information which is highly related to working conditions and inconsistent matching problems in the process of multi-target domain transfer. In order to solve these problems, a working condition decoupling adversarial network (WCDAN) is proposed for multi-target domain fault diagnosis. Specifically, the prototype discrepancy alignment module is constructed following a weight-shared wavelet convolution feature extractor to ensure a clear prototype representation boundary. Then, the adaptive domain discriminator weight, along with the acquired multi-domain discrepancy, are utilized to decouple the working conditions. This process filters out interference information that highly associated with the source domain working conditions while preserving the inherent fault characteristics. Furthermore, the strategy of multi-domain hybrid alignment aims to minimize the disparity between different domains and solve the inconsistent matching issue. Based on two gearbox fault datasets under stable and unstable conditions, the comparative experimental results show that the WCDAN can be generalized from a single source domain to multiple target domains at the same time and achieve excellent fault diagnosis performance.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"616 ","pages":"Article 128953"},"PeriodicalIF":5.5,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142743061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Neurocomputing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1