首页 > 最新文献

Neural Processing Letters最新文献

英文 中文
MDGCL: Graph Contrastive Learning Framework with Multiple Graph Diffusion Methods MDGCL:采用多种图形扩散方法的图形对比学习框架
IF 3.1 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-13 DOI: 10.1007/s11063-024-11672-3
Yuqiang Li, Yi Zhang, Chun Liu

In recent years, some classical graph contrastive learning(GCL) frameworks have been proposed to address the problem of sparse labeling of graph data in the real world. However, in node classification tasks, there are two obvious problems with existing GCL frameworks: first, the stochastic augmentation methods they adopt lose a lot of semantic information; second, the local–local contrasting mode selected by most frameworks ignores the global semantic information of the original graph, which limits the node classification performance of these frameworks. To address the above problems, this paper proposes a novel graph contrastive learning framework, MDGCL, which introduces two graph diffusion methods, Markov and PPR, and a deterministic–stochastic data augmentation strategy while retaining the local–local contrasting mode. Specifically, before using the two stochastic augmentation methods (FeatureDrop and EdgeDrop), MDGCL first uses two deterministic augmentation methods (Markov diffusion and PPR diffusion) to perform data augmentation on the original graph to increase the semantic information, this step ensures subsequent stochastic augmentation methods do not lose too much semantic information. Meanwhile, the diffusion matrices carried by the augmented views contain global semantic information of the original graph, allowing the framework to utilize the global semantic information while retaining the local-local contrasting mode, which further enhances the node classification performance of the framework. We conduct extensive comparative experiments on multiple benchmark datasets, and the results show that MDGCL outperforms the representative baseline frameworks on node classification tasks. Among them, compared with COSTA, MDGCL’s node classification accuracy has been improved by 1.07% and 0.41% respectively on two representative datasets, Amazon-Photo and Coauthor-CS. In addition, we also conduct ablation experiments on two datasets, Cora and CiteSeer, to verify the effectiveness of each improvement work of our framework.

近年来,人们提出了一些经典的图对比学习(GCL)框架,以解决现实世界中图数据稀疏标注的问题。然而,在节点分类任务中,现有的 GCL 框架存在两个明显的问题:第一,它们采用的随机增强方法丢失了大量语义信息;第二,大多数框架选择的局部-局部对比模式忽略了原始图的全局语义信息,这限制了这些框架的节点分类性能。针对上述问题,本文提出了一种新型图对比学习框架--MDGCL,它在保留局部-局部对比模式的同时,引入了马尔可夫和PPR两种图扩散方法以及确定性-随机数据增强策略。具体来说,在使用两种随机扩增方法(FeatureDrop 和 EdgeDrop)之前,MDGCL 首先使用两种确定性扩增方法(Markov diffusion 和 PPR diffusion)对原始图进行数据扩增,以增加语义信息。同时,扩增视图所携带的扩散矩阵包含了原始图的全局语义信息,使得框架在保留局部-局部对比模式的同时利用了全局语义信息,进一步提高了框架的节点分类性能。我们在多个基准数据集上进行了广泛的对比实验,结果表明 MDGCL 在节点分类任务上的表现优于具有代表性的基线框架。其中,与 COSTA 相比,MDGCL 在 Amazon-Photo 和 Coauthor-CS 两个代表性数据集上的节点分类准确率分别提高了 1.07% 和 0.41%。此外,我们还在 Cora 和 CiteSeer 两个数据集上进行了消融实验,以验证框架各项改进工作的有效性。
{"title":"MDGCL: Graph Contrastive Learning Framework with Multiple Graph Diffusion Methods","authors":"Yuqiang Li, Yi Zhang, Chun Liu","doi":"10.1007/s11063-024-11672-3","DOIUrl":"https://doi.org/10.1007/s11063-024-11672-3","url":null,"abstract":"<p>In recent years, some classical graph contrastive learning(GCL) frameworks have been proposed to address the problem of sparse labeling of graph data in the real world. However, in node classification tasks, there are two obvious problems with existing GCL frameworks: first, the stochastic augmentation methods they adopt lose a lot of semantic information; second, the local–local contrasting mode selected by most frameworks ignores the global semantic information of the original graph, which limits the node classification performance of these frameworks. To address the above problems, this paper proposes a novel graph contrastive learning framework, MDGCL, which introduces two graph diffusion methods, Markov and PPR, and a deterministic–stochastic data augmentation strategy while retaining the local–local contrasting mode. Specifically, before using the two stochastic augmentation methods (FeatureDrop and EdgeDrop), MDGCL first uses two deterministic augmentation methods (Markov diffusion and PPR diffusion) to perform data augmentation on the original graph to increase the semantic information, this step ensures subsequent stochastic augmentation methods do not lose too much semantic information. Meanwhile, the diffusion matrices carried by the augmented views contain global semantic information of the original graph, allowing the framework to utilize the global semantic information while retaining the local-local contrasting mode, which further enhances the node classification performance of the framework. We conduct extensive comparative experiments on multiple benchmark datasets, and the results show that MDGCL outperforms the representative baseline frameworks on node classification tasks. Among them, compared with COSTA, MDGCL’s node classification accuracy has been improved by 1.07% and 0.41% respectively on two representative datasets, Amazon-Photo and Coauthor-CS. In addition, we also conduct ablation experiments on two datasets, Cora and CiteSeer, to verify the effectiveness of each improvement work of our framework.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141612834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On Stage-Wise Backpropagation for Improving Cheng’s Method for Fully Connected Cascade Networks 关于分阶段反向传播改进全连接级联网络的程氏方法
IF 3.1 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-11 DOI: 10.1007/s11063-024-11655-4
Eiji Mizutani, Naoyuki Kubota, Tam Chi Truong

In this journal, Cheng has proposed a backpropagation (BP) procedure called BPFCC for deep fully connected cascaded (FCC) neural network learning in comparison with a neuron-by-neuron (NBN) algorithm of Wilamowski and Yu. Both BPFCC and NBN are designed to implement the Levenberg-Marquardt method, which requires an efficient evaluation of the Gauss-Newton (approximate Hessian) matrix (nabla textbf{r}^textsf{T} nabla textbf{r}), the cross product of the Jacobian matrix (nabla textbf{r}) of the residual vector (textbf{r}) in nonlinear least squares sense. Here, the dominant cost is to form (nabla textbf{r}^textsf{T} nabla textbf{r}) by rank updates on each data pattern. Notably, NBN is better than BPFCC for the multiple (q~!(>!1))-output FCC-learning when q rows (per pattern) of the Jacobian matrix (nabla textbf{r}) are evaluated; however, the dominant cost (for rank updates) is common to both BPFCC and NBN. The purpose of this paper is to present a new more efficient stage-wise BP procedure (for q-output FCC-learning) that reduces the dominant cost with no rows of (nabla textbf{r}) explicitly evaluated, just as standard BP evaluates the gradient vector (nabla textbf{r}^textsf{T} textbf{r}) with no explicit evaluation of any rows of the Jacobian matrix (nabla textbf{r}).

在这本期刊中,Cheng 提出了一种名为 BPFCC 的反向传播(BP)程序,用于深度全连接级联(FCC)神经网络学习,并与 Wilamowski 和 Yu 的逐神经元(NBN)算法进行了比较。BPFCC 和 NBN 算法都是为实现 Levenberg-Marquardt 方法而设计的,该方法要求高效评估高斯-牛顿(近似 Hessian)矩阵 (nabla textbf{r}^textsf{T} nabla textbf{r}), 即非线性最小二乘法意义上残差向量 (textbf{r}) 的雅各布矩阵 (nabla textbf{r}) 的交叉积。在这里,主要的代价是通过对每个数据模式的秩更新来形成 (nabla textbf{r}^textsf{T} nabla textbf{r})。值得注意的是,当评估 Jacobian 矩阵 (nabla textbf{r})的 q 行(每个模式)时,NBN 在多行(q~!本文的目的是提出一种新的更高效的分阶段 BP 程序(用于 q 输出 FCC 学习),它可以在不明确评估 (nabla textbf{r}) 的任何行的情况下降低主导成本,就像标准 BP 评估梯度向量 (nabla textbf{r}^textsf{T} textbf{r})一样,不明确评估雅各布矩阵 (nabla textbf{r}) 的任何行。
{"title":"On Stage-Wise Backpropagation for Improving Cheng’s Method for Fully Connected Cascade Networks","authors":"Eiji Mizutani, Naoyuki Kubota, Tam Chi Truong","doi":"10.1007/s11063-024-11655-4","DOIUrl":"https://doi.org/10.1007/s11063-024-11655-4","url":null,"abstract":"<p>In this journal, Cheng has proposed a <i>backpropagation</i> (<i>BP</i>) procedure called BPFCC for deep <i>fully connected cascaded</i> (<i>FCC</i>) neural network learning in comparison with a <i>neuron-by-neuron</i> (NBN) algorithm of Wilamowski and Yu. Both BPFCC and NBN are designed to implement the Levenberg-Marquardt method, which requires an efficient evaluation of the Gauss-Newton (approximate Hessian) matrix <span>(nabla textbf{r}^textsf{T} nabla textbf{r})</span>, the cross product of the Jacobian matrix <span>(nabla textbf{r})</span> of the residual vector <span>(textbf{r})</span> in <i>nonlinear least squares sense</i>. Here, the dominant cost is to form <span>(nabla textbf{r}^textsf{T} nabla textbf{r})</span> by <i>rank updates on each data pattern</i>. Notably, NBN is better than BPFCC for the multiple <span>(q~!(&gt;!1))</span>-output FCC-learning when <i>q</i> rows (per pattern) of the Jacobian matrix <span>(nabla textbf{r})</span> are evaluated; however, the dominant cost (for rank updates) is common to both BPFCC and NBN. The purpose of this paper is to present a new more efficient <i>stage-wise BP</i> procedure (for <i>q</i>-output FCC-learning) that <i>reduces the dominant cost</i> with no rows of <span>(nabla textbf{r})</span> explicitly evaluated, just as standard BP evaluates the gradient vector <span>(nabla textbf{r}^textsf{T} textbf{r})</span> with no explicit evaluation of any rows of the Jacobian matrix <span>(nabla textbf{r})</span>.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141585797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detection Method of Manipulator Grasp Pose Based on RGB-D Image 基于 RGB-D 图像的机械手抓握姿势检测方法
IF 3.1 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-09 DOI: 10.1007/s11063-024-11662-5
Cheng Huang, Zhen Pang, Jiazhong Xu

In order to better solve the visual detection problem of manipulator grasping non-cooperative targets, we propose a method of grasp pose detection based on pixel point and feature fusion. By using the improved U2net network as the backbone for feature extraction and feature fusion of the input image, and the grasp prediction layer detects the grasp pose on each pixel. In order to adapt the U2net to grasp pose detection and improve its detection performance, we improve detection speed and control sampling depth by simplifying its network structure, while retaining some shallow features in feature fusion to enhance its feature extraction capability. We introduce depthwise separable convolution in the grasp prediction layer, further fusing the features extracted from the backbone to obtain predictive feature maps with stronger feature expressiveness. FocalLoss is selected as the loss function to solve the problem of unbalanced positive and negative samples in network training. We use the Cornell dataset for training and testing, perform pixel-level labeling on the image, and replace the labels that are not conducive to the actual grasping. This adaptation helps the dataset better suit the network training and testing while meeting the real-world grasping requirements of the manipulator. The evaluation results on image-wise and object-wise are 95.65% and 91.20% respectively, and the detection speed is 0.007 s/frame. We also used the method for actual manipulator grasping experiments. The results show that our method has improved accuracy and speed compared with previous methods, and has strong generalization ability and portability.

为了更好地解决机械手抓取非合作目标时的视觉检测问题,我们提出了一种基于像素点和特征融合的抓取姿势检测方法。以改进的 U2net 网络为骨干,对输入图像进行特征提取和特征融合,并由抓取预测层检测每个像素点上的抓取姿势。为了使 U2net 适应抓取姿势检测并提高其检测性能,我们通过简化其网络结构来提高检测速度和控制采样深度,同时在特征融合中保留一些浅层特征,以增强其特征提取能力。我们在抓取预测层引入深度可分离卷积,进一步融合从骨干层提取的特征,得到特征表现力更强的预测特征图。我们选择 FocalLoss 作为损失函数,以解决网络训练中正负样本不平衡的问题。我们使用康奈尔数据集进行训练和测试,对图像进行像素级标注,并替换不利于实际抓取的标签。这种调整有助于数据集更好地适应网络训练和测试,同时满足机械手的实际抓取要求。对图像和物体的评估结果分别为 95.65% 和 91.20%,检测速度为 0.007 秒/帧。我们还使用该方法进行了实际机械手抓取实验。结果表明,与之前的方法相比,我们的方法提高了准确性和速度,并且具有很强的泛化能力和可移植性。
{"title":"Detection Method of Manipulator Grasp Pose Based on RGB-D Image","authors":"Cheng Huang, Zhen Pang, Jiazhong Xu","doi":"10.1007/s11063-024-11662-5","DOIUrl":"https://doi.org/10.1007/s11063-024-11662-5","url":null,"abstract":"<p>In order to better solve the visual detection problem of manipulator grasping non-cooperative targets, we propose a method of grasp pose detection based on pixel point and feature fusion. By using the improved U2net network as the backbone for feature extraction and feature fusion of the input image, and the grasp prediction layer detects the grasp pose on each pixel. In order to adapt the U2net to grasp pose detection and improve its detection performance, we improve detection speed and control sampling depth by simplifying its network structure, while retaining some shallow features in feature fusion to enhance its feature extraction capability. We introduce depthwise separable convolution in the grasp prediction layer, further fusing the features extracted from the backbone to obtain predictive feature maps with stronger feature expressiveness. FocalLoss is selected as the loss function to solve the problem of unbalanced positive and negative samples in network training. We use the Cornell dataset for training and testing, perform pixel-level labeling on the image, and replace the labels that are not conducive to the actual grasping. This adaptation helps the dataset better suit the network training and testing while meeting the real-world grasping requirements of the manipulator. The evaluation results on image-wise and object-wise are 95.65% and 91.20% respectively, and the detection speed is 0.007 s/frame. We also used the method for actual manipulator grasping experiments. The results show that our method has improved accuracy and speed compared with previous methods, and has strong generalization ability and portability.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141574440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Lightweight Task-Agreement Meta Learning for Low-Resource Speech Recognition 用于低资源语音识别的轻量级任务协议元学习
IF 3.1 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-05 DOI: 10.1007/s11063-024-11661-6
Yaqi Chen, Hao Zhang, Wenlin Zhang, Dan Qu, Xukui Yang

Meta-learning has proven to be a powerful paradigm for transferring knowledge from prior tasks to facilitate the quick learning of new tasks in automatic speech recognition. However, the differences between languages (tasks) lead to variations in task learning directions, causing the harmful competition for model’s limited resources. To address this challenge, we introduce the task-agreement multilingual meta-learning (TAMML), which adopts the gradient agreement algorithm to guide the model parameters towards a direction where tasks exhibit greater consistency. However, the computation and storage cost of TAMML grows dramatically with model’s depth increases. To address this, we further propose a simplification called TAMML-Light which only uses the output layer for gradient calculation. Experiments on three datasets demonstrate that TAMML and TAMML-Light achieve outperform meta-learning approaches, yielding superior results.Furthermore, TAMML-Light can reduce at least 80 (%) of the relative increased computation expenses compared to TAMML.

元学习(Meta-learning)已被证明是一种强大的范式,它可以从先前的任务中转移知识,从而促进自动语音识别中新任务的快速学习。然而,语言(任务)之间的差异会导致任务学习方向的不同,从而对模型的有限资源造成有害竞争。为了解决这一难题,我们引入了任务协议多语言元学习(TAMML),它采用梯度协议算法引导模型参数向任务表现出更大一致性的方向发展。然而,TAMML 的计算和存储成本随着模型深度的增加而急剧增长。为了解决这个问题,我们进一步提出了一种简化方法,称为 TAMML-Light,它只使用输出层进行梯度计算。在三个数据集上的实验表明,TAMML和TAMML-Light的表现优于元学习方法,取得了卓越的效果。此外,与TAMML相比,TAMML-Light至少可以减少80%的相对增加的计算费用。
{"title":"A Lightweight Task-Agreement Meta Learning for Low-Resource Speech Recognition","authors":"Yaqi Chen, Hao Zhang, Wenlin Zhang, Dan Qu, Xukui Yang","doi":"10.1007/s11063-024-11661-6","DOIUrl":"https://doi.org/10.1007/s11063-024-11661-6","url":null,"abstract":"<p>Meta-learning has proven to be a powerful paradigm for transferring knowledge from prior tasks to facilitate the quick learning of new tasks in automatic speech recognition. However, the differences between languages (tasks) lead to variations in task learning directions, causing the harmful competition for model’s limited resources. To address this challenge, we introduce the task-agreement multilingual meta-learning (TAMML), which adopts the gradient agreement algorithm to guide the model parameters towards a direction where tasks exhibit greater consistency. However, the computation and storage cost of TAMML grows dramatically with model’s depth increases. To address this, we further propose a simplification called TAMML-Light which only uses the output layer for gradient calculation. Experiments on three datasets demonstrate that TAMML and TAMML-Light achieve outperform meta-learning approaches, yielding superior results.Furthermore, TAMML-Light can reduce at least 80 <span>(%)</span> of the relative increased computation expenses compared to TAMML.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141552316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PANet: Pluralistic Attention Network for Few-Shot Image Classification PANet:用于少镜头图像分类的多元注意网络
IF 3.1 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-29 DOI: 10.1007/s11063-024-11638-5
Wenming Cao, Tianyuan Li, Qifan Liu, Zhiquan He

Traditional deep learning methods require a large amount of labeled data for model training, which is laborious and costly in real word. Few-shot learning (FSL) aims to recognize novel classes with only a small number of labeled samples to address these challenges. We focus on metric-based few-shot learning with improvements in both feature extraction and metric method. In our work, we propose the Pluralistic Attention Network (PANet), a novel attention-oriented framework, involving both a local encoded intra-attention(LEIA) module and a global encoded reciprocal attention(GERA) module. The LEIA is designed to capture comprehensive local feature dependencies within every single sample. The GERA concentrates on the correlation between two samples and learns the discriminability of representations obtained from the LEIA. The two modules are complementary to each other and ensure the feature information within and between images can be fully utilized. Furthermore, we also design a dual-centralization (DC) cosine similarity to eliminate the disparity of data distribution in different dimensions and enhance the metric accuracy between support and query samples. Our method is thoroughly evaluated with extensive experiments, and the results demonstrate that with the contribution of each component, our model can achieve high-performance on four widely used few-shot classification benchmarks of miniImageNet, tieredImageNet, CUB-200-2011 and CIFAR-FS.

传统的深度学习方法需要大量标注数据来训练模型,这在实际应用中既费力又费钱。少量学习(FSL)旨在只用少量标注样本来识别新类别,以应对这些挑战。我们将重点放在基于度量的少量学习上,并对特征提取和度量方法进行了改进。在我们的工作中,我们提出了多元注意力网络(PANet),这是一个以注意力为导向的新型框架,包含局部编码内部注意力(LEIA)模块和全局编码互惠注意力(GERA)模块。本地编码内部注意模块旨在捕捉每个样本中的全面本地特征依赖关系。GERA 专注于两个样本之间的相关性,并学习从 LEIA 中获得的表征的可辨别性。这两个模块相辅相成,确保图像内部和图像之间的特征信息得到充分利用。此外,我们还设计了双集中(DC)余弦相似性,以消除不同维度数据分布的差异,提高支持样本和查询样本之间的度量精度。我们通过大量实验对我们的方法进行了全面评估,结果表明,在各个组件的贡献下,我们的模型可以在 miniImageNet、tiereredImageNet、CUB-200-2011 和 CIFAR-FS 这四个广泛使用的少量图像分类基准上实现高性能。
{"title":"PANet: Pluralistic Attention Network for Few-Shot Image Classification","authors":"Wenming Cao, Tianyuan Li, Qifan Liu, Zhiquan He","doi":"10.1007/s11063-024-11638-5","DOIUrl":"https://doi.org/10.1007/s11063-024-11638-5","url":null,"abstract":"<p>Traditional deep learning methods require a large amount of labeled data for model training, which is laborious and costly in real word. Few-shot learning (FSL) aims to recognize novel classes with only a small number of labeled samples to address these challenges. We focus on metric-based few-shot learning with improvements in both feature extraction and metric method. In our work, we propose the Pluralistic Attention Network (PANet), a novel attention-oriented framework, involving both a local encoded intra-attention(LEIA) module and a global encoded reciprocal attention(GERA) module. The LEIA is designed to capture comprehensive local feature dependencies within every single sample. The GERA concentrates on the correlation between two samples and learns the discriminability of representations obtained from the LEIA. The two modules are complementary to each other and ensure the feature information within and between images can be fully utilized. Furthermore, we also design a dual-centralization (DC) cosine similarity to eliminate the disparity of data distribution in different dimensions and enhance the metric accuracy between support and query samples. Our method is thoroughly evaluated with extensive experiments, and the results demonstrate that with the contribution of each component, our model can achieve high-performance on four widely used few-shot classification benchmarks of miniImageNet, tieredImageNet, CUB-200-2011 and CIFAR-FS.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141505306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exponential Stability of Impulsive Stochastic Neutral Neural Networks with Lévy Noise Under Non-Lipschitz Conditions 非 Lipschitz 条件下具有 Lévy 噪声的脉冲随机中性神经网络的指数稳定性
IF 3.1 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-28 DOI: 10.1007/s11063-024-11663-4
Shuo Ma, Jiangman Li, Ruonan Liu, Qiang Li

In this paper, the exponential stability issue of stochastic impulsive neutral neural networks driven by Lévy noise is explored. By resorting to the Lyapunov-Krasovskii function that involves neutral time-delay components, the properties of the Lévy process, as well as various inequality approaches, some sufficient exponential stability criteria in non-Lipschitz cases are obtained. Besides, the achieved results depend on the time-delay, noise intensity, and impulse factor. At the end of the paper, two numerical examples with simulations are presented to demonstrate the effectiveness and feasibility of the addressed results

本文探讨了由勒维噪声驱动的随机冲动中性神经网络的指数稳定性问题。通过利用涉及中性时延分量的 Lyapunov-Krasovskii 函数、Lévy 过程的特性以及各种不等式方法,得到了非 Lipschitz 情况下的一些充分指数稳定性准则。此外,所取得的结果还取决于时延、噪声强度和脉冲因子。论文最后给出了两个数值模拟示例,以证明上述结果的有效性和可行性。
{"title":"Exponential Stability of Impulsive Stochastic Neutral Neural Networks with Lévy Noise Under Non-Lipschitz Conditions","authors":"Shuo Ma, Jiangman Li, Ruonan Liu, Qiang Li","doi":"10.1007/s11063-024-11663-4","DOIUrl":"https://doi.org/10.1007/s11063-024-11663-4","url":null,"abstract":"<p>In this paper, the exponential stability issue of stochastic impulsive neutral neural networks driven by Lévy noise is explored. By resorting to the Lyapunov-Krasovskii function that involves neutral time-delay components, the properties of the Lévy process, as well as various inequality approaches, some sufficient exponential stability criteria in non-Lipschitz cases are obtained. Besides, the achieved results depend on the time-delay, noise intensity, and impulse factor. At the end of the paper, two numerical examples with simulations are presented to demonstrate the effectiveness and feasibility of the addressed results</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141520316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Knowledge Graph-Aware Deep Interest Extraction Network on Sequential Recommendation 基于序列推荐的知识图谱感知深度兴趣提取网络
IF 3.1 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-28 DOI: 10.1007/s11063-024-11665-2
Zhenhai Wang, Yuhao Xu, Zhiru Wang, Rong Fan, Yunlong Guo, Weimin Li

Sequential recommendation is the mainstream approach in the field of click-through-rate (CTR) prediction for modeling users’ behavior. This behavior implies the change of the user’s interest, and the goal of sequential recommendation is to capture this dynamic change. However, existing studies have focused on designing complex dedicated networks to capture user interests from user behavior sequences, while neglecting the use of auxiliary information. Recently, knowledge graph (KG) has gradually attracted the attention of researchers as a structured auxiliary information. Items and their attributes in the recommendation, can be mapped to knowledge triples in the KG. Therefore, the introduction of KG to recommendation can help us obtain more expressive item representations. Since KG can be considered a special type of graph, it is possible to use the graph neural network (GNN) to propagate the rich information contained in the KG into the item representation. Based on this idea, this paper proposes a recommendation method that uses KG as auxiliary information. The method first propagates the knowledge information in the KG using GNN to obtain a knowledge-rich item representation. Then the temporal features in the item sequence are extracted using a transformer for CTR prediction, namely the Knowledge Graph-Aware Deep Interest Extraction network (KGDIE). To evaluate the performance of this model, we conducted extensive experiments on two real datasets with different scenarios. The results showed that the KGDIE method could outperform several state-of-the-art baselines. The source code of our model is available at https://github.com/gylgyl123/kgdie.

顺序推荐是点击率(CTR)预测领域中模拟用户行为的主流方法。这种行为意味着用户兴趣的变化,而顺序推荐的目标就是捕捉这种动态变化。然而,现有的研究侧重于设计复杂的专用网络,从用户行为序列中捕捉用户兴趣,而忽视了辅助信息的使用。最近,知识图谱(KG)作为一种结构化的辅助信息逐渐引起了研究人员的关注。推荐中的项目及其属性可以映射到知识图谱中的知识三元组。因此,在推荐中引入知识图谱可以帮助我们获得更具表现力的项目表示。由于 KG 可以被视为一种特殊类型的图,因此可以使用图神经网络(GNN)将 KG 中包含的丰富信息传播到项目表示中。基于这一思想,本文提出了一种将知识信息作为辅助信息的推荐方法。该方法首先利用 GNN 传播 KG 中的知识信息,从而获得知识丰富的项目表示。然后使用用于 CTR 预测的转换器(即知识图谱感知深度兴趣提取网络(KGDIE))提取项目序列中的时间特征。为了评估该模型的性能,我们在两个不同场景的真实数据集上进行了大量实验。结果表明,KGDIE 方法的性能优于几种最先进的基线方法。我们模型的源代码可在 https://github.com/gylgyl123/kgdie 上获取。
{"title":"Knowledge Graph-Aware Deep Interest Extraction Network on Sequential Recommendation","authors":"Zhenhai Wang, Yuhao Xu, Zhiru Wang, Rong Fan, Yunlong Guo, Weimin Li","doi":"10.1007/s11063-024-11665-2","DOIUrl":"https://doi.org/10.1007/s11063-024-11665-2","url":null,"abstract":"<p>Sequential recommendation is the mainstream approach in the field of click-through-rate (CTR) prediction for modeling users’ behavior. This behavior implies the change of the user’s interest, and the goal of sequential recommendation is to capture this dynamic change. However, existing studies have focused on designing complex dedicated networks to capture user interests from user behavior sequences, while neglecting the use of auxiliary information. Recently, knowledge graph (KG) has gradually attracted the attention of researchers as a structured auxiliary information. Items and their attributes in the recommendation, can be mapped to knowledge triples in the KG. Therefore, the introduction of KG to recommendation can help us obtain more expressive item representations. Since KG can be considered a special type of graph, it is possible to use the graph neural network (GNN) to propagate the rich information contained in the KG into the item representation. Based on this idea, this paper proposes a recommendation method that uses KG as auxiliary information. The method first propagates the knowledge information in the KG using GNN to obtain a knowledge-rich item representation. Then the temporal features in the item sequence are extracted using a transformer for CTR prediction, namely the <b>K</b>nowledge <b>G</b>raph-Aware <b>D</b>eep <b>I</b>nterest <b>E</b>xtraction network (KGDIE). To evaluate the performance of this model, we conducted extensive experiments on two real datasets with different scenarios. The results showed that the KGDIE method could outperform several state-of-the-art baselines. The source code of our model is available at https://github.com/gylgyl123/kgdie.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141505308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Gudermannian Neural Networks for Two-Point Nonlinear Singular Model Arising in the Thermal-Explosion Theory 热爆炸理论中出现的两点非线性奇异模型的古德曼神经网络
IF 3.1 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-26 DOI: 10.1007/s11063-024-11512-4
Samara Fatima, Zulqurnain Sabir, Dumitru Baleanu, Sharifah E. Alhazmi

The goal of this research is to design the Gudermannian neural networks (GNNs) to solve a type of two-point nonlinear singular boundary value problems (TPN-SBVPs) that arise within thermal-explosion theory. The results of these investigation are provided for different neurons (4, 12 and 20), as well as absolute error along with the time complexity. For solving the TPN-SBVPs, a genetic algorithm (GA) and sequential quadratic programming (SQP) are used to optimize the error function. The accuracy of designed GNNs is provided by using a hybrid GA–SQP combination, which is based on a comparison of obtained and actual solutions. Furthermore, statistical analysis of the data is proposed in order to establish the competence as well as effectiveness of designed and the efficacy of the designed computing framework for solving the TPN-SBVPs.

本研究的目标是设计古德曼神经网络(GNN),以解决热爆炸理论中出现的一种两点非线性奇异边界值问题(TPN-SBVPs)。我们提供了不同神经元(4、12 和 20)的研究结果,以及绝对误差和时间复杂度。在求解 TPN-SBVPs 时,使用了遗传算法(GA)和序列二次编程(SQP)来优化误差函数。通过使用 GA-SQP 混合组合,在比较获得的解和实际解的基础上,提供了所设计的 GNN 的准确性。此外,还提出了数据统计分析,以确定所设计的计算框架在解决 TPN-SBVPs 方面的能力和有效性。
{"title":"Gudermannian Neural Networks for Two-Point Nonlinear Singular Model Arising in the Thermal-Explosion Theory","authors":"Samara Fatima, Zulqurnain Sabir, Dumitru Baleanu, Sharifah E. Alhazmi","doi":"10.1007/s11063-024-11512-4","DOIUrl":"https://doi.org/10.1007/s11063-024-11512-4","url":null,"abstract":"<p>The goal of this research is to design the Gudermannian neural networks (GNNs) to solve a type of two-point nonlinear singular boundary value problems (TPN-SBVPs) that arise within thermal-explosion theory. The results of these investigation are provided for different neurons (4, 12 and 20), as well as absolute error along with the time complexity. For solving the TPN-SBVPs, a genetic algorithm (GA) and sequential quadratic programming (SQP) are used to optimize the error function. The accuracy of designed GNNs is provided by using a hybrid GA–SQP combination, which is based on a comparison of obtained and actual solutions. Furthermore, statistical analysis of the data is proposed in order to establish the competence as well as effectiveness of designed and the efficacy of the designed computing framework for solving the TPN-SBVPs.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141505307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive Sliding Mode Fixed-/Preassigned-Time Synchronization of Stochastic Memristive Neural Networks with Mixed-Delays 具有混合延迟的随机记忆神经网络的自适应滑动模式固定/预分配时间同步化
IF 3.1 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-21 DOI: 10.1007/s11063-024-11669-y
Jie Gao, Xiangyong Chen, Jianlong Qiu, Chunmei Wang, Tianyuan Jia

The paper addresses the fixed-/preassigned-time synchronization of stochastic memristive neural networks (MNNs) with uncertain parameters and mixed delays. Adaptive sliding mode control (ASMC) technology is mainly utilized. First, a proper sliding surface is constructed and the adaptive laws are given. Also, the synchronization control scheme is designed, which can ensure error system to realize fixed-time stability. Second, preassigned-time sliding mode control scheme is mainly provided to realize fast synchronization of MNNs. The presented theoretical methods can guarantee the error system convergence and stability for reaching and sliding mode within preassigned-time. And the synchronization criteria and explicit expression of settling time (ST) are acquired, where ST is not related with initial values and controller parameters but can be predefined perferentially. Finally, the calculation example is offered to interpret the practicability and availability of the innovations in this paper.

本文探讨了具有不确定参数和混合延迟的随机记忆神经网络(MNN)的固定/预分配时间同步问题。主要采用了自适应滑模控制(ASMC)技术。首先,构建适当的滑动面并给出自适应规律。同时,设计了同步控制方案,确保误差系统实现定时稳定性。其次,主要提供了预分配时间滑模控制方案,以实现 MNN 的快速同步。所提出的理论方法可以保证误差系统在预分配时间内达到和滑动模式的收敛性和稳定性。此外,还获得了同步标准和沉降时间(ST)的明确表达式,其中 ST 与初始值和控制器参数无关,可以优先预定义。最后,本文提供了一个计算实例,以解释本文创新的实用性和可用性。
{"title":"Adaptive Sliding Mode Fixed-/Preassigned-Time Synchronization of Stochastic Memristive Neural Networks with Mixed-Delays","authors":"Jie Gao, Xiangyong Chen, Jianlong Qiu, Chunmei Wang, Tianyuan Jia","doi":"10.1007/s11063-024-11669-y","DOIUrl":"https://doi.org/10.1007/s11063-024-11669-y","url":null,"abstract":"<p>The paper addresses the fixed-/preassigned-time synchronization of stochastic memristive neural networks (MNNs) with uncertain parameters and mixed delays. Adaptive sliding mode control (ASMC) technology is mainly utilized. First, a proper sliding surface is constructed and the adaptive laws are given. Also, the synchronization control scheme is designed, which can ensure error system to realize fixed-time stability. Second, preassigned-time sliding mode control scheme is mainly provided to realize fast synchronization of MNNs. The presented theoretical methods can guarantee the error system convergence and stability for reaching and sliding mode within preassigned-time. And the synchronization criteria and explicit expression of settling time (ST) are acquired, where ST is not related with initial values and controller parameters but can be predefined perferentially. Finally, the calculation example is offered to interpret the practicability and availability of the innovations in this paper.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141505310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SS-CRE: A Continual Relation Extraction Method Through SimCSE-BERT and Static Relation Prototypes SS-CRE:通过 SimCSE-BERT 和静态关系原型的连续关系提取方法
IF 3.1 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-20 DOI: 10.1007/s11063-024-11647-4
Jinguang Chen, Suyue Wang, Lili Ma, Bo Yang, Kaibing Zhang

Continual relation extraction aims to learn new relations from a continuous stream of data while avoiding forgetting old relations. Existing methods typically use the BERT encoder to obtain semantic embeddings, ignoring the fact that the vector representations suffer from anisotropy and uneven distribution. Furthermore, the relation prototypes are usually computed by memory samples directly, resulting in the model being overly sensitive to memory samples. To solve these problems, we propose a new continual relation extraction method. Firstly, we modified the basic structure of the sample encoder to generate uniformly distributed semantic embeddings using the supervised SimCSE-BERT to obtain richer sample information. Secondly, we introduced static relation prototypes and dynamically adjust their proportion with dynamic relation prototypes to adapt to the feature space. Lastly, through experimental analysis on the widely used FewRel and TACRED datasets, the results demonstrate that the proposed method effectively enhances semantic embeddings and relation prototypes, resulting in a further alleviation of catastrophic forgetting in the model. The code will be soon released at https://github.com/SuyueW/SS-CRE.

连续关系提取的目的是从连续数据流中学习新关系,同时避免遗忘旧关系。现有方法通常使用 BERT 编码器来获取语义嵌入,而忽略了向量表示存在各向异性和分布不均的问题。此外,关系原型通常由内存样本直接计算,导致模型对内存样本过于敏感。为了解决这些问题,我们提出了一种新的连续关系提取方法。首先,我们修改了样本编码器的基本结构,利用有监督的 SimCSE-BERT 生成均匀分布的语义嵌入,从而获得更丰富的样本信息。其次,我们引入了静态关系原型,并通过动态关系原型动态调整其比例以适应特征空间。最后,通过对广泛使用的 FewRel 和 TACRED 数据集进行实验分析,结果表明所提出的方法有效地增强了语义嵌入和关系原型,从而进一步减轻了模型中的灾难性遗忘。代码即将在 https://github.com/SuyueW/SS-CRE 上发布。
{"title":"SS-CRE: A Continual Relation Extraction Method Through SimCSE-BERT and Static Relation Prototypes","authors":"Jinguang Chen, Suyue Wang, Lili Ma, Bo Yang, Kaibing Zhang","doi":"10.1007/s11063-024-11647-4","DOIUrl":"https://doi.org/10.1007/s11063-024-11647-4","url":null,"abstract":"<p>Continual relation extraction aims to learn new relations from a continuous stream of data while avoiding forgetting old relations. Existing methods typically use the BERT encoder to obtain semantic embeddings, ignoring the fact that the vector representations suffer from anisotropy and uneven distribution. Furthermore, the relation prototypes are usually computed by memory samples directly, resulting in the model being overly sensitive to memory samples. To solve these problems, we propose a new continual relation extraction method. Firstly, we modified the basic structure of the sample encoder to generate uniformly distributed semantic embeddings using the supervised SimCSE-BERT to obtain richer sample information. Secondly, we introduced static relation prototypes and dynamically adjust their proportion with dynamic relation prototypes to adapt to the feature space. Lastly, through experimental analysis on the widely used FewRel and TACRED datasets, the results demonstrate that the proposed method effectively enhances semantic embeddings and relation prototypes, resulting in a further alleviation of catastrophic forgetting in the model. The code will be soon released at https://github.com/SuyueW/SS-CRE.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141520317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Neural Processing Letters
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1