Neural Processing Letters最新文献_第7页

A Parallel Model for Jointly Extracting Entities and Relations 联合提取实体和关系的并行模型

IF 3.1 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Processing Letters

Pub Date : 2024-05-07 DOI: 10.1007/s11063-024-11616-x

Zuqin Chen, Yujie Zheng, Jike Ge, Wencheng Yu, Zining Wang

Extracting relational triples from a piece of text is an essential task in knowledge graph construction. However, most existing methods either identify entities before predicting their relations, or detect relations before recognizing associated entities. This order may lead to error accumulation because once there is an error in the initial step, it will accumulate to subsequent steps. To solve this problem, we propose a parallel model for jointly extracting entities and relations, called PRE-Span, which consists of two mutually independent submodules. Specifically, candidate entities and relations are first generated by enumerating token sequences in sentences. Then, two independent submodules (Entity Extraction Module and Relation Detection Module) are designed to predict entities and relations. Finally, the predicted results of the two submodules are analyzed to select entities and relations, which are jointly decoded to obtain relational triples. The advantage of this method is that all triples can be extracted in just one step. Extensive experiments on the WebNLG*, NYT*, NYT and WebNLG datasets show that our model outperforms other baselines at 94.4%, 88.3%, 86.5% and 83.0%, respectively.

从文本中提取关系三元组是构建知识图谱的一项基本任务。然而，现有的大多数方法要么是先识别实体再预测其关系，要么是先检测关系再识别相关实体。这种顺序可能会导致错误累积，因为一旦初始步骤出现错误，就会累积到后续步骤。为了解决这个问题，我们提出了一种联合提取实体和关系的并行模型，称为 PRE-Span，它由两个相互独立的子模块组成。具体来说，首先通过枚举句子中的标记序列生成候选实体和关系。然后，设计两个独立的子模块（实体提取模块和关系检测模块）来预测实体和关系。最后，对两个子模块的预测结果进行分析，选出实体和关系，并对它们进行联合解码，得到关系三。这种方法的优点是只需一步就能提取所有三元组。在 WebNLG*、NYT*、NYT 和 WebNLG 数据集上进行的大量实验表明，我们的模型优于其他基线模型的比例分别为 94.4%、88.3%、86.5% 和 83.0%。

{"title":"A Parallel Model for Jointly Extracting Entities and Relations","authors":"Zuqin Chen, Yujie Zheng, Jike Ge, Wencheng Yu, Zining Wang","doi":"10.1007/s11063-024-11616-x","DOIUrl":"https://doi.org/10.1007/s11063-024-11616-x","url":null,"abstract":"Extracting relational triples from a piece of text is an essential task in knowledge graph construction. However, most existing methods either identify entities before predicting their relations, or detect relations before recognizing associated entities. This order may lead to error accumulation because once there is an error in the initial step, it will accumulate to subsequent steps. To solve this problem, we propose a parallel model for jointly extracting entities and relations, called PRE-Span, which consists of two mutually independent submodules. Specifically, candidate entities and relations are first generated by enumerating token sequences in sentences. Then, two independent submodules (Entity Extraction Module and Relation Detection Module) are designed to predict entities and relations. Finally, the predicted results of the two submodules are analyzed to select entities and relations, which are jointly decoded to obtain relational triples. The advantage of this method is that all triples can be extracted in just one step. Extensive experiments on the WebNLG*, NYT*, NYT and WebNLG datasets show that our model outperforms other baselines at 94.4%, 88.3%, 86.5% and 83.0%, respectively.","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"17 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140887757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Non-linear Time Series Prediction using Improved CEEMDAN, SVD and LSTM 利用改进的 CEEMDAN、SVD 和 LSTM 进行非线性时间序列预测

IF 3.1 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Processing Letters

Pub Date : 2024-05-06 DOI: 10.1007/s11063-024-11622-z

Sameer Poongadan, M. C. Lineesh

This study recommends a new time series forecasting model, namely ICEEMDAN - SVD - LSTM model, which coalesces Improved Complete Ensemble EMD with Adaptive Noise, Singular Value Decomposition and Long Short Term Memory network. It can be applied to analyse Non-linear and non-stationary data. The framework of this model is comprised of three levels, namely ICEEMDAN level, SVD level and LSTM level. The first level utilized ICEEMDAN to break up the series into some IMF components along with a residue. The SVD in the second level accounts for de-noising of every IMF component and residue. LSTM forecasts all the resultant IMF components and residue in third level. To obtain the forecasted values of the original data, the predictions of all IMF components and residue are added. The proposed model is contrasted with other extant ones, namely LSTM model, EMD - LSTM model, EEMD - LSTM model, CEEMDAN - LSTM model, EEMD - SVD - LSTM model, ICEEMDAN - LSTM model and CEEMDAN - SVD - LSTM model. The comparison bears witness to the potential of the recommended model over the traditional models.

本研究推荐了一种新的时间序列预测模型，即 ICEEMDAN - SVD - LSTM 模型，它将改进的完整集合 EMD 与自适应噪声、奇异值分解和长短期记忆网络结合在一起。它可用于分析非线性和非平稳数据。该模型的框架包括三个层次，即 ICEEMDAN 层次、SVD 层次和 LSTM 层次。第一级利用 ICEEMDAN 将序列分解为一些 IMF 成分和残差。第二级中的 SVD 对每个 IMF 分量和残差进行去噪处理。在第三级中，LSTM 对所有产生的 IMF 分量和残差进行预测。为了获得原始数据的预测值，需要将所有 IMF 分量和残差的预测值相加。建议的模型与其他现有模型进行了对比，即 LSTM 模型、EMD - LSTM 模型、EEMD - LSTM 模型、CEEMDAN - LSTM 模型、EEMD - SVD - LSTM 模型、ICEEMDAN - LSTM 模型和 CEEMDAN - SVD - LSTM 模型。对比结果证明了推荐模型比传统模型更有潜力。

引用次数: 0

Dual Classifier Adaptation: Source-Free UDA via Adaptive Pseudo-Labels Learning 双分类器自适应：通过自适应伪标签学习实现无源 UDA

IF 3.1 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Processing Letters

Pub Date : 2024-05-04 DOI: 10.1007/s11063-024-11627-8

Yunyun Wang, Qinghao Li, Ziyi Hua

Different from Unsupervised Domain Adaptation (UDA), Source-Free Unsupervised Domain Adaptation (SFUDA) transfers source knowledge to target domain without accessing the source data, using only the source model, has attracted much attention recently. One mainstream SFUDA method fine-tunes the source model by self-training to generate pseudo-labels of the target data. However, due to the significant differences between different domains, these target pseudo-labels often contain some noise, and it will inevitably degenerates the target performance. For this purpose, we propose an innovative SFUDA method with adaptive pseudo-labels learning named Dual Classifier Adaptation (DCA). In DCA, a dual classifier structure is introduced to adaptively learn target pseudo-labels by cooperation between source and target classifiers. Simultaneously, the minimax entropy is introduced for target learning, in order to adapt target data to source model, while capture the intrinsic cluster structure in target domain as well. After compared our proposed method DCA with a range of UDA and SFUDA methods, DCA achieves far ahead performance on several benchmark datasets.

与无监督领域适配（UDA）不同，无源无监督领域适配（SFUDA）是在不访问源数据的情况下，仅使用源模型将源知识转移到目标领域。一种主流的 SFUDA 方法是通过自我训练对源模型进行微调，从而生成目标数据的伪标签。然而，由于不同领域之间存在显著差异，这些目标伪标签往往包含一些噪声，这将不可避免地降低目标性能。为此，我们提出了一种具有自适应伪标签学习功能的创新 SFUDA 方法，名为双分类器自适应（DCA）。在 DCA 中，我们引入了双分类器结构，通过源分类器和目标分类器之间的合作来自适应地学习目标伪标签。同时，在目标学习中引入了最小熵，以便使目标数据适应源模型，同时捕捉目标领域的内在聚类结构。将我们提出的 DCA 方法与一系列 UDA 和 SFUDA 方法进行比较后，发现 DCA 在几个基准数据集上的性能遥遥领先。

{"title":"Dual Classifier Adaptation: Source-Free UDA via Adaptive Pseudo-Labels Learning","authors":"Yunyun Wang, Qinghao Li, Ziyi Hua","doi":"10.1007/s11063-024-11627-8","DOIUrl":"https://doi.org/10.1007/s11063-024-11627-8","url":null,"abstract":"Different from Unsupervised Domain Adaptation (UDA), Source-Free Unsupervised Domain Adaptation (SFUDA) transfers source knowledge to target domain without accessing the source data, using only the source model, has attracted much attention recently. One mainstream SFUDA method fine-tunes the source model by self-training to generate pseudo-labels of the target data. However, due to the significant differences between different domains, these target pseudo-labels often contain some noise, and it will inevitably degenerates the target performance. For this purpose, we propose an innovative SFUDA method with adaptive pseudo-labels learning named Dual Classifier Adaptation (DCA). In DCA, a dual classifier structure is introduced to adaptively learn target pseudo-labels by cooperation between source and target classifiers. Simultaneously, the minimax entropy is introduced for target learning, in order to adapt target data to source model, while capture the intrinsic cluster structure in target domain as well. After compared our proposed method DCA with a range of UDA and SFUDA methods, DCA achieves far ahead performance on several benchmark datasets.","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"13 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140887575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Dialogues Summarization Algorithm Based on Multi-task Learning 基于多任务学习的对话摘要算法

IF 3.1 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Processing Letters

Pub Date : 2024-05-02 DOI: 10.1007/s11063-024-11619-8

Haowei Chen, Chen Li, Jiajing Liang, Lihua Tian

With the continuous advancement of social information, the number of texts in the form of dialogue between individuals has exponentially increased. However, it is very challenging to review the previous dialogue content before initiating a new conversation. In view of the above background, a new dialogue summarization algorithm based on multi-task learning is first proposed in the paper. Specifically, Minimum Risk Training is used as the loss function to alleviate the problem of inconsistent goals between the training phase and the testing phase. Then, in order to deal with the problem that the model cannot effectively distinguish gender pronouns, a gender pronoun discrimination auxiliary task based on contrast learning is designed to help the model learn to distinguish different gender pronouns. Finally, an auxiliary task of reducing exposure bias is introduced, which involves incorporating the summary generated during inference into another round of training to reduce the difference between the decoder inputs during the training and testing stages. Experimental results show that our model outperforms strong baselines on three public dialogue summarization datasets: SAMSUM, DialogSum, and CSDS.

随着社会信息的不断发展，个人之间对话形式的文本数量呈指数级增长。然而，在开始新对话之前回顾之前的对话内容是非常具有挑战性的。鉴于上述背景，本文首先提出了一种基于多任务学习的新型对话摘要算法。具体来说，本文使用最小风险训练作为损失函数，以缓解训练阶段和测试阶段目标不一致的问题。然后，针对模型无法有效区分性别代词的问题，设计了基于对比学习的性别代词区分辅助任务，帮助模型学习区分不同的性别代词。最后，我们还引入了减少暴露偏差的辅助任务，即将推理过程中产生的总结纳入另一轮训练中，以减少训练和测试阶段解码器输入之间的差异。实验结果表明，我们的模型在三个公共对话摘要数据集上的表现优于强基准：SAMSUM、DialogSum 和 CSDS。

{"title":"A Dialogues Summarization Algorithm Based on Multi-task Learning","authors":"Haowei Chen, Chen Li, Jiajing Liang, Lihua Tian","doi":"10.1007/s11063-024-11619-8","DOIUrl":"https://doi.org/10.1007/s11063-024-11619-8","url":null,"abstract":"With the continuous advancement of social information, the number of texts in the form of dialogue between individuals has exponentially increased. However, it is very challenging to review the previous dialogue content before initiating a new conversation. In view of the above background, a new dialogue summarization algorithm based on multi-task learning is first proposed in the paper. Specifically, Minimum Risk Training is used as the loss function to alleviate the problem of inconsistent goals between the training phase and the testing phase. Then, in order to deal with the problem that the model cannot effectively distinguish gender pronouns, a gender pronoun discrimination auxiliary task based on contrast learning is designed to help the model learn to distinguish different gender pronouns. Finally, an auxiliary task of reducing exposure bias is introduced, which involves incorporating the summary generated during inference into another round of training to reduce the difference between the decoder inputs during the training and testing stages. Experimental results show that our model outperforms strong baselines on three public dialogue summarization datasets: SAMSUM, DialogSum, and CSDS.","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"57 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140833468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Pilot Study of Observation Poisoning on Selective Reincarnation in Multi-Agent Reinforcement Learning 观察中毒对多代理强化学习中选择性轮回的试点研究

IF 3.1 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Processing Letters

Pub Date : 2024-05-02 DOI: 10.1007/s11063-024-11625-w

Harsha Putla, Chanakya Patibandla, Krishna Pratap Singh, P Nagabhushan

This research explores the vulnerability of selective reincarnation, a concept in Multi-Agent Reinforcement Learning (MARL), in response to observation poisoning attacks. Observation poisoning is an adversarial strategy that subtly manipulates an agent’s observation space, potentially leading to a misdirection in its learning process. The primary aim of this paper is to systematically evaluate the robustness of selective reincarnation in MARL systems against the subtle yet potentially debilitating effects of observation poisoning attacks. Through assessing how manipulated observation data influences MARL agents, we seek to highlight potential vulnerabilities and inform the development of more resilient MARL systems. Our experimental testbed was the widely used HalfCheetah environment, utilizing the Independent Deep Deterministic Policy Gradient algorithm within a cooperative MARL setting. We introduced a series of triggers, namely Gaussian noise addition, observation reversal, random shuffling, and scaling, into the teacher dataset of the MARL system provided to the reincarnating agents of HalfCheetah. Here, the “teacher dataset” refers to the stored experiences from previous training sessions used to accelerate the learning of reincarnating agents in MARL. This approach enabled the observation of these triggers’ significant impact on reincarnation decisions. Specifically, the reversal technique showed the most pronounced negative effect for maximum returns, with an average decrease of 38.08% in Kendall’s tau values across all the agent combinations. With random shuffling, Kendall’s tau values decreased by 17.66%. On the other hand, noise addition and scaling aligned with the original ranking by only 21.42% and 32.66%, respectively. The results, quantified by Kendall’s tau metric, indicate the fragility of the selective reincarnation process under adversarial observation poisoning. Our findings also reveal that vulnerability to observation poisoning varies significantly among different agent combinations, with some exhibiting markedly higher susceptibility than others. This investigation elucidates our understanding of selective reincarnation’s robustness against observation poisoning attacks, which is crucial for developing more secure MARL systems and also for making informed decisions about agent reincarnation.

本研究探讨了多代理强化学习（MARL）中的一个概念--选择性轮回（selective reincarnation）在应对观察中毒攻击时的脆弱性。观察中毒是一种对抗策略，它可以巧妙地操纵代理的观察空间，从而有可能导致其学习过程出现偏差。本文的主要目的是系统地评估 MARL 系统中选择性轮回的鲁棒性，以应对观察中毒攻击的微妙但潜在的破坏性影响。通过评估被操纵的观测数据如何影响 MARL 代理，我们试图突出潜在的漏洞，并为开发更具弹性的 MARL 系统提供信息。我们的实验平台是广泛使用的 HalfCheetah 环境，在合作 MARL 环境中使用独立深度确定性策略梯度算法。我们在向 HalfCheetah 的轮回代理提供的 MARL 系统教师数据集中引入了一系列触发器，即高斯噪声添加、观测反转、随机洗牌和缩放。这里的 "教师数据集 "指的是之前训练中存储的经验，用于加速轮回者在 MARL 系统中的学习。这种方法能够观察到这些触发因素对转世决定的重大影响。具体来说，逆转技术对最大回报的负面影响最为明显，所有代理组合的 Kendall's tau 值平均下降了 38.08%。随机洗牌的 Kendall's tau 值下降了 17.66%。另一方面，噪声添加和缩放与原始排名的一致性分别仅为 21.42% 和 32.66%。用 Kendall's tau 指标量化的结果表明，在对抗性观测中毒的情况下，选择性轮回过程非常脆弱。我们的研究结果还显示，不同的代理组合对观察中毒的易感性差异很大，有些代理组合的易感性明显高于其他代理组合。这项研究阐明了我们对选择性轮回对抗观察中毒攻击的鲁棒性的理解，这对于开发更安全的 MARL 系统以及做出关于代理轮回的明智决策至关重要。

{"title":"A Pilot Study of Observation Poisoning on Selective Reincarnation in Multi-Agent Reinforcement Learning","authors":"Harsha Putla, Chanakya Patibandla, Krishna Pratap Singh, P Nagabhushan","doi":"10.1007/s11063-024-11625-w","DOIUrl":"https://doi.org/10.1007/s11063-024-11625-w","url":null,"abstract":"This research explores the vulnerability of selective reincarnation, a concept in Multi-Agent Reinforcement Learning (MARL), in response to observation poisoning attacks. Observation poisoning is an adversarial strategy that subtly manipulates an agent’s observation space, potentially leading to a misdirection in its learning process. The primary aim of this paper is to systematically evaluate the robustness of selective reincarnation in MARL systems against the subtle yet potentially debilitating effects of observation poisoning attacks. Through assessing how manipulated observation data influences MARL agents, we seek to highlight potential vulnerabilities and inform the development of more resilient MARL systems. Our experimental testbed was the widely used HalfCheetah environment, utilizing the Independent Deep Deterministic Policy Gradient algorithm within a cooperative MARL setting. We introduced a series of triggers, namely Gaussian noise addition, observation reversal, random shuffling, and scaling, into the teacher dataset of the MARL system provided to the reincarnating agents of HalfCheetah. Here, the “teacher dataset” refers to the stored experiences from previous training sessions used to accelerate the learning of reincarnating agents in MARL. This approach enabled the observation of these triggers’ significant impact on reincarnation decisions. Specifically, the reversal technique showed the most pronounced negative effect for maximum returns, with an average decrease of 38.08% in Kendall’s tau values across all the agent combinations. With random shuffling, Kendall’s tau values decreased by 17.66%. On the other hand, noise addition and scaling aligned with the original ranking by only 21.42% and 32.66%, respectively. The results, quantified by Kendall’s tau metric, indicate the fragility of the selective reincarnation process under adversarial observation poisoning. Our findings also reveal that vulnerability to observation poisoning varies significantly among different agent combinations, with some exhibiting markedly higher susceptibility than others. This investigation elucidates our understanding of selective reincarnation’s robustness against observation poisoning attacks, which is crucial for developing more secure MARL systems and also for making informed decisions about agent reincarnation.","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"308 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140887758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Disentangled Variational Autoencoder for Social Recommendation 用于社交推荐的离散变异自动编码器

IF 3.1 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Processing Letters

Pub Date : 2024-04-29 DOI: 10.1007/s11063-024-11607-y

Yongshuai Zhang, Jiajin Huang, Jian Yang

Social recommendation aims to improve the recommendation performance by learning user interest and social representations from users’ interaction records and social relations. Intuitively, these learned representations entangle user interest factors with social factors because users’ interaction behaviors and social relations affect each other. A high-quality recommender system should provide items to a user according to his/her interest factors. However, most existing social recommendation models aggregate the two kinds of representations indiscriminately, and this kind of aggregation limits their recommendation performance. In this paper, we develop a model called Disentangled Variational autoencoder for Social Recommendation (DVSR) to disentangle interest and social factors from the two kinds of user representations. Firstly, we perform a preliminary analysis of the entangled information on three popular social recommendation datasets. Then, we present the model architecture of DVSR, which is based on the Variational AutoEncoder (VAE) framework. Besides the traditional method of training VAE, we also use contrastive estimation to penalize the mutual information between interest and social factors. Extensive experiments are conducted on three benchmark datasets to evaluate the effectiveness of our model.

社交推荐旨在通过从用户的交互记录和社交关系中学习用户兴趣和社交表征来提高推荐性能。直观地说，这些学习到的表征将用户兴趣因素与社会因素联系在一起，因为用户的交互行为和社会关系会相互影响。高质量的推荐系统应根据用户的兴趣因素向其提供项目。然而，现有的社交推荐模型大多将这两种表征不加区分地聚合在一起，这种聚合限制了其推荐性能。在本文中，我们开发了一种名为 "社交推荐变异自动编码器"（Disentangled Variational autoencoder for Social Recommendation，DVSR）的模型，将兴趣因素和社交因素从两种用户表征中分离出来。首先，我们在三个流行的社交推荐数据集上对纠缠信息进行了初步分析。然后，我们介绍了基于变异自动编码器（VAE）框架的 DVSR 模型架构。除了传统的 VAE 训练方法外，我们还使用了对比估计法来惩罚兴趣因素和社会因素之间的互信息。我们在三个基准数据集上进行了广泛的实验，以评估我们模型的有效性。

{"title":"Disentangled Variational Autoencoder for Social Recommendation","authors":"Yongshuai Zhang, Jiajin Huang, Jian Yang","doi":"10.1007/s11063-024-11607-y","DOIUrl":"https://doi.org/10.1007/s11063-024-11607-y","url":null,"abstract":"Social recommendation aims to improve the recommendation performance by learning user interest and social representations from users’ interaction records and social relations. Intuitively, these learned representations entangle user interest factors with social factors because users’ interaction behaviors and social relations affect each other. A high-quality recommender system should provide items to a user according to his/her interest factors. However, most existing social recommendation models aggregate the two kinds of representations indiscriminately, and this kind of aggregation limits their recommendation performance. In this paper, we develop a model called Disentangled Variational autoencoder for Social Recommendation (DVSR) to disentangle interest and social factors from the two kinds of user representations. Firstly, we perform a preliminary analysis of the entangled information on three popular social recommendation datasets. Then, we present the model architecture of DVSR, which is based on the Variational AutoEncoder (VAE) framework. Besides the traditional method of training VAE, we also use contrastive estimation to penalize the mutual information between interest and social factors. Extensive experiments are conducted on three benchmark datasets to evaluate the effectiveness of our model.","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"18 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140833239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Class-Balanced Regularization for Long-Tailed Recognition 长尾识别的类平衡正则化

IF 3.1 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Processing Letters

Pub Date : 2024-04-29 DOI: 10.1007/s11063-024-11624-x

Yuge Xu, Chuanlong Lyu

Long-tailed recognition performs poorly on minority classes. The extremely imbalanced distribution of classifier weight norms leads to a decision boundary biased toward majority classes. To address this issue, we propose Class-Balanced Regularization to balance the distribution of classifier weight norms so that the model can make more balanced and reasonable classification decisions. In detail, CBR separately adjusts the regularization factors based on L2 regularization to be correlated with the class sample frequency positively, rather than using a fixed regularization factor. CBR trains balanced classifiers by increasing the L2 norm penalty for majority classes and reducing the penalty for minority classes. Since CBR is mainly used for classification adjustment instead of feature extraction, we adopt a two-stage training algorithm. In the first stage, the network with the traditional empirical risk minimization is trained, and in the second stage, CBR for classifier adjustment is applied. To validate the effectiveness of CBR, we perform extensive experiments on CIFAR10-LT, CIFAR100-LT, and ImageNet-LT datasets. The results demonstrate that CBR significantly improves performance by effectively balancing the distribution of classifier weight norms.

长尾识别在少数类别中表现不佳。分类器权重规范的分布极不平衡，导致决策边界偏向多数类。为了解决这个问题，我们提出了 "类平衡正则化"（Class-Balanced Regularization）来平衡分类器权重规范的分布，从而使模型能做出更平衡、更合理的分类决策。具体来说，CBR 基于 L2 正则化单独调整正则化因子，使其与类样本频率正相关，而不是使用固定的正则化因子。CBR 通过增加对多数类的 L2 正则惩罚和减少对少数类的惩罚来训练平衡的分类器。由于 CBR 主要用于分类调整而非特征提取，因此我们采用了两阶段训练算法。在第一阶段，训练传统的经验风险最小化网络；在第二阶段，应用 CBR 调整分类器。为了验证 CBR 的有效性，我们在 CIFAR10-LT、CIFAR100-LT 和 ImageNet-LT 数据集上进行了大量实验。结果表明，CBR 能有效平衡分类器权重规范的分布，从而显著提高性能。

{"title":"Class-Balanced Regularization for Long-Tailed Recognition","authors":"Yuge Xu, Chuanlong Lyu","doi":"10.1007/s11063-024-11624-x","DOIUrl":"https://doi.org/10.1007/s11063-024-11624-x","url":null,"abstract":"Long-tailed recognition performs poorly on minority classes. The extremely imbalanced distribution of classifier weight norms leads to a decision boundary biased toward majority classes. To address this issue, we propose Class-Balanced Regularization to balance the distribution of classifier weight norms so that the model can make more balanced and reasonable classification decisions. In detail, CBR separately adjusts the regularization factors based on L2 regularization to be correlated with the class sample frequency positively, rather than using a fixed regularization factor. CBR trains balanced classifiers by increasing the L2 norm penalty for majority classes and reducing the penalty for minority classes. Since CBR is mainly used for classification adjustment instead of feature extraction, we adopt a two-stage training algorithm. In the first stage, the network with the traditional empirical risk minimization is trained, and in the second stage, CBR for classifier adjustment is applied. To validate the effectiveness of CBR, we perform extensive experiments on CIFAR10-LT, CIFAR100-LT, and ImageNet-LT datasets. The results demonstrate that CBR significantly improves performance by effectively balancing the distribution of classifier weight norms.","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"94 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140833440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Target Oriented Dynamic Adaption for Cross-Domain Few-Shot Learning 面向目标的动态自适应跨域少量学习

IF 3.1 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Processing Letters

Pub Date : 2024-04-27 DOI: 10.1007/s11063-024-11508-0

Xinyi Chang, Chunyu Du, Xinjing Song, Weifeng Liu, Yanjiang Wang

Few-shot learning has achieved satisfactory progress over the years, but these methods implicitly hypothesize that the data in the base (source) classes and novel (target) classes are sampled from the same data distribution (domain), which is often invalid in reality. The purpose of cross-domain few-shot learning (CD-FSL) is to successfully identify novel target classes with a small quantity of labeled instances on the target domain under the circumstance of domain shift between the source domain and the target domain. However, in CD-FSL, the knowledge learned by the network on the source domain often suffers from the situation of inadaptation when it is transferred to the target domain, since the instances on the source and target domains do not obey the same data distribution. To surmount this problem, we propose a Target Oriented Dynamic Adaption (TODA) model, which uses a tiny amount of target data to orient the network to dynamically adjust and adapt during training. Specifically, this work proposes a domain-specific adapter to ameliorate the network inadaptability issues in transfer to the target domain. The domain-specific adapter can make the extracted features more specific to the tasks in the target domain and reduce the impact of tasks in the source domain by combining them with the mainstream backbone network. In addition, we propose an adaptive optimization method in the network optimization process, which assigns different weights according to the importance of different optimization tasks. Extensive experiments on several benchmark datasets demonstrate the effectiveness of our TODA method.

多年来，少点学习取得了令人满意的进展，但这些方法隐含的假设是，基础（源）类和新颖（目标）类中的数据是从相同的数据分布（域）中采样的，而这在现实中往往是无效的。跨域少量学习（CD-FSL）的目的是在源域和目标域之间发生域转移的情况下，利用目标域上的少量标注实例成功识别新目标类。然而，在 CD-FSL 中，由于源域和目标域上的实例并不服从相同的数据分布，网络在源域上学习到的知识在转移到目标域时往往会出现不适应的情况。为了解决这个问题，我们提出了目标导向动态自适应（TODA）模型，该模型在训练过程中使用极少量的目标数据来引导网络进行动态调整和自适应。具体来说，这项工作提出了一个特定领域适配器，以改善网络在转移到目标领域时的不适应问题。特定领域适配器可以使提取的特征更加针对目标领域的任务，并通过与主流骨干网络相结合来减少源领域任务的影响。此外，我们还在网络优化过程中提出了自适应优化方法，根据不同优化任务的重要性分配不同权重。在多个基准数据集上的广泛实验证明了我们的 TODA 方法的有效性。

{"title":"Target Oriented Dynamic Adaption for Cross-Domain Few-Shot Learning","authors":"Xinyi Chang, Chunyu Du, Xinjing Song, Weifeng Liu, Yanjiang Wang","doi":"10.1007/s11063-024-11508-0","DOIUrl":"https://doi.org/10.1007/s11063-024-11508-0","url":null,"abstract":"Few-shot learning has achieved satisfactory progress over the years, but these methods implicitly hypothesize that the data in the base (source) classes and novel (target) classes are sampled from the same data distribution (domain), which is often invalid in reality. The purpose of cross-domain few-shot learning (CD-FSL) is to successfully identify novel target classes with a small quantity of labeled instances on the target domain under the circumstance of domain shift between the source domain and the target domain. However, in CD-FSL, the knowledge learned by the network on the source domain often suffers from the situation of inadaptation when it is transferred to the target domain, since the instances on the source and target domains do not obey the same data distribution. To surmount this problem, we propose a Target Oriented Dynamic Adaption (TODA) model, which uses a tiny amount of target data to orient the network to dynamically adjust and adapt during training. Specifically, this work proposes a domain-specific adapter to ameliorate the network inadaptability issues in transfer to the target domain. The domain-specific adapter can make the extracted features more specific to the tasks in the target domain and reduce the impact of tasks in the source domain by combining them with the mainstream backbone network. In addition, we propose an adaptive optimization method in the network optimization process, which assigns different weights according to the importance of different optimization tasks. Extensive experiments on several benchmark datasets demonstrate the effectiveness of our TODA method.","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"20 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140809329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Rethinking Regularization with Random Label Smoothing 用随机标签平滑法重新思考正则化问题

IF 3.1 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Processing Letters

Pub Date : 2024-04-27 DOI: 10.1007/s11063-024-11579-z

Claudio Filipi Gonçalves dos Santos, João Paulo Papa

Regularization helps to improve machine learning techniques by penalizing the models during training. Such approaches act in either the input, internal, or output layers. Regarding the latter, label smoothing is widely used to introduce noise in the label vector, making learning more challenging. This work proposes a new label regularization method, Random Label Smoothing, that attributes random values to the labels while preserving their semantics during training. The idea is to change the entire label into fixed arbitrary values. Results show improvements in image classification and super-resolution tasks, outperforming state-of-the-art techniques for such purposes.

正则化通过在训练过程中对模型进行惩罚来帮助改进机器学习技术。此类方法可在输入层、内部层或输出层发挥作用。关于后者，标签平滑被广泛用于在标签向量中引入噪声，使学习更具挑战性。本研究提出了一种新的标签正则化方法--随机标签平滑法，在训练过程中保留标签语义的同时，为标签赋予随机值。这种方法的理念是将整个标签改变为固定的任意值。结果表明，该方法在图像分类和超分辨率任务中取得了改进，其性能优于用于此类目的的最先进技术。

引用次数: 0

A Geometric Theory for Binary Classification of Finite Datasets by DNNs with Relu Activations 采用 Relu 激活的 DNN 对有限数据集进行二元分类的几何理论

IF 3.1 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Processing Letters

Pub Date : 2024-04-26 DOI: 10.1007/s11063-024-11612-1

Xiao-Song Yang

In this paper we investigate deep neural networks for binary classification of datasets from geometric perspective in order to understand the working mechanism of deep neural networks. First, we establish a geometrical result on injectivity of finite set under a projection from Euclidean space to the real line. Then by introducing notions of alternative points and alternative number, we propose an approach to design DNNs for binary classification of finite labeled points on the real line, thus proving existence of binary classification neural net with its hidden layers of width two and the number of hidden layers not larger than the cardinality of the finite labelled set. We also demonstrate geometrically how the dataset is transformed across every hidden layers in a narrow DNN setting for binary classification task.

本文从几何学的角度研究深度神经网络对数据集的二元分类，以了解深度神经网络的工作机制。首先，我们建立了有限集在欧几里得空间到实线的投影下的注入性的几何结果。然后，通过引入替代点和替代数的概念，我们提出了一种设计二元分类实线上有限标注点的 DNN 的方法，从而证明了二元分类神经网络的存在，其隐藏层的宽度为 2，且隐藏层的数量不大于有限标注集的万有引力。我们还从几何角度演示了在二元分类任务的窄 DNN 设置中，数据集如何在每个隐藏层之间进行转换。

引用次数: 0