Neural Processing Letters最新文献_第8页

Dissipativity Analysis of Memristive Inertial Competitive Neural Networks with Mixed Delays 具有混合延迟的膜惯性竞争神经网络的离散性分析

IF 3.1 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Processing Letters

Pub Date : 2024-04-20 DOI: 10.1007/s11063-024-11610-3

Jin Yang, Jigui Jian

Without altering the inertial system into the two first-order differential systems, this paper primarily works over the global exponential dissipativity (GED) of memristive inertial competitive neural networks (MICNNs) with mixed delays. For this purpose, a novel differential inequality is primarily established around the discussed system. Then, by applying the founded inequality and constructing some novel Lyapunov functionals, the GED criteria in the algebraic form and the linear matrix inequality (LMI) form are given, respectively. Furthermore, the estimation of the global exponential attractive set (GEAS) is furnished. Finally, a specific illustrative example is analyzed to check the correctness and feasibility of the obtained findings.

在不改变惯性系统为两个一阶微分系统的情况下，本文主要研究具有混合延迟的记忆惯性竞争神经网络（MICNN）的全局指数消散性（GED）。为此，本文主要围绕所讨论的系统建立了一个新的微分不等式。然后，通过应用所建立的不等式和构建一些新的 Lyapunov 函数，分别给出了代数形式和线性矩阵不等式（LMI）形式的 GED 标准。此外，还提供了全局指数吸引集（GEAS）的估计。最后，分析了一个具体的示例，以检验所得结论的正确性和可行性。

引用次数: 0

SSGAN: A Semantic Similarity-Based GAN for Small-Sample Image Augmentation SSGAN：用于小样本图像增强的基于语义相似性的 GAN

IF 3.1 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Processing Letters

Pub Date : 2024-04-16 DOI: 10.1007/s11063-024-11498-z

Congcong Ma, Jiaqi Mi, Wanlin Gao, Sha Tao

Image sample augmentation refers to strategies for increasing sample size by modifying current data or synthesizing new data based on existing data. This technique is of vital significance in enhancing the performance of downstream learning tasks in widespread small-sample scenarios. In recent years, GAN-based image augmentation methods have gained significant attention and research focus. They have achieved remarkable generation results on large-scale datasets. However, their performance tends to be unsatisfactory when applied to datasets with limited samples. Therefore, this paper proposes a semantic similarity-based small-sample image augmentation method named SSGAN. Firstly, a relatively shallow pyramid-structured GAN-based backbone network was designed, aiming to enhance the model’s feature extraction capabilities to adapt to small sample sizes. Secondly, a feature selection module based on high-dimensional semantics was designed to optimize the loss function, thereby improving the model’s learning capacity. Lastly, extensive comparative experiments and comprehensive ablation experiments were carried out on the “Flower” and “Animal” datasets. The results indicate that the proposed method outperforms other classical GANs methods in well-established evaluation metrics such as FID and IS, with improvements of 18.6 and 1.4, respectively. The dataset augmented by SSGAN significantly enhances the performance of the classifier, achieving a 2.2% accuracy improvement compared to the best-known method. Furthermore, SSGAN demonstrates excellent generalization and robustness.

图像样本扩增是指通过修改现有数据或根据现有数据合成新数据来增加样本量的策略。这项技术对于在广泛的小样本场景中提高下游学习任务的性能具有重要意义。近年来，基于 GAN 的图像增强方法获得了极大的关注和研究重点。它们在大规模数据集上取得了令人瞩目的生成结果。然而，当应用于样本有限的数据集时，它们的性能往往不能令人满意。因此，本文提出了一种名为 SSGAN 的基于语义相似性的小样本图像增强方法。首先，设计了一个相对较浅的基于金字塔结构的 GAN 骨干网络，旨在增强模型的特征提取能力，以适应小样本量。其次，设计了基于高维语义的特征选择模块，以优化损失函数，从而提高模型的学习能力。最后，在 "花卉 "和 "动物 "数据集上进行了广泛的对比实验和综合消融实验。结果表明，在 FID 和 IS 等成熟的评价指标上，所提出的方法优于其他经典的 GANs 方法，分别提高了 18.6 和 1.4。通过 SSGAN 增强的数据集显著提高了分类器的性能，与最著名的方法相比，准确率提高了 2.2%。此外，SSGAN 还表现出卓越的泛化能力和鲁棒性。

{"title":"SSGAN: A Semantic Similarity-Based GAN for Small-Sample Image Augmentation","authors":"Congcong Ma, Jiaqi Mi, Wanlin Gao, Sha Tao","doi":"10.1007/s11063-024-11498-z","DOIUrl":"https://doi.org/10.1007/s11063-024-11498-z","url":null,"abstract":"Image sample augmentation refers to strategies for increasing sample size by modifying current data or synthesizing new data based on existing data. This technique is of vital significance in enhancing the performance of downstream learning tasks in widespread small-sample scenarios. In recent years, GAN-based image augmentation methods have gained significant attention and research focus. They have achieved remarkable generation results on large-scale datasets. However, their performance tends to be unsatisfactory when applied to datasets with limited samples. Therefore, this paper proposes a semantic similarity-based small-sample image augmentation method named SSGAN. Firstly, a relatively shallow pyramid-structured GAN-based backbone network was designed, aiming to enhance the model’s feature extraction capabilities to adapt to small sample sizes. Secondly, a feature selection module based on high-dimensional semantics was designed to optimize the loss function, thereby improving the model’s learning capacity. Lastly, extensive comparative experiments and comprehensive ablation experiments were carried out on the “Flower” and “Animal” datasets. The results indicate that the proposed method outperforms other classical GANs methods in well-established evaluation metrics such as FID and IS, with improvements of 18.6 and 1.4, respectively. The dataset augmented by SSGAN significantly enhances the performance of the classifier, achieving a 2.2% accuracy improvement compared to the best-known method. Furthermore, SSGAN demonstrates excellent generalization and robustness.","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"47 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140571991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Efficient Visual Metaphor Image Generation Based on Metaphor Understanding 基于隐喻理解的高效视觉隐喻图像生成

IF 3.1 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Processing Letters

Pub Date : 2024-04-16 DOI: 10.1007/s11063-024-11609-w

Chang Su, Xingyue Wang, Shupin Liu, Yijiang Chen

Metaphor has significant implications for revealing cognitive and thinking mechanisms. Visual metaphor image generation not only presents metaphorical connotations intuitively but also reflects AI’s understanding of metaphor through the generated images. This paper investigates the task of generating images based on text with visual metaphors. We explore metaphor image generation and create a dataset containing sentences with visual metaphors. Then, we propose a visual metaphor generation image framework based on metaphor understanding, which is more tailored to the essence of metaphor, better utilizes visual features, and has stronger interpretability. Specifically, the framework extracts the source domain, target domain, and metaphor interpretation from metaphorical sentences, separating the elements of the metaphor to deepen the understanding of its themes and intentions. Additionally, the framework introduces image data from the source domain to capture visual similarities and generate visual enhancement prompts specific to the domain. Finally, these prompts are combined with metaphorical interpretation sentences to form the final prompt text. Experimental results demonstrate that this approach effectively captures the essence of metaphor and generates metaphorical images consistent with the textual meaning.

隐喻对于揭示认知和思维机制具有重要意义。视觉隐喻图像生成不仅能直观地呈现隐喻内涵，还能通过生成的图像反映人工智能对隐喻的理解。本文研究了基于文本生成具有视觉隐喻的图像的任务。我们探讨了隐喻图像的生成，并创建了一个包含视觉隐喻句子的数据集。然后，我们提出了基于隐喻理解的视觉隐喻生成图像框架，该框架更符合隐喻的本质，能更好地利用视觉特征，并具有更强的可解释性。具体来说，该框架从隐喻句子中提取源域、目标域和隐喻解释，分离隐喻元素，加深对隐喻主题和意图的理解。此外，该框架还引入了源领域的图像数据，以捕捉视觉相似性，并生成该领域特有的视觉增强提示。最后，这些提示与隐喻解释句子相结合，形成最终的提示文本。实验结果表明，这种方法能有效捕捉隐喻的本质，并生成与文本含义一致的隐喻图像。

{"title":"Efficient Visual Metaphor Image Generation Based on Metaphor Understanding","authors":"Chang Su, Xingyue Wang, Shupin Liu, Yijiang Chen","doi":"10.1007/s11063-024-11609-w","DOIUrl":"https://doi.org/10.1007/s11063-024-11609-w","url":null,"abstract":"Metaphor has significant implications for revealing cognitive and thinking mechanisms. Visual metaphor image generation not only presents metaphorical connotations intuitively but also reflects AI’s understanding of metaphor through the generated images. This paper investigates the task of generating images based on text with visual metaphors. We explore metaphor image generation and create a dataset containing sentences with visual metaphors. Then, we propose a visual metaphor generation image framework based on metaphor understanding, which is more tailored to the essence of metaphor, better utilizes visual features, and has stronger interpretability. Specifically, the framework extracts the source domain, target domain, and metaphor interpretation from metaphorical sentences, separating the elements of the metaphor to deepen the understanding of its themes and intentions. Additionally, the framework introduces image data from the source domain to capture visual similarities and generate visual enhancement prompts specific to the domain. Finally, these prompts are combined with metaphorical interpretation sentences to form the final prompt text. Experimental results demonstrate that this approach effectively captures the essence of metaphor and generates metaphorical images consistent with the textual meaning.","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"26 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140616317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Learning Reliable Dense Pseudo-Labels for Point-Level Weakly-Supervised Action Localization 为点级弱监督动作定位学习可靠的高密度伪标签

IF 3.1 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Processing Letters

Pub Date : 2024-04-10 DOI: 10.1007/s11063-024-11598-w

Yuanjie Dang, Guozhu Zheng, Peng Chen, Nan Gao, Ruohong Huan, Dongdong Zhao, Ronghua Liang

Point-level weakly-supervised temporal action localization aims to accurately recognize and localize action segments in untrimmed videos, using only point-level annotations during training. Current methods primarily focus on mining sparse pseudo-labels and generating dense pseudo-labels. However, due to the sparsity of point-level labels and the impact of scene information on action representations, the reliability of dense pseudo-label methods still remains an issue. In this paper, we propose a point-level weakly-supervised temporal action localization method based on local representation enhancement and global temporal optimization. This method comprises two modules that enhance the representation capacity of action features and improve the reliability of class activation sequence classification, thereby enhancing the reliability of dense pseudo-labels and strengthening the model’s capability for completeness learning. Specifically, we first generate representative features of actions using pseudo-label feature and calculate weights based on the feature similarity between representative features of actions and segments features to adjust class activation sequence. Additionally, we maintain the fixed-length queues for annotated segments and design a action contrastive learning framework between videos. The experimental results demonstrate that our modules indeed enhance the model’s capability for comprehensive learning, particularly achieving state-of-the-art results at high IoU thresholds.

点级弱监督时空动作定位旨在仅利用训练过程中的点级注释，准确识别和定位未剪辑视频中的动作片段。目前的方法主要侧重于挖掘稀疏伪标签和生成密集伪标签。然而，由于点级标签的稀疏性和场景信息对动作表示的影响，密集伪标签方法的可靠性仍然是个问题。在本文中，我们提出了一种基于局部表示增强和全局时空优化的点级弱监督时空动作定位方法。该方法由两个模块组成，分别增强了动作特征的表征能力和提高了类激活序列分类的可靠性，从而提高了高密度伪标签的可靠性，增强了模型的完备性学习能力。具体来说，我们首先利用伪标签特征生成动作的代表特征，然后根据动作代表特征与片段特征之间的特征相似性计算权重，从而调整类激活序列。此外，我们还为已注释的片段保留了固定长度的队列，并设计了一个视频间动作对比学习框架。实验结果表明，我们的模块确实增强了模型的综合学习能力，尤其是在高 IoU 门限下取得了最先进的结果。

{"title":"Learning Reliable Dense Pseudo-Labels for Point-Level Weakly-Supervised Action Localization","authors":"Yuanjie Dang, Guozhu Zheng, Peng Chen, Nan Gao, Ruohong Huan, Dongdong Zhao, Ronghua Liang","doi":"10.1007/s11063-024-11598-w","DOIUrl":"https://doi.org/10.1007/s11063-024-11598-w","url":null,"abstract":"Point-level weakly-supervised temporal action localization aims to accurately recognize and localize action segments in untrimmed videos, using only point-level annotations during training. Current methods primarily focus on mining sparse pseudo-labels and generating dense pseudo-labels. However, due to the sparsity of point-level labels and the impact of scene information on action representations, the reliability of dense pseudo-label methods still remains an issue. In this paper, we propose a point-level weakly-supervised temporal action localization method based on local representation enhancement and global temporal optimization. This method comprises two modules that enhance the representation capacity of action features and improve the reliability of class activation sequence classification, thereby enhancing the reliability of dense pseudo-labels and strengthening the model’s capability for completeness learning. Specifically, we first generate representative features of actions using pseudo-label feature and calculate weights based on the feature similarity between representative features of actions and segments features to adjust class activation sequence. Additionally, we maintain the fixed-length queues for annotated segments and design a action contrastive learning framework between videos. The experimental results demonstrate that our modules indeed enhance the model’s capability for comprehensive learning, particularly achieving state-of-the-art results at high IoU thresholds.","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"214 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140571988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Novel GCN Model Using Dense Connection and Attention Mechanism for Text Classification 利用密集连接和注意力机制进行文本分类的新型 GCN 模型

IF 3.1 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Processing Letters

Pub Date : 2024-04-09 DOI: 10.1007/s11063-024-11599-9

Yinbin Peng, Wei Wu, Jiansi Ren, Xiang Yu

Convolutional Neural Network (CNN) or Recurrent Neural Network (RNN) based text classification algorithms currently in use can successfully extract local textual features but disregard global data. Due to its ability to understand complex text structures and maintain global information, Graph Neural Network (GNN) has demonstrated considerable promise in text classification. However, most of the GNN text classification models in use presently are typically shallow, unable to capture long-distance node information and reflect the various scale features of the text (such as words, phrases, etc.). All of which will negatively impact the performance of the final classification. A novel Graph Convolutional Neural Network (GCN) with dense connections and an attention mechanism for text classification is proposed to address these constraints. By increasing the depth of GCN, the densely connected graph convolutional network (DC-GCN) gathers information about distant nodes. The DC-GCN multiplexes the small-scale features of shallow layers and produces different scale features through dense connections. To combine features and determine their relative importance, an attention mechanism is finally added. Experiment results on four benchmark datasets demonstrate that our model’s classification accuracy greatly outpaces that of the conventional deep learning text classification model. Our model performs exceptionally well when compared to other text categorization GCN algorithms.

目前使用的基于卷积神经网络（CNN）或递归神经网络（RNN）的文本分类算法可以成功提取局部文本特征，但忽略了全局数据。由于图形神经网络（GNN）能够理解复杂的文本结构并保持全局信息，因此在文本分类方面大有可为。然而，目前使用的大多数图神经网络文本分类模型通常都比较浅，无法捕捉长距离节点信息，也无法反映文本的各种规模特征（如单词、短语等）。所有这些都会对最终分类的性能产生负面影响。为了解决这些限制，我们提出了一种具有密集连接和注意力机制的新型图卷积神经网络（GCN）来进行文本分类。通过增加 GCN 的深度，密集连接图卷积网络（DC-GCN）可以收集远处节点的信息。DC-GCN 复用了浅层的小尺度特征，并通过密集连接产生了不同尺度的特征。为了合并特征并确定其相对重要性，最后加入了注意力机制。在四个基准数据集上的实验结果表明，我们模型的分类准确率大大超过了传统的深度学习文本分类模型。与其他文本分类 GCN 算法相比，我们的模型表现尤为出色。

{"title":"Novel GCN Model Using Dense Connection and Attention Mechanism for Text Classification","authors":"Yinbin Peng, Wei Wu, Jiansi Ren, Xiang Yu","doi":"10.1007/s11063-024-11599-9","DOIUrl":"https://doi.org/10.1007/s11063-024-11599-9","url":null,"abstract":"Convolutional Neural Network (CNN) or Recurrent Neural Network (RNN) based text classification algorithms currently in use can successfully extract local textual features but disregard global data. Due to its ability to understand complex text structures and maintain global information, Graph Neural Network (GNN) has demonstrated considerable promise in text classification. However, most of the GNN text classification models in use presently are typically shallow, unable to capture long-distance node information and reflect the various scale features of the text (such as words, phrases, etc.). All of which will negatively impact the performance of the final classification. A novel Graph Convolutional Neural Network (GCN) with dense connections and an attention mechanism for text classification is proposed to address these constraints. By increasing the depth of GCN, the densely connected graph convolutional network (DC-GCN) gathers information about distant nodes. The DC-GCN multiplexes the small-scale features of shallow layers and produces different scale features through dense connections. To combine features and determine their relative importance, an attention mechanism is finally added. Experiment results on four benchmark datasets demonstrate that our model’s classification accuracy greatly outpaces that of the conventional deep learning text classification model. Our model performs exceptionally well when compared to other text categorization GCN algorithms.","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"215 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140571911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Consensus Affinity Graph Learning via Structure Graph Fusion and Block Diagonal Representation for Multiview Clustering 通过结构图融合和块对角线表示进行共识亲和图学习，实现多视图聚类

IF 3.1 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Processing Letters

Pub Date : 2024-04-08 DOI: 10.1007/s11063-024-11589-x

Zhongyan Gui, Jing Yang, Zhiqiang Xie, Cuicui Ye

Learning a robust affinity graph is fundamental to graph-based clustering methods. However, some existing affinity graph learning methods have encountered the following problems. First, the constructed affinity graphs cannot capture the intrinsic structure of data well. Second, when fusing all view-specific affinity graphs, most of them obtain a fusion graph by simply taking the average of multiple views, or directly learning a common graph from multiple views, without considering the discriminative property among diverse views. Third, the fusion graph does not maintain an explicit cluster structure. To alleviate these problems, the adaptive neighbor graph learning approach and the data self-expression approach are first integrated into a structure graph fusion framework to obtain a view-specific structure affinity graph to capture the local and global structures of data. Then, all the structural affinity graphs are weighted dynamically into a consensus affinity graph, which not only effectively incorporates the complementary affinity structure of important views but also has the capability of preserving the consensus affinity structure that is shared by all views. Finally, a k–block diagonal regularizer is introduced for the consensus affinity graph to encourage it to have an explicit cluster structure. An efficient optimization algorithm is developed to tackle the resultant optimization problem. Extensive experiments on benchmark datasets validate the superiority of the proposed method.

学习稳健的亲和图是基于图的聚类方法的基础。然而，现有的一些亲和图学习方法遇到了以下问题。首先，构建的亲和图不能很好地捕捉数据的内在结构。其次，在融合所有特定视图的亲和图时，大多数方法都是通过简单地取多个视图的平均值来获得融合图，或者直接从多个视图中学习一个共同的图，而没有考虑不同视图之间的区分属性。第三，融合图没有保持明确的聚类结构。为了缓解这些问题，我们首先将自适应邻接图学习方法和数据自我表达方法整合到结构图融合框架中，从而获得特定视图的结构亲和图，以捕捉数据的局部和全局结构。然后，将所有结构亲和图动态加权为共识亲和图，该共识亲和图不仅能有效整合重要视图的互补亲和结构，还能保留所有视图共享的共识亲和结构。最后，为共识亲和图引入了 k 块对角正则，以鼓励其具有明确的聚类结构。为解决由此产生的优化问题，我们开发了一种高效的优化算法。在基准数据集上进行的大量实验验证了所提方法的优越性。

{"title":"Consensus Affinity Graph Learning via Structure Graph Fusion and Block Diagonal Representation for Multiview Clustering","authors":"Zhongyan Gui, Jing Yang, Zhiqiang Xie, Cuicui Ye","doi":"10.1007/s11063-024-11589-x","DOIUrl":"https://doi.org/10.1007/s11063-024-11589-x","url":null,"abstract":"Learning a robust affinity graph is fundamental to graph-based clustering methods. However, some existing affinity graph learning methods have encountered the following problems. First, the constructed affinity graphs cannot capture the intrinsic structure of data well. Second, when fusing all view-specific affinity graphs, most of them obtain a fusion graph by simply taking the average of multiple views, or directly learning a common graph from multiple views, without considering the discriminative property among diverse views. Third, the fusion graph does not maintain an explicit cluster structure. To alleviate these problems, the adaptive neighbor graph learning approach and the data self-expression approach are first integrated into a structure graph fusion framework to obtain a view-specific structure affinity graph to capture the local and global structures of data. Then, all the structural affinity graphs are weighted dynamically into a consensus affinity graph, which not only effectively incorporates the complementary affinity structure of important views but also has the capability of preserving the consensus affinity structure that is shared by all views. Finally, a k–block diagonal regularizer is introduced for the consensus affinity graph to encourage it to have an explicit cluster structure. An efficient optimization algorithm is developed to tackle the resultant optimization problem. Extensive experiments on benchmark datasets validate the superiority of the proposed method.","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"57 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140571912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

DialGNN: Heterogeneous Graph Neural Networks for Dialogue Classification DialGNN：用于对话分类的异构图神经网络

IF 3.1 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Processing Letters

Pub Date : 2024-04-08 DOI: 10.1007/s11063-024-11595-z

Yan Yan, Bo-Wen Zhang, Peng-hao Min, Guan-wen Ding, Jun-yuan Liu

Dialogue systems have attracted growing research interests due to its widespread applications in various domains. However, most research work focus on sentence-level intent recognition to interpret user utterances in dialogue systems, while the comprehension of the whole documents has not attracted sufficient attention. In this paper, we propose DialGNN, a heterogeneous graph neural network framework tailored for the problem of dialogue classification which takes the entire dialogue as input. Specifically, a heterogeneous graph is constructed with nodes in different levels of semantic granularity. The graph framework allows flexible integration of various pre-trained language representation models, such as BERT and its variants, which endows DialGNN with powerful text representational capabilities. DialGNN outperforms on CM and ECS datasets, which demonstrates robustness and the effectiveness. Specifically, our model achieves a notable enhancement in performance, optimizing the classification of document-level dialogue text. The implementation of DialGNN and related data are shared through https://github.com/821code/DialGNN.

由于对话系统在各个领域的广泛应用，它吸引了越来越多的研究兴趣。然而，大多数研究工作都集中在句子层面的意图识别，以解释对话系统中的用户话语，而对整个文档的理解还没有引起足够的重视。在本文中，我们提出了 DialGNN，这是一个为对话分类问题量身定制的异构图神经网络框架，它以整个对话为输入。具体来说，我们用不同语义粒度的节点构建了一个异构图。图框架允许灵活集成各种预训练语言表示模型，如 BERT 及其变体，从而赋予 DialGNN 强大的文本表示能力。DialGNN 在 CM 和 ECS 数据集上表现优异，证明了其鲁棒性和有效性。具体来说，我们的模型在优化文档级对话文本分类方面取得了显著的性能提升。DialGNN 的实现和相关数据可通过 https://github.com/821code/DialGNN 共享。

{"title":"DialGNN: Heterogeneous Graph Neural Networks for Dialogue Classification","authors":"Yan Yan, Bo-Wen Zhang, Peng-hao Min, Guan-wen Ding, Jun-yuan Liu","doi":"10.1007/s11063-024-11595-z","DOIUrl":"https://doi.org/10.1007/s11063-024-11595-z","url":null,"abstract":"Dialogue systems have attracted growing research interests due to its widespread applications in various domains. However, most research work focus on sentence-level intent recognition to interpret user utterances in dialogue systems, while the comprehension of the whole documents has not attracted sufficient attention. In this paper, we propose DialGNN, a heterogeneous graph neural network framework tailored for the problem of dialogue classification which takes the entire dialogue as input. Specifically, a heterogeneous graph is constructed with nodes in different levels of semantic granularity. The graph framework allows flexible integration of various pre-trained language representation models, such as BERT and its variants, which endows DialGNN with powerful text representational capabilities. DialGNN outperforms on CM and ECS datasets, which demonstrates robustness and the effectiveness. Specifically, our model achieves a notable enhancement in performance, optimizing the classification of document-level dialogue text. The implementation of DialGNN and related data are shared through https://github.com/821code/DialGNN.","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"40 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140571915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Semantic Spectral Clustering with Contrastive Learning and Neighbor Mining 利用对比学习和邻域挖掘进行语义谱聚类

IF 3.1 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Processing Letters

Pub Date : 2024-04-07 DOI: 10.1007/s11063-024-11597-x

Nongxiao Wang, Xulun Ye, Jieyu Zhao, Qing Wang

Deep spectral clustering techniques are considered one of the most efficient clustering algorithms in data mining field. The similarity between instances and the disparity among classes are two critical factors in clustering fields. However, most current deep spectral clustering approaches do not sufficiently take them both into consideration. To tackle the above issue, we propose Semantic Spectral clustering with Contrastive learning and Neighbor mining (SSCN) framework, which performs instance-level pulling and cluster-level pushing cooperatively. Specifically, we obtain the semantic feature embedding using an unsupervised contrastive learning model. Next, we obtain the nearest neighbors partially and globally, and the neighbors along with data augmentation information enhance their effectiveness collaboratively on the instance level as well as the cluster level. The spectral constraint is applied by orthogonal layers to satisfy conventional spectral clustering. Extensive experiments demonstrate the superiority of our proposed frame of spectral clustering.

深度光谱聚类技术被认为是数据挖掘领域最有效的聚类算法之一。实例之间的相似性和类之间的差异是聚类领域的两个关键因素。然而，目前大多数深度光谱聚类方法都没有充分考虑到这两个因素。针对上述问题，我们提出了具有对比学习和邻居挖掘功能的语义光谱聚类（SSCN）框架，该框架可协同执行实例级拉动和聚类级推动。具体来说，我们使用无监督对比学习模型获得语义特征嵌入。接下来，我们在局部和全局范围内获取近邻，近邻与数据增强信息一起在实例级和集群级协同增强其有效性。通过正交层应用光谱约束来满足传统的光谱聚类。大量实验证明了我们提出的光谱聚类框架的优越性。

引用次数: 0

SGNNRec: A Scalable Double-Layer Attention-Based Graph Neural Network Recommendation Model SGNNRec：基于注意力的可扩展双层图神经网络推荐模型

IF 3.1 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Processing Letters

Pub Date : 2024-04-04 DOI: 10.1007/s11063-024-11555-7

Jing He, Le Tang, Dan Tang, Ping Wang, Li Cai

Due to the information from the multi-relationship graphs is difficult to aggregate, the graph neural network recommendation model focuses on single-relational graphs (e.g., the user-item rating bipartite graph and user-user social relationship graphs). However, existing graph neural network recommendation models have insufficient flexibility. The recommendation accuracy instead decreases when low-quality auxiliary information is aggregated in the recommendation model. This paper proposes a scalable graph neural network recommendation model named SGNNRec. SGNNRec fuse a variety of auxiliary information (e.g., user social information, item tag information and user-item interaction information) beside user-item rating as supplements to solve the problem of data sparsity. A tag cluster-based item-semantic graph method and an apriori algorithm-based user-item interaction graph method are proposed to realize the construction of graph relations. Furthermore, a double-layer attention network is designed to learn the influence of latent factors. Thus, the latent factors are to be optimized to obtain the best recommendation results. Empirical results on real-world datasets verify the effectiveness of our model. SGNNRec can reduce the influence of poor auxiliary information; moreover, with increasing the number of auxiliary information, the model accuracy improves.

由于多关系图中的信息难以汇总，图神经网络推荐模型主要集中在单关系图（如用户-物品评分双元图和用户-用户社会关系图）上。然而，现有的图神经网络推荐模型灵活性不足。当低质量的辅助信息被聚合到推荐模型中时，推荐准确率反而会降低。本文提出了一种名为 SGNNRec 的可扩展图神经网络推荐模型。SGNNRec 除用户-物品评分外，还融合了多种辅助信息（如用户社交信息、物品标签信息和用户-物品交互信息）作为补充，以解决数据稀疏的问题。为实现图关系的构建，提出了基于标签集群的物品语义图方法和基于apriori算法的用户-物品交互图方法。此外，还设计了双层注意力网络来学习潜在因素的影响。因此，需要对潜在因素进行优化，以获得最佳推荐结果。实际数据集的经验结果验证了我们模型的有效性。SGNNRec 可以减少不良辅助信息的影响；此外，随着辅助信息数量的增加，模型的准确性也会提高。

{"title":"SGNNRec: A Scalable Double-Layer Attention-Based Graph Neural Network Recommendation Model","authors":"Jing He, Le Tang, Dan Tang, Ping Wang, Li Cai","doi":"10.1007/s11063-024-11555-7","DOIUrl":"https://doi.org/10.1007/s11063-024-11555-7","url":null,"abstract":"Due to the information from the multi-relationship graphs is difficult to aggregate, the graph neural network recommendation model focuses on single-relational graphs (e.g., the user-item rating bipartite graph and user-user social relationship graphs). However, existing graph neural network recommendation models have insufficient flexibility. The recommendation accuracy instead decreases when low-quality auxiliary information is aggregated in the recommendation model. This paper proposes a scalable graph neural network recommendation model named SGNNRec. SGNNRec fuse a variety of auxiliary information (e.g., user social information, item tag information and user-item interaction information) beside user-item rating as supplements to solve the problem of data sparsity. A tag cluster-based item-semantic graph method and an apriori algorithm-based user-item interaction graph method are proposed to realize the construction of graph relations. Furthermore, a double-layer attention network is designed to learn the influence of latent factors. Thus, the latent factors are to be optimized to obtain the best recommendation results. Empirical results on real-world datasets verify the effectiveness of our model. SGNNRec can reduce the influence of poor auxiliary information; moreover, with increasing the number of auxiliary information, the model accuracy improves.","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"25 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140571903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Hierarchical Patch Aggregation Transformer for Motion Deblurring 用于运动去模糊的分层补丁聚合变换器

IF 3.1 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Processing Letters

Pub Date : 2024-04-04 DOI: 10.1007/s11063-024-11594-0

Yujie Wu, Lei Liang, Siyao Ling, Zhisheng Gao

The encoder-decoder framework based on Transformer components has become a paradigm in the field of image deblurring architecture design. In this paper, we critically revisit this approach and find that many current architectures severely focus on limited local regions during the feature extraction stage. These designs compromise the feature richness and diversity of the encoder-decoder framework, leading to bottlenecks in performance improvement. To address these deficiencies, a novel Hierarchical Patch Aggregation Transformer architecture (HPAT) is proposed. In the initial feature extraction stage, HPAT combines Axis-Selective Transformer Blocks with linear complexity and is supplemented by an adaptive hierarchical attention fusion mechanism. These mechanisms enable the model to effectively capture the spatial relationships between features and integrate features from different hierarchical levels. Then, we redesign the feedforward network of the Transformer block in the encoder-decoder structure and propose the Fused Feedforward Network. This effective aggregation enhances the ability to capture and retain local detailed features. We evaluate HPAT through extensive experiments and compare its performance with baseline methods on public datasets. Experimental results show that the proposed HPAT model achieves state-of-the-art performance in image deblurring tasks.

基于变换器组件的编码器-解码器框架已成为图像去模糊架构设计领域的典范。在本文中，我们重新审视了这一方法，发现当前的许多架构在特征提取阶段严重关注有限的局部区域。这些设计损害了编码器-解码器框架的特征丰富性和多样性，导致性能提升遇到瓶颈。为了解决这些缺陷，我们提出了一种新颖的分层补丁聚合转换器架构（HPAT）。在初始特征提取阶段，HPAT 结合了具有线性复杂性的轴选择变换器块，并辅以自适应分层注意力融合机制。这些机制使模型能够有效捕捉特征之间的空间关系，并整合来自不同层次的特征。然后，我们重新设计了编码器-解码器结构中变换器模块的前馈网络，并提出了融合前馈网络。这种有效的聚合增强了捕捉和保留局部细节特征的能力。我们通过大量实验对 HPAT 进行了评估，并将其性能与公共数据集上的基准方法进行了比较。实验结果表明，所提出的 HPAT 模型在图像去模糊任务中实现了最先进的性能。

{"title":"Hierarchical Patch Aggregation Transformer for Motion Deblurring","authors":"Yujie Wu, Lei Liang, Siyao Ling, Zhisheng Gao","doi":"10.1007/s11063-024-11594-0","DOIUrl":"https://doi.org/10.1007/s11063-024-11594-0","url":null,"abstract":"The encoder-decoder framework based on Transformer components has become a paradigm in the field of image deblurring architecture design. In this paper, we critically revisit this approach and find that many current architectures severely focus on limited local regions during the feature extraction stage. These designs compromise the feature richness and diversity of the encoder-decoder framework, leading to bottlenecks in performance improvement. To address these deficiencies, a novel Hierarchical Patch Aggregation Transformer architecture (HPAT) is proposed. In the initial feature extraction stage, HPAT combines Axis-Selective Transformer Blocks with linear complexity and is supplemented by an adaptive hierarchical attention fusion mechanism. These mechanisms enable the model to effectively capture the spatial relationships between features and integrate features from different hierarchical levels. Then, we redesign the feedforward network of the Transformer block in the encoder-decoder structure and propose the Fused Feedforward Network. This effective aggregation enhances the ability to capture and retain local detailed features. We evaluate HPAT through extensive experiments and compare its performance with baseline methods on public datasets. Experimental results show that the proposed HPAT model achieves state-of-the-art performance in image deblurring tasks.","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"1 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140571979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0