Pub Date : 2024-04-20DOI: 10.1007/s11063-024-11610-3
Jin Yang, Jigui Jian
Without altering the inertial system into the two first-order differential systems, this paper primarily works over the global exponential dissipativity (GED) of memristive inertial competitive neural networks (MICNNs) with mixed delays. For this purpose, a novel differential inequality is primarily established around the discussed system. Then, by applying the founded inequality and constructing some novel Lyapunov functionals, the GED criteria in the algebraic form and the linear matrix inequality (LMI) form are given, respectively. Furthermore, the estimation of the global exponential attractive set (GEAS) is furnished. Finally, a specific illustrative example is analyzed to check the correctness and feasibility of the obtained findings.
{"title":"Dissipativity Analysis of Memristive Inertial Competitive Neural Networks with Mixed Delays","authors":"Jin Yang, Jigui Jian","doi":"10.1007/s11063-024-11610-3","DOIUrl":"https://doi.org/10.1007/s11063-024-11610-3","url":null,"abstract":"<p>Without altering the inertial system into the two first-order differential systems, this paper primarily works over the global exponential dissipativity (GED) of memristive inertial competitive neural networks (MICNNs) with mixed delays. For this purpose, a novel differential inequality is primarily established around the discussed system. Then, by applying the founded inequality and constructing some novel Lyapunov functionals, the GED criteria in the algebraic form and the linear matrix inequality (LMI) form are given, respectively. Furthermore, the estimation of the global exponential attractive set (GEAS) is furnished. Finally, a specific illustrative example is analyzed to check the correctness and feasibility of the obtained findings.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"24 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140630251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-16DOI: 10.1007/s11063-024-11498-z
Congcong Ma, Jiaqi Mi, Wanlin Gao, Sha Tao
Image sample augmentation refers to strategies for increasing sample size by modifying current data or synthesizing new data based on existing data. This technique is of vital significance in enhancing the performance of downstream learning tasks in widespread small-sample scenarios. In recent years, GAN-based image augmentation methods have gained significant attention and research focus. They have achieved remarkable generation results on large-scale datasets. However, their performance tends to be unsatisfactory when applied to datasets with limited samples. Therefore, this paper proposes a semantic similarity-based small-sample image augmentation method named SSGAN. Firstly, a relatively shallow pyramid-structured GAN-based backbone network was designed, aiming to enhance the model’s feature extraction capabilities to adapt to small sample sizes. Secondly, a feature selection module based on high-dimensional semantics was designed to optimize the loss function, thereby improving the model’s learning capacity. Lastly, extensive comparative experiments and comprehensive ablation experiments were carried out on the “Flower” and “Animal” datasets. The results indicate that the proposed method outperforms other classical GANs methods in well-established evaluation metrics such as FID and IS, with improvements of 18.6 and 1.4, respectively. The dataset augmented by SSGAN significantly enhances the performance of the classifier, achieving a 2.2% accuracy improvement compared to the best-known method. Furthermore, SSGAN demonstrates excellent generalization and robustness.
图像样本扩增是指通过修改现有数据或根据现有数据合成新数据来增加样本量的策略。这项技术对于在广泛的小样本场景中提高下游学习任务的性能具有重要意义。近年来,基于 GAN 的图像增强方法获得了极大的关注和研究重点。它们在大规模数据集上取得了令人瞩目的生成结果。然而,当应用于样本有限的数据集时,它们的性能往往不能令人满意。因此,本文提出了一种名为 SSGAN 的基于语义相似性的小样本图像增强方法。首先,设计了一个相对较浅的基于金字塔结构的 GAN 骨干网络,旨在增强模型的特征提取能力,以适应小样本量。其次,设计了基于高维语义的特征选择模块,以优化损失函数,从而提高模型的学习能力。最后,在 "花卉 "和 "动物 "数据集上进行了广泛的对比实验和综合消融实验。结果表明,在 FID 和 IS 等成熟的评价指标上,所提出的方法优于其他经典的 GANs 方法,分别提高了 18.6 和 1.4。通过 SSGAN 增强的数据集显著提高了分类器的性能,与最著名的方法相比,准确率提高了 2.2%。此外,SSGAN 还表现出卓越的泛化能力和鲁棒性。
{"title":"SSGAN: A Semantic Similarity-Based GAN for Small-Sample Image Augmentation","authors":"Congcong Ma, Jiaqi Mi, Wanlin Gao, Sha Tao","doi":"10.1007/s11063-024-11498-z","DOIUrl":"https://doi.org/10.1007/s11063-024-11498-z","url":null,"abstract":"<p>Image sample augmentation refers to strategies for increasing sample size by modifying current data or synthesizing new data based on existing data. This technique is of vital significance in enhancing the performance of downstream learning tasks in widespread small-sample scenarios. In recent years, GAN-based image augmentation methods have gained significant attention and research focus. They have achieved remarkable generation results on large-scale datasets. However, their performance tends to be unsatisfactory when applied to datasets with limited samples. Therefore, this paper proposes a semantic similarity-based small-sample image augmentation method named SSGAN. Firstly, a relatively shallow pyramid-structured GAN-based backbone network was designed, aiming to enhance the model’s feature extraction capabilities to adapt to small sample sizes. Secondly, a feature selection module based on high-dimensional semantics was designed to optimize the loss function, thereby improving the model’s learning capacity. Lastly, extensive comparative experiments and comprehensive ablation experiments were carried out on the “Flower” and “Animal” datasets. The results indicate that the proposed method outperforms other classical GANs methods in well-established evaluation metrics such as FID and IS, with improvements of 18.6 and 1.4, respectively. The dataset augmented by SSGAN significantly enhances the performance of the classifier, achieving a 2.2% accuracy improvement compared to the best-known method. Furthermore, SSGAN demonstrates excellent generalization and robustness.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"47 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140571991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-16DOI: 10.1007/s11063-024-11609-w
Chang Su, Xingyue Wang, Shupin Liu, Yijiang Chen
Metaphor has significant implications for revealing cognitive and thinking mechanisms. Visual metaphor image generation not only presents metaphorical connotations intuitively but also reflects AI’s understanding of metaphor through the generated images. This paper investigates the task of generating images based on text with visual metaphors. We explore metaphor image generation and create a dataset containing sentences with visual metaphors. Then, we propose a visual metaphor generation image framework based on metaphor understanding, which is more tailored to the essence of metaphor, better utilizes visual features, and has stronger interpretability. Specifically, the framework extracts the source domain, target domain, and metaphor interpretation from metaphorical sentences, separating the elements of the metaphor to deepen the understanding of its themes and intentions. Additionally, the framework introduces image data from the source domain to capture visual similarities and generate visual enhancement prompts specific to the domain. Finally, these prompts are combined with metaphorical interpretation sentences to form the final prompt text. Experimental results demonstrate that this approach effectively captures the essence of metaphor and generates metaphorical images consistent with the textual meaning.
{"title":"Efficient Visual Metaphor Image Generation Based on Metaphor Understanding","authors":"Chang Su, Xingyue Wang, Shupin Liu, Yijiang Chen","doi":"10.1007/s11063-024-11609-w","DOIUrl":"https://doi.org/10.1007/s11063-024-11609-w","url":null,"abstract":"<p>Metaphor has significant implications for revealing cognitive and thinking mechanisms. Visual metaphor image generation not only presents metaphorical connotations intuitively but also reflects AI’s understanding of metaphor through the generated images. This paper investigates the task of generating images based on text with visual metaphors. We explore metaphor image generation and create a dataset containing sentences with visual metaphors. Then, we propose a visual metaphor generation image framework based on metaphor understanding, which is more tailored to the essence of metaphor, better utilizes visual features, and has stronger interpretability. Specifically, the framework extracts the source domain, target domain, and metaphor interpretation from metaphorical sentences, separating the elements of the metaphor to deepen the understanding of its themes and intentions. Additionally, the framework introduces image data from the source domain to capture visual similarities and generate visual enhancement prompts specific to the domain. Finally, these prompts are combined with metaphorical interpretation sentences to form the final prompt text. Experimental results demonstrate that this approach effectively captures the essence of metaphor and generates metaphorical images consistent with the textual meaning.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"26 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140616317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Point-level weakly-supervised temporal action localization aims to accurately recognize and localize action segments in untrimmed videos, using only point-level annotations during training. Current methods primarily focus on mining sparse pseudo-labels and generating dense pseudo-labels. However, due to the sparsity of point-level labels and the impact of scene information on action representations, the reliability of dense pseudo-label methods still remains an issue. In this paper, we propose a point-level weakly-supervised temporal action localization method based on local representation enhancement and global temporal optimization. This method comprises two modules that enhance the representation capacity of action features and improve the reliability of class activation sequence classification, thereby enhancing the reliability of dense pseudo-labels and strengthening the model’s capability for completeness learning. Specifically, we first generate representative features of actions using pseudo-label feature and calculate weights based on the feature similarity between representative features of actions and segments features to adjust class activation sequence. Additionally, we maintain the fixed-length queues for annotated segments and design a action contrastive learning framework between videos. The experimental results demonstrate that our modules indeed enhance the model’s capability for comprehensive learning, particularly achieving state-of-the-art results at high IoU thresholds.
{"title":"Learning Reliable Dense Pseudo-Labels for Point-Level Weakly-Supervised Action Localization","authors":"Yuanjie Dang, Guozhu Zheng, Peng Chen, Nan Gao, Ruohong Huan, Dongdong Zhao, Ronghua Liang","doi":"10.1007/s11063-024-11598-w","DOIUrl":"https://doi.org/10.1007/s11063-024-11598-w","url":null,"abstract":"<p>Point-level weakly-supervised temporal action localization aims to accurately recognize and localize action segments in untrimmed videos, using only point-level annotations during training. Current methods primarily focus on mining sparse pseudo-labels and generating dense pseudo-labels. However, due to the sparsity of point-level labels and the impact of scene information on action representations, the reliability of dense pseudo-label methods still remains an issue. In this paper, we propose a point-level weakly-supervised temporal action localization method based on local representation enhancement and global temporal optimization. This method comprises two modules that enhance the representation capacity of action features and improve the reliability of class activation sequence classification, thereby enhancing the reliability of dense pseudo-labels and strengthening the model’s capability for completeness learning. Specifically, we first generate representative features of actions using pseudo-label feature and calculate weights based on the feature similarity between representative features of actions and segments features to adjust class activation sequence. Additionally, we maintain the fixed-length queues for annotated segments and design a action contrastive learning framework between videos. The experimental results demonstrate that our modules indeed enhance the model’s capability for comprehensive learning, particularly achieving state-of-the-art results at high IoU thresholds.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"214 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140571988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-09DOI: 10.1007/s11063-024-11599-9
Yinbin Peng, Wei Wu, Jiansi Ren, Xiang Yu
Convolutional Neural Network (CNN) or Recurrent Neural Network (RNN) based text classification algorithms currently in use can successfully extract local textual features but disregard global data. Due to its ability to understand complex text structures and maintain global information, Graph Neural Network (GNN) has demonstrated considerable promise in text classification. However, most of the GNN text classification models in use presently are typically shallow, unable to capture long-distance node information and reflect the various scale features of the text (such as words, phrases, etc.). All of which will negatively impact the performance of the final classification. A novel Graph Convolutional Neural Network (GCN) with dense connections and an attention mechanism for text classification is proposed to address these constraints. By increasing the depth of GCN, the densely connected graph convolutional network (DC-GCN) gathers information about distant nodes. The DC-GCN multiplexes the small-scale features of shallow layers and produces different scale features through dense connections. To combine features and determine their relative importance, an attention mechanism is finally added. Experiment results on four benchmark datasets demonstrate that our model’s classification accuracy greatly outpaces that of the conventional deep learning text classification model. Our model performs exceptionally well when compared to other text categorization GCN algorithms.
{"title":"Novel GCN Model Using Dense Connection and Attention Mechanism for Text Classification","authors":"Yinbin Peng, Wei Wu, Jiansi Ren, Xiang Yu","doi":"10.1007/s11063-024-11599-9","DOIUrl":"https://doi.org/10.1007/s11063-024-11599-9","url":null,"abstract":"<p>Convolutional Neural Network (CNN) or Recurrent Neural Network (RNN) based text classification algorithms currently in use can successfully extract local textual features but disregard global data. Due to its ability to understand complex text structures and maintain global information, Graph Neural Network (GNN) has demonstrated considerable promise in text classification. However, most of the GNN text classification models in use presently are typically shallow, unable to capture long-distance node information and reflect the various scale features of the text (such as words, phrases, etc.). All of which will negatively impact the performance of the final classification. A novel Graph Convolutional Neural Network (GCN) with dense connections and an attention mechanism for text classification is proposed to address these constraints. By increasing the depth of GCN, the densely connected graph convolutional network (DC-GCN) gathers information about distant nodes. The DC-GCN multiplexes the small-scale features of shallow layers and produces different scale features through dense connections. To combine features and determine their relative importance, an attention mechanism is finally added. Experiment results on four benchmark datasets demonstrate that our model’s classification accuracy greatly outpaces that of the conventional deep learning text classification model. Our model performs exceptionally well when compared to other text categorization GCN algorithms.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"215 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140571911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-08DOI: 10.1007/s11063-024-11589-x
Zhongyan Gui, Jing Yang, Zhiqiang Xie, Cuicui Ye
Learning a robust affinity graph is fundamental to graph-based clustering methods. However, some existing affinity graph learning methods have encountered the following problems. First, the constructed affinity graphs cannot capture the intrinsic structure of data well. Second, when fusing all view-specific affinity graphs, most of them obtain a fusion graph by simply taking the average of multiple views, or directly learning a common graph from multiple views, without considering the discriminative property among diverse views. Third, the fusion graph does not maintain an explicit cluster structure. To alleviate these problems, the adaptive neighbor graph learning approach and the data self-expression approach are first integrated into a structure graph fusion framework to obtain a view-specific structure affinity graph to capture the local and global structures of data. Then, all the structural affinity graphs are weighted dynamically into a consensus affinity graph, which not only effectively incorporates the complementary affinity structure of important views but also has the capability of preserving the consensus affinity structure that is shared by all views. Finally, a k–block diagonal regularizer is introduced for the consensus affinity graph to encourage it to have an explicit cluster structure. An efficient optimization algorithm is developed to tackle the resultant optimization problem. Extensive experiments on benchmark datasets validate the superiority of the proposed method.
学习稳健的亲和图是基于图的聚类方法的基础。然而,现有的一些亲和图学习方法遇到了以下问题。首先,构建的亲和图不能很好地捕捉数据的内在结构。其次,在融合所有特定视图的亲和图时,大多数方法都是通过简单地取多个视图的平均值来获得融合图,或者直接从多个视图中学习一个共同的图,而没有考虑不同视图之间的区分属性。第三,融合图没有保持明确的聚类结构。为了缓解这些问题,我们首先将自适应邻接图学习方法和数据自我表达方法整合到结构图融合框架中,从而获得特定视图的结构亲和图,以捕捉数据的局部和全局结构。然后,将所有结构亲和图动态加权为共识亲和图,该共识亲和图不仅能有效整合重要视图的互补亲和结构,还能保留所有视图共享的共识亲和结构。最后,为共识亲和图引入了 k 块对角正则,以鼓励其具有明确的聚类结构。为解决由此产生的优化问题,我们开发了一种高效的优化算法。在基准数据集上进行的大量实验验证了所提方法的优越性。
{"title":"Consensus Affinity Graph Learning via Structure Graph Fusion and Block Diagonal Representation for Multiview Clustering","authors":"Zhongyan Gui, Jing Yang, Zhiqiang Xie, Cuicui Ye","doi":"10.1007/s11063-024-11589-x","DOIUrl":"https://doi.org/10.1007/s11063-024-11589-x","url":null,"abstract":"<p>Learning a robust affinity graph is fundamental to graph-based clustering methods. However, some existing affinity graph learning methods have encountered the following problems. First, the constructed affinity graphs cannot capture the intrinsic structure of data well. Second, when fusing all view-specific affinity graphs, most of them obtain a fusion graph by simply taking the average of multiple views, or directly learning a common graph from multiple views, without considering the discriminative property among diverse views. Third, the fusion graph does not maintain an explicit cluster structure. To alleviate these problems, the adaptive neighbor graph learning approach and the data self-expression approach are first integrated into a structure graph fusion framework to obtain a view-specific structure affinity graph to capture the local and global structures of data. Then, all the structural affinity graphs are weighted dynamically into a consensus affinity graph, which not only effectively incorporates the complementary affinity structure of important views but also has the capability of preserving the consensus affinity structure that is shared by all views. Finally, a <i>k</i>–block diagonal regularizer is introduced for the consensus affinity graph to encourage it to have an explicit cluster structure. An efficient optimization algorithm is developed to tackle the resultant optimization problem. Extensive experiments on benchmark datasets validate the superiority of the proposed method.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"57 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140571912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-08DOI: 10.1007/s11063-024-11595-z
Yan Yan, Bo-Wen Zhang, Peng-hao Min, Guan-wen Ding, Jun-yuan Liu
Dialogue systems have attracted growing research interests due to its widespread applications in various domains. However, most research work focus on sentence-level intent recognition to interpret user utterances in dialogue systems, while the comprehension of the whole documents has not attracted sufficient attention. In this paper, we propose DialGNN, a heterogeneous graph neural network framework tailored for the problem of dialogue classification which takes the entire dialogue as input. Specifically, a heterogeneous graph is constructed with nodes in different levels of semantic granularity. The graph framework allows flexible integration of various pre-trained language representation models, such as BERT and its variants, which endows DialGNN with powerful text representational capabilities. DialGNN outperforms on CM and ECS datasets, which demonstrates robustness and the effectiveness. Specifically, our model achieves a notable enhancement in performance, optimizing the classification of document-level dialogue text. The implementation of DialGNN and related data are shared through https://github.com/821code/DialGNN.
{"title":"DialGNN: Heterogeneous Graph Neural Networks for Dialogue Classification","authors":"Yan Yan, Bo-Wen Zhang, Peng-hao Min, Guan-wen Ding, Jun-yuan Liu","doi":"10.1007/s11063-024-11595-z","DOIUrl":"https://doi.org/10.1007/s11063-024-11595-z","url":null,"abstract":"<p>Dialogue systems have attracted growing research interests due to its widespread applications in various domains. However, most research work focus on sentence-level intent recognition to interpret user utterances in dialogue systems, while the comprehension of the whole documents has not attracted sufficient attention. In this paper, we propose DialGNN, a heterogeneous graph neural network framework tailored for the problem of dialogue classification which takes the entire dialogue as input. Specifically, a heterogeneous graph is constructed with nodes in different levels of semantic granularity. The graph framework allows flexible integration of various pre-trained language representation models, such as BERT and its variants, which endows DialGNN with powerful text representational capabilities. DialGNN outperforms on CM and ECS datasets, which demonstrates robustness and the effectiveness. Specifically, our model achieves a notable enhancement in performance, optimizing the classification of document-level dialogue text. The implementation of DialGNN and related data are shared through https://github.com/821code/DialGNN.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"40 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140571915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-07DOI: 10.1007/s11063-024-11597-x
Nongxiao Wang, Xulun Ye, Jieyu Zhao, Qing Wang
Deep spectral clustering techniques are considered one of the most efficient clustering algorithms in data mining field. The similarity between instances and the disparity among classes are two critical factors in clustering fields. However, most current deep spectral clustering approaches do not sufficiently take them both into consideration. To tackle the above issue, we propose Semantic Spectral clustering with Contrastive learning and Neighbor mining (SSCN) framework, which performs instance-level pulling and cluster-level pushing cooperatively. Specifically, we obtain the semantic feature embedding using an unsupervised contrastive learning model. Next, we obtain the nearest neighbors partially and globally, and the neighbors along with data augmentation information enhance their effectiveness collaboratively on the instance level as well as the cluster level. The spectral constraint is applied by orthogonal layers to satisfy conventional spectral clustering. Extensive experiments demonstrate the superiority of our proposed frame of spectral clustering.
{"title":"Semantic Spectral Clustering with Contrastive Learning and Neighbor Mining","authors":"Nongxiao Wang, Xulun Ye, Jieyu Zhao, Qing Wang","doi":"10.1007/s11063-024-11597-x","DOIUrl":"https://doi.org/10.1007/s11063-024-11597-x","url":null,"abstract":"<p>Deep spectral clustering techniques are considered one of the most efficient clustering algorithms in data mining field. The similarity between instances and the disparity among classes are two critical factors in clustering fields. However, most current deep spectral clustering approaches do not sufficiently take them both into consideration. To tackle the above issue, we propose Semantic Spectral clustering with Contrastive learning and Neighbor mining (SSCN) framework, which performs instance-level pulling and cluster-level pushing cooperatively. Specifically, we obtain the semantic feature embedding using an unsupervised contrastive learning model. Next, we obtain the nearest neighbors partially and globally, and the neighbors along with data augmentation information enhance their effectiveness collaboratively on the instance level as well as the cluster level. The spectral constraint is applied by orthogonal layers to satisfy conventional spectral clustering. Extensive experiments demonstrate the superiority of our proposed frame of spectral clustering.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"120 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140571905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-04DOI: 10.1007/s11063-024-11555-7
Jing He, Le Tang, Dan Tang, Ping Wang, Li Cai
Due to the information from the multi-relationship graphs is difficult to aggregate, the graph neural network recommendation model focuses on single-relational graphs (e.g., the user-item rating bipartite graph and user-user social relationship graphs). However, existing graph neural network recommendation models have insufficient flexibility. The recommendation accuracy instead decreases when low-quality auxiliary information is aggregated in the recommendation model. This paper proposes a scalable graph neural network recommendation model named SGNNRec. SGNNRec fuse a variety of auxiliary information (e.g., user social information, item tag information and user-item interaction information) beside user-item rating as supplements to solve the problem of data sparsity. A tag cluster-based item-semantic graph method and an apriori algorithm-based user-item interaction graph method are proposed to realize the construction of graph relations. Furthermore, a double-layer attention network is designed to learn the influence of latent factors. Thus, the latent factors are to be optimized to obtain the best recommendation results. Empirical results on real-world datasets verify the effectiveness of our model. SGNNRec can reduce the influence of poor auxiliary information; moreover, with increasing the number of auxiliary information, the model accuracy improves.
{"title":"SGNNRec: A Scalable Double-Layer Attention-Based Graph Neural Network Recommendation Model","authors":"Jing He, Le Tang, Dan Tang, Ping Wang, Li Cai","doi":"10.1007/s11063-024-11555-7","DOIUrl":"https://doi.org/10.1007/s11063-024-11555-7","url":null,"abstract":"<p>Due to the information from the multi-relationship graphs is difficult to aggregate, the graph neural network recommendation model focuses on single-relational graphs (e.g., the user-item rating bipartite graph and user-user social relationship graphs). However, existing graph neural network recommendation models have insufficient flexibility. The recommendation accuracy instead decreases when low-quality auxiliary information is aggregated in the recommendation model. This paper proposes a scalable graph neural network recommendation model named SGNNRec. SGNNRec fuse a variety of auxiliary information (e.g., user social information, item tag information and user-item interaction information) beside user-item rating as supplements to solve the problem of data sparsity. A tag cluster-based item-semantic graph method and an apriori algorithm-based user-item interaction graph method are proposed to realize the construction of graph relations. Furthermore, a double-layer attention network is designed to learn the influence of latent factors. Thus, the latent factors are to be optimized to obtain the best recommendation results. Empirical results on real-world datasets verify the effectiveness of our model. SGNNRec can reduce the influence of poor auxiliary information; moreover, with increasing the number of auxiliary information, the model accuracy improves.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"25 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140571903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-04DOI: 10.1007/s11063-024-11594-0
Yujie Wu, Lei Liang, Siyao Ling, Zhisheng Gao
The encoder-decoder framework based on Transformer components has become a paradigm in the field of image deblurring architecture design. In this paper, we critically revisit this approach and find that many current architectures severely focus on limited local regions during the feature extraction stage. These designs compromise the feature richness and diversity of the encoder-decoder framework, leading to bottlenecks in performance improvement. To address these deficiencies, a novel Hierarchical Patch Aggregation Transformer architecture (HPAT) is proposed. In the initial feature extraction stage, HPAT combines Axis-Selective Transformer Blocks with linear complexity and is supplemented by an adaptive hierarchical attention fusion mechanism. These mechanisms enable the model to effectively capture the spatial relationships between features and integrate features from different hierarchical levels. Then, we redesign the feedforward network of the Transformer block in the encoder-decoder structure and propose the Fused Feedforward Network. This effective aggregation enhances the ability to capture and retain local detailed features. We evaluate HPAT through extensive experiments and compare its performance with baseline methods on public datasets. Experimental results show that the proposed HPAT model achieves state-of-the-art performance in image deblurring tasks.
{"title":"Hierarchical Patch Aggregation Transformer for Motion Deblurring","authors":"Yujie Wu, Lei Liang, Siyao Ling, Zhisheng Gao","doi":"10.1007/s11063-024-11594-0","DOIUrl":"https://doi.org/10.1007/s11063-024-11594-0","url":null,"abstract":"<p>The encoder-decoder framework based on Transformer components has become a paradigm in the field of image deblurring architecture design. In this paper, we critically revisit this approach and find that many current architectures severely focus on limited local regions during the feature extraction stage. These designs compromise the feature richness and diversity of the encoder-decoder framework, leading to bottlenecks in performance improvement. To address these deficiencies, a novel Hierarchical Patch Aggregation Transformer architecture (HPAT) is proposed. In the initial feature extraction stage, HPAT combines Axis-Selective Transformer Blocks with linear complexity and is supplemented by an adaptive hierarchical attention fusion mechanism. These mechanisms enable the model to effectively capture the spatial relationships between features and integrate features from different hierarchical levels. Then, we redesign the feedforward network of the Transformer block in the encoder-decoder structure and propose the Fused Feedforward Network. This effective aggregation enhances the ability to capture and retain local detailed features. We evaluate HPAT through extensive experiments and compare its performance with baseline methods on public datasets. Experimental results show that the proposed HPAT model achieves state-of-the-art performance in image deblurring tasks.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"1 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140571979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}