Pub Date : 2024-07-18DOI: 10.1007/s10115-024-02182-8
Aditya Kumar, Jainath Yadav
This paper introduces a novel approach for feature set partitioning in multi-view ensemble learning (MVEL) utilizing the minimum spanning tree clustering (MSTC) algorithm. The proposed method aims to generate informative and diverse feature subsets to enhance classification performance in the MVEL framework. The MSTC algorithm constructs a minimum spanning tree based on correlation measures and divides features into non-overlapping clusters, representing distinct views used to improve ensemble learning. We evaluate the effectiveness of the MSTC-based MVEL framework on ten high-dimensional datasets using support vector machines. Results indicate significant improvements in classification performance compared to single-view learning and other cutting-edge feature partitioning approaches. Statistical analysis confirms the enhanced classification accuracy achieved by the proposed MVEL framework, reaching a level of accuracy that is both reliable and competitive.
{"title":"Minimum spanning tree clustering approach for effective feature partitioning in multi-view ensemble learning","authors":"Aditya Kumar, Jainath Yadav","doi":"10.1007/s10115-024-02182-8","DOIUrl":"https://doi.org/10.1007/s10115-024-02182-8","url":null,"abstract":"<p>This paper introduces a novel approach for feature set partitioning in multi-view ensemble learning (MVEL) utilizing the minimum spanning tree clustering (MSTC) algorithm. The proposed method aims to generate informative and diverse feature subsets to enhance classification performance in the MVEL framework. The MSTC algorithm constructs a minimum spanning tree based on correlation measures and divides features into non-overlapping clusters, representing distinct views used to improve ensemble learning. We evaluate the effectiveness of the MSTC-based MVEL framework on ten high-dimensional datasets using support vector machines. Results indicate significant improvements in classification performance compared to single-view learning and other cutting-edge feature partitioning approaches. Statistical analysis confirms the enhanced classification accuracy achieved by the proposed MVEL framework, reaching a level of accuracy that is both reliable and competitive.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"65 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141743299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-17DOI: 10.1007/s10115-024-02172-w
Angelica Liguori, Ettore Ritacco, Francesco Sergio Pisani, Giuseppe Manco
The capability to devise robust outlier and anomaly detection tools is an important research topic in machine learning and data mining. Recent techniques have been focusing on reinforcing detection with sophisticated data generation tools that successfully refine the learning process by generating variants of the data that expand the recognition capabilities of the outlier detector. In this paper, we propose (textrm{ARN}), a semi-supervised anomaly detection and generation method based on adversarial counterfactual reconstruction. (textrm{ARN}) exploits a regularized autoencoder to optimize the reconstruction of variants of normal examples with minimal differences that are recognized as outliers. The combination of regularization and counterfactual reconstruction helps to stabilize the learning process, which results in both realistic outlier generation and substantially extended detection capability. In fact, the counterfactual generation enables a smart exploration of the search space by successfully relating small changes in all the actual samples from the true distribution to high anomaly scores. Experiments on several benchmark datasets show that our model improves the current state of the art by valuable margins because of its ability to model the true boundaries of the data manifold.
{"title":"Robust anomaly detection via adversarial counterfactual generation","authors":"Angelica Liguori, Ettore Ritacco, Francesco Sergio Pisani, Giuseppe Manco","doi":"10.1007/s10115-024-02172-w","DOIUrl":"https://doi.org/10.1007/s10115-024-02172-w","url":null,"abstract":"<p>The capability to devise robust outlier and anomaly detection tools is an important research topic in machine learning and data mining. Recent techniques have been focusing on reinforcing detection with sophisticated data generation tools that successfully refine the learning process by generating variants of the data that expand the recognition capabilities of the outlier detector. In this paper, we propose <span>(textrm{ARN})</span>, a semi-supervised anomaly detection and generation method based on adversarial counterfactual reconstruction. <span>(textrm{ARN})</span> exploits a regularized autoencoder to optimize the reconstruction of variants of normal examples with minimal differences that are recognized as outliers. The combination of regularization and counterfactual reconstruction helps to stabilize the learning process, which results in both realistic outlier generation and substantially extended detection capability. In fact, the counterfactual generation enables a smart exploration of the search space by successfully relating small changes in all the actual samples from the true distribution to high anomaly scores. Experiments on several benchmark datasets show that our model improves the current state of the art by valuable margins because of its ability to model the true boundaries of the data manifold.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"10 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141717792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-12DOI: 10.1007/s10115-024-02173-9
Junwen Duan, Mingyi Jia, Jianbo Liao, Jianxin Wang
Semantic text matching plays a vital role in diverse domains, such as information retrieval, question answering, and recommendation. However, longer texts present challenges, including noise, long-range dependency, and cross-sentence inference. Graph-based approaches have shown effectiveness in addressing these challenges, but traditional graph structures struggle to model complex higher-order relationships in long-form texts. To overcome this limitation, we propose HyperMatch, a hypergraph-based method for long-form text matching. HyperMatch leverages hypergraph modeling to capture high-order relationships and enhance matching performance. Our approach involves constructing a keyword graph using document keywords as nodes, connecting sentences to nodes based on inclusion relationships, creating a hypergraph based on sentence similarity across nodes, and utilizing hypergraph convolutional networks to aggregate matching signals. Extensive experiments on benchmark datasets demonstrate the superiority of our model over state-of-the-art long-form text matching approaches.
{"title":"HyperMatch: long-form text matching via hypergraph convolutional networks","authors":"Junwen Duan, Mingyi Jia, Jianbo Liao, Jianxin Wang","doi":"10.1007/s10115-024-02173-9","DOIUrl":"https://doi.org/10.1007/s10115-024-02173-9","url":null,"abstract":"<p>Semantic text matching plays a vital role in diverse domains, such as information retrieval, question answering, and recommendation. However, longer texts present challenges, including noise, long-range dependency, and cross-sentence inference. Graph-based approaches have shown effectiveness in addressing these challenges, but traditional graph structures struggle to model complex higher-order relationships in long-form texts. To overcome this limitation, we propose <b>HyperMatch</b>, a hypergraph-based method for long-form text matching. HyperMatch leverages hypergraph modeling to capture high-order relationships and enhance matching performance. Our approach involves constructing a keyword graph using document keywords as nodes, connecting sentences to nodes based on inclusion relationships, creating a hypergraph based on sentence similarity across nodes, and utilizing hypergraph convolutional networks to aggregate matching signals. Extensive experiments on benchmark datasets demonstrate the superiority of our model over state-of-the-art long-form text matching approaches.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"41 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141612240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-05DOI: 10.1007/s10115-024-02169-5
Matheus B. Rocha, Renato A. Krohling
Variational autoencoders (VAEs) are generative models known for learning compact and continuous latent representations of data. While they have proven effective in various applications, using latent representations for classification tasks presents challenges. Typically, a straightforward approach involves concatenating the mean and variance vectors and inputting them into a shallow neural network. In this paper, we introduce a novel approach for variational autoencoders, named VAE-GNA, which integrates Gaussian neurons into the latent space along with attention mechanisms. These neurons directly process mean and variance values through a suitable modified sigmoid function, not only improving classification, but also optimizing the training of the VAE in extracting features, in synergy with the classification network. Additionally, we investigate both additive and multiplicative attention mechanisms to enhance the model’s capabilities. We applied the proposed method to automatic cancer detection using near-infrared (NIR) spectral data, showing that the experimental results of VAE-GNA surpass established baselines for spectral datasets. The results obtained indicate the feasibility and effectiveness of our approach.
{"title":"VAE-GNA: a variational autoencoder with Gaussian neurons in the latent space and attention mechanisms","authors":"Matheus B. Rocha, Renato A. Krohling","doi":"10.1007/s10115-024-02169-5","DOIUrl":"https://doi.org/10.1007/s10115-024-02169-5","url":null,"abstract":"<p>Variational autoencoders (VAEs) are generative models known for learning compact and continuous latent representations of data. While they have proven effective in various applications, using latent representations for classification tasks presents challenges. Typically, a straightforward approach involves concatenating the mean and variance vectors and inputting them into a shallow neural network. In this paper, we introduce a novel approach for variational autoencoders, named VAE-GNA, which integrates Gaussian neurons into the latent space along with attention mechanisms. These neurons directly process mean and variance values through a suitable modified sigmoid function, not only improving classification, but also optimizing the training of the VAE in extracting features, in synergy with the classification network. Additionally, we investigate both additive and multiplicative attention mechanisms to enhance the model’s capabilities. We applied the proposed method to automatic cancer detection using near-infrared (NIR) spectral data, showing that the experimental results of VAE-GNA surpass established baselines for spectral datasets. The results obtained indicate the feasibility and effectiveness of our approach.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"47 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141547252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-05DOI: 10.1007/s10115-024-02159-7
Cong Guo, Wei Yang, Chun Liu, Zheng Li
Many datasets suffer from missing values due to various reasons, which not only increases the processing difficulty of related tasks but also reduces the classification accuracy. To address this problem, the mainstream approach is to use missing value imputation to complete the dataset. Existing imputation methods treat all features as equally important during data completion, while in fact different features have different importance. Therefore, we have designed an imputation method that considers feature importance. This algorithm iteratively performs matrix completion and feature importance learning. In particular, matrix completion is performed based on a completion loss function that incorporates feature importance. Our experimental analysis involves three types of datasets: synthetic datasets with different noisy features and missing values, real-world datasets with artificially generated missing values, and real-world datasets originally containing missing values. The results on these datasets consistently show that the proposed method outperforms the existing five imputation algorithms.
{"title":"Iterative missing value imputation based on feature importance","authors":"Cong Guo, Wei Yang, Chun Liu, Zheng Li","doi":"10.1007/s10115-024-02159-7","DOIUrl":"https://doi.org/10.1007/s10115-024-02159-7","url":null,"abstract":"<p>Many datasets suffer from missing values due to various reasons, which not only increases the processing difficulty of related tasks but also reduces the classification accuracy. To address this problem, the mainstream approach is to use missing value imputation to complete the dataset. Existing imputation methods treat all features as equally important during data completion, while in fact different features have different importance. Therefore, we have designed an imputation method that considers feature importance. This algorithm iteratively performs matrix completion and feature importance learning. In particular, matrix completion is performed based on a completion loss function that incorporates feature importance. Our experimental analysis involves three types of datasets: synthetic datasets with different noisy features and missing values, real-world datasets with artificially generated missing values, and real-world datasets originally containing missing values. The results on these datasets consistently show that the proposed method outperforms the existing five imputation algorithms.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"5 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141547242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-05DOI: 10.1007/s10115-024-02164-w
Nikolaos Giarelis, Nikos Karacapilidis
Keyphrase extraction is a subtask of natural language processing referring to the automatic extraction of salient terms that semantically capture the key themes and topics of a document. Earlier literature reviews focus on classical approaches that employ various statistical or graph-based techniques; these approaches miss important keywords/keyphrases, due to their inability to fully utilize context (that is present or not) in a document, thus achieving low F1 scores. Recent advances in deep learning and word/sentence embedding vectors lead to the development of new approaches, which address the lack of context and outperform the majority of classical ones. Taking the above into account, the contribution of this review is fourfold: (i) we analyze the state-of-the-art keyphrase extraction approaches and categorize them upon their employed techniques; (ii) we provide a comparative evaluation of these approaches, using well-known datasets of the literature and popular evaluation metrics, such as the F1 score; (iii) we provide a series of insights on various keyphrase extraction issues, including alternative approaches and future research directions; (iv) we make the datasets and code used in our experiments public, aiming to further increase the reproducibility of this work and facilitate future research in the field.
关键词提取是自然语言处理的一个子任务,指的是自动提取从语义上捕捉文档关键主题和话题的突出术语。早期的文献综述侧重于采用各种统计或基于图的技术的经典方法;这些方法由于无法充分利用文档中的上下文(存在或不存在)而错过了重要的关键词/关键短语,因此获得的 F1 分数较低。深度学习和单词/句子嵌入向量方面的最新进展推动了新方法的发展,这些新方法解决了缺乏上下文的问题,并优于大多数传统方法。综上所述,本综述有四方面的贡献:(i)我们分析了最先进的关键词提取方法,并根据其采用的技术对它们进行了分类;(ii)我们使用文献中的知名数据集和流行的评估指标(如 F1 分数)对这些方法进行了比较评估;(iii)我们就各种关键词提取问题提出了一系列见解,包括替代方法和未来研究方向;(iv)我们公开了实验中使用的数据集和代码,旨在进一步提高这项工作的可重复性,并促进该领域的未来研究。
{"title":"Deep learning and embeddings-based approaches for keyphrase extraction: a literature review","authors":"Nikolaos Giarelis, Nikos Karacapilidis","doi":"10.1007/s10115-024-02164-w","DOIUrl":"https://doi.org/10.1007/s10115-024-02164-w","url":null,"abstract":"<p>Keyphrase extraction is a subtask of natural language processing referring to the automatic extraction of salient terms that semantically capture the key themes and topics of a document. Earlier literature reviews focus on classical approaches that employ various statistical or graph-based techniques; these approaches miss important keywords/keyphrases, due to their inability to fully utilize context (that is present or not) in a document, thus achieving low <i>F1</i> scores. Recent advances in deep learning and word/sentence embedding vectors lead to the development of new approaches, which address the lack of context and outperform the majority of classical ones. Taking the above into account, the contribution of this review is fourfold: (i) we analyze the state-of-the-art keyphrase extraction approaches and categorize them upon their employed techniques; (ii) we provide a comparative evaluation of these approaches, using well-known datasets of the literature and popular evaluation metrics, such as the <i>F1</i> score; (iii) we provide a series of insights on various keyphrase extraction issues, including alternative approaches and future research directions; (iv) we make the datasets and code used in our experiments public, aiming to further increase the reproducibility of this work and facilitate future research in the field.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"57 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141547243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Internet of Things (IoT) has been used in various aspects. Fundamental security issues must be addressed to accelerate and develop the Internet of Things. An intrusion detection system (IDS) is an essential element in network security designed to detect and determine the type of attacks. The use of deep learning (DL) shows promising results in the design of IDS based on IoT. DL facilitates analytics and learning in the dynamic IoT domain. Some deep learning-based IDS in IOT sensors cannot be executed, because of resource restrictions. Although cloud computing could overcome limitations, the distance between the cloud and the end IoT sensors causes high communication costs, security problems and delays. Fog computing has been presented to handle these issues and can bring resources to the edge of the network. Many studies have been conducted to investigate IDS based on IoT. Our goal is to investigate and classify deep learning-based IDS on fog processing. In this paper, researchers can access comprehensive resources in this field. Therefore, first, we provide a complete classification of IDS in IoT. Then practical and important proposed IDSs in the fog environment are discussed in three groups (binary, multi-class, and hybrid), and are examined the advantages and disadvantages of each approach. The results show that most of the studied methods consider hybrid strategies (binary and multi-class). In addition, in the reviewed papers the average Accuracy obtained in the binary method is better than the multi-class. Finally, we highlight some challenges and future directions for the next research in IDS techniques.
{"title":"Taxonomy of deep learning-based intrusion detection system approaches in fog computing: a systematic review","authors":"Sepide Najafli, Abolrazl Toroghi Haghighat, Babak Karasfi","doi":"10.1007/s10115-024-02162-y","DOIUrl":"https://doi.org/10.1007/s10115-024-02162-y","url":null,"abstract":"<p>The Internet of Things (IoT) has been used in various aspects. Fundamental security issues must be addressed to accelerate and develop the Internet of Things. An intrusion detection system (IDS) is an essential element in network security designed to detect and determine the type of attacks. The use of deep learning (DL) shows promising results in the design of IDS based on IoT. DL facilitates analytics and learning in the dynamic IoT domain. Some deep learning-based IDS in IOT sensors cannot be executed, because of resource restrictions. Although cloud computing could overcome limitations, the distance between the cloud and the end IoT sensors causes high communication costs, security problems and delays. Fog computing has been presented to handle these issues and can bring resources to the edge of the network. Many studies have been conducted to investigate IDS based on IoT. Our goal is to investigate and classify deep learning-based IDS on fog processing. In this paper, researchers can access comprehensive resources in this field. Therefore, first, we provide a complete classification of IDS in IoT. Then practical and important proposed IDSs in the fog environment are discussed in three groups (binary, multi-class, and hybrid), and are examined the advantages and disadvantages of each approach. The results show that most of the studied methods consider hybrid strategies (binary and multi-class). In addition, in the reviewed papers the average Accuracy obtained in the binary method is better than the multi-class. Finally, we highlight some challenges and future directions for the next research in IDS techniques.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"14 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141577121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-04DOI: 10.1007/s10115-024-02170-y
Yucheng Yan, Jin Li, Shuling Xu, Xinlong Chen, Genggeng Liu, Yang-Geng Fu
Graph neural networks (GNNs) have achieved excellent performances in many graph-related tasks. However, they need appropriate pooling operations to deal with the graph classification tasks, and thus, they may suffer from some limitations such as information loss and ignorance of the part-whole relationships. CapsGNN is proposed to solve the above-mentioned issues, but suffers from high time and space complexities leading to its poor scalability. In this paper, we propose a novel, effective and efficient graph capsule network called LightCapsGNN. First, we devise a fast voting mechanism (called LightVoting) implemented via linear combinations of K shared transformation matrices to reduce the number of trainable parameters in the voting procedure. Second, an improved reconstruction layer is proposed to encourage our model to capture more informative and essential knowledge of the input graph. Third, other improvements are combined to further accelerate our model, e.g., matrix capsules and a trainable routing mechanism. Finally, extensive experiments are conducted on the popular real-world graph benchmarks in the graph classification tasks and the proposed model can achieve competitive or even better performance compared to ten baselines or state-of-the-art models. Furthermore, compared to other CapsGNNs, the proposed model reduce almost (99%) learnable parameters and (31.1%) running time.
图神经网络(GNN)在许多与图相关的任务中都取得了出色的表现。然而,它们需要适当的池化操作来处理图分类任务,因此可能会受到一些限制,如信息丢失和忽略部分-整体关系。CapsGNN 就是为了解决上述问题而提出的,但它在时间和空间上的复杂性较高,导致其可扩展性较差。在本文中,我们提出了一种新颖、有效和高效的图胶囊网络--LightCapsGNN。首先,我们设计了一种通过 K 个共享变换矩阵的线性组合实现的快速投票机制(称为 LightVoting),以减少投票过程中可训练参数的数量。其次,我们提出了一个改进的重构层,以鼓励我们的模型捕捉输入图的更多信息和基本知识。第三,结合其他改进措施来进一步加速我们的模型,例如矩阵胶囊和可训练路由机制。最后,我们在图分类任务中对流行的真实图基准进行了广泛的实验,与十种基准或最先进的模型相比,所提出的模型可以获得具有竞争力甚至更好的性能。此外,与其他CapsGNNs相比,所提出的模型减少了近99%的可学习参数和31.1%的运行时间。
{"title":"LightCapsGNN: light capsule graph neural network for graph classification","authors":"Yucheng Yan, Jin Li, Shuling Xu, Xinlong Chen, Genggeng Liu, Yang-Geng Fu","doi":"10.1007/s10115-024-02170-y","DOIUrl":"https://doi.org/10.1007/s10115-024-02170-y","url":null,"abstract":"<p>Graph neural networks (GNNs) have achieved excellent performances in many graph-related tasks. However, they need appropriate pooling operations to deal with the graph classification tasks, and thus, they may suffer from some limitations such as information loss and ignorance of the part-whole relationships. CapsGNN is proposed to solve the above-mentioned issues, but suffers from high time and space complexities leading to its poor scalability. In this paper, we propose a novel, effective and efficient graph capsule network called <i>LightCapsGNN</i>. First, we devise a fast voting mechanism (called <i>LightVoting</i>) implemented via linear combinations of <i>K</i> shared transformation matrices to reduce the number of trainable parameters in the voting procedure. Second, an improved reconstruction layer is proposed to encourage our model to capture more informative and essential knowledge of the input graph. Third, other improvements are combined to further accelerate our model, <i>e.g.</i>, matrix capsules and a trainable routing mechanism. Finally, extensive experiments are conducted on the popular real-world graph benchmarks in the graph classification tasks and the proposed model can achieve competitive or even better performance compared to ten baselines or state-of-the-art models. Furthermore, compared to other CapsGNNs, the proposed model reduce almost <span>(99%)</span> learnable parameters and <span>(31.1%)</span> running time.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"37 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141547249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-26DOI: 10.1007/s10115-024-02165-9
Jing Zhang, Jin Shi, Jingsheng Duan, Yonggong Ren
The incremental recommendation involves updating existing models by extracting information from interaction data at current time-step, with the aim of maintaining model accuracy while addressing limitations including parameter dependencies and inefficient training. However, real-time user interaction data is often afflicted by substantial noise and invalid samples, presenting the following key challenges for incremental model updating: (1) how to effectively extract valuable new knowledge from interaction data at the current time-step to ensure model accuracy and timeliness, and (2) how to safeguard against the catastrophic forgetting of long-term stable preference information, thus preserving the model’s sensitivity during cold-starts. In response to these challenges, we propose the Incremental Recommendation with Stable Latent Side-information Updating (SIIFR). This model employs a side-information augmenter to extract valuable latent side-information from user interaction behavior at time-step T, thereby sidestepping the interference caused by noisy interaction data and acquiring stable user preference. Moreover, the model utilizes rough interaction data at time-step (T+1), in conjunction with existing side-information enhancements to achieve incremental updates of latent preferences, thereby ensuring the model’s efficacy during cold-start. Furthermore, SIIFR leverages the change rate in user latent side-information to mitigate catastrophic forgetting that results in the loss of long-term stable preference information. The effectiveness of the proposed model is validated and compared against existing models using four popular incremental datasets. The model code can be achieved at: https://github.com/LNNU-computer-research-526/FR-sii.
增量推荐是指通过从当前时间步骤的交互数据中提取信息来更新现有模型,目的是在保持模型准确性的同时解决参数依赖性和训练效率低下等限制因素。然而,实时用户交互数据往往存在大量噪声和无效样本,这给增量模型更新带来了以下关键挑战:(1) 如何在当前时间步有效地从交互数据中提取有价值的新知识,以确保模型的准确性和及时性;(2) 如何防止长期稳定偏好信息的灾难性遗忘,从而在冷启动时保持模型的灵敏度。为了应对这些挑战,我们提出了稳定潜在侧面信息更新增量推荐模型(SIIFR)。该模型利用侧信息增强器从时间步 T 的用户交互行为中提取有价值的潜在侧信息,从而避开噪声交互数据的干扰,获得稳定的用户偏好。此外,该模型还利用时间步(T+1)的粗略交互数据,结合现有的侧信息增强器,实现潜在偏好的增量更新,从而确保模型在冷启动期间的有效性。此外,SIIFR 还能利用用户潜在侧信息的变化率来减轻灾难性遗忘导致的长期稳定偏好信息丢失。我们使用四种流行的增量数据集对所提出模型的有效性进行了验证,并与现有模型进行了比较。模型代码见:https://github.com/LNNU-computer-research-526/FR-sii。
{"title":"Latent side-information dynamic augmentation for incremental recommendation","authors":"Jing Zhang, Jin Shi, Jingsheng Duan, Yonggong Ren","doi":"10.1007/s10115-024-02165-9","DOIUrl":"https://doi.org/10.1007/s10115-024-02165-9","url":null,"abstract":"<p>The incremental recommendation involves updating existing models by extracting information from interaction data at current time-step, with the aim of maintaining model accuracy while addressing limitations including parameter dependencies and inefficient training. However, real-time user interaction data is often afflicted by substantial noise and invalid samples, presenting the following key challenges for incremental model updating: (1) how to effectively extract valuable new knowledge from interaction data at the current time-step to ensure model accuracy and timeliness, and (2) how to safeguard against the catastrophic forgetting of long-term stable preference information, thus preserving the model’s sensitivity during cold-starts. In response to these challenges, we propose the Incremental Recommendation with Stable Latent Side-information Updating (SIIFR). This model employs a side-information augmenter to extract valuable latent side-information from user interaction behavior at time-step <i>T</i>, thereby sidestepping the interference caused by noisy interaction data and acquiring stable user preference. Moreover, the model utilizes rough interaction data at time-step <span>(T+1)</span>, in conjunction with existing side-information enhancements to achieve incremental updates of latent preferences, thereby ensuring the model’s efficacy during cold-start. Furthermore, SIIFR leverages the change rate in user latent side-information to mitigate catastrophic forgetting that results in the loss of long-term stable preference information. The effectiveness of the proposed model is validated and compared against existing models using four popular incremental datasets. The model code can be achieved at: https://github.com/LNNU-computer-research-526/FR-sii.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"245 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141510288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-26DOI: 10.1007/s10115-024-02147-x
Fadilul-lah Yassaanah Issahaku, Ke Lu, Fang Xianwen, Sumaiya Bashiru Danwana, Husein Mohammed Bandago
Process mining algorithms essentially reflect the execution behavior of events in an event log for conformance checking, model discovery, or enhancement. Domain experts have developed several process mining algorithms based on theoretical frameworks such as linear integer programming, heuristics, and genetic algorithms, region-based and semantic-based approaches. The idea is to generate insightful representations of these processes of information systems to enable process mining practitioners to gain insight into their systems. Recently, there has been a shift toward semantic-based approaches for process mining since they not only discover enhanced models but also emphasize context. To this effect, this paper conducts a comprehensive review of 30 articles on semantic process mining techniques. It was found that 44.7% of all works used semantics for process discovery, 23.7% for model enhancement, and conformance checking was the least with 10.5%. We further indicate the benefits and contributions of these methods to process mining. Challenges, opportunities, and prospective future research areas are also discussed.
{"title":"An overview of semantic-based process mining techniques: trends and future directions","authors":"Fadilul-lah Yassaanah Issahaku, Ke Lu, Fang Xianwen, Sumaiya Bashiru Danwana, Husein Mohammed Bandago","doi":"10.1007/s10115-024-02147-x","DOIUrl":"https://doi.org/10.1007/s10115-024-02147-x","url":null,"abstract":"<p>Process mining algorithms essentially reflect the execution behavior of events in an event log for conformance checking, model discovery, or enhancement. Domain experts have developed several process mining algorithms based on theoretical frameworks such as linear integer programming, heuristics, and genetic algorithms, region-based and semantic-based approaches. The idea is to generate insightful representations of these processes of information systems to enable process mining practitioners to gain insight into their systems. Recently, there has been a shift toward semantic-based approaches for process mining since they not only discover enhanced models but also emphasize context. To this effect, this paper conducts a comprehensive review of 30 articles on semantic process mining techniques. It was found that 44.7% of all works used semantics for process discovery, 23.7% for model enhancement, and conformance checking was the least with 10.5%. We further indicate the benefits and contributions of these methods to process mining. Challenges, opportunities, and prospective future research areas are also discussed.\u0000</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"19 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141510287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}