首页 > 最新文献

Knowledge-Based Systems最新文献

英文 中文
SPIRF-CTA: Selection of parameter importance levels for reasonable forgetting in continuous task adaptation SPIRF-CTA:为连续任务适应中的合理遗忘选择参数重要性水平
IF 7.2 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-19 DOI: 10.1016/j.knosys.2024.112575
Qinglang Li , Jing Yang , Xiaoli Ruan , Shaobo Li , Jianjun Hu , Bingqi Hu
Humans can adapt to changing environments and tasks and consolidate their previous knowledge by constantly learning and practicing new knowledge; however, it is extremely challenging for artificial intelligence to achieve this goal. With the development of the field of artificial intelligence, research on overcoming catastrophic forgetting in continuous learning is also faced with the challenges of focusing on the most important parameters of the current task, retaining knowledge from previous tasks and adapting it to new ones, and making better use of knowledge from previous and new tasks. To solve these problems, this article proposes a reasonable forgetting method, which is called selection of parameter importance levels for reasonable forgetting in continuous task adaptation (SPIRF-CTA). The SPIRF-CTA approach enables the constructed model to identify and focus on the most important parameters for the current task by designing a normalized parameter importance selection mechanism and a loss function with parameter importance penalties, and it adjusts the parameter updates by incorporating Hessian matrix information to achieve reasonable forgetting and prevent the new task from completely overwriting the knowledge of the previous task. Moreover, we design a model alignment loss function and a multitask loss function to use the knowledge of the new and previous tasks. We evaluate the SPIRF-CTA method on the Split CIFAR-10 Split CIFAR-100, and Split mini-ImageNet datasets, and the results show that the image classification accuracies of the proposed approach improve by 3.6%, 4.4%, and 3.36%, respectively; moreover, the SPIRF-CTA method exhibits excellent control of the degree of forgetting, with a forgetting rate of only 3.54%. Code is available at https://github.com/ybyangjing/CTA.
人类可以通过不断学习和练习新知识来适应不断变化的环境和任务,并巩固以往的知识;然而,人工智能要实现这一目标却极具挑战性。随着人工智能领域的发展,在持续学习中克服灾难性遗忘的研究也面临着以下挑战:关注当前任务中最重要的参数,保留以前任务中的知识并使其适应新任务,以及更好地利用以前和新任务中的知识。为了解决这些问题,本文提出了一种合理遗忘方法,即连续任务适应中合理遗忘的参数重要性水平选择(SPIRF-CTA)。SPIRF-CTA 方法通过设计归一化参数重要性选择机制和带有参数重要性惩罚的损失函数,使构建的模型能够识别并关注当前任务中最重要的参数,并通过结合赫西矩阵信息调整参数更新,从而实现合理遗忘,防止新任务完全覆盖上一任务的知识。此外,我们还设计了模型对齐损失函数和多任务损失函数,以利用新任务和前任务的知识。我们在 Split CIFAR-10 Split CIFAR-100 和 Split miniImageNet 数据集上对 SPIRF-CTA 方法进行了评估,结果表明,所提方法的图像分类准确率分别提高了 3.6%、4.4% 和 3.36%;此外,SPIRF-CTA 方法对遗忘程度的控制非常出色,遗忘率仅为 3.54%。代码见 https://github.com/ybyangjing/CTA。
{"title":"SPIRF-CTA: Selection of parameter importance levels for reasonable forgetting in continuous task adaptation","authors":"Qinglang Li ,&nbsp;Jing Yang ,&nbsp;Xiaoli Ruan ,&nbsp;Shaobo Li ,&nbsp;Jianjun Hu ,&nbsp;Bingqi Hu","doi":"10.1016/j.knosys.2024.112575","DOIUrl":"10.1016/j.knosys.2024.112575","url":null,"abstract":"<div><div>Humans can adapt to changing environments and tasks and consolidate their previous knowledge by constantly learning and practicing new knowledge; however, it is extremely challenging for artificial intelligence to achieve this goal. With the development of the field of artificial intelligence, research on overcoming catastrophic forgetting in continuous learning is also faced with the challenges of focusing on the most important parameters of the current task, retaining knowledge from previous tasks and adapting it to new ones, and making better use of knowledge from previous and new tasks. To solve these problems, this article proposes a reasonable forgetting method, which is called selection of parameter importance levels for reasonable forgetting in continuous task adaptation (SPIRF-CTA). The SPIRF-CTA approach enables the constructed model to identify and focus on the most important parameters for the current task by designing a normalized parameter importance selection mechanism and a loss function with parameter importance penalties, and it adjusts the parameter updates by incorporating Hessian matrix information to achieve reasonable forgetting and prevent the new task from completely overwriting the knowledge of the previous task. Moreover, we design a model alignment loss function and a multitask loss function to use the knowledge of the new and previous tasks. We evaluate the SPIRF-CTA method on the Split CIFAR-10 Split CIFAR-100, and Split mini-ImageNet datasets, and the results show that the image classification accuracies of the proposed approach improve by 3.6%, 4.4%, and 3.36%, respectively; moreover, the SPIRF-CTA method exhibits excellent control of the degree of forgetting, with a forgetting rate of only 3.54%. Code is available at <span><span>https://github.com/ybyangjing/CTA</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"305 ","pages":"Article 112575"},"PeriodicalIF":7.2,"publicationDate":"2024-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142529398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Topology modification against membership inference attack in Graph Neural Networks 图神经网络中针对成员推理攻击的拓扑修改
IF 7.2 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-19 DOI: 10.1016/j.knosys.2024.112642
Faqian Guan , Tianqing Zhu , Hanjin Tong , Wanlei Zhou
Graph Neural Networks (GNNs) can effectively leverage information from graph data and apply to various downstream tasks, such as recommendation system, or anomaly detection. Previous studies have shown that node-level GNNs are susceptible to membership inference attacks, potentially leaking private information about the nodes. However, current studies on defending against such attacks primarily focuses on image and text data. They may not be directly applicable to graph data. In addition, classic defense methods such as differential privacy, regularization techniques, and adversarial training. They may reduce model availability or require model retraining while protecting privacy. To deal with these problems, we propose a novel defense strategy against membership inference attacks in graph neural networks. The strategy involves augmenting the graph with additional nodes and modifying its topology to create a new privacy-preserving graph, thereby protecting the privacy of the original nodes. We conducted extensive experiments on three representative GNN models and compared them with state-of-the-art baseline methods to support our research. The experimental results demonstrate that our method significantly reduces the success rate of membership inference attacks while maintaining the basic performance of the target model.
图神经网络(GNN)可以有效利用图数据中的信息,并应用于各种下游任务,如推荐系统或异常检测。以往的研究表明,节点级 GNN 容易受到成员推理攻击,可能泄露节点的私人信息。然而,目前有关防御此类攻击的研究主要集中在图像和文本数据上。它们可能无法直接适用于图数据。此外,经典的防御方法,如差分隐私、正则化技术和对抗训练等。它们可能会在保护隐私的同时降低模型可用性或需要重新训练模型。为了解决这些问题,我们提出了一种新的防御策略,以抵御图神经网络中的成员推理攻击。该策略包括用额外的节点增强图并修改其拓扑结构,以创建一个新的隐私保护图,从而保护原始节点的隐私。我们在三个具有代表性的 GNN 模型上进行了广泛的实验,并将它们与最先进的基线方法进行了比较,以支持我们的研究。实验结果表明,我们的方法大大降低了成员推理攻击的成功率,同时保持了目标模型的基本性能。
{"title":"Topology modification against membership inference attack in Graph Neural Networks","authors":"Faqian Guan ,&nbsp;Tianqing Zhu ,&nbsp;Hanjin Tong ,&nbsp;Wanlei Zhou","doi":"10.1016/j.knosys.2024.112642","DOIUrl":"10.1016/j.knosys.2024.112642","url":null,"abstract":"<div><div>Graph Neural Networks (GNNs) can effectively leverage information from graph data and apply to various downstream tasks, such as recommendation system, or anomaly detection. Previous studies have shown that node-level GNNs are susceptible to membership inference attacks, potentially leaking private information about the nodes. However, current studies on defending against such attacks primarily focuses on image and text data. They may not be directly applicable to graph data. In addition, classic defense methods such as differential privacy, regularization techniques, and adversarial training. They may reduce model availability or require model retraining while protecting privacy. To deal with these problems, we propose a novel defense strategy against membership inference attacks in graph neural networks. The strategy involves augmenting the graph with additional nodes and modifying its topology to create a new privacy-preserving graph, thereby protecting the privacy of the original nodes. We conducted extensive experiments on three representative GNN models and compared them with state-of-the-art baseline methods to support our research. The experimental results demonstrate that our method significantly reduces the success rate of membership inference attacks while maintaining the basic performance of the target model.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"305 ","pages":"Article 112642"},"PeriodicalIF":7.2,"publicationDate":"2024-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142529404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploiting memristive autapse and temporal distillation for training spiking neural networks 利用记忆性自愈和时间蒸馏训练尖峰神经网络
IF 7.2 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-19 DOI: 10.1016/j.knosys.2024.112627
Tao Chen , Shukai Duan , Lidan Wang
Spiking neural networks (SNNs) have attracted widespread attention due to their brain-inspired information processing mechanism and low power, sparse accumulation computation on neuromorphic chips. The surrogate gradient method makes it possible to train deep SNNs using backpropagation and shows satisfactory performance on some tasks. However, as the network structure becomes deeper, the spike information may fail to transmit to deeper layers, thus causing the output layer to make wrong predictions in recognition tasks. Inspired by the autaptic structure in the cerebral cortex, which is formed by axons connecting to their own dendrites and capable of modulating neuronal activity, we use discrete memristors to build feedback-connected autapses to adaptively regulate the precision of the spikes. Further, to prevent outlier at a certain time step from affecting the overall output, we distill the averaged knowledge into sub-models at each time step to correct potential errors. By combining these two proposed methods, we propose a deep SNNs optimized by Leaky Integrate-and-Fire (LIF) model with memristive autapse and temporal distillation, referred to as MA-SNN. A series of experiments on static datasets (CIFAR10 and CIFAR100) as well as neuromorphic datasets (DVS-CIFAR10 and N-Caltech101) demonstrated the competitiveness of the proposed model and validated the effectiveness of its components. Code for MA-SNN is available at: https://github.com/CHNtao/MA-SNN.
尖峰神经网络(SNN)因其受大脑启发的信息处理机制以及在神经形态芯片上的低功耗、稀疏积累计算而受到广泛关注。代梯度法使利用反向传播训练深度嗅觉神经网络成为可能,并在某些任务中显示出令人满意的性能。然而,随着网络结构变得越来越深,尖峰信息可能无法传输到更深的层,从而导致输出层在识别任务中做出错误的预测。受大脑皮层中由轴突连接自身树突并能调节神经元活动的自突触结构的启发,我们使用离散忆阻器来构建反馈连接的自突触,从而自适应地调节尖峰的精度。此外,为了防止某个时间步的离群值影响整体输出,我们在每个时间步将平均知识提炼为子模型,以纠正潜在的错误。通过将这两种方法结合起来,我们提出了一种深度 SNN,该 SNN 由具有记忆性自复归和时间蒸馏功能的泄漏积分与火灾(LIF)模型优化而成,被称为 MA-SNN。在静态数据集(CIFAR10 和 CIFAR100)以及神经形态数据集(DVS-CIFAR10 和 N-Caltech101)上进行的一系列实验证明了所提模型的竞争力,并验证了其组件的有效性。有关 MA-SNN 的代码,请访问:https://github.com/CHNtao/MA-SNN。
{"title":"Exploiting memristive autapse and temporal distillation for training spiking neural networks","authors":"Tao Chen ,&nbsp;Shukai Duan ,&nbsp;Lidan Wang","doi":"10.1016/j.knosys.2024.112627","DOIUrl":"10.1016/j.knosys.2024.112627","url":null,"abstract":"<div><div>Spiking neural networks (SNNs) have attracted widespread attention due to their brain-inspired information processing mechanism and low power, sparse accumulation computation on neuromorphic chips. The surrogate gradient method makes it possible to train deep SNNs using backpropagation and shows satisfactory performance on some tasks. However, as the network structure becomes deeper, the spike information may fail to transmit to deeper layers, thus causing the output layer to make wrong predictions in recognition tasks. Inspired by the autaptic structure in the cerebral cortex, which is formed by axons connecting to their own dendrites and capable of modulating neuronal activity, we use discrete memristors to build feedback-connected autapses to adaptively regulate the precision of the spikes. Further, to prevent outlier at a certain time step from affecting the overall output, we distill the averaged knowledge into sub-models at each time step to correct potential errors. By combining these two proposed methods, we propose a deep SNNs optimized by Leaky Integrate-and-Fire (LIF) model with memristive autapse and temporal distillation, referred to as MA-SNN. A series of experiments on static datasets (CIFAR10 and CIFAR100) as well as neuromorphic datasets (DVS-CIFAR10 and N-Caltech101) demonstrated the competitiveness of the proposed model and validated the effectiveness of its components. Code for MA-SNN is available at: <span><span>https://github.com/CHNtao/MA-SNN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"305 ","pages":"Article 112627"},"PeriodicalIF":7.2,"publicationDate":"2024-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142573329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transfer-learning enabled micro-expression recognition using dense connections and mixed attention 利用密集连接和混合注意力实现转移学习微表情识别
IF 7.2 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-19 DOI: 10.1016/j.knosys.2024.112640
Chenquan Gan , Junhao Xiao , Qingyi Zhu , Deepak Kumar Jain , Vitomir Štruc
Micro-expression recognition (MER) is a challenging computer vision problem, where the limited amount of available training data and insufficient intensity of the facial expressions are among the main issues adversely affecting the performance of existing recognition models. To address these challenges, this paper explores a transfer–learning enabled MER model using a densely connected feature extraction module with mixed attention. Unlike previous works that utilize transfer learning to facilitate MER and extract local facial-expression information, our model relies on pretraining with three diverse macro-expression datasets and, as a result, can: (i) overcome the problem of insufficient sample size and limited training data availability, (ii) leverage (related) domain-specific information from multiple datasets with diverse characteristics, and (iii) improve the model adaptability to complex scenes. Furthermore, to enhance the intensity of the micro-expressions and improve the discriminability of the extracted features, the Euler video magnification (EVM) method is adopted in the preprocessing stage and then used jointly with a densely connected feature extraction module and a mixed attention mechanism to derive expressive feature representations for the classification procedure. The proposed feature extraction mechanism not only guarantees the integrity of the extracted features but also efficiently captures local texture cues by aggregating the most salient information from the generated feature maps, which is key for the MER task. The experimental results on multiple datasets demonstrate the robustness and effectiveness of our model compared to the state-of-the-art.
微表情识别(MER)是一个具有挑战性的计算机视觉问题,可用训练数据量有限和面部表情强度不足是影响现有识别模型性能的主要问题。为了应对这些挑战,本文利用具有混合注意力的密集连接特征提取模块,探索了一种支持迁移学习的 MER 模型。与以往利用迁移学习促进 MER 和提取局部面部表情信息的工作不同,我们的模型依赖于使用三个不同的宏观表情数据集进行预训练,因此可以(i)克服了样本量不足和训练数据有限的问题,(ii)利用了来自多个不同特征数据集的(相关)特定领域信息,(iii)提高了模型对复杂场景的适应性。此外,为了增强微表情的强度并提高所提取特征的可辨别性,在预处理阶段采用了欧拉视频放大(EVM)方法,然后与密集连接特征提取模块和混合注意力机制联合使用,为分类过程生成富有表现力的特征表示。所提出的特征提取机制不仅保证了所提取特征的完整性,而且通过从生成的特征图中汇总最显著的信息,有效地捕捉了局部纹理线索,而这正是 MER 任务的关键所在。在多个数据集上的实验结果表明,与最先进的模型相比,我们的模型具有鲁棒性和有效性。
{"title":"Transfer-learning enabled micro-expression recognition using dense connections and mixed attention","authors":"Chenquan Gan ,&nbsp;Junhao Xiao ,&nbsp;Qingyi Zhu ,&nbsp;Deepak Kumar Jain ,&nbsp;Vitomir Štruc","doi":"10.1016/j.knosys.2024.112640","DOIUrl":"10.1016/j.knosys.2024.112640","url":null,"abstract":"<div><div>Micro-expression recognition (MER) is a challenging computer vision problem, where the limited amount of available training data and insufficient intensity of the facial expressions are among the main issues adversely affecting the performance of existing recognition models. To address these challenges, this paper explores a transfer–learning enabled MER model using a densely connected feature extraction module with mixed attention. Unlike previous works that utilize transfer learning to facilitate MER and extract local facial-expression information, our model relies on pretraining with three diverse macro-expression datasets and, as a result, can: <span><math><mrow><mo>(</mo><mi>i</mi><mo>)</mo></mrow></math></span> overcome the problem of insufficient sample size and limited training data availability, <span><math><mrow><mo>(</mo><mi>i</mi><mi>i</mi><mo>)</mo></mrow></math></span> leverage (related) domain-specific information from multiple datasets with diverse characteristics, and <span><math><mrow><mo>(</mo><mi>i</mi><mi>i</mi><mi>i</mi><mo>)</mo></mrow></math></span> improve the model adaptability to complex scenes. Furthermore, to enhance the intensity of the micro-expressions and improve the discriminability of the extracted features, the Euler video magnification (EVM) method is adopted in the preprocessing stage and then used jointly with a densely connected feature extraction module and a mixed attention mechanism to derive expressive feature representations for the classification procedure. The proposed feature extraction mechanism not only guarantees the integrity of the extracted features but also efficiently captures local texture cues by aggregating the most salient information from the generated feature maps, which is key for the MER task. The experimental results on multiple datasets demonstrate the robustness and effectiveness of our model compared to the state-of-the-art.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"305 ","pages":"Article 112640"},"PeriodicalIF":7.2,"publicationDate":"2024-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142553804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust Visual Question Answering utilizing Bias Instances and Label Imbalance 利用偏差实例和标签失衡实现稳健的视觉问题解答
IF 7.2 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-19 DOI: 10.1016/j.knosys.2024.112629
Liang Zhao, Kefeng Li, Jiangtao Qi, Yanhan Sun, Zhenfang Zhu
Visual Question Answering (VQA) models often suffer from bias issues which cause their predictions to rely on superficial correlations in datasets rather than the intrinsic properties of the task. However, most current debiasing methods primarily capture implicit biases but lack fine granularity, overlooking the need to address biases at a sample level. Additionally, these methods mostly fail to adequately utilize information about the imbalanced distribution of labels within the dataset. In this paper, we address these issues by utilizing Bias Instances and Label Imbalance for robust Visual Question Answering (BILI-VQA), which first identifies samples in the dataset that induce bias, defined as “Bias Instances”, and enhances their visual features during training to mitigate language shortcuts for achieving sample-level debiasing. The proposed method then introduces a new module that analyzes the label distribution, assigning higher weights to less frequent answer categories when computing the loss, thereby reducing the negative impact of label imbalance. Extensive experiments on the VQA-CP v2, VQA-CP v1, and VQA-CE datasets demonstrate the effectiveness of our BILI-VQA approach.
视觉问题解答(VQA)模型经常受到偏差问题的困扰,导致其预测依赖于数据集的表面相关性,而不是任务的内在属性。然而,目前大多数去偏差方法主要捕捉隐含偏差,但缺乏细粒度,忽略了在样本层面解决偏差问题的需要。此外,这些方法大多未能充分利用数据集中标签分布不平衡的信息。在本文中,我们利用 "偏差实例和标签不平衡 "来解决这些问题。"偏差实例和标签不平衡 "可用于稳健的视觉问题解答(BILI-VQA),该方法首先识别数据集中引起偏差的样本(定义为 "偏差实例"),并在训练过程中增强其视觉特征,以减少实现样本级去偏差的语言捷径。然后,该方法引入了一个新模块,用于分析标签分布,在计算损失时为频率较低的答案类别分配更高的权重,从而降低标签不平衡的负面影响。在 VQA-CP v2、VQA-CP v1 和 VQA-CE 数据集上进行的大量实验证明了我们的 BILI-VQA 方法的有效性。
{"title":"Robust Visual Question Answering utilizing Bias Instances and Label Imbalance","authors":"Liang Zhao,&nbsp;Kefeng Li,&nbsp;Jiangtao Qi,&nbsp;Yanhan Sun,&nbsp;Zhenfang Zhu","doi":"10.1016/j.knosys.2024.112629","DOIUrl":"10.1016/j.knosys.2024.112629","url":null,"abstract":"<div><div>Visual Question Answering (VQA) models often suffer from bias issues which cause their predictions to rely on superficial correlations in datasets rather than the intrinsic properties of the task. However, most current debiasing methods primarily capture implicit biases but lack fine granularity, overlooking the need to address biases at a sample level. Additionally, these methods mostly fail to adequately utilize information about the imbalanced distribution of labels within the dataset. In this paper, we address these issues by utilizing Bias Instances and Label Imbalance for robust Visual Question Answering (BILI-VQA), which first identifies samples in the dataset that induce bias, defined as “Bias Instances”, and enhances their visual features during training to mitigate language shortcuts for achieving sample-level debiasing. The proposed method then introduces a new module that analyzes the label distribution, assigning higher weights to less frequent answer categories when computing the loss, thereby reducing the negative impact of label imbalance. Extensive experiments on the VQA-CP v2, VQA-CP v1, and VQA-CE datasets demonstrate the effectiveness of our BILI-VQA approach.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"305 ","pages":"Article 112629"},"PeriodicalIF":7.2,"publicationDate":"2024-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142529493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A federated advisory teacher–student framework with simultaneous learning agents 具有同步学习代理的师生联合咨询框架
IF 7.2 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-19 DOI: 10.1016/j.knosys.2024.112637
Yunjiao Lei , Dayong Ye , Tianqing Zhu , Wanlei Zhou
Multi-agent reinforcement learning requires numerous interactions with the environment and other agents to learn an optimal policy. The teacher–student framework is one paradigm that can enhance the learning performance of reinforcement learning by allowing agents to seek advice from one another. However, recent studies show limitations in knowledge sharing between agents, as advisory learning is only peer-to-peer at a given time. Some methods enable a student to accept multiple pieces of advice, but they typically rely on pre-trained and/or policy-fixed teachers, rendering them unsuitable for agents with simultaneous advisory learning. Simultaneous learning with multiple pieces of advice has not been thoroughly investigated. Furthermore, most research has concentrated on the sharing of knowledge samples, a practice vulnerable to security breaches that could allow attackers to deduce details about the environment. To address these challenges, we propose a federated advisory framework that uses a federated learning structure to aggregate multiple sources of advice with deep reinforcement learning, ensuring that the shared advice is not sample-based. Our experimental comparisons with leading advisory learning techniques confirm that our approach significantly enhances learning performance.
多代理强化学习需要与环境和其他代理进行多次互动,以学习最优策略。师生框架是一种可以提高强化学习性能的范式,它允许代理相互征求意见。然而,最近的研究表明,代理之间的知识共享存在局限性,因为咨询学习在特定时间内只能是点对点的。有些方法能让学生接受多个建议,但它们通常依赖于预先训练和/或政策固定的教师,因此不适合同时进行咨询学习的代理。对同时学习多个建议的研究还不够深入。此外,大多数研究都集中在知识样本的共享上,这种做法容易受到安全漏洞的影响,使攻击者能够推断出环境的细节。为了应对这些挑战,我们提出了一个联合咨询框架,该框架利用联合学习结构,通过深度强化学习聚合多个建议来源,确保共享建议不是基于样本的。我们与领先的咨询学习技术进行的实验比较证实,我们的方法能显著提高学习性能。
{"title":"A federated advisory teacher–student framework with simultaneous learning agents","authors":"Yunjiao Lei ,&nbsp;Dayong Ye ,&nbsp;Tianqing Zhu ,&nbsp;Wanlei Zhou","doi":"10.1016/j.knosys.2024.112637","DOIUrl":"10.1016/j.knosys.2024.112637","url":null,"abstract":"<div><div>Multi-agent reinforcement learning requires numerous interactions with the environment and other agents to learn an optimal policy. The teacher–student framework is one paradigm that can enhance the learning performance of reinforcement learning by allowing agents to seek advice from one another. However, recent studies show limitations in knowledge sharing between agents, as advisory learning is only peer-to-peer at a given time. Some methods enable a student to accept multiple pieces of advice, but they typically rely on pre-trained and/or policy-fixed teachers, rendering them unsuitable for agents with simultaneous advisory learning. Simultaneous learning with multiple pieces of advice has not been thoroughly investigated. Furthermore, most research has concentrated on the sharing of knowledge samples, a practice vulnerable to security breaches that could allow attackers to deduce details about the environment. To address these challenges, we propose a federated advisory framework that uses a federated learning structure to aggregate multiple sources of advice with deep reinforcement learning, ensuring that the shared advice is not sample-based. Our experimental comparisons with leading advisory learning techniques confirm that our approach significantly enhances learning performance.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"305 ","pages":"Article 112637"},"PeriodicalIF":7.2,"publicationDate":"2024-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142529495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Weighted average algorithm: A novel meta-heuristic optimization algorithm based on the weighted average position concept 加权平均算法:基于加权平均位置概念的新型元启发式优化算法
IF 7.2 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-18 DOI: 10.1016/j.knosys.2024.112564
Jun Cheng, Wim De Waele
To address the ever-increasing complexity of real-world engineering challenges, meta-heuristic algorithms have been extensively studied and applied. However, balancing exploration and exploitation remains a significant challenge. In this paper, a novel meta-heuristic optimization algorithm based on the weighted average position concept, and named weighted average algorithm (WAA), is proposed and implemented. In this algorithm, the weighted average position for the whole population is first established at each iteration. Subsequently, WAA applies one of two movement strategies to balance exploration and exploitation, determined by a parameter function that depends on random constants and iteration numbers. To validate the effectiveness and reliability of WAA, it is applied to various optimization challenges, amongst which unconstrained benchmark functions and constrained engineering challenges. Based on the Friedman and Wilcoxon analyses, it can be concluded that the proposed algorithm obtained the best performance for the considered benchmark functions and engineering problems. The WAA is applied to prediction and optimization of surface waviness in Wire Arc Additive Manufacturing (WAAM) components. First, two prediction models relating WAAM process parameters and surface waviness are established based on an Artificial Neural Network (ANN) optimized by WAA, and Particle Swarm Optimization (PSO), respectively. The WAA optimized ANN (WAA/ANN) model outperforms both the standard ANN models and those optimized by PSO. Finally, leveraging the developed WAA/ANN prediction model, WAA and other optimization algorithms are applied to minimize waviness of a WAAM component, with WAA exhibiting promising performance. Source codes of WAA are publicly available at https://nl.mathworks.com/matlabcentral/fileexchange/174020-a-weighted-average-algorithm.
为了应对现实世界中日益复杂的工程挑战,元启发式算法得到了广泛的研究和应用。然而,如何在探索和利用之间取得平衡仍然是一个重大挑战。本文提出并实现了一种基于加权平均位置概念的新型元启发式优化算法,并将其命名为加权平均算法(WAA)。在该算法中,首先在每次迭代中确定整个群体的加权平均位置。随后,WAA 采用两种移动策略中的一种来平衡探索和开发,这取决于随机常数和迭代次数的参数函数。为了验证 WAA 的有效性和可靠性,我们将其应用于各种优化挑战,其中包括无约束基准函数和约束工程挑战。根据 Friedman 和 Wilcoxon 分析,可以得出结论:对于所考虑的基准函数和工程问题,所提出的算法获得了最佳性能。WAA 被应用于线弧增材制造(WAAM)部件表面波浪度的预测和优化。首先,分别基于经 WAA 优化的人工神经网络(ANN)和粒子群优化(PSO)建立了两个与 WAAM 工艺参数和表面粗糙度相关的预测模型。经 WAA 优化的 ANN(WAA/ANN)模型优于标准 ANN 模型和经 PSO 优化的 ANN 模型。最后,利用所开发的 WAA/ANN 预测模型,WAA 和其他优化算法被应用于最小化 WAAM 组件的波形,WAA 表现出了良好的性能。WAA 的源代码可在 https://nl.mathworks.com/matlabcentral/fileexchange/174020-a-weighted-average-algorithm 公开获取。
{"title":"Weighted average algorithm: A novel meta-heuristic optimization algorithm based on the weighted average position concept","authors":"Jun Cheng,&nbsp;Wim De Waele","doi":"10.1016/j.knosys.2024.112564","DOIUrl":"10.1016/j.knosys.2024.112564","url":null,"abstract":"<div><div>To address the ever-increasing complexity of real-world engineering challenges, meta-heuristic algorithms have been extensively studied and applied. However, balancing exploration and exploitation remains a significant challenge. In this paper, a novel meta-heuristic optimization algorithm based on the weighted average position concept, and named weighted average algorithm (WAA), is proposed and implemented. In this algorithm, the weighted average position for the whole population is first established at each iteration. Subsequently, WAA applies one of two movement strategies to balance exploration and exploitation, determined by a parameter function that depends on random constants and iteration numbers. To validate the effectiveness and reliability of WAA, it is applied to various optimization challenges, amongst which unconstrained benchmark functions and constrained engineering challenges. Based on the Friedman and Wilcoxon analyses, it can be concluded that the proposed algorithm obtained the best performance for the considered benchmark functions and engineering problems. The WAA is applied to prediction and optimization of surface waviness in Wire Arc Additive Manufacturing (WAAM) components. First, two prediction models relating WAAM process parameters and surface waviness are established based on an Artificial Neural Network (ANN) optimized by WAA, and Particle Swarm Optimization (PSO), respectively. The WAA optimized ANN (WAA/ANN) model outperforms both the standard ANN models and those optimized by PSO. Finally, leveraging the developed WAA/ANN prediction model, WAA and other optimization algorithms are applied to minimize waviness of a WAAM component, with WAA exhibiting promising performance. Source codes of WAA are publicly available at <span><span>https://nl.mathworks.com/matlabcentral/fileexchange/174020-a-weighted-average-algorithm</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"305 ","pages":"Article 112564"},"PeriodicalIF":7.2,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142553727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Backtest overfitting in the machine learning era: A comparison of out-of-sample testing methods in a synthetic controlled environment 机器学习时代的回溯测试过度拟合:在合成受控环境中比较样本外测试方法
IF 7.2 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-18 DOI: 10.1016/j.knosys.2024.112477
Hamid Arian , Daniel Norouzi Mobarekeh , Luis Seco
We present a comprehensive framework to assess these methods, considering the unique characteristics of financial data like non-stationarity, autocorrelation, and regime shifts. Through our analysis, we unveil the marked superiority of the Combinatorial Purged (CPCV) method in mitigating overfitting risks, outperforming traditional methods as evidenced by its lower Probability of Backtest Overfitting (PBO) and superior Deflated Sharpe Ratio (DSR) test statistic. Walk-Forward, by contrast, exhibits notable shortcomings in false discovery prevention, characterized by increased temporal variability and weaker stationarity. This contrasts with CPCV’s demonstrable stability and efficiency. We introduce novel variants of CPCV, including Bagged CPCV and Adaptive CPCV, which enhance robustness through ensemble approaches and dynamic adjustments based on market conditions. Our empirical validation using historical SP 500 data confirms these advanced cross-validation methods’ practical applicability and resilience. The analysis also suggests that choosing between Purged K-Fold and K-Fold necessitates caution due to their comparable performance and potential impact on the robustness of training data in out-of-sample testing. Our investigation utilizes a Synthetic Controlled Environment incorporating advanced models like the Heston Stochastic Volatility, Merton Jump Diffusion, and Drift-Burst Hypothesis alongside regime-switching models. This approach provides a nuanced simulation of market conditions, offering new insights into evaluating cross-validation techniques. We also address the computational aspects of these methods, demonstrating that parallelization significantly improves efficiency, making them feasible for large-scale financial datasets. Our study underscores the necessity of specialized validation methods in financial modeling, especially in the face of growing regulatory demands and complex market dynamics.
考虑到金融数据的独特性,如非平稳性、自相关性和制度转换,我们提出了评估这些方法的综合框架。通过分析,我们揭示了组合净化(CPCV)方法在降低过拟合风险方面的明显优势,其较低的回溯测试过拟合概率(PBO)和出色的膨胀夏普比率(DSR)测试统计量证明了该方法优于传统方法。相比之下,Walk-Forward 在防止误发现方面表现出明显的不足,其特点是时间变异性增加,静态性减弱。这与 CPCV 可证明的稳定性和效率形成了鲜明对比。我们介绍了 CPCV 的新变体,包括袋装 CPCV 和自适应 CPCV,它们通过集合方法和基于市场条件的动态调整来增强稳健性。我们使用历史 SP 500 数据进行了经验验证,证实了这些先进的交叉验证方法的实际适用性和弹性。分析还表明,由于净化 K-Fold 和 K-Fold 的性能相当,而且在样本外测试中对训练数据的稳健性有潜在影响,因此有必要谨慎选择。我们的研究采用了一种合成受控环境,将赫斯顿随机波动、默顿跃迁扩散和漂移-猝发假说等先进模型与制度转换模型结合在一起。这种方法对市场条件进行了细致入微的模拟,为评估交叉验证技术提供了新的见解。我们还探讨了这些方法的计算问题,证明并行化可显著提高效率,使其适用于大规模金融数据集。我们的研究强调了专业验证方法在金融建模中的必要性,尤其是面对日益增长的监管需求和复杂的市场动态。
{"title":"Backtest overfitting in the machine learning era: A comparison of out-of-sample testing methods in a synthetic controlled environment","authors":"Hamid Arian ,&nbsp;Daniel Norouzi Mobarekeh ,&nbsp;Luis Seco","doi":"10.1016/j.knosys.2024.112477","DOIUrl":"10.1016/j.knosys.2024.112477","url":null,"abstract":"<div><div>We present a comprehensive framework to assess these methods, considering the unique characteristics of financial data like non-stationarity, autocorrelation, and regime shifts. Through our analysis, we unveil the marked superiority of the Combinatorial Purged (CPCV) method in mitigating overfitting risks, outperforming traditional methods as evidenced by its lower Probability of Backtest Overfitting (PBO) and superior Deflated Sharpe Ratio (DSR) test statistic. Walk-Forward, by contrast, exhibits notable shortcomings in false discovery prevention, characterized by increased temporal variability and weaker stationarity. This contrasts with CPCV’s demonstrable stability and efficiency. We introduce novel variants of CPCV, including Bagged CPCV and Adaptive CPCV, which enhance robustness through ensemble approaches and dynamic adjustments based on market conditions. Our empirical validation using historical SP 500 data confirms these advanced cross-validation methods’ practical applicability and resilience. The analysis also suggests that choosing between Purged K-Fold and K-Fold necessitates caution due to their comparable performance and potential impact on the robustness of training data in out-of-sample testing. Our investigation utilizes a Synthetic Controlled Environment incorporating advanced models like the Heston Stochastic Volatility, Merton Jump Diffusion, and Drift-Burst Hypothesis alongside regime-switching models. This approach provides a nuanced simulation of market conditions, offering new insights into evaluating cross-validation techniques. We also address the computational aspects of these methods, demonstrating that parallelization significantly improves efficiency, making them feasible for large-scale financial datasets. Our study underscores the necessity of specialized validation methods in financial modeling, especially in the face of growing regulatory demands and complex market dynamics.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"305 ","pages":"Article 112477"},"PeriodicalIF":7.2,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142553656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
U-AEFA: Online and offline learning-based unified artificial electric field algorithm for real parameter optimization U-AEFA:基于在线和离线学习的统一人工电场算法,用于实际参数优化
IF 7.2 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-18 DOI: 10.1016/j.knosys.2024.112636
Dikshit Chauhan , Anupam Trivedi , Anupam Yadav
Optimization problems in real-world scenarios require algorithms that effectively balance exploration and exploitation to avoid local optima and achieve global solutions. To address this, we propose a unified artificial electric field algorithm (U-AEFA) that integrates both offline learning (inspired by high-performing metaheuristic algorithms) and online learning (through historical data generated during evolution). U-AEFA introduces a unique three-layer population structure to enhance search efficiency, consisting of the best agent in the first layer, the top-performing agents in the second layer, and the remaining agents in the third layer. Key features of U-AEFA include (i) an effective Coulomb’s constant for improved exploration, (ii) a non-uniform mutation operator to mitigate premature convergence, and (iii) two acceleration coefficients for enhanced performance. These three factors constitute offline learning and have been implemented to improve different design elements of the algorithm. As part of online learning, it employs a difference vector reuse (DVR) strategy to evolve the first-layer agents. The algorithm is evaluated using the CEC 2017 test suite across multiple dimensions (10, 30, 50, 100), where it consistently outperforms seven state-of-the-art algorithms, demonstrating superior accuracy and convergence speed. Moreover, U-AEFA’s robustness is validated on 12 high-dimensional feature selection problems, further highlighting its effectiveness in solving complex optimization tasks. The source code of U-AEFA is available at
现实世界中的优化问题需要有效平衡探索和利用的算法,以避免局部最优并获得全局解决方案。为此,我们提出了一种统一人工电场算法(U-AEFA),它集成了离线学习(受高性能元搜索算法的启发)和在线学习(通过进化过程中产生的历史数据)。U-AEFA 引入了独特的三层群体结构以提高搜索效率,第一层由最佳代理组成,第二层由表现最好的代理组成,第三层由其余代理组成。U-AEFA 的主要特点包括:(i) 一个有效的库仑常数,用于改进探索;(ii) 一个非均匀突变算子,用于减少过早收敛;(iii) 两个加速系数,用于提高性能。这三个因素构成了离线学习,并用于改进算法的不同设计元素。作为在线学习的一部分,它采用了差分矢量重用(DVR)策略来演化第一层代理。该算法在多个维度(10、30、50、100)上使用 CEC 2017 测试套件进行了评估,其性能始终优于七种最先进的算法,显示出卓越的准确性和收敛速度。此外,U-AEFA 的鲁棒性在 12 个高维特征选择问题上得到了验证,进一步突出了它在解决复杂优化任务方面的有效性。U-AEFA 的源代码可在以下网址获取
{"title":"U-AEFA: Online and offline learning-based unified artificial electric field algorithm for real parameter optimization","authors":"Dikshit Chauhan ,&nbsp;Anupam Trivedi ,&nbsp;Anupam Yadav","doi":"10.1016/j.knosys.2024.112636","DOIUrl":"10.1016/j.knosys.2024.112636","url":null,"abstract":"<div><div>Optimization problems in real-world scenarios require algorithms that effectively balance exploration and exploitation to avoid local optima and achieve global solutions. To address this, we propose a unified artificial electric field algorithm (U-AEFA) that integrates both offline learning (inspired by high-performing metaheuristic algorithms) and online learning (through historical data generated during evolution). U-AEFA introduces a unique three-layer population structure to enhance search efficiency, consisting of the best agent in the first layer, the top-performing agents in the second layer, and the remaining agents in the third layer. Key features of U-AEFA include <strong>(i)</strong> an effective Coulomb’s constant for improved exploration, <strong>(ii)</strong> a non-uniform mutation operator to mitigate premature convergence, and <strong>(iii)</strong> two acceleration coefficients for enhanced performance. These three factors constitute offline learning and have been implemented to improve different design elements of the algorithm. As part of online learning, it employs a difference vector reuse (DVR) strategy to evolve the first-layer agents. The algorithm is evaluated using the CEC 2017 test suite across multiple dimensions (10, 30, 50, 100), where it consistently outperforms seven state-of-the-art algorithms, demonstrating superior accuracy and convergence speed. Moreover, U-AEFA’s robustness is validated on 12 high-dimensional feature selection problems, further highlighting its effectiveness in solving complex optimization tasks. The source code of U-AEFA is available at</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"305 ","pages":"Article 112636"},"PeriodicalIF":7.2,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142529492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Soft set-based MSER end-to-end system for occluded scene text detection, recognition and prediction 基于软集的 MSER 端到端系统,用于遮挡场景文本检测、识别和预测
IF 7.2 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-18 DOI: 10.1016/j.knosys.2024.112593
Alloy Das , Shivakumara Palaiahnakote , Ayan Banerjee , Apostolos Antonacopoulos , Umapada Pal
The presence of unpredictable occlusions on natural scene text is a significant challenge, exacerbating the difficulties already posed on text detection and recognition by the variability of such images. Addressing the need for a robust, consistently performing approach that can effectively address the above challenges, this paper presents a new Soft Set-based end-to-end system for text detection, recognition and prediction in occluded natural scene images. This is the first approach to integrate text detection, recognition and prediction, unlike existing systems developed for end-to-end text spotting (text detection and recognition) only. For candidate text components detection, the proposed combination of Soft Sets with Maximally Stable Extremal Regions (SS-MSER) improves text detection and spotting in natural scene images, irrespectively of the presence of arbitrarily orientated and shaped text, complex backgrounds and occlusion. Furthermore, a Graph Recurrent Neural Network is proposed for grouping candidate text components into text lines and for fitting accurate bounding boxes to each word. Finally, a Convolutional Recurrent Neural Network (CRNN) is proposed for the recognition of text and for predicting missing characters due to occlusion. Experimental results on a new occluded scene text dataset (OSTD) and on the most relevant benchmark natural scene text datasets demonstrate that the proposed system outperforms the state-of-the-art in text detection, recognition and prediction. The code and dataset are available at https://github.com/alloydas/Softset-MSER-Based-Occluded-Scene-Text-Spotting/blob/master/Soft_set_MSER.ipynb
自然场景文本中存在不可预测的遮挡物是一个重大挑战,这种图像的多变性加剧了文本检测和识别的困难。为了满足对稳健、性能稳定的方法的需求,有效地应对上述挑战,本文提出了一种新的基于软集的端到端系统,用于在遮挡的自然场景图像中进行文本检测、识别和预测。与现有的仅用于端到端文本发现(文本检测和识别)的系统不同,这是第一种集成文本检测、识别和预测的方法。在候选文本成分检测方面,所提出的软集与最大稳定极值区域(SS-MSER)的组合改善了自然场景图像中的文本检测和发现,而不受任意方向和形状的文本、复杂背景和遮挡的影响。此外,还提出了一种图递归神经网络,用于将候选文本组件分组为文本行,并为每个单词拟合精确的边界框。最后,还提出了一种卷积递归神经网络(CRNN),用于识别文本和预测因遮挡而丢失的字符。在新的遮挡场景文本数据集(OSTD)和最相关的基准自然场景文本数据集上的实验结果表明,所提出的系统在文本检测、识别和预测方面优于最先进的系统。代码和数据集可从 https://github.com/alloydas/Softset-MSER-Based-Occluded-Scene-Text-Spotting/blob/master/Soft_set_MSER.ipynb 获取。
{"title":"Soft set-based MSER end-to-end system for occluded scene text detection, recognition and prediction","authors":"Alloy Das ,&nbsp;Shivakumara Palaiahnakote ,&nbsp;Ayan Banerjee ,&nbsp;Apostolos Antonacopoulos ,&nbsp;Umapada Pal","doi":"10.1016/j.knosys.2024.112593","DOIUrl":"10.1016/j.knosys.2024.112593","url":null,"abstract":"<div><div>The presence of unpredictable occlusions on natural scene text is a significant challenge, exacerbating the difficulties already posed on text detection and recognition by the variability of such images. Addressing the need for a robust, consistently performing approach that can effectively address the above challenges, this paper presents a new Soft Set-based end-to-end system for text detection, recognition and prediction in occluded natural scene images. This is the first approach to integrate text detection, recognition <em>and prediction</em>, unlike existing systems developed for end-to-end text spotting (text detection and recognition) only. For candidate text components detection, the proposed combination of Soft Sets with Maximally Stable Extremal Regions (SS-MSER) improves text detection and spotting in natural scene images, irrespectively of the presence of arbitrarily orientated and shaped text, complex backgrounds and occlusion. Furthermore, a Graph Recurrent Neural Network is proposed for grouping candidate text components into text lines and for fitting accurate bounding boxes to each word. Finally, a Convolutional Recurrent Neural Network (CRNN) is proposed for the recognition of text and for predicting missing characters due to occlusion. Experimental results on a new occluded scene text dataset (OSTD) and on the most relevant benchmark natural scene text datasets demonstrate that the proposed system outperforms the state-of-the-art in text detection, recognition and prediction. The code and dataset are available at <span><span>https://github.com/alloydas/Softset-MSER-Based-Occluded-Scene-Text-Spotting/blob/master/Soft_set_MSER.ipynb</span><svg><path></path></svg></span></div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"305 ","pages":"Article 112593"},"PeriodicalIF":7.2,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142529377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Knowledge-Based Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1