首页 > 最新文献

Knowledge-Based Systems最新文献

英文 中文
Learnable mixture distribution prior for image denoising 图像去噪的可学习混合分布先验
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-29 DOI: 10.1016/j.knosys.2025.115234
Zhuoxiao Li , Faqiang Wang , Li Cui , Jun Liu
In variational-based image denoising, the regularizer derived from mixture distributions plays a crucial role in preserving image details. However, this type of mixture distribution priors has not been incorporated into deep learning-based denoising methods. In this paper, we propose a method for learning regularizers based on a learnable Laplacian mixture distribution for image denoising. Our approach is motivated by the assumption that deep image features follow a latent distribution with a mixture model. To address this assumption, we propose a regularizer with learnable weights by considering the dual problem of maximum likelihood estimation for the deep features. The dual variable in this problem represents an attention weight, which can be learned using a numerical scheme with an unrolling technique. Notably, our method establishes a connection between the mixture distribution prior and the popular attention mechanism in deep learning. Additionally, to capture multi-scale features, we introduce a multigrid solver for the optimization involved in our method. This enables the formulation of an encoder-decoder architecture based on a learnable mixture distribution network (LMDNet) for image denoising. We demonstrate the effectiveness of our proposed method by comparing it with several popular denoising methods. The results demonstrate the superior performance of our approach for image denoising. The code is publicly available at https://github.com/ylyslzx/LMDNet.
在基于变分的图像去噪中,由混合分布得到的正则化器在保留图像细节方面起着至关重要的作用。然而,这种混合分布先验尚未被纳入到基于深度学习的去噪方法中。本文提出了一种基于可学习拉普拉斯混合分布的图像去噪正则化学习方法。我们的方法的动机是假设深度图像特征遵循混合模型的潜在分布。为了解决这一假设,我们通过考虑深度特征的极大似然估计的对偶问题,提出了一个具有可学习权的正则化器。该问题中的对偶变量表示一个注意权值,可以使用带有展开技术的数值格式来学习。值得注意的是,我们的方法在混合分布先验和深度学习中流行的注意机制之间建立了联系。此外,为了捕获多尺度特征,我们引入了一个多网格求解器来优化我们的方法。这使得基于可学习混合分布网络(LMDNet)的图像去噪编码器-解码器架构的制定成为可能。通过与几种常用的去噪方法进行比较,证明了该方法的有效性。实验结果表明,该方法具有较好的图像去噪效果。该代码可在https://github.com/ylyslzx/LMDNet上公开获得。
{"title":"Learnable mixture distribution prior for image denoising","authors":"Zhuoxiao Li ,&nbsp;Faqiang Wang ,&nbsp;Li Cui ,&nbsp;Jun Liu","doi":"10.1016/j.knosys.2025.115234","DOIUrl":"10.1016/j.knosys.2025.115234","url":null,"abstract":"<div><div>In variational-based image denoising, the regularizer derived from mixture distributions plays a crucial role in preserving image details. However, this type of mixture distribution priors has not been incorporated into deep learning-based denoising methods. In this paper, we propose a method for learning regularizers based on a learnable Laplacian mixture distribution for image denoising. Our approach is motivated by the assumption that deep image features follow a latent distribution with a mixture model. To address this assumption, we propose a regularizer with learnable weights by considering the dual problem of maximum likelihood estimation for the deep features. The dual variable in this problem represents an attention weight, which can be learned using a numerical scheme with an unrolling technique. Notably, our method establishes a connection between the mixture distribution prior and the popular attention mechanism in deep learning. Additionally, to capture multi-scale features, we introduce a multigrid solver for the optimization involved in our method. This enables the formulation of an encoder-decoder architecture based on a learnable mixture distribution network (LMDNet) for image denoising. We demonstrate the effectiveness of our proposed method by comparing it with several popular denoising methods. The results demonstrate the superior performance of our approach for image denoising. The code is publicly available at <span><span>https://github.com/ylyslzx/LMDNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"335 ","pages":"Article 115234"},"PeriodicalIF":7.6,"publicationDate":"2025-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145927561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HMEN: A hybrid modular network with dynamic expansion for continual learning HMEN:具有动态扩展的混合模块化网络,用于持续学习
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-29 DOI: 10.1016/j.knosys.2025.115182
Ziye Fang , Bo Wan , Shangqi Guo , Jian K. Liu
Continual Learning (CL) seeks to enable machine learning models to continuously learn from evolving task streams while retaining knowledge from previously learned tasks. However, existing approaches struggle with catastrophic forgetting and limited knowledge transfer across tasks, hindering their performance in complex task streams. Modular networks provide a promising solution, but their homogeneous module structures and rigid expansion strategies hinder flexibility and scalability. To address these limitations, we propose the Hybrid Modular Expansion Network (HMEN), a novel framework featuring a hierarchical architecture with diverse module types and a dynamic expansion mechanism. Unlike conventional modular continual learning methods that rely on structurally uniform modules, HMEN integrates both Autoencoders (AE) and Weighted Encoder Fusion Autoencoders (WEF-AE) in each layer. The WEF-AE incorporates multiple encoders with a weighted fusion strategy, enabling more flexible feature extraction and better adaptation to tasks of varying complexities. HMEN introduces the Hybrid Module Expansion Strategy (HMES), which combines partial module expansion and full module expansion, allowing the network to dynamically optimize its structure based on task requirements. We evaluate HMEN on the Continual Transfer Learning (CTrL) benchmark, which includes multiple complex task streams. Experimental results demonstrate that HMEN outperforms state-of-the-art modular continual learning methods, such as LMC and MNTDP, in key metrics including average accuracy, backward knowledge transfer, and forward knowledge transfer. These results establish HMEN as a robust and scalable framework for continual learning in dynamic environments.
持续学习(CL)旨在使机器学习模型能够从不断发展的任务流中不断学习,同时保留以前学习过的任务中的知识。然而,现有的方法与灾难性遗忘和有限的跨任务知识转移作斗争,阻碍了它们在复杂任务流中的表现。模块化网络提供了一个很有前途的解决方案,但其同构的模块结构和僵化的扩展策略阻碍了灵活性和可扩展性。为了解决这些限制,我们提出了混合模块扩展网络(HMEN),这是一个具有不同模块类型和动态扩展机制的分层架构的新框架。与传统的依赖于结构统一模块的模块化持续学习方法不同,HMEN在每层中集成了自编码器(AE)和加权编码器融合自编码器(WEF-AE)。WEF-AE将多个编码器与加权融合策略结合在一起,能够更灵活地提取特征并更好地适应不同复杂性的任务。HMEN引入了混合模块扩展策略(HMES),将部分模块扩展和全模块扩展相结合,使网络能够根据任务需求动态优化其结构。我们在持续迁移学习(CTrL)基准上评估了HMEN,其中包括多个复杂的任务流。实验结果表明,HMEN在平均准确率、后向知识转移和前向知识转移等关键指标上优于LMC和MNTDP等最先进的模块化持续学习方法。这些结果使HMEN成为一个健壮的、可扩展的框架,用于在动态环境中持续学习。
{"title":"HMEN: A hybrid modular network with dynamic expansion for continual learning","authors":"Ziye Fang ,&nbsp;Bo Wan ,&nbsp;Shangqi Guo ,&nbsp;Jian K. Liu","doi":"10.1016/j.knosys.2025.115182","DOIUrl":"10.1016/j.knosys.2025.115182","url":null,"abstract":"<div><div>Continual Learning (CL) seeks to enable machine learning models to continuously learn from evolving task streams while retaining knowledge from previously learned tasks. However, existing approaches struggle with catastrophic forgetting and limited knowledge transfer across tasks, hindering their performance in complex task streams. Modular networks provide a promising solution, but their homogeneous module structures and rigid expansion strategies hinder flexibility and scalability. To address these limitations, we propose the Hybrid Modular Expansion Network (HMEN), a novel framework featuring a hierarchical architecture with diverse module types and a dynamic expansion mechanism. Unlike conventional modular continual learning methods that rely on structurally uniform modules, HMEN integrates both Autoencoders (AE) and Weighted Encoder Fusion Autoencoders (WEF-AE) in each layer. The WEF-AE incorporates multiple encoders with a weighted fusion strategy, enabling more flexible feature extraction and better adaptation to tasks of varying complexities. HMEN introduces the Hybrid Module Expansion Strategy (HMES), which combines partial module expansion and full module expansion, allowing the network to dynamically optimize its structure based on task requirements. We evaluate HMEN on the Continual Transfer Learning (CTrL) benchmark, which includes multiple complex task streams. Experimental results demonstrate that HMEN outperforms state-of-the-art modular continual learning methods, such as LMC and MNTDP, in key metrics including average accuracy, backward knowledge transfer, and forward knowledge transfer. These results establish HMEN as a robust and scalable framework for continual learning in dynamic environments.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"335 ","pages":"Article 115182"},"PeriodicalIF":7.6,"publicationDate":"2025-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145927555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-scale feature fusion-based dynamic framework using continual learning to identify text generated by multiple large language models 基于多尺度特征融合的动态框架,使用持续学习来识别多个大型语言模型生成的文本
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-29 DOI: 10.1016/j.knosys.2025.115206
Muhammad Sohail , Zan Hongying , Muhammad Abdullah , Fabio Caraffini , Arifa Javed , Hassan Eshkiki
The rapid advancement of large language models has significantly enhanced the quality of AI-generated text, making it increasingly difficult for detection systems to distinguish from human-written content. Existing detection methods, such as statistical, linguistic, machine learning, and deep learning approaches, often exhibit a decline in performance when applied to new or previously unseen large language models. Additionally, they tend to become outdated due to their static frameworks and inability to adapt to emerging patterns in generative text. To address this limitation, we introduce a novel dynamic fusion framework that integrates multi-scale feature fusion to capture diverse text patterns and employs continual learning with Elastic Weight Consolidation (EWC) to adapt to new models while mitigating catastrophic forgetting. This is the first attempt, to the best of our knowledge, to develop such a dynamic framework for AI-generated text detection. Evaluated on the TuringBench and DeepfakeTextDetect benchmark datasets, our framework achieves an average accuracy of 95.78% and 92.39%, outperforming the standard model by 5.88% and 7.98%, respectively, in distinguishing AI-generated from human-written text across various language generative model architectures. The continual learning ensures that the model remains adaptive and accurate over time, which is essential for practical applications in dynamic environments. This dynamic and adaptive approach paves the way for resilient AI-generated text detection systems capable of evolving alongside the rapidly advancing landscape of generative language technologies.
大型语言模型的快速发展大大提高了人工智能生成文本的质量,使得检测系统越来越难以区分人类编写的内容。现有的检测方法,如统计学、语言学、机器学习和深度学习方法,在应用于新的或以前未见过的大型语言模型时,往往表现出性能下降。此外,由于它们的静态框架和无法适应生成文本中出现的模式,它们往往会过时。为了解决这一限制,我们引入了一种新的动态融合框架,该框架集成了多尺度特征融合来捕获不同的文本模式,并采用弹性权重巩固(EWC)的持续学习来适应新模型,同时减轻灾难性遗忘。据我们所知,这是第一次尝试为人工智能生成的文本检测开发这样一个动态框架。在TuringBench和DeepfakeTextDetect基准数据集上进行评估,我们的框架在区分各种语言生成模型架构中的人工智能生成文本和人类书面文本方面的平均准确率分别达到95.78%和92.39%,分别比标准模型高出5.88%和7.98%。持续的学习确保模型随着时间的推移保持适应性和准确性,这对于动态环境中的实际应用是必不可少的。这种动态和自适应的方法为具有弹性的人工智能生成文本检测系统铺平了道路,该系统能够随着生成语言技术的快速发展而发展。
{"title":"Multi-scale feature fusion-based dynamic framework using continual learning to identify text generated by multiple large language models","authors":"Muhammad Sohail ,&nbsp;Zan Hongying ,&nbsp;Muhammad Abdullah ,&nbsp;Fabio Caraffini ,&nbsp;Arifa Javed ,&nbsp;Hassan Eshkiki","doi":"10.1016/j.knosys.2025.115206","DOIUrl":"10.1016/j.knosys.2025.115206","url":null,"abstract":"<div><div>The rapid advancement of large language models has significantly enhanced the quality of AI-generated text, making it increasingly difficult for detection systems to distinguish from human-written content. Existing detection methods, such as statistical, linguistic, machine learning, and deep learning approaches, often exhibit a decline in performance when applied to new or previously unseen large language models. Additionally, they tend to become outdated due to their static frameworks and inability to adapt to emerging patterns in generative text. To address this limitation, we introduce a novel dynamic fusion framework that integrates multi-scale feature fusion to capture diverse text patterns and employs continual learning with Elastic Weight Consolidation (EWC) to adapt to new models while mitigating catastrophic forgetting. This is the first attempt, to the best of our knowledge, to develop such a dynamic framework for AI-generated text detection. Evaluated on the TuringBench and DeepfakeTextDetect benchmark datasets, our framework achieves an average accuracy of 95.78% and 92.39%, outperforming the standard model by 5.88% and 7.98%, respectively, in distinguishing AI-generated from human-written text across various language generative model architectures. The continual learning ensures that the model remains adaptive and accurate over time, which is essential for practical applications in dynamic environments. This dynamic and adaptive approach paves the way for resilient AI-generated text detection systems capable of evolving alongside the rapidly advancing landscape of generative language technologies.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"335 ","pages":"Article 115206"},"PeriodicalIF":7.6,"publicationDate":"2025-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Weakly-supervised entity matching via LLM-guided data augmentation and knowledge transfer 基于llm引导的数据增强和知识转移的弱监督实体匹配
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-29 DOI: 10.1016/j.knosys.2025.115238
Wenzhou Dou , Derong Shen , Xiangmin Zhou , Yue Kou , Tiezheng Nie , Hang Cui , Ge Yu
Entity Matching (EM) is a fundamental task in data integration, enabling the identification of records that refer to the same real-world entity across data sources. While accurate EM is critical for applications such as product catalog alignment, deploying EM systems in practice remains challenging due to the scarcity and noise of labeled training data. This setting, known as weakly supervised EM (WSEM), introduces two key difficulties: incomplete supervision, where labeled pairs cover only a narrow portion of the entity space, and inaccurate supervision, where labels are noisy. Existing WSEM approaches face a supervision-deployability trade-off. To overcome these limitations, we propose LEMONADE, a novel framework that integrates Large Language Model (LLM) guidance into the training of a lightweight Small Language Model (SLM), combining accuracy with deployability. LEMONADE introduces two complementary mechanisms: (i) data-oriented guidance, which employs LLMs to synthesize challenging training pairs via soft and strong entity modifications, enriching supervision and enhancing robustness; and (ii) knowledge-oriented guidance, which transfers semantic similarity from an LLM-based encoder to an SLM-based encoder through compressed similarity embeddings, thereby aligning representations and injecting high-level reasoning. Extensive experiments on nine benchmark datasets demonstrate that LEMONADE outperforms the best baseline by up to 4.4 F1 points, while achieving approximately 10 ×  faster inference than LLM-based methods, making it highly deployable in practice.
实体匹配(Entity Matching, EM)是数据集成中的一项基本任务,它允许识别跨数据源引用相同现实世界实体的记录。虽然准确的EM对于产品目录对齐等应用至关重要,但由于标记训练数据的稀缺性和噪声,在实践中部署EM系统仍然具有挑战性。这种设置,被称为弱监督EM (WSEM),引入了两个关键困难:不完全监督,其中标记对仅覆盖实体空间的一小部分,以及不准确的监督,其中标签是嘈杂的。现有的WSEM方法面临着监督与可部署性之间的权衡。为了克服这些限制,我们提出了一个新的框架LEMONADE,它将大型语言模型(LLM)指导集成到轻量级小型语言模型(SLM)的训练中,结合了准确性和可部署性。LEMONADE引入了两种互补机制:(1)面向数据的指导,利用法学硕士通过软、强实体修改来合成具有挑战性的训练对,丰富监督,增强鲁棒性;(ii)面向知识的引导,通过压缩相似度嵌入将语义相似度从基于llm的编码器转移到基于slm的编码器,从而对齐表示并注入高级推理。在9个基准数据集上进行的大量实验表明,LEMONADE的性能比最佳基线高出4.4 F1点,同时比基于llm的方法实现了大约10 × 的推理速度,使其在实践中具有高度可部署性。
{"title":"Weakly-supervised entity matching via LLM-guided data augmentation and knowledge transfer","authors":"Wenzhou Dou ,&nbsp;Derong Shen ,&nbsp;Xiangmin Zhou ,&nbsp;Yue Kou ,&nbsp;Tiezheng Nie ,&nbsp;Hang Cui ,&nbsp;Ge Yu","doi":"10.1016/j.knosys.2025.115238","DOIUrl":"10.1016/j.knosys.2025.115238","url":null,"abstract":"<div><div>Entity Matching (EM) is a fundamental task in data integration, enabling the identification of records that refer to the same real-world entity across data sources. While accurate EM is critical for applications such as product catalog alignment, deploying EM systems in practice remains challenging due to the scarcity and noise of labeled training data. This setting, known as weakly supervised EM (WSEM), introduces two key difficulties: incomplete supervision, where labeled pairs cover only a narrow portion of the entity space, and inaccurate supervision, where labels are noisy. Existing WSEM approaches face a supervision-deployability trade-off. To overcome these limitations, we propose LEMONADE, a novel framework that integrates Large Language Model (LLM) guidance into the training of a lightweight Small Language Model (SLM), combining accuracy with deployability. LEMONADE introduces two complementary mechanisms: (i) <em>data-oriented guidance</em>, which employs LLMs to synthesize challenging training pairs via soft and strong entity modifications, enriching supervision and enhancing robustness; and (ii) <em>knowledge-oriented guidance</em>, which transfers semantic similarity from an LLM-based encoder to an SLM-based encoder through compressed similarity embeddings, thereby aligning representations and injecting high-level reasoning. Extensive experiments on nine benchmark datasets demonstrate that LEMONADE outperforms the best baseline by up to 4.4 F1 points, while achieving approximately 10 ×  faster inference than LLM-based methods, making it highly deployable in practice.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"335 ","pages":"Article 115238"},"PeriodicalIF":7.6,"publicationDate":"2025-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive NeuroSpider-HAR: Bio-inspired neural optimization for high-precision human activity recognition 自适应神经蜘蛛har:高精度人类活动识别的仿生神经优化
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-28 DOI: 10.1016/j.knosys.2025.115245
S. Selvabharathi , K.S. Dhanalakshmi , P. Prabhu
Human Activity Recognition (HAR) is a crucial area in smart healthcare, fitness monitoring, and intelligent human-computer interaction. In this work, NeuroSpider-HAR is proposed that integrates adaptive signal preprocessing, deep feature extraction, and a powerful bio-inspired optimization framework to improve activity prediction accuracy. First, Adaptive Noise Cancellation (ANC) is applied to remove background noise and enhance signal clarity from accelerometer and gyroscope data. Next, an advanced Instantaneous Frequency and Spectrum Analysis (IFSA) approach is processed to extract time-frequency-based features from the sensor signals. Finally for classification, a hybrid deep learning model, combining Disjunctive Normal Form Network (DNF-Net) and Self-Normalizing Neural Network (SNN) with Scaled Exponential Linear Unit (SELU) activation is adopted to ensure stable and consistent learning. The optimization process is enhanced using a bio-inspired Spider Wasp Optimization (SWO) algorithm that dynamically adjusts hyperparameters and learning weights based on environmental feedback and activity complexity. The proposed model is validated using Python software and the experimental results reveal that NeuroSpider-HAR achieves state-of-the-art performance, the comparative analysis proves that the dataset 2 PAMAP2 accomplishes higher accuracy of 0.9995 and minimal errors compared with other two datasets such as UCI-HAR and WISDM HAR, thereby it helps to effectively predict HAR with higher performance.
人体活动识别(HAR)是智能医疗、健康监测和智能人机交互的关键领域。在这项工作中,NeuroSpider-HAR集成了自适应信号预处理、深度特征提取和强大的生物优化框架,以提高活动预测的准确性。首先,采用自适应噪声消除(ANC)去除加速度计和陀螺仪数据中的背景噪声,提高信号清晰度。接下来,采用先进的瞬时频率和频谱分析(IFSA)方法从传感器信号中提取基于时频的特征。最后,在分类方面,采用一种结合析取范式网络(DNF-Net)和自归一化神经网络(SNN)的混合深度学习模型,并结合缩放指数线性单元(SELU)激活来保证学习的稳定性和一致性。优化过程采用了一种基于环境反馈和活动复杂性动态调整超参数和学习权重的生物蜘蛛黄蜂优化(SWO)算法。利用Python软件对所提出的模型进行了验证,实验结果表明,NeuroSpider-HAR达到了最先进的性能,对比分析表明,与UCI-HAR和WISDM HAR等其他两个数据集相比,数据集2 PAMAP2的准确率达到了0.9995,误差最小,有助于以更高的性能有效预测HAR。
{"title":"Adaptive NeuroSpider-HAR: Bio-inspired neural optimization for high-precision human activity recognition","authors":"S. Selvabharathi ,&nbsp;K.S. Dhanalakshmi ,&nbsp;P. Prabhu","doi":"10.1016/j.knosys.2025.115245","DOIUrl":"10.1016/j.knosys.2025.115245","url":null,"abstract":"<div><div>Human Activity Recognition (HAR) is a crucial area in smart healthcare, fitness monitoring, and intelligent human-computer interaction. In this work, NeuroSpider-HAR is proposed that integrates adaptive signal preprocessing, deep feature extraction, and a powerful bio-inspired optimization framework to improve activity prediction accuracy. First, Adaptive Noise Cancellation (ANC) is applied to remove background noise and enhance signal clarity from accelerometer and gyroscope data. Next, an advanced Instantaneous Frequency and Spectrum Analysis (IFSA) approach is processed to extract time-frequency-based features from the sensor signals. Finally for classification, a hybrid deep learning model, combining Disjunctive Normal Form Network (DNF-Net) and Self-Normalizing Neural Network (SNN) with Scaled Exponential Linear Unit (SELU) activation is adopted to ensure stable and consistent learning. The optimization process is enhanced using a bio-inspired Spider Wasp Optimization (SWO) algorithm that dynamically adjusts hyperparameters and learning weights based on environmental feedback and activity complexity. The proposed model is validated using Python software and the experimental results reveal that NeuroSpider-HAR achieves state-of-the-art performance, the comparative analysis proves that the dataset 2 PAMAP2 accomplishes higher accuracy of 0.9995 and minimal errors compared with other two datasets such as UCI-HAR and WISDM HAR, thereby it helps to effectively predict HAR with higher performance.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"335 ","pages":"Article 115245"},"PeriodicalIF":7.6,"publicationDate":"2025-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145927471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
All you need is two domains: Unified RGB-Wavelet transformer for visual representation learning 您所需要的只是两个领域:用于视觉表示学习的统一rgb -小波转换器
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-28 DOI: 10.1016/j.knosys.2025.115239
Yu Fu , Weichao Yi , Liquan Dong , Ming Liu , Lingqin Kong
Recent advances in visual representation learning have leveraged Transformer architectures to achieve remarkable performance in tasks such as image classification and dense prediction. However, traditional Vision Transformers (ViTs) often struggle with multi-scale feature handling and the preservation of fine-grained details due to pooling-based downsampling and random cropping operations, which can result in information loss. To address these challenges, we propose a novel unified dual-domain framework, named RWT, which jointly exploits RGB and wavelet domain representations to capture both global dependencies as well as localized frequency information. In the RGB domain, multi-head self-attention is employed to extract long-range interactions, while in the wavelet domain, the Discrete Wavelet Transform (DWT) facilitates invertible downsampling by decomposing images into low-frequency (structural) and high-frequency (textural) components, which are then processed via depthwise separable convolutions. A dynamic convolutional kernel adjustment allows the model to adapt to varying decomposition levels, ensuring efficient feature extraction without pooling artifacts. Furthermore, a cross-attention fusion module merges global RGB features with local wavelet details. Extensive experiments on ImageNet-1K demonstrate that RWT outperforms state-of-the-art models, while showing superior transferability on downstream datasets like CIFAR-10/100, Stanford Cars, and Flowers-102. Source code is available at http://github.com/Fuuu12/RWT.
视觉表示学习的最新进展利用Transformer架构在图像分类和密集预测等任务中实现了卓越的性能。然而,由于基于池的下采样和随机裁剪操作,传统的视觉变换(vit)经常在多尺度特征处理和细粒度细节的保存方面遇到困难,这可能导致信息丢失。为了解决这些挑战,我们提出了一种新的统一的双域框架,称为RWT,它联合利用RGB和小波域表示来捕获全局依赖关系和局部频率信息。在RGB域中,采用多头自注意提取远程相互作用,而在小波域中,离散小波变换(DWT)通过将图像分解为低频(结构)和高频(纹理)分量,然后通过深度可分离卷积进行处理,从而实现可逆下采样。动态卷积核调整允许模型适应不同的分解水平,确保有效的特征提取而不池化工件。此外,交叉注意融合模块将全局RGB特征与局部小波细节合并。在ImageNet-1K上进行的大量实验表明,RWT优于最先进的模型,同时在下游数据集(如CIFAR-10/100、Stanford Cars和Flowers-102)上显示出优越的可移植性。源代码可从http://github.com/Fuuu12/RWT获得。
{"title":"All you need is two domains: Unified RGB-Wavelet transformer for visual representation learning","authors":"Yu Fu ,&nbsp;Weichao Yi ,&nbsp;Liquan Dong ,&nbsp;Ming Liu ,&nbsp;Lingqin Kong","doi":"10.1016/j.knosys.2025.115239","DOIUrl":"10.1016/j.knosys.2025.115239","url":null,"abstract":"<div><div>Recent advances in visual representation learning have leveraged Transformer architectures to achieve remarkable performance in tasks such as image classification and dense prediction. However, traditional Vision Transformers (ViTs) often struggle with multi-scale feature handling and the preservation of fine-grained details due to pooling-based downsampling and random cropping operations, which can result in information loss. To address these challenges, we propose a novel unified dual-domain framework, named RWT, which jointly exploits RGB and wavelet domain representations to capture both global dependencies as well as localized frequency information. In the RGB domain, multi-head self-attention is employed to extract long-range interactions, while in the wavelet domain, the Discrete Wavelet Transform (DWT) facilitates invertible downsampling by decomposing images into low-frequency (structural) and high-frequency (textural) components, which are then processed via depthwise separable convolutions. A dynamic convolutional kernel adjustment allows the model to adapt to varying decomposition levels, ensuring efficient feature extraction without pooling artifacts. Furthermore, a cross-attention fusion module merges global RGB features with local wavelet details. Extensive experiments on ImageNet-1K demonstrate that RWT outperforms state-of-the-art models, while showing superior transferability on downstream datasets like CIFAR-10/100, Stanford Cars, and Flowers-102. Source code is available at <span><span>http://github.com/Fuuu12/RWT</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"335 ","pages":"Article 115239"},"PeriodicalIF":7.6,"publicationDate":"2025-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Uncertainty-based consistency regularisation for text classification with limited labels 基于不确定性的有限标签文本分类一致性正则化
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-28 DOI: 10.1016/j.knosys.2025.115227
Deepak Uniyal, Richi Nayak, Md Abul Bashar
Semi-supervised text classification has garnered significant attention for its ability to leverage unlabelled data in settings with limited labelled data. While current state-of-the-art methods employ consistency regularisation with pseudo-labels and augmentation techniques such as synonym replacement and back-translation, they often suffer from high computational cost and limited augmentation diversity. We propose SemiCR-VAE, a novel framework that uses a variational autoencoder with uncertainty-based sampling to generate diverse, task-aware augmentations from the latent space. Experiments on five datasets, including the newly introduced Hydrogen Energy dataset, show that SemiCR-VAE achieves the highest average accuracy (74%) and F1 score (76%), outperforming strong baselines by an average of 3% in accuracy and 4% in F1 score. Our approach exhibits robustness with tuned hyperparameters (λs=100, λrl=0.001, λkl=10, λu=0.5, T=0.7), ensuring effective text classification in low-resource settings with limited labelled data.
半监督文本分类因其在有限的标记数据设置中利用未标记数据的能力而获得了极大的关注。虽然目前最先进的方法采用伪标签的一致性正则化和同义词替换和反翻译等增强技术,但它们往往存在计算成本高和增强多样性有限的问题。我们提出了一个新的框架,使用基于不确定性采样的变分自编码器,从潜在空间生成多样化的任务感知增强。在包括新引入的Hydrogen Energy数据集在内的五个数据集上进行的实验表明,SemiCR-VAE达到了最高的平均准确率(74%)和F1分数(76%),比强基线平均高出3%的准确率和4%的F1分数。我们的方法通过调整超参数(λs=100, λrl=0.001, λkl=10, λu=0.5, T=0.7)显示出鲁棒性,确保在有限标记数据的低资源设置下有效的文本分类。
{"title":"Uncertainty-based consistency regularisation for text classification with limited labels","authors":"Deepak Uniyal,&nbsp;Richi Nayak,&nbsp;Md Abul Bashar","doi":"10.1016/j.knosys.2025.115227","DOIUrl":"10.1016/j.knosys.2025.115227","url":null,"abstract":"<div><div>Semi-supervised text classification has garnered significant attention for its ability to leverage unlabelled data in settings with limited labelled data. While current state-of-the-art methods employ consistency regularisation with pseudo-labels and augmentation techniques such as synonym replacement and back-translation, they often suffer from high computational cost and limited augmentation diversity. We propose SemiCR-VAE, a novel framework that uses a variational autoencoder with uncertainty-based sampling to generate diverse, task-aware augmentations from the latent space. Experiments on five datasets, including the newly introduced Hydrogen Energy dataset, show that SemiCR-VAE achieves the highest average accuracy (74%) and F1 score (76%), outperforming strong baselines by an average of 3% in accuracy and 4% in F1 score. Our approach exhibits robustness with tuned hyperparameters (<span><math><mrow><msub><mi>λ</mi><mi>s</mi></msub><mo>=</mo><mn>100</mn></mrow></math></span>, <span><math><mrow><msub><mi>λ</mi><mrow><mi>r</mi><mi>l</mi></mrow></msub><mo>=</mo><mn>0.001</mn></mrow></math></span>, <span><math><mrow><msub><mi>λ</mi><mrow><mi>k</mi><mi>l</mi></mrow></msub><mo>=</mo><mn>10</mn></mrow></math></span>, <span><math><mrow><msub><mi>λ</mi><mi>u</mi></msub><mo>=</mo><mn>0.5</mn></mrow></math></span>, <span><math><mrow><mi>T</mi><mo>=</mo><mn>0.7</mn></mrow></math></span>), ensuring effective text classification in low-resource settings with limited labelled data.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"335 ","pages":"Article 115227"},"PeriodicalIF":7.6,"publicationDate":"2025-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ARETO : A joint entity and relation extraction model for the triple overlapping problem ARETO:一种针对三重重叠问题的联合实体和关系提取模型
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-28 DOI: 10.1016/j.knosys.2025.115178
Jing Liao , Can Wu , Lei Jiang , Xiande Su , Wei Liang , Ling-Huey Li , Arcangelo Castiglione , Kuanching Li
Entity-relation triple extraction is a crucial task in knowledge graph research, and its performance can be affected by overlapping triples. Existing studies often apply a matrix structure to represent relational triples and each matrix represents the correlation of all tokens. However, this structure is not well-suited to managing triple overlap, especially in cases involving the Entity Pair Overlap (EPO) problem. It struggles to clearly differentiate between multiple relational triples that involve the same pair of entities, leading to ambiguity and reduced extraction accuracy in such scenarios. To address this issue, we propose ARETO, a joint extraction model that incorporates an adjacency list structure to handle triple overlap. This structure represents entity relations without losing information, thus avoiding the complex mapping problems associated with the matrix structure in EPO cases. Moreover, ARETO includes a triple-element decoupling module that reduces errors caused by feature confusion, employing a step-by-step ”subject-object-relation” extraction method to improve accuracy and effectively address the EPO problem. Furthermore, this investigation discovers and defines SEO_R, a new type of relation overlap where triples share the same entity and relation. Experimental results demonstrate that ARETO achieves state-of-the-art F1 scores (93.0% and 93.8%) on the NYT and WebNLG-star datasets, with an accuracy improvement of 0.4%, and reaches F1 scores of 95.3% and 94.7% for overlaps between EPO and SEO_R, outperforming other baseline models and demonstrating its effectiveness. The source code of our work is available at: https://github.com/104wucan/ARETO.
实体-关系三元组抽取是知识图谱研究中的一项重要任务,三元组重叠会影响抽取的性能。现有研究通常采用矩阵结构来表示关系三元组,每个矩阵表示所有标记的相关性。然而,这种结构并不适合管理三重重叠,特别是在涉及实体对重叠(EPO)问题的情况下。它很难清楚地区分涉及同一对实体的多个关系三元组,从而导致在这种情况下产生歧义并降低提取准确性。为了解决这个问题,我们提出了ARETO,一种结合邻接表结构来处理三重重叠的联合提取模型。这种结构表示实体关系而不丢失信息,从而避免了EPO案例中与矩阵结构相关的复杂映射问题。此外,ARETO还包含一个三元素解耦模块,减少了特征混淆带来的误差,采用分步的“主体-客体-关系”提取方法,提高了精度,有效解决了EPO问题。此外,本研究发现并定义了SEO_R,这是一种新型的关系重叠,其中三元组共享相同的实体和关系。实验结果表明,ARETO在NYT和WebNLG-star数据集上获得了最先进的F1分数(93.0%和93.8%),准确率提高了0.4%,在EPO和SEO_R之间的重叠上达到了95.3%和94.7%的F1分数,优于其他基线模型,证明了其有效性。我们工作的源代码可在:https://github.com/104wucan/ARETO。
{"title":"ARETO : A joint entity and relation extraction model for the triple overlapping problem","authors":"Jing Liao ,&nbsp;Can Wu ,&nbsp;Lei Jiang ,&nbsp;Xiande Su ,&nbsp;Wei Liang ,&nbsp;Ling-Huey Li ,&nbsp;Arcangelo Castiglione ,&nbsp;Kuanching Li","doi":"10.1016/j.knosys.2025.115178","DOIUrl":"10.1016/j.knosys.2025.115178","url":null,"abstract":"<div><div>Entity-relation triple extraction is a crucial task in knowledge graph research, and its performance can be affected by overlapping triples. Existing studies often apply a matrix structure to represent relational triples and each matrix represents the correlation of all tokens. However, this structure is not well-suited to managing triple overlap, especially in cases involving the Entity Pair Overlap (EPO) problem. It struggles to clearly differentiate between multiple relational triples that involve the same pair of entities, leading to ambiguity and reduced extraction accuracy in such scenarios. To address this issue, we propose ARETO, a joint extraction model that incorporates an adjacency list structure to handle triple overlap. This structure represents entity relations without losing information, thus avoiding the complex mapping problems associated with the matrix structure in EPO cases. Moreover, ARETO includes a triple-element decoupling module that reduces errors caused by feature confusion, employing a step-by-step ”subject-object-relation” extraction method to improve accuracy and effectively address the EPO problem. Furthermore, this investigation discovers and defines SEO_R, a new type of relation overlap where triples share the same entity and relation. Experimental results demonstrate that ARETO achieves state-of-the-art F1 scores (93.0% and 93.8%) on the NYT and WebNLG-star datasets, with an accuracy improvement of 0.4%, and reaches F1 scores of 95.3% and 94.7% for overlaps between EPO and SEO_R, outperforming other baseline models and demonstrating its effectiveness. The source code of our work is available at: <span><span>https://github.com/104wucan/ARETO</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"335 ","pages":"Article 115178"},"PeriodicalIF":7.6,"publicationDate":"2025-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FSMamba: A dual-expert architecture with fast global attention and local-enhanced state-space mamba for time series forecasting FSMamba:具有快速全局关注和局部增强状态空间曼巴的双专家体系结构,用于时间序列预测
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-27 DOI: 10.1016/j.knosys.2025.115233
Shiming Fan , Hua Wang , Fan Zhang
Time series forecasting has become one of the key techniques in many practical applications. Real time series usually show two different temporal patterns: local short-term fluctuations and long-term global trends. However, existing methods ignore the collaborative modeling between local and global temporal patterns, which limits the further improvement of model performance. To meet this challenge, we propose a dual-expert model FSMamba that integrates Transformer-Mamba. FSMamba contains two complementary expert modules: Parity-Corrected Local Mamba (PCLM), as a local expert, further enhances Mamba’s ability to capture micro-dynamics within the sequence by performing parity decomposition on the input sequence and introducing a parity correction mechanism. Fast Global Linear Attention (FGLA), as a global expert, efficiently models long-term dependencies and global dynamic information in the sequence with a complexity of approximately O(N) through a linear attention mechanism with structural reconstruction, making up for the shortcomings of traditional attention mechanisms in efficiency and global dependency. FSMamba learns local and global temporal patterns respectively through these two expert modules, while maintaining the computational complexity of O(N), achieving powerful multi-scale modeling capabilities. A large number of experiments have shown that FSMamba has achieved SOTA performance in both long-term and short-term prediction tasks. Our code is available at https://github.com/fsmss/FSMamba.
时间序列预测已成为许多实际应用中的关键技术之一。实时时间序列通常显示两种不同的时间模式:当地短期波动和长期全球趋势。然而,现有方法忽略了局部和全局时间模式之间的协同建模,限制了模型性能的进一步提高。为了应对这一挑战,我们提出了一个集成了变形金刚-曼巴的双专家模型FSMamba。FSMamba包含两个互补的专家模块:奇偶校正本地曼巴(PCLM),作为一个本地专家,进一步增强曼巴的能力,捕捉微动力学序列内执行奇偶分解输入序列和引入奇偶校正机制。快速全局线性注意(Fast Global Linear Attention, FGLA)作为一种全局专家,通过结构重构的线性注意机制,对复杂度约为O(N)的序列中的长期依赖关系和全局动态信息进行高效建模,弥补了传统注意机制在效率和全局依赖方面的不足。FSMamba通过这两个专家模块分别学习局部和全局时间模式,同时保持了0 (N)的计算复杂度,实现了强大的多尺度建模能力。大量实验表明,FSMamba在长期和短期预测任务中都达到了SOTA的性能。我们的代码可在https://github.com/fsmss/FSMamba上获得。
{"title":"FSMamba: A dual-expert architecture with fast global attention and local-enhanced state-space mamba for time series forecasting","authors":"Shiming Fan ,&nbsp;Hua Wang ,&nbsp;Fan Zhang","doi":"10.1016/j.knosys.2025.115233","DOIUrl":"10.1016/j.knosys.2025.115233","url":null,"abstract":"<div><div>Time series forecasting has become one of the key techniques in many practical applications. Real time series usually show two different temporal patterns: local short-term fluctuations and long-term global trends. However, existing methods ignore the collaborative modeling between local and global temporal patterns, which limits the further improvement of model performance. To meet this challenge, we propose a dual-expert model FSMamba that integrates Transformer-Mamba. FSMamba contains two complementary expert modules: Parity-Corrected Local Mamba (PCLM), as a local expert, further enhances Mamba’s ability to capture micro-dynamics within the sequence by performing parity decomposition on the input sequence and introducing a parity correction mechanism. Fast Global Linear Attention (FGLA), as a global expert, efficiently models long-term dependencies and global dynamic information in the sequence with a complexity of approximately O(N) through a linear attention mechanism with structural reconstruction, making up for the shortcomings of traditional attention mechanisms in efficiency and global dependency. FSMamba learns local and global temporal patterns respectively through these two expert modules, while maintaining the computational complexity of O(N), achieving powerful multi-scale modeling capabilities. A large number of experiments have shown that FSMamba has achieved SOTA performance in both long-term and short-term prediction tasks. Our code is available at <span><span>https://github.com/fsmss/FSMamba</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"335 ","pages":"Article 115233"},"PeriodicalIF":7.6,"publicationDate":"2025-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MoBA: Motion memory-augmented deblurring autoencoder for video anomaly detection 用于视频异常检测的运动记忆增强去模糊自动编码器
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-27 DOI: 10.1016/j.knosys.2025.115218
Jiahao Lyu, Minghua Zhao, Jing Hu, Xuewen Huang, Shuangli Du, Cheng Shi, Zhiyong Lv
Video anomaly detection (VAD) often learns the distribution of normal samples and detects the anomaly through measuring significant deviations, but the undesired generalization may reconstruct a few anomalies thus suppressing the deviations. Meanwhile, most VADs cannot cope with cross-dataset validation for new target domains, and few-shot methods must laboriously rely on model-tuning from the target domain to complete domain adaptation. To address these problems, we propose a novel unsupervised VAD method with the Motion memory-augmented deBlurring Autoencoder, i.e. MoBA. First, we add Gaussian blur to the raw appearance images to limit autoencoder recovery of anomalies. Next, memory items are obtained by recording the motion features in the training phase, which are used to retrieve the motion features from the raw images in the testing phase. Finally, our method can ignore the blurred real anomaly through attention and rely on motion memory items to increase the normality gap between normal and abnormal motion. Extensive experiments on four benchmark datasets-Ped2, Avenue, ShanghaiTech, and UBnormal-demonstrate the effectiveness of the proposed method, in which MoBA achieves AUC scores of 99.0%, 89.6%, 75.6%, and 58.1%, respectively. Compared with cross-domain methods, our method achieves state-of-the-art performance without any adaptation, achieving 97.71% in Ped2 and 87.47% in Avenue with zero-shot learning.
视频异常检测(VAD)通常通过学习正态样本的分布,并通过测量显著偏差来检测异常,但不理想的泛化可能会重建少量异常,从而抑制偏差。同时,大多数vad无法处理新目标域的跨数据集验证,少数射击方法必须费力地依赖于目标域的模型调优来完成域自适应。为了解决这些问题,我们提出了一种新的无监督VAD方法与运动记忆增强去模糊自动编码器,即MoBA。首先,我们对原始外观图像添加高斯模糊,以限制自编码器对异常的恢复。接下来,通过记录训练阶段的运动特征获得记忆项,用于在测试阶段从原始图像中检索运动特征。最后,我们的方法可以通过注意忽略模糊的真实异常,并依靠运动记忆项来增加正常和异常运动之间的正常差距。在ped2、Avenue、ShanghaiTech和ubnorm4个基准数据集上的大量实验证明了该方法的有效性,MoBA的AUC得分分别达到99.0%、89.6%、75.6%和58.1%。与跨域方法相比,我们的方法在没有任何自适应的情况下达到了最先进的性能,在Ped2上达到了97.71%,在零次学习的Avenue上达到了87.47%。
{"title":"MoBA: Motion memory-augmented deblurring autoencoder for video anomaly detection","authors":"Jiahao Lyu,&nbsp;Minghua Zhao,&nbsp;Jing Hu,&nbsp;Xuewen Huang,&nbsp;Shuangli Du,&nbsp;Cheng Shi,&nbsp;Zhiyong Lv","doi":"10.1016/j.knosys.2025.115218","DOIUrl":"10.1016/j.knosys.2025.115218","url":null,"abstract":"<div><div>Video anomaly detection (VAD) often learns the distribution of normal samples and detects the anomaly through measuring significant deviations, but the undesired generalization may reconstruct a few anomalies thus suppressing the deviations. Meanwhile, most VADs cannot cope with cross-dataset validation for new target domains, and few-shot methods must laboriously rely on model-tuning from the target domain to complete domain adaptation. To address these problems, we propose a novel unsupervised VAD method with the <strong>Mo</strong>tion memory-augmented de<strong>B</strong>lurring <strong>A</strong>utoencoder, i.e. MoBA. First, we add Gaussian blur to the raw appearance images to limit autoencoder recovery of anomalies. Next, memory items are obtained by recording the motion features in the training phase, which are used to retrieve the motion features from the raw images in the testing phase. Finally, our method can ignore the blurred real anomaly through attention and rely on motion memory items to increase the normality gap between normal and abnormal motion. Extensive experiments on four benchmark datasets-Ped2, Avenue, ShanghaiTech, and UBnormal-demonstrate the effectiveness of the proposed method, in which MoBA achieves AUC scores of 99.0%, 89.6%, 75.6%, and 58.1%, respectively. Compared with cross-domain methods, our method achieves state-of-the-art performance without any adaptation, achieving 97.71% in Ped2 and 87.47% in Avenue with zero-shot learning.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"335 ","pages":"Article 115218"},"PeriodicalIF":7.6,"publicationDate":"2025-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Knowledge-Based Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1