A Survey on Knowledge Editing of Neural Networks

IF 8.9 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE IEEE transactions on neural networks and learning systems Pub Date : 2024-11-25 DOI:10.1109/TNNLS.2024.3498935
Vittorio Mazzia;Alessandro Pedrani;Andrea Caciolai;Kay Rottmann;Davide Bernardi
{"title":"A Survey on Knowledge Editing of Neural Networks","authors":"Vittorio Mazzia;Alessandro Pedrani;Andrea Caciolai;Kay Rottmann;Davide Bernardi","doi":"10.1109/TNNLS.2024.3498935","DOIUrl":null,"url":null,"abstract":"Deep neural networks are becoming increasingly pervasive in academia and industry, matching and surpassing human performance in a wide variety of fields and related tasks. However, just as humans, even the largest artificial neural networks (ANNs) make mistakes, and once-correct predictions can become invalid as the world progresses in time. Augmenting datasets with samples that account for mistakes or up-to-date information has become a common workaround in practical applications. However, the well-known phenomenon of catastrophic forgetting poses a challenge in achieving precise changes in the implicitly memorized knowledge of neural network parameters, often requiring a full model retraining to achieve desired behaviors. That is expensive, unreliable, and incompatible with the current trend of large self-supervised pretraining, making it necessary to find more efficient and effective methods for adapting neural network models to changing data. To address this need, knowledge editing (KE) is emerging as a novel area of research that aims to enable reliable, data-efficient, and fast changes to a pretrained target model, without affecting model behaviors on previously learned tasks. In this survey, we provide a brief review of this recent artificial intelligence field of research. We first introduce the problem of editing neural networks, formalize it in a common framework and differentiate it from more notorious branches of research such as continuous learning. Next, we provide a review of the most relevant KE approaches and datasets proposed so far, grouping works under four different families: regularization techniques, meta-learning, direct model editing, and architectural strategies. Finally, we outline some intersections with other fields of research and potential directions for future works.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"36 7","pages":"11759-11775"},"PeriodicalIF":8.9000,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks and learning systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10766891/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Deep neural networks are becoming increasingly pervasive in academia and industry, matching and surpassing human performance in a wide variety of fields and related tasks. However, just as humans, even the largest artificial neural networks (ANNs) make mistakes, and once-correct predictions can become invalid as the world progresses in time. Augmenting datasets with samples that account for mistakes or up-to-date information has become a common workaround in practical applications. However, the well-known phenomenon of catastrophic forgetting poses a challenge in achieving precise changes in the implicitly memorized knowledge of neural network parameters, often requiring a full model retraining to achieve desired behaviors. That is expensive, unreliable, and incompatible with the current trend of large self-supervised pretraining, making it necessary to find more efficient and effective methods for adapting neural network models to changing data. To address this need, knowledge editing (KE) is emerging as a novel area of research that aims to enable reliable, data-efficient, and fast changes to a pretrained target model, without affecting model behaviors on previously learned tasks. In this survey, we provide a brief review of this recent artificial intelligence field of research. We first introduce the problem of editing neural networks, formalize it in a common framework and differentiate it from more notorious branches of research such as continuous learning. Next, we provide a review of the most relevant KE approaches and datasets proposed so far, grouping works under four different families: regularization techniques, meta-learning, direct model editing, and architectural strategies. Finally, we outline some intersections with other fields of research and potential directions for future works.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
神经网络知识编辑调查
深度神经网络在学术界和工业界变得越来越普遍,在各种领域和相关任务中匹配并超越了人类的表现。然而,就像人类一样,即使是最大的人工神经网络(ann)也会犯错误,随着世界的发展,曾经正确的预测也会变得无效。在实际应用中,用解释错误或最新信息的样本来增加数据集已经成为一种常见的解决方法。然而,众所周知的灾难性遗忘现象对实现隐式记忆的神经网络参数知识的精确变化提出了挑战,通常需要完整的模型再训练来实现期望的行为。这是昂贵的,不可靠的,并且与当前大规模自监督预训练的趋势不相容,因此有必要找到更高效和有效的方法来使神经网络模型适应不断变化的数据。为了满足这一需求,知识编辑(KE)正在成为一个新的研究领域,旨在实现对预训练目标模型的可靠、数据高效和快速更改,而不影响模型对先前学习任务的行为。在这一调查中,我们提供了一个简短的回顾,最近的研究领域的人工智能。我们首先介绍了编辑神经网络的问题,将其形式化在一个通用框架中,并将其与更臭名昭著的研究分支(如持续学习)区分开来。接下来,我们回顾了迄今为止提出的最相关的KE方法和数据集,将工作分组为四个不同的家族:正则化技术、元学习、直接模型编辑和架构策略。最后,我们概述了与其他研究领域的一些交叉点和未来工作的潜在方向。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE transactions on neural networks and learning systems
IEEE transactions on neural networks and learning systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
CiteScore
23.80
自引率
9.60%
发文量
2102
审稿时长
3-8 weeks
期刊介绍: The focus of IEEE Transactions on Neural Networks and Learning Systems is to present scholarly articles discussing the theory, design, and applications of neural networks as well as other learning systems. The journal primarily highlights technical and scientific research in this domain.
期刊最新文献
A Dual-Network Framework With Adversarial GMM Augmentation and Frequency-Mamba Fusion for Hyperspectral Target Detection. Disentangled Generative Graph Representation Learning Adaptive Prototype-Guided Personalized Propagation for Heterophilic Graphs With Missing Data. Causal Counterfactual Inference Network for Video Object State Changes in Open-World Scenarios. Attribute-Topology Cross-Frequency Aligned Graph Neural Networks for Homophilic and Heterophilic Graphs in Node Classification.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1