ACM Computing Surveys最新文献

英文中文

Class-Imbalanced Learning on Graphs: A Survey

IF 16.6 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

ACM Computing Surveys

Pub Date : 2025-02-19 DOI: 10.1145/3718734

Yihong Ma, Yijun Tian, Nuno Moniz, Nitesh V. Chawla

Rapid advancement in machine learning is increasing the demand for effective graph data analysis. However, real-world graph data often exhibits class imbalance, leading to poor performance of standard machine learning models on underrepresented classes. To address this, C lass- I mbalanced L earning on G raphs (CILG) has emerged as a promising solution that combines graph representation learning and class-imbalanced learning. This survey provides a comprehensive understanding of CILG’s current state-of-the-art, establishing the first systematic taxonomy of existing work and its connections to traditional imbalanced learning. We critically analyze recent advances and discuss key open problems. A continuously updated reading list of relevant papers and code implementations is available at https://github.com/yihongma/CILG-Papers.

引用次数: 0

Automated Program Repair: Emerging Trends Pose and Expose Problems for Benchmarks

IF 16.6 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

ACM Computing Surveys

Pub Date : 2025-02-19 DOI: 10.1145/3704997

Joseph Renzullo, Pemma Reiter, Westley Weimer, Stephanie Forrest

Machine learning (ML) pervades the field of Automated Program Repair (APR). Algorithms deploy neural machine translation and large language models (LLMs) to generate software patches, among other tasks. But, there are important differences between these applications of ML and earlier work, which complicates the task of ensuring that results are valid and likely to generalize. A challenge is that the most popular APR evaluation benchmarks were not designed with ML techniques in mind. This is especially true for LLMs, whose large and often poorly-disclosed training datasets may include problems on which they are evaluated. This paper reviews work in APR published in the field’s top five venues since 2018, emphasizing emerging trends in the field, including the dramatic rise of ML models, including LLMs. ML-based papers are categorized along structural and functional dimensions, and a variety of issues are identified that these new methods raise. Importantly, data leakage and contamination concerns arise from the challenge of validating ML-based APR using existing benchmarks, which were designed before these techniques were popular. We discuss inconsistencies in evaluation design and performance reporting and offer pointers to solutions where they are available. Finally, we highlight promising new directions that the field is already taking.

{"title":"Automated Program Repair: Emerging Trends Pose and Expose Problems for Benchmarks","authors":"Joseph Renzullo, Pemma Reiter, Westley Weimer, Stephanie Forrest","doi":"10.1145/3704997","DOIUrl":"https://doi.org/10.1145/3704997","url":null,"abstract":"Machine learning (ML) pervades the field of Automated Program Repair (APR). Algorithms deploy neural machine translation and large language models (LLMs) to generate software patches, among other tasks. But, there are important differences between these applications of ML and earlier work, which complicates the task of ensuring that results are valid and likely to generalize. A challenge is that the most popular APR evaluation benchmarks were not designed with ML techniques in mind. This is especially true for LLMs, whose large and often poorly-disclosed training datasets may include problems on which they are evaluated. This paper reviews work in APR published in the field’s top five venues since 2018, emphasizing emerging trends in the field, including the dramatic rise of ML models, including LLMs. ML-based papers are categorized along structural and functional dimensions, and a variety of issues are identified that these new methods raise. Importantly, data leakage and contamination concerns arise from the challenge of validating ML-based APR using existing benchmarks, which were designed before these techniques were popular. We discuss inconsistencies in evaluation design and performance reporting and offer pointers to solutions where they are available. Finally, we highlight promising new directions that the field is already taking.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"87 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143462203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Deep Learning for Cross-Domain Few-Shot Visual Recognition: A Survey 用于跨域少镜头视觉识别的深度学习：调查

IF 16.6 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

ACM Computing Surveys

Pub Date : 2025-02-17 DOI: 10.1145/3718362

Huali Xu, Shuaifeng Zhi, Shuzhou Sun, Vishal Patel, Li Liu

While deep learning excels in computer vision tasks with abundant labeled data, its performance diminishes significantly in scenarios with limited labeled samples. To address this, Few-shot learning (FSL) enables models to perform the target tasks with very few labeled examples by leveraging prior knowledge from related tasks. However, traditional FSL assumes that both the related and target tasks come from the same domain, which is a restrictive assumption in many real-world scenarios where domain differences are common. To overcome this limitation, Cross-domain few-shot learning (CDFSL) has gained attention, as it allows source and target data to come from different domains and label spaces. This paper presents the first comprehensive review of Cross-domain Few-shot Learning (CDFSL), a field that has received less attention compared to traditional FSL due to its unique challenges. We aim to provide both a position paper and a tutorial for researchers, covering key problems, existing methods, and future research directions. The review begins with a formal definition of CDFSL, outlining its core challenges, followed by a systematic analysis of current approaches, organized under a clear taxonomy. Finally, we discuss promising future directions in terms of problem setups, applications, and theoretical advancements.

{"title":"Deep Learning for Cross-Domain Few-Shot Visual Recognition: A Survey","authors":"Huali Xu, Shuaifeng Zhi, Shuzhou Sun, Vishal Patel, Li Liu","doi":"10.1145/3718362","DOIUrl":"https://doi.org/10.1145/3718362","url":null,"abstract":"While deep learning excels in computer vision tasks with abundant labeled data, its performance diminishes significantly in scenarios with limited labeled samples. To address this, Few-shot learning (FSL) enables models to perform the target tasks with very few labeled examples by leveraging prior knowledge from related tasks. However, traditional FSL assumes that both the related and target tasks come from the same domain, which is a restrictive assumption in many real-world scenarios where domain differences are common. To overcome this limitation, Cross-domain few-shot learning (CDFSL) has gained attention, as it allows source and target data to come from different domains and label spaces. This paper presents the first comprehensive review of Cross-domain Few-shot Learning (CDFSL), a field that has received less attention compared to traditional FSL due to its unique challenges. We aim to provide both a position paper and a tutorial for researchers, covering key problems, existing methods, and future research directions. The review begins with a formal definition of CDFSL, outlining its core challenges, followed by a systematic analysis of current approaches, organized under a clear taxonomy. Finally, we discuss promising future directions in terms of problem setups, applications, and theoretical advancements.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"10 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143427140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

From Perception to Computation: Revisiting Delay Optimization for Connected Autonomous Vehicles

IF 16.6 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

ACM Computing Surveys

Pub Date : 2025-02-17 DOI: 10.1145/3718361

Tianen Liu, Shuai Wang, Zheng Dong, Borui Li, Tian He

With the development of sensing, wireless communication, and real-time computing technologies, vehicles are gradually becoming more and more intelligent. To provide safe autonomous mobility services, connected autonomous vehicles (CAVs) need to obtain complete information about their environment and process it in real-time to make driving decisions. However, the rapid increase in data volume puts pressure on CAVs to process tasks in real time. This survey analyzes CAVs delay optimization from the perception layer, communication layer, computation layer, and cross-layer. According to different coordination modes, each layer of CAVs is divided, and the problem of delay optimization is classified in fine granularity. This survey will help researchers gain insight into the mechanism of delay optimization on CAVs and highlight the key role of optimized delay in autonomous driving.

引用次数: 0

Green Federated Learning: A New Era of Green Aware AI

IF 16.6 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

ACM Computing Surveys

Pub Date : 2025-02-15 DOI: 10.1145/3718363

Dipanwita Thakur, Antonella Guzzo, Giancarlo Fortino, Francesco Piccialli

The development of AI applications, especially in large-scale wireless networks, is growing exponentially, alongside the size and complexity of the architectures used. Particularly, machine learning is acknowledged as one of today’s most energy-intensive computational applications, posing a significant challenge to the environmental sustainability of next-generation intelligent systems. Achieving environmental sustainability entails ensuring that every AI algorithm is designed with sustainability in mind, integrating green considerations from the architectural phase onwards. Recently, Federated Learning (FL), with its distributed nature, presents new opportunities to address this need. Hence, it’s imperative to elucidate the potential and challenges stemming from recent FL advancements and their implications for sustainability. Moreover, it’s crucial to furnish researchers, stakeholders, and interested parties with a roadmap to navigate and understand existing efforts and gaps in green-aware AI algorithms. This survey primarily aims to achieve this objective by identifying and analyzing over a hundred FL works and assessing their contributions to green-aware artificial intelligence for sustainable environments, with a specific focus on IoT research. It delves into current issues in green federated learning from an energy-efficient standpoint, discussing potential challenges and future prospects for green IoT application research.

{"title":"Green Federated Learning: A New Era of Green Aware AI","authors":"Dipanwita Thakur, Antonella Guzzo, Giancarlo Fortino, Francesco Piccialli","doi":"10.1145/3718363","DOIUrl":"https://doi.org/10.1145/3718363","url":null,"abstract":"The development of AI applications, especially in large-scale wireless networks, is growing exponentially, alongside the size and complexity of the architectures used. Particularly, machine learning is acknowledged as one of today’s most energy-intensive computational applications, posing a significant challenge to the environmental sustainability of next-generation intelligent systems. Achieving environmental sustainability entails ensuring that every AI algorithm is designed with sustainability in mind, integrating green considerations from the architectural phase onwards. Recently, Federated Learning (FL), with its distributed nature, presents new opportunities to address this need. Hence, it’s imperative to elucidate the potential and challenges stemming from recent FL advancements and their implications for sustainability. Moreover, it’s crucial to furnish researchers, stakeholders, and interested parties with a roadmap to navigate and understand existing efforts and gaps in green-aware AI algorithms. This survey primarily aims to achieve this objective by identifying and analyzing over a hundred FL works and assessing their contributions to green-aware artificial intelligence for sustainable environments, with a specific focus on IoT research. It delves into current issues in green federated learning from an energy-efficient standpoint, discussing potential challenges and future prospects for green IoT application research.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"15 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2025-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143417493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Comprehensive Survey on Big Data Analytics: Characteristics, Tools and Techniques

IF 16.6 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

ACM Computing Surveys

Pub Date : 2025-02-15 DOI: 10.1145/3718364

Mohammad Shahnawaz, Manish Kumar

Modern computing devices generate vast amounts of diverse data. It means that a fast transition through various computing devices leads to big data production. Big data with high velocity, volume, and variety presents challenges like data inconsistency, scalability, real-time analysis, and tool selection. Although numerous solutions have been proposed for big data processing, they are often limited in scope and effectiveness. This survey aims to address the lack of comprehensive analysis of big data challenges in relation to machine learning (ML) and the Internet of Things (IoT) environments, particularly concerning the 7Vs of big data. It emphasizes the significance of selecting suitable tools to address each unique big data characteristic, providing a structured approach to manage these challenges effectively. The article systematically reviews big data characteristics and associated techniques, with a detailed discussion of various tools and their applications. Additionally, it analyzes existing ML methods and techniques for IoT data analytics in big data contexts. Through a systematic literature review (SLR), we examine key aspects, including core concepts, benefits, limitations, and the impact of big data on ML algorithms and IoT data analytics. We highlight groundbreaking studies addressing big data challenges to impact future research and enhance big data-driven applications.

{"title":"A Comprehensive Survey on Big Data Analytics: Characteristics, Tools and Techniques","authors":"Mohammad Shahnawaz, Manish Kumar","doi":"10.1145/3718364","DOIUrl":"https://doi.org/10.1145/3718364","url":null,"abstract":"Modern computing devices generate vast amounts of diverse data. It means that a fast transition through various computing devices leads to big data production. Big data with high velocity, volume, and variety presents challenges like data inconsistency, scalability, real-time analysis, and tool selection. Although numerous solutions have been proposed for big data processing, they are often limited in scope and effectiveness. This survey aims to address the lack of comprehensive analysis of big data challenges in relation to machine learning (ML) and the Internet of Things (IoT) environments, particularly concerning the 7Vs of big data. It emphasizes the significance of selecting suitable tools to address each unique big data characteristic, providing a structured approach to manage these challenges effectively. The article systematically reviews big data characteristics and associated techniques, with a detailed discussion of various tools and their applications. Additionally, it analyzes existing ML methods and techniques for IoT data analytics in big data contexts. Through a systematic literature review (SLR), we examine key aspects, including core concepts, benefits, limitations, and the impact of big data on ML algorithms and IoT data analytics. We highlight groundbreaking studies addressing big data challenges to impact future research and enhance big data-driven applications.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"208 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2025-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143417495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Blockchain-Empowered Trustworthy Data Sharing: Fundamentals, Applications, and Challenges

IF 16.6 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

ACM Computing Surveys

Pub Date : 2025-02-15 DOI: 10.1145/3718082

Thanh Linh Nguyen, Lam Nguyen, Thong Hoang, Dilum Bandara, Qin Wang, Qinghua Lu, Xiwei Xu, Liming Zhu, Shiping Chen

The rise of data-sharing platforms, driven by public demand for open data and legislative mandates, has raised several pertinent issues. These encompass uncertainties over data accuracy, provenance and lineage, privacy concerns, consent management, and the lack of equitable incentives for data providers. The advanced nature of blockchain makes it well-suited to address these concerns. Yet, the limitations of blockchains, particularly their restricted performance, scalability, and high cost, make them less adept at managing the four “V” of big data - volume, variety, velocity, and veracity. As the body of work proposing blockchain-based data-sharing solutions grows, so does the confusion in selecting between these platforms, particularly in terms of sharing mechanisms, services, quality of services, and applications. In this paper, we aim to fill this knowledge gap through an in-depth survey of blockchain-based data-sharing architectures and applications. We first identify the key challenges of existing data-sharing techniques and lay out the foundations of blockchains. Our focus then shifts to the intersection of blockchain and data sharing, wherein we aim to clarify the existing landscape and propose a reference architecture for blockchain-based data sharing. Subsequently, we explore various industrial applications of blockchain-based data sharing, spanning healthcare, smart grids, transportation, and decarbonization. For each application, we draw from real-world deployments to present key lessons learned in the implementation of blockchain-based data sharing. Lastly, we shed light on current research challenges and open avenues for further study in this space. This paper aims to serve as a comprehensive resource for researchers/practitioners looking to navigate the complex terrain of blockchain-based data-sharing solutions.

{"title":"Blockchain-Empowered Trustworthy Data Sharing: Fundamentals, Applications, and Challenges","authors":"Thanh Linh Nguyen, Lam Nguyen, Thong Hoang, Dilum Bandara, Qin Wang, Qinghua Lu, Xiwei Xu, Liming Zhu, Shiping Chen","doi":"10.1145/3718082","DOIUrl":"https://doi.org/10.1145/3718082","url":null,"abstract":"The rise of data-sharing platforms, driven by public demand for open data and legislative mandates, has raised several pertinent issues. These encompass uncertainties over data accuracy, provenance and lineage, privacy concerns, consent management, and the lack of equitable incentives for data providers. The advanced nature of blockchain makes it well-suited to address these concerns. Yet, the limitations of blockchains, particularly their restricted performance, scalability, and high cost, make them less adept at managing the four “V” of big data - volume, variety, velocity, and veracity. As the body of work proposing blockchain-based data-sharing solutions grows, so does the confusion in selecting between these platforms, particularly in terms of sharing mechanisms, services, quality of services, and applications. In this paper, we aim to fill this knowledge gap through an in-depth survey of blockchain-based data-sharing architectures and applications. We first identify the key challenges of existing data-sharing techniques and lay out the foundations of blockchains. Our focus then shifts to the intersection of blockchain and data sharing, wherein we aim to clarify the existing landscape and propose a reference architecture for blockchain-based data sharing. Subsequently, we explore various industrial applications of blockchain-based data sharing, spanning healthcare, smart grids, transportation, and decarbonization. For each application, we draw from real-world deployments to present key lessons learned in the implementation of blockchain-based data sharing. Lastly, we shed light on current research challenges and open avenues for further study in this space. This paper aims to serve as a comprehensive resource for researchers/practitioners looking to navigate the complex terrain of blockchain-based data-sharing solutions.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"28 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2025-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143417498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Facial Expression Analysis in Parkinson's Disease Using Machine Learning: A Review

IF 16.6 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

ACM Computing Surveys

Pub Date : 2025-02-14 DOI: 10.1145/3716818

Guilherme Camargo, Quoc Ngo, Leandro Passos, Danilo Jodas, Joao Papa, Dinesh Kumar

Computerised facial expression analysis is performed for a range of social and commercial applications and more recently its potential in medicine such as to detect Parkinson’s Disease (PD) is emerging. This has possibilities for use in telehealth and population screening. The advancement of facial expression analysis using machine learning is relatively recent, with majority of the published work being post-2019. We have performed a systematic review of the English-based publication on the topic from 2019 to 2024 to capture the trends and identify research opportunities that will facilitate the translation of this technology for recognising Parkinson’s disease. The review shows significant advancements in the field, with facial expressions emerging as a potential biomarker for PD. Different machine learning models, from shallow to deep learning, could detect PD faces. However, the main limitation is the reliance on limited datasets. Furthermore, while significant progress has been made, model generalization must be tested before clinical applications.

引用次数: 0

Towards Robust Cyber Attack Taxonomies: A Survey with Requirements, Structures, and Assessment

IF 16.6 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

ACM Computing Surveys

Pub Date : 2025-02-13 DOI: 10.1145/3717606

Paulo Roberto da Paz Ferraz Santos, Paulo Angelo Alves Resende, João José Costa Gondim, André Costa Drummond

Cyber attacks have become a growing threat in today’s interconnected society, and taxonomies play a crucial role in understanding and preventing these attacks. However, the lack of comprehensive assessment methods for evaluating attack taxonomies represents a significant gap in the literature, hindering their development and applicability. This paper aims to address this gap by conducting a survey of 20 attack taxonomies published between 2011 and 2022 and evaluating them with a novel set of qualitative and quantitative assessment criteria, grounded in fundamental taxonomy requirements and key structural attributes. In pursuit of clear and objective assessment criteria, the authors investigated the main taxonomy properties in the literature, identifying dependencies and relationships. This investigation extracted the fundamental requirements for a relevant and widely accepted attack taxonomy in the cybersecurity community. Noteworthy structural aspects, such as organization, scheme, labeling, and approach, are also addressed, considering their impact on taxonomy effectiveness and applicability constraints. Finally, the paper poses some open questions and challenges, along with suggestions for future research directions.

在当今相互联系的社会中，网络攻击已成为一种日益严重的威胁，而分类标准在理解和预防这些攻击方面发挥着至关重要的作用。然而，由于缺乏全面的评估方法来评价攻击分类标准，这是文献中的一个重大空白，阻碍了分类标准的发展和应用。本文旨在通过对 2011 年至 2022 年间发布的 20 个攻击分类标准进行调查，并根据分类标准的基本要求和关键结构属性，采用一套新颖的定性和定量评估标准对其进行评估，从而弥补这一空白。为了追求清晰客观的评估标准，作者对文献中的主要分类属性进行了调查，确定了相关性和关系。这项调查提取了网络安全界广泛接受的相关攻击分类法的基本要求。本文还讨论了值得注意的结构方面，如组织、方案、标签和方法，并考虑了它们对分类法有效性和适用性限制的影响。最后，本文提出了一些开放性问题和挑战，并对未来的研究方向提出了建议。

{"title":"Towards Robust Cyber Attack Taxonomies: A Survey with Requirements, Structures, and Assessment","authors":"Paulo Roberto da Paz Ferraz Santos, Paulo Angelo Alves Resende, João José Costa Gondim, André Costa Drummond","doi":"10.1145/3717606","DOIUrl":"https://doi.org/10.1145/3717606","url":null,"abstract":"Cyber attacks have become a growing threat in today’s interconnected society, and taxonomies play a crucial role in understanding and preventing these attacks. However, the lack of comprehensive assessment methods for evaluating attack taxonomies represents a significant gap in the literature, hindering their development and applicability. This paper aims to address this gap by conducting a survey of 20 attack taxonomies published between 2011 and 2022 and evaluating them with a novel set of qualitative and quantitative assessment criteria, grounded in fundamental taxonomy requirements and key structural attributes. In pursuit of clear and objective assessment criteria, the authors investigated the main taxonomy properties in the literature, identifying dependencies and relationships. This investigation extracted the fundamental requirements for a relevant and widely accepted attack taxonomy in the cybersecurity community. Noteworthy structural aspects, such as organization, scheme, labeling, and approach, are also addressed, considering their impact on taxonomy effectiveness and applicability constraints. Finally, the paper poses some open questions and challenges, along with suggestions for future research directions.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"67 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143417745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Towards Lifelong Learning of Large Language Models: A Survey

IF 16.6 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

ACM Computing Surveys

Pub Date : 2025-02-13 DOI: 10.1145/3716629

Junhao Zheng, Shengjie Qiu, Chengming Shi, Qianli Ma

As the applications of large language models (LLMs) expand across diverse fields, their ability to adapt to ongoing changes in data, tasks, and user preferences becomes crucial. Traditional training methods with static datasets are inadequate for coping with the dynamic nature of real-world information. Lifelong learning, or continual learning, addresses this by enabling LLMs to learn continuously and adapt over their operational lifetime, integrating new knowledge while retaining previously learned information and preventing catastrophic forgetting. Our survey explores the landscape of lifelong learning, categorizing strategies into two groups based on how new knowledge is integrated: Internal Knowledge, where LLMs absorb new knowledge into their parameters through full or partial training, and External Knowledge, which incorporates new knowledge as external resources like Wikipedia or APIs without updating model parameters. The key contributions of our survey include: (1) Introducing a novel taxonomy to categorize the extensive literature of lifelong learning into 12 scenarios; (2) Identifying common techniques across all lifelong learning scenarios and classifying existing literature into various technique groups; (3) Highlighting emerging techniques such as model expansion and data selection, which were less explored in the pre-LLM era. Resources are available at https://github.com/qianlima-lab/awesome-lifelong-learning-methods-for-llm.

随着大型语言模型（LLM）在各个领域的应用不断扩大，其适应数据、任务和用户偏好不断变化的能力变得至关重要。传统的静态数据集训练方法不足以应对真实世界信息的动态性质。终身学习或持续学习可以解决这个问题，它能让 LLM 在其运行寿命期间不断学习和适应，在整合新知识的同时保留以前学习过的信息，防止灾难性遗忘。我们的调查探讨了终身学习的前景，根据新知识的整合方式将战略分为两类：内部知识：LLM 通过全部或部分训练将新知识吸收到参数中；外部知识：在不更新模型参数的情况下，通过维基百科或 API 等外部资源吸收新知识。我们调查的主要贡献包括(1)引入一种新颖的分类法，将有关终身学习的大量文献分为 12 种情况；(2)确定所有终身学习情况中的共同技术，并将现有文献分为不同的技术组；(3)强调模型扩展和数据选择等新兴技术，这些技术在前 LLM 时代较少被探索。资源可在 https://github.com/qianlima-lab/awesome-lifelong-learning-methods-for-llm 上获取。

{"title":"Towards Lifelong Learning of Large Language Models: A Survey","authors":"Junhao Zheng, Shengjie Qiu, Chengming Shi, Qianli Ma","doi":"10.1145/3716629","DOIUrl":"https://doi.org/10.1145/3716629","url":null,"abstract":"As the applications of large language models (LLMs) expand across diverse fields, their ability to adapt to ongoing changes in data, tasks, and user preferences becomes crucial. Traditional training methods with static datasets are inadequate for coping with the dynamic nature of real-world information. Lifelong learning, or continual learning, addresses this by enabling LLMs to learn continuously and adapt over their operational lifetime, integrating new knowledge while retaining previously learned information and preventing catastrophic forgetting. Our survey explores the landscape of lifelong learning, categorizing strategies into two groups based on how new knowledge is integrated: Internal Knowledge, where LLMs absorb new knowledge into their parameters through full or partial training, and External Knowledge, which incorporates new knowledge as external resources like Wikipedia or APIs without updating model parameters. The key contributions of our survey include: (1) Introducing a novel taxonomy to categorize the extensive literature of lifelong learning into 12 scenarios; (2) Identifying common techniques across all lifelong learning scenarios and classifying existing literature into various technique groups; (3) Highlighting emerging techniques such as model expansion and data selection, which were less explored in the pre-LLM era. Resources are available at https://github.com/qianlima-lab/awesome-lifelong-learning-methods-for-llm.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"63 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143417744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

ACM Computing Surveys

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀