ACM Computing Surveys最新文献_第9页

Racial Bias within Face Recognition: A Survey 人脸识别中的种族偏见：一项调查

IF 16.6 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

ACM Computing Surveys

Pub Date : 2024-11-22 DOI: 10.1145/3705295

Seyma Yucer, Furkan Tektas, Noura Al Moubayed, Toby Breckon

Facial recognition is one of the most academically studied and industrially developed areas within computer vision where we readily find associated applications deployed globally. This widespread adoption has uncovered significant performance variation across subjects of different racial profiles leading to focused research attention on racial bias within face recognition spanning both current causation and future potential solutions. In support, this study provides an extensive taxonomic review of research on racial bias within face recognition exploring every aspect and stage of the associated facial processing pipeline. Firstly, we discuss the problem definition of racial bias, starting with race definition, grouping strategies, and the societal implications of using race or race-related groupings. Secondly, we divide the common face recognition processing pipeline into four stages: image acquisition, face localisation, face representation, face verification and identification, and review the relevant corresponding literature associated with each stage. The overall aim is to provide comprehensive coverage of the racial bias problem with respect to each and every stage of the face recognition processing pipeline whilst also highlighting the potential pitfalls and limitations of contemporary mitigation strategies that need to be considered within future research endeavours or commercial applications alike.

人脸识别是计算机视觉领域中学术研究和工业开发最多的领域之一，我们很容易在全球范围内发现相关的应用。这种广泛的应用发现了不同种族被试之间的显著性能差异，从而引发了对人脸识别中种族偏见的集中研究，包括当前的成因和未来潜在的解决方案。为此，本研究对人脸识别中的种族偏见研究进行了广泛的分类综述，探讨了相关面部处理管道的各个方面和阶段。首先，我们讨论了种族偏见的问题定义，从种族定义、分组策略以及使用种族或种族相关分组的社会影响入手。其次，我们将常见的人脸识别处理流程分为四个阶段：图像采集、人脸定位、人脸表示、人脸验证和识别，并回顾了与每个阶段相关的相应文献。我们的总体目标是在人脸识别处理流程的每一个阶段全面覆盖种族偏见问题，同时强调当代缓解策略的潜在隐患和局限性，这些都需要在未来的研究工作或商业应用中加以考虑。

{"title":"Racial Bias within Face Recognition: A Survey","authors":"Seyma Yucer, Furkan Tektas, Noura Al Moubayed, Toby Breckon","doi":"10.1145/3705295","DOIUrl":"https://doi.org/10.1145/3705295","url":null,"abstract":"Facial recognition is one of the most academically studied and industrially developed areas within computer vision where we readily find associated applications deployed globally. This widespread adoption has uncovered significant performance variation across subjects of different racial profiles leading to focused research attention on racial bias within face recognition spanning both current causation and future potential solutions. In support, this study provides an extensive taxonomic review of research on racial bias within face recognition exploring every aspect and stage of the associated facial processing pipeline. Firstly, we discuss the problem definition of racial bias, starting with race definition, grouping strategies, and the societal implications of using race or race-related groupings. Secondly, we divide the common face recognition processing pipeline into four stages: image acquisition, face localisation, face representation, face verification and identification, and review the relevant corresponding literature associated with each stage. The overall aim is to provide comprehensive coverage of the racial bias problem with respect to each and every stage of the face recognition processing pipeline whilst also highlighting the potential pitfalls and limitations of contemporary mitigation strategies that need to be considered within future research endeavours or commercial applications alike.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"115 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142690743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Survey of Machine Learning for Urban Decision Making: Applications in Planning, Transportation, and Healthcare 城市决策机器学习概览》：规划、交通和医疗领域的应用

IF 16.6 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

ACM Computing Surveys

Pub Date : 2024-11-22 DOI: 10.1145/3695986

Yu Zheng, Qianyue Hao, Jingwei Wang, Changzheng Gao, Jinwei Chen, Depeng Jin, Yong Li

Developing smart cities is vital for ensuring sustainable development and improving human well-being. One critical aspect of building smart cities is designing intelligent methods to address various decision-making problems that arise in urban areas. As machine learning techniques continue to advance rapidly, a growing body of research has been focused on utilizing these methods to achieve intelligent urban decision making. In this survey, we conduct a systematic literature review on the application of machine learning methods in urban decision making, with a focus on planning, transportation, and healthcare. First, we provide a taxonomy based on typical applications of machine learning methods for urban decision making. We then present background knowledge on these tasks and the machine learning techniques that have been adopted to solve them. Next, we examine the challenges and advantages of applying machine learning in urban decision making, including issues related to urban complexity, urban heterogeneity and computational cost. Afterward and primarily, we elaborate on the existing machine learning methods that aim to solve urban decision making tasks in planning, transportation, and healthcare, highlighting their strengths and limitations. Finally, we discuss open problems and the future directions of applying machine learning to enable intelligent urban decision making, such as developing foundation models and combining reinforcement learning algorithms with human feedback. We hope this survey can help researchers in related fields understand the recent progress made in existing works, and inspire novel applications of machine learning in smart cities.

发展智慧城市对于确保可持续发展和改善人类福祉至关重要。建设智慧城市的一个重要方面是设计智能方法，以解决城市地区出现的各种决策问题。随着机器学习技术的不断快速发展，越来越多的研究集中于利用这些方法实现智能城市决策。在本调查中，我们对机器学习方法在城市决策中的应用进行了系统的文献综述，重点关注规划、交通和医疗保健领域。首先，我们根据机器学习方法在城市决策中的典型应用进行了分类。然后，我们介绍了这些任务的背景知识以及解决这些任务所采用的机器学习技术。接下来，我们探讨了在城市决策中应用机器学习所面临的挑战和优势，包括与城市复杂性、城市异质性和计算成本相关的问题。随后，我们主要阐述了旨在解决规划、交通和医疗保健领域城市决策任务的现有机器学习方法，并强调了这些方法的优势和局限性。最后，我们讨论了应用机器学习实现智能城市决策的未决问题和未来方向，如开发基础模型、将强化学习算法与人工反馈相结合等。我们希望这份调查报告能帮助相关领域的研究人员了解现有工作的最新进展，并激发机器学习在智慧城市中的新应用。

{"title":"A Survey of Machine Learning for Urban Decision Making: Applications in Planning, Transportation, and Healthcare","authors":"Yu Zheng, Qianyue Hao, Jingwei Wang, Changzheng Gao, Jinwei Chen, Depeng Jin, Yong Li","doi":"10.1145/3695986","DOIUrl":"https://doi.org/10.1145/3695986","url":null,"abstract":"Developing smart cities is vital for ensuring sustainable development and improving human well-being. One critical aspect of building smart cities is designing intelligent methods to address various decision-making problems that arise in urban areas. As machine learning techniques continue to advance rapidly, a growing body of research has been focused on utilizing these methods to achieve intelligent urban decision making. In this survey, we conduct a systematic literature review on the application of machine learning methods in urban decision making, with a focus on planning, transportation, and healthcare. First, we provide a taxonomy based on typical applications of machine learning methods for urban decision making. We then present background knowledge on these tasks and the machine learning techniques that have been adopted to solve them. Next, we examine the challenges and advantages of applying machine learning in urban decision making, including issues related to urban complexity, urban heterogeneity and computational cost. Afterward and primarily, we elaborate on the existing machine learning methods that aim to solve urban decision making tasks in planning, transportation, and healthcare, highlighting their strengths and limitations. Finally, we discuss open problems and the future directions of applying machine learning to enable intelligent urban decision making, such as developing foundation models and combining reinforcement learning algorithms with human feedback. We hope this survey can help researchers in related fields understand the recent progress made in existing works, and inspire novel applications of machine learning in smart cities.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"15 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142691069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Tool Learning with Foundation Models 利用基础模型进行工具学习

IF 16.6 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

ACM Computing Surveys

Pub Date : 2024-11-21 DOI: 10.1145/3704435

Yujia Qin, Shengding Hu, Yankai Lin, Weize Chen, Ning Ding, Ganqu Cui, Zheni Zeng, Xuanhe Zhou, Yufei Huang, Chaojun Xiao, Chi Han, Yi Ren Fung, Yusheng Su, Huadong Wang, Cheng Qian, Runchu Tian, Kunlun Zhu, Shihao Liang, Xingyu Shen, Bokai Xu, Zhen Zhang, Yining Ye, Bowen Li, Ziwei Tang, Jing Yi, Yuzhang Zhu, Zhenning Dai, Lan Yan, Xin Cong, Yaxi Lu, Weilin Zhao, Yuxiang Huang, Junxi Yan, Xu Han, Xian Sun, Dahai Li, Jason Phang, Cheng Yang, Tongshuang Wu, Heng Ji, Guoliang Li, Zhiyuan Liu, Maosong Sun

Humans possess an extraordinary ability to create and utilize tools. With the advent of foundation models, artificial intelligence systems have the potential to be equally adept in tool use as humans. This paradigm, which is dubbed as tool learning with foundation models , combines the strengths of tools and foundation models to achieve enhanced accuracy, efficiency, and automation in problem-solving. This paper presents a systematic investigation and comprehensive review of tool learning. We first introduce the background of tool learning, including its cognitive origins, the paradigm shift of foundation models, and the complementary roles of tools and models. Then we recapitulate existing tool learning research and formulate a general framework: starting from understanding the user instruction, models should learn to decompose a complex task into several subtasks, dynamically adjust their plan through reasoning, and effectively conquer each sub-task by selecting appropriate tools. We also discuss how to train models for improved tool-use capabilities and facilitate generalization in tool learning. Finally, we discuss several open problems that require further investigation, such as ensuring trustworthy tool use, enabling tool creation with foundation models, and addressing personalization challenges. Overall, we hope this paper could inspire future research in integrating tools with foundation models.

人类拥有创造和使用工具的非凡能力。随着基础模型的出现，人工智能系统有可能像人类一样善于使用工具。这种模式被称为 "工具学习与基础模型"，它结合了工具和基础模型的优势，从而提高了解决问题的准确性、效率和自动化程度。本文对工具学习进行了系统研究和全面评述。我们首先介绍了工具学习的背景，包括其认知起源、基础模型的范式转变以及工具和模型的互补作用。然后，我们回顾了现有的工具学习研究，并提出了一个总体框架：从理解用户指令开始，模型应学会将复杂任务分解为多个子任务，通过推理动态调整计划，并通过选择适当的工具有效地完成每个子任务。我们还讨论了如何训练模型以提高工具使用能力，并促进工具学习的泛化。最后，我们讨论了几个需要进一步研究的开放性问题，如确保工具使用的可信度、利用基础模型创建工具以及应对个性化挑战。总之，我们希望本文能对未来将工具与基础模型相结合的研究有所启发。

{"title":"Tool Learning with Foundation Models","authors":"Yujia Qin, Shengding Hu, Yankai Lin, Weize Chen, Ning Ding, Ganqu Cui, Zheni Zeng, Xuanhe Zhou, Yufei Huang, Chaojun Xiao, Chi Han, Yi Ren Fung, Yusheng Su, Huadong Wang, Cheng Qian, Runchu Tian, Kunlun Zhu, Shihao Liang, Xingyu Shen, Bokai Xu, Zhen Zhang, Yining Ye, Bowen Li, Ziwei Tang, Jing Yi, Yuzhang Zhu, Zhenning Dai, Lan Yan, Xin Cong, Yaxi Lu, Weilin Zhao, Yuxiang Huang, Junxi Yan, Xu Han, Xian Sun, Dahai Li, Jason Phang, Cheng Yang, Tongshuang Wu, Heng Ji, Guoliang Li, Zhiyuan Liu, Maosong Sun","doi":"10.1145/3704435","DOIUrl":"https://doi.org/10.1145/3704435","url":null,"abstract":"Humans possess an extraordinary ability to create and utilize tools. With the advent of foundation models, artificial intelligence systems have the potential to be equally adept in tool use as humans. This paradigm, which is dubbed as tool learning with foundation models , combines the strengths of tools and foundation models to achieve enhanced accuracy, efficiency, and automation in problem-solving. This paper presents a systematic investigation and comprehensive review of tool learning. We first introduce the background of tool learning, including its cognitive origins, the paradigm shift of foundation models, and the complementary roles of tools and models. Then we recapitulate existing tool learning research and formulate a general framework: starting from understanding the user instruction, models should learn to decompose a complex task into several subtasks, dynamically adjust their plan through reasoning, and effectively conquer each sub-task by selecting appropriate tools. We also discuss how to train models for improved tool-use capabilities and facilitate generalization in tool learning. Finally, we discuss several open problems that require further investigation, such as ensuring trustworthy tool use, enabling tool creation with foundation models, and addressing personalization challenges. Overall, we hope this paper could inspire future research in integrating tools with foundation models.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"15 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142684949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Collaborative Distributed Machine Learning 协作式分布机器学习

IF 16.6 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

ACM Computing Surveys

Pub Date : 2024-11-20 DOI: 10.1145/3704807

David Jin, Niclas Kannengießer, Sascha Rank, Ali Sunyaev

Various collaborative distributed machine learning (CDML) systems, including federated learning systems and swarm learning systems, with different key traits were developed to leverage resources for the development and use of machine learning (ML) models in a confidentiality-preserving way. To meet use case requirements, suitable CDML systems need to be selected. However, comparison between CDML systems to assess their suitability for use cases is often difficult. To support comparison of CDML systems and introduce scientific and practical audiences to the principal functioning and key traits of CDML systems, this work presents a CDML system conceptualization and CDML archetypes.

为了以保密方式利用资源开发和使用机器学习（ML）模型，开发了各种具有不同关键特征的协作分布式机器学习（CDML）系统，包括联合学习系统和群学习系统。为满足用例要求，需要选择合适的 CDML 系统。然而，对 CDML 系统进行比较以评估其是否适合用例往往很困难。为了支持对 CDML 系统进行比较，并向科学界和实际受众介绍 CDML 系统的主要功能和关键特征，这项工作提出了 CDML 系统概念化和 CDML 原型。

引用次数: 0

Motivations, Challenges, Best Practices, and Benefits for Bots and Conversational Agents in Software Engineering: A Multivocal Literature Review 软件工程中机器人和对话式代理的动机、挑战、最佳实践和优势：多语种文献综述

IF 16.6 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

ACM Computing Surveys

Pub Date : 2024-11-20 DOI: 10.1145/3704806

Stefano Lambiase, Gemma Catolino, Fabio Palomba, Filomena Ferrucci

Bots are software systems designed to support users by automating specific processes, tasks, or activities. When these systems implement a conversational component to interact with users, they are also known as conversational agents or chatbots . Bots—particularly in their conversation-oriented version and AI-powered—have seen increased adoption over time for software development and engineering purposes. Despite their exciting potential, which has been further enhanced by the advent of Generative AI and Large Language Models, bots still face challenges in terms of development and integration into the development cycle, as practitioners report that bots can add difficulties rather than provide improvements. In this work, we aim to provide a taxonomy for characterizing bots, as well as a series of challenges for their adoption in software engineering, accompanied by potential mitigation strategies. To achieve our objectives, we conducted a multivocal literature review , examining both research and practitioner literature. Through such an approach, we hope to contribute to both researchers and practitioners by providing (i) a series of future research directions to pursue, (ii) a list of strategies to adopt for improving the use of bots for software engineering purposes, and (iii) fostering technology and knowledge transfer from the research field to practice—one of the primary goals of multivocal literature reviews.

机器人是一种软件系统，旨在通过自动化特定流程、任务或活动为用户提供支持。当这些系统采用对话组件与用户交互时，它们也被称为对话代理或聊天机器人。随着时间的推移，机器人--尤其是以对话为导向的机器人和人工智能机器人--在软件开发和工程中的应用越来越广泛。生成式人工智能和大型语言模型的出现进一步增强了机器人的潜力，尽管如此，机器人在开发和集成到开发周期方面仍然面临挑战，因为从业人员报告说，机器人可能会增加困难，而不是提供改进。在这项工作中，我们旨在提供一种用于描述机器人特征的分类方法，以及在软件工程中采用机器人所面临的一系列挑战，并辅以潜在的缓解策略。为了实现我们的目标，我们进行了多角度的文献综述，同时考察了研究文献和实践文献。通过这种方法，我们希望为研究人员和实践人员做出贡献，提供：(i) 一系列未来研究方向；(ii) 一系列改进软件工程中机器人使用的策略；(iii) 促进从研究领域到实践的技术和知识转移--这也是多声部文献综述的主要目标之一。

{"title":"Motivations, Challenges, Best Practices, and Benefits for Bots and Conversational Agents in Software Engineering: A Multivocal Literature Review","authors":"Stefano Lambiase, Gemma Catolino, Fabio Palomba, Filomena Ferrucci","doi":"10.1145/3704806","DOIUrl":"https://doi.org/10.1145/3704806","url":null,"abstract":" Bots are software systems designed to support users by automating specific processes, tasks, or activities. When these systems implement a conversational component to interact with users, they are also known as conversational agents or chatbots . Bots—particularly in their conversation-oriented version and AI-powered—have seen increased adoption over time for software development and engineering purposes. Despite their exciting potential, which has been further enhanced by the advent of Generative AI and Large Language Models, bots still face challenges in terms of development and integration into the development cycle, as practitioners report that bots can add difficulties rather than provide improvements. In this work, we aim to provide a taxonomy for characterizing bots, as well as a series of challenges for their adoption in software engineering, accompanied by potential mitigation strategies. To achieve our objectives, we conducted a multivocal literature review , examining both research and practitioner literature. Through such an approach, we hope to contribute to both researchers and practitioners by providing (i) a series of future research directions to pursue, (ii) a list of strategies to adopt for improving the use of bots for software engineering purposes, and (iii) fostering technology and knowledge transfer from the research field to practice—one of the primary goals of multivocal literature reviews.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"23 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142678491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Private and Secure Distributed Deep Learning: A Survey 私密安全的分布式深度学习：调查

IF 16.6 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

ACM Computing Surveys

Pub Date : 2024-11-16 DOI: 10.1145/3703452

Corinne Allaart, Saba Amiri, Henri Bal, Adam Belloum, Leon Gommans, Aart van Halteren, Sander Klous

Traditionally, deep learning practitioners would bring data into a central repository for model training and inference. Recent developments in distributed learning, such as federated learning and deep learning as a service (DLaaS) do not require centralized data and instead push computing to where the distributed datasets reside. These decentralized training schemes, however, introduce additional security and privacy challenges. This survey first structures the field of distributed learning into two main paradigms and then provides an overview of the recently published protective measures for each. This work highlights both secure training methods as well as private inference measures. Our analyses show that recent publications while being highly dependent on the problem definition, report progress in terms of security, privacy, and efficiency. Nevertheless, we also identify several current issues within the private and secure distributed deep learning (PSDDL) field that require more research. We discuss these issues and provide a general overview of how they might be resolved.

传统上，深度学习从业者会将数据导入中央存储库，进行模型训练和推理。分布式学习的最新发展，如联合学习和深度学习即服务（DLaaS），不需要集中数据，而是将计算推向分布式数据集所在的地方。然而，这些分散式训练方案带来了额外的安全和隐私挑战。本调查报告首先将分布式学习领域划分为两个主要范式，然后概述了最近发布的针对每个范式的保护措施。这项工作既强调了安全训练方法，也强调了隐私推断措施。我们的分析表明，近期发表的论文虽然高度依赖于问题的定义，但在安全性、隐私性和效率方面都取得了进展。不过，我们也发现了当前在私有和安全分布式深度学习（PSDDL）领域中需要进一步研究的几个问题。我们将讨论这些问题，并概述如何解决这些问题。

引用次数: 0

Backdoor Attacks and Defenses Targeting Multi-Domain AI Models: A Comprehensive Review 针对多域人工智能模型的后门攻击和防御：全面回顾

IF 16.6 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

ACM Computing Surveys

Pub Date : 2024-11-15 DOI: 10.1145/3704725

Shaobo Zhang, Yimeng Pan, Qin Liu, Zheng Yan, Kim-Kwang Raymond Choo, Guojun Wang

Since the emergence of security concerns in artificial intelligence (AI), there has been significant attention devoted to the examination of backdoor attacks. Attackers can utilize backdoor attacks to manipulate model predictions, leading to significant potential harm. However, current research on backdoor attacks and defenses in both theoretical and practical fields still has many shortcomings. To systematically analyze these shortcomings and address the lack of comprehensive reviews, this paper presents a comprehensive and systematic summary of both backdoor attacks and defenses targeting multi-domain AI models. Simultaneously, based on the design principles and shared characteristics of triggers in different domains and the implementation stages of backdoor defense, this paper proposes a new classification method for backdoor attacks and defenses. We use this method to extensively review backdoor attacks in the fields of computer vision and natural language processing, and also examine the current applications of backdoor attacks in audio recognition, video action recognition, multimodal tasks, time series tasks, generative learning, and reinforcement learning, while critically analyzing the open problems of various backdoor attack techniques and defense strategies. Finally, this paper builds upon the analysis of the current state of AI security to further explore potential future research directions for backdoor attacks and defenses.

自从人工智能（AI）出现安全问题以来，后门攻击的研究一直备受关注。攻击者可以利用后门攻击来操纵模型预测，从而导致重大的潜在危害。然而，目前在理论和实践领域对后门攻击和防御的研究还存在很多不足。为了系统地分析这些不足，并解决缺乏全面综述的问题，本文对针对多域人工智能模型的后门攻击和防御进行了全面系统的总结。同时，基于不同领域触发器的设计原理和共同特点，以及后门防御的实现阶段，本文提出了一种新的后门攻击和防御分类方法。我们利用这种方法广泛回顾了计算机视觉和自然语言处理领域的后门攻击，还考察了目前后门攻击在音频识别、视频动作识别、多模态任务、时间序列任务、生成学习和强化学习中的应用，同时批判性地分析了各种后门攻击技术和防御策略的开放性问题。最后，本文在分析人工智能安全现状的基础上，进一步探讨了后门攻击和防御的潜在未来研究方向。

{"title":"Backdoor Attacks and Defenses Targeting Multi-Domain AI Models: A Comprehensive Review","authors":"Shaobo Zhang, Yimeng Pan, Qin Liu, Zheng Yan, Kim-Kwang Raymond Choo, Guojun Wang","doi":"10.1145/3704725","DOIUrl":"https://doi.org/10.1145/3704725","url":null,"abstract":"Since the emergence of security concerns in artificial intelligence (AI), there has been significant attention devoted to the examination of backdoor attacks. Attackers can utilize backdoor attacks to manipulate model predictions, leading to significant potential harm. However, current research on backdoor attacks and defenses in both theoretical and practical fields still has many shortcomings. To systematically analyze these shortcomings and address the lack of comprehensive reviews, this paper presents a comprehensive and systematic summary of both backdoor attacks and defenses targeting multi-domain AI models. Simultaneously, based on the design principles and shared characteristics of triggers in different domains and the implementation stages of backdoor defense, this paper proposes a new classification method for backdoor attacks and defenses. We use this method to extensively review backdoor attacks in the fields of computer vision and natural language processing, and also examine the current applications of backdoor attacks in audio recognition, video action recognition, multimodal tasks, time series tasks, generative learning, and reinforcement learning, while critically analyzing the open problems of various backdoor attack techniques and defense strategies. Finally, this paper builds upon the analysis of the current state of AI security to further explore potential future research directions for backdoor attacks and defenses.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"5 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142642616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Systematic Review of Generative Modelling Tools and Utility Metrics for Fully Synthetic Tabular Data 全合成表格式数据的生成建模工具和效用指标系统性综述

IF 16.6 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

ACM Computing Surveys

Pub Date : 2024-11-14 DOI: 10.1145/3704437

Anton Danholt Lautrup, Tobias Hyrup, Arthur Zimek, Peter Schneider-Kamp

Sharing data with third parties is essential for advancing science, but it is becoming more and more difficult with the rise of data protection regulations, ethical restrictions, and growing fear of misuse. Fully synthetic data, which transcends anonymisation, may be the key to unlocking valuable untapped insights stored away in secured data vaults. This review examines current synthetic data generation methods and their utility measurement. We found that more traditional generative models such as Classification and Regression Tree models alongside Bayesian Networks remain highly relevant and are still capable of surpassing deep learning alternatives like Generative Adversarial Networks. However, our findings also display the same lack of agreement on metrics for evaluation, uncovered in earlier reviews, posing a persistent obstacle to advancing the field. We propose a tool for evaluating the utility of synthetic data and illustrate how it can be applied to three synthetic data generation models. By streamlining evaluation and promoting agreement on metrics, researchers can explore novel methods and generate compelling results that will convince data curators and lawmakers to embrace synthetic data. Our review emphasises the potential of synthetic data and highlights the need for greater collaboration and standardisation to unlock its full potential.

与第三方共享数据对推动科学发展至关重要，但随着数据保护法规、道德限制的增多，以及对滥用数据的担忧与日俱增，共享数据变得越来越困难。超越匿名化的全合成数据可能是开启存储在安全数据库中的宝贵未开发洞察力的关键。本综述探讨了当前的合成数据生成方法及其效用测量。我们发现，分类和回归树模型以及贝叶斯网络等更传统的生成模型仍然具有很高的相关性，并且仍然能够超越生成对抗网络等深度学习替代方法。然而，我们的研究结果也显示，在早期的综述中，人们对评估指标缺乏一致意见，这对推动该领域的发展构成了持续的障碍。我们提出了一种评估合成数据效用的工具，并说明了如何将其应用于三种合成数据生成模型。通过简化评估和促进在衡量标准上达成一致，研究人员可以探索新方法并产生令人信服的结果，从而说服数据管理员和立法者接受合成数据。我们的综述强调了合成数据的潜力，并强调了加强合作和标准化以充分释放其潜力的必要性。

{"title":"Systematic Review of Generative Modelling Tools and Utility Metrics for Fully Synthetic Tabular Data","authors":"Anton Danholt Lautrup, Tobias Hyrup, Arthur Zimek, Peter Schneider-Kamp","doi":"10.1145/3704437","DOIUrl":"https://doi.org/10.1145/3704437","url":null,"abstract":"Sharing data with third parties is essential for advancing science, but it is becoming more and more difficult with the rise of data protection regulations, ethical restrictions, and growing fear of misuse. Fully synthetic data, which transcends anonymisation, may be the key to unlocking valuable untapped insights stored away in secured data vaults. This review examines current synthetic data generation methods and their utility measurement. We found that more traditional generative models such as Classification and Regression Tree models alongside Bayesian Networks remain highly relevant and are still capable of surpassing deep learning alternatives like Generative Adversarial Networks. However, our findings also display the same lack of agreement on metrics for evaluation, uncovered in earlier reviews, posing a persistent obstacle to advancing the field. We propose a tool for evaluating the utility of synthetic data and illustrate how it can be applied to three synthetic data generation models. By streamlining evaluation and promoting agreement on metrics, researchers can explore novel methods and generate compelling results that will convince data curators and lawmakers to embrace synthetic data. Our review emphasises the potential of synthetic data and highlights the need for greater collaboration and standardisation to unlock its full potential.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"21 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142637686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Democratizing Container Live Migration for Enhanced Future Networks - A Survey 面向增强型未来网络的民主化容器实时迁移--一项调查

IF 16.6 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

ACM Computing Surveys

Pub Date : 2024-11-14 DOI: 10.1145/3704436

Wissem Soussi, Gürkan Gür, Burkhard Stiller

Emerging cloud-centric networks span from edge clouds to large-scale datacenters with shared infrastructure among multiple tenants and applications with high availability, isolation, fault tolerance, security, and energy efficiency demands. Live migration (LiMi) plays an increasingly critical role in these environments by enabling seamless application mobility covering the edge-to-cloud continuum and maintaining these requirements. This survey presents a comprehensive survey of recent advancements that democratize LiMi, making it more applicable to a broader range of scenarios and network environments both for virtual machines (VMs) and containers, and analyzes LiMi’s technical underpinnings and optimization techniques. It also delves into the issue of connections handover, presenting a taxonomy to categorize methods of traffic redirection synthesized from the existing literature. Finally, it identifies technical challenges and paves the way for future research directions in this key technology.

新兴的以云为中心的网络从边缘云到大型数据中心，多个租户和应用共享基础设施，具有高可用性、隔离性、容错性、安全性和能效要求。实时迁移（LiMi）在这些环境中发挥着越来越关键的作用，它实现了从边缘到云的无缝应用移动性，并保持了这些要求。本调查报告全面介绍了使 LiMi 民主化的最新进展，使其更适用于虚拟机（VM）和容器的更广泛场景和网络环境，并分析了 LiMi 的技术基础和优化技术。报告还深入探讨了连接切换问题，提出了一种分类法，以综合现有文献对流量重定向方法进行分类。最后，它指出了技术挑战，并为这一关键技术的未来研究方向铺平了道路。

引用次数: 0

Membership Inference Attacks and Defenses in Federated Learning: A Survey 联盟学习中的成员推理攻击与防御：调查

IF 16.6 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

ACM Computing Surveys

Pub Date : 2024-11-14 DOI: 10.1145/3704633

Li Bai, Haibo Hu, Qingqing Ye, Haoyang Li, Leixia Wang, Jianliang Xu

Federated learning is a decentralized machine learning approach where clients train models locally and share model updates to develop a global model. This enables low-resource devices to collaboratively build a high-quality model without requiring direct access to the raw training data. However, despite only sharing model updates, federated learning still faces several privacy vulnerabilities. One of the key threats is membership inference attacks, which target clients’ privacy by determining whether a specific example is part of the training set. These attacks can compromise sensitive information in real-world applications, such as medical diagnoses within a healthcare system. Although there has been extensive research on membership inference attacks, a comprehensive and up-to-date survey specifically focused on it within federated learning is still absent. To fill this gap, we categorize and summarize membership inference attacks and their corresponding defense strategies based on their characteristics in this setting. We introduce a unique taxonomy of existing attack research and provide a systematic overview of various countermeasures. For these studies, we thoroughly analyze the strengths and weaknesses of different approaches. Finally, we identify and discuss key future research directions for readers interested in advancing the field.

联合学习是一种去中心化的机器学习方法，客户端在本地训练模型，并共享模型更新，以开发一个全局模型。这使低资源设备能够协作建立高质量模型，而无需直接访问原始训练数据。然而，尽管只共享模型更新，联合学习仍然面临着几个隐私漏洞。其中一个主要威胁是成员推理攻击，这种攻击通过确定特定示例是否属于训练集的一部分来攻击客户的隐私。这些攻击会破坏真实世界应用中的敏感信息，例如医疗保健系统中的医疗诊断。尽管对成员推断攻击已有大量研究，但专门针对联合学习中的成员推断攻击的全面、最新调查报告仍然缺失。为了填补这一空白，我们根据成员推断攻击在此环境中的特点，对其进行了分类和总结，并提出了相应的防御策略。我们对现有的攻击研究进行了独特的分类，并对各种对策进行了系统的概述。针对这些研究，我们深入分析了不同方法的优缺点。最后，我们为有志于推动该领域发展的读者指出并讨论了未来的主要研究方向。

{"title":"Membership Inference Attacks and Defenses in Federated Learning: A Survey","authors":"Li Bai, Haibo Hu, Qingqing Ye, Haoyang Li, Leixia Wang, Jianliang Xu","doi":"10.1145/3704633","DOIUrl":"https://doi.org/10.1145/3704633","url":null,"abstract":"Federated learning is a decentralized machine learning approach where clients train models locally and share model updates to develop a global model. This enables low-resource devices to collaboratively build a high-quality model without requiring direct access to the raw training data. However, despite only sharing model updates, federated learning still faces several privacy vulnerabilities. One of the key threats is membership inference attacks, which target clients’ privacy by determining whether a specific example is part of the training set. These attacks can compromise sensitive information in real-world applications, such as medical diagnoses within a healthcare system. Although there has been extensive research on membership inference attacks, a comprehensive and up-to-date survey specifically focused on it within federated learning is still absent. To fill this gap, we categorize and summarize membership inference attacks and their corresponding defense strategies based on their characteristics in this setting. We introduce a unique taxonomy of existing attack research and provide a systematic overview of various countermeasures. For these studies, we thoroughly analyze the strengths and weaknesses of different approaches. Finally, we identify and discuss key future research directions for readers interested in advancing the field.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"37 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142637688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0