Large foundation models, including large language models, vision transformers, diffusion, and LLM-based multimodal models, are revolutionizing the entire machine learning lifecycle, from training to deployment. However, the substantial advancements in versatility and performance these models offer come at a significant cost in terms of hardware resources. To support the growth of these large models in a scalable and environmentally sustainable way, there has been a considerable focus on developing resource-efficient strategies. This survey delves into the critical importance of such research, examining both algorithmic and systemic aspects. It offers a comprehensive analysis and valuable insights gleaned from existing literature, encompassing a broad array of topics from cutting-edge model architectures and training/serving algorithms to practical system designs and implementations. The goal of this survey is to provide an overarching understanding of how current approaches are tackling the resource challenges posed by large foundation models and to potentially inspire future breakthroughs in this field.
{"title":"Resource-efficient Algorithms and Systems of Foundation Models: A Survey","authors":"Mengwei Xu, Dongqi Cai, Wangsong Yin, Shangguang Wang, Xin Jin, Xuanzhe Liu","doi":"10.1145/3706418","DOIUrl":"https://doi.org/10.1145/3706418","url":null,"abstract":"Large foundation models, including large language models, vision transformers, diffusion, and LLM-based multimodal models, are revolutionizing the entire machine learning lifecycle, from training to deployment. However, the substantial advancements in versatility and performance these models offer come at a significant cost in terms of hardware resources. To support the growth of these large models in a scalable and environmentally sustainable way, there has been a considerable focus on developing resource-efficient strategies. This survey delves into the critical importance of such research, examining both algorithmic and systemic aspects. It offers a comprehensive analysis and valuable insights gleaned from existing literature, encompassing a broad array of topics from cutting-edge model architectures and training/serving algorithms to practical system designs and implementations. The goal of this survey is to provide an overarching understanding of how current approaches are tackling the resource challenges posed by large foundation models and to potentially inspire future breakthroughs in this field.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"84 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142752990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sakuna Harinda Jayasundara, Nalin Asanka Gamagedara Arachchilage, Giovanni Russello
Administrator-centered access control failures can cause data breaches, putting organizations at risk of financial loss and reputation damage. Existing graphical policy configuration tools and automated policy generation frameworks attempt to help administrators configure and generate access control policies by avoiding such failures. However, graphical policy configuration tools are prone to human errors, making them unusable. On the other hand, automated policy generation frameworks are prone to erroneous predictions, making them unreliable. Therefore, to find ways to improve their usability and reliability, we conducted a Systematic Literature Review analyzing 49 publications. The thematic analysis of the publications revealed that graphical policy configuration tools are developed to write and visualize policies manually. Moreover, automated policy generation frameworks are developed using machine learning (ML) and natural language processing (NLP) techniques to automatically generate access control policies from high-level requirement specifications. Despite their utility in the access control domain, limitations of these tools, such as the lack of flexibility, and limitations of frameworks, such as the lack of domain adaptation, negatively affect their usability and reliability, respectively. Our study offers recommendations to address these limitations through real-world applications and recent advancements in the NLP domain, paving the way for future research.
{"title":"SoK: Access Control Policy Generation from High-level Natural Language Requirements","authors":"Sakuna Harinda Jayasundara, Nalin Asanka Gamagedara Arachchilage, Giovanni Russello","doi":"10.1145/3706057","DOIUrl":"https://doi.org/10.1145/3706057","url":null,"abstract":"Administrator-centered access control failures can cause data breaches, putting organizations at risk of financial loss and reputation damage. Existing graphical policy configuration tools and automated policy generation frameworks attempt to help administrators configure and generate access control policies by avoiding such failures. However, graphical policy configuration tools are prone to human errors, making them unusable. On the other hand, automated policy generation frameworks are prone to erroneous predictions, making them unreliable. Therefore, to find ways to improve their usability and reliability, we conducted a Systematic Literature Review analyzing 49 publications. The thematic analysis of the publications revealed that graphical policy configuration tools are developed to write and visualize policies manually. Moreover, automated policy generation frameworks are developed using machine learning (ML) and natural language processing (NLP) techniques to automatically generate access control policies from high-level requirement specifications. Despite their utility in the access control domain, limitations of these tools, such as the lack of flexibility, and limitations of frameworks, such as the lack of domain adaptation, negatively affect their usability and reliability, respectively. Our study offers recommendations to address these limitations through real-world applications and recent advancements in the NLP domain, paving the way for future research.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"14 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142742597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Decentralized finance (DeFi) represents a novel financial system but faces significant fraud challenges, leading to substantial losses. Recent advancements in artificial intelligence (AI) show potential for complex fraud detection. Despite growing interest, a systematic review of these methods is lacking. This survey correlates fraud types with DeFi project stages, presenting a taxonomy based on the project life cycle. We evaluate AI techniques, revealing notable findings such as the superiority of tree-based and graph-related models. Based on these insights, we offer recommendations and outline future research directions to aid researchers, practitioners, and regulators in enhancing DeFi security.
{"title":"AI-powered Fraud Detection in Decentralized Finance: A Project Life Cycle Perspective","authors":"Bingqiao Luo, Zhen Zhang, Qian Wang, Anli Ke, Shengliang Lu, Bingsheng He","doi":"10.1145/3705296","DOIUrl":"https://doi.org/10.1145/3705296","url":null,"abstract":"Decentralized finance (DeFi) represents a novel financial system but faces significant fraud challenges, leading to substantial losses. Recent advancements in artificial intelligence (AI) show potential for complex fraud detection. Despite growing interest, a systematic review of these methods is lacking. This survey correlates fraud types with DeFi project stages, presenting a taxonomy based on the project life cycle. We evaluate AI techniques, revealing notable findings such as the superiority of tree-based and graph-related models. Based on these insights, we offer recommendations and outline future research directions to aid researchers, practitioners, and regulators in enhancing DeFi security.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"65 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142713154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recent advancements in wireless communication technologies have made Wi-Fi signals indispensable in both personal and professional settings. The utilization of these signals for Human Activity Recognition (HAR) has emerged as a cutting-edge technology. By leveraging the fluctuations in Wi-Fi signals for HAR, this approach offers enhanced privacy compared to traditional visual surveillance methods. The essence of this technique lies in detecting subtle changes when Wi-Fi signals interact with the human body, which are then captured and interpreted by advanced algorithms. This paper initially provides an overview of the key methodologies in HAR and the evolution of non-contact sensing, introducing sensor-based recognition, computer vision, and Wi-Fi signal-based approaches, respectively. It then explores tools for Wi-Fi-based HAR signal collection and lists several high-quality datasets. Subsequently, the paper reviews various sensing tasks enabled by Wi-Fi signal recognition, highlighting the application of deep learning networks in Wi-Fi signal detection. The fourth section presents experimental results that assess the capabilities of different networks. The findings indicate significant variability in the generalization capacities of neural networks and notable differences in test accuracy for various motion analyses.
无线通信技术的最新进展使 Wi-Fi 信号在个人和专业环境中都变得不可或缺。利用这些信号进行人类活动识别(HAR)已成为一项尖端技术。与传统的视觉监控方法相比,利用 Wi-Fi 信号的波动进行人类活动识别(HAR)可增强隐私性。这种技术的精髓在于检测 Wi-Fi 信号与人体相互作用时的微妙变化,然后通过先进的算法捕捉和解读这些变化。本文首先概述了 HAR 的主要方法和非接触式传感的发展,分别介绍了基于传感器的识别、计算机视觉和基于 Wi-Fi 信号的方法。然后,论文探讨了基于 Wi-Fi 的 HAR 信号采集工具,并列出了几个高质量的数据集。随后,论文回顾了通过 Wi-Fi 信号识别实现的各种传感任务,重点介绍了深度学习网络在 Wi-Fi 信号检测中的应用。第四部分介绍了评估不同网络能力的实验结果。研究结果表明,神经网络的泛化能力存在很大差异,各种运动分析的测试准确性也存在明显差异。
{"title":"Wi-Fi Sensing Techniques for Human Activity Recognition: Brief Survey, Potential Challenges, and Research Directions","authors":"Fucheng Miao, Youxiang Huang, Zhiyi Lu, Tomoaki Ohtsuki, Guan Gui, Hikmet Sari","doi":"10.1145/3705893","DOIUrl":"https://doi.org/10.1145/3705893","url":null,"abstract":"Recent advancements in wireless communication technologies have made Wi-Fi signals indispensable in both personal and professional settings. The utilization of these signals for Human Activity Recognition (HAR) has emerged as a cutting-edge technology. By leveraging the fluctuations in Wi-Fi signals for HAR, this approach offers enhanced privacy compared to traditional visual surveillance methods. The essence of this technique lies in detecting subtle changes when Wi-Fi signals interact with the human body, which are then captured and interpreted by advanced algorithms. This paper initially provides an overview of the key methodologies in HAR and the evolution of non-contact sensing, introducing sensor-based recognition, computer vision, and Wi-Fi signal-based approaches, respectively. It then explores tools for Wi-Fi-based HAR signal collection and lists several high-quality datasets. Subsequently, the paper reviews various sensing tasks enabled by Wi-Fi signal recognition, highlighting the application of deep learning networks in Wi-Fi signal detection. The fourth section presents experimental results that assess the capabilities of different networks. The findings indicate significant variability in the generalization capacities of neural networks and notable differences in test accuracy for various motion analyses.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"191 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142713368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Temporal data, representing chronological observations of complex systems, has always been a typical data structure that can be widely generated by many domains, such as industry, finance, healthcare and climatology. Analyzing the underlying structures, i.e. , the causal relations, could be extremely valuable for various applications. Recently, causal discovery from temporal data has been considered as an interesting yet critical task and attracted much research attention. According to the nature and structure of temporal data, existing causal discovery works can be divided into two highly correlated categories i.e. , multivariate time series causal discovery, and event sequence causal discovery. However, most previous surveys are only focused on the multivariate time series causal discovery but ignore the second category. In this paper, we specify the similarity between the two categories and provide an overview of existing solutions. Furthermore, we provide public datasets, evaluation metrics and new perspectives for temporal data causal discovery.
{"title":"Causal Discovery from Temporal Data: An Overview and New Perspectives","authors":"Chang Gong, Chuzhe Zhang, Di Yao, Jingping Bi, Wenbin Li, YongJun Xu","doi":"10.1145/3705297","DOIUrl":"https://doi.org/10.1145/3705297","url":null,"abstract":"Temporal data, representing chronological observations of complex systems, has always been a typical data structure that can be widely generated by many domains, such as industry, finance, healthcare and climatology. Analyzing the underlying structures, <jats:italic>i.e.</jats:italic> , the causal relations, could be extremely valuable for various applications. Recently, causal discovery from temporal data has been considered as an interesting yet critical task and attracted much research attention. According to the nature and structure of temporal data, existing causal discovery works can be divided into two highly correlated categories <jats:italic>i.e.</jats:italic> , multivariate time series causal discovery, and event sequence causal discovery. However, most previous surveys are only focused on the multivariate time series causal discovery but ignore the second category. In this paper, we specify the similarity between the two categories and provide an overview of existing solutions. Furthermore, we provide public datasets, evaluation metrics and new perspectives for temporal data causal discovery.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"184 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142694336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Naeem Ullah, Javed Ali Khan, Ivanoe De Falco, Giovanna Sannino
There is an urgent need in many application areas for eXplainable ArtificiaI Intelligence (XAI) approaches to boost people’s confidence and trust in Artificial Intelligence methods. Current works concentrate on specific aspects of XAI and avoid a comprehensive perspective. This study undertakes a systematic survey of importance, approaches, methods, and application domains to address this gap and provide a comprehensive understanding of the XAI domain. Applying the Systematic Literature Review approach has resulted in finding and discussing 155 papers, allowing a wide discussion on the strengths, limitations, and challenges of XAI methods and future research directions.
{"title":"Explainable Artificial Intelligence: Importance, Use Domains, Stages, Output Shapes, and Challenges","authors":"Naeem Ullah, Javed Ali Khan, Ivanoe De Falco, Giovanna Sannino","doi":"10.1145/3705724","DOIUrl":"https://doi.org/10.1145/3705724","url":null,"abstract":"There is an urgent need in many application areas for eXplainable ArtificiaI Intelligence (XAI) approaches to boost people’s confidence and trust in Artificial Intelligence methods. Current works concentrate on specific aspects of XAI and avoid a comprehensive perspective. This study undertakes a systematic survey of importance, approaches, methods, and application domains to address this gap and provide a comprehensive understanding of the XAI domain. Applying the Systematic Literature Review approach has resulted in finding and discussing 155 papers, allowing a wide discussion on the strengths, limitations, and challenges of XAI methods and future research directions.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"27 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142694355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Seyma Yucer, Furkan Tektas, Noura Al Moubayed, Toby Breckon
Facial recognition is one of the most academically studied and industrially developed areas within computer vision where we readily find associated applications deployed globally. This widespread adoption has uncovered significant performance variation across subjects of different racial profiles leading to focused research attention on racial bias within face recognition spanning both current causation and future potential solutions. In support, this study provides an extensive taxonomic review of research on racial bias within face recognition exploring every aspect and stage of the associated facial processing pipeline. Firstly, we discuss the problem definition of racial bias, starting with race definition, grouping strategies, and the societal implications of using race or race-related groupings. Secondly, we divide the common face recognition processing pipeline into four stages: image acquisition, face localisation, face representation, face verification and identification, and review the relevant corresponding literature associated with each stage. The overall aim is to provide comprehensive coverage of the racial bias problem with respect to each and every stage of the face recognition processing pipeline whilst also highlighting the potential pitfalls and limitations of contemporary mitigation strategies that need to be considered within future research endeavours or commercial applications alike.
{"title":"Racial Bias within Face Recognition: A Survey","authors":"Seyma Yucer, Furkan Tektas, Noura Al Moubayed, Toby Breckon","doi":"10.1145/3705295","DOIUrl":"https://doi.org/10.1145/3705295","url":null,"abstract":"Facial recognition is one of the most academically studied and industrially developed areas within computer vision where we readily find associated applications deployed globally. This widespread adoption has uncovered significant performance variation across subjects of different racial profiles leading to focused research attention on racial bias within face recognition spanning both current causation and future potential solutions. In support, this study provides an extensive taxonomic review of research on racial bias within face recognition exploring every aspect and stage of the associated facial processing pipeline. Firstly, we discuss the problem definition of racial bias, starting with race definition, grouping strategies, and the societal implications of using race or race-related groupings. Secondly, we divide the common face recognition processing pipeline into four stages: image acquisition, face localisation, face representation, face verification and identification, and review the relevant corresponding literature associated with each stage. The overall aim is to provide comprehensive coverage of the racial bias problem with respect to each and every stage of the face recognition processing pipeline whilst also highlighting the potential pitfalls and limitations of contemporary mitigation strategies that need to be considered within future research endeavours or commercial applications alike.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"115 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142690743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Developing smart cities is vital for ensuring sustainable development and improving human well-being. One critical aspect of building smart cities is designing intelligent methods to address various decision-making problems that arise in urban areas. As machine learning techniques continue to advance rapidly, a growing body of research has been focused on utilizing these methods to achieve intelligent urban decision making. In this survey, we conduct a systematic literature review on the application of machine learning methods in urban decision making, with a focus on planning, transportation, and healthcare. First, we provide a taxonomy based on typical applications of machine learning methods for urban decision making. We then present background knowledge on these tasks and the machine learning techniques that have been adopted to solve them. Next, we examine the challenges and advantages of applying machine learning in urban decision making, including issues related to urban complexity, urban heterogeneity and computational cost. Afterward and primarily, we elaborate on the existing machine learning methods that aim to solve urban decision making tasks in planning, transportation, and healthcare, highlighting their strengths and limitations. Finally, we discuss open problems and the future directions of applying machine learning to enable intelligent urban decision making, such as developing foundation models and combining reinforcement learning algorithms with human feedback. We hope this survey can help researchers in related fields understand the recent progress made in existing works, and inspire novel applications of machine learning in smart cities.
{"title":"A Survey of Machine Learning for Urban Decision Making: Applications in Planning, Transportation, and Healthcare","authors":"Yu Zheng, Qianyue Hao, Jingwei Wang, Changzheng Gao, Jinwei Chen, Depeng Jin, Yong Li","doi":"10.1145/3695986","DOIUrl":"https://doi.org/10.1145/3695986","url":null,"abstract":"Developing smart cities is vital for ensuring sustainable development and improving human well-being. One critical aspect of building smart cities is designing intelligent methods to address various decision-making problems that arise in urban areas. As machine learning techniques continue to advance rapidly, a growing body of research has been focused on utilizing these methods to achieve intelligent urban decision making. In this survey, we conduct a systematic literature review on the application of machine learning methods in urban decision making, with a focus on planning, transportation, and healthcare. First, we provide a taxonomy based on typical applications of machine learning methods for urban decision making. We then present background knowledge on these tasks and the machine learning techniques that have been adopted to solve them. Next, we examine the challenges and advantages of applying machine learning in urban decision making, including issues related to urban complexity, urban heterogeneity and computational cost. Afterward and primarily, we elaborate on the existing machine learning methods that aim to solve urban decision making tasks in planning, transportation, and healthcare, highlighting their strengths and limitations. Finally, we discuss open problems and the future directions of applying machine learning to enable intelligent urban decision making, such as developing foundation models and combining reinforcement learning algorithms with human feedback. We hope this survey can help researchers in related fields understand the recent progress made in existing works, and inspire novel applications of machine learning in smart cities.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"15 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142691069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yujia Qin, Shengding Hu, Yankai Lin, Weize Chen, Ning Ding, Ganqu Cui, Zheni Zeng, Xuanhe Zhou, Yufei Huang, Chaojun Xiao, Chi Han, Yi Ren Fung, Yusheng Su, Huadong Wang, Cheng Qian, Runchu Tian, Kunlun Zhu, Shihao Liang, Xingyu Shen, Bokai Xu, Zhen Zhang, Yining Ye, Bowen Li, Ziwei Tang, Jing Yi, Yuzhang Zhu, Zhenning Dai, Lan Yan, Xin Cong, Yaxi Lu, Weilin Zhao, Yuxiang Huang, Junxi Yan, Xu Han, Xian Sun, Dahai Li, Jason Phang, Cheng Yang, Tongshuang Wu, Heng Ji, Guoliang Li, Zhiyuan Liu, Maosong Sun
Humans possess an extraordinary ability to create and utilize tools. With the advent of foundation models, artificial intelligence systems have the potential to be equally adept in tool use as humans. This paradigm, which is dubbed as tool learning with foundation models , combines the strengths of tools and foundation models to achieve enhanced accuracy, efficiency, and automation in problem-solving. This paper presents a systematic investigation and comprehensive review of tool learning. We first introduce the background of tool learning, including its cognitive origins, the paradigm shift of foundation models, and the complementary roles of tools and models. Then we recapitulate existing tool learning research and formulate a general framework: starting from understanding the user instruction, models should learn to decompose a complex task into several subtasks, dynamically adjust their plan through reasoning, and effectively conquer each sub-task by selecting appropriate tools. We also discuss how to train models for improved tool-use capabilities and facilitate generalization in tool learning. Finally, we discuss several open problems that require further investigation, such as ensuring trustworthy tool use, enabling tool creation with foundation models, and addressing personalization challenges. Overall, we hope this paper could inspire future research in integrating tools with foundation models.
{"title":"Tool Learning with Foundation Models","authors":"Yujia Qin, Shengding Hu, Yankai Lin, Weize Chen, Ning Ding, Ganqu Cui, Zheni Zeng, Xuanhe Zhou, Yufei Huang, Chaojun Xiao, Chi Han, Yi Ren Fung, Yusheng Su, Huadong Wang, Cheng Qian, Runchu Tian, Kunlun Zhu, Shihao Liang, Xingyu Shen, Bokai Xu, Zhen Zhang, Yining Ye, Bowen Li, Ziwei Tang, Jing Yi, Yuzhang Zhu, Zhenning Dai, Lan Yan, Xin Cong, Yaxi Lu, Weilin Zhao, Yuxiang Huang, Junxi Yan, Xu Han, Xian Sun, Dahai Li, Jason Phang, Cheng Yang, Tongshuang Wu, Heng Ji, Guoliang Li, Zhiyuan Liu, Maosong Sun","doi":"10.1145/3704435","DOIUrl":"https://doi.org/10.1145/3704435","url":null,"abstract":"Humans possess an extraordinary ability to create and utilize tools. With the advent of foundation models, artificial intelligence systems have the potential to be equally adept in tool use as humans. This paradigm, which is dubbed as <jats:italic>tool learning with foundation models</jats:italic> , combines the strengths of tools and foundation models to achieve enhanced accuracy, efficiency, and automation in problem-solving. This paper presents a systematic investigation and comprehensive review of tool learning. We first introduce the background of tool learning, including its cognitive origins, the paradigm shift of foundation models, and the complementary roles of tools and models. Then we recapitulate existing tool learning research and formulate a general framework: starting from understanding the user instruction, models should learn to decompose a complex task into several subtasks, dynamically adjust their plan through reasoning, and effectively conquer each sub-task by selecting appropriate tools. We also discuss how to train models for improved tool-use capabilities and facilitate generalization in tool learning. Finally, we discuss several open problems that require further investigation, such as ensuring trustworthy tool use, enabling tool creation with foundation models, and addressing personalization challenges. Overall, we hope this paper could inspire future research in integrating tools with foundation models.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"15 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142684949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
David Jin, Niclas Kannengießer, Sascha Rank, Ali Sunyaev
Various collaborative distributed machine learning (CDML) systems, including federated learning systems and swarm learning systems, with different key traits were developed to leverage resources for the development and use of machine learning (ML) models in a confidentiality-preserving way. To meet use case requirements, suitable CDML systems need to be selected. However, comparison between CDML systems to assess their suitability for use cases is often difficult. To support comparison of CDML systems and introduce scientific and practical audiences to the principal functioning and key traits of CDML systems, this work presents a CDML system conceptualization and CDML archetypes.
{"title":"Collaborative Distributed Machine Learning","authors":"David Jin, Niclas Kannengießer, Sascha Rank, Ali Sunyaev","doi":"10.1145/3704807","DOIUrl":"https://doi.org/10.1145/3704807","url":null,"abstract":"Various collaborative distributed machine learning (CDML) systems, including federated learning systems and swarm learning systems, with different key traits were developed to leverage resources for the development and use of machine learning (ML) models in a confidentiality-preserving way. To meet use case requirements, suitable CDML systems need to be selected. However, comparison between CDML systems to assess their suitability for use cases is often difficult. To support comparison of CDML systems and introduce scientific and practical audiences to the principal functioning and key traits of CDML systems, this work presents a CDML system conceptualization and CDML archetypes.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"14 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142678439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}