The rapid expansion of internet activities in daily life has elevated cyberattacks to a significant global threat. As a result, protecting the networks and systems of industries, organizations, and individuals against cybercrimes has become an increasingly critical challenge. This monograph provides a comprehensive review and analysis of national, international, and industry regulations on cybercrimes. It presents empirical evidence of the effectiveness of these regulatory measures and their impacts at the national, organizational, and individual levels. We also examine the challenges posed by emerging technologies to these regulations. Finally, the monograph identifies limitations in the current regulatory framework and proposes future directions to enhance the cybersecurity ecosystem.
{"title":"Regulating Information and Network Security: Review and Challenges","authors":"Tayssir Bouraffa, Kai-Lung Hui","doi":"10.1145/3711124","DOIUrl":"https://doi.org/10.1145/3711124","url":null,"abstract":"The rapid expansion of internet activities in daily life has elevated cyberattacks to a significant global threat. As a result, protecting the networks and systems of industries, organizations, and individuals against cybercrimes has become an increasingly critical challenge. This monograph provides a comprehensive review and analysis of national, international, and industry regulations on cybercrimes. It presents empirical evidence of the effectiveness of these regulatory measures and their impacts at the national, organizational, and individual levels. We also examine the challenges posed by emerging technologies to these regulations. Finally, the monograph identifies limitations in the current regulatory framework and proposes future directions to enhance the cybersecurity ecosystem.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"128 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142929458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
To address the barrier caused by the black-box nature of Deep Learning (DL) for practical deployment, eXplainable Artificial Intelligence (XAI) has emerged and is developing rapidly. While significant progress has been made in explanation techniques for DL models targeted to images and texts, research on explaining DL models for graph data is still in its infancy. As Graph Neural Networks (GNNs) have shown superiority over various network analysis tasks, their explainability has also gained attention from both academia and industry. However, despite the increasing number of GNN explanation methods, there is currently neither a fine-grained taxonomy of them, nor a holistic set of evaluation criteria for quantitative and qualitative evaluation. To fill this gap, we conduct a comprehensive survey on existing explanation methods of GNNs in this paper. Specifically, we propose a novel four-dimensional taxonomy of GNN explanation methods and summarize evaluation criteria in terms of correctness, robustness, usability, understandability, and computational complexity. Based on the taxonomy and criteria, we thoroughly review the recent advances in GNN explanation methods and analyze their pros and cons. In the end, we identify a series of open issues and put forward future research directions to facilitate XAI research in the field of GNNs.
{"title":"Can Graph Neural Networks be Adequately Explained? A Survey","authors":"Xuyan Li, Jie Wang, Zheng Yan","doi":"10.1145/3711122","DOIUrl":"https://doi.org/10.1145/3711122","url":null,"abstract":"To address the barrier caused by the black-box nature of Deep Learning (DL) for practical deployment, eXplainable Artificial Intelligence (XAI) has emerged and is developing rapidly. While significant progress has been made in explanation techniques for DL models targeted to images and texts, research on explaining DL models for graph data is still in its infancy. As Graph Neural Networks (GNNs) have shown superiority over various network analysis tasks, their explainability has also gained attention from both academia and industry. However, despite the increasing number of GNN explanation methods, there is currently neither a fine-grained taxonomy of them, nor a holistic set of evaluation criteria for quantitative and qualitative evaluation. To fill this gap, we conduct a comprehensive survey on existing explanation methods of GNNs in this paper. Specifically, we propose a novel four-dimensional taxonomy of GNN explanation methods and summarize evaluation criteria in terms of correctness, robustness, usability, understandability, and computational complexity. Based on the taxonomy and criteria, we thoroughly review the recent advances in GNN explanation methods and analyze their pros and cons. In the end, we identify a series of open issues and put forward future research directions to facilitate XAI research in the field of GNNs.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"66 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142929808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daochen Zha, Zaid Pervaiz Bhat, Kwei-Herng Lai, Fan Yang, Zhimeng Jiang, Shaochen Zhong, Xia Hu
Artificial Intelligence (AI) is making a profound impact in almost every domain. A vital enabler of its great success is the availability of abundant and high-quality data for building machine learning models. Recently, the role of data in AI has been significantly magnified, giving rise to the emerging concept of data-centric AI . The attention of researchers and practitioners has gradually shifted from advancing model design to enhancing the quality and quantity of the data. In this survey, we discuss the necessity of data-centric AI, followed by a holistic view of three general data-centric goals (training data development, inference data development, and data maintenance) and the representative methods. We also organize the existing literature from automation and collaboration perspectives, discuss the challenges, and tabulate the benchmarks for various tasks. We believe this is the first comprehensive survey that provides a global view of a spectrum of tasks across various stages of the data lifecycle. We hope it can help the readers efficiently grasp a broad picture of this field, and equip them with the techniques and further research ideas to systematically engineer data for building AI systems. A companion list of data-centric AI resources will be regularly updated on https://github.com/daochenzha/data-centric-AI
{"title":"Data-centric Artificial Intelligence: A Survey","authors":"Daochen Zha, Zaid Pervaiz Bhat, Kwei-Herng Lai, Fan Yang, Zhimeng Jiang, Shaochen Zhong, Xia Hu","doi":"10.1145/3711118","DOIUrl":"https://doi.org/10.1145/3711118","url":null,"abstract":"Artificial Intelligence (AI) is making a profound impact in almost every domain. A vital enabler of its great success is the availability of abundant and high-quality data for building machine learning models. Recently, the role of data in AI has been significantly magnified, giving rise to the emerging concept of <jats:italic>data-centric AI</jats:italic> . The attention of researchers and practitioners has gradually shifted from advancing model design to enhancing the quality and quantity of the data. In this survey, we discuss the necessity of data-centric AI, followed by a holistic view of three general data-centric goals (training data development, inference data development, and data maintenance) and the representative methods. We also organize the existing literature from automation and collaboration perspectives, discuss the challenges, and tabulate the benchmarks for various tasks. We believe this is the first comprehensive survey that provides a global view of a spectrum of tasks across various stages of the data lifecycle. We hope it can help the readers efficiently grasp a broad picture of this field, and equip them with the techniques and further research ideas to systematically engineer data for building AI systems. A companion list of data-centric AI resources will be regularly updated on https://github.com/daochenzha/data-centric-AI","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"55 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142929491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The proliferation of social media has increased cyber-aggressive behavior behind the freedom of speech, posing societal risks from online anonymity to real-world consequences. This article systematically reviews Aggression Content Detection and Behavioral Analysis to address these risks. Content detection is vital for handling explicit aggression, and behavior analysis offers insights into underlying dynamics. The paper analyzes diverse definitions, proposes a unified cyber-aggression definition, and reviews the process of Aggression Content Detection, emphasizing dataset creation, feature extraction, and algorithm development. Additionally, examines Behavioral Analysis studies that explore influencing factors, consequences, and patterns of online aggression. We cross-examine content detection and behavioral analysis, revealing the effectiveness of integrating sociological insights into computational techniques for preventing cyber-aggression. We conclude by identifying research gaps that urge progress in the integrative domain of socio-computational aggressive behavior analysis.
{"title":"A Survey on Online Aggression: Content Detection and Behavioural Analysis on Social Media Platforms","authors":"Swapnil Mane, Suman Kundu, Rajesh Sharma","doi":"10.1145/3711125","DOIUrl":"https://doi.org/10.1145/3711125","url":null,"abstract":"The proliferation of social media has increased cyber-aggressive behavior behind the freedom of speech, posing societal risks from online anonymity to real-world consequences. This article systematically reviews Aggression Content Detection and Behavioral Analysis to address these risks. Content detection is vital for handling explicit aggression, and behavior analysis offers insights into underlying dynamics. The paper analyzes diverse definitions, proposes a unified cyber-aggression definition, and reviews the process of Aggression Content Detection, emphasizing dataset creation, feature extraction, and algorithm development. Additionally, examines Behavioral Analysis studies that explore influencing factors, consequences, and patterns of online aggression. We cross-examine content detection and behavioral analysis, revealing the effectiveness of integrating sociological insights into computational techniques for preventing cyber-aggression. We conclude by identifying research gaps that urge progress in the integrative domain of socio-computational aggressive behavior analysis.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"45 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2025-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142924727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper surveys heuristic methods for profile and wavefront reductions. These graph layout problems represent a challenge for optimization methods and heuristics especially. This paper presents the graph layout problems with their formal definition. The study provides an ample perspective of techniques for designing heuristic methods for these graph layout problems but concentrates on the approaches and methodologies that yield high-quality solutions. Thus, this survey references the most relevant studies in the associated literature and discusses the current state-of-the-art heuristics for these graph layout problems.
{"title":"A survey of heuristics for profile and wavefront reductions","authors":"Sanderson Gonzaga de Oliveira","doi":"10.1145/3711120","DOIUrl":"https://doi.org/10.1145/3711120","url":null,"abstract":"This paper surveys heuristic methods for profile and wavefront reductions. These graph layout problems represent a challenge for optimization methods and heuristics especially. This paper presents the graph layout problems with their formal definition. The study provides an ample perspective of techniques for designing heuristic methods for these graph layout problems but concentrates on the approaches and methodologies that yield high-quality solutions. Thus, this survey references the most relevant studies in the associated literature and discusses the current state-of-the-art heuristics for these graph layout problems.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"14 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2025-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142924728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Deep Learning (DL) has achieved remarkable success in tackling complex Artificial Intelligence tasks. The standard training of neural networks employs backpropagation to compute gradients and utilizes various optimization algorithms in the Euclidean space (mathbb {R}^n ) . However, this optimization process faces challenges, such as the local optimal issues and the problem of gradient vanishing and exploding. To address these problems, Riemannian optimization offers a powerful extension to solve optimization problems in deep learning. By incorporating the prior constraint structure and the metric information of the underlying geometric information, Riemannian optimization-based DL offers a more stable and reliable optimization process, as well as enhanced adaptability to complex data structures. This article presents a comprehensive survey of applying geometric optimization in DL, including the basic procedure of geometric optimization, various geometric optimizers, and some concepts of the Riemannian manifold. In addition, it investigates various applications of geometric optimization in different DL networks for diverse tasks and discusses typical public toolboxes that implement optimization on the manifold. This article also includes a performance comparison among different deep geometric optimization methods in image recognition scenarios. Finally, this article elaborates on future opportunities and challenges in this field.
{"title":"A Survey of Geometric Optimization for Deep Learning: From Euclidean Space to Riemannian Manifold","authors":"Yanhong Fei, Yingjie Liu, Chentao Jia, Zhengyu Li, Xian Wei, Mingsong Chen","doi":"10.1145/3708498","DOIUrl":"https://doi.org/10.1145/3708498","url":null,"abstract":"Deep Learning (DL) has achieved remarkable success in tackling complex Artificial Intelligence tasks. The standard training of neural networks employs backpropagation to compute gradients and utilizes various optimization algorithms in the Euclidean space <jats:inline-formula content-type=\"math/tex\"> <jats:tex-math notation=\"TeX\" version=\"MathJaX\">(mathbb {R}^n )</jats:tex-math> </jats:inline-formula> . However, this optimization process faces challenges, such as the local optimal issues and the problem of gradient vanishing and exploding. To address these problems, Riemannian optimization offers a powerful extension to solve optimization problems in deep learning. By incorporating the prior constraint structure and the metric information of the underlying geometric information, Riemannian optimization-based DL offers a more stable and reliable optimization process, as well as enhanced adaptability to complex data structures. This article presents a comprehensive survey of applying geometric optimization in DL, including the basic procedure of geometric optimization, various geometric optimizers, and some concepts of the Riemannian manifold. In addition, it investigates various applications of geometric optimization in different DL networks for diverse tasks and discusses typical public toolboxes that implement optimization on the manifold. This article also includes a performance comparison among different deep geometric optimization methods in image recognition scenarios. Finally, this article elaborates on future opportunities and challenges in this field.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"57 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142886772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recommender systems (RS) play an integral role in many online platforms. Exponential growth and potential commercial interests are raising significant concerns around privacy, security, fairness, and overall responsibility. The existing literature around responsible recommendation services is diverse and multi-disciplinary. Most literature reviews cover a specific aspect or a single technology for responsible behavior, such as federated learning or blockchain. This study integrates relevant concepts across disciplines to provide a broader representation of the landscape. We review the latest advancements toward building privacy-preserved and responsible recommendation services for the e-commerce industry. The survey summarizes recent, high-impact works on diverse aspects and technologies that ensure responsible behavior in RS through an interconnected taxonomy. We contextualize potential privacy threats, practical significance, industrial expectations, and research remedies. From the technical viewpoint, we analyze conventional privacy defenses and provide an overview of emerging technologies including differential privacy, federated learning, and blockchain. The methods and concepts across technologies are linked based on their objectives, challenges, and future directions. In addition, we also develop an open-source repository that summarizes a wide range of evaluation benchmarks, codebases, and toolkits to aid the further research. The survey offers a holistic perspective on this rapidly evolving landscape by synthesizing insights from both recommender systems and responsible AI literature.
{"title":"Privacy-preserved and Responsible Recommenders: From Conventional Defense to Federated Learning and Blockchain","authors":"Waqar Ali, Xiangmin Zhou, Jie Shao","doi":"10.1145/3708982","DOIUrl":"https://doi.org/10.1145/3708982","url":null,"abstract":"Recommender systems (RS) play an integral role in many online platforms. Exponential growth and potential commercial interests are raising significant concerns around privacy, security, fairness, and overall responsibility. The existing literature around responsible recommendation services is diverse and multi-disciplinary. Most literature reviews cover a specific aspect or a single technology for responsible behavior, such as federated learning or blockchain. This study integrates relevant concepts across disciplines to provide a broader representation of the landscape. We review the latest advancements toward building privacy-preserved and responsible recommendation services for the e-commerce industry. The survey summarizes recent, high-impact works on diverse aspects and technologies that ensure responsible behavior in RS through an interconnected taxonomy. We contextualize potential privacy threats, practical significance, industrial expectations, and research remedies. From the technical viewpoint, we analyze conventional privacy defenses and provide an overview of emerging technologies including differential privacy, federated learning, and blockchain. The methods and concepts across technologies are linked based on their objectives, challenges, and future directions. In addition, we also develop an open-source repository that summarizes a wide range of evaluation benchmarks, codebases, and toolkits to aid the further research. The survey offers a holistic perspective on this rapidly evolving landscape by synthesizing insights from both recommender systems and responsible AI literature.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"40 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142857991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Claudio Filipi Goncalves dos Santos, Rodrigo Reis Arrais, Jhessica Victoria Santos da Silva, Matheus Henrique Marques da Silva, Wladimir Barroso Guedes de Araujo Neto, Leonardo Tadeu Lopes, Guilherme Augusto Bileki, Iago Oliveira Lima, Lucas Borges Rondon, Bruno Melo de Souza, Mayara Costa Regazio, Rodolfo Coelho Dalapicola, Arthur Alves Tasca
The entire Image Signal Processor (ISP) of a camera relies on several processes to transform the data from the Color Filter Array (CFA) sensor, such as demosaicing, denoising, and enhancement. These processes can be executed either by some hardware or via software. In recent years, Deep Learning(DL) has emerged as one solution for some of them or even to replace the entire ISP using a single neural network for the task. In this work, we investigated several recent pieces of research in this area and provide deeper analysis and comparison among them, including results and possible points of improvement for future researchers.
{"title":"ISP Meets Deep Learning: A Survey on Deep Learning Methods for Image Signal Processing","authors":"Claudio Filipi Goncalves dos Santos, Rodrigo Reis Arrais, Jhessica Victoria Santos da Silva, Matheus Henrique Marques da Silva, Wladimir Barroso Guedes de Araujo Neto, Leonardo Tadeu Lopes, Guilherme Augusto Bileki, Iago Oliveira Lima, Lucas Borges Rondon, Bruno Melo de Souza, Mayara Costa Regazio, Rodolfo Coelho Dalapicola, Arthur Alves Tasca","doi":"10.1145/3708516","DOIUrl":"https://doi.org/10.1145/3708516","url":null,"abstract":"The entire Image Signal Processor (ISP) of a camera relies on several processes to transform the data from the Color Filter Array (CFA) sensor, such as demosaicing, denoising, and enhancement. These processes can be executed either by some hardware or via software. In recent years, Deep Learning(DL) has emerged as one solution for some of them or even to replace the entire ISP using a single neural network for the task. In this work, we investigated several recent pieces of research in this area and provide deeper analysis and comparison among them, including results and possible points of improvement for future researchers.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"276 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142857992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Procedural content generation (PCG) can be applied to a wide variety of tasks in games, from narratives, levels and sounds, to trees and weapons. A large amount of game content is comprised of graphical assets , such as clouds, buildings or vegetation, that do not require gameplay function considerations. There is also a breadth of literature examining the procedural generation of such elements for purposes outside of games. The body of research, focused on specific methods for generating specific assets, provides a narrow view of the available possibilities. Hence, it is difficult to have a clear picture of all approaches and possibilities, with no guide for interested parties to discover possible methods and approaches for their needs, and no facility to guide them through each technique or approach to map out the process of using them. Therefore, a systematic literature review has been conducted, yielding 239 accepted papers. This paper explores state-of-the-art approaches to graphical asset generation, examining research from a wide range of applications, inside and outside of games. Informed by the literature, a conceptual framework has been derived to address the aforementioned gaps.
{"title":"Intelligent Generation of Graphical Game Assets: A Conceptual Framework and Systematic Review of the State of the Art","authors":"Kaisei Fukaya, Damon Daylamani-Zad, Harry Agius","doi":"10.1145/3708499","DOIUrl":"https://doi.org/10.1145/3708499","url":null,"abstract":"Procedural content generation (PCG) can be applied to a wide variety of tasks in games, from narratives, levels and sounds, to trees and weapons. A large amount of game content is comprised of <jats:italic>graphical assets</jats:italic> , such as clouds, buildings or vegetation, that do not require gameplay function considerations. There is also a breadth of literature examining the procedural generation of such elements for purposes outside of games. The body of research, focused on specific methods for generating specific assets, provides a narrow view of the available possibilities. Hence, it is difficult to have a clear picture of all approaches and possibilities, with no guide for interested parties to discover possible methods and approaches for their needs, and no facility to guide them through each technique or approach to map out the process of using them. Therefore, a systematic literature review has been conducted, yielding 239 accepted papers. This paper explores state-of-the-art approaches to <jats:italic>graphical asset</jats:italic> generation, examining research from a wide range of applications, inside and outside of games. Informed by the literature, a conceptual framework has been derived to address the aforementioned gaps.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"10 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142849190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Artificial intelligence (AI), and especially its sub-field of Machine Learning (ML), are impacting the daily lives of everyone with their ubiquitous applications. In recent years, AI researchers and practitioners have introduced principles and guidelines to build systems that make reliable and trustworthy decisions. From a practical perspective, conventional ML systems process historical data to extract the features that are consequently used to train ML models that perform the desired task. However, in practice, a fundamental challenge arises when the system needs to be operationalized and deployed to evolve and operate in real-life environments continuously. To address this challenge, Machine Learning Operations (MLOps) have emerged as a potential recipe for standardizing ML solutions in deployment. Although MLOps demonstrated great success in streamlining ML processes, thoroughly defining the specifications of robust MLOps approaches remains of great interest to researchers and practitioners. In this paper, we provide a comprehensive overview of the trustworthiness property of MLOps systems. Specifically, we highlight technical practices to achieve robust MLOps systems. In addition, we survey the existing research approaches that address the robustness aspects of ML systems in production. We also review the tools and software available to build MLOps systems and summarize their support to handle the robustness aspects. Finally, we present the open challenges and propose possible future directions and opportunities within this emerging field. The aim of this paper is to provide researchers and practitioners working on practical AI applications with a comprehensive view to adopt robust ML solutions in production environments.
{"title":"Towards Trustworthy Machine Learning in Production: An Overview of the Robustness in MLOps Approach","authors":"Firas Bayram, Bestoun S. Ahmed","doi":"10.1145/3708497","DOIUrl":"https://doi.org/10.1145/3708497","url":null,"abstract":"Artificial intelligence (AI), and especially its sub-field of Machine Learning (ML), are impacting the daily lives of everyone with their ubiquitous applications. In recent years, AI researchers and practitioners have introduced principles and guidelines to build systems that make reliable and trustworthy decisions. From a practical perspective, conventional ML systems process historical data to extract the features that are consequently used to train ML models that perform the desired task. However, in practice, a fundamental challenge arises when the system needs to be operationalized and deployed to evolve and operate in real-life environments continuously. To address this challenge, Machine Learning Operations (MLOps) have emerged as a potential recipe for standardizing ML solutions in deployment. Although MLOps demonstrated great success in streamlining ML processes, thoroughly defining the specifications of robust MLOps approaches remains of great interest to researchers and practitioners. In this paper, we provide a comprehensive overview of the trustworthiness property of MLOps systems. Specifically, we highlight technical practices to achieve robust MLOps systems. In addition, we survey the existing research approaches that address the robustness aspects of ML systems in production. We also review the tools and software available to build MLOps systems and summarize their support to handle the robustness aspects. Finally, we present the open challenges and propose possible future directions and opportunities within this emerging field. The aim of this paper is to provide researchers and practitioners working on practical AI applications with a comprehensive view to adopt robust ML solutions in production environments.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"28 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142849191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}