ACM Computing Surveys (CSUR)最新文献

英文中文

Topic Modeling Using Latent Dirichlet allocation 基于潜在狄利克雷分配的主题建模

ACM Computing Surveys (CSUR)

Pub Date : 2021-09-17 DOI: 10.1145/3462478

Uttam Chauhan, Apurva Shah

We are not able to deal with a mammoth text corpus without summarizing them into a relatively small subset. A computational tool is extremely needed to understand such a gigantic pool of text. Probabilistic Topic Modeling discovers and explains the enormous collection of documents by reducing them in a topical subspace. In this work, we study the background and advancement of topic modeling techniques. We first introduce the preliminaries of the topic modeling techniques and review its extensions and variations, such as topic modeling over various domains, hierarchical topic modeling, word embedded topic models, and topic models in multilingual perspectives. Besides, the research work for topic modeling in a distributed environment, topic visualization approaches also have been explored. We also covered the implementation and evaluation techniques for topic models in brief. Comparison matrices have been shown over the experimental results of the various categories of topic modeling. Diverse technical challenges and future directions have been discussed.

如果不将它们总结成一个相对较小的子集，我们就无法处理庞大的文本语料库。要理解如此庞大的文本库，极其需要一种计算工具。概率主题建模通过在主题子空间中约简来发现和解释大量的文档集合。在这项工作中，我们研究了主题建模技术的背景和进展。我们首先介绍了主题建模技术的基础知识，并回顾了其扩展和变体，如不同领域的主题建模、分层主题建模、词嵌入主题模型和多语言视角的主题模型。此外，还对分布式环境下的主题建模、主题可视化方法进行了研究。我们还简要介绍了主题模型的实现和评估技术。对各类主题建模的实验结果给出了比较矩阵。讨论了各种技术挑战和未来方向。

引用次数: 72

Semantic Information Retrieval on Medical Texts 医学文本语义信息检索

ACM Computing Surveys (CSUR)

Pub Date : 2021-09-17 DOI: 10.1145/3462476

L. Tamine, L. Goeuriot

The explosive growth and widespread accessibility of medical information on the Internet have led to a surge of research activity in a wide range of scientific communities including health informatics and information retrieval (IR). One of the common concerns of this research, across these disciplines, is how to design either clinical decision support systems or medical search engines capable of providing adequate support for both novices (e.g., patients and their next-of-kin) and experts (e.g., physicians, clinicians) tackling complex tasks (e.g., search for diagnosis, search for a treatment). However, despite the significant multi-disciplinary research advances, current medical search systems exhibit low levels of performance. This survey provides an overview of the state of the art in the disciplines of IR and health informatics, and bridging these disciplines shows how semantic search techniques can facilitate medical IR. First,we will give a broad picture of semantic search and medical IR and then highlight the major scientific challenges. Second, focusing on the semantic gap challenge, we will discuss representative state-of-the-art work related to feature-based as well as semantic-based representation and matching models that support medical search systems. In addition to seminal works, we will present recent works that rely on research advancements in deep learning. Third, we make a thorough cross-model analysis and provide some findings and lessons learned. Finally, we discuss some open issues and possible promising directions for future research trends.

互联网上医疗信息的爆炸性增长和广泛可及性导致了包括卫生信息学和信息检索(IR)在内的广泛科学界研究活动的激增。在这些学科中，本研究的共同关注点之一是如何设计临床决策支持系统或医学搜索引擎，能够为新手(例如，患者及其近亲)和专家(例如，医生，临床医生)处理复杂任务(例如，搜索诊断，搜索治疗)提供足够的支持。然而，尽管有重大的多学科研究进展，目前的医疗搜索系统表现出低水平的性能。本调查概述了IR和健康信息学学科的最新进展，并将这些学科联系起来，展示了语义搜索技术如何促进医学IR。首先，我们将给出语义搜索和医学IR的广泛图景，然后强调主要的科学挑战。其次，关注语义差距挑战，我们将讨论与支持医疗搜索系统的基于特征和基于语义的表示和匹配模型相关的代表性最新研究。除了开创性的工作外，我们还将介绍依赖于深度学习研究进展的最新工作。第三，我们进行了深入的跨模型分析，并提供了一些发现和经验教训。最后，我们讨论了一些有待解决的问题和未来可能的研究方向。

{"title":"Semantic Information Retrieval on Medical Texts","authors":"L. Tamine, L. Goeuriot","doi":"10.1145/3462476","DOIUrl":"https://doi.org/10.1145/3462476","url":null,"abstract":"The explosive growth and widespread accessibility of medical information on the Internet have led to a surge of research activity in a wide range of scientific communities including health informatics and information retrieval (IR). One of the common concerns of this research, across these disciplines, is how to design either clinical decision support systems or medical search engines capable of providing adequate support for both novices (e.g., patients and their next-of-kin) and experts (e.g., physicians, clinicians) tackling complex tasks (e.g., search for diagnosis, search for a treatment). However, despite the significant multi-disciplinary research advances, current medical search systems exhibit low levels of performance. This survey provides an overview of the state of the art in the disciplines of IR and health informatics, and bridging these disciplines shows how semantic search techniques can facilitate medical IR. First,we will give a broad picture of semantic search and medical IR and then highlight the major scientific challenges. Second, focusing on the semantic gap challenge, we will discuss representative state-of-the-art work related to feature-based as well as semantic-based representation and matching models that support medical search systems. In addition to seminal works, we will present recent works that rely on research advancements in deep learning. Third, we make a thorough cross-model analysis and provide some findings and lessons learned. Finally, we discuss some open issues and possible promising directions for future research trends.","PeriodicalId":7000,"journal":{"name":"ACM Computing Surveys (CSUR)","volume":"50 1","pages":"1 - 38"},"PeriodicalIF":0.0,"publicationDate":"2021-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88493764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

Centralized, Distributed, and Everything in between 集中式、分布式以及介于两者之间的一切

ACM Computing Surveys (CSUR)

Pub Date : 2021-09-17 DOI: 10.1145/3465170

Sophie Dramé-Maigné, M. Laurent-Maknavicius, Laurent Castillo, H. Ganem

The Internet of Things is taking hold in our everyday life. Regrettably, the security of IoT devices is often being overlooked. Among the vast array of security issues plaguing the emerging IoT, we decide to focus on access control, as privacy, trust, and other security properties cannot be achieved without controlled access. This article classifies IoT access control solutions from the literature according to their architecture (e.g., centralized, hierarchical, federated, distributed) and examines the suitability of each one for access control purposes. Our analysis concludes that important properties such as auditability and revocation are missing from many proposals while hierarchical and federated architectures are neglected by the community. Finally, we provide an architecture-based taxonomy and future research directions: a focus on hybrid architectures, usability, flexibility, privacy, and revocation schemes in serverless authorization.

物联网正在深入我们的日常生活。令人遗憾的是，物联网设备的安全性经常被忽视。在困扰新兴物联网的大量安全问题中，我们决定将重点放在访问控制上，因为如果没有受控访问，就无法实现隐私、信任和其他安全属性。本文根据其架构(例如，集中式，分层式，联合式，分布式)对文献中的物联网访问控制解决方案进行分类，并检查每个解决方案对访问控制目的的适用性。我们的分析得出的结论是，许多提案缺少诸如可审计性和可撤销性等重要属性，而社区忽略了分层和联邦架构。最后，我们提供了一个基于体系结构的分类和未来的研究方向:关注无服务器授权中的混合体系结构、可用性、灵活性、隐私和撤销方案。

引用次数: 15

Temporal Relation Extraction in Clinical Texts 临床文献中的时间关系提取

ACM Computing Surveys (CSUR)

Pub Date : 2021-09-17 DOI: 10.1145/3462475

Yohan Bonescki Gumiel, Lucas Emanuel Silva e Oliveira, V. Claveau, N. Grabar, E. Paraiso, C. Moro, D. Carvalho

Unstructured data in electronic health records, represented by clinical texts, are a vast source of healthcare information because they describe a patient's journey, including clinical findings, procedures, and information about the continuity of care. The publication of several studies on temporal relation extraction from clinical texts during the last decade and the realization of multiple shared tasks highlight the importance of this research theme. Therefore, we propose a review of temporal relation extraction in clinical texts. We analyzed 105 articles and verified that relations between events and document creation time, a coarse temporality type, were addressed with traditional machine learning–based models with few recent initiatives to push the state-of-the-art with deep learning–based models. For temporal relations between entities (event and temporal expressions) in the document, factors such as dataset imbalance because of candidate pair generation and task complexity directly affect the system's performance. The state-of-the-art resides on attention-based models, with contextualized word representations being fine-tuned for temporal relation extraction. However, further experiments and advances in the research topic are required until real-time clinical domain applications are released. Furthermore, most of the publications mainly reside on the same dataset, hindering the need for new annotation projects that provide datasets for different medical specialties, clinical text types, and even languages.

以临床文本为代表的电子健康记录中的非结构化数据是医疗保健信息的巨大来源，因为它们描述了患者的旅程，包括临床发现、程序和有关护理连续性的信息。在过去的十年中，一些关于从临床文本中提取时间关系的研究的发表和多个共享任务的实现突出了这一研究主题的重要性。因此，我们建议对临床文献中的时间关系提取进行回顾。我们分析了105篇文章，并验证了事件和文档创建时间(一种粗略的时间类型)之间的关系是用传统的基于机器学习的模型来解决的，而最近很少有基于深度学习的模型来推动最先进的技术。对于文档中实体(事件和时态表达式)之间的时态关系，候选对生成导致的数据集不平衡和任务复杂性等因素直接影响系统的性能。最先进的技术是基于注意力的模型，对上下文化的单词表示进行了微调，以提取时间关系。然而，在实时临床领域应用发布之前，还需要进一步的实验和研究课题的进展。此外，大多数出版物主要驻留在相同的数据集上，阻碍了对新的注释项目的需求，这些项目为不同的医学专业、临床文本类型甚至语言提供数据集。

{"title":"Temporal Relation Extraction in Clinical Texts","authors":"Yohan Bonescki Gumiel, Lucas Emanuel Silva e Oliveira, V. Claveau, N. Grabar, E. Paraiso, C. Moro, D. Carvalho","doi":"10.1145/3462475","DOIUrl":"https://doi.org/10.1145/3462475","url":null,"abstract":"Unstructured data in electronic health records, represented by clinical texts, are a vast source of healthcare information because they describe a patient's journey, including clinical findings, procedures, and information about the continuity of care. The publication of several studies on temporal relation extraction from clinical texts during the last decade and the realization of multiple shared tasks highlight the importance of this research theme. Therefore, we propose a review of temporal relation extraction in clinical texts. We analyzed 105 articles and verified that relations between events and document creation time, a coarse temporality type, were addressed with traditional machine learning–based models with few recent initiatives to push the state-of-the-art with deep learning–based models. For temporal relations between entities (event and temporal expressions) in the document, factors such as dataset imbalance because of candidate pair generation and task complexity directly affect the system's performance. The state-of-the-art resides on attention-based models, with contextualized word representations being fine-tuned for temporal relation extraction. However, further experiments and advances in the research topic are required until real-time clinical domain applications are released. Furthermore, most of the publications mainly reside on the same dataset, hindering the need for new annotation projects that provide datasets for different medical specialties, clinical text types, and even languages.","PeriodicalId":7000,"journal":{"name":"ACM Computing Surveys (CSUR)","volume":"92 1","pages":"1 - 36"},"PeriodicalIF":0.0,"publicationDate":"2021-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80472007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

A Survey on Resilience in the IoT 物联网中的弹性调查

ACM Computing Surveys (CSUR)

Pub Date : 2021-09-06 DOI: 10.1145/3462513

C. Berger, Philipp Eichhammer, Hans P. Reiser, Jörg Domaschka, F. Hauck, Gerhard Habiger

Internet-of-Things (IoT) ecosystems tend to grow both in scale and complexity, as they consist of a variety of heterogeneous devices that span over multiple architectural IoT layers (e.g., cloud, edge, sensors). Further, IoT systems increasingly demand the resilient operability of services, as they become part of critical infrastructures. This leads to a broad variety of research works that aim to increase the resilience of these systems. In this article, we create a systematization of knowledge about existing scientific efforts of making IoT systems resilient. In particular, we first discuss the taxonomy and classification of resilience and resilience mechanisms and subsequently survey state-of-the-art resilience mechanisms that have been proposed by research work and are applicable to IoT. As part of the survey, we also discuss questions that focus on the practical aspects of resilience, e.g., which constraints resilience mechanisms impose on developers when designing resilient systems by incorporating a specific mechanism into IoT systems.

物联网(IoT)生态系统往往在规模和复杂性上都在增长，因为它们由跨越多个物联网架构层(例如云、边缘、传感器)的各种异构设备组成。此外，物联网系统越来越需要服务的弹性可操作性，因为它们成为关键基础设施的一部分。这导致了各种各样的研究工作，旨在提高这些系统的弹性。在本文中，我们创建了一个关于使物联网系统具有弹性的现有科学努力的系统化知识。特别是，我们首先讨论了弹性和弹性机制的分类和分类，随后调查了研究工作提出的适用于物联网的最新弹性机制。作为调查的一部分，我们还讨论了侧重于弹性实践方面的问题，例如，通过将特定机制整合到物联网系统中，在设计弹性系统时，弹性机制对开发人员施加了哪些约束。

引用次数: 31

Text Mining in Cybersecurity 网络安全中的文本挖掘

ACM Computing Surveys (CSUR)

Pub Date : 2021-07-18 DOI: 10.1145/3462477

Luciano Ignaczak, Guilherme Goldschmidt, C. Costa, R. Righi

The growth of data volume has changed cybersecurity activities, demanding a higher level of automation. In this new cybersecurity landscape, text mining emerged as an alternative to improve the efficiency of the activities involving unstructured data. This article proposes a Systematic Literature Review (SLR) to present the application of text mining in the cybersecurity domain. Using a systematic protocol, we identified 2,196 studies, out of which 83 were summarized. As a contribution, we propose a taxonomy to demonstrate the different activities in the cybersecurity domain supported by text mining. We also detail the strategies evaluated in the application of text mining tasks and the use of neural networks to support activities involving unstructured data. The work also discusses text classification performance aiming its application in real-world solutions. The SLR also highlights open gaps for future research, such as the analysis of non-English content and the intensification in the usage of neural networks.

数据量的增长改变了网络安全活动，要求更高的自动化水平。在这种新的网络安全环境中，文本挖掘作为提高涉及非结构化数据的活动效率的替代方案而出现。本文通过系统文献综述(SLR)来介绍文本挖掘在网络安全领域的应用。使用系统方案，我们确定了2196项研究，其中83项进行了总结。作为贡献，我们提出了一个分类法来展示文本挖掘支持的网络安全领域中的不同活动。我们还详细介绍了在文本挖掘任务和使用神经网络支持涉及非结构化数据的活动的应用中评估的策略。该工作还讨论了文本分类性能，目标是其在现实世界解决方案中的应用。SLR还强调了未来研究的空白，例如对非英语内容的分析和神经网络使用的加强。

引用次数: 14

More than Privacy 不仅仅是隐私

ACM Computing Surveys (CSUR)

Pub Date : 2021-07-18 DOI: 10.1145/3460771

Lefeng Zhang, Tianqing Zhu, P. Xiong, Wanlei Zhou, Philip S. Yu

The vast majority of artificial intelligence solutions are founded on game theory, and differential privacy is emerging as perhaps the most rigorous and widely adopted privacy paradigm in the field. However, alongside all the advancements made in both these fields, there is not a single application that is not still vulnerable to privacy violations, security breaches, or manipulation by adversaries. Our understanding of the interactions between differential privacy and game theoretic solutions is limited. Hence, we undertook a comprehensive review of literature in the field, finding that differential privacy has several advantageous properties that can make more of a contribution to game theory than just privacy protection. It can also be used to build heuristic models for game-theoretic solutions, to avert strategic manipulations, and to quantify the cost of privacy protection. With a focus on mechanism design, the aim of this article is to provide a new perspective on the currently held impossibilities in game theory, potential avenues to circumvent those impossibilities, and opportunities to improve the performance of game-theoretic solutions with differentially private techniques.

绝大多数人工智能解决方案都是建立在博弈论的基础上的，而差分隐私可能是该领域最严格、最广泛采用的隐私范式。然而，除了在这两个领域取得的所有进步之外，没有一个应用程序不容易受到隐私侵犯、安全漏洞或对手操纵的影响。我们对差分隐私和博弈论解决方案之间的相互作用的理解是有限的。因此，我们对该领域的文献进行了全面的回顾，发现差异隐私有几个有利的属性，这些属性可以对博弈论做出更多的贡献，而不仅仅是隐私保护。它还可以用来为博弈论解决方案建立启发式模型，避免战略操纵，并量化隐私保护的成本。本文以机制设计为重点，旨在为博弈论中目前存在的不可能性、规避这些不可能性的潜在途径以及利用差异私有技术提高博弈论解决方案性能的机会提供一个新的视角。

引用次数: 1

On the Use of Intelligent Models towards Meeting the Challenges of the Edge Mesh 利用智能模型应对边缘网格的挑战

ACM Computing Surveys (CSUR)

Pub Date : 2021-07-01 DOI: 10.1145/3456630

P. Oikonomou, Anna Karanika, C. Anagnostopoulos, Kostas Kolomvatsos

Nowadays, we are witnessing the advent of the Internet of Things (IoT) with numerous devices performing interactions between them or with their environment. The huge number of devices leads to huge volumes of data that demand the appropriate processing. The “legacy” approach is to rely on Cloud where increased computational resources can realize any desired processing. However, the need for supporting real-time applications requires a reduced latency in the provision of outcomes. Edge Computing (EC) comes as the “solver” of the latency problem. Various processing activities can be performed at EC nodes having direct connection with IoT devices. A number of challenges should be met before we conclude a fully automated ecosystem where nodes can cooperate or understand their status to efficiently serve applications. In this article, we perform a survey of the relevant research activities towards the vision of Edge Mesh (EM), i.e., a “cover” of intelligence upon the EC. We present the necessary hardware and discuss research outcomes in every aspect of EC/EM nodes functioning. We present technologies and theories adopted for data, tasks, and resource management while discussing how machine learning and optimization can be adopted in the domain.

如今，我们正在见证物联网(IoT)的出现，许多设备在它们之间或与它们的环境进行交互。大量的设备导致大量的数据需要适当的处理。“遗留”方法依赖于云，在云上增加计算资源可以实现任何所需的处理。然而，支持实时应用程序的需求要求在提供结果时减少延迟。边缘计算(EC)是延迟问题的“解决者”。各种处理活动可以在与物联网设备直接连接的EC节点上执行。在我们完成一个完全自动化的生态系统之前，应该遇到许多挑战，在这个生态系统中，节点可以合作或了解它们的状态，从而有效地为应用程序服务。在本文中，我们对边缘网格(EM)愿景的相关研究活动进行了调查，即在EC上的情报“掩护”。我们介绍了必要的硬件，并讨论了EC/EM节点功能的各个方面的研究成果。我们介绍了用于数据、任务和资源管理的技术和理论，同时讨论了如何在该领域采用机器学习和优化。

{"title":"On the Use of Intelligent Models towards Meeting the Challenges of the Edge Mesh","authors":"P. Oikonomou, Anna Karanika, C. Anagnostopoulos, Kostas Kolomvatsos","doi":"10.1145/3456630","DOIUrl":"https://doi.org/10.1145/3456630","url":null,"abstract":"Nowadays, we are witnessing the advent of the Internet of Things (IoT) with numerous devices performing interactions between them or with their environment. The huge number of devices leads to huge volumes of data that demand the appropriate processing. The “legacy” approach is to rely on Cloud where increased computational resources can realize any desired processing. However, the need for supporting real-time applications requires a reduced latency in the provision of outcomes. Edge Computing (EC) comes as the “solver” of the latency problem. Various processing activities can be performed at EC nodes having direct connection with IoT devices. A number of challenges should be met before we conclude a fully automated ecosystem where nodes can cooperate or understand their status to efficiently serve applications. In this article, we perform a survey of the relevant research activities towards the vision of Edge Mesh (EM), i.e., a “cover” of intelligence upon the EC. We present the necessary hardware and discuss research outcomes in every aspect of EC/EM nodes functioning. We present technologies and theories adopted for data, tasks, and resource management while discussing how machine learning and optimization can be adopted in the domain.","PeriodicalId":7000,"journal":{"name":"ACM Computing Surveys (CSUR)","volume":"42 1","pages":"1 - 42"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82482117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A Survey on Encrypted Network Traffic Analysis Applications, Techniques, and Countermeasures 加密网络流量分析应用、技术及对策综述

ACM Computing Surveys (CSUR)

Pub Date : 2021-07-01 DOI: 10.1145/3457904

Eva Papadogiannaki, S. Ioannidis

The adoption of network traffic encryption is continually growing. Popular applications use encryption protocols to secure communications and protect the privacy of users. In addition, a large portion of malware is spread through the network traffic taking advantage of encryption protocols to hide its presence and activity. Entering into the era of completely encrypted communications over the Internet, we must rapidly start reviewing the state-of-the-art in the wide domain of network traffic analysis and inspection, to conclude if traditional traffic processing systems will be able to seamlessly adapt to the upcoming full adoption of network encryption. In this survey, we examine the literature that deals with network traffic analysis and inspection after the ascent of encryption in communication channels. We notice that the research community has already started proposing solutions on how to perform inspection even when the network traffic is encrypted and we demonstrate and review these works. In addition, we present the techniques and methods that these works use and their limitations. Finally, we examine the countermeasures that have been proposed in the literature in order to circumvent traffic analysis techniques that aim to harm user privacy.

网络流量加密的采用在不断增长。流行的应用程序使用加密协议来保护通信和保护用户的隐私。此外，很大一部分恶意软件通过网络流量传播，利用加密协议隐藏其存在和活动。进入互联网上完全加密通信的时代，我们必须迅速开始审查网络流量分析和检测领域的最新技术，以确定传统的流量处理系统是否能够无缝地适应即将全面采用的网络加密。在本调查中，我们研究了在通信通道加密上升后处理网络流量分析和检查的文献。我们注意到，研究界已经开始提出如何在网络流量被加密的情况下执行检查的解决方案，我们演示和审查了这些工作。此外，我们还介绍了这些作品使用的技术和方法及其局限性。最后，我们研究了文献中提出的对策，以规避旨在损害用户隐私的流量分析技术。

{"title":"A Survey on Encrypted Network Traffic Analysis Applications, Techniques, and Countermeasures","authors":"Eva Papadogiannaki, S. Ioannidis","doi":"10.1145/3457904","DOIUrl":"https://doi.org/10.1145/3457904","url":null,"abstract":"The adoption of network traffic encryption is continually growing. Popular applications use encryption protocols to secure communications and protect the privacy of users. In addition, a large portion of malware is spread through the network traffic taking advantage of encryption protocols to hide its presence and activity. Entering into the era of completely encrypted communications over the Internet, we must rapidly start reviewing the state-of-the-art in the wide domain of network traffic analysis and inspection, to conclude if traditional traffic processing systems will be able to seamlessly adapt to the upcoming full adoption of network encryption. In this survey, we examine the literature that deals with network traffic analysis and inspection after the ascent of encryption in communication channels. We notice that the research community has already started proposing solutions on how to perform inspection even when the network traffic is encrypted and we demonstrate and review these works. In addition, we present the techniques and methods that these works use and their limitations. Finally, we examine the countermeasures that have been proposed in the literature in order to circumvent traffic analysis techniques that aim to harm user privacy.","PeriodicalId":7000,"journal":{"name":"ACM Computing Surveys (CSUR)","volume":"33 1","pages":"1 - 35"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88841368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 52

A Comprehensive Survey of Privacy-preserving Federated Learning 隐私保护联邦学习的综合研究

ACM Computing Surveys (CSUR)

Pub Date : 2021-07-01 DOI: 10.1145/3460427

Xuefei Yin, Yanming Zhu, Jiankun Hu

The past four years have witnessed the rapid development of federated learning (FL). However, new privacy concerns have also emerged during the aggregation of the distributed intermediate results. The emerging privacy-preserving FL (PPFL) has been heralded as a solution to generic privacy-preserving machine learning. However, the challenge of protecting data privacy while maintaining the data utility through machine learning still remains. In this article, we present a comprehensive and systematic survey on the PPFL based on our proposed 5W-scenario-based taxonomy. We analyze the privacy leakage risks in the FL from five aspects, summarize existing methods, and identify future research directions.

在过去的四年里，联邦学习(FL)得到了迅速的发展。然而，在分布式中间结果的聚合过程中，也出现了新的隐私问题。新兴的隐私保护FL (PPFL)被认为是通用隐私保护机器学习的解决方案。然而，在通过机器学习保持数据效用的同时保护数据隐私的挑战仍然存在。在本文中，我们基于我们提出的基于5w场景的分类法对PPFL进行了全面和系统的调查。我们从五个方面分析了FL中的隐私泄露风险，总结了现有的方法，并确定了未来的研究方向。

引用次数: 209

首页上一页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

ACM Computing Surveys (CSUR)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀