2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)最新文献_第10页

StripeMerge: Efficient Wide-Stripe Generation for Large-Scale Erasure-Coded Storage StripeMerge:大规模擦除编码存储的高效宽条带生成

2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)

Pub Date : 2021-07-01 DOI: 10.1109/ICDCS51616.2021.00053

Qiaori Yao, Yuchong Hu, Liangfeng Cheng, P. Lee, D. Feng, Weichun Wang, Wei Chen

Erasure coding has been widely deployed in modern large-scale storage systems for storage-efficient fault tolerance by storing stripes of data and parity chunks. Recently, enterprises explore the notion of wide stripes to suppress the fraction of parity chunks in each stripe to achieve extreme storage savings. However, how to efficiently generate wide stripes remains a non-trivial issue. In particular, re-encoding the currently stored stripes (termed narrow stripes) into wide stripes triggers substantial bandwidth overhead in relocating and regenerating the chunks for wide stripes. We propose StripeMerge, a wide-stripe generation mechanism that selects and merges narrow stripes into wide stripes, with the primary objective of minimizing the wide-stripe generation bandwidth. We prove the existence of an optimal scheme that does not incur any data transfer for wide-stripe generation, yet the optimal scheme is computationally expensive. To this end, we propose two heuristics that can be efficiently executed with only limited wide-stripe generation bandwidth overhead. We prototype StripeMerge and show via both simulations and Amazon EC2 experiments that the wide-stripe generation time can be reduced by up to 87.8% over a state-of-the-art storage scaling approach.

Erasure编码已广泛应用于现代大型存储系统中，通过存储条带数据和奇偶校验块来实现高效的存储容错。最近，企业探索了宽条纹的概念，以抑制每个条纹中的奇偶校验块的比例，以实现极大的存储节省。然而，如何有效地生成宽条纹仍然是一个重要的问题。特别是，将当前存储的条带(称为窄条带)重新编码为宽条带会在重定位和重新生成宽条带的块时触发大量的带宽开销。我们提出了StripeMerge，这是一种选择窄条纹并将其合并为宽条纹的宽条纹生成机制，其主要目标是最小化宽条纹生成带宽。我们证明了一种不产生任何数据传输的最优方案的存在性，但该最优方案的计算代价很高。为此，我们提出了两种启发式算法，它们可以在有限的宽条生成带宽开销下有效地执行。我们对StripeMerge进行了原型设计，并通过模拟和Amazon EC2实验表明，与最先进的存储扩展方法相比，宽条纹生成时间可以减少87.8%。

{"title":"StripeMerge: Efficient Wide-Stripe Generation for Large-Scale Erasure-Coded Storage","authors":"Qiaori Yao, Yuchong Hu, Liangfeng Cheng, P. Lee, D. Feng, Weichun Wang, Wei Chen","doi":"10.1109/ICDCS51616.2021.00053","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00053","url":null,"abstract":"Erasure coding has been widely deployed in modern large-scale storage systems for storage-efficient fault tolerance by storing stripes of data and parity chunks. Recently, enterprises explore the notion of wide stripes to suppress the fraction of parity chunks in each stripe to achieve extreme storage savings. However, how to efficiently generate wide stripes remains a non-trivial issue. In particular, re-encoding the currently stored stripes (termed narrow stripes) into wide stripes triggers substantial bandwidth overhead in relocating and regenerating the chunks for wide stripes. We propose StripeMerge, a wide-stripe generation mechanism that selects and merges narrow stripes into wide stripes, with the primary objective of minimizing the wide-stripe generation bandwidth. We prove the existence of an optimal scheme that does not incur any data transfer for wide-stripe generation, yet the optimal scheme is computationally expensive. To this end, we propose two heuristics that can be efficiently executed with only limited wide-stripe generation bandwidth overhead. We prototype StripeMerge and show via both simulations and Amazon EC2 experiments that the wide-stripe generation time can be reduced by up to 87.8% over a state-of-the-art storage scaling approach.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132412768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Communication-Efficient Federated Learning with Adaptive Parameter Freezing 具有自适应参数冻结的高效通信联邦学习

2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)

Pub Date : 2021-07-01 DOI: 10.1109/ICDCS51616.2021.00010

Chen Chen, Hongao Xu, Wei Wang, Baochun Li, Bo Li, Li Chen, Gong Zhang

Federated learning allows edge devices to collaboratively train a global model by synchronizing their local updates without sharing private data. Yet, with limited network bandwidth at the edge, communication often becomes a severe bottleneck. In this paper, we find that it is unnecessary to always synchronize the full model in the entire training process, because many parameters gradually stabilize prior to the ultimate model convergence, and can thus be excluded from being synchronized at an early stage. This allows us to reduce the communication overhead without compromising the model accuracy. However, challenges are that the local parameters excluded from global synchronization may diverge on different clients, and meanwhile some parameters may stabilize only temporally. To address these challenges, we propose a novel scheme called Adaptive Parameter Freezing (APF), which fixes (freezes) the non-synchronized stable parameters in intermittent periods. Specifically, the freezing periods are tentatively adjusted in an additively-increase and multiplicatively-decrease manner, depending on if the previously-frozen parameters remain stable in subsequent iterations. We implemented APF as a Python module in PyTorch. Our extensive array of experimental results show that APF can reduce data transfer by over 60%.

联邦学习允许边缘设备通过同步本地更新来协作训练全局模型，而无需共享私有数据。然而，由于边缘网络带宽有限，通信往往成为严重的瓶颈。在本文中，我们发现在整个训练过程中没有必要总是同步整个模型，因为许多参数在模型最终收敛之前逐渐稳定，因此可以在早期排除同步。这使我们能够在不影响模型准确性的情况下减少通信开销。然而，问题在于被排除在全局同步之外的局部参数可能在不同的客户端上出现分歧，同时一些参数可能只是暂时稳定的。为了解决这些挑战，我们提出了一种称为自适应参数冻结(APF)的新方案，该方案在间歇期间固定(冻结)非同步的稳定参数。具体地说，冻结周期暂定地以加性增加和乘性减少的方式进行调整，这取决于先前冻结的参数在随后的迭代中是否保持稳定。我们将APF作为Python模块在PyTorch中实现。我们大量的实验结果表明，APF可以减少60%以上的数据传输。

{"title":"Communication-Efficient Federated Learning with Adaptive Parameter Freezing","authors":"Chen Chen, Hongao Xu, Wei Wang, Baochun Li, Bo Li, Li Chen, Gong Zhang","doi":"10.1109/ICDCS51616.2021.00010","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00010","url":null,"abstract":"Federated learning allows edge devices to collaboratively train a global model by synchronizing their local updates without sharing private data. Yet, with limited network bandwidth at the edge, communication often becomes a severe bottleneck. In this paper, we find that it is unnecessary to always synchronize the full model in the entire training process, because many parameters gradually stabilize prior to the ultimate model convergence, and can thus be excluded from being synchronized at an early stage. This allows us to reduce the communication overhead without compromising the model accuracy. However, challenges are that the local parameters excluded from global synchronization may diverge on different clients, and meanwhile some parameters may stabilize only temporally. To address these challenges, we propose a novel scheme called Adaptive Parameter Freezing (APF), which fixes (freezes) the non-synchronized stable parameters in intermittent periods. Specifically, the freezing periods are tentatively adjusted in an additively-increase and multiplicatively-decrease manner, depending on if the previously-frozen parameters remain stable in subsequent iterations. We implemented APF as a Python module in PyTorch. Our extensive array of experimental results show that APF can reduce data transfer by over 60%.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121100526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 30

Efficiently Recovering Stateful System Components of Multi-server Microkernels 有效地恢复多服务器微内核的有状态系统组件

2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)

Pub Date : 2021-07-01 DOI: 10.1109/ICDCS51616.2021.00054

Wentai Li, Jinyu Gu, Nian Liu, B. Zang

Microkernel OSes provide OS services through mutually-isolated system servers running in different user processes, which brings stronger fault isolation than monolithic OSes. Nevertheless, considering the fault recovery capability of system servers, most existing microkernel OSes usually do no more than restarting a fault server, which will cause a server to lose all its running states and then may affect all the applications relying on it. In this paper, we present a mechanism named TxIPC that can efficiently recover stateful system servers on microkernel OSes. Since a system server provides the service by inter-process communication (IPC), TxIPC makes it fault resilient by handling each IPC in a transaction-like manner. Specifically, if a fault happens in a server (during one IPC handling procedure), TxIPC aborts all the updates made by the IPC and thus recovers the server from that fault. Evaluations show that TxIPC can enable servers to recover from 99.8% (injected) faults with 3%-45 % performance overhead on application benchmarks, which significantly outperforms existing counterparts.

微内核操作系统通过运行在不同用户进程中的相互隔离的系统服务器提供操作系统服务，具有比单片操作系统更强的故障隔离能力。然而，考虑到系统服务器的故障恢复能力，大多数现有的微内核操作系统通常只是重新启动故障服务器，这将导致服务器失去其所有运行状态，从而可能影响所有依赖于它的应用程序。在本文中，我们提出了一种名为TxIPC的机制，可以有效地恢复微内核操作系统上有状态的系统服务器。由于系统服务器通过进程间通信(IPC)提供服务，因此TxIPC通过以类似事务的方式处理每个IPC使其具有故障弹性。具体来说，如果服务器发生故障(在一个IPC处理过程中)，TxIPC将终止IPC所做的所有更新，从而从该故障中恢复服务器。评估表明，在应用程序基准测试中，TxIPC可以使服务器从99.8%(注入)的故障中恢复，而性能开销为3%- 45%，显著优于现有的同类产品。

{"title":"Efficiently Recovering Stateful System Components of Multi-server Microkernels","authors":"Wentai Li, Jinyu Gu, Nian Liu, B. Zang","doi":"10.1109/ICDCS51616.2021.00054","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00054","url":null,"abstract":"Microkernel OSes provide OS services through mutually-isolated system servers running in different user processes, which brings stronger fault isolation than monolithic OSes. Nevertheless, considering the fault recovery capability of system servers, most existing microkernel OSes usually do no more than restarting a fault server, which will cause a server to lose all its running states and then may affect all the applications relying on it. In this paper, we present a mechanism named TxIPC that can efficiently recover stateful system servers on microkernel OSes. Since a system server provides the service by inter-process communication (IPC), TxIPC makes it fault resilient by handling each IPC in a transaction-like manner. Specifically, if a fault happens in a server (during one IPC handling procedure), TxIPC aborts all the updates made by the IPC and thus recovers the server from that fault. Evaluations show that TxIPC can enable servers to recover from 99.8% (injected) faults with 3%-45 % performance overhead on application benchmarks, which significantly outperforms existing counterparts.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125248585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A Suspicion-Free Black-box Adversarial Attack for Deep Driving Maneuver Classification Models 深度驾驶机动分类模型的无怀疑黑盒对抗攻击

2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)

Pub Date : 2021-07-01 DOI: 10.1109/ICDCS51616.2021.00080

Ankur Sarker, Haiying Shen, Tanmoy Sen

The current autonomous vehicles are equipped with onboard deep neural network (DNN) models to process the data from different sensor and communication units. In the connected autonomous vehicle (CAV) scenario, each vehicle receives time-series driving signals (e.g., speed, brake status) from nearby vehicles through the wireless communication technologies. In the CAV scenario, several black-box adversarial attacks have been proposed, in which an attacker deliberately sends false driving signals to its nearby vehicle to fool its onboard DNN model and cause unwanted traffic incidents. However, the previously proposed black-box adversarial attack can be easily detected. To handle this problem, in this paper, we propose a Suspicion-free Boundary Black-box Adversarial (SBBA) attack, where the attacker utilizes the DNN model's output to design the adversarial perturbation. First, we formulate the attack design problem as a goal satisfying optimization problem with constraints so that the proposed attack will not be easily detectable by detection methods. Second, we solve the proposed optimization problem using the Bayesian optimization method. In our Bayesian optimization framework, we use the Gaussian process to model the posterior distribution of the DNN model, and we use the knowledge gradient function to choose the next sample point. We devise a gradient estimation technique for the knowledge gradient method to reduce the solution searching time. Finally, we conduct extensive experimental evaluations using two real driving datasets. The experimental results show that SBBA outperforms the previous adversarial attacks by 56% higher success rate under detection methods, 238% less time to launch the attacks, and 76% less perturbation (to avoid being detected), and 257% fewer queries (to the DNN model to verify the attack success).

目前的自动驾驶汽车配备了车载深度神经网络(DNN)模型，以处理来自不同传感器和通信单元的数据。在联网自动驾驶汽车(CAV)场景中，每辆车通过无线通信技术接收来自附近车辆的时序驾驶信号(例如速度、制动状态)。在CAV场景中，已经提出了几种黑盒对抗性攻击，其中攻击者故意向附近的车辆发送错误的驾驶信号，以欺骗其车载DNN模型，并造成不必要的交通事故。然而，先前提出的黑盒对抗攻击很容易被检测到。为了解决这一问题，本文提出了一种无怀疑边界黑盒对抗(SBBA)攻击，攻击者利用DNN模型的输出来设计对抗扰动。首先，我们将攻击设计问题表述为一个目标满足约束的优化问题，使所提出的攻击不容易被检测方法检测到。其次，我们使用贝叶斯优化方法来解决所提出的优化问题。在我们的贝叶斯优化框架中，我们使用高斯过程来建模DNN模型的后验分布，并使用知识梯度函数来选择下一个样本点。为减少知识梯度法的解搜索时间，设计了一种梯度估计技术。最后，我们使用两个真实的驾驶数据集进行了广泛的实验评估。实验结果表明，在检测方法下，SBBA比以前的对抗性攻击的成功率高56%，发起攻击的时间减少238%，扰动减少76%(以避免被检测)，查询减少257%(对DNN模型验证攻击成功)。

{"title":"A Suspicion-Free Black-box Adversarial Attack for Deep Driving Maneuver Classification Models","authors":"Ankur Sarker, Haiying Shen, Tanmoy Sen","doi":"10.1109/ICDCS51616.2021.00080","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00080","url":null,"abstract":"The current autonomous vehicles are equipped with onboard deep neural network (DNN) models to process the data from different sensor and communication units. In the connected autonomous vehicle (CAV) scenario, each vehicle receives time-series driving signals (e.g., speed, brake status) from nearby vehicles through the wireless communication technologies. In the CAV scenario, several black-box adversarial attacks have been proposed, in which an attacker deliberately sends false driving signals to its nearby vehicle to fool its onboard DNN model and cause unwanted traffic incidents. However, the previously proposed black-box adversarial attack can be easily detected. To handle this problem, in this paper, we propose a Suspicion-free Boundary Black-box Adversarial (SBBA) attack, where the attacker utilizes the DNN model's output to design the adversarial perturbation. First, we formulate the attack design problem as a goal satisfying optimization problem with constraints so that the proposed attack will not be easily detectable by detection methods. Second, we solve the proposed optimization problem using the Bayesian optimization method. In our Bayesian optimization framework, we use the Gaussian process to model the posterior distribution of the DNN model, and we use the knowledge gradient function to choose the next sample point. We devise a gradient estimation technique for the knowledge gradient method to reduce the solution searching time. Finally, we conduct extensive experimental evaluations using two real driving datasets. The experimental results show that SBBA outperforms the previous adversarial attacks by 56% higher success rate under detection methods, 238% less time to launch the attacks, and 76% less perturbation (to avoid being detected), and 257% fewer queries (to the DNN model to verify the attack success).","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125731240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

ProChecker: An Automated Security and Privacy Analysis Framework for 4G LTE Protocol Implementations ProChecker:用于4G LTE协议实现的自动安全和隐私分析框架

2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)

Pub Date : 2021-07-01 DOI: 10.1109/ICDCS51616.2021.00079

Imtiaz Karim, Syed Rafiul Hussain, E. Bertino

Cellular protocol implementations must comply with the specifications, and the security and privacy requirements. These implementations, however, often deviate from the security and privacy requirements due to under specifications in cellular standards, inherent protocol complexities, and design flaws inducing logical vulnerabilities. Detecting such logical vulnerabilities in the complex and stateful 4G LTE protocol is challenging due to operational dependencies on internal-states, and intertwined complex protocol interactions among multiple participants. In this paper, we address these challenges and develop ProChecker which- (1) extracts a precise semantic model as a finite-state machine of the implementation by combining dynamic testing with static instrumentation, and (2) verifies the properties against the extracted model by combining a symbolic model checker and a cryptographic protocol verifier. We demonstrate the effectiveness of ProChecker by evaluating it on a closed-source and two of the most popular open-source 4G LTE control-plane protocol implementations with 62 properties. ProChecker unveiled 3 new protocol-specific logical attacks, 6 implementation issues, and detected 14 prior attacks. The impact of the attacks range from denial-of-service, broken integrity, encryption, and replay protection to privacy leakage.

蜂窝协议的实现必须符合规范、安全和隐私要求。然而，由于蜂窝标准中的不规范、固有的协议复杂性以及导致逻辑漏洞的设计缺陷，这些实现经常偏离安全和隐私要求。由于对内部状态的操作依赖以及多个参与者之间错综复杂的协议交互，在复杂和有状态的4G LTE协议中检测此类逻辑漏洞具有挑战性。在本文中，我们解决了这些挑战，并开发了ProChecker，它-(1)通过结合动态测试和静态检测来提取精确的语义模型作为实现的有限状态机，并且(2)通过结合符号模型检查器和加密协议验证器来验证提取的模型的属性。我们通过在具有62个属性的闭源和两个最流行的开源4G LTE控制平面协议实现上对ProChecker进行评估来证明其有效性。ProChecker发布了3个新的特定于协议的逻辑攻击,6实现问题,发现14之前攻击。攻击的影响范围从拒绝服务、破坏完整性、加密和重放保护到隐私泄露。

{"title":"ProChecker: An Automated Security and Privacy Analysis Framework for 4G LTE Protocol Implementations","authors":"Imtiaz Karim, Syed Rafiul Hussain, E. Bertino","doi":"10.1109/ICDCS51616.2021.00079","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00079","url":null,"abstract":"Cellular protocol implementations must comply with the specifications, and the security and privacy requirements. These implementations, however, often deviate from the security and privacy requirements due to under specifications in cellular standards, inherent protocol complexities, and design flaws inducing logical vulnerabilities. Detecting such logical vulnerabilities in the complex and stateful 4G LTE protocol is challenging due to operational dependencies on internal-states, and intertwined complex protocol interactions among multiple participants. In this paper, we address these challenges and develop ProChecker which- (1) extracts a precise semantic model as a finite-state machine of the implementation by combining dynamic testing with static instrumentation, and (2) verifies the properties against the extracted model by combining a symbolic model checker and a cryptographic protocol verifier. We demonstrate the effectiveness of ProChecker by evaluating it on a closed-source and two of the most popular open-source 4G LTE control-plane protocol implementations with 62 properties. ProChecker unveiled 3 new protocol-specific logical attacks, 6 implementation issues, and detected 14 prior attacks. The impact of the attacks range from denial-of-service, broken integrity, encryption, and replay protection to privacy leakage.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128374678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Poster: WallGuard - A Deep Learning Approach for Avoiding Regrettable Posts in Social Media 海报:WallGuard -一种深度学习方法，用于避免社交媒体上令人遗憾的帖子

2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)

Pub Date : 2021-07-01 DOI: 10.1109/ICDCS51616.2021.00127

Haya Shulman, Hervais Simo

We develop WallGuard for helping users in online social networks (OSNs) avoid regrettable posts and disclosure of sensitive information. Using WallGuard the users can control their posts and can (i) detect inappropriate, regrettable messages before they are posted, as well as (ii) identify already posted messages that could negatively impact user's reputation and life. WallGuard is based on deep learning architectures and NLP based methods. To evaluate the effectiveness of WallGuard, we developed a semi-supervised self-training methodology, which we use to create a new, large-scale corpus for regret detection with 4,7 million OSN messages. The corpus is generated by incrementally labelling messages from large OSN platforms relying on human-labelled and machine-labelled messages. Training Facebook's FastText word embeddings and Word2vec embeddings on our corpus, we created domain specific word embeddings, we referred to as regret embeddings. Our approach allows us to extract features that are discriminative/intrinsic for regrettable disclosures. Leveraging both regret embeddings and the new corpus, we successfully train and evaluate five new multi-label deep-learning based models for automatically classifying regrettable posts. Our evaluation of the proposed models demonstrate that we can detect messages with regrettable topics, achieving up to 0,975 weighted AUC, 82,2% precision and 74,6% recall. WallGuard is free and open-source.

我们开发WallGuard是为了帮助在线社交网络(OSNs)的用户避免令人遗憾的帖子和敏感信息的泄露。使用WallGuard，用户可以控制他们的帖子，并且可以(i)在发布之前检测不适当的，令人遗憾的消息，以及(ii)识别已经发布的可能对用户的声誉和生活产生负面影响的消息。WallGuard基于深度学习架构和基于NLP的方法。为了评估WallGuard的有效性，我们开发了一种半监督自我训练方法，我们使用该方法创建了一个新的大规模语料库，用于包含470万条OSN消息的遗憾检测。语料库是通过依赖于人工标记和机器标记的消息，对来自大型OSN平台的消息进行增量标记而生成的。在我们的语料库上训练Facebook的FastText词嵌入和Word2vec词嵌入，我们创建了特定领域的词嵌入，我们称之为后悔嵌入。我们的方法使我们能够从令人遗憾的披露中提取出具有歧视性/内在性的特征。利用遗憾嵌入和新的语料库，我们成功地训练和评估了五个新的基于多标签深度学习的模型，用于自动分类遗憾帖子。我们对所提出的模型的评估表明，我们可以检测具有遗憾主题的消息，加权AUC达到0.975，精度为82.2%，召回率为74.6%。WallGuard是免费和开源的。

{"title":"Poster: WallGuard - A Deep Learning Approach for Avoiding Regrettable Posts in Social Media","authors":"Haya Shulman, Hervais Simo","doi":"10.1109/ICDCS51616.2021.00127","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00127","url":null,"abstract":"We develop WallGuard for helping users in online social networks (OSNs) avoid regrettable posts and disclosure of sensitive information. Using WallGuard the users can control their posts and can (i) detect inappropriate, regrettable messages before they are posted, as well as (ii) identify already posted messages that could negatively impact user's reputation and life. WallGuard is based on deep learning architectures and NLP based methods. To evaluate the effectiveness of WallGuard, we developed a semi-supervised self-training methodology, which we use to create a new, large-scale corpus for regret detection with 4,7 million OSN messages. The corpus is generated by incrementally labelling messages from large OSN platforms relying on human-labelled and machine-labelled messages. Training Facebook's FastText word embeddings and Word2vec embeddings on our corpus, we created domain specific word embeddings, we referred to as regret embeddings. Our approach allows us to extract features that are discriminative/intrinsic for regrettable disclosures. Leveraging both regret embeddings and the new corpus, we successfully train and evaluate five new multi-label deep-learning based models for automatically classifying regrettable posts. Our evaluation of the proposed models demonstrate that we can detect messages with regrettable topics, achieving up to 0,975 weighted AUC, 82,2% precision and 74,6% recall. WallGuard is free and open-source.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115057859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Demo: Cloak: A Framework For Development of Confidential Blockchain Smart Contracts 演示:斗篷:开发机密区块链智能合约的框架

2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)

Pub Date : 2021-06-25 DOI: 10.1109/ICDCS51616.2021.00111

Qian Ren, Han Liu, Yue Li, Hong Lei

In recent years, as blockchain adoption has been expanding across a wide range of domains, e.g., digital asset, supply chain finance, etc., the confidentiality of smart contracts is now a fundamental demand for practical applications. However, while new privacy protection techniques keep coming out, how existing ones can best fit development settings is little studied. Suffering from limited architectural support in terms of programming interfaces, state-of-the-art solutions can hardly reach general developers. In this paper, we proposed the CLOAK framework for developing confidential smart contracts. The key capability of Cloak is allowing developers to implement and deploy practical solutions to multi-party transaction (MPT) problems, i.e., transact with secret inputs and states owned by different parties by simply specifying it. To this end, CLOAK introduced a domain-specific annotation language for declaring privacy specifications and further automatically generating confidential smart contracts to be deployed with trusted execution environment (TEE) on blockchain. In our evaluation on both simple and real-world applications, developers managed to deploy business services on blockchain in a concise manner by only developing CLOAK smart contracts whose size is less than 30% of the deployed ones.

近年来，随着区块链在数字资产、供应链金融等广泛领域的应用，智能合约的保密性已成为实际应用的基本需求。然而，尽管新的隐私保护技术不断涌现，但现有的技术如何最适合开发环境却鲜有研究。由于编程接口方面的体系结构支持有限，最先进的解决方案很难达到一般开发人员的要求。在本文中，我们提出了用于开发机密智能合约的CLOAK框架。Cloak的关键功能是允许开发人员实现和部署多方事务(MPT)问题的实际解决方案，即通过简单地指定不同方拥有的秘密输入和状态进行交易。为此，CLOAK引入了一种特定于领域的注释语言，用于声明隐私规范，并进一步自动生成机密智能合约，以在区块链上与可信执行环境(TEE)一起部署。在我们对简单应用程序和实际应用程序的评估中，开发人员仅通过开发规模小于已部署规模30%的CLOAK智能合约，以简洁的方式在区块链上部署业务服务。

{"title":"Demo: Cloak: A Framework For Development of Confidential Blockchain Smart Contracts","authors":"Qian Ren, Han Liu, Yue Li, Hong Lei","doi":"10.1109/ICDCS51616.2021.00111","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00111","url":null,"abstract":"In recent years, as blockchain adoption has been expanding across a wide range of domains, e.g., digital asset, supply chain finance, etc., the confidentiality of smart contracts is now a fundamental demand for practical applications. However, while new privacy protection techniques keep coming out, how existing ones can best fit development settings is little studied. Suffering from limited architectural support in terms of programming interfaces, state-of-the-art solutions can hardly reach general developers. In this paper, we proposed the CLOAK framework for developing confidential smart contracts. The key capability of Cloak is allowing developers to implement and deploy practical solutions to multi-party transaction (MPT) problems, i.e., transact with secret inputs and states owned by different parties by simply specifying it. To this end, CLOAK introduced a domain-specific annotation language for declaring privacy specifications and further automatically generating confidential smart contracts to be deployed with trusted execution environment (TEE) on blockchain. In our evaluation on both simple and real-world applications, developers managed to deploy business services on blockchain in a concise manner by only developing CLOAK smart contracts whose size is less than 30% of the deployed ones.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122348676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

GDDR: GNN-based Data-Driven Routing GDDR:基于gnn的数据驱动路由

2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)

Pub Date : 2021-04-20 DOI: 10.1109/ICDCS51616.2021.00056

Oliver Hope, Eiko Yoneki

We explore the feasibility of combining Graph Neural Network-based policy architectures with Deep Reinforcement Learning as an approach to problems in systems. This fits particularly well with operations on networks, which naturally take the form of graphs. As a case study, we take the idea of data-driven routing in intradomain traffic engineering, whereby the routing of data in a network can be managed taking into account the data itself. The particular subproblem which we examine is minimising link congestion in networks using knowledge of historic traffic flows. We show through experiments that an approach using Graph Neural Networks (GNNs) performs at least as well as previous work using Multilayer Perceptron architectures. GNNs have the added benefit that they allow for the generalisation of trained agents to different network topologies with no extra work. Furthermore, we believe that this technique is applicable to a far wider selection of problems in systems research.

我们探索了将基于图神经网络的策略架构与深度强化学习相结合作为解决系统问题的方法的可行性。这特别适合于网络上的操作，网络上的操作通常采用图的形式。作为一个案例研究，我们在域内流量工程中采用数据驱动路由的思想，即可以考虑数据本身来管理网络中数据的路由。我们研究的特定子问题是利用历史交通流的知识最小化网络中的链路拥塞。我们通过实验证明，使用图神经网络(gnn)的方法至少与以前使用多层感知器架构的工作一样好。gnn还有一个额外的好处，即它们允许将训练过的代理推广到不同的网络拓扑，而不需要额外的工作。此外，我们相信这种技术适用于系统研究中更广泛的问题选择。

引用次数: 8

Upper and Lower Bounds for Deterministic Approximate Objects 确定性近似对象的上下界

2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)

Pub Date : 2021-04-20 DOI: 10.1109/ICDCS51616.2021.00049

Danny Hendler, A. Khattabi, A. Milani, Corentin Travers

Relaxing the sequential specification of shared objects has been proposed as a promising approach to obtain implementations with better complexity. In this paper, we study the step complexity of relaxed variants of two common shared objects: max registers and counters. In particular, we consider the $k$-multiplicative-accurate max register and the k-multiplicative-accurate counter, where read operations are allowed to err by a multiplicative factor of $k$ (for some $kin mathbb{N}$). More accurately, reads are allowed to return an approximate value $x$ of the maximum value $v$ previously written to the max register, or of the number $v$ of increments previously applied to the counter, respectively, such that $v/kleq xleq v. k$. We provide upper and lower bounds on the complexity of implementing these objects in a wait-free manner in the shared memory model.

放宽共享对象的顺序规范被认为是获得更好的复杂性实现的一种有前途的方法。本文研究了两种常见共享对象:最大寄存器和计数器的松弛变量的步长复杂度。特别地，我们考虑$k$ -乘法精度最大寄存器和k-乘法精度计数器，其中允许读取操作出错的乘法因子为$k$(对于某些$kin mathbb{N}$)。更准确地说，允许读取返回先前写入max寄存器的最大值$v$的近似值$x$，或者先前分别应用于计数器的增量数$v$的近似值，例如$v/kleq xleq v. k$。我们提供了在共享内存模型中以无等待方式实现这些对象的复杂度的上限和下限。

引用次数: 0

Sync-Switch: Hybrid Parameter Synchronization for Distributed Deep Learning 同步开关:分布式深度学习的混合参数同步

2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)

Pub Date : 2021-04-16 DOI: 10.1109/ICDCS51616.2021.00057

Shijian Li, Oren Mangoubi, Lijie Xu, Tian Guo

Stochastic Gradient Descent (SGD) has become the de facto way to train deep neural networks in distributed clusters. A critical factor in determining the training throughput and model accuracy is the choice of the parameter synchronization protocol. For example, while Bulk Synchronous Parallel (BSP) often achieves better converged accuracy, the corresponding training throughput can be negatively impacted by stragglers. In contrast, Asynchronous Parallel (ASP) can have higher throughput, but its convergence and accuracy can be impacted by stale gradients. To improve the performance of synchronization protocol, recent work often focuses on designing new protocols with a heavy reliance on hard-to-tune hyper-parameters. In this paper, we design a hybrid synchronization approach that exploits the benefits of both BSP and ASP, i.e., reducing training time while simultaneously maintaining the converged accuracy. Based on extensive empirical profiling, we devise a collection of adaptive policies that determine how and when to switch between synchronization protocols. Our policies include both offline ones that target recurring jobs and online ones for handling transient stragglers. We implement the proposed policies in a prototype system, called Sync-Switch, on top of TensorFlow, and evaluate the training performance with popular deep learning models and datasets. Our experiments show that Sync-Switch can achieve ASP level training speedup while maintaining similar converged accuracy when comparing to BSP. Moreover, Sync-Switch's elastic-based policy can adequately mitigate the impact from transient stragglers.

随机梯度下降法(SGD)已经成为在分布式集群中训练深度神经网络的有效方法。参数同步协议的选择是决定训练吞吐量和模型精度的关键因素。例如，虽然批量同步并行(BSP)通常可以获得更好的收敛精度，但相应的训练吞吐量可能会受到离散者的负面影响。相比之下，异步并行(ASP)具有更高的吞吐量，但其收敛性和准确性会受到陈旧梯度的影响。为了提高同步协议的性能，最近的工作通常集中在设计严重依赖难以调优的超参数的新协议上。在本文中，我们设计了一种混合同步方法，利用了BSP和ASP的优点，即在保持收敛精度的同时减少了训练时间。基于广泛的经验分析，我们设计了一组自适应策略，用于确定如何以及何时在同步协议之间切换。我们的策略包括针对重复工作的离线策略和用于处理暂时掉队的在线策略。我们在TensorFlow之上的原型系统(称为Sync-Switch)中实现了所提出的策略，并使用流行的深度学习模型和数据集评估了训练性能。实验表明，与BSP相比，Sync-Switch可以在保持相似收敛精度的同时实现ASP级别的训练加速。此外，Sync-Switch基于弹性的策略可以充分减轻瞬态掉队者的影响。

{"title":"Sync-Switch: Hybrid Parameter Synchronization for Distributed Deep Learning","authors":"Shijian Li, Oren Mangoubi, Lijie Xu, Tian Guo","doi":"10.1109/ICDCS51616.2021.00057","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00057","url":null,"abstract":"Stochastic Gradient Descent (SGD) has become the de facto way to train deep neural networks in distributed clusters. A critical factor in determining the training throughput and model accuracy is the choice of the parameter synchronization protocol. For example, while Bulk Synchronous Parallel (BSP) often achieves better converged accuracy, the corresponding training throughput can be negatively impacted by stragglers. In contrast, Asynchronous Parallel (ASP) can have higher throughput, but its convergence and accuracy can be impacted by stale gradients. To improve the performance of synchronization protocol, recent work often focuses on designing new protocols with a heavy reliance on hard-to-tune hyper-parameters. In this paper, we design a hybrid synchronization approach that exploits the benefits of both BSP and ASP, i.e., reducing training time while simultaneously maintaining the converged accuracy. Based on extensive empirical profiling, we devise a collection of adaptive policies that determine how and when to switch between synchronization protocols. Our policies include both offline ones that target recurring jobs and online ones for handling transient stragglers. We implement the proposed policies in a prototype system, called Sync-Switch, on top of TensorFlow, and evaluate the training performance with popular deep learning models and datasets. Our experiments show that Sync-Switch can achieve ASP level training speedup while maintaining similar converged accuracy when comparing to BSP. Moreover, Sync-Switch's elastic-based policy can adequately mitigate the impact from transient stragglers.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129855336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12