2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)最新文献_第8页

Defuse: A Dependency-Guided Function Scheduler to Mitigate Cold Starts on FaaS Platforms 化解:一个依赖引导的功能调度器，用于减少FaaS平台上的冷启动

2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)

Pub Date : 2021-07-01 DOI: 10.1109/ICDCS51616.2021.00027

Jiacheng Shen, Tianyi Yang, Yuxin Su, Yangfan Zhou, Michael R. Lyu

Function-as-a-Service (FaaS) is becoming a prevalent paradigm in developing cloud applications. With FaaS, clients can develop applications as serverless functions, leaving the burden of resource management to cloud providers. However, FaaS platforms suffer from the performance degradation caused by the cold starts of serverless functions. Cold starts happen when serverless functions are invoked before they have been loaded into the memory. The problem is unavoidable because the memory in datacenters is typically too limited to hold all serverless functions simultaneously. The latency of cold function invocations will greatly degenerate the performance of FaaS platforms. Currently, FaaS platforms employ various scheduling methods to reduce the occurrences of cold starts. However, they do not consider the ubiquitous dependencies between serverless functions. Observing the potential of using dependencies to mitigate cold starts, we propose Defuse, a Dependency-guided Function Scheduler on FaaS platforms. Specifically, Defuse identifies two types of dependencies between serverless functions, i.e., strong dependencies and weak ones. It uses frequent pattern mining and positive point-wise mutual information to mine such dependencies respectively from function invocation histories. In this way, Defuse constructs a function dependency graph. The connected components (i.e., dependent functions) on the graph can be scheduled to diminish the occurrences of cold starts. We evaluate the effectiveness of Defuse by applying it to an industrial serverless dataset. The experimental results show that Defuse can reduce 22% of memory usage while having a 35% decrease in function cold-start rates compared with the state-of-the-art method.

功能即服务(FaaS)正在成为开发云应用程序的流行范例。使用FaaS，客户可以将应用程序开发为无服务器功能，将资源管理的负担留给云提供商。但是，FaaS平台受到无服务器功能冷启动导致的性能下降的影响。冷启动发生在无服务器函数被加载到内存之前被调用的时候。这个问题是不可避免的，因为数据中心的内存通常非常有限，无法同时容纳所有无服务器功能。冷函数调用的延迟将极大地降低FaaS平台的性能。目前，FaaS平台采用各种调度方法来减少冷启动的发生。但是，它们没有考虑无服务器功能之间普遍存在的依赖关系。观察到使用依赖关系来缓解冷启动的潜力，我们提出了化解，一个FaaS平台上的依赖引导的功能调度程序。具体来说，化解识别了无服务器函数之间的两种依赖关系，即强依赖关系和弱依赖关系。它使用频繁的模式挖掘和正向的逐点互信息分别从函数调用历史中挖掘这些依赖关系。通过这种方式，化解构建了一个函数依赖关系图。可以对图上的连接组件(即相关函数)进行调度，以减少冷启动的发生。我们通过将其应用于工业无服务器数据集来评估其有效性。实验结果表明，与最先进的方法相比，该方法可以减少22%的内存使用，同时功能冷启动率降低35%。

{"title":"Defuse: A Dependency-Guided Function Scheduler to Mitigate Cold Starts on FaaS Platforms","authors":"Jiacheng Shen, Tianyi Yang, Yuxin Su, Yangfan Zhou, Michael R. Lyu","doi":"10.1109/ICDCS51616.2021.00027","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00027","url":null,"abstract":"Function-as-a-Service (FaaS) is becoming a prevalent paradigm in developing cloud applications. With FaaS, clients can develop applications as serverless functions, leaving the burden of resource management to cloud providers. However, FaaS platforms suffer from the performance degradation caused by the cold starts of serverless functions. Cold starts happen when serverless functions are invoked before they have been loaded into the memory. The problem is unavoidable because the memory in datacenters is typically too limited to hold all serverless functions simultaneously. The latency of cold function invocations will greatly degenerate the performance of FaaS platforms. Currently, FaaS platforms employ various scheduling methods to reduce the occurrences of cold starts. However, they do not consider the ubiquitous dependencies between serverless functions. Observing the potential of using dependencies to mitigate cold starts, we propose Defuse, a Dependency-guided Function Scheduler on FaaS platforms. Specifically, Defuse identifies two types of dependencies between serverless functions, i.e., strong dependencies and weak ones. It uses frequent pattern mining and positive point-wise mutual information to mine such dependencies respectively from function invocation histories. In this way, Defuse constructs a function dependency graph. The connected components (i.e., dependent functions) on the graph can be scheduled to diminish the occurrences of cold starts. We evaluate the effectiveness of Defuse by applying it to an industrial serverless dataset. The experimental results show that Defuse can reduce 22% of memory usage while having a 35% decrease in function cold-start rates compared with the state-of-the-art method.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125936788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 15

Distributed Online Service Coordination Using Deep Reinforcement Learning 使用深度强化学习的分布式在线服务协调

2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)

Pub Date : 2021-07-01 DOI: 10.1109/ICDCS51616.2021.00058

Stefan Schneider, Haydar Qarawlus, Holger Karl

Services often consist of multiple chained components such as microservices in a service mesh, or machine learning functions in a pipeline. Providing these services requires online coordination including scaling the service, placing instance of all components in the network, scheduling traffic to these instances, and routing traffic through the network. Optimized service coordination is still a hard problem due to many influencing factors such as rapidly arriving user demands and limited node and link capacity. Existing approaches to solve the problem are often built on rigid models and assumptions, tailored to specific scenarios. If the scenario changes and the assumptions no longer hold, they easily break and require manual adjustments by experts. Novel self-learning approaches using deep reinforcement learning (DRL) are promising but still have limitations as they only address simplified versions of the problem and are typically centralized and thus do not scale to practical large-scale networks. To address these issues, we propose a distributed self-learning service coordination approach using DRL. After centralized training, we deploy a distributed DRL agent at each node in the network, making fast coordination decisions locally in parallel with the other nodes. Each agent only observes its direct neighbors and does not need global knowledge. Hence, our approach scales independently from the size of the network. In our extensive evaluation using real-world network topologies and traffic traces, we show that our proposed approach outperforms a state-of-the-art conventional heuristic as well as a centralized DRL approach (60 % higher throughput on average) while requiring less time per online decision (1 ms).

服务通常由多个链式组件组成，比如服务网格中的微服务，或者管道中的机器学习功能。提供这些服务需要在线协调，包括扩展服务、在网络中放置所有组件的实例、为这些实例调度流量以及通过网络路由流量。由于用户需求到达速度快、节点链路容量有限等因素的影响，优化服务协调仍然是一个难题。现有的解决问题的方法通常建立在严格的模型和假设之上，并针对特定的场景进行了调整。如果情景发生变化，假设不再成立，它们很容易失效，需要专家进行手动调整。使用深度强化学习(DRL)的新颖自学习方法很有前途，但仍然有局限性，因为它们只解决问题的简化版本，并且通常是集中的，因此不能扩展到实际的大规模网络。为了解决这些问题，我们提出了一种使用DRL的分布式自学习服务协调方法。在集中训练后，我们在网络的每个节点上部署分布式DRL代理，与其他节点并行地在本地进行快速协调决策。每个智能体只观察它的直接邻居，不需要全局知识。因此，我们的方法与网络的大小无关。在我们使用真实网络拓扑和流量跟踪进行的广泛评估中，我们表明，我们提出的方法优于最先进的传统启发式方法和集中式DRL方法(平均吞吐量提高60%)，同时每个在线决策所需的时间更少(1毫秒)。

{"title":"Distributed Online Service Coordination Using Deep Reinforcement Learning","authors":"Stefan Schneider, Haydar Qarawlus, Holger Karl","doi":"10.1109/ICDCS51616.2021.00058","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00058","url":null,"abstract":"Services often consist of multiple chained components such as microservices in a service mesh, or machine learning functions in a pipeline. Providing these services requires online coordination including scaling the service, placing instance of all components in the network, scheduling traffic to these instances, and routing traffic through the network. Optimized service coordination is still a hard problem due to many influencing factors such as rapidly arriving user demands and limited node and link capacity. Existing approaches to solve the problem are often built on rigid models and assumptions, tailored to specific scenarios. If the scenario changes and the assumptions no longer hold, they easily break and require manual adjustments by experts. Novel self-learning approaches using deep reinforcement learning (DRL) are promising but still have limitations as they only address simplified versions of the problem and are typically centralized and thus do not scale to practical large-scale networks. To address these issues, we propose a distributed self-learning service coordination approach using DRL. After centralized training, we deploy a distributed DRL agent at each node in the network, making fast coordination decisions locally in parallel with the other nodes. Each agent only observes its direct neighbors and does not need global knowledge. Hence, our approach scales independently from the size of the network. In our extensive evaluation using real-world network topologies and traffic traces, we show that our proposed approach outperforms a state-of-the-art conventional heuristic as well as a centralized DRL approach (60 % higher throughput on average) while requiring less time per online decision (1 ms).","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"137 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124316501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Gradient-Leakage Resilient Federated Learning 梯度泄漏弹性联邦学习

2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)

Pub Date : 2021-07-01 DOI: 10.1109/ICDCS51616.2021.00081

Wenqi Wei, Ling Liu, Yanzhao Wu, Gong Su, A. Iyengar

Federated learning(FL) is an emerging distributed learning paradigm with default client privacy because clients can keep sensitive data on their devices and only share local training parameter updates with the federated server. However, recent studies reveal that gradient leakages in FL may compromise the privacy of client training data. This paper presents a gradient leakage resilient approach to privacy-preserving federated learning with per training example-based client differential privacy, coined as Fed-CDP. It makes three original contributions. First, we identify three types of client gradient leakage threats in federated learning even with encrypted client-server communications. We articulate when and why the conventional server coordinated differential privacy approach, coined as Fed-SDP, is insufficient to protect the privacy of the training data. Second, we introduce Fed-CDP, the per example-based client differential privacy algorithm, and provide a formal analysis of Fed-CDP with the (∊,δ) differential privacy guarantee, and a formal comparison between Fed-CDP and Fed-SDP in terms of privacy accounting. Third, we formally analyze the privacy-utility tradeoff for providing differential privacy guarantee by Fed-CDP and present a dynamic decay noise-injection policy to further improve the accuracy and resiliency of Fed-CDP. We evaluate and compare Fed-CDP and Fed-CDP(decay) with Fed-SDP in terms of differential privacy guarantee and gradient leakage resilience over five benchmark datasets. The results show that the Fed-CDP approach outperforms conventional Fed-SDP in terms of resilience to client gradient leakages while offering competitive accuracy performance in federated learning.

联邦学习(FL)是一种新兴的分布式学习范例，具有默认的客户机隐私性，因为客户机可以将敏感数据保存在其设备上，并且只与联邦服务器共享本地训练参数更新。然而，最近的研究表明，FL中的梯度泄漏可能会损害客户训练数据的隐私。本文提出了一种梯度泄漏弹性方法，用于基于每个训练示例的客户端差异隐私的隐私保护联邦学习，称为Fed-CDP。它有三个原创性贡献。首先，即使使用加密的客户机-服务器通信，我们也确定了联邦学习中的三种客户机梯度泄漏威胁。我们阐明了传统的服务器协调差分隐私方法(称为Fed-SDP)何时以及为何不足以保护训练数据的隐私。其次，我们引入了基于实例的客户端差分隐私算法Fed-CDP，并对Fed-CDP的差分隐私保证进行了形式化分析，并对Fed-CDP与Fed-SDP在隐私计费方面进行了形式化比较。第三，我们正式分析了Fed-CDP提供差分隐私保障的隐私效用权衡，并提出了一种动态衰减噪声注入策略，进一步提高Fed-CDP的准确性和弹性。我们在五个基准数据集上评估和比较了Fed-CDP和Fed-CDP(衰减)与Fed-SDP在差异隐私保证和梯度泄漏弹性方面的差异。结果表明，在对客户端梯度泄漏的弹性方面，Fed-CDP方法优于传统的Fed-SDP方法，同时在联邦学习中提供具有竞争力的准确性性能。

{"title":"Gradient-Leakage Resilient Federated Learning","authors":"Wenqi Wei, Ling Liu, Yanzhao Wu, Gong Su, A. Iyengar","doi":"10.1109/ICDCS51616.2021.00081","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00081","url":null,"abstract":"Federated learning(FL) is an emerging distributed learning paradigm with default client privacy because clients can keep sensitive data on their devices and only share local training parameter updates with the federated server. However, recent studies reveal that gradient leakages in FL may compromise the privacy of client training data. This paper presents a gradient leakage resilient approach to privacy-preserving federated learning with per training example-based client differential privacy, coined as Fed-CDP. It makes three original contributions. First, we identify three types of client gradient leakage threats in federated learning even with encrypted client-server communications. We articulate when and why the conventional server coordinated differential privacy approach, coined as Fed-SDP, is insufficient to protect the privacy of the training data. Second, we introduce Fed-CDP, the per example-based client differential privacy algorithm, and provide a formal analysis of Fed-CDP with the (∊,δ) differential privacy guarantee, and a formal comparison between Fed-CDP and Fed-SDP in terms of privacy accounting. Third, we formally analyze the privacy-utility tradeoff for providing differential privacy guarantee by Fed-CDP and present a dynamic decay noise-injection policy to further improve the accuracy and resiliency of Fed-CDP. We evaluate and compare Fed-CDP and Fed-CDP(decay) with Fed-SDP in terms of differential privacy guarantee and gradient leakage resilience over five benchmark datasets. The results show that the Fed-CDP approach outperforms conventional Fed-SDP in terms of resilience to client gradient leakages while offering competitive accuracy performance in federated learning.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124369431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 39

Root Cause Analyses for the Deteriorating Bitcoin Network Synchronization 比特币网络同步恶化的根本原因分析

2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)

Pub Date : 2021-07-01 DOI: 10.1109/ICDCS51616.2021.00031

Muhammad Saad, Songqing Chen, David A. Mohaisen

The Bitcoin network synchronization is crucial for its security against partitioning attacks. From 2014 to 2018, the Bitcoin network size has increased, while the percentage of synchronized nodes has decreased due to block propagation delay, which increases with the network size. However, in the last few months, the network synchronization has deteriorated despite a constant network size. The change in the synchronization pattern suggests that the network size is not the only factor in place, necessitating a root cause analysis of network synchronization. In this paper, we perform a root cause analysis to study four factors that affect network synchronization: the unreachable nodes, the addressing protocol, the information relaying protocol, and the network churn. Our study reveals that the unreachable nodes size is 24x the reachable network size. We also found that the network addressing protocol does not distinguish between reachable and unreachable nodes, leading to inefficiencies due to attempts to connect with unreachable nodes/addresses. We note that the outcome of this behavior is a low success rate of the outgoing connections, which reduces the average outdegree. Through measurements, we found malicious nodes that exploit this opportunity to flood the network with unreachable addresses. We also discovered that Bitcoin follows a round-robin relaying mechanism that adds a small delay in block propagation. Finally, we observe a high churn in the Bitcoin network where ≈8 % nodes leave the network every day. In the last few months the churn among synchronized nodes has doubled, which is likely the most dominant factor in decreasing network synchronization. Consolidating our insights, we propose improvements in Bitcoin Core to increase network synchronization.

比特币网络的同步性对其抵御分区攻击的安全性至关重要。从2014年到2018年，比特币网络规模有所增加，而同步节点的百分比由于区块传播延迟而下降，并且随着网络规模的增加而增加。然而，在过去的几个月里，尽管网络规模保持不变，网络同步却恶化了。同步模式的变化表明网络大小并不是唯一的影响因素，因此需要对网络同步进行根本原因分析。本文对影响网络同步的四个因素进行了根本原因分析:不可达节点、寻址协议、信息中继协议和网络扰动。我们的研究表明，不可达节点的大小是可达网络大小的24倍。我们还发现，网络寻址协议不区分可达和不可达的节点，导致效率低下，因为试图连接不可达的节点/地址。我们注意到，这种行为的结果是传出连接的成功率较低，这降低了平均传出度。通过测量，我们发现恶意节点利用这个机会用无法到达的地址淹没网络。我们还发现，比特币遵循一种循环中继机制，在区块传播中增加了一点延迟。最后，我们观察到比特币网络的高流失率，每天约有8%的节点离开网络。在过去的几个月里，同步节点之间的混乱增加了一倍，这可能是减少网络同步的最主要因素。巩固我们的见解，我们提出了比特币核心的改进，以增加网络同步。

{"title":"Root Cause Analyses for the Deteriorating Bitcoin Network Synchronization","authors":"Muhammad Saad, Songqing Chen, David A. Mohaisen","doi":"10.1109/ICDCS51616.2021.00031","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00031","url":null,"abstract":"The Bitcoin network synchronization is crucial for its security against partitioning attacks. From 2014 to 2018, the Bitcoin network size has increased, while the percentage of synchronized nodes has decreased due to block propagation delay, which increases with the network size. However, in the last few months, the network synchronization has deteriorated despite a constant network size. The change in the synchronization pattern suggests that the network size is not the only factor in place, necessitating a root cause analysis of network synchronization. In this paper, we perform a root cause analysis to study four factors that affect network synchronization: the unreachable nodes, the addressing protocol, the information relaying protocol, and the network churn. Our study reveals that the unreachable nodes size is 24x the reachable network size. We also found that the network addressing protocol does not distinguish between reachable and unreachable nodes, leading to inefficiencies due to attempts to connect with unreachable nodes/addresses. We note that the outcome of this behavior is a low success rate of the outgoing connections, which reduces the average outdegree. Through measurements, we found malicious nodes that exploit this opportunity to flood the network with unreachable addresses. We also discovered that Bitcoin follows a round-robin relaying mechanism that adds a small delay in block propagation. Finally, we observe a high churn in the Bitcoin network where ≈8 % nodes leave the network every day. In the last few months the churn among synchronized nodes has doubled, which is likely the most dominant factor in decreasing network synchronization. Consolidating our insights, we propose improvements in Bitcoin Core to increase network synchronization.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116885084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Online Learning Algorithms for Offloading Augmented Reality Requests with Uncertain Demands in MECs mec中具有不确定需求的增强现实请求卸载在线学习算法

2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)

Pub Date : 2021-07-01 DOI: 10.1109/ICDCS51616.2021.00105

Zichuan Xu, Dongqi Liu, W. Liang, Wenzheng Xu, Haipeng Dai, Qiufen Xia, Pan Zhou

Augmented Reality (AR) has various practical applications in healthcare, education, and entertainment. To provide a fully interactive and immersive experience, AR applications require extremely high responsiveness and ultra-low processing latency. Mobile edge computing (MEC) has shown great potential in meeting such stringent requirements and demands of AR applications by implementing AR requests in edge servers within the close proximity of these applications. In this paper, we investigate the problem of reward maximization for AR applications with uncertain demands in an MEC network, such that the reward of provisioning services for AR applications is maximized and the responsiveness of AR applications is enhanced, subject to both network resource capacity. We devise an exact solution for the problem if the problem size is small, otherwise we develop an efficient approximation algorithm with a provable approximation ratio for the problem. We also devise an online learning algorithm with a bounded regret for the dynamic reward maximization problem without the knowledge of the future arrivals of AR requests, by adopting the technique of Multi-Armed Bandits (MAB). We evaluate the performance of the proposed algorithms through simulations. Experimental results show that the proposed algorithms outperform existing studies by 17 % higher reward.

增强现实(AR)在医疗保健、教育和娱乐领域有各种实际应用。为了提供完全交互式的沉浸式体验，AR应用程序需要极高的响应速度和超低的处理延迟。移动边缘计算(MEC)通过在这些应用程序附近的边缘服务器中实现AR请求，在满足AR应用程序的严格要求和需求方面显示出巨大的潜力。本文研究了MEC网络中需求不确定的AR应用的报酬最大化问题，使AR应用提供服务的报酬最大化，增强AR应用的响应能力，同时受网络资源容量的限制。如果问题规模很小，我们设计了一个问题的精确解，否则我们开发了一个有效的近似算法，具有可证明的近似比。在不知道AR请求未来到达的情况下，采用多武装强盗(Multi-Armed Bandits, MAB)技术，设计了一种具有有限遗憾的在线学习算法来解决动态奖励最大化问题。我们通过仿真来评估所提出算法的性能。实验结果表明，所提出的算法比现有的研究结果高出17%。

{"title":"Online Learning Algorithms for Offloading Augmented Reality Requests with Uncertain Demands in MECs","authors":"Zichuan Xu, Dongqi Liu, W. Liang, Wenzheng Xu, Haipeng Dai, Qiufen Xia, Pan Zhou","doi":"10.1109/ICDCS51616.2021.00105","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00105","url":null,"abstract":"Augmented Reality (AR) has various practical applications in healthcare, education, and entertainment. To provide a fully interactive and immersive experience, AR applications require extremely high responsiveness and ultra-low processing latency. Mobile edge computing (MEC) has shown great potential in meeting such stringent requirements and demands of AR applications by implementing AR requests in edge servers within the close proximity of these applications. In this paper, we investigate the problem of reward maximization for AR applications with uncertain demands in an MEC network, such that the reward of provisioning services for AR applications is maximized and the responsiveness of AR applications is enhanced, subject to both network resource capacity. We devise an exact solution for the problem if the problem size is small, otherwise we develop an efficient approximation algorithm with a provable approximation ratio for the problem. We also devise an online learning algorithm with a bounded regret for the dynamic reward maximization problem without the knowledge of the future arrivals of AR requests, by adopting the technique of Multi-Armed Bandits (MAB). We evaluate the performance of the proposed algorithms through simulations. Experimental results show that the proposed algorithms outperform existing studies by 17 % higher reward.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134101986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Cutting the Request Completion Time in Key-value Stores with Distributed Adaptive Scheduler 利用分布式自适应调度器缩短键值存储中的请求完成时间

2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)

Pub Date : 2021-07-01 DOI: 10.1109/ICDCS51616.2021.00047

Wanchun Jiang, Haoyang Li, Yulong Yan, Fa Ji, M. Jiang, Jianxin Wang, Tong Zhang

Nowadays, the distributed key-value stores have become the basic building block for large scale cloud applications. In large-scale distributed key-value stores, many key-value access operations, which will be processed in parallel on different servers, are usually generated for the data required by a single end-user request. Hence, the completion time of the end request is determined by the last completed key-value access operation. Accordingly, scheduling the order of key-value access operations of different end requests can effectively reduce their completion time, improving the user experience. However, existing algorithms are either hard to employ in distributed key-value stores due to the relatively large cooperation overhead for centralized information or unable to adapt to the time-varying load and server performance under different traffic patterns. In this paper, we first formalize the scheduling problem for small mean request completion time. As a step further, because of the NP-hardness of this problem, we heuristically design the distributed adaptive scheduler (DAS) for distributed key-value stores. DAS reduces the average request completion time by a distributed combination of the largest remaining processing time last and shortest remaining process time first algorithms. Moreover, DAS is adaptive to the time-varying server load and performance. Extensive simulations show that DAS reduces the mean request completion time by more than 15 ~ 50% compared to the default first come first served algorithm and outperforms the existing Rein-SBF algorithm under various scenarios.

如今，分布式键值存储已经成为大规模云应用程序的基本构建块。在大规模分布式键值存储中，通常会为单个最终用户请求所需的数据生成许多键值访问操作，这些操作将在不同的服务器上并行处理。因此，结束请求的完成时间由最后完成的键值访问操作决定。因此，对不同终端请求的键值访问操作顺序进行调度，可以有效缩短终端请求的完成时间，提高用户体验。然而，现有的算法要么难以用于分布式键值存储，因为集中信息的协作开销相对较大，要么无法适应不同流量模式下时变的负载和服务器性能。本文首先形式化了小平均请求完成时间的调度问题。进一步，由于该问题的np -硬度，我们启发式地设计了分布式键值存储的分布式自适应调度程序(DAS)。DAS通过最大剩余处理时间最后和最短剩余处理时间第一算法的分布式组合来减少平均请求完成时间。此外，DAS能够适应时变的服务器负载和性能。大量的仿真表明，与默认的先到先服务算法相比，DAS平均请求完成时间减少了15 ~ 50%以上，并且在各种场景下优于现有的Rein-SBF算法。

{"title":"Cutting the Request Completion Time in Key-value Stores with Distributed Adaptive Scheduler","authors":"Wanchun Jiang, Haoyang Li, Yulong Yan, Fa Ji, M. Jiang, Jianxin Wang, Tong Zhang","doi":"10.1109/ICDCS51616.2021.00047","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00047","url":null,"abstract":"Nowadays, the distributed key-value stores have become the basic building block for large scale cloud applications. In large-scale distributed key-value stores, many key-value access operations, which will be processed in parallel on different servers, are usually generated for the data required by a single end-user request. Hence, the completion time of the end request is determined by the last completed key-value access operation. Accordingly, scheduling the order of key-value access operations of different end requests can effectively reduce their completion time, improving the user experience. However, existing algorithms are either hard to employ in distributed key-value stores due to the relatively large cooperation overhead for centralized information or unable to adapt to the time-varying load and server performance under different traffic patterns. In this paper, we first formalize the scheduling problem for small mean request completion time. As a step further, because of the NP-hardness of this problem, we heuristically design the distributed adaptive scheduler (DAS) for distributed key-value stores. DAS reduces the average request completion time by a distributed combination of the largest remaining processing time last and shortest remaining process time first algorithms. Moreover, DAS is adaptive to the time-varying server load and performance. Extensive simulations show that DAS reduces the mean request completion time by more than 15 ~ 50% compared to the default first come first served algorithm and outperforms the existing Rein-SBF algorithm under various scenarios.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"116 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128074975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Poster: Learning Index on Content-based Pub/Sub 海报:基于内容的Pub/Sub学习索引

2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)

Pub Date : 2021-07-01 DOI: 10.1109/ICDCS51616.2021.00124

Cheng Lin, Qinpei Zhao, Weixiong Rao

Content-based Pub/Sub paradigm has been widely used in many distributed applications and existing approaches suffer from high redundancy subscription index structure and low matching efficiency. To tackle this issue, in this paper, we propose a learning framework to guide the construction of an efficient in-memory subscription index, namely PMIndex, via a multi-task learning framework. The key of PMIndex is to merge redundant subscriptions into an optimal number of partitions for less memory cost and faster matching time. Our initial experimental result on a synthetic dataset demonstrates that PMindex outperforms two state-of-the-arts by faster matching time and less memory cost.

基于内容的Pub/Sub模式在许多分布式应用中得到了广泛的应用，但现有的Pub/Sub模式存在订阅索引结构冗余度高、匹配效率低等问题。为了解决这个问题，本文提出了一个学习框架，通过一个多任务学习框架来指导构建一个高效的内存订阅索引，即PMIndex。PMIndex的关键是将冗余订阅合并到最优数量的分区中，以获得更少的内存成本和更快的匹配时间。我们在一个合成数据集上的初步实验结果表明，PMindex在更快的匹配时间和更少的内存成本方面优于两种最先进的技术。

引用次数: 0

Demo: A FSM Approach to Web Collaboration 演示:FSM的Web协作方法

2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)

Pub Date : 2021-07-01 DOI: 10.1109/ICDCS51616.2021.00116

C. Gadea, B. Ionescu, D. Ionescu

Operational Transformation (OT) algorithms, at the heart of web-based collaboration, have been studied since the late 1980s and remain a hot research subject. Centralized versions of OT algorithms that can be implemented on top of cloud-based serverless platforms are still unexplored. This poster introduces a Control Loop view of OT algorithms that are modeled by a series of Finite State Automata (FSAs) embedded in a serverless system architecture. A series of nested Finite State Machines (FSMs) dynamically control the co-editing processes. The proposed platform was simulated to demonstrate the correctness of the OT algorithms. Results obtained from the simulation are presented and an interactive demonstration is given.

操作转换(OT)算法是基于网络协作的核心，自20世纪80年代末以来一直是一个热门的研究课题。可以在基于云的无服务器平台上实现的集中式OT算法版本仍未被探索。这张海报介绍了OT算法的控制环视图，该算法由嵌入在无服务器系统架构中的一系列有限状态自动机(FSAs)建模。一系列嵌套的有限状态机(fsm)动态控制协同编辑过程。通过仿真验证了OT算法的正确性。给出了仿真结果，并给出了交互式演示。

引用次数: 0

MVCom: Scheduling Most Valuable Committees for the Large-Scale Sharded Blockchain MVCom:为大规模分片区块链调度最有价值的委员会

2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)

Pub Date : 2021-07-01 DOI: 10.1109/ICDCS51616.2021.00066

Huawei Huang, Zhenyi Huang, Xiaowen Peng, Zibin Zheng, Song Guo

In a large-scale sharded blockchain, transactions are processed by a number of parallel committees collaboratively. Thus, the blockchain throughput can be strongly boosted. A problem is that some groups of blockchain nodes consume large latency to form committees at the beginning of each epoch. Furthermore, the heterogeneous processing capabilities of different committees also result in unbalanced consensus latency. Such unbalanced two-phase latency brings a large cumulative age to the transactions waited in the final committee. Consequently, the blockchain throughput can be significantly degraded because of the large transaction's cumulative age. We believe that a good committee-scheduling strategy can reduce the cumulative age, and thus benefit the blockchain throughput. However, we have not yet found a committee-scheduling scheme that works for accelerating block formation in the context of blockchain sharding. To this end, this paper studies a fine-balanced tradeoff between the transaction's throughput and their cumulative age in a large-scale sharded blockchain. We formulate this tradeoff as a utility-maximization problem, which is proved NP-hard. To solve this problem, we propose an online distributed Stochastic-Exploration (SE) algorithm, which guarantees a near-optimal system utility. The theoretical convergence time of the proposed algorithm as well as the performance perturbation brought by the committee's failure are also analyzed rigorously. We then evaluate the proposed algorithm using the dataset of blockchain-sharding transactions. The simulation results demonstrate that the proposed SE algorithm shows an overwhelming better performance comparing with other baselines in terms of both system utility and the contributing degree while processing shard transactions.

在大规模分片区块链中，交易由多个并行委员会协同处理。因此，区块链的吞吐量可以大大提高。一个问题是，在每个epoch开始时，一些区块链节点组会消耗大量延迟来组建委员会。此外，不同委员会的异构处理能力也导致了不平衡的共识延迟。这种不平衡的两阶段延迟给最终委员会中等待的事务带来了很大的累积时间。因此，由于大交易的累积时间，区块链吞吐量可能会显著降低。我们认为，一个好的委员会调度策略可以减少累积年龄，从而有利于区块链的吞吐量。然而，我们还没有找到一个委员会调度方案，可以在区块链分片的背景下加速块的形成。为此，本文研究了大规模分片区块链中交易吞吐量与其累积时间之间的精细平衡权衡。我们将这种权衡表述为效用最大化问题，并证明了np困难。为了解决这一问题，我们提出了一种在线分布式随机探索(SE)算法，该算法保证了近乎最优的系统效用。对算法的理论收敛时间以及委员会失效带来的性能扰动进行了严格的分析。然后，我们使用区块链分片交易数据集评估所提出的算法。仿真结果表明，在处理分片事务时，所提出的SE算法在系统效用和贡献程度方面都比其他基准具有压倒性的优势。

{"title":"MVCom: Scheduling Most Valuable Committees for the Large-Scale Sharded Blockchain","authors":"Huawei Huang, Zhenyi Huang, Xiaowen Peng, Zibin Zheng, Song Guo","doi":"10.1109/ICDCS51616.2021.00066","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00066","url":null,"abstract":"In a large-scale sharded blockchain, transactions are processed by a number of parallel committees collaboratively. Thus, the blockchain throughput can be strongly boosted. A problem is that some groups of blockchain nodes consume large latency to form committees at the beginning of each epoch. Furthermore, the heterogeneous processing capabilities of different committees also result in unbalanced consensus latency. Such unbalanced two-phase latency brings a large cumulative age to the transactions waited in the final committee. Consequently, the blockchain throughput can be significantly degraded because of the large transaction's cumulative age. We believe that a good committee-scheduling strategy can reduce the cumulative age, and thus benefit the blockchain throughput. However, we have not yet found a committee-scheduling scheme that works for accelerating block formation in the context of blockchain sharding. To this end, this paper studies a fine-balanced tradeoff between the transaction's throughput and their cumulative age in a large-scale sharded blockchain. We formulate this tradeoff as a utility-maximization problem, which is proved NP-hard. To solve this problem, we propose an online distributed Stochastic-Exploration (SE) algorithm, which guarantees a near-optimal system utility. The theoretical convergence time of the proposed algorithm as well as the performance perturbation brought by the committee's failure are also analyzed rigorously. We then evaluate the proposed algorithm using the dataset of blockchain-sharding transactions. The simulation results demonstrate that the proposed SE algorithm shows an overwhelming better performance comparing with other baselines in terms of both system utility and the contributing degree while processing shard transactions.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116057064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

MSS: Lightweight network authentication for resource constrained devices via Mergeable Stateful Signatures MSS:通过可合并状态签名对资源受限设备进行轻量级网络认证

2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)

Pub Date : 2021-07-01 DOI: 10.1109/ICDCS51616.2021.00035

Abdulrahman Bin Rabiah, Yugarshi Shashwat, Fatemah Alharbi, Silas Richelson, N. Abu-Ghazaleh

Signature-based authentication is a core cryptographic primitive essential for most secure networking protocols. We introduce a new signature scheme, MSS, that allows a client to efficiently authenticate herself to a server. We model our new scheme in an offline/online model where client online time is premium. The offline component derives basis signatures that are then composed based on the data being signed to provide signatures efficiently and securely during run-time. MSS requires the server to maintain state and is suitable for applications where a device has long-term associations with the server. MSS allows direct comparison to hash chains-based authentication schemes used in similar settings, and is relevant to resource-constrained devices e.g., IoT. We derive MSS instantiations for two cryptographic families, assuming the hardness of RSA and decisional Diffie-Hellman (DDH) respectively, demonstrating the generality of the idea. We then use our new scheme to design an efficient time-based one-time password (TOTP) protocol. Specifically, we implement two TOTP authentication systems from our RSA and DDH instantiations. We evaluate the TOTP implementations on Raspberry Pis which demonstrate appealing gains: MSS reduces authentication latency and energy consumption by a factor of ~82 and 792, respectively, compared to a recent hash chain-based TOTP system.

基于签名的身份验证是大多数安全网络协议必不可少的核心加密原语。我们引入了一种新的签名方案，MSS，它允许客户端有效地向服务器验证自己。我们在离线/在线模型中模拟我们的新方案，其中客户在线时间是宝贵的。脱机组件派生基签名，然后根据正在签名的数据组合基签名，以便在运行时高效、安全地提供签名。MSS要求服务器维护状态，适用于设备与服务器有长期关联的应用程序。MSS允许直接比较类似设置中使用的基于哈希链的身份验证方案，并且与资源受限的设备(例如物联网)相关。我们推导了两个密码学族的MSS实例，分别假设RSA和DDH的硬度，证明了该思想的普遍性。然后，我们使用我们的新方案设计了一个高效的基于时间的一次性密码(TOTP)协议。具体来说，我们从RSA和DDH实例中实现了两个TOTP身份验证系统。我们评估了树莓派上的TOTP实现，它展示了吸引人的收益:与最近基于哈希链的TOTP系统相比，MSS将身份验证延迟和能耗分别减少了约82和792。

{"title":"MSS: Lightweight network authentication for resource constrained devices via Mergeable Stateful Signatures","authors":"Abdulrahman Bin Rabiah, Yugarshi Shashwat, Fatemah Alharbi, Silas Richelson, N. Abu-Ghazaleh","doi":"10.1109/ICDCS51616.2021.00035","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00035","url":null,"abstract":"Signature-based authentication is a core cryptographic primitive essential for most secure networking protocols. We introduce a new signature scheme, MSS, that allows a client to efficiently authenticate herself to a server. We model our new scheme in an offline/online model where client online time is premium. The offline component derives basis signatures that are then composed based on the data being signed to provide signatures efficiently and securely during run-time. MSS requires the server to maintain state and is suitable for applications where a device has long-term associations with the server. MSS allows direct comparison to hash chains-based authentication schemes used in similar settings, and is relevant to resource-constrained devices e.g., IoT. We derive MSS instantiations for two cryptographic families, assuming the hardness of RSA and decisional Diffie-Hellman (DDH) respectively, demonstrating the generality of the idea. We then use our new scheme to design an efficient time-based one-time password (TOTP) protocol. Specifically, we implement two TOTP authentication systems from our RSA and DDH instantiations. We evaluate the TOTP implementations on Raspberry Pis which demonstrate appealing gains: MSS reduces authentication latency and energy consumption by a factor of ~82 and 792, respectively, compared to a recent hash chain-based TOTP system.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"52 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116837077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0