首页 > 最新文献

2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS)最新文献

英文 中文
Toward Trustworthy Blockchain-as-a-Service with Auditing 走向可信赖的区块链即服务与审计
Pub Date : 2020-11-01 DOI: 10.1109/ICDCS47774.2020.00068
Yongrae Jo, Jeonghyun Ma, Chanik Park
Many Blockchain-as-a-Service (BaaS) providers have emerged with the growing interest in BaaS among enterprises. However, current BaaS providers can pose a potential security threat in the context of a centralized service provider and for clients that depend on the provider. In this study, we first consider the problem of auditing BaaS and develop an Enforcer architecture for trustworthy BaaS.
随着企业对BaaS的兴趣日益浓厚,许多区块链即服务(BaaS)提供商应运而生。然而,当前的BaaS提供商可能在集中式服务提供商的上下文中以及对依赖于该提供商的客户端构成潜在的安全威胁。在本研究中,我们首先考虑审计BaaS的问题,并为可信的BaaS开发一个执行者体系结构。
{"title":"Toward Trustworthy Blockchain-as-a-Service with Auditing","authors":"Yongrae Jo, Jeonghyun Ma, Chanik Park","doi":"10.1109/ICDCS47774.2020.00068","DOIUrl":"https://doi.org/10.1109/ICDCS47774.2020.00068","url":null,"abstract":"Many Blockchain-as-a-Service (BaaS) providers have emerged with the growing interest in BaaS among enterprises. However, current BaaS providers can pose a potential security threat in the context of a centralized service provider and for clients that depend on the provider. In this study, we first consider the problem of auditing BaaS and develop an Enforcer architecture for trustworthy BaaS.","PeriodicalId":158630,"journal":{"name":"2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130987680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Exploration of TransE in a Distributed Environment 分布式环境中transse的探索
Pub Date : 2020-11-01 DOI: 10.1109/ICDCS47774.2020.00190
Meiyan Lu, L. Liao, Feng Zhang, Dandan Song
Knowledge graph is popular in knowledge mining fields. TransE uses the structure information of triples $left( {overrightarrow {{e_h}} + overrightarrow {{e_r}} approx overrightarrow {{e_t}} } right)$ to embed knowledge graphs into a continuous vector space, which is a very important component in knowledge representations. However, current TransE models are only implemented on single-node machines. With the explosive growth of data volumes, single-node TransE cannot meet the demand for data processing of large knowledge graphs, so a distributed TransE is urgently needed. In this poster, we propose a distributed TransE written in MPI, which can run on HPC clusters. In our experiments, our distributed TransE exhibits high-performance speedup and accuracy.
知识图谱是知识挖掘领域的热点。TransE利用三元组$left( {overrightarrow {{e_h}} + overrightarrow {{e_r}} approx overrightarrow {{e_t}} } right)$的结构信息将知识图嵌入到连续的向量空间中,这是知识表示中非常重要的组成部分。然而,当前的TransE模型仅在单节点机器上实现。随着数据量的爆炸式增长,单节点TransE已不能满足大型知识图谱的数据处理需求,因此迫切需要分布式TransE。在这张海报中,我们提出了一个用MPI编写的分布式TransE,它可以运行在HPC集群上。在我们的实验中,我们的分布式TransE表现出高性能的加速和准确性。
{"title":"Exploration of TransE in a Distributed Environment","authors":"Meiyan Lu, L. Liao, Feng Zhang, Dandan Song","doi":"10.1109/ICDCS47774.2020.00190","DOIUrl":"https://doi.org/10.1109/ICDCS47774.2020.00190","url":null,"abstract":"Knowledge graph is popular in knowledge mining fields. TransE uses the structure information of triples $left( {overrightarrow {{e_h}} + overrightarrow {{e_r}} approx overrightarrow {{e_t}} } right)$ to embed knowledge graphs into a continuous vector space, which is a very important component in knowledge representations. However, current TransE models are only implemented on single-node machines. With the explosive growth of data volumes, single-node TransE cannot meet the demand for data processing of large knowledge graphs, so a distributed TransE is urgently needed. In this poster, we propose a distributed TransE written in MPI, which can run on HPC clusters. In our experiments, our distributed TransE exhibits high-performance speedup and accuracy.","PeriodicalId":158630,"journal":{"name":"2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS)","volume":"228 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124290980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FastUp: Compute a Better TCAM Update Scheme in Less Time for SDN Switches 快速:在更短的时间内为SDN交换机计算更好的TCAM更新方案
Pub Date : 2020-11-01 DOI: 10.1109/ICDCS47774.2020.00128
Ying Wan, Haoyu Song, Hao Che, Yang Xu, Yi Wang, Chuwen Zhang, Zhijun Wang, Tian Pan, Hao Li, Hong Jiang, Chengchen Hu, Zhikang Chen, Bin Liu
While widely used for flow tables in SDN switches, TCAM faces challenges for rule updates. Both the computation time and interrupt time need to be short. We propose FastUp, a new TCAM update algorithm, which improves the previous dynamic programming-based algorithms. Evaluations show that FastUp shortens the computation time by 40~100× and the interrupt time by 1.2~2.5×. In addition, we are the first to prove the NP-hardness of the optimal TCAM update problem, and provide a practical method to evaluate an algorithm’s degree of optimality. Experiments show that FastUp’s optimality reaches 90%.
虽然TCAM在SDN交换机流表中得到了广泛的应用,但它在规则更新方面面临着挑战。计算时间和中断时间都需要短。提出了一种新的TCAM更新算法FastUp,改进了以往基于动态规划的TCAM更新算法。评估结果表明,FastUp的计算时间缩短了40~100倍,中断时间缩短了1.2~2.5倍。此外,我们首次证明了最优TCAM更新问题的np -硬度,并提供了一种实用的方法来评估算法的最优性程度。实验表明,FastUp的最优性达到90%。
{"title":"FastUp: Compute a Better TCAM Update Scheme in Less Time for SDN Switches","authors":"Ying Wan, Haoyu Song, Hao Che, Yang Xu, Yi Wang, Chuwen Zhang, Zhijun Wang, Tian Pan, Hao Li, Hong Jiang, Chengchen Hu, Zhikang Chen, Bin Liu","doi":"10.1109/ICDCS47774.2020.00128","DOIUrl":"https://doi.org/10.1109/ICDCS47774.2020.00128","url":null,"abstract":"While widely used for flow tables in SDN switches, TCAM faces challenges for rule updates. Both the computation time and interrupt time need to be short. We propose FastUp, a new TCAM update algorithm, which improves the previous dynamic programming-based algorithms. Evaluations show that FastUp shortens the computation time by 40~100× and the interrupt time by 1.2~2.5×. In addition, we are the first to prove the NP-hardness of the optimal TCAM update problem, and provide a practical method to evaluate an algorithm’s degree of optimality. Experiments show that FastUp’s optimality reaches 90%.","PeriodicalId":158630,"journal":{"name":"2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS)","volume":"182 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124586252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Toward Adaptive Disk Failure Prediction via Stream Mining 基于流挖掘的自适应磁盘故障预测
Pub Date : 2020-11-01 DOI: 10.1109/ICDCS47774.2020.00044
Shujie Han, P. Lee, Zhirong Shen, Cheng He, Yi Liu, Tao Huang
We explore machine learning for accurately predicting imminent disk failures and hence providing proactive fault tolerance for modern storage systems. Current disk failure prediction approaches are mostly offline and assume that the disk logs required for training learning models are available a priori. However, in large-scale disk deployment, disk logs are often continuously generated as an evolving data stream, in which the statistical patterns vary over time (also known as concept drift). Such a challenge motivates the need of online techniques that perform training and prediction on the incoming stream of disk logs in real time, while being adaptive to concept drift.We present StreamDFP, a general stream mining framework for disk failure prediction with concept-drift adaptation. We start with a measurement study and demonstrate the existence of concept drift on various disk models based on the datasets from Backblaze and Alibaba Cloud. Motivated by our study, we design StreamDFP with three key techniques, namely (i) online labeling, (ii) concept-drift-aware training, and (iii) general prediction, with a primary objective of making StreamDFP support various machine learning algorithms as a general frame-work. Our evaluation shows that StreamDFP improves the prediction accuracy significantly compared to without concept-drift adaptation under various settings, and achieves reasonably high stream processing performance.
我们探索机器学习准确预测即将发生的磁盘故障,从而为现代存储系统提供主动容错。当前的磁盘故障预测方法大多是离线的,并且假设训练学习模型所需的磁盘日志是先验的。然而,在大规模磁盘部署中,磁盘日志通常作为不断发展的数据流不断生成,其中的统计模式随着时间的推移而变化(也称为概念漂移)。这种挑战激发了对在线技术的需求,这些技术可以实时地对传入的磁盘日志流进行训练和预测,同时适应概念漂移。我们提出了StreamDFP,一个用于磁盘故障预测的通用流挖掘框架,具有概念漂移自适应。我们从测量研究开始,并基于Backblaze和阿里云的数据集证明了各种磁盘模型上概念漂移的存在。在我们研究的激励下,我们用三种关键技术设计StreamDFP,即(i)在线标记,(ii)概念漂移感知训练和(iii)一般预测,其主要目标是使StreamDFP支持各种机器学习算法作为一般框架。我们的评估表明,在各种设置下,与没有概念漂移自适应相比,StreamDFP显著提高了预测精度,并获得了相当高的流处理性能。
{"title":"Toward Adaptive Disk Failure Prediction via Stream Mining","authors":"Shujie Han, P. Lee, Zhirong Shen, Cheng He, Yi Liu, Tao Huang","doi":"10.1109/ICDCS47774.2020.00044","DOIUrl":"https://doi.org/10.1109/ICDCS47774.2020.00044","url":null,"abstract":"We explore machine learning for accurately predicting imminent disk failures and hence providing proactive fault tolerance for modern storage systems. Current disk failure prediction approaches are mostly offline and assume that the disk logs required for training learning models are available a priori. However, in large-scale disk deployment, disk logs are often continuously generated as an evolving data stream, in which the statistical patterns vary over time (also known as concept drift). Such a challenge motivates the need of online techniques that perform training and prediction on the incoming stream of disk logs in real time, while being adaptive to concept drift.We present StreamDFP, a general stream mining framework for disk failure prediction with concept-drift adaptation. We start with a measurement study and demonstrate the existence of concept drift on various disk models based on the datasets from Backblaze and Alibaba Cloud. Motivated by our study, we design StreamDFP with three key techniques, namely (i) online labeling, (ii) concept-drift-aware training, and (iii) general prediction, with a primary objective of making StreamDFP support various machine learning algorithms as a general frame-work. Our evaluation shows that StreamDFP improves the prediction accuracy significantly compared to without concept-drift adaptation under various settings, and achieves reasonably high stream processing performance.","PeriodicalId":158630,"journal":{"name":"2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124021121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Automatic Rule Updating based on Machine Learning in Complex Event Processing 复杂事件处理中基于机器学习的规则自动更新
Pub Date : 2020-11-01 DOI: 10.1109/ICDCS47774.2020.00176
Yunhao Sun, Guan-yu Li, B. Ning
Complex Event Process (CEP) is very essential in Semantic Web of Things (SWoT) that deploy a large number of sensor devices, like smart traffic and smart city. CEP mainly solves heterogenous problems of stream data processing, where streaming data is connected to internet by a mass of wireless sensor devices. The core work of CEP is rule updating. Existing researches of rule updating are designed for static environments, and it is quite laborious to transplant those rules for dynamic environments. To enhance the portability of event rules, a method of automatic rule updating based on machine learning is proposed to learn the rules of a dynamic environment. Experimental results reveal that the proposed methods are effective and efficient.
复杂事件处理(CEP)在部署大量传感器设备的语义物联网(SWoT)中至关重要,如智能交通和智慧城市。CEP主要解决流数据处理的异构问题,其中流数据通过大量无线传感器设备连接到互联网。CEP的核心工作是规则更新。现有的规则更新研究都是针对静态环境设计的,将这些规则移植到动态环境中是非常费力的。为了增强事件规则的可移植性,提出了一种基于机器学习的规则自动更新方法来学习动态环境中的规则。实验结果表明,该方法是有效的。
{"title":"Automatic Rule Updating based on Machine Learning in Complex Event Processing","authors":"Yunhao Sun, Guan-yu Li, B. Ning","doi":"10.1109/ICDCS47774.2020.00176","DOIUrl":"https://doi.org/10.1109/ICDCS47774.2020.00176","url":null,"abstract":"Complex Event Process (CEP) is very essential in Semantic Web of Things (SWoT) that deploy a large number of sensor devices, like smart traffic and smart city. CEP mainly solves heterogenous problems of stream data processing, where streaming data is connected to internet by a mass of wireless sensor devices. The core work of CEP is rule updating. Existing researches of rule updating are designed for static environments, and it is quite laborious to transplant those rules for dynamic environments. To enhance the portability of event rules, a method of automatic rule updating based on machine learning is proposed to learn the rules of a dynamic environment. Experimental results reveal that the proposed methods are effective and efficient.","PeriodicalId":158630,"journal":{"name":"2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129716451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
SineKV: Decoupled Secondary Indexing for LSM-based Key-Value Stores 基于lsm的键值存储的解耦二级索引
Pub Date : 2020-11-01 DOI: 10.1109/ICDCS47774.2020.00071
Fei Li, Youyou Lu, Zhe Yang, J. Shu
Secondary indexing is highly demanded for key-value stores by many applications to accelerate query performance. Current secondary indices on key-value stores are typically built on top of the primary index. In a secondary key query, the primary index has to be accessed to fetch the records, with the retrieved primary keys from the secondary index. The record fetching process invokes lots of point lookups in the primary index and exacerbates the read amplification. In this paper, we present SineKV, a decoupled Secondary indexing Key-Value store, aiming to avoid fetching records from the primary index and improve the secondary key query performance. Firstly, SineKV separates the records from the indices and keeps each index pointing to the record values independently. Secondly, SineKV proposes a mapping-based lazy index maintenance strategy to ensure the consistency of secondary indices. Finally, SineKV leverages the CMB feature of the underlying NVMe SSDs to guarantee crash consistency. We implement and evaluate SineKV against LevelDB and Wisc-Key based designs. The evaluations show SineKV outperforms LevelDB and WiscKey based systems by up to 6.12× and 2.78× under microbenchmark and mixed workloads.
为了提高查询性能,许多应用程序都需要对键值存储进行二次索引。当前键值存储上的二级索引通常构建在主索引之上。在辅助键查询中,必须访问主索引以获取记录,并从辅助索引检索主键。记录获取过程在主索引中调用了大量的点查找,并加剧了读取放大。在本文中,我们提出了一种解耦的二级索引键值存储(SineKV),旨在避免从主索引获取记录并提高二级键查询性能。首先,SineKV将记录从索引中分离出来,并使每个索引独立地指向记录值。其次,为了保证二级索引的一致性,提出了基于映射的惰性索引维护策略。最后,SineKV利用底层NVMe ssd的CMB特性来保证崩溃一致性。我们根据LevelDB和基于Wisc-Key的设计实施和评估SineKV。评估显示,在微基准测试和混合工作负载下,SineKV的性能比基于LevelDB和wiskey的系统分别高出6.12倍和2.78倍。
{"title":"SineKV: Decoupled Secondary Indexing for LSM-based Key-Value Stores","authors":"Fei Li, Youyou Lu, Zhe Yang, J. Shu","doi":"10.1109/ICDCS47774.2020.00071","DOIUrl":"https://doi.org/10.1109/ICDCS47774.2020.00071","url":null,"abstract":"Secondary indexing is highly demanded for key-value stores by many applications to accelerate query performance. Current secondary indices on key-value stores are typically built on top of the primary index. In a secondary key query, the primary index has to be accessed to fetch the records, with the retrieved primary keys from the secondary index. The record fetching process invokes lots of point lookups in the primary index and exacerbates the read amplification. In this paper, we present SineKV, a decoupled Secondary indexing Key-Value store, aiming to avoid fetching records from the primary index and improve the secondary key query performance. Firstly, SineKV separates the records from the indices and keeps each index pointing to the record values independently. Secondly, SineKV proposes a mapping-based lazy index maintenance strategy to ensure the consistency of secondary indices. Finally, SineKV leverages the CMB feature of the underlying NVMe SSDs to guarantee crash consistency. We implement and evaluate SineKV against LevelDB and Wisc-Key based designs. The evaluations show SineKV outperforms LevelDB and WiscKey based systems by up to 6.12× and 2.78× under microbenchmark and mixed workloads.","PeriodicalId":158630,"journal":{"name":"2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129659595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A Light in the Dark Web: Linking Dark Web Aliases to Real Internet Identities 暗网之光:将暗网别名与真实互联网身份联系起来
Pub Date : 2020-11-01 DOI: 10.1109/ICDCS47774.2020.00081
Ehsan Arabnezhad, Massimo La Morgia, A. Mei, E. Nemmi, Julinda Stefa
Most users have several Internet names. On Face-book or LinkedIn, for example, people usually appear with the real one. On other standard websites, like forums, people often use aliases to protect their real identities with respect to the other users, with no real privacy against the web site and the authorities. Aliases in the Dark Web are different: users expect strong identity protection.In this paper, we show that using both "open" aliases (aliases used in the standard Web) and Dark Web aliases can be dangerous per se. Indeed, we develop tools to link Dark Web to open aliases. For the first time, we perform a massive scale experiment on real scenarios. First between two Dark Web forums, then between the Dark Web forums and the standard forums. Due to a large number of possible pairs, we first reduce the search space cutting down the number of potential matches to a small set of candidates, and then on the selection of the correct alias among these candidates. We show that our methodology has excellent precision, from 87% to 94%, and recall around 80%.
大多数用户都有好几个互联网名称。例如,在facebook或LinkedIn上,人们通常与真人一起出现。在论坛等其他标准网站上,人们经常使用别名来保护他们的真实身份,而不是其他用户,对网站和当局没有真正的隐私。暗网中的别名是不同的:用户期望强大的身份保护。在本文中,我们展示了使用“开放”别名(标准Web中使用的别名)和暗网别名本身可能是危险的。事实上,我们开发了将暗网与开放别名连接起来的工具。这是我们第一次在真实场景中进行大规模实验。先是在两个暗网论坛之间,然后在暗网论坛和标准论坛之间。由于存在大量可能的配对,我们首先缩小搜索空间,将潜在匹配的数量减少到一小部分候选对象,然后在这些候选对象中选择正确的别名。我们表明,我们的方法具有优异的精度,从87%到94%,召回率约为80%。
{"title":"A Light in the Dark Web: Linking Dark Web Aliases to Real Internet Identities","authors":"Ehsan Arabnezhad, Massimo La Morgia, A. Mei, E. Nemmi, Julinda Stefa","doi":"10.1109/ICDCS47774.2020.00081","DOIUrl":"https://doi.org/10.1109/ICDCS47774.2020.00081","url":null,"abstract":"Most users have several Internet names. On Face-book or LinkedIn, for example, people usually appear with the real one. On other standard websites, like forums, people often use aliases to protect their real identities with respect to the other users, with no real privacy against the web site and the authorities. Aliases in the Dark Web are different: users expect strong identity protection.In this paper, we show that using both \"open\" aliases (aliases used in the standard Web) and Dark Web aliases can be dangerous per se. Indeed, we develop tools to link Dark Web to open aliases. For the first time, we perform a massive scale experiment on real scenarios. First between two Dark Web forums, then between the Dark Web forums and the standard forums. Due to a large number of possible pairs, we first reduce the search space cutting down the number of potential matches to a small set of candidates, and then on the selection of the correct alias among these candidates. We show that our methodology has excellent precision, from 87% to 94%, and recall around 80%.","PeriodicalId":158630,"journal":{"name":"2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128976656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Dragon: A Lightweight, High Performance Distributed Stream Processing Engine Dragon:一个轻量级、高性能的分布式流处理引擎
Pub Date : 2020-11-01 DOI: 10.1109/ICDCS47774.2020.00177
A. Harwood, M. Read, Gayashan Amarasinghe
The performance of a distributed stream processing engine is traditionally considered in terms of fundamental measurements of latency and throughput. Recently, Apache Storm has demonstrated sub-millisecond latencies for inter-component tuple transmission, though it does so through aggressive throttling that leads to strict throughput limitations in order to keep tuple queues near empty. On the other hand, Apache Heron has excellent throughput characteristics, especially when operating near unstable conditions, but its inter-component latencies typically start around 10 milliseconds. Both of these systems require roughly 650MB of installation space. We have developed Dragon, loosely based on the same API as Storm and Heron, that is both lightweight, requiring just 7.5MB of installation space, and competitive in performance to Storm and Heron. In this paper we show experiments with all three systems using the Word Count benchmark. Dragon achieves throughput characteristics near to that of Heron and inter-component latencies less than 10ms under high load. In particular, Dragon’s maximum latency is significantly less that Storm’s maximum latency under high load. Finally Dragon managed to remain stable at higher effective throughput than Heron. We believe Dragon is a good "allrounder" solution and is particularly suitable for Edge computing applications, given its small installation footprint.
传统上,分布式流处理引擎的性能是根据延迟和吞吐量的基本测量来考虑的。最近,Apache Storm已经演示了组件间元组传输的亚毫秒延迟,尽管它是通过积极的节流来实现的,这种节流会导致严格的吞吐量限制,以保持元组队列接近空。另一方面,Apache Heron具有出色的吞吐量特性,特别是在接近不稳定条件时,但是它的组件间延迟通常从10毫秒左右开始。这两个系统都需要大约650MB的安装空间。我们已经开发了Dragon,它大致基于与Storm和Heron相同的API,它都是轻量级的,只需要7.5MB的安装空间,并且在性能上与Storm和Heron相比具有竞争力。在本文中,我们展示了使用Word Count基准测试的所有三个系统的实验。Dragon实现了接近Heron的吞吐量特性,高负载下组件间延迟小于10ms。特别是,在高负载下,龙的最大延迟明显小于风暴的最大延迟。最终,龙运保持稳定,有效吞吐量高于苍鹭。我们相信Dragon是一个很好的“全能型”解决方案,特别适合边缘计算应用,因为它的安装空间很小。
{"title":"Dragon: A Lightweight, High Performance Distributed Stream Processing Engine","authors":"A. Harwood, M. Read, Gayashan Amarasinghe","doi":"10.1109/ICDCS47774.2020.00177","DOIUrl":"https://doi.org/10.1109/ICDCS47774.2020.00177","url":null,"abstract":"The performance of a distributed stream processing engine is traditionally considered in terms of fundamental measurements of latency and throughput. Recently, Apache Storm has demonstrated sub-millisecond latencies for inter-component tuple transmission, though it does so through aggressive throttling that leads to strict throughput limitations in order to keep tuple queues near empty. On the other hand, Apache Heron has excellent throughput characteristics, especially when operating near unstable conditions, but its inter-component latencies typically start around 10 milliseconds. Both of these systems require roughly 650MB of installation space. We have developed Dragon, loosely based on the same API as Storm and Heron, that is both lightweight, requiring just 7.5MB of installation space, and competitive in performance to Storm and Heron. In this paper we show experiments with all three systems using the Word Count benchmark. Dragon achieves throughput characteristics near to that of Heron and inter-component latencies less than 10ms under high load. In particular, Dragon’s maximum latency is significantly less that Storm’s maximum latency under high load. Finally Dragon managed to remain stable at higher effective throughput than Heron. We believe Dragon is a good \"allrounder\" solution and is particularly suitable for Edge computing applications, given its small installation footprint.","PeriodicalId":158630,"journal":{"name":"2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127835314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Achieving High Utilization for Approximate Fair Queueing in Data Center 实现数据中心近似公平排队的高利用率
Pub Date : 2020-11-01 DOI: 10.1109/ICDCS47774.2020.00099
Jingling Liu, Jiawei Huang, Ning Jiang, Weihe Li, Jianxin Wang
Modern data centers often host multiple applications with diverse network demands. To provide fair bandwidth allocation to several thousand traversing flows, Approximate Fair Queueing (AFQ) utilizes multiple priority queues in switch to approximate ideal fair queueing. However, due to limited number of queues in commodity switches, AFQ easily experiences high packet loss and low link utilization. In this paper, we propose Elastic Fair Queueing (EFQ), which leverages limited priority queues to flexibly achieve both high network utilization and fair bandwidth allocation. EFQ dynamically assigns the free buffer space in priority queues for each packet to obtain high utilization without sacrificing flow-level fairness. The results of simulation experiments and real implementations show that EFQ reduces the average flow completion time by up to 82% over the state-of-the-art fair bandwidth allocation mechanisms.
现代数据中心通常托管具有不同网络需求的多个应用程序。为了给数千个遍历流提供公平的带宽分配,近似公平排队(AFQ)利用交换机中的多个优先级队列来近似理想的公平排队。然而,由于商品交换机中的队列数量有限,AFQ很容易出现高丢包和低链路利用率的问题。在本文中,我们提出弹性公平排队(EFQ),它利用有限的优先级队列灵活地实现高网络利用率和公平的带宽分配。EFQ在不牺牲流级公平性的前提下,为每个数据包在优先级队列中动态分配空闲缓冲空间。仿真实验和实际实现的结果表明,与最先进的公平带宽分配机制相比,EFQ可将平均流量完成时间减少82%。
{"title":"Achieving High Utilization for Approximate Fair Queueing in Data Center","authors":"Jingling Liu, Jiawei Huang, Ning Jiang, Weihe Li, Jianxin Wang","doi":"10.1109/ICDCS47774.2020.00099","DOIUrl":"https://doi.org/10.1109/ICDCS47774.2020.00099","url":null,"abstract":"Modern data centers often host multiple applications with diverse network demands. To provide fair bandwidth allocation to several thousand traversing flows, Approximate Fair Queueing (AFQ) utilizes multiple priority queues in switch to approximate ideal fair queueing. However, due to limited number of queues in commodity switches, AFQ easily experiences high packet loss and low link utilization. In this paper, we propose Elastic Fair Queueing (EFQ), which leverages limited priority queues to flexibly achieve both high network utilization and fair bandwidth allocation. EFQ dynamically assigns the free buffer space in priority queues for each packet to obtain high utilization without sacrificing flow-level fairness. The results of simulation experiments and real implementations show that EFQ reduces the average flow completion time by up to 82% over the state-of-the-art fair bandwidth allocation mechanisms.","PeriodicalId":158630,"journal":{"name":"2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128996277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
EdgeProg: Edge-centric Programming for IoT Applications EdgeProg:物联网应用的边缘中心编程
Pub Date : 2020-11-01 DOI: 10.1109/ICDCS47774.2020.00038
Borui Li, Wei Dong
IoT application development usually involves separate programming at the device side and server side. While separate programming style is sufficient for many simple applications, it is not suitable for many complex applications that involve complex interactions and intensive data processing. We propose EdgeProg, an edge-centric programming approach to simplify IoT application programming, motivated by the increasing popularity of edge computing. With EdgeProg, users could write application logic in a centralized manner with an augmented If-This-Then-That (IFTTT) syntax and virtual sensor mechanism. The program can be processed at the edge server, which can automatically generate the actual application code and intelligently partition the code into device code and server code, for achieving the optimal latency. EdgeProg employs dynamic linking and loading to deploy the device code on a variety of IoT devices, which do not run any application-specific codes at the start. Results show that EdgeProg achieves an average reduction of 20.96% and 79.41% in terms of execution latency and lines of code, compared with state-of-the-art approaches.
物联网应用程序开发通常涉及设备端和服务器端的独立编程。虽然单独的编程风格对于许多简单的应用程序来说已经足够了,但对于涉及复杂交互和密集数据处理的许多复杂应用程序来说,它并不合适。由于边缘计算的日益普及,我们提出了EdgeProg,这是一种以边缘为中心的编程方法,用于简化物联网应用程序编程。使用EdgeProg,用户可以使用增强的If-This-Then-That (IFTTT)语法和虚拟传感器机制以集中方式编写应用程序逻辑。程序可以在边缘服务器上进行处理,边缘服务器可以自动生成实际应用代码,并将代码智能地划分为设备代码和服务器代码,以实现最佳延迟。EdgeProg采用动态链接和加载将设备代码部署到各种物联网设备上,这些设备在开始时不运行任何特定于应用程序的代码。结果表明,与最先进的方法相比,EdgeProg在执行延迟和代码行数方面平均减少了20.96%和79.41%。
{"title":"EdgeProg: Edge-centric Programming for IoT Applications","authors":"Borui Li, Wei Dong","doi":"10.1109/ICDCS47774.2020.00038","DOIUrl":"https://doi.org/10.1109/ICDCS47774.2020.00038","url":null,"abstract":"IoT application development usually involves separate programming at the device side and server side. While separate programming style is sufficient for many simple applications, it is not suitable for many complex applications that involve complex interactions and intensive data processing. We propose EdgeProg, an edge-centric programming approach to simplify IoT application programming, motivated by the increasing popularity of edge computing. With EdgeProg, users could write application logic in a centralized manner with an augmented If-This-Then-That (IFTTT) syntax and virtual sensor mechanism. The program can be processed at the edge server, which can automatically generate the actual application code and intelligently partition the code into device code and server code, for achieving the optimal latency. EdgeProg employs dynamic linking and loading to deploy the device code on a variety of IoT devices, which do not run any application-specific codes at the start. Results show that EdgeProg achieves an average reduction of 20.96% and 79.41% in terms of execution latency and lines of code, compared with state-of-the-art approaches.","PeriodicalId":158630,"journal":{"name":"2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121366679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
期刊
2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1