Industrial IoT applications often require both dependability and flexibility from the underlying networks. Restructuring production lines brings topological changes that directly affect the interference levels per link. When a scheduled network, e.g. IEEE802.15.4-TSCH (Time Synchronized Channel Hopping), is used to ensure dependability in low-power networks, rescheduling of transmissions is needed to re-establish effective and reliable end-to-end communication. Typical approaches focus on either centralized or distributed schedulers with little attention drawn on how the chosen solution would perform compared to other solutions or in different topologies. In this work, we introduce the concept of online assessment of TSCH schedules and present an automated method for evaluating schedules taking into consideration the internal interference and conflicts. The network and its TSCH schedule are mapped to a common representation, the interference graph, easy to analyze. Experiment results suggest that this evaluation method reflects the performance of the network when measured by packet reception ratio, end to end delivery ratio, and latency.
{"title":"Time-Scheduled Network Evaluation Based on Interference","authors":"T. Lee, A. Liotta, Georgios Exarchakos","doi":"10.1109/IC2E.2018.00063","DOIUrl":"https://doi.org/10.1109/IC2E.2018.00063","url":null,"abstract":"Industrial IoT applications often require both dependability and flexibility from the underlying networks. Restructuring production lines brings topological changes that directly affect the interference levels per link. When a scheduled network, e.g. IEEE802.15.4-TSCH (Time Synchronized Channel Hopping), is used to ensure dependability in low-power networks, rescheduling of transmissions is needed to re-establish effective and reliable end-to-end communication. Typical approaches focus on either centralized or distributed schedulers with little attention drawn on how the chosen solution would perform compared to other solutions or in different topologies. In this work, we introduce the concept of online assessment of TSCH schedules and present an automated method for evaluating schedules taking into consideration the internal interference and conflicts. The network and its TSCH schedule are mapped to a common representation, the interference graph, easy to analyze. Experiment results suggest that this evaluation method reflects the performance of the network when measured by packet reception ratio, end to end delivery ratio, and latency.","PeriodicalId":263348,"journal":{"name":"2018 IEEE International Conference on Cloud Engineering (IC2E)","volume":"268 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115302489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Upcoming exascale high performance computing (HPC) systems are expected to comprise multi-tier storage hierarchy, and thus will necessitate innovative storage and I/O mechanisms. Traditional disk and block-based interfaces and file systems face severe challenges in utilizing capabilities of storage hierarchies due to the lack of hierarchy support and semantic interfaces. Object-based and semantically-rich data abstractions for scientific data management on large scale systems offer a sustainable solution to these challenges. Such data abstractions can also simplify users involvement in data movement. In this paper, we take the first steps of realizing such an object abstraction and explore storage mechanisms for these objects to enhance I/O performance, especially for scientific applications. We explore how an object-based interface can facilitate next generation scalable computing systems by presenting the mapping of data I/O from two real world HPC scientific use cases: a plasma physics simulation code (VPIC) and a cosmology simulation code (HACC). Our storage model stores data objects in different physical organizations to support data movement across layers of memory/storage hierarchy. Our implementation sclaes well to 16K parallel processes, and compared to the state of the art, such as MPI-IO and HDF5, our object-based data abstractions and data placement strategy in multi-level storage hierarchy achieves up to 7× I/O performance improvement for scientific data.
{"title":"Toward Transparent Data Management in Multi-Layer Storage Hierarchy of HPC Systems","authors":"Bharti Wadhwa, S. Byna, A. Butt","doi":"10.1109/IC2E.2018.00046","DOIUrl":"https://doi.org/10.1109/IC2E.2018.00046","url":null,"abstract":"Upcoming exascale high performance computing (HPC) systems are expected to comprise multi-tier storage hierarchy, and thus will necessitate innovative storage and I/O mechanisms. Traditional disk and block-based interfaces and file systems face severe challenges in utilizing capabilities of storage hierarchies due to the lack of hierarchy support and semantic interfaces. Object-based and semantically-rich data abstractions for scientific data management on large scale systems offer a sustainable solution to these challenges. Such data abstractions can also simplify users involvement in data movement. In this paper, we take the first steps of realizing such an object abstraction and explore storage mechanisms for these objects to enhance I/O performance, especially for scientific applications. We explore how an object-based interface can facilitate next generation scalable computing systems by presenting the mapping of data I/O from two real world HPC scientific use cases: a plasma physics simulation code (VPIC) and a cosmology simulation code (HACC). Our storage model stores data objects in different physical organizations to support data movement across layers of memory/storage hierarchy. Our implementation sclaes well to 16K parallel processes, and compared to the state of the art, such as MPI-IO and HDF5, our object-based data abstractions and data placement strategy in multi-level storage hierarchy achieves up to 7× I/O performance improvement for scientific data.","PeriodicalId":263348,"journal":{"name":"2018 IEEE International Conference on Cloud Engineering (IC2E)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114434324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present OPTiC, a multi-tenant scheduler intended for distributed graph processing frameworks. OPTiC proposes opportunistic scheduling, whereby queued jobs can be pre-scheduled at cluster nodes when the cluster is fully busy running jobs. This allows overlapping of data ingress with ongoing computation. To pre-schedule wisely, OPTiC's novel contribution is a profile-free and cluster-agnostic approach to compare progress of graph processing jobs. OPTiC is implemented inside Apache Giraph, with YARN underneath. Our experiments with real workload traces and network models show that OPTiC's opportunistic scheduling improves run time (both at the median and at the tail) by 20%-82% compared to baseline multi-tenancy, in a variety of scenarios.
{"title":"OPTiC: Opportunistic Graph Processing in Multi-Tenant Clusters","authors":"Muntasir Raihan Rahman, Indranil Gupta, Akash Kapoor, Haozhen Ding","doi":"10.1109/IC2E.2018.00034","DOIUrl":"https://doi.org/10.1109/IC2E.2018.00034","url":null,"abstract":"We present OPTiC, a multi-tenant scheduler intended for distributed graph processing frameworks. OPTiC proposes opportunistic scheduling, whereby queued jobs can be pre-scheduled at cluster nodes when the cluster is fully busy running jobs. This allows overlapping of data ingress with ongoing computation. To pre-schedule wisely, OPTiC's novel contribution is a profile-free and cluster-agnostic approach to compare progress of graph processing jobs. OPTiC is implemented inside Apache Giraph, with YARN underneath. Our experiments with real workload traces and network models show that OPTiC's opportunistic scheduling improves run time (both at the median and at the tail) by 20%-82% compared to baseline multi-tenancy, in a variety of scenarios.","PeriodicalId":263348,"journal":{"name":"2018 IEEE International Conference on Cloud Engineering (IC2E)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125592920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
W. Lloyd, S. Ramesh, Swetha Chinthalapati, Lan Ly, S. Pallickara
Serverless computing platforms provide function(s)-as-a-Service (FaaS) to end users while promising reduced hosting costs, high availability, fault tolerance, and dynamic elasticity for hosting individual functions known as microservices. Serverless Computing environments, unlike Infrastructure-as-a-Service (IaaS) cloud platforms, abstract infrastructure management including creation of virtual machines (VMs), operating system containers, and request load balancing from users. To conserve cloud server capacity and energy, cloud providers allow hosting infrastructure to go COLD, deprovisioning containers when service demand is low freeing infrastructure to be harnessed by others. In this paper, we present results from our comprehensive investigation into the factors which influence microservice performance afforded by serverless computing. We examine hosting implications related to infrastructure elasticity, load balancing, provisioning variation, infrastructure retention, and memory reservation size. We identify four states of serverless infrastructure including: provider cold, VM cold, container cold, and warm and demonstrate how microservice performance varies up to 15x based on these states.
{"title":"Serverless Computing: An Investigation of Factors Influencing Microservice Performance","authors":"W. Lloyd, S. Ramesh, Swetha Chinthalapati, Lan Ly, S. Pallickara","doi":"10.1109/IC2E.2018.00039","DOIUrl":"https://doi.org/10.1109/IC2E.2018.00039","url":null,"abstract":"Serverless computing platforms provide function(s)-as-a-Service (FaaS) to end users while promising reduced hosting costs, high availability, fault tolerance, and dynamic elasticity for hosting individual functions known as microservices. Serverless Computing environments, unlike Infrastructure-as-a-Service (IaaS) cloud platforms, abstract infrastructure management including creation of virtual machines (VMs), operating system containers, and request load balancing from users. To conserve cloud server capacity and energy, cloud providers allow hosting infrastructure to go COLD, deprovisioning containers when service demand is low freeing infrastructure to be harnessed by others. In this paper, we present results from our comprehensive investigation into the factors which influence microservice performance afforded by serverless computing. We examine hosting implications related to infrastructure elasticity, load balancing, provisioning variation, infrastructure retention, and memory reservation size. We identify four states of serverless infrastructure including: provider cold, VM cold, container cold, and warm and demonstrate how microservice performance varies up to 15x based on these states.","PeriodicalId":263348,"journal":{"name":"2018 IEEE International Conference on Cloud Engineering (IC2E)","volume":"109 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132542635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mathias Björkqvist, C. Cachin, Felix Engelmann, A. Sorniotti
As use of cryptography increases in all areas of computing, efficient solutions for key management in distributed systems are needed. Large deployments in the cloud can require millions of keys for thousands of clients. The current approaches for serving keys are centralized components, which do not scale as desired. This work reports on the realization of a key manager that uses an untrusted distributed key-value store (KVS) and offers consistent key distribution over the Key-Management Interoperability Protocol (KMIP). To achieve confidentiality, it uses a key hierarchy where every key except a root key itself is encrypted by the respective parent key. The hierarchy also allows for key rotation and, ultimately, for secure deletion of data. The design permits key rotation to proceed concurrently with key-serving operations. A prototype was integrated with IBM Spectrum Scale, a highly scalable cluster file system, where it serves keys for file encryption. Linear scalability was achieved even under load from concurrent key updates. The implementation shows that the approach is viable, works as intended, and suitable for high-throughput key serving in cloud platforms.
{"title":"Scalable Key Management for Distributed Cloud Storage","authors":"Mathias Björkqvist, C. Cachin, Felix Engelmann, A. Sorniotti","doi":"10.1109/IC2E.2018.00051","DOIUrl":"https://doi.org/10.1109/IC2E.2018.00051","url":null,"abstract":"As use of cryptography increases in all areas of computing, efficient solutions for key management in distributed systems are needed. Large deployments in the cloud can require millions of keys for thousands of clients. The current approaches for serving keys are centralized components, which do not scale as desired. This work reports on the realization of a key manager that uses an untrusted distributed key-value store (KVS) and offers consistent key distribution over the Key-Management Interoperability Protocol (KMIP). To achieve confidentiality, it uses a key hierarchy where every key except a root key itself is encrypted by the respective parent key. The hierarchy also allows for key rotation and, ultimately, for secure deletion of data. The design permits key rotation to proceed concurrently with key-serving operations. A prototype was integrated with IBM Spectrum Scale, a highly scalable cluster file system, where it serves keys for file encryption. Linear scalability was achieved even under load from concurrent key updates. The implementation shows that the approach is viable, works as intended, and suitable for high-throughput key serving in cloud platforms.","PeriodicalId":263348,"journal":{"name":"2018 IEEE International Conference on Cloud Engineering (IC2E)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124188356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Publish–subscribe middleware is a popular technology for facilitating device-to-device communication in large-scale distributed Internet of Things (IoT) scenarios. However, the stringent quality of service (QoS) requirements imposed by many applications cannot be met by cloud-based solutions alone. Edge computing is considered a key enabler for such applications. Client mobility and dynamic resource availability are prominent challenges in edge computing architectures. In this paper, we present EMMA, an edge-enabled publish–subscribe middleware that addresses these challenges. EMMA continuously monitors network QoS and orchestrates a network of MQTT protocol brokers. It transparently migrates MQTT clients to brokers in close proximity to optimize QoS. Experiments in a real-world testbed show that EMMA can significantly reduce end-to-end latencies that incur from network link usage, even in the face of client mobility and unpredictable resource availability.
{"title":"EMMA: Distributed QoS-Aware MQTT Middleware for Edge Computing Applications","authors":"T. Rausch, Stefan Nastic, S. Dustdar","doi":"10.1109/IC2E.2018.00043","DOIUrl":"https://doi.org/10.1109/IC2E.2018.00043","url":null,"abstract":"Publish–subscribe middleware is a popular technology for facilitating device-to-device communication in large-scale distributed Internet of Things (IoT) scenarios. However, the stringent quality of service (QoS) requirements imposed by many applications cannot be met by cloud-based solutions alone. Edge computing is considered a key enabler for such applications. Client mobility and dynamic resource availability are prominent challenges in edge computing architectures. In this paper, we present EMMA, an edge-enabled publish–subscribe middleware that addresses these challenges. EMMA continuously monitors network QoS and orchestrates a network of MQTT protocol brokers. It transparently migrates MQTT clients to brokers in close proximity to optimize QoS. Experiments in a real-world testbed show that EMMA can significantly reduce end-to-end latencies that incur from network link usage, even in the face of client mobility and unpredictable resource availability.","PeriodicalId":263348,"journal":{"name":"2018 IEEE International Conference on Cloud Engineering (IC2E)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129392190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michael Hornacek, D. Schall, Philipp Glira, Sebastian Geiger, Andreas Egger, Andrei Filip, C. Windisch, Mike Liepe
Operators of pipeline infrastructure buried underground are in many countries required to ensure that depth of cover—a measure of the quantity of soil covering a pipeline—lie within prescribed bounds. Traditionally, monitoring depth of cover at scale has been carried out qualitatively by means of visual inspection. We proceed instead to rely on airborne remote sensing techniques to obtain densely sampled ground surface point measurements from the pipeline's right of way, from which we determine depth of cover using automated algorithms. Proceeding in our manner presents a reproducible, quantitative approach to monitoring depth of cover, yet the demands thus made by the scale of real-world pipeline monitoring scenarios on compute and storage resources can be substantial. We show that the scalability afforded by the cloud can be leveraged to address such scenarios, distributing the algorithms we employ to take advantage of multiple compute nodes and exploiting elastic storage. While the use case underlying this paper is monitoring depth of cover, our proposed architecture can be applied more broadly to a wide variety of geospatial analytics tasks carried out 'in the large', including change detection, semantic classification or segmentation, or computation of vegetation indices.
{"title":"Geospatial Analytics in the Large for Monitoring Depth of Cover for Buried Pipeline Infrastructure","authors":"Michael Hornacek, D. Schall, Philipp Glira, Sebastian Geiger, Andreas Egger, Andrei Filip, C. Windisch, Mike Liepe","doi":"10.1109/IC2E.2018.00049","DOIUrl":"https://doi.org/10.1109/IC2E.2018.00049","url":null,"abstract":"Operators of pipeline infrastructure buried underground are in many countries required to ensure that depth of cover—a measure of the quantity of soil covering a pipeline—lie within prescribed bounds. Traditionally, monitoring depth of cover at scale has been carried out qualitatively by means of visual inspection. We proceed instead to rely on airborne remote sensing techniques to obtain densely sampled ground surface point measurements from the pipeline's right of way, from which we determine depth of cover using automated algorithms. Proceeding in our manner presents a reproducible, quantitative approach to monitoring depth of cover, yet the demands thus made by the scale of real-world pipeline monitoring scenarios on compute and storage resources can be substantial. We show that the scalability afforded by the cloud can be leveraged to address such scenarios, distributing the algorithms we employ to take advantage of multiple compute nodes and exploiting elastic storage. While the use case underlying this paper is monitoring depth of cover, our proposed architecture can be applied more broadly to a wide variety of geospatial analytics tasks carried out 'in the large', including change detection, semantic classification or segmentation, or computation of vegetation indices.","PeriodicalId":263348,"journal":{"name":"2018 IEEE International Conference on Cloud Engineering (IC2E)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128572319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wei-Tsung Lin, C. Krintz, R. Wolski, Michael Zhang, Xiaogang Cai, Tongjun Li, W. Xu
Serverless computing is a new cloud programming and deployment paradigm that is receiving wide-spread uptake. Serverless offerings such as Amazon Web Services (AWS) Lambda, Google Functions, and Azure Functions automatically execute simple functions uploaded by developers, in response to cloud-based event triggers. The serverless abstraction greatly simplifies integration of concurrency and parallelism into cloud applications, and enables deployment of scalable distributed systems and services at very low cost. Although a significant first step, the serverless abstraction requires tools that software engineers can use to reason about, debug, and optimize their increasingly complex, asynchronous applications. Toward this end, we investigate the design and implementation of GammaRay, a cloud service that extracts causal dependencies across functions and through cloud services, without programmer intervention. We implement GammaRay for AWS Lambda and evaluate the overheads that it introduces for serverless micro-benchmarks and applications written in Python.
无服务器计算是一种新的云编程和部署范例,正在得到广泛采用。Amazon Web Services (AWS) Lambda、谷歌Functions和Azure Functions等无服务器产品自动执行开发人员上传的简单函数,以响应基于云的事件触发器。无服务器抽象极大地简化了将并发性和并行性集成到云应用程序中,并支持以非常低的成本部署可扩展的分布式系统和服务。尽管这是重要的第一步,但无服务器抽象需要软件工程师可以用来推理、调试和优化其日益复杂的异步应用程序的工具。为此,我们研究了GammaRay的设计和实现,GammaRay是一种云服务,可以在没有程序员干预的情况下提取跨功能和通过云服务的因果依赖关系。我们为AWS Lambda实现了GammaRay,并评估了它为无服务器微基准测试和用Python编写的应用程序引入的开销。
{"title":"Tracking Causal Order in AWS Lambda Applications","authors":"Wei-Tsung Lin, C. Krintz, R. Wolski, Michael Zhang, Xiaogang Cai, Tongjun Li, W. Xu","doi":"10.1109/IC2E.2018.00027","DOIUrl":"https://doi.org/10.1109/IC2E.2018.00027","url":null,"abstract":"Serverless computing is a new cloud programming and deployment paradigm that is receiving wide-spread uptake. Serverless offerings such as Amazon Web Services (AWS) Lambda, Google Functions, and Azure Functions automatically execute simple functions uploaded by developers, in response to cloud-based event triggers. The serverless abstraction greatly simplifies integration of concurrency and parallelism into cloud applications, and enables deployment of scalable distributed systems and services at very low cost. Although a significant first step, the serverless abstraction requires tools that software engineers can use to reason about, debug, and optimize their increasingly complex, asynchronous applications. Toward this end, we investigate the design and implementation of GammaRay, a cloud service that extracts causal dependencies across functions and through cloud services, without programmer intervention. We implement GammaRay for AWS Lambda and evaluate the overheads that it introduces for serverless micro-benchmarks and applications written in Python.","PeriodicalId":263348,"journal":{"name":"2018 IEEE International Conference on Cloud Engineering (IC2E)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129658603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Khan, Tobias Becker, Perumal Kuppuudaiyar, A. Elster
Building and successfully deploying applications on high-end heterogeneous resources such as GPUs, MICs or FPGAs, and typically with several library dependencies, is a complex task. Modern containerization provides a lightweight virtualization environment which can help solve some of these complex deployment and execution issues. By using containers, the software can be packaged with all the dependencies and tested in a single environment and deployable on heterogeneous architectures, easily. In this paper, we present our experiences with the container-based virtualized solutions that we developed for the use case applications in the EU H2020 project CloudLightning. We present specifics on their management and orchestration with specific software on the heterogeneous resources of our test-bed. The use cases include a Genomics application targeting FPGA-based DFEs, an Upscaling application for reservoir modeling, a ray tracing application targeting the MIC (Intel Xeon Phi co-processor), and a BLAS application with libraries, optimized for both CPU and GPU. An overview of the CloudLightning project and how the use case applications have been developed to be used as cloud services in the self-organizing, self-managing cloud technology, is also included.
{"title":"Container-Based Virtualization for Heterogeneous HPC Clouds: Insights from the EU H2020 CloudLightning Project","authors":"M. Khan, Tobias Becker, Perumal Kuppuudaiyar, A. Elster","doi":"10.1109/IC2E.2018.00074","DOIUrl":"https://doi.org/10.1109/IC2E.2018.00074","url":null,"abstract":"Building and successfully deploying applications on high-end heterogeneous resources such as GPUs, MICs or FPGAs, and typically with several library dependencies, is a complex task. Modern containerization provides a lightweight virtualization environment which can help solve some of these complex deployment and execution issues. By using containers, the software can be packaged with all the dependencies and tested in a single environment and deployable on heterogeneous architectures, easily. In this paper, we present our experiences with the container-based virtualized solutions that we developed for the use case applications in the EU H2020 project CloudLightning. We present specifics on their management and orchestration with specific software on the heterogeneous resources of our test-bed. The use cases include a Genomics application targeting FPGA-based DFEs, an Upscaling application for reservoir modeling, a ray tracing application targeting the MIC (Intel Xeon Phi co-processor), and a BLAS application with libraries, optimized for both CPU and GPU. An overview of the CloudLightning project and how the use case applications have been developed to be used as cloud services in the self-organizing, self-managing cloud technology, is also included.","PeriodicalId":263348,"journal":{"name":"2018 IEEE International Conference on Cloud Engineering (IC2E)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124732483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Many applications like geo-replication need to deliver multiple copies of data from a single datacenter to multiple datacenters, which has benefits of improving fault tolerance, increasing availability and achieving high service quality. These applications usually require completing multicast transfers before certain deadlines. Some of the existing works only consider unicast transfers, which is not appropriate for the multicast transmission type. An alternative approach proposed by existing works was to find a minimum weight Steiner tree for each transfer. Instead of using only one tree for each transfer, we propose to use one or multiple trees, which increases the flexibility of routing, improves the utilization of available bandwidth, and increases the throughput for each transfer. In this paper, we focus on the multicast transmission type, propose an efficient and effective solution that maximizes throughput for all transfer requests while meeting deadlines. We also show that our solution can reduce packet reordering by selecting very few Steiner trees for each transfer. We have implemented our solution on a software-defined overlay network at the application layer, and our real-world experiments on the Google Cloud Platform have shown that our system effectively improves the network throughput performance and has a lower traffic rejection rate compared to existing related works.
{"title":"Deadline-Aware Scheduling and Routing for Inter-Datacenter Multicast Transfers","authors":"Siqi Ji, Shuhao Liu, Baochun Li","doi":"10.1109/IC2E.2018.00035","DOIUrl":"https://doi.org/10.1109/IC2E.2018.00035","url":null,"abstract":"Many applications like geo-replication need to deliver multiple copies of data from a single datacenter to multiple datacenters, which has benefits of improving fault tolerance, increasing availability and achieving high service quality. These applications usually require completing multicast transfers before certain deadlines. Some of the existing works only consider unicast transfers, which is not appropriate for the multicast transmission type. An alternative approach proposed by existing works was to find a minimum weight Steiner tree for each transfer. Instead of using only one tree for each transfer, we propose to use one or multiple trees, which increases the flexibility of routing, improves the utilization of available bandwidth, and increases the throughput for each transfer. In this paper, we focus on the multicast transmission type, propose an efficient and effective solution that maximizes throughput for all transfer requests while meeting deadlines. We also show that our solution can reduce packet reordering by selecting very few Steiner trees for each transfer. We have implemented our solution on a software-defined overlay network at the application layer, and our real-world experiments on the Google Cloud Platform have shown that our system effectively improves the network throughput performance and has a lower traffic rejection rate compared to existing related works.","PeriodicalId":263348,"journal":{"name":"2018 IEEE International Conference on Cloud Engineering (IC2E)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131951342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}