首页 > 最新文献

2022 IEEE Cloud Summit最新文献

英文 中文
Quantitative Evaluation of Cloud Elasticity based on Fuzzy Analytic Hierarchy Process 基于模糊层次分析法的云弹性定量评价
Pub Date : 2022-10-01 DOI: 10.1109/CloudSummit54781.2022.00022
Bolin Yang, Fan Zhang, S. Khan
Elasticity is one of the most important cloud computing characteristics, which enables deployed applications to dynamically adapt to workload-changing demands by acquiring and releasing shared computing resources at runtime. However, the existing cloud elasticity metrics are either oversimplified or hard to use, thereby lacking a comprehensive evaluation mech-anism to properly compare the elastic feature among different cloud providers. To address this gap, we propose an assessment method for cloud elasticity based on fuzzy hierarchical analysis. We use a fuzzy hierarchical model to quantitatively assess the qualitative metrics with a unified standard model. We compare three public cloud providers (Ali Cloud, HUAWEI Cloud, Tencent Cloud) as case studies and measure their cloud elasticity based on the proposed model on a cluster. To verify the effectiveness of our method, we also measure three cloud platforms using auto scaling performance metrics proposed by SPEC Cloud Group. The results show that our proposed elasticity quantification method is feasible.
弹性是最重要的云计算特性之一,它使部署的应用程序能够通过在运行时获取和释放共享计算资源来动态适应工作负载变化的需求。然而,现有的云弹性指标要么过于简化,要么难以使用,从而缺乏一种全面的评估机制来正确比较不同云提供商之间的弹性特性。为了解决这一差距,我们提出了一种基于模糊层次分析的云弹性评估方法。采用模糊层次模型对定性指标进行定量评价,统一标准模型。我们比较了三家公共云提供商(阿里云、华为云、腾讯云)作为案例研究,并基于所提出的模型在集群上度量它们的云弹性。为了验证我们方法的有效性,我们还使用SPEC cloud Group提出的自动缩放性能指标对三个云平台进行了测量。结果表明,本文提出的弹性量化方法是可行的。
{"title":"Quantitative Evaluation of Cloud Elasticity based on Fuzzy Analytic Hierarchy Process","authors":"Bolin Yang, Fan Zhang, S. Khan","doi":"10.1109/CloudSummit54781.2022.00022","DOIUrl":"https://doi.org/10.1109/CloudSummit54781.2022.00022","url":null,"abstract":"Elasticity is one of the most important cloud computing characteristics, which enables deployed applications to dynamically adapt to workload-changing demands by acquiring and releasing shared computing resources at runtime. However, the existing cloud elasticity metrics are either oversimplified or hard to use, thereby lacking a comprehensive evaluation mech-anism to properly compare the elastic feature among different cloud providers. To address this gap, we propose an assessment method for cloud elasticity based on fuzzy hierarchical analysis. We use a fuzzy hierarchical model to quantitatively assess the qualitative metrics with a unified standard model. We compare three public cloud providers (Ali Cloud, HUAWEI Cloud, Tencent Cloud) as case studies and measure their cloud elasticity based on the proposed model on a cluster. To verify the effectiveness of our method, we also measure three cloud platforms using auto scaling performance metrics proposed by SPEC Cloud Group. The results show that our proposed elasticity quantification method is feasible.","PeriodicalId":106553,"journal":{"name":"2022 IEEE Cloud Summit","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121268384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Facial Expression Recognition System on a Distributed Edge-Cloud Infrastructure 分布式边缘云基础设施上的面部表情识别系统
Pub Date : 2022-10-01 DOI: 10.1109/CloudSummit54781.2022.00014
Kai Cui, Guoting Zhang, Fan Zhang, S. Khan
Time-sensitive AI applications usually pre-process the raw data on edge devices without having to offload them all to the cloud. However, deploying the AI applications on a distributed edge-cloud infrastructure is still an open issue since separating the roles between the edge and the cloud has no existing rule to follow. In this paper, we implemented a Facial Expression Recognition (FER) system, as a case study AI application, on an edge-cloud infrastructure to bridge the gap. FER system is distributed, fault tolerant, performant and completely edge-cloud separated. FER performs light-weight algorithms such as extracting facial feature points on the edge, while it performs heavy-weight algorithms such as deep neural network inference on the cloud. We performed experiments on different cloud providers, and we have seen that we reduced the network overhead significantly and improved the performance by 25% compared with deploying it solely on the cloud, with only the feature data being transferred to the cloud instead of all the raw data.
对时间敏感的人工智能应用程序通常会在边缘设备上预处理原始数据,而不必将它们全部卸载到云端。然而,在分布式边缘云基础设施上部署人工智能应用程序仍然是一个悬而未决的问题,因为在边缘和云之间分离角色没有现有的规则可循。在本文中,我们在边缘云基础设施上实现了一个面部表情识别(FER)系统,作为一个案例研究人工智能应用,以弥合这一差距。FER系统具有分布式、容错、高性能和完全边缘云分离的特点。FER在边缘上执行轻量级算法,如提取面部特征点,而在云上执行重型算法,如深度神经网络推理。我们在不同的云提供商上进行了实验,我们已经看到,与仅在云中部署相比,我们显著降低了网络开销,并将性能提高了25%,只有特征数据被传输到云中,而不是所有的原始数据。
{"title":"Facial Expression Recognition System on a Distributed Edge-Cloud Infrastructure","authors":"Kai Cui, Guoting Zhang, Fan Zhang, S. Khan","doi":"10.1109/CloudSummit54781.2022.00014","DOIUrl":"https://doi.org/10.1109/CloudSummit54781.2022.00014","url":null,"abstract":"Time-sensitive AI applications usually pre-process the raw data on edge devices without having to offload them all to the cloud. However, deploying the AI applications on a distributed edge-cloud infrastructure is still an open issue since separating the roles between the edge and the cloud has no existing rule to follow. In this paper, we implemented a Facial Expression Recognition (FER) system, as a case study AI application, on an edge-cloud infrastructure to bridge the gap. FER system is distributed, fault tolerant, performant and completely edge-cloud separated. FER performs light-weight algorithms such as extracting facial feature points on the edge, while it performs heavy-weight algorithms such as deep neural network inference on the cloud. We performed experiments on different cloud providers, and we have seen that we reduced the network overhead significantly and improved the performance by 25% compared with deploying it solely on the cloud, with only the feature data being transferred to the cloud instead of all the raw data.","PeriodicalId":106553,"journal":{"name":"2022 IEEE Cloud Summit","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133124238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Cost of Virtualizing Time in Linux Containers Linux容器虚拟化的时间成本
Pub Date : 2022-10-01 DOI: 10.1109/CloudSummit54781.2022.00016
X. Merino, Carlos E. Otero
Containerization has enabled applications to be deployed in ever-changing environments, restarted, updated, migrated, and frequently rolled back to earlier versions. Because host placement and scheduling are not guaranteed, an application may be restarted in a different host or at a later time, losing its sense of time and refusing service owing to incongruent states or network timeouts. Until now, process time was determined by the host. The most recent Linux time namespace allows for per-service timelines, regardless of the host. Because container engines do not yet support the time namespace, we offer a workflow for creating time-aware containers, as well as the first performance analysis of virtualizing time in Linux containers using this namespace. We consider 11 time-related system calls and their vDSO variants, making this one of the most comprehensive studies on the overhead of time virtualization in the literature. Our findings show that time virtualization adds 2-4% overhead, in line with current containerization overhead.
容器化使应用程序能够部署在不断变化的环境中,重新启动、更新、迁移,并经常回滚到早期版本。由于不能保证主机位置和调度,应用程序可能会在不同的主机上或稍后的时间重新启动,从而由于状态不一致或网络超时而失去时间感并拒绝服务。到目前为止,进程时间是由主机决定的。最新的Linux时间名称空间允许每个服务的时间线,而不管主机是什么。因为容器引擎还不支持时间名称空间,所以我们提供了一个用于创建时间感知容器的工作流,以及使用该名称空间在Linux容器中虚拟化时间的第一个性能分析。我们考虑了11个与时间相关的系统调用及其vDSO变体,这是文献中关于时间虚拟化开销的最全面的研究之一。我们的研究结果表明,时间虚拟化增加了2-4%的开销,与当前的容器化开销一致。
{"title":"The Cost of Virtualizing Time in Linux Containers","authors":"X. Merino, Carlos E. Otero","doi":"10.1109/CloudSummit54781.2022.00016","DOIUrl":"https://doi.org/10.1109/CloudSummit54781.2022.00016","url":null,"abstract":"Containerization has enabled applications to be deployed in ever-changing environments, restarted, updated, migrated, and frequently rolled back to earlier versions. Because host placement and scheduling are not guaranteed, an application may be restarted in a different host or at a later time, losing its sense of time and refusing service owing to incongruent states or network timeouts. Until now, process time was determined by the host. The most recent Linux time namespace allows for per-service timelines, regardless of the host. Because container engines do not yet support the time namespace, we offer a workflow for creating time-aware containers, as well as the first performance analysis of virtualizing time in Linux containers using this namespace. We consider 11 time-related system calls and their vDSO variants, making this one of the most comprehensive studies on the overhead of time virtualization in the literature. Our findings show that time virtualization adds 2-4% overhead, in line with current containerization overhead.","PeriodicalId":106553,"journal":{"name":"2022 IEEE Cloud Summit","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133309646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards Extreme and Sustainable Graph Processing for Urgent Societal Challenges in Europe 面向极端和可持续的图形处理在欧洲的紧迫社会挑战
Pub Date : 2022-10-01 DOI: 10.1109/CloudSummit54781.2022.00010
R.-C. Prodan, Dragi Kimovski, Andrea Bartolini, Michael Cochez, A. Iosup, E. Kharlamov, Jože M. Rožanec, Laurentiu A. Vasiliu, A. Varbanescu
The Graph-Massivizer project, funded by the Horizon Europe research and innovation program, researches and develops a high-performance, scalable, and sustainable platform for information processing and reasoning based on the massive graph (MG) representation of extreme data. It delivers a toolkit of five open-source software tools and FAIR graph datasets covering the sustainable lifecycle of processing extreme data as MGs. The tools focus on holistic usability (from extreme data ingestion and MG creation), automated intelligence (through analytics and reasoning), performance modelling, and environmental sustainability tradeoffs, supported by credible data-driven evidence across the computing continuum. The automated operation uses the emerging serverless computing paradigm for efficiency and event responsiveness. Thus, it supports experienced and novice stakeholders from a broad group of large and small organisations to capitalise on extreme data through MG programming and processing. Graph-Massivizer validates its innovation on four complementary use cases considering their extreme data properties and coverage of the three sustainability pillars (economy, society, and environment): sustainable green finance, global environment protection foresight, green AI for the sustainable automotive industry, and data centre digital twin for exascale computing. Graph-Massivizer promises 70% more efficient analytics than AliGraph, and 30 % improved energy awareness for extract, transform and load storage operations than Amazon Redshift. Furthermore, it aims to demonstrate a possible two-fold improvement in data centre energy efficiency and over 25 % lower greenhouse gas emissions for basic graph operations.
graph - massivizer项目由Horizon Europe研究和创新计划资助,研究和开发了一个高性能、可扩展和可持续的平台,用于基于海量图(MG)表示极端数据的信息处理和推理。它提供了一个由五个开源软件工具和FAIR图形数据集组成的工具包,涵盖了作为mg处理极端数据的可持续生命周期。这些工具侧重于整体可用性(从极端数据摄取和MG创建)、自动化智能(通过分析和推理)、性能建模和环境可持续性权衡,并由可信的数据驱动证据支持整个计算连续体。自动化操作使用新兴的无服务器计算范例来提高效率和事件响应能力。因此,它支持来自大型和小型组织的广泛群体的经验丰富和新手利益相关者通过MG编程和处理来利用极端数据。Graph-Massivizer在四个互补用例上验证了其创新,考虑到它们的极端数据属性和三个可持续发展支柱(经济、社会和环境)的覆盖范围:可持续绿色金融、全球环境保护远见、可持续汽车行业的绿色人工智能和用于百亿亿次计算的数据中心数字孪生。Graph-Massivizer承诺比AliGraph的分析效率提高70%,比Amazon Redshift的提取、转换和负载存储操作的能源意识提高30%。此外,它旨在展示数据中心能源效率可能的两倍提高,并在基本图形操作中减少超过25%的温室气体排放。
{"title":"Towards Extreme and Sustainable Graph Processing for Urgent Societal Challenges in Europe","authors":"R.-C. Prodan, Dragi Kimovski, Andrea Bartolini, Michael Cochez, A. Iosup, E. Kharlamov, Jože M. Rožanec, Laurentiu A. Vasiliu, A. Varbanescu","doi":"10.1109/CloudSummit54781.2022.00010","DOIUrl":"https://doi.org/10.1109/CloudSummit54781.2022.00010","url":null,"abstract":"The Graph-Massivizer project, funded by the Horizon Europe research and innovation program, researches and develops a high-performance, scalable, and sustainable platform for information processing and reasoning based on the massive graph (MG) representation of extreme data. It delivers a toolkit of five open-source software tools and FAIR graph datasets covering the sustainable lifecycle of processing extreme data as MGs. The tools focus on holistic usability (from extreme data ingestion and MG creation), automated intelligence (through analytics and reasoning), performance modelling, and environmental sustainability tradeoffs, supported by credible data-driven evidence across the computing continuum. The automated operation uses the emerging serverless computing paradigm for efficiency and event responsiveness. Thus, it supports experienced and novice stakeholders from a broad group of large and small organisations to capitalise on extreme data through MG programming and processing. Graph-Massivizer validates its innovation on four complementary use cases considering their extreme data properties and coverage of the three sustainability pillars (economy, society, and environment): sustainable green finance, global environment protection foresight, green AI for the sustainable automotive industry, and data centre digital twin for exascale computing. Graph-Massivizer promises 70% more efficient analytics than AliGraph, and 30 % improved energy awareness for extract, transform and load storage operations than Amazon Redshift. Furthermore, it aims to demonstrate a possible two-fold improvement in data centre energy efficiency and over 25 % lower greenhouse gas emissions for basic graph operations.","PeriodicalId":106553,"journal":{"name":"2022 IEEE Cloud Summit","volume":"118 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123228277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Context-Aware Feature Selection using Denoising Auto-Encoder for Fault Detection in Cloud Environments 云环境中基于去噪自编码器的上下文感知特征选择
Pub Date : 2022-10-01 DOI: 10.1109/CloudSummit54781.2022.00019
Razieh Abbasi Ghalehtaki, Amin Ebrahimzadeh, R. Glitho
Machine learning is expected to play an instrumental role in automating the detection of faults in next-generation cloud networks. The existing machine-learning-based fault detection methods suffer from the following drawbacks: (i) ignoring the issue of missing feature values, (ii) ignoring the impact of each feature on output prediction over other features (measurement of feature importance), and (iii) lack of calculation of the proper number of features for fault detection. To address the above challenges, in this paper, we propose a context-aware feature selection method to improve the performance of fault detection methods in the cloud environment, aiming at maximizing the $F_{1}$-score. Our proposed solution comprises Denoising Auto-Encoder (DAE) stacked with a Discriminative Model (DM). The DAE is applied to handle the missing feature values and encoding features while the DM is responsible for making predictions of system status based on the encoded features. Then, the sensitivity analysis of output prediction with respect to each input feature value is used to measure the feature importance. We compare our work with existing solutions from the literature. Our results reveal that the proposed solution can improve the $F_{1}$-score up to 47 % and 76 % in the scenario where all feature values are known and in the scenario where only 25 % of feature values are known, respectively.
机器学习有望在下一代云网络的故障自动检测中发挥重要作用。现有的基于机器学习的故障检测方法存在以下缺点:(i)忽略了缺失特征值的问题,(ii)忽略了每个特征对输出预测的影响,而不是其他特征(特征重要性的度量),以及(iii)缺乏计算用于故障检测的适当数量的特征。为了解决上述挑战,本文提出了一种上下文感知特征选择方法,以最大化$F_{1}$-分数为目标,提高云环境下故障检测方法的性能。我们提出的解决方案是将去噪自编码器(DAE)与判别模型(DM)叠加在一起。DAE用于处理缺失的特征值和编码特征,DM负责根据编码特征对系统状态进行预测。然后,利用输出预测相对于每个输入特征值的敏感性分析来度量特征的重要性。我们将我们的工作与文献中的现有解决方案进行比较。我们的结果表明,在所有特征值已知的情况下和仅25%的特征值已知的情况下,所提出的解决方案分别可以将$F_{1}$-得分提高47%和76%。
{"title":"Context-Aware Feature Selection using Denoising Auto-Encoder for Fault Detection in Cloud Environments","authors":"Razieh Abbasi Ghalehtaki, Amin Ebrahimzadeh, R. Glitho","doi":"10.1109/CloudSummit54781.2022.00019","DOIUrl":"https://doi.org/10.1109/CloudSummit54781.2022.00019","url":null,"abstract":"Machine learning is expected to play an instrumental role in automating the detection of faults in next-generation cloud networks. The existing machine-learning-based fault detection methods suffer from the following drawbacks: (i) ignoring the issue of missing feature values, (ii) ignoring the impact of each feature on output prediction over other features (measurement of feature importance), and (iii) lack of calculation of the proper number of features for fault detection. To address the above challenges, in this paper, we propose a context-aware feature selection method to improve the performance of fault detection methods in the cloud environment, aiming at maximizing the $F_{1}$-score. Our proposed solution comprises Denoising Auto-Encoder (DAE) stacked with a Discriminative Model (DM). The DAE is applied to handle the missing feature values and encoding features while the DM is responsible for making predictions of system status based on the encoded features. Then, the sensitivity analysis of output prediction with respect to each input feature value is used to measure the feature importance. We compare our work with existing solutions from the literature. Our results reveal that the proposed solution can improve the $F_{1}$-score up to 47 % and 76 % in the scenario where all feature values are known and in the scenario where only 25 % of feature values are known, respectively.","PeriodicalId":106553,"journal":{"name":"2022 IEEE Cloud Summit","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116592611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Various Network Topologies and an Analysis Comparative Between Fat-Tree and BCube for a Data Center Network: An Overview 数据中心网络的各种网络拓扑结构及胖树与BCube的分析比较综述
Pub Date : 2022-10-01 DOI: 10.1109/CloudSummit54781.2022.00007
Antonio Cortés Castillo
The exponential rise of servers in the cloud generates the need for network topologies for specialized data centers, which means that the service requirements are accompanied by the availability of massive storage and quality of services, essential aspects for handling large volumes of data in large server farms located in the DCN. In turn, the requirements for new cloud services have grown exponentially, so DCNs face new challenges related to scalability, energy efficiency, network congestion, and cost, which are directly associated with the architectures and DCN network topologies. Similarly, from existing network topologies we propose a DCN topology. In this paper, Fat-Tree and BCube network topologies are compared by considering the architectures themselves, the metrics for comparing various topology types, and two statistical functions are used such as the exponential random traffic distribution and uniform random traffic distribution. This type of comparison helps to solve the existing problem regarding the demand for new services, load balance, bandwidth, and node requirements in a DC network infrastructure.
云中服务器的指数级增长产生了对专用数据中心的网络拓扑的需求,这意味着服务需求伴随着大量存储和服务质量的可用性,这是在位于DCN的大型服务器群中处理大量数据的基本方面。反过来,对新云服务的需求呈指数级增长,因此DCN面临着与可伸缩性、能源效率、网络拥塞和成本相关的新挑战,这些挑战与体系结构和DCN网络拓扑直接相关。同样,从现有的网络拓扑中,我们提出了一个DCN拓扑。本文对Fat-Tree和BCube两种网络拓扑结构进行了比较,采用了指数型随机流量分布和均匀随机流量分布两种统计函数来比较不同的拓扑类型。这种比较有助于解决数据中心网络基础设施中存在的新业务需求、负载均衡、带宽和节点需求等问题。
{"title":"Various Network Topologies and an Analysis Comparative Between Fat-Tree and BCube for a Data Center Network: An Overview","authors":"Antonio Cortés Castillo","doi":"10.1109/CloudSummit54781.2022.00007","DOIUrl":"https://doi.org/10.1109/CloudSummit54781.2022.00007","url":null,"abstract":"The exponential rise of servers in the cloud generates the need for network topologies for specialized data centers, which means that the service requirements are accompanied by the availability of massive storage and quality of services, essential aspects for handling large volumes of data in large server farms located in the DCN. In turn, the requirements for new cloud services have grown exponentially, so DCNs face new challenges related to scalability, energy efficiency, network congestion, and cost, which are directly associated with the architectures and DCN network topologies. Similarly, from existing network topologies we propose a DCN topology. In this paper, Fat-Tree and BCube network topologies are compared by considering the architectures themselves, the metrics for comparing various topology types, and two statistical functions are used such as the exponential random traffic distribution and uniform random traffic distribution. This type of comparison helps to solve the existing problem regarding the demand for new services, load balance, bandwidth, and node requirements in a DC network infrastructure.","PeriodicalId":106553,"journal":{"name":"2022 IEEE Cloud Summit","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133646588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
PriRecT: Privacy-preserving Job Recommendation Tool for GPU Sharing PriRecT: GPU共享的隐私保护作业推荐工具
Pub Date : 2022-10-01 DOI: 10.1109/CloudSummit54781.2022.00021
Aritra Ray, Zhaobo Zhang, Ying Xiong, K. Chakrabarty
Machine Learning (ML) jobs significantly benefit when trained on abundant GPU resources. It leads to resource contention when several ML training jobs are scheduled con-currently on a single GPU in the compute cluster. A job's performance is susceptible to its competitor's task on a single GPU. We, in this paper, propose PriRecT, a novel ML job recommendation tool that preserves user privacy for scheduling ML training jobs in the GPU compute cluster. We perform workload characterization for several ML training scripts, and the Futurewei mini-ML Workload Dataset is released publicly [1]. We build a knowledge base of inter and intra-cluster task interference for GPU sharing through a clustering-based approach. For scheduling purposes, PriRecT blinds the user-sensitive information and assigns the job to an existing cluster. Based on clustering results, PriRecT recommends jobs that should run concurrently on a single GPU to minimize task interference and additionally assigns an uncertainty score to account for job variations in the recommendation.
在丰富的GPU资源上进行训练,机器学习(ML)作业将显著受益。当多个机器学习训练任务同时在计算集群中的单个GPU上调度时,会导致资源争用。一个任务的性能很容易受到竞争对手在单个GPU上的任务的影响。在本文中,我们提出了一种新颖的机器学习作业推荐工具PriRecT,它可以保护用户隐私,以便在GPU计算集群中调度机器学习训练作业。我们对几个机器学习训练脚本进行了工作量表征,并公开发布了Futurewei mini-ML工作量数据集[1]。我们通过基于聚类的方法建立了GPU共享的集群间和集群内任务干扰知识库。出于调度目的,PriRecT会屏蔽用户敏感信息,并将作业分配给现有集群。基于聚类结果,PriRecT推荐应该在单个GPU上并发运行的作业,以最大限度地减少任务干扰,并额外分配不确定性分数,以考虑推荐中的作业变化。
{"title":"PriRecT: Privacy-preserving Job Recommendation Tool for GPU Sharing","authors":"Aritra Ray, Zhaobo Zhang, Ying Xiong, K. Chakrabarty","doi":"10.1109/CloudSummit54781.2022.00021","DOIUrl":"https://doi.org/10.1109/CloudSummit54781.2022.00021","url":null,"abstract":"Machine Learning (ML) jobs significantly benefit when trained on abundant GPU resources. It leads to resource contention when several ML training jobs are scheduled con-currently on a single GPU in the compute cluster. A job's performance is susceptible to its competitor's task on a single GPU. We, in this paper, propose PriRecT, a novel ML job recommendation tool that preserves user privacy for scheduling ML training jobs in the GPU compute cluster. We perform workload characterization for several ML training scripts, and the Futurewei mini-ML Workload Dataset is released publicly [1]. We build a knowledge base of inter and intra-cluster task interference for GPU sharing through a clustering-based approach. For scheduling purposes, PriRecT blinds the user-sensitive information and assigns the job to an existing cluster. Based on clustering results, PriRecT recommends jobs that should run concurrently on a single GPU to minimize task interference and additionally assigns an uncertainty score to account for job variations in the recommendation.","PeriodicalId":106553,"journal":{"name":"2022 IEEE Cloud Summit","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121267399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine Learning vs Deep Learning for Anomaly Detection and Categorization in Multi-cloud Environments 机器学习与深度学习在多云环境中的异常检测和分类
Pub Date : 2022-10-01 DOI: 10.1109/CloudSummit54781.2022.00013
J. Akoto, Tara Salman
Detecting intrusions is a critical issue in cyberse-curity. One way to overcome this issue is to build efficient and robust Network Intrusion Detection Systems (NIDS) using existing Machine Learning (ML) algorithms. Such an approach has been proposed in the literature and has been shown to perform well. However, a comparative analysis of the performance of ML and Deep Learning (DL) based NIDS for both detection and categorization of intrusions is still needed. This paper investigates the performance of ML and DL models for both intrusion detection and categorization. We use the publicly available Canadian Institute of Cybersecurity Intrusion Detection System 2017 (CICIDS-2017) dataset to train and test ML and DL models. We apply three traditional ML models, namely, Logistic Regression (LR), Random Forest (RF), K-Nearest Neighbor (KNN), and three DL models − 1-D Convolutional Neural Network (ConvlD), Recurrent Neural Network (RNN), and a two-staged model that combines an unsupervised Dense Autoencoders (DAE) for pre-training and an Artificial Neural Network (ANN) for classification. Our results demonstrate that RF is the best performing ML model with a detection accuracy of 99.5% and DAE-ANN is the best performing DL model with a detection accuracy of 98.7%. We also show the advantages of using a stepwise multi-classification over a classical single-stage multi-classification. Finally, we observe that RF outperforms DAE-ANN in categorization with detection rates of 91.35 % and 84.66 %, respectively.
检测入侵是网络安全的关键问题。克服这一问题的一种方法是使用现有的机器学习(ML)算法构建高效且强大的网络入侵检测系统(NIDS)。这种方法已在文献中提出,并已被证明表现良好。然而,仍然需要对基于ML和深度学习(DL)的入侵检测和分类的NIDS的性能进行比较分析。本文研究了ML和DL模型在入侵检测和分类方面的性能。我们使用公开可用的加拿大网络安全入侵检测系统研究所2017 (CICIDS-2017)数据集来训练和测试ML和DL模型。我们应用了三种传统的机器学习模型,即逻辑回归(LR)、随机森林(RF)、k近邻(KNN)和三种深度学习模型——一维卷积神经网络(ConvlD)、递归神经网络(RNN),以及一种两阶段模型,该模型结合了用于预训练的无监督密集自编码器(DAE)和用于分类的人工神经网络(ANN)。我们的结果表明,RF是表现最好的ML模型,检测准确率为99.5%,DAE-ANN是表现最好的DL模型,检测准确率为98.7%。我们还展示了使用逐步多分类优于经典单阶段多分类的优点。最后,我们观察到RF在分类方面优于DAE-ANN,检出率分别为91.35%和84.66%。
{"title":"Machine Learning vs Deep Learning for Anomaly Detection and Categorization in Multi-cloud Environments","authors":"J. Akoto, Tara Salman","doi":"10.1109/CloudSummit54781.2022.00013","DOIUrl":"https://doi.org/10.1109/CloudSummit54781.2022.00013","url":null,"abstract":"Detecting intrusions is a critical issue in cyberse-curity. One way to overcome this issue is to build efficient and robust Network Intrusion Detection Systems (NIDS) using existing Machine Learning (ML) algorithms. Such an approach has been proposed in the literature and has been shown to perform well. However, a comparative analysis of the performance of ML and Deep Learning (DL) based NIDS for both detection and categorization of intrusions is still needed. This paper investigates the performance of ML and DL models for both intrusion detection and categorization. We use the publicly available Canadian Institute of Cybersecurity Intrusion Detection System 2017 (CICIDS-2017) dataset to train and test ML and DL models. We apply three traditional ML models, namely, Logistic Regression (LR), Random Forest (RF), K-Nearest Neighbor (KNN), and three DL models − 1-D Convolutional Neural Network (ConvlD), Recurrent Neural Network (RNN), and a two-staged model that combines an unsupervised Dense Autoencoders (DAE) for pre-training and an Artificial Neural Network (ANN) for classification. Our results demonstrate that RF is the best performing ML model with a detection accuracy of 99.5% and DAE-ANN is the best performing DL model with a detection accuracy of 98.7%. We also show the advantages of using a stepwise multi-classification over a classical single-stage multi-classification. Finally, we observe that RF outperforms DAE-ANN in categorization with detection rates of 91.35 % and 84.66 %, respectively.","PeriodicalId":106553,"journal":{"name":"2022 IEEE Cloud Summit","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125946578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2022 IEEE Cloud Summit
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1