Post-deduplication in traditional cloud environments primarily focuses on single-node, where delta compression is performed on the same deduplication node located on server side. However, with data explosion, the multi-node post-deduplication, also called global deduplication, has become a hot issue in research communities, which aims to simultaneously execute delta compression on data distributed across all nodes. Simply setting up single-node deduplication systems on multi-node environments would significantly affect storage utilization and incur secondary overhead from file migration. Nevertheless, existing global deduplication solutions suffer from lower data compression ratios and high computational overhead due to their resemblance detection's inherent limitations and overly coarse granularities. Similar blocks typically have high correlations between sub-blocks; inspired by this observation, we propose IBNR (Intra-Block Neighborhood Relationship-Based Resemblance Detection for High-Performance Multi-Node Post-Deduplication), which introduces a novel resemblance detection based on relationships between sub-blocks and determines the ownership of blocks in entry stage to achieve efficient global deduplication. Furthermore, the by-products of IBNR have shown powerful scalability by replacing internal resemblance detection scheme with existing solutions on practical workloads. Experimental results indicate that IBNR outperforms state-of-the-art solutions, achieving an average 1.99× data reduction ratio and varying degrees of improvement across other key metrics.
{"title":"IBNR-RD: Intra-Block Neighborhood Relationship-Based Resemblance Detection for High-Performance Multi-Node Post-Deduplication","authors":"Dewen Zeng;Wenlong Tian;Tingting He;Ruixuan Li;Xuming Ye;Zhiyong Xu","doi":"10.1109/TCC.2024.3514784","DOIUrl":"https://doi.org/10.1109/TCC.2024.3514784","url":null,"abstract":"Post-deduplication in traditional cloud environments primarily focuses on single-node, where delta compression is performed on the same deduplication node located on server side. However, with data explosion, the multi-node post-deduplication, also called global deduplication, has become a hot issue in research communities, which aims to simultaneously execute delta compression on data distributed across all nodes. Simply setting up single-node deduplication systems on multi-node environments would significantly affect storage utilization and incur secondary overhead from file migration. Nevertheless, existing global deduplication solutions suffer from lower data compression ratios and high computational overhead due to their resemblance detection's inherent limitations and overly coarse granularities. Similar blocks typically have high correlations between sub-blocks; inspired by this observation, we propose IBNR (Intra-Block Neighborhood Relationship-Based Resemblance Detection for High-Performance Multi-Node Post-Deduplication), which introduces a novel resemblance detection based on relationships between sub-blocks and determines the ownership of blocks in entry stage to achieve efficient global deduplication. Furthermore, the by-products of IBNR have shown powerful scalability by replacing internal resemblance detection scheme with existing solutions on practical workloads. Experimental results indicate that IBNR outperforms state-of-the-art solutions, achieving an average 1.99× data reduction ratio and varying degrees of improvement across other key metrics.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"13 1","pages":"118-129"},"PeriodicalIF":5.3,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143570723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In recent years, the use of container cloud platforms has experienced rapid growth. However, because containers are operating-system-level virtualization, their isolation is far less than that of virtual machines, posing considerable challenges for multi-tenant container cloud platforms. To address the issues associated with current container anomaly detection algorithms, such as the difficulty in mining periodic features and the high rate of false positives due to noisy data, we propose an anomaly detection method named SST-LOF, based on singular spectrum transformation and the local outlier factor. Our method enhances the traditional Singular Spectrum Transformation (SST) algorithm to meet the needs of streaming unsupervised detection. Furthermore, our method improves the calculation mode of the anomaly score of the Local Outlier Factor algorithm (LOF) and reduces false positives of noisy data with dynamic sliding windows. Additionally, we have designed and implemented a container cloud anomaly detection system that can perform real-time, unsupervised, streaming anomaly detection on containers quickly and accurately. The experimental results demonstrate the effectiveness and efficiency of our method in detecting anomalies in containers in both simulated and real cloud environments.
{"title":"SST-LOF: Container Anomaly Detection Method Based on Singular Spectrum Transformation and Local Outlier Factor","authors":"Shilei Bu;Minpeng Jin;Jie Wang;Yulai Xie;Liangkang Zhang","doi":"10.1109/TCC.2024.3514297","DOIUrl":"https://doi.org/10.1109/TCC.2024.3514297","url":null,"abstract":"In recent years, the use of container cloud platforms has experienced rapid growth. However, because containers are operating-system-level virtualization, their isolation is far less than that of virtual machines, posing considerable challenges for multi-tenant container cloud platforms. To address the issues associated with current container anomaly detection algorithms, such as the difficulty in mining periodic features and the high rate of false positives due to noisy data, we propose an anomaly detection method named SST-LOF, based on singular spectrum transformation and the local outlier factor. Our method enhances the traditional Singular Spectrum Transformation (SST) algorithm to meet the needs of streaming unsupervised detection. Furthermore, our method improves the calculation mode of the anomaly score of the Local Outlier Factor algorithm (LOF) and reduces false positives of noisy data with dynamic sliding windows. Additionally, we have designed and implemented a container cloud anomaly detection system that can perform real-time, unsupervised, streaming anomaly detection on containers quickly and accurately. The experimental results demonstrate the effectiveness and efficiency of our method in detecting anomalies in containers in both simulated and real cloud environments.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"13 1","pages":"130-147"},"PeriodicalIF":5.3,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143570675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-05DOI: 10.1109/TCC.2024.3511548
Hoda Sedighi;Daniel Gehberger;Amin Ebrahimzadeh;Fetahi Wuhib;Roch H. Glitho
The advent of microservice architecture enables complex cloud applications to be realized via a set of individually isolated components, increasing their flexibility and performance. As these applications require massive computing resources, graphics processing units (GPUs) are being widely used as high-speed parallel computing devices to meet the stringent demands. Although current GPUs allow application components to be executed concurrently via spatial multitasking, they face several challenges. The first challenge is allocating the computing resources to components dynamically to maximize efficiency. The second challenge is avoiding performance degradation caused by the data transfer overhead between the components. To address these challenges, we propose an efficient GPU resource management technique that dynamically allocates GPU resources to application components. The proposed method allocates resources based on component workloads and uses online performance monitoring to guarantee the application's performance. We also propose a GPU memory manager to reduce the data transfer overhead between components via shared memory. Our evaluation results indicate that the proposed dynamic resource allocation method improves application throughput by up to 134.12% compared to the state-of-the-art spatial multitasking techniques. We also show that using a shared memory results in 6x throughput improvement compared to the baseline User Datagram Protocol (UDP)-based technique.
微服务架构的出现使得复杂的云应用程序可以通过一组独立的组件来实现,从而提高了它们的灵活性和性能。由于这些应用需要大量的计算资源,图形处理单元(graphics processing unit, gpu)作为高速并行计算设备被广泛使用,以满足苛刻的要求。尽管当前的gpu允许应用程序组件通过空间多任务并发执行,但它们面临着一些挑战。第一个挑战是动态地将计算资源分配给组件以最大化效率。第二个挑战是避免由组件之间的数据传输开销引起的性能下降。为了解决这些挑战,我们提出了一种高效的GPU资源管理技术,该技术可以动态地将GPU资源分配给应用程序组件。该方法基于组件工作负载分配资源,并使用在线性能监控来保证应用程序的性能。我们还提出了一个GPU内存管理器,通过共享内存减少组件之间的数据传输开销。我们的评估结果表明,与最先进的空间多任务处理技术相比,所提出的动态资源分配方法可将应用程序吞吐量提高134.12%。我们还表明,与基于用户数据报协议(UDP)的基线技术相比,使用共享内存可使吞吐量提高6倍。
{"title":"Efficient Dynamic Resource Management for Spatial Multitasking GPUs","authors":"Hoda Sedighi;Daniel Gehberger;Amin Ebrahimzadeh;Fetahi Wuhib;Roch H. Glitho","doi":"10.1109/TCC.2024.3511548","DOIUrl":"https://doi.org/10.1109/TCC.2024.3511548","url":null,"abstract":"The advent of microservice architecture enables complex cloud applications to be realized via a set of individually isolated components, increasing their flexibility and performance. As these applications require massive computing resources, graphics processing units (GPUs) are being widely used as high-speed parallel computing devices to meet the stringent demands. Although current GPUs allow application components to be executed concurrently via spatial multitasking, they face several challenges. The first challenge is allocating the computing resources to components dynamically to maximize efficiency. The second challenge is avoiding performance degradation caused by the data transfer overhead between the components. To address these challenges, we propose an efficient GPU resource management technique that dynamically allocates GPU resources to application components. The proposed method allocates resources based on component workloads and uses online performance monitoring to guarantee the application's performance. We also propose a GPU memory manager to reduce the data transfer overhead between components via shared memory. Our evaluation results indicate that the proposed dynamic resource allocation method improves application throughput by up to 134.12% compared to the state-of-the-art spatial multitasking techniques. We also show that using a shared memory results in 6x throughput improvement compared to the baseline User Datagram Protocol (UDP)-based technique.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"13 1","pages":"99-117"},"PeriodicalIF":5.3,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143570597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-04DOI: 10.1109/TCC.2024.3510916
Caio Alves Caldeira;Otávio Augusto de Oliveira Souza;Olga Goussevskaia;Stefan Schmid
Self-Adjusting Networks (SAN) optimize their physical topology toward the demand in an online manner. Their application in data center networks is motivated by emerging hardware technologies, such as 3D MEMS Optical Circuit Switches (OCS). The Matching Model (MM) has been introduced to study the hybrid architecture of such networks. It abstracts from the electrical switches and focuses on the added (reconfigurable) optical ones. MM defines any SAN topology as a union of matchings over a set of top-of-rack (ToR) nodes, and assumes that rearranging the edges of a single matching comes at a fixed cost. In this work, we propose and study the Scalable Matching Model (SMM), a generalization of the MM, and present OpticNet, a framework that maps a set of ToRs to a set of OCSs to form a SAN topology. We prove that OpticNet uses the minimum number of switches to realize any bounded-degree topology and allows existing SAN algorithms to run on top of it, while preserving amortized performance guarantees. Our experimental results based on real workloads show that OpticNet is a flexible and efficient framework for the implementation and evaluation of SAN algorithms in reconfigurable data center environments.
{"title":"Optical Self-Adjusting Data Center Networks in the Scalable Matching Model","authors":"Caio Alves Caldeira;Otávio Augusto de Oliveira Souza;Olga Goussevskaia;Stefan Schmid","doi":"10.1109/TCC.2024.3510916","DOIUrl":"https://doi.org/10.1109/TCC.2024.3510916","url":null,"abstract":"Self-Adjusting Networks (SAN) optimize their physical topology toward the demand in an online manner. Their application in data center networks is motivated by emerging hardware technologies, such as 3D MEMS Optical Circuit Switches (OCS). The Matching Model (MM) has been introduced to study the hybrid architecture of such networks. It abstracts from the electrical switches and focuses on the added (reconfigurable) optical ones. MM defines any SAN topology as a union of matchings over a set of top-of-rack (ToR) nodes, and assumes that rearranging the edges of a single matching comes at a fixed cost. In this work, we propose and study the Scalable Matching Model (SMM), a generalization of the MM, and present OpticNet, a framework that maps a set of ToRs to a set of OCSs to form a SAN topology. We prove that OpticNet uses the minimum number of switches to realize any bounded-degree topology and allows existing SAN algorithms to run on top of it, while preserving amortized performance guarantees. Our experimental results based on real workloads show that OpticNet is a flexible and efficient framework for the implementation and evaluation of SAN algorithms in reconfigurable data center environments.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"13 1","pages":"87-98"},"PeriodicalIF":5.3,"publicationDate":"2024-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143570781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-27DOI: 10.1109/TCC.2024.3506614
Jingru Xu;Cong Peng;Rui Li;Jintao Fu;Min Luo
To balance data confidentiality and availability, order-revealing encryption (ORE) has emerged as a pivotal primitive facilitating range queries on encrypted data. However, challenges arise in diverse user domains where data is encrypted with different keys, giving rise to the development of delegatable order-revealing encryption (DORE) schemes. Regrettably, existing DORE schemes are susceptible to authorization token forgery attacks and rely on computationally intensive bilinear pairings. This work proposes a novel solution to address these challenges. We first introduce a delegatable equality-revealing encryption scheme, enabling the comparison of ciphertexts encrypted by distinct secret keys through authorization tokens. Building upon this, we present a delegatable order-revealing encryption that leverages bitwise encryption. DORE supports efficient multi-user ciphertext comparison while robustly resisting authorization token forgery attacks. Significantly, our approach distinguishes itself by minimizing bilinear pairings. Experimental results highlight the efficacy of DORE, showcasing a notable speedup of $2.8times$ in encryption performance and $1.33times$ in comparison performance compared to previous DORE schemes, respectively.
{"title":"An Efficient Delegatable Order-Revealing Encryption Scheme for Multi-User Range Queries","authors":"Jingru Xu;Cong Peng;Rui Li;Jintao Fu;Min Luo","doi":"10.1109/TCC.2024.3506614","DOIUrl":"https://doi.org/10.1109/TCC.2024.3506614","url":null,"abstract":"To balance data confidentiality and availability, order-revealing encryption (ORE) has emerged as a pivotal primitive facilitating range queries on encrypted data. However, challenges arise in diverse user domains where data is encrypted with different keys, giving rise to the development of delegatable order-revealing encryption (DORE) schemes. Regrettably, existing DORE schemes are susceptible to authorization token forgery attacks and rely on computationally intensive bilinear pairings. This work proposes a novel solution to address these challenges. We first introduce a delegatable equality-revealing encryption scheme, enabling the comparison of ciphertexts encrypted by distinct secret keys through authorization tokens. Building upon this, we present a delegatable order-revealing encryption that leverages bitwise encryption. DORE supports efficient multi-user ciphertext comparison while robustly resisting authorization token forgery attacks. Significantly, our approach distinguishes itself by minimizing bilinear pairings. Experimental results highlight the efficacy of DORE, showcasing a notable speedup of <inline-formula><tex-math>$2.8times$</tex-math></inline-formula> in encryption performance and <inline-formula><tex-math>$1.33times$</tex-math></inline-formula> in comparison performance compared to previous DORE schemes, respectively.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"13 1","pages":"75-86"},"PeriodicalIF":5.3,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143570727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-20DOI: 10.1109/TCC.2024.3503358
Devki Nandan Jha;Graham Lenton;James Asker;David Blundell;Martin Higgins;David C. H. Wallom
With the unprecedented demand for cloud computing, ensuring trust in the underlying environment is challenging. Applications executing in the cloud are prone to attacks of different types including malware, network and data manipulation. These attacks may remain undetected for a significant length of time thus causing a lack of trust. Untrusted cloud services can also lead to business losses in many cases and therefore need urgent attention. In this paper, we present Trusted Public Cloud (TPC), a generic framework ensuring the Zero-trust security of client machine. It tracks the system state, alerting the user of unexpected changes in the machine’s state, thus increasing the run-time detection of security vulnerabilities. We validated TPC on Microsoft Azure with Local, Software Trusted Platform Module (SWTPM) and Software Guard Extension (SGX)-enabled SWTPM security providers. We also evaluated the scalability of TPC on Amazon Web Services (AWS) with a varying number of client machines executing in a concurrent environment. The execution results show the effectiveness of TPC as it takes a maximum of 35.6 seconds to recognise the system state when there are 128 client machines attached.
随着对云计算的空前需求,确保对底层环境的信任是一项挑战。在云中执行的应用程序容易受到不同类型的攻击,包括恶意软件、网络和数据操纵。这些攻击可能在很长一段时间内未被发现,从而导致缺乏信任。在许多情况下,不可信的云服务也可能导致业务损失,因此需要紧急关注。本文提出了一种保证客户端机器零信任安全的通用框架可信公共云(TPC)。它跟踪系统状态,提醒用户机器状态的意外变化,从而增加对安全漏洞的运行时检测。我们在Microsoft Azure上使用本地、软件可信平台模块(SWTPM)和软件保护扩展(SGX)支持的SWTPM安全提供商验证了TPC。我们还通过在并发环境中执行不同数量的客户机,评估了TPC在Amazon Web Services (AWS)上的可伸缩性。执行结果显示了TPC的有效性,因为当连接了128台客户机时,它最多需要35.6秒来识别系统状态。
{"title":"A Run-Time Framework for Ensuring Zero-Trust State of Client’s Machines in Cloud Environment","authors":"Devki Nandan Jha;Graham Lenton;James Asker;David Blundell;Martin Higgins;David C. H. Wallom","doi":"10.1109/TCC.2024.3503358","DOIUrl":"https://doi.org/10.1109/TCC.2024.3503358","url":null,"abstract":"With the unprecedented demand for cloud computing, ensuring trust in the underlying environment is challenging. Applications executing in the cloud are prone to attacks of different types including malware, network and data manipulation. These attacks may remain undetected for a significant length of time thus causing a lack of trust. Untrusted cloud services can also lead to business losses in many cases and therefore need urgent attention. In this paper, we present <italic>Trusted Public Cloud</i> (<sc>TPC</small>), a generic framework ensuring the <italic>Zero-trust</i> security of client machine. It tracks the system state, alerting the user of unexpected changes in the machine’s state, thus increasing the run-time detection of security vulnerabilities. We validated <sc>TPC</small> on Microsoft Azure with Local, Software Trusted Platform Module (SWTPM) and Software Guard Extension (SGX)-enabled SWTPM security providers. We also evaluated the scalability of <sc>TPC</small> on Amazon Web Services (AWS) with a varying number of client machines executing in a concurrent environment. The execution results show the effectiveness of <sc>TPC</small> as it takes a maximum of 35.6 seconds to recognise the system state when there are 128 client machines attached.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"13 1","pages":"61-74"},"PeriodicalIF":5.3,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143570760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-19DOI: 10.1109/TCC.2024.3502464
Geyao Cheng;Junxu Xia;Lailong Luo;Haibo Mi;Deke Guo;Richard T. B. Ma
Currently, deduplication techniques are utilized to minimize the space overhead by deleting redundant data blocks across large-scale servers in data centers. However, such a process exacerbates the fragmentation of data blocks, causing more cross-server file retrievals with plummeting retrieval throughput. Some attempts prefer better file retrieval performance by confining all blocks of a file to one single server, resulting in non-trivial space consumption for more replicated blocks across servers. An ideal network storage system, in effect, should take both the deduplication and retrieval performance into account by implementing reasonable assignment of the detected unique blocks. Such a fine-grained assignment requires an accurate and comprehensive abstraction of the files, blocks, and the file-block affiliation relationships. To achieve this, we innovatively design the weighted hypergraph to profile the multivariate data correlations. With this delicate abstraction in place, we propose HyperPart, which elegantly transforms this complex block allocation problem into a hypergraph partition problem. For more general scenarios with dynamic file updates, we further propose a two-phase incremental hypergraph repartition scheme, which mitigates the performance degradation with minimal migration volume. We implement a prototype system of HyperPart, and the experiment results validate that it saves around 50% of the storage space and improves the retrieval throughput by approximately 30% of state-of-the-art methods under the balance constraints.
{"title":"HyperPart: A Hypergraph-Based Abstraction for Deduplicated Storage Systems","authors":"Geyao Cheng;Junxu Xia;Lailong Luo;Haibo Mi;Deke Guo;Richard T. B. Ma","doi":"10.1109/TCC.2024.3502464","DOIUrl":"https://doi.org/10.1109/TCC.2024.3502464","url":null,"abstract":"Currently, deduplication techniques are utilized to minimize the space overhead by deleting redundant data blocks across large-scale servers in data centers. However, such a process exacerbates the fragmentation of data blocks, causing more cross-server file retrievals with plummeting retrieval throughput. Some attempts prefer better file retrieval performance by confining all blocks of a file to one single server, resulting in non-trivial space consumption for more replicated blocks across servers. An ideal network storage system, in effect, should take both the deduplication and retrieval performance into account by implementing reasonable assignment of the detected unique blocks. Such a fine-grained assignment requires an accurate and comprehensive abstraction of the files, blocks, and the file-block affiliation relationships. To achieve this, we innovatively design the weighted hypergraph to profile the multivariate data correlations. With this delicate abstraction in place, we propose HyperPart, which elegantly transforms this complex block allocation problem into a hypergraph partition problem. For more general scenarios with dynamic file updates, we further propose a two-phase incremental hypergraph repartition scheme, which mitigates the performance degradation with minimal migration volume. We implement a prototype system of HyperPart, and the experiment results validate that it saves around 50% of the storage space and improves the retrieval throughput by approximately 30% of state-of-the-art methods under the balance constraints.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"13 1","pages":"46-60"},"PeriodicalIF":5.3,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143570780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-18DOI: 10.1109/TCC.2024.3500139
Danny De Vleeschauwer;Chia-Yu Chang;Paola Soto;Yorick De Bock;Miguel Camelo;Koen De Schepper
Nowadays, many services are offered via the cloud, i.e., they rely on interacting software components that can run on a set of connected Commercial Off-The-Shelf (COTS) servers sitting in data centers. As the demand for any particular service evolves over time, the computational resources associated with the service must be scaled accordingly while keeping the Key Performance Indicators (KPIs) associated with the service under control. Consequently, scaling always involves a delicate trade-off between using the cloud resources and complying with the KPIs. In this paper, we show that a (workload-dependent) Pareto front embodies this trade-off’s limits. We identify this Pareto front for various workloads and assess the ability of several scaling algorithms to approach that Pareto front.
{"title":"A Method to Compare Scaling Algorithms for Cloud-Based Services","authors":"Danny De Vleeschauwer;Chia-Yu Chang;Paola Soto;Yorick De Bock;Miguel Camelo;Koen De Schepper","doi":"10.1109/TCC.2024.3500139","DOIUrl":"https://doi.org/10.1109/TCC.2024.3500139","url":null,"abstract":"Nowadays, many services are offered via the cloud, i.e., they rely on interacting software components that can run on a set of connected Commercial Off-The-Shelf (COTS) servers sitting in data centers. As the demand for any particular service evolves over time, the computational resources associated with the service must be scaled accordingly while keeping the Key Performance Indicators (KPIs) associated with the service under control. Consequently, scaling always involves a delicate trade-off between using the cloud resources and complying with the KPIs. In this paper, we show that a (workload-dependent) Pareto front embodies this trade-off’s limits. We identify this Pareto front for various workloads and assess the ability of several scaling algorithms to approach that Pareto front.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"13 1","pages":"34-45"},"PeriodicalIF":5.3,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143570778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A cascaded network represents a classic scaling-out model in traditional electrical switching networks. Recent proposals have integrated optical circuit switching at specific tiers of these networks to reduce power consumption and enhance topological flexibility. Utilizing a multi-tiered cascaded optical circuit switching network is expected to extend the advantages of optical circuit switching further. The main challenges fall into two categories. First, an architecture with sufficient connectivity is required to support varying workloads. Second, the network reconfiguration is more complex and necessitates a low-complexity scheduling algorithm. In this work, we propose COCSN, a multi-tiered cascaded optical circuit switching network architecture for data center. COCSN employs wavelength-selective switches that integrate multiple wavelengths to enhance network connectivity. We formulate a mathematical model covering lightpath establishment, network reconfiguration, and reconfiguration goals, and propose theorems to optimize the model. Based on the theorems, we introduce an over-subscription-supported wavelength-by-wavelength scheduling algorithm, facilitating agile establishment of lightpaths in COCSN tailored to communication demand. This algorithm effectively addresses scheduling complexities and mitigates the issue of lengthy WSS configuration times. Simulation studies investigate the impact of flow length, WSS reconfiguration time, and communication domain on COCSN, verifying its significantly lower complexity and superior performance over classical cascaded networks.
{"title":"COCSN: A Multi-Tiered Cascaded Optical Circuit Switching Network for Data Center","authors":"Shuo Li;Huaxi Gu;Xiaoshan Yu;Hua Huang;Songyan Wang;Zeshan Chang","doi":"10.1109/TCC.2024.3488275","DOIUrl":"https://doi.org/10.1109/TCC.2024.3488275","url":null,"abstract":"A cascaded network represents a classic scaling-out model in traditional electrical switching networks. Recent proposals have integrated optical circuit switching at specific tiers of these networks to reduce power consumption and enhance topological flexibility. Utilizing a multi-tiered cascaded optical circuit switching network is expected to extend the advantages of optical circuit switching further. The main challenges fall into two categories. First, an architecture with sufficient connectivity is required to support varying workloads. Second, the network reconfiguration is more complex and necessitates a low-complexity scheduling algorithm. In this work, we propose COCSN, a multi-tiered cascaded optical circuit switching network architecture for data center. COCSN employs wavelength-selective switches that integrate multiple wavelengths to enhance network connectivity. We formulate a mathematical model covering lightpath establishment, network reconfiguration, and reconfiguration goals, and propose theorems to optimize the model. Based on the theorems, we introduce an over-subscription-supported wavelength-by-wavelength scheduling algorithm, facilitating agile establishment of lightpaths in COCSN tailored to communication demand. This algorithm effectively addresses scheduling complexities and mitigates the issue of lengthy WSS configuration times. Simulation studies investigate the impact of flow length, WSS reconfiguration time, and communication domain on COCSN, verifying its significantly lower complexity and superior performance over classical cascaded networks.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"12 4","pages":"1463-1475"},"PeriodicalIF":5.3,"publicationDate":"2024-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142798036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In edge computing, energy-limited distributed edge clients present challenges such as heterogeneity, high energy consumption, and security risks. Traditional blockchain-based federated learning (BFL) struggles to address all three of these challenges simultaneously. This article proposes a Graph-Partitioning Multi-Granularity Federated Learning method on a consortium blockchain, namely GP-MGFL. To reduce the overall communication overhead, we adopt a balanced graph partitioning algorithm while introducing observer and consensus nodes. This method groups clients to minimize high-cost communications and focuses on the guidance effect within each group, thereby ensuring effective guidance with reduced overhead. To fully leverage heterogeneity, we introduce a cross-granularity guidance mechanism. This mechanism involves fine-granularity models guiding coarse-granularity models to enhance the accuracy of the latter models. We also introduce a credit model to adjust the contribution of models to the global model dynamically and to dynamically select leaders responsible for model aggregation. Finally, we implement a prototype system on real physical hardware and compare it with several baselines. Experimental results show that the accuracy of the GP-MGFL algorithm is 5.6% higher than that of ordinary BFL algorithms. In addition, compared to other grouping methods, such as greedy grouping, the accuracy of the proposed method improves by about 1.5%. In scenarios with malicious clients, the maximum accuracy improvement reaches 11.1%. We also analyze and summarize the impact of grouping and the number of clients on the model, as well as the impact of this method on the inherent security of the blockchain itself.
{"title":"Multi-Granularity Federated Learning by Graph-Partitioning","authors":"Ziming Dai;Yunfeng Zhao;Chao Qiu;Xiaofei Wang;Haipeng Yao;Dusit Niyato","doi":"10.1109/TCC.2024.3494765","DOIUrl":"https://doi.org/10.1109/TCC.2024.3494765","url":null,"abstract":"In edge computing, energy-limited distributed edge clients present challenges such as heterogeneity, high energy consumption, and security risks. Traditional blockchain-based federated learning (BFL) struggles to address all three of these challenges simultaneously. This article proposes a Graph-Partitioning Multi-Granularity Federated Learning method on a consortium blockchain, namely GP-MGFL. To reduce the overall communication overhead, we adopt a balanced graph partitioning algorithm while introducing observer and consensus nodes. This method groups clients to minimize high-cost communications and focuses on the guidance effect within each group, thereby ensuring effective guidance with reduced overhead. To fully leverage heterogeneity, we introduce a cross-granularity guidance mechanism. This mechanism involves fine-granularity models guiding coarse-granularity models to enhance the accuracy of the latter models. We also introduce a credit model to adjust the contribution of models to the global model dynamically and to dynamically select leaders responsible for model aggregation. Finally, we implement a prototype system on real physical hardware and compare it with several baselines. Experimental results show that the accuracy of the GP-MGFL algorithm is 5.6% higher than that of ordinary BFL algorithms. In addition, compared to other grouping methods, such as greedy grouping, the accuracy of the proposed method improves by about 1.5%. In scenarios with malicious clients, the maximum accuracy improvement reaches 11.1%. We also analyze and summarize the impact of grouping and the number of clients on the model, as well as the impact of this method on the inherent security of the blockchain itself.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"13 1","pages":"18-33"},"PeriodicalIF":5.3,"publicationDate":"2024-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143570702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}