首页 > 最新文献

IEEE Transactions on Parallel and Distributed Systems最新文献

英文 中文
Beyond Belady to Attain a Seemingly Unattainable Byte Miss Ratio for Content Delivery Networks 超越 Belady,为内容交付网络实现看似遥不可及的字节遗漏率
IF 5.6 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2024-08-30 DOI: 10.1109/TPDS.2024.3452096
Peng Wang;Hong Jiang;Yu Liu;Zhelong Zhao;Ke Zhou;Zhihai Huang
Reducing the byte miss ratio (BMR) in the Content Delivery Network (CDN) caches can help providers save on the cost of paying for traffic. When evicting objects or files of different sizes in the caches of CDNs, it is no longer sufficient to pursue an optimal object miss ratio (OMR) by approximating Belady to ensure an optimal BMR. Our experimental observations suggest that there are multiple request sequence windows. In these windows, a replacement policy prioritizes the eviction of objects with large sizes and ultimately evicts the object with the longest reuse distance, lowering the BMR without increasing the OMR. To accurately capture those windows, we monitor the changes in OMR and BMR using a deep reinforcement learning (RL) model and then implement a BMR-friendly replacement algorithm in these windows. Based on this policy, we propose a Belady and Size Eviction (LRU-BaSE) algorithm that reduces BMR while maintaining OMR. To make LRU-BaSE efficient and practical, we address the feedback delay problem of RL with a two-pronged approach. On the one hand, we shorten the LRU-base decision region based on the observation that the rear section of the cache queue contains most of the eviction candidates. On the other hand, the request distribution on CDNs makes it feasible to divide the learning region into multiple sub-regions that are each learned with reduced time and increased accuracy. In real CDN systems, LRU-BaSE outperforms LRU by reducing “backing to OS” traffic and access latency by 30.05% and 17.07%, respectively, on average. In simulator tests, LRU-BaSE outperforms state-of-the-art cache replacement policies. On average, LRU-BaSE's BMR is 0.63% and 0.33% less than that of Belady and Practical Flow-based Offline Optimal (PFOO), respectively. In addition, compared to Learning Relaxed Belady (LRB), LRU-BaSE can yield relatively stable performance when facing workload drift.
降低内容分发网络(CDN)缓存中的字节遗漏率(BMR)可以帮助提供商节省流量付费成本。在驱逐 CDN 缓存中不同大小的对象或文件时,通过近似贝拉迪(Belady)来追求最佳对象遗漏率(OMR)以确保最佳字节遗漏率(BMR)已经不够了。我们的实验观察表明,存在多个请求序列窗口。在这些窗口中,替换策略会优先驱逐尺寸较大的对象,并最终驱逐重用距离最长的对象,从而在不增加 OMR 的情况下降低 BMR。为了准确捕捉这些窗口,我们使用深度强化学习(RL)模型监控 OMR 和 BMR 的变化,然后在这些窗口中实施 BMR 友好替换算法。基于这一策略,我们提出了一种 "Belady and Size Eviction"(LRU-BaSE)算法,可在保持 OMR 的同时降低 BMR。为了使 LRU-BaSE 高效实用,我们采用双管齐下的方法来解决 RL 的反馈延迟问题。一方面,我们根据高速缓存队列后部包含大部分驱逐候选对象的观察结果,缩短了 LRU 基准决策区域。另一方面,CDN 上的请求分布使得将学习区域划分为多个子区域成为可行,每个子区域的学习时间更短,准确率更高。在实际 CDN 系统中,LRU-BaSE 的性能优于 LRU,"备份到操作系统 "流量和访问延迟平均分别减少了 30.05% 和 17.07%。在模拟器测试中,LRU-BaSE 的性能优于最先进的缓存替换策略。平均而言,LRU-BaSE 的 BMR 分别比 Belady 和基于实践流的离线优化(PFOO)低 0.63% 和 0.33%。此外,与学习宽松贝拉迪(LRB)相比,LRU-BaSE 在面对工作负载漂移时能产生相对稳定的性能。
{"title":"Beyond Belady to Attain a Seemingly Unattainable Byte Miss Ratio for Content Delivery Networks","authors":"Peng Wang;Hong Jiang;Yu Liu;Zhelong Zhao;Ke Zhou;Zhihai Huang","doi":"10.1109/TPDS.2024.3452096","DOIUrl":"https://doi.org/10.1109/TPDS.2024.3452096","url":null,"abstract":"Reducing the byte miss ratio (BMR) in the Content Delivery Network (CDN) caches can help providers save on the cost of paying for traffic. When evicting objects or files of different sizes in the caches of CDNs, it is no longer sufficient to pursue an optimal object miss ratio (OMR) by approximating Belady to ensure an optimal BMR. Our experimental observations suggest that there are multiple request sequence windows. In these windows, a replacement policy prioritizes the eviction of objects with large sizes and ultimately evicts the object with the longest reuse distance, lowering the BMR without increasing the OMR. To accurately capture those windows, we monitor the changes in OMR and BMR using a deep reinforcement learning (RL) model and then implement a BMR-friendly replacement algorithm in these windows. Based on this policy, we propose a Belady and Size Eviction (LRU-BaSE) algorithm that reduces BMR while maintaining OMR. To make LRU-BaSE efficient and practical, we address the feedback delay problem of RL with a two-pronged approach. On the one hand, we shorten the LRU-base decision region based on the observation that the rear section of the cache queue contains most of the eviction candidates. On the other hand, the request distribution on CDNs makes it feasible to divide the learning region into multiple sub-regions that are each learned with reduced time and increased accuracy. In real CDN systems, LRU-BaSE outperforms LRU by reducing “backing to OS” traffic and access latency by 30.05% and 17.07%, respectively, on average. In simulator tests, LRU-BaSE outperforms state-of-the-art cache replacement policies. On average, LRU-BaSE's BMR is 0.63% and 0.33% less than that of Belady and Practical Flow-based Offline Optimal (PFOO), respectively. In addition, compared to Learning Relaxed Belady (LRB), LRU-BaSE can yield relatively stable performance when facing workload drift.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"35 11","pages":"1949-1963"},"PeriodicalIF":5.6,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142160006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BIRD+: Design of a Lightweight Communication Compressor for Resource-Constrained Distribution Learning Platforms BIRD+:为资源有限的分布式学习平台设计轻量级通信压缩器
IF 5.6 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2024-08-21 DOI: 10.1109/TPDS.2024.3447221
Donglei Wu;Weihao Yang;Xiangyu Zou;Hao Feng;Dingwen Tao;Shiyi Li;Wen Xia;Binxing Fang
The Top-K sparsification-based compression framework is extensively explored for reducing communication costs in distributed learning. However, we identified several issues with existing Top-K sparsification-based compression methods: (i) The limited compressibility of the Top-K parameter's indexes critically restricts the overall communication compression ratio; (ii) Several time-consuming compression operations significantly offset the benefits of communication compression; (iii) The use of error feedback techniques to maintain model quality results in a high memory footprint consumption. To solve these issues, we propose BIRD, a lightweight tensor-wise Bi-Random sampling strategy with an expectation invariance property. Specifically, BIRD applies a tensor-wise index sharing mechanism that reduces the index proportion by allowing multiple tensor elements to share a single index, thus improving the overall compression ratio. Additionally, BIRD replaces the time-consuming Top-K sorting with a faster Bi-Random sampling strategy based on the aforementioned index sharing mechanism, significantly reducing compression overheads; Moreover, BIRD establishes an expectation invariance property into the Bi-Random sampling to ensure an approximate unbiased representation for the $L_1$-norm of the sampled tensors, effectively maintaining the model quality without incurring extra memory costs. We further optimize BIRD to BIRD+ by introducing the uniform distribution-based sampling and Gamma correction on the tensor-wise sampling process, achieving a more flexibly adjustment of the sparsity with better convergence performance. Experimental evaluations across multiple conventional distributed learning tasks demonstrate that compared to state-of-the-art approaches, BIRD+ achieves higher communication compression ratios up to 36.2$times$ and higher computation throughput up to 149.6$times$ while maintaining the model quality without incurring extra memory costs.
为降低分布式学习中的通信成本,基于 Top-K 稀疏化的压缩框架得到了广泛探索。然而,我们发现现有的基于 Top-K 稀疏化的压缩方法存在几个问题:(i) Top-K 参数索引的可压缩性有限,严重限制了整体通信压缩率;(ii) 一些耗时的压缩操作大大抵消了通信压缩的好处;(iii) 使用误差反馈技术来保持模型质量会消耗大量内存。为了解决这些问题,我们提出了具有期望不变性的轻量级张量双随机抽样策略 BIRD。具体来说,BIRD 采用了一种张量索引共享机制,通过允许多个张量元素共享一个索引来降低索引比例,从而提高整体压缩率。此外,BIRD 在上述索引共享机制的基础上采用了更快的双随机抽样策略,取代了耗时的 Top-K 排序,大大减少了压缩开销;而且,BIRD 在双随机抽样中建立了期望不变性属性,以确保对抽样张量的 $L_1$-norm 进行近似无偏表示,从而在不产生额外内存成本的情况下有效保持了模型质量。通过引入基于均匀分布的采样和张量采样过程中的伽马修正,我们进一步将 BIRD 优化为 BIRD+,实现了更灵活的稀疏性调整和更好的收敛性能。多个传统分布式学习任务的实验评估表明,与最先进的方法相比,BIRD+ 实现了更高的通信压缩比,最高可达 36.2 美元/次,计算吞吐量最高可达 149.6 美元/次,同时保持了模型质量,不会产生额外的内存成本。
{"title":"BIRD+: Design of a Lightweight Communication Compressor for Resource-Constrained Distribution Learning Platforms","authors":"Donglei Wu;Weihao Yang;Xiangyu Zou;Hao Feng;Dingwen Tao;Shiyi Li;Wen Xia;Binxing Fang","doi":"10.1109/TPDS.2024.3447221","DOIUrl":"10.1109/TPDS.2024.3447221","url":null,"abstract":"The Top-K sparsification-based compression framework is extensively explored for reducing communication costs in distributed learning. However, we identified several issues with existing Top-K sparsification-based compression methods: (\u0000<i>i</i>\u0000) The limited compressibility of the Top-K parameter's indexes critically restricts the overall communication compression ratio; (\u0000<i>ii</i>\u0000) Several time-consuming compression operations significantly offset the benefits of communication compression; (\u0000<i>iii</i>\u0000) The use of error feedback techniques to maintain model quality results in a high memory footprint consumption. To solve these issues, we propose BIRD, a lightweight tensor-wise \u0000<i>Bi-Random sampling</i>\u0000 strategy with an expectation invariance property. Specifically, BIRD applies a tensor-wise \u0000<i>index sharing</i>\u0000 mechanism that reduces the index proportion by allowing multiple tensor elements to share a single index, thus improving the overall compression ratio. Additionally, BIRD replaces the time-consuming Top-K sorting with a faster \u0000<i>Bi-Random sampling</i>\u0000 strategy based on the aforementioned \u0000<i>index sharing</i>\u0000 mechanism, significantly reducing compression overheads; Moreover, BIRD establishes an \u0000<i>expectation invariance</i>\u0000 property into the \u0000<i>Bi-Random sampling</i>\u0000 to ensure an approximate unbiased representation for the \u0000<inline-formula><tex-math>$L_1$</tex-math></inline-formula>\u0000-norm of the sampled tensors, effectively maintaining the model quality without incurring extra memory costs. We further optimize BIRD to BIRD+ by introducing the uniform distribution-based sampling and Gamma correction on the tensor-wise sampling process, achieving a more flexibly adjustment of the sparsity with better convergence performance. Experimental evaluations across multiple conventional distributed learning tasks demonstrate that compared to state-of-the-art approaches, BIRD+ achieves higher communication compression ratios up to 36.2\u0000<inline-formula><tex-math>$times$</tex-math></inline-formula>\u0000 and higher computation throughput up to 149.6\u0000<inline-formula><tex-math>$times$</tex-math></inline-formula>\u0000 while maintaining the model quality without incurring extra memory costs.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"35 11","pages":"2193-2207"},"PeriodicalIF":5.6,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142194361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fair Coflow Scheduling via Controlled Slowdown 通过受控减速实现公平的共流调度
IF 5.6 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2024-08-20 DOI: 10.1109/TPDS.2024.3446188
Francesco De Pellegrini;Vaibhav Kumar Gupta;Rachid El Azouzi;Serigne Gueye;Cedric Richier;Jeremie Leguay
The average coflow completion time (CCT) is the standard performance metric in coflow scheduling. However, standard CCT minimization may introduce unfairness between the data transfer phase of different computing jobs. Thus, while progress guarantees have been introduced in the literature to mitigate this fairness issue, the trade-off between fairness and efficiency of data transfer is hard to control. This paper introduces a fairness framework for coflow scheduling based on the concept of slowdown, i.e., the performance loss of a coflow compared to isolation. By controlling the slowdown it is possible to enforce a target coflow progress while minimizing the average CCT. In the proposed framework, the minimum slowdown for a batch of coflows can be determined in polynomial time. By showing the equivalence with Gaussian elimination, slowdown constraints are introduced into primal-dual iterations of the CoFair algorithm. The algorithm extends the class of the $sigma$-order schedulers to solve the fair coflow scheduling problem in polynomial time. It provides a 4-approximation of the average CCT w.r.t. an optimal scheduler. Extensive numerical results demonstrate that this approach can trade off average CCT for slowdown more efficiently than existing state of the art schedulers.
平均共流完成时间(CCT)是共流调度的标准性能指标。然而,标准的 CCT 最小化可能会导致不同计算作业的数据传输阶段之间出现不公平现象。因此,虽然文献中引入了进度保证来缓解这一公平性问题,但数据传输的公平性和效率之间的权衡很难控制。本文基于 "减速 "的概念,即与隔离相比,共同流的性能损失,为共同流调度引入了一个公平性框架。通过控制减速,可以在最大限度降低平均 CCT 的同时,强制执行目标 coflow 进度。在所提出的框架中,一批共同流的最小减速可以在多项式时间内确定。通过证明与高斯消元的等价性,减速约束被引入到 CoFair 算法的基元-双迭代中。该算法扩展了$sigma$阶调度器的类别,可以在多项式时间内解决公平共流调度问题。它提供了与最优调度器相比平均 CCT 的 4 倍近似值。大量的数值结果表明,与现有的最先进调度器相比,这种方法能更有效地权衡平均 CCT 与速度减慢之间的关系。
{"title":"Fair Coflow Scheduling via Controlled Slowdown","authors":"Francesco De Pellegrini;Vaibhav Kumar Gupta;Rachid El Azouzi;Serigne Gueye;Cedric Richier;Jeremie Leguay","doi":"10.1109/TPDS.2024.3446188","DOIUrl":"https://doi.org/10.1109/TPDS.2024.3446188","url":null,"abstract":"The average coflow completion time (CCT) is the standard performance metric in coflow scheduling. However, standard CCT minimization may introduce unfairness between the data transfer phase of different computing jobs. Thus, while progress guarantees have been introduced in the literature to mitigate this fairness issue, the trade-off between fairness and efficiency of data transfer is hard to control. This paper introduces a fairness framework for coflow scheduling based on the concept of slowdown, i.e., the performance loss of a coflow compared to isolation. By controlling the slowdown it is possible to enforce a target coflow progress while minimizing the average CCT. In the proposed framework, the minimum slowdown for a batch of coflows can be determined in polynomial time. By showing the equivalence with Gaussian elimination, slowdown constraints are introduced into primal-dual iterations of the CoFair algorithm. The algorithm extends the class of the \u0000<inline-formula><tex-math>$sigma$</tex-math></inline-formula>\u0000-order schedulers to solve the fair coflow scheduling problem in polynomial time. It provides a 4-approximation of the average CCT w.r.t. an optimal scheduler. Extensive numerical results demonstrate that this approach can trade off average CCT for slowdown more efficiently than existing state of the art schedulers.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"35 12","pages":"2347-2360"},"PeriodicalIF":5.6,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142438663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Privacy-Preserving Data Selection for Horizontal and Vertical Federated Learning 为横向和纵向联合学习选择保护隐私的数据
IF 5.6 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2024-08-19 DOI: 10.1109/TPDS.2024.3439709
Lan Zhang;Anran Li;Hongyi Peng;Feng Han;Fan Huang;Xiang-Yang Li
Federated learning (FL) enables distributed participants to collaboratively train a machine learning model without accessing to their local data. In FL systems, the selection of training samples has a significant impact on model performances, e.g., selecting participants whose datasets have low-quality samples, features would result in low accuracy, unstable models. In this work, we aim to solve the problem that selects a collection of high-quality training samples for a given FL task under a monetary budget. We propose a holistic design to efficiently select high-quality samples while preserve the privacy of participants’ local data, the server’s label set. We propose an efficient hierarchical sample selection mechanism to select relevant clients, their samples before training for horizontal federated learning (HFL). It uses the determinantal point process (DPP) to select both the statistical homogenous, content diverse clients, samples. Besides, we propose a private set intersection (PSI) based scheme to filter relevant features for the target VFL task. Finally, during training, an erroneous-aware importance based selection is proposed to dynamically select important clients, samples to accelerate model convergence. We verify the merits of our proposed solution with extensive experiments on a real AIoT system with 50 clients. The experimental results validate that our solution achieves accurate, efficient selection of high-quality data, consequently an FL model with a faster convergence speed, higher accuracy.
联邦学习(FL)使分布式参与者能够协作训练机器学习模型,而无需访问其本地数据。在联机学习系统中,训练样本的选择对模型性能有重大影响,例如,如果选择的参与者的数据集样本质量较低,则会导致模型准确率低、不稳定。在这项工作中,我们的目标是解决这样一个问题,即在资金预算允许的情况下,为给定的 FL 任务选择高质量的训练样本集。我们提出了一种整体设计方案,既能有效地选择高质量样本,又能保护参与者的本地数据(即服务器标签集)的隐私。我们提出了一种高效的分层样本选择机制,用于在水平联合学习(HFL)训练前选择相关客户及其样本。它使用行列式点过程(DPP)来选择统计同质和内容多样的客户、样本。此外,我们还提出了一种基于私有集交集(PSI)的方案,用于过滤目标 VFL 任务的相关特征。最后,在训练过程中,我们提出了一种基于错误感知重要性的选择方法,以动态选择重要的客户和样本,从而加速模型收敛。我们在一个拥有 50 个客户端的真实 AIoT 系统上进行了大量实验,验证了我们提出的解决方案的优点。实验结果验证了我们的解决方案能够准确、高效地选择高质量数据,从而使 FL 模型具有更快的收敛速度和更高的准确性。
{"title":"Privacy-Preserving Data Selection for Horizontal and Vertical Federated Learning","authors":"Lan Zhang;Anran Li;Hongyi Peng;Feng Han;Fan Huang;Xiang-Yang Li","doi":"10.1109/TPDS.2024.3439709","DOIUrl":"10.1109/TPDS.2024.3439709","url":null,"abstract":"Federated learning (FL) enables distributed participants to collaboratively train a machine learning model without accessing to their local data. In FL systems, the selection of training samples has a significant impact on model performances, e.g., selecting participants whose datasets have low-quality samples, features would result in low accuracy, unstable models. In this work, we aim to solve the problem that selects a collection of high-quality training samples for a given FL task under a monetary budget. We propose a holistic design to efficiently select high-quality samples while preserve the privacy of participants’ local data, the server’s label set. We propose an efficient hierarchical sample selection mechanism to select relevant clients, their samples before training for horizontal federated learning (HFL). It uses the determinantal point process (DPP) to select both the statistical homogenous, content diverse clients, samples. Besides, we propose a private set intersection (PSI) based scheme to filter relevant features for the target VFL task. Finally, during training, an erroneous-aware importance based selection is proposed to dynamically select important clients, samples to accelerate model convergence. We verify the merits of our proposed solution with extensive experiments on a real AIoT system with 50 clients. The experimental results validate that our solution achieves accurate, efficient selection of high-quality data, consequently an FL model with a faster convergence speed, higher accuracy.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"35 11","pages":"2054-2068"},"PeriodicalIF":5.6,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142194362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Logical Synchrony and the Bittide Mechanism 逻辑同步和比特机制
IF 5.6 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2024-08-16 DOI: 10.1109/TPDS.2024.3444739
Sanjay Lall;Călin Caşcaval;Martin Izzard;Tammo Spalink
We introduce logical synchrony, a framework that allows distributed computing to be coordinated as tightly as in synchronous systems without the distribution of a global clock or any reference to universal time. We develop a model of events called a logical synchrony network, in which nodes correspond to processors and every node has an associated local clock which generates the events. We construct a measure of logical latency and develop its properties. A further model, called a multiclock network, is then analyzed and shown to be a refinement of the logical synchrony network. We present the bittide mechanism as an instantiation of multiclock networks, and discuss the clock control mechanism that ensures that buffers do not overflow or underflow. Finally we give conditions under which a logical synchrony network has an equivalent synchronous realization.
我们介绍了逻辑同步,这是一个允许分布式计算像同步系统一样紧密协调的框架,而无需分配全局时钟或参考通用时间。我们建立了一个称为逻辑同步网络的事件模型,其中的节点与处理器相对应,每个节点都有一个相关的本地时钟来产生事件。我们构建了逻辑延迟的测量方法,并发展了其特性。然后,我们分析了另一种称为多时钟网络的模型,并证明它是逻辑同步网络的一种改进。我们介绍了作为多时钟网络实例化的比特化机制,并讨论了确保缓冲区不会溢出或下溢的时钟控制机制。最后,我们给出了逻辑同步网络具有等效同步实现的条件。
{"title":"Logical Synchrony and the Bittide Mechanism","authors":"Sanjay Lall;Călin Caşcaval;Martin Izzard;Tammo Spalink","doi":"10.1109/TPDS.2024.3444739","DOIUrl":"https://doi.org/10.1109/TPDS.2024.3444739","url":null,"abstract":"We introduce logical synchrony, a framework that allows distributed computing to be coordinated as tightly as in synchronous systems without the distribution of a global clock or any reference to universal time. We develop a model of events called a logical synchrony network, in which nodes correspond to processors and every node has an associated local clock which generates the events. We construct a measure of logical latency and develop its properties. A further model, called a multiclock network, is then analyzed and shown to be a refinement of the logical synchrony network. We present the bittide mechanism as an instantiation of multiclock networks, and discuss the clock control mechanism that ensures that buffers do not overflow or underflow. Finally we give conditions under which a logical synchrony network has an equivalent synchronous realization.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"35 11","pages":"1936-1948"},"PeriodicalIF":5.6,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10638228","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142159918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Paired Many-to-Many 2-Disjoint Path Covers in Meshes 网格中成对的多对多 2-Disjoint 路径覆盖
IF 5.6 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2024-08-16 DOI: 10.1109/TPDS.2024.3445283
Fatemeh Keshavarz-Kohjerdi
In the paired many-to-many $k$-disjoint path cover ($k$-DPC) problem, given a set of $k$ pairs of vertices $(s_{i},t_{i})$, $1leqslant ileqslant k$, of a graph $G$ we want to find $k$ simple vertex-disjoint paths whose end-vertices are these $k$ pairs, such that each vertex of $G$ is covered by a path. This problem is a well-known problem in parallel processing and is a generalization of the well-known Hamiltonian $(s,t)$-path problem, which is equal to 1-DPC. In this paper, we consider the paired many-to-many 2-disjoint path cover problem (2-DPC) in meshes (rectangular grids). We give the necessary conditions for existence of such covers, and present a linear-time algorithm to compute them. Although the paired many-to-many $k$-disjoint path cover problem is well-known in parallel processing, our motivation to study this problem is its application in solving the Hamiltonian path problem in solid grid graphs. We consider the case where the pairs of vertices are on the outer face of the graph.
在成对的多对多 $k$-disjoint path cover($k$-DPC)问题中,给定图 $G$ 的一组 $k$ 对顶点 $(s_{i},t_{i})$,1leqslant ileqslant k$,我们要找到其末端顶点是这 $k$ 对的 $k$ 简单顶点-disjoint 路径,从而使 $G$ 的每个顶点都被路径覆盖。这个问题是并行处理中的一个著名问题,也是著名的哈密顿$(s,t)$路径问题的一般化,相当于 1-DPC。在本文中,我们考虑的是网格(矩形网格)中成对的多对多 2-disjoint 路径覆盖问题(2-DPC)。我们给出了这种覆盖存在的必要条件,并提出了一种计算这种覆盖的线性时间算法。尽管成对的多对多 $k$-isjoint 路径覆盖问题在并行处理中是众所周知的,但我们研究这个问题的动机是它在解决实体网格图中的哈密顿路径问题中的应用。我们考虑的情况是,顶点对位于图的外侧。
{"title":"Paired Many-to-Many 2-Disjoint Path Covers in Meshes","authors":"Fatemeh Keshavarz-Kohjerdi","doi":"10.1109/TPDS.2024.3445283","DOIUrl":"https://doi.org/10.1109/TPDS.2024.3445283","url":null,"abstract":"In the paired many-to-many \u0000<inline-formula><tex-math>$k$</tex-math></inline-formula>\u0000-disjoint path cover (\u0000<inline-formula><tex-math>$k$</tex-math></inline-formula>\u0000-DPC) problem, given a set of \u0000<inline-formula><tex-math>$k$</tex-math></inline-formula>\u0000 pairs of vertices \u0000<inline-formula><tex-math>$(s_{i},t_{i})$</tex-math></inline-formula>\u0000, \u0000<inline-formula><tex-math>$1leqslant ileqslant k$</tex-math></inline-formula>\u0000, of a graph \u0000<inline-formula><tex-math>$G$</tex-math></inline-formula>\u0000 we want to find \u0000<inline-formula><tex-math>$k$</tex-math></inline-formula>\u0000 simple vertex-disjoint paths whose end-vertices are these \u0000<inline-formula><tex-math>$k$</tex-math></inline-formula>\u0000 pairs, such that each vertex of \u0000<inline-formula><tex-math>$G$</tex-math></inline-formula>\u0000 is covered by a path. This problem is a well-known problem in parallel processing and is a generalization of the well-known Hamiltonian \u0000<inline-formula><tex-math>$(s,t)$</tex-math></inline-formula>\u0000-path problem, which is equal to 1-DPC. In this paper, we consider the paired many-to-many 2-disjoint path cover problem (2-DPC) in meshes (rectangular grids). We give the necessary conditions for existence of such covers, and present a linear-time algorithm to compute them. Although the paired many-to-many \u0000<inline-formula><tex-math>$k$</tex-math></inline-formula>\u0000-disjoint path cover problem is well-known in parallel processing, our motivation to study this problem is its application in solving the Hamiltonian path problem in solid grid graphs. We consider the case where the pairs of vertices are on the outer face of the graph.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"35 10","pages":"1854-1866"},"PeriodicalIF":5.6,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142090712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FlexRaft: Exploiting Flexible Erasure Coding for Minimum-Cost Consensus and Fast Recovery FlexRaft:利用灵活的擦除编码实现最低成本共识和快速恢复
IF 5.6 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2024-08-14 DOI: 10.1109/TPDS.2024.3443424
Mi Zhang;Qihan Kang;Patrick P. C. Lee
Consensus protocols like Paxos and Raft provide data consistency and fault tolerance for distributed services. Log replication in these protocols can be supported by erasure coding, which incurs lower redundancy than full-copy replication and significantly saves network and storage costs for overall performance improvements. However, existing consensus protocols with erasure coding cannot achieve the minimum network and storage costs during log replication. We propose FlexRaft, which dynamically varies the coding scheme used in Raft based on the server status to always achieve the theoretically minimum redundancy ratio, while maintaining the same liveness as in Raft. To address the issue of an inconsistent coding scheme between the leader and its followers, we specify the prerequisite of overwriting a log entry and also allow the leader and its followers to exactly track the coding scheme being used. We further extend FlexRaft into FlexRaft+, which provides a different storage layout to vary the coding scheme through a novel technique called re-encoding-free replication, so as to enable fast server recovery. We prove that both FlexRaft and FlexRaft+ maintain Raft safety. We implement a prototype of FlexRaft and FlexRaft+, atop which we build a distributed key-value store to show its efficacy. Experiments on Alibaba Cloud show that FlexRaft achieves the theoretically minimum network and storage costs in practice, and reduces the commit latency by 44.51% and 19.37% compared with state-of-the-art CRaft and HRaft, respectively. FlexRaft+ further reduces the commit latency when the coding scheme is being varied and improves the server recovery performance.
Paxos 和 Raft 等共识协议可为分布式服务提供数据一致性和容错性。这些协议中的日志复制可由擦除编码提供支持,擦除编码比全拷贝复制产生的冗余度更低,可显著节省网络和存储成本,从而提高整体性能。然而,采用擦除编码的现有共识协议无法在日志复制过程中实现最低的网络和存储成本。我们提出了 FlexRaft,它能根据服务器状态动态改变 Raft 中使用的编码方案,以始终达到理论上的最小冗余比,同时保持与 Raft 相同的有效性。为了解决领导者和跟随者之间编码方案不一致的问题,我们规定了覆盖日志条目的前提条件,并允许领导者和跟随者精确跟踪正在使用的编码方案。我们进一步将 FlexRaft 扩展为 FlexRaft+,它提供了不同的存储布局,通过一种称为无重码复制的新技术来改变编码方案,从而实现快速的服务器恢复。我们证明 FlexRaft 和 FlexRaft+ 都能保持 Raft 安全性。我们实现了 FlexRaft 和 FlexRaft+ 的原型,并在此基础上构建了分布式键值存储,以展示其功效。在阿里巴巴云上的实验表明,FlexRaft 实现了理论上最低的网络和存储成本,与最先进的 CRaft 和 HRaft 相比,提交延迟分别降低了 44.51% 和 19.37%。当编码方案发生变化时,FlexRaft+ 还能进一步降低提交延迟,并提高服务器恢复性能。
{"title":"FlexRaft: Exploiting Flexible Erasure Coding for Minimum-Cost Consensus and Fast Recovery","authors":"Mi Zhang;Qihan Kang;Patrick P. C. Lee","doi":"10.1109/TPDS.2024.3443424","DOIUrl":"https://doi.org/10.1109/TPDS.2024.3443424","url":null,"abstract":"Consensus protocols like Paxos and Raft provide data consistency and fault tolerance for distributed services. Log replication in these protocols can be supported by erasure coding, which incurs lower redundancy than full-copy replication and significantly saves network and storage costs for overall performance improvements. However, existing consensus protocols with erasure coding cannot achieve the minimum network and storage costs during log replication. We propose FlexRaft, which dynamically varies the coding scheme used in Raft based on the server status to always achieve the theoretically minimum redundancy ratio, while maintaining the same liveness as in Raft. To address the issue of an inconsistent coding scheme between the leader and its followers, we specify the prerequisite of overwriting a log entry and also allow the leader and its followers to exactly track the coding scheme being used. We further extend FlexRaft into FlexRaft+, which provides a different storage layout to vary the coding scheme through a novel technique called re-encoding-free replication, so as to enable fast server recovery. We prove that both FlexRaft and FlexRaft+ maintain Raft safety. We implement a prototype of FlexRaft and FlexRaft+, atop which we build a distributed key-value store to show its efficacy. Experiments on Alibaba Cloud show that FlexRaft achieves the theoretically minimum network and storage costs in practice, and reduces the commit latency by 44.51% and 19.37% compared with state-of-the-art CRaft and HRaft, respectively. FlexRaft+ further reduces the commit latency when the coding scheme is being varied and improves the server recovery performance.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"35 10","pages":"1826-1840"},"PeriodicalIF":5.6,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142090784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SSRAID: A Stripe-Queued and Stripe-Threaded Merging I/O Strategy to Improve Write Performance of Serial Interface SSD RAID SSRAID:提高串行接口固态盘 RAID 写入性能的条带-队列和条带-线程合并 I/O 策略
IF 5.6 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2024-08-14 DOI: 10.1109/TPDS.2024.3443083
Peixuan Li;Ping Xie;Qiang Cao
RAID (Redundant Array of Independent Disks) has been widely used to enhance read and write performance of existing storage systems. Existing software RAID do not fully utilize write performance of Serial interface SSDs (Solid State Drive). The most popular software RAID currently is Linux Multiple-Disks (MD), and the latest software RAID is StRAID. We observe that both of these software RAID methods lead to thread contention in multi-threaded mode, especially when applied to Serial interface SSDs. Multiple threads writing to same address can limit write performance. In this paper, we propose a stripe-queued and stripe-threaded merging I/O strategy. First, SSRAID segregates write requests across different stripes using a set of stripe-queues and stripe-threads to prevent interference between them. As a result, write thread contention in SSRAID is eliminated, allowing stripe-threads to maintain the highest efficiency of parallelism. Secondly, SSRAID can merge write requests from the same stripe-queue multiple times through stripe-thread, effectively reducing the number of additional write I/Os. Finally, SSRAID presents a stage buffer based on data merging. During partial stripe-write, write-induced read I/Os on the SSD are transformed into direct access to the stage buffer, effectively reducing write-induced read I/Os. Compared to StRAID, SSRAID improves average sequential write throughput by 86% and reduces average sequential write latency by 61% in the optimal case.
RAID(独立磁盘冗余阵列)已被广泛用于提高现有存储系统的读写性能。现有的软件 RAID 无法充分利用串行接口固态硬盘(SSD)的写入性能。目前最流行的软件 RAID 是 Linux Multiple-Disks(MD),最新的软件 RAID 是 StRAID。我们发现,这两种软件 RAID 方法在多线程模式下都会导致线程争用,尤其是在应用于串行接口固态硬盘时。多个线程写入同一地址会限制写入性能。在本文中,我们提出了一种条带排队和条带线程合并 I/O 策略。首先,SSRAID 使用一组条带队列和条带线程将写入请求隔离到不同的条带上,以防止它们之间的干扰。因此,SSRAID 中的写线程竞争得以消除,从而使条带线程保持最高的并行效率。其次,SSRAID 可以通过条带线程多次合并来自同一条带队列的写入请求,从而有效减少额外的写入 I/O 数量。最后,SSRAID 提出了基于数据合并的阶段缓冲。在部分条带写入过程中,固态硬盘上由写入引起的读 I/O 将转化为对阶段缓冲区的直接访问,从而有效减少由写入引起的读 I/O。与 StRAID 相比,在最佳情况下,SSRAID 将平均连续写吞吐量提高了 86%,将平均连续写延迟降低了 61%。
{"title":"SSRAID: A Stripe-Queued and Stripe-Threaded Merging I/O Strategy to Improve Write Performance of Serial Interface SSD RAID","authors":"Peixuan Li;Ping Xie;Qiang Cao","doi":"10.1109/TPDS.2024.3443083","DOIUrl":"https://doi.org/10.1109/TPDS.2024.3443083","url":null,"abstract":"RAID (Redundant Array of Independent Disks) has been widely used to enhance read and write performance of existing storage systems. Existing software RAID do not fully utilize write performance of Serial interface SSDs (Solid State Drive). The most popular software RAID currently is Linux Multiple-Disks (MD), and the latest software RAID is StRAID. We observe that both of these software RAID methods lead to thread contention in multi-threaded mode, especially when applied to Serial interface SSDs. Multiple threads writing to same address can limit write performance. In this paper, we propose a stripe-queued and stripe-threaded merging I/O strategy. First, SSRAID segregates write requests across different stripes using a set of stripe-queues and stripe-threads to prevent interference between them. As a result, write thread contention in SSRAID is eliminated, allowing stripe-threads to maintain the highest efficiency of parallelism. Secondly, SSRAID can merge write requests from the same stripe-queue multiple times through stripe-thread, effectively reducing the number of additional write I/Os. Finally, SSRAID presents a stage buffer based on data merging. During partial stripe-write, write-induced read I/Os on the SSD are transformed into direct access to the stage buffer, effectively reducing write-induced read I/Os. Compared to StRAID, SSRAID improves average sequential write throughput by 86% and reduces average sequential write latency by 61% in the optimal case.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"35 10","pages":"1841-1853"},"PeriodicalIF":5.6,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142090952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Proteus: Simulating the Performance of Distributed DNN Training Proteus:模拟分布式 DNN 训练的性能
IF 5.6 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2024-08-14 DOI: 10.1109/TPDS.2024.3443255
Jiangfei Duan;Xiuhong Li;Ping Xu;Xingcheng Zhang;Shengen Yan;Yun Liang;Dahua Lin
DNN models are becoming increasingly larger to achieve unprecedented accuracy, and the accompanying increased computation and memory requirements necessitate the employment of massive clusters and elaborate parallelization strategies to accelerate DNN training. In order to better optimize the performance and analyze the cost, it is indispensable to model the training throughput of distributed DNN training. However, complex parallelization strategies and the resulting complex runtime behaviors make it challenging to construct an accurate performance model. In this article, we present Proteus, the first standalone simulator to model the performance of complex parallelization strategies through simulation execution. Proteus first models complex parallelization strategies with a unified representation named Strategy Tree. Then, it compiles the strategy tree into a distributed execution graph and simulates the complex runtime behaviors, comp-comm overlap and bandwidth sharing, with a Hierarchical Topo-Aware Executor (HTAE). We finally evaluate Proteus across a wide variety of DNNs on three hardware configurations. Experimental results show that Proteus achieves 3.0% average prediction error and preserves order for training throughput of various parallelization strategies. Compared to state-of-the-art approaches, Proteus reduces prediction error by up to 133.8%.
为了达到前所未有的精确度,DNN 模型变得越来越大,随之而来的计算和内存要求也越来越高,因此有必要使用大规模集群和精心设计的并行化策略来加速 DNN 训练。为了更好地优化性能和分析成本,建立分布式 DNN 训练吞吐量模型是必不可少的。然而,复杂的并行化策略和由此产生的复杂运行时行为使得构建精确的性能模型变得十分困难。在本文中,我们将介绍 Proteus,它是第一个通过模拟执行对复杂并行化策略的性能进行建模的独立模拟器。Proteus 首先用名为 "策略树 "的统一表示法对复杂并行化策略进行建模。然后,它将策略树编译成分布式执行图,并通过分层拓扑感知执行器(HTAE)模拟复杂的运行时行为、计算-通信重叠和带宽共享。最后,我们在三种硬件配置上对各种 DNN 进行了 Proteus 评估。实验结果表明,Proteus 实现了 3.0% 的平均预测误差,并保持了各种并行化策略的训练吞吐量顺序。与最先进的方法相比,Proteus 最多可将预测误差降低 133.8%。
{"title":"Proteus: Simulating the Performance of Distributed DNN Training","authors":"Jiangfei Duan;Xiuhong Li;Ping Xu;Xingcheng Zhang;Shengen Yan;Yun Liang;Dahua Lin","doi":"10.1109/TPDS.2024.3443255","DOIUrl":"https://doi.org/10.1109/TPDS.2024.3443255","url":null,"abstract":"DNN models are becoming increasingly larger to achieve unprecedented accuracy, and the accompanying increased computation and memory requirements necessitate the employment of massive clusters and elaborate parallelization strategies to accelerate DNN training. In order to better optimize the performance and analyze the cost, it is indispensable to model the training throughput of distributed DNN training. However, complex parallelization strategies and the resulting complex runtime behaviors make it challenging to construct an accurate performance model. In this article, we present Proteus, the first standalone simulator to model the performance of complex parallelization strategies through simulation execution. Proteus first models complex parallelization strategies with a unified representation named \u0000<italic>Strategy Tree</i>\u0000. Then, it compiles the strategy tree into a distributed execution graph and simulates the complex runtime behaviors, \u0000<italic>comp-comm overlap</i>\u0000 and \u0000<italic>bandwidth sharing</i>\u0000, with a \u0000<underline>H</u>\u0000ierarchical \u0000<underline>T</u>\u0000opo-\u0000<underline>A</u>\u0000ware \u0000<underline>E</u>\u0000xecutor (\u0000<italic>HTAE</i>\u0000). We finally evaluate Proteus across a wide variety of DNNs on three hardware configurations. Experimental results show that Proteus achieves 3.0% average prediction error and preserves order for training throughput of various parallelization strategies. Compared to state-of-the-art approaches, Proteus reduces prediction error by up to 133.8%.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"35 10","pages":"1867-1878"},"PeriodicalIF":5.6,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10636756","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142090713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Opca: Enabling Optimistic Concurrent Access for Multiple Users in Oblivious Data Storage Opca:在遗忘数据存储中实现多用户优化并发访问
IF 5.6 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2024-08-12 DOI: 10.1109/TPDS.2024.3441623
Yuezhi Che;Dazhao Cheng;Xiao Wang;Rujia Wang
The challenges of data privacy and security posed by data outsourcing are becoming increasingly prevalent. Oblivious RAM (ORAM)-based oblivious data storage guarantees data confidentiality through data encryption and access pattern obfuscation. However, it suffers from performance degradation and low throughput. To address these issues, the concurrency of ORAM in a multi-user scenario has been explored. We investigate several existing concurrent oblivious data storage solutions and discover that a trusted proxy is used to serve concurrent accesses between users and storage, with processing locks involved in the proxy to ensure correctness and prevent conflicts. The proxy-based system is inherently prone to pessimistic concurrency control, and as the number of users grows, a proxy might become a performance bottleneck, causing significant delays. In this study, we propose Opca, a novel oblivious data storage framework that enables optimistic concurrent access. Opca refines the proxy design by temporally storing multiple versions of modified data with labeled timestamps, committing only the latest version to the storage during a separate processing period. Opca is implemented and evaluated in different real-world storage backends with a scalable number of users, and its performance is compared to alternative schemes. Opca outperforms the state-of-the-art concurrent oblivious storage system TaoStore, which relies on a similar system setting. Our results show that Opca can improve 3.77x throughput and reduce 73.5% response time.
数据外包带来的数据隐私和安全挑战越来越普遍。基于遗忘内存(ORAM)的遗忘数据存储通过数据加密和访问模式混淆来保证数据的机密性。然而,它存在性能下降和吞吐量低的问题。为了解决这些问题,我们探索了多用户情况下遗忘内存的并发性。我们研究了几种现有的并发遗忘数据存储解决方案,发现用户和存储之间的并发访问使用可信代理服务,代理中涉及处理锁,以确保正确性并防止冲突。基于代理的系统在本质上容易造成并发控制的悲观,随着用户数量的增加,代理可能会成为性能瓶颈,造成严重的延迟。在本研究中,我们提出了一种新型遗忘式数据存储框架 Opca,它可以实现乐观的并发访问。Opca 改进了代理设计,在时间上存储了多个带时间戳的修改数据版本,在单独的处理期间只将最新版本提交到存储中。Opca 在用户数量可扩展的不同实际存储后端中进行了实施和评估,并将其性能与其他方案进行了比较。Opca 的性能优于最先进的并发遗忘存储系统 TaoStore,后者依赖于类似的系统设置。结果表明,Opca 的吞吐量提高了 3.77 倍,响应时间缩短了 73.5%。
{"title":"Opca: Enabling Optimistic Concurrent Access for Multiple Users in Oblivious Data Storage","authors":"Yuezhi Che;Dazhao Cheng;Xiao Wang;Rujia Wang","doi":"10.1109/TPDS.2024.3441623","DOIUrl":"https://doi.org/10.1109/TPDS.2024.3441623","url":null,"abstract":"The challenges of data privacy and security posed by data outsourcing are becoming increasingly prevalent. Oblivious RAM (ORAM)-based oblivious data storage guarantees data confidentiality through data encryption and access pattern obfuscation. However, it suffers from performance degradation and low throughput. To address these issues, the concurrency of ORAM in a multi-user scenario has been explored. We investigate several existing concurrent oblivious data storage solutions and discover that a trusted proxy is used to serve concurrent accesses between users and storage, with processing locks involved in the proxy to ensure correctness and prevent conflicts. The proxy-based system is inherently prone to pessimistic concurrency control, and as the number of users grows, a proxy might become a performance bottleneck, causing significant delays. In this study, we propose Opca, a novel oblivious data storage framework that enables optimistic concurrent access. Opca refines the proxy design by temporally storing multiple versions of modified data with labeled timestamps, committing only the latest version to the storage during a separate processing period. Opca is implemented and evaluated in different real-world storage backends with a scalable number of users, and its performance is compared to alternative schemes. Opca outperforms the state-of-the-art concurrent oblivious storage system TaoStore, which relies on a similar system setting. Our results show that Opca can improve 3.77x throughput and reduce 73.5% response time.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"35 11","pages":"1891-1903"},"PeriodicalIF":5.6,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142165005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Transactions on Parallel and Distributed Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1