首页 > 最新文献

IEEE Transactions on Computers最新文献

英文 中文
LogSay: An Efficient Comprehension System for Log Numerical Reasoning LogSay:对数推理的高效理解系统
IF 3.7 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-04-08 DOI: 10.1109/TC.2024.3386068
Jiaxing Qi;Zhongzhi Luan;Shaohan Huang;Carol Fung;Hailong Yang
With the growth of smart systems and applications, high volume logs are generated that record important data for system maintenance. System developers are usually required to analyze logs to track the status of the system or applications. Therefore, it is essential to find the answers in large-scale logs when they have some questions. In this work, we design a multi-step “Retriever-Reader” question-answering system, namely LogSay, which aims at predicting answers accurately and efficiently. Our system can not only answers simple questions, such as a segment log or span, but also can answer complex logical questions through numerical reasoning. LogSay has two key components: Log Retriever and Log Reasoner, and we designed five operators to implement them. Log Retriever aims at retrieving some relevant logs based on a question. Then, Log Reasoner performs numerical reasoning to infer the final answer. In addition, due to the lack of available question-answering datasets for system logs, we constructed question-answering datasets based on three public log datasets and will make them publicly available. Our evaluation results show that LogSay outperforms the state-of-the-art works in terms of accuracy and efficiency.
随着智能系统和应用程序的发展,产生了大量日志,这些日志记录了系统维护所需的重要数据。系统开发人员通常需要分析日志来跟踪系统或应用程序的状态。因此,当他们遇到一些问题时,必须从大规模日志中找到答案。在这项工作中,我们设计了一个多步骤的 "Retriever-Reader "问题解答系统,即 LogSay,旨在准确、高效地预测答案。我们的系统不仅能回答简单的问题,如段落日志或跨度,还能通过数字推理回答复杂的逻辑问题。LogSay 有两个关键组件:我们设计了五个运算符来实现它们。日志检索器旨在根据问题检索一些相关日志。然后,日志推理器进行数字推理,推断出最终答案。此外,由于缺乏可用的系统日志问题解答数据集,我们在三个公共日志数据集的基础上构建了问题解答数据集,并将公开这些数据集。我们的评估结果表明,LogSay 在准确性和效率方面都优于最先进的作品。
{"title":"LogSay: An Efficient Comprehension System for Log Numerical Reasoning","authors":"Jiaxing Qi;Zhongzhi Luan;Shaohan Huang;Carol Fung;Hailong Yang","doi":"10.1109/TC.2024.3386068","DOIUrl":"10.1109/TC.2024.3386068","url":null,"abstract":"With the growth of smart systems and applications, high volume logs are generated that record important data for system maintenance. System developers are usually required to analyze logs to track the status of the system or applications. Therefore, it is essential to find the answers in large-scale logs when they have some questions. In this work, we design a multi-step \u0000<italic>“Retriever-Reader”</i>\u0000 question-answering system, namely LogSay, which aims at predicting answers accurately and efficiently. Our system can not only answers simple questions, such as a segment log or span, but also can answer complex logical questions through numerical reasoning. LogSay has two key components: \u0000<italic>Log Retriever</i>\u0000 and \u0000<italic>Log Reasoner</i>\u0000, and we designed five operators to implement them. \u0000<italic>Log Retriever</i>\u0000 aims at retrieving some relevant logs based on a question. Then, \u0000<italic>Log Reasoner</i>\u0000 performs numerical reasoning to infer the final answer. In addition, due to the lack of available question-answering datasets for system logs, we constructed question-answering datasets based on three public log datasets and will make them publicly available. Our evaluation results show that LogSay outperforms the state-of-the-art works in terms of accuracy and efficiency.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"73 7","pages":"1809-1821"},"PeriodicalIF":3.7,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140575450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
vKernel: Enhancing Container Isolation via Private Code and Data vKernel:通过私有代码和数据加强容器隔离
IF 3.7 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-04-08 DOI: 10.1109/TC.2024.3383988
Hang Huang;Honglei Wang;Jia Rao;Song Wu;Hao Fan;Chen Yu;Hai Jin;Kun Suo;Lisong Pan
Container technology is increasingly adopted in cloud environments. However, the lack of isolation in the shared kernel becomes a significant barrier to the wide adoption of containers. The challenges lie in how to simultaneously attain high performance and isolation. On the one hand, kernel-level isolation mechanisms, such as seccomp, capabilities, and apparmor, achieve good performance without much overhead, but lack the support for per-container customization. On the other hand, user-level and VM-based isolation offer superior security guarantees and allow for customization, since a container is assigned a dedicated kernel, but at the cost of high overhead. We present vKernel, a kernel isolation framework. It maintains a minimal set of code and data that are either sensitive or prone to interference in a vKernel Instance (vKI). vKernel relies on inline hooks to intercept and redirect requests sent to the host kernel to a vKI, where container-specific security rules, functions, and data are implemented. Through case studies, we demonstrate that under vKernel user-defined data isolation and kernel customization can be supported with a reasonable engineering effort. An evaluation of vKernel with micro-benchmarks, cloud services, real-world applications show that vKernel achieves good security guarantees, but with much less overhead.
云环境中越来越多地采用容器技术。然而,共享内核中缺乏隔离性成为广泛采用容器的一大障碍。挑战在于如何同时实现高性能和隔离。一方面,内核级隔离机制(如 seccomp、abilities 和 apparmor)能在不增加太多开销的情况下实现良好的性能,但缺乏对每个容器定制的支持。另一方面,基于用户级和虚拟机的隔离机制提供了更优越的安全保障,并允许自定义,因为容器被分配了一个专用内核,但代价是高昂的开销。我们提出了内核隔离框架 vKernel。vKernel 依靠内联钩子拦截发送到主机内核的请求并将其重定向到 vKI,在 vKI 中实施容器特定的安全规则、函数和数据。通过案例研究,我们证明了在 vKernel 下,用户定义的数据隔离和内核定制可以通过合理的工程设计得到支持。利用微基准、云服务和实际应用对 vKernel 进行的评估表明,vKernel 可以实现良好的安全保证,但开销要小得多。
{"title":"vKernel: Enhancing Container Isolation via Private Code and Data","authors":"Hang Huang;Honglei Wang;Jia Rao;Song Wu;Hao Fan;Chen Yu;Hai Jin;Kun Suo;Lisong Pan","doi":"10.1109/TC.2024.3383988","DOIUrl":"10.1109/TC.2024.3383988","url":null,"abstract":"Container technology is increasingly adopted in cloud environments. However, the lack of isolation in the shared kernel becomes a significant barrier to the wide adoption of containers. The challenges lie in how to simultaneously attain high performance and isolation. On the one hand, kernel-level isolation mechanisms, such as \u0000<italic>seccomp</i>\u0000, \u0000<italic>capabilities</i>\u0000, and \u0000<italic>apparmor</i>\u0000, achieve good performance without much overhead, but lack the support for per-container customization. On the other hand, user-level and VM-based isolation offer superior security guarantees and allow for customization, since a container is assigned a dedicated kernel, but at the cost of high overhead. We present vKernel, a kernel isolation framework. It maintains a minimal set of code and data that are either sensitive or prone to interference in a \u0000<italic>vKernel Instance</i>\u0000 (vKI). vKernel relies on inline hooks to intercept and redirect requests sent to the host kernel to a vKI, where container-specific security rules, functions, and data are implemented. Through case studies, we demonstrate that under vKernel user-defined data isolation and kernel customization can be supported with a reasonable engineering effort. An evaluation of vKernel with micro-benchmarks, cloud services, real-world applications show that vKernel achieves good security guarantees, but with much less overhead.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"73 7","pages":"1711-1723"},"PeriodicalIF":3.7,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10494778","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140575535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hybrid-Memcached: A Novel Approach for Memcached Persistence Optimization With Hybrid Memory 混合内存缓存:利用混合内存优化 Memcached 持久性的新方法
IF 3.7 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-04-04 DOI: 10.1109/TC.2024.3385279
Zhang Jiang;Xianduo Li;Tianxiang Peng;Haoran Li;Jingxuan Hong;Jin Zhang;Xiaoli Gong
Memcached is a widely adopted, high-performance, in-memory key-value object caching system utilized in data centers. Nonetheless, its data is stored in volatile DRAM, making the cached data susceptible to loss during system shutdowns. Consequently, cold restarts experience significant delays. Persistent memory is a byte-addressable, large-capacity, and non-volatility storage media, which can be employed to avoid the cold restart problem. However, deploying Memcached on persistent memory requires consideration of issues such as write endurance, asymmetric read/write latency and bandwidth, and write granularity of persistent memory. In this paper, we propose Hybrid-Memcached, an optimized Memcached framework based on a hybrid combination of DRAM and persistent memory. Hybrid-Memcached includes three key components: (1) a DRAM-based data aggregation buffer to avoid multiple fine-grained writes, which extends the write endurance of persistent memory, (2) a data-object alignment mechanism to avoid write amplification, and (3) a non-temporal store instruction-based writing strategy to improve the bandwidth utilization. We have implemented Hybrid-Memcached on the Intel Optane persistent memory. Several micros-benchmarks are designed to evaluate Hybrid-Memcached by varying read/write ratios, access distributions, and key-value item sizes. Additionally, we evaluated it with the YCSB benchmark, showing a 21.2% performance improvement for fully write-intensive workloads and 11.8% for read-write balanced workloads.
Memcached 是一种广泛应用于数据中心的高性能内存键值对象缓存系统。然而,它的数据存储在易失性 DRAM 中,因此缓存数据很容易在系统关闭时丢失。因此,冷启动时会出现明显的延迟。持久内存是一种字节可寻址、大容量且不易挥发的存储介质,可用于避免冷重启问题。但是,在持久内存上部署 Memcached 需要考虑持久内存的写耐久性、非对称读/写延迟和带宽以及写粒度等问题。在本文中,我们提出了基于 DRAM 和持久内存混合组合的优化 Memcached 框架 Hybrid-Memcached。Hybrid-Memcached 包括三个关键组件:(1) 基于 DRAM 的数据聚合缓冲区,以避免多次细粒度写入,从而延长持久性内存的写入耐久性;(2) 数据对象对齐机制,以避免写入放大;以及 (3) 基于非时态存储指令的写入策略,以提高带宽利用率。我们在英特尔 Optane 持久内存上实现了混合缓存。我们设计了几个微基准,通过不同的读写比率、访问分布和键值项大小来评估混合缓存。此外,我们还利用 YCSB 基准对其进行了评估,结果显示,在完全写密集型工作负载中,性能提高了 21.2%,在读写均衡型工作负载中,性能提高了 11.8%。
{"title":"Hybrid-Memcached: A Novel Approach for Memcached Persistence Optimization With Hybrid Memory","authors":"Zhang Jiang;Xianduo Li;Tianxiang Peng;Haoran Li;Jingxuan Hong;Jin Zhang;Xiaoli Gong","doi":"10.1109/TC.2024.3385279","DOIUrl":"10.1109/TC.2024.3385279","url":null,"abstract":"Memcached is a widely adopted, high-performance, in-memory key-value object caching system utilized in data centers. Nonetheless, its data is stored in volatile DRAM, making the cached data susceptible to loss during system shutdowns. Consequently, cold restarts experience significant delays. Persistent memory is a byte-addressable, large-capacity, and non-volatility storage media, which can be employed to avoid the cold restart problem. However, deploying Memcached on persistent memory requires consideration of issues such as write endurance, asymmetric read/write latency and bandwidth, and write granularity of persistent memory. In this paper, we propose Hybrid-Memcached, an optimized Memcached framework based on a hybrid combination of DRAM and persistent memory. Hybrid-Memcached includes three key components: (1) a DRAM-based data aggregation buffer to avoid multiple fine-grained writes, which extends the write endurance of persistent memory, (2) a data-object alignment mechanism to avoid write amplification, and (3) a non-temporal store instruction-based writing strategy to improve the bandwidth utilization. We have implemented Hybrid-Memcached on the Intel Optane persistent memory. Several micros-benchmarks are designed to evaluate Hybrid-Memcached by varying read/write ratios, access distributions, and key-value item sizes. Additionally, we evaluated it with the YCSB benchmark, showing a 21.2% performance improvement for fully write-intensive workloads and 11.8% for read-write balanced workloads.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"73 7","pages":"1866-1874"},"PeriodicalIF":3.7,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140575534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LAC: A Workload Intensity-Aware Caching Scheme for High-Performance SSDs LAC:面向高性能固态硬盘的工作负载强度感知缓存方案
IF 3.7 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-04-04 DOI: 10.1109/TC.2024.3385290
Hui Sun;Haoqiang Tong;Yinliang Yue;Xiao Qin
Inside an NAND Flash-based solid-state disk (SSD), utilizing DRAM-based write-back caching is a practical approach to bolstering the SSD performance. Existing caching schemes overlook the problem of high user I/Os intensity due to the dramatic increment of I/Os accesses. The hefty I/O intensity causes access conflict of I/O requests inside an SSD: a large number of requests are blocked to impair response time. Conventional passive update caching schemes merely replace pages upon access misses in event of full cache. Tail latency occurs facing a colossal I/O intensity. Active write-back caching schemes utilize idle time among requests coupled with free internal bandwidth to flush dirty data into flash memory in advance, lowering response time. Frequent active write-back operations, however, cause access conflict of requests – a culprit that expands write amplification (WA) and degrades SSD lifetime. We address the above issues by proposing a workLoad intensity-aware and Active parallel Caching scheme - LAC - that is powered by collaborative-load awareness. LAC fends off user I/Os’ access conflict under high-I/O-intensity workloads. If the I/O intensity is low – intervals between consecutive I/O requests are large – and the target die is free, LAC actively and concurrently writes dirty data of adjacent addresses back to the die, cultivating clean data generated by the active write-back. Replacing clean data in priority can reduce response time and prevent flash transactions from being blocked. We devise a data protection method to write back cold data based on various criteria in the cache replacement and active write-backs. Thus, LAC reduces WA incurred by actively writing back hot data and extends SSD lifetime. We compare LAC against the six caching schemes (LRU, CFLRU, GCaR-LRU, MQSim, VS-Batch, and Co-Active) in the modern MQSim simulator. The results unveil that LAC trims response time and erase count by up to 78.5% and 47.8%, with an average of 64.4% and 16.6%, respectively.
在基于 NAND 闪存的固态硬盘(SSD)中,利用基于 DRAM 的回写缓存是提高 SSD 性能的一种实用方法。由于 I/O 访问量急剧增加,现有的缓存方案忽略了用户 I/O 高强度的问题。高I/O强度会导致固态硬盘内I/O请求的访问冲突:大量请求被阻塞,从而影响响应时间。传统的被动更新缓存方案只是在缓存已满的情况下,在访问未命中时替换页面。面对巨大的 I/O 强度,会出现尾部延迟。主动回写缓存方案利用请求之间的空闲时间和空闲的内部带宽,提前将脏数据刷新到闪存中,从而缩短响应时间。然而,频繁的主动回写操作会导致请求之间的访问冲突--这是扩大写放大(WA)和降低固态硬盘寿命的罪魁祸首。为了解决上述问题,我们提出了一种工作负载强度感知和主动并行缓存方案--LAC,该方案由协作负载感知驱动。在高I/O强度的工作负载下,LAC能抵御用户I/O的访问冲突。如果 I/O 强度较低--连续 I/O 请求之间的间隔较大--且目标裸片空闲,LAC 就会主动并发地将相邻地址的脏数据写回裸片,同时培养主动回写产生的干净数据。优先替换干净数据可以缩短响应时间,防止闪存事务被阻塞。我们设计了一种数据保护方法,根据缓存替换和主动回写中的各种标准来写回冷数据。因此,LAC 减少了因主动回写热数据而产生的 WA,并延长了固态硬盘的使用寿命。我们在现代 MQSim 模拟器中将 LAC 与六种缓存方案(LRU、CFLRU、GCaR-LRU、MQSim、VS-Batch 和 Co-Active)进行了比较。结果表明,LAC 可将响应时间和擦除次数分别缩短 78.5% 和 47.8%,平均分别缩短 64.4% 和 16.6%。
{"title":"LAC: A Workload Intensity-Aware Caching Scheme for High-Performance SSDs","authors":"Hui Sun;Haoqiang Tong;Yinliang Yue;Xiao Qin","doi":"10.1109/TC.2024.3385290","DOIUrl":"10.1109/TC.2024.3385290","url":null,"abstract":"Inside an NAND Flash-based solid-state disk (SSD), utilizing DRAM-based write-back caching is a practical approach to bolstering the SSD performance. Existing caching schemes overlook the problem of high user I/Os intensity due to the dramatic increment of I/Os accesses. The hefty I/O intensity causes access conflict of I/O requests inside an SSD: a large number of requests are blocked to impair response time. Conventional passive update caching schemes merely replace pages upon access misses in event of full cache. Tail latency occurs facing a colossal I/O intensity. Active write-back caching schemes utilize idle time among requests coupled with free internal bandwidth to flush dirty data into flash memory in advance, lowering response time. Frequent active write-back operations, however, cause access conflict of requests – a culprit that expands write amplification (WA) and degrades SSD lifetime. We address the above issues by proposing a \u0000<italic>work<b>L</b></i>\u0000oad intensity-aware and \u0000<bold><i>A</i></b>\u0000ctive parallel \u0000<bold><i>Caching</i></b>\u0000 scheme - LAC - that is powered by collaborative-load awareness. LAC fends off user I/Os’ access conflict under high-I/O-intensity workloads. If the I/O intensity is low – intervals between consecutive I/O requests are large – and the target die is free, LAC actively and concurrently writes dirty data of adjacent addresses back to the die, cultivating clean data generated by the active write-back. Replacing clean data in priority can reduce response time and prevent flash transactions from being blocked. We devise a data protection method to write back cold data based on various criteria in the cache replacement and active write-backs. Thus, LAC reduces WA incurred by actively writing back hot data and extends SSD lifetime. We compare LAC against the six caching schemes (LRU, CFLRU, GCaR-LRU, MQSim, VS-Batch, and Co-Active) in the modern MQSim simulator. The results unveil that LAC trims response time and erase count by up to 78.5% and 47.8%, with an average of 64.4% and 16.6%, respectively.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"73 7","pages":"1738-1752"},"PeriodicalIF":3.7,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140575529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Xvpfloat: RISC-V ISA Extension for Variable Extended Precision Floating Point Computation Xvpfloat:用于可变扩展精度浮点运算的 RISC-V ISA 扩展
IF 3.7 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-04-02 DOI: 10.1109/TC.2024.3383964
Eric Guthmuller;César Fuguet;Andrea Bocco;Jérôme Fereyre;Riccardo Alidori;Ihsane Tahir;Yves Durand
A key concern in the field of scientific computation is the convergence of numerical solvers when applied to large problems. The numerical workarounds used to improve convergence are often problem specific, time consuming and require skilled numerical analysts. An alternative is to simply increase the working precision of the computation, but this is difficult due to the lack of efficient hardware support for extended precision. We propose Xvpfloat, a RISC-V ISA extension for dynamically variable and extended precision computation, a hardware implementation and a full software stack. Our architecture provides a comprehensive implementation of this ISA, with up to 512 bits of significand, including full support for common rounding modes and heterogeneous precision arithmetic operations. The memory subsystem handles IEEE 754 extendable formats, and features specialized indexed loads and stores with hardware-assisted prefetching. This processor can either operate standalone or as an accelerator for a general purpose host. We demonstrate that the number of solver iterations can be reduced up to $5boldsymbol{times}$ and, for certain, difficult problems, convergence is only possible with very high precision ($boldsymbol{geq}$384 bits). This accelerator provides a new approach to accelerate large scale scientific computing.
科学计算领域的一个关键问题是数值求解器在应用于大型问题时的收敛性。用于提高收敛性的数值变通方法往往针对具体问题,耗费时间,而且需要熟练的数值分析人员。另一种方法是简单地提高计算的工作精度,但由于缺乏对扩展精度的有效硬件支持,这种方法很难实现。我们提出了 Xvpfloat,一种用于动态可变和扩展精度计算的 RISC-V ISA 扩展、硬件实现和完整的软件栈。我们的架构提供了该 ISA 的全面实施,具有高达 512 位的示值,包括对常见舍入模式和异构精度算术运算的全面支持。内存子系统可处理 IEEE 754 扩展格式,并具有专门的索引加载和存储功能以及硬件辅助预取功能。该处理器既可独立运行,也可作为通用主机的加速器。我们证明,求解器迭代次数最多可减少 5 美元(oldsymbol{times}$),而且对于某些困难问题,只有在精度非常高(384 位)的情况下才有可能收敛。该加速器为加速大规模科学计算提供了一种新方法。
{"title":"Xvpfloat: RISC-V ISA Extension for Variable Extended Precision Floating Point Computation","authors":"Eric Guthmuller;César Fuguet;Andrea Bocco;Jérôme Fereyre;Riccardo Alidori;Ihsane Tahir;Yves Durand","doi":"10.1109/TC.2024.3383964","DOIUrl":"10.1109/TC.2024.3383964","url":null,"abstract":"A key concern in the field of scientific computation is the convergence of numerical solvers when applied to large problems. The numerical workarounds used to improve convergence are often problem specific, time consuming and require skilled numerical analysts. An alternative is to simply increase the working precision of the computation, but this is difficult due to the lack of efficient hardware support for extended precision. We propose \u0000<i>Xvpfloat</i>\u0000, a RISC-V ISA extension for dynamically variable and extended precision computation, a hardware implementation and a full software stack. Our architecture provides a comprehensive implementation of this ISA, with up to 512 bits of significand, including full support for common rounding modes and heterogeneous precision arithmetic operations. The memory subsystem handles IEEE 754 extendable formats, and features specialized indexed loads and stores with hardware-assisted prefetching. This processor can either operate standalone or as an accelerator for a general purpose host. We demonstrate that the number of solver iterations can be reduced up to \u0000<inline-formula><tex-math>$5boldsymbol{times}$</tex-math></inline-formula>\u0000 and, for certain, difficult problems, convergence is only possible with very high precision (\u0000<inline-formula><tex-math>$boldsymbol{geq}$</tex-math></inline-formula>\u0000384 bits). This accelerator provides a new approach to accelerate large scale scientific computing.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"73 7","pages":"1683-1697"},"PeriodicalIF":3.7,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10488759","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140575538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BSR-FL: An Efficient Byzantine-Robust Privacy-Preserving Federated Learning Framework BSR-FL:高效的拜占庭稳健隐私保护联合学习框架
IF 3.6 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-03-22 DOI: 10.1109/TC.2024.3404102
Honghong Zeng;Jie Li;Jiong Lou;Shijing Yuan;Chentao Wu;Wei Zhao;Sijin Wu;Zhiwen Wang
Federated learning (FL) is a technique that enables clients to collaboratively train a model by sharing local models instead of raw private data. However, existing reconstruction attacks can recover the sensitive training samples from the shared models. Additionally, the emerging poisoning attacks also pose severe threats to the security of FL. However, most existing Byzantine-robust privacy-preserving federated learning solutions either reduce the accuracy of aggregated models or introduce significant computation and communication overheads. In this paper, we propose a novel Blockchain-based Secure and Robust Federated Learning (BSR-FL) framework to mitigate reconstruction attacks and poisoning attacks. BSR-FL avoids accuracy loss while ensuring efficient privacy protection and Byzantine robustness. Specifically, we first construct a lightweight non-interactive functional encryption (NIFE) scheme to protect the privacy of local models while maintaining high communication performance. Then, we propose a privacy-preserving defensive aggregation strategy based on NIFE, which can resist encrypted poisoning attacks without compromising model privacy through secure cosine similarity and incentive-based Byzantine-tolerance aggregation. Finally, we utilize the blockchain system to assist in facilitating the processes of federated learning and the implementation of protocols. Extensive theoretical analysis and experiments demonstrate that our new BSR-FL has enhanced privacy security, robustness, and high efficiency.
联合学习(FL)是一种使客户能够通过共享本地模型而不是原始私人数据来协同训练模型的技术。然而,现有的重构攻击可以从共享模型中恢复敏感的训练样本。此外,新出现的中毒攻击也对 FL 的安全性构成了严重威胁。然而,大多数现有的拜占庭式隐私保护联合学习解决方案要么降低了聚合模型的准确性,要么带来了巨大的计算和通信开销。在本文中,我们提出了一种新颖的基于区块链的安全稳健联合学习(BSR-FL)框架,以减轻重构攻击和中毒攻击。BSR-FL 可避免准确性损失,同时确保高效的隐私保护和拜占庭鲁棒性。具体来说,我们首先构建了一个轻量级非交互式功能加密(NIFE)方案,以保护本地模型的隐私,同时保持较高的通信性能。然后,我们在 NIFE 的基础上提出了一种保护隐私的防御聚合策略,通过安全余弦相似性和基于激励的拜占庭容错聚合,在不损害模型隐私的情况下抵御加密中毒攻击。最后,我们利用区块链系统来协助促进联合学习和协议实施的过程。广泛的理论分析和实验证明,我们的新型 BSR-FL 具有更强的隐私安全性、鲁棒性和高效性。
{"title":"BSR-FL: An Efficient Byzantine-Robust Privacy-Preserving Federated Learning Framework","authors":"Honghong Zeng;Jie Li;Jiong Lou;Shijing Yuan;Chentao Wu;Wei Zhao;Sijin Wu;Zhiwen Wang","doi":"10.1109/TC.2024.3404102","DOIUrl":"10.1109/TC.2024.3404102","url":null,"abstract":"Federated learning (FL) is a technique that enables clients to collaboratively train a model by sharing local models instead of raw private data. However, existing reconstruction attacks can recover the sensitive training samples from the shared models. Additionally, the emerging poisoning attacks also pose severe threats to the security of FL. However, most existing Byzantine-robust privacy-preserving federated learning solutions either reduce the accuracy of aggregated models or introduce significant computation and communication overheads. In this paper, we propose a novel \u0000<underline>B</u>\u0000lockchain-based \u0000<underline>S</u>\u0000ecure and \u0000<underline>R</u>\u0000obust \u0000<underline>F</u>\u0000ederated \u0000<underline>L</u>\u0000earning (BSR-FL) framework to mitigate reconstruction attacks and poisoning attacks. BSR-FL avoids accuracy loss while ensuring efficient privacy protection and Byzantine robustness. Specifically, we first construct a lightweight non-interactive functional encryption (NIFE) scheme to protect the privacy of local models while maintaining high communication performance. Then, we propose a privacy-preserving defensive aggregation strategy based on NIFE, which can resist encrypted poisoning attacks without compromising model privacy through secure cosine similarity and incentive-based Byzantine-tolerance aggregation. Finally, we utilize the blockchain system to assist in facilitating the processes of federated learning and the implementation of protocols. Extensive theoretical analysis and experiments demonstrate that our new BSR-FL has enhanced privacy security, robustness, and high efficiency.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"73 8","pages":"2096-2110"},"PeriodicalIF":3.6,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141147534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BlockCompass: A Benchmarking Platform for Blockchain Performance BlockCompass:区块链性能基准平台
IF 3.6 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-03-22 DOI: 10.1109/TC.2024.3404103
Mohammadreza Rasolroveicy;Wejdene Haouari;Marios Fokaefs
Blockchain technology has gained momentum due to its immutability and transparency. Several blockchain platforms, each with different consensus protocols, have been proposed. However, choosing and configuring such a platform is a non-trivial task. Numerous benchmarking tools have been introduced to test the performance of blockchain solutions. Yet, these tools are often limited to specific blockchain platforms or require complex configurations. Moreover, they tend to focus on one-off batch evaluation models, which may not be ideal for longer-running instances under continuous workloads. In this work, we present BlockCompass, an all-inclusive blockchain benchmarking tool that can be easily configured and extended. We demonstrate how BlockCompass can evaluate the performance of various blockchain platforms and configurations, including Ethereum Proof-of-Authority, Ethereum Proof-of-Work, Hyperledger Fabric Raft, Hyperledger Sawtooth with Proof-of-Elapsed-Time, Practical Byzantine Fault Tolerance, and Raft consensus algorithms, against workloads that continuously fluctuate over time. We show how continuous transactional workloads may be more appropriate than batch workloads in capturing certain stressful events for the system. Finally, we present the results of a usability study about the convenience and effectiveness offered by BlockCompass in blockchain benchmarking.
区块链技术因其不可篡改性和透明度而获得了强劲的发展势头。目前已提出了多个区块链平台,每个平台都有不同的共识协议。然而,选择和配置这样一个平台并非易事。为了测试区块链解决方案的性能,已经推出了许多基准测试工具。然而,这些工具往往局限于特定的区块链平台,或者需要复杂的配置。此外,这些工具往往侧重于一次性批量评估模型,对于在连续工作负载下长期运行的实例可能并不理想。在这项工作中,我们介绍了 BlockCompass,这是一种可以轻松配置和扩展的包罗万象的区块链基准测试工具。我们展示了 BlockCompass 如何针对随时间持续波动的工作负载,评估各种区块链平台和配置的性能,包括以太坊授权证明、以太坊工作证明、Hyperledger Fabric Raft、Hyperledger Sawtooth 与过期时间证明、实用拜占庭容错和 Raft 共识算法。我们展示了连续事务性工作负载如何比批量工作负载更适合捕捉系统的某些压力事件。最后,我们介绍了有关 BlockCompass 在区块链基准测试中提供的便利性和有效性的可用性研究结果。
{"title":"BlockCompass: A Benchmarking Platform for Blockchain Performance","authors":"Mohammadreza Rasolroveicy;Wejdene Haouari;Marios Fokaefs","doi":"10.1109/TC.2024.3404103","DOIUrl":"10.1109/TC.2024.3404103","url":null,"abstract":"Blockchain technology has gained momentum due to its immutability and transparency. Several blockchain platforms, each with different consensus protocols, have been proposed. However, choosing and configuring such a platform is a non-trivial task. Numerous benchmarking tools have been introduced to test the performance of blockchain solutions. Yet, these tools are often limited to specific blockchain platforms or require complex configurations. Moreover, they tend to focus on one-off batch evaluation models, which may not be ideal for longer-running instances under continuous workloads. In this work, we present \u0000<italic>BlockCompass</i>\u0000, an all-inclusive blockchain benchmarking tool that can be easily configured and extended. We demonstrate how \u0000<italic>BlockCompass</i>\u0000 can evaluate the performance of various blockchain platforms and configurations, including Ethereum Proof-of-Authority, Ethereum Proof-of-Work, Hyperledger Fabric Raft, Hyperledger Sawtooth with Proof-of-Elapsed-Time, Practical Byzantine Fault Tolerance, and Raft consensus algorithms, against workloads that continuously fluctuate over time. We show how continuous transactional workloads may be more appropriate than batch workloads in capturing certain stressful events for the system. Finally, we present the results of a usability study about the convenience and effectiveness offered by \u0000<italic>BlockCompass</i>\u0000 in blockchain benchmarking.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"73 8","pages":"2111-2122"},"PeriodicalIF":3.6,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141147537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Machine Learning-Empowered Cache Management Scheme for High-Performance SSDs 面向高性能固态硬盘的机器学习驱动缓存管理方案
IF 3.6 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-03-22 DOI: 10.1109/TC.2024.3404064
Hui Sun;Chen Sun;Haoqiang Tong;Yinliang Yue;Xiao Qin
NAND Flash-based solid-state drives (SSDs) have gained widespread usage in data storage thanks to their exceptional performance and low power consumption. The computational capability of SSDs has been elevated to tackle complex algorithms. Inside an SSD, a DRAM cache for frequently accessed requests reduces response time and write amplification (WA), thereby improving SSD performance and lifetime. Existing caching schemes, based on temporal locality, overlook its variations, which potentially reduces cache hit rates. Some caching schemes bolster performance via flash-aware techniques but at the expense of the cache hit rate. To address these issues, we propose a random forest machine learning Classifier-empowered Cache scheme named CCache, where I/O requests are classified into critical, intermediate, and non-critical ones according to their access status. After designing a machine learning model to predict these three types of requests, we implement a trie-level linked list to manage the cache placement and replacement. CCache safeguards critical requests for cache service to the greatest extent, while granting the highest priority to evicting request accessed by non-critical requests. CCache – considering chip state when processing non-critical requests – is implemented in an SSD simulator (SSDSim). CCache outperforms the alternative caching schemes, including LRU, CFLRU, LCR, NCache, ML_WP, and CCache_ANN, in terms of response time, WA, erase count, and hit ratio. The performance discrepancy between CCache and the OPT scheme is marginal. For example, CCache reduces the response time of the competitors by up to 41.9% with an average of 16.1%. CCache slashes erase counts by a maximum of 67.4%, with an average of 21.3%. The performance gap between CCache and and OPT is merely 2.0%-3.0%.
基于 NAND 闪存的固态硬盘(SSD)凭借其卓越的性能和低功耗,在数据存储领域得到了广泛应用。固态硬盘的计算能力得到了提升,可以处理复杂的算法。在固态硬盘中,用于频繁访问请求的 DRAM 缓存缩短了响应时间,降低了写放大(WA),从而提高了固态硬盘的性能和使用寿命。现有的缓存方案以时间定位为基础,忽略了其变化,这可能会降低缓存命中率。一些缓存方案通过闪存感知技术提高了性能,但却牺牲了缓存命中率。为了解决这些问题,我们提出了一种名为 CCache 的随机森林机器学习分类器授权缓存方案,根据访问状态将 I/O 请求分为关键请求、中间请求和非关键请求。在设计了一个机器学习模型来预测这三类请求后,我们实施了一个三级链表来管理缓存的放置和替换。CCache 最大程度地保障了关键请求的缓存服务,同时给予非关键请求访问的请求驱逐最高优先级。在处理非关键请求时考虑芯片状态的 CCache 是在固态硬盘模拟器(SSDSim)中实现的。就响应时间、WA、擦除次数和命中率而言,CCache优于其他缓存方案,包括LRU、CFLRU、LCR、NCache、ML_WP和CCache_ANN。CCache 和 OPT 方案之间的性能差异很小。例如,CCache 可将竞争对手的响应时间缩短 41.9%,平均缩短 16.1%。CCache 最多可将擦除次数减少 67.4%,平均为 21.3%。CCache 与 OPT 之间的性能差距仅为 2.0%-3.0%。
{"title":"A Machine Learning-Empowered Cache Management Scheme for High-Performance SSDs","authors":"Hui Sun;Chen Sun;Haoqiang Tong;Yinliang Yue;Xiao Qin","doi":"10.1109/TC.2024.3404064","DOIUrl":"10.1109/TC.2024.3404064","url":null,"abstract":"NAND Flash-based solid-state drives (SSDs) have gained widespread usage in data storage thanks to their exceptional performance and low power consumption. The computational capability of SSDs has been elevated to tackle complex algorithms. Inside an SSD, a DRAM cache for frequently accessed requests reduces response time and write amplification (WA), thereby improving SSD performance and lifetime. Existing caching schemes, based on temporal locality, overlook its variations, which potentially reduces cache hit rates. Some caching schemes bolster performance via flash-aware techniques but at the expense of the cache hit rate. To address these issues, we propose a random forest machine learning \u0000<bold>C</b>\u0000lassifier-empowered \u0000<bold>C</b>\u0000ache scheme named CCache, where I/O requests are classified into critical, intermediate, and non-critical ones according to their access status. After designing a machine learning model to predict these three types of requests, we implement a trie-level linked list to manage the cache placement and replacement. CCache safeguards critical requests for cache service to the greatest extent, while granting the highest priority to evicting request accessed by non-critical requests. CCache – considering chip state when processing non-critical requests – is implemented in an SSD simulator (SSDSim). CCache outperforms the alternative caching schemes, including LRU, CFLRU, LCR, NCache, ML_WP, and CCache_ANN, in terms of response time, WA, erase count, and hit ratio. The performance discrepancy between CCache and the OPT scheme is marginal. For example, CCache reduces the response time of the competitors by up to 41.9% with an average of 16.1%. CCache slashes erase counts by a maximum of 67.4%, with an average of 21.3%. The performance gap between CCache and and OPT is merely 2.0%-3.0%.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"73 8","pages":"2066-2080"},"PeriodicalIF":3.6,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141147533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DPU-Direct: Unleashing Remote Accelerators via Enhanced RDMA for Disaggregated Datacenters DPU-Direct:通过增强型 RDMA 为分散的数据中心释放远程加速器
IF 3.6 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-03-22 DOI: 10.1109/TC.2024.3404089
Yunkun Liao;Jingya Wu;Wenyan Lu;Xiaowei Li;Guihai Yan
This paper presents DPU-Direct, an accelerator disaggregation system that connects accelerator nodes (ANs) and CPU nodes (CNs) over a standard Remote Direct Memory Access (RDMA) network. DPU-Direct eliminates the latency introduced by the CPU-based network stack, and PCIe interconnects between network I/O and the accelerator. The DPU-Direct system architecture includes a DPU Wrapper hardware architecture, an RDMA-based Accelerator Access Pattern (RAAP), and a CN-side programming model. The DPU Wrapper connects accelerators directly with the RDMA engine, turning ANs into disaggregation-native devices. The RAAP provides the CN with low-latency and high throughput accelerator semantics based on standard RDMA operations. Our FPGA prototype demonstrates DPU-Direct's efficacy with two proof-of-concept applications: AES encryption and key-value cache, which are computationally intensive and latency-sensitive. DPU-Direct yields a 400x speedup in AES encryption over the CPU baseline and matches the performance of the locally integrated AES accelerator. For key-value cache, DPU-Direct reduces the average end-to-end latency by 1.66x for GETs and 1.30x for SETs over the CPU-RDMA-Polling baseline, reducing latency jitter by over 10x for both operations.
本文介绍的 DPU-Direct 是一种加速器分解系统,它通过标准远程直接内存访问 (RDMA) 网络连接加速器节点 (AN) 和 CPU 节点 (CN)。DPU-Direct 消除了基于 CPU 的网络堆栈和网络 I/O 与加速器之间的 PCIe 互连所带来的延迟。DPU-Direct 系统架构包括 DPU Wrapper 硬件架构、基于 RDMA 的加速器访问模式 (RAAP) 和 CN 端编程模型。DPU Wrapper 可将加速器与 RDMA 引擎直接连接,从而将 AN 转化为分解原生设备。RAAP 为 CN 提供了基于标准 RDMA 操作的低延迟、高吞吐量加速器语义。我们的 FPGA 原型通过两个概念验证应用展示了 DPU-Direct 的功效:AES 加密和键值缓存是计算密集型和延迟敏感型应用。与 CPU 相比,DPU-Direct 的 AES 加密速度提高了 400 倍,与本地集成的 AES 加速器性能相当。在键值缓存方面,与 CPU-RDMA 轮询基线相比,DPU-Direct 将 GET 的平均端到端延迟降低了 1.66 倍,将 SET 的平均端到端延迟降低了 1.30 倍,将这两种操作的延迟抖动降低了 10 倍以上。
{"title":"DPU-Direct: Unleashing Remote Accelerators via Enhanced RDMA for Disaggregated Datacenters","authors":"Yunkun Liao;Jingya Wu;Wenyan Lu;Xiaowei Li;Guihai Yan","doi":"10.1109/TC.2024.3404089","DOIUrl":"10.1109/TC.2024.3404089","url":null,"abstract":"This paper presents DPU-Direct, an accelerator disaggregation system that connects accelerator nodes (ANs) and CPU nodes (CNs) over a standard Remote Direct Memory Access (RDMA) network. DPU-Direct eliminates the latency introduced by the CPU-based network stack, and PCIe interconnects between network I/O and the accelerator. The DPU-Direct system architecture includes a DPU Wrapper hardware architecture, an RDMA-based Accelerator Access Pattern (RAAP), and a CN-side programming model. The DPU Wrapper connects accelerators directly with the RDMA engine, turning ANs into disaggregation-native devices. The RAAP provides the CN with low-latency and high throughput accelerator semantics based on standard RDMA operations. Our FPGA prototype demonstrates DPU-Direct's efficacy with two proof-of-concept applications: AES encryption and key-value cache, which are computationally intensive and latency-sensitive. DPU-Direct yields a 400x speedup in AES encryption over the CPU baseline and matches the performance of the locally integrated AES accelerator. For key-value cache, DPU-Direct reduces the average end-to-end latency by 1.66x for GETs and 1.30x for SETs over the CPU-RDMA-Polling baseline, reducing latency jitter by over 10x for both operations.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"73 8","pages":"2081-2095"},"PeriodicalIF":3.6,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141147535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LMChain: An Efficient Load-Migratable Beacon-Based Sharding Blockchain System LMChain:基于信标的高效负载迁移分片区块链系统
IF 3.6 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-03-22 DOI: 10.1109/TC.2024.3404057
Dengcheng Hu;Jianrong Wang;Xiulong Liu;Qi Li;Keqiu Li
Sharding is an important technology that utilizes group parallelism to enhance the scalability and performance of blockchain. However, the existing solutions use a historical transaction-based approach to reallocate shards, which cannot handle temporary overload and incurs additional overhead during the reallocation process. To this end, this paper proposes LMChain, an efficient load-migratable beacon-based sharding blockchain system. The primary goal of LMChain is to eliminate reliance on historical transactions and achieve the high performance. Specifically, we redesign the state maintenance data structure in Beacon Shard to effectively manage all account states at the shard level. Then, we innovatively propose a load-migratable transaction processing protocol built upon the new data structure. To mitigate read-write conflicts during the selection of migration transactions, we adopt a novel graph partitioning scheme. We also adopt a relay-based method to handle cross-shard transactions and resolve inter-shard state read-write conflicts. We implement the LMChain prototype and conduct experiments in a real network environment comprising 17 cloud servers. Experimental results show that, compared with state-of-the-art solutions, LMChain effectively reduces the average transaction waiting latency of overloaded transactions by 30% to 48% in different cases within 16 transaction shards, while improving throughput by 3% to 10%.
分片是一项重要技术,它利用组并行性来提高区块链的可扩展性和性能。然而,现有的解决方案使用基于历史交易的方法来重新分配分片,这种方法无法处理临时过载,并且在重新分配过程中会产生额外的开销。为此,本文提出了基于信标的高效负载迁移分片区块链系统 LMChain。LMChain 的主要目标是消除对历史交易的依赖,实现高性能。具体来说,我们重新设计了信标分片中的状态维护数据结构,以便在分片级别有效管理所有账户状态。然后,我们在新数据结构的基础上创新性地提出了一种可负载迁移的事务处理协议。为了缓解迁移事务选择过程中的读写冲突,我们采用了一种新颖的图分区方案。我们还采用了一种基于中继的方法来处理跨分区事务并解决分区间的读写冲突。我们实现了 LMChain 原型,并在由 17 台云服务器组成的真实网络环境中进行了实验。实验结果表明,与最先进的解决方案相比,LMChain 在 16 个事务分片内的不同情况下,有效地将过载事务的平均事务等待延迟降低了 30% 至 48%,同时将吞吐量提高了 3% 至 10%。
{"title":"LMChain: An Efficient Load-Migratable Beacon-Based Sharding Blockchain System","authors":"Dengcheng Hu;Jianrong Wang;Xiulong Liu;Qi Li;Keqiu Li","doi":"10.1109/TC.2024.3404057","DOIUrl":"10.1109/TC.2024.3404057","url":null,"abstract":"Sharding is an important technology that utilizes group parallelism to enhance the scalability and performance of blockchain. However, the existing solutions use a historical transaction-based approach to reallocate shards, which cannot handle temporary overload and incurs additional overhead during the reallocation process. To this end, this paper proposes LMChain, an efficient load-migratable beacon-based sharding blockchain system. The primary goal of LMChain is to eliminate reliance on historical transactions and achieve the high performance. Specifically, we redesign the state maintenance data structure in Beacon Shard to effectively manage all account states at the shard level. Then, we innovatively propose a load-migratable transaction processing protocol built upon the new data structure. To mitigate read-write conflicts during the selection of migration transactions, we adopt a novel graph partitioning scheme. We also adopt a relay-based method to handle cross-shard transactions and resolve inter-shard state read-write conflicts. We implement the LMChain prototype and conduct experiments in a real network environment comprising 17 cloud servers. Experimental results show that, compared with state-of-the-art solutions, LMChain effectively reduces the average transaction waiting latency of overloaded transactions by 30% to 48% in different cases within 16 transaction shards, while improving throughput by 3% to 10%.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"73 9","pages":"2178-2191"},"PeriodicalIF":3.6,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141147547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Transactions on Computers
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1