首页 > 最新文献

Journal of Systems Architecture最新文献

英文 中文
Practicable live container migrations in high performance computing clouds: Diskless, iterative, and connection-persistent 高性能计算云中切实可行的实时容器迁移:无盘、迭代和连接持久化
IF 4.5 2区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-05-09 DOI: 10.1016/j.sysarc.2024.103157
Jordi Guitart

Checkpoint/Restore techniques had been thoroughly used by the High Performance Computing (HPC) community in the context of failure recovery. Given the current trend in HPC to use containerization to obtain fast, customized, portable, flexible, and reproducible deployments of their workloads, as well as efficient and reliable sharing and management of HPC Cloud infrastructures, there is a need to integrate Checkpoint/Restore with containerization in such a way that the freeze time of the application is minimal and live migrations are practicable. Whereas current Checkpoint/Restore tools (such as CRIU) support several options to accomplish this, most of them are rarely exploited in HPC Clouds and, consequently, their potential impact on the performance is barely known. Therefore, this paper explores the use of CRIU’s advanced features to implement diskless, iterative (pre-copy and post-copy) migrations of containers with external network namespaces and established TCP connections, so that memory-intensive and connection-persistent HPC applications can live-migrate. Our extensive experiments to characterize the performance impact of those features demonstrate that properly-configured live migrations incur low application downtime and memory/disk usage and are indeed feasible in containerized HPC Clouds.

在故障恢复方面,高性能计算(HPC)界已经广泛使用了检查点/恢复技术。鉴于高性能计算目前的趋势是使用容器化来获得快速、定制、可移植、灵活和可重现的工作负载部署,以及高效可靠地共享和管理高性能计算云基础设施,因此有必要将检查点/还原与容器化整合在一起,从而将应用程序的冻结时间降到最低,并使实时迁移切实可行。虽然目前的检查点/还原工具(如 CRIU)支持多种实现这一目标的方法,但其中大多数方法很少在高性能计算云中使用,因此,人们对它们对性能的潜在影响知之甚少。因此,本文探讨了如何利用CRIU的高级功能来实现具有外部网络命名空间和已建TCP连接的容器的无盘、迭代(复制前和复制后)迁移,从而使内存密集型和连接持久型高性能计算应用能够实时迁移。我们进行了大量实验来描述这些特性对性能的影响,结果表明,正确配置的实时迁移会减少应用停机时间和内存/磁盘使用量,在容器化高性能计算云中确实可行。
{"title":"Practicable live container migrations in high performance computing clouds: Diskless, iterative, and connection-persistent","authors":"Jordi Guitart","doi":"10.1016/j.sysarc.2024.103157","DOIUrl":"https://doi.org/10.1016/j.sysarc.2024.103157","url":null,"abstract":"<div><p>Checkpoint/Restore techniques had been thoroughly used by the High Performance Computing (HPC) community in the context of failure recovery. Given the current trend in HPC to use containerization to obtain fast, customized, portable, flexible, and reproducible deployments of their workloads, as well as efficient and reliable sharing and management of HPC Cloud infrastructures, there is a need to integrate Checkpoint/Restore with containerization in such a way that the freeze time of the application is minimal and live migrations are practicable. Whereas current Checkpoint/Restore tools (such as CRIU) support several options to accomplish this, most of them are rarely exploited in HPC Clouds and, consequently, their potential impact on the performance is barely known. Therefore, this paper explores the use of CRIU’s advanced features to implement diskless, iterative (pre-copy and post-copy) migrations of containers with external network namespaces and established TCP connections, so that memory-intensive and connection-persistent HPC applications can live-migrate. Our extensive experiments to characterize the performance impact of those features demonstrate that properly-configured live migrations incur low application downtime and memory/disk usage and are indeed feasible in containerized HPC Clouds.</p></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"152 ","pages":"Article 103157"},"PeriodicalIF":4.5,"publicationDate":"2024-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1383762124000948/pdfft?md5=cc1942d37d08df364ee498e16b1e96a9&pid=1-s2.0-S1383762124000948-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140917817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Energy-efficient scheduling for parallel applications with reliability and time constraints on heterogeneous distributed systems 异构分布式系统上具有可靠性和时间限制的并行应用程序的节能调度
IF 4.5 2区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-05-06 DOI: 10.1016/j.sysarc.2024.103173
Hongzhi Xu , Binlian Zhang , Chen Pan , Keqin Li

Reliability is a crucial index of the system, and many safety-critical applications have reliability requirements and deadline constraints. In addition, in order to protect the environment and reduce system operating costs, it is necessary to minimize energy consumption as much as possible. This paper considers parallel applications on heterogeneous distributed systems and proposes two algorithms to minimize energy consumption for meeting the deadline and satisfying the reliability requirement of the applications. The first algorithm is called minimizing scheduling length while satisfying the reliability requirement (MSLSRR). It first transforms the reliability requirement of the application into the reliability requirement of the task and then assigns the task to the processor with the earliest finish time. Since the reliability generated by MSLSRR is often higher than the reliability requirement of the application, and the scheduling length is also less than the deadline, an algorithm called improving energy efficiency (IEE) is designed, which redefined the minimum reliability requirement for the task and applied dynamic voltage and frequency scaling (DVFS) technique for energy conservation. The proposed algorithms are compared with existing algorithms by using real parallel applications. Experimental results demonstrate that the proposed algorithms consume the least energy.

可靠性是系统的一项重要指标,许多安全关键型应用都有可靠性要求和期限限制。此外,为了保护环境和降低系统运行成本,有必要尽可能降低能耗。本文考虑了异构分布式系统上的并行应用,并提出了两种算法,以最小化能耗来满足应用的截止日期和可靠性要求。第一种算法称为在满足可靠性要求的同时最小化调度长度(MSLSRR)。它首先将应用的可靠性要求转化为任务的可靠性要求,然后将任务分配给完成时间最早的处理器。由于 MSLSRR 生成的可靠性往往高于应用程序的可靠性要求,而且调度长度也小于截止时间,因此设计了一种称为提高能效(IEE)的算法,该算法重新定义了任务的最低可靠性要求,并应用动态电压和频率缩放(DVFS)技术进行节能。通过使用真实的并行应用,将提出的算法与现有算法进行了比较。实验结果表明,提出的算法能耗最低。
{"title":"Energy-efficient scheduling for parallel applications with reliability and time constraints on heterogeneous distributed systems","authors":"Hongzhi Xu ,&nbsp;Binlian Zhang ,&nbsp;Chen Pan ,&nbsp;Keqin Li","doi":"10.1016/j.sysarc.2024.103173","DOIUrl":"https://doi.org/10.1016/j.sysarc.2024.103173","url":null,"abstract":"<div><p>Reliability is a crucial index of the system, and many safety-critical applications have reliability requirements and deadline constraints. In addition, in order to protect the environment and reduce system operating costs, it is necessary to minimize energy consumption as much as possible. This paper considers parallel applications on heterogeneous distributed systems and proposes two algorithms to minimize energy consumption for meeting the deadline and satisfying the reliability requirement of the applications. The first algorithm is called minimizing scheduling length while satisfying the reliability requirement (MSLSRR). It first transforms the reliability requirement of the application into the reliability requirement of the task and then assigns the task to the processor with the earliest finish time. Since the reliability generated by MSLSRR is often higher than the reliability requirement of the application, and the scheduling length is also less than the deadline, an algorithm called improving energy efficiency (IEE) is designed, which redefined the minimum reliability requirement for the task and applied dynamic voltage and frequency scaling (DVFS) technique for energy conservation. The proposed algorithms are compared with existing algorithms by using real parallel applications. Experimental results demonstrate that the proposed algorithms consume the least energy.</p></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"152 ","pages":"Article 103173"},"PeriodicalIF":4.5,"publicationDate":"2024-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140906110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Flash controller-based secure execution environment for protecting code confidentiality 基于闪存控制器的安全执行环境,保护代码机密性
IF 4.5 2区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-05-06 DOI: 10.1016/j.sysarc.2024.103172
Zheng Zhang , Jingfeng Xue , Tian Chen , Yuhang Zhao , Weizhi Meng

With the rapid evolution of Internet-of-Things (IoT), billions of IoT devices have connected to the Internet, collecting information via tags and sensors. For an IoT device, the application code itself and data collected by sensors can be of great commercial value. It is challenging to protect them because IoT devices are prone to compromise due to the inevitable vulnerabilities of commodity Operating Systems. Trusted Execution Environment (TEE) is one of the solutions that protects sensitive data by running security-sensitive workloads in a secure world. However, this solution does not work for most of the IoT devices that are limited in resources.

In this paper, we propose Flash Controller-based Secure Execution Environment (FCSEE), an approach to protect security-sensitive code and data for IoT devices using the flash controller. Our approach constructs a secure execution environment on the target flash memory by modifying the execution logic of its controller, leveraging it as a co-processor to execute security-sensitive workloads of the host device. By extending the original functionality of the flash firmware, FCSEE also provides several much-needed security primitives to protect sensitive data. We constructed a prototype based on a Trans-Flash (TF) card and implemented a proof of its confidentiality. Our evaluation results indicate that FCSEE can confidentially execute security-sensitive workloads from the host and efficiently protect its sensitive data.

随着物联网(IoT)的快速发展,数十亿个物联网设备已连接到互联网,并通过标签和传感器收集信息。对于物联网设备来说,应用程序代码本身和传感器收集的数据都具有巨大的商业价值。由于商品操作系统存在不可避免的漏洞,物联网设备很容易受到攻击,因此保护它们具有挑战性。可信执行环境(TEE)是通过在安全环境中运行安全敏感的工作负载来保护敏感数据的解决方案之一。在本文中,我们提出了基于闪存控制器的安全执行环境(FCSEE),这是一种使用闪存控制器为物联网设备保护安全敏感代码和数据的方法。我们的方法通过修改闪存控制器的执行逻辑,在目标闪存上构建安全执行环境,将其作为协处理器来执行主机设备的安全敏感工作负载。通过扩展闪存固件的原有功能,FCSEE 还提供了几个急需的安全基元来保护敏感数据。我们构建了一个基于跨闪存(TF)卡的原型,并对其保密性进行了验证。我们的评估结果表明,FCSEE 可以从主机机密地执行安全敏感的工作负载,并有效地保护其敏感数据。
{"title":"Flash controller-based secure execution environment for protecting code confidentiality","authors":"Zheng Zhang ,&nbsp;Jingfeng Xue ,&nbsp;Tian Chen ,&nbsp;Yuhang Zhao ,&nbsp;Weizhi Meng","doi":"10.1016/j.sysarc.2024.103172","DOIUrl":"https://doi.org/10.1016/j.sysarc.2024.103172","url":null,"abstract":"<div><p>With the rapid evolution of Internet-of-Things (IoT), billions of IoT devices have connected to the Internet, collecting information via tags and sensors. For an IoT device, the application code itself and data collected by sensors can be of great commercial value. It is challenging to protect them because IoT devices are prone to compromise due to the inevitable vulnerabilities of commodity Operating Systems. Trusted Execution Environment (TEE) is one of the solutions that protects sensitive data by running security-sensitive workloads in a secure world. However, this solution does not work for most of the IoT devices that are limited in resources.</p><p>In this paper, we propose Flash Controller-based Secure Execution Environment (FCSEE), an approach to protect security-sensitive code and data for IoT devices using the flash controller. Our approach constructs a secure execution environment on the target flash memory by modifying the execution logic of its controller, leveraging it as a co-processor to execute security-sensitive workloads of the host device. By extending the original functionality of the flash firmware, FCSEE also provides several much-needed security primitives to protect sensitive data. We constructed a prototype based on a Trans-Flash (TF) card and implemented a proof of its confidentiality. Our evaluation results indicate that FCSEE can confidentially execute security-sensitive workloads from the host and efficiently protect its sensitive data.</p></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"152 ","pages":"Article 103172"},"PeriodicalIF":4.5,"publicationDate":"2024-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1383762124001097/pdfft?md5=ddc214324da00a88a4c83e6123dfe876&pid=1-s2.0-S1383762124001097-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140900856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Timing-accurate scheduling and allocation for parallel I/O operations in real-time systems 实时系统中并行 I/O 操作的时间精确调度和分配
IF 4.5 2区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-05-06 DOI: 10.1016/j.sysarc.2024.103158
Yuanhai Zhang , Shuai Zhao , Gang Chen , Haoyu Luo , Kai Huang

In industrial real-time systems, the I/O operations are often required to be both timing predictable, i.e., finish before the deadline to ensure safety, and timing accurate, i.e., start at or close to an ideal time instant for optimal I/O performance. However, for I/O-extensive systems, such strict timing requirements raise significant challenges for the scheduling of I/O operations, where execution conflicts widely exist if the I/O operations are scheduled at their ideal time instants. Existing methods mainly focus on one I/O device and apply simple heuristics to schedule I/O operations, which cannot effectively resolve execution conflicts, hence, undermining both timing predictability and accuracy. This paper proposes novel scheduling and allocation methods to maximize the timing accuracy while guaranteeing the predictability of the system. First, on one I/O device, a fine-grained schedule using Mixed Integer Linear Programming (MILP) is constructed that optimizes the timing accuracy of the I/O operations. Then, for systems containing multiple I/O devices of the same type, two novel allocations are proposed to realize parallel timing-accurate I/O control. The first utilizes MILP to further improve the timing accuracy of the system, whereas the second is a heuristic that provides competitive results with low overheads. Experimental results show the proposed methods outperform the state-of-the-art in terms of both timing predictability and accuracy by 37% and 25% on average (up to 5.56x and 33%), respectively.

在工业实时系统中,I/O 操作通常既要求时间可预测,即在截止日期前完成以确保安全,又要求时间准确,即在理想的时间瞬间或接近理想的时间瞬间开始,以获得最佳的 I/O 性能。然而,对于 I/O 密集型系统来说,这种严格的定时要求给 I/O 操作的调度带来了巨大挑战,如果 I/O 操作在理想的时间点调度,执行冲突就会广泛存在。现有方法主要集中在一个 I/O 设备上,采用简单的启发式方法调度 I/O 操作,无法有效解决执行冲突,从而影响了时序的可预测性和准确性。本文提出了新颖的调度和分配方法,在保证系统可预测性的同时,最大限度地提高定时精度。首先,在一个 I/O 设备上,使用混合整数线性规划(MILP)构建细粒度调度,优化 I/O 操作的定时精度。然后,针对包含多个同类型 I/O 设备的系统,提出了两种新的分配方案,以实现并行定时精确的 I/O 控制。第一种方法利用 MILP 进一步提高系统的定时精度,而第二种方法则是一种启发式方法,能以较低的开销提供有竞争力的结果。实验结果表明,所提出的方法在时序可预测性和准确性方面分别比最先进的方法平均高出 37% 和 25%(高达 5.56 倍和 33%)。
{"title":"Timing-accurate scheduling and allocation for parallel I/O operations in real-time systems","authors":"Yuanhai Zhang ,&nbsp;Shuai Zhao ,&nbsp;Gang Chen ,&nbsp;Haoyu Luo ,&nbsp;Kai Huang","doi":"10.1016/j.sysarc.2024.103158","DOIUrl":"https://doi.org/10.1016/j.sysarc.2024.103158","url":null,"abstract":"<div><p>In industrial real-time systems, the I/O operations are often required to be both <em>timing predictable</em>, i.e., finish before the deadline to ensure safety, and <em>timing accurate</em>, i.e., start at or close to an ideal time instant for optimal I/O performance. However, for I/O-extensive systems, such strict timing requirements raise significant challenges for the scheduling of I/O operations, where execution conflicts widely exist if the I/O operations are scheduled at their ideal time instants. Existing methods mainly focus on one I/O device and apply simple heuristics to schedule I/O operations, which cannot effectively resolve execution conflicts, hence, undermining both timing predictability and accuracy. This paper proposes novel scheduling and allocation methods to maximize the timing accuracy while guaranteeing the predictability of the system. First, on one I/O device, a fine-grained schedule using Mixed Integer Linear Programming (MILP) is constructed that optimizes the timing accuracy of the I/O operations. Then, for systems containing multiple I/O devices of the same type, two novel allocations are proposed to realize parallel timing-accurate I/O control. The first utilizes MILP to further improve the timing accuracy of the system, whereas the second is a heuristic that provides competitive results with low overheads. Experimental results show the proposed methods outperform the state-of-the-art in terms of both timing predictability and accuracy by 37% and 25% on average (up to 5.56x and 33%), respectively.</p></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"152 ","pages":"Article 103158"},"PeriodicalIF":4.5,"publicationDate":"2024-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140951388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BGS: Accelerate GNN training on multiple GPUs BGS:在多个 GPU 上加速 GNN 训练
IF 4.5 2区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-05-04 DOI: 10.1016/j.sysarc.2024.103162
Yujuan Tan , Zhuoxin Bai , Duo Liu , Zhaoyang Zeng , Yan Gan , Ao Ren , Xianzhang Chen , Kan Zhong

Emerging Graph Neural Networks (GNNs) have made significant progress in processing graph-structured data, yet existing GNN frameworks face scalability issues when training large-scale graph data using multiple GPUs. Frequent feature data transfers between CPUs and GPUs are a major bottleneck, and current caching schemes have not fully considered the characteristics of multi-GPU environments, leading to inefficient feature extraction. To address these challenges, we propose BGS, an auxiliary framework designed to accelerate GNN training from a data perspective in multi-GPU environments. Firstly, we introduce a novel training set partition algorithm, assigning independent training subsets to each GPU to enhance the spatial locality of node access, thus optimizing the efficiency of the feature caching strategy. Secondly, considering that GPUs can communicate at high speeds via NVLink connections, we designed a feature caching placement strategy suitable for multi-GPU environments. This strategy aims to improve the overall hit rate by setting reasonable redundant caches on each GPU. Evaluations on two representative GNN models, GCN and GraphSAGE, show that BGS significantly improves the hit rate of feature caching strategies in multi-GPU environments and substantially reduces the time overhead of data loading, achieving a performance improvement of 1.5 to 6.2 times compared to the baseline.

新兴的图神经网络(GNN)在处理图结构数据方面取得了重大进展,然而现有的图神经网络框架在使用多个 GPU 训练大规模图数据时面临着可扩展性问题。CPU 和 GPU 之间频繁的特征数据传输是一个主要瓶颈,而且当前的缓存方案没有充分考虑多 GPU 环境的特点,导致特征提取效率低下。为了应对这些挑战,我们提出了一个辅助框架 BGS,旨在从数据角度加速多 GPU 环境下的 GNN 训练。首先,我们引入了一种新颖的训练集分割算法,为每个 GPU 分配独立的训练子集,以增强节点访问的空间位置性,从而优化特征缓存策略的效率。其次,考虑到 GPU 可以通过 NVLink 连接进行高速通信,我们设计了一种适用于多 GPU 环境的特征缓存放置策略。该策略旨在通过在每个 GPU 上设置合理的冗余缓存来提高整体命中率。在两个具有代表性的 GNN 模型(GCN 和 GraphSAGE)上进行的评估表明,BGS 显著提高了多 GPU 环境下特征缓存策略的命中率,并大幅减少了数据加载的时间开销,与基线相比,性能提高了 1.5 到 6.2 倍。
{"title":"BGS: Accelerate GNN training on multiple GPUs","authors":"Yujuan Tan ,&nbsp;Zhuoxin Bai ,&nbsp;Duo Liu ,&nbsp;Zhaoyang Zeng ,&nbsp;Yan Gan ,&nbsp;Ao Ren ,&nbsp;Xianzhang Chen ,&nbsp;Kan Zhong","doi":"10.1016/j.sysarc.2024.103162","DOIUrl":"10.1016/j.sysarc.2024.103162","url":null,"abstract":"<div><p>Emerging Graph Neural Networks (GNNs) have made significant progress in processing graph-structured data, yet existing GNN frameworks face scalability issues when training large-scale graph data using multiple GPUs. Frequent feature data transfers between CPUs and GPUs are a major bottleneck, and current caching schemes have not fully considered the characteristics of multi-GPU environments, leading to inefficient feature extraction. To address these challenges, we propose BGS, an auxiliary framework designed to accelerate GNN training from a data perspective in multi-GPU environments. Firstly, we introduce a novel training set partition algorithm, assigning independent training subsets to each GPU to enhance the spatial locality of node access, thus optimizing the efficiency of the feature caching strategy. Secondly, considering that GPUs can communicate at high speeds via NVLink connections, we designed a feature caching placement strategy suitable for multi-GPU environments. This strategy aims to improve the overall hit rate by setting reasonable redundant caches on each GPU. Evaluations on two representative GNN models, GCN and GraphSAGE, show that BGS significantly improves the hit rate of feature caching strategies in multi-GPU environments and substantially reduces the time overhead of data loading, achieving a performance improvement of 1.5 to 6.2 times compared to the baseline.</p></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"153 ","pages":"Article 103162"},"PeriodicalIF":4.5,"publicationDate":"2024-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141035294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dynamic zone redistribution for key-value stores on zoned namespaces SSDs 分区命名空间 SSD 上键值存储的动态区域再分配
IF 4.5 2区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-05-03 DOI: 10.1016/j.sysarc.2024.103159
Doeun Kim , Jinyoung Kim , Kihan Choi , Hyuck Han , Minsoo Ryu , Sooyong Kang

Recently, the zoned namespaces (ZNS) interface has been introduced as a new interface for solid-state drives (SSD), and commercial ZNS SSDs are starting to be used for LSM-tree-based KV-stores, including RocksDB, whose log-structured write characteristics well align with the intra-zone sequential write constraint of the ZNS SSDs. The host software for ZNS SSDs, including ZenFS for RocksDB, considers the lifetime of data when allocating zones to expedite zone reclamation. It also uses a lock-based synchronization mechanism to prevent concurrent writes to a zone, together with a contention avoidance policy that avoids allocating ‘locked’ zones to increase write throughput. However, this policy seriously damages the lifetime-based zone allocation strategy, leading to increased write amplification in KV-stores that support parallel compaction. In this paper, we delve into the underlying causes of this phenomenon and propose a novel zone management scheme, Dynamic Zone Redistribution (DZR), that can be effectively used for such KV-stores. DZR enables both high throughput and low write amplification by effectively addressing the root cause. Experimental results using micro- and macro-benchmarks show that DZR significantly reduces write amplification compared with ZenFS while preserving (or even increasing) write throughput.

最近,分区命名空间(ZNS)接口作为固态硬盘(SSD)的一种新接口被引入,商用 ZNS SSD 开始用于基于 LSM 树的 KV 存储,包括 RocksDB,其日志结构写入特性与 ZNS SSD 的区内顺序写入约束非常吻合。ZNS SSD 的主机软件,包括 RocksDB 的 ZenFS,在分配区域时会考虑数据的生命周期,以加快区域回收。它还使用基于锁的同步机制来防止对区域的并发写入,同时使用避免争用策略来避免分配 "锁定 "区域,以提高写入吞吐量。然而,这种策略严重破坏了基于生命周期的区域分配策略,导致支持并行压缩的 KV 存储的写入放大率增加。在本文中,我们深入探讨了造成这种现象的根本原因,并提出了一种可有效用于此类 KV 存储的新型分区管理方案--动态分区再分配(DZR)。通过有效解决根本原因,DZR 实现了高吞吐量和低写入放大。使用微观和宏观基准测试的实验结果表明,与 ZenFS 相比,DZR 显著降低了写入放大率,同时保持(甚至提高)了写入吞吐量。
{"title":"Dynamic zone redistribution for key-value stores on zoned namespaces SSDs","authors":"Doeun Kim ,&nbsp;Jinyoung Kim ,&nbsp;Kihan Choi ,&nbsp;Hyuck Han ,&nbsp;Minsoo Ryu ,&nbsp;Sooyong Kang","doi":"10.1016/j.sysarc.2024.103159","DOIUrl":"https://doi.org/10.1016/j.sysarc.2024.103159","url":null,"abstract":"<div><p>Recently, the zoned namespaces (ZNS) interface has been introduced as a new interface for solid-state drives (SSD), and commercial ZNS SSDs are starting to be used for LSM-tree-based KV-stores, including RocksDB, whose log-structured write characteristics well align with the intra-zone sequential write constraint of the ZNS SSDs. The host software for ZNS SSDs, including ZenFS for RocksDB, considers the lifetime of data when allocating zones to expedite zone reclamation. It also uses a lock-based synchronization mechanism to prevent concurrent writes to a zone, together with a contention avoidance policy that avoids allocating ‘locked’ zones to increase write throughput. However, this policy seriously damages the lifetime-based zone allocation strategy, leading to increased write amplification in KV-stores that support parallel compaction. In this paper, we delve into the underlying causes of this phenomenon and propose a novel zone management scheme, Dynamic Zone Redistribution (DZR), that can be effectively used for such KV-stores. DZR enables both high throughput and low write amplification by effectively addressing the root cause. Experimental results using micro- and macro-benchmarks show that DZR significantly reduces write amplification compared with ZenFS while preserving (or even increasing) write throughput.</p></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"152 ","pages":"Article 103159"},"PeriodicalIF":4.5,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140823738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CEIU: Consistent and Efficient Incremental Update mechanism for mobile systems on flash storage CEIU:闪存移动系统的一致高效增量更新机制
IF 4.5 2区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-05-03 DOI: 10.1016/j.sysarc.2024.103151
Ruiqing Lei , Xianzhang Chen , Duo Liu , Chunlin Song , Yujuan Tan , Ao Ren

The ever-growing sizes and frequent updating of mobile applications cause high network and storage cost for updating. Hence, emerging mobile systems often employ incremental update algorithms, typically HDiffPatch, to upgrade mobile applications. However, we find that existing incremental update algorithms not only generate a significant amount of redundant data accesses, but also lacks of consistency guarantees for the whole package of application. In this paper, we present a novel Consistent and Efficient Incremental Update (CEIU) mechanism for upgrading mobile applications. Firstly, CEIU reduces the memory consumption and file access of incremental updates by reusing the indexes of blocks in the old image rather than copying the blocks. Secondly, CEIU employs a two-level journaling mechanism to ensure the consistency of the whole package and subfiles of the new image. We implement the proposed mechanism in Linux kernel based on TI-LFAT file system and evaluate it with real-world applications. The experimental results show that the proposed mechanism can reduce 30%–80% memory footprints in comparison with HDiffPatch, the state-of-the-art incremental update algorithm. It also significantly reduces the recovery time when power failure or system crash occurs.

移动应用程序的规模不断扩大,更新频繁,导致更新的网络和存储成本居高不下。因此,新兴的移动系统通常采用增量更新算法(通常是 HDiffPatch)来升级移动应用程序。然而,我们发现现有的增量更新算法不仅会产生大量冗余数据访问,而且缺乏对整套应用程序的一致性保证。在本文中,我们提出了一种用于升级移动应用程序的新型一致高效增量更新(CEIU)机制。首先,CEIU 通过重用旧图像中的块索引而不是复制块,减少了增量更新的内存消耗和文件访问。其次,CEIU 采用两级日志机制,确保新镜像的整个包和子文件的一致性。我们在基于 TI-LFAT 文件系统的 Linux 内核中实现了建议的机制,并通过实际应用进行了评估。实验结果表明,与最先进的增量更新算法 HDiffPatch 相比,所提出的机制可减少 30%-80% 的内存占用。此外,它还大大缩短了断电或系统崩溃时的恢复时间。
{"title":"CEIU: Consistent and Efficient Incremental Update mechanism for mobile systems on flash storage","authors":"Ruiqing Lei ,&nbsp;Xianzhang Chen ,&nbsp;Duo Liu ,&nbsp;Chunlin Song ,&nbsp;Yujuan Tan ,&nbsp;Ao Ren","doi":"10.1016/j.sysarc.2024.103151","DOIUrl":"https://doi.org/10.1016/j.sysarc.2024.103151","url":null,"abstract":"<div><p>The ever-growing sizes and frequent updating of mobile applications cause high network and storage cost for updating. Hence, emerging mobile systems often employ incremental update algorithms, typically HDiffPatch, to upgrade mobile applications. However, we find that existing incremental update algorithms not only generate a significant amount of redundant data accesses, but also lacks of consistency guarantees for the whole package of application. In this paper, we present a novel Consistent and Efficient Incremental Update (CEIU) mechanism for upgrading mobile applications. Firstly, CEIU reduces the memory consumption and file access of incremental updates by reusing the indexes of blocks in the old image rather than copying the blocks. Secondly, CEIU employs a two-level journaling mechanism to ensure the consistency of the whole package and subfiles of the new image. We implement the proposed mechanism in Linux kernel based on TI-LFAT file system and evaluate it with real-world applications. The experimental results show that the proposed mechanism can reduce 30%–80% memory footprints in comparison with HDiffPatch, the state-of-the-art incremental update algorithm. It also significantly reduces the recovery time when power failure or system crash occurs.</p></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"152 ","pages":"Article 103151"},"PeriodicalIF":4.5,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140842735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BFL-SA: Blockchain-based federated learning via enhanced secure aggregation BFL-SA:通过增强型安全聚合实现基于区块链的联合学习
IF 4.5 2区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-05-03 DOI: 10.1016/j.sysarc.2024.103163
Yizhong Liu , Zixiao Jia , Zixu Jiang , Xun Lin , Jianwei Liu , Qianhong Wu , Willy Susilo

Federated learning, involving a central server and multiple clients, aims to keep data local but raises privacy concerns like data exposure and participation privacy. Secure aggregation, especially with pairwise masking, preserves privacy without accuracy loss. Yet, issues persist like security against malicious models, central server fault tolerance, and trust in decryption keys. Resolving these challenges is vital for advancing secure federated learning systems. In this paper, we present BFL-SA, a blockchain-based federated learning scheme via enhanced secure aggregation, which addresses key challenges by integrating blockchain consensus, publicly verifiable secret sharing, and an overdue gradients aggregation module. These enhancements significantly boost security and fault tolerance while improving the efficiency of data utilization in the secure aggregation process. After security analysis, we have demonstrated that BFL-SA achieves secure aggregation even in malicious models. Through experimental comparative analysis, BFL-SA exhibits rapid secure aggregation speed and achieves 100% model aggregation accuracy.

联合学习涉及一个中央服务器和多个客户端,旨在保持数据本地化,但会引发数据暴露和参与隐私等隐私问题。安全聚合,尤其是配对屏蔽,可以在不损失准确性的情况下保护隐私。然而,针对恶意模型的安全性、中央服务器容错以及对解密密钥的信任等问题依然存在。解决这些难题对于推进安全的联合学习系统至关重要。在本文中,我们介绍了通过增强安全聚合实现的基于区块链的联合学习方案 BFL-SA,该方案通过整合区块链共识、可公开验证的秘密共享和逾期梯度聚合模块来应对关键挑战。这些增强功能大大提高了安全性和容错性,同时提高了安全聚合过程中的数据利用效率。经过安全分析,我们证明 BFL-SA 即使在恶意模型中也能实现安全聚合。通过实验对比分析,BFL-SA 表现出了快速的安全聚合速度,并实现了 100% 的模型聚合准确率。
{"title":"BFL-SA: Blockchain-based federated learning via enhanced secure aggregation","authors":"Yizhong Liu ,&nbsp;Zixiao Jia ,&nbsp;Zixu Jiang ,&nbsp;Xun Lin ,&nbsp;Jianwei Liu ,&nbsp;Qianhong Wu ,&nbsp;Willy Susilo","doi":"10.1016/j.sysarc.2024.103163","DOIUrl":"https://doi.org/10.1016/j.sysarc.2024.103163","url":null,"abstract":"<div><p>Federated learning, involving a central server and multiple clients, aims to keep data local but raises privacy concerns like data exposure and participation privacy. Secure aggregation, especially with pairwise masking, preserves privacy without accuracy loss. Yet, issues persist like security against malicious models, central server fault tolerance, and trust in decryption keys. Resolving these challenges is vital for advancing secure federated learning systems. In this paper, we present BFL-SA, a blockchain-based federated learning scheme via enhanced secure aggregation, which addresses key challenges by integrating blockchain consensus, publicly verifiable secret sharing, and an overdue gradients aggregation module. These enhancements significantly boost security and fault tolerance while improving the efficiency of data utilization in the secure aggregation process. After security analysis, we have demonstrated that BFL-SA achieves secure aggregation even in malicious models. Through experimental comparative analysis, BFL-SA exhibits rapid secure aggregation speed and achieves 100% model aggregation accuracy.</p></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"152 ","pages":"Article 103163"},"PeriodicalIF":4.5,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140900855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Day–Night architecture: Development of an ultra-low power RISC-V processor for wearable anomaly detection 日夜架构:开发用于可穿戴异常检测的超低功耗 RISC-V 处理器
IF 4.5 2区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-05-03 DOI: 10.1016/j.sysarc.2024.103161
Eunjin Choi , Jina Park , Kyeongwon Lee , Jae-Jin Lee , Kyuseung Han , Woojoo Lee

In healthcare, anomaly detection has emerged as a central application. This study presents an ultra-low power processor tailored for wearable devices dedicated to anomaly detection. Introducing a unique Day–Night architecture, the processor is bifurcated into two distinct segments: The Day segment and the Night segment, both of which function autonomously. The Day segment, catering to generic wearable applications, is designed to remain largely inactive, awakening only for specific tasks. This approach leads to considerable power savings by incorporating the Main-CPU and system interconnect, both major power consumers. Conversely, the Night segment is dedicated to real-time anomaly detection using sensor data analytics. It comprises a Sub-CPU and a minimal set of IPs, operating continuously but with minimized power consumption. To further enhance this architecture, the paper presents an ultra-lightweight RISC-V core, All-Night core, specialized for anomaly detection applications, replacing the traditional Sub-CPU. To validate the Day–Night architecture, we developed a prototype processor and implemented it on an FPGA board. An anomaly detection application, optimized for this prototype, was also developed to showcase its functional prowess. Finally, when we synthesized the processor prototype using 45 nm process technology, it affirmed our assertion of achieving an energy reduction of up to 57%.

在医疗保健领域,异常检测已成为一项核心应用。本研究提出了一种为可穿戴设备量身定制的超低功耗处理器,专门用于异常检测。该处理器采用独特的昼夜架构,分为两个不同的部分:日用部分和夜间部分,这两个部分均可独立运行。日间部分主要针对一般的可穿戴应用,设计成基本不活动,只有在执行特定任务时才会被唤醒。这种方法通过整合主 CPU 和系统互连这两个耗电大户,大大节省了功耗。相反,"夜间 "部分专门用于利用传感器数据分析进行实时异常检测。它由一个子 CPU 和一组最小的 IP 组成,可连续运行,但功耗最小。为了进一步增强这一架构,本文提出了一个超轻量级 RISC-V 内核--All-Night 内核,专门用于异常检测应用,取代了传统的 Sub-CPU。为了验证 "昼夜 "架构,我们开发了一个原型处理器,并在 FPGA 板上实现。我们还开发了一个针对该原型进行优化的异常检测应用,以展示其强大的功能。最后,当我们使用 45 纳米工艺技术合成处理器原型时,它证实了我们的说法,即实现了高达 57% 的能耗降低。
{"title":"Day–Night architecture: Development of an ultra-low power RISC-V processor for wearable anomaly detection","authors":"Eunjin Choi ,&nbsp;Jina Park ,&nbsp;Kyeongwon Lee ,&nbsp;Jae-Jin Lee ,&nbsp;Kyuseung Han ,&nbsp;Woojoo Lee","doi":"10.1016/j.sysarc.2024.103161","DOIUrl":"https://doi.org/10.1016/j.sysarc.2024.103161","url":null,"abstract":"<div><p>In healthcare, anomaly detection has emerged as a central application. This study presents an ultra-low power processor tailored for wearable devices dedicated to anomaly detection. Introducing a unique <em>Day–Night</em> architecture, the processor is bifurcated into two distinct segments: The <em>Day</em> segment and the <em>Night</em> segment, both of which function autonomously. The Day segment, catering to generic wearable applications, is designed to remain largely inactive, awakening only for specific tasks. This approach leads to considerable power savings by incorporating the Main-CPU and system interconnect, both major power consumers. Conversely, the Night segment is dedicated to real-time anomaly detection using sensor data analytics. It comprises a Sub-CPU and a minimal set of IPs, operating continuously but with minimized power consumption. To further enhance this architecture, the paper presents an ultra-lightweight RISC-V core, <em>All-Night</em> core, specialized for anomaly detection applications, replacing the traditional Sub-CPU. To validate the Day–Night architecture, we developed a prototype processor and implemented it on an FPGA board. An anomaly detection application, optimized for this prototype, was also developed to showcase its functional prowess. Finally, when we synthesized the processor prototype using 45 nm process technology, it affirmed our assertion of achieving an energy reduction of up to 57%.</p></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"152 ","pages":"Article 103161"},"PeriodicalIF":4.5,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1383762124000985/pdfft?md5=3dbd70dbdc0c83e130f00806758df490&pid=1-s2.0-S1383762124000985-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140879289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ALERT: A lightweight defense mechanism for enhancing DNN robustness against T-BFA ALERT:增强 DNN 对 T-BFA 的鲁棒性的轻量级防御机制
IF 4.5 2区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-05-01 DOI: 10.1016/j.sysarc.2024.103160
Xiaohui Wei, Xiaonan Wang, Yumin Yan, Nan Jiang, Hengshan Yue

DNNs have become pervasive in many security–critical scenarios such as autonomous vehicles and medical diagnoses. Recent studies reveal the susceptibility of DNNs to various adversarial attacks, among which weight Bit-Flip Attacks (BFA) is emerging as a significant security concern. Moreover, Targeted Bit-Flip Attacks (T-BFA), as a novel variant of BFA, can stealthily alter specific source–target classifications while preserving accurate classifications of non-target classes, posing a more severe threat. However, due to the inadequate consideration for T-BFA’s “targeted” characteristic, existing defense mechanisms tend to perform over-protection/-modification to the network, leading to significant defense overheads or non-negligible DNN accuracy reduction.

In this work, we propose ALERT, A Lightweight defense mechanism for Enhancing DNN Robustness against T-BFA while maintaining network accuracy. Firstly, fully understanding the key factors that dominate the misclassification among source–target class pairs, we propose a Source-Target-Aware Searching (STAS) method to accurately identify the vulnerable weights under T-BFA. Secondly, leveraging the intrinsic redundancy characteristic of DNNs, we propose a weight random switch mechanism to reduce the exposure of vulnerable weights, thereby weakening the expected impact of T-BFA. Striking a delicate balance between enhancing robustness and preserving network accuracy, we develop a metric to meticulously select candidate weights. Finally, to further enhance the DNN robustness, we present a lightweight runtime monitoring mechanism for detecting T-BFA through weight signature verification, and dynamically optimize the weight random switch strategy accordingly. Evaluation results demonstrate that our proposed method effectively enhances the robustness of DNNs against T-BFA while maintaining network accuracy. Compared with the baseline, our method can tolerate 6.7× more flipped bits with negligible accuracy loss (<0.1% in ResNet-50).

DNN 在自动驾驶汽车和医疗诊断等许多对安全至关重要的应用场景中已变得非常普遍。最近的研究揭示了 DNNs 易受各种对抗性攻击的影响,其中重量比特翻转攻击(BFA)正在成为一个重要的安全问题。此外,定向比特翻转攻击(T-BFA)作为比特翻转攻击的一种新型变体,可以在保留非目标类别的准确分类的同时,隐蔽地改变特定的源-目标分类,从而构成更严重的威胁。然而,由于对T-BFA的 "针对性 "特征考虑不足,现有的防御机制往往会对网络进行过度保护/修改,导致显著的防御开销或不可忽略的DNN精度下降。在这项工作中,我们提出了ALERT,一种在保持网络精度的同时增强DNN对T-BFA的鲁棒性的轻量级防御机制。首先,我们充分理解了主导源-目标类对错误分类的关键因素,提出了一种源-目标感知搜索(STAS)方法,以准确识别 T-BFA 下的易损权重。其次,利用 DNN 固有的冗余特性,我们提出了权重随机切换机制,以减少易受攻击权重的暴露,从而削弱 T-BFA 的预期影响。为了在增强鲁棒性和保持网络准确性之间取得微妙的平衡,我们开发了一种度量方法来精心选择候选权重。最后,为了进一步增强 DNN 的鲁棒性,我们提出了一种轻量级运行时监控机制,通过权重签名验证来检测 T-BFA,并相应地动态优化权重随机切换策略。评估结果表明,我们提出的方法在保持网络准确性的同时,有效增强了 DNN 对 T-BFA 的鲁棒性。与基线相比,我们的方法可容忍多 6.7 倍的翻转比特,而精度损失几乎可以忽略不计(在 ResNet-50 中为 0.1%)。
{"title":"ALERT: A lightweight defense mechanism for enhancing DNN robustness against T-BFA","authors":"Xiaohui Wei,&nbsp;Xiaonan Wang,&nbsp;Yumin Yan,&nbsp;Nan Jiang,&nbsp;Hengshan Yue","doi":"10.1016/j.sysarc.2024.103160","DOIUrl":"https://doi.org/10.1016/j.sysarc.2024.103160","url":null,"abstract":"<div><p>DNNs have become pervasive in many security–critical scenarios such as autonomous vehicles and medical diagnoses. Recent studies reveal the susceptibility of DNNs to various adversarial attacks, among which weight Bit-Flip Attacks (BFA) is emerging as a significant security concern. Moreover, Targeted Bit-Flip Attacks (T-BFA), as a novel variant of BFA, can stealthily alter specific source–target classifications while preserving accurate classifications of non-target classes, posing a more severe threat. However, due to the inadequate consideration for T-BFA’s “targeted” characteristic, existing defense mechanisms tend to perform over-protection/-modification to the network, leading to significant defense overheads or non-negligible DNN accuracy reduction.</p><p>In this work, we propose <u><em>ALERT</em></u>, <u><em>A</em></u> <u><em>L</em></u>ightweight defense mechanism for <u><em>E</em></u>nhancing DNN <u><em>R</em></u>obustness against <u><em>T</em></u>-BFA while maintaining network accuracy. Firstly, fully understanding the key factors that dominate the misclassification among source–target class pairs, we propose a Source-Target-Aware Searching (STAS) method to accurately identify the vulnerable weights under T-BFA. Secondly, leveraging the intrinsic redundancy characteristic of DNNs, we propose a weight random switch mechanism to reduce the exposure of vulnerable weights, thereby weakening the expected impact of T-BFA. Striking a delicate balance between enhancing robustness and preserving network accuracy, we develop a metric to meticulously select candidate weights. Finally, to further enhance the DNN robustness, we present a lightweight runtime monitoring mechanism for detecting T-BFA through weight signature verification, and dynamically optimize the weight random switch strategy accordingly. Evaluation results demonstrate that our proposed method effectively enhances the robustness of DNNs against T-BFA while maintaining network accuracy. Compared with the baseline, our method can tolerate <span><math><mrow><mn>6</mn><mo>.</mo><mn>7</mn><mo>×</mo></mrow></math></span> more flipped bits with negligible accuracy loss (<span><math><mrow><mo>&lt;</mo><mn>0</mn><mo>.</mo><mn>1</mn><mtext>%</mtext></mrow></math></span> in ResNet-50).</p></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"152 ","pages":"Article 103160"},"PeriodicalIF":4.5,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140893404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Systems Architecture
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1