IEEE Transactions on Computers最新文献_第7页

LPAH: Illustrating Efficient Live Patching With Alignment Holes in Kernel Data LPAH：利用内核数据中的对齐漏洞说明高效的实时修补程序

IF 3.6 2区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Transactions on Computers

Pub Date : 2024-07-05 DOI: 10.1109/TC.2024.3424263

Chao Su;Xiaoshuang Xing;Xiaolu Cheng;Rui Guo;Chuanwen Luo

The Linux kernel is regularly updated to enhance security, improve performance, and introduce new functionalities. Traditional updating methods typically require rebooting, leading to service disruptions and potential data loss. Live-patching technology dynamically updates the kernel modules without rebooting, ensuring continuous service availability. However, this technique has its drawbacks. Since live-patching alters the original structure of data types, it can no longer utilize base offsets to access the members, imposing considerable overheads. This paper proposes LPAH (Live Patching with Alignment Holes), a live patching system that leverages the fragmented space generated by compile-time alignment for data types, to enable effective live patching updates for security vulnerability fixes, feature enhancements, and user-defined patching tasks. LPAH capitalizes on the relationship between these alignment holes and data objects. This approach ensures efficient access to extended data members while preserving the original data's integrity. This approach allows other functions to remain unaffected by updates and replacements through explicit type casts. Extensive experimental results show that LPAH offers valid and robust live patching for multiple real vulnerabilities in the Linux kernel, without degrading performance. Our method provides an efficient way to install security patches in the Linux kernel, and thus reenforces kernel security.

Linux 内核会定期更新，以增强安全性、提高性能并引入新功能。传统的更新方法通常需要重新启动，导致服务中断和潜在的数据丢失。实时补丁技术可以动态更新内核模块，无需重启，从而确保服务的持续可用性。不过，这种技术也有缺点。由于实时补丁改变了数据类型的原始结构，因此无法再利用基偏移来访问成员，从而造成了相当大的开销。本文提出的 LPAH（带对齐孔的实时补丁）是一种实时补丁系统，它利用数据类型编译时对齐所产生的碎片空间，为安全漏洞修复、功能增强和用户定义的补丁任务提供有效的实时补丁更新。LPAH 利用了这些对齐漏洞和数据对象之间的关系。这种方法可确保高效访问扩展数据成员，同时保持原始数据的完整性。这种方法允许其他函数通过显式类型转换不受更新和替换的影响。广泛的实验结果表明，LPAH 为 Linux 内核中的多个真实漏洞提供了有效、稳健的实时补丁，而且不会降低性能。我们的方法提供了一种在 Linux 内核中安装安全补丁的有效方法，从而加强了内核的安全性。

{"title":"LPAH: Illustrating Efficient Live Patching With Alignment Holes in Kernel Data","authors":"Chao Su;Xiaoshuang Xing;Xiaolu Cheng;Rui Guo;Chuanwen Luo","doi":"10.1109/TC.2024.3424263","DOIUrl":"10.1109/TC.2024.3424263","url":null,"abstract":"The Linux kernel is regularly updated to enhance security, improve performance, and introduce new functionalities. Traditional updating methods typically require rebooting, leading to service disruptions and potential data loss. Live-patching technology dynamically updates the kernel modules without rebooting, ensuring continuous service availability. However, this technique has its drawbacks. Since live-patching alters the original structure of data types, it can no longer utilize base offsets to access the members, imposing considerable overheads. This paper proposes LPAH (Live Patching with Alignment Holes), a live patching system that leverages the fragmented space generated by compile-time alignment for data types, to enable effective live patching updates for security vulnerability fixes, feature enhancements, and user-defined patching tasks. LPAH capitalizes on the relationship between these alignment holes and data objects. This approach ensures efficient access to extended data members while preserving the original data's integrity. This approach allows other functions to remain unaffected by updates and replacements through explicit type casts. Extensive experimental results show that LPAH offers valid and robust live patching for multiple real vulnerabilities in the Linux kernel, without degrading performance. Our method provides an efficient way to install security patches in the Linux kernel, and thus reenforces kernel security.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"73 10","pages":"2434-2448"},"PeriodicalIF":3.6,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141570092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

HYDRA: A Hybrid Resistance Drift Resilient Architecture for Phase Change Memory-Based Neural Network Accelerators HYDRA：基于相变存储器的神经网络加速器的混合抗漂移架构

IF 3.6 2区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Transactions on Computers

Pub Date : 2024-06-24 DOI: 10.1109/TC.2024.3404096

Thai-Hoang Nguyen;Muhammad Imran;Jaehyuk Choi;Joon-Sung Yang

In-memory Computing (IMC) using Phase Change Memory (PCM) has proven to be effective for efficient processing of Deep Neural Networks (DNNs). However, with the use of multi-level cell PCM (MLC-PCM) in NVMs-based accelerators, errors due to resistance drift in MLC-PCM can severely degrade the DNNs accuracy. In this paper, an analysis of the impact of resistance drift errors on accuracy of MLC-PCM based DNN accelerator shows that the drift errors alone can significantly impact the accuracy. This paper proposes Hydra, which is a hybrid resistance drift resilient architecture for MLC-PCM based DNN accelerators which use IMC for efficient computations. Hydra utilizes Tri-level cell PCM, which has a negligible resistance drift error rate, to store the critical bits of DNNs parameters and MLC-PCM (4-level cell), which has a higher error rate (but offers more storage density), for the non-critical bits. Experimental results on various DNN architectures, configurations and datasets show that, with the presence of resistance drift errors in PCM, Hydra can maintain the baseline accuracy of DNNs for up to 1 year (resistance drift is time-dependent), whereas conventional drift tolerance techniques lead to a significant accuracy drop in just a few seconds.

事实证明，使用相变存储器（PCM）的内存计算（IMC）可有效处理深度神经网络（DNN）。然而，在基于 NVMs 的加速器中使用多层单元 PCM（MLC-PCM）时，MLC-PCM 中电阻漂移导致的误差会严重降低 DNN 的精度。本文分析了电阻漂移误差对基于 MLC-PCM 的 DNN 加速器精度的影响，结果表明仅漂移误差就会对精度产生重大影响。本文提出的 Hydra 是一种混合电阻漂移弹性架构，适用于使用 IMC 进行高效计算的基于 MLC-PCM 的 DNN 加速器。Hydra 利用电阻漂移误差率可忽略不计的三级单元 PCM 来存储 DNN 参数的关键位，而利用误差率较高（但存储密度更大）的 MLC-PCM（四级单元）来存储非关键位。在各种 DNN 架构、配置和数据集上的实验结果表明，在 PCM 存在电阻漂移误差的情况下，Hydra 可以将 DNN 的基线精度保持长达 1 年（电阻漂移与时间有关），而传统的漂移容错技术仅在几秒钟内就会导致精度大幅下降。

{"title":"HYDRA: A Hybrid Resistance Drift Resilient Architecture for Phase Change Memory-Based Neural Network Accelerators","authors":"Thai-Hoang Nguyen;Muhammad Imran;Jaehyuk Choi;Joon-Sung Yang","doi":"10.1109/TC.2024.3404096","DOIUrl":"10.1109/TC.2024.3404096","url":null,"abstract":"In-memory Computing (IMC) using Phase Change Memory (PCM) has proven to be effective for efficient processing of Deep Neural Networks (DNNs). However, with the use of multi-level cell PCM (MLC-PCM) in NVMs-based accelerators, errors due to resistance drift in MLC-PCM can severely degrade the DNNs accuracy. In this paper, an analysis of the impact of resistance drift errors on accuracy of MLC-PCM based DNN accelerator shows that the drift errors alone can significantly impact the accuracy. This paper proposes Hydra, which is a hybrid resistance drift resilient architecture for MLC-PCM based DNN accelerators which use IMC for efficient computations. Hydra utilizes Tri-level cell PCM, which has a negligible resistance drift error rate, to store the critical bits of DNNs parameters and MLC-PCM (4-level cell), which has a higher error rate (but offers more storage density), for the non-critical bits. Experimental results on various DNN architectures, configurations and datasets show that, with the presence of resistance drift errors in PCM, Hydra can maintain the baseline accuracy of DNNs for up to 1 year (resistance drift is time-dependent), whereas conventional drift tolerance techniques lead to a significant accuracy drop in just a few seconds.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"73 9","pages":"2123-2135"},"PeriodicalIF":3.6,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141503753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Novas: Tackling Online Dynamic Video Analytics With Service Adaptation at Mobile Edge Servers Novas：利用移动边缘服务器的服务适应性解决在线动态视频分析问题

IF 3.6 2区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Transactions on Computers

Pub Date : 2024-06-19 DOI: 10.1109/TC.2024.3416675

Liang Zhang;Hongzi Zhu;Wen Fei;Yunzhe Li;Mingjin Zhang;Jiannong Cao;Minyi Guo

Video analytics at mobile edge servers offers significant benefits like reduced response time and enhanced privacy. However, guaranteeing various quality-of-service (QoS) requirements of dynamic video analysis requests on heterogeneous edge devices remains challenging. In this paper, we propose a scalable online video analytics scheme, called Novas, which automatically makes precise service configuration adjustments upon constant video content changes. Specifically, Novas leverages the filtered confidence sum and a two-window t-test to online detect accuracy fluctuations without ground truth information. In such cases, Novas efficiently estimates the performance of all potential service configurations through a singular value decomposition (SVD)-based collaborative filtering method. Finally, given the NP-hardness of the optimal scheduling problem, a heuristic scheduling strategy that maximizes the minimum remaining resources is devised to schedule the most suitable configurations to servers for execution. We evaluate the effectiveness of Novas through extensive hybrid experiments conducted on a dedicated testbed. Results show that Novas can achieve a substantial over 27

$times$

improvement in satisfying the accuracy requirements compared with existing methods adopting fixed configurations, while ensuring latency requirements. Moreover, Novas improves the goodput of the system by an average of 37.86% compared to existing state-of-the-art scheduling solutions.

移动边缘服务器上的视频分析具有显著优势，如缩短响应时间和增强隐私保护。然而，在异构边缘设备上保证动态视频分析请求的各种服务质量（QoS）要求仍然具有挑战性。在本文中，我们提出了一种名为 Novas 的可扩展在线视频分析方案，它能在视频内容不断变化时自动进行精确的服务配置调整。具体来说，Novas 利用滤波置信度总和和双窗口 t 检验来在线检测精度波动，而无需地面实况信息。在这种情况下，Novas 通过基于奇异值分解（SVD）的协同过滤方法，有效地估算出所有潜在服务配置的性能。最后，考虑到最优调度问题的 NP 难度，我们设计了一种启发式调度策略，最大限度地减少剩余资源，从而将最合适的配置调度到服务器上执行。我们在专用测试平台上进行了广泛的混合实验，评估了 Novas 的有效性。结果表明，与采用固定配置的现有方法相比，Novas 在满足精度要求方面可实现超过 27 美元/次的大幅改进，同时还能确保延迟要求。此外，与现有的最先进调度解决方案相比，Novas 还能将系统的吞吐量平均提高 37.86%。

{"title":"Novas: Tackling Online Dynamic Video Analytics With Service Adaptation at Mobile Edge Servers","authors":"Liang Zhang;Hongzi Zhu;Wen Fei;Yunzhe Li;Mingjin Zhang;Jiannong Cao;Minyi Guo","doi":"10.1109/TC.2024.3416675","DOIUrl":"10.1109/TC.2024.3416675","url":null,"abstract":"Video analytics at mobile edge servers offers significant benefits like reduced response time and enhanced privacy. However, guaranteeing various quality-of-service (QoS) requirements of dynamic video analysis requests on heterogeneous edge devices remains challenging. In this paper, we propose a scalable online video analytics scheme, called Novas, which automatically makes precise service configuration adjustments upon constant video content changes. Specifically, Novas leverages the filtered confidence sum and a two-window t-test to online detect accuracy fluctuations without ground truth information. In such cases, Novas efficiently estimates the performance of all potential service configurations through a singular value decomposition (SVD)-based collaborative filtering method. Finally, given the NP-hardness of the optimal scheduling problem, a heuristic scheduling strategy that maximizes the minimum remaining resources is devised to schedule the most suitable configurations to servers for execution. We evaluate the effectiveness of Novas through extensive hybrid experiments conducted on a dedicated testbed. Results show that Novas can achieve a substantial over 27\u0000<inline-formula><tex-math>$times$</tex-math></inline-formula>\u0000 improvement in satisfying the accuracy requirements compared with existing methods adopting fixed configurations, while ensuring latency requirements. Moreover, Novas improves the goodput of the system by an average of 37.86% compared to existing state-of-the-art scheduling solutions.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"73 9","pages":"2220-2232"},"PeriodicalIF":3.6,"publicationDate":"2024-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141939839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Unified and Fully Automated Framework for Wavelet-Based Attacks on Random Delay 基于小波的随机延迟攻击的统一和全自动框架

IF 3.6 2区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Transactions on Computers

Pub Date : 2024-06-19 DOI: 10.1109/TC.2024.3416682

Qianmei Wu;Fan Zhang;Shize Guo;Kun Yang;Haoting Shen

As a common defense against side-channel attacks, random delay insertion introduces noise into the executive flow of encryption, which increases attack complexity. Accordingly, various techniques are exploited to mitigate the defense effect of such insertions. As an advanced mathematical technique, wavelet analysis is considered to be a more effective technology according to its detailed and comprehensive interpretation of signals. In this paper, we propose a unified and fully automated wavelet-based attack framework (denoted as UWAF), whose data processing is kept within one unified wavelet domain, with three enhanced components: denoising, alignment and key extraction. We put forward a new idea of combining machine learning with wavelet analysis to realize the full automation of the program for attack framework, rendering it possible to search exhaustively for the optimal combination of parameter settings in wavelet transform. Our proposal finds a new setting of wavelet parameters that have not been exploited ever before and achieves the performance enhancement for about 20 times fewer traces required for successful key recovery. UWAF is compared with several mainstream attack frameworks. Experimental results show that it outperforms those counterparts, and can be considered as an effective framework-level solution to defeat the countermeasure of random delay insertion.

作为一种常见的侧信道攻击防御手段，随机延迟插入会在加密执行流中引入噪声，从而增加攻击的复杂性。因此，人们利用各种技术来减轻这种插入的防御效果。小波分析作为一种先进的数学技术，对信号的解释细致而全面，被认为是一种更有效的技术。在本文中，我们提出了一种基于小波的统一全自动攻击框架（简称 UWAF），其数据处理保持在一个统一的小波域内，并包含三个增强组件：去噪、对齐和密钥提取。我们提出了将机器学习与小波分析相结合的新思路，以实现攻击框架程序的完全自动化，从而可以穷举搜索小波变换中参数设置的最佳组合。我们的建议找到了一种新的小波参数设置，这种参数设置以前从未被利用过，并且在成功恢复密钥所需的痕迹数量减少约 20 倍的情况下实现了性能提升。UWAF 与几种主流攻击框架进行了比较。实验结果表明，UWAF 的性能优于这些主流攻击框架，可被视为一种有效的框架级解决方案，可击败随机延迟插入的对策。

{"title":"A Unified and Fully Automated Framework for Wavelet-Based Attacks on Random Delay","authors":"Qianmei Wu;Fan Zhang;Shize Guo;Kun Yang;Haoting Shen","doi":"10.1109/TC.2024.3416682","DOIUrl":"10.1109/TC.2024.3416682","url":null,"abstract":"As a common defense against side-channel attacks, random delay insertion introduces noise into the executive flow of encryption, which increases attack complexity. Accordingly, various techniques are exploited to mitigate the defense effect of such insertions. As an advanced mathematical technique, wavelet analysis is considered to be a more effective technology according to its detailed and comprehensive interpretation of signals. In this paper, we propose a unified and fully automated wavelet-based attack framework (denoted as \u0000<bold>UWAF</b>\u0000), whose data processing is kept within one unified wavelet domain, with three enhanced components: denoising, alignment and key extraction. We put forward a new idea of combining machine learning with wavelet analysis to realize the full automation of the program for attack framework, rendering it possible to search exhaustively for the optimal combination of parameter settings in wavelet transform. Our proposal finds a new setting of wavelet parameters that have not been exploited ever before and achieves the performance enhancement for about 20 times fewer traces required for successful key recovery. \u0000<bold>UWAF</b>\u0000 is compared with several mainstream attack frameworks. Experimental results show that it outperforms those counterparts, and can be considered as an effective framework-level solution to defeat the countermeasure of random delay insertion.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"73 9","pages":"2206-2219"},"PeriodicalIF":3.6,"publicationDate":"2024-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141939831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Juliet: A Configurable Processor for Computing on Encrypted Data 朱丽叶用于加密数据计算的可配置处理器

IF 3.6 2区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Transactions on Computers

Pub Date : 2024-06-19 DOI: 10.1109/TC.2024.3416752

Charles Gouert;Dimitris Mouris;Nektarios Georgios Tsoutsos

Fully homomorphic encryption (FHE) has become progressively more viable in the years since its original inception in 2009. At the same time, leveraging state-of-the-art schemes in an efficient way for general computation remains prohibitively difficult for the average programmer. In this work, we introduce a new design for a fully homomorphic processor, dubbed Juliet, to enable faster operations on encrypted data using the state-of-the-art TFHE and cuFHE libraries for both CPU and GPU evaluation. To improve usability, we define an expressive assembly language and instruction set architecture (ISA) judiciously designed for end-to-end encrypted computation. We demonstrate Juliet's capabilities with a broad range of realistic benchmarks including cryptographic algorithms, such as the lightweight ciphers Simon and Speck, as well as logistic regression (LR) inference and matrix multiplication.

全同态加密（FHE）自 2009 年诞生以来，已逐渐变得更加可行。与此同时，对于普通程序员来说，以高效的方式利用最先进的方案进行一般计算仍然非常困难。在这项工作中，我们介绍了一种全新设计的全同态处理器（命名为 Juliet），利用最先进的 TFHE 和 cuFHE 库，在 CPU 和 GPU 评估中实现更快的加密数据运算。为了提高可用性，我们定义了一种富有表现力的汇编语言和指令集架构 (ISA)，该架构专为端到端加密计算而精心设计。我们利用各种实际基准来展示 Juliet 的能力，包括加密算法（如轻量级密码 Simon 和 Speck）以及逻辑回归（LR）推理和矩阵乘法。

引用次数: 0

Efficient Fault-Tolerant Path Embedding for 3D Torus Network Using Locally Faulty Blocks 利用局部故障块为三维环形网络嵌入高效容错路径

IF 3.6 2区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Transactions on Computers

Pub Date : 2024-06-19 DOI: 10.1109/TC.2024.3416695

Weibei Fan;Fu Xiao;Mengjie Lv;Lei Han;Shui Yu

3D tori are significant interconnection architectures in building supercomputers and parallel computing systems. Due to the rapid growth of edge faults and the crucial role of path structures in large-scale distributed systems, fault-tolerant path embedding and correlated issues have drawn widespread researches. However, existing path embedding methods are based on traditional fault models, allowing all faults to be near the same node, so they usually only focus on theoretical proof and generate linear fault-tolerance related to dimension

$n$

. In order to improve the fault-tolerance of 3D torus, we first propose a novel conditional fault model called the Locally Faulty Block model (LFB model). On the basis of this model, the Hamiltonian paths with large-scale edge defects in torus are investigated. After that, we construct an Hamiltonian path embedding algorithm HP-LFB into torus with

$O(N)$

under the LFB model, where

$N$

is the number of nodes in torus. Furthermore, we present an adaptive routing algorithm HoeFA, which is based on the method of distance vector to limit the use of virtual channels (VCs). We also make a comparison with state-of-the-art schemes, indicating that our scheme enhance other comprehensive results. The experiment indicated that HP-LFB can sustain the dynamic degradation of the batting average of establishing Hamiltonian paths, with the added faulty edges exceeding fault-tolerance.

三维环是构建超级计算机和并行计算系统的重要互连架构。由于边缘故障的快速增长和路径结构在大规模分布式系统中的关键作用，容错路径嵌入和相关问题引起了广泛的研究。然而，现有的路径嵌入方法都是基于传统的故障模型，允许所有故障都在同一节点附近，因此通常只注重理论证明，并产生与维度 $n$ 相关的线性容错。为了提高三维环的容错性，我们首先提出了一种新的条件故障模型，即局部故障块模型（LFB 模型）。在此基础上，研究了环中存在大规模边缘缺陷的哈密顿路径。然后，我们构建了一种在 LFB 模型下以 $O(N)$ 嵌入环的哈密顿路径嵌入算法 HP-LFB，其中 $N$ 是环中的节点数。此外，我们还提出了一种自适应路由算法 HoeFA，它基于距离矢量方法来限制虚拟通道（VC）的使用。我们还与最先进的方案进行了比较，结果表明我们的方案增强了其他综合结果。实验表明，HP-LFB 可以承受建立哈密尔顿路径的击球平均值的动态衰减，增加的故障边超出了容错范围。

{"title":"Efficient Fault-Tolerant Path Embedding for 3D Torus Network Using Locally Faulty Blocks","authors":"Weibei Fan;Fu Xiao;Mengjie Lv;Lei Han;Shui Yu","doi":"10.1109/TC.2024.3416695","DOIUrl":"https://doi.org/10.1109/TC.2024.3416695","url":null,"abstract":"3D tori are significant interconnection architectures in building supercomputers and parallel computing systems. Due to the rapid growth of edge faults and the crucial role of path structures in large-scale distributed systems, fault-tolerant path embedding and correlated issues have drawn widespread researches. However, existing path embedding methods are based on traditional fault models, allowing all faults to be near the same node, so they usually only focus on theoretical proof and generate linear fault-tolerance related to dimension \u0000<inline-formula><tex-math>$n$</tex-math></inline-formula>\u0000. In order to improve the fault-tolerance of 3D torus, we first propose a novel conditional fault model called the Locally Faulty Block model (LFB model). On the basis of this model, the Hamiltonian paths with large-scale edge defects in torus are investigated. After that, we construct an Hamiltonian path embedding algorithm HP-LFB into torus with \u0000<inline-formula><tex-math>$O(N)$</tex-math></inline-formula>\u0000 under the LFB model, where \u0000<inline-formula><tex-math>$N$</tex-math></inline-formula>\u0000 is the number of nodes in torus. Furthermore, we present an adaptive routing algorithm HoeFA, which is based on the method of distance vector to limit the use of virtual channels (VCs). We also make a comparison with state-of-the-art schemes, indicating that our scheme enhance other comprehensive results. The experiment indicated that HP-LFB can sustain the dynamic degradation of the batting average of establishing Hamiltonian paths, with the added faulty edges exceeding fault-tolerance.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"73 9","pages":"2305-2319"},"PeriodicalIF":3.6,"publicationDate":"2024-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141966295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Combined Trend Virtual Machine Consolidation Strategy for Cloud Data Centers 云数据中心虚拟机整合战略的综合趋势

IF 3.6 2区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Transactions on Computers

Pub Date : 2024-06-19 DOI: 10.1109/TC.2024.3416734

Yuxuan Chen;Zhen Zhang;Yuhui Deng;Geyong Min;Lin Cui

Virtual machine (VM) consolidation strategies are widely used in cloud data centers (CDC) to optimize resource utilization and reduce total energy consumption. Although existing strategies consider current and future resource utilization, the impact of sudden bursts in historical resource utilization on the hosts has been underestimated in uncertain future periods. Insufficient analysis of historical resource utilization may increase the risk of host overloading and Service Level Agreement Violation (SLAV). By defining historical and future trends based on resource utilization, we propose a novel combined trend VM consolidation (CTVMC) strategy which can effectively reduce energy consumption and SLAV. The VMs with the largest combined trend are selected for migration to prevent host overloading. Based on the temporal locality and prediction technique, CTVMC then employs the past, present, and future resource utilization to filter candidate hosts, and identifies the most complementary host to place VM using combined trends. We conduct extensive simulation experiments with PlanetLab Trace and Google Cluster Trace in the CloudSim simulator. Compared with the well-known strategies, CTVMC strategy using the PlanetLab Trace can reduce the number of migrations by over 72.39%, SLAV by over 75.85%, and ESV (a combined metric that judges the trade-off between energy consumption and SLAV) by over 81.54%. According to the Google Cluster Trace, our strategy can reduce the number of migrations by over 61.51%, SLAV by over 37.37%, and ESV by over 35.30%.

虚拟机（VM）整合策略被广泛应用于云数据中心（CDC），以优化资源利用率并降低总能耗。虽然现有策略考虑了当前和未来的资源利用率，但在不确定的未来时期，历史资源利用率的突然爆发对主机的影响被低估了。对历史资源利用率的分析不足可能会增加主机过载和违反服务级别协议（SLAV）的风险。通过定义基于资源利用率的历史和未来趋势，我们提出了一种新颖的合并趋势虚拟机整合（CTVMC）策略，该策略可有效降低能耗和 SLAV。我们选择综合趋势最大的虚拟机进行迁移，以防止主机过载。然后，基于时间定位和预测技术，CTVMC 利用过去、现在和未来的资源利用率来筛选候选主机，并利用综合趋势识别出最具互补性的主机来放置虚拟机。我们在 CloudSim 模拟器中使用 PlanetLab Trace 和 Google Cluster Trace 进行了大量模拟实验。与众所周知的策略相比，使用 PlanetLab Trace 的 CTVMC 策略可以减少 72.39% 以上的迁移次数，减少 75.85% 以上的 SLAV，减少 81.54% 以上的 ESV（判断能耗和 SLAV 之间权衡的综合指标）。根据谷歌集群跟踪，我们的策略可以减少 61.51% 以上的迁移次数、37.37% 以上的 SLAV 和 35.30% 以上的 ESV。

{"title":"A Combined Trend Virtual Machine Consolidation Strategy for Cloud Data Centers","authors":"Yuxuan Chen;Zhen Zhang;Yuhui Deng;Geyong Min;Lin Cui","doi":"10.1109/TC.2024.3416734","DOIUrl":"10.1109/TC.2024.3416734","url":null,"abstract":"Virtual machine (VM) consolidation strategies are widely used in cloud data centers (CDC) to optimize resource utilization and reduce total energy consumption. Although existing strategies consider current and future resource utilization, the impact of sudden bursts in historical resource utilization on the hosts has been underestimated in uncertain future periods. Insufficient analysis of historical resource utilization may increase the risk of host overloading and Service Level Agreement Violation (SLAV). By defining historical and future trends based on resource utilization, we propose a novel combined trend VM consolidation (CTVMC) strategy which can effectively reduce energy consumption and SLAV. The VMs with the largest combined trend are selected for migration to prevent host overloading. Based on the temporal locality and prediction technique, CTVMC then employs the past, present, and future resource utilization to filter candidate hosts, and identifies the most complementary host to place VM using combined trends. We conduct extensive simulation experiments with PlanetLab Trace and Google Cluster Trace in the CloudSim simulator. Compared with the well-known strategies, CTVMC strategy using the PlanetLab Trace can reduce the number of migrations by over 72.39%, SLAV by over 75.85%, and ESV (a combined metric that judges the trade-off between energy consumption and SLAV) by over 81.54%. According to the Google Cluster Trace, our strategy can reduce the number of migrations by over 61.51%, SLAV by over 37.37%, and ESV by over 35.30%.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"73 9","pages":"2150-2164"},"PeriodicalIF":3.6,"publicationDate":"2024-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141939835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enabling Reliable Memory-Mapped I/O With Auto-Snapshot for Persistent Memory Systems 利用自动快照功能为持久内存系统提供可靠的内存映射 I/O

IF 3.6 2区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Transactions on Computers

Pub Date : 2024-06-19 DOI: 10.1109/TC.2024.3416683

Bo Ding;Wei Tong;Yu Hua;Zhangyu Chen;Xueliang Wei;Dan Feng

Persistent memory (PM) is promising to be the next-generation storage device with better I/O performance. Since the traditional I/O path is too lengthy to drive PM featuring low latency and high bandwidth, prior works proposed memory-mapped I/O (MMIO) to shorten the I/O path to PM. However, native MMIO directly maps files into the user address space, which puts files at risk of being corrupted by scribbles and non-atomic I/O interfaces, causing serious reliability issues. To address these issues, we propose RMMIO, an efficient user-space library that provides reliable MMIO for PM systems. RMMIO provides atomic I/O interfaces and lightweight snapshots to ensure the reliability of MMIO. Compared with existing schemes, RMMIO mitigates additional writes and extra software overheads caused by reliability guarantees, thus achieving MMIO-like performance. In addition, we also propose an automatic snapshot with efficient memory management for RMMIO to minimize data loss incurred by reliability issues. The experimental results of microbenchmarks show that RMMIO achieves 8.49x and 2.31x higher throughput than ext4-DAX and the state-of-the-art MMIO-based scheme, respectively, while ensuring data reliability. The real-world application accelerated by RMMIO achieves at most 7.06x higher throughput than that of ext4-DAX.

持久内存（PM）有望成为具有更好 I/O 性能的下一代存储设备。由于传统的 I/O 路径过于冗长，无法驱动具有低延迟和高带宽特性的持久性内存，因此之前的研究提出了内存映射 I/O（MMIO），以缩短通往持久性内存的 I/O 路径。然而，原生 MMIO 直接将文件映射到用户地址空间，这使得文件有可能被涂鸦和非原子 I/O 接口损坏，从而导致严重的可靠性问题。为了解决这些问题，我们提出了 RMMIO，一个为 PM 系统提供可靠 MMIO 的高效用户空间库。RMMIO 提供原子 I/O 接口和轻量级快照，以确保 MMIO 的可靠性。与现有方案相比，RMMIO 减少了由可靠性保证引起的额外写入和额外软件开销，从而实现了类似 MMIO 的性能。此外，我们还为 RMMIO 提出了一种具有高效内存管理功能的自动快照，以尽量减少因可靠性问题造成的数据丢失。微基准测试的实验结果表明，在确保数据可靠性的前提下，RMMIO 的吞吐量分别比 ext4-DAX 和基于 MMIO 的最先进方案高出 8.49 倍和 2.31 倍。RMMIO 加速的实际应用的吞吐量比 ext4-DAX 最多高出 7.06 倍。

{"title":"Enabling Reliable Memory-Mapped I/O With Auto-Snapshot for Persistent Memory Systems","authors":"Bo Ding;Wei Tong;Yu Hua;Zhangyu Chen;Xueliang Wei;Dan Feng","doi":"10.1109/TC.2024.3416683","DOIUrl":"10.1109/TC.2024.3416683","url":null,"abstract":"Persistent memory (PM) is promising to be the next-generation storage device with better I/O performance. Since the traditional I/O path is too lengthy to drive PM featuring low latency and high bandwidth, prior works proposed memory-mapped I/O (MMIO) to shorten the I/O path to PM. However, native MMIO directly maps files into the user address space, which puts files at risk of being corrupted by scribbles and non-atomic I/O interfaces, causing serious reliability issues. To address these issues, we propose RMMIO, an efficient user-space library that provides reliable MMIO for PM systems. RMMIO provides atomic I/O interfaces and lightweight snapshots to ensure the reliability of MMIO. Compared with existing schemes, RMMIO mitigates additional writes and extra software overheads caused by reliability guarantees, thus achieving MMIO-like performance. In addition, we also propose an automatic snapshot with efficient memory management for RMMIO to minimize data loss incurred by reliability issues. The experimental results of microbenchmarks show that RMMIO achieves 8.49x and 2.31x higher throughput than ext4-DAX and the state-of-the-art MMIO-based scheme, respectively, while ensuring data reliability. The real-world application accelerated by RMMIO achieves at most 7.06x higher throughput than that of ext4-DAX.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"73 9","pages":"2290-2304"},"PeriodicalIF":3.6,"publicationDate":"2024-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141939834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Highly Evasive Targeted Bit-Trojan on Deep Neural Networks 深度神经网络上的高规避性定向比特木马

IF 3.6 2区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Transactions on Computers

Pub Date : 2024-06-19 DOI: 10.1109/TC.2024.3416705

Lingxin Jin;Wei Jiang;Jinyu Zhan;Xiangyu Wen

Bit-Trojan attacks based on Bit-Flip Attacks (BFAs) have emerged as severe threats to Deep Neural Networks (DNNs) deployed in safety-critical systems since they can inject Trojans during the model deployment stage without accessing training supply chains. Existing works are mainly devoted to improving the executability of Bit-Trojan attacks, while seriously ignoring the concerns on evasiveness. In this paper, we propose a highly Evasive Targeted Bit-Trojan (ETBT) with evasiveness improvements from three aspects, i.e., reducing the number of bit-flips (improving executability), smoothing activation distribution, and reducing accuracy fluctuation. Specifically, key neuron extraction is utilized to identify essential neurons from DNNs precisely and decouple the key neurons between different classes, thus improving the evasiveness regarding accuracy fluctuation and executability. Additionally, activation-constrained trigger generation is devised to eliminate the differences between activation distributions of Trojaned and clean models, which enhances evasiveness from the perspective of activation distribution. Ultimately, the strategy of constrained target bits search is designed to reduce bit-flip numbers, directly benefits the evasiveness of ETBT. Benchmark-based experiments are conducted to evaluate the superiority of ETBT. Compared with existing works, ETBT can significantly improve evasiveness-relevant performances with much lower computation overheads, better robustness, and generalizability. Our code is released at https://github.com/bluefier/ETBT.

基于比特翻转攻击（BFA）的比特木马攻击已成为部署在安全关键型系统中的深度神经网络（DNN）的严重威胁，因为它们可以在模型部署阶段注入木马，而无需访问训练供应链。现有研究主要致力于提高比特木马攻击的可执行性，而严重忽视了对规避性的关注。本文提出了一种高度规避性的目标比特木马（ETBT），从减少比特翻转次数（提高可执行性）、平滑激活分布和减少精度波动三个方面提高了规避性。具体来说，利用关键神经元提取技术从 DNN 中精确识别出基本神经元，并将不同类别之间的关键神经元解耦，从而在准确性波动和可执行性方面提高规避性。此外，还设计了激活受限触发器生成技术，以消除木马模型和干净模型的激活分布差异，从而从激活分布的角度提高规避性。最后，限制目标比特搜索策略旨在减少比特翻转次数，直接提高了 ETBT 的规避性。基于基准的实验评估了 ETBT 的优越性。与现有研究相比，ETBT 能显著提高规避性相关性能，而且计算开销更低、鲁棒性更好、通用性更强。我们的代码发布于 https://github.com/bluefier/ETBT。

{"title":"Highly Evasive Targeted Bit-Trojan on Deep Neural Networks","authors":"Lingxin Jin;Wei Jiang;Jinyu Zhan;Xiangyu Wen","doi":"10.1109/TC.2024.3416705","DOIUrl":"10.1109/TC.2024.3416705","url":null,"abstract":"Bit-Trojan attacks based on Bit-Flip Attacks (BFAs) have emerged as severe threats to Deep Neural Networks (DNNs) deployed in safety-critical systems since they can inject Trojans during the model deployment stage without accessing training supply chains. Existing works are mainly devoted to improving the executability of Bit-Trojan attacks, while seriously ignoring the concerns on evasiveness. In this paper, we propose a highly Evasive Targeted Bit-Trojan (ETBT) with evasiveness improvements from three aspects, i.e., reducing the number of bit-flips (improving executability), smoothing activation distribution, and reducing accuracy fluctuation. Specifically, key neuron extraction is utilized to identify essential neurons from DNNs precisely and decouple the key neurons between different classes, thus improving the evasiveness regarding accuracy fluctuation and executability. Additionally, activation-constrained trigger generation is devised to eliminate the differences between activation distributions of Trojaned and clean models, which enhances evasiveness from the perspective of activation distribution. Ultimately, the strategy of constrained target bits search is designed to reduce bit-flip numbers, directly benefits the evasiveness of ETBT. Benchmark-based experiments are conducted to evaluate the superiority of ETBT. Compared with existing works, ETBT can significantly improve evasiveness-relevant performances with much lower computation overheads, better robustness, and generalizability. Our code is released at \u0000<uri>https://github.com/bluefier/ETBT</uri>\u0000.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"73 9","pages":"2350-2363"},"PeriodicalIF":3.6,"publicationDate":"2024-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141939837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Hiding in Plain Sight: Adversarial Attack via Style Transfer on Image Borders 隐藏在众目睽睽之下通过样式转移对图像边界进行对抗性攻击

IF 3.6 2区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Transactions on Computers

Pub Date : 2024-06-19 DOI: 10.1109/TC.2024.3416761

Haiyan Zhang;Xinghua Li;Jiawei Tang;Chunlei Peng;Yunwei Wang;Ning Zhang;Yingbin Miao;Ximeng Liu;Kim-Kwang Raymond Choo

Deep Convolution Neural Networks (CNNs) have become the cornerstone of image classification, but the emergence of adversarial image attacks brings serious security risks to CNN-based applications. As a local perturbation attack, the border attack can achieve high success rates by only modifying the pixels around the border of an image, which is a novel attack perspective. However, existing border attacks have shortcomings in stealthiness and are easily detected. In this article, we propose a novel stealthy border attack method based on deep feature alignment. Specifically, we propose a deep feature alignment algorithm based on style transfer to guarantee the stealthiness of adversarial borders. The algorithm takes the deep feature difference between the adversarial and the original borders as the stealthiness loss and thus ensures good stealthiness of the generated adversarial images. To ensure high attack success rates simultaneously, we apply cross entropy to design the targeted attack loss and use margin loss as well as Leaky ReLU to design the untargeted attack loss. Experiments show that the structural similarity between the generated adversarial images and the original images is 8.8% higher than the state-of-art border attack method, indicating that our proposed adversarial images have better stealthiness. At the same time, the success rate of our attack in the face of defense methods is much higher, which is about four times that of the state-of-art border attack under the adversarial training defense.

深度卷积神经网络（CNN）已成为图像分类的基石，但对抗性图像攻击的出现给基于 CNN 的应用带来了严重的安全隐患。作为一种局部扰动攻击，边界攻击只需修改图像边界周围的像素就能达到很高的成功率，这是一种新颖的攻击视角。然而，现有的边界攻击存在隐蔽性差、易被检测等缺点。本文提出了一种基于深度特征对齐的新型隐形边界攻击方法。具体来说，我们提出了一种基于样式转移的深度特征对齐算法，以保证对抗性边界的隐蔽性。该算法将对抗边界与原始边界之间的深度特征差异作为隐蔽性损失，从而确保生成的对抗图像具有良好的隐蔽性。为了同时确保较高的攻击成功率，我们采用交叉熵来设计目标攻击损失，并使用边际损失和 Leaky ReLU 来设计非目标攻击损失。实验表明，生成的对抗图像与原始图像的结构相似度比最先进的边界攻击方法高出 8.8%，这表明我们提出的对抗图像具有更好的隐蔽性。同时，面对各种防御方法，我们的攻击成功率也更高，在对抗训练防御下，我们的攻击成功率约为最先进边界攻击方法的四倍。

{"title":"Hiding in Plain Sight: Adversarial Attack via Style Transfer on Image Borders","authors":"Haiyan Zhang;Xinghua Li;Jiawei Tang;Chunlei Peng;Yunwei Wang;Ning Zhang;Yingbin Miao;Ximeng Liu;Kim-Kwang Raymond Choo","doi":"10.1109/TC.2024.3416761","DOIUrl":"10.1109/TC.2024.3416761","url":null,"abstract":"Deep Convolution Neural Networks (CNNs) have become the cornerstone of image classification, but the emergence of adversarial image attacks brings serious security risks to CNN-based applications. As a local perturbation attack, the border attack can achieve high success rates by only modifying the pixels around the border of an image, which is a novel attack perspective. However, existing border attacks have shortcomings in stealthiness and are easily detected. In this article, we propose a novel stealthy border attack method based on deep feature alignment. Specifically, we propose a deep feature alignment algorithm based on style transfer to guarantee the stealthiness of adversarial borders. The algorithm takes the deep feature difference between the adversarial and the original borders as the stealthiness loss and thus ensures good stealthiness of the generated adversarial images. To ensure high attack success rates simultaneously, we apply cross entropy to design the targeted attack loss and use margin loss as well as Leaky ReLU to design the untargeted attack loss. Experiments show that the structural similarity between the generated adversarial images and the original images is 8.8% higher than the state-of-art border attack method, indicating that our proposed adversarial images have better stealthiness. At the same time, the success rate of our attack in the face of defense methods is much higher, which is about four times that of the state-of-art border attack under the adversarial training defense.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"73 10","pages":"2405-2419"},"PeriodicalIF":3.6,"publicationDate":"2024-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141939893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0