首页 > 最新文献

ACM Transactions on Design Automation of Electronic Systems最新文献

英文 中文
SEDONUT: A Single Event Double Node Upset Tolerant SRAM for Terrestrial Applications SEDONUT:用于地面应用的单事件双节点猝发容错 SRAM
IF 1.4 4区 计算机科学 Q2 Computer Science Pub Date : 2024-04-06 DOI: 10.1145/3651985
Govind Prasad, Bipin Chnadra Mandi, Maifuz Ali

The radiation and its effect on neighboring nodes are critical not only for space applications but also for terrestrial applications at modern lower technology nodes. This may cause SRAM failures due to single and multi-node upset. Hence, this paper proposes a 14T radiation-hardened-based SRAM cell to overcome soft errors for space and critical terrestrial applications. Simulation results show that the proposed cell can be resilient to any single event upset and single event double node upset at its storage nodes. This cell uses less power than others. The hold, read, and write stability increases compared to most considered cells. The higher critical charge of the proposed SRAM increases radiation resistance. Simulation results demonstrate that out of all compared SRAMs, only DNUSRM and proposed SRAM show 0% probability of logical flipping. Also, the other parameters like total critical charge, write stability, read stability, hold stability, area, power, sensitive area, write speed, and read speed of the proposed SRAM are improved by -19.1%, 5.22%, 25.7%, -5.46%, 22.5%, 50.6%, 60.0%, 17.91%, and 0.74% compared to DNUSRM SRAM. Hence, the better balance among the parameters makes the proposed cell more suitable for space and critical terrestrial applications. Finally, the post-layout and Monte Carlo simulation validate the efficiency of SRAMs.

辐射及其对邻近节点的影响不仅对空间应用至关重要,对现代低技术节点的地面应用也同样重要。这可能会导致单节点和多节点干扰造成的 SRAM 故障。因此,本文提出了一种基于 14T 辐射加固的 SRAM 单元,以克服空间和关键地面应用中的软误差。仿真结果表明,所提出的单元可抵御其存储节点上的任何单事件破坏和单事件双节点破坏。该单元比其他单元耗电更少。与大多数考虑过的电池相比,它的保持、读取和写入稳定性都有所提高。拟议的 SRAM 临界电荷较高,从而提高了抗辐射能力。仿真结果表明,在所有比较过的 SRAM 中,只有 DNUSRM 和建议的 SRAM 的逻辑翻转概率为 0%。此外,与 DNUSRM SRAM 相比,拟议 SRAM 的其他参数,如总临界电荷、写入稳定性、读取稳定性、保持稳定性、面积、功耗、敏感区域、写入速度和读取速度分别提高了 -19.1%、5.22%、25.7%、-5.46%、22.5%、50.6%、60.0%、17.91% 和 0.74%。因此,由于更好地平衡了各参数,拟议的单元更适合太空和关键地面应用。最后,后布局和蒙特卡罗仿真验证了 SRAM 的效率。
{"title":"SEDONUT: A Single Event Double Node Upset Tolerant SRAM for Terrestrial Applications","authors":"Govind Prasad, Bipin Chnadra Mandi, Maifuz Ali","doi":"10.1145/3651985","DOIUrl":"https://doi.org/10.1145/3651985","url":null,"abstract":"<p>The radiation and its effect on neighboring nodes are critical not only for space applications but also for terrestrial applications at modern lower technology nodes. This may cause SRAM failures due to single and multi-node upset. Hence, this paper proposes a 14T radiation-hardened-based SRAM cell to overcome soft errors for space and critical terrestrial applications. Simulation results show that the proposed cell can be resilient to any single event upset and single event double node upset at its storage nodes. This cell uses less power than others. The hold, read, and write stability increases compared to most considered cells. The higher critical charge of the proposed SRAM increases radiation resistance. Simulation results demonstrate that out of all compared SRAMs, only DNUSRM and proposed SRAM show 0% probability of logical flipping. Also, the other parameters like total critical charge, write stability, read stability, hold stability, area, power, sensitive area, write speed, and read speed of the proposed SRAM are improved by -19.1%, 5.22%, 25.7%, -5.46%, 22.5%, 50.6%, 60.0%, 17.91%, and 0.74% compared to DNUSRM SRAM. Hence, the better balance among the parameters makes the proposed cell more suitable for space and critical terrestrial applications. Finally, the post-layout and Monte Carlo simulation validate the efficiency of SRAMs.</p>","PeriodicalId":50944,"journal":{"name":"ACM Transactions on Design Automation of Electronic Systems","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140573635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ZoneTrace: A Zone Monitoring Tool for F2FS on ZNS SSDs ZoneTrace:用于 ZNS 固态硬盘上 F2FS 的区域监控工具
IF 1.4 4区 计算机科学 Q2 Computer Science Pub Date : 2024-04-05 DOI: 10.1145/3656172
Ping-Xiang Chen, Dongjoo Seo, Changhoon Sung, Jongheum Park, Minchul Lee, Huaicheng Li, Matias Bjørling, Nikil Dutt

We present ZoneTrace, a runtime monitoring tool for the Flash-Friendly File System (F2FS) on Zoned Namespace (ZNS) SSDs. ZNS SSD organizes its storage into zones of sequential write access. Due to ZNS SSD’s sequential write nature, F2FS is a log-structured file system that has recently been adopted to support ZNS SSDs. To present the space management with the zone concept between F2FS and the underlying ZNS SSD, we developed ZoneTrace, a tool that enables users to visualize and analyze the space management of F2FS on ZNS SSDs. ZoneTrace utilizes the extended Berkeley Packet Filter (eBPF) to trace the updated segment bitmap in F2FS and visualize each zone space usage accordingly. Furthermore, ZoneTrace is able to analyze on file fragmentation in F2FS and provides users with informative fragmentation histogram to serve as an indicator of file fragmentation. Using ZoneTrace’s visualization, we are able to identify the current F2FS space management scheme’s inability to fully optimize space for streaming data recording in autonomous systems, which leads to serious file fragmentation on ZNS SSDs. Our evaluations show that ZoneTrace is lightweight and assists users in getting useful insights for effortless monitoring on F2FS with ZNS SSD with both synthetic and realistic workloads. We believe ZoneTrace can help users analyze F2FS with ease and open up space management research topics with F2FS on ZNS SSDs.

我们介绍的 ZoneTrace 是一种运行时监控工具,适用于分区命名空间(ZNS)固态硬盘上的闪存友好文件系统(F2FS)。ZNS SSD 将其存储组织为顺序写入访问区域。由于 ZNS SSD 的顺序写入特性,F2FS 是一种日志结构文件系统,最近被采用来支持 ZNS SSD。为了展示 F2FS 和底层 ZNS SSD 之间采用区域概念的空间管理,我们开发了 ZoneTrace,这是一种能让用户可视化和分析 ZNS SSD 上 F2FS 空间管理的工具。ZoneTrace 利用扩展的伯克利数据包过滤器(eBPF)来跟踪 F2FS 中更新的段位图,并相应地可视化每个区的空间使用情况。此外,ZoneTrace 还能分析 F2FS 中的文件碎片,并为用户提供信息丰富的碎片直方图,作为文件碎片的指示器。利用 ZoneTrace 的可视化功能,我们能够发现当前的 F2FS 空间管理方案无法完全优化自主系统中的流式数据记录空间,从而导致 ZNS SSD 上出现严重的文件碎片。我们的评估结果表明,ZoneTrace 是轻量级的,可帮助用户获得有用的洞察力,从而在合成和现实工作负载中毫不费力地监控带有 ZNS SSD 的 F2FS。我们相信,ZoneTrace 可以帮助用户轻松分析 F2FS,并为 ZNS SSD 上的 F2FS 开辟空间管理研究课题。
{"title":"ZoneTrace: A Zone Monitoring Tool for F2FS on ZNS SSDs","authors":"Ping-Xiang Chen, Dongjoo Seo, Changhoon Sung, Jongheum Park, Minchul Lee, Huaicheng Li, Matias Bjørling, Nikil Dutt","doi":"10.1145/3656172","DOIUrl":"https://doi.org/10.1145/3656172","url":null,"abstract":"<p>We present <monospace>ZoneTrace</monospace>, a runtime monitoring tool for the Flash-Friendly File System (F2FS) on Zoned Namespace (ZNS) SSDs. ZNS SSD organizes its storage into zones of sequential write access. Due to ZNS SSD’s sequential write nature, F2FS is a log-structured file system that has recently been adopted to support ZNS SSDs. To present the space management with the zone concept between F2FS and the underlying ZNS SSD, we developed <monospace>ZoneTrace</monospace>, a tool that enables users to visualize and analyze the space management of F2FS on ZNS SSDs. <monospace>ZoneTrace</monospace> utilizes the extended Berkeley Packet Filter (eBPF) to trace the updated segment bitmap in F2FS and visualize each zone space usage accordingly. Furthermore, <monospace>ZoneTrace</monospace> is able to analyze on file fragmentation in F2FS and provides users with informative fragmentation histogram to serve as an indicator of file fragmentation. Using <monospace>ZoneTrace</monospace>’s visualization, we are able to identify the current F2FS space management scheme’s inability to fully optimize space for streaming data recording in autonomous systems, which leads to serious file fragmentation on ZNS SSDs. Our evaluations show that <monospace>ZoneTrace</monospace> is lightweight and assists users in getting useful insights for effortless monitoring on F2FS with ZNS SSD with both synthetic and realistic workloads. We believe <monospace>ZoneTrace</monospace> can help users analyze F2FS with ease and open up space management research topics with F2FS on ZNS SSDs.</p>","PeriodicalId":50944,"journal":{"name":"ACM Transactions on Design Automation of Electronic Systems","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140573732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Mixed-Criticality Traffic Scheduler with Mitigating Congestion for CAN-to-TSN Gateway 为 CAN 至 TSN 网关设计的具有缓解拥塞功能的混合关键性流量调度器
IF 1.4 4区 计算机科学 Q2 Computer Science Pub Date : 2024-04-04 DOI: 10.1145/3656173
Wenyan Yan, Dongsheng Wei, Bin Fu, Renfa Li, Guoqi Xie

The network architecture that Time-Sensitive Networking (TSN) is used as the backbone network and the Controller Area Network (CAN) serves as the intra-domain network is considered as the CAN-TSN interconnection network architecture, which has gained considerable attention within industrial embedded networks, such as spacecraft, intelligent automobiles, and factory automation. The architecture employs the CAN-TSN gateway as a central hub for transmitting and managing a significant volume of communications between the CAN domains and TSN. However, the CAN-TSN gateway faces a high congestion challenge due to the rapid growth in data volume, making it difficult to effectively support different time planning mechanisms provided by TSN. In this paper, we propose a two-stage mixed-criticality traffic scheduler. The scheduler in the first stage adopts a Message Optimization Algorithm (MOA) to aggregate multiple CAN messages into a single TSN message (including the aggregation of critical and non-critical CAN messages), which reduces the number of CAN messages requiring transmission. In the second stage, the scheduler proposes a Message Scheduling Optimization Algorithm (MSOA) to schedule critical TSN messages. This algorithm reassembles all the critical CAN messages (within the un-schedulable TSN messages) to generate new TSN messages for rescheduling. Experimental results show that our proposed scheduler effectively improves the acceptance ratio of critical and non-critical CAN messages and outperforms the state-of-the-art message scheduling method in terms of acceptance ratio while improving the bandwidth utilization and the number of schedule table entries. We further construct a hardware platform to evaluate the performance of MSOA. The consistency between practical results and theoretical results shows the effectiveness of MSOA.

时敏网络(TSN)作为骨干网络,控制器局域网(CAN)作为域内网络的网络架构被认为是 CAN-TSN 互联网络架构,这种架构在工业嵌入式网络(如航天器、智能汽车和工厂自动化)中受到了广泛关注。该架构采用 CAN-TSN 网关作为中心枢纽,在 CAN 域和 TSN 之间传输和管理大量通信。然而,由于数据量的快速增长,CAN-TSN 网关面临着高度拥塞的挑战,难以有效支持 TSN 提供的不同时间规划机制。本文提出了一种两阶段混合关键性流量调度器。第一阶段的调度器采用报文优化算法(MOA)将多个 CAN 报文聚合成一个 TSN 报文(包括关键和非关键 CAN 报文的聚合),从而减少了需要传输的 CAN 报文数量。在第二阶段,调度器提出一种报文调度优化算法(MSOA)来调度关键 TSN 报文。该算法将所有关键 CAN 报文(在无法调度的 TSN 报文内)重新组合,生成新的 TSN 报文,以便重新调度。实验结果表明,我们提出的调度器有效提高了关键和非关键 CAN 报文的接受率,在接受率方面优于最先进的报文调度方法,同时提高了带宽利用率和调度表条目数。我们进一步构建了一个硬件平台来评估 MSOA 的性能。实际结果与理论结果的一致性表明了 MSOA 的有效性。
{"title":"A Mixed-Criticality Traffic Scheduler with Mitigating Congestion for CAN-to-TSN Gateway","authors":"Wenyan Yan, Dongsheng Wei, Bin Fu, Renfa Li, Guoqi Xie","doi":"10.1145/3656173","DOIUrl":"https://doi.org/10.1145/3656173","url":null,"abstract":"<p>The network architecture that Time-Sensitive Networking (TSN) is used as the backbone network and the Controller Area Network (CAN) serves as the intra-domain network is considered as the CAN-TSN interconnection network architecture, which has gained considerable attention within industrial embedded networks, such as spacecraft, intelligent automobiles, and factory automation. The architecture employs the CAN-TSN gateway as a central hub for transmitting and managing a significant volume of communications between the CAN domains and TSN. However, the CAN-TSN gateway faces a high congestion challenge due to the rapid growth in data volume, making it difficult to effectively support different time planning mechanisms provided by TSN. In this paper, we propose a two-stage mixed-criticality traffic scheduler. The scheduler in the first stage adopts a Message Optimization Algorithm (MOA) to aggregate multiple CAN messages into a single TSN message (including the aggregation of critical and non-critical CAN messages), which reduces the number of CAN messages requiring transmission. In the second stage, the scheduler proposes a Message Scheduling Optimization Algorithm (MSOA) to schedule critical TSN messages. This algorithm reassembles all the critical CAN messages (within the un-schedulable TSN messages) to generate new TSN messages for rescheduling. Experimental results show that our proposed scheduler effectively improves the acceptance ratio of critical and non-critical CAN messages and outperforms the state-of-the-art message scheduling method in terms of acceptance ratio while improving the bandwidth utilization and the number of schedule table entries. We further construct a hardware platform to evaluate the performance of MSOA. The consistency between practical results and theoretical results shows the effectiveness of MSOA.</p>","PeriodicalId":50944,"journal":{"name":"ACM Transactions on Design Automation of Electronic Systems","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140573636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Incremental Concolic Testing of Register-Transfer Level Designs 寄存器传输级设计的增量协整测试
IF 1.4 4区 计算机科学 Q2 Computer Science Pub Date : 2024-03-30 DOI: 10.1145/3655621
Hasini Witharana, Aruna Jayasena, Prabhat Mishra
Concolic testing is a scalable solution for automated generation of directed tests for validation of hardware designs. Unfortunately, concolic testing fails to cover complex corner cases such as hard-to-activate branches. In this paper, we propose an incremental concolic testing technique to cover hard-to-activate branches in register-transfer level (RTL) models. We show that a complex branch condition can be viewed as a sequence of easy-to-activate events. We map the branch coverage problem to the coverage of a sequence of events. We propose an efficient algorithm to cover the sequence of events using concolic testing. Specifically, the test generated to activate the current event is used as the starting point to activate the next event in the sequence. Experimental results demonstrate that our approach can be used to generate directed tests to cover complex corner cases in RTL models while state-of-the-art methods fail to activate them.
协程测试是一种可扩展的解决方案,用于自动生成定向测试,以验证硬件设计。遗憾的是,协程测试无法覆盖复杂的角情况,如难以激活的分支。在本文中,我们提出了一种增量协程测试技术,以覆盖寄存器传输层(RTL)模型中难以激活的分支。我们表明,复杂的分支条件可被视为一系列易于激活的事件。我们将分支覆盖问题映射为事件序列的覆盖问题。我们提出了一种使用协程测试来覆盖事件序列的高效算法。具体来说,为激活当前事件而生成的测试被用作激活序列中下一个事件的起点。实验结果表明,我们的方法可用于生成定向测试,以覆盖 RTL 模型中的复杂角情况,而最先进的方法却无法激活这些角情况。
{"title":"Incremental Concolic Testing of Register-Transfer Level Designs","authors":"Hasini Witharana, Aruna Jayasena, Prabhat Mishra","doi":"10.1145/3655621","DOIUrl":"https://doi.org/10.1145/3655621","url":null,"abstract":"Concolic testing is a scalable solution for automated generation of directed tests for validation of hardware designs. Unfortunately, concolic testing fails to cover complex corner cases such as hard-to-activate branches. In this paper, we propose an incremental concolic testing technique to cover hard-to-activate branches in register-transfer level (RTL) models. We show that a complex branch condition can be viewed as a sequence of easy-to-activate events. We map the branch coverage problem to the coverage of a sequence of events. We propose an efficient algorithm to cover the sequence of events using concolic testing. Specifically, the test generated to activate the current event is used as the starting point to activate the next event in the sequence. Experimental results demonstrate that our approach can be used to generate directed tests to cover complex corner cases in RTL models while state-of-the-art methods fail to activate them.","PeriodicalId":50944,"journal":{"name":"ACM Transactions on Design Automation of Electronic Systems","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140362228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
POEM: Performance Optimization and Endurance Management for Non-volatile Caches POEM:非易失性高速缓存的性能优化和耐久性管理
IF 1.4 4区 计算机科学 Q2 Computer Science Pub Date : 2024-03-27 DOI: 10.1145/3653452
Aritra Bagchi, Dharamjeet, Ohm Rishabh, Manan Suri, Preeti Ranjan Panda

Non-volatile memories (NVMs) with their high storage density and ultra-low leakage power offer promising potential for redesigning the memory hierarchy in next-generation Multi-Processor Systems-on-Chip (MPSoCs). However, the adoption of NVMs in cache designs introduces challenges such as NVM write overheads and limited NVM endurance. The shared NVM cache in an MPSoC experiences requests from different processor cores and responses from the off-chip memory when the requested data is not present in the cache. Besides, upon evictions of dirty data from higher-level caches, the shared NVM cache experiences another source of write operations, known as writebacks. These sources of write operations: writebacks and responses, further exacerbate the contention for the shared bandwidth of the NVM cache, and create significant performance bottlenecks. Uncontrolled write operations can also affect the endurance of the NVM cache, posing a threat to cache lifetime and system reliability. Existing strategies often address either performance or cache endurance individually, leaving a gap for a holistic solution. This study introduces the Performance Optimization and Endurance Management (POEM) methodology, a novel approach that aggressively bypasses cache writebacks and responses to alleviate the NVM cache contention. Contrary to the existing bypass policies which do not pay adequate attention to the shared NVM cache contention, and focus too much on cache data reuse, POEM’s aggressive bypass significantly improves the overall system performance, even at the expense of data reuse. POEM also employs effective wear leveling to enhance the NVM cache endurance by careful redistribution of write operations across different cache lines. Across diverse workloads, POEM yields an average speedup of (34% ) over a naïve baseline and (28.8% ) over a state-of-the-art NVM cache bypass technique, while enhancing the cache endurance by (15% ) over the baseline. POEM also explores diverse design choices by exploiting a key policy parameter that assigns varying priorities to the two system-level objectives.

非易失性存储器(NVM)具有存储密度高、漏电功率超低的特点,为重新设计下一代多处理器片上系统(MPSoC)的存储器层次结构提供了广阔的前景。然而,在高速缓存设计中采用 NVM 会带来一些挑战,如 NVM 写入开销和有限的 NVM 耐用性。当请求的数据不在高速缓存中时,MPSoC 中的共享 NVM 高速缓存会受到来自不同处理器内核的请求和来自片外内存的响应。此外,从上一级高速缓存中驱逐脏数据时,共享 NVM 高速缓存还会经历另一个写操作源,即回写。这些写操作源(回写和响应)进一步加剧了对 NVM 高速缓存共享带宽的争夺,并造成严重的性能瓶颈。不受控制的写操作还会影响 NVM 缓存的耐用性,对缓存寿命和系统可靠性构成威胁。现有的策略通常是单独解决性能或高速缓存耐久性问题,这就为整体解决方案留下了空白。本研究介绍了性能优化和耐久性管理(POEM)方法,这是一种积极绕过高速缓存回写和响应以缓解 NVM 高速缓存争用的新方法。现有的旁路策略没有充分关注共享的 NVM 缓存争用问题,而是过于关注缓存数据的重用,与此相反,POEM 的积极旁路策略即使以牺牲数据重用为代价,也能显著提高系统的整体性能。POEM 还采用了有效的损耗均衡技术,通过在不同缓存行之间谨慎地重新分配写入操作来提高 NVM 缓存的耐用性。在各种不同的工作负载中,POEM的平均速度比原始基线提高了34%,比最先进的NVM缓存旁路技术提高了28.8%,同时缓存耐用性比基线提高了15%。POEM 还利用关键策略参数为两个系统级目标分配了不同的优先级,从而探索了多样化的设计选择。
{"title":"POEM: Performance Optimization and Endurance Management for Non-volatile Caches","authors":"Aritra Bagchi, Dharamjeet, Ohm Rishabh, Manan Suri, Preeti Ranjan Panda","doi":"10.1145/3653452","DOIUrl":"https://doi.org/10.1145/3653452","url":null,"abstract":"<p>Non-volatile memories (NVMs) with their high storage density and ultra-low leakage power offer promising potential for redesigning the memory hierarchy in next-generation Multi-Processor Systems-on-Chip (MPSoCs). However, the adoption of NVMs in cache designs introduces challenges such as NVM write overheads and limited NVM endurance. The shared NVM cache in an MPSoC experiences <i>requests</i> from different processor cores and <i>responses</i> from the off-chip memory when the requested data is not present in the cache. Besides, upon evictions of dirty data from higher-level caches, the shared NVM cache experiences another source of write operations, known as <i>writebacks</i>. These sources of write operations: writebacks and responses, further exacerbate the contention for the shared bandwidth of the NVM cache, and create significant performance bottlenecks. Uncontrolled write operations can also affect the endurance of the NVM cache, posing a threat to cache lifetime and system reliability. Existing strategies often address either performance or cache endurance individually, leaving a gap for a holistic solution. This study introduces the Performance Optimization and Endurance Management (POEM) methodology, a novel approach that aggressively bypasses cache writebacks and responses to alleviate the NVM cache contention. Contrary to the existing bypass policies which do not pay adequate attention to the shared NVM cache contention, and focus too much on cache data reuse, POEM’s aggressive bypass significantly improves the overall system performance, even at the expense of data reuse. POEM also employs effective wear leveling to enhance the NVM cache endurance by careful redistribution of write operations across different cache lines. Across diverse workloads, POEM yields an average speedup of (34% ) over a naïve baseline and (28.8% ) over a state-of-the-art NVM cache bypass technique, while enhancing the cache endurance by (15% ) over the baseline. POEM also explores diverse design choices by exploiting a key policy parameter that assigns varying priorities to the two system-level objectives.</p>","PeriodicalId":50944,"journal":{"name":"ACM Transactions on Design Automation of Electronic Systems","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140311393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Floorplanning with Edge-Aware Graph Attention Network and Hindsight Experience Replay 利用边缘感知图形注意力网络和后视经验回放进行楼层规划
IF 1.4 4区 计算机科学 Q2 Computer Science Pub Date : 2024-03-22 DOI: 10.1145/3653453
Bo Yang, Qi Xu, Hao Geng, Song Chen, Bei Yu, Yi Kang

In this paper, we focus on chip floorplanning, which aims to determine the location and orientation of circuit macros simultaneously, so that the chip area and wirelength are minimized. As the highest level of abstraction in hierarchical physical design, floorplanning bridges the gap between the system-level design and the physical synthesis, whose quality directly influences downstream placement and routing. To tackle chip floorplanning, we propose an end-to-end reinforcement learning (RL) methodology with a hindsight experience replay technique. An edge-aware graph attention network (EAGAT) is developed to effectively encode the macro and connection features of the netlist graph. Moreover, we build a hierarchical decoder architecture mainly consisting of transformer and attention pointer mechanism to output floorplan actions. Since the RL agent automatically extracts knowledge about the solution space, the previously learned policy can be quickly transferred to optimize new unseen netlists. Experimental results demonstrate that, compared with state-of-the-art floorplanners, the proposed end-to-end methodology significantly optimizes area and wirelength on public GSRC and MCNC benchmarks.

本文的重点是芯片平面规划,其目的是同时确定电路宏的位置和方向,使芯片面积和线长最小化。作为分层物理设计的最高抽象层次,底层规划在系统级设计和物理综合之间架起了一座桥梁,而物理综合的质量直接影响到下游的布局和布线。为了解决芯片平面规划问题,我们提出了一种端到端强化学习(RL)方法,并采用了事后经验重放技术。我们开发了边缘感知图注意网络 (EAGAT),以有效编码网表图的宏和连接特征。此外,我们还建立了一个分层解码器架构,主要由转换器和注意力指针机制组成,用于输出平面图动作。由于 RL 代理能自动提取有关解空间的知识,因此先前学习的策略可以快速转移到优化新的未见网表中。实验结果表明,与最先进的平面规划器相比,所提出的端到端方法在公共 GSRC 和 MCNC 基准上显著优化了面积和线长。
{"title":"Floorplanning with Edge-Aware Graph Attention Network and Hindsight Experience Replay","authors":"Bo Yang, Qi Xu, Hao Geng, Song Chen, Bei Yu, Yi Kang","doi":"10.1145/3653453","DOIUrl":"https://doi.org/10.1145/3653453","url":null,"abstract":"<p>In this paper, we focus on chip floorplanning, which aims to determine the location and orientation of circuit macros simultaneously, so that the chip area and wirelength are minimized. As the highest level of abstraction in hierarchical physical design, floorplanning bridges the gap between the system-level design and the physical synthesis, whose quality directly influences downstream placement and routing. To tackle chip floorplanning, we propose an end-to-end reinforcement learning (RL) methodology with a hindsight experience replay technique. An edge-aware graph attention network (EAGAT) is developed to effectively encode the macro and connection features of the netlist graph. Moreover, we build a hierarchical decoder architecture mainly consisting of transformer and attention pointer mechanism to output floorplan actions. Since the RL agent automatically extracts knowledge about the solution space, the previously learned policy can be quickly transferred to optimize new unseen netlists. Experimental results demonstrate that, compared with state-of-the-art floorplanners, the proposed end-to-end methodology significantly optimizes area and wirelength on public GSRC and MCNC benchmarks.</p>","PeriodicalId":50944,"journal":{"name":"ACM Transactions on Design Automation of Electronic Systems","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140204154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Reinforcement Learning-Based Mining Task Offloading Scheme for Intelligent Connected Vehicles in UAV-Aided MEC 基于深度强化学习的智能网联汽车挖掘任务卸载方案(UAV-Aided MEC
IF 1.4 4区 计算机科学 Q2 Computer Science Pub Date : 2024-03-20 DOI: 10.1145/3653451
Chunlin Li, Kun Jiang, Yong Zhang, Lincheng Jiang, Youlong Luo, Shaohua Wan

The convergence of unmanned aerial vehicle (UAV)-aided mobile edge computing (MEC) networks and blockchain transforms the existing mobile networking paradigm. However, in the temporary hotspot scenario for intelligent connected vehicles (ICVs) in UAV-aided MEC networks, deploying blockchain-based services and applications in vehicles is generally impossible due to its high computational resource and storage requirements. One possible solution is to offload part of all the computational tasks to MEC servers wherever possible. Unfortunately, due to the limited availability and high mobility of the vehicles, there is still lacking simple solutions that can support low-latency and higher reliability networking services for ICVs. In this paper, we study the task offloading problem of minimizing the total system latency and the optimal task offloading scheme, subject to constraints on the hover position coordinates of the UAV, the fixed bonuses, flexible transaction fees, transaction rates, mining difficulty, costs and battery energy consumption of the UAV. The problem is confirmed to be a challenging linear integer planning problem, we formulate the problem as a constrained Markov decision process (CMDP). Deep Reinforcement Learning (DRL) has excellently solved sequential decision-making problems in dynamic ICVs environment, therefore, we propose a novel distributed DRL-based P-D3QN approach by using Prioritized Experience Replay (PER) strategy and the dueling double deep Q-network (D3QN) algorithm to solve the optimal task offloading policy effectively. Finally, experiment results show that compared with the benchmark scheme, the P-D3QN algorithm can bring about 26.24% latency improvement and increase about 42.26% offloading utility.

无人机辅助移动边缘计算(MEC)网络与区块链的融合改变了现有的移动网络模式。然而,在无人机辅助的 MEC 网络中的智能互联车辆(ICV)临时热点场景中,由于对计算资源和存储要求较高,一般不可能在车辆中部署基于区块链的服务和应用。一种可能的解决方案是尽可能将所有计算任务的一部分卸载到 MEC 服务器上。遗憾的是,由于车辆的有限可用性和高流动性,仍然缺乏简单的解决方案来支持 ICV 的低延迟和高可靠性网络服务。本文研究了任务卸载问题,即在无人飞行器悬停位置坐标、固定奖金、灵活交易费、交易费率、挖矿难度、成本和无人飞行器电池能耗等约束条件下,系统总延迟最小化和最优任务卸载方案。经证实,该问题是一个具有挑战性的线性整数规划问题,我们将该问题表述为受约束马尔可夫决策过程(CMDP)。深度强化学习(DRL)出色地解决了动态 ICV 环境中的顺序决策问题,因此,我们提出了一种基于 DRL 的新型分布式 P-D3QN 方法,利用优先经验重放(PER)策略和决斗双深度 Q 网络(D3QN)算法有效地解决了最优任务卸载策略。最后,实验结果表明,与基准方案相比,P-D3QN 算法能带来约 26.24% 的延迟改善,并提高约 42.26% 的卸载效用。
{"title":"Deep Reinforcement Learning-Based Mining Task Offloading Scheme for Intelligent Connected Vehicles in UAV-Aided MEC","authors":"Chunlin Li, Kun Jiang, Yong Zhang, Lincheng Jiang, Youlong Luo, Shaohua Wan","doi":"10.1145/3653451","DOIUrl":"https://doi.org/10.1145/3653451","url":null,"abstract":"<p>The convergence of unmanned aerial vehicle (UAV)-aided mobile edge computing (MEC) networks and blockchain transforms the existing mobile networking paradigm. However, in the temporary hotspot scenario for intelligent connected vehicles (ICVs) in UAV-aided MEC networks, deploying blockchain-based services and applications in vehicles is generally impossible due to its high computational resource and storage requirements. One possible solution is to offload part of all the computational tasks to MEC servers wherever possible. Unfortunately, due to the limited availability and high mobility of the vehicles, there is still lacking simple solutions that can support low-latency and higher reliability networking services for ICVs. In this paper, we study the task offloading problem of minimizing the total system latency and the optimal task offloading scheme, subject to constraints on the hover position coordinates of the UAV, the fixed bonuses, flexible transaction fees, transaction rates, mining difficulty, costs and battery energy consumption of the UAV. The problem is confirmed to be a challenging linear integer planning problem, we formulate the problem as a constrained Markov decision process (CMDP). Deep Reinforcement Learning (DRL) has excellently solved sequential decision-making problems in dynamic ICVs environment, therefore, we propose a novel distributed DRL-based P-D3QN approach by using Prioritized Experience Replay (PER) strategy and the dueling double deep Q-network (D3QN) algorithm to solve the optimal task offloading policy effectively. Finally, experiment results show that compared with the benchmark scheme, the P-D3QN algorithm can bring about 26.24% latency improvement and increase about 42.26% offloading utility.</p>","PeriodicalId":50944,"journal":{"name":"ACM Transactions on Design Automation of Electronic Systems","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140170357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A High-Performance Accelerator for Real-Time Super-Resolution on Edge FPGAs 在边缘 FPGA 上实现实时超级分辨率的高性能加速器
IF 1.4 4区 计算机科学 Q2 Computer Science Pub Date : 2024-03-16 DOI: 10.1145/3652855
Hongduo Liu, Yijian Qian, Youqiang Liang, Bin Zhang, Zhaohan Liu, Tao He, Wenqian Zhao, Jiangbo Lu, Bei Yu

In the digital era, the prevalence of low-quality images contrasts with the widespread use of high-definition displays, primarily due to low-resolution cameras and compression technologies. Image super-resolution (SR) techniques, particularly those leveraging deep learning, aim to enhance these images for high-definition presentation. However, real-time execution of deep neural network (DNN)-based SR methods at the edge poses challenges due to their high computational and storage requirements. To address this, field-programmable gate arrays (FPGAs) have emerged as a promising platform, offering flexibility, programmability, and adaptability to evolving models. Previous FPGA-based SR solutions have focused on reducing computational and memory costs through aggressive simplification techniques, often sacrificing the quality of the reconstructed images. This paper introduces a novel SR network specifically designed for edge applications, which maintains reconstruction performance while managing computation costs effectively. Additionally, we propose an architectural design that enables the real-time and end-to-end inference of the proposed SR network on embedded FPGAs. Our key contributions include a tailored SR algorithm optimized for embedded FPGAs, a DSP-enhanced design that achieves a significant four-fold speedup, a novel scalable cache strategy for handling large feature maps, optimization of DSP cascade consumption, and a constraint optimization approach for resource allocation. Experimental results demonstrate that our FPGA-specific accelerator surpasses existing solutions, delivering superior throughput, energy efficiency, and image quality.

在数字时代,低质量图像的普遍存在与高清显示器的广泛使用形成了鲜明对比,这主要是由于低分辨率相机和压缩技术造成的。图像超分辨率(SR)技术,尤其是利用深度学习的技术,旨在增强这些图像的高清晰度。然而,在边缘实时执行基于深度神经网络(DNN)的 SR 方法面临着挑战,因为它们对计算和存储要求很高。为了解决这个问题,现场可编程门阵列(FPGA)成为一个很有前途的平台,它具有灵活性、可编程性和对不断发展的模型的适应性。以前基于 FPGA 的 SR 解决方案侧重于通过积极的简化技术降低计算和内存成本,但往往牺牲了重建图像的质量。本文介绍了一种专为边缘应用设计的新型 SR 网络,它能在保持重建性能的同时有效管理计算成本。此外,我们还提出了一种架构设计,可在嵌入式 FPGA 上实现拟议 SR 网络的实时和端到端推理。我们的主要贡献包括专为嵌入式 FPGA 优化的定制 SR 算法、可显著提高四倍速度的 DSP 增强设计、用于处理大型特征图的新型可扩展缓存策略、DSP 级联消耗优化以及用于资源分配的约束优化方法。实验结果表明,我们的 FPGA 专用加速器超越了现有解决方案,提供了卓越的吞吐量、能效和图像质量。
{"title":"A High-Performance Accelerator for Real-Time Super-Resolution on Edge FPGAs","authors":"Hongduo Liu, Yijian Qian, Youqiang Liang, Bin Zhang, Zhaohan Liu, Tao He, Wenqian Zhao, Jiangbo Lu, Bei Yu","doi":"10.1145/3652855","DOIUrl":"https://doi.org/10.1145/3652855","url":null,"abstract":"<p>In the digital era, the prevalence of low-quality images contrasts with the widespread use of high-definition displays, primarily due to low-resolution cameras and compression technologies. Image super-resolution (SR) techniques, particularly those leveraging deep learning, aim to enhance these images for high-definition presentation. However, real-time execution of deep neural network (DNN)-based SR methods at the edge poses challenges due to their high computational and storage requirements. To address this, field-programmable gate arrays (FPGAs) have emerged as a promising platform, offering flexibility, programmability, and adaptability to evolving models. Previous FPGA-based SR solutions have focused on reducing computational and memory costs through aggressive simplification techniques, often sacrificing the quality of the reconstructed images. This paper introduces a novel SR network specifically designed for edge applications, which maintains reconstruction performance while managing computation costs effectively. Additionally, we propose an architectural design that enables the real-time and end-to-end inference of the proposed SR network on embedded FPGAs. Our key contributions include a tailored SR algorithm optimized for embedded FPGAs, a DSP-enhanced design that achieves a significant four-fold speedup, a novel scalable cache strategy for handling large feature maps, optimization of DSP cascade consumption, and a constraint optimization approach for resource allocation. Experimental results demonstrate that our FPGA-specific accelerator surpasses existing solutions, delivering superior throughput, energy efficiency, and image quality.</p>","PeriodicalId":50944,"journal":{"name":"ACM Transactions on Design Automation of Electronic Systems","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140151320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparative Analysis of Dynamic Power Consumption of Parallel Prefix Adder 并行前缀加法器动态功耗对比分析
IF 1.4 4区 计算机科学 Q2 Computer Science Pub Date : 2024-03-11 DOI: 10.1145/3651984
Ireneusz Brzozowski

The Newcomb-Benford law is the law, also known as Benford's law, of anomalous numbers stating that in many real-life numerical datasets, including physical and statistical ones, numbers have small initial digit. Numbers irregularity observed in nature leads to the question, is the arithmetical-logical unit, responsible for performing calculations in computers optimal? Are there other architectures, not as regular as commonly used Parallel Prefix Adders that can perform better, especially when operating on the datasets that are not purely random, but irregular,? In this article, structures of propagate-generate tree are compared including regular and irregular configurations – various structures are examined: regular, irregular, with gray cells only, with both gray and black and with higher valency cells. Performance is evaluated in terms of energy consumption. The evaluation was performed using the extended power model of static CMOS gates. The model is based on changes of vectors, naturally taking into account spatio-temporal correlations. The energy parameters of the designed cells were calculated on the basis of electrical (Spice) simulation. Designs and simulations were done in Cadence environment, calculations of the power dissipation were performed in Matlab. The results clearly show that there are PPA structures that perform much better for a specific type of numerical data. Negligent design can lead to an increase greater than two times of power consumption. The novel architectures of PPA described in this work might find practical applications in specialized adders dealing with numerical datasets, such as, for example, sine functions commonly used in digital signal processing.

纽科姆-本福德定律(Newcomb-Benford law)是关于反常数字的定律,也被称为本福德定律(Benford's law),指出在许多现实生活中的数字数据集(包括物理和统计数据集)中,数字的初始位数都很小。从自然界中观察到的数字不规则性引出了一个问题:计算机中负责执行计算的算术逻辑单元是否是最佳的?是否有其他不像常用并行前缀加法器那样规则的架构,可以发挥更好的性能,尤其是在处理非纯随机而是不规则的数据集时? 本文比较了传播生成树的结构,包括规则和不规则配置--考察了各种结构:规则、不规则、仅灰色单元、灰色和黑色单元以及高价单元。根据能耗对性能进行了评估。评估采用静态 CMOS 栅极的扩展功率模型。该模型以向量变化为基础,自然考虑了时空相关性。设计单元的能量参数是在电气(Spice)模拟的基础上计算得出的。设计和仿真在 Cadence 环境中完成,功率耗散计算在 Matlab 中进行。结果清楚地表明,对于特定类型的数值数据,有些 PPA 结构的性能要好得多。设计上的疏忽会导致功耗增加两倍以上。这项工作中描述的新型 PPA 架构可能会在处理数值数据集的专用加法器中得到实际应用,例如数字信号处理中常用的正弦函数。
{"title":"Comparative Analysis of Dynamic Power Consumption of Parallel Prefix Adder","authors":"Ireneusz Brzozowski","doi":"10.1145/3651984","DOIUrl":"https://doi.org/10.1145/3651984","url":null,"abstract":"<p>The Newcomb-Benford law is the law, also known as Benford's law, of anomalous numbers stating that in many real-life numerical datasets, including physical and statistical ones, numbers have small initial digit. Numbers irregularity observed in nature leads to the question, is the arithmetical-logical unit, responsible for performing calculations in computers optimal? Are there other architectures, not as regular as commonly used Parallel Prefix Adders that can perform better, especially when operating on the datasets that are not purely random, but irregular,? In this article, structures of propagate-generate tree are compared including regular and irregular configurations – various structures are examined: regular, irregular, with gray cells only, with both gray and black and with higher valency cells. Performance is evaluated in terms of energy consumption. The evaluation was performed using the extended power model of static CMOS gates. The model is based on changes of vectors, naturally taking into account spatio-temporal correlations. The energy parameters of the designed cells were calculated on the basis of electrical (Spice) simulation. Designs and simulations were done in Cadence environment, calculations of the power dissipation were performed in Matlab. The results clearly show that there are PPA structures that perform much better for a specific type of numerical data. Negligent design can lead to an increase greater than two times of power consumption. The novel architectures of PPA described in this work might find practical applications in specialized adders dealing with numerical datasets, such as, for example, sine functions commonly used in digital signal processing.</p>","PeriodicalId":50944,"journal":{"name":"ACM Transactions on Design Automation of Electronic Systems","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140128363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Security Evaluation of State Space Obfuscation of Hardware IP through a Red Team – Blue Team Practice 通过 "红队-蓝队 "实践对硬件 IP 的状态空间混淆进行安全评估
IF 1.4 4区 计算机科学 Q2 Computer Science Pub Date : 2024-03-05 DOI: 10.1145/3640461
Md Moshiur Rahman, Jim Geist, Daniel Xing, Yuntao Liu, Ankur Srivastava, Travis Meade, Yier Jin, Swarup Bhunia

Due to the inclination towards a fab-less model of integrated circuit (IC) manufacturing, several untrusted entities get white-box access to the proprietary intellectual property (IP) blocks from diverse vendors. To this end, the untrusted entities pose security-breach threats in the form of piracy, cloning, and reverse engineering, sometimes threatening national security. Hardware obfuscation is a prominent countermeasure against such issues. Obfuscation allows for preventing the usage of the IP blocks without authorization from the IP owners. Due to finite state machine (FSM) transformation-based hardware obfuscation, the design’s FSM gets transformed to make it difficult for an attacker to reverse engineer the design. A secret key needs to be applied to make the FSM functional thus preventing the usage of the IP for unintended purposes. Although several hardware obfuscation techniques have been proposed, due to the inability to analyze the techniques from the attackers’ standpoint, numerous vulnerabilities inherent to the obfuscation methods go undetected unless a true adversary discovers them. In this paper, we present a collaborative approach between two entities - one acting as an attacker or red team and another as a defender or blue team, the first systematic approach to replicate the real attacker-defender scenario in the hardware security domain, which in return strengthens the FSM transformation-based obfuscation technique. The blue team transforms the underlying FSM of a gate-level netlist using state space obfuscation. The red team plays the role of an adversary or evaluator and tries to unlock the design by extracting the unlocking key or recovering the obfuscation circuitries. As the key outcome of this red team - blue team effort, a robust state space obfuscation methodology is evolved showing security promises.

由于集成电路(IC)制造倾向于采用无工厂模式,一些不受信任的实体可以白盒方式访问来自不同供应商的专有知识产权(IP)模块。为此,这些不受信任的实体以盗版、克隆和逆向工程的形式造成安全漏洞威胁,有时甚至威胁到国家安全。硬件混淆是解决此类问题的重要对策。混淆可以防止未经知识产权所有者授权而使用知识产权块。由于采用了基于有限状态机(FSM)转换的硬件混淆技术,设计的 FSM 会被转换,使攻击者难以对设计进行逆向工程。要使 FSM 起作用,需要使用密钥,从而防止 IP 被用于非预期目的。虽然已经提出了几种硬件混淆技术,但由于无法从攻击者的角度分析这些技术,除非真正的对手发现,否则混淆方法中固有的许多漏洞都不会被发现。在本文中,我们提出了一种两个实体之间的合作方法--一个实体作为攻击者或红队,另一个实体作为防御者或蓝队,这是首个在硬件安全领域复制真实攻击者-防御者场景的系统方法,它反过来加强了基于 FSM 变换的混淆技术。蓝队使用状态空间混淆技术转换门级网表的底层 FSM。红队扮演对手或评估者的角色,试图通过提取解锁密钥或恢复混淆电路来解锁设计。作为红队和蓝队合作的重要成果,一种强大的状态空间混淆方法得到了发展,并显示出其安全性前景。
{"title":"Security Evaluation of State Space Obfuscation of Hardware IP through a Red Team – Blue Team Practice","authors":"Md Moshiur Rahman, Jim Geist, Daniel Xing, Yuntao Liu, Ankur Srivastava, Travis Meade, Yier Jin, Swarup Bhunia","doi":"10.1145/3640461","DOIUrl":"https://doi.org/10.1145/3640461","url":null,"abstract":"<p>Due to the inclination towards a fab-less model of integrated circuit (IC) manufacturing, several untrusted entities get white-box access to the proprietary intellectual property (IP) blocks from diverse vendors. To this end, the untrusted entities pose security-breach threats in the form of piracy, cloning, and reverse engineering, sometimes threatening national security. Hardware obfuscation is a prominent countermeasure against such issues. Obfuscation allows for preventing the usage of the IP blocks without authorization from the IP owners. Due to finite state machine (FSM) transformation-based hardware obfuscation, the design’s FSM gets transformed to make it difficult for an attacker to reverse engineer the design. A secret key needs to be applied to make the FSM functional thus preventing the usage of the IP for unintended purposes. Although several hardware obfuscation techniques have been proposed, due to the inability to analyze the techniques from the attackers’ standpoint, numerous vulnerabilities inherent to the obfuscation methods go undetected unless a true adversary discovers them. In this paper, we present a collaborative approach between two entities - one acting as an attacker or <i>red team</i> and another as a defender or <i>blue team</i>, the first systematic approach to replicate the real attacker-defender scenario in the hardware security domain, which in return strengthens the FSM transformation-based obfuscation technique. The <i>blue team</i> transforms the underlying FSM of a gate-level netlist using state space obfuscation. The <i>red team</i> plays the role of an adversary or evaluator and tries to unlock the design by extracting the unlocking key or recovering the obfuscation circuitries. As the key outcome of this red team - blue team effort, a robust state space obfuscation methodology is evolved showing security promises.</p>","PeriodicalId":50944,"journal":{"name":"ACM Transactions on Design Automation of Electronic Systems","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140037170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
ACM Transactions on Design Automation of Electronic Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1