首页 > 最新文献

2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC)最新文献

英文 中文
A Machine Learning based Hard Fault Recuperation Model for Approximate Hardware Accelerators 基于机器学习的近似硬件加速器硬故障恢复模型
Pub Date : 2018-06-01 DOI: 10.1145/3195970.3195974
Farah Naz Taher, Joseph Callenes-Sloan, Benjamin Carrión Schäfer
Continuous pursuit of higher performance and energy efficiency has led to heterogeneous SoC that contains multiple dedicated hardware accelerators. These accelerators exploit the inherent parallelism of tasks and are often tolerant to inaccuracies in their outputs, e.g. image and digital signal processing applications. At the same time, permanent faults are escalating due to process scaling and power restrictions, leading to erroneous outputs. To address this issue, in this paper, we propose a low-cost, universal fault-recovery/repair method that utilizes supervised machine learning techniques to ameliorate the effect of permanent fault(s) in hardware accelerators that can tolerate inexact outputs. The proposed compensation model does not require any information about the accelerator and is highly scalable with low area overhead. Experimental results show, the proposed method improves the accuracy by 50% and decreases the overall mean error rate by 90% with an area overhead of 5% compared to execution without fault compensation.
对更高性能和能效的不断追求导致了包含多个专用硬件加速器的异构SoC。这些加速器利用任务的固有并行性,并且通常容忍输出中的不准确性,例如图像和数字信号处理应用。同时,由于过程缩放和功率限制,永久性故障正在升级,导致错误输出。为了解决这个问题,在本文中,我们提出了一种低成本、通用的故障恢复/修复方法,该方法利用监督机器学习技术来改善硬件加速器中可以容忍不精确输出的永久故障的影响。所提出的补偿模型不需要任何关于加速器的信息,并且具有低面积开销的高度可扩展性。实验结果表明,该方法与无故障补偿相比,准确率提高了50%,总体平均错误率降低了90%,面积开销仅为5%。
{"title":"A Machine Learning based Hard Fault Recuperation Model for Approximate Hardware Accelerators","authors":"Farah Naz Taher, Joseph Callenes-Sloan, Benjamin Carrión Schäfer","doi":"10.1145/3195970.3195974","DOIUrl":"https://doi.org/10.1145/3195970.3195974","url":null,"abstract":"Continuous pursuit of higher performance and energy efficiency has led to heterogeneous SoC that contains multiple dedicated hardware accelerators. These accelerators exploit the inherent parallelism of tasks and are often tolerant to inaccuracies in their outputs, e.g. image and digital signal processing applications. At the same time, permanent faults are escalating due to process scaling and power restrictions, leading to erroneous outputs. To address this issue, in this paper, we propose a low-cost, universal fault-recovery/repair method that utilizes supervised machine learning techniques to ameliorate the effect of permanent fault(s) in hardware accelerators that can tolerate inexact outputs. The proposed compensation model does not require any information about the accelerator and is highly scalable with low area overhead. Experimental results show, the proposed method improves the accuracy by 50% and decreases the overall mean error rate by 90% with an area overhead of 5% compared to execution without fault compensation.","PeriodicalId":6491,"journal":{"name":"2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC)","volume":"1 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87831720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Exploring the Programmability for Deep Learning Processors: from Architecture to Tensorization 探索深度学习处理器的可编程性:从架构到张量化
Pub Date : 2018-06-01 DOI: 10.1145/3195970.3196049
Chixiao Chen, Huwan Peng, Xindi Liu, Hongwei Ding, C. R. Shi
This paper presents an instruction and Fabric Programmable Neuron Array (iFPNA) architecture, its 28nm CMOS chip prototype, and a compiler for the acceleration of a variety of deep learning neural networks (DNNs) including convolutional neural networks (CNNs), recurrent neural networks (RNNs), and fully connected (FC) networks on chip. The iFPNA architecture combines instruction-level programmability as in an Instruction Set Architecture (ISA) with logic-level reconfigurability as in a Field-Programmable Gate Array (FPGA) in a sliced structure for scalability. Four data flow models, namely weight stationary, input stationary, row stationary and tunnel stationary, are described as the abstraction of various DNN data and computational dependence. The iFPNA compiler partitions a large-size DNN to smaller networks, each being mapped to, optimized and code generated for, the underlying iFPNA processor using one or a mixture of the four data-flow models. Experimental results have shown that state-of-art large-size CNNs, RNNs, and FC networks can be mapped to the iFPNA processor achieving the near ASIC performance.
本文介绍了一种指令和结构可编程神经元阵列(iFPNA)架构,其28nm CMOS芯片原型,以及用于加速各种深度学习神经网络(dnn)的编译器,包括卷积神经网络(cnn),循环神经网络(rnn)和片上全连接(FC)网络。iFPNA架构将指令级可编程性(如指令集架构(ISA))与逻辑级可重构性(如现场可编程门阵列(FPGA))结合在一个切片结构中,以实现可扩展性。将权重平稳、输入平稳、行平稳和隧道平稳四种数据流模型描述为各种深度神经网络数据的抽象和计算依赖性。iFPNA编译器将大型DNN划分为较小的网络,每个网络都映射到底层iFPNA处理器,并使用四种数据流模型中的一种或混合模型进行优化和生成代码。实验结果表明,最先进的大尺寸cnn、rnn和FC网络可以映射到iFPNA处理器上,实现接近ASIC的性能。
{"title":"Exploring the Programmability for Deep Learning Processors: from Architecture to Tensorization","authors":"Chixiao Chen, Huwan Peng, Xindi Liu, Hongwei Ding, C. R. Shi","doi":"10.1145/3195970.3196049","DOIUrl":"https://doi.org/10.1145/3195970.3196049","url":null,"abstract":"This paper presents an instruction and Fabric Programmable Neuron Array (iFPNA) architecture, its 28nm CMOS chip prototype, and a compiler for the acceleration of a variety of deep learning neural networks (DNNs) including convolutional neural networks (CNNs), recurrent neural networks (RNNs), and fully connected (FC) networks on chip. The iFPNA architecture combines instruction-level programmability as in an Instruction Set Architecture (ISA) with logic-level reconfigurability as in a Field-Programmable Gate Array (FPGA) in a sliced structure for scalability. Four data flow models, namely weight stationary, input stationary, row stationary and tunnel stationary, are described as the abstraction of various DNN data and computational dependence. The iFPNA compiler partitions a large-size DNN to smaller networks, each being mapped to, optimized and code generated for, the underlying iFPNA processor using one or a mixture of the four data-flow models. Experimental results have shown that state-of-art large-size CNNs, RNNs, and FC networks can be mapped to the iFPNA processor achieving the near ASIC performance.","PeriodicalId":6491,"journal":{"name":"2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC)","volume":"62 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87072441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Minimizing Write Amplification to Enhance Lifetime of Large-page Flash-Memory Storage Devices 最小化写放大以提高大页闪存存储设备的使用寿命
Pub Date : 2018-06-01 DOI: 10.1145/3195970.3196076
Wei-Lin Wang, Tseng-Yi Chen, Yuan-Hao Chang, H. Wei, W. Shih
Due to the decreasing endurance of flash chips, the lifetime of flash drives has become a critical issue. To resolve this issue, various techniques such as wear-leveling and error correction code have been proposed to reduce the bit error rates of flash storage devices. In contrast to these techniques, we observe that minimizing write amplification is another promising direction to enhance the lifetime of a flash storage device. However, the development trend of large-page flash memory exacerbates the write amplification issue. In this work, we present a compression-based management design to deal with compressed data updates and internal fragmentation in flash pages. Thus, it can minimize write amplification by only updating the modified part of flash pages with the support of data reduction techniques; and the reduced write amplification degree is more significant when the flash page size becomes larger due to the development trend. This design is orthogonal to wear-leveling and error correction techniques and thus can cooperate with them to further enhance the lifetime of a flash device. Based on a series of experiments, the results demonstrate that the proposed design can effectively improve the lifetime of a flash storage device by reducing write amplification.
由于闪存芯片的耐用性不断降低,闪存驱动器的寿命已成为一个关键问题。为了解决这个问题,人们提出了各种技术,如损耗均衡和纠错码,以降低闪存存储设备的误码率。与这些技术相比,我们观察到最小化写放大是提高闪存设备寿命的另一个有希望的方向。然而,大页闪存的发展趋势加剧了写入放大问题。在这项工作中,我们提出了一种基于压缩的管理设计来处理压缩数据更新和flash页面中的内部碎片。因此,在数据缩减技术的支持下,只需更新flash页面的修改部分,即可最大限度地减少写入放大;随着flash页面尺寸的不断增大,写入放大程度的降低也越来越明显。该设计与损耗平衡和纠错技术是正交的,因此可以与它们合作,进一步提高闪存器件的使用寿命。一系列实验结果表明,该设计可以有效地降低写入放大,从而提高闪存器件的使用寿命。
{"title":"Minimizing Write Amplification to Enhance Lifetime of Large-page Flash-Memory Storage Devices","authors":"Wei-Lin Wang, Tseng-Yi Chen, Yuan-Hao Chang, H. Wei, W. Shih","doi":"10.1145/3195970.3196076","DOIUrl":"https://doi.org/10.1145/3195970.3196076","url":null,"abstract":"Due to the decreasing endurance of flash chips, the lifetime of flash drives has become a critical issue. To resolve this issue, various techniques such as wear-leveling and error correction code have been proposed to reduce the bit error rates of flash storage devices. In contrast to these techniques, we observe that minimizing write amplification is another promising direction to enhance the lifetime of a flash storage device. However, the development trend of large-page flash memory exacerbates the write amplification issue. In this work, we present a compression-based management design to deal with compressed data updates and internal fragmentation in flash pages. Thus, it can minimize write amplification by only updating the modified part of flash pages with the support of data reduction techniques; and the reduced write amplification degree is more significant when the flash page size becomes larger due to the development trend. This design is orthogonal to wear-leveling and error correction techniques and thus can cooperate with them to further enhance the lifetime of a flash device. Based on a series of experiments, the results demonstrate that the proposed design can effectively improve the lifetime of a flash storage device by reducing write amplification.","PeriodicalId":6491,"journal":{"name":"2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC)","volume":"1 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86189042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Raise Your Game for Split Manufacturing: Restoring the True Functionality Through BEOL 提高你的游戏分裂制造:通过BEOL恢复真正的功能
Pub Date : 2018-06-01 DOI: 10.1145/3195970.3196100
Satwik Patnaik, M. Ashraf, J. Knechtel, O. Sinanoglu
Split manufacturing (SM) seeks to protect against piracy of intellectual property (IP) in chip designs. Here we propose a scheme to manipulate both placement and routing in an intertwined manner, thereby increasing the resilience of SM layouts. Key stages of our scheme are to (partially) randomize a design, place and route the erroneous netlist, and restore the original design by re-routing the BEOL. Based on state-of-the-art proximity attacks, we demonstrate that our scheme notably excels over the prior art (i.e., 0% correct connection rates). Our scheme induces controllable PPA overheads and lowers commercial cost (the latter by splitting at higher layers).
拆分制造(SM)旨在防止芯片设计中的知识产权(IP)被盗版。在这里,我们提出了一种方案,以一种交织的方式操纵放置和路由,从而增加SM布局的弹性。我们方案的关键阶段是(部分)随机化设计,放置和路由错误的网络列表,并通过重新路由BEOL来恢复原始设计。基于最先进的接近攻击,我们证明了我们的方案明显优于现有技术(即0%的正确连接率)。我们的方案引入了可控的PPA开销,降低了商业成本(后者通过在更高层进行拆分)。
{"title":"Raise Your Game for Split Manufacturing: Restoring the True Functionality Through BEOL","authors":"Satwik Patnaik, M. Ashraf, J. Knechtel, O. Sinanoglu","doi":"10.1145/3195970.3196100","DOIUrl":"https://doi.org/10.1145/3195970.3196100","url":null,"abstract":"Split manufacturing (SM) seeks to protect against piracy of intellectual property (IP) in chip designs. Here we propose a scheme to manipulate both placement and routing in an intertwined manner, thereby increasing the resilience of SM layouts. Key stages of our scheme are to (partially) randomize a design, place and route the erroneous netlist, and restore the original design by re-routing the BEOL. Based on state-of-the-art proximity attacks, we demonstrate that our scheme notably excels over the prior art (i.e., 0% correct connection rates). Our scheme induces controllable PPA overheads and lowers commercial cost (the latter by splitting at higher layers).","PeriodicalId":6491,"journal":{"name":"2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC)","volume":"46 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77760345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Design-for-Testability for Continuous-Flow Microfluidic Biochips 连续流微流体生物芯片的可测试性设计
Pub Date : 2018-06-01 DOI: 10.1145/3195970.3196025
Chunfeng Liu, Bing Li, Tsung-Yi Ho, K. Chakrabarty, Ulf Schlichtmann
Flow-based microfluidic biochips are gaining traction in the microfluidics community since they enable efficient and low-cost biochemical experiments. These highly integrated lab-on-a-chip systems, however, suffer from manufacturing defects, which cause some chips to malfunction. To test biochips after manufacturing, air pressure is applied to input ports of a chip and predetermined test vectors are used to change the states of microvalves in the chip. Pressure meters are connected to the output ports to measure pressure values, which are compared with expected values to detect errors. To reduce the cost of the test platform, the number of pressure sources and meters should be reduced. We propose a design-for-testability (DFT) technique that enables a test procedure with only a single pressure source and a single pressure meter. Furthermore, the valves inserted for DFT share control channels with valves in the original chip so that no additional control signals are required. Simulation results demonstrate that this technique can generate efficient chip architectures for single-source single-meter test in all experiment cases successfully to reduce test cost, while the performance of these chips in executing applications is still maintained.
基于流动的微流控生物芯片由于能够实现高效、低成本的生化实验,在微流控领域获得了越来越多的关注。然而,这些高度集成的芯片实验室系统存在制造缺陷,导致一些芯片出现故障。为了在制造后测试生物芯片,将气压施加到芯片的输入端口,并使用预定的测试载体来改变芯片中微阀的状态。压力表连接到输出端口测量压力值,将其与期望值进行比较,以检测误差。为了降低测试平台的成本,应减少压力源和仪表的数量。我们提出了一种可测试性设计(DFT)技术,该技术可以实现仅使用单个压力源和单个压力表的测试过程。此外,为DFT插入的阀与原始芯片中的阀共享控制通道,因此不需要额外的控制信号。仿真结果表明,该技术在所有实验情况下都能成功地为单源单表测试生成高效的芯片架构,降低了测试成本,同时保持了芯片在执行应用中的性能。
{"title":"Design-for-Testability for Continuous-Flow Microfluidic Biochips","authors":"Chunfeng Liu, Bing Li, Tsung-Yi Ho, K. Chakrabarty, Ulf Schlichtmann","doi":"10.1145/3195970.3196025","DOIUrl":"https://doi.org/10.1145/3195970.3196025","url":null,"abstract":"Flow-based microfluidic biochips are gaining traction in the microfluidics community since they enable efficient and low-cost biochemical experiments. These highly integrated lab-on-a-chip systems, however, suffer from manufacturing defects, which cause some chips to malfunction. To test biochips after manufacturing, air pressure is applied to input ports of a chip and predetermined test vectors are used to change the states of microvalves in the chip. Pressure meters are connected to the output ports to measure pressure values, which are compared with expected values to detect errors. To reduce the cost of the test platform, the number of pressure sources and meters should be reduced. We propose a design-for-testability (DFT) technique that enables a test procedure with only a single pressure source and a single pressure meter. Furthermore, the valves inserted for DFT share control channels with valves in the original chip so that no additional control signals are required. Simulation results demonstrate that this technique can generate efficient chip architectures for single-source single-meter test in all experiment cases successfully to reduce test cost, while the performance of these chips in executing applications is still maintained.","PeriodicalId":6491,"journal":{"name":"2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC)","volume":"2 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77822075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Dadu-P: A Scalable Accelerator for Robot Motion Planning in a Dynamic Environment Dadu-P:动态环境下机器人运动规划的可扩展加速器
Pub Date : 2018-06-01 DOI: 10.1145/3195970.3196020
Shiqi Lian, Yinhe Han, Xiaoming Chen, Ying Wang, Hang Xiao
As a critical operation in robotics, motion planning consumes lots of time and energy, especially in a dynamic environment. Through approaches based on general-purpose processors, it is hard to get a valid planning in real time. We present an accelerator to speed up collision detection, which costs over 90% of the computation time in motion planning. Via the octree-based roadmap representation, the accelerator can be reconfigured online and support large roadmaps. We in addition propose an effective algorithm to update the roadmap in a dynamic environment, together with a batched incremental processing approach to reduce the complexity of collision detection. Experimental results show that our accelerator achieves 26.5X speedup than an existing CPU-based approach. With the incremental approach, the performance further improves by 10X while the solution quality is degraded by 10% only.
运动规划是机器人技术中的一项关键操作,需要耗费大量的时间和精力,尤其是在动态环境中。通过基于通用处理器的方法,很难得到有效的实时规划。我们提出了一种加速碰撞检测的加速器,它可以节省90%以上的运动规划计算时间。通过基于八叉树的路线图表示,加速器可以在线重新配置并支持大型路线图。此外,我们还提出了一种有效的算法来更新动态环境中的路线图,以及一种批量增量处理方法来降低碰撞检测的复杂性。实验结果表明,与现有的基于cpu的方法相比,我们的加速器实现了26.5倍的加速。使用增量方法,性能进一步提高了10倍,而解决方案质量仅下降了10%。
{"title":"Dadu-P: A Scalable Accelerator for Robot Motion Planning in a Dynamic Environment","authors":"Shiqi Lian, Yinhe Han, Xiaoming Chen, Ying Wang, Hang Xiao","doi":"10.1145/3195970.3196020","DOIUrl":"https://doi.org/10.1145/3195970.3196020","url":null,"abstract":"As a critical operation in robotics, motion planning consumes lots of time and energy, especially in a dynamic environment. Through approaches based on general-purpose processors, it is hard to get a valid planning in real time. We present an accelerator to speed up collision detection, which costs over 90% of the computation time in motion planning. Via the octree-based roadmap representation, the accelerator can be reconfigured online and support large roadmaps. We in addition propose an effective algorithm to update the roadmap in a dynamic environment, together with a batched incremental processing approach to reduce the complexity of collision detection. Experimental results show that our accelerator achieves 26.5X speedup than an existing CPU-based approach. With the incremental approach, the performance further improves by 10X while the solution quality is degraded by 10% only.","PeriodicalId":6491,"journal":{"name":"2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC)","volume":"258 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82048734","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Analysis of Security of Split Manufacturing using Machine Learning 基于机器学习的拆分制造安全性分析
Pub Date : 2018-06-01 DOI: 10.1145/3195970.3195991
Boyu Zhang, J. Magaña, A. Davoodi
This work is the first to analyze the security of split manufacturing using machine learning, based on data collected from layouts provided by industry, with 8 routing metal layers, and significant variation in wire size and routing congestion across the layers. We consider many types of layout features for machine learning including those obtained from placement, routing, and cell sizes. For the top split layer, we demonstrate dramatically better results in proximity attack compared to a recent prior work. We analyze the ranking of the features used by machine learning and show the importance of how features vary when moving to the lower layers. Since the runtime of our basic machine learning becomes prohibitively large for lower layers, we propose novel techniques to make it scalable with little sacrifice in effectiveness of the attack.
这项工作是第一个使用机器学习分析分离制造安全性的研究,该研究基于从行业提供的布局中收集的数据,具有8个布线金属层,并且线尺寸和跨层布线拥塞的显着变化。我们考虑了许多类型的布局特征用于机器学习,包括那些从放置、路由和单元大小中获得的特征。对于顶部分割层,我们证明了与最近的工作相比,在接近攻击方面取得了显着更好的结果。我们分析了机器学习所使用的特征的排名,并展示了特征在移动到较低层时如何变化的重要性。由于我们的基本机器学习的运行时间对于较低的层来说变得非常大,因此我们提出了新的技术,使其在几乎不牺牲攻击有效性的情况下可扩展。
{"title":"Analysis of Security of Split Manufacturing using Machine Learning","authors":"Boyu Zhang, J. Magaña, A. Davoodi","doi":"10.1145/3195970.3195991","DOIUrl":"https://doi.org/10.1145/3195970.3195991","url":null,"abstract":"This work is the first to analyze the security of split manufacturing using machine learning, based on data collected from layouts provided by industry, with 8 routing metal layers, and significant variation in wire size and routing congestion across the layers. We consider many types of layout features for machine learning including those obtained from placement, routing, and cell sizes. For the top split layer, we demonstrate dramatically better results in proximity attack compared to a recent prior work. We analyze the ranking of the features used by machine learning and show the importance of how features vary when moving to the lower layers. Since the runtime of our basic machine learning becomes prohibitively large for lower layers, we propose novel techniques to make it scalable with little sacrifice in effectiveness of the attack.","PeriodicalId":6491,"journal":{"name":"2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC)","volume":"42 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75444059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Invited: Efficient Reinforcement Learning for Automating Human Decision-Making in SoC Design 邀请:SoC设计中人类决策自动化的高效强化学习
Pub Date : 2018-06-01 DOI: 10.1145/3195970.3199855
Shankar Sadasivam, Zhuo Chen, Jinwon Lee, Rajeev Jain
The exponential growth in PVT corners due to Moore's law scaling, and the increasing demand for consumer applications and longer battery life in mobile devices, has ushered in significant cost and power-related challenges for designing and productizing mobile chips within a predictable schedule. Two main reasons for this are the reliance on human decision-making to achieve the desired performance within the target area and power budget, and significant increases in complexity of the human decision-making space. The problem is that to-date human design experience has not been replaced by design automation tools, and tasks requiring experience of past designs are still being performed manually.In this paper we investigate how machine learning may be applied to develop tools that learn from experience just like human designers, thus automating tasks that still require human intervention. The potential advantage of the machine learning approach is the ability to scale with increasing complexity and therefore hold the design-time constant with same manpower.Reinforcement Learning (RL) is a machine learning technique that allows us to mimic a human designers' ability to learn from experience and automate human decision-making, without loss in quality of the design, while making the design time independent of the complexity. In this paper we show how manual design tasks can be abstracted as RL problems. Based on the experience with applying RL to one of these problems, we show that RL can automatically achieve results similar to human designs, but in a predictable schedule. However, a major drawback is that the RL solution can require a prohibitively large number of iterations for training. If efficient training techniques can be developed for RL, it holds great promise to automate tasks requiring human experience. In this paper we present a Bayesian Optimization technique for reducing the RL training time.
由于摩尔定律的缩放,PVT角落的指数级增长,以及消费者应用需求的增长和移动设备电池寿命的延长,为在可预测的时间表内设计和生产移动芯片带来了巨大的成本和功耗方面的挑战。造成这种情况的两个主要原因是依靠人工决策来实现目标区域和功率预算内的期望性能,以及人工决策空间的复杂性显着增加。问题是,迄今为止,人类的设计经验并没有被设计自动化工具所取代,需要过去设计经验的任务仍然是手动执行的。在本文中,我们研究了如何将机器学习应用于开发像人类设计师一样从经验中学习的工具,从而使仍然需要人工干预的任务自动化。机器学习方法的潜在优势是能够随着复杂性的增加而扩展,因此在相同的人力条件下保持设计时间不变。强化学习(RL)是一种机器学习技术,它允许我们模仿人类设计师从经验中学习和自动化人类决策的能力,而不会损失设计质量,同时使设计时间与复杂性无关。在本文中,我们展示了如何将手工设计任务抽象为强化学习问题。基于将强化学习应用于其中一个问题的经验,我们表明强化学习可以自动实现类似于人类设计的结果,但在可预测的时间表内。然而,一个主要的缺点是RL解决方案可能需要大量的训练迭代。如果可以为强化学习开发有效的训练技术,那么自动化需要人类经验的任务将大有希望。本文提出了一种减少强化学习训练时间的贝叶斯优化技术。
{"title":"Invited: Efficient Reinforcement Learning for Automating Human Decision-Making in SoC Design","authors":"Shankar Sadasivam, Zhuo Chen, Jinwon Lee, Rajeev Jain","doi":"10.1145/3195970.3199855","DOIUrl":"https://doi.org/10.1145/3195970.3199855","url":null,"abstract":"The exponential growth in PVT corners due to Moore's law scaling, and the increasing demand for consumer applications and longer battery life in mobile devices, has ushered in significant cost and power-related challenges for designing and productizing mobile chips within a predictable schedule. Two main reasons for this are the reliance on human decision-making to achieve the desired performance within the target area and power budget, and significant increases in complexity of the human decision-making space. The problem is that to-date human design experience has not been replaced by design automation tools, and tasks requiring experience of past designs are still being performed manually.In this paper we investigate how machine learning may be applied to develop tools that learn from experience just like human designers, thus automating tasks that still require human intervention. The potential advantage of the machine learning approach is the ability to scale with increasing complexity and therefore hold the design-time constant with same manpower.Reinforcement Learning (RL) is a machine learning technique that allows us to mimic a human designers' ability to learn from experience and automate human decision-making, without loss in quality of the design, while making the design time independent of the complexity. In this paper we show how manual design tasks can be abstracted as RL problems. Based on the experience with applying RL to one of these problems, we show that RL can automatically achieve results similar to human designs, but in a predictable schedule. However, a major drawback is that the RL solution can require a prohibitively large number of iterations for training. If efficient training techniques can be developed for RL, it holds great promise to automate tasks requiring human experience. In this paper we present a Bayesian Optimization technique for reducing the RL training time.","PeriodicalId":6491,"journal":{"name":"2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC)","volume":"20 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83276238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Cross-Layer Fault-Space Pruning for Hardware-Assisted Fault Injection 硬件辅助故障注入的跨层故障空间剪枝
Pub Date : 2018-06-01 DOI: 10.1145/3195970.3196019
Christian J. Dietrich, Achim Schmider, Oskar Pusz, G. P. Vayá, D. Lohmann
With shrinking structure sizes, soft-error mitigation has become a major challenge in the design and certification of safety-critical embedded systems. Their robustness is quantified by extensive fault-injection campaigns, which on hardware level can nevertheless cover only a tiny part of the fault space.We suggest Fault-Masking Terms (MATEs) to effectively prune the fault space for gate-level fault injection campaigns by using the (software-induced) hardware state to dynamically cut off benign faults. Our tool applied to an AVR core and a size-optimized MSP430 implementation shows that up to 21 percent of all SEUs on flip-flop level are masked within one clock cycle.
随着结构尺寸的缩小,软错误缓解已经成为安全关键型嵌入式系统设计和认证的主要挑战。它们的鲁棒性是通过广泛的故障注入活动来量化的,然而在硬件层面上,这些活动只能覆盖故障空间的一小部分。我们建议使用(软件诱导的)硬件状态来动态切断良性故障,从而使用故障屏蔽术语(MATEs)来有效地修剪门级故障注入活动的故障空间。我们的工具应用于AVR内核和尺寸优化的MSP430实现,结果表明,在一个时钟周期内,触发器电平上高达21%的seu被屏蔽。
{"title":"Cross-Layer Fault-Space Pruning for Hardware-Assisted Fault Injection","authors":"Christian J. Dietrich, Achim Schmider, Oskar Pusz, G. P. Vayá, D. Lohmann","doi":"10.1145/3195970.3196019","DOIUrl":"https://doi.org/10.1145/3195970.3196019","url":null,"abstract":"With shrinking structure sizes, soft-error mitigation has become a major challenge in the design and certification of safety-critical embedded systems. Their robustness is quantified by extensive fault-injection campaigns, which on hardware level can nevertheless cover only a tiny part of the fault space.We suggest Fault-Masking Terms (MATEs) to effectively prune the fault space for gate-level fault injection campaigns by using the (software-induced) hardware state to dynamically cut off benign faults. Our tool applied to an AVR core and a size-optimized MSP430 implementation shows that up to 21 percent of all SEUs on flip-flop level are masked within one clock cycle.","PeriodicalId":6491,"journal":{"name":"2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC)","volume":"49 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90279901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
INVITED: A Modular Digital VLSI Flow for High-Productivity SoC Design 邀请:模块化数字VLSI流程用于高生产力SoC设计
Pub Date : 2018-06-01 DOI: 10.1145/3195970.3199846
Brucek Khailany, Evgeni Khmer, Rangharajan Venkatesan, Jason Clemons, J. Emer, Matthew R. Fojtik, Alicia Klinefelter, Michael Pellauer, N. Pinckney, Y. Shao, S. Srinath, Christopher Torng, S. Xi, Yanqing Zhang, B. Zimmer
A high-productivity digital VLSI flow for designing complex SoCs is presented. The flow includes high-level synthesis tools, an object-oriented library of synthesizable SystemC and C++ components, and a modular VLSI physical design approach based on fine-grained globally asynchronous locally synchronous (GALS) clocking. The flow was demonstrated on a 16nm FinFET testchip targeting machine learning and computer vision.
提出了一种用于设计复杂soc的高效率数字VLSI流程。该流程包括高级合成工具,可合成的SystemC和c++组件的面向对象库,以及基于细粒度全局异步局部同步(GALS)时钟的模块化VLSI物理设计方法。该流程在针对机器学习和计算机视觉的16nm FinFET测试芯片上进行了演示。
{"title":"INVITED: A Modular Digital VLSI Flow for High-Productivity SoC Design","authors":"Brucek Khailany, Evgeni Khmer, Rangharajan Venkatesan, Jason Clemons, J. Emer, Matthew R. Fojtik, Alicia Klinefelter, Michael Pellauer, N. Pinckney, Y. Shao, S. Srinath, Christopher Torng, S. Xi, Yanqing Zhang, B. Zimmer","doi":"10.1145/3195970.3199846","DOIUrl":"https://doi.org/10.1145/3195970.3199846","url":null,"abstract":"A high-productivity digital VLSI flow for designing complex SoCs is presented. The flow includes high-level synthesis tools, an object-oriented library of synthesizable SystemC and C++ components, and a modular VLSI physical design approach based on fine-grained globally asynchronous locally synchronous (GALS) clocking. The flow was demonstrated on a 16nm FinFET testchip targeting machine learning and computer vision.","PeriodicalId":6491,"journal":{"name":"2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC)","volume":"43 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90596414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 50
期刊
2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1