首页 > 最新文献

2020 57th ACM/IEEE Design Automation Conference (DAC)最新文献

英文 中文
CRAFFT: High Resolution FFT Accelerator In Spintronic Computational RAM 工艺:自旋电子计算RAM中的高分辨率FFT加速器
Pub Date : 2020-07-01 DOI: 10.1109/DAC18072.2020.9218673
Hüsrev Cılasun, Salonik Resch, Z. Chowdhury, Erin Olson, Masoud Zabihi, Zhengyang Zhao, Thomas J. Peterson, Jianping Wang, S. Sapatnekar, Ulya R. Karpuzcu
High resolution Fast Fourier Transform (FFT) is important for various applications while increased memory access and parallelism requirement limits the traditional hardware. In this work, we explore acceleration opportunities for high resolution FFTs in spintronic computational RAM (CRAM) which supports true in-memory processing semantics. We experiment with Spin-Torque-Transfer (STT) and Spin-Hall-Effect (SHE) based CRAMs in implementing CRAFFT, a high resolution FFT accelerator in memory. For one million point fixed-point FFT, we demonstrate that CRAFFT can provide up to 2.57× speedup and 673× energy reduction. We also provide a proof-of-concept extension to floating-point FFT.
高分辨率快速傅里叶变换(FFT)在各种应用中都很重要,而增加的内存访问和并行性要求限制了传统硬件。在这项工作中,我们探索了高分辨率fft在支持真正的内存处理语义的自旋电子计算RAM (CRAM)中的加速机会。我们实验了基于自旋-扭矩传递(STT)和自旋-霍尔效应(SHE)的ram来实现高分辨率FFT加速器craft。对于一百万点的定点FFT,我们证明了craft可以提供高达2.57倍的加速和673倍的能量减少。我们还提供了对浮点FFT的概念验证扩展。
{"title":"CRAFFT: High Resolution FFT Accelerator In Spintronic Computational RAM","authors":"Hüsrev Cılasun, Salonik Resch, Z. Chowdhury, Erin Olson, Masoud Zabihi, Zhengyang Zhao, Thomas J. Peterson, Jianping Wang, S. Sapatnekar, Ulya R. Karpuzcu","doi":"10.1109/DAC18072.2020.9218673","DOIUrl":"https://doi.org/10.1109/DAC18072.2020.9218673","url":null,"abstract":"High resolution Fast Fourier Transform (FFT) is important for various applications while increased memory access and parallelism requirement limits the traditional hardware. In this work, we explore acceleration opportunities for high resolution FFTs in spintronic computational RAM (CRAM) which supports true in-memory processing semantics. We experiment with Spin-Torque-Transfer (STT) and Spin-Hall-Effect (SHE) based CRAMs in implementing CRAFFT, a high resolution FFT accelerator in memory. For one million point fixed-point FFT, we demonstrate that CRAFFT can provide up to 2.57× speedup and 673× energy reduction. We also provide a proof-of-concept extension to floating-point FFT.","PeriodicalId":428807,"journal":{"name":"2020 57th ACM/IEEE Design Automation Conference (DAC)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130993576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Impeccable Circuits II 完美电路II
Pub Date : 2020-07-01 DOI: 10.1109/DAC18072.2020.9218615
Aein Rezaei Shahmirzadi, Shahram Rasoolzadeh, A. Moradi
Protection against active physical attacks is of serious concerns of cryptographic hardware designers. Introduction of SIFA invalidating several previously-thought-effective counter-measures, made this challenge even harder. Here in this work we deal with error correction, and introduce a methodology which shows, depending on the selected adversary model, how to correctly embed error-correcting codes in a cryptographic implementation. Our construction guarantees the correction of faults, in any location of the circuit and at any clock cycle, as long as they fit into the underlying adversary model. Based on case studies evaluated by open-source fault diagnostic tools, we claim protection against SIFA.
防止主动物理攻击是加密硬件设计者非常关注的问题。SIFA的引入使一些以前认为有效的对抗措施失效,使这一挑战更加困难。在这项工作中,我们处理错误纠正,并介绍一种方法,该方法显示,根据所选择的对手模型,如何在加密实现中正确嵌入错误纠正码。我们的结构保证在电路的任何位置和任何时钟周期中纠正故障,只要它们符合潜在的对手模型。基于开源故障诊断工具评估的案例研究,我们要求对SIFA进行保护。
{"title":"Impeccable Circuits II","authors":"Aein Rezaei Shahmirzadi, Shahram Rasoolzadeh, A. Moradi","doi":"10.1109/DAC18072.2020.9218615","DOIUrl":"https://doi.org/10.1109/DAC18072.2020.9218615","url":null,"abstract":"Protection against active physical attacks is of serious concerns of cryptographic hardware designers. Introduction of SIFA invalidating several previously-thought-effective counter-measures, made this challenge even harder. Here in this work we deal with error correction, and introduce a methodology which shows, depending on the selected adversary model, how to correctly embed error-correcting codes in a cryptographic implementation. Our construction guarantees the correction of faults, in any location of the circuit and at any clock cycle, as long as they fit into the underlying adversary model. Based on case studies evaluated by open-source fault diagnostic tools, we claim protection against SIFA.","PeriodicalId":428807,"journal":{"name":"2020 57th ACM/IEEE Design Automation Conference (DAC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131252374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
TrojDRL: Evaluation of Backdoor Attacks on Deep Reinforcement Learning TrojDRL:评估对深度强化学习的后门攻击
Pub Date : 2020-07-01 DOI: 10.1109/DAC18072.2020.9218663
Panagiota Kiourti, Kacper Wardega, Susmit Jha, Wenchao Li
We present TrojDRL, a tool for exploring and evaluating backdoor attacks on deep reinforcement learning agents. TrojDRL exploits the sequential nature of deep reinforcement learning (DRL) and considers different gradations of threat models. We show that untargeted attacks on state-of-the-art actor-critic algorithms can circumvent existing defenses built on the assumption of backdoors being targeted. We evaluated TrojDRL on a broad set of DRL benchmarks and showed that the attacks require only poisoning as little as 0.025% of the training data. Compared with existing works of backdoor attacks on classification models, TrojDRL provides a first step towards understanding the vulnerability of DRL agents.
我们提出TrojDRL,一个用于探索和评估对深度强化学习代理的后门攻击的工具。TrojDRL利用了深度强化学习(DRL)的顺序特性,并考虑了不同层次的威胁模型。我们表明,对最先进的行动者批评算法的非针对性攻击可以绕过基于后门被攻击的假设而建立的现有防御。我们在一组广泛的DRL基准测试中评估了TrojDRL,结果表明攻击只需要毒害0.025%的训练数据。与现有针对分类模型的后门攻击工作相比,TrojDRL为理解DRL代理的漏洞提供了第一步。
{"title":"TrojDRL: Evaluation of Backdoor Attacks on Deep Reinforcement Learning","authors":"Panagiota Kiourti, Kacper Wardega, Susmit Jha, Wenchao Li","doi":"10.1109/DAC18072.2020.9218663","DOIUrl":"https://doi.org/10.1109/DAC18072.2020.9218663","url":null,"abstract":"We present TrojDRL, a tool for exploring and evaluating backdoor attacks on deep reinforcement learning agents. TrojDRL exploits the sequential nature of deep reinforcement learning (DRL) and considers different gradations of threat models. We show that untargeted attacks on state-of-the-art actor-critic algorithms can circumvent existing defenses built on the assumption of backdoors being targeted. We evaluated TrojDRL on a broad set of DRL benchmarks and showed that the attacks require only poisoning as little as 0.025% of the training data. Compared with existing works of backdoor attacks on classification models, TrojDRL provides a first step towards understanding the vulnerability of DRL agents.","PeriodicalId":428807,"journal":{"name":"2020 57th ACM/IEEE Design Automation Conference (DAC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132350308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 50
SHIELDeNN: Online Accelerated Framework for Fault-Tolerant Deep Neural Network Architectures 用于容错深度神经网络架构的在线加速框架
Pub Date : 2020-07-01 DOI: 10.1109/DAC18072.2020.9218697
N. Khoshavi, A. Roohi, Connor Broyles, S. Sargolzaei, Yu Bi, D. Pan
We propose SHIELDeNN, an end-to-end inference accelerator frame-work that synergizes the mitigation approach and computational resources to realize a low-overhead error-resilient Neural Network (NN) overlay. We develop a rigorous fault assessment paradigm to delineate a ground-truth fault-skeleton map for revealing the most vulnerable parameters in NN. The error-susceptible parameters and resource constraints are given to a function to find superior design. The error-resiliency magnitude offered by SHIELDeNN can be adjusted based on the given boundaries. SHIELDeNN methodology improves the error-resiliency magnitude of cnvW1A1 by 17.19% and 96.15% for 100 MBUs that target weight and activation layers, respectively.
我们提出了SHIELDeNN,这是一个端到端推理加速器框架,它协同了缓解方法和计算资源,以实现低开销的容错神经网络(NN)覆盖。我们开发了一个严格的故障评估范式来描绘一个真实的断层骨架图,以揭示神经网络中最脆弱的参数。给出了函数的易出错参数和资源约束,以寻找最佳设计。SHIELDeNN提供的容错幅度可以根据给定的边界进行调整。对于目标重量层和激活层的100个mbu, SHIELDeNN方法将cnvW1A1的容错幅度分别提高了17.19%和96.15%。
{"title":"SHIELDeNN: Online Accelerated Framework for Fault-Tolerant Deep Neural Network Architectures","authors":"N. Khoshavi, A. Roohi, Connor Broyles, S. Sargolzaei, Yu Bi, D. Pan","doi":"10.1109/DAC18072.2020.9218697","DOIUrl":"https://doi.org/10.1109/DAC18072.2020.9218697","url":null,"abstract":"We propose SHIELDeNN, an end-to-end inference accelerator frame-work that synergizes the mitigation approach and computational resources to realize a low-overhead error-resilient Neural Network (NN) overlay. We develop a rigorous fault assessment paradigm to delineate a ground-truth fault-skeleton map for revealing the most vulnerable parameters in NN. The error-susceptible parameters and resource constraints are given to a function to find superior design. The error-resiliency magnitude offered by SHIELDeNN can be adjusted based on the given boundaries. SHIELDeNN methodology improves the error-resiliency magnitude of cnvW1A1 by 17.19% and 96.15% for 100 MBUs that target weight and activation layers, respectively.","PeriodicalId":428807,"journal":{"name":"2020 57th ACM/IEEE Design Automation Conference (DAC)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132083869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
DPCP-p: A Distributed Locking Protocol for Parallel Real-Time Tasks 并行实时任务的分布式锁定协议
Pub Date : 2020-07-01 DOI: 10.1109/DAC18072.2020.9218584
Maolin Yang, Zewei Chen, Xu Jiang, Nan Guan, Hang Lei
Real-time scheduling and locking protocols are fundamental facilities to construct time-critical systems. For parallel real-time tasks, predictable locking protocols are required when concurrent sub-jobs mutually exclusive access to shared resources. This paper for the first time studies the distributed synchronization framework of parallel real-time tasks, where both tasks and global resources are partitioned to designated processors, and requests to each global resource are conducted on the processor on which the resource is partitioned. We extend the Distributed Priority Ceiling Protocol (DPCP) for parallel tasks under federated scheduling, with which we proved that a request can be blocked by at most one lower-priority request. We develop task and resource partitioning heuristics and propose analysis techniques to safely bound the task response times. Numerical evaluation (with heavy tasks on 8-, 16-, and 32-core processors) indicates that the proposed methods improve the schedulability significantly compared to the state-of-the-art locking protocols under federated scheduling.
实时调度和锁定协议是构建时间关键型系统的基础设施。对于并行实时任务,当并发子作业互斥地访问共享资源时,需要可预测的锁定协议。本文首次研究了并行实时任务的分布式同步框架,该框架将任务和全局资源都划分到指定的处理器上,对每个全局资源的请求都在分配资源的处理器上进行。将分布式优先级上限协议(DPCP)扩展到联邦调度下的并行任务,证明了一个请求最多只能被一个低优先级请求阻塞。我们开发了任务和资源划分启发式方法,并提出了安全约束任务响应时间的分析技术。数值评估(8核、16核和32核处理器上的繁重任务)表明,与联邦调度下最先进的锁定协议相比,所提出的方法显著提高了可调度性。
{"title":"DPCP-p: A Distributed Locking Protocol for Parallel Real-Time Tasks","authors":"Maolin Yang, Zewei Chen, Xu Jiang, Nan Guan, Hang Lei","doi":"10.1109/DAC18072.2020.9218584","DOIUrl":"https://doi.org/10.1109/DAC18072.2020.9218584","url":null,"abstract":"Real-time scheduling and locking protocols are fundamental facilities to construct time-critical systems. For parallel real-time tasks, predictable locking protocols are required when concurrent sub-jobs mutually exclusive access to shared resources. This paper for the first time studies the distributed synchronization framework of parallel real-time tasks, where both tasks and global resources are partitioned to designated processors, and requests to each global resource are conducted on the processor on which the resource is partitioned. We extend the Distributed Priority Ceiling Protocol (DPCP) for parallel tasks under federated scheduling, with which we proved that a request can be blocked by at most one lower-priority request. We develop task and resource partitioning heuristics and propose analysis techniques to safely bound the task response times. Numerical evaluation (with heavy tasks on 8-, 16-, and 32-core processors) indicates that the proposed methods improve the schedulability significantly compared to the state-of-the-art locking protocols under federated scheduling.","PeriodicalId":428807,"journal":{"name":"2020 57th ACM/IEEE Design Automation Conference (DAC)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114943458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
SAT-Sweeping Enhanced for Logic Synthesis sat扫描增强逻辑合成
Pub Date : 2020-07-01 DOI: 10.1109/DAC18072.2020.9218691
L. Amarù, F. Marranghello, Eleonora Testa, Christopher Casares, V. Possani, Jiong Luo, P. Vuillod, A. Mishchenko, G. Micheli
SAT-sweeping is a powerful method for simplifying logic networks. It consists of merging gates that are proven equivalent (up to complementation) by running simulation and SAT solving in synergy. SAT-sweeping is used in both verification and synthesis applications within EDA. In this paper, we focus on the development of a highly efficient, synthesis-oriented, SAT-sweeping engine. We introduce a new algorithm to guide initial simulation, which strongly reduces the number of false candidates for merge, thus increasing the computational efficiency of the sweeper. We revisit the SAT-sweeping flow in light of practical considerations for synthesis, with the aim of proving all valid merges and ensuring fast execution. Experimental results confirm remarkable speedup deriving from our methodology, up to 10× for large combinational networks, and better QoR as compared to previous SAT-sweeping implementation. Embedded in a commercial synthesis flow, our proposes SAT-sweeper enables area and power savings of 1.98% and 1.81%, respectively, with neutral timing at negligible runtime overhead, over 36 testcases.
sat扫描是一种简化逻辑网络的强大方法。它由合并门组成,通过运行仿真和协同求解SAT被证明是等效的(直到互补)。sat扫描用于EDA中的验证和合成应用程序。在本文中,我们的重点是开发一个高效的,面向综合的,卫星扫描引擎。我们引入了一种新的算法来指导初始模拟,该算法大大减少了合并的错误候选数,从而提高了清扫器的计算效率。我们根据合成的实际考虑重新审视sat扫描流程,目的是证明所有有效的合并并确保快速执行。实验结果证实了我们的方法带来的显著加速,对于大型组合网络高达10倍,与以前的sat扫描实现相比,QoR更好。嵌入在商业合成流中,我们提出的SAT-sweeper在36个测试用例中分别实现了1.98%和1.81%的面积和功耗节省,并且在可忽略的运行时开销下具有中性定时。
{"title":"SAT-Sweeping Enhanced for Logic Synthesis","authors":"L. Amarù, F. Marranghello, Eleonora Testa, Christopher Casares, V. Possani, Jiong Luo, P. Vuillod, A. Mishchenko, G. Micheli","doi":"10.1109/DAC18072.2020.9218691","DOIUrl":"https://doi.org/10.1109/DAC18072.2020.9218691","url":null,"abstract":"SAT-sweeping is a powerful method for simplifying logic networks. It consists of merging gates that are proven equivalent (up to complementation) by running simulation and SAT solving in synergy. SAT-sweeping is used in both verification and synthesis applications within EDA. In this paper, we focus on the development of a highly efficient, synthesis-oriented, SAT-sweeping engine. We introduce a new algorithm to guide initial simulation, which strongly reduces the number of false candidates for merge, thus increasing the computational efficiency of the sweeper. We revisit the SAT-sweeping flow in light of practical considerations for synthesis, with the aim of proving all valid merges and ensuring fast execution. Experimental results confirm remarkable speedup deriving from our methodology, up to 10× for large combinational networks, and better QoR as compared to previous SAT-sweeping implementation. Embedded in a commercial synthesis flow, our proposes SAT-sweeper enables area and power savings of 1.98% and 1.81%, respectively, with neutral timing at negligible runtime overhead, over 36 testcases.","PeriodicalId":428807,"journal":{"name":"2020 57th ACM/IEEE Design Automation Conference (DAC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116176072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
A-QED Verification of Hardware Accelerators 硬件加速器的A-QED验证
Pub Date : 2020-07-01 DOI: 10.1109/DAC18072.2020.9218715
Eshan Singh, Florian Lonsing, Saranyu Chattopadhyay, Maxwell Strange, Peng Wei, Xiaofan Zhang, Yuan Zhou, Deming Chen, J. Cong, Priyanka Raina, Zhiru Zhang, Clark W. Barrett, S. Mitra
We present A-QED (Accelerator-Quick Error Detection), a new approach for pre-silicon formal verification of stand-alone hardware accelerators. A-QED relies on bounded model checking -- however, it does not require extensive design-specific properties or a full formal design specification. While A- QED is effective for both RTL and high-level synthesis (HLS) design flows, it integrates seamlessly with HLS flows. Our A-QED results on several hardware accelerator designs demonstrate its practicality and effectiveness: 1. A-QED detected all bugs detected by conventional verification flow. 2. A-QED detected bugs that escaped conventional verification flow. 3. A-QED improved verification productivity dramatically, by 30X, in one of our case studies (1 person-day using A-QED vs. 30 person-days using conventional verification flow). 4. A-QED produced short counterexamples for easy debug (37X shorter on average vs. conventional verification flow).
我们提出了a - qed(加速器-快速错误检测),这是一种用于独立硬件加速器的预硅形式验证的新方法。a - qed依赖于有界模型检查——然而,它不需要广泛的特定于设计的属性或完整的正式设计规范。虽然A- QED对RTL和高级综合(HLS)设计流程都有效,但它可以与HLS流程无缝集成。我们在几个硬件加速器设计上的A-QED结果证明了它的实用性和有效性。A-QED检测了常规验证流程检测到的所有错误。2. A-QED检测到逃避常规验证流程的错误。3.在我们的一个案例研究中,A-QED极大地提高了验证效率,提高了30倍(使用A-QED的1人/天与使用传统验证流程的30人/天相比)。4. A-QED生成了简短的反例,便于调试(与传统验证流程相比,平均缩短了37倍)。
{"title":"A-QED Verification of Hardware Accelerators","authors":"Eshan Singh, Florian Lonsing, Saranyu Chattopadhyay, Maxwell Strange, Peng Wei, Xiaofan Zhang, Yuan Zhou, Deming Chen, J. Cong, Priyanka Raina, Zhiru Zhang, Clark W. Barrett, S. Mitra","doi":"10.1109/DAC18072.2020.9218715","DOIUrl":"https://doi.org/10.1109/DAC18072.2020.9218715","url":null,"abstract":"We present A-QED (Accelerator-Quick Error Detection), a new approach for pre-silicon formal verification of stand-alone hardware accelerators. A-QED relies on bounded model checking -- however, it does not require extensive design-specific properties or a full formal design specification. While A- QED is effective for both RTL and high-level synthesis (HLS) design flows, it integrates seamlessly with HLS flows. Our A-QED results on several hardware accelerator designs demonstrate its practicality and effectiveness: 1. A-QED detected all bugs detected by conventional verification flow. 2. A-QED detected bugs that escaped conventional verification flow. 3. A-QED improved verification productivity dramatically, by 30X, in one of our case studies (1 person-day using A-QED vs. 30 person-days using conventional verification flow). 4. A-QED produced short counterexamples for easy debug (37X shorter on average vs. conventional verification flow).","PeriodicalId":428807,"journal":{"name":"2020 57th ACM/IEEE Design Automation Conference (DAC)","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116345811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Tier-Scrubbing: An Adaptive and Tiered Disk Scrubbing Scheme with Improved MTTD and Reduced Cost 层刷洗:一种改进MTTD和降低成本的自适应分层磁盘刷洗方案
Pub Date : 2020-07-01 DOI: 10.1109/DAC18072.2020.9218551
Ji Zhang, Yuanzhang Wang, Yangtao Wang, Ke Zhou, Sebastian Schelter, Ping Huang, Bin Cheng, Yongguang Ji
Sector errors are a common type of error in modern disks. A sector error that occurs during I/O operations might cause inaccessibility of an application. Even worse, it could result in permanent data loss if the data is being reconstructed, and thereby severely affects the reliability of a storage system. Many disk scrubbing schemes have been proposed to solve this problem. However, existing approaches have several limitations. First, schemes use machine learning (ML) to predict latent sector errors (LSEs), but only leverage a single snapshot of training data to make a prediction, and thereby ignore sequential dependencies between different statuses of a hard disk over time. Second, they accelerate the scrubbing at a fixed rate based on the results of a binary classification model, which may result in unnecessary increases in scrubbing cost. Third, they naively accelerate the scrubbing of the full disk which has LSEs based on the predictive results, but neglect partial high-risk areas (the areas that have a higher probability of encountering LSEs). Lastly, they do not employ strategies to scrub these high-risk areas in advance based on I/O accesses patterns, in order to further increase the efficiency of scrubbing.We address these challenges by designing a Tier-Scrubbing (TS) scheme that combines a Long Short-Term Memory (LSTM) based Adaptive Scrubbing Rate Controller (ASRC), a module focusing on sector error locality to locate high-risk areas in a disk, and a piggyback scrubbing strategy to improve the reliability of a storage system. Our evaluation results on realistic datasets and workloads from two real world data centers demonstrate that TS can simultaneously decrease the Mean-Time-To-Detection (MTTD) by about 80% and the scrubbing cost by 20%, compared to a state-of-the-art scrubbing scheme.
扇区错误是现代磁盘中常见的错误类型。在I/O操作期间发生的扇区错误可能导致应用程序不可访问。更严重的是,在进行数据重构时,可能导致数据永久丢失,从而严重影响存储系统的可靠性。为了解决这个问题,已经提出了许多磁盘清洗方案。然而,现有的方法有一些局限性。首先,方案使用机器学习(ML)来预测潜在扇区错误(lse),但仅利用训练数据的单个快照进行预测,从而忽略了硬盘不同状态之间随时间的顺序依赖关系。其次,它们根据二元分类模型的结果以固定的速率加速洗涤,这可能会导致不必要的洗涤成本增加。第三,他们天真地加速了基于预测结果的具有lse的整个磁盘的擦除,但忽略了部分高风险区域(遇到lse的概率更高的区域)。最后,为了进一步提高清理效率,它们没有采用基于I/O访问模式提前清理这些高风险区域的策略。为了解决这些问题,我们设计了一种分层刷洗(TS)方案,该方案结合了基于长短期记忆(LSTM)的自适应刷洗速率控制器(ASRC),一个专注于扇区错误局域定位的模块,以定位磁盘中的高风险区域,以及一种用于提高存储系统可靠性的背带刷洗策略。我们对来自两个真实世界数据中心的真实数据集和工作负载的评估结果表明,与最先进的清洗方案相比,TS可以同时将平均检测时间(MTTD)降低约80%,清洗成本降低20%。
{"title":"Tier-Scrubbing: An Adaptive and Tiered Disk Scrubbing Scheme with Improved MTTD and Reduced Cost","authors":"Ji Zhang, Yuanzhang Wang, Yangtao Wang, Ke Zhou, Sebastian Schelter, Ping Huang, Bin Cheng, Yongguang Ji","doi":"10.1109/DAC18072.2020.9218551","DOIUrl":"https://doi.org/10.1109/DAC18072.2020.9218551","url":null,"abstract":"Sector errors are a common type of error in modern disks. A sector error that occurs during I/O operations might cause inaccessibility of an application. Even worse, it could result in permanent data loss if the data is being reconstructed, and thereby severely affects the reliability of a storage system. Many disk scrubbing schemes have been proposed to solve this problem. However, existing approaches have several limitations. First, schemes use machine learning (ML) to predict latent sector errors (LSEs), but only leverage a single snapshot of training data to make a prediction, and thereby ignore sequential dependencies between different statuses of a hard disk over time. Second, they accelerate the scrubbing at a fixed rate based on the results of a binary classification model, which may result in unnecessary increases in scrubbing cost. Third, they naively accelerate the scrubbing of the full disk which has LSEs based on the predictive results, but neglect partial high-risk areas (the areas that have a higher probability of encountering LSEs). Lastly, they do not employ strategies to scrub these high-risk areas in advance based on I/O accesses patterns, in order to further increase the efficiency of scrubbing.We address these challenges by designing a Tier-Scrubbing (TS) scheme that combines a Long Short-Term Memory (LSTM) based Adaptive Scrubbing Rate Controller (ASRC), a module focusing on sector error locality to locate high-risk areas in a disk, and a piggyback scrubbing strategy to improve the reliability of a storage system. Our evaluation results on realistic datasets and workloads from two real world data centers demonstrate that TS can simultaneously decrease the Mean-Time-To-Detection (MTTD) by about 80% and the scrubbing cost by 20%, compared to a state-of-the-art scrubbing scheme.","PeriodicalId":428807,"journal":{"name":"2020 57th ACM/IEEE Design Automation Conference (DAC)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131911869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
AHEC: End-to-end Compiler Framework for Privacy-preserving Machine Learning Acceleration
Pub Date : 2020-07-01 DOI: 10.1109/DAC18072.2020.9218508
Huili Chen, Rosario Cammarota, Felipe Valencia, F. Regazzoni, F. Koushanfar
Privacy-preserving machine learning (PPML) is driven by the emerging adoption of Machine Learning as a Service (MLaaS). In a typical MLaaS system, the end-user sends his personal data to the service provider and receives the corresponding prediction output. However, such interaction raises severe privacy concerns about both the user’s proprietary data and the server’s ML model. PPML integrates cryptographic primitives such as Multi-Party Computation (MPC) and/or Homomorphic Encryption (HE) into ML services to resolve the privacy issue. However, existing PPML solutions have not been widely deployed in practice since: (i) Privacy protection comes at the cost of additional computation and/or communication overhead; (ii) Adapting PPML to different front-end frameworks and back-end hardware incurs prohibitive engineering cost.We propose AHEC, the first automated, end-to-end HE compiler for efficient PPML inference. Leveraging the capability of Domain Specific Languages (DSLs), AHEC enables automated generation and optimization of HE kernels across diverse types of hardware platforms and ML frameworks. We perform extensive experiments to investigate the performance of AHEC from different abstraction levels: HE operations, HE-based ML kernels, and neural network layers. Empirical results corroborate that AHEC achieves superior runtime reduction compared to the state-of-the-art solutions built from static HE libraries.
隐私保护机器学习(PPML)是由机器学习即服务(MLaaS)的新兴采用推动的。在典型的MLaaS系统中,最终用户将其个人数据发送给服务提供者并接收相应的预测输出。然而,这种交互引起了对用户专有数据和服务器ML模型的严重隐私问题。PPML将诸如多方计算(MPC)和/或同态加密(HE)之类的加密原语集成到ML服务中,以解决隐私问题。然而,现有的PPML解决方案并没有在实践中得到广泛部署,因为:(i)隐私保护的代价是额外的计算和/或通信开销;(ii)调整PPML以适应不同的前端框架和后端硬件,会导致过高的工程成本。我们提出AHEC,第一个自动化的端到端HE编译器,用于高效的PPML推理。利用领域特定语言(dsl)的功能,AHEC可以跨不同类型的硬件平台和ML框架自动生成和优化HE内核。我们进行了大量的实验,从不同的抽象层次来研究AHEC的性能:HE操作,基于HE的ML内核和神经网络层。实证结果证实,与由静态HE库构建的最先进的解决方案相比,AHEC实现了卓越的运行时间减少。
{"title":"AHEC: End-to-end Compiler Framework for Privacy-preserving Machine Learning Acceleration","authors":"Huili Chen, Rosario Cammarota, Felipe Valencia, F. Regazzoni, F. Koushanfar","doi":"10.1109/DAC18072.2020.9218508","DOIUrl":"https://doi.org/10.1109/DAC18072.2020.9218508","url":null,"abstract":"Privacy-preserving machine learning (PPML) is driven by the emerging adoption of Machine Learning as a Service (MLaaS). In a typical MLaaS system, the end-user sends his personal data to the service provider and receives the corresponding prediction output. However, such interaction raises severe privacy concerns about both the user’s proprietary data and the server’s ML model. PPML integrates cryptographic primitives such as Multi-Party Computation (MPC) and/or Homomorphic Encryption (HE) into ML services to resolve the privacy issue. However, existing PPML solutions have not been widely deployed in practice since: (i) Privacy protection comes at the cost of additional computation and/or communication overhead; (ii) Adapting PPML to different front-end frameworks and back-end hardware incurs prohibitive engineering cost.We propose AHEC, the first automated, end-to-end HE compiler for efficient PPML inference. Leveraging the capability of Domain Specific Languages (DSLs), AHEC enables automated generation and optimization of HE kernels across diverse types of hardware platforms and ML frameworks. We perform extensive experiments to investigate the performance of AHEC from different abstraction levels: HE operations, HE-based ML kernels, and neural network layers. Empirical results corroborate that AHEC achieves superior runtime reduction compared to the state-of-the-art solutions built from static HE libraries.","PeriodicalId":428807,"journal":{"name":"2020 57th ACM/IEEE Design Automation Conference (DAC)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133487350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
SFO: A Scalable Approach to Fanout-Bounded Logic Synthesis for Emerging Technologies 新兴技术扇出边界逻辑综合的可扩展方法
Pub Date : 2020-07-01 DOI: 10.1109/DAC18072.2020.9218500
He-Teng Zhang, J. H. Jiang
Fanouts are an essential element for signal cloning to achieve logic sharing, but can be a very limited resource in certain emerging technologies, such as quantum circuits, superconducting electronic circuits, photonic integrated circuits, and biological circuits. Although fanout synthesis has been intensively studied for high performance circuit synthesis, prior methods often treat fanout as a soft constraint for critical path optimization or target on specific high-fanout nets such as clock and reset signals. They are not particularly suited for circuit synthesis of these emerging technologies. By treating fanouts as first class citizens, the problem of fanout-bounded logic synthesis was posed as a challenge in the 2019 IWLS Programming Contest. In this paper, we present our winning method, which achieved the overall best quality in the competition, based on fanout load redistribution among existing or expanded equivalent signals.
fanout是信号克隆实现逻辑共享的必要元素,但在某些新兴技术中,如量子电路、超导电子电路、光子集成电路和生物电路中,fanout是非常有限的资源。尽管扇出合成已被广泛研究用于高性能电路合成,但先前的方法通常将扇出视为关键路径优化的软约束或特定高扇出网络(如时钟和复位信号)的目标。它们并不特别适合这些新兴技术的电路合成。通过将扇出视为一等公民,在2019年IWLS编程竞赛中提出了扇出有界逻辑合成问题。在本文中,我们提出了一种在现有或扩展的等效信号之间进行风扇输出负载重新分配的获胜方法,该方法在竞争中获得了整体最佳质量。
{"title":"SFO: A Scalable Approach to Fanout-Bounded Logic Synthesis for Emerging Technologies","authors":"He-Teng Zhang, J. H. Jiang","doi":"10.1109/DAC18072.2020.9218500","DOIUrl":"https://doi.org/10.1109/DAC18072.2020.9218500","url":null,"abstract":"Fanouts are an essential element for signal cloning to achieve logic sharing, but can be a very limited resource in certain emerging technologies, such as quantum circuits, superconducting electronic circuits, photonic integrated circuits, and biological circuits. Although fanout synthesis has been intensively studied for high performance circuit synthesis, prior methods often treat fanout as a soft constraint for critical path optimization or target on specific high-fanout nets such as clock and reset signals. They are not particularly suited for circuit synthesis of these emerging technologies. By treating fanouts as first class citizens, the problem of fanout-bounded logic synthesis was posed as a challenge in the 2019 IWLS Programming Contest. In this paper, we present our winning method, which achieved the overall best quality in the competition, based on fanout load redistribution among existing or expanded equivalent signals.","PeriodicalId":428807,"journal":{"name":"2020 57th ACM/IEEE Design Automation Conference (DAC)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130580507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2020 57th ACM/IEEE Design Automation Conference (DAC)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1