Pub Date : 2020-07-01DOI: 10.1109/DAC18072.2020.9218624
Abraham Addisie, V. Bertacco
The increased use of graph algorithms in diverse fields has highlighted their inefficiencies in current chip-multiprocessor (CMP) architectures, primarily due to their seemingly random-access patterns to off-chip memory. Recently, two families of solutions have been proposed: 1) solutions that offload operations generated by all vertices from the processor cores to off-chip memory; and 2) solutions that offload only operations generated by high-degree vertices to dedicated on-chip memory, while the cores continue to process the work related to the remaining vertices. Neither approach is optimal over the full range of vertex’s degrees. Thus, in this work, we propose Centaur, a novel architecture that processes operations on vertex data in on- and off-chip memory. Centaur utilizes a vertex’s degree as a proxy to determine whether to process related operations in on- or off-chip memory. Centaur manages to provide up to 4.0× improvement in performance and 3.8× in energy benefits, compared to a baseline CMP, and up to a 2.0× performance boost over state-of-the-art specialized solutions.
{"title":"Centaur: Hybrid Processing in On/Off-chip Memory Architecture for Graph Analytics","authors":"Abraham Addisie, V. Bertacco","doi":"10.1109/DAC18072.2020.9218624","DOIUrl":"https://doi.org/10.1109/DAC18072.2020.9218624","url":null,"abstract":"The increased use of graph algorithms in diverse fields has highlighted their inefficiencies in current chip-multiprocessor (CMP) architectures, primarily due to their seemingly random-access patterns to off-chip memory. Recently, two families of solutions have been proposed: 1) solutions that offload operations generated by all vertices from the processor cores to off-chip memory; and 2) solutions that offload only operations generated by high-degree vertices to dedicated on-chip memory, while the cores continue to process the work related to the remaining vertices. Neither approach is optimal over the full range of vertex’s degrees. Thus, in this work, we propose Centaur, a novel architecture that processes operations on vertex data in on- and off-chip memory. Centaur utilizes a vertex’s degree as a proxy to determine whether to process related operations in on- or off-chip memory. Centaur manages to provide up to 4.0× improvement in performance and 3.8× in energy benefits, compared to a baseline CMP, and up to a 2.0× performance boost over state-of-the-art specialized solutions.","PeriodicalId":428807,"journal":{"name":"2020 57th ACM/IEEE Design Automation Conference (DAC)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133736255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-07-01DOI: 10.1109/DAC18072.2020.9218730
Hamid Nejatollahi, Saransh Gupta, M. Imani, T. Simunic, Rosario Cammarota, N. Dutt
Quantum computers promise to solve hard mathematical problems such as integer factorization and discrete logarithms in polynomial time, making standardized public-key cryptosystems insecure. Lattice-Based Cryptography (LBC) is a promising post-quantum public key cryptographic protocol that could replace standardized public key cryptography, thanks to the inherent post-quantum resistant properties, efficiency, and versatility. A key mathematical tool in LBC is the Number Theoretic Transform (NTT), a common method to compute polynomial multiplication. It is the most compute-intensive routine and requires acceleration for practical deployment of LBC protocols. In this paper, we propose CryptoPIM, a high-throughput Processing In-Memory (PIM) accelerator for NTT-based polynomial multiplier with the support of polynomials with degrees up to 32k. Compared to the fastest FPGA implementation of an NTT-based multiplier, CryptoPIM achieves on average 31x throughput improvement with the same energy and only 28% performance reduction, thereby showing promise for practical deployment of LBC.
{"title":"CryptoPIM: In-memory Acceleration for Lattice-based Cryptographic Hardware","authors":"Hamid Nejatollahi, Saransh Gupta, M. Imani, T. Simunic, Rosario Cammarota, N. Dutt","doi":"10.1109/DAC18072.2020.9218730","DOIUrl":"https://doi.org/10.1109/DAC18072.2020.9218730","url":null,"abstract":"Quantum computers promise to solve hard mathematical problems such as integer factorization and discrete logarithms in polynomial time, making standardized public-key cryptosystems insecure. Lattice-Based Cryptography (LBC) is a promising post-quantum public key cryptographic protocol that could replace standardized public key cryptography, thanks to the inherent post-quantum resistant properties, efficiency, and versatility. A key mathematical tool in LBC is the Number Theoretic Transform (NTT), a common method to compute polynomial multiplication. It is the most compute-intensive routine and requires acceleration for practical deployment of LBC protocols. In this paper, we propose CryptoPIM, a high-throughput Processing In-Memory (PIM) accelerator for NTT-based polynomial multiplier with the support of polynomials with degrees up to 32k. Compared to the fastest FPGA implementation of an NTT-based multiplier, CryptoPIM achieves on average 31x throughput improvement with the same energy and only 28% performance reduction, thereby showing promise for practical deployment of LBC.","PeriodicalId":428807,"journal":{"name":"2020 57th ACM/IEEE Design Automation Conference (DAC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133527901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-07-01DOI: 10.1109/DAC18072.2020.9218662
Huili Chen, S. Hussain, Fabian Boemer, Emmanuel Stapf, A. Sadeghi, F. Koushanfar, Rosario Cammarota
Advances in customers' data privacy laws create pressures and pain points across the entire lifecycle of AI products. Working figures such as data scientists and data engineers need to account for the correct use of privacy-enhancing technologies such as homomorphic encryption, secure multi-party computation, and trusted execution environment when they develop, test and deploy products embedding AI models while providing data protection guarantees. In this work, we share the lessons learned during the development of frameworks to aid data scientists and data engineers to map their optimized workloads onto privacy-enhancing technologies seamlessly and correctly.
{"title":"Developing Privacy-preserving AI Systems: The Lessons learned","authors":"Huili Chen, S. Hussain, Fabian Boemer, Emmanuel Stapf, A. Sadeghi, F. Koushanfar, Rosario Cammarota","doi":"10.1109/DAC18072.2020.9218662","DOIUrl":"https://doi.org/10.1109/DAC18072.2020.9218662","url":null,"abstract":"Advances in customers' data privacy laws create pressures and pain points across the entire lifecycle of AI products. Working figures such as data scientists and data engineers need to account for the correct use of privacy-enhancing technologies such as homomorphic encryption, secure multi-party computation, and trusted execution environment when they develop, test and deploy products embedding AI models while providing data protection guarantees. In this work, we share the lessons learned during the development of frameworks to aid data scientists and data engineers to map their optimized workloads onto privacy-enhancing technologies seamlessly and correctly.","PeriodicalId":428807,"journal":{"name":"2020 57th ACM/IEEE Design Automation Conference (DAC)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132863040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-07-01DOI: 10.1109/DAC18072.2020.9218528
Dongup Kwon, Suyeon Hur, Hamin Jang, E. Nurvitadhi, Jangwoo Kim
The increasing size of recurrent neural networks (RNNs) makes it hard to meet the growing demand for real-time AI services. For low-latency RNN serving, FPGA-based accelerators can leverage specialized architectures with optimized dataflow. However, they also suffer from severe HW under-utilization when partitioning RNNs, and thus fail to obtain the scalable performance.In this paper, we identify the performance bottlenecks of existing RNN partitioning strategies. Then, we propose a novel RNN partitioning strategy to achieve the scalable multi-FPGA acceleration for large RNNs. First, we introduce three parallelism levels and exploit them by partitioning weight matrices, matrix/vector operations, and layers. Second, we examine the performance impact of collective communications and software pipelining to derive more accurate and optimal distribution results. We prototyped an FPGA-based acceleration system using multiple Intel high-end FPGAs, and our partitioning scheme allows up to 2.4x faster inference of modern RNN workloads than conventional partitioning methods.
{"title":"Scalable Multi-FPGA Acceleration for Large RNNs with Full Parallelism Levels","authors":"Dongup Kwon, Suyeon Hur, Hamin Jang, E. Nurvitadhi, Jangwoo Kim","doi":"10.1109/DAC18072.2020.9218528","DOIUrl":"https://doi.org/10.1109/DAC18072.2020.9218528","url":null,"abstract":"The increasing size of recurrent neural networks (RNNs) makes it hard to meet the growing demand for real-time AI services. For low-latency RNN serving, FPGA-based accelerators can leverage specialized architectures with optimized dataflow. However, they also suffer from severe HW under-utilization when partitioning RNNs, and thus fail to obtain the scalable performance.In this paper, we identify the performance bottlenecks of existing RNN partitioning strategies. Then, we propose a novel RNN partitioning strategy to achieve the scalable multi-FPGA acceleration for large RNNs. First, we introduce three parallelism levels and exploit them by partitioning weight matrices, matrix/vector operations, and layers. Second, we examine the performance impact of collective communications and software pipelining to derive more accurate and optimal distribution results. We prototyped an FPGA-based acceleration system using multiple Intel high-end FPGAs, and our partitioning scheme allows up to 2.4x faster inference of modern RNN workloads than conventional partitioning methods.","PeriodicalId":428807,"journal":{"name":"2020 57th ACM/IEEE Design Automation Conference (DAC)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122881759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-07-01DOI: 10.1109/DAC18072.2020.9218741
Maedeh Hemmat, Joshua San Miguel, A. Davoodi
We propose CAP’NN, a framework for Class-Aware Personalized Neural Network Inference. CAP’NN prunes an already-trained neural network model based on the preferences of individual users. Specifically, by adapting to the subset of output classes that each user is expected to encounter, CAP’NN is able to prune not only ineffectual neurons but also miseffectual neurons that confuse classification, without the need to retrain the network. CAP’NN achieves up to 50% model size reduction while actually improving the top-l(5) classification accuracy by up to 2.3%(3.2%) when the user only encounters a subset of VGG-16 classes.
{"title":"CAP’NN: Class-Aware Personalized Neural Network Inference","authors":"Maedeh Hemmat, Joshua San Miguel, A. Davoodi","doi":"10.1109/DAC18072.2020.9218741","DOIUrl":"https://doi.org/10.1109/DAC18072.2020.9218741","url":null,"abstract":"We propose CAP’NN, a framework for Class-Aware Personalized Neural Network Inference. CAP’NN prunes an already-trained neural network model based on the preferences of individual users. Specifically, by adapting to the subset of output classes that each user is expected to encounter, CAP’NN is able to prune not only ineffectual neurons but also miseffectual neurons that confuse classification, without the need to retrain the network. CAP’NN achieves up to 50% model size reduction while actually improving the top-l(5) classification accuracy by up to 2.3%(3.2%) when the user only encounters a subset of VGG-16 classes.","PeriodicalId":428807,"journal":{"name":"2020 57th ACM/IEEE Design Automation Conference (DAC)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124186928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-07-01DOI: 10.1109/DAC18072.2020.9218579
Charles Gouert, N. G. Tsoutsos
As cloud computing becomes increasingly ubiquitous, protecting the confidentiality of data outsourced to third parties becomes a priority. While encryption is a natural solution to this problem, traditional algorithms may only protect data at rest and in transit, but do not support encrypted processing. In this work we introduce ROMEO, which enables easy-to-use privacy-preserving processing of data in the cloud using homomorphic encryption. ROMEO automatically converts arbitrary programs expressed in Verilog HDL into equivalent homomorphic circuits that are evaluated using encrypted inputs. For our experiments, we employ cryptographic circuits, such as AES, and benchmarks from the ISCAS’85 and ISCAS’89 suites.
{"title":"Romeo: Conversion and Evaluation of HDL Designs in the Encrypted Domain","authors":"Charles Gouert, N. G. Tsoutsos","doi":"10.1109/DAC18072.2020.9218579","DOIUrl":"https://doi.org/10.1109/DAC18072.2020.9218579","url":null,"abstract":"As cloud computing becomes increasingly ubiquitous, protecting the confidentiality of data outsourced to third parties becomes a priority. While encryption is a natural solution to this problem, traditional algorithms may only protect data at rest and in transit, but do not support encrypted processing. In this work we introduce ROMEO, which enables easy-to-use privacy-preserving processing of data in the cloud using homomorphic encryption. ROMEO automatically converts arbitrary programs expressed in Verilog HDL into equivalent homomorphic circuits that are evaluated using encrypted inputs. For our experiments, we employ cryptographic circuits, such as AES, and benchmarks from the ISCAS’85 and ISCAS’89 suites.","PeriodicalId":428807,"journal":{"name":"2020 57th ACM/IEEE Design Automation Conference (DAC)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116367182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-07-01DOI: 10.1109/DAC18072.2020.9218661
Andreas Kurth, Samuel Riedel, Florian Zaruba, T. Hoefler, L. Benini
Atomic operations are crucial for most modern parallel and concurrent algorithms, which necessitates their optimized implementation in highly-scalable manycore processors. We pro-pose a modular and efficient, open-source ATomic UNit (ATUN) architecture that can be placed flexibly at different levels of the memory hierarchy. ATUN demonstrates near-optimal linear scaling for various synthetic and real-world workloads on an FPGA prototype with 32 RISC-V cores. We characterize the hardware complexity of our ATUN design in 22 nm FDSOI and find that it scales linearly in area (only 0.5 kGE per core) and logarithmically in the critical path.
{"title":"ATUNs: Modular and Scalable Support for Atomic Operations in a Shared Memory Multiprocessor","authors":"Andreas Kurth, Samuel Riedel, Florian Zaruba, T. Hoefler, L. Benini","doi":"10.1109/DAC18072.2020.9218661","DOIUrl":"https://doi.org/10.1109/DAC18072.2020.9218661","url":null,"abstract":"Atomic operations are crucial for most modern parallel and concurrent algorithms, which necessitates their optimized implementation in highly-scalable manycore processors. We pro-pose a modular and efficient, open-source ATomic UNit (ATUN) architecture that can be placed flexibly at different levels of the memory hierarchy. ATUN demonstrates near-optimal linear scaling for various synthetic and real-world workloads on an FPGA prototype with 32 RISC-V cores. We characterize the hardware complexity of our ATUN design in 22 nm FDSOI and find that it scales linearly in area (only 0.5 kGE per core) and logarithmically in the critical path.","PeriodicalId":428807,"journal":{"name":"2020 57th ACM/IEEE Design Automation Conference (DAC)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116904053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-07-01DOI: 10.1109/DAC18072.2020.9218747
Siva Satyendra Sahoo, B. Veeravalli, Akash Kumar
Cross-layer reliability (CLR) presents a cost-effective alternative to traditional single-layer design in resource-constrained embedded systems. CLR provides the scope for leveraging the inherent fault-masking of multiple layers and exploiting application-specific tolerances to degradation in some Quality of Service (QoS) metrics. However, it can also lead to an explosion in the design complexity. State-of-the art approaches to such joint optimization across multiple degrees of freedom can lead to degradation in the system-level Design Space Exploration (DSE) results. To this end, we propose a DSE methodology for enabling CLR-aware task-mapping in heterogeneous embedded systems. Specifically, we present novel approaches to both task and system-level analysis for performing an early-stage exploration of various design decisions. The proposed methodology results in considerable improvements over other state-of-the-art approaches and shows significant scaling with application size.
{"title":"CL(R)Early: An Early-stage DSE Methodology for Cross-Layer Reliability-aware Heterogeneous Embedded Systems","authors":"Siva Satyendra Sahoo, B. Veeravalli, Akash Kumar","doi":"10.1109/DAC18072.2020.9218747","DOIUrl":"https://doi.org/10.1109/DAC18072.2020.9218747","url":null,"abstract":"Cross-layer reliability (CLR) presents a cost-effective alternative to traditional single-layer design in resource-constrained embedded systems. CLR provides the scope for leveraging the inherent fault-masking of multiple layers and exploiting application-specific tolerances to degradation in some Quality of Service (QoS) metrics. However, it can also lead to an explosion in the design complexity. State-of-the art approaches to such joint optimization across multiple degrees of freedom can lead to degradation in the system-level Design Space Exploration (DSE) results. To this end, we propose a DSE methodology for enabling CLR-aware task-mapping in heterogeneous embedded systems. Specifically, we present novel approaches to both task and system-level analysis for performing an early-stage exploration of various design decisions. The proposed methodology results in considerable improvements over other state-of-the-art approaches and shows significant scaling with application size.","PeriodicalId":428807,"journal":{"name":"2020 57th ACM/IEEE Design Automation Conference (DAC)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115209714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-07-01DOI: 10.1109/DAC18072.2020.9218682
Wenjie Fu, Leilei Jin, Ming Ling, Yu Zheng, Longxing Shi
Wide supply voltage scaling is critical to enable worthwhile dynamic adjustment of the processor efficiency against varying workloads. In this paper, a cross-layer power and timing evaluation method is proposed to estimate the processor energy efficiency using both circuit and architectural information in a wide voltage range. The process variations are considered through statistical static timing analysis while the voltage effect is modeled through secondary iterated fittings. The error for estimating processor energy efficiency decreases to 8.29% when the supply voltage is scaled from 1.1V to 0.6V, while traditional architectural evaluations behave more than 40% errors.
{"title":"A Cross-Layer Power and Timing Evaluation Method for Wide Voltage Scaling","authors":"Wenjie Fu, Leilei Jin, Ming Ling, Yu Zheng, Longxing Shi","doi":"10.1109/DAC18072.2020.9218682","DOIUrl":"https://doi.org/10.1109/DAC18072.2020.9218682","url":null,"abstract":"Wide supply voltage scaling is critical to enable worthwhile dynamic adjustment of the processor efficiency against varying workloads. In this paper, a cross-layer power and timing evaluation method is proposed to estimate the processor energy efficiency using both circuit and architectural information in a wide voltage range. The process variations are considered through statistical static timing analysis while the voltage effect is modeled through secondary iterated fittings. The error for estimating processor energy efficiency decreases to 8.29% when the supply voltage is scaled from 1.1V to 0.6V, while traditional architectural evaluations behave more than 40% errors.","PeriodicalId":428807,"journal":{"name":"2020 57th ACM/IEEE Design Automation Conference (DAC)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125327886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-07-01DOI: 10.1109/DAC18072.2020.9218627
Chang Meng, Weikang Qian, A. Mishchenko
Approximate computing is an emerging design technique for error-resilient applications. It improves circuit area, power, and delay at the cost of introducing some errors. Approximate logic synthesis (ALS) is an automatic process to produce approximate circuits. This paper proposes approximate resubstitution with approximate care set and uses it to build a simulation-based ALS flow. The experimental results demonstrate that the proposed method saves 7%–18% area compared to state-of-the-art methods. The code of ALSRAC is made open-source.
{"title":"ALSRAC: Approximate Logic Synthesis by Resubstitution with Approximate Care Set","authors":"Chang Meng, Weikang Qian, A. Mishchenko","doi":"10.1109/DAC18072.2020.9218627","DOIUrl":"https://doi.org/10.1109/DAC18072.2020.9218627","url":null,"abstract":"Approximate computing is an emerging design technique for error-resilient applications. It improves circuit area, power, and delay at the cost of introducing some errors. Approximate logic synthesis (ALS) is an automatic process to produce approximate circuits. This paper proposes approximate resubstitution with approximate care set and uses it to build a simulation-based ALS flow. The experimental results demonstrate that the proposed method saves 7%–18% area compared to state-of-the-art methods. The code of ALSRAC is made open-source.","PeriodicalId":428807,"journal":{"name":"2020 57th ACM/IEEE Design Automation Conference (DAC)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126910666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}