Pub Date : 2018-03-13DOI: 10.1109/ISQED.2018.8357310
Divya Akella, Xinfei Guo, H. Patel, M. Stan, B. Calhoun
This paper presents a post-silicon hold time closure technique for performance-relaxed, sub-threshold digital designs using tunable-buffer insertion in hold-critical data-paths. Hold time closure in flip-flop based digital circuits is highly critical because hold failures cannot be corrected post-fabrication. This criticality increases in the sub-threshold domain, which is highly sensitive to process, voltage, and temperature variations. Design-time hold margins enable robust hold time closure across variations. However, insufficient hold margins can lead to chip failures and overestimated hold margins introduce additional costs in area and power. In this paper, we propose a post-silicon hold time closure methodology that introduces tunable-buffers in the data-path. This enables post-silicon correction of hold violations and therefore, reduces the design effort in estimating design-time hold margins. We design a tunable-buffer, demonstrate the tunable-buffer insertion strategy, and present a physical design flow using standard EDA tools. We verify this technique with measurements of a 130 nm test chip. A design-dependent hold slack improvement in the range of 103%–195% is achieved compared to the traditional buffering technique, with minimal power and area overhead. This technique also has the potential to reduce the number of buffers inserted for hold closure.
{"title":"A post-silicon hold time closure technique using data-path tunable-buffers for variation-tolerance in sub-threshold designs","authors":"Divya Akella, Xinfei Guo, H. Patel, M. Stan, B. Calhoun","doi":"10.1109/ISQED.2018.8357310","DOIUrl":"https://doi.org/10.1109/ISQED.2018.8357310","url":null,"abstract":"This paper presents a post-silicon hold time closure technique for performance-relaxed, sub-threshold digital designs using tunable-buffer insertion in hold-critical data-paths. Hold time closure in flip-flop based digital circuits is highly critical because hold failures cannot be corrected post-fabrication. This criticality increases in the sub-threshold domain, which is highly sensitive to process, voltage, and temperature variations. Design-time hold margins enable robust hold time closure across variations. However, insufficient hold margins can lead to chip failures and overestimated hold margins introduce additional costs in area and power. In this paper, we propose a post-silicon hold time closure methodology that introduces tunable-buffers in the data-path. This enables post-silicon correction of hold violations and therefore, reduces the design effort in estimating design-time hold margins. We design a tunable-buffer, demonstrate the tunable-buffer insertion strategy, and present a physical design flow using standard EDA tools. We verify this technique with measurements of a 130 nm test chip. A design-dependent hold slack improvement in the range of 103%–195% is achieved compared to the traditional buffering technique, with minimal power and area overhead. This technique also has the potential to reduce the number of buffers inserted for hold closure.","PeriodicalId":213351,"journal":{"name":"2018 19th International Symposium on Quality Electronic Design (ISQED)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123736165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-13DOI: 10.1109/ISQED.2018.8357294
Ruslan Dautov, S. Mosin
Analog circuits are widely used in different fields such as medicine, military, aviation and are critical for the development of reliable electronic systems. Testing and diagnosis are important tasks which detect and localize defects in the circuit under test as well as improve quality of the final product. Output responses of fault-free and faulty behavior of analog circuit can be represented by infinite set of values due to tolerances of internal components. The data mining methods may improve quality of fault diagnosis in the case of big data processing. The technique of aggregation the classes of fault diagnostic responses, based on association rule mining, is proposed. The technique corresponds to the simulation before test concept: a fault dictionary is generated by collecting the coefficients of wavelet transformation for fault-free and faulty conditions as the preprocessing of output signals. Classificator is based on k-nearest neighbors method (k-NN) and association rule mining algorithm. The fault diagnostic technique was trained and tested using data obtained after simulation of fault-free and faulty behavior of the analog filter. In result the accuracy in classifying faulty conditions and fault coverage have consisted of more than 99,09% and more than 99,08% correspondingly. The proposed technique is completely automated and can be extended.
{"title":"A technique to aggregate classes of analog fault diagnostic data based on association rule mining","authors":"Ruslan Dautov, S. Mosin","doi":"10.1109/ISQED.2018.8357294","DOIUrl":"https://doi.org/10.1109/ISQED.2018.8357294","url":null,"abstract":"Analog circuits are widely used in different fields such as medicine, military, aviation and are critical for the development of reliable electronic systems. Testing and diagnosis are important tasks which detect and localize defects in the circuit under test as well as improve quality of the final product. Output responses of fault-free and faulty behavior of analog circuit can be represented by infinite set of values due to tolerances of internal components. The data mining methods may improve quality of fault diagnosis in the case of big data processing. The technique of aggregation the classes of fault diagnostic responses, based on association rule mining, is proposed. The technique corresponds to the simulation before test concept: a fault dictionary is generated by collecting the coefficients of wavelet transformation for fault-free and faulty conditions as the preprocessing of output signals. Classificator is based on k-nearest neighbors method (k-NN) and association rule mining algorithm. The fault diagnostic technique was trained and tested using data obtained after simulation of fault-free and faulty behavior of the analog filter. In result the accuracy in classifying faulty conditions and fault coverage have consisted of more than 99,09% and more than 99,08% correspondingly. The proposed technique is completely automated and can be extended.","PeriodicalId":213351,"journal":{"name":"2018 19th International Symposium on Quality Electronic Design (ISQED)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115875167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-13DOI: 10.1109/ISQED.2018.8357264
A. Roohi, Ramtin Zand, R. Demara
In this paper, we develop an evolutionary-driven circuit optimization methodology, which can be leveraged for the synthesis of spintronic-based normally-off computing (NoC) circuits. NoC architectures distribute nonvolatile memory elements throughout the CMOS logic plane, creating a new class of fine-grained functionally-constrained synthesis challenges. Spin-based NoC circuits synthesis objectives include increased computational throughput and reduced static power consumption. Our proposed methodology utilizes Genetic Algorithms (GAs) to optimize the implementation of a Boolean logic expression in terms of area, delay, or power consumption. It first leverages the spin-based device characteristics to achieve a primary semi-optimized implementation, then further performance optimization is applied to the implemented design based on the NoC requirements and optimization criteria. As a proof-of-concept, the optimization approach is leveraged to implement a functionally-complete set of Boolean logic gates using spin Hall effect (SHE)-magnetic tunnel junctions (MTJs), which are optimized for both power and delay objectives. NoC synthesis methodologies supporting NoC circuit design of emerging device and hybrid CMOS logic applications. Finally, Simulation results and analyses verified the functionality of our proposed optimization tool for NoC circuit implementations.
{"title":"Synthesis of normally-off boolean circuits: An evolutionary optimization approach utilizing spintronic devices","authors":"A. Roohi, Ramtin Zand, R. Demara","doi":"10.1109/ISQED.2018.8357264","DOIUrl":"https://doi.org/10.1109/ISQED.2018.8357264","url":null,"abstract":"In this paper, we develop an evolutionary-driven circuit optimization methodology, which can be leveraged for the synthesis of spintronic-based normally-off computing (NoC) circuits. NoC architectures distribute nonvolatile memory elements throughout the CMOS logic plane, creating a new class of fine-grained functionally-constrained synthesis challenges. Spin-based NoC circuits synthesis objectives include increased computational throughput and reduced static power consumption. Our proposed methodology utilizes Genetic Algorithms (GAs) to optimize the implementation of a Boolean logic expression in terms of area, delay, or power consumption. It first leverages the spin-based device characteristics to achieve a primary semi-optimized implementation, then further performance optimization is applied to the implemented design based on the NoC requirements and optimization criteria. As a proof-of-concept, the optimization approach is leveraged to implement a functionally-complete set of Boolean logic gates using spin Hall effect (SHE)-magnetic tunnel junctions (MTJs), which are optimized for both power and delay objectives. NoC synthesis methodologies supporting NoC circuit design of emerging device and hybrid CMOS logic applications. Finally, Simulation results and analyses verified the functionality of our proposed optimization tool for NoC circuit implementations.","PeriodicalId":213351,"journal":{"name":"2018 19th International Symposium on Quality Electronic Design (ISQED)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115130472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-13DOI: 10.1109/ISQED.2018.8357300
V. Nautiyal, N. Nukala, F. Bohra, S. Dwivedi, J. Dasani, Satinderjit Singh, G. Singla, M. Kinkade
In this paper, a row-redundancy circuit using latches is designed for 7nm FinFET ultra high density SRAM operating at 1.75 GHz. Input and faulty addresses are compared in parallel to the memory read access operation thus avoiding a major impact on access or address setup time. Latch output data is multiplexed with memory data and the impact on access time is only 7ps at SS/0.675V/-40°C corner. Data is written to redundant latches only when address comparison matches. The proposed circuit is implemented with no setup time impact and an overall area overhead of the proposed row redundancy scheme is less by 82% as compared to the area overhead of the conventional redundancy scheme.
{"title":"Logic-based row redundancy technique designed in 7nm FinFET technology for embedded SRAMs","authors":"V. Nautiyal, N. Nukala, F. Bohra, S. Dwivedi, J. Dasani, Satinderjit Singh, G. Singla, M. Kinkade","doi":"10.1109/ISQED.2018.8357300","DOIUrl":"https://doi.org/10.1109/ISQED.2018.8357300","url":null,"abstract":"In this paper, a row-redundancy circuit using latches is designed for 7nm FinFET ultra high density SRAM operating at 1.75 GHz. Input and faulty addresses are compared in parallel to the memory read access operation thus avoiding a major impact on access or address setup time. Latch output data is multiplexed with memory data and the impact on access time is only 7ps at SS/0.675V/-40°C corner. Data is written to redundant latches only when address comparison matches. The proposed circuit is implemented with no setup time impact and an overall area overhead of the proposed row redundancy scheme is less by 82% as compared to the area overhead of the conventional redundancy scheme.","PeriodicalId":213351,"journal":{"name":"2018 19th International Symposium on Quality Electronic Design (ISQED)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126316976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-13DOI: 10.1109/ISQED.2018.8357316
Bingzhe Li, M. Najafi, Bo Yuan, D. Lilja
With increased interests of neural networks, hardware implementations of neural networks have been investigated. Researchers pursue low hardware cost by using different technologies such as stochastic computing and quantization. For example, the quantization is able to reduce total number of trained weights and results in low hardware cost. Stochastic computing aims to lower hardware costs substantially by using simple gates instead of complex arithmetic operations. In this paper, we propose a new stochastic multiplier with shifted unary code adders (SUC-Adder) for quantized neural networks. The new design uses the characteristic of quantized weights and tremendously reduces the hardware cost of neural networks. Experimental results indicate that our stochastic design achieves about 10x energy reduction compared to its counterpart binary implementation while maintaining slightly higher recognition error rates than the binary implementation.
{"title":"Quantized neural networks with new stochastic multipliers","authors":"Bingzhe Li, M. Najafi, Bo Yuan, D. Lilja","doi":"10.1109/ISQED.2018.8357316","DOIUrl":"https://doi.org/10.1109/ISQED.2018.8357316","url":null,"abstract":"With increased interests of neural networks, hardware implementations of neural networks have been investigated. Researchers pursue low hardware cost by using different technologies such as stochastic computing and quantization. For example, the quantization is able to reduce total number of trained weights and results in low hardware cost. Stochastic computing aims to lower hardware costs substantially by using simple gates instead of complex arithmetic operations. In this paper, we propose a new stochastic multiplier with shifted unary code adders (SUC-Adder) for quantized neural networks. The new design uses the characteristic of quantized weights and tremendously reduces the hardware cost of neural networks. Experimental results indicate that our stochastic design achieves about 10x energy reduction compared to its counterpart binary implementation while maintaining slightly higher recognition error rates than the binary implementation.","PeriodicalId":213351,"journal":{"name":"2018 19th International Symposium on Quality Electronic Design (ISQED)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122391472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-13DOI: 10.1109/ISQED.2018.8357268
Yu-Cheng Chiang, Shr-Cheng Tsai, Rung-Bin Lin
This paper presents an algorithm for finding array structures in a layout design. The algorithm can find all the regular layout structures from a flattened layout design without knowing its building blocks beforehand. A potential application of this work is to reduce layout DRC and lithography check time. Experimental results show that our algorithm is efficient and robust.
{"title":"Recognition of regular layout structures","authors":"Yu-Cheng Chiang, Shr-Cheng Tsai, Rung-Bin Lin","doi":"10.1109/ISQED.2018.8357268","DOIUrl":"https://doi.org/10.1109/ISQED.2018.8357268","url":null,"abstract":"This paper presents an algorithm for finding array structures in a layout design. The algorithm can find all the regular layout structures from a flattened layout design without knowing its building blocks beforehand. A potential application of this work is to reduce layout DRC and lithography check time. Experimental results show that our algorithm is efficient and robust.","PeriodicalId":213351,"journal":{"name":"2018 19th International Symposium on Quality Electronic Design (ISQED)","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114273075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-13DOI: 10.1109/ISQED.2018.8357279
Jihyun Ryoo, Meenakshi Arunachalam, R. Khanna, M. Kandemir
Scores of emerging and domain-specific applications need the ability to acquire and augment new knowledge from offline training-sets and online user interactions. This requires an underlying computing platform that can host machine learning (ML) kernels. This in turn entails one to have efficient implementations of the frequently-used ML kernels on state-of-the-art multicores and many-cores, to act as high-performance accelerators. Motivated by this observation, this paper focuses on one such ML kernel, namely, K Nearest Neighbor (KNN), and conducts a comprehensive comparison of its behavior on two alternate accelerator-based systems: NVIDIA GPU and Intel Xeon Phi (both KNC and KNL architectures). More explicitly, we discuss and experimentally evaluate various optimizations that can be applied to both GPU and Xeon Phi, as well as optimizations that are specific to either GPU or Xeon Phi. Furthermore, we implement different versions of KNN on these candidate accelerators and collect experimental data using various inputs. Our experimental evaluations suggest that, by using both general purpose and accelerator specific optimizations, one can achieve average speedups ranging 0.49x–3.48x (training) and 1.43x–9.41x (classification) on Xeon Phi series, compared to 0.05x–0.60x (training), 1.61x–6.32x (classification) achieved by the GPU version, both over the standard host-only system.
{"title":"Efficient K nearest neighbor algorithm implementations for throughput-oriented architectures","authors":"Jihyun Ryoo, Meenakshi Arunachalam, R. Khanna, M. Kandemir","doi":"10.1109/ISQED.2018.8357279","DOIUrl":"https://doi.org/10.1109/ISQED.2018.8357279","url":null,"abstract":"Scores of emerging and domain-specific applications need the ability to acquire and augment new knowledge from offline training-sets and online user interactions. This requires an underlying computing platform that can host machine learning (ML) kernels. This in turn entails one to have efficient implementations of the frequently-used ML kernels on state-of-the-art multicores and many-cores, to act as high-performance accelerators. Motivated by this observation, this paper focuses on one such ML kernel, namely, K Nearest Neighbor (KNN), and conducts a comprehensive comparison of its behavior on two alternate accelerator-based systems: NVIDIA GPU and Intel Xeon Phi (both KNC and KNL architectures). More explicitly, we discuss and experimentally evaluate various optimizations that can be applied to both GPU and Xeon Phi, as well as optimizations that are specific to either GPU or Xeon Phi. Furthermore, we implement different versions of KNN on these candidate accelerators and collect experimental data using various inputs. Our experimental evaluations suggest that, by using both general purpose and accelerator specific optimizations, one can achieve average speedups ranging 0.49x–3.48x (training) and 1.43x–9.41x (classification) on Xeon Phi series, compared to 0.05x–0.60x (training), 1.61x–6.32x (classification) achieved by the GPU version, both over the standard host-only system.","PeriodicalId":213351,"journal":{"name":"2018 19th International Symposium on Quality Electronic Design (ISQED)","volume":"138 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132454971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-13DOI: 10.1109/ISQED.2018.8357283
Chun-Xun Lin, Tsung-Wei Huang, Martin D. F. Wong
The rapid evolution of modern C++ programming language has completely changed the way developers write high-performance and robust applications. By modern, we mean C++17, which has revolutionized the “old-fashion” C++98 in many aspects such as meta-programming, concurrency controls, and functional programming. Despite the tremendous progress in language innovation, research on how these advanced features can improve EDA programs is still nascent. In this paper, we introduce a novel routing framework using the technique of generalized constant expression in C++17. Our framework allows a router to take advantage of compile-time computation and thus can save a significant amount of engineering effort that would otherwise be issued every time the program runs. By prescribing computation at compile time, the compiler is able to further produce more optimized codes to run faster than ever before. We have evaluated our framework on classic routing problems and have demonstrated promising performance gain over which is done solely at runtime. Our framework has the potential to change many fundamental EDA building blocks and thus can achieve better tool performance and engineering productivity.
{"title":"Routing at compile time","authors":"Chun-Xun Lin, Tsung-Wei Huang, Martin D. F. Wong","doi":"10.1109/ISQED.2018.8357283","DOIUrl":"https://doi.org/10.1109/ISQED.2018.8357283","url":null,"abstract":"The rapid evolution of modern C++ programming language has completely changed the way developers write high-performance and robust applications. By modern, we mean C++17, which has revolutionized the “old-fashion” C++98 in many aspects such as meta-programming, concurrency controls, and functional programming. Despite the tremendous progress in language innovation, research on how these advanced features can improve EDA programs is still nascent. In this paper, we introduce a novel routing framework using the technique of generalized constant expression in C++17. Our framework allows a router to take advantage of compile-time computation and thus can save a significant amount of engineering effort that would otherwise be issued every time the program runs. By prescribing computation at compile time, the compiler is able to further produce more optimized codes to run faster than ever before. We have evaluated our framework on classic routing problems and have demonstrated promising performance gain over which is done solely at runtime. Our framework has the potential to change many fundamental EDA building blocks and thus can achieve better tool performance and engineering productivity.","PeriodicalId":213351,"journal":{"name":"2018 19th International Symposium on Quality Electronic Design (ISQED)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129860019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-13DOI: 10.1109/ISQED.2018.8357320
Zhiming Zhang, L. Njilla, C. Kamhoua, K. Kwiat, Qiaoyan Yu
Component-aging is unavoidable in legacy systems. Although re-designing the system typically results in a high cost, the need to replace aged components for legacy systems is an urgent priority. Unfortunately, the aged components are likely to be obsolete and not available on the current market. Obsolete component replacement with field-programmable gate array (FPGA) devices is emerging as a feasible option to extend the lifetime of legacy systems. While replacing the aged component, we traditionally only focus on matching the functionality and neglect the potential security threats from FPGA replacement. However, recent literature demonstrates that FPGA devices may contain hardware Trojans, which are induced during FPGA device fabrication or bitstream generation time. To prevent the Trojans on FPGA from receiving external inputs or leaking sensitive information, we propose a Runtime Pin Grounding (RPG) scheme to ground the unused pins and check the pin status at every clock cycle. Furthermore, we exploit the principle of moving target defense (MTD) and propose a hardware MTD (HMTD) method. In our method, the aged obsolete unit is replicated to multiple copies in the FPGA device, and two of the replicas are randomly selected for output comparison and thus Trojan detection. We successfully implemented the proposed RPG and HMTD methods on a Nexys-3 FPGA board. Our case study shows that the proposed RPG scheme increases the FPGA utilization rate by less than 0.1%. On average, our HMTD method reduces the hardware Trojan bypass rate by 61% over the existing method.
{"title":"Securing FPGA-based obsolete component replacement for legacy systems","authors":"Zhiming Zhang, L. Njilla, C. Kamhoua, K. Kwiat, Qiaoyan Yu","doi":"10.1109/ISQED.2018.8357320","DOIUrl":"https://doi.org/10.1109/ISQED.2018.8357320","url":null,"abstract":"Component-aging is unavoidable in legacy systems. Although re-designing the system typically results in a high cost, the need to replace aged components for legacy systems is an urgent priority. Unfortunately, the aged components are likely to be obsolete and not available on the current market. Obsolete component replacement with field-programmable gate array (FPGA) devices is emerging as a feasible option to extend the lifetime of legacy systems. While replacing the aged component, we traditionally only focus on matching the functionality and neglect the potential security threats from FPGA replacement. However, recent literature demonstrates that FPGA devices may contain hardware Trojans, which are induced during FPGA device fabrication or bitstream generation time. To prevent the Trojans on FPGA from receiving external inputs or leaking sensitive information, we propose a Runtime Pin Grounding (RPG) scheme to ground the unused pins and check the pin status at every clock cycle. Furthermore, we exploit the principle of moving target defense (MTD) and propose a hardware MTD (HMTD) method. In our method, the aged obsolete unit is replicated to multiple copies in the FPGA device, and two of the replicas are randomly selected for output comparison and thus Trojan detection. We successfully implemented the proposed RPG and HMTD methods on a Nexys-3 FPGA board. Our case study shows that the proposed RPG scheme increases the FPGA utilization rate by less than 0.1%. On average, our HMTD method reduces the hardware Trojan bypass rate by 61% over the existing method.","PeriodicalId":213351,"journal":{"name":"2018 19th International Symposium on Quality Electronic Design (ISQED)","volume":"135 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132421830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-13DOI: 10.1109/ISQED.2018.8357266
Z. Pajouhi
Research towards brain-inspired computing based on beyond CMOS devices has gained momentum in recent years. The motivation beyond this vigorous research prevails in exploitation of the resemblance between the computing principles and the device characteristics. To this end, the devices are used to perform otherwise time-consuming and power hungry tasks required for brain-inspired computing. Due to their miniaturized dimensions, zero leakage and nonvolatility, spintronic devices are among the most promising class of beyond CMOS devices. In this paper, we propose a novel spintronic structure based on antiferrromagnetically coupled domain walls. The device structure enables dedicated terminology for synaptic and neuron connections. This characteristic enables more efficient design of neuromorphic systems by allowing larger design space for designers. Furthermore, thanks to the coupling between the domain walls, the device can potentially operate at higher speeds while maintaining the energy consumption of the device; this higher speed contributes to improved performance of the neuromorphic system. In order to evaluate our proposed device structure, we developed a cross-layer simulation framework. Our simulation framework analyzes the neuromorphic system at the device, circuit and algorithm levels. Our simulation results show an order of magnitude improvement in the energy consumption compared to CMOS and analog neurons and up to 2X performance improvement as well as 8% improvement in the energy over state-of-the-art neuromorphic platforms using spintronic devices.
{"title":"Energy efficient neuromorphic processing using spintronic memristive device with dedicated synaptic and neuron terminology","authors":"Z. Pajouhi","doi":"10.1109/ISQED.2018.8357266","DOIUrl":"https://doi.org/10.1109/ISQED.2018.8357266","url":null,"abstract":"Research towards brain-inspired computing based on beyond CMOS devices has gained momentum in recent years. The motivation beyond this vigorous research prevails in exploitation of the resemblance between the computing principles and the device characteristics. To this end, the devices are used to perform otherwise time-consuming and power hungry tasks required for brain-inspired computing. Due to their miniaturized dimensions, zero leakage and nonvolatility, spintronic devices are among the most promising class of beyond CMOS devices. In this paper, we propose a novel spintronic structure based on antiferrromagnetically coupled domain walls. The device structure enables dedicated terminology for synaptic and neuron connections. This characteristic enables more efficient design of neuromorphic systems by allowing larger design space for designers. Furthermore, thanks to the coupling between the domain walls, the device can potentially operate at higher speeds while maintaining the energy consumption of the device; this higher speed contributes to improved performance of the neuromorphic system. In order to evaluate our proposed device structure, we developed a cross-layer simulation framework. Our simulation framework analyzes the neuromorphic system at the device, circuit and algorithm levels. Our simulation results show an order of magnitude improvement in the energy consumption compared to CMOS and analog neurons and up to 2X performance improvement as well as 8% improvement in the energy over state-of-the-art neuromorphic platforms using spintronic devices.","PeriodicalId":213351,"journal":{"name":"2018 19th International Symposium on Quality Electronic Design (ISQED)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130613462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}