Pub Date : 2023-04-05DOI: 10.1109/ISQED57927.2023.10129313
Ya-sine Agrignan, Shangli Zhou, Jun Bai, Sahidul Islam, S. Nabavi, Mimi Xie, Caiwen Ding
Intra-Cardiac Electrogram (IEGM) is widely used to identify life-threatening ventricular arrhythmias in medical devices to prevent sudden cardiac death, e.g., Implantable Cardioverter Defibrillator (ICD). In this paper, we present and explore the development of a machine learning approach for the detection of life-threatening Heart Arrhythmias through IEGM Data from an ICD Device. This work is facilitated by the design and analysis of 2 Convolutional Neural Network (CNN), 1D and 2D CNNs, that perform inference on a Low Power STM Nucleo-32 MCU. Multiple microcontroller software platforms are utilized to construct and deploy the trained models onto the MCU platform for inference measurements. The experimental analysis consists of minimizing Average Inference time and onboard Memory Occupation while maximizing the accuracy of the models. We profile the memory occupation and inference time for different CNN kernels. We develop a 1D CNN structure with a 26.20 ms Average Inference out of 10 measurements taken by the MCU platform. Model Weights in Flash Memory Occupied 5.99 KiB and Model Activations in SRAM (Static Random Access Memory) measure 5.00 KiB. The 1D CNN achieves a Fβ score of 97.8. The 2D CNN Model achieves 11.00 ms of inference, 3.05 KiB of Flash, and 8.09 KiB of SRAM. The 2D CNN achieves a Fβ score of 95.15. Our code is publicly available at https://github.com/Zhoushanglin100/TinyML-HuskyCSDeepical.
{"title":"A Deep Learning Approach for Ventricular Arrhythmias Classification using Microcontroller","authors":"Ya-sine Agrignan, Shangli Zhou, Jun Bai, Sahidul Islam, S. Nabavi, Mimi Xie, Caiwen Ding","doi":"10.1109/ISQED57927.2023.10129313","DOIUrl":"https://doi.org/10.1109/ISQED57927.2023.10129313","url":null,"abstract":"Intra-Cardiac Electrogram (IEGM) is widely used to identify life-threatening ventricular arrhythmias in medical devices to prevent sudden cardiac death, e.g., Implantable Cardioverter Defibrillator (ICD). In this paper, we present and explore the development of a machine learning approach for the detection of life-threatening Heart Arrhythmias through IEGM Data from an ICD Device. This work is facilitated by the design and analysis of 2 Convolutional Neural Network (CNN), 1D and 2D CNNs, that perform inference on a Low Power STM Nucleo-32 MCU. Multiple microcontroller software platforms are utilized to construct and deploy the trained models onto the MCU platform for inference measurements. The experimental analysis consists of minimizing Average Inference time and onboard Memory Occupation while maximizing the accuracy of the models. We profile the memory occupation and inference time for different CNN kernels. We develop a 1D CNN structure with a 26.20 ms Average Inference out of 10 measurements taken by the MCU platform. Model Weights in Flash Memory Occupied 5.99 KiB and Model Activations in SRAM (Static Random Access Memory) measure 5.00 KiB. The 1D CNN achieves a Fβ score of 97.8. The 2D CNN Model achieves 11.00 ms of inference, 3.05 KiB of Flash, and 8.09 KiB of SRAM. The 2D CNN achieves a Fβ score of 95.15. Our code is publicly available at https://github.com/Zhoushanglin100/TinyML-HuskyCSDeepical.","PeriodicalId":315053,"journal":{"name":"2023 24th International Symposium on Quality Electronic Design (ISQED)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122877883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-05DOI: 10.1109/ISQED57927.2023.10129317
Noah Zins, Hongyu An
Deep learning accomplishes remarkable success through training with massively labeled datasets. However, the high demands on the datasets impede the feasibility of deep learning in edge computing scenarios and suffer the data scarcity issue. Rather than relying on labeled data, animals learn by interacting with their surroundings and memorizing the relationship between concurrent events. This learning paradigm is referred to as associative memory. The successful implementation of associative memory potentially achieves self-learning schemes analogous to animals to resolve the challenges of deep learning. The state-of-the-art implementations of associative memory are limited to small-scale and offline paradigms. Thus, in this work, we implement associative memory learning with an Unmanned Ground Vehicle (UGV) and neuromorphic chips (Intel Loihi) for an online learning scenario. Our system reproduces the classic associative memory in rats. In specific, our system successfully reproduces the fear conditioning with no pretraining procedure and labeled datasets. In our experiments, the UGV serves as a substitute for the rats. Our UGV autonomously memorizes the cause-and-effect of the light stimulus and vibration stimulus, then exhibits a movement response. During associative memory learning, the synaptic weights are updated by Hebbian learning. The Intel Loihi chip is integrated with our online learning system for processing visual signals. Its average power usages for computing logic and memory are 30 mW and 29 mW, respectively.
{"title":"Reproducing Fear Conditioning of Rats with Unmanned Ground Vehicles and Neuromorphic Systems","authors":"Noah Zins, Hongyu An","doi":"10.1109/ISQED57927.2023.10129317","DOIUrl":"https://doi.org/10.1109/ISQED57927.2023.10129317","url":null,"abstract":"Deep learning accomplishes remarkable success through training with massively labeled datasets. However, the high demands on the datasets impede the feasibility of deep learning in edge computing scenarios and suffer the data scarcity issue. Rather than relying on labeled data, animals learn by interacting with their surroundings and memorizing the relationship between concurrent events. This learning paradigm is referred to as associative memory. The successful implementation of associative memory potentially achieves self-learning schemes analogous to animals to resolve the challenges of deep learning. The state-of-the-art implementations of associative memory are limited to small-scale and offline paradigms. Thus, in this work, we implement associative memory learning with an Unmanned Ground Vehicle (UGV) and neuromorphic chips (Intel Loihi) for an online learning scenario. Our system reproduces the classic associative memory in rats. In specific, our system successfully reproduces the fear conditioning with no pretraining procedure and labeled datasets. In our experiments, the UGV serves as a substitute for the rats. Our UGV autonomously memorizes the cause-and-effect of the light stimulus and vibration stimulus, then exhibits a movement response. During associative memory learning, the synaptic weights are updated by Hebbian learning. The Intel Loihi chip is integrated with our online learning system for processing visual signals. Its average power usages for computing logic and memory are 30 mW and 29 mW, respectively.","PeriodicalId":315053,"journal":{"name":"2023 24th International Symposium on Quality Electronic Design (ISQED)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134619060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-05DOI: 10.1109/ISQED57927.2023.10129400
V. Kumari, Maya Chandrakar, M. Majumder
A continuous scaling down of technology drives the microelectronics industry towards the nanoscale regime, wherein various fabrication-related defects such as electromigration induced open/short faults, interfacial cracks, and thermal stress-induced leakage problems primarily dominate the overall performance of a through silicon via (TSV). Interfacial cracking plays a pivotal role in the long-term service reliability of the chip among them. On account of these facts, this paper provides equivalent RLGC fault modeling and performance analysis of thermo-mechanical delamination in TSVs known as interfacial cracks. Considering the MOS effect, an analytical expression is derived using defective parameters to analyze the feasibility and reliability of the defected TSVs at different crack widths and heights. Using a driver-via-load (DVL) setup, performance in terms of power dissipation, power delay product (PDP), and dynamic crosstalk delay are analyzed using a CMOS driver. Encouragingly, considering interfacial cracked TSV, power and crosstalk delay are improved by 74.4% and 65.5%, respectively, at a minimum crack length approaching the defect-free condition.
{"title":"Performance Analysis of Cylindrical Through Silicon Via with Interfacial Crack","authors":"V. Kumari, Maya Chandrakar, M. Majumder","doi":"10.1109/ISQED57927.2023.10129400","DOIUrl":"https://doi.org/10.1109/ISQED57927.2023.10129400","url":null,"abstract":"A continuous scaling down of technology drives the microelectronics industry towards the nanoscale regime, wherein various fabrication-related defects such as electromigration induced open/short faults, interfacial cracks, and thermal stress-induced leakage problems primarily dominate the overall performance of a through silicon via (TSV). Interfacial cracking plays a pivotal role in the long-term service reliability of the chip among them. On account of these facts, this paper provides equivalent RLGC fault modeling and performance analysis of thermo-mechanical delamination in TSVs known as interfacial cracks. Considering the MOS effect, an analytical expression is derived using defective parameters to analyze the feasibility and reliability of the defected TSVs at different crack widths and heights. Using a driver-via-load (DVL) setup, performance in terms of power dissipation, power delay product (PDP), and dynamic crosstalk delay are analyzed using a CMOS driver. Encouragingly, considering interfacial cracked TSV, power and crosstalk delay are improved by 74.4% and 65.5%, respectively, at a minimum crack length approaching the defect-free condition.","PeriodicalId":315053,"journal":{"name":"2023 24th International Symposium on Quality Electronic Design (ISQED)","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134077151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-05DOI: 10.1109/ISQED57927.2023.10129344
Joseph Lindsay, Ramtin Zand
Works in quantum machine learning (QML) over the past few years indicate that QML algorithms can function just as well as their classical counterparts, and even outperform them in some cases. Among the corpus of recent work, many current QML models take advantage of variational quantum algorithm (VQA) circuits, given that their scale is typically small enough to be compatible with NISQ devices and the method of automatic differentiation for optimizing circuit parameters is familiar to machine learning (ML). While the results bare interesting promise for an era when quantum machines are more readily accessible, if one can achieve similar results through non-quantum methods then there may be a more near-term advantage available to practitioners. To this end, the nature of this work is to investigate the utilization of stochastic methods inspired by a variational quantum version of the long short-term memory (LSTM) model in attempt to approach the reported successes in performance and rapid convergence. By analyzing the performance of classical, stochastic, and quantum methods, this work aims to elucidate if it is possible to achieve some of QML’s major reported benefits on classical machines by incorporating aspects of its stochasticity.
{"title":"A Novel Stochastic LSTM Model Inspired by Quantum Machine Learning","authors":"Joseph Lindsay, Ramtin Zand","doi":"10.1109/ISQED57927.2023.10129344","DOIUrl":"https://doi.org/10.1109/ISQED57927.2023.10129344","url":null,"abstract":"Works in quantum machine learning (QML) over the past few years indicate that QML algorithms can function just as well as their classical counterparts, and even outperform them in some cases. Among the corpus of recent work, many current QML models take advantage of variational quantum algorithm (VQA) circuits, given that their scale is typically small enough to be compatible with NISQ devices and the method of automatic differentiation for optimizing circuit parameters is familiar to machine learning (ML). While the results bare interesting promise for an era when quantum machines are more readily accessible, if one can achieve similar results through non-quantum methods then there may be a more near-term advantage available to practitioners. To this end, the nature of this work is to investigate the utilization of stochastic methods inspired by a variational quantum version of the long short-term memory (LSTM) model in attempt to approach the reported successes in performance and rapid convergence. By analyzing the performance of classical, stochastic, and quantum methods, this work aims to elucidate if it is possible to achieve some of QML’s major reported benefits on classical machines by incorporating aspects of its stochasticity.","PeriodicalId":315053,"journal":{"name":"2023 24th International Symposium on Quality Electronic Design (ISQED)","volume":"207 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114216269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-05DOI: 10.1109/ISQED57927.2023.10129326
Honghao Zheng, Y. Yi
Spiking neural network (SNN) has attracted more and more research attention due to its event-based property. SNNs are more power efficient with such property than a conventional artificial neural network. For transferring the information to spikes, SNNs need an encoding process. With the temporal encoding schemes, SNN can extract the temporal patterns from the original information. A more advanced encoding scheme is a multiplexing temporal encoding which combines several encoding schemes with different timescales to have a larger information density and dynamic range. After that, the spike timing dependence plasticity (STDP) learning algorithm is utilized for training the SNN since the SNN can not be trained with regular training algorithms like backpropagation. In this work, a spiking domain feature extraction neural network with temporal multiplexing encoding is designed on EAGLE and fabricated on the PCB board. The testbench’s power consumption is 400mW. From the test result, a conclusion can be drawn that the network on PCB can transfer the input information to multiplexing temporal encoded spikes and then utilize the spikes to adjust the synaptic weight voltage.
{"title":"Spiking Domain Feature Extraction with Temporal Dynamic Learning","authors":"Honghao Zheng, Y. Yi","doi":"10.1109/ISQED57927.2023.10129326","DOIUrl":"https://doi.org/10.1109/ISQED57927.2023.10129326","url":null,"abstract":"Spiking neural network (SNN) has attracted more and more research attention due to its event-based property. SNNs are more power efficient with such property than a conventional artificial neural network. For transferring the information to spikes, SNNs need an encoding process. With the temporal encoding schemes, SNN can extract the temporal patterns from the original information. A more advanced encoding scheme is a multiplexing temporal encoding which combines several encoding schemes with different timescales to have a larger information density and dynamic range. After that, the spike timing dependence plasticity (STDP) learning algorithm is utilized for training the SNN since the SNN can not be trained with regular training algorithms like backpropagation. In this work, a spiking domain feature extraction neural network with temporal multiplexing encoding is designed on EAGLE and fabricated on the PCB board. The testbench’s power consumption is 400mW. From the test result, a conclusion can be drawn that the network on PCB can transfer the input information to multiplexing temporal encoded spikes and then utilize the spikes to adjust the synaptic weight voltage.","PeriodicalId":315053,"journal":{"name":"2023 24th International Symposium on Quality Electronic Design (ISQED)","volume":"364 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124924571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-05DOI: 10.1109/ISQED57927.2023.10129336
Zihao Chen, Songlei Meng, F. Yang, Li Shang, Xuan Zeng
With ever-increasing design complexity and stringent time-to-market pressure, automated topology synthesis tools for operational amplifiers are required to produce designs meeting different specifications. This paper proposes TOTAL, a reinforcement learning-based topology optimization method for operational amplifiers. We decompose the circuit topology design as a Markov decision process to solve the high dimensionality of the design space, with the three-stage cascode paradigm fixed to avoid meaningless structures. Therefore, starting from a basic behavior-level topology, an agent modifies the circuit step by step. Specifically, this agent mainly adopts a graph neural network to understand each design state, including specifications and the design history, and a convolutional neural network to modify the current topology. Every completed circuit is then simulated and evaluated by a customized reward function to guide the agent in finding qualified circuits, among which only the optimal one ever recorded is mapped to the transistor level for further evaluation. Experimental results show that the trained agent can not only generate high-performance circuits, but also be reusable by transferring to other specifications as a pre-trained model and achieving competitive results.
{"title":"TOTAL: Topology Optimization of Operational Amplifier via Reinforcement Learning","authors":"Zihao Chen, Songlei Meng, F. Yang, Li Shang, Xuan Zeng","doi":"10.1109/ISQED57927.2023.10129336","DOIUrl":"https://doi.org/10.1109/ISQED57927.2023.10129336","url":null,"abstract":"With ever-increasing design complexity and stringent time-to-market pressure, automated topology synthesis tools for operational amplifiers are required to produce designs meeting different specifications. This paper proposes TOTAL, a reinforcement learning-based topology optimization method for operational amplifiers. We decompose the circuit topology design as a Markov decision process to solve the high dimensionality of the design space, with the three-stage cascode paradigm fixed to avoid meaningless structures. Therefore, starting from a basic behavior-level topology, an agent modifies the circuit step by step. Specifically, this agent mainly adopts a graph neural network to understand each design state, including specifications and the design history, and a convolutional neural network to modify the current topology. Every completed circuit is then simulated and evaluated by a customized reward function to guide the agent in finding qualified circuits, among which only the optimal one ever recorded is mapped to the transistor level for further evaluation. Experimental results show that the trained agent can not only generate high-performance circuits, but also be reusable by transferring to other specifications as a pre-trained model and achieving competitive results.","PeriodicalId":315053,"journal":{"name":"2023 24th International Symposium on Quality Electronic Design (ISQED)","volume":"88 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125071972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-05DOI: 10.1109/ISQED57927.2023.10129299
Patricia Gonzalez-Guerrero, Kylie Huch, N. Patra, Thom Popovici, George Michelogiannakis
In superconducting circuits, information is carried by ps-wide, µV-tall, Single Flux Quanta (SFQ) pulses. These circuits can operate at frequencies of hundreds of GHz with orders of magnitude lower switching energy than complementary-metal-oxide-semiconductors (CMOS). However, under the stringent area constraints of modern superconductor technologies, fully-fledged, CMOS-inspired superconducting architectures cannot be fabricated at large scales. Unary SFQ (U-SFQ) is an alternative computing paradigm that addresses these area constraints. In U-SFQ, information is mapped to a combination of streams of SFQ pulses and in the temporal domain. In this work, we propose a U-SFQ Convolutional Neural Network (CNN) hardware accelerator capable of comparable peak performance with state-of-the-art superconducting binary (B-SFQ) approaches in 32× less area. CNNs can operate with 5 to 8 bits of resolution with no significant degradation in classification accuracy. The proposed CNN accelerator effortlessly supports this variable resolution and, for less than 7 bits, yields 5×-63× better performance than CMOS and 15×-173× better area efficiency than B-SFQ.
{"title":"An Area Efficient Superconducting Unary CNN Accelerator","authors":"Patricia Gonzalez-Guerrero, Kylie Huch, N. Patra, Thom Popovici, George Michelogiannakis","doi":"10.1109/ISQED57927.2023.10129299","DOIUrl":"https://doi.org/10.1109/ISQED57927.2023.10129299","url":null,"abstract":"In superconducting circuits, information is carried by ps-wide, µV-tall, Single Flux Quanta (SFQ) pulses. These circuits can operate at frequencies of hundreds of GHz with orders of magnitude lower switching energy than complementary-metal-oxide-semiconductors (CMOS). However, under the stringent area constraints of modern superconductor technologies, fully-fledged, CMOS-inspired superconducting architectures cannot be fabricated at large scales. Unary SFQ (U-SFQ) is an alternative computing paradigm that addresses these area constraints. In U-SFQ, information is mapped to a combination of streams of SFQ pulses and in the temporal domain. In this work, we propose a U-SFQ Convolutional Neural Network (CNN) hardware accelerator capable of comparable peak performance with state-of-the-art superconducting binary (B-SFQ) approaches in 32× less area. CNNs can operate with 5 to 8 bits of resolution with no significant degradation in classification accuracy. The proposed CNN accelerator effortlessly supports this variable resolution and, for less than 7 bits, yields 5×-63× better performance than CMOS and 15×-173× better area efficiency than B-SFQ.","PeriodicalId":315053,"journal":{"name":"2023 24th International Symposium on Quality Electronic Design (ISQED)","volume":"193 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121727866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-05DOI: 10.1109/ISQED57927.2023.10129352
Kangjun Bai, Daniel Titcombe, Jack Lombardi, C. Thiem, N. Cady
In-memory computing is an emerging computing paradigm that sidesteps challenges inherent to deep learning acceleration in conventional systems. Along with the development of neuromorphic architectures, resistive random-access memory (RRAM) has paved the way for in-memory computing by processing mixed-signal operations in a fully parallel fashion. In this work, we designed and implemented working prototypes of in-memory operators using a custom 65nm CMOS/RRAM technology node fabricated on a 300mm wafer. Specifically, arrays of hafnium-oxide RRAM cells were built in a crossbar structure to support high-throughput matrix multiplications at low energy and area consumption. Building upon these efficient RRAM, applications of pixel detection and flow-based Boolean operations are presented. Our introduced approaches alleviate the intermediate data movement and parallelize the computations, thereby yielding orders of magnitude improvement in energy and area efficiency over the equivalent CMOS design.
{"title":"Moving Towards Game-Changing Technology: Fabrication and Application of HfO2 RRAM for In-Memory Computing","authors":"Kangjun Bai, Daniel Titcombe, Jack Lombardi, C. Thiem, N. Cady","doi":"10.1109/ISQED57927.2023.10129352","DOIUrl":"https://doi.org/10.1109/ISQED57927.2023.10129352","url":null,"abstract":"In-memory computing is an emerging computing paradigm that sidesteps challenges inherent to deep learning acceleration in conventional systems. Along with the development of neuromorphic architectures, resistive random-access memory (RRAM) has paved the way for in-memory computing by processing mixed-signal operations in a fully parallel fashion. In this work, we designed and implemented working prototypes of in-memory operators using a custom 65nm CMOS/RRAM technology node fabricated on a 300mm wafer. Specifically, arrays of hafnium-oxide RRAM cells were built in a crossbar structure to support high-throughput matrix multiplications at low energy and area consumption. Building upon these efficient RRAM, applications of pixel detection and flow-based Boolean operations are presented. Our introduced approaches alleviate the intermediate data movement and parallelize the computations, thereby yielding orders of magnitude improvement in energy and area efficiency over the equivalent CMOS design.","PeriodicalId":315053,"journal":{"name":"2023 24th International Symposium on Quality Electronic Design (ISQED)","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124302488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-05DOI: 10.1109/ISQED57927.2023.10129333
S. Koranne
Advances in VLSI process have created a significant computational burden on process calibration, and the generation of capacitance tables of pre-characterized 2D cross-section layout from physical parameters such as dielectrics and layer thickness. In this paper we describe a high-level synthesis approach to design hardware accelerators to alleviate this concern. Design of a hardware accelerator which can produce capacitance tables for multiple layer and corner combinations is presented. An innovative approach (lambda compression) to reduce the volume of output is also described. Simulation shows that our pipelined superscalar approach can generate and solve a capacitance problem in amortized 4us at 500MHz clock, which is three orders of magnitude faster than state-of-art software based solutions. Interestingly, the optimizations suggested by an hardware implementation also give very good results on a CPU implementation, and this is yet another approach to software optimization.
{"title":"Design of Hardware Accelerators to Compute Parametric Capacitance Tables","authors":"S. Koranne","doi":"10.1109/ISQED57927.2023.10129333","DOIUrl":"https://doi.org/10.1109/ISQED57927.2023.10129333","url":null,"abstract":"Advances in VLSI process have created a significant computational burden on process calibration, and the generation of capacitance tables of pre-characterized 2D cross-section layout from physical parameters such as dielectrics and layer thickness. In this paper we describe a high-level synthesis approach to design hardware accelerators to alleviate this concern. Design of a hardware accelerator which can produce capacitance tables for multiple layer and corner combinations is presented. An innovative approach (lambda compression) to reduce the volume of output is also described. Simulation shows that our pipelined superscalar approach can generate and solve a capacitance problem in amortized 4us at 500MHz clock, which is three orders of magnitude faster than state-of-art software based solutions. Interestingly, the optimizations suggested by an hardware implementation also give very good results on a CPU implementation, and this is yet another approach to software optimization.","PeriodicalId":315053,"journal":{"name":"2023 24th International Symposium on Quality Electronic Design (ISQED)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130134598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
While most gate-level hardware Trojan detection techniques strive to detect as many as possible suspicious nets, this paper suggests another direction: identifying only a few suspicious nets, in order to reduce the subsequent manual investigation effort, since there is no need to trace multiple suspicious nets that lead to the same Trojan module. To accomplish this goal, we adopt a collaborative approach by a combination of structural-based analysis, testability-based analysis, and behavioral-based analysis to minimize the number of suspicious Trojan nets. Extensive experiments are conducted with Trust-HUB benchmark and an industrial processor. The results are very significant: (1) high precision 95.39%, most of identified nets being actual Trojan nets; (2) high true negative rate 99.99%, most normal nets being correctly identified as non-suspicious; (3) 44% less suspicious nets to greatly reduce the subsequent manual investigation effort; while (4) leading to detect 100% of the Trojan modules.
{"title":"Focusing on the Key Suspicious Trojan Nets with a Collaborative Approach","authors":"Shih-Jung Pao, Chuan-Pin Huang, Yen-Chi Peng, Ing-Jer Huang","doi":"10.1109/ISQED57927.2023.10129361","DOIUrl":"https://doi.org/10.1109/ISQED57927.2023.10129361","url":null,"abstract":"While most gate-level hardware Trojan detection techniques strive to detect as many as possible suspicious nets, this paper suggests another direction: identifying only a few suspicious nets, in order to reduce the subsequent manual investigation effort, since there is no need to trace multiple suspicious nets that lead to the same Trojan module. To accomplish this goal, we adopt a collaborative approach by a combination of structural-based analysis, testability-based analysis, and behavioral-based analysis to minimize the number of suspicious Trojan nets. Extensive experiments are conducted with Trust-HUB benchmark and an industrial processor. The results are very significant: (1) high precision 95.39%, most of identified nets being actual Trojan nets; (2) high true negative rate 99.99%, most normal nets being correctly identified as non-suspicious; (3) 44% less suspicious nets to greatly reduce the subsequent manual investigation effort; while (4) leading to detect 100% of the Trojan modules.","PeriodicalId":315053,"journal":{"name":"2023 24th International Symposium on Quality Electronic Design (ISQED)","volume":"120 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128052756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}