Andre Luiz Pereira de Franca, R. Jasinski, V. Pedroni, A. Santin
Software-based network security is constantly challenged by the increase in network speeds and number of attacks. At the same time, mobile network access underscores the need for energy efficiency. In this paper, we present a new way to improve the throughput and to reduce the energy consumption of an anomaly-based intrusion detection system for probing attacks. Our framework implements the same classifier algorithm in software (C++) and in hardware (synthesizable VHDL), and then compares the energy efficiency of the two approaches. Our results for a decision tree classifier show that the hardware version consumed only 0.03% of the energy used by the same algorithm in software, even though the hardware version operates with a throughput that is 15 times that of the software version.
{"title":"Moving Network Protection from Software to Hardware: An Energy Efficiency Analysis","authors":"Andre Luiz Pereira de Franca, R. Jasinski, V. Pedroni, A. Santin","doi":"10.1109/ISVLSI.2014.89","DOIUrl":"https://doi.org/10.1109/ISVLSI.2014.89","url":null,"abstract":"Software-based network security is constantly challenged by the increase in network speeds and number of attacks. At the same time, mobile network access underscores the need for energy efficiency. In this paper, we present a new way to improve the throughput and to reduce the energy consumption of an anomaly-based intrusion detection system for probing attacks. Our framework implements the same classifier algorithm in software (C++) and in hardware (synthesizable VHDL), and then compares the energy efficiency of the two approaches. Our results for a decision tree classifier show that the hardware version consumed only 0.03% of the energy used by the same algorithm in software, even though the hardware version operates with a throughput that is 15 times that of the software version.","PeriodicalId":405755,"journal":{"name":"2014 IEEE Computer Society Annual Symposium on VLSI","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122736002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
FPGAs are widely deployed nowadays. Besides offering powerful computation capacity, contemporary FPGAs also provide many security features such as bitstream protection. The security of these features is dependent on the security of the keys embedded in the FPGA, which is usually generated by the vendor. This type of architecture has a shortcoming that the FPGA vendor knows everything and becomes the root of trust. In this work, we propose a key generation method utilizing bilinear pairing that enables the user of the FPGA to interact with the device to generate keys. The generated keys depend on both the input from the user and the device so vendor cannot learn the keys. Furthermore, we offer a method to allow the user to verify the generated keys to make sure that the keys are related to his input. Finally we conduct some experiments and indicate the effectiveness of our scheme.
{"title":"Removing the Root of Trust: Secure Oblivious Key Establishment for FPGAs","authors":"Lei Xu, W. Shi","doi":"10.1109/ISVLSI.2014.49","DOIUrl":"https://doi.org/10.1109/ISVLSI.2014.49","url":null,"abstract":"FPGAs are widely deployed nowadays. Besides offering powerful computation capacity, contemporary FPGAs also provide many security features such as bitstream protection. The security of these features is dependent on the security of the keys embedded in the FPGA, which is usually generated by the vendor. This type of architecture has a shortcoming that the FPGA vendor knows everything and becomes the root of trust. In this work, we propose a key generation method utilizing bilinear pairing that enables the user of the FPGA to interact with the device to generate keys. The generated keys depend on both the input from the user and the device so vendor cannot learn the keys. Furthermore, we offer a method to allow the user to verify the generated keys to make sure that the keys are related to his input. Finally we conduct some experiments and indicate the effectiveness of our scheme.","PeriodicalId":405755,"journal":{"name":"2014 IEEE Computer Society Annual Symposium on VLSI","volume":"210 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133536154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents the design of a low power low noise variable gain amplifier(VGA) interface circuit. The VGA circuit proposed is designed for interface with Capacitive Micro-Machined Ultrasonic Transducer (CMUT). Due to the small area and low power consumption, the circuit is suitable for in-probe imaging where the VGA is interfaced with the in-prob ADC which does all the digital conversion inside probe. The VGA circuit maps the attenuated received signal from CMUT to the full dynamic range of the ADC. The circuit is able to produce a differential output from an ultrasound sensor which is based on a Zero-Bias CMUT, where the requirement for an external high voltage dc bias is eliminated. Therefore, the single to differential conversion is carried out through steering the current from both the electrodes of the CMUT without the need for high voltage design. The VGA is designed and simulated with 65nm CMOS technology. The VGA gain varies in linear from 0 -- 20db. A noise figure (NF) of 3dB for a CMUT with 5MHz center frequency is estimated, where the power consumption of only 80uW and the total area of 0:008mm2 is achieved which makes it perfect for the interface to the in probe ADC circuit. The circuit layout design is based on the standard unit shapes which results in pattern regularity and density uniformity. This will assure the result from the post-layout simulation close to the pre-layout simulation and also gives a better matching in the layout.
{"title":"A Low-Noise Variable-Gain Amplifier for in-Probe 3D Imaging Applications Based on CMUT Transducers","authors":"H. Attarzadeh, T. Ytterdal","doi":"10.1109/ISVLSI.2014.113","DOIUrl":"https://doi.org/10.1109/ISVLSI.2014.113","url":null,"abstract":"This paper presents the design of a low power low noise variable gain amplifier(VGA) interface circuit. The VGA circuit proposed is designed for interface with Capacitive Micro-Machined Ultrasonic Transducer (CMUT). Due to the small area and low power consumption, the circuit is suitable for in-probe imaging where the VGA is interfaced with the in-prob ADC which does all the digital conversion inside probe. The VGA circuit maps the attenuated received signal from CMUT to the full dynamic range of the ADC. The circuit is able to produce a differential output from an ultrasound sensor which is based on a Zero-Bias CMUT, where the requirement for an external high voltage dc bias is eliminated. Therefore, the single to differential conversion is carried out through steering the current from both the electrodes of the CMUT without the need for high voltage design. The VGA is designed and simulated with 65nm CMOS technology. The VGA gain varies in linear from 0 -- 20db. A noise figure (NF) of 3dB for a CMUT with 5MHz center frequency is estimated, where the power consumption of only 80uW and the total area of 0:008mm2 is achieved which makes it perfect for the interface to the in probe ADC circuit. The circuit layout design is based on the standard unit shapes which results in pattern regularity and density uniformity. This will assure the result from the post-layout simulation close to the pre-layout simulation and also gives a better matching in the layout.","PeriodicalId":405755,"journal":{"name":"2014 IEEE Computer Society Annual Symposium on VLSI","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114492713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pratik Dutta, Chandan Bandyopadhyay, C. Giri, H. Rahaman
In this work, we present an efficient reversible implementation of Carry-Lookahead Adder (CLA) in all-optical domain. Now-a-days, semiconductor optical amplifier (SOA)-based Mach-Zehnder interferometer (MZI) plays a vital role in the field of ultra-fast all-optical signal processing. We have used all optical based Mach-Zehnder Interferometer (MZI) switches to design the CLA circuit implementing reversible functionality. Two approaches are proposed for designing the CLA circuit. First, we propose a hierarchical approach for implementation of 2n-bit reversible CLA. In the second approach, we remove the drawback of hierarchical CLA and improve the design by implementing non-modular staircase structure of n-bit reversible CLA. The design complexities of both the approaches are computed. Experimental result shows that the optical cost and delay incurred in staircase structured reversible implementation of CLA are much less than those proposed in the recently reported works.
{"title":"Mach-Zehnder Interferometer Based All Optical Reversible Carry-Lookahead Adder","authors":"Pratik Dutta, Chandan Bandyopadhyay, C. Giri, H. Rahaman","doi":"10.1109/ISVLSI.2014.102","DOIUrl":"https://doi.org/10.1109/ISVLSI.2014.102","url":null,"abstract":"In this work, we present an efficient reversible implementation of Carry-Lookahead Adder (CLA) in all-optical domain. Now-a-days, semiconductor optical amplifier (SOA)-based Mach-Zehnder interferometer (MZI) plays a vital role in the field of ultra-fast all-optical signal processing. We have used all optical based Mach-Zehnder Interferometer (MZI) switches to design the CLA circuit implementing reversible functionality. Two approaches are proposed for designing the CLA circuit. First, we propose a hierarchical approach for implementation of 2n-bit reversible CLA. In the second approach, we remove the drawback of hierarchical CLA and improve the design by implementing non-modular staircase structure of n-bit reversible CLA. The design complexities of both the approaches are computed. Experimental result shows that the optical cost and delay incurred in staircase structured reversible implementation of CLA are much less than those proposed in the recently reported works.","PeriodicalId":405755,"journal":{"name":"2014 IEEE Computer Society Annual Symposium on VLSI","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116738041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
High performance and energy efficient video analytics systems that can extract rich metadata from voluminous visual content, will enable a variety of high-value surveillance, driver assistance, video tagging, and first person analytics systems. These big-data applications are pervasive across retail, automotive, medical, agriculture and security domains. However, current trends in general purpose and multicore architectures will not keep pace with the growing computational demands of cutting edge visual perception algorithms. Hardware acceleration is crucial to surpassing what is realizable on modern multicore and GPGPU architectures. In this paper we detail a Sea-of-Accelerators, SoA, platform that combines a mix of macro-accelerators, microaccelerators, and lightweight processors to achieve high performance and energy efficiency in video analytics applications. In this paper, we describe a framework for video and image analytics and highlight its benefits with a case study of a customized visual saliency accelerator. We describe the architecture of a full custom macro-accelerator that is suitable when raw performance is of critical importance. As an alternative, we illustrate the composition of an accelerator from a constituent of loosely coupled microaccelerators and evaluate the performance achievable when performance and flexibility are competing objectives.
{"title":"Achieving High-Performance Video Analytics with Lightweight Cores and a Sea of Hardware Accelerators","authors":"K. Irick, Nandhini Chandramoorthy","doi":"10.1109/ISVLSI.2014.112","DOIUrl":"https://doi.org/10.1109/ISVLSI.2014.112","url":null,"abstract":"High performance and energy efficient video analytics systems that can extract rich metadata from voluminous visual content, will enable a variety of high-value surveillance, driver assistance, video tagging, and first person analytics systems. These big-data applications are pervasive across retail, automotive, medical, agriculture and security domains. However, current trends in general purpose and multicore architectures will not keep pace with the growing computational demands of cutting edge visual perception algorithms. Hardware acceleration is crucial to surpassing what is realizable on modern multicore and GPGPU architectures. In this paper we detail a Sea-of-Accelerators, SoA, platform that combines a mix of macro-accelerators, microaccelerators, and lightweight processors to achieve high performance and energy efficiency in video analytics applications. In this paper, we describe a framework for video and image analytics and highlight its benefits with a case study of a customized visual saliency accelerator. We describe the architecture of a full custom macro-accelerator that is suitable when raw performance is of critical importance. As an alternative, we illustrate the composition of an accelerator from a constituent of loosely coupled microaccelerators and evaluate the performance achievable when performance and flexibility are competing objectives.","PeriodicalId":405755,"journal":{"name":"2014 IEEE Computer Society Annual Symposium on VLSI","volume":"301 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131617519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper we discuss the potential of emerging spin-torque devices for computing applications. Recent proposals for spin-based computing schemes may be differentiated as all-spin? vs. hybrid, programmable vs. fixed, and, Boolean vs. non-Boolean. All-spin logic-styles may offer high area-density due to small form-factor of nano-magnetic devices. However, circuit and system-level design techniques need to be explored that leaverage the specific spin-device characterisitcs to achieve energy-efficiency, performance and reliability comparable to those of CMOS. The non-volatility of nano-magnets can be exploited in the design of energy and area-efficient programmable logic. In such logic-styles, spin-devices may play the dual-role of computing as well as memory-elements that provide field-programmability. Spin-based threshold logic design is presented as an example. Emerging spintronic phenomena may lead to ultra-low-voltage, current-mode, spin-torque switches that can offer attractive computing capabilities, beyond digital switches. Such devices may be suitable for non-Boolean data-processing applications which involve analog processing leading to highly energy-efficient information processing hardware for applicatons like pattern-matching, neuromorphic-computing, image-processing and data-conversion. Towards the end, we discuss the possibility of applying emerging spin-torque switches in the design of energy-efficient global interconnects, for future chip multiprocessors.
{"title":"Computing with Spin-Transfer-Torque Devices: Prospects and Perspectives","authors":"K. Roy, M. Sharad, Deliang Fan, K. Yogendra","doi":"10.1109/ISVLSI.2014.120","DOIUrl":"https://doi.org/10.1109/ISVLSI.2014.120","url":null,"abstract":"In this paper we discuss the potential of emerging spin-torque devices for computing applications. Recent proposals for spin-based computing schemes may be differentiated as all-spin? vs. hybrid, programmable vs. fixed, and, Boolean vs. non-Boolean. All-spin logic-styles may offer high area-density due to small form-factor of nano-magnetic devices. However, circuit and system-level design techniques need to be explored that leaverage the specific spin-device characterisitcs to achieve energy-efficiency, performance and reliability comparable to those of CMOS. The non-volatility of nano-magnets can be exploited in the design of energy and area-efficient programmable logic. In such logic-styles, spin-devices may play the dual-role of computing as well as memory-elements that provide field-programmability. Spin-based threshold logic design is presented as an example. Emerging spintronic phenomena may lead to ultra-low-voltage, current-mode, spin-torque switches that can offer attractive computing capabilities, beyond digital switches. Such devices may be suitable for non-Boolean data-processing applications which involve analog processing leading to highly energy-efficient information processing hardware for applicatons like pattern-matching, neuromorphic-computing, image-processing and data-conversion. Towards the end, we discuss the possibility of applying emerging spin-torque switches in the design of energy-efficient global interconnects, for future chip multiprocessors.","PeriodicalId":405755,"journal":{"name":"2014 IEEE Computer Society Annual Symposium on VLSI","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121587172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents a 1MHz ring oscillator with simple temperature compensation circuit dedicated to MEMS sensor applications. A closed-loop PTAT voltage source follower coupled with an open-loop replica-biased source follower driving structure is proposed to power the CMOS ring oscillator and counteract its temperature-dependent effect. In conjunction with a pseudo-resistor based low-pass filter to lower the circuit noise, it significantly improves the jitter performance. The circuit is realized using 0.35μm CMOS technology at a 5V supply. The frequency variation of the compensated oscillator over the temperature range of - 40°C to +90°C is - 0.1% to +0.19% (22.3ppm/°C) with respect to the uncompensated frequency variation of -- 12.1% to +21.6% (2592ppm/°C) at typical process. For worst case process, the frequency deviates from - 1.5% to +1.9% for the stated temperature span and it deviates from - 1.2% to +1.2% at ±10% supply variation under 1 MHz oscillation frequency. The proposed ring oscillator is insensitive to the PVT variations. The simulated cycle-to-cycle RMS jitter value due to both the intrinsic circuit noise and the 200mV peak-to-peak power supply noise is only 64ps, which is about 50 times smaller than that of the standard ring oscillators under identical process technology and design conditions.
{"title":"A Compact CMOS Ring Oscillator with Temperature and Supply Compensation for Sensor Applications","authors":"Yanmei Wang, P. K. Chan, K. Li","doi":"10.1109/ISVLSI.2014.15","DOIUrl":"https://doi.org/10.1109/ISVLSI.2014.15","url":null,"abstract":"This paper presents a 1MHz ring oscillator with simple temperature compensation circuit dedicated to MEMS sensor applications. A closed-loop PTAT voltage source follower coupled with an open-loop replica-biased source follower driving structure is proposed to power the CMOS ring oscillator and counteract its temperature-dependent effect. In conjunction with a pseudo-resistor based low-pass filter to lower the circuit noise, it significantly improves the jitter performance. The circuit is realized using 0.35μm CMOS technology at a 5V supply. The frequency variation of the compensated oscillator over the temperature range of - 40°C to +90°C is - 0.1% to +0.19% (22.3ppm/°C) with respect to the uncompensated frequency variation of -- 12.1% to +21.6% (2592ppm/°C) at typical process. For worst case process, the frequency deviates from - 1.5% to +1.9% for the stated temperature span and it deviates from - 1.2% to +1.2% at ±10% supply variation under 1 MHz oscillation frequency. The proposed ring oscillator is insensitive to the PVT variations. The simulated cycle-to-cycle RMS jitter value due to both the intrinsic circuit noise and the 200mV peak-to-peak power supply noise is only 64ps, which is about 50 times smaller than that of the standard ring oscillators under identical process technology and design conditions.","PeriodicalId":405755,"journal":{"name":"2014 IEEE Computer Society Annual Symposium on VLSI","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116581490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Due to the continuous shrinking of semiconductor technology, there are more and more variations in the process of manufacturing chips. From the viewpoint of analyzing the functionality of a chip, variation may change the overall "observed" behavior of the chip. In this paper, we discuss additional delays caused by variation that may generate changes of observed behaviors. In the first part of the paper, we discuss functional changes caused by additional delays on the inputs of each gate in the circuit. Unlike stuck-at faults, such additional delays can introduce many different faulty functions on a gate. For example, in the cases of two-input AND/OR gate, all possible logic functions with two-input, which are 222=16 different functions, can potentially be observed. This indicates that it may make sense to model faulty behaviors caused by variation as general functional faults rather than structurally defined faults, such as stuck-at faults. Also, such additional delays by variation can happen in multiple locations simultaneously. As a result, there can be so many possible fault combinations to be considered, and it is not easy at all to analyze them with traditional automatic test pattern generation (ATPG) methods which drop detectable faults by fault simulators using explicit representation of faults. So in the second part of the paper, we discuss about ATPG methods where test pattern generation and fault dropping processes are unified. As faults are represented implicitly, even if numbers of simultaneous faults are large, we may still be able to successfully perform ATPG processes.
{"title":"Variation-Aware Analysis and Test Pattern Generation Based on Functional Faults","authors":"M. Fujita","doi":"10.1109/ISVLSI.2014.116","DOIUrl":"https://doi.org/10.1109/ISVLSI.2014.116","url":null,"abstract":"Due to the continuous shrinking of semiconductor technology, there are more and more variations in the process of manufacturing chips. From the viewpoint of analyzing the functionality of a chip, variation may change the overall \"observed\" behavior of the chip. In this paper, we discuss additional delays caused by variation that may generate changes of observed behaviors. In the first part of the paper, we discuss functional changes caused by additional delays on the inputs of each gate in the circuit. Unlike stuck-at faults, such additional delays can introduce many different faulty functions on a gate. For example, in the cases of two-input AND/OR gate, all possible logic functions with two-input, which are 222=16 different functions, can potentially be observed. This indicates that it may make sense to model faulty behaviors caused by variation as general functional faults rather than structurally defined faults, such as stuck-at faults. Also, such additional delays by variation can happen in multiple locations simultaneously. As a result, there can be so many possible fault combinations to be considered, and it is not easy at all to analyze them with traditional automatic test pattern generation (ATPG) methods which drop detectable faults by fault simulators using explicit representation of faults. So in the second part of the paper, we discuss about ATPG methods where test pattern generation and fault dropping processes are unified. As faults are represented implicitly, even if numbers of simultaneous faults are large, we may still be able to successfully perform ATPG processes.","PeriodicalId":405755,"journal":{"name":"2014 IEEE Computer Society Annual Symposium on VLSI","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121554667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Clock distribution networks represent one of the most important signals in a synchronous integrated circuit, this signal may be altered by radiation effects and generate an abnormal behavior in the system. In this work we analyzed two types of clock distribution network using the same circuit to know which of one is more sensitive to radiation threats. Using a case study we compare clock tree with clock mesh. Finally we found that clock mesh topology is more sensitive to radiation effects in comparison with traditional tree distribution. This may occur because clock mesh has a uniform distribution of capacitance and this allows a good distribution of the signal even to transient pulse.
{"title":"SET Susceptibility Analysis of Clock Tree and Clock Mesh Topologies","authors":"R. Chipana, F. Kastensmidt","doi":"10.1109/ISVLSI.2014.33","DOIUrl":"https://doi.org/10.1109/ISVLSI.2014.33","url":null,"abstract":"Clock distribution networks represent one of the most important signals in a synchronous integrated circuit, this signal may be altered by radiation effects and generate an abnormal behavior in the system. In this work we analyzed two types of clock distribution network using the same circuit to know which of one is more sensitive to radiation threats. Using a case study we compare clock tree with clock mesh. Finally we found that clock mesh topology is more sensitive to radiation effects in comparison with traditional tree distribution. This may occur because clock mesh has a uniform distribution of capacitance and this allows a good distribution of the signal even to transient pulse.","PeriodicalId":405755,"journal":{"name":"2014 IEEE Computer Society Annual Symposium on VLSI","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114609410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Low swing clocking is a low power design methodology that scales the clock voltage to decrease power consumption of the clock distribution networks, with an expected degradation in the performance. In this work, a novel low swing clock tree synthesis methodology is combined with a custom low swing clock-aware D flip-flop (DFF) design. The low swing clocking serves to reduce the power dissipation whereas the custom low swing-aware DFF serves to preserve the performance of the IC. The experimental results performed on the three largest circuits of ISCAS'89 benchmarks operating at 1GHz in the 32nm technology show that the proposed methodology can achieve an average of 16% power savings in the clock tree compared to its full swing counterpart, while satisfying the same clock skew (50ps) and slew (150ps) constraints at the worst case corner of operation. Moreover, the clock-to-output delay of the low swing DFF does not increase compared to traditional full swing DFF, while consuming only 1% more power.
{"title":"High Performance Low Swing Clock Tree Synthesis with Custom D Flip-Flop Design","authors":"Can Sitik, Leo Filippini, E. Salman, B. Taskin","doi":"10.1109/ISVLSI.2014.53","DOIUrl":"https://doi.org/10.1109/ISVLSI.2014.53","url":null,"abstract":"Low swing clocking is a low power design methodology that scales the clock voltage to decrease power consumption of the clock distribution networks, with an expected degradation in the performance. In this work, a novel low swing clock tree synthesis methodology is combined with a custom low swing clock-aware D flip-flop (DFF) design. The low swing clocking serves to reduce the power dissipation whereas the custom low swing-aware DFF serves to preserve the performance of the IC. The experimental results performed on the three largest circuits of ISCAS'89 benchmarks operating at 1GHz in the 32nm technology show that the proposed methodology can achieve an average of 16% power savings in the clock tree compared to its full swing counterpart, while satisfying the same clock skew (50ps) and slew (150ps) constraints at the worst case corner of operation. Moreover, the clock-to-output delay of the low swing DFF does not increase compared to traditional full swing DFF, while consuming only 1% more power.","PeriodicalId":405755,"journal":{"name":"2014 IEEE Computer Society Annual Symposium on VLSI","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125973329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}