Andre Luiz Pereira de Franca, R. Jasinski, V. Pedroni, A. Santin
Software-based network security is constantly challenged by the increase in network speeds and number of attacks. At the same time, mobile network access underscores the need for energy efficiency. In this paper, we present a new way to improve the throughput and to reduce the energy consumption of an anomaly-based intrusion detection system for probing attacks. Our framework implements the same classifier algorithm in software (C++) and in hardware (synthesizable VHDL), and then compares the energy efficiency of the two approaches. Our results for a decision tree classifier show that the hardware version consumed only 0.03% of the energy used by the same algorithm in software, even though the hardware version operates with a throughput that is 15 times that of the software version.
{"title":"Moving Network Protection from Software to Hardware: An Energy Efficiency Analysis","authors":"Andre Luiz Pereira de Franca, R. Jasinski, V. Pedroni, A. Santin","doi":"10.1109/ISVLSI.2014.89","DOIUrl":"https://doi.org/10.1109/ISVLSI.2014.89","url":null,"abstract":"Software-based network security is constantly challenged by the increase in network speeds and number of attacks. At the same time, mobile network access underscores the need for energy efficiency. In this paper, we present a new way to improve the throughput and to reduce the energy consumption of an anomaly-based intrusion detection system for probing attacks. Our framework implements the same classifier algorithm in software (C++) and in hardware (synthesizable VHDL), and then compares the energy efficiency of the two approaches. Our results for a decision tree classifier show that the hardware version consumed only 0.03% of the energy used by the same algorithm in software, even though the hardware version operates with a throughput that is 15 times that of the software version.","PeriodicalId":405755,"journal":{"name":"2014 IEEE Computer Society Annual Symposium on VLSI","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122736002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
FPGAs are widely deployed nowadays. Besides offering powerful computation capacity, contemporary FPGAs also provide many security features such as bitstream protection. The security of these features is dependent on the security of the keys embedded in the FPGA, which is usually generated by the vendor. This type of architecture has a shortcoming that the FPGA vendor knows everything and becomes the root of trust. In this work, we propose a key generation method utilizing bilinear pairing that enables the user of the FPGA to interact with the device to generate keys. The generated keys depend on both the input from the user and the device so vendor cannot learn the keys. Furthermore, we offer a method to allow the user to verify the generated keys to make sure that the keys are related to his input. Finally we conduct some experiments and indicate the effectiveness of our scheme.
{"title":"Removing the Root of Trust: Secure Oblivious Key Establishment for FPGAs","authors":"Lei Xu, W. Shi","doi":"10.1109/ISVLSI.2014.49","DOIUrl":"https://doi.org/10.1109/ISVLSI.2014.49","url":null,"abstract":"FPGAs are widely deployed nowadays. Besides offering powerful computation capacity, contemporary FPGAs also provide many security features such as bitstream protection. The security of these features is dependent on the security of the keys embedded in the FPGA, which is usually generated by the vendor. This type of architecture has a shortcoming that the FPGA vendor knows everything and becomes the root of trust. In this work, we propose a key generation method utilizing bilinear pairing that enables the user of the FPGA to interact with the device to generate keys. The generated keys depend on both the input from the user and the device so vendor cannot learn the keys. Furthermore, we offer a method to allow the user to verify the generated keys to make sure that the keys are related to his input. Finally we conduct some experiments and indicate the effectiveness of our scheme.","PeriodicalId":405755,"journal":{"name":"2014 IEEE Computer Society Annual Symposium on VLSI","volume":"210 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133536154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents the design of a low power low noise variable gain amplifier(VGA) interface circuit. The VGA circuit proposed is designed for interface with Capacitive Micro-Machined Ultrasonic Transducer (CMUT). Due to the small area and low power consumption, the circuit is suitable for in-probe imaging where the VGA is interfaced with the in-prob ADC which does all the digital conversion inside probe. The VGA circuit maps the attenuated received signal from CMUT to the full dynamic range of the ADC. The circuit is able to produce a differential output from an ultrasound sensor which is based on a Zero-Bias CMUT, where the requirement for an external high voltage dc bias is eliminated. Therefore, the single to differential conversion is carried out through steering the current from both the electrodes of the CMUT without the need for high voltage design. The VGA is designed and simulated with 65nm CMOS technology. The VGA gain varies in linear from 0 -- 20db. A noise figure (NF) of 3dB for a CMUT with 5MHz center frequency is estimated, where the power consumption of only 80uW and the total area of 0:008mm2 is achieved which makes it perfect for the interface to the in probe ADC circuit. The circuit layout design is based on the standard unit shapes which results in pattern regularity and density uniformity. This will assure the result from the post-layout simulation close to the pre-layout simulation and also gives a better matching in the layout.
{"title":"A Low-Noise Variable-Gain Amplifier for in-Probe 3D Imaging Applications Based on CMUT Transducers","authors":"H. Attarzadeh, T. Ytterdal","doi":"10.1109/ISVLSI.2014.113","DOIUrl":"https://doi.org/10.1109/ISVLSI.2014.113","url":null,"abstract":"This paper presents the design of a low power low noise variable gain amplifier(VGA) interface circuit. The VGA circuit proposed is designed for interface with Capacitive Micro-Machined Ultrasonic Transducer (CMUT). Due to the small area and low power consumption, the circuit is suitable for in-probe imaging where the VGA is interfaced with the in-prob ADC which does all the digital conversion inside probe. The VGA circuit maps the attenuated received signal from CMUT to the full dynamic range of the ADC. The circuit is able to produce a differential output from an ultrasound sensor which is based on a Zero-Bias CMUT, where the requirement for an external high voltage dc bias is eliminated. Therefore, the single to differential conversion is carried out through steering the current from both the electrodes of the CMUT without the need for high voltage design. The VGA is designed and simulated with 65nm CMOS technology. The VGA gain varies in linear from 0 -- 20db. A noise figure (NF) of 3dB for a CMUT with 5MHz center frequency is estimated, where the power consumption of only 80uW and the total area of 0:008mm2 is achieved which makes it perfect for the interface to the in probe ADC circuit. The circuit layout design is based on the standard unit shapes which results in pattern regularity and density uniformity. This will assure the result from the post-layout simulation close to the pre-layout simulation and also gives a better matching in the layout.","PeriodicalId":405755,"journal":{"name":"2014 IEEE Computer Society Annual Symposium on VLSI","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114492713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Clock distribution networks represent one of the most important signals in a synchronous integrated circuit, this signal may be altered by radiation effects and generate an abnormal behavior in the system. In this work we analyzed two types of clock distribution network using the same circuit to know which of one is more sensitive to radiation threats. Using a case study we compare clock tree with clock mesh. Finally we found that clock mesh topology is more sensitive to radiation effects in comparison with traditional tree distribution. This may occur because clock mesh has a uniform distribution of capacitance and this allows a good distribution of the signal even to transient pulse.
{"title":"SET Susceptibility Analysis of Clock Tree and Clock Mesh Topologies","authors":"R. Chipana, F. Kastensmidt","doi":"10.1109/ISVLSI.2014.33","DOIUrl":"https://doi.org/10.1109/ISVLSI.2014.33","url":null,"abstract":"Clock distribution networks represent one of the most important signals in a synchronous integrated circuit, this signal may be altered by radiation effects and generate an abnormal behavior in the system. In this work we analyzed two types of clock distribution network using the same circuit to know which of one is more sensitive to radiation threats. Using a case study we compare clock tree with clock mesh. Finally we found that clock mesh topology is more sensitive to radiation effects in comparison with traditional tree distribution. This may occur because clock mesh has a uniform distribution of capacitance and this allows a good distribution of the signal even to transient pulse.","PeriodicalId":405755,"journal":{"name":"2014 IEEE Computer Society Annual Symposium on VLSI","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114609410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pratik Dutta, Chandan Bandyopadhyay, C. Giri, H. Rahaman
In this work, we present an efficient reversible implementation of Carry-Lookahead Adder (CLA) in all-optical domain. Now-a-days, semiconductor optical amplifier (SOA)-based Mach-Zehnder interferometer (MZI) plays a vital role in the field of ultra-fast all-optical signal processing. We have used all optical based Mach-Zehnder Interferometer (MZI) switches to design the CLA circuit implementing reversible functionality. Two approaches are proposed for designing the CLA circuit. First, we propose a hierarchical approach for implementation of 2n-bit reversible CLA. In the second approach, we remove the drawback of hierarchical CLA and improve the design by implementing non-modular staircase structure of n-bit reversible CLA. The design complexities of both the approaches are computed. Experimental result shows that the optical cost and delay incurred in staircase structured reversible implementation of CLA are much less than those proposed in the recently reported works.
{"title":"Mach-Zehnder Interferometer Based All Optical Reversible Carry-Lookahead Adder","authors":"Pratik Dutta, Chandan Bandyopadhyay, C. Giri, H. Rahaman","doi":"10.1109/ISVLSI.2014.102","DOIUrl":"https://doi.org/10.1109/ISVLSI.2014.102","url":null,"abstract":"In this work, we present an efficient reversible implementation of Carry-Lookahead Adder (CLA) in all-optical domain. Now-a-days, semiconductor optical amplifier (SOA)-based Mach-Zehnder interferometer (MZI) plays a vital role in the field of ultra-fast all-optical signal processing. We have used all optical based Mach-Zehnder Interferometer (MZI) switches to design the CLA circuit implementing reversible functionality. Two approaches are proposed for designing the CLA circuit. First, we propose a hierarchical approach for implementation of 2n-bit reversible CLA. In the second approach, we remove the drawback of hierarchical CLA and improve the design by implementing non-modular staircase structure of n-bit reversible CLA. The design complexities of both the approaches are computed. Experimental result shows that the optical cost and delay incurred in staircase structured reversible implementation of CLA are much less than those proposed in the recently reported works.","PeriodicalId":405755,"journal":{"name":"2014 IEEE Computer Society Annual Symposium on VLSI","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116738041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ivan Ratković, Oscar Palomar, Milan Stanic, O. Unsal, A. Cristal, M. Valero
Selecting an appropriate estimation method for a given technology and design is of crucial interest as the estimations guide future project and design decisions. The accuracy of the estimations of area, timing, and power (metrics of interest) depends on the phase of the design flow and the fidelity of the models. In this research, we use design space exploration of low-power adders as a case study for comparative analysis of two estimation flows: Physical layout Aware Synthesis (PAS) and Place and Route (PnR). We study and compare post-PAS and post-PnR estimations of the metrics of interest and the impact of various design parameters and input switching activity factor (αI). Adders are particularly interesting for this study because they are fundamental microprocessor units, and their design involves many parameters that create a vast design space. We show cases when the post-PAS and post-PnR estimations could lead to different design decisions, especially from a low-power designer point of view. Our experiments reveal that post-PAS results underestimate the side-effects of clock-gating, pipelining, and extensive timing optimizations compared to post-PnR results. We also observe that PnR estimation flow sometimes reports counterintuitive results.
{"title":"Physical vs. Physically-Aware Estimation Flow: Case Study of Design Space Exploration of Adders","authors":"Ivan Ratković, Oscar Palomar, Milan Stanic, O. Unsal, A. Cristal, M. Valero","doi":"10.1109/ISVLSI.2014.14","DOIUrl":"https://doi.org/10.1109/ISVLSI.2014.14","url":null,"abstract":"Selecting an appropriate estimation method for a given technology and design is of crucial interest as the estimations guide future project and design decisions. The accuracy of the estimations of area, timing, and power (metrics of interest) depends on the phase of the design flow and the fidelity of the models. In this research, we use design space exploration of low-power adders as a case study for comparative analysis of two estimation flows: Physical layout Aware Synthesis (PAS) and Place and Route (PnR). We study and compare post-PAS and post-PnR estimations of the metrics of interest and the impact of various design parameters and input switching activity factor (αI). Adders are particularly interesting for this study because they are fundamental microprocessor units, and their design involves many parameters that create a vast design space. We show cases when the post-PAS and post-PnR estimations could lead to different design decisions, especially from a low-power designer point of view. Our experiments reveal that post-PAS results underestimate the side-effects of clock-gating, pipelining, and extensive timing optimizations compared to post-PnR results. We also observe that PnR estimation flow sometimes reports counterintuitive results.","PeriodicalId":405755,"journal":{"name":"2014 IEEE Computer Society Annual Symposium on VLSI","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114068685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Due to the continuous shrinking of semiconductor technology, there are more and more variations in the process of manufacturing chips. From the viewpoint of analyzing the functionality of a chip, variation may change the overall "observed" behavior of the chip. In this paper, we discuss additional delays caused by variation that may generate changes of observed behaviors. In the first part of the paper, we discuss functional changes caused by additional delays on the inputs of each gate in the circuit. Unlike stuck-at faults, such additional delays can introduce many different faulty functions on a gate. For example, in the cases of two-input AND/OR gate, all possible logic functions with two-input, which are 222=16 different functions, can potentially be observed. This indicates that it may make sense to model faulty behaviors caused by variation as general functional faults rather than structurally defined faults, such as stuck-at faults. Also, such additional delays by variation can happen in multiple locations simultaneously. As a result, there can be so many possible fault combinations to be considered, and it is not easy at all to analyze them with traditional automatic test pattern generation (ATPG) methods which drop detectable faults by fault simulators using explicit representation of faults. So in the second part of the paper, we discuss about ATPG methods where test pattern generation and fault dropping processes are unified. As faults are represented implicitly, even if numbers of simultaneous faults are large, we may still be able to successfully perform ATPG processes.
{"title":"Variation-Aware Analysis and Test Pattern Generation Based on Functional Faults","authors":"M. Fujita","doi":"10.1109/ISVLSI.2014.116","DOIUrl":"https://doi.org/10.1109/ISVLSI.2014.116","url":null,"abstract":"Due to the continuous shrinking of semiconductor technology, there are more and more variations in the process of manufacturing chips. From the viewpoint of analyzing the functionality of a chip, variation may change the overall \"observed\" behavior of the chip. In this paper, we discuss additional delays caused by variation that may generate changes of observed behaviors. In the first part of the paper, we discuss functional changes caused by additional delays on the inputs of each gate in the circuit. Unlike stuck-at faults, such additional delays can introduce many different faulty functions on a gate. For example, in the cases of two-input AND/OR gate, all possible logic functions with two-input, which are 222=16 different functions, can potentially be observed. This indicates that it may make sense to model faulty behaviors caused by variation as general functional faults rather than structurally defined faults, such as stuck-at faults. Also, such additional delays by variation can happen in multiple locations simultaneously. As a result, there can be so many possible fault combinations to be considered, and it is not easy at all to analyze them with traditional automatic test pattern generation (ATPG) methods which drop detectable faults by fault simulators using explicit representation of faults. So in the second part of the paper, we discuss about ATPG methods where test pattern generation and fault dropping processes are unified. As faults are represented implicitly, even if numbers of simultaneous faults are large, we may still be able to successfully perform ATPG processes.","PeriodicalId":405755,"journal":{"name":"2014 IEEE Computer Society Annual Symposium on VLSI","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121554667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper we discuss the potential of emerging spin-torque devices for computing applications. Recent proposals for spin-based computing schemes may be differentiated as all-spin? vs. hybrid, programmable vs. fixed, and, Boolean vs. non-Boolean. All-spin logic-styles may offer high area-density due to small form-factor of nano-magnetic devices. However, circuit and system-level design techniques need to be explored that leaverage the specific spin-device characterisitcs to achieve energy-efficiency, performance and reliability comparable to those of CMOS. The non-volatility of nano-magnets can be exploited in the design of energy and area-efficient programmable logic. In such logic-styles, spin-devices may play the dual-role of computing as well as memory-elements that provide field-programmability. Spin-based threshold logic design is presented as an example. Emerging spintronic phenomena may lead to ultra-low-voltage, current-mode, spin-torque switches that can offer attractive computing capabilities, beyond digital switches. Such devices may be suitable for non-Boolean data-processing applications which involve analog processing leading to highly energy-efficient information processing hardware for applicatons like pattern-matching, neuromorphic-computing, image-processing and data-conversion. Towards the end, we discuss the possibility of applying emerging spin-torque switches in the design of energy-efficient global interconnects, for future chip multiprocessors.
{"title":"Computing with Spin-Transfer-Torque Devices: Prospects and Perspectives","authors":"K. Roy, M. Sharad, Deliang Fan, K. Yogendra","doi":"10.1109/ISVLSI.2014.120","DOIUrl":"https://doi.org/10.1109/ISVLSI.2014.120","url":null,"abstract":"In this paper we discuss the potential of emerging spin-torque devices for computing applications. Recent proposals for spin-based computing schemes may be differentiated as all-spin? vs. hybrid, programmable vs. fixed, and, Boolean vs. non-Boolean. All-spin logic-styles may offer high area-density due to small form-factor of nano-magnetic devices. However, circuit and system-level design techniques need to be explored that leaverage the specific spin-device characterisitcs to achieve energy-efficiency, performance and reliability comparable to those of CMOS. The non-volatility of nano-magnets can be exploited in the design of energy and area-efficient programmable logic. In such logic-styles, spin-devices may play the dual-role of computing as well as memory-elements that provide field-programmability. Spin-based threshold logic design is presented as an example. Emerging spintronic phenomena may lead to ultra-low-voltage, current-mode, spin-torque switches that can offer attractive computing capabilities, beyond digital switches. Such devices may be suitable for non-Boolean data-processing applications which involve analog processing leading to highly energy-efficient information processing hardware for applicatons like pattern-matching, neuromorphic-computing, image-processing and data-conversion. Towards the end, we discuss the possibility of applying emerging spin-torque switches in the design of energy-efficient global interconnects, for future chip multiprocessors.","PeriodicalId":405755,"journal":{"name":"2014 IEEE Computer Society Annual Symposium on VLSI","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121587172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Low swing clocking is a low power design methodology that scales the clock voltage to decrease power consumption of the clock distribution networks, with an expected degradation in the performance. In this work, a novel low swing clock tree synthesis methodology is combined with a custom low swing clock-aware D flip-flop (DFF) design. The low swing clocking serves to reduce the power dissipation whereas the custom low swing-aware DFF serves to preserve the performance of the IC. The experimental results performed on the three largest circuits of ISCAS'89 benchmarks operating at 1GHz in the 32nm technology show that the proposed methodology can achieve an average of 16% power savings in the clock tree compared to its full swing counterpart, while satisfying the same clock skew (50ps) and slew (150ps) constraints at the worst case corner of operation. Moreover, the clock-to-output delay of the low swing DFF does not increase compared to traditional full swing DFF, while consuming only 1% more power.
{"title":"High Performance Low Swing Clock Tree Synthesis with Custom D Flip-Flop Design","authors":"Can Sitik, Leo Filippini, E. Salman, B. Taskin","doi":"10.1109/ISVLSI.2014.53","DOIUrl":"https://doi.org/10.1109/ISVLSI.2014.53","url":null,"abstract":"Low swing clocking is a low power design methodology that scales the clock voltage to decrease power consumption of the clock distribution networks, with an expected degradation in the performance. In this work, a novel low swing clock tree synthesis methodology is combined with a custom low swing clock-aware D flip-flop (DFF) design. The low swing clocking serves to reduce the power dissipation whereas the custom low swing-aware DFF serves to preserve the performance of the IC. The experimental results performed on the three largest circuits of ISCAS'89 benchmarks operating at 1GHz in the 32nm technology show that the proposed methodology can achieve an average of 16% power savings in the clock tree compared to its full swing counterpart, while satisfying the same clock skew (50ps) and slew (150ps) constraints at the worst case corner of operation. Moreover, the clock-to-output delay of the low swing DFF does not increase compared to traditional full swing DFF, while consuming only 1% more power.","PeriodicalId":405755,"journal":{"name":"2014 IEEE Computer Society Annual Symposium on VLSI","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125973329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Rethinagiri, Oscar Palomar, J. Moreno, O. Unsal, A. Cristal, M. Biglari-Abhari
In this paper, we propose a power and energy estimation methodology at system-level for Open Multimedia Applications Platforms (OMAPs). Within this methodology, the Functional-Level Power Analysis (FLPA) is extended to create generic power models of the platform under test. Then, a simulation framework is developed at the transactional-level to accurately evaluate the activities used in the previously developed power and energy models. The proposed methodology has several benefits: it considers the power and energy consumption of the entire platform including peripherals and leads to accurate estimates. The efficiency of the proposed system-level methodology is validated using mono-processor and heterogeneous multiprocessor embedded architectures designed around OMAPs. The estimated power and energy results provide a maximum error of 5% for mono-processor and 9% for heterogeneous multiprocessor based system when compared against the real board measurements.
{"title":"System-Level Power and Energy Estimation Methodology for Open Multimedia Applications Platforms","authors":"S. Rethinagiri, Oscar Palomar, J. Moreno, O. Unsal, A. Cristal, M. Biglari-Abhari","doi":"10.1109/ISVLSI.2014.38","DOIUrl":"https://doi.org/10.1109/ISVLSI.2014.38","url":null,"abstract":"In this paper, we propose a power and energy estimation methodology at system-level for Open Multimedia Applications Platforms (OMAPs). Within this methodology, the Functional-Level Power Analysis (FLPA) is extended to create generic power models of the platform under test. Then, a simulation framework is developed at the transactional-level to accurately evaluate the activities used in the previously developed power and energy models. The proposed methodology has several benefits: it considers the power and energy consumption of the entire platform including peripherals and leads to accurate estimates. The efficiency of the proposed system-level methodology is validated using mono-processor and heterogeneous multiprocessor embedded architectures designed around OMAPs. The estimated power and energy results provide a maximum error of 5% for mono-processor and 9% for heterogeneous multiprocessor based system when compared against the real board measurements.","PeriodicalId":405755,"journal":{"name":"2014 IEEE Computer Society Annual Symposium on VLSI","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127525537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}