Pub Date : 2019-07-01DOI: 10.1109/PATMOS.2019.8862039
M. Anala, B. Harish
Leakage current is making a substantial contribution to the power dissipation in nanometer regime due to continued technology scaling. The problem is further accentuated with the increasing levels of unpredictability in process parameters. Consequently, accurate and reliable modeling of leakage current is critical for the prediction of static power, especially for ultra low power applications. In contrast to gate leakage and Band-to-Band-Tunneling (BTBT) leakage, subthreshold leakage is the most sensitive to parameter variations and hence has been considered for variability modeling. The variations in electrical and geometry parameters of the device drastically impact the sub-threshold leakage current. In this paper, a subthreshold leakage power estimation model in the presence of process variations, with Drain-Induced Barrier Lowering (DIBL) considerations, is proposed. The model focuses on the subthreshold leakage variations induced by the simultaneous effect of threshold voltage variability and variations in gate length and width. The variation in the subthreshold leakage power is characterized by using an extensive Monte Carlo analysis. In order to demonstrate the efficacy of the proposed model, the model generated distributions of a static CMOS inverter are overlaid on the SPICE generated distributions in 32 nm PTM technology. The results demonstrate that, in the presence of process variations, the proposed model offers better predictability with a mean error in the range of 0.09% to 0.45% and reduction in the standard deviation of 3.3% to 34%, resulting in tighter distributions, thereby ensuring better predictability and design robustness. Further, the proposed model is about 700X computationally faster than SPICE simulations.
{"title":"Process Variation-Aware Analytical Modeling of Subthreshold Leakage Power","authors":"M. Anala, B. Harish","doi":"10.1109/PATMOS.2019.8862039","DOIUrl":"https://doi.org/10.1109/PATMOS.2019.8862039","url":null,"abstract":"Leakage current is making a substantial contribution to the power dissipation in nanometer regime due to continued technology scaling. The problem is further accentuated with the increasing levels of unpredictability in process parameters. Consequently, accurate and reliable modeling of leakage current is critical for the prediction of static power, especially for ultra low power applications. In contrast to gate leakage and Band-to-Band-Tunneling (BTBT) leakage, subthreshold leakage is the most sensitive to parameter variations and hence has been considered for variability modeling. The variations in electrical and geometry parameters of the device drastically impact the sub-threshold leakage current. In this paper, a subthreshold leakage power estimation model in the presence of process variations, with Drain-Induced Barrier Lowering (DIBL) considerations, is proposed. The model focuses on the subthreshold leakage variations induced by the simultaneous effect of threshold voltage variability and variations in gate length and width. The variation in the subthreshold leakage power is characterized by using an extensive Monte Carlo analysis. In order to demonstrate the efficacy of the proposed model, the model generated distributions of a static CMOS inverter are overlaid on the SPICE generated distributions in 32 nm PTM technology. The results demonstrate that, in the presence of process variations, the proposed model offers better predictability with a mean error in the range of 0.09% to 0.45% and reduction in the standard deviation of 3.3% to 34%, resulting in tighter distributions, thereby ensuring better predictability and design robustness. Further, the proposed model is about 700X computationally faster than SPICE simulations.","PeriodicalId":430458,"journal":{"name":"2019 29th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116782133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-07-01DOI: 10.1109/PATMOS.2019.8862081
N. Xiromeritis, S. Simoglou, C. Sotiriou, Nikolaos Sketopoulos
In this work, we present an Asynchronous Static Timing Analysis (ASTA) EDA methodology for cyclic, Asynchronous Control Circuits. Our methodology operates using Graph-based Analysis (GBA) principles, as conventional synchronous GBA STA, is fast, and pessimistically computes Critical Cycle(s), instead of Critical Paths, without cycle cutting. Our ASTA flow supports industrial Timing Libraries, Verilog input and multiple PVT corners. Gate timing arc delay/slew computation, input/output environment constraints, and path delay propagation, are implemented based on GBA STA principles. To perform ASTA, both gate-level netlist and a graph-based Event Model, Marked Graph (MG) or PeTri Net (PTnet), is required. The pair is used to construct the Event Timing Graph (ETG), an MG with annotated netlist extracted delays, for Event Model Transition to Transition (T2T) arcs. ETG delays are computed automatically, based on cyclic equilibrium slews, and GBA critical path identification between relevant T2T netlist gate pins. GBA T2T paths may be manually overridden. As GBA is non-functional, we illustrate a mapping between an Event Model, where choice places may be allowed, and the ETG, where places are collapsed to their corresponding timing annotated T2T arcs. The resultant ETG is live and 1-bounded, making it suitable for Period analysis using Burns Primal-Dual Algorithm. Our methodology has been successfully tested on 23 asynchronous benchmarks, and validated via timing simulation. We compare results against an industrial, synchronous STA tool with cycle cutting, and illustrate significant timing errors, when synchronous STA is used for delay annotation, as well as a 50% delta in Critical Cycle Delay.
{"title":"Graph-Based STA for Asynchronous Controllers","authors":"N. Xiromeritis, S. Simoglou, C. Sotiriou, Nikolaos Sketopoulos","doi":"10.1109/PATMOS.2019.8862081","DOIUrl":"https://doi.org/10.1109/PATMOS.2019.8862081","url":null,"abstract":"In this work, we present an Asynchronous Static Timing Analysis (ASTA) EDA methodology for cyclic, Asynchronous Control Circuits. Our methodology operates using Graph-based Analysis (GBA) principles, as conventional synchronous GBA STA, is fast, and pessimistically computes Critical Cycle(s), instead of Critical Paths, without cycle cutting. Our ASTA flow supports industrial Timing Libraries, Verilog input and multiple PVT corners. Gate timing arc delay/slew computation, input/output environment constraints, and path delay propagation, are implemented based on GBA STA principles. To perform ASTA, both gate-level netlist and a graph-based Event Model, Marked Graph (MG) or PeTri Net (PTnet), is required. The pair is used to construct the Event Timing Graph (ETG), an MG with annotated netlist extracted delays, for Event Model Transition to Transition (T2T) arcs. ETG delays are computed automatically, based on cyclic equilibrium slews, and GBA critical path identification between relevant T2T netlist gate pins. GBA T2T paths may be manually overridden. As GBA is non-functional, we illustrate a mapping between an Event Model, where choice places may be allowed, and the ETG, where places are collapsed to their corresponding timing annotated T2T arcs. The resultant ETG is live and 1-bounded, making it suitable for Period analysis using Burns Primal-Dual Algorithm. Our methodology has been successfully tested on 23 asynchronous benchmarks, and validated via timing simulation. We compare results against an industrial, synchronous STA tool with cycle cutting, and illustrate significant timing errors, when synchronous STA is used for delay annotation, as well as a 50% delta in Critical Cycle Delay.","PeriodicalId":430458,"journal":{"name":"2019 29th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132180141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-07-01DOI: 10.1109/PATMOS.2019.8862154
Maria Sapounaki, A. Kakarountas
Neuromorphic circuits have gained a lot of interest through the last decades since they may be deployed in a large spectrum of scientific research. In this paper a hardware realization of a single neuron targeting Field Programmable Gate Arrays (FPGA) with 6 levels of pipeline is presented. The proposed circuit implements the Izhikevich’s model and is presenting better performance compared to a previous pipelined design. The proposed implementation is based on fixed-point arithmetic, allowing faster computations on values related to the membrane potential and the membrane recovery variable of the neuron. The exploitation of balanced and reduced stages of pipeline, in combination to the fixed point arithmetic, offers two significant characteristics. The circuits characteristics are higher performance up to 14%, achieving also parallel computation, better simulation of the actual operation of a neuron, while area requirements of the FPGA implementation remain low as the initial reference design. The proposed circuit is the first of its kind, in an effort to minimize area and at the same time improve performance of an artificial neuron.
{"title":"A High-Performance Neuron for Artificial Neural Network based on Izhikevich model","authors":"Maria Sapounaki, A. Kakarountas","doi":"10.1109/PATMOS.2019.8862154","DOIUrl":"https://doi.org/10.1109/PATMOS.2019.8862154","url":null,"abstract":"Neuromorphic circuits have gained a lot of interest through the last decades since they may be deployed in a large spectrum of scientific research. In this paper a hardware realization of a single neuron targeting Field Programmable Gate Arrays (FPGA) with 6 levels of pipeline is presented. The proposed circuit implements the Izhikevich’s model and is presenting better performance compared to a previous pipelined design. The proposed implementation is based on fixed-point arithmetic, allowing faster computations on values related to the membrane potential and the membrane recovery variable of the neuron. The exploitation of balanced and reduced stages of pipeline, in combination to the fixed point arithmetic, offers two significant characteristics. The circuits characteristics are higher performance up to 14%, achieving also parallel computation, better simulation of the actual operation of a neuron, while area requirements of the FPGA implementation remain low as the initial reference design. The proposed circuit is the first of its kind, in an effort to minimize area and at the same time improve performance of an artificial neuron.","PeriodicalId":430458,"journal":{"name":"2019 29th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS)","volume":"351 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131230301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-07-01DOI: 10.1109/PATMOS.2019.8862129
Fabio Galán-Prado, Alejandro Morán, J. Font-Rosselló, M. Roca, J. Rosselló
Stochastic spiking Neural Networks (SNN) is a new neural modeling oriented to include the intrinsic stochastic processes present in the brain. One of the main advantages of this kind of modeling is that they can be easily implemented in a digital circuit, thus taking advantage of this mature technology. In this paper we propose a digital design for stochastic spiking neurons oriented to high-density hardware implementation. We compare the proposal with other neural models, comparing in terms of speed, area and precision. As is shown, the circuit proposal is able to provide competitive results when comparing with other works present in the literature.
{"title":"Stochastic Radial Basis Neural Networks","authors":"Fabio Galán-Prado, Alejandro Morán, J. Font-Rosselló, M. Roca, J. Rosselló","doi":"10.1109/PATMOS.2019.8862129","DOIUrl":"https://doi.org/10.1109/PATMOS.2019.8862129","url":null,"abstract":"Stochastic spiking Neural Networks (SNN) is a new neural modeling oriented to include the intrinsic stochastic processes present in the brain. One of the main advantages of this kind of modeling is that they can be easily implemented in a digital circuit, thus taking advantage of this mature technology. In this paper we propose a digital design for stochastic spiking neurons oriented to high-density hardware implementation. We compare the proposal with other neural models, comparing in terms of speed, area and precision. As is shown, the circuit proposal is able to provide competitive results when comparing with other works present in the literature.","PeriodicalId":430458,"journal":{"name":"2019 29th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS)","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116180437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-07-01DOI: 10.1109/PATMOS.2019.8862165
Daniel Öhlinger, Jürgen Maier, Matthias Függer, U. Schmid
We introduce the prototype of a digital timing simulation and power analysis tool for integrated circuit (Involution Tool) which employs the involution delay model introduced by Fugger et al. at DATE’15. Unlike the pure and inertial delay¨ models typically used in digital timing analysis tools, the involution model faithfully captures pulse propagation. The presented tool is able to quantify for the first time the accuracy of the latter by facilitating comparisons of its timing and power predictions with both SPICE-generated results and results achieved by standard timing analysis tools. It is easily customizable, both w.r.t. different instances of the involution model and different circuits, and supports automatic test case generation, including parameter sweeping. We demonstrate its capabilities by providing timing and power analysis results for three circuits in varying technologies: an inverter tree, the clock tree of an open-source processor, and a combinational circuit that involves multi-input NAND gates. It turns out that the timing and power predictions of two natural types of involution models are significantly better than the predictions obtained by standard digital simulations for the inverter tree and the clock tree. For the NAND circuit, the performance is comparable but not significantly better. Our simulations thus confirm the benefits of the involution model, but also demonstrate shortcomings for multi-input gates.
{"title":"The Involution Tool for Accurate Digital Timingand Power Analysis","authors":"Daniel Öhlinger, Jürgen Maier, Matthias Függer, U. Schmid","doi":"10.1109/PATMOS.2019.8862165","DOIUrl":"https://doi.org/10.1109/PATMOS.2019.8862165","url":null,"abstract":"We introduce the prototype of a digital timing simulation and power analysis tool for integrated circuit (Involution Tool) which employs the involution delay model introduced by Fugger et al. at DATE’15. Unlike the pure and inertial delay¨ models typically used in digital timing analysis tools, the involution model faithfully captures pulse propagation. The presented tool is able to quantify for the first time the accuracy of the latter by facilitating comparisons of its timing and power predictions with both SPICE-generated results and results achieved by standard timing analysis tools. It is easily customizable, both w.r.t. different instances of the involution model and different circuits, and supports automatic test case generation, including parameter sweeping. We demonstrate its capabilities by providing timing and power analysis results for three circuits in varying technologies: an inverter tree, the clock tree of an open-source processor, and a combinational circuit that involves multi-input NAND gates. It turns out that the timing and power predictions of two natural types of involution models are significantly better than the predictions obtained by standard digital simulations for the inverter tree and the clock tree. For the NAND circuit, the performance is comparable but not significantly better. Our simulations thus confirm the benefits of the involution model, but also demonstrate shortcomings for multi-input gates.","PeriodicalId":430458,"journal":{"name":"2019 29th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128234829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-07-01DOI: 10.1109/patmos.2019.8862116
{"title":"PATMOS 2019 Technical Papers","authors":"","doi":"10.1109/patmos.2019.8862116","DOIUrl":"https://doi.org/10.1109/patmos.2019.8862116","url":null,"abstract":"","PeriodicalId":430458,"journal":{"name":"2019 29th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127422626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-07-01DOI: 10.1109/PATMOS.2019.8862045
A. Simevski, M. Krstic
Integrated circuit aging becomes a major concern with technology downscaling. Long-life systems therefore require better mechanisms for improving their lifetime. Here we present an implementation of the Youngest-First Round-Robin (YFRR) core gating pattern as a mean for reduction of aging in a four core multiprocessor. The pattern is optimal in respect to achieving the maximal possible system lifetime and significantly outperforms the simple Round-Robin (RR) pattern. For the purposes of simulation, a Verilog model of the circuit aging is developed and integrated in the "all-digital" simulation environment. The relative wear-out of the cores is obtained by using aging monitors which direct the core selection process of the YFRR pattern. The results confirm that YFRR excels when the initial age of the cores is uneven. Even greater than 32% increase in lifetime is obtained, predicted by the theoretical model which is based on a Weibul distribution of the lifetime reliability function. Here, we further find out that YFRR is far better than RR when the aging rate is high.
{"title":"Simulation-based Verification of the Youngest-First Round-Robin Core Gating Pattern","authors":"A. Simevski, M. Krstic","doi":"10.1109/PATMOS.2019.8862045","DOIUrl":"https://doi.org/10.1109/PATMOS.2019.8862045","url":null,"abstract":"Integrated circuit aging becomes a major concern with technology downscaling. Long-life systems therefore require better mechanisms for improving their lifetime. Here we present an implementation of the Youngest-First Round-Robin (YFRR) core gating pattern as a mean for reduction of aging in a four core multiprocessor. The pattern is optimal in respect to achieving the maximal possible system lifetime and significantly outperforms the simple Round-Robin (RR) pattern. For the purposes of simulation, a Verilog model of the circuit aging is developed and integrated in the \"all-digital\" simulation environment. The relative wear-out of the cores is obtained by using aging monitors which direct the core selection process of the YFRR pattern. The results confirm that YFRR excels when the initial age of the cores is uneven. Even greater than 32% increase in lifetime is obtained, predicted by the theoretical model which is based on a Weibul distribution of the lifetime reliability function. Here, we further find out that YFRR is far better than RR when the aging rate is high.","PeriodicalId":430458,"journal":{"name":"2019 29th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122700235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-07-01DOI: 10.1109/PATMOS.2019.8862100
Fabio Galán-Prado, J. Font-Rosselló, J. Rosselló
In the recent years, Reservoir Computing arises as an emerging machine-learning technique that is highly suitable for time-series processing. In this work, we propose the implementation of reservoir computing systems in hardware via morphological neurons which make use of tropical algebra concepts that allow us to reduce the area cost in the neural synapses. The main consequence of using tropical algebra is that synapses multipliers are substituted by adders, with lower hardware requirements. The proposed design is synthesized in a Field-Programmable Gate Array (FPGA) and benchmarked against a time-series prediction task. The current approach achieves significant savings in terms of power and hardware, as well as an appreciable higher precision if compared to classical reservoir systems.
{"title":"Morphological Reservoir Computing Hardware","authors":"Fabio Galán-Prado, J. Font-Rosselló, J. Rosselló","doi":"10.1109/PATMOS.2019.8862100","DOIUrl":"https://doi.org/10.1109/PATMOS.2019.8862100","url":null,"abstract":"In the recent years, Reservoir Computing arises as an emerging machine-learning technique that is highly suitable for time-series processing. In this work, we propose the implementation of reservoir computing systems in hardware via morphological neurons which make use of tropical algebra concepts that allow us to reduce the area cost in the neural synapses. The main consequence of using tropical algebra is that synapses multipliers are substituted by adders, with lower hardware requirements. The proposed design is synthesized in a Field-Programmable Gate Array (FPGA) and benchmarked against a time-series prediction task. The current approach achieves significant savings in terms of power and hardware, as well as an appreciable higher precision if compared to classical reservoir systems.","PeriodicalId":430458,"journal":{"name":"2019 29th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123143512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-07-01DOI: 10.1109/PATMOS.2019.8862078
S. Siskos, V. Gogolou, C. Tsamis, A. Kerasidou, G. Doumenis, Konstantine Tsiapali, S. Katsikas, Andreas Sakellariou
A desired property of an autonomous system is the capability to operate and survive in unforeseen conditions. Wireless IoT (formerly Wireless Sensor Network) applications, pose a series of limitations regarding an embedded system’s power consumption and energy autonomy. The PERPS project proposes an innovative approach to energy harvesting systems, aiming to perpetual operation of WSN nodes and portable electronics. A state-of-the-art energy conversion integrated circuit (ENC IC), with real-time S/W algorithms is implemented, to allow the predictive estimation of energy availability at the system’s installation site. A multi-source input is employed combining parallel harvesters for various energy sources including (ambient) light, (micro) vibrations and (small) temperature differences, to upgrade the topology’s efficiency and versatility. In addition, the newly-introduced concept of harvesting energy via triboelectric microgenerators is studied. Ultra-low power consumption microelectronic circuitries and novel storage structure techniques are employed in order the overall architecture to present optimum energy utilization, therefore maximized efficiency. The final version of the system will be tested on a ship’s engine room thus the verification of the PERPS project operational principle will be based on real and demanding environmental conditions.
{"title":"Design of a flexible multi-source energy harvesting system for autonomously powered IoT : The PERPS project","authors":"S. Siskos, V. Gogolou, C. Tsamis, A. Kerasidou, G. Doumenis, Konstantine Tsiapali, S. Katsikas, Andreas Sakellariou","doi":"10.1109/PATMOS.2019.8862078","DOIUrl":"https://doi.org/10.1109/PATMOS.2019.8862078","url":null,"abstract":"A desired property of an autonomous system is the capability to operate and survive in unforeseen conditions. Wireless IoT (formerly Wireless Sensor Network) applications, pose a series of limitations regarding an embedded system’s power consumption and energy autonomy. The PERPS project proposes an innovative approach to energy harvesting systems, aiming to perpetual operation of WSN nodes and portable electronics. A state-of-the-art energy conversion integrated circuit (ENC IC), with real-time S/W algorithms is implemented, to allow the predictive estimation of energy availability at the system’s installation site. A multi-source input is employed combining parallel harvesters for various energy sources including (ambient) light, (micro) vibrations and (small) temperature differences, to upgrade the topology’s efficiency and versatility. In addition, the newly-introduced concept of harvesting energy via triboelectric microgenerators is studied. Ultra-low power consumption microelectronic circuitries and novel storage structure techniques are employed in order the overall architecture to present optimum energy utilization, therefore maximized efficiency. The final version of the system will be tested on a ship’s engine room thus the verification of the PERPS project operational principle will be based on real and demanding environmental conditions.","PeriodicalId":430458,"journal":{"name":"2019 29th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132506924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-07-01DOI: 10.1109/PATMOS.2019.8862166
A. Kyriakos, V. Kitsakis, Alexandros Louropoulos, E. Papatheofanous, I. Patronas, D. Reisis
The continuing advancement of the neural networks based techniques led to their exploitation in many applications, such as computer vision and the natural language processing systems where they provide high accuracy results at the cost of their high computational complexity. Hardware implemented AI accelerators provide the needed performance improvement for applications in specific areas, including robotics, autonomous systems and internet of things. The current study presents an FPGA based accelerator for Convolutional Neural Networks (CNN). The CNN model is trained for the MNIST dataset and the VHDL design targets high throughput, low power while using only on chip memory. The architecture uses parallel computations at the convolutional and fully connected layers and it has a highly pipelined output layer. The architecture implementation on a Xilinx Virtex VC707 validates the results.
{"title":"High Performance Accelerator for CNN Applications","authors":"A. Kyriakos, V. Kitsakis, Alexandros Louropoulos, E. Papatheofanous, I. Patronas, D. Reisis","doi":"10.1109/PATMOS.2019.8862166","DOIUrl":"https://doi.org/10.1109/PATMOS.2019.8862166","url":null,"abstract":"The continuing advancement of the neural networks based techniques led to their exploitation in many applications, such as computer vision and the natural language processing systems where they provide high accuracy results at the cost of their high computational complexity. Hardware implemented AI accelerators provide the needed performance improvement for applications in specific areas, including robotics, autonomous systems and internet of things. The current study presents an FPGA based accelerator for Convolutional Neural Networks (CNN). The CNN model is trained for the MNIST dataset and the VHDL design targets high throughput, low power while using only on chip memory. The architecture uses parallel computations at the convolutional and fully connected layers and it has a highly pipelined output layer. The architecture implementation on a Xilinx Virtex VC707 validates the results.","PeriodicalId":430458,"journal":{"name":"2019 29th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131384828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}