首页 > 最新文献

2019 29th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS)最新文献

英文 中文
A Synergy of a Closed-Loop DVFS Controller and CPU Hot-Plug For Run-Time Thermal Management in Multicore Systems 多核系统运行时热管理中闭环DVFS控制器和CPU热插拔的协同作用
Michail Noltsis, Nikolaos Zambelis, F. Catthoor, D. Soudris
In the era of nanoelectronic circuits, temperature largely affects reliability and static power consumption of systems. In fact, temperature control in modern circuits is considered as a crucial system function, along with low-power operation. To this end, numerous thermal management approaches exist on the hardware, firmware and software layer. A widely used technique for that matter is dynamic voltage and frequency scaling (DVFS), aiming to control temperature by proper voltage and frequency decisions. CPU hot-plug is another technique brought from reliability and fault-tolerance domain that can be utilized for thermal management purposes. To this end, our work examines the thermal profile of an NXP IMX6Q board when different DVFS and CPU hot-plug actuations are applied. In addition, we move further and propose a synergy between the two methods while studying their effect on chip temperature, performance and energy consumption.
在纳米电路时代,温度在很大程度上影响着系统的可靠性和静态功耗。事实上,温度控制在现代电路中被认为是一个关键的系统功能,以及低功耗操作。为此,在硬件、固件和软件层存在许多热管理方法。在这方面广泛使用的技术是动态电压和频率缩放(DVFS),旨在通过适当的电压和频率决定来控制温度。CPU热插拔是另一种来自可靠性和容错领域的技术,可用于热管理目的。为此,我们的工作检查了在应用不同DVFS和CPU热插拔驱动时NXP IMX6Q板的热概况。此外,我们进一步提出了两种方法之间的协同作用,同时研究了它们对芯片温度,性能和能耗的影响。
{"title":"A Synergy of a Closed-Loop DVFS Controller and CPU Hot-Plug For Run-Time Thermal Management in Multicore Systems","authors":"Michail Noltsis, Nikolaos Zambelis, F. Catthoor, D. Soudris","doi":"10.1109/PATMOS.2019.8862032","DOIUrl":"https://doi.org/10.1109/PATMOS.2019.8862032","url":null,"abstract":"In the era of nanoelectronic circuits, temperature largely affects reliability and static power consumption of systems. In fact, temperature control in modern circuits is considered as a crucial system function, along with low-power operation. To this end, numerous thermal management approaches exist on the hardware, firmware and software layer. A widely used technique for that matter is dynamic voltage and frequency scaling (DVFS), aiming to control temperature by proper voltage and frequency decisions. CPU hot-plug is another technique brought from reliability and fault-tolerance domain that can be utilized for thermal management purposes. To this end, our work examines the thermal profile of an NXP IMX6Q board when different DVFS and CPU hot-plug actuations are applied. In addition, we move further and propose a synergy between the two methods while studying their effect on chip temperature, performance and energy consumption.","PeriodicalId":430458,"journal":{"name":"2019 29th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131386118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Modelling Reversion Loss and Shoot-through Current in Switched-Capacitor DC-DC Converters with Petri Nets 用Petri网建模开关电容DC-DC变换器的回归损耗和穿透电流
Danhui Li, F. Xia, Junwen Luo, A. Yakovlev
For a Switched-Capacitor DC-DC converter (SCC) in a low power design, reversion losses and shoot-through currents may lead to substantial efficiency degradations and voltage reductions at the output. These reversion losses and shoot-through currents are caused by undesired conduction in MOS devices under certain combinations of internal SCC signals including clocks. This paper proposes a new method that models reversion losses and shoot-through currents in SCCs with Petri nets, providing a formal way of tracking them. With reachability analysis on the Petri Net models, reversion losses and shoot-through currents can be verified and investigated, which is helpful for avoiding these problems in designs. This paper takes cross-coupled voltage doublers as examples. Analysis examples where these properties are identified are presented, together with the finding of healthy traces, which do not contain them. Besides tool-supported reachability analysis capabilities, the natural causal event traceability of Petri net models allows the design of SCCs and other analog and mixed signal (AMS) circuits to be more transparent and understandable, and hence easier to reason about, debug and validate.
对于低功耗设计的开关电容DC-DC转换器(SCC),反转损耗和穿透电流可能导致输出效率大幅下降和电压降低。这些反转损耗和穿透电流是由MOS器件在包括时钟在内的内部SCC信号的某些组合下的不期望传导引起的。本文提出了一种用Petri网对SCCs的回归损失和穿透电流进行建模的新方法,提供了一种正式的跟踪方法。通过对Petri网模型的可达性分析,验证和研究了回归损耗和穿透电流,有助于在设计中避免这些问题。本文以交叉耦合倍压器为例。介绍了识别这些属性的分析示例,以及查找不包含这些属性的健康踪迹。除了工具支持的可达性分析功能外,Petri网模型的自然因果事件可追溯性使scc和其他模拟和混合信号(AMS)电路的设计更加透明和可理解,因此更容易推理,调试和验证。
{"title":"Modelling Reversion Loss and Shoot-through Current in Switched-Capacitor DC-DC Converters with Petri Nets","authors":"Danhui Li, F. Xia, Junwen Luo, A. Yakovlev","doi":"10.1109/PATMOS.2019.8862124","DOIUrl":"https://doi.org/10.1109/PATMOS.2019.8862124","url":null,"abstract":"For a Switched-Capacitor DC-DC converter (SCC) in a low power design, reversion losses and shoot-through currents may lead to substantial efficiency degradations and voltage reductions at the output. These reversion losses and shoot-through currents are caused by undesired conduction in MOS devices under certain combinations of internal SCC signals including clocks. This paper proposes a new method that models reversion losses and shoot-through currents in SCCs with Petri nets, providing a formal way of tracking them. With reachability analysis on the Petri Net models, reversion losses and shoot-through currents can be verified and investigated, which is helpful for avoiding these problems in designs. This paper takes cross-coupled voltage doublers as examples. Analysis examples where these properties are identified are presented, together with the finding of healthy traces, which do not contain them. Besides tool-supported reachability analysis capabilities, the natural causal event traceability of Petri net models allows the design of SCCs and other analog and mixed signal (AMS) circuits to be more transparent and understandable, and hence easier to reason about, debug and validate.","PeriodicalId":430458,"journal":{"name":"2019 29th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131084499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
UVM-based Verification of a Digital PLL Using SystemVerilog 基于uvm的数字锁相环系统验证
Nikolaos Georgoulopoulos, Alkiviadis A. Hatzopoulos
One of the most significant trends in the semiconductor industry is mixed-signal applications. A great amount of effort is focused on creating fast and accurate designs, which include both analog and digital components. As a result, mixed-signal verification poses a major concern. Previous traditional verification techniques offer slow verification time and relatively small robustness. In this work, an efficient UVM-based verification architecture for a digital phase-locked loop (DPLL) real number model using SystemVerilog is presented. The UVM capabilities of the proposed methodology combined with the RNM model of the digital PLL favors the generation of a reusable, time-to-market fast and robust verification environment. Cadence Incisive Enterprise Simulator was used for the testbench creation and simulation. The proposed verification architecture uses constrained-random stimulus generation, analog assertions and coverage metrics, in order to achieve high gains in verification efficiency.
半导体工业中最重要的趋势之一是混合信号应用。大量的努力集中在创建快速和准确的设计,其中包括模拟和数字组件。因此,混合信号验证是一个主要问题。以往的传统验证技术存在验证时间慢、鲁棒性差的问题。在这项工作中,利用SystemVerilog提出了一种基于uvm的数字锁相环(DPLL)实数模型验证体系结构。所提出方法的UVM功能与数字锁相环的RNM模型相结合,有利于生成可重用的、快速上市的、健壮的验证环境。使用Cadence Incisive Enterprise Simulator进行试验台的创建和仿真。该验证体系结构采用约束随机刺激生成、模拟断言和覆盖度量,以提高验证效率。
{"title":"UVM-based Verification of a Digital PLL Using SystemVerilog","authors":"Nikolaos Georgoulopoulos, Alkiviadis A. Hatzopoulos","doi":"10.1109/PATMOS.2019.8862105","DOIUrl":"https://doi.org/10.1109/PATMOS.2019.8862105","url":null,"abstract":"One of the most significant trends in the semiconductor industry is mixed-signal applications. A great amount of effort is focused on creating fast and accurate designs, which include both analog and digital components. As a result, mixed-signal verification poses a major concern. Previous traditional verification techniques offer slow verification time and relatively small robustness. In this work, an efficient UVM-based verification architecture for a digital phase-locked loop (DPLL) real number model using SystemVerilog is presented. The UVM capabilities of the proposed methodology combined with the RNM model of the digital PLL favors the generation of a reusable, time-to-market fast and robust verification environment. Cadence Incisive Enterprise Simulator was used for the testbench creation and simulation. The proposed verification architecture uses constrained-random stimulus generation, analog assertions and coverage metrics, in order to achieve high gains in verification efficiency.","PeriodicalId":430458,"journal":{"name":"2019 29th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115141490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Impact of Coarse-Grained Power Gate Placement on a Fine-Grained System Design 粗粒度电源栅极布局对细粒度系统设计的影响
Shylesh Umapathy, Aaron Stillmaker
With the 54th commemoration of Moores law and the intense development of VLSI technology has permitted more and more IP components to be integrated on a single chip. However, factors such as power consumption has been limiting this growth rate. Low power techniques such as clock gating, power gating, dynamic voltage and frequency scaling, body biasing, and many more have emerged as potential solutions. This paper explores power gating technique and presents the design trade-offs between the ring and grid style of power gate placement in a fine-grained system design. The study used 24 physical designs of 12 different sized MAC units ranging from 44 to 320-bit inputs, and extracted various parameters. The results depict that, using a ring style of placement gives an average increase in IR drop of 9.59% when compared to grid style of placement for 128 to 320-bits input MAC unit. The grid style possesses an additional average congestion of 1.66% when compared to ring style of placement for 192 to 320-bits input MAC unit.
随着摩尔定律54周年的纪念和VLSI技术的迅猛发展,越来越多的IP组件被集成到单个芯片上。然而,电力消耗等因素限制了这一增长速度。时钟门控、功率门控、动态电压和频率缩放、体偏置等低功耗技术已经成为潜在的解决方案。本文探讨了功率门控技术,并提出了在细粒度系统设计中环形和栅格功率门布置风格之间的设计权衡。本研究使用了12种不同尺寸MAC单元的24种物理设计,输入量从44位到320位不等,并提取了各种参数。结果表明,对于128至320位输入MAC单元,使用环形布局与网格布局相比,IR下降平均增加9.59%。与192到320位输入MAC单元的环形布局相比,网格样式具有额外的1.66%的平均拥塞。
{"title":"Impact of Coarse-Grained Power Gate Placement on a Fine-Grained System Design","authors":"Shylesh Umapathy, Aaron Stillmaker","doi":"10.1109/PATMOS.2019.8862128","DOIUrl":"https://doi.org/10.1109/PATMOS.2019.8862128","url":null,"abstract":"With the 54th commemoration of Moores law and the intense development of VLSI technology has permitted more and more IP components to be integrated on a single chip. However, factors such as power consumption has been limiting this growth rate. Low power techniques such as clock gating, power gating, dynamic voltage and frequency scaling, body biasing, and many more have emerged as potential solutions. This paper explores power gating technique and presents the design trade-offs between the ring and grid style of power gate placement in a fine-grained system design. The study used 24 physical designs of 12 different sized MAC units ranging from 44 to 320-bit inputs, and extracted various parameters. The results depict that, using a ring style of placement gives an average increase in IR drop of 9.59% when compared to grid style of placement for 128 to 320-bits input MAC unit. The grid style possesses an additional average congestion of 1.66% when compared to ring style of placement for 192 to 320-bits input MAC unit.","PeriodicalId":430458,"journal":{"name":"2019 29th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS)","volume":"261 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124415928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Minimizing Power for Neural Network Training with Logarithm-Approximate Floating-Point Multiplier 基于对数近似浮点乘法器的神经网络训练功率最小化
TaiYu Cheng, Jaehoon Yu, M. Hashimoto
This paper proposes to adopt logarithm-approximate multiplier (LAM) for multiply-accumulate (MAC) computation in neural network (NN) training engine, where LAM approximates a floating-point multiplication as an addition resulting in smaller delay, fewer gates, and lower power consumption. Our implementation of NN training engine for a 2-D classification dataset achieves 10% speed-up and 2.5X and 2.3X efficiency improvement in power and area, respectively. LAM is also highly compatible with conventional bit-width scaling (BWS). When BWS is applied with LAM in four test datasets, more than 5.2X power efficiency improvement is achievable with only 1% accuracy degradation, where 2.3X improvement originates from LAM.
本文提出在神经网络(NN)训练引擎中采用对数近似乘法器(LAM)进行乘法累加(MAC)计算,其中LAM将浮点乘法近似为加法,从而实现更小的延迟、更少的门数和更低的功耗。我们对二维分类数据集的NN训练引擎实现了10%的加速,功率和面积分别提高了2.5倍和2.3倍的效率。LAM还与传统的位宽缩放(BWS)高度兼容。当BWS与LAM一起应用于四个测试数据集时,可以实现超过5.2倍的功率效率提高,而精度仅下降1%,其中2.3倍的提高来自LAM。
{"title":"Minimizing Power for Neural Network Training with Logarithm-Approximate Floating-Point Multiplier","authors":"TaiYu Cheng, Jaehoon Yu, M. Hashimoto","doi":"10.1109/PATMOS.2019.8862162","DOIUrl":"https://doi.org/10.1109/PATMOS.2019.8862162","url":null,"abstract":"This paper proposes to adopt logarithm-approximate multiplier (LAM) for multiply-accumulate (MAC) computation in neural network (NN) training engine, where LAM approximates a floating-point multiplication as an addition resulting in smaller delay, fewer gates, and lower power consumption. Our implementation of NN training engine for a 2-D classification dataset achieves 10% speed-up and 2.5X and 2.3X efficiency improvement in power and area, respectively. LAM is also highly compatible with conventional bit-width scaling (BWS). When BWS is applied with LAM in four test datasets, more than 5.2X power efficiency improvement is achievable with only 1% accuracy degradation, where 2.3X improvement originates from LAM.","PeriodicalId":430458,"journal":{"name":"2019 29th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115951598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Temperature-aware writing architecture for multilevel memristive cells 多电平记忆体的温度感知写入体系结构
Amadeo de Gracia Herranz, M. López-Vallejo
The high potential of memristors as multilevel resistance devices. Memristors are promising but suffer from their non-linear behaviour and a strong dependency on different sources of variability (process, voltage, temperature…). Temperature variations are specially harmful because a small thermal variation changes the operation point of the device in a conclusive way. For these reasons the circuitry required to accurately read or write multilevel devices is complex and area demanding. This paper presents a time-domain architecture based on variable pulses that is able to write different levels in the memristive cell. It is resilient to temperature changes based on an in depth analysis of the definition of the resistance levels. Furthermore, the proposed architecture takes advantage of logarithmic counters to save area. Experimental results show that the proposed approach is valid for a wide temperature range.
作为多电平电阻器件的高电位忆阻器。忆阻器是很有前途的,但它们的非线性行为和对不同可变性来源(工艺、电压、温度……)的强烈依赖是它们的缺点。温度变化是特别有害的,因为一个小的热变化会以决定性的方式改变设备的工作点。由于这些原因,精确地读取或写入多电平器件所需的电路是复杂的和面积要求。本文提出了一种基于可变脉冲的时域结构,可以在忆阻单元中写入不同的电平。基于对电阻水平定义的深入分析,它对温度变化具有弹性。此外,所提出的架构利用对数计数器来节省面积。实验结果表明,该方法在较宽的温度范围内是有效的。
{"title":"Temperature-aware writing architecture for multilevel memristive cells","authors":"Amadeo de Gracia Herranz, M. López-Vallejo","doi":"10.1109/PATMOS.2019.8862049","DOIUrl":"https://doi.org/10.1109/PATMOS.2019.8862049","url":null,"abstract":"The high potential of memristors as multilevel resistance devices. Memristors are promising but suffer from their non-linear behaviour and a strong dependency on different sources of variability (process, voltage, temperature…). Temperature variations are specially harmful because a small thermal variation changes the operation point of the device in a conclusive way. For these reasons the circuitry required to accurately read or write multilevel devices is complex and area demanding. This paper presents a time-domain architecture based on variable pulses that is able to write different levels in the memristive cell. It is resilient to temperature changes based on an in depth analysis of the definition of the resistance levels. Furthermore, the proposed architecture takes advantage of logarithmic counters to save area. Experimental results show that the proposed approach is valid for a wide temperature range.","PeriodicalId":430458,"journal":{"name":"2019 29th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS)","volume":"354 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122791854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Implementing VESA Display Stream Compression Encoder in FPGAs 在fpga上实现VESA显示流压缩编码器
Nikolaos Kefalas, G. Theodoridis
In this paper, an architecture for implementing VESA DSC image compression standard in FPGAs is proposed. DSC uses a very small 3-pixel wide coding unit that restricts the HW architecture of the prediction part to only three pipeline stages. The proposed architecture optimizes the pipeline distribution and performs algorithmic optimizations in order to reduce the critical path inside the prediction modes of DSC. It achieves 62 MHz and requires 18595 slices on Virtex 7 spl. It can process full HD (1920x1080) 4:4:4 images at 30 frames per second and full HD 4:2:2 or 4:2:0 images at 60 frames per second, while it has sub-line latency.
本文提出了一种在fpga上实现VESA DSC图像压缩标准的体系结构。DSC使用非常小的3像素宽编码单元,将预测部分的硬件架构限制为只有三个管道阶段。该架构优化了管道分布,并对算法进行了优化,以减少DSC预测模式内的关键路径。它达到62 MHz,在Virtex 7 spl上需要18595个切片。它可以以每秒30帧的速度处理全高清(1920x1080) 4:4:4图像,也可以以每秒60帧的速度处理全高清4:2:2或4:2:0图像,但它有子线路延迟。
{"title":"Implementing VESA Display Stream Compression Encoder in FPGAs","authors":"Nikolaos Kefalas, G. Theodoridis","doi":"10.1109/PATMOS.2019.8862082","DOIUrl":"https://doi.org/10.1109/PATMOS.2019.8862082","url":null,"abstract":"In this paper, an architecture for implementing VESA DSC image compression standard in FPGAs is proposed. DSC uses a very small 3-pixel wide coding unit that restricts the HW architecture of the prediction part to only three pipeline stages. The proposed architecture optimizes the pipeline distribution and performs algorithmic optimizations in order to reduce the critical path inside the prediction modes of DSC. It achieves 62 MHz and requires 18595 slices on Virtex 7 spl. It can process full HD (1920x1080) 4:4:4 images at 30 frames per second and full HD 4:2:2 or 4:2:0 images at 60 frames per second, while it has sub-line latency.","PeriodicalId":430458,"journal":{"name":"2019 29th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122545614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Voltage Scaling and Guardband Customization of Multiple Constituent Components in SoC-FPGA SoC-FPGA中多组分元件的电压缩放和护带定制
I. Stratakos, Konstantinos Maragos, G. Lentaris
The ever-growing demands for increased speed and low power has led to the development of sophisticated and complex heterogeneous chips consisting of multiple components, such as memories, processors, DSPs, and classical FPGA resources, which operate with diverse specifications. However, their vendor-defined specifications are quite conservative to enable meeting the most demanding application scenarios. Consequently, a surplus of energy consumption is measured in practice. This work focuses on the customization of the operating parameters in SoC-FPGA chips, when executing a HW/SW co-designed application that utilizes multiple of the constituent components of the system. We demonstrate that the nominal application throughput can be attained when multiple components of the SoC are individually fine-tuned to distinct voltage levels, specific to the given application, thus leading to improved energy footprint.
对提高速度和低功耗的不断增长的需求导致了由多个组件组成的复杂异构芯片的发展,例如存储器,处理器,dsp和经典FPGA资源,它们以不同的规格运行。然而,它们的供应商定义的规范非常保守,无法满足最苛刻的应用程序场景。因此,在实践中测量了能源消耗的盈余。这项工作的重点是在执行硬件/软件共同设计的应用程序时,在SoC-FPGA芯片中定制操作参数,该应用程序利用了系统的多个组成组件。我们证明,当SoC的多个组件单独微调到特定于给定应用的不同电压水平时,可以获得标称的应用吞吐量,从而导致改善的能量足迹。
{"title":"Voltage Scaling and Guardband Customization of Multiple Constituent Components in SoC-FPGA","authors":"I. Stratakos, Konstantinos Maragos, G. Lentaris","doi":"10.1109/PATMOS.2019.8862050","DOIUrl":"https://doi.org/10.1109/PATMOS.2019.8862050","url":null,"abstract":"The ever-growing demands for increased speed and low power has led to the development of sophisticated and complex heterogeneous chips consisting of multiple components, such as memories, processors, DSPs, and classical FPGA resources, which operate with diverse specifications. However, their vendor-defined specifications are quite conservative to enable meeting the most demanding application scenarios. Consequently, a surplus of energy consumption is measured in practice. This work focuses on the customization of the operating parameters in SoC-FPGA chips, when executing a HW/SW co-designed application that utilizes multiple of the constituent components of the system. We demonstrate that the nominal application throughput can be attained when multiple components of the SoC are individually fine-tuned to distinct voltage levels, specific to the given application, thus leading to improved energy footprint.","PeriodicalId":430458,"journal":{"name":"2019 29th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128261475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
PATMOS 2019 Author Index PATMOS 2019作者索引
{"title":"PATMOS 2019 Author Index","authors":"","doi":"10.1109/patmos.2019.8862164","DOIUrl":"https://doi.org/10.1109/patmos.2019.8862164","url":null,"abstract":"","PeriodicalId":430458,"journal":{"name":"2019 29th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127558401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Calibration Procedure for Sensor Based Adaptive Voltage Scaling Approaches 一种基于传感器自适应电压标度方法的校准程序
Christoph Niemann, Munawar Ali, Jakob Heller, D. Timmermann
Nowadays, VLSI systems suffer from increasing impacts of aging and variability. Traditionally, this is treated by applying extensive guard bands. As those guard bands are chosen at design time, they are necessarily worst case guard bands. Thus, most often they are too pessimistic. Current research tries to mitigate this by means of in-situ performance measurement based Adaptive Voltage Scaling (AVS). AVS typically relies on assumptions regarding the timing behavior of the application logic in relation to the behavior of a specific canary or sensor logic. Most published approaches use manually gained empirical data of just a few test chips and application designs for this purpose. However, to practically apply these techniques, an automatic calibration flow is needed. We propose such an automated calibration flow and test it on multiple FPGAs. We achieve an average power saving of 67%.
如今,超大规模集成电路系统受到越来越多的老化和变异性的影响。传统上,这是通过应用广泛的保护带来治疗的。由于这些保护带是在设计时选择的,因此它们必然是最坏情况下的保护带。因此,他们往往过于悲观。目前的研究试图通过基于现场性能测量的自适应电压缩放(AVS)来缓解这一问题。AVS通常依赖于与特定金丝雀或传感器逻辑的行为相关的应用程序逻辑的定时行为的假设。大多数已发表的方法使用手动获得的经验数据,只有少数测试芯片和应用程序设计用于此目的。然而,为了实际应用这些技术,需要一个自动校准流程。我们提出了这样一个自动校准流程,并在多个fpga上进行了测试。我们实现了67%的平均节电。
{"title":"A Calibration Procedure for Sensor Based Adaptive Voltage Scaling Approaches","authors":"Christoph Niemann, Munawar Ali, Jakob Heller, D. Timmermann","doi":"10.1109/PATMOS.2019.8862075","DOIUrl":"https://doi.org/10.1109/PATMOS.2019.8862075","url":null,"abstract":"Nowadays, VLSI systems suffer from increasing impacts of aging and variability. Traditionally, this is treated by applying extensive guard bands. As those guard bands are chosen at design time, they are necessarily worst case guard bands. Thus, most often they are too pessimistic. Current research tries to mitigate this by means of in-situ performance measurement based Adaptive Voltage Scaling (AVS). AVS typically relies on assumptions regarding the timing behavior of the application logic in relation to the behavior of a specific canary or sensor logic. Most published approaches use manually gained empirical data of just a few test chips and application designs for this purpose. However, to practically apply these techniques, an automatic calibration flow is needed. We propose such an automated calibration flow and test it on multiple FPGAs. We achieve an average power saving of 67%.","PeriodicalId":430458,"journal":{"name":"2019 29th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116370061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
2019 29th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1