IET Circuits Devices Syst.最新文献

英文中文

Area-efficient reconfigurable-array-based oscillator for standard cell characterisation 用于标准细胞表征的区域高效可重构阵列振荡器

IET Circuits Devices Syst.

Pub Date : 2012-11-01 DOI: 10.1049/iet-cds.2012.0012

B. P. Das, H. Onodera

Today's multi-million digital integrated circuit design highly depends on the quality of the standard cell library. In this study, an all-digital reconfigurable-array-based test structure is presented to test the quality (i.e. functionality and performance) of all types of logic gates in the standard cell library using the reconfigurable array of gate delay measurement cell. The gate delay is estimated using the least squares method with measured reconfigurable ring oscillator's (RO) period/frequency. As the least squares method averages out the random noise in the measured RO period, measured gate delay is estimated accurately. The reconfigurable-array structure can easily isolate a faulty standard cell from a non-faulty standard cell. The test structure is area efficient with a saving of 1.6× and 2× area compared with the normal RO-based delay measurement in 180 nm and 65 nm technology node, respectively. A subset of standard cells is tested using this reconfigurable-array structure. A test chip has been fabricated in an industrial 180 nm technology node to study the feasibility of the approach. The measured results from 20 chips are reported to show the amount of within-die and die-to-die variation.

当今数百万的数字集成电路设计高度依赖于标准单元库的质量。在本研究中，提出了一种基于全数字可重构阵列的测试结构，使用门延迟测量单元的可重构阵列来测试标准单元库中所有类型逻辑门的质量(即功能和性能)。通过测量可重构环振荡器(RO)周期/频率，利用最小二乘法估计栅极延迟。由于最小二乘法平均了被测RO周期内的随机噪声，因此可以准确地估计被测栅极延迟。可重构阵列结构可以很容易地将故障标准单元与非故障标准单元隔离开来。该测试结构具有面积效率，在180 nm和65 nm技术节点上，与普通基于ro的延迟测量相比，面积分别节省1.6倍和2倍。使用此可重构数组结构测试标准单元的子集。在工业180nm工艺节点上制作了测试芯片，以研究该方法的可行性。对20个芯片的测量结果进行了报告，显示了模内和模间的变化量。

{"title":"Area-efficient reconfigurable-array-based oscillator for standard cell characterisation","authors":"B. P. Das, H. Onodera","doi":"10.1049/iet-cds.2012.0012","DOIUrl":"https://doi.org/10.1049/iet-cds.2012.0012","url":null,"abstract":"Today's multi-million digital integrated circuit design highly depends on the quality of the standard cell library. In this study, an all-digital reconfigurable-array-based test structure is presented to test the quality (i.e. functionality and performance) of all types of logic gates in the standard cell library using the reconfigurable array of gate delay measurement cell. The gate delay is estimated using the least squares method with measured reconfigurable ring oscillator's (RO) period/frequency. As the least squares method averages out the random noise in the measured RO period, measured gate delay is estimated accurately. The reconfigurable-array structure can easily isolate a faulty standard cell from a non-faulty standard cell. The test structure is area efficient with a saving of 1.6× and 2× area compared with the normal RO-based delay measurement in 180 nm and 65 nm technology node, respectively. A subset of standard cells is tested using this reconfigurable-array structure. A test chip has been fabricated in an industrial 180 nm technology node to study the feasibility of the approach. The measured results from 20 chips are reported to show the amount of within-die and die-to-die variation.","PeriodicalId":120076,"journal":{"name":"IET Circuits Devices Syst.","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114642923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Fully integrated serial-link receiver with optical interface for long-haul display interconnects 完全集成的串行链路接收器与光学接口，用于长途显示互连

IET Circuits Devices Syst.

Pub Date : 2012-11-01 DOI: 10.1049/iet-cds.2012.0029

Kang-Yeob Park, W. Oh, Y. Lee, W. Choi

We report a fully integrated serial-link receiver with optical interface fabricated with a 0.18 µm complementary metal oxide semiconductor technology for long-haul display interconnects. The receiver includes a trans-impedance amplifier, a limiting amplifier, a clock and data recovery circuit, 1:64 de-multiplexer and a built-in error checker. The receiver produces 64-bit wide electrical signals from photodetector output signals produced by 5.28, 5.6 or 6.25 Gb/s optical signals delivered through up to 700-m multi-mode fibre. It can support serialised data for Ultra eXtended Graphics Array (UXGA), 1080 p and Wide Ultra eXtended Graphics Array (WUXGA). The receiver core occupies 0.59 mm2 with 42.4 mW power dissipation at 6.25 Gb/s bit rate from a 1.8 V supply.

我们报告了一种完全集成的串行链路接收器，其光接口采用0.18 μ m互补金属氧化物半导体技术制造，用于长距离显示互连。接收机包括一个反阻抗放大器、一个限幅放大器、一个时钟和数据恢复电路、1:64解复用器和一个内置错误检查器。接收器从光电探测器输出的5.28、5.6或6.25 Gb/s光信号中产生64位宽电信号，这些光信号通过长达700米的多模光纤传输。它可以支持超扩展图形阵列(UXGA)， 1080p和宽超扩展图形阵列(WUXGA)的序列化数据。在1.8 V电源下，在6.25 Gb/s比特率下，接收器核心占地0.59 mm2，功耗42.4 mW。

引用次数: 1

Power estimation model based on grouping components in field-programmable gate array circuit 现场可编程门阵列电路中基于分组元件的功率估计模型

IET Circuits Devices Syst.

Pub Date : 2012-11-01 DOI: 10.1049/iet-cds.2011.0367

Najoua Chalbi, Mohamed Boubaker, M. Hedi

In this study, the authors present field-programmable gate array dynamic power models for basic operators at the architectural level. Other models are developed for operator groups arranged in parallel or in series in the architecture. The operator's characterisation models depend on the frequency variation, the activity rate and precision in the presence of autocorrelation, taking into account the interconnections between operators. The authors have validated their approach by the Euclidean distance and finite-impulse response filter applications while using the operator models in a first step and the IPs models in a second step. The estimation results show that the estimate is even closer to the real value when IPs mathematical models are used, and the experimental ones show a higher average accuracy and the maximum average error reached is equal to 3.7%. The power models are verified by an on-board measurement based on a Virtex2Pro field-programmable gate array real environment and is ready for integration with high-level power optimisation techniques.

在这项研究中，作者提出了在架构层面上的基本操作员的现场可编程门阵列动态功率模型。其他模型是为在体系结构中并行或串联排列的操作符组开发的。考虑到算子之间的相互联系，算子的表征模型依赖于频率变化、自相关存在下的活动率和精度。作者通过欧几里得距离和有限脉冲响应滤波器的应用验证了他们的方法，同时在第一步使用算子模型，在第二步使用IPs模型。估计结果表明，采用IPs数学模型的估计结果更接近真实值，实验模型的平均精度更高，达到的最大平均误差为3.7%。功率模型通过基于Virtex2Pro现场可编程门阵列真实环境的机载测量进行验证，并准备与高级功率优化技术集成。

引用次数: 0

Analysis of double-gate FinFET-based address decoder for radiation-induced single-event-transients 基于双栅极finfet的辐射单事件瞬态地址解码器分析

IET Circuits Devices Syst.

Pub Date : 2012-10-02 DOI: 10.1049/iet-cds.2011.0253

S. Rathod, A. Saxena, S. Dasgupta

In this study, the authors evaluate different schemes of address decoders based on bulk, single gate (SG) silicon-on-insulator (SOI) and double gate (DG) FinFET technology. Schemes differ in terms of back gate connections, and swing on the enable and address lines. The analysis for delay, power dissipation and critical charge has been carried out. Radiation induced single event transients and multiple bit upsets in address decoder have been studied. For radiation hardened applications, tied gate configuration has been found to be good choice over bulk, SG-SOI and independent gate configurations. The effect of process parameter variations on different schemes has been studied. HSPICE simulations have been performed with 45 nm bulk, SG-SOI and DG-FinFET predictive technology models.

在本研究中，作者评估了基于块体、单栅(SG)绝缘体上硅(SOI)和双栅(DG) FinFET技术的不同地址解码器方案。方案在后门连接方面有所不同，并在启用和地址线上摇摆。对延时、功耗和临界电荷进行了分析。研究了地址解码器中辐射引起的单事件瞬变和多位扰动。对于辐射硬化应用，捆扎栅极配置已被发现是比散装，SG-SOI和独立栅极配置更好的选择。研究了工艺参数变化对不同方案的影响。采用45纳米体、SG-SOI和DG-FinFET预测技术模型进行了HSPICE模拟。

引用次数: 7

Utilising the normal distribution of the write noise margin to easily predict the SRAM write yield 利用写噪声余量的正态分布，可以很容易地预测SRAM的写产率

IET Circuits Devices Syst.

Pub Date : 2012-10-02 DOI: 10.1049/iet-cds.2012.0090

H. Makino, S. Nakata, Hirotsugu Suzuki, S. Mutoh, M. Miyama, T. Yoshimura, S. Iwade, Y. Matsuda

This study describes a method to easily predict the write yield of a static random access memory (SRAM) memory cell. The differential coefficient of the combined word line margin (CWLM) for the threshold voltage ( V th ) is analysed using the simple Schockley's transistor model. The analysis shows the good linearity comes from keeping the access transistor operating in the saturation mode for a wide range of V th 's. The Monte Carlo simulation demonstrates that the CWLM obeys the normal distribution. The mean and the variance of the CWLM are almost constant for sample numbers ranging from 100 to 100'000. The estimated write failure probability are almost uniform within a factor of 1.7 for the number of samples more than 300, which allows us to evaluate SRAM with a small number of measurements. The predicted distribution using the differential coefficient calculated by the SPICE simulation also matches the Monte Carlo results. The estimated write failure probability agrees with the Monte Carlo results within a factor of 2.0, which is acceptable for SRAM redundancy circuit design. Finally, the write yield is related to the error rate. Thus, the write yield is easily predicted from a small number of measured samples or the differential coefficients of the CWLM on the V th 's calculated by the SPICE simulation.

本研究描述一种简单预测静态随机存取记忆体(SRAM)记忆体写产率的方法。利用简单的肖克利晶体管模型，分析了阈值电压V th的组合字线边界(CWLM)的微分系数。分析表明，良好的线性度来自于在大V - s范围内保持接入晶体管工作在饱和模式。蒙特卡罗仿真表明，CWLM服从正态分布。CWLM的均值和方差在100到100000的样本数范围内几乎是恒定的。对于超过300个样本的数量，估计的写入失败概率几乎在1.7因子内是一致的，这允许我们使用少量的测量来评估SRAM。利用SPICE模拟计算的微分系数预测的分布也与蒙特卡罗结果相吻合。估计的写入失败概率与蒙特卡罗结果在2.0因子内一致，这对于SRAM冗余电路设计是可以接受的。最后，写产量与错误率有关。因此，写入产率很容易从少量测量样本或通过SPICE模拟计算的CWLM在V上的微分系数中预测出来。

{"title":"Utilising the normal distribution of the write noise margin to easily predict the SRAM write yield","authors":"H. Makino, S. Nakata, Hirotsugu Suzuki, S. Mutoh, M. Miyama, T. Yoshimura, S. Iwade, Y. Matsuda","doi":"10.1049/iet-cds.2012.0090","DOIUrl":"https://doi.org/10.1049/iet-cds.2012.0090","url":null,"abstract":"This study describes a method to easily predict the write yield of a static random access memory (SRAM) memory cell. The differential coefficient of the combined word line margin (CWLM) for the threshold voltage ( V th ) is analysed using the simple Schockley's transistor model. The analysis shows the good linearity comes from keeping the access transistor operating in the saturation mode for a wide range of V th 's. The Monte Carlo simulation demonstrates that the CWLM obeys the normal distribution. The mean and the variance of the CWLM are almost constant for sample numbers ranging from 100 to 100'000. The estimated write failure probability are almost uniform within a factor of 1.7 for the number of samples more than 300, which allows us to evaluate SRAM with a small number of measurements. The predicted distribution using the differential coefficient calculated by the SPICE simulation also matches the Monte Carlo results. The estimated write failure probability agrees with the Monte Carlo results within a factor of 2.0, which is acceptable for SRAM redundancy circuit design. Finally, the write yield is related to the error rate. Thus, the write yield is easily predicted from a small number of measured samples or the differential coefficients of the CWLM on the V th 's calculated by the SPICE simulation.","PeriodicalId":120076,"journal":{"name":"IET Circuits Devices Syst.","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129857665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Efficient inclusive analytical model for delay estimation of multi-walled carbon nanotube interconnects 多壁碳纳米管互连延迟估计的高效包容分析模型

IET Circuits Devices Syst.

Pub Date : 2012-10-02 DOI: 10.1049/iet-cds.2011.0283

M. Gholipour, N. Masoumi

Multi-walled carbon nanotubes (MWCNTs) have attracted much attention as very large scale integration (VLSI) chip interconnects, because of their high-current densities and excellent thermal and mechanical properties. This study investigates different aspects of the use of MWCNTs as chip routing wires to seek modern technologies for high-performance interconnects. Mathematical analyses, and simulations were made for MWCNT and Cu at global, intermediate and local interconnect levels. The authors propose a semi-analytical delay estimation model along with an equivalent RC model for MWCNT global interconnects. The results obtained from these models show good conformance with the simulation results. The proposed compact semi-analytical model can be used to perform fast analysis of MWCNT global interconnects, including delay, buffer insertion and crosstalk. The authors exploited their model to investigate the impact of buffer insertion on MWCNT interconnect delay. The optimal number of required buffers is estimated, as it minimises the MWCNT propagation delay. Analytical and simulation results show that the MWCNT interconnects require lower number of buffers compared to Cu wires.

多壁碳纳米管(MWCNTs)由于具有高电流密度和优异的热力学性能，作为超大规模集成电路(VLSI)芯片的互连材料受到了广泛的关注。本研究探讨了MWCNTs作为芯片布线的不同方面，以寻求高性能互连的现代技术。在全局、中间和局部互连水平上对MWCNT和Cu进行了数学分析和模拟。本文提出了一种MWCNT全局互连的半解析延迟估计模型和等效RC模型。模型计算结果与仿真结果吻合较好。所提出的紧凑半解析模型可用于MWCNT全局互连的快速分析，包括延迟、缓冲器插入和串扰。作者利用他们的模型来研究缓冲区插入对MWCNT互连延迟的影响。估计所需缓冲区的最佳数量，因为它最小化了MWCNT的传播延迟。分析和仿真结果表明，与铜线相比，MWCNT互连所需的缓冲器数量更少。

{"title":"Efficient inclusive analytical model for delay estimation of multi-walled carbon nanotube interconnects","authors":"M. Gholipour, N. Masoumi","doi":"10.1049/iet-cds.2011.0283","DOIUrl":"https://doi.org/10.1049/iet-cds.2011.0283","url":null,"abstract":"Multi-walled carbon nanotubes (MWCNTs) have attracted much attention as very large scale integration (VLSI) chip interconnects, because of their high-current densities and excellent thermal and mechanical properties. This study investigates different aspects of the use of MWCNTs as chip routing wires to seek modern technologies for high-performance interconnects. Mathematical analyses, and simulations were made for MWCNT and Cu at global, intermediate and local interconnect levels. The authors propose a semi-analytical delay estimation model along with an equivalent RC model for MWCNT global interconnects. The results obtained from these models show good conformance with the simulation results. The proposed compact semi-analytical model can be used to perform fast analysis of MWCNT global interconnects, including delay, buffer insertion and crosstalk. The authors exploited their model to investigate the impact of buffer insertion on MWCNT interconnect delay. The optimal number of required buffers is estimated, as it minimises the MWCNT propagation delay. Analytical and simulation results show that the MWCNT interconnects require lower number of buffers compared to Cu wires.","PeriodicalId":120076,"journal":{"name":"IET Circuits Devices Syst.","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129391693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 14

Heuristic finite-impulse-response filter design for cascaded ΣΔ modulators with finite amplifier gain 有限放大器增益级联ΣΔ调制器的启发式有限脉冲响应滤波器设计

IET Circuits Devices Syst.

Pub Date : 2012-10-02 DOI: 10.1049/iet-cds.2011.0177

Y. Chou, Chun-Chen Lin, Hsin-Liang Chen, Jen-Shiun Chiang

This study addresses a new digital calibration filter design for cascaded ΣΔ modulators with finite amplifier gain. A recent approach based on the H-infinity loop shaping method to this problem has the merit of obviating the use of an estimation or adaptive digital correction scheme, which thus reduces the complexity of circuit implementation. For the approach to be successful, it is critical to find an appropriate weighting function so as to make the gain responses of the uncertain noise transfer function (NTF) in a proper shape for improving signal-to-noise ratio (SNR). However, the search of such a weighting function is difficult in general. Moreover, the introduced weighting function increases filter order and hence circuit complexity. To circumvent this difficulty and the inherited drawbacks, this study presents a new noise shaping method for the problem. Considering that it is hard to decide the optimal shape of the uncertain NTF a priori, the authors propose a dual-band design to achieve the shape adjustment task. In particular, the range of lower frequency band is determined by SNR performance evaluation rather than being arbitrarily given a priori. This step is crucial and increases the chance of finding a better filter.

本研究针对具有有限放大器增益的级联ΣΔ调制器提出一种新的数字校准滤波器设计。最近一种基于h∞环整形法的方法解决了这一问题，其优点是避免了使用估计或自适应数字校正方案，从而降低了电路实现的复杂性。该方法的成功关键在于找到合适的加权函数，使不确定噪声传递函数(NTF)的增益响应具有合适的形状，从而提高信噪比(SNR)。然而，这种加权函数的搜索通常是困难的。此外，引入的加权函数增加了滤波器的阶数，从而增加了电路的复杂度。为了克服这一困难和固有的缺陷，本研究提出了一种新的噪声整形方法。考虑到不确定NTF的最佳形状难以先验确定，作者提出了一种双波段设计来实现形状调整任务。特别是，较低频段的范围是由信噪比性能评估确定的，而不是任意给定的先验。这一步是至关重要的，增加了找到更好的过滤器的机会。

{"title":"Heuristic finite-impulse-response filter design for cascaded ΣΔ modulators with finite amplifier gain","authors":"Y. Chou, Chun-Chen Lin, Hsin-Liang Chen, Jen-Shiun Chiang","doi":"10.1049/iet-cds.2011.0177","DOIUrl":"https://doi.org/10.1049/iet-cds.2011.0177","url":null,"abstract":"This study addresses a new digital calibration filter design for cascaded ΣΔ modulators with finite amplifier gain. A recent approach based on the H-infinity loop shaping method to this problem has the merit of obviating the use of an estimation or adaptive digital correction scheme, which thus reduces the complexity of circuit implementation. For the approach to be successful, it is critical to find an appropriate weighting function so as to make the gain responses of the uncertain noise transfer function (NTF) in a proper shape for improving signal-to-noise ratio (SNR). However, the search of such a weighting function is difficult in general. Moreover, the introduced weighting function increases filter order and hence circuit complexity. To circumvent this difficulty and the inherited drawbacks, this study presents a new noise shaping method for the problem. Considering that it is hard to decide the optimal shape of the uncertain NTF a priori, the authors propose a dual-band design to achieve the shape adjustment task. In particular, the range of lower frequency band is determined by SNR performance evaluation rather than being arbitrarily given a priori. This step is crucial and increases the chance of finding a better filter.","PeriodicalId":120076,"journal":{"name":"IET Circuits Devices Syst.","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121768413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Design of rectifier diode temperature compensation circuit in flyback converter 反激变换器中整流二极管温度补偿电路的设计

IET Circuits Devices Syst.

Pub Date : 2012-10-02 DOI: 10.1049/iet-cds.2011.0254

Ling-feng Shi, Y. J. Chang, Hui-sen He, H. Nie, Y. Zhao

A rectifier diode temperature compensation circuit is presented for primary-side controlled flyback converter. By compensating the variation of secondary-side rectifier diode forward voltage with temperature, the error rate of output voltage in flyback converter will be effectively improved at high temperature. The design of the circuit is based on the negative temperature characteristics of the base-emitter voltage VBE of bipolar transistors. Besides, the circuit can also provide overtemperature protection. Results of simulation based on 0.5 mm bipolar complementary metal oxide semi-conductor process show that the compensation voltage is 0.1 V at 125°C and 0 V at 25°C. The maximum output voltage error rate of flyback converter with compensation is from 3.8 to 0.6% under the temperature between 25 and 125°C. The thermal shutdown threshold is 140°C, and the over-temperature protection hysteresis threshold is 110°C.

提出了一种用于一次侧控制反激变换器的整流二极管温度补偿电路。通过补偿二次侧整流二极管正向电压随温度的变化，可以有效地提高高温下反激变换器输出电压的误差率。该电路的设计是基于双极晶体管基极-发射极电压VBE的负温度特性。此外，该电路还可提供过温保护。基于0.5 mm双极互补金属氧化物半导体工艺的仿真结果表明，补偿电压在125°C时为0.1 V，在25°C时为0 V。在温度25 ~ 125℃范围内，带补偿的反激变换器的最大输出电压错误率为3.8 ~ 0.6%。热停机阈值为140℃，过温保护迟滞阈值为110℃。

引用次数: 4

Empirical model for cooperative resizing of processor structures to exploit power-performance efficiency at runtime 基于运行时功率性能效率的协同调整处理器结构的经验模型

IET Circuits Devices Syst.

Pub Date : 2012-09-01 DOI: 10.1049/iet-cds.2011.0354

O. Khan, S. Kundu

Power consumption has become a major cause of concern spanning from data centres to handheld devices. Traditionally, improvement in power-performance efficiency of a modern superscalar processor came from technology scaling. However, that is no longer the case. Many of the current systems deploy coarse grain voltage and/or frequency scaling for power management. These techniques are attractive, but limited because of their granularity of control and effectiveness in nano-complementary metal-oxide-semiconductor (CMOS) technologies. This study proposes a novel architecture-level mechanism to exploit intra-thread variations for power-performance efficiency in modern superscalar processors. This class of processors implement several buffer/queue structures to support speculative out-of-order execution for performance enhancement. Applications may not need full capabilities of such structures at all times. A mechanism that collaboratively adapts a finite set of key hardware structures to the changing programme behaviour can allow the processor to operate with heterogeneous power-performance capabilities. This study presents a novel offline regression-based empirical model to estimate structure resizing for a selected set of structures. It is shown that using a few processor runtime events, the system can dynamically estimate structure resizing to exploit power-performance efficiency. Results show that using the proposed empirical model, a selective set of key structures can be resized at runtime to deliver on average 40% power-performance efficiency over a baseline design, with only 5% loss of performance.

从数据中心到手持设备，电力消耗已经成为人们关注的主要问题。传统上，现代超标量处理器的功率性能效率的提高来自于技术的缩放。然而，情况已不再如此。目前许多系统采用粗粒度电压和/或频率缩放来进行电源管理。这些技术很有吸引力，但由于其控制粒度和纳米互补金属氧化物半导体(CMOS)技术的有效性而受到限制。本研究提出了一种新的架构级机制来利用现代超标量处理器的线程内变化来提高功率性能效率。这类处理器实现了几个缓冲区/队列结构，以支持推测的乱序执行，从而提高性能。应用程序可能并不总是需要这种结构的全部功能。协同调整有限的关键硬件结构以适应不断变化的程序行为的机制可以使处理器具有异构的功率性能能力。本文提出了一种新的基于离线回归的经验模型来估计一组选定结构的结构调整大小。结果表明，利用少量处理器运行时事件，系统可以动态估计结构调整大小，从而提高功率性能效率。结果表明，使用所提出的经验模型，可以在运行时调整一组选定的关键结构的大小，从而在基准设计的基础上提供平均40%的功率性能效率，而性能损失仅为5%。

{"title":"Empirical model for cooperative resizing of processor structures to exploit power-performance efficiency at runtime","authors":"O. Khan, S. Kundu","doi":"10.1049/iet-cds.2011.0354","DOIUrl":"https://doi.org/10.1049/iet-cds.2011.0354","url":null,"abstract":"Power consumption has become a major cause of concern spanning from data centres to handheld devices. Traditionally, improvement in power-performance efficiency of a modern superscalar processor came from technology scaling. However, that is no longer the case. Many of the current systems deploy coarse grain voltage and/or frequency scaling for power management. These techniques are attractive, but limited because of their granularity of control and effectiveness in nano-complementary metal-oxide-semiconductor (CMOS) technologies. This study proposes a novel architecture-level mechanism to exploit intra-thread variations for power-performance efficiency in modern superscalar processors. This class of processors implement several buffer/queue structures to support speculative out-of-order execution for performance enhancement. Applications may not need full capabilities of such structures at all times. A mechanism that collaboratively adapts a finite set of key hardware structures to the changing programme behaviour can allow the processor to operate with heterogeneous power-performance capabilities. This study presents a novel offline regression-based empirical model to estimate structure resizing for a selected set of structures. It is shown that using a few processor runtime events, the system can dynamically estimate structure resizing to exploit power-performance efficiency. Results show that using the proposed empirical model, a selective set of key structures can be resized at runtime to deliver on average 40% power-performance efficiency over a baseline design, with only 5% loss of performance.","PeriodicalId":120076,"journal":{"name":"IET Circuits Devices Syst.","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129641063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Power-efficient decoder implementation based on state transparent convolutional codes 基于状态透明卷积码的高效解码器实现

IET Circuits Devices Syst.

Pub Date : 2012-07-01 DOI: 10.1049/iet-cds.2011.0055

Yeu-Horng Shiau, Hung-Yu Yang, Pei-Yin Chen, Shi-Gi Huang

In this study, a power-efficient very large-scale integration (VLSI) implementation for the convolutional code decoder is presented. Based on the state transparent convolutional code definition, the receiving codewords are classified into non-erroneous and erroneous segments separately. Different from the conventional Viterbi decoder (VD), the authors use a low-complexity decoder, denoted as bit reverse decoder, to recover the non-erroneous segments using reverse operation with a little power consumption and present the segment-based VD to decode the erroneous codeword segments. Then, the clock-gating technique is employed to switch between segment-based VD and bit reverse decoder for power saving. To further reduce the power consumption, the authors group registers into several segments in the survivor memory unit of the segment-based VD and also apply clock gating to each segment individually. According to the number of consecutive erroneous codeword segments, the corresponding numbers of register segments in the survivor memory unit are enabled and other register segments are clock-gated to reduce the switching activities. Besides, our design determines the start and terminal states of the survivor path to obtain correct results of erroneous segments without bit-error rate degradation. As compared with other decoders, our design requires less power without decreasing the decoding performance.

在本研究中，提出了一种低功耗的卷积码解码器的超大规模集成(VLSI)实现方案。基于状态透明卷积码定义，将接收码字分为非错误段和错误段。与传统的维特比译码器(VD)不同，本文采用一种低复杂度译码器，即位反向译码器，以较小的功耗利用反向运算恢复非错误码字段，并提出基于段的VD译码器对错误码字段进行译码。然后，采用时钟门控技术在基于段的VD和位反向解码器之间切换，以节省功耗。为了进一步降低功耗，作者在基于段的VD的存活存储器单元中将寄存器分成几个段，并对每个段分别应用时钟门控。根据连续错误码字段的数目，使能幸存存储器单元中相应数目的寄存器段，并对其他寄存器段进行时钟选通，以减少切换活动。此外，我们的设计确定了幸存者路径的开始和结束状态，从而在不降低误码率的情况下获得错误段的正确结果。与其他解码器相比，我们的设计在不降低解码性能的前提下，降低了功耗。

{"title":"Power-efficient decoder implementation based on state transparent convolutional codes","authors":"Yeu-Horng Shiau, Hung-Yu Yang, Pei-Yin Chen, Shi-Gi Huang","doi":"10.1049/iet-cds.2011.0055","DOIUrl":"https://doi.org/10.1049/iet-cds.2011.0055","url":null,"abstract":"In this study, a power-efficient very large-scale integration (VLSI) implementation for the convolutional code decoder is presented. Based on the state transparent convolutional code definition, the receiving codewords are classified into non-erroneous and erroneous segments separately. Different from the conventional Viterbi decoder (VD), the authors use a low-complexity decoder, denoted as bit reverse decoder, to recover the non-erroneous segments using reverse operation with a little power consumption and present the segment-based VD to decode the erroneous codeword segments. Then, the clock-gating technique is employed to switch between segment-based VD and bit reverse decoder for power saving. To further reduce the power consumption, the authors group registers into several segments in the survivor memory unit of the segment-based VD and also apply clock gating to each segment individually. According to the number of consecutive erroneous codeword segments, the corresponding numbers of register segments in the survivor memory unit are enabled and other register segments are clock-gated to reduce the switching activities. Besides, our design determines the start and terminal states of the survivor path to obtain correct results of erroneous segments without bit-error rate degradation. As compared with other decoders, our design requires less power without decreasing the decoding performance.","PeriodicalId":120076,"journal":{"name":"IET Circuits Devices Syst.","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123440484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

IET Circuits Devices Syst.

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀