ACM Great Lakes Symposium on VLSI最新文献

英文中文

A novel mixed-signal self-calibration technique for baseband filters in systems-on-chip mobile transceivers 一种新的片上系统移动收发器基带滤波器混合信号自校正技术

ACM Great Lakes Symposium on VLSI

Pub Date : 2014-05-20 DOI: 10.1145/2591513.2591522

Yongsuk Choi, Yong-Bin Kim

This paper presents a novel digitally-assisted automatic frequency tuning technique, and the self calibration technique is verified for a 130nm CMOS 4th order biquad baseband low-pass filter case with 20MHz cut-off frequency, which satisfies the typical LTE receiver specifications. The proposed tuning method includes hardware reduction methods, coherent sampling, and magnitude calculator using "alpha max plus beta min" algorithm for significant chip area reduction with negligible accuracy degradation. The cut-off frequency turns out to be tunable in the range of 16.2MHz to 24.4MHz, and the tuning error is less than 0.4% over the whole frequency tuning range. The estimated area consumption is 0.027mm2 with 80% device density, and power dissipation is 0.16mW at 128MHz clock speed with a 1.2V supply voltage.

本文提出了一种新的数字辅助自动调频技术，并对截止频率为20MHz的130nm CMOS四阶双基带低通滤波器进行了自校准技术验证，该自校准技术满足典型LTE接收机规格。所提出的调谐方法包括硬件缩减方法、相干采样和使用“alpha max + beta min”算法的大小计算器，用于显着减少芯片面积，而精度退化可以忽略不计。截止频率在16.2MHz ~ 24.4MHz范围内可调，在整个频率调谐范围内调谐误差小于0.4%。在器件密度为80%时，估计面积消耗为0.027mm2，功耗为0.16mW，时钟速度为128MHz，电源电压为1.2V。

引用次数: 0

H.264 8x8 inverse transform architecture optimization H.264 8x8逆变换架构优化

ACM Great Lakes Symposium on VLSI

Pub Date : 2014-05-20 DOI: 10.1145/2591513.2591564

F. Pereira, A. Soares, A. Susin, A. Bonatto, M. Negreiros

This paper presents a resource optimized hardware solution to perform the H.264 8x8 inverse transform. Row/column decomposition is used, arithmetic units are re-used and the transpose memory is replaced by a shift register. The architecture is able to perform 8x8 integer transform calculation in 144 cycles with as few as 431 LUTs on a Xilinx virtex 6 FPGA for 16-bit resolution. To enable the module to process all inverse transforms in H.264, the number of LUTs is increased to 681. When used to calculate all transforms for H.264 videos, the design supports resolutions up to 1280x720@30fps when running at 84 MHz.

提出了一种实现H.264 8x8反变换的资源优化硬件方案。使用行/列分解，算术单元被重用，转置存储器被移位寄存器取代。该架构能够在16位分辨率的Xilinx virtex 6 FPGA上在144个周期内执行8x8整数变换计算，仅使用431个lut。为了使模块能够处理H.264中的所有逆变换，lut的数量增加到681。当用于计算H.264视频的所有变换时，该设计在84 MHz运行时支持高达1280x720@30fps的分辨率。

引用次数: 2

A generic implementation of a quantified predictor on FPGAs 量化预测器在fpga上的通用实现

ACM Great Lakes Symposium on VLSI

Pub Date : 2014-05-20 DOI: 10.1145/2591513.2591517

G. Thomas, A. Elhossini, B. Juurlink

Predictors are used in many fields of computer architectures to enhance performance. With good estimations of future system behaviour, policies can be developed to improve system performance or reduce power consumption. These policies become more effective if the predictors are implemented in hardware and can provide quantified forecasts and not only binary ones. In this paper, we present and evaluate a generic predictor implemented in VHDL running on an FPGA which produces quantified forecasts. Moreover, a complete scalability analysis is presented which shows that our implementation has a maximum device utilization of less than 5%. Furthermore, we analyse the power consumption of the predictor running on an FPGA. Additionally, we show that this implementation can be clocked by over 210 MHz. Finally, we evaluate a power-saving policy based on our hardware predictor. Based on predicted idle periods, this power-saving policy uses power-saving modes and is able to reduce memory power consumption by 14.3%.

预测器用于计算机体系结构的许多领域，以提高性能。通过对未来系统行为的良好估计，可以制定策略来改进系统性能或降低功耗。如果预测器在硬件中实现，并且可以提供量化的预测，而不仅仅是二进制预测，那么这些策略将变得更加有效。在本文中，我们提出并评估了一个通用的预测器实现的VHDL在FPGA上运行，产生量化的预测。此外，给出了完整的可扩展性分析，表明我们的实现的最大设备利用率低于5%。此外，我们分析了在FPGA上运行的预测器的功耗。此外，我们表明该实现的时钟可以超过210 MHz。最后，我们基于硬件预测器评估节能策略。根据预测的空闲时间，该策略采用省电模式，能够将内存功耗降低14.3%。

引用次数: 2

WeDBless: weighted deflection bufferless router for mesh NoCs WeDBless:用于网状noc的加权偏转无缓冲路由器

ACM Great Lakes Symposium on VLSI

Pub Date : 2014-05-20 DOI: 10.1145/2591513.2591559

Simi Zerine Sleeba, John Jose, M. G. Mini

Bufferless NoC routers employing deflection routing are gaining popularity due to their power and area efficiency. We propose WeDBless, a bufferless deflection router that reduces deflection rate of flits by employing port allocation based on weighted deflection of flits. The proposed method directs the frequently misrouted flits towards their destination by increasing their probability of getting a productive output port. Our evaluations on synthetic traffic patterns show that WeDBless achieves significant reduction in deflection rate, average flit latency and improvement in network saturation point compared to the state-of-the-art bufferless router and reduced complexity in route computing logic.

采用偏转路由的无缓冲NoC路由器由于其功率和面积效率而越来越受欢迎。我们提出了一种无缓冲偏转路由器WeDBless，它采用基于偏转加权的端口分配来降低偏转率。所提出的方法通过增加航班获得有效输出端口的概率，将频繁出错的航班引导到目的地。我们对综合流量模式的评估表明，与最先进的无缓冲路由器相比，WeDBless在偏转率、平均飞行延迟和网络饱和点方面取得了显著降低，并降低了路由计算逻辑的复杂性。

引用次数: 5

System-level reliability exploration framework for heterogeneous MPSoC 异构MPSoC系统级可靠性探索框架

ACM Great Lakes Symposium on VLSI

Pub Date : 2014-05-20 DOI: 10.1145/2591513.2591519

Z. Wang, Chao Chen, Piyush Sharma, A. Chattopadhyay

Power density of digital circuits increased at alarming rate for deep sub-micron CMOS technology, turning reliability into a serious design concern. On the other hand, ever-growing task complexity with strict performance budget forced designers to adopt complex, heterogeneous MPSoCs as the implementation choice. Several commercial system-level design platforms exist currently for design, exploration and implementation of MPSoC. In this paper, we propose a system-level reliability exploration framework by extending a commercial system-level design flow. Using this framework, a heterogeneous MPSoC is designed which can accept a custom mapping algorithm based on the MPSoC topology before the actual task deployment. The dynamic reliability-aware task management is able to consider the desired reliability constraints of tasks as well as reliability levels of the system components. We report our experimental findings using state-of-the-art benchmark applications.

随着深亚微米CMOS技术的发展，数字电路的功率密度以惊人的速度增长，使得可靠性成为一个严重的设计问题。另一方面，不断增长的任务复杂性和严格的性能预算迫使设计人员采用复杂的异构mpsoc作为实现选择。目前已有几个商业系统级设计平台用于MPSoC的设计、探索和实现。在本文中，我们通过扩展商业系统级设计流程，提出了一个系统级可靠性探索框架。利用该框架，设计了异构MPSoC，在实际任务部署之前可以接受基于MPSoC拓扑的自定义映射算法。动态可靠性感知任务管理既能考虑任务的期望可靠性约束，又能考虑系统组件的可靠性水平。我们使用最先进的基准应用程序报告我们的实验结果。

引用次数: 5

New 4T-based DRAM cell designs 新的基于4t的DRAM单元设计

ACM Great Lakes Symposium on VLSI

Pub Date : 2014-05-20 DOI: 10.1145/2591513.2591515

Wei Wei, K. Namba, F. Lombardi

Dynamic Random Access Memories (DRAM) are widely used in processor design. Different cells have been proposed in the past to overcome concerns associated with low retention time, degradation in performance due to process variations and susceptibility to soft errors. This paper proposes two novel DRAM cells (referred to as 4TI and 4T1D) that utilize the techniques of gated diode and forward body-biasing to overcome the above issues. The designs of these cells are evaluated by HSPICE simulation; different figures of merits (such as Read delay, Write delay, retention time, power dissipation, critical charge and layout area) are assessed and a comparative analysis of the proposed cells with existing cells is pursued. The 4TI cell achieves the best power dissipation, while the 4T1D achieves the best retention time, the highest critical charge and the least average Read delay. An extensive simulation based evaluation of process variations is also presented to confirm that using static and Monte Carlo based analysis, the proposed cells are likely to be less affected by process variations (in threshold voltage and effective channel length) than the other cells found in the technical literature.

动态随机存取存储器(DRAM)广泛应用于处理器设计中。过去已经提出了不同的电池来克服与低保留时间相关的问题，由于工艺变化和易受软错误影响而导致的性能下降。本文提出了两种新的DRAM单元(称为4TI和4T1D)，它们利用门控二极管和正向体偏置技术来克服上述问题。通过HSPICE仿真对这些单元的设计进行了评价;评估了不同的优点(如读延迟、写延迟、保持时间、功耗、临界电荷和布局面积)，并对所提出的电池与现有电池进行了比较分析。4TI电池具有最佳的功耗，而4T1D电池具有最佳的保持时间、最高的临界电荷和最小的平均读延迟。还提出了基于工艺变化的广泛模拟评估，以确认使用静态和基于蒙特卡罗的分析，所提出的单元可能比技术文献中发现的其他单元受工艺变化(阈值电压和有效通道长度)的影响更小。

引用次数: 1

VLSI implementation of linear MIMO detection with boosted communications performance: extended abstract 提高通信性能的线性MIMO检测的VLSI实现:扩展摘要

ACM Great Lakes Symposium on VLSI

Pub Date : 2014-05-20 DOI: 10.1145/2591513.2591551

Dominik Auras, D. Rieth, R. Leupers, G. Ascheid

A novel class of linear soft-input soft-output detectors featuring boosted communications performance is introduced. Compared to state-of-the-art linear detectors, the detector has an SNR gain of up to 2.4 dB. We shortly summarize the algorithm, and sketch a suitable architecture. The corresponding ASIC implementation shows the feasibility and efficiency of the concept. It achieves the IEEE 802.11n standard's peak data rate of 600 Mbit/s.

介绍了一种新型的线性软输入软输出检测器，具有提高通信性能的特点。与最先进的线性检测器相比，该检测器的信噪比增益高达2.4 dB。我们简要地总结了算法，并勾画了一个合适的体系结构。相应的ASIC实现表明了该概念的可行性和有效性。达到IEEE 802.11n标准的峰值数据速率600mbit /s。

引用次数: 2

Hardening QDI circuits against transient faults using delay-insensitive maxterm synthesis 使用延迟不敏感的最大项合成来加固QDI电路以防止瞬态故障

ACM Great Lakes Symposium on VLSI

Pub Date : 2014-05-20 DOI: 10.1145/2591513.2591531

Matheus T. Moreira, R. Guazzelli, G. Heck, Ney Laert Vilar Calazans

The correct functionality of quasi-delay-insensitive asynchronous circuits can be jeopardized by the presence and propagation of transient faults. If these faults are latched, they will corrupt data validity and can make the whole circuit to stall, given the strict event ordering constraints imposed by handshaking protocols. This is particularly concerning for the delay-insensitive minterm synthesis logic style, widely adopted by asynchronous designers to implement combinatory quasi-delay-insensitive logic, because it makes extensive use of C-elements and these components are rather vulnerable to transient effects. This paper demonstrates that this logic style submits C-elements to their most vulnerable states during operation. It accordingly proposes the alternative use of the delay-insensitive maxterm synthesis for hardening QDI circuits against transient faults. The latter is a logic style based on the return-to-one 4-phase protocol. Although this style also relies on extensive usage of C-elements, the states where these components are most vulnerable are avoided. Results display improvements of over 300% in C-elements tolerance to transient faults, in the best case.

暂态故障的存在和传播会影响准延迟不敏感异步电路的正常工作。如果这些故障被锁住，它们将破坏数据有效性，并可能使整个电路停滞，因为握手协议施加了严格的事件顺序约束。这对于延迟不敏感的短期综合逻辑风格尤其值得关注，异步设计者广泛采用这种风格来实现组合准延迟不敏感逻辑，因为它大量使用c元素，而这些组件非常容易受到瞬态效应的影响。本文论证了这种逻辑方式使c元素在运行过程中处于最脆弱的状态。因此，提出了延迟不敏感最大项合成的替代方法，用于强化QDI电路以防止瞬态故障。后者是一种基于return-to- 1 4阶段协议的逻辑样式。尽管这种风格也依赖于c元素的广泛使用，但避免了这些组件最容易受到攻击的状态。结果显示，在最佳情况下，c元素对瞬态故障的容忍度提高了300%以上。

{"title":"Hardening QDI circuits against transient faults using delay-insensitive maxterm synthesis","authors":"Matheus T. Moreira, R. Guazzelli, G. Heck, Ney Laert Vilar Calazans","doi":"10.1145/2591513.2591531","DOIUrl":"https://doi.org/10.1145/2591513.2591531","url":null,"abstract":"The correct functionality of quasi-delay-insensitive asynchronous circuits can be jeopardized by the presence and propagation of transient faults. If these faults are latched, they will corrupt data validity and can make the whole circuit to stall, given the strict event ordering constraints imposed by handshaking protocols. This is particularly concerning for the delay-insensitive minterm synthesis logic style, widely adopted by asynchronous designers to implement combinatory quasi-delay-insensitive logic, because it makes extensive use of C-elements and these components are rather vulnerable to transient effects. This paper demonstrates that this logic style submits C-elements to their most vulnerable states during operation. It accordingly proposes the alternative use of the delay-insensitive maxterm synthesis for hardening QDI circuits against transient faults. The latter is a logic style based on the return-to-one 4-phase protocol. Although this style also relies on extensive usage of C-elements, the states where these components are most vulnerable are avoided. Results display improvements of over 300% in C-elements tolerance to transient faults, in the best case.","PeriodicalId":272619,"journal":{"name":"ACM Great Lakes Symposium on VLSI","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128538889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Reliability-aware cross-point resistive memory design 可靠性感知交叉点电阻式存储器设计

ACM Great Lakes Symposium on VLSI

Pub Date : 2014-05-20 DOI: 10.1145/2591513.2591528

Cong Xu, Dimin Niu, Yang Zheng, Shimeng Yu, Yuan Xie

The transition metal oxide (TMO) resistive random access memory (ReRAM) has been identified as one of the most promising candidates for the next generation non-volatile memory (NVM) technology. Numerous TMO ReRAMs with different materials have been developed and demonstrate attractive characteristics, such as fast read/write speed, low power consumption, high integrated density, and good scalability. Among them, the most attractive characteristic of ReRAM is its cross-point structure which features a 4F2 cell size. However, the existence of sneak current and voltage drop along the wire resistance in a cross-point array brings in extra design challenges. In addition, a robust ReRAM design needs to deal with both soft and hard errors. In this paper, we summarize mechanisms of both soft and hard errors of ReRAM cells and propose a unified model to characterize different failure behaviors. We quantitatively analyze the impact of cell failure modes on the reliability of cross-point array. We also propose an error resilient architecture which avoids unnecessary writes in the hard error detection unit. Experimental results show that our design can extend the lifetime of ReRAM up to 75% over the design without hard error detections and up to 12% over the design with "write-verify" detection mechanism.

过渡金属氧化物(TMO)电阻式随机存取存储器(ReRAM)已被确定为下一代非易失性存储器(NVM)技术最有前途的候选者之一。许多不同材料的TMO reram已经被开发出来，并表现出诸如读/写速度快、功耗低、集成密度高、可扩展性好等吸引人的特点。其中，ReRAM最吸引人的特点是其交叉点结构，具有4F2的单元大小。然而，在交叉点阵列中，沿导线电阻存在潜流和压降，这给设计带来了额外的挑战。此外，一个健壮的ReRAM设计需要处理软错误和硬错误。在本文中，我们总结了ReRAM单元的软错误和硬错误的机制，并提出了一个统一的模型来表征不同的失效行为。定量分析了单元失效模式对交叉点阵列可靠性的影响。我们还提出了一种错误弹性架构，以避免在硬错误检测单元中不必要的写入。实验结果表明，我们的设计可以将ReRAM的寿命延长到没有硬错误检测的设计的75%，比带有“写验证”检测机制的设计延长12%。

{"title":"Reliability-aware cross-point resistive memory design","authors":"Cong Xu, Dimin Niu, Yang Zheng, Shimeng Yu, Yuan Xie","doi":"10.1145/2591513.2591528","DOIUrl":"https://doi.org/10.1145/2591513.2591528","url":null,"abstract":"The transition metal oxide (TMO) resistive random access memory (ReRAM) has been identified as one of the most promising candidates for the next generation non-volatile memory (NVM) technology. Numerous TMO ReRAMs with different materials have been developed and demonstrate attractive characteristics, such as fast read/write speed, low power consumption, high integrated density, and good scalability. Among them, the most attractive characteristic of ReRAM is its cross-point structure which features a 4F2 cell size. However, the existence of sneak current and voltage drop along the wire resistance in a cross-point array brings in extra design challenges. In addition, a robust ReRAM design needs to deal with both soft and hard errors. In this paper, we summarize mechanisms of both soft and hard errors of ReRAM cells and propose a unified model to characterize different failure behaviors. We quantitatively analyze the impact of cell failure modes on the reliability of cross-point array. We also propose an error resilient architecture which avoids unnecessary writes in the hard error detection unit. Experimental results show that our design can extend the lifetime of ReRAM up to 75% over the design without hard error detections and up to 12% over the design with \"write-verify\" detection mechanism.","PeriodicalId":272619,"journal":{"name":"ACM Great Lakes Symposium on VLSI","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130120476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Thermal-aware phase-based tuning of embedded systems 嵌入式系统的热感知相位调优

ACM Great Lakes Symposium on VLSI

Pub Date : 2014-05-20 DOI: 10.1145/2591513.2591586

Tosiron Adegbija, A. Gordon-Ross

Due to embedded systems' stringent design constraints, much prior work focused on optimizing energy consumption and/or performance. However, since embedded systems have fewer cooling options, rising temperature, and thus temperature optimization, is an emergent concern. We present thermal-aware phase-based tuning--TaPT--that determines Pareto optimal configurations for fine-grained execution time, energy, and temperature tradeoffs. Results show that TaPT reduces execution time, energy, and temperature by as much as 5%, 30%, and 25%, respectively, while adhering to designer-specified design constraints.

由于嵌入式系统严格的设计限制，许多先前的工作集中在优化能耗和/或性能上。然而，由于嵌入式系统具有较少的冷却选项，因此温度上升和温度优化是一个紧急问题。我们提出了基于热感知相位的调优——TaPT——它确定了细粒度执行时间、能量和温度权衡的Pareto最优配置。结果表明，在遵守设计人员指定的设计约束的同时，TaPT分别减少了5%、30%和25%的执行时间、能量和温度。

引用次数: 3

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

ACM Great Lakes Symposium on VLSI

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀