首页 > 最新文献

Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors最新文献

英文 中文
High level functional verification closure 高级功能验证闭包
Surrendra Dudani, Jayant Nagda
We present a methodology to obtain high level functional verification closure. We discuss the current advances in critical technologies that are part of the verification closure solution. Within this methodology, assertion specifications are the starting point and central to the collaboration required between various verification tasks to efficiently search for tests and allow automation to proceed. The goal of verification closure is to generate a complete set of tests that meet the design quality criteria established for the design.
我们提出了一种方法来获得高层次的功能验证闭包。我们讨论了作为验证结束解决方案一部分的关键技术的当前进展。在此方法中,断言规范是各种验证任务之间所需协作的起点和中心,以便有效地搜索测试并允许自动化继续进行。验证结束的目标是生成一套完整的测试,满足为设计建立的设计质量标准。
{"title":"High level functional verification closure","authors":"Surrendra Dudani, Jayant Nagda","doi":"10.1109/ICCD.2002.1106753","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106753","url":null,"abstract":"We present a methodology to obtain high level functional verification closure. We discuss the current advances in critical technologies that are part of the verification closure solution. Within this methodology, assertion specifications are the starting point and central to the collaboration required between various verification tasks to efficiently search for tests and allow automation to proceed. The goal of verification closure is to generate a complete set of tests that meet the design quality criteria established for the design.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116986788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A design methodology for application-specific real-time interfaces 特定于应用程序的实时接口的设计方法
Stefan Ihmor, M. Visarius, W. Hardt
The complexity of embedded systems has increased rapidly during the last years. Several design approaches, including system-level design as well as IP-based design, have improved the design process. The rising number of instantiated components implicates a set of complex interfaces. High-speed data transmission rates, fault tolerance and predictability are key challenges for interface design. Thus high sophisticated interfaces have to be generated with respect to the different applications. In this paper we present a design methodology for application-specific real-time interfaces. The high-level design specification is done in a UML-based formalism. An interface-block (IFB) is derived from this specification. The IFB handles data sequencing and protocol generation. Both parts are controlled hierarchically. Within the IFB all application specific restrictions, channel features, and target platform characteristics are taken into account. Our approach is illustrated by a case study implementing a real-time communication between two interacting robots.
嵌入式系统的复杂性在过去几年中迅速增加。包括系统级设计和基于ip的设计在内的几种设计方法改进了设计过程。实例化组件数量的增加意味着一组复杂的接口。高速数据传输速率、容错性和可预测性是接口设计面临的主要挑战。因此,必须针对不同的应用程序生成高度复杂的接口。在本文中,我们提出了一种针对特定应用的实时接口的设计方法。高级设计规范是在基于uml的形式化中完成的。接口块(IFB)是从这个规范派生出来的。IFB处理数据排序和协议生成。这两个部分都是分层控制的。在IFB中,所有特定于应用程序的限制、通道特征和目标平台特征都被考虑在内。我们的方法通过一个案例研究来说明,该案例研究实现了两个相互作用的机器人之间的实时通信。
{"title":"A design methodology for application-specific real-time interfaces","authors":"Stefan Ihmor, M. Visarius, W. Hardt","doi":"10.1109/ICCD.2002.1106820","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106820","url":null,"abstract":"The complexity of embedded systems has increased rapidly during the last years. Several design approaches, including system-level design as well as IP-based design, have improved the design process. The rising number of instantiated components implicates a set of complex interfaces. High-speed data transmission rates, fault tolerance and predictability are key challenges for interface design. Thus high sophisticated interfaces have to be generated with respect to the different applications. In this paper we present a design methodology for application-specific real-time interfaces. The high-level design specification is done in a UML-based formalism. An interface-block (IFB) is derived from this specification. The IFB handles data sequencing and protocol generation. Both parts are controlled hierarchically. Within the IFB all application specific restrictions, channel features, and target platform characteristics are taken into account. Our approach is illustrated by a case study implementing a real-time communication between two interacting robots.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127105717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A new architecture for signed radix-2/sup m/ pure array multipliers 有符号基数-2/sup - m/纯数组乘法器的新架构
E. Costa, S. Bampi, J. Monteiro
We present a new architecture for signed multiplication which maintains the pure form of an array multiplier, exhibiting a much lower overhead than the Booth architecture. This architecture is extended for radix-2/sup m/ encoding, which leads to a reduction of the number of partial lines, enabling a significant improvement in performance and power consumption. The flexibility of our architecture allows for the easy construction of multipliers for different values of m, as opposed to the Booth architecture for which implementations for m > 2 are complex. The results we present show that the proposed architecture with radix-4 compares favorably in performance and power with the Modified Booth multiplier. We have experimented our architecture with different values of m and concluded that m = 4 minimizes both delay and power.
我们提出了一种新的有符号乘法体系结构,它保持了数组乘法器的纯粹形式,显示出比Booth体系结构低得多的开销。该架构扩展到基数-2/sup m/编码,从而减少了部分线的数量,从而显著提高了性能和功耗。我们架构的灵活性允许为不同的m值轻松构造乘数,而不是像Booth架构那样,m > 2的实现是复杂的。我们提出的结果表明,与改进的Booth乘法器相比,提出的基数为4的架构在性能和功耗方面都具有优势。我们用不同的m值对我们的架构进行了实验,得出的结论是m = 4可以最小化延迟和功耗。
{"title":"A new architecture for signed radix-2/sup m/ pure array multipliers","authors":"E. Costa, S. Bampi, J. Monteiro","doi":"10.1109/ICCD.2002.1106756","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106756","url":null,"abstract":"We present a new architecture for signed multiplication which maintains the pure form of an array multiplier, exhibiting a much lower overhead than the Booth architecture. This architecture is extended for radix-2/sup m/ encoding, which leads to a reduction of the number of partial lines, enabling a significant improvement in performance and power consumption. The flexibility of our architecture allows for the easy construction of multipliers for different values of m, as opposed to the Booth architecture for which implementations for m > 2 are complex. The results we present show that the proposed architecture with radix-4 compares favorably in performance and power with the Modified Booth multiplier. We have experimented our architecture with different values of m and concluded that m = 4 minimizes both delay and power.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"459 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125833384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Trace-level speculative multithreaded architecture 跟踪级推测多线程体系结构
Carlos Molina, Antonio González, Jordi Tubella
This paper presents a novel microarchitecture to exploit trace-level speculation by means of two threads working cooperatively in a speculative and non-speculative way respectively. The architecture presents two main benefits: (a) no significant penalties are introduced in the presence of a misspeculation and (b) any type of trace predictor can work together with this proposal. In this way, aggressive trace predictors can be incorporated since misspeculations do not introduce significant penalties. We describe in detail TSMA (trace-level speculative multithreaded architecture) and present initial results to show the benefits of this proposal. We show how simple trace predictors achieve significant speed-up in the majority of cases. Results of a simple trace speculation mechanism show an average speed-up of 16%.
本文提出了一种新的微架构,通过两个线程分别以推测和非推测的方式协同工作来利用跟踪级推测。该体系结构提供了两个主要好处:(a)在存在错误猜测的情况下不会引入重大惩罚;(b)任何类型的跟踪预测器都可以与该提议一起工作。通过这种方式,由于错误的推测不会带来重大的惩罚,因此可以合并积极的跟踪预测器。我们详细描述了TSMA(跟踪级推测多线程架构),并给出了初步结果,以显示该提议的好处。我们展示了简单的跟踪预测器如何在大多数情况下实现显著的加速。简单的痕量投机机制的结果显示平均加速16%。
{"title":"Trace-level speculative multithreaded architecture","authors":"Carlos Molina, Antonio González, Jordi Tubella","doi":"10.1109/ICCD.2002.1106802","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106802","url":null,"abstract":"This paper presents a novel microarchitecture to exploit trace-level speculation by means of two threads working cooperatively in a speculative and non-speculative way respectively. The architecture presents two main benefits: (a) no significant penalties are introduced in the presence of a misspeculation and (b) any type of trace predictor can work together with this proposal. In this way, aggressive trace predictors can be incorporated since misspeculations do not introduce significant penalties. We describe in detail TSMA (trace-level speculative multithreaded architecture) and present initial results to show the benefits of this proposal. We show how simple trace predictors achieve significant speed-up in the majority of cases. Results of a simple trace speculation mechanism show an average speed-up of 16%.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122448741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Accurate and efficient static timing analysis with crosstalk 准确、高效的静态时序分析与串扰
I-De Huang, S. Gupta, M. Breuer
We have developed an accurate and efficient methodology to perform static timing analysis (STA) in combinational logic blocks in the presence of multiple crosstalk-induced noise effects. The crosstalk model used is more accurate because it considers skew, input transition times, and driver strengths. This crosstalk model is enhanced to handle timing ranges for performing STA. The methodology also uses more accurate delay models for gates. The presence of one or more coupling capacitances can create cyclic timing dependencies, even in an otherwise acyclic circuit. We have developed an approach to partition the circuit into minimal timing-iterative subcircuits (TISs) that encapsulate the cyclic timing dependencies. When used in conjunction with our levelization procedure, iterative timing analysis is confined within individual TISs. We have demonstrated that the maximum arrival time values computed by the proposed STA using integrated delay models are much closer to detailed circuit simulation results than an STA that uses the 3C/sub c/ delay model.
我们已经开发了一种准确和有效的方法来执行静态时序分析(STA)在组合逻辑块中存在多个串扰引起的噪声效应。使用的串扰模型更准确,因为它考虑了倾斜、输入转换时间和驱动器强度。该串扰模型被增强以处理执行STA的时序范围。该方法还使用了更精确的门延迟模型。一个或多个耦合电容的存在会产生循环时序依赖,即使在非循环电路中也是如此。我们开发了一种将电路划分为最小时间迭代子电路(TISs)的方法,该子电路封装了循环时间依赖性。当与我们的平准化程序结合使用时,迭代定时分析被限制在单个TISs内。我们已经证明,与使用3C/sub c/延迟模型的STA相比,使用集成延迟模型的STA计算的最大到达时间值更接近详细的电路仿真结果。
{"title":"Accurate and efficient static timing analysis with crosstalk","authors":"I-De Huang, S. Gupta, M. Breuer","doi":"10.1109/ICCD.2002.1106780","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106780","url":null,"abstract":"We have developed an accurate and efficient methodology to perform static timing analysis (STA) in combinational logic blocks in the presence of multiple crosstalk-induced noise effects. The crosstalk model used is more accurate because it considers skew, input transition times, and driver strengths. This crosstalk model is enhanced to handle timing ranges for performing STA. The methodology also uses more accurate delay models for gates. The presence of one or more coupling capacitances can create cyclic timing dependencies, even in an otherwise acyclic circuit. We have developed an approach to partition the circuit into minimal timing-iterative subcircuits (TISs) that encapsulate the cyclic timing dependencies. When used in conjunction with our levelization procedure, iterative timing analysis is confined within individual TISs. We have demonstrated that the maximum arrival time values computed by the proposed STA using integrated delay models are much closer to detailed circuit simulation results than an STA that uses the 3C/sub c/ delay model.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114408711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Speculative trace scheduling in VLIW processors VLIW处理器中的推测跟踪调度
Manvi Agarwal, S. Nandy, J. V. Eijndhoven, S. Balakrishnan
VLIW processors are statically scheduled processors and their performance depends on the quality of schedules generated by the compiler's scheduler. We propose a new scheduling scheme where the application is first divided into decision trees and then further split into traces. Traces are speculatively scheduled on the processor based on their probability of execution. We have developed a tool "SpliTree" to generate traces automatically. By using dynamic branch prediction for scheduling traces our scheme achieves approximately 1.4/spl times/ performance improvement over that using decision trees for Spec92 benchmarks simulated on TriMedia/spl trade/.
VLIW处理器是静态调度的处理器,其性能取决于编译器调度程序生成的调度的质量。我们提出了一种新的调度方案,该方案首先将应用程序划分为决策树,然后进一步划分为路径。跟踪根据其执行的概率在处理器上进行推测调度。我们开发了一个工具“SpliTree”来自动生成轨迹。通过使用动态分支预测进行调度跟踪,我们的方案比在TriMedia/spl trade/上模拟的Spec92基准测试中使用决策树实现了大约1.4/spl时间/性能改进。
{"title":"Speculative trace scheduling in VLIW processors","authors":"Manvi Agarwal, S. Nandy, J. V. Eijndhoven, S. Balakrishnan","doi":"10.1109/ICCD.2002.1106803","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106803","url":null,"abstract":"VLIW processors are statically scheduled processors and their performance depends on the quality of schedules generated by the compiler's scheduler. We propose a new scheduling scheme where the application is first divided into decision trees and then further split into traces. Traces are speculatively scheduled on the processor based on their probability of execution. We have developed a tool \"SpliTree\" to generate traces automatically. By using dynamic branch prediction for scheduling traces our scheme achieves approximately 1.4/spl times/ performance improvement over that using decision trees for Spec92 benchmarks simulated on TriMedia/spl trade/.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129780769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
TTA-C2 a single chip communication controller for the time-triggered-protocol TTA-C2单片机通信控制器为时间触发协议
M. Ley, H. Grünbacher
This paper describes the architecture and implementation of the first industrial single chip communication controller for the Time Triggered Protocol (TTP/C). TTP/C is an emerging communication protocol for fault-tolerant real time systems. Typical applications are safety-critical digital control systems such as drive-by-wire and fly-by-wire. We applied a VHDL based design flow to implement an application specific RISC core with several specialized peripheral blocks, RAMs, flash memory and analog cells. For production of the 27 mm/sup 2/ chip a 0.35 /spl mu/ Flash-CMOS technology is used Fully tested samples are already available and proved the design to be "first time right".
本文介绍了时间触发协议(TTP/C)的第一个工业单片机通信控制器的结构和实现。TTP/C是一种新兴的容错实时系统通信协议。典型的应用是对安全至关重要的数字控制系统,如电传驱动和电传飞行。我们应用了一个基于VHDL的设计流程来实现一个特定于应用程序的RISC核心,其中包含几个专门的外设块、ram、闪存和模拟单元。对于27 mm/sup 2/芯片的生产,使用了0.35 /spl mu/ Flash-CMOS技术,已经有了充分测试的样品,并证明了设计是“第一次正确的”。
{"title":"TTA-C2 a single chip communication controller for the time-triggered-protocol","authors":"M. Ley, H. Grünbacher","doi":"10.1109/ICCD.2002.1106811","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106811","url":null,"abstract":"This paper describes the architecture and implementation of the first industrial single chip communication controller for the Time Triggered Protocol (TTP/C). TTP/C is an emerging communication protocol for fault-tolerant real time systems. Typical applications are safety-critical digital control systems such as drive-by-wire and fly-by-wire. We applied a VHDL based design flow to implement an application specific RISC core with several specialized peripheral blocks, RAMs, flash memory and analog cells. For production of the 27 mm/sup 2/ chip a 0.35 /spl mu/ Flash-CMOS technology is used Fully tested samples are already available and proved the design to be \"first time right\".","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128513639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Performance enhancements to the Active Memory System 主动内存系统的性能增强
W. Srisa-an, D. Lo, J. M. Chang
The Active Memory System - a garbage collected memory module - was introduced as a way to provide hardware support for garbage collection in embedded systems. The major component in the design was the Active Memory Processor (AMP) that utilized a set of bit-maps and a combinational circuit to perform mark-sweep garbage collection. The design can achieve constant time for both allocation and sweeping. In this paper two enhancements are made to the design of AMP so that it can perform one-bit reference counting that postpones the need to perform garbage collection. Moreover, a caching mechanism is also introduced to reduce the hardware cost of the design. The experimental results show that the proposed modification can reduce the number of garbage collection invocations by 76%. The speed-up in marking time can be as much as 5.81. With the caching mechanism, the hardware cost can be as small as 27 K gates and 6 KB of SRAM.
主动内存系统——一个垃圾收集内存模块——是作为一种为嵌入式系统中的垃圾收集提供硬件支持的方式而引入的。设计中的主要组件是活动内存处理器(AMP),它利用一组位图和一个组合电路来执行标记清除垃圾收集。该设计可实现分配和清扫时间恒定。本文对AMP的设计进行了两个改进,使其能够执行一位引用计数,从而推迟了执行垃圾收集的需要。此外,还引入了缓存机制,以降低设计的硬件成本。实验结果表明,所提出的修改可以减少76%的垃圾收集调用。标记时间的加速可高达5.81。使用缓存机制,硬件成本可以低至27 K门和6 KB SRAM。
{"title":"Performance enhancements to the Active Memory System","authors":"W. Srisa-an, D. Lo, J. M. Chang","doi":"10.1109/ICCD.2002.1106778","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106778","url":null,"abstract":"The Active Memory System - a garbage collected memory module - was introduced as a way to provide hardware support for garbage collection in embedded systems. The major component in the design was the Active Memory Processor (AMP) that utilized a set of bit-maps and a combinational circuit to perform mark-sweep garbage collection. The design can achieve constant time for both allocation and sweeping. In this paper two enhancements are made to the design of AMP so that it can perform one-bit reference counting that postpones the need to perform garbage collection. Moreover, a caching mechanism is also introduced to reduce the hardware cost of the design. The experimental results show that the proposed modification can reduce the number of garbage collection invocations by 76%. The speed-up in marking time can be as much as 5.81. With the caching mechanism, the hardware cost can be as small as 27 K gates and 6 KB of SRAM.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124749181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Parallel multiple-symbol variable-length decoding 并行多符号变长解码
Jari Nikara, S. Vassiliadis, J. Takala, M. Sima, P. Liuha
In this paper a parallel Variable-Length Decoding (VLD) scheme is introduced. The scheme is capable of decoding all the codewords in an N-bit buffer whose accumulated codelength is at most N. The proposed method partially breaks the recursive dependency related to the MPEG-2 VLD. All possible codewords in the buffer are detected in parallel and the sum of the codelengths is provided to the external shifter aligning the variable-length coded input stream for a new decoding cycle. Two length detection mechanisms are proposed: the first approach determines the length in a parallel/serial fashion and the second using a new device denoted as MultiplexedAdd. In order to prove feasibility and determine the limiting factors of our proposal, the parallel/serial codeword detector with 32-bit input has been described in behavioral non-optimized VHDL and mapped onto Altera's ACEX EP1K100 FPGA. The implemented prototype exhibits a latency of 110 ns and uses 32% of the logic cells of the device. When applied to MPEG-2 standard benchmark scenes, on average 3.5 symbols are decoded per cycle.
本文介绍了一种并行变长解码(VLD)方案。该方案能够对一个n位缓冲区内的所有码字进行译码,其累计码长最多为n。并行检测缓冲区中所有可能的码字,并将码长之和提供给外部移位器,以对齐可变长度编码输入流以进行新的解码周期。提出了两种长度检测机制:第一种方法以并行/串行方式确定长度,第二种方法使用新设备表示为MultiplexedAdd。为了证明可行性并确定我们的建议的限制因素,在行为非优化的VHDL中描述了具有32位输入的并行/串行码字检测器,并将其映射到Altera的ACEX EP1K100 FPGA上。实现的原型显示出110 ns的延迟,并使用32%的器件逻辑单元。当应用于MPEG-2标准基准场景时,平均每个周期解码3.5个符号。
{"title":"Parallel multiple-symbol variable-length decoding","authors":"Jari Nikara, S. Vassiliadis, J. Takala, M. Sima, P. Liuha","doi":"10.1109/ICCD.2002.1106759","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106759","url":null,"abstract":"In this paper a parallel Variable-Length Decoding (VLD) scheme is introduced. The scheme is capable of decoding all the codewords in an N-bit buffer whose accumulated codelength is at most N. The proposed method partially breaks the recursive dependency related to the MPEG-2 VLD. All possible codewords in the buffer are detected in parallel and the sum of the codelengths is provided to the external shifter aligning the variable-length coded input stream for a new decoding cycle. Two length detection mechanisms are proposed: the first approach determines the length in a parallel/serial fashion and the second using a new device denoted as MultiplexedAdd. In order to prove feasibility and determine the limiting factors of our proposal, the parallel/serial codeword detector with 32-bit input has been described in behavioral non-optimized VHDL and mapped onto Altera's ACEX EP1K100 FPGA. The implemented prototype exhibits a latency of 110 ns and uses 32% of the logic cells of the device. When applied to MPEG-2 standard benchmark scenes, on average 3.5 symbols are decoded per cycle.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130179470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
A low power pseudo-random BIST technique 一种低功耗伪随机BIST技术
N. Z. Basturkmen, S. Reddy, I. Pomeranz
Peak power consumption during testing is an important concern. For scan designs, a high level of switching activity is created in the circuit during scan shifts, which increases power consumption considerably. In this paper we propose a pseudo-random BIST scheme for scan designs, which reduces the peak power consumption as well as the average power consumption as measured by the switching activity in the circuit. The method reduces the switching activity in the scan chains and the activity in the circuit under test by limiting the scan shifts to a portion of the scan chain structure using scan chain disable. Experimental results on various benchmark circuits demonstrate that the technique reduces the switching activity caused by scan shifts.
测试期间的峰值功耗是一个重要问题。对于扫描设计,在扫描移位期间,电路中产生了高水平的开关活动,这大大增加了功耗。在本文中,我们提出了一种用于扫描设计的伪随机BIST方案,该方案降低了电路中开关活动测量的峰值功耗和平均功耗。该方法通过使用扫描链禁用功能将扫描移位限制到扫描链结构的一部分来减少扫描链中的切换活度和被测电路中的活度。在各种基准电路上的实验结果表明,该技术降低了扫描位移引起的开关活动。
{"title":"A low power pseudo-random BIST technique","authors":"N. Z. Basturkmen, S. Reddy, I. Pomeranz","doi":"10.1109/ICCD.2002.1106815","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106815","url":null,"abstract":"Peak power consumption during testing is an important concern. For scan designs, a high level of switching activity is created in the circuit during scan shifts, which increases power consumption considerably. In this paper we propose a pseudo-random BIST scheme for scan designs, which reduces the peak power consumption as well as the average power consumption as measured by the switching activity in the circuit. The method reduces the switching activity in the scan chains and the activity in the circuit under test by limiting the scan shifts to a portion of the scan chain structure using scan chain disable. Experimental results on various benchmark circuits demonstrate that the technique reduces the switching activity caused by scan shifts.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133901352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1