首页 > 最新文献

21st International Conference on VLSI Design (VLSID 2008)最新文献

英文 中文
Fault Tolerant Dynamic Antenna Array in Smart Antenna System Using Evolved Virtual Reconfigurable Circuit 基于进化虚拟可重构电路的智能天线系统容错动态天线阵列
Pub Date : 2008-01-04 DOI: 10.1109/VLSI.2008.32
D. Dhanasekaran, K. Bagan
A majority of applications require cooperation of two or more independently designed, separately located, but mutually affecting subsystems. In addition to good behavior of each of the subsystems, an effective coordination is very important to achieve the desired overall performance. However, such a co-ordination is very difficult to attain mainly due to the lack of precise system models and/or dynamic parameters. In such situations, the evolvable hardware (EHW) techniques, which can achieve the sophisticated level of information processing the brain is capable of, can excel. In this paper, a new virtual reconfigurable circuit based drive circuit for array elements in smart antenna using the techniques of evolved operators is presented. The idea of this work is to develop a system that is tolerant to array element failure (fault tolerance) by utilizing phased array input programmer connected to a programmable VLSI chip. The approach chosen here is based on functional level evolution whose architecture contains many nonlinear functions and uses an evolutionary algorithm to evolve the best configuration. The system is tested for its effectiveness by choosing a real-time phase control in three element array of smart antenna with three input phases and introducing different element failures such as: element fails as open circuit, sensor fails as short circuit, noise added to individual element, multiple element failure etc.. In each case the mean square error is computed and used as the performance index.
大多数应用程序需要两个或更多独立设计、单独定位但相互影响的子系统的协作。除了每个子系统的良好行为外,有效的协调对于实现期望的整体性能非常重要。然而,由于缺乏精确的系统模型和/或动态参数,这种协调很难实现。在这种情况下,可进化硬件(EHW)技术能够达到大脑所能达到的复杂信息处理水平,可以脱颖而出。本文利用演化算子技术,提出了一种基于虚拟可重构电路的智能天线阵列元件驱动电路。这项工作的想法是通过利用连接到可编程VLSI芯片的相控阵输入编程器来开发一个能够容忍阵列元件故障(容错)的系统。本文选择的方法是基于功能级进化的方法,其结构包含许多非线性函数,并使用进化算法来进化最佳配置。通过在三输入相的智能天线三元阵列中选择实时相位控制,并引入不同的元件故障,如元件断路故障、传感器短路故障、单个元件加噪声故障、多元件故障等,验证了系统的有效性。在每种情况下,计算均方误差并将其用作性能指标。
{"title":"Fault Tolerant Dynamic Antenna Array in Smart Antenna System Using Evolved Virtual Reconfigurable Circuit","authors":"D. Dhanasekaran, K. Bagan","doi":"10.1109/VLSI.2008.32","DOIUrl":"https://doi.org/10.1109/VLSI.2008.32","url":null,"abstract":"A majority of applications require cooperation of two or more independently designed, separately located, but mutually affecting subsystems. In addition to good behavior of each of the subsystems, an effective coordination is very important to achieve the desired overall performance. However, such a co-ordination is very difficult to attain mainly due to the lack of precise system models and/or dynamic parameters. In such situations, the evolvable hardware (EHW) techniques, which can achieve the sophisticated level of information processing the brain is capable of, can excel. In this paper, a new virtual reconfigurable circuit based drive circuit for array elements in smart antenna using the techniques of evolved operators is presented. The idea of this work is to develop a system that is tolerant to array element failure (fault tolerance) by utilizing phased array input programmer connected to a programmable VLSI chip. The approach chosen here is based on functional level evolution whose architecture contains many nonlinear functions and uses an evolutionary algorithm to evolve the best configuration. The system is tested for its effectiveness by choosing a real-time phase control in three element array of smart antenna with three input phases and introducing different element failures such as: element fails as open circuit, sensor fails as short circuit, noise added to individual element, multiple element failure etc.. In each case the mean square error is computed and used as the performance index.","PeriodicalId":143886,"journal":{"name":"21st International Conference on VLSI Design (VLSID 2008)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2008-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115423170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Retimed Decomposed Serial Berlekamp-Massey (BM) Architecture for High-Speed Reed-Solomon Decoding 高速Reed-Solomon解码的重新定时分解串行Berlekamp-Massey (BM)架构
Pub Date : 2008-01-04 DOI: 10.1109/VLSI.2008.45
Shahid Rizwan
This paper presents a retimed decomposed inversion-less serial Berlekamp-Massey (BM) architecture for Reed Solomon (RS) decoding. The key idea is to apply the retiming technique into the critical path in order to achieve high decoding performance. The standard basis irregular fully parallel multiplier is separated into partial product generation (PPG) and partial product reduction (PPR) stages to implement the proposed modified decomposed inversion-less serial BM algorithm. The proposed RS (255,239) decoder is implemented in verilog HDL and synthesized with 0.18 mum CMOS std 130 standard cell library. The proposed architecture achieves almost 76 % increase in speed and throughput, and can be used in high-speed and high-throughput applications such as DVD, optical fiber communications, etc.
提出了一种用于RS译码的重定时分解无反转串行Berlekamp-Massey (BM)结构。关键思想是将重定时技术应用到关键路径中,以达到较高的解码性能。将标准基不规则全并行乘法器分为部分乘积生成(PPG)和部分乘积约简(PPR)两个阶段,实现改进的分解无反转串行BM算法。所提出的RS(255,239)解码器采用verilog HDL语言实现,并采用0.18 μ m CMOS std 130标准单元库合成。该架构的速度和吞吐量提高了近76%,可用于DVD、光纤通信等高速和高吞吐量应用。
{"title":"Retimed Decomposed Serial Berlekamp-Massey (BM) Architecture for High-Speed Reed-Solomon Decoding","authors":"Shahid Rizwan","doi":"10.1109/VLSI.2008.45","DOIUrl":"https://doi.org/10.1109/VLSI.2008.45","url":null,"abstract":"This paper presents a retimed decomposed inversion-less serial Berlekamp-Massey (BM) architecture for Reed Solomon (RS) decoding. The key idea is to apply the retiming technique into the critical path in order to achieve high decoding performance. The standard basis irregular fully parallel multiplier is separated into partial product generation (PPG) and partial product reduction (PPR) stages to implement the proposed modified decomposed inversion-less serial BM algorithm. The proposed RS (255,239) decoder is implemented in verilog HDL and synthesized with 0.18 mum CMOS std 130 standard cell library. The proposed architecture achieves almost 76 % increase in speed and throughput, and can be used in high-speed and high-throughput applications such as DVD, optical fiber communications, etc.","PeriodicalId":143886,"journal":{"name":"21st International Conference on VLSI Design (VLSID 2008)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2008-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130776542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Fast Congestion Aware Routing for Pin Assignment 快速拥塞感知路由引脚分配
Pub Date : 2008-01-04 DOI: 10.1109/VLSI.2008.110
S. Prasad
Macroblock (aka partition) pin assignment and routing are important tasks in typical top-down hierarchical physical design. Routers use pin locations as connection points to route the design with a goal of minimizing congestion. However, determining suitable pin locations it self depends on availability of congestion free routing topology as a seed input. This results in a catch-22 situation. In this paper, we present an approach, during prototyping phase, to generate fast-and- dirty congestion free routing topology, in top channels. This is real chip routing topology, in the sense that, the routing topology of every net adheres to physical hierarchy, as would happen during hierarchical implementation. This is passed as seed to pin assignment engine, which thus, results in congestion-free pin locations. The novelty of this approach lies in efficient detection of those inter-partition nets whose routing topology have little or no bearing to top channel congestion. These nets are then either not routed or routed in a fast hierarchy unaware manner. We will show that this routing topology is good enough (less than 10% error margin) to establish suitable cross points at partition boundaries, while the speed up achieved is around 6X compared to routing all nets in hierarchy aware manner. Experimental results demonstrate its efficiency and effectiveness. Furthermore, it can also be effectively used as seed input for decisions like channel sizing between partitions, and budgeting timing constraints to partitions.
在典型的自顶向下分层物理设计中,Macroblock(又名分区)引脚分配和路由是重要的任务。路由器使用引脚位置作为连接点,以最小化拥塞为目标进行路由设计。然而,确定合适的引脚位置本身取决于作为种子输入的无拥塞路由拓扑的可用性。这就导致了一个进退两难的局面。在本文中,我们提出了一种方法,在原型阶段,生成快速和肮脏的无拥塞路由拓扑,在顶部通道。这是真正的芯片路由拓扑,从某种意义上说,每个网络的路由拓扑都遵循物理层次结构,就像分层实现期间发生的那样。这将作为种子传递给引脚分配引擎,从而产生无拥塞的引脚位置。该方法的新颖之处在于它能有效地检测出那些路由拓扑与顶部信道拥塞关系很小或没有关系的分区间网络。然后,这些网络要么不路由,要么以不知道层次结构的方式快速路由。我们将证明这种路由拓扑足够好(小于10%的误差范围),可以在分区边界上建立合适的交叉点,而与以层次感知方式路由所有网络相比,实现的速度提高了约6倍。实验结果证明了该方法的有效性。此外,它还可以有效地用作决策的种子输入,例如分区之间的通道大小和分区的预算时间约束。
{"title":"Fast Congestion Aware Routing for Pin Assignment","authors":"S. Prasad","doi":"10.1109/VLSI.2008.110","DOIUrl":"https://doi.org/10.1109/VLSI.2008.110","url":null,"abstract":"Macroblock (aka partition) pin assignment and routing are important tasks in typical top-down hierarchical physical design. Routers use pin locations as connection points to route the design with a goal of minimizing congestion. However, determining suitable pin locations it self depends on availability of congestion free routing topology as a seed input. This results in a catch-22 situation. In this paper, we present an approach, during prototyping phase, to generate fast-and- dirty congestion free routing topology, in top channels. This is real chip routing topology, in the sense that, the routing topology of every net adheres to physical hierarchy, as would happen during hierarchical implementation. This is passed as seed to pin assignment engine, which thus, results in congestion-free pin locations. The novelty of this approach lies in efficient detection of those inter-partition nets whose routing topology have little or no bearing to top channel congestion. These nets are then either not routed or routed in a fast hierarchy unaware manner. We will show that this routing topology is good enough (less than 10% error margin) to establish suitable cross points at partition boundaries, while the speed up achieved is around 6X compared to routing all nets in hierarchy aware manner. Experimental results demonstrate its efficiency and effectiveness. Furthermore, it can also be effectively used as seed input for decisions like channel sizing between partitions, and budgeting timing constraints to partitions.","PeriodicalId":143886,"journal":{"name":"21st International Conference on VLSI Design (VLSID 2008)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2008-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117094075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Dynamic Error Detection for Dependable Cache Coherency in Multicore Architectures 多核架构中可靠缓存一致性的动态错误检测
Pub Date : 2008-01-04 DOI: 10.1109/VLSI.2008.68
Hui Wang, Sandeep Baldawa, R. Sangireddy
In chip multiprocessor (CMP) systems the various effects of technology scaling make the on chip components more susceptible to faults. Most of the earlier schemes that address fault tolerance issues in CMPs adopt redundant-thread techniques. These techniques are mostly effective, except that they fail to detect errors resulting from faults in hardware components on chip that commonly serve multiple cores. The cache coherence controller (CC) logic, which ensures consistency of data shared among multiple threads, is a vital common component in CMPs. A fault in CC logic of any of the processors may lead to errors in the data states in the entire CMP system. It is observed that up to 59.6% of the memory references cause a change in cache state for SPLASH-2 applications. We propose a novel scheme with a verification logic that can dynamically detect errors in the CC logic of multiple cores in a CMP system. The entire verification logic is designed with a negligible area of 0.1372 sq.mm using a TSMC 0.18 mu4-metal layer process technology. Even at highly aggressive fault injection rates, the logic achieves an average error coverage of more than 95% (and almost 100% for some applications)
在芯片多处理器(CMP)系统中,技术缩放的各种影响使芯片上的组件更容易发生故障。大多数解决cmp中容错问题的早期方案都采用冗余线程技术。这些技术大多是有效的,除了它们无法检测到通常服务于多个核心的芯片上的硬件组件故障所导致的错误。缓存一致性控制器(CC)逻辑是cmp中重要的公共组件,它保证了多线程间共享数据的一致性。任何处理器的CC逻辑出现故障都可能导致整个CMP系统的数据状态出现错误。可以观察到,高达59.6%的内存引用会导致SPLASH-2应用程序的缓存状态发生变化。我们提出了一种新的验证逻辑方案,该方案可以动态检测CMP系统中多核CC逻辑中的错误。整个验证逻辑的设计面积为0.1372平方,可以忽略不计。mm采用台积电0.18 mu4金属层工艺技术。即使在高度激进的故障注入率下,逻辑也能实现95%以上的平均错误覆盖率(对于某些应用程序几乎是100%)。
{"title":"Dynamic Error Detection for Dependable Cache Coherency in Multicore Architectures","authors":"Hui Wang, Sandeep Baldawa, R. Sangireddy","doi":"10.1109/VLSI.2008.68","DOIUrl":"https://doi.org/10.1109/VLSI.2008.68","url":null,"abstract":"In chip multiprocessor (CMP) systems the various effects of technology scaling make the on chip components more susceptible to faults. Most of the earlier schemes that address fault tolerance issues in CMPs adopt redundant-thread techniques. These techniques are mostly effective, except that they fail to detect errors resulting from faults in hardware components on chip that commonly serve multiple cores. The cache coherence controller (CC) logic, which ensures consistency of data shared among multiple threads, is a vital common component in CMPs. A fault in CC logic of any of the processors may lead to errors in the data states in the entire CMP system. It is observed that up to 59.6% of the memory references cause a change in cache state for SPLASH-2 applications. We propose a novel scheme with a verification logic that can dynamically detect errors in the CC logic of multiple cores in a CMP system. The entire verification logic is designed with a negligible area of 0.1372 sq.mm using a TSMC 0.18 mu4-metal layer process technology. Even at highly aggressive fault injection rates, the logic achieves an average error coverage of more than 95% (and almost 100% for some applications)","PeriodicalId":143886,"journal":{"name":"21st International Conference on VLSI Design (VLSID 2008)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2008-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125281016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Variability-Tolerant Register-Transfer Level Synthesis 容变寄存器-传输级合成
Pub Date : 2008-01-04 DOI: 10.1109/VLSI.2008.114
Anish Muttreja, S. Ravi, N. Jha
Variability in circuit delay is a significant challenge in the design and synthesis of digital circuits. While the challenge is being addressed at various levels of the design hierarchy, we argue that modern register-transfer level (RTL) synthesis tools can be enhanced to deal with this problem in an alternate, yet effective, manner. Our solution involves the design of variability- tolerant, correct circuits assuming common-case, rather than worst-case, values for critical path delays. We propose a methodology to design variability-tolerant circuits that can, at runtime, detect and efficiently recover from delay errors, which would be inevitably introduced due to the use of common-case delay values. Variability-agnostic designs are automatically transformed into variability-tolerant circuits by the introduction of shadow logic to detect and recover from runtime errors, while exploiting data speculation to derive performance benefits. For various benchmark circuits, we show that the area overhead imposed by our scheme is only 11.4% on an average, while achieving upto 16.3% performance speedup over margined designs.
电路延迟的可变性是数字电路设计和合成中的一个重大挑战。虽然在设计层次的各个层次上都解决了这一挑战,但我们认为可以增强现代寄存器传输层(RTL)合成工具,以另一种有效的方式处理这一问题。我们的解决方案涉及设计可变性容忍,正确的电路假设常见情况,而不是最坏情况下的关键路径延迟值。我们提出了一种设计可变容限电路的方法,该方法可以在运行时检测并有效地从延迟错误中恢复,这将不可避免地由于使用共例延迟值而引入。通过引入影子逻辑来检测并从运行时错误中恢复,可变性不可知设计自动转换为可变性容忍电路,同时利用数据推测来获得性能优势。对于各种基准电路,我们表明我们的方案所施加的面积开销平均仅为11.4%,同时在边际设计中实现高达16.3%的性能加速。
{"title":"Variability-Tolerant Register-Transfer Level Synthesis","authors":"Anish Muttreja, S. Ravi, N. Jha","doi":"10.1109/VLSI.2008.114","DOIUrl":"https://doi.org/10.1109/VLSI.2008.114","url":null,"abstract":"Variability in circuit delay is a significant challenge in the design and synthesis of digital circuits. While the challenge is being addressed at various levels of the design hierarchy, we argue that modern register-transfer level (RTL) synthesis tools can be enhanced to deal with this problem in an alternate, yet effective, manner. Our solution involves the design of variability- tolerant, correct circuits assuming common-case, rather than worst-case, values for critical path delays. We propose a methodology to design variability-tolerant circuits that can, at runtime, detect and efficiently recover from delay errors, which would be inevitably introduced due to the use of common-case delay values. Variability-agnostic designs are automatically transformed into variability-tolerant circuits by the introduction of shadow logic to detect and recover from runtime errors, while exploiting data speculation to derive performance benefits. For various benchmark circuits, we show that the area overhead imposed by our scheme is only 11.4% on an average, while achieving upto 16.3% performance speedup over margined designs.","PeriodicalId":143886,"journal":{"name":"21st International Conference on VLSI Design (VLSID 2008)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2008-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122926202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Formal Verification of a Public-Domain DDR2 Controller Design 公共域DDR2控制器设计的形式化验证
Pub Date : 2008-01-04 DOI: 10.1109/VLSI.2008.94
Abhishek Datta, V. Singhal
This paper demonstrates a formal verification- planning process and presents associated verification strategy that we believe is an essential (yet often neglected) step in an ASIC or SoC functional formal verification flow. Our contribution is to present a way to apply the verification planning process and a set of abstraction techniques on a non-trivial open-source example (the Sun OpenSPARCtrade DDR2 controller). The process and verification strategy can be applied to DDR2 controllers in particular and generalized for other designs.
本文演示了一个正式的验证计划过程,并提出了相关的验证策略,我们认为这是ASIC或SoC功能正式验证流程中必不可少的(但经常被忽视的)步骤。我们的贡献是在一个重要的开源示例(Sun OpenSPARCtrade DDR2控制器)上提供一种应用验证计划过程和一组抽象技术的方法。该过程和验证策略特别适用于DDR2控制器,并推广到其他设计。
{"title":"Formal Verification of a Public-Domain DDR2 Controller Design","authors":"Abhishek Datta, V. Singhal","doi":"10.1109/VLSI.2008.94","DOIUrl":"https://doi.org/10.1109/VLSI.2008.94","url":null,"abstract":"This paper demonstrates a formal verification- planning process and presents associated verification strategy that we believe is an essential (yet often neglected) step in an ASIC or SoC functional formal verification flow. Our contribution is to present a way to apply the verification planning process and a set of abstraction techniques on a non-trivial open-source example (the Sun OpenSPARCtrade DDR2 controller). The process and verification strategy can be applied to DDR2 controllers in particular and generalized for other designs.","PeriodicalId":143886,"journal":{"name":"21st International Conference on VLSI Design (VLSID 2008)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2008-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120893371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Exploiting Circuit Reconvergence through Static Learning in CNF SAT Solvers 利用静态学习在CNF SAT求解器中的电路再收敛
Pub Date : 2008-01-04 DOI: 10.1109/VLSI.2008.90
Yinlei Yu, C. Brien, S. Malik
Most contemporary SAT solvers use a conjunctive-normal-form (CNF) representation for logic functions due to the availability of efficient algorithms for this form, such as deduction through unit propagation and conflict driven learning using clause resolution. The use of CNF generally entails transformation to this form from other representations such as logic circuits (Tseitin, 1970). However, this transformation results in loss of information such as direction of signal flow and observability of signals at circuit outputs (Een, 2003)(Fu, 2005). This has prompted the development of various circuit-based solvers (Ganai et al., 2002), hybrid CNF+circuit-based solvers (Fu, 2005), as well as augmented CNF solvers (Een, 2003). Having the circuit available provides for additional capabilities at a cost, and thus requires careful analysis to determine the viability of each approach. This paper highlights one specific capability provided by a circuit: the ability to consider reconvergent paths in unit propagation. Unit propagation is the workhorse of contemporary SAT solvers, thus any improvement to this has significant practical potential. We first demonstrate that the Tseitin circuit-to-CNF transformation limits backward unit propagation and how additional implications can be derived when unit propagation across multiple paths is considered. Next, we show how these implications can be exploited by statically learning clauses during circuit pre-processing. The results of the practical implementation of these algorithms show that the static learning can provide significant speed-up on several classes of benchmark circuits. Finally, we discuss how this work compares with other circuit-based approaches, especially those arising from the automatic-test-pattern-generation (ATPG) community (e.g. recursive learning) and circuit and non- circuit based pre-processors.
大多数当代SAT求解器使用合取范式(CNF)表示逻辑函数,因为这种形式的有效算法可用,例如通过单元传播的演绎和使用子句解析的冲突驱动学习。CNF的使用通常需要将其他表示(如逻辑电路)转换为这种形式(tseittin, 1970)。然而,这种转换会导致信号流方向和电路输出信号的可观察性等信息的丢失(Een, 2003)(Fu, 2005)。这促使了各种基于电路的求解器的发展(Ganai等人,2002年),混合CNF+基于电路的求解器(Fu, 2005年),以及增强CNF求解器(Een, 2003年)。有了可用的电路,额外的功能是有代价的,因此需要仔细分析,以确定每种方法的可行性。本文强调了电路提供的一种特殊能力:在单元传播中考虑再收敛路径的能力。单元传播是当代SAT求解器的主力军,因此对其进行任何改进都具有重大的实际潜力。我们首先证明了tseittin电路到cnf转换限制了向后的单元传播,以及当考虑跨多条路径的单元传播时,如何推导出额外的含义。接下来,我们将展示如何在电路预处理期间通过静态学习子句利用这些含义。这些算法的实际实现结果表明,静态学习可以在几类基准电路上提供显着的加速。最后,我们讨论了这项工作与其他基于电路的方法的比较,特别是那些来自自动测试模式生成(ATPG)社区(例如递归学习)以及基于电路和非电路的预处理器的方法。
{"title":"Exploiting Circuit Reconvergence through Static Learning in CNF SAT Solvers","authors":"Yinlei Yu, C. Brien, S. Malik","doi":"10.1109/VLSI.2008.90","DOIUrl":"https://doi.org/10.1109/VLSI.2008.90","url":null,"abstract":"Most contemporary SAT solvers use a conjunctive-normal-form (CNF) representation for logic functions due to the availability of efficient algorithms for this form, such as deduction through unit propagation and conflict driven learning using clause resolution. The use of CNF generally entails transformation to this form from other representations such as logic circuits (Tseitin, 1970). However, this transformation results in loss of information such as direction of signal flow and observability of signals at circuit outputs (Een, 2003)(Fu, 2005). This has prompted the development of various circuit-based solvers (Ganai et al., 2002), hybrid CNF+circuit-based solvers (Fu, 2005), as well as augmented CNF solvers (Een, 2003). Having the circuit available provides for additional capabilities at a cost, and thus requires careful analysis to determine the viability of each approach. This paper highlights one specific capability provided by a circuit: the ability to consider reconvergent paths in unit propagation. Unit propagation is the workhorse of contemporary SAT solvers, thus any improvement to this has significant practical potential. We first demonstrate that the Tseitin circuit-to-CNF transformation limits backward unit propagation and how additional implications can be derived when unit propagation across multiple paths is considered. Next, we show how these implications can be exploited by statically learning clauses during circuit pre-processing. The results of the practical implementation of these algorithms show that the static learning can provide significant speed-up on several classes of benchmark circuits. Finally, we discuss how this work compares with other circuit-based approaches, especially those arising from the automatic-test-pattern-generation (ATPG) community (e.g. recursive learning) and circuit and non- circuit based pre-processors.","PeriodicalId":143886,"journal":{"name":"21st International Conference on VLSI Design (VLSID 2008)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2008-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124446430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Novel Carry-Look Ahead Approach to a Unified BCD and Binary Adder/Subtractor 统一BCD和二进制加减法器的一种新的超前进位方法
Pub Date : 2008-01-04 DOI: 10.1109/VLSI.2008.80
S. Veeramachaneni, K. M. Krishna, V. PrateekG., S. Subroto, S. Bharat, M. Srinivas
Increasing prominence of commercial, financial and Internet-based applications, which process decimal data, there is an increasing interest in providing hardware support for such data. In this paper, new architecture for efficient binary and binary coded decimal (BCD) adder/subtracter is presented. This employs a new method of subtraction unlike the existing designs which mostly use 10's complements, to obtain a much lower latency. Though there is a necessity of correction in some cases, the delay overhead is minimal. A complete discussion about such cases and the required logic to process is presented. The architecture is run-time reconfigurable to facilitate both BCD and binary operations, including signed and unsigned numbers. The proposed circuits are compared (both qualitatively as well as quantitatively) with the existing circuits in literature and are shown to perform better. Simulation results show that the proposed architecture is at least 11% faster than the existing designs.
处理十进制数据的商业、金融和基于internet的应用程序日益突出,因此对为此类数据提供硬件支持的兴趣日益增加。本文提出了一种高效二进制和二进制编码十进制(BCD)加/减法器的新结构。这采用了一种新的减法方法,不像现有的设计,主要使用10的补数,以获得更低的延迟。虽然在某些情况下需要进行校正,但延迟开销是最小的。对这种情况和处理所需的逻辑进行了完整的讨论。该体系结构在运行时可重新配置,以促进BCD和二进制操作,包括有符号数和无符号数。将所提出的电路与文献中现有的电路进行了定性和定量的比较,并显示出更好的性能。仿真结果表明,所提架构比现有设计至少快11%。
{"title":"A Novel Carry-Look Ahead Approach to a Unified BCD and Binary Adder/Subtractor","authors":"S. Veeramachaneni, K. M. Krishna, V. PrateekG., S. Subroto, S. Bharat, M. Srinivas","doi":"10.1109/VLSI.2008.80","DOIUrl":"https://doi.org/10.1109/VLSI.2008.80","url":null,"abstract":"Increasing prominence of commercial, financial and Internet-based applications, which process decimal data, there is an increasing interest in providing hardware support for such data. In this paper, new architecture for efficient binary and binary coded decimal (BCD) adder/subtracter is presented. This employs a new method of subtraction unlike the existing designs which mostly use 10's complements, to obtain a much lower latency. Though there is a necessity of correction in some cases, the delay overhead is minimal. A complete discussion about such cases and the required logic to process is presented. The architecture is run-time reconfigurable to facilitate both BCD and binary operations, including signed and unsigned numbers. The proposed circuits are compared (both qualitatively as well as quantitatively) with the existing circuits in literature and are shown to perform better. Simulation results show that the proposed architecture is at least 11% faster than the existing designs.","PeriodicalId":143886,"journal":{"name":"21st International Conference on VLSI Design (VLSID 2008)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2008-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131766638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Power Attack Resistant Efficient FPGA Architecture for Karatsuba Multiplier 抗功率攻击的高效FPGA倍频器结构
Pub Date : 2008-01-04 DOI: 10.1109/VLSI.2008.65
C. Rebeiro, Debdeep Mukhopadhyay
The paper presents an architecture to implement Karatsuba Multiplier on an FPGA platform. Detailed analysis has been carried out on how existing algorithms utilize FPGA resources. Based on the observations the work develops a hybrid technique which has a better area delay product compared to the known algorithms. The results have been practically demonstrated through a large number of experiments. Subsequently, the work develops a masking strategy to prevent power based side channel attacks on the multiplier. It has been found that the proposed masked Hybrid Karatsuba multiplier is more compact compared to existing designs.
本文提出了一种在FPGA平台上实现倍频器的体系结构。详细分析了现有算法如何利用FPGA资源。在此基础上,本文开发了一种混合算法,与已知算法相比,该算法具有更好的面积延迟积。通过大量的实验,结果得到了实际的验证。随后,该工作开发了一种屏蔽策略,以防止对乘法器的基于功率的侧信道攻击。研究发现,与现有设计相比,所提出的掩膜混合卡拉suba乘法器更加紧凑。
{"title":"Power Attack Resistant Efficient FPGA Architecture for Karatsuba Multiplier","authors":"C. Rebeiro, Debdeep Mukhopadhyay","doi":"10.1109/VLSI.2008.65","DOIUrl":"https://doi.org/10.1109/VLSI.2008.65","url":null,"abstract":"The paper presents an architecture to implement Karatsuba Multiplier on an FPGA platform. Detailed analysis has been carried out on how existing algorithms utilize FPGA resources. Based on the observations the work develops a hybrid technique which has a better area delay product compared to the known algorithms. The results have been practically demonstrated through a large number of experiments. Subsequently, the work develops a masking strategy to prevent power based side channel attacks on the multiplier. It has been found that the proposed masked Hybrid Karatsuba multiplier is more compact compared to existing designs.","PeriodicalId":143886,"journal":{"name":"21st International Conference on VLSI Design (VLSID 2008)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2008-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124600499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
An Optimal Multi-Functional Unit Dynamic Instruction Selection Logic at Submicron Technologies 基于亚微米技术的最优多功能单元动态指令选择逻辑
Pub Date : 2008-01-04 DOI: 10.1109/VLSI.2008.55
Terrell R. Bennett, R. Sangireddy
As the technology scales, reduction in transistor size creates many opportunities for increased circuit capabilities in reduced chip area. In modern wide-issue processors, performance of the processor is directly impacted by the time delay complexity of the dynamic scheduling logic. In this paper, we analyze the scaling of time delay of instruction select logic at the submicron technologies, and also present novel designs that provide a single selection tree for two similar functional units. The designs are based on a tree structure using arbiter cells of two and four inputs which can handle one or two functional units. The effects of technology and design decisions are shown based on simulations using four submicron technologies. The delays in the select logic trees are shown to decrease by an average of 60% from 130 nm technology to 45 nm technology when servicing a single functional unit. The double grant arbiter cells are shown to build a tree that will serve multiple functional units simultaneously with 65% lesser delay as compared to multiple single-grant trees1.
随着技术的发展,晶体管尺寸的减小为在减小芯片面积的情况下提高电路性能创造了许多机会。在现代大问题处理器中,动态调度逻辑的时延复杂度直接影响处理器的性能。在本文中,我们分析了亚微米技术下指令选择逻辑的时间延迟的尺度,并提出了一种新的设计,为两个类似的功能单元提供一个单一的选择树。该设计基于树形结构,使用两个和四个输入的仲裁单元,可以处理一个或两个功能单元。基于四种亚微米技术的模拟,显示了技术和设计决策的影响。当服务于单个功能单元时,从130纳米技术到45纳米技术,选择逻辑树中的延迟平均减少了60%。如图所示,双授权仲裁单元构建的树将同时为多个功能单元提供服务,与多个单授权树相比,延迟减少了65% 1。
{"title":"An Optimal Multi-Functional Unit Dynamic Instruction Selection Logic at Submicron Technologies","authors":"Terrell R. Bennett, R. Sangireddy","doi":"10.1109/VLSI.2008.55","DOIUrl":"https://doi.org/10.1109/VLSI.2008.55","url":null,"abstract":"As the technology scales, reduction in transistor size creates many opportunities for increased circuit capabilities in reduced chip area. In modern wide-issue processors, performance of the processor is directly impacted by the time delay complexity of the dynamic scheduling logic. In this paper, we analyze the scaling of time delay of instruction select logic at the submicron technologies, and also present novel designs that provide a single selection tree for two similar functional units. The designs are based on a tree structure using arbiter cells of two and four inputs which can handle one or two functional units. The effects of technology and design decisions are shown based on simulations using four submicron technologies. The delays in the select logic trees are shown to decrease by an average of 60% from 130 nm technology to 45 nm technology when servicing a single functional unit. The double grant arbiter cells are shown to build a tree that will serve multiple functional units simultaneously with 65% lesser delay as compared to multiple single-grant trees1.","PeriodicalId":143886,"journal":{"name":"21st International Conference on VLSI Design (VLSID 2008)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2008-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124649391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
21st International Conference on VLSI Design (VLSID 2008)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1