首页 > 最新文献

Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors最新文献

英文 中文
Design of delay-insensitive three dimension pipeline array multiplier for image processing 用于图像处理的延迟不敏感三维管道阵列乘法器设计
A. Taubin, K. Fant, J. McCardle
This paper presents a novel delay-insensitive three dimension pipeline array multiplier. The organization combines deep (gate-level) pipelining of Manchester adders with a two dimensional cross-pipeline mesh for multiplicand and multiplier propagation and partial product bits calculation. Fine grain pipelining with elimination of broadcasting and completion trees leads to high-throughput without use of dynamic logic that leaves the door open for further improvement of performance.
提出了一种新型的延迟不敏感三维管道阵列乘法器。该组织将曼彻斯特加法器的深(门级)流水线与用于乘法器和乘法器传播以及部分积位计算的二维跨管道网格相结合。细粒度流水线消除了广播和完井树,在不使用动态逻辑的情况下实现了高吞吐量,为进一步提高性能敞开了大门。
{"title":"Design of delay-insensitive three dimension pipeline array multiplier for image processing","authors":"A. Taubin, K. Fant, J. McCardle","doi":"10.1109/ICCD.2002.1106755","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106755","url":null,"abstract":"This paper presents a novel delay-insensitive three dimension pipeline array multiplier. The organization combines deep (gate-level) pipelining of Manchester adders with a two dimensional cross-pipeline mesh for multiplicand and multiplier propagation and partial product bits calculation. Fine grain pipelining with elimination of broadcasting and completion trees leads to high-throughput without use of dynamic logic that leaves the door open for further improvement of performance.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132128466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Floating-point fused multiply-add with reduced latency 浮点融合的乘加运算,减少了延迟
T. Lang, J. Bruguera
We propose an architecture for the computation of the floating-point multiply-add-fused (MAF) operation A+ (B /spl times/ C). This architecture is based on the combined addition and rounding (using a dual adder) and on the anticipation of the normalization step before the addition. Because the normalization is performed before the addition, it is not possible to overlap the leading-zero-anticipator with the adder. Consequently, to avoid the increase in delay we modify the design of the LZA so that the leading bits of its output are produced first and can be used to begin the normalization. Moreover, parts of the addition are also anticipated. We have estimated the delay of the resulting architecture for double-precision format, considering the load introduced by long connections, and estimate a reduction of about 15% to 20% with respect to traditional implementations of the floating-point MAF unit.
我们提出了一种浮点乘加融合(MAF)运算A+ (B /spl times/ C)的计算架构。该架构基于组合的加法和舍入(使用双加法器)以及加法前的规范化步骤的预期。因为归一化是在加法之前执行的,所以不可能将前导零预期器与加法器重叠。因此,为了避免延迟增加,我们修改了LZA的设计,使其输出的前导位首先产生,并可用于开始归一化。此外,部分新增部分也在预期之中。考虑到长连接带来的负载,我们估计了双精度格式的最终架构的延迟,并估计与浮点MAF单元的传统实现相比减少了大约15%到20%。
{"title":"Floating-point fused multiply-add with reduced latency","authors":"T. Lang, J. Bruguera","doi":"10.1109/ICCD.2002.1106762","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106762","url":null,"abstract":"We propose an architecture for the computation of the floating-point multiply-add-fused (MAF) operation A+ (B /spl times/ C). This architecture is based on the combined addition and rounding (using a dual adder) and on the anticipation of the normalization step before the addition. Because the normalization is performed before the addition, it is not possible to overlap the leading-zero-anticipator with the adder. Consequently, to avoid the increase in delay we modify the design of the LZA so that the leading bits of its output are produced first and can be used to begin the normalization. Moreover, parts of the addition are also anticipated. We have estimated the delay of the resulting architecture for double-precision format, considering the load introduced by long connections, and estimate a reduction of about 15% to 20% with respect to traditional implementations of the floating-point MAF unit.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126389632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 47
Adaptive pipeline depth control for processor power-management 处理器电源管理的自适应管道深度控制
A. Efthymiou, J. Garside
A method of managing the power consumption of an embedded, single-issue processor by controlling its pipeline depth is proposed. The execution time will be increased but, if the method is applied to applications with slack time, the user-perceived performance may not be degraded Two techniques are shown using an existing asynchronous processor as a starting point. The first method controls the pipeline occupancy using a token mechanism, the second enables adjacent pipeline stages to be merged, by making the latches between them 'permanently' transparent. An energy reduction of up to 16% is measured, using a collection of five benchmarks.
提出了一种通过控制流水线深度来管理嵌入式单问题处理器功耗的方法。执行时间会增加,但是,如果将该方法应用于有空闲时间的应用程序,则用户感知的性能可能不会降低。第一种方法使用令牌机制控制管道占用,第二种方法通过使它们之间的锁存“永久”透明,使相邻的管道阶段能够合并。使用五个基准的集合,测量了高达16%的能源减少。
{"title":"Adaptive pipeline depth control for processor power-management","authors":"A. Efthymiou, J. Garside","doi":"10.1109/ICCD.2002.1106812","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106812","url":null,"abstract":"A method of managing the power consumption of an embedded, single-issue processor by controlling its pipeline depth is proposed. The execution time will be increased but, if the method is applied to applications with slack time, the user-perceived performance may not be degraded Two techniques are shown using an existing asynchronous processor as a starting point. The first method controls the pipeline occupancy using a token mechanism, the second enables adjacent pipeline stages to be merged, by making the latches between them 'permanently' transparent. An energy reduction of up to 16% is measured, using a collection of five benchmarks.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121334725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 54
Don't-care identification on specific bits of test patterns 不要关心测试模式的特定部分的标识
K. Miyase, S. Kajihara, I. Pomeranz, S. Reddy
Given a test set for stuck-at faults, a primary input value may be changed to the opposite logic value without losing fault coverage. One can regard such a value as a don't-care (X). The don't care values can be filled appropriately to achieve test compaction, test data compression, or power reduction during testing. However, these uses are better served if the don't cares can be placed in desired/specific bit positions of the test patterns. In this paper, we present a method for maximally fixing Xs on specific bits of given test vectors. Experimental results on ISCAS benchmark circuits show how the proposed method can increase the number of Xs on specific bits compared with an earlier proposed method.
给定卡住故障的测试集,可以将主要输入值更改为相反的逻辑值,而不会丢失故障覆盖率。可以把这样的值看作一个不在乎(X)。可以适当地填充不在乎值,以实现测试压缩、测试数据压缩或测试过程中的功耗降低。然而,如果不关心可以放置在测试模式的所需/特定位位置,则这些用途将得到更好的服务。在本文中,我们提出了在给定测试向量的特定位上最大固定x的方法。在ISCAS基准电路上的实验结果表明,与先前提出的方法相比,该方法可以增加特定位上的x数。
{"title":"Don't-care identification on specific bits of test patterns","authors":"K. Miyase, S. Kajihara, I. Pomeranz, S. Reddy","doi":"10.1109/ICCD.2002.1106769","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106769","url":null,"abstract":"Given a test set for stuck-at faults, a primary input value may be changed to the opposite logic value without losing fault coverage. One can regard such a value as a don't-care (X). The don't care values can be filled appropriately to achieve test compaction, test data compression, or power reduction during testing. However, these uses are better served if the don't cares can be placed in desired/specific bit positions of the test patterns. In this paper, we present a method for maximally fixing Xs on specific bits of given test vectors. Experimental results on ISCAS benchmark circuits show how the proposed method can increase the number of Xs on specific bits compared with an earlier proposed method.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116333864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
A distributed computation platform for wireless embedded sensing 一种无线嵌入式传感分布式计算平台
A. Savvides, M. Srivastava
We present a low cost wireless microsensor node architecture for distributed computation and sensing in massively distributed embedded systems. Our design focuses on the development of a versatile, low power device to facilitate experimentation and initial deployment of wireless microsensor nodes in deeply embedded systems. This paper provides the details of our architecture and introduces fine-grained node localization as an example application of distributed computation and wireless embedded sensing.
我们提出了一种低成本的无线微传感器节点架构,用于大规模分布式嵌入式系统的分布式计算和传感。我们的设计重点是开发一种通用的低功耗设备,以促进深度嵌入式系统中无线微传感器节点的实验和初始部署。本文提供了我们的体系结构的细节,并介绍了细粒度节点定位作为分布式计算和无线嵌入式传感的一个应用实例。
{"title":"A distributed computation platform for wireless embedded sensing","authors":"A. Savvides, M. Srivastava","doi":"10.1109/ICCD.2002.1106774","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106774","url":null,"abstract":"We present a low cost wireless microsensor node architecture for distributed computation and sensing in massively distributed embedded systems. Our design focuses on the development of a versatile, low power device to facilitate experimentation and initial deployment of wireless microsensor nodes in deeply embedded systems. This paper provides the details of our architecture and introduces fine-grained node localization as an example application of distributed computation and wireless embedded sensing.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131302531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 83
Efficient PEEC-based inductance extraction using circuit-aware techniques 利用电路感知技术高效的基于peec的电感提取
Haitian Hu, S. Sapatnekar
Practical approaches for on-chip inductance extraction to obtain a sparse, stable and accurate inverse inductance matrix K are proposed. The novelty of our work is in using circuit characteristics to define the concept of resistance-dominant and inductance-dominant lines. This notion is used to progressively refine a set of clusters that are inductively tightly-coupled. For reasonable designs, the more exact algorithm yields a sparsification of 97% for delay and oscillation magnitude errors of 10% and 15%, respectively, while the more approximate algorithm achieves up to 99% sparsification. An offshoot of this work is K-PRIMA, an extension of PRIMA to handle K matrices with guaranteed passivity.
提出了一种实用的片上电感提取方法,以获得稀疏、稳定、准确的逆电感矩阵K。我们工作的新颖之处在于使用电路特性来定义电阻为主线和电感为主线的概念。这个概念用于逐步细化一组电感紧耦合的簇。在设计合理的情况下,对于延迟和振荡幅度误差分别为10%和15%时,更精确的算法可实现97%的稀疏化,而更近似的算法可实现99%的稀疏化。该工作的一个分支是K-PRIMA,它是PRIMA的一个扩展,用于处理具有保证无源性的K矩阵。
{"title":"Efficient PEEC-based inductance extraction using circuit-aware techniques","authors":"Haitian Hu, S. Sapatnekar","doi":"10.1109/ICCD.2002.1106808","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106808","url":null,"abstract":"Practical approaches for on-chip inductance extraction to obtain a sparse, stable and accurate inverse inductance matrix K are proposed. The novelty of our work is in using circuit characteristics to define the concept of resistance-dominant and inductance-dominant lines. This notion is used to progressively refine a set of clusters that are inductively tightly-coupled. For reasonable designs, the more exact algorithm yields a sparsification of 97% for delay and oscillation magnitude errors of 10% and 15%, respectively, while the more approximate algorithm achieves up to 99% sparsification. An offshoot of this work is K-PRIMA, an extension of PRIMA to handle K matrices with guaranteed passivity.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130960857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Applying decay strategies to branch predictors for leakage energy savings 将衰减策略应用于分支预测器的泄漏节能
Zhigang Hu, Philo Juang, K. Skadron, D. Clark, M. Martonosi
With technology advancing toward deep submicron, leakage energy is of increasing concern, especially for large onchip array structures such as caches and branch predictors. Recent work has suggested that even larger branch predictors can and should be used in order to improve microprocessor performance. A further consideration is that the branch predictor is a thermal hot spot, thus further increasing its leakage. For these reasons, it is natural to consider applying decay techniques-already shown to reduce leakage energy for caches-to branch-prediction structures. Due to the structural difference between caches and branch predictors, applying decay techniques to branch predictors is not straightforward. This paper explores the strategies for exploiting spatial and temporal locality to make decay effective for bimodal, gshare, and hybrid predictors, as well as the branch target buffer Overall, this paper demonstrates that decay techniques apply more broadly than just to caches, but that careful policy and implementation make the difference between success and failure in building decay-based branch predictors. Multi-component hybrid predictors offer especially interesting implementation tradeoffs for decay.
随着技术向深亚微米方向发展,泄漏能量越来越受到关注,特别是对于高速缓存和分支预测器等大型片上阵列结构。最近的研究表明,为了提高微处理器的性能,可以而且应该使用更大的分支预测器。进一步的考虑是分支预测器是一个热热点,从而进一步增加其泄漏。由于这些原因,考虑将衰减技术应用于分支预测结构是很自然的,这种技术已经被证明可以减少缓存的泄漏能量。由于缓存和分支预测器之间的结构差异,将衰减技术应用于分支预测器并不简单。本文探讨了利用空间和时间局域性的策略,以使衰减对双峰、gshare和混合预测器以及分支目标缓冲区有效。总体而言,本文表明,衰减技术不仅适用于缓存,还适用于更广泛的领域,但在构建基于衰减的分支预测器时,谨慎的策略和实施决定了成败。多组件混合预测器为衰减提供了特别有趣的实现折衷。
{"title":"Applying decay strategies to branch predictors for leakage energy savings","authors":"Zhigang Hu, Philo Juang, K. Skadron, D. Clark, M. Martonosi","doi":"10.1109/ICCD.2002.1106809","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106809","url":null,"abstract":"With technology advancing toward deep submicron, leakage energy is of increasing concern, especially for large onchip array structures such as caches and branch predictors. Recent work has suggested that even larger branch predictors can and should be used in order to improve microprocessor performance. A further consideration is that the branch predictor is a thermal hot spot, thus further increasing its leakage. For these reasons, it is natural to consider applying decay techniques-already shown to reduce leakage energy for caches-to branch-prediction structures. Due to the structural difference between caches and branch predictors, applying decay techniques to branch predictors is not straightforward. This paper explores the strategies for exploiting spatial and temporal locality to make decay effective for bimodal, gshare, and hybrid predictors, as well as the branch target buffer Overall, this paper demonstrates that decay techniques apply more broadly than just to caches, but that careful policy and implementation make the difference between success and failure in building decay-based branch predictors. Multi-component hybrid predictors offer especially interesting implementation tradeoffs for decay.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128518468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 45
Embedded operating system energy analysis and macro-modeling 嵌入式操作系统能量分析和宏建模
T. K. Tan, A. Raghunathan, N. Jha
A large and increasing number of modern embedded systems are subject to tight power/energy constraints. It has been demonstrated that the operating system (OS) can have a significant impact on the energy efficiency of the embedded system. Hence, analysis of the energy effects of the OS is of great importance. Conventional approaches to energy analysis of the OS (and embedded software, in general) require the application software to be completely developed and integrated with the system software, and that either measurement on a hardware prototype or detailed simulation of the entire system be performed. Since this process requires significant design effort, unfortunately, it is typically too late in the design cycle to perform high-level or architectural optimizations on the embedded software, restricting the scope of power savings. Our work recognizes the need to provide embedded software designers with feedback about the effect of different OS services on energy consumption early in the design cycle. As a first step in that direction, this paper presents a systematic methodology to perform energy analysis and macro-modeling of an embedded OS. Our energy macro-models provide software architects and developers with an intuitive model for the OS energy effects, since they directly associate energy consumption with OS services and primitives that are visible to the application software. Our methodology consists of (i) an analysis stage, where we identify a set of energy components, called energy characteristics, which are useful to the designer in making OS-related design trade-offs, and (ii) a subsequent macromodeling stage, where we collect data for the identified energy components and automatically derive macro-models for them. We validate our methodology by deriving energy macro-models for two state-of-the-art embedded OS's, /spl mu/C/OS and Linux OS.
越来越多的现代嵌入式系统受到严格的功率/能量限制。已经证明,操作系统(OS)可以对嵌入式系统的能源效率产生重大影响。因此,分析OS的能量效应是非常重要的。传统的操作系统能量分析方法(通常是嵌入式软件)要求应用软件完全开发并与系统软件集成,并且要么在硬件原型上进行测量,要么对整个系统进行详细模拟。由于此过程需要大量的设计工作,不幸的是,在设计周期中执行嵌入式软件的高级或架构优化通常为时已晚,从而限制了节能的范围。我们的工作认识到需要在设计周期的早期为嵌入式软件设计人员提供关于不同操作系统服务对能耗影响的反馈。作为该方向的第一步,本文提出了一种系统的方法来执行嵌入式操作系统的能量分析和宏观建模。我们的能源宏观模型为软件架构师和开发人员提供了操作系统能源效应的直观模型,因为它们直接将能源消耗与应用软件可见的操作系统服务和原语联系起来。我们的方法包括(i)分析阶段,在此阶段我们确定一组能源组件,称为能源特性,这对设计师在进行操作系统相关的设计权衡时很有用,以及(ii)随后的宏观建模阶段,我们收集已确定的能源组件的数据并自动为它们导出宏观模型。我们通过为两种最先进的嵌入式操作系统/spl / mu/C/OS和Linux操作系统推导能量宏观模型来验证我们的方法。
{"title":"Embedded operating system energy analysis and macro-modeling","authors":"T. K. Tan, A. Raghunathan, N. Jha","doi":"10.1109/ICCD.2002.1106822","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106822","url":null,"abstract":"A large and increasing number of modern embedded systems are subject to tight power/energy constraints. It has been demonstrated that the operating system (OS) can have a significant impact on the energy efficiency of the embedded system. Hence, analysis of the energy effects of the OS is of great importance. Conventional approaches to energy analysis of the OS (and embedded software, in general) require the application software to be completely developed and integrated with the system software, and that either measurement on a hardware prototype or detailed simulation of the entire system be performed. Since this process requires significant design effort, unfortunately, it is typically too late in the design cycle to perform high-level or architectural optimizations on the embedded software, restricting the scope of power savings. Our work recognizes the need to provide embedded software designers with feedback about the effect of different OS services on energy consumption early in the design cycle. As a first step in that direction, this paper presents a systematic methodology to perform energy analysis and macro-modeling of an embedded OS. Our energy macro-models provide software architects and developers with an intuitive model for the OS energy effects, since they directly associate energy consumption with OS services and primitives that are visible to the application software. Our methodology consists of (i) an analysis stage, where we identify a set of energy components, called energy characteristics, which are useful to the designer in making OS-related design trade-offs, and (ii) a subsequent macromodeling stage, where we collect data for the identified energy components and automatically derive macro-models for them. We validate our methodology by deriving energy macro-models for two state-of-the-art embedded OS's, /spl mu/C/OS and Linux OS.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130402375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
From ASIC to ASIP: the next design discontinuity 从ASIC到ASIP:下一个设计中断
K. Keutzer, S. Malik, A. Newton
A variety of factors is making it increasingly difficult and expensive to design and manufacture traditional Application Specific Integrated Circuits (ASICs). This has started a significant move towards the use of programmable solutions of various forms - increasingly referred to as programmable platforms. For the platform manufacturer, programmability provides higher volume to amortize design and manufacturing costs, as the same platform can be used over multiple related applications, as well as over generations of an application. For the application implementer, programmability provides a lower risk and shorter time-to-market implementation path. The flexibility provided by programmability comes with a performance and power overhead. This can be significantly mitigated by using application specific platforms, also referred to as Application Specific Instruction Set Processors (ASIPs). This paper details the reasons for this significant change in application implementation philosophy, provides illustrative contemporary evidence of this change, examines the space of application specific platforms, outlines fundamental problems in their development, and finally presents a methodology to deal with this changing design style.
各种因素使得设计和制造传统的专用集成电路(asic)变得越来越困难和昂贵。这已经开始朝着使用各种形式的可编程解决方案迈出了重要的一步——越来越多地被称为可编程平台。对于平台制造商来说,可编程性为分摊设计和制造成本提供了更高的容量,因为同一个平台可以用于多个相关的应用程序,也可以用于多个应用程序的代。对于应用程序实现者来说,可编程性提供了更低的风险和更短的上市实现路径。可编程性提供的灵活性带来了性能和功率开销。这可以通过使用特定于应用程序的平台(也称为特定于应用程序的指令集处理器(application specific Instruction Set Processors, asip))得到显著缓解。本文详细介绍了应用程序实现理念发生这一重大变化的原因,提供了说明这一变化的当代证据,考察了应用程序特定平台的空间,概述了它们发展中的基本问题,最后提出了一种处理这种变化的设计风格的方法。
{"title":"From ASIC to ASIP: the next design discontinuity","authors":"K. Keutzer, S. Malik, A. Newton","doi":"10.1109/ICCD.2002.1106752","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106752","url":null,"abstract":"A variety of factors is making it increasingly difficult and expensive to design and manufacture traditional Application Specific Integrated Circuits (ASICs). This has started a significant move towards the use of programmable solutions of various forms - increasingly referred to as programmable platforms. For the platform manufacturer, programmability provides higher volume to amortize design and manufacturing costs, as the same platform can be used over multiple related applications, as well as over generations of an application. For the application implementer, programmability provides a lower risk and shorter time-to-market implementation path. The flexibility provided by programmability comes with a performance and power overhead. This can be significantly mitigated by using application specific platforms, also referred to as Application Specific Instruction Set Processors (ASIPs). This paper details the reasons for this significant change in application implementation philosophy, provides illustrative contemporary evidence of this change, examines the space of application specific platforms, outlines fundamental problems in their development, and finally presents a methodology to deal with this changing design style.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133818350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 141
Accelerated SAT-based scheduling of control/data flow graphs 加速基于sat的控制/数据流图调度
S. Memik, F. Fallah
In this paper we present a satisfiability-based approach to the scheduling problem in high-level synthesis. We formulate the resource constrained scheduling as a satisfiability (SAT) problem. We present experimental results on the performance of the state-of-the-art SAT solver Chaff, and demonstrate techniques to reduce the SAT problem size by applying bounding techniques on the scheduling problem. In addition, we demonstrate the use of transformations on control data flow graphs such that the same lower bound techniques can operate on them as well. Our experiments show that Chaff is able to outperform the integer linear program (ILP) solver CPLEX in terms of CPU time by as much as 59 fold. Finally, we conclude that the satisfiability-based approach is a promising alternative for obtaining optimal solutions to NP-complete scheduling problem instances.
本文提出了一种基于可满足性的高级综合调度问题求解方法。我们将资源约束调度表述为可满足性问题。我们展示了最先进的SAT求解器Chaff性能的实验结果,并演示了通过在调度问题上应用边界技术来减少SAT问题大小的技术。此外,我们还演示了在控制数据流图上使用转换,以便相同的下界技术也可以对它们进行操作。我们的实验表明,在CPU时间方面,Chaff能够优于整数线性规划(ILP)求解器CPLEX多达59倍。最后,我们得出结论,基于满意度的方法是求解np -完全调度问题实例最优解的一种有希望的替代方法。
{"title":"Accelerated SAT-based scheduling of control/data flow graphs","authors":"S. Memik, F. Fallah","doi":"10.1109/ICCD.2002.1106801","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106801","url":null,"abstract":"In this paper we present a satisfiability-based approach to the scheduling problem in high-level synthesis. We formulate the resource constrained scheduling as a satisfiability (SAT) problem. We present experimental results on the performance of the state-of-the-art SAT solver Chaff, and demonstrate techniques to reduce the SAT problem size by applying bounding techniques on the scheduling problem. In addition, we demonstrate the use of transformations on control data flow graphs such that the same lower bound techniques can operate on them as well. Our experiments show that Chaff is able to outperform the integer linear program (ILP) solver CPLEX in terms of CPU time by as much as 59 fold. Finally, we conclude that the satisfiability-based approach is a promising alternative for obtaining optimal solutions to NP-complete scheduling problem instances.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"94 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116821134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
期刊
Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1