首页 > 最新文献

ASP-DAC 2004: Asia and South Pacific Design Automation Conference 2004 (IEEE Cat. No.04EX753)最新文献

英文 中文
Minimization of fractional wordlength on fixed-point conversion for high-level synthesis 高级合成中定点转换的分数字长最小化
Nobuhiro Doi, T. Horiyama, M. Nakanishi, S. Kimura
In the hardware synthesis from high-level language such as C, bit length of variables is one of the key issues on the area and speed optimization. Usually, designers are required to specify the word length of each variable manually, and verify the correctness by the simulation on huge data. We propose an optimization method of fractional word length of floating-point variables in the floating to fixed-point conversion of variables. The amount of round-off errors are formulated with parameters and propagated via data flow graphs. The nonlinear programming is used to solve the fractional word length minimization problem. The method does not require the simulation on huge data, and is very fast compared to ones based on the simulation. We have shown the effect on several programs.
在C等高级语言的硬件综合中,变量的位长度是优化面积和优化速度的关键问题之一。通常,设计人员需要手动指定每个变量的字长,并通过对大量数据的仿真来验证其正确性。在变量的浮点到定点转换中,提出了一种浮点变量的小数字长优化方法。舍入误差的数量用参数表示,并通过数据流图传播。采用非线性规划方法求解分数字长最小化问题。该方法不需要对大数据进行仿真,与基于仿真的方法相比,速度非常快。我们已经在几个程序中展示了这种效果。
{"title":"Minimization of fractional wordlength on fixed-point conversion for high-level synthesis","authors":"Nobuhiro Doi, T. Horiyama, M. Nakanishi, S. Kimura","doi":"10.1109/ASPDAC.2004.1337544","DOIUrl":"https://doi.org/10.1109/ASPDAC.2004.1337544","url":null,"abstract":"In the hardware synthesis from high-level language such as C, bit length of variables is one of the key issues on the area and speed optimization. Usually, designers are required to specify the word length of each variable manually, and verify the correctness by the simulation on huge data. We propose an optimization method of fractional word length of floating-point variables in the floating to fixed-point conversion of variables. The amount of round-off errors are formulated with parameters and propagated via data flow graphs. The nonlinear programming is used to solve the fractional word length minimization problem. The method does not require the simulation on huge data, and is very fast compared to ones based on the simulation. We have shown the effect on several programs.","PeriodicalId":426349,"journal":{"name":"ASP-DAC 2004: Asia and South Pacific Design Automation Conference 2004 (IEEE Cat. No.04EX753)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114699924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Complexity analysis and speedup techniques for optimal buffer insertion with minimum cost 以最小代价进行最优缓冲区插入的复杂性分析和加速技术
Weiping Shi, Zhuo Li, C. Alpert
As gate delays d e e m faster than wire delays for each teehnolagy generation, buffer insertion hecomes a popular method to reduce the interconnecI delay. Several modem huffer insertion algorithms (e.g.. 17.6.151) are based on van Ginneken¿s dynamic programming paradigm [141. However, van Ginneken¿s original algorithm does not control buffering resources and tends to over-buffering, thereby wasting area and power. It has been a major open prohlem whether it is possible to optimize slack and at the same time minimize the buffer usage. This paper settles this open problem by showing that for arbitrary integer cost functions, the problem is NP-complete. We also extend the prr-buffer slack technique (121 to minimize the buffer cost. This technique can significantly reduce the running time and memory in buffer cost miniminition problem. The experimental results show that our algorithm can speed up the running time up to 17 times and reduces the memory to 1/30 of traditional best know algorithm. Finally, we show how to efficiently deal with multiway merge in buffer insertion.
由于每一代技术的门延迟都比线延迟快,因此缓冲器插入成为减少互连延迟的一种流行方法。几种调制解调器的高频插入算法(例如…17.6.151)基于van Ginneken的动态规划范式[141]。然而,van Ginneken的原始算法不控制缓冲资源,容易产生过度缓冲,从而浪费面积和功率。如何在优化松弛的同时最大限度地减少缓冲的使用一直是一个悬而未决的问题。通过证明对于任意整数代价函数,问题是np完全的,解决了这一开放问题。我们还扩展了prr-buffer松弛技术(121)以最小化缓冲区成本。这种技术可以显著减少运行时间和内存中缓冲区成本最小化的问题。实验结果表明,该算法可以将运行时间提高17倍,将内存减少到传统算法的1/30。最后,我们展示了如何有效地处理缓冲区插入中的多路合并。
{"title":"Complexity analysis and speedup techniques for optimal buffer insertion with minimum cost","authors":"Weiping Shi, Zhuo Li, C. Alpert","doi":"10.1109/ASPDAC.2004.1337664","DOIUrl":"https://doi.org/10.1109/ASPDAC.2004.1337664","url":null,"abstract":"As gate delays d e e m faster than wire delays for each teehnolagy generation, buffer insertion hecomes a popular method to reduce the interconnecI delay. Several modem huffer insertion algorithms (e.g.. 17.6.151) are based on van Ginneken¿s dynamic programming paradigm [141. However, van Ginneken¿s original algorithm does not control buffering resources and tends to over-buffering, thereby wasting area and power. It has been a major open prohlem whether it is possible to optimize slack and at the same time minimize the buffer usage. This paper settles this open problem by showing that for arbitrary integer cost functions, the problem is NP-complete. We also extend the prr-buffer slack technique (121 to minimize the buffer cost. This technique can significantly reduce the running time and memory in buffer cost miniminition problem. The experimental results show that our algorithm can speed up the running time up to 17 times and reduces the memory to 1/30 of traditional best know algorithm. Finally, we show how to efficiently deal with multiway merge in buffer insertion.","PeriodicalId":426349,"journal":{"name":"ASP-DAC 2004: Asia and South Pacific Design Automation Conference 2004 (IEEE Cat. No.04EX753)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134377682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 43
Associative memory with fully parallel nearest-manhattan-distance search for low-power real-time single-chip applications 联想存储器与全并行最接近曼哈顿距离搜索低功耗实时单芯片应用
Yuji Yano, T. Koide, H. Mattausch
A fully-paralled minimum Manhattan-distance search associative memory has been designed in 0.35μm CMOS with 3-metal layers. The nearest-match unit consumes only 1.02mm2, while the chip area is 7.49mm2. The measured winner-search time of this chip, the time to determine the best-matching reference-data word for an input-data word among a database of 128 reference words (5-bit, 16 units), is < 180nsec. This corresponds to a performance requirement of 16 GOPS/mm2, if a 32-bit computer with the same chip area would have to run the same workload. Furthermore the power dissipation of the designed test chip is only about 26.7mW/mm2.
设计了一种全并行最小曼哈顿距离搜索联想存储器,该存储器采用0.35μm CMOS,具有3金属层。最接近匹配单元的功耗仅为1.02mm2,而芯片面积为7.49mm2。该芯片测量的赢家搜索时间,即在128个参考词(5位,16个单位)的数据库中为一个输入数据词确定最匹配的参考数据词的时间,小于180nsec。如果具有相同芯片面积的32位计算机必须运行相同的工作负载,则这相当于16 GOPS/mm2的性能要求。设计的测试芯片功耗仅为26.7mW/mm2左右。
{"title":"Associative memory with fully parallel nearest-manhattan-distance search for low-power real-time single-chip applications","authors":"Yuji Yano, T. Koide, H. Mattausch","doi":"10.1109/ASPDAC.2004.1337640","DOIUrl":"https://doi.org/10.1109/ASPDAC.2004.1337640","url":null,"abstract":"A fully-paralled minimum Manhattan-distance search associative memory has been designed in 0.35μm CMOS with 3-metal layers. The nearest-match unit consumes only 1.02mm2, while the chip area is 7.49mm2. The measured winner-search time of this chip, the time to determine the best-matching reference-data word for an input-data word among a database of 128 reference words (5-bit, 16 units), is < 180nsec. This corresponds to a performance requirement of 16 GOPS/mm2, if a 32-bit computer with the same chip area would have to run the same workload. Furthermore the power dissipation of the designed test chip is only about 26.7mW/mm2.","PeriodicalId":426349,"journal":{"name":"ASP-DAC 2004: Asia and South Pacific Design Automation Conference 2004 (IEEE Cat. No.04EX753)","volume":"140 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133208403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Analytical expressions for phase noise eigenfunctions of LC oscillators LC振荡器相位噪声特征函数的解析表达式
P. Ghanta, Zheng Li, J. Roychowdhury
We obtain analytical expressions for eigenfunctions that characterize the phase noise performance of generic LC oscillator structures. Using these, we also obtain analytical expressions for the timing jitter and spectrum of such oscillators. Our approach is based on identifying three fundamental parameters, derived from the oscillator's steady state, that characterize these eigenfunctions. Our analysis accounts for the nonlinear mechanism that stabilizes oscillator amplitudes. It also lays out, quantitatively and in analytical form, how symmetry in an LC oscillator's negative resistance mechanism impacts the oscillator's eigenfunctions and its phase noise/jitter characteristics. We show that symmetry results in particularly simple forms for the PPV and resultant phase noise. We compare our expressions with existing LC oscillator design formulae and show that the expressions match for symmetric nonlinearities. We validate our analytical results against simulation on practical CMOS LC oscillator circuits. Our expressions and symmetry results are expected to be useful tools for optimizing phase noise performance during the design of LC oscillators.
我们得到了表征一般LC振荡器结构相位噪声性能的特征函数的解析表达式。利用这些,我们还得到了这类振荡器的时序抖动和频谱的解析表达式。我们的方法是基于确定三个基本参数,从振荡器的稳态导出,表征这些特征函数。我们的分析考虑了稳定振荡器振幅的非线性机制。它还以定量和分析的形式展示了LC振荡器负电阻机制中的对称性如何影响振荡器的本征函数及其相位噪声/抖动特性。我们表明,对称性导致了PPV和由此产生的相位噪声的特别简单的形式。我们将我们的表达式与现有的LC振荡器设计公式进行了比较,并证明了表达式在对称非线性情况下是匹配的。通过对实际CMOS LC振荡器电路的仿真验证了分析结果。我们的表达式和对称性结果有望成为设计LC振荡器时优化相位噪声性能的有用工具。
{"title":"Analytical expressions for phase noise eigenfunctions of LC oscillators","authors":"P. Ghanta, Zheng Li, J. Roychowdhury","doi":"10.1109/ASPDAC.2004.1337561","DOIUrl":"https://doi.org/10.1109/ASPDAC.2004.1337561","url":null,"abstract":"We obtain analytical expressions for eigenfunctions that characterize the phase noise performance of generic LC oscillator structures. Using these, we also obtain analytical expressions for the timing jitter and spectrum of such oscillators. Our approach is based on identifying three fundamental parameters, derived from the oscillator's steady state, that characterize these eigenfunctions. Our analysis accounts for the nonlinear mechanism that stabilizes oscillator amplitudes. It also lays out, quantitatively and in analytical form, how symmetry in an LC oscillator's negative resistance mechanism impacts the oscillator's eigenfunctions and its phase noise/jitter characteristics. We show that symmetry results in particularly simple forms for the PPV and resultant phase noise. We compare our expressions with existing LC oscillator design formulae and show that the expressions match for symmetric nonlinearities. We validate our analytical results against simulation on practical CMOS LC oscillator circuits. Our expressions and symmetry results are expected to be useful tools for optimizing phase noise performance during the design of LC oscillators.","PeriodicalId":426349,"journal":{"name":"ASP-DAC 2004: Asia and South Pacific Design Automation Conference 2004 (IEEE Cat. No.04EX753)","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130567573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
LPRAM: a low power DRAM with testability LPRAM:具有可测试性的低功耗DRAM
S. Bhattacharjee, D. Pradhan
To date all the proposal for low power designs of RAMs essentially focus on circuit level solutions. What we propose here is a novel architecture level solution. Our methodology provides a systematic trade off between power and area. Also, it allows tradeoff between test time and power consumed in test mode. Significantly, too, the proposed design has the potential to achieve performance improvements while reducing power. In this respect it stands apart from other approaches where the conventional wisdom of reducing power reduces speed.
迄今为止,所有关于ram低功耗设计的建议基本上都集中在电路级解决方案上。我们在这里提出的是一种新颖的架构级解决方案。我们的方法在权力和面积之间提供了一个系统的权衡。此外,它允许在测试模式下的测试时间和功耗之间进行权衡。同样重要的是,所提出的设计有可能在降低功耗的同时实现性能改进。在这方面,它区别于其他方法,传统的智慧,减少功率降低速度。
{"title":"LPRAM: a low power DRAM with testability","authors":"S. Bhattacharjee, D. Pradhan","doi":"10.5555/1015090.1015188","DOIUrl":"https://doi.org/10.5555/1015090.1015188","url":null,"abstract":"To date all the proposal for low power designs of RAMs essentially focus on circuit level solutions. What we propose here is a novel architecture level solution. Our methodology provides a systematic trade off between power and area. Also, it allows tradeoff between test time and power consumed in test mode. Significantly, too, the proposed design has the potential to achieve performance improvements while reducing power. In this respect it stands apart from other approaches where the conventional wisdom of reducing power reduces speed.","PeriodicalId":426349,"journal":{"name":"ASP-DAC 2004: Asia and South Pacific Design Automation Conference 2004 (IEEE Cat. No.04EX753)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122079491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Hierarchical extraction and verification of symmetry constraints for analog layout automation 模拟布局自动化中对称约束的分层提取与验证
S. Bhattacharya, N. Jangkrajarng, R. Hartono, C. Shi
Device matching and layout symmetry are of utmost importance to high performance analog and RF circuits. Here, we present HiLSD, the first CAD tool for the automatic detection of layout symmetry between two or more devices in a hierarchical manner. HiLSD first extracts the circuit structure from the layout, then applies an efficient pattern-matching algorithm to find all the subcircuits automatically, and finally detects layout symmetry on the portion of the layout that corresponds to extracted subcircuit instances. On a set of practical analog layouts, HiLSD is demonstrated to be much more efficient than direct symmetry detection on a flattened layout. Results from applying HiLSD to automatic analog layout retargeting for technology migration and new specifications are also described.
器件匹配和布局对称对于高性能模拟电路和射频电路至关重要。在这里,我们提出了HiLSD,这是第一个以分层方式自动检测两个或多个设备之间布局对称性的CAD工具。HiLSD首先从版图中提取电路结构,然后应用高效的模式匹配算法自动找到所有子电路,最后检测与提取的子电路实例对应的版图部分的布局对称性。在一组实际的模拟布局中,HiLSD被证明比在平坦布局上的直接对称检测更有效。描述了将HiLSD应用于自动模拟布局重定位技术迁移和新规范的结果。
{"title":"Hierarchical extraction and verification of symmetry constraints for analog layout automation","authors":"S. Bhattacharya, N. Jangkrajarng, R. Hartono, C. Shi","doi":"10.1109/ASPDAC.2004.1337608","DOIUrl":"https://doi.org/10.1109/ASPDAC.2004.1337608","url":null,"abstract":"Device matching and layout symmetry are of utmost importance to high performance analog and RF circuits. Here, we present HiLSD, the first CAD tool for the automatic detection of layout symmetry between two or more devices in a hierarchical manner. HiLSD first extracts the circuit structure from the layout, then applies an efficient pattern-matching algorithm to find all the subcircuits automatically, and finally detects layout symmetry on the portion of the layout that corresponds to extracted subcircuit instances. On a set of practical analog layouts, HiLSD is demonstrated to be much more efficient than direct symmetry detection on a flattened layout. Results from applying HiLSD to automatic analog layout retargeting for technology migration and new specifications are also described.","PeriodicalId":426349,"journal":{"name":"ASP-DAC 2004: Asia and South Pacific Design Automation Conference 2004 (IEEE Cat. No.04EX753)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123417344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
A thread partitioning algorithm in low power high-level synthesis 低功耗高级合成中的线程划分算法
J. Uchida, N. Togawa, M. Yanagisawa, T. Ohtsuki
We propose a thread partitioning algorithm in low power high-level synthesis. The algorithm is applied to high-level synthesis systems. In the systems, we can describe parallel behaving circuit blocks (threads) explicitly. First it focuses on a local register file RF in a thread. It partitions a thread into two subthreads, one of which has RF and the other does not have RF. The partitioned subthreads need to be synchronized with each other to keep the data dependency of the original thread. Since the partitioned subthreads have waiting time for synchronization, gated clocks can be applied to each subthread. Then we can synthesize a low power circuit with a low area overhead, compared to the original circuit. Experimental results demonstrate effectiveness and efficiency of the algorithm.
提出了一种低功耗高级合成中的线程划分算法。该算法已应用于高级综合系统。在系统中,我们可以明确地描述并行行为的电路块(线程)。首先,它关注线程中的本地寄存器文件RF。它将一个线程划分为两个子线程,其中一个具有RF,另一个没有RF。分区的子线程需要彼此同步,以保持原始线程的数据依赖性。由于分区的子线程有等待同步的时间,因此可以对每个子线程应用门控时钟。然后我们可以合成一个低功耗电路与低面积开销,与原始电路相比。实验结果证明了该算法的有效性和高效性。
{"title":"A thread partitioning algorithm in low power high-level synthesis","authors":"J. Uchida, N. Togawa, M. Yanagisawa, T. Ohtsuki","doi":"10.1109/ASPDAC.2004.1337543","DOIUrl":"https://doi.org/10.1109/ASPDAC.2004.1337543","url":null,"abstract":"We propose a thread partitioning algorithm in low power high-level synthesis. The algorithm is applied to high-level synthesis systems. In the systems, we can describe parallel behaving circuit blocks (threads) explicitly. First it focuses on a local register file RF in a thread. It partitions a thread into two subthreads, one of which has RF and the other does not have RF. The partitioned subthreads need to be synchronized with each other to keep the data dependency of the original thread. Since the partitioned subthreads have waiting time for synchronization, gated clocks can be applied to each subthread. Then we can synthesize a low power circuit with a low area overhead, compared to the original circuit. Experimental results demonstrate effectiveness and efficiency of the algorithm.","PeriodicalId":426349,"journal":{"name":"ASP-DAC 2004: Asia and South Pacific Design Automation Conference 2004 (IEEE Cat. No.04EX753)","volume":"2005 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123765897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Predictable design of low power systems by pre-implementation estimation and optimization 基于预估与优化的低功耗系统可预测设计
W. Nebel
Each year tens of billions of Dollars are wasted by the microelectronics industry because of missed deadlines and delayed design projects. These delays are partially due to design iterations many of which could have been avoided if the low level ramifications of high level design decisions, at the architecture- and algorithmic-level would have been known before the time consuming and tedious RT- and lower level implementation started. In this contribution we present a system-level design flow and respective EDA support tools for low power designs. We analyze the requirements for such a design technology, which shifts more responsibility to the system architect. We exemplify this approach with a design flow for low power systems. The architecture of an algorithm-level power estimation tool is presented together with some use cases based on an EDA product which has been commercially developed from the research results of several collaborative projects funded by the Commission of the European Community.
每年,由于错过最后期限和设计项目延误,微电子工业浪费了数百亿美元。这些延迟部分是由于设计迭代,如果高层设计决策的低级分支,在架构和算法级别上,在耗时且乏味的RT和较低级别实现开始之前就已经知道,那么许多设计迭代是可以避免的。在这篇文章中,我们提出了一个系统级设计流程和相应的EDA支持工具,用于低功耗设计。我们分析了这种设计技术的需求,它将更多的责任转移给了系统架构师。我们通过低功率系统的设计流程举例说明了这种方法。本文介绍了一种算法级功率估计工具的体系结构,以及基于EDA产品的一些用例,该产品是根据欧共体委员会资助的几个合作项目的研究成果开发的。
{"title":"Predictable design of low power systems by pre-implementation estimation and optimization","authors":"W. Nebel","doi":"10.1109/ASPDAC.2004.1337531","DOIUrl":"https://doi.org/10.1109/ASPDAC.2004.1337531","url":null,"abstract":"Each year tens of billions of Dollars are wasted by the microelectronics industry because of missed deadlines and delayed design projects. These delays are partially due to design iterations many of which could have been avoided if the low level ramifications of high level design decisions, at the architecture- and algorithmic-level would have been known before the time consuming and tedious RT- and lower level implementation started. In this contribution we present a system-level design flow and respective EDA support tools for low power designs. We analyze the requirements for such a design technology, which shifts more responsibility to the system architect. We exemplify this approach with a design flow for low power systems. The architecture of an algorithm-level power estimation tool is presented together with some use cases based on an EDA product which has been commercially developed from the research results of several collaborative projects funded by the Commission of the European Community.","PeriodicalId":426349,"journal":{"name":"ASP-DAC 2004: Asia and South Pacific Design Automation Conference 2004 (IEEE Cat. No.04EX753)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127134480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Temperature-aware global placement 温度感知的全局布局
B. Obermeier, F. Johannes
We describe a deterministic placement method for standard cells which minimizes total power consumption and leads to a smooth temperature distribution over the die. It is based on the quadratic placement formulation, where the overall weighted net length is minimized. Two innovations are introduced to achieve the above goals. First, overall power consumption is minimized by shortening nets with a high power dissipation. Second, cells are spread over the placement area such that the die temperature profile inside the package is flattened. Experimental results show a significant reduction of the maximum temperature on the die and a reduction of total power consumption.
我们描述了标准电池的确定性放置方法,该方法最大限度地减少了总功耗,并导致模具上的平滑温度分布。它是基于二次布局公式,其中总加权净长度是最小的。为实现上述目标,本文介绍了两项创新。首先,通过缩短网与高功耗最小化整体功耗。其次,单元分布在放置区域,使封装内的模具温度曲线变平。实验结果表明,该方法显著降低了模具上的最高温度,降低了总功耗。
{"title":"Temperature-aware global placement","authors":"B. Obermeier, F. Johannes","doi":"10.1109/ASPDAC.2004.1337555","DOIUrl":"https://doi.org/10.1109/ASPDAC.2004.1337555","url":null,"abstract":"We describe a deterministic placement method for standard cells which minimizes total power consumption and leads to a smooth temperature distribution over the die. It is based on the quadratic placement formulation, where the overall weighted net length is minimized. Two innovations are introduced to achieve the above goals. First, overall power consumption is minimized by shortening nets with a high power dissipation. Second, cells are spread over the placement area such that the die temperature profile inside the package is flattened. Experimental results show a significant reduction of the maximum temperature on the die and a reduction of total power consumption.","PeriodicalId":426349,"journal":{"name":"ASP-DAC 2004: Asia and South Pacific Design Automation Conference 2004 (IEEE Cat. No.04EX753)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128048552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 55
Rate analysis for streaming applications with on-chip buffer constraints 具有片上缓冲约束的流媒体应用的速率分析
A. Maxiaguine, S. Künzli, S. Chakraborty, L. Thiele
While mapping a streaming (such as multimedia or network packet processing) application onto a specified architecture, an important issue is to determine the input stream rates that can be supported by the architecture for any given mapping. This is subject to typical constraints such as on-chip buffers should not overflow, and specified play out buffers (which feed audio or video devices) should not underflow, so that the quality of the audio/video output is maintained. The main difficulty in this problem arises from the high variability in execution times of stream processing algorithms, coupled with the bursty nature of the streams to be processed. We present a mathematical framework for such a rate analysis for streaming applications, and illustrate its feasibility through a detailed case study of a MPEG-2 decoder application. When integrated into a tool for automated design-space exploration, such an analysis can be used for fast performance evaluation of different stream processing architectures.
在将流(如多媒体或网络数据包处理)应用程序映射到指定的体系结构时,一个重要的问题是确定对于任何给定映射,体系结构可以支持的输入流速率。这受制于典型的约束,例如片上缓冲区不应该溢出,指定的播放缓冲区(提供音频或视频设备)不应该下溢,以便保持音频/视频输出的质量。这个问题的主要困难来自流处理算法的执行时间的高度可变性,以及要处理的流的突发性质。我们提出了流媒体应用的速率分析的数学框架,并通过MPEG-2解码器应用的详细案例研究说明了其可行性。当集成到自动化设计空间探索的工具中时,这样的分析可以用于不同流处理架构的快速性能评估。
{"title":"Rate analysis for streaming applications with on-chip buffer constraints","authors":"A. Maxiaguine, S. Künzli, S. Chakraborty, L. Thiele","doi":"10.1109/ASPDAC.2004.1337553","DOIUrl":"https://doi.org/10.1109/ASPDAC.2004.1337553","url":null,"abstract":"While mapping a streaming (such as multimedia or network packet processing) application onto a specified architecture, an important issue is to determine the input stream rates that can be supported by the architecture for any given mapping. This is subject to typical constraints such as on-chip buffers should not overflow, and specified play out buffers (which feed audio or video devices) should not underflow, so that the quality of the audio/video output is maintained. The main difficulty in this problem arises from the high variability in execution times of stream processing algorithms, coupled with the bursty nature of the streams to be processed. We present a mathematical framework for such a rate analysis for streaming applications, and illustrate its feasibility through a detailed case study of a MPEG-2 decoder application. When integrated into a tool for automated design-space exploration, such an analysis can be used for fast performance evaluation of different stream processing architectures.","PeriodicalId":426349,"journal":{"name":"ASP-DAC 2004: Asia and South Pacific Design Automation Conference 2004 (IEEE Cat. No.04EX753)","volume":"158 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126205712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 43
期刊
ASP-DAC 2004: Asia and South Pacific Design Automation Conference 2004 (IEEE Cat. No.04EX753)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1