首页 > 最新文献

2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)最新文献

英文 中文
Fast poisson solver preconditioned method for robust power grid analysis 鲁棒电网分析的快速泊松解预处理方法
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105381
Jianlei Yang, Yici Cai, Qiang Zhou, Jin Shi
Robust and efficient algorithms for power grid analysis are crucial for both VLSI design and optimization. Due to the increasing size of power grids IR drop analysis has become more computationally challenging both in runtime and memory consumption. This work presents a fast Poisson solver preconditioned method for unstructured power grid with unideal boundary conditions. In fact, by taking the advantage of analytical formulation of power grids this analytical preconditioner can be considered as sparse approximate inverse technique. By combining this analytical preconditioner with robust conjugate gradient method, we demonstrate that this approach is totally robust for extremely large scale power grid simulations. Experimental results have shown that iterations of our proposed method will hardly increase with grid size increasing once the pads density and the range of metal resistances value distribution have been decided. We demonstrated that this approach solves an unstructured power grid with 2.56M nodes in only 1/3 iterations of classical ICCG solver, and achieves almost 20X speedups over the classical ICCG solver on runtime.
稳健、高效的电网分析算法对于超大规模集成电路的设计和优化至关重要。随着电网规模的不断扩大,IR下降分析在运行时和内存消耗方面变得越来越具有计算挑战性。本文提出了一种非结构电网非理想边界条件下的快速泊松预条件求解方法。实际上,利用电网的解析公式,这种解析预调节器可以看作是一种稀疏近似逆技术。通过将该分析预调节器与鲁棒共轭梯度方法相结合,我们证明了该方法对于超大规模电网模拟具有完全的鲁棒性。实验结果表明,一旦确定焊盘密度和金属电阻值分布范围,该方法的迭代次数几乎不会随栅格尺寸的增加而增加。我们证明了该方法在经典ICCG求解器的1/3迭代中解决了具有2.56M节点的非结构化电网,并且在运行时比经典ICCG求解器实现了近20倍的加速。
{"title":"Fast poisson solver preconditioned method for robust power grid analysis","authors":"Jianlei Yang, Yici Cai, Qiang Zhou, Jin Shi","doi":"10.1109/ICCAD.2011.6105381","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105381","url":null,"abstract":"Robust and efficient algorithms for power grid analysis are crucial for both VLSI design and optimization. Due to the increasing size of power grids IR drop analysis has become more computationally challenging both in runtime and memory consumption. This work presents a fast Poisson solver preconditioned method for unstructured power grid with unideal boundary conditions. In fact, by taking the advantage of analytical formulation of power grids this analytical preconditioner can be considered as sparse approximate inverse technique. By combining this analytical preconditioner with robust conjugate gradient method, we demonstrate that this approach is totally robust for extremely large scale power grid simulations. Experimental results have shown that iterations of our proposed method will hardly increase with grid size increasing once the pads density and the range of metal resistances value distribution have been decided. We demonstrated that this approach solves an unstructured power grid with 2.56M nodes in only 1/3 iterations of classical ICCG solver, and achieves almost 20X speedups over the classical ICCG solver on runtime.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91098098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
On rewiring and simplification for canonicity in threshold logic circuits 阈值逻辑电路中正则性的重新布线与简化
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105360
Pin-Yi Kuo, Chun-Yao Wang, Ching-Yi Huang
Rewiring is a well developed and widely used technique in the synthesis and optimization of traditional Boolean logic designs. The threshold logic is a new alternative logic representation to Boolean logic which poses a compactness characteristic of representation. Nowadays, with the advances in nanomaterials, research on multi-level synthesis, verification, and testing for threshold networks is flourishing. This paper presents an algorithm for rewiring in a threshold network. It works by removing a target wire, and then corrects circuit's functionality by adding a corresponding rectification network. It also proposes a simplification procedure for representing a threshold logic gate canonically. The experimental results show that our approach has 7.1 times speedup compared to the-state-of-the-art multi-level synthesis algorithm, in synthesizing a threshold network with a new fanin number constraint.
重新布线是一种发展良好、应用广泛的技术,用于对传统布尔逻辑设计进行综合和优化。阈值逻辑是布尔逻辑的一种新的替代逻辑表示法,具有表示法的紧凑性。近年来,随着纳米材料的发展,阈值网络的多层次合成、验证和测试研究蓬勃发展。本文提出了一种阈值网络重布线算法。它的工作原理是去除目标导线,然后通过添加相应的整流网络来纠正电路的功能。提出了一种门限逻辑门规范化表示的简化方法。实验结果表明,该方法在合成具有新的fanin数约束的阈值网络时,与现有的多级合成算法相比,速度提高了7.1倍。
{"title":"On rewiring and simplification for canonicity in threshold logic circuits","authors":"Pin-Yi Kuo, Chun-Yao Wang, Ching-Yi Huang","doi":"10.1109/ICCAD.2011.6105360","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105360","url":null,"abstract":"Rewiring is a well developed and widely used technique in the synthesis and optimization of traditional Boolean logic designs. The threshold logic is a new alternative logic representation to Boolean logic which poses a compactness characteristic of representation. Nowadays, with the advances in nanomaterials, research on multi-level synthesis, verification, and testing for threshold networks is flourishing. This paper presents an algorithm for rewiring in a threshold network. It works by removing a target wire, and then corrects circuit's functionality by adding a corresponding rectification network. It also proposes a simplification procedure for representing a threshold logic gate canonically. The experimental results show that our approach has 7.1 times speedup compared to the-state-of-the-art multi-level synthesis algorithm, in synthesizing a threshold network with a new fanin number constraint.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91412881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
A heterogeneous accelerator platform for multi-subject voxel-based brain network analysis 基于多主体体素的脑网络分析异构加速平台
Pub Date : 2011-11-07 DOI: 10.5555/2132325.2132413
Yu Wang, Mo Xu, Ling Ren, Xiaorui Zhang, Di Wu, Yong He, Ningyi Xu, Huazhong Yang
The research on understanding the human brain has attracted more and more attention. A promising method is to model the brain as a network based on modern imaging technologies and then to apply graph theory algorithms for analysis. In this work, we examine the computing bottleneck of this method, and propose a CPU-GPU heterogeneous platform to accelerate the process. We construct a statistical brain network from a sample of 198 people and get characteristics such as nodal degree and modularity. This is the first study of voxel-based brain networks on large samples. We also illustrate that domain-specific hardware platform can have a significant impact on neuroscience studies.
了解人类大脑的研究越来越受到人们的关注。一种很有前途的方法是基于现代成像技术将大脑建模为一个网络,然后应用图论算法进行分析。在这项工作中,我们研究了这种方法的计算瓶颈,并提出了一个CPU-GPU异构平台来加速这一过程。我们从198人的样本中构建了一个统计脑网络,得到了节点度和模块性等特征。这是首次在大样本上对基于体素的大脑网络进行研究。我们还说明了特定领域的硬件平台可以对神经科学研究产生重大影响。
{"title":"A heterogeneous accelerator platform for multi-subject voxel-based brain network analysis","authors":"Yu Wang, Mo Xu, Ling Ren, Xiaorui Zhang, Di Wu, Yong He, Ningyi Xu, Huazhong Yang","doi":"10.5555/2132325.2132413","DOIUrl":"https://doi.org/10.5555/2132325.2132413","url":null,"abstract":"The research on understanding the human brain has attracted more and more attention. A promising method is to model the brain as a network based on modern imaging technologies and then to apply graph theory algorithms for analysis. In this work, we examine the computing bottleneck of this method, and propose a CPU-GPU heterogeneous platform to accelerate the process. We construct a statistical brain network from a sample of 198 people and get characteristics such as nodal degree and modularity. This is the first study of voxel-based brain networks on large samples. We also illustrate that domain-specific hardware platform can have a significant impact on neuroscience studies.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90891051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Fast statistical model of TiO2 thin-film memristor and design implication TiO2薄膜忆阻器的快速统计模型及其设计意义
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105353
Miao Hu, Hai Helen Li, R. Pino
The emerging memristor devices have recently received increased attention since HP Lab reported the first TiO2-based memristive structure. As it is at nano-scale geometry size, the uniformity of memristor device is difficult to control due to the process variations in the fabrication process. The incurred design concerns in a memristor-based computing system, e.g, neuromorphic computing, can be very severe because the analog states of memristors are heavily utilized. Therefore, the understanding and quantitative characterization of the impact of process variations on the electrical properties of memristors become crucial for the corresponding VLSI designs. In this work, we examined the theoretical model of TiO2 thin-film memristors and studied the relationships between the electrical parameters and the process variations of the devices. A statistical model based on a process-variation aware memristor device structure is extracted accordingly. Simulations show that our proposed model is 3 ∼ 4 magnitude faster than the existing Monte-Carlo simulation method, with only ∼ 2% accuracy degradation. A variable gain amplifier (VGA) is used as the case study to demonstrate the applications of our model in memristor-based circuit designs.
自从惠普实验室报道了第一个基于二氧化钛的忆阻结构以来,新兴的忆阻器件最近受到了越来越多的关注。由于忆阻器器件的几何尺寸为纳米级,在制造过程中由于工艺的变化,其均匀性难以控制。在基于忆阻器的计算系统(如神经形态计算)中产生的设计问题可能非常严重,因为忆阻器的模拟状态被大量利用。因此,了解和定量表征工艺变化对忆阻器电性能的影响对于相应的VLSI设计至关重要。在这项工作中,我们检验了TiO2薄膜忆阻器的理论模型,并研究了器件的电参数与工艺变化之间的关系。基于工艺变化感知的忆阻器结构,提取了统计模型。仿真表明,我们提出的模型比现有的蒙特卡罗模拟方法快3 ~ 4个数量级,精度仅下降~ 2%。以可变增益放大器(VGA)为例,演示了我们的模型在基于忆阻器的电路设计中的应用。
{"title":"Fast statistical model of TiO2 thin-film memristor and design implication","authors":"Miao Hu, Hai Helen Li, R. Pino","doi":"10.1109/ICCAD.2011.6105353","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105353","url":null,"abstract":"The emerging memristor devices have recently received increased attention since HP Lab reported the first TiO2-based memristive structure. As it is at nano-scale geometry size, the uniformity of memristor device is difficult to control due to the process variations in the fabrication process. The incurred design concerns in a memristor-based computing system, e.g, neuromorphic computing, can be very severe because the analog states of memristors are heavily utilized. Therefore, the understanding and quantitative characterization of the impact of process variations on the electrical properties of memristors become crucial for the corresponding VLSI designs. In this work, we examined the theoretical model of TiO2 thin-film memristors and studied the relationships between the electrical parameters and the process variations of the devices. A statistical model based on a process-variation aware memristor device structure is extracted accordingly. Simulations show that our proposed model is 3 ∼ 4 magnitude faster than the existing Monte-Carlo simulation method, with only ∼ 2% accuracy degradation. A variable gain amplifier (VGA) is used as the case study to demonstrate the applications of our model in memristor-based circuit designs.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77801141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
REBEL and TDC: Two embedded test structures for on-chip measurements of within-die path delay variations REBEL和TDC:两种嵌入式测试结构,用于芯片内路径延迟变化的片上测量
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105322
Charles Lamech, Jim Aarestad, J. Plusquellic, R. Rad, K. Agarwal
As feature printability becomes more challenging in advanced technology nodes, measuring and characterizing process variation effects on delay and power is becoming increasingly important. In this paper, we present two embedded test structures (ETS) for carrying out path delay measurement in actual product designs. Of the two structures proposed here, one is designed to be incorporated into a customer's scan structures, augmenting selected functional units with the ability to perform accurate path delay measurements. We refer to this ETS as REBEL (regional delay behavior). It is designed to leverage the existing scan chain as a means of reducing area overhead and performance impact. For cases in which very high resolution of delay measurements is required, a second standalone structure is proposed which we refer to as TDC for time-to-digital converter. Beyond characterizing process variations, these ETSs can also be used for design debug, detection of hardware Trojans and small delay defects and as physical unclonable functions.
随着特征可打印性在先进技术节点上变得越来越具有挑战性,测量和表征工艺变化对延迟和功率的影响变得越来越重要。在本文中,我们提出了两种嵌入式测试结构(ETS)来进行实际产品设计中的路径延迟测量。在这里提出的两种结构中,一种被设计成与客户的扫描结构相结合,通过执行精确的路径延迟测量来增加选定的功能单元。我们把这种ETS称为REBEL(区域延迟行为)。它旨在利用现有的扫描链作为减少面积开销和性能影响的一种手段。对于需要非常高分辨率的延迟测量的情况,提出了第二种独立结构,我们称之为时间-数字转换器的TDC。除了表征过程变化之外,这些ets还可用于设计调试,硬件木马和小延迟缺陷的检测以及物理不可克隆功能。
{"title":"REBEL and TDC: Two embedded test structures for on-chip measurements of within-die path delay variations","authors":"Charles Lamech, Jim Aarestad, J. Plusquellic, R. Rad, K. Agarwal","doi":"10.1109/ICCAD.2011.6105322","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105322","url":null,"abstract":"As feature printability becomes more challenging in advanced technology nodes, measuring and characterizing process variation effects on delay and power is becoming increasingly important. In this paper, we present two embedded test structures (ETS) for carrying out path delay measurement in actual product designs. Of the two structures proposed here, one is designed to be incorporated into a customer's scan structures, augmenting selected functional units with the ability to perform accurate path delay measurements. We refer to this ETS as REBEL (regional delay behavior). It is designed to leverage the existing scan chain as a means of reducing area overhead and performance impact. For cases in which very high resolution of delay measurements is required, a second standalone structure is proposed which we refer to as TDC for time-to-digital converter. Beyond characterizing process variations, these ETSs can also be used for design debug, detection of hardware Trojans and small delay defects and as physical unclonable functions.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80908941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
Fast statistical timing analysis for circuits with Post-Silicon Tunable clock buffers 后硅可调谐时钟缓冲器电路的快速统计时序分析
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105314
Bing Li, Ning Chen
Post-Silicon Tunable (PST) clock buffers are widely used in high performance designs to counter process variations. By allowing delay compensation between consecutive register stages, PST buffers can effectively improve the yield of digital circuits. To date, the evaluation of manufacturing yield in the presence of PST buffers is only possible using Monte Carlo simulation. In this paper, we propose an alternative method based on graph transformations, which is much faster, more than 1000 times, and computes a parametric minimum clock period. It also identifies the gates which are most critical to the circuit performance, therefore enabling a fast analysis-optimization flow.
后硅可调谐(PST)时钟缓冲器广泛用于高性能设计,以应对工艺变化。通过允许连续寄存器级之间的延迟补偿,PST缓冲器可以有效地提高数字电路的良率。到目前为止,在PST缓冲存在下的制造产量的评估只能使用蒙特卡罗模拟。在本文中,我们提出了一种基于图变换的替代方法,该方法要快得多,超过1000倍,并计算参数最小时钟周期。它还确定了对电路性能最关键的门,从而实现了快速的分析优化流程。
{"title":"Fast statistical timing analysis for circuits with Post-Silicon Tunable clock buffers","authors":"Bing Li, Ning Chen","doi":"10.1109/ICCAD.2011.6105314","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105314","url":null,"abstract":"Post-Silicon Tunable (PST) clock buffers are widely used in high performance designs to counter process variations. By allowing delay compensation between consecutive register stages, PST buffers can effectively improve the yield of digital circuits. To date, the evaluation of manufacturing yield in the presence of PST buffers is only possible using Monte Carlo simulation. In this paper, we propose an alternative method based on graph transformations, which is much faster, more than 1000 times, and computes a parametric minimum clock period. It also identifies the gates which are most critical to the circuit performance, therefore enabling a fast analysis-optimization flow.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82484437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Bandwidth-aware reconfigurable cache design with hybrid memory technologies 基于混合存储技术的带宽感知可重构缓存设计
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105304
Jishen Zhao, Cong Xu, Yuan Xie
In chip-multiprocessor (CMP) designs, limited memory bandwidth is a potential bottleneck of the system performance. New memory technologies, such as spin-torque-transfer memory (STT-RAM), resistive memory (RRAM), and embedded DRAM (eDRAM), are promising on-chip memory solutions for CMPs. In this paper, we propose a bandwidth-aware re-configurable cache hierarchy (BARCH) with hybrid memory technologies. BARCH consists of a hybrid cache hierarchy, a reconfiguration mechanism, and a statistical prediction engine. Our hybrid cache hierarchy chooses different memory technologies to configure each level so that the bandwidth provided by the overall hierarchy is optimized. Furthermore, we present a reconfiguration mechanism to dynamically adapt the cache space of each level based on the predicted bandwidth demands of different applications, which is guaranteed by our prediction engine. We evaluate the system performance gain obtained by our method with a set of multithreaded and multiprogrammed applications. Compared to traditional SRAM-based cache designs, our proposed design improves the system throughput by 58% and 14% for multithreaded and multiprogrammed applications, respectively.1
在芯片多处理器(CMP)设计中,有限的内存带宽是系统性能的潜在瓶颈。新的存储技术,如自旋扭矩传输存储器(STT-RAM)、电阻式存储器(RRAM)和嵌入式DRAM (eDRAM),都是很有前途的cmp片上存储器解决方案。在本文中,我们提出了一种基于混合存储技术的带宽感知可重构缓存层次结构(BARCH)。BARCH由混合缓存层次结构、重新配置机制和统计预测引擎组成。我们的混合缓存层次结构选择不同的内存技术来配置每个级别,以便优化整个层次结构提供的带宽。在此基础上,提出了一种基于不同应用的带宽预测需求动态调整各层缓存空间的重构机制,以保证预测引擎的性能。我们用一组多线程和多程序应用程序来评估用我们的方法获得的系统性能增益。与传统的基于sram的缓存设计相比,我们提出的设计在多线程和多程序应用程序中分别提高了58%和14%的系统吞吐量
{"title":"Bandwidth-aware reconfigurable cache design with hybrid memory technologies","authors":"Jishen Zhao, Cong Xu, Yuan Xie","doi":"10.1109/ICCAD.2011.6105304","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105304","url":null,"abstract":"In chip-multiprocessor (CMP) designs, limited memory bandwidth is a potential bottleneck of the system performance. New memory technologies, such as spin-torque-transfer memory (STT-RAM), resistive memory (RRAM), and embedded DRAM (eDRAM), are promising on-chip memory solutions for CMPs. In this paper, we propose a bandwidth-aware re-configurable cache hierarchy (BARCH) with hybrid memory technologies. BARCH consists of a hybrid cache hierarchy, a reconfiguration mechanism, and a statistical prediction engine. Our hybrid cache hierarchy chooses different memory technologies to configure each level so that the bandwidth provided by the overall hierarchy is optimized. Furthermore, we present a reconfiguration mechanism to dynamically adapt the cache space of each level based on the predicted bandwidth demands of different applications, which is guaranteed by our prediction engine. We evaluate the system performance gain obtained by our method with a set of multithreaded and multiprogrammed applications. Compared to traditional SRAM-based cache designs, our proposed design improves the system throughput by 58% and 14% for multithreaded and multiprogrammed applications, respectively.1","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78336597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
PTrace: Derivative-free local tracing of bicriterial design tradeoffs PTrace:双准则设计权衡的无导数局部跟踪
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105375
Amith Singhee
This paper presents a novel method, PTrace, to locally and uniformly trace convex bicriterial Pareto-optimal fronts for bicriterial optimization problems that, unlike existing methods, does not require derivatives of the objectives with respect to the design variables. The method computes a sequence of points along the front in a user-specified direction from a starting point, such that the points are roughly uniformly spaced as per a spacing constraint from the user. At each iteration, a local quadratic model of the front is used to estimate an appropriate weighted sum of objectives that, on optimization, will give the next point on the front. A single objective optimization on this weighted sum then generates the actual point, which is then used to build a new local model. The method uses convexity-based heuristics to improve on mildly sub-optimal results from the optimizer and reuses cached points to improve the optimization speed and quality. We test the method on a synthetic and a 6-T SRAM power-performance tradeoff test case to demonstrate its effectiveness.
本文提出了一种新颖的PTrace方法,用于局部和一致地跟踪凸双准则pareto最优前沿的双准则优化问题,与现有方法不同,该方法不需要目标对设计变量的导数。该方法从起点沿用户指定的方向沿前方计算一系列点,使得这些点根据用户的间距约束大致均匀间隔。在每次迭代中,前线的局部二次模型用于估计目标的适当加权和,优化后,将给出前线上的下一个点。然后对这个加权和进行单目标优化,生成实际的点,然后用于构建新的局部模型。该方法使用基于凸性的启发式算法来改进优化器的轻度次优结果,并重用缓存点来提高优化速度和质量。我们在合成和6-T SRAM功率性能权衡测试案例上测试了该方法,以证明其有效性。
{"title":"PTrace: Derivative-free local tracing of bicriterial design tradeoffs","authors":"Amith Singhee","doi":"10.1109/ICCAD.2011.6105375","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105375","url":null,"abstract":"This paper presents a novel method, PTrace, to locally and uniformly trace convex bicriterial Pareto-optimal fronts for bicriterial optimization problems that, unlike existing methods, does not require derivatives of the objectives with respect to the design variables. The method computes a sequence of points along the front in a user-specified direction from a starting point, such that the points are roughly uniformly spaced as per a spacing constraint from the user. At each iteration, a local quadratic model of the front is used to estimate an appropriate weighted sum of objectives that, on optimization, will give the next point on the front. A single objective optimization on this weighted sum then generates the actual point, which is then used to build a new local model. The method uses convexity-based heuristics to improve on mildly sub-optimal results from the optimizer and reuses cached points to improve the optimization speed and quality. We test the method on a synthetic and a 6-T SRAM power-performance tradeoff test case to demonstrate its effectiveness.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75013271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Heterogeneous B∗-trees for analog placement with symmetry and regularity considerations 考虑对称性和规则性的模拟放置的异质B *树
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105378
Pang-Yen Chou, H. Ou, Yao-Wen Chang
Symmetry constraints and regular structures are two major considerations for expert analog layout designers. Symmetry constraints are specified to place matched modules symmetrically with respect to some common axes to reduce unwanted electrical effects. Regular structures are commonly followed by experienced designers to enhance routability and suppress parasitics induced by extra bends of wires and via cost. In this paper, we propose a heterogeneous B∗-tree representation to consider symmetry and regularity simultaneously. Corresponding moves and a new regularity cost modelling for the representation are also presented. Experimental results show that our approach can efficiently generate regularly structured placement satisfying all symmetry constraints. For example, our placer achieves a 18X runtime speedup, 28% smaller area, and 68% shorter wirelength than the previous work, based on placement results, and 60% fewer overflows, 39% fewer vias, and 86% shorter routed wirelength, based on global routing results.
对称约束和规则结构是模拟布局设计专家的两个主要考虑因素。对称约束被指定为将匹配的模块相对于一些公共轴对称地放置,以减少不必要的电效应。规则结构通常由经验丰富的设计师遵循,以提高可达性和抑制寄生引起的额外弯曲电线和通过成本。在本文中,我们提出了一种异质B * -树表示来同时考虑对称性和正则性。给出了相应的动作和一种新的正则性代价模型。实验结果表明,该方法可以有效地生成满足所有对称约束的规则结构化布局。例如,根据放置结果,我们的放置器实现了18倍的运行时加速,28%的面积缩小,68%的长度缩短,根据全局路由结果,溢出减少了60%,过孔减少了39%,路由长度缩短了86%。
{"title":"Heterogeneous B∗-trees for analog placement with symmetry and regularity considerations","authors":"Pang-Yen Chou, H. Ou, Yao-Wen Chang","doi":"10.1109/ICCAD.2011.6105378","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105378","url":null,"abstract":"Symmetry constraints and regular structures are two major considerations for expert analog layout designers. Symmetry constraints are specified to place matched modules symmetrically with respect to some common axes to reduce unwanted electrical effects. Regular structures are commonly followed by experienced designers to enhance routability and suppress parasitics induced by extra bends of wires and via cost. In this paper, we propose a heterogeneous B∗-tree representation to consider symmetry and regularity simultaneously. Corresponding moves and a new regularity cost modelling for the representation are also presented. Experimental results show that our approach can efficiently generate regularly structured placement satisfying all symmetry constraints. For example, our placer achieves a 18X runtime speedup, 28% smaller area, and 68% shorter wirelength than the previous work, based on placement results, and 60% fewer overflows, 39% fewer vias, and 86% shorter routed wirelength, based on global routing results.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73561260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
A framework for accelerating neuromorphic-vision algorithms on FPGAs 基于fpga的神经形态视觉算法加速框架
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105351
M. DeBole, Ahmed Al-Maashri, M. Cotter, Chi-Li Yu, C. Chakrabarti, N. Vijaykrishnan
Implementations of neuromorphic algorithms are traditionally implemented on platforms which consume significant power, falling short of their biologically underpinnings. Recent improvements in FPGA technology have led to FPGAs becoming a platform in which these rapidly evolving algorithms can be implemented. Unfortunately, implementing designs on FPGAs still prove challenging for nonexperts, limiting their use in the neuroscience domain. In this paper, a FPGA framework is presented which enables neuroscientists to compose multi-FPGA systems for a cortical object classification model. This is demonstrated by mapping this algorithm onto two distinct platforms providing speedups of up to ∼28X over a reference CPU implementation.
传统上,神经形态算法的实现是在消耗大量能量的平台上实现的,缺乏其生物学基础。最近FPGA技术的改进使FPGA成为实现这些快速发展算法的平台。不幸的是,在fpga上实现设计对于非专家来说仍然具有挑战性,限制了它们在神经科学领域的应用。本文提出了一种FPGA框架,使神经科学家能够为皮质目标分类模型组成多个FPGA系统。通过将该算法映射到两个不同的平台上,可以在参考CPU实现上提供高达28倍的加速,从而证明了这一点。
{"title":"A framework for accelerating neuromorphic-vision algorithms on FPGAs","authors":"M. DeBole, Ahmed Al-Maashri, M. Cotter, Chi-Li Yu, C. Chakrabarti, N. Vijaykrishnan","doi":"10.1109/ICCAD.2011.6105351","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105351","url":null,"abstract":"Implementations of neuromorphic algorithms are traditionally implemented on platforms which consume significant power, falling short of their biologically underpinnings. Recent improvements in FPGA technology have led to FPGAs becoming a platform in which these rapidly evolving algorithms can be implemented. Unfortunately, implementing designs on FPGAs still prove challenging for nonexperts, limiting their use in the neuroscience domain. In this paper, a FPGA framework is presented which enables neuroscientists to compose multi-FPGA systems for a cortical object classification model. This is demonstrated by mapping this algorithm onto two distinct platforms providing speedups of up to ∼28X over a reference CPU implementation.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72894350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
期刊
2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1