首页 > 最新文献

16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011)最新文献

英文 中文
High performance lithographic hotspot detection using hierarchically refined machine learning 使用分层细化机器学习的高性能光刻热点检测
Pub Date : 2011-01-25 DOI: 10.1109/ASPDAC.2011.5722294
Duo Ding, A. Torres, F. Pikus, D. Pan
Under real and continuously improving manufacturing conditions, lithography hotspot detection faces several key challenges. First, real hotspots become less but harder to fix at post-layout stages; second, false alarm rate must be kept low to avoid excessive and expensive post-processing hotspot removal; third, full chip physical verification and optimization require fast turn-around time. To address these issues, we propose a high performance lithographic hotspot detection flow with ultra-fast speed and high fidelity. It consists of a novel set of hotspot signature definitions and a hierarchically refined detection flow with powerful machine learning kernels, ANN (artificial neural network) and SVM (support vector machine). We have implemented our algorithm with industry-strength engine under real manufacturing conditions in 45nm process, and showed that it significantly outperforms previous state-of-the-art algorithms in hotspot detection false alarm rate (2.4X to 2300X reduction) and simulation run-time (5X to 237X reduction), meanwhile archiving similar or slightly better hotspot detection accuracies. Such high performance lithographic hotspot detection under real manufacturing conditions is especially suitable for guiding lithography friendly physical design.
在真实且不断改进的制造条件下,光刻热点检测面临着几个关键挑战。首先,真正的热点变得越来越少,但在布局后阶段更难修复;其次,必须保持低虚警率,避免后处理热点去除过度和昂贵;第三,全芯片物理验证和优化需要快速的周转时间。为了解决这些问题,我们提出了一种速度超快、保真度高的高性能光刻热点检测流程。它由一组新颖的热点签名定义和具有强大机器学习核、人工神经网络(ANN)和支持向量机(SVM)的分层精炼检测流程组成。我们已经在45nm工艺的实际制造条件下使用工业强度引擎实现了我们的算法,并表明它在热点检测误报率(降低2.4倍至2300X)和模拟运行时间(降低5倍至237X)方面显着优于以前最先进的算法,同时具有相似或略好的热点检测精度。这种在真实制造条件下的高性能光刻热点检测特别适合指导光刻友好型物理设计。
{"title":"High performance lithographic hotspot detection using hierarchically refined machine learning","authors":"Duo Ding, A. Torres, F. Pikus, D. Pan","doi":"10.1109/ASPDAC.2011.5722294","DOIUrl":"https://doi.org/10.1109/ASPDAC.2011.5722294","url":null,"abstract":"Under real and continuously improving manufacturing conditions, lithography hotspot detection faces several key challenges. First, real hotspots become less but harder to fix at post-layout stages; second, false alarm rate must be kept low to avoid excessive and expensive post-processing hotspot removal; third, full chip physical verification and optimization require fast turn-around time. To address these issues, we propose a high performance lithographic hotspot detection flow with ultra-fast speed and high fidelity. It consists of a novel set of hotspot signature definitions and a hierarchically refined detection flow with powerful machine learning kernels, ANN (artificial neural network) and SVM (support vector machine). We have implemented our algorithm with industry-strength engine under real manufacturing conditions in 45nm process, and showed that it significantly outperforms previous state-of-the-art algorithms in hotspot detection false alarm rate (2.4X to 2300X reduction) and simulation run-time (5X to 237X reduction), meanwhile archiving similar or slightly better hotspot detection accuracies. Such high performance lithographic hotspot detection under real manufacturing conditions is especially suitable for guiding lithography friendly physical design.","PeriodicalId":316253,"journal":{"name":"16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121317293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 73
A physical-location-aware fault redistribution for maximum IR-drop reduction 一个物理位置感知的故障重新分配,最大限度地减少红外下降
Pub Date : 2011-01-25 DOI: 10.1109/ASPDAC.2011.5722277
Fu-Wei Chen, Shih-Liang Chen, Yung-Sheng Lin, TingTing Hwang
To guarantee that an application specific integrated circuits (ASIC) meets its timing requirement, at-speed scan testing becomes an indispensable procedure for verifying the performance of ASIC. However, at-speed scan test suffers the test-induced yield loss. Because the switching activity in test mode is much higher than that in normal mode, the switching-induced large current drawn causes severe IR drop and increases gate delay. X-filling is the most commonly used technique to reduce IR-drop effect during at-speed test. However, the effectiveness of X-filling depends on the number and the characteristic of X-bit distribution. In this paper, we propose a physical-location-aware X-identification1 which redistributes faults so that the maximum switching activity is guaranteed to be reduced after X-filling. The experimental results on ITC'99 show that our method has an average of 8.54% more reduction of maximum IR-drop as compared to a previous work which re-distributes X-bits evenly in all test vectors.
为了保证专用集成电路(ASIC)满足其时序要求,高速扫描测试成为验证ASIC性能不可缺少的步骤。然而,高速扫描测试存在测试诱导的良率损失。由于测试模式下的开关活度远高于正常模式,开关诱导的大电流产生了严重的红外下降,增加了栅极延迟。在高速试验中,x -填充是减少红外降效应最常用的技术。然而,x填充的有效性取决于x位分布的数量和特性。在本文中,我们提出了一种物理位置感知的x识别方法,该方法可以重新分配故障,从而保证x填充后最大切换活动减少。在ITC'99上的实验结果表明,与之前在所有测试向量中均匀重新分配x位的工作相比,我们的方法平均减少了8.54%的最大红外降。
{"title":"A physical-location-aware fault redistribution for maximum IR-drop reduction","authors":"Fu-Wei Chen, Shih-Liang Chen, Yung-Sheng Lin, TingTing Hwang","doi":"10.1109/ASPDAC.2011.5722277","DOIUrl":"https://doi.org/10.1109/ASPDAC.2011.5722277","url":null,"abstract":"To guarantee that an application specific integrated circuits (ASIC) meets its timing requirement, at-speed scan testing becomes an indispensable procedure for verifying the performance of ASIC. However, at-speed scan test suffers the test-induced yield loss. Because the switching activity in test mode is much higher than that in normal mode, the switching-induced large current drawn causes severe IR drop and increases gate delay. X-filling is the most commonly used technique to reduce IR-drop effect during at-speed test. However, the effectiveness of X-filling depends on the number and the characteristic of X-bit distribution. In this paper, we propose a physical-location-aware X-identification1 which redistributes faults so that the maximum switching activity is guaranteed to be reduced after X-filling. The experimental results on ITC'99 show that our method has an average of 8.54% more reduction of maximum IR-drop as compared to a previous work which re-distributes X-bits evenly in all test vectors.","PeriodicalId":316253,"journal":{"name":"16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011)","volume":"780 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123285823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Power-efficient tree-based multicast support for Networks-on-Chip 对片上网络的高能效树型多播支持
Pub Date : 2011-01-25 DOI: 10.1109/ASPDAC.2011.5722214
Wenmin Hu, Zhonghai Lu, A. Jantsch, Hengzhu Liu
In this paper, a novel hardware support for multicast on mesh Networks-on-Chip (NoC) is proposed. It supports multicast routing on any shape of tree-based paths. Two power-efficient tree-based multicast routing algorithms, Optimized tree (OPT) and Left-XY-Right-Optimized tree (LXYROPT) are also proposed. XY tree-based (XYT) algorithm and multiple unicast copies (MUC) are also implemented on the router as baselines. Along with the increase of the destination size, compared with MUC, OPT and LXYROPT achieve a remarkable improvement in both latency and throughput while the average power consumption is reduced by 50% and 45%, respectively. Compared with XYT, OPT is 10% higher in latency but gains 17% saving in power consumption. LXYROPT is 3% lower in latency and 8% lower in power consumption. In some cases, OPT and LXYROPT give power saving up to 70% less than the XYT.
提出了一种基于片上网状网络(mesh Networks-on-Chip, NoC)的组播硬件支持方案。它支持任何形状的基于树的路径上的多播路由。提出了两种节能的基于树的组播路由算法:优化树(OPT)和左xy右优化树(LXYROPT)。在路由器上还实现了基于XY树(XYT)算法和多单播复制(MUC)算法作为基线。随着目标大小的增加,与MUC相比,OPT和LXYROPT在延迟和吞吐量方面都有显著提高,平均功耗分别降低50%和45%。与XYT相比,OPT的延迟提高了10%,但功耗节省了17%。LXYROPT延迟降低3%,功耗降低8%。在某些情况下,OPT和LXYROPT比XYT节省多达70%的电力。
{"title":"Power-efficient tree-based multicast support for Networks-on-Chip","authors":"Wenmin Hu, Zhonghai Lu, A. Jantsch, Hengzhu Liu","doi":"10.1109/ASPDAC.2011.5722214","DOIUrl":"https://doi.org/10.1109/ASPDAC.2011.5722214","url":null,"abstract":"In this paper, a novel hardware support for multicast on mesh Networks-on-Chip (NoC) is proposed. It supports multicast routing on any shape of tree-based paths. Two power-efficient tree-based multicast routing algorithms, Optimized tree (OPT) and Left-XY-Right-Optimized tree (LXYROPT) are also proposed. XY tree-based (XYT) algorithm and multiple unicast copies (MUC) are also implemented on the router as baselines. Along with the increase of the destination size, compared with MUC, OPT and LXYROPT achieve a remarkable improvement in both latency and throughput while the average power consumption is reduced by 50% and 45%, respectively. Compared with XYT, OPT is 10% higher in latency but gains 17% saving in power consumption. LXYROPT is 3% lower in latency and 8% lower in power consumption. In some cases, OPT and LXYROPT give power saving up to 70% less than the XYT.","PeriodicalId":316253,"journal":{"name":"16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011)","volume":"266 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123288461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 41
An on-chip characterizing system for within-die delay variation measurement of individual standard cells in 65-nm CMOS 一种用于测量65nm CMOS中单个标准单元的芯片内延迟变化的片上表征系统
Pub Date : 2011-01-25 DOI: 10.1109/ASPDAC.2011.5722162
Xin Zhang, K. Ishida, M. Takamiya, T. Sakurai
New characterizing system for within-die delay variations of individual standard cells is presented. The proposed characterizing system is able to measure rising and falling delay variations separately by directly measuring the input and output waveforms of individual gate using an on-chip sampling oscilloscope in 65nm CMOS process. 7 types of standard cells are measured with 60 DUT's for each type. Thanks to the proposed system, a relationship between the rising and falling delay variations and the active area of the standard cells is experimentally shown for the first time.
提出了一种新的模内延迟变化的表征系统。所提出的表征系统采用65nm CMOS工艺的片上采样示波器,通过直接测量单个栅极的输入输出波形,能够分别测量上升和下降的延迟变化。测量7种标准细胞,每种细胞用60个DUT。利用该系统,首次通过实验证明了延时的上升和下降变化与标准单元的活动面积之间的关系。
{"title":"An on-chip characterizing system for within-die delay variation measurement of individual standard cells in 65-nm CMOS","authors":"Xin Zhang, K. Ishida, M. Takamiya, T. Sakurai","doi":"10.1109/ASPDAC.2011.5722162","DOIUrl":"https://doi.org/10.1109/ASPDAC.2011.5722162","url":null,"abstract":"New characterizing system for within-die delay variations of individual standard cells is presented. The proposed characterizing system is able to measure rising and falling delay variations separately by directly measuring the input and output waveforms of individual gate using an on-chip sampling oscilloscope in 65nm CMOS process. 7 types of standard cells are measured with 60 DUT's for each type. Thanks to the proposed system, a relationship between the rising and falling delay variations and the active area of the standard cells is experimentally shown for the first time.","PeriodicalId":316253,"journal":{"name":"16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115578541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Parallel statistical capacitance extraction of on-chip interconnects with an improved geometric variation model 基于改进几何变化模型的片上互连并联统计电容提取
Pub Date : 2011-01-25 DOI: 10.1109/ASPDAC.2011.5722272
Wenjian Yu, Chao Hu, Wangyang Zhang
In this paper, a new geometric variation model, referred to as the improved continuous surface variation (ICSV) model, is proposed to accurately imitate the random variation of on-chip interconnects. In addition, a new statistical capacitance solver is implemented to incorporate the ICSV model, the HPC [5] and weighted PFA [6] techniques. The solver also employs a parallel computing technique to greatly improve its efficiency. Experiments show that on a typical 65nm-technology structure, ICSV model has significant advantage over other existing models, and the new solver is at least 10X faster than the MC simulation with 10000 samples. The parallel solver achieves 7X further speedup on an 8-core machine. We conclude this paper with several criteria to discuss the trade-off between different geometric models and statistical methods for different scenarios.
本文提出了一种新的几何变化模型,即改进的连续表面变化(ICSV)模型,以准确地模拟片上互连的随机变化。此外,还实现了一种新的统计电容求解器,结合了ICSV模型、HPC[5]和加权PFA[6]技术。求解器还采用了并行计算技术,大大提高了求解效率。实验表明,在典型的65nm工艺结构上,ICSV模型比其他现有模型具有显著的优势,并且新的求解器比具有10000个样本的MC模拟快至少10倍。并行求解器在8核机器上实现了7倍的进一步加速。最后,我们用几个准则来讨论不同几何模型和统计方法在不同场景下的权衡。
{"title":"Parallel statistical capacitance extraction of on-chip interconnects with an improved geometric variation model","authors":"Wenjian Yu, Chao Hu, Wangyang Zhang","doi":"10.1109/ASPDAC.2011.5722272","DOIUrl":"https://doi.org/10.1109/ASPDAC.2011.5722272","url":null,"abstract":"In this paper, a new geometric variation model, referred to as the improved continuous surface variation (ICSV) model, is proposed to accurately imitate the random variation of on-chip interconnects. In addition, a new statistical capacitance solver is implemented to incorporate the ICSV model, the HPC [5] and weighted PFA [6] techniques. The solver also employs a parallel computing technique to greatly improve its efficiency. Experiments show that on a typical 65nm-technology structure, ICSV model has significant advantage over other existing models, and the new solver is at least 10X faster than the MC simulation with 10000 samples. The parallel solver achieves 7X further speedup on an 8-core machine. We conclude this paper with several criteria to discuss the trade-off between different geometric models and statistical methods for different scenarios.","PeriodicalId":316253,"journal":{"name":"16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115786954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Fault simulation and test generation for clock delay faults 时钟延迟故障的故障模拟与测试生成
Pub Date : 2011-01-25 DOI: 10.1109/ASPDAC.2011.5722299
Y. Higami, Hiroshi Takahashi, Shin-ya Kobayashi, K. Saluja
In this paper, we investigate the effects of delay faults on clock lines under launch-on-capture test strategy. In this fault model we assume that scan-in and scan-out operations, being relatively slow, can perform correctly even in the presence of a fault. However, a flip-flop may fail to capture a value at correct timing during system clock operation, thus requiring the use of launch-on-capture test strategy to detect such a fault. In the paper, we first show simulation results providing a relation between the duration of the delay and difficulty of detecting such faults in the launch-on-capture test. Next, we propose test generation methods to detect such clock delay faults, and show some experimental results to establish the effectiveness of our methods.
本文研究了在捕获发射测试策略下延时故障对时钟线的影响。在这个故障模型中,我们假设扫描入和扫描出操作相对较慢,即使在存在故障的情况下也能正确执行。然而,在系统时钟操作期间,触发器可能无法在正确的时间捕获值,因此需要使用捕获后启动测试策略来检测此类故障。在本文中,我们首先展示了仿真结果,提供了延迟持续时间与在捕获后发射测试中检测此类故障的难度之间的关系。接下来,我们提出了测试生成方法来检测此类时钟延迟故障,并给出了一些实验结果来验证我们方法的有效性。
{"title":"Fault simulation and test generation for clock delay faults","authors":"Y. Higami, Hiroshi Takahashi, Shin-ya Kobayashi, K. Saluja","doi":"10.1109/ASPDAC.2011.5722299","DOIUrl":"https://doi.org/10.1109/ASPDAC.2011.5722299","url":null,"abstract":"In this paper, we investigate the effects of delay faults on clock lines under launch-on-capture test strategy. In this fault model we assume that scan-in and scan-out operations, being relatively slow, can perform correctly even in the presence of a fault. However, a flip-flop may fail to capture a value at correct timing during system clock operation, thus requiring the use of launch-on-capture test strategy to detect such a fault. In the paper, we first show simulation results providing a relation between the duration of the delay and difficulty of detecting such faults in the launch-on-capture test. Next, we propose test generation methods to detect such clock delay faults, and show some experimental results to establish the effectiveness of our methods.","PeriodicalId":316253,"journal":{"name":"16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011)","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115890543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
An integer programming placement approach to FPGA clock power reduction 一种降低FPGA时钟功耗的整数编程放置方法
Pub Date : 2011-01-25 DOI: 10.1109/ASPDAC.2011.5722305
Alireza Rakhshanfar, J. Anderson
Clock signals are responsible for a significant portion of dynamic power in FPGAs owing to their high toggle frequency and capacitance. Clock signals are distributed to loads through a programmable routing tree network, designed to provide low delay and low skew. The placement step of the FPGA CAD flow plays a key role in influencing clock power, as clock tree branches are connected based solely on the placement of the clock loads. In this paper, we present a placement-based approach to clock power reduction based on an integer linear programming (ILP) formulation. Our technique is intended to be used as an optimization post-pass executed after traditional placement, and it offers fine-grained control of the amount by which clock power is optimized versus other placement criteria. Results show that the proposed technique reduces clock network capacitance by over 50% with minimal deleterious impact on post-routed wirelength and circuit speed.
时钟信号由于其高开关频率和电容,在fpga中占动态功率的很大一部分。时钟信号通过可编程路由树网络分配给负载,旨在提供低延迟和低倾斜。FPGA CAD流的放置步骤在影响时钟功率方面起着关键作用,因为时钟树分支的连接完全基于时钟负载的放置。在本文中,我们提出了一种基于整数线性规划(ILP)公式的基于放置的时钟功耗降低方法。我们的技术旨在作为传统放置后执行的优化后通道,它提供了对时钟功率优化量的细粒度控制,而不是其他放置标准。结果表明,该技术将时钟网络电容降低了50%以上,对路由后的无线长度和电路速度的有害影响最小。
{"title":"An integer programming placement approach to FPGA clock power reduction","authors":"Alireza Rakhshanfar, J. Anderson","doi":"10.1109/ASPDAC.2011.5722305","DOIUrl":"https://doi.org/10.1109/ASPDAC.2011.5722305","url":null,"abstract":"Clock signals are responsible for a significant portion of dynamic power in FPGAs owing to their high toggle frequency and capacitance. Clock signals are distributed to loads through a programmable routing tree network, designed to provide low delay and low skew. The placement step of the FPGA CAD flow plays a key role in influencing clock power, as clock tree branches are connected based solely on the placement of the clock loads. In this paper, we present a placement-based approach to clock power reduction based on an integer linear programming (ILP) formulation. Our technique is intended to be used as an optimization post-pass executed after traditional placement, and it offers fine-grained control of the amount by which clock power is optimized versus other placement criteria. Results show that the proposed technique reduces clock network capacitance by over 50% with minimal deleterious impact on post-routed wirelength and circuit speed.","PeriodicalId":316253,"journal":{"name":"16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115932628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Temporal and spatial isolation in a virtualization layer for multi-core processor based information appliances 基于多核处理器的信息设备虚拟化层中的时间和空间隔离
Pub Date : 2011-01-25 DOI: 10.1109/ASPDAC.2011.5722268
T. Nakajima, Y. Kinebuchi, H. Shimada, Alexandre Courbot, Tsung-Han Lin
A virtualization layer makes it possible to compose multiple functionalities on a multi-core processor with minimum modifications of OS kernels and applications. A multi-core processor is a good candidate to compose various software independently developed for dedicated processors into one multi-core processor to reduce both the hardware and development cost. In this paper, we present SPUMONE, which is a virtualization layer suitable for developing multi-core processor based-information appliances.
虚拟化层使得在一个多核处理器上组合多个功能成为可能,而对操作系统内核和应用程序的修改最少。多核处理器是将为专用处理器独立开发的各种软件组合到一个多核处理器中以降低硬件和开发成本的理想选择。本文提出了SPUMONE,它是一种适合开发基于多核处理器的信息设备的虚拟化层。
{"title":"Temporal and spatial isolation in a virtualization layer for multi-core processor based information appliances","authors":"T. Nakajima, Y. Kinebuchi, H. Shimada, Alexandre Courbot, Tsung-Han Lin","doi":"10.1109/ASPDAC.2011.5722268","DOIUrl":"https://doi.org/10.1109/ASPDAC.2011.5722268","url":null,"abstract":"A virtualization layer makes it possible to compose multiple functionalities on a multi-core processor with minimum modifications of OS kernels and applications. A multi-core processor is a good candidate to compose various software independently developed for dedicated processors into one multi-core processor to reduce both the hardware and development cost. In this paper, we present SPUMONE, which is a virtualization layer suitable for developing multi-core processor based-information appliances.","PeriodicalId":316253,"journal":{"name":"16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131564746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
A 4.32 mm2 170mW LDPC decoder in 0.13μm CMOS for WiMax/Wi-Fi applications 4.32 mm2 170mW LDPC解码器,0.13μm CMOS,适用于WiMax/Wi-Fi应用
Pub Date : 2011-01-25 DOI: 10.1109/ASPDAC.2011.5722293
Dan Bao, Chuan Wu, Yan Ying, Yun Chen, Xiaoyang Zeng
An energy-efficient programmable LDPC decoder is proposed for WiMax and Wi-Fi applications. The proposed decoder is designed with overlapped processing units, flexible message passing network and medium-grain partitioned memories to achieve flexibility, area reduction, and energy efficiency. The decoder can be programmed by host processor with several special-purpose micro-instructions. Thus, various operation modes can be reconfigured. Fabricated in SMIC 0.13μm 1P8M CMOS process, the chip occupies 4.32 mm2 with core area 2.97 mm2, and consumes 170mW with a throughput of 302Mb/s when operating at 145MHz and 1.2V.
提出了一种适用于WiMax和Wi-Fi应用的节能可编程LDPC解码器。本文提出的解码器采用重叠处理单元、灵活的消息传递网络和中粒分区存储器设计,以实现灵活性、减少面积和节能。该解码器可由主处理器编写若干专用微指令。因此,可以重新配置各种操作模式。该芯片采用中芯国际0.13μm 1P8M CMOS工艺制造,芯片占地4.32 mm2,核心面积2.97 mm2,功耗170mW,工作在145MHz和1.2V时的吞吐量为302Mb/s。
{"title":"A 4.32 mm2 170mW LDPC decoder in 0.13μm CMOS for WiMax/Wi-Fi applications","authors":"Dan Bao, Chuan Wu, Yan Ying, Yun Chen, Xiaoyang Zeng","doi":"10.1109/ASPDAC.2011.5722293","DOIUrl":"https://doi.org/10.1109/ASPDAC.2011.5722293","url":null,"abstract":"An energy-efficient programmable LDPC decoder is proposed for WiMax and Wi-Fi applications. The proposed decoder is designed with overlapped processing units, flexible message passing network and medium-grain partitioned memories to achieve flexibility, area reduction, and energy efficiency. The decoder can be programmed by host processor with several special-purpose micro-instructions. Thus, various operation modes can be reconfigured. Fabricated in SMIC 0.13μm 1P8M CMOS process, the chip occupies 4.32 mm2 with core area 2.97 mm2, and consumes 170mW with a throughput of 302Mb/s when operating at 145MHz and 1.2V.","PeriodicalId":316253,"journal":{"name":"16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132309996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fast data-cache modeling for native co-simulation 本地协同仿真的快速数据缓存建模
Pub Date : 2011-01-25 DOI: 10.1109/ASPDAC.2011.5722227
H. Posadas, L. Diaz, E. Villar
Efficient design of large multiprocessor embedded systems requires fast, early performance modeling techniques. Native co-simulation has been proposed as a fast solution for evaluating systems in early design steps. Annotated SW execution can be performed in conjunction with a virtual model of the HW platform to generate a complete system simulation. To obtain sufficiently accurate performance estimations, the effect of all the system components, as processor caches, must be considered. ISS-based cache models slow down the simulation speed, greatly reducing the efficiency of native-based co-simulations. To solve the problem, cache modeling techniques for fast native co-simulation have been proposed, but only considering instruction-caches. In this paper, a fast technique for datacache modeling is presented, together with the instrumentation required for its application in native execution. The model allows the designer to obtain cache hit/miss rate estimations with a speed-up of two orders of magnitude with respect to ISS. Miss rate estimation error remains below 5% for representative examples.
大型多处理器嵌入式系统的高效设计需要快速、早期的性能建模技术。原生联合仿真已被提出作为在早期设计阶段评估系统的快速解决方案。带注释的软件执行可以与硬件平台的虚拟模型一起执行,以生成完整的系统仿真。为了获得足够准确的性能估计,必须考虑所有系统组件(如处理器缓存)的影响。基于iss的缓存模型降低了仿真速度,大大降低了基于本机的协同仿真的效率。为了解决这个问题,提出了快速本地协同仿真的缓存建模技术,但只考虑指令缓存。本文提出了一种快速的数据缓存建模技术,以及在本机执行中应用所需的工具。该模型使设计人员能够以相对于ISS的两个数量级的速度获得缓存命中/未命中率估计。对于代表性示例,缺失率估计误差保持在5%以下。
{"title":"Fast data-cache modeling for native co-simulation","authors":"H. Posadas, L. Diaz, E. Villar","doi":"10.1109/ASPDAC.2011.5722227","DOIUrl":"https://doi.org/10.1109/ASPDAC.2011.5722227","url":null,"abstract":"Efficient design of large multiprocessor embedded systems requires fast, early performance modeling techniques. Native co-simulation has been proposed as a fast solution for evaluating systems in early design steps. Annotated SW execution can be performed in conjunction with a virtual model of the HW platform to generate a complete system simulation. To obtain sufficiently accurate performance estimations, the effect of all the system components, as processor caches, must be considered. ISS-based cache models slow down the simulation speed, greatly reducing the efficiency of native-based co-simulations. To solve the problem, cache modeling techniques for fast native co-simulation have been proposed, but only considering instruction-caches. In this paper, a fast technique for datacache modeling is presented, together with the instrumentation required for its application in native execution. The model allows the designer to obtain cache hit/miss rate estimations with a speed-up of two orders of magnitude with respect to ISS. Miss rate estimation error remains below 5% for representative examples.","PeriodicalId":316253,"journal":{"name":"16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131083039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
期刊
16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1