首页 > 最新文献

2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)最新文献

英文 中文
Performance-asymmetry-aware topology virtualization for defect-tolerant NoC-based many-core processors 基于缺陷容错的多核处理器的性能不对称感知拓扑虚拟化
Pub Date : 2010-03-08 DOI: 10.1109/DATE.2010.5457060
Lei Zhang, Yue Yu, Jianbo Dong, Yinhe Han, Shangping Ren, Xiaowei Li
Topology virtualization techniques are proposed for NoC-based many-core processors with core-level redundancy to isolate hardware changes caused by on-chip defective cores. Prior work focuses on homogeneous cores with symmetric performance and optimizes on-chip communication only. However, core-to-core performance asymmetry due to manufacturing process variations poses new challenges for constructing virtual topologies. Lower performance cores may scatter over a virtual topology, while operating systems typically allocate tasks to continuous cores. As a result, parallel applications are probably assigned to a region containing many slower cores that become bottlenecks. To tackle the above problem, in this paper we present a novel performance-asymmetry-aware reconfiguration algorithm Bubble-Up based on a new metric called core fragmentation factor (CFF). Bubble-Up can arrange cores with similar performance closer, yet maintaining reasonable hop distances between virtual neighbors, thus accelerating applications with higher degree of parallelism, without changing existing allocation strategies for OS. Experimental results show its effectiveness.
针对基于cpu的多核处理器,提出了具有核级冗余的拓扑虚拟化技术,以隔离片内核缺陷引起的硬件变化。先前的工作主要集中在具有对称性能的同构内核上,并仅优化片上通信。然而,由于制造工艺的变化,核心到核心的性能不对称给构建虚拟拓扑带来了新的挑战。性能较低的核心可能分散在虚拟拓扑中,而操作系统通常将任务分配给连续的核心。因此,并行应用程序可能被分配到包含许多较慢内核的区域,这些内核成为瓶颈。为了解决上述问题,本文提出了一种新的性能不对称感知重构算法Bubble-Up,该算法基于核心碎片因子(CFF)的新度量。Bubble-Up可以将具有相似性能的内核安排得更近,同时在虚拟邻居之间保持合理的跳距离,从而在不改变现有操作系统分配策略的情况下加速具有更高并行度的应用程序。实验结果表明了该方法的有效性。
{"title":"Performance-asymmetry-aware topology virtualization for defect-tolerant NoC-based many-core processors","authors":"Lei Zhang, Yue Yu, Jianbo Dong, Yinhe Han, Shangping Ren, Xiaowei Li","doi":"10.1109/DATE.2010.5457060","DOIUrl":"https://doi.org/10.1109/DATE.2010.5457060","url":null,"abstract":"Topology virtualization techniques are proposed for NoC-based many-core processors with core-level redundancy to isolate hardware changes caused by on-chip defective cores. Prior work focuses on homogeneous cores with symmetric performance and optimizes on-chip communication only. However, core-to-core performance asymmetry due to manufacturing process variations poses new challenges for constructing virtual topologies. Lower performance cores may scatter over a virtual topology, while operating systems typically allocate tasks to continuous cores. As a result, parallel applications are probably assigned to a region containing many slower cores that become bottlenecks. To tackle the above problem, in this paper we present a novel performance-asymmetry-aware reconfiguration algorithm Bubble-Up based on a new metric called core fragmentation factor (CFF). Bubble-Up can arrange cores with similar performance closer, yet maintaining reasonable hop distances between virtual neighbors, thus accelerating applications with higher degree of parallelism, without changing existing allocation strategies for OS. Experimental results show its effectiveness.","PeriodicalId":432902,"journal":{"name":"2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)","volume":"490 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127648152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
HORUS - high-dimensional Model Order Reduction via low moment-matching upgraded sampling HORUS -基于低矩匹配升级采样的高维模型降阶方法
Pub Date : 2010-03-08 DOI: 10.1109/DATE.2010.5457159
J. Villena, L. M. Silveira
This paper describes a Model Order Reduction algorithm for multi-dimensional parameterized systems, based on a sampling procedure which incorporates a low order moment matching paradigm into a multi-point based methodology. The procedure seeks to maximize the subspace generated by a given number of samples, selected among an initial candidate set. The selection is based on a global criteria that chooses the sample whose associated vector adds more information to the existing subspace. However, the initial candidate set can be extremely large for high-dimensional systems, and thus the procedure can be costly. To improve efficiency we propose a scheme to incorporate information from low order moments to the basis with small extra cost, in order to extend the approximation to a wider region around the selected point. This will allow reduction of the initial candidate set without decreasing the level of confidence. We further improve the procedure by generating the global subspace based on the composition of local approximations. To achieve this, the initial candidates will be split into subsets that will be considered as independent regions, and in a first phase the procedure applied locally thus enabling improved efficiency and providing a framework for almost perfect parallelization.
本文描述了一种基于采样过程的多维参数化系统模型降阶算法,该算法将低阶矩匹配范式融入到基于多点的方法中。该过程寻求最大化由给定数量的样本生成的子空间,从初始候选集中选择。选择基于一个全局标准,该标准选择其相关向量向现有子空间添加更多信息的样本。然而,对于高维系统,初始候选集可能非常大,因此该过程可能代价高昂。为了提高效率,我们提出了一种以较小的额外代价将低阶矩信息合并到基中的方案,以便将逼近扩展到选定点周围更宽的区域。这将允许在不降低置信度的情况下减少初始候选集。我们进一步改进了这一过程,基于局部近似的组合生成了全局子空间。为了实现这一点,最初的候选对象将被分割成子集,这些子集将被视为独立的区域,在第一阶段,该过程在局部应用,从而提高了效率,并为几乎完美的并行化提供了框架。
{"title":"HORUS - high-dimensional Model Order Reduction via low moment-matching upgraded sampling","authors":"J. Villena, L. M. Silveira","doi":"10.1109/DATE.2010.5457159","DOIUrl":"https://doi.org/10.1109/DATE.2010.5457159","url":null,"abstract":"This paper describes a Model Order Reduction algorithm for multi-dimensional parameterized systems, based on a sampling procedure which incorporates a low order moment matching paradigm into a multi-point based methodology. The procedure seeks to maximize the subspace generated by a given number of samples, selected among an initial candidate set. The selection is based on a global criteria that chooses the sample whose associated vector adds more information to the existing subspace. However, the initial candidate set can be extremely large for high-dimensional systems, and thus the procedure can be costly. To improve efficiency we propose a scheme to incorporate information from low order moments to the basis with small extra cost, in order to extend the approximation to a wider region around the selected point. This will allow reduction of the initial candidate set without decreasing the level of confidence. We further improve the procedure by generating the global subspace based on the composition of local approximations. To achieve this, the initial candidates will be split into subsets that will be considered as independent regions, and in a first phase the procedure applied locally thus enabling improved efficiency and providing a framework for almost perfect parallelization.","PeriodicalId":432902,"journal":{"name":"2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131311170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Detecting/preventing information leakage on the memory bus due to malicious hardware 检测/防止恶意硬件导致的内存总线信息泄露
Pub Date : 2010-03-08 DOI: 10.1109/DATE.2010.5456930
Abhishek Das, G. Memik, Joseph Zambreno, A. Choudhary
An increasing concern amongst designers and integrators of military and defense-related systems is the underlying security of the individual microprocessor components that make up these systems. Malicious circuitry can be inserted and hidden at several stages of the design process through the use of third-party Intellectual Property (IP), design tools, and manufacturing facilities. Such hardware Trojan circuitry has been shown to be capable of shutting down the main processor after a random number of cycles, broadcasting sensitive information over the bus, and bypassing software authentication mechanisms. In this work, we propose an architecture that can prevent information leakage due to such malicious hardware. Our technique is based on guaranteeing certain behavior in the memory system, which will be checked at an external guardian core that “approves” each memory request. By sitting between off-chip memory and the main core, the guardian core can monitor bus activity and verify the compiler-defined correctness of all memory writes. Experimental results on a conventional x86 platform demonstrate that application binaries can be statically reinstrumented to coordinate with the guardian core to monitor off-chip access, resulting in less than 60% overhead for the majority of the studied benchmarks.
军事和国防相关系统的设计者和集成商越来越关注组成这些系统的单个微处理器组件的潜在安全性。通过使用第三方知识产权(IP)、设计工具和制造设施,可以在设计过程的几个阶段插入和隐藏恶意电路。这种硬件木马电路已经被证明能够在随机的周期数之后关闭主处理器,在总线上广播敏感信息,并绕过软件认证机制。在这项工作中,我们提出了一种架构,可以防止这种恶意硬件造成的信息泄露。我们的技术基于保证内存系统中的某些行为,这些行为将在“批准”每个内存请求的外部守护内核中进行检查。通过位于片外内存和主内核之间,守护内核可以监视总线活动并验证编译器定义的所有内存写入的正确性。在传统x86平台上的实验结果表明,可以静态地重新增强应用程序二进制文件,以便与守护核心协调以监视片外访问,从而使所研究的大多数基准测试的开销低于60%。
{"title":"Detecting/preventing information leakage on the memory bus due to malicious hardware","authors":"Abhishek Das, G. Memik, Joseph Zambreno, A. Choudhary","doi":"10.1109/DATE.2010.5456930","DOIUrl":"https://doi.org/10.1109/DATE.2010.5456930","url":null,"abstract":"An increasing concern amongst designers and integrators of military and defense-related systems is the underlying security of the individual microprocessor components that make up these systems. Malicious circuitry can be inserted and hidden at several stages of the design process through the use of third-party Intellectual Property (IP), design tools, and manufacturing facilities. Such hardware Trojan circuitry has been shown to be capable of shutting down the main processor after a random number of cycles, broadcasting sensitive information over the bus, and bypassing software authentication mechanisms. In this work, we propose an architecture that can prevent information leakage due to such malicious hardware. Our technique is based on guaranteeing certain behavior in the memory system, which will be checked at an external guardian core that “approves” each memory request. By sitting between off-chip memory and the main core, the guardian core can monitor bus activity and verify the compiler-defined correctness of all memory writes. Experimental results on a conventional x86 platform demonstrate that application binaries can be statically reinstrumented to coordinate with the guardian core to monitor off-chip access, resulting in less than 60% overhead for the majority of the studied benchmarks.","PeriodicalId":432902,"journal":{"name":"2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115863942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
GentleCool: Cooling aware proactive workload scheduling in multi-machine systems genlecool:多机系统中具有冷却意识的主动工作负载调度
Pub Date : 2010-03-08 DOI: 10.1109/DATE.2010.5457191
R. Ayoub, Shervin Sharifi, T. Simunic
In state of the art systems, workload scheduling and server fan speed operate independently leading to cooling inefficiencies. We propose GentleCool, a proactive multi-tier approach for significantly lowering the fan cooling costs without compromising the performance. Our technique manages the fan speed through intelligently allocating the workload across different machines. The experimental results show our approach delivers average cooling energy savings of 72% and improves the mean time between failures (MTBF) of the fans by 2.3X compared to the state of the art.
在最先进的系统中,工作负载调度和服务器风扇速度独立运行导致冷却效率低下。我们提出了GentleCool,这是一种主动的多层方法,可以在不影响性能的情况下显着降低风扇冷却成本。我们的技术通过智能地在不同的机器上分配工作负载来管理风扇速度。实验结果表明,我们的方法平均节省了72%的冷却能源,并将风扇的平均无故障时间(MTBF)提高了2.3倍。
{"title":"GentleCool: Cooling aware proactive workload scheduling in multi-machine systems","authors":"R. Ayoub, Shervin Sharifi, T. Simunic","doi":"10.1109/DATE.2010.5457191","DOIUrl":"https://doi.org/10.1109/DATE.2010.5457191","url":null,"abstract":"In state of the art systems, workload scheduling and server fan speed operate independently leading to cooling inefficiencies. We propose GentleCool, a proactive multi-tier approach for significantly lowering the fan cooling costs without compromising the performance. Our technique manages the fan speed through intelligently allocating the workload across different machines. The experimental results show our approach delivers average cooling energy savings of 72% and improves the mean time between failures (MTBF) of the fans by 2.3X compared to the state of the art.","PeriodicalId":432902,"journal":{"name":"2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114403312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Aging-resilient design of pipelined architectures using novel detection and correction circuits 采用新型检测和校正电路的流水线结构抗老化设计
Pub Date : 2010-03-08 DOI: 10.1109/DATE.2010.5457203
H. Dadgour, K. Banerjee
Time-dependent performance degradation due to transistor aging caused by mechanisms such as Negative Bias Temperature Instability (NBTI) and Hot Carrier Injection (HCI) is one of the most important reliability concerns for deep nano-scale regime VLSI circuits. Hence, aging-resilient design methodologies are necessary to address this issue in order to improve reliability, preferably with minimal impact on the area, power and performance. This work offers two major contributions to the aging-resilient circuit design methodology literature. First, it introduces a novel sensor circuit that can detect the aging of pipeline architectures by monitoring the arrival time of data signals at flip-flops. The area overhead of the proposed circuit is estimated to be less than 45% compared to that of previous approaches, which are over 95%. To ensure the accuracy of its operation, a comprehensive timing analysis is performed on the proposed circuit including the influence of process variations. As a second contribution, this work presents an innovative correction technique to reduce the probability of timing failures caused by aging. This method employs novel reconfigurable flip-flops, which operate as normal flip-flops as long as the circuit is fresh, but function as time-borrowing flip-flops once the circuit ages. This unique flip-flop design allows utilization of the advantages of the time-borrowing technique while avoiding potential race conditions that can be created by employing such a technique. It is shown via simulations that by employing the proposed design methodology, the probability of timing failures in the aged circuits can be reduced by as much as 10X for various benchmark circuits.
由于负偏置温度不稳定性(NBTI)和热载流子注入(HCI)等机制引起的晶体管老化导致的时间依赖性性能下降是深纳米级VLSI电路最重要的可靠性问题之一。因此,为了提高可靠性,最好是在对面积、功率和性能影响最小的情况下,有必要采用抗老化设计方法来解决这一问题。这项工作为抗老化电路设计方法论文献提供了两个主要贡献。首先,它引入了一种新的传感器电路,可以通过监测触发器数据信号的到达时间来检测管道结构的老化。与之前超过95%的方法相比,所提出的电路的面积开销估计小于45%。为了保证其运行的准确性,对所提出的电路进行了全面的时序分析,包括工艺变化的影响。作为第二个贡献,这项工作提出了一种创新的校正技术,以减少由老化引起的定时故障的概率。该方法采用了一种新颖的可重构触发器,只要电路是新鲜的,它就像普通触发器一样工作,但一旦电路老化,它就像借时触发器一样工作。这种独特的触发器设计允许利用借用时间技术的优点,同时避免使用这种技术可能产生的潜在竞争条件。通过仿真表明,采用所提出的设计方法,对于各种基准电路,老化电路中的时序故障概率可以降低多达10倍。
{"title":"Aging-resilient design of pipelined architectures using novel detection and correction circuits","authors":"H. Dadgour, K. Banerjee","doi":"10.1109/DATE.2010.5457203","DOIUrl":"https://doi.org/10.1109/DATE.2010.5457203","url":null,"abstract":"Time-dependent performance degradation due to transistor aging caused by mechanisms such as Negative Bias Temperature Instability (NBTI) and Hot Carrier Injection (HCI) is one of the most important reliability concerns for deep nano-scale regime VLSI circuits. Hence, aging-resilient design methodologies are necessary to address this issue in order to improve reliability, preferably with minimal impact on the area, power and performance. This work offers two major contributions to the aging-resilient circuit design methodology literature. First, it introduces a novel sensor circuit that can detect the aging of pipeline architectures by monitoring the arrival time of data signals at flip-flops. The area overhead of the proposed circuit is estimated to be less than 45% compared to that of previous approaches, which are over 95%. To ensure the accuracy of its operation, a comprehensive timing analysis is performed on the proposed circuit including the influence of process variations. As a second contribution, this work presents an innovative correction technique to reduce the probability of timing failures caused by aging. This method employs novel reconfigurable flip-flops, which operate as normal flip-flops as long as the circuit is fresh, but function as time-borrowing flip-flops once the circuit ages. This unique flip-flop design allows utilization of the advantages of the time-borrowing technique while avoiding potential race conditions that can be created by employing such a technique. It is shown via simulations that by employing the proposed design methodology, the probability of timing failures in the aged circuits can be reduced by as much as 10X for various benchmark circuits.","PeriodicalId":432902,"journal":{"name":"2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117297896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
Construction of dual mode components for reconfiguration aware high-level synthesis 面向可重构高级综合的双模构件构建
Pub Date : 2010-03-08 DOI: 10.5555/1870926.1871252
G. Economakos, S. Xydis, Ioannis Koutras, D. Soudris
High-level synthesis has recently started to gain industrial acceptance, due to the improved quality of results and the multi-objective optimizations offered. One optimization area lately addressed is reconfigurable computing, where parts of a DFG are merged and mapped into coarse grained reconfigurable components. This paper presents an alternative approach, the construction of dual mode components which are exchanged with regular components in the resulting RTL architecture. The dual mode components are constructed by exhaustive search for dual mode functional primitives inside the datapath of complicated RTL components. Such components, like multipliers and dividers, that would remain idle in certain control steps, are able to work full-time in two different modes, without any reconfiguration overhead applied to the critical path of the application. The results obtained with different DSP benchmarks show an average performance gain of 15%, without any practical datapath area increase, offering uniform and balanced resource utilization.
由于结果质量的提高和提供的多目标优化,高水平合成最近开始获得工业认可。最近解决的一个优化领域是可重构计算,其中DFG的各个部分被合并并映射到粗粒度的可重构组件。本文提出了一种替代方法,即在生成的RTL体系结构中与常规组件交换的双模式组件的构造。双模组件是通过在复杂RTL组件的数据路径内穷尽搜索双模功能原语来构造的。这些组件,如乘数器和除数器,在某些控制步骤中保持空闲,可以在两种不同的模式下全职工作,而无需对应用程序的关键路径应用任何重新配置开销。不同DSP基准测试的结果表明,在没有任何实际数据路径面积增加的情况下,平均性能提高了15%,提供了统一和平衡的资源利用率。
{"title":"Construction of dual mode components for reconfiguration aware high-level synthesis","authors":"G. Economakos, S. Xydis, Ioannis Koutras, D. Soudris","doi":"10.5555/1870926.1871252","DOIUrl":"https://doi.org/10.5555/1870926.1871252","url":null,"abstract":"High-level synthesis has recently started to gain industrial acceptance, due to the improved quality of results and the multi-objective optimizations offered. One optimization area lately addressed is reconfigurable computing, where parts of a DFG are merged and mapped into coarse grained reconfigurable components. This paper presents an alternative approach, the construction of dual mode components which are exchanged with regular components in the resulting RTL architecture. The dual mode components are constructed by exhaustive search for dual mode functional primitives inside the datapath of complicated RTL components. Such components, like multipliers and dividers, that would remain idle in certain control steps, are able to work full-time in two different modes, without any reconfiguration overhead applied to the critical path of the application. The results obtained with different DSP benchmarks show an average performance gain of 15%, without any practical datapath area increase, offering uniform and balanced resource utilization.","PeriodicalId":432902,"journal":{"name":"2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116287087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Power efficient voltage islanding for Systems-on-chip from a floorplanning perspective 从平面规划的角度来看,片上系统的高效电压隔离
Pub Date : 2010-03-08 DOI: 10.1109/DATE.2010.5457124
P. Ghosh, Arunabha Sen
Power consumption can be significantly reduced in Systems-on-Chip (SoC) by scaling down the voltage levels of the Processing Elements (PEs). The power efficiency of this Voltage Islanding technique comes at the cost of energy and area overhead due to the level shifters between voltage islands. Moreover, from the physical design perspective it is not desirable to have an excessive number of voltage islands on the chip. Considering voltage islanding at an early phase of design as during floorplanning of the PEs can address various of these issues. In this paper, we propose a new cost function for the floorplanning objective different from the traditional floorplanning objective. The new cost function not only includes the overall area requirement, but also incorporates the overall power consumption and the design constraint imposed on the maximum number of voltage islands. We propose a greedy heuristic based on the proposed cost function for the floorplanning of the PEs with several voltage islands. Experimental results using benchmark data study the effect of several parameters on the outcome of the heuristic. It is evident from the results that power consumption can be significantly reduced using our algorithm without significant area overhead. The area obtained from the heuristic is also compared with the optimal, and found to be within 4% of the optimal on average, when area minimization is given the priority.
通过降低处理元件(PEs)的电压水平,可以显著降低片上系统(SoC)的功耗。这种电压岛技术的功率效率是以能量和面积开销为代价的,因为电压岛之间的电平移位器。此外,从物理设计的角度来看,在芯片上有过多的电压岛是不可取的。在设计的早期阶段考虑电压孤岛,如在pe的平面规划期间,可以解决这些问题。本文提出了一种不同于传统平面规划目标的平面规划目标成本函数。新的成本函数不仅包括总体面积要求,还包括总体功耗和最大电压岛数的设计约束。我们提出了一种基于所提成本函数的贪心启发式算法,用于具有多个电压岛的电力发电厂的平面规划。使用基准数据的实验结果研究了几个参数对启发式结果的影响。从结果中可以明显看出,使用我们的算法可以显着降低功耗,而不会产生显着的面积开销。将启发式算法得到的面积与最优算法进行了比较,发现当以面积最小化为优先考虑时,启发式算法得到的面积与最优算法的平均误差在4%以内。
{"title":"Power efficient voltage islanding for Systems-on-chip from a floorplanning perspective","authors":"P. Ghosh, Arunabha Sen","doi":"10.1109/DATE.2010.5457124","DOIUrl":"https://doi.org/10.1109/DATE.2010.5457124","url":null,"abstract":"Power consumption can be significantly reduced in Systems-on-Chip (SoC) by scaling down the voltage levels of the Processing Elements (PEs). The power efficiency of this Voltage Islanding technique comes at the cost of energy and area overhead due to the level shifters between voltage islands. Moreover, from the physical design perspective it is not desirable to have an excessive number of voltage islands on the chip. Considering voltage islanding at an early phase of design as during floorplanning of the PEs can address various of these issues. In this paper, we propose a new cost function for the floorplanning objective different from the traditional floorplanning objective. The new cost function not only includes the overall area requirement, but also incorporates the overall power consumption and the design constraint imposed on the maximum number of voltage islands. We propose a greedy heuristic based on the proposed cost function for the floorplanning of the PEs with several voltage islands. Experimental results using benchmark data study the effect of several parameters on the outcome of the heuristic. It is evident from the results that power consumption can be significantly reduced using our algorithm without significant area overhead. The area obtained from the heuristic is also compared with the optimal, and found to be within 4% of the optimal on average, when area minimization is given the priority.","PeriodicalId":432902,"journal":{"name":"2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)","volume":"405 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123528724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Optimal regulation of traffic flows in networks-on-chip 片上网络中交通流的最优调节
Pub Date : 2010-03-08 DOI: 10.1109/DATE.2010.5457070
Fahimeh Jafari, Zhonghai Lu, A. Jantsch, M. Moghaddam
We have proposed (σ, ρ)-based flow regulation to reduce delay and backlog bounds in SoC architectures, where σ bounds the traffic burstiness and ρ the traffic rate. The regulation is conducted per-flow for its peak rate and traffic burstiness. In this paper, we optimize these regulation parameters in networks on chips where many flows may have conflicting regulation requirements. We formulate an optimization problem for minimizing total buffers under performance constraints. We solve the problem with the interior point method. Our case study results exhibit 48% reduction of total buffers and 16% reduction of total latency for the proposed problem. The optimization solution has low run-time complexity, enabling quick exploration of large design space.
我们提出了基于(σ, ρ)的流量调节来减少SoC架构中的延迟和积压边界,其中σ限制了流量突发性,ρ限制了流量速率。根据峰值率和交通密集度按流量进行调节。在本文中,我们在芯片上的网络中优化这些调节参数,其中许多流可能具有相互冲突的调节要求。我们提出了一个在性能约束下最小化总缓冲区的优化问题。我们用内点法解决了这个问题。我们的案例研究结果显示,对于所提出的问题,总缓冲区减少了48%,总延迟减少了16%。该优化方案具有较低的运行时复杂度,能够快速探索大型设计空间。
{"title":"Optimal regulation of traffic flows in networks-on-chip","authors":"Fahimeh Jafari, Zhonghai Lu, A. Jantsch, M. Moghaddam","doi":"10.1109/DATE.2010.5457070","DOIUrl":"https://doi.org/10.1109/DATE.2010.5457070","url":null,"abstract":"We have proposed (σ, ρ)-based flow regulation to reduce delay and backlog bounds in SoC architectures, where σ bounds the traffic burstiness and ρ the traffic rate. The regulation is conducted per-flow for its peak rate and traffic burstiness. In this paper, we optimize these regulation parameters in networks on chips where many flows may have conflicting regulation requirements. We formulate an optimization problem for minimizing total buffers under performance constraints. We solve the problem with the interior point method. Our case study results exhibit 48% reduction of total buffers and 16% reduction of total latency for the proposed problem. The optimization solution has low run-time complexity, enabling quick exploration of large design space.","PeriodicalId":432902,"journal":{"name":"2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123547793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Extended Hamiltonian Pencil for passivity assessment and enforcement for S-parameter systems s参数系统无源性评价与执行的扩展哈密顿铅笔
Pub Date : 2010-03-08 DOI: 10.1109/DATE.2010.5456981
Zuochang Ye, L. M. Silveira, J. Phillips
An efficient algorithmbased on the Extended Hamiltonian Pencil was proposed in [1] for systems with hybrid representation. Here we further extend the Extended Hamiltonian Pencil method to systems described with scattering representation, i.e. S-parameter systems. The derivation of the Extended Hamiltonian Pencil for S-parameter systems is presented. Some properties that allow passivity enforcement based on eigenvalue displacement are reported. Experimental results demonstrate the effectiveness of the proposed method.
文献[1]提出了一种基于扩展哈密顿铅笔的混合表示系统的有效算法。本文将扩展哈密顿铅笔法进一步推广到用散射表示描述的系统,即s参数系统。给出了s参数系统的扩展哈密顿铅笔的推导。报告了一些允许基于特征值位移的被动执行的属性。实验结果证明了该方法的有效性。
{"title":"Extended Hamiltonian Pencil for passivity assessment and enforcement for S-parameter systems","authors":"Zuochang Ye, L. M. Silveira, J. Phillips","doi":"10.1109/DATE.2010.5456981","DOIUrl":"https://doi.org/10.1109/DATE.2010.5456981","url":null,"abstract":"An efficient algorithmbased on the Extended Hamiltonian Pencil was proposed in [1] for systems with hybrid representation. Here we further extend the Extended Hamiltonian Pencil method to systems described with scattering representation, i.e. S-parameter systems. The derivation of the Extended Hamiltonian Pencil for S-parameter systems is presented. Some properties that allow passivity enforcement based on eigenvalue displacement are reported. Experimental results demonstrate the effectiveness of the proposed method.","PeriodicalId":432902,"journal":{"name":"2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117214704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Scoped identifiers for efficient bit aligned logging 有效的位对齐日志记录的作用域标识符
Pub Date : 2010-03-08 DOI: 10.1109/DATE.2010.5457040
Roy Shea, M. Srivastava, Young H. Cho
Detailed diagnostic data is a prerequisite for debugging problems and understanding runtime performance in distributed wireless embedded systems. Severe bandwidth limitations, tight timing constraints, and limited program text space hinder the application of standard diagnostic tools within this domain. This work introduces the Log Instrumentation Specification (LIS), which provides a high level logging interface to developers and is able to create extremely compact diagnostic logs. LIS uses a token scoping technique to aggressively compact identifiers that are packed into bit aligned log buffers. LIS is evaluated in the context of recording call traces within a network of wireless sensor nodes. Our evaluation shows that logs generated using LIS require less than 50% of the bandwidth utilized by alternate logging mechanisms. Through microbench-marking of a complete LIS implementation for the TinyOS operating system, we demonstrate that LIS can comfortably fit onto low-end embedded systems. By significantly reducing log bandwidth, LIS enables extraction of a more complete picture of runtime behavior from distributed wireless embedded systems.
详细的诊断数据是调试问题和理解分布式无线嵌入式系统运行时性能的先决条件。严格的带宽限制、严格的时序限制和有限的程序文本空间阻碍了标准诊断工具在该领域的应用。这项工作引入了日志工具规范(Log Instrumentation Specification, LIS),它为开发人员提供了一个高级的日志记录接口,并且能够创建非常紧凑的诊断日志。LIS使用令牌作用域技术对打包到位对齐日志缓冲区中的标识符进行严格压缩。LIS在无线传感器节点网络中记录呼叫轨迹的背景下进行评估。我们的评估表明,使用LIS生成的日志所需的带宽不到其他日志记录机制所使用带宽的50%。通过对TinyOS操作系统的完整LIS实现进行微基准测试,我们证明了LIS可以舒适地适用于低端嵌入式系统。通过显著减少日志带宽,LIS能够从分布式无线嵌入式系统中提取更完整的运行时行为图像。
{"title":"Scoped identifiers for efficient bit aligned logging","authors":"Roy Shea, M. Srivastava, Young H. Cho","doi":"10.1109/DATE.2010.5457040","DOIUrl":"https://doi.org/10.1109/DATE.2010.5457040","url":null,"abstract":"Detailed diagnostic data is a prerequisite for debugging problems and understanding runtime performance in distributed wireless embedded systems. Severe bandwidth limitations, tight timing constraints, and limited program text space hinder the application of standard diagnostic tools within this domain. This work introduces the Log Instrumentation Specification (LIS), which provides a high level logging interface to developers and is able to create extremely compact diagnostic logs. LIS uses a token scoping technique to aggressively compact identifiers that are packed into bit aligned log buffers. LIS is evaluated in the context of recording call traces within a network of wireless sensor nodes. Our evaluation shows that logs generated using LIS require less than 50% of the bandwidth utilized by alternate logging mechanisms. Through microbench-marking of a complete LIS implementation for the TinyOS operating system, we demonstrate that LIS can comfortably fit onto low-end embedded systems. By significantly reducing log bandwidth, LIS enables extraction of a more complete picture of runtime behavior from distributed wireless embedded systems.","PeriodicalId":432902,"journal":{"name":"2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123975940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
期刊
2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1