首页 > 最新文献

2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)最新文献

英文 中文
Electronic implants: Power delivery and management 电子植入物:电力输送和管理
Pub Date : 2013-03-18 DOI: 10.7873/DATE.2013.313
J. Olivo, S. Ghoreishizadeh, S. Carrara, G. Micheli
A power delivery system for implantable biosensors is presented. The system, embedded into a skin patch and located directly over the implantation area, is able to transfer up to 15 mW wirelessly through the body tissues by means of an inductive link. The inductive link is also used to achieve bidirectional data communication with the implanted device. Downlink communication (ASK) is performed at 100 kbps; uplink communication (LSK) is performed at 66.6 kbps. The received power is managed by an integrated system including a voltage rectifier, an amplitude demodulator and a load modulator. The power management system is presented and evaluated by means of simulations.
介绍了一种用于植入式生物传感器的电力传输系统。该系统嵌入到皮肤贴片中,直接位于植入区域上方,能够通过感应链路无线传输高达15兆瓦的能量。所述感应链路还用于实现与所述植入设备的双向数据通信。下行链路通信(ASK)以100kbps的速度进行;LSK (uplink)通信速率为66.6 kbps。接收的功率由一个集成系统管理,包括电压整流器、幅度解调器和负载调制器。介绍了该电源管理系统,并通过仿真对其进行了评价。
{"title":"Electronic implants: Power delivery and management","authors":"J. Olivo, S. Ghoreishizadeh, S. Carrara, G. Micheli","doi":"10.7873/DATE.2013.313","DOIUrl":"https://doi.org/10.7873/DATE.2013.313","url":null,"abstract":"A power delivery system for implantable biosensors is presented. The system, embedded into a skin patch and located directly over the implantation area, is able to transfer up to 15 mW wirelessly through the body tissues by means of an inductive link. The inductive link is also used to achieve bidirectional data communication with the implanted device. Downlink communication (ASK) is performed at 100 kbps; uplink communication (LSK) is performed at 66.6 kbps. The received power is managed by an integrated system including a voltage rectifier, an amplitude demodulator and a load modulator. The power management system is presented and evaluated by means of simulations.","PeriodicalId":6310,"journal":{"name":"2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"PP 1","pages":"1540-1545"},"PeriodicalIF":0.0,"publicationDate":"2013-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84345200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Space sensitive cache dumping for post-silicon validation 用于后硅验证的空间敏感缓存转储
Pub Date : 2013-03-18 DOI: 10.7873/DATE.2013.113
Sandeep Chandran, S. Sarangi, P. Panda
The internal state of complex modern processors often needs to be dumped out frequently during post-silicon validation. Since the last level cache (considered L2 in this paper) holds most of the state, the volume of data dumped and the transfer time are dominated by the L2 cache. The limited bandwidth to transfer data off-chip coupled with the large size of L2 cache results in stalling the processor for long durations when dumping the cache contents off-chip. To alleviate this, we propose to transfer only those cache lines that were updated since the previous dump. Since maintaining a bit-vector with a separate bit to track the status of individual cache lines is expensive, we propose 2 methods: (i) where a bit tracks multiple cache lines and (ii) an Interval Table which stores only the starting and ending addresses of continuous runs of updated cache lines. Both methods require significantly lesser space compared to a bit-vector, and allow the designer to choose the amount of space to allocate for this design-for-debug (DFD) feature. The impact of reducing storage space is that some non-updated cache lines are dumped too. We attempt to minimize such overheads. Further, the Interval Table is independent of the cache size which makes it ideal for large caches. Through experimentation, we also determine the break-even point below which a t-lines/bit bit-vector is beneficial compared to an Interval Table.
复杂的现代处理器的内部状态通常需要在后硅验证期间频繁地丢弃。由于最后一级缓存(本文认为是L2)保存了大部分状态,因此转储的数据量和传输时间由L2缓存控制。芯片外传输数据的带宽有限,加上L2缓存的大小很大,导致在将缓存内容转储到芯片外时,处理器会长时间停机。为了缓解这种情况,我们建议只传输自上次转储以来更新的缓存行。由于维护一个单独的位向量来跟踪单个缓存线的状态是昂贵的,我们提出了2种方法:(i)一个位跟踪多个缓存线和(ii)一个间隔表,它只存储更新的缓存线的连续运行的开始和结束地址。与位向量相比,这两种方法都需要更少的空间,并且允许设计人员选择为这种调试设计(DFD)特性分配的空间量。减少存储空间的影响是一些未更新的缓存行也会被转储。我们试图把这些开销降到最低。此外,间隔表与缓存大小无关,这使得它非常适合大型缓存。通过实验,我们还确定了与间隔表相比,t线/位矢量的损益平衡点。
{"title":"Space sensitive cache dumping for post-silicon validation","authors":"Sandeep Chandran, S. Sarangi, P. Panda","doi":"10.7873/DATE.2013.113","DOIUrl":"https://doi.org/10.7873/DATE.2013.113","url":null,"abstract":"The internal state of complex modern processors often needs to be dumped out frequently during post-silicon validation. Since the last level cache (considered L2 in this paper) holds most of the state, the volume of data dumped and the transfer time are dominated by the L2 cache. The limited bandwidth to transfer data off-chip coupled with the large size of L2 cache results in stalling the processor for long durations when dumping the cache contents off-chip. To alleviate this, we propose to transfer only those cache lines that were updated since the previous dump. Since maintaining a bit-vector with a separate bit to track the status of individual cache lines is expensive, we propose 2 methods: (i) where a bit tracks multiple cache lines and (ii) an Interval Table which stores only the starting and ending addresses of continuous runs of updated cache lines. Both methods require significantly lesser space compared to a bit-vector, and allow the designer to choose the amount of space to allocate for this design-for-debug (DFD) feature. The impact of reducing storage space is that some non-updated cache lines are dumped too. We attempt to minimize such overheads. Further, the Interval Table is independent of the cache size which makes it ideal for large caches. Through experimentation, we also determine the break-even point below which a t-lines/bit bit-vector is beneficial compared to an Interval Table.","PeriodicalId":6310,"journal":{"name":"2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"22 3 1","pages":"497-502"},"PeriodicalIF":0.0,"publicationDate":"2013-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85058837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reversible logic synthesis of k-input, m-output lookup tables 可逆逻辑合成的k-输入,m-输出查找表
Pub Date : 2013-03-18 DOI: 10.7873/DATE.2013.256
A. Shafaei, Mehdi Saeedi, Massoud Pedram
Improving circuit realization of known quantum algorithms by CAD techniques has benefits for quantum experimentalists. In this paper, we address the problem of synthesizing a given k-input, m-output lookup table (LUT) by a reversible circuit. This problem has interesting applications in the famous Shor's number-factoring algorithm and in quantum walk on sparse graphs. For LUT synthesis, our approach targets the number of control lines in multiple-control Toffoli gates to reduce synthesis cost. To achieve this, we propose a multi-level optimization technique for reversible circuits to benefit from shared cofactors. To reuse output qubits and/or zero-initialized ancillae, we un-compute intermediate cofactors. Our experiments reveal that the proposed LUT synthesis has a significant impact on reducing the size of modular exponentiation circuits for Shor's quantum factoring algorithm, oracle circuits in quantum walk on sparse graphs, and the well-known MCNC benchmarks.
利用计算机辅助设计技术改进已知量子算法的电路实现,对量子实验工作者有好处。在本文中,我们解决了一个可逆电路合成给定k输入,m输出查找表(LUT)的问题。这个问题在著名的Shor数字分解算法和稀疏图上的量子行走中有有趣的应用。对于LUT合成,我们的方法以多控制Toffoli门的控制线数量为目标,以降低合成成本。为了实现这一目标,我们提出了可逆电路的多级优化技术,以受益于共享辅因子。为了重用输出量子位和/或零初始化辅助,我们取消了中间协因子的计算。我们的实验表明,所提出的LUT综合对于减少Shor量子因式分解算法的模幂运算电路的大小,稀疏图上量子行走的oracle电路以及众所周知的MCNC基准具有显着影响。
{"title":"Reversible logic synthesis of k-input, m-output lookup tables","authors":"A. Shafaei, Mehdi Saeedi, Massoud Pedram","doi":"10.7873/DATE.2013.256","DOIUrl":"https://doi.org/10.7873/DATE.2013.256","url":null,"abstract":"Improving circuit realization of known quantum algorithms by CAD techniques has benefits for quantum experimentalists. In this paper, we address the problem of synthesizing a given k-input, m-output lookup table (LUT) by a reversible circuit. This problem has interesting applications in the famous Shor's number-factoring algorithm and in quantum walk on sparse graphs. For LUT synthesis, our approach targets the number of control lines in multiple-control Toffoli gates to reduce synthesis cost. To achieve this, we propose a multi-level optimization technique for reversible circuits to benefit from shared cofactors. To reuse output qubits and/or zero-initialized ancillae, we un-compute intermediate cofactors. Our experiments reveal that the proposed LUT synthesis has a significant impact on reducing the size of modular exponentiation circuits for Shor's quantum factoring algorithm, oracle circuits in quantum walk on sparse graphs, and the well-known MCNC benchmarks.","PeriodicalId":6310,"journal":{"name":"2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"95 2 1","pages":"1235-1240"},"PeriodicalIF":0.0,"publicationDate":"2013-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76278872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
SCC thermal model identification via advanced bias-compensated least-squares 基于先进偏差补偿最小二乘的SCC热模型辨识
Pub Date : 2013-03-18 DOI: 10.7873/DATE.2013.060
R. Diversi, Andrea Bartolini, A. Tilli, Francesco Beneventi, L. Benini
Compact thermal models and modeling strategies are today a cornerstone for advanced power management to counteract the emerging thermal crisis for many-core systems-on-chip. System identification techniques allow to extract models directly from the target device thermal response. Unfortunately, standard Least Squares techniques cannot effectively cope with both model approximation and measurement noise typical of real systems. In this work, we present a novel distributed identification strategy capable of coping with real-life temperature sensor noise and effectively extracting a set of low-order predictive thermal models for the tiles of Intel's Single-chip-Cloud-Computer (SCC) many-core prototype.
紧凑的热模型和建模策略是当今先进电源管理的基石,以抵消多核系统芯片上出现的热危机。系统识别技术允许直接从目标器件热响应中提取模型。不幸的是,标准最小二乘技术不能有效地处理模型逼近和实际系统典型的测量噪声。在这项工作中,我们提出了一种新的分布式识别策略,能够应对现实生活中的温度传感器噪声,并有效地提取一组低阶预测热模型,用于英特尔的单芯片云计算机(SCC)多核原型。
{"title":"SCC thermal model identification via advanced bias-compensated least-squares","authors":"R. Diversi, Andrea Bartolini, A. Tilli, Francesco Beneventi, L. Benini","doi":"10.7873/DATE.2013.060","DOIUrl":"https://doi.org/10.7873/DATE.2013.060","url":null,"abstract":"Compact thermal models and modeling strategies are today a cornerstone for advanced power management to counteract the emerging thermal crisis for many-core systems-on-chip. System identification techniques allow to extract models directly from the target device thermal response. Unfortunately, standard Least Squares techniques cannot effectively cope with both model approximation and measurement noise typical of real systems. In this work, we present a novel distributed identification strategy capable of coping with real-life temperature sensor noise and effectively extracting a set of low-order predictive thermal models for the tiles of Intel's Single-chip-Cloud-Computer (SCC) many-core prototype.","PeriodicalId":6310,"journal":{"name":"2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"80 1","pages":"230-235"},"PeriodicalIF":0.0,"publicationDate":"2013-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85811318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Wireless interconnect for board and chip level 无线互连板和芯片水平
Pub Date : 2013-03-18 DOI: 10.7873/DATE.2013.201
G. Fettweis, N. Hassan, L. Landau, E. Fischer
Electronic systems of the future require a very high bandwidth communications infrastructure within the system. This way the massive amount of compute power which will be available can be inter-connected to realize future powerful advanced electronic systems. Today, electronic inter-connects between 3D chip-stacks, as well as intra-connects within 3D chip-stacks are approaching data rates of 100 Gbit/s soon. Hence, the question to be answered is how to efficiently design the communications infrastructure which will be within electronic systems. Within this paper approaches and results for building this infrastructure for future electronics are addressed.
未来的电子系统需要系统内的高带宽通信基础设施。通过这种方式,大量可用的计算能力可以相互连接,以实现未来强大的先进电子系统。今天,3D芯片堆栈之间的电子互连以及3D芯片堆栈内部的连接很快就会接近100 Gbit/s的数据速率。因此,要回答的问题是如何有效地设计将在电子系统内的通信基础设施。本文讨论了为未来电子产品构建这种基础设施的方法和结果。
{"title":"Wireless interconnect for board and chip level","authors":"G. Fettweis, N. Hassan, L. Landau, E. Fischer","doi":"10.7873/DATE.2013.201","DOIUrl":"https://doi.org/10.7873/DATE.2013.201","url":null,"abstract":"Electronic systems of the future require a very high bandwidth communications infrastructure within the system. This way the massive amount of compute power which will be available can be inter-connected to realize future powerful advanced electronic systems. Today, electronic inter-connects between 3D chip-stacks, as well as intra-connects within 3D chip-stacks are approaching data rates of 100 Gbit/s soon. Hence, the question to be answered is how to efficiently design the communications infrastructure which will be within electronic systems. Within this paper approaches and results for building this infrastructure for future electronics are addressed.","PeriodicalId":6310,"journal":{"name":"2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"3 1","pages":"958-963"},"PeriodicalIF":0.0,"publicationDate":"2013-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78317405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
Formal analysis of sporadic bursts in real-time systems 实时系统中零星突发的形式化分析
Pub Date : 2013-03-18 DOI: 10.7873/DATE.2013.163
Sophie Quinton, Mircea Negrean, R. Ernst
In this paper we propose a new method for the analysis of response times in uni-processor real-time systems where task activation patterns may contain sporadic bursts. We use a burst model to calculate how often response times may exceed the worst-case response time bound obtained while ignoring bursts. This work is of particular interest to deal with dual-cyclic frames in the analysis of CAN buses. Our approach can handle arbitrary activation patterns and the static priority preemptive as well as non-preemptive scheduling policies. Experiments show the applicability and the benefits of the proposed method.
本文提出了一种分析单处理器实时系统中任务激活模式可能包含零星突发的响应时间的新方法。我们使用突发模型来计算在忽略突发的情况下,响应时间可能超过最坏情况下的响应时间界限的频率。这项工作对处理CAN总线分析中的双循环帧特别感兴趣。我们的方法可以处理任意激活模式和静态优先级抢占以及非抢占调度策略。实验证明了该方法的适用性和有效性。
{"title":"Formal analysis of sporadic bursts in real-time systems","authors":"Sophie Quinton, Mircea Negrean, R. Ernst","doi":"10.7873/DATE.2013.163","DOIUrl":"https://doi.org/10.7873/DATE.2013.163","url":null,"abstract":"In this paper we propose a new method for the analysis of response times in uni-processor real-time systems where task activation patterns may contain sporadic bursts. We use a burst model to calculate how often response times may exceed the worst-case response time bound obtained while ignoring bursts. This work is of particular interest to deal with dual-cyclic frames in the analysis of CAN buses. Our approach can handle arbitrary activation patterns and the static priority preemptive as well as non-preemptive scheduling policies. Experiments show the applicability and the benefits of the proposed method.","PeriodicalId":6310,"journal":{"name":"2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"40 1","pages":"767-772"},"PeriodicalIF":0.0,"publicationDate":"2013-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78527912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Utilizing voltage-frequency islands in C-to-RTL synthesis for streaming applications 在流应用中利用C-to-RTL合成中的电压频率岛
Pub Date : 2013-03-18 DOI: 10.7873/DATE.2013.207
Xinyu He, Shuangchen Li, Yongpan Liu, X. Hu, Huazhong Yang
Automatic C-to-RTL (C2RTL) synthesis can greatly benefit hardware design for streaming applications. However, stringent through-put/area constraints, especially the demand for power optimization at the system level is rather challenging for existing C2RTL synthesis tools. This paper considers a power-aware C2RTL framework using voltage-frequency islands (VFIs) to address these challenges. Given the throughput, area, and power constraints, an MILP-based approach is introduced to synthesize C-code into an RTL design by simultaneously considering three design knobs, i.e., partition, parallelization, and VFI assignment to get the global optimal solution. A heuristic solution is also discussed to deal with the scalability challenge facing the MILP formulation. Experimental results based on four well known multimedia applications demonstrate the effectiveness of both solutions.
自动C-to-RTL (C2RTL)合成可以极大地促进流媒体应用的硬件设计。然而,严格的吞吐量/面积限制,特别是对系统级功率优化的需求,对现有的C2RTL合成工具来说是相当具有挑战性的。本文考虑使用电压频率岛(vfi)的功率感知C2RTL框架来解决这些挑战。考虑到吞吐量、面积和功耗的限制,提出了一种基于milp的方法,通过同时考虑分区、并行化和VFI分配三个设计参数,将c代码合成为RTL设计,从而得到全局最优解。本文还讨论了一种启发式解决方案,以解决MILP公式面临的可扩展性挑战。基于四个知名多媒体应用的实验结果证明了两种方案的有效性。
{"title":"Utilizing voltage-frequency islands in C-to-RTL synthesis for streaming applications","authors":"Xinyu He, Shuangchen Li, Yongpan Liu, X. Hu, Huazhong Yang","doi":"10.7873/DATE.2013.207","DOIUrl":"https://doi.org/10.7873/DATE.2013.207","url":null,"abstract":"Automatic C-to-RTL (C2RTL) synthesis can greatly benefit hardware design for streaming applications. However, stringent through-put/area constraints, especially the demand for power optimization at the system level is rather challenging for existing C2RTL synthesis tools. This paper considers a power-aware C2RTL framework using voltage-frequency islands (VFIs) to address these challenges. Given the throughput, area, and power constraints, an MILP-based approach is introduced to synthesize C-code into an RTL design by simultaneously considering three design knobs, i.e., partition, parallelization, and VFI assignment to get the global optimal solution. A heuristic solution is also discussed to deal with the scalability challenge facing the MILP formulation. Experimental results based on four well known multimedia applications demonstrate the effectiveness of both solutions.","PeriodicalId":6310,"journal":{"name":"2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"15 1","pages":"992-995"},"PeriodicalIF":0.0,"publicationDate":"2013-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80181496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Resource-constrained high-level datapath optimization in ASIP design ASIP设计中资源受限的高级数据路径优化
Pub Date : 2013-03-18 DOI: 10.7873/DATE.2013.054
Yuankai Chen, H. Zhou
In this work, we study the problem of optimizing the data-path under resource constraint in the high-level synthesis of Application-Specific Instruction Processor (ASIP). We propose a two-level dynamic programming (DP) based heuristic algorithm. At the inner level of the proposed algorithm, the instructions are sorted in topological order, and then a DP algorithm is applied to optimize the topological order of the datapath. At the outer level, the space of the topological order of each instruction is explored to iteratively improve the solution. Compared with an optimal brutal-force algorithm, the proposed algorithm achieves near-optimal solution, with only 3% more performance overhead on average but significant reduction in runtime. Compared with a greedy algorithm which replaces the DP inner level with a greedy heuristic approach, the proposed algorithm achieves 48% reduction in performance overhead.
本文研究了专用指令处理器(ASIP)高级合成中资源约束下的数据路径优化问题。提出了一种基于两级动态规划的启发式算法。在算法内部,对指令按拓扑顺序进行排序,然后采用DP算法对数据路径的拓扑顺序进行优化。在外部层次,探索每条指令的拓扑顺序空间,迭代改进解。与最优野蛮力算法相比,该算法实现了近似最优解,性能开销平均仅增加3%,但运行时间显著缩短。与用贪心启发式方法取代DP内层的贪心算法相比,该算法的性能开销降低了48%。
{"title":"Resource-constrained high-level datapath optimization in ASIP design","authors":"Yuankai Chen, H. Zhou","doi":"10.7873/DATE.2013.054","DOIUrl":"https://doi.org/10.7873/DATE.2013.054","url":null,"abstract":"In this work, we study the problem of optimizing the data-path under resource constraint in the high-level synthesis of Application-Specific Instruction Processor (ASIP). We propose a two-level dynamic programming (DP) based heuristic algorithm. At the inner level of the proposed algorithm, the instructions are sorted in topological order, and then a DP algorithm is applied to optimize the topological order of the datapath. At the outer level, the space of the topological order of each instruction is explored to iteratively improve the solution. Compared with an optimal brutal-force algorithm, the proposed algorithm achieves near-optimal solution, with only 3% more performance overhead on average but significant reduction in runtime. Compared with a greedy algorithm which replaces the DP inner level with a greedy heuristic approach, the proposed algorithm achieves 48% reduction in performance overhead.","PeriodicalId":6310,"journal":{"name":"2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"1 1","pages":"198-201"},"PeriodicalIF":0.0,"publicationDate":"2013-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82093018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Utility-aware deferred load balancing in the cloud driven by dynamic pricing of electricity 在云中由电力动态定价驱动的公用事业感知延迟负载平衡
Pub Date : 2013-03-18 DOI: 10.7873/DATE.2013.066
Muhammad Abdullah Adnan, Rajesh K. Gupta
Distributed computing resources in a cloud computing environment provides an opportunity to reduce energy and its cost by shifting loads in response to dynamically varying availability of energy. This variation in electrical power availability is represented in its dynamically changing price that can be used to drive workload deferral against performance requirements. But such deferral may cause user dissatisfaction. In this paper, we quantify the impact of deferral on user satisfaction and utilize flexibility from the service level agreements (SLAs) for deferral to adapt with dynamic price variation. We differentiate among the jobs based on their requirements for responsiveness and schedule them for energy saving while meeting deadlines and user satisfaction. Representing utility as decaying functions along with workload deferral, we make a balance between loss of user satisfaction and energy efficiency. We model delay as decaying functions and guarantee that no job violates the maximum deadline, and we minimize the overall energy cost. Our simulation on MapReduce traces show that energy consumption can be reduced by ∼15%, with such utility-aware deferred load balancing. We also found that considering utility as a decaying function gives better cost reduction than load balancing with a fixed deadline.
云计算环境中的分布式计算资源提供了一个机会,可以根据能源可用性的动态变化来转移负载,从而减少能源及其成本。电力可用性的这种变化表现在其动态变化的价格中,可用于根据性能要求驱动工作负载延迟。但是这样的延迟可能会引起用户的不满。在本文中,我们量化了延迟对用户满意度的影响,并利用服务水平协议(sla)的延迟灵活性来适应动态价格变化。我们根据对响应能力的要求来区分不同的工作,并在满足截止日期和用户满意度的同时,为节省能源而安排工作。将效用表示为与工作负载延迟一起衰减的函数,我们在用户满意度损失和能源效率之间取得平衡。我们将延迟建模为衰减函数,并保证没有作业违反最大截止日期,从而使总能源成本最小化。我们在MapReduce跟踪上的模拟表明,使用这种效用感知的延迟负载平衡,能耗可以减少~ 15%。我们还发现,将效用视为衰减函数比使用固定截止日期进行负载平衡能更好地降低成本。
{"title":"Utility-aware deferred load balancing in the cloud driven by dynamic pricing of electricity","authors":"Muhammad Abdullah Adnan, Rajesh K. Gupta","doi":"10.7873/DATE.2013.066","DOIUrl":"https://doi.org/10.7873/DATE.2013.066","url":null,"abstract":"Distributed computing resources in a cloud computing environment provides an opportunity to reduce energy and its cost by shifting loads in response to dynamically varying availability of energy. This variation in electrical power availability is represented in its dynamically changing price that can be used to drive workload deferral against performance requirements. But such deferral may cause user dissatisfaction. In this paper, we quantify the impact of deferral on user satisfaction and utilize flexibility from the service level agreements (SLAs) for deferral to adapt with dynamic price variation. We differentiate among the jobs based on their requirements for responsiveness and schedule them for energy saving while meeting deadlines and user satisfaction. Representing utility as decaying functions along with workload deferral, we make a balance between loss of user satisfaction and energy efficiency. We model delay as decaying functions and guarantee that no job violates the maximum deadline, and we minimize the overall energy cost. Our simulation on MapReduce traces show that energy consumption can be reduced by ∼15%, with such utility-aware deferred load balancing. We also found that considering utility as a decaying function gives better cost reduction than load balancing with a fixed deadline.","PeriodicalId":6310,"journal":{"name":"2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"49 1","pages":"262-265"},"PeriodicalIF":0.0,"publicationDate":"2013-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82233426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Cache coherence enabled adaptive refresh for volatile STT-RAM 缓存一致性支持易失性STT-RAM的自适应刷新
Pub Date : 2013-03-18 DOI: 10.7873/DATE.2013.258
Jianhua Li, Liang Shi, Qing'an Li, C. Xue, Yiran Chen, Yinlong Xu
Spin-Transfer Torque RAM (STT-RAM) is extensively studied in recent years. Recent work proposed to improve the write performance of STT-RAM through relaxing the retention time of STT-RAM cell, magnetic tunnel junction (MTJ). Unfortunately, frequent refresh operations of volatile STT-RAM could dissipate significantly extra energy. In addition, refresh operations can severely conflict with normal read/write operations and results in degraded cache performance. This paper proposes Cache Coherence Enabled Adaptive Refresh (CCear) to minimize refresh operations for volatile STT-RAM. Through novel modifications to cache coherence protocol, CCear can effectively minimize the number of refresh operations on volatile STT-RAM. Full-system simulation results show that CCear approaches the performance of the ideal refresh policy with negligible overhead.
自旋传递扭矩RAM (STT-RAM)近年来得到了广泛的研究。近年来,研究人员提出通过延长STT-RAM单元的磁隧道结(MTJ)的保留时间来提高STT-RAM的写入性能。不幸的是,易失性STT-RAM的频繁刷新操作可能会消耗大量额外的能量。此外,刷新操作可能与正常的读写操作发生严重冲突,导致缓存性能下降。本文提出了缓存一致性自适应刷新(cear)来减少易失性STT-RAM的刷新操作。通过对缓存一致性协议的新颖修改,cear可以有效地减少对易失性STT-RAM的刷新操作次数。全系统仿真结果表明,cclear接近理想刷新策略的性能,开销可以忽略不计。
{"title":"Cache coherence enabled adaptive refresh for volatile STT-RAM","authors":"Jianhua Li, Liang Shi, Qing'an Li, C. Xue, Yiran Chen, Yinlong Xu","doi":"10.7873/DATE.2013.258","DOIUrl":"https://doi.org/10.7873/DATE.2013.258","url":null,"abstract":"Spin-Transfer Torque RAM (STT-RAM) is extensively studied in recent years. Recent work proposed to improve the write performance of STT-RAM through relaxing the retention time of STT-RAM cell, magnetic tunnel junction (MTJ). Unfortunately, frequent refresh operations of volatile STT-RAM could dissipate significantly extra energy. In addition, refresh operations can severely conflict with normal read/write operations and results in degraded cache performance. This paper proposes Cache Coherence Enabled Adaptive Refresh (CCear) to minimize refresh operations for volatile STT-RAM. Through novel modifications to cache coherence protocol, CCear can effectively minimize the number of refresh operations on volatile STT-RAM. Full-system simulation results show that CCear approaches the performance of the ideal refresh policy with negligible overhead.","PeriodicalId":6310,"journal":{"name":"2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"1966 1","pages":"1247-1250"},"PeriodicalIF":0.0,"publicationDate":"2013-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87785527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
期刊
2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1