首页 > 最新文献

2014 Design, Automation & Test in Europe Conference & Exhibition (DATE)最新文献

英文 中文
Software-based Pauli tracking in fault-tolerant quantum circuits 容错量子电路中基于软件的泡利跟踪
Pub Date : 2014-03-24 DOI: 10.7873/DATE.2014.137
A. Paler, S. Devitt, K. Nemoto, I. Polian
The realisation of large-scale quantum computing is no longer simply a hardware question. The rapid development of quantum technology has resulted in dozens of control and programming problems that should be directed towards the classical computer science and engineering community. One such problem is known as Pauli tracking. Methods for implementing quantum algorithms that are compatible with crucial error correction technology utilise extensive quantum teleportation protocols. These protocols are intrinsically probabilistic and result in correction operators that occur as byproducts of teleportation. These byproduct operators do not need to be corrected in the quantum hardware itself, but are tracked through the circuit and output results reinterpreted. This tracking is routinely ignored in quantum information as it is assumed that tracking algorithms will eventually be developed. In this work we help fill this gap and present an algorithm for tracking byproduct operators through a quantum computation.
大规模量子计算的实现不再是一个简单的硬件问题。量子技术的快速发展导致了许多控制和编程问题,这些问题应该直接针对经典计算机科学和工程界。其中一个问题被称为泡利跟踪。实现与关键纠错技术兼容的量子算法的方法利用了广泛的量子隐形传态协议。这些协议本质上是概率性的,并导致作为隐形传态副产品出现的校正算子。这些副产物算子不需要在量子硬件本身进行校正,而是通过电路进行跟踪,并重新解释输出结果。这种跟踪通常在量子信息中被忽略,因为它假设跟踪算法最终将被开发出来。在这项工作中,我们帮助填补了这一空白,并提出了一种通过量子计算跟踪副产物算子的算法。
{"title":"Software-based Pauli tracking in fault-tolerant quantum circuits","authors":"A. Paler, S. Devitt, K. Nemoto, I. Polian","doi":"10.7873/DATE.2014.137","DOIUrl":"https://doi.org/10.7873/DATE.2014.137","url":null,"abstract":"The realisation of large-scale quantum computing is no longer simply a hardware question. The rapid development of quantum technology has resulted in dozens of control and programming problems that should be directed towards the classical computer science and engineering community. One such problem is known as Pauli tracking. Methods for implementing quantum algorithms that are compatible with crucial error correction technology utilise extensive quantum teleportation protocols. These protocols are intrinsically probabilistic and result in correction operators that occur as byproducts of teleportation. These byproduct operators do not need to be corrected in the quantum hardware itself, but are tracked through the circuit and output results reinterpreted. This tracking is routinely ignored in quantum information as it is assumed that tracking algorithms will eventually be developed. In this work we help fill this gap and present an algorithm for tracking byproduct operators through a quantum computation.","PeriodicalId":6550,"journal":{"name":"2014 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"16 1","pages":"1-4"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72762580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Modeling steep slope devices: From circuits to architectures 陡坡设备建模:从电路到架构
Pub Date : 2014-03-24 DOI: 10.7873/DATE.2014.149
Karthik Swaminathan, M. Kim, Nandhini Chandramoorthy, B. Sedighi, Robert Perricone, J. Sampson, N. Vijaykrishnan
Steep Slope devices, with Heterojunction Tunnel FETs (TFETs) in particular, have been proposed as a viable solution to overcome the subthreshold slope limitation in existing CMOS technology and achieve ultra-low voltage operation with acceptable performance. However, state-of-the-art FinFET technologies continue to demonstrate superior performance than steep slope devices in application domains demanding peak single threaded performance. In this context, we examine different computing paradigms where TFET technologies can be used, not just as a `drop in' replacement, but as an additional parameter to augment the architectural design space. This greatly widens the scope of optimizations for performance and power. We investigate the tradeoffs between device and architectures in general purpose processors when performance, power and temperature are individually constrained. We also synthesize examples of domain-specific accelerators used in computer vision using in-house TFET standard cell libraries to demonstrate the energy benefits of designing TFET-based accelerators. We demonstrate that synthesizing these accelerators using TFETs reduces energy by over 6X in comparison to an equivalent iso-voltage CMOS-based design and by over 30% in comparison to an iso-performance CMOS design.
陡坡器件,特别是异质结隧道场效应管(tfet),已被提出作为克服现有CMOS技术中亚阈值斜率限制的可行解决方案,并实现具有可接受性能的超低电压工作。然而,在要求峰值单线程性能的应用领域,最先进的FinFET技术继续表现出比陡坡器件更好的性能。在这种情况下,我们研究了不同的计算范式,其中可以使用ttfet技术,不仅仅是作为“偶然”的替代品,而是作为增加架构设计空间的附加参数。这极大地扩大了性能和功率优化的范围。我们研究了在性能、功率和温度单独受限的情况下,通用处理器中器件和架构之间的权衡。我们还综合了在计算机视觉中使用的特定领域加速器的例子,使用内部的TFET标准单元库来演示设计基于TFET的加速器的能量效益。我们证明,与等效的等电压CMOS设计相比,使用tfet合成这些加速器可减少6倍以上的能量,与等性能CMOS设计相比,可减少30%以上的能量。
{"title":"Modeling steep slope devices: From circuits to architectures","authors":"Karthik Swaminathan, M. Kim, Nandhini Chandramoorthy, B. Sedighi, Robert Perricone, J. Sampson, N. Vijaykrishnan","doi":"10.7873/DATE.2014.149","DOIUrl":"https://doi.org/10.7873/DATE.2014.149","url":null,"abstract":"Steep Slope devices, with Heterojunction Tunnel FETs (TFETs) in particular, have been proposed as a viable solution to overcome the subthreshold slope limitation in existing CMOS technology and achieve ultra-low voltage operation with acceptable performance. However, state-of-the-art FinFET technologies continue to demonstrate superior performance than steep slope devices in application domains demanding peak single threaded performance. In this context, we examine different computing paradigms where TFET technologies can be used, not just as a `drop in' replacement, but as an additional parameter to augment the architectural design space. This greatly widens the scope of optimizations for performance and power. We investigate the tradeoffs between device and architectures in general purpose processors when performance, power and temperature are individually constrained. We also synthesize examples of domain-specific accelerators used in computer vision using in-house TFET standard cell libraries to demonstrate the energy benefits of designing TFET-based accelerators. We demonstrate that synthesizing these accelerators using TFETs reduces energy by over 6X in comparison to an equivalent iso-voltage CMOS-based design and by over 30% in comparison to an iso-performance CMOS design.","PeriodicalId":6550,"journal":{"name":"2014 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"24 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72660567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Co-optimization of memory BIST grouping, test scheduling, and logic placement 内存BIST分组、测试调度和逻辑布局的协同优化
Pub Date : 2014-03-24 DOI: 10.7873/DATE.2014.209
A. Kahng, Ilgweon Kang
Built-in self-test (BIST) is a well-known design technique in which part of a circuit is used to test the circuit itself. BIST plays an important role for embedded memories, which do not have pins or pads exposed toward the periphery of the chip for testing with automatic test equipment. With the rapidly increasing number of embedded memories in modern SOCs (up to hundreds of memories in each hard macro of the SOC), product designers incur substantial costs of test time (subject to possible power constraints) and BIST logic physical resources (area, routing, power). However, only limited previous work addresses the physical design optimization of BIST logic; notably, Chien et al. [7] optimize BIST design with respect to test time, routing length, and area. In our work, we propose a new three-step heuristic approach to minimize test time as well as test physical layout resources, subject to given upper bounds on power consumption. A key contribution is an integer linear programming ILP framework that determines optimal test time for a given cluster of memories using either one or two BIST controllers, subject to test power limits and with full comprehension of available serialization and parallelization. Our heuristic approach integrates (i) generation of a hypergraph over the memories, with test time-aware weighting of hyperedges, along with top-down, FM-style min-cut partitioning; (ii) solution of an ILP that comprehends parallel and serial testing to optimize test scheduling per BIST controller; and (iii) placement of BIST logic to minimize routing and buffering costs. When evaluated on hard macros from a recent industrial 28nm networking SOC, our heuristic solutions reduce test time estimates by up to 11.57% with strictly fewer BIST controllers per hard macro, compared to the industrial solutions.
内置自检(BIST)是一种众所周知的设计技术,它使用电路的一部分来测试电路本身。BIST在嵌入式存储器中扮演着重要的角色,嵌入式存储器没有引脚或焊盘暴露在芯片的外围,可以用自动测试设备进行测试。随着现代SOC中嵌入式存储器数量的迅速增加(SOC的每个硬宏中多达数百个存储器),产品设计人员会产生大量的测试时间成本(受可能的功率限制)和BIST逻辑物理资源(面积,路由,功率)。然而,只有有限的先前的工作解决物理设计优化的BIST逻辑;值得注意的是,Chien等人[7]在测试时间、路由长度和面积方面优化了BIST设计。在我们的工作中,我们提出了一种新的三步启发式方法,在给定功耗上限的情况下,最小化测试时间和测试物理布局资源。一个关键的贡献是一个整数线性规划ILP框架,它使用一个或两个BIST控制器确定给定内存集群的最佳测试时间,受测试功率限制,并充分理解可用的串行化和并行化。我们的启发式方法集成了(i)在内存上生成超图,具有超边的测试时间感知权重,以及自顶向下,fm风格的最小分割;(ii)理解并行和串行测试的ILP解决方案,以优化每个BIST控制器的测试调度;(iii)放置BIST逻辑以最小化路由和缓冲成本。在最近的工业28nm网络SOC的硬宏上进行评估时,与工业解决方案相比,我们的启发式解决方案将测试时间估计减少了11.57%,每个硬宏的BIST控制器严格减少。
{"title":"Co-optimization of memory BIST grouping, test scheduling, and logic placement","authors":"A. Kahng, Ilgweon Kang","doi":"10.7873/DATE.2014.209","DOIUrl":"https://doi.org/10.7873/DATE.2014.209","url":null,"abstract":"Built-in self-test (BIST) is a well-known design technique in which part of a circuit is used to test the circuit itself. BIST plays an important role for embedded memories, which do not have pins or pads exposed toward the periphery of the chip for testing with automatic test equipment. With the rapidly increasing number of embedded memories in modern SOCs (up to hundreds of memories in each hard macro of the SOC), product designers incur substantial costs of test time (subject to possible power constraints) and BIST logic physical resources (area, routing, power). However, only limited previous work addresses the physical design optimization of BIST logic; notably, Chien et al. [7] optimize BIST design with respect to test time, routing length, and area. In our work, we propose a new three-step heuristic approach to minimize test time as well as test physical layout resources, subject to given upper bounds on power consumption. A key contribution is an integer linear programming ILP framework that determines optimal test time for a given cluster of memories using either one or two BIST controllers, subject to test power limits and with full comprehension of available serialization and parallelization. Our heuristic approach integrates (i) generation of a hypergraph over the memories, with test time-aware weighting of hyperedges, along with top-down, FM-style min-cut partitioning; (ii) solution of an ILP that comprehends parallel and serial testing to optimize test scheduling per BIST controller; and (iii) placement of BIST logic to minimize routing and buffering costs. When evaluated on hard macros from a recent industrial 28nm networking SOC, our heuristic solutions reduce test time estimates by up to 11.57% with strictly fewer BIST controllers per hard macro, compared to the industrial solutions.","PeriodicalId":6550,"journal":{"name":"2014 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"11 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74148028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Hardware/software approach for code synchronization in low-power multi-core sensor nodes 低功耗多核传感器节点代码同步的硬件/软件方法
Pub Date : 2014-03-24 DOI: 10.7873/DATE.2014.181
R. Braojos, A. Dogan, I. Beretta, G. Ansaloni, David Atienza Alonso
Latest embedded bio-signal analysis applications, targeting low-power Wireless Body Sensor Nodes (WBSNs), present conflicting requirements. On one hand, bio-signal analysis applications are continuously increasing their demand for high computing capabilities. On the other hand, long-term signal processing in WBSNs must be provided within their highly constrained energy budget. In this context, parallel processing effectively increases the power efficiency of WBSNs, but only if the execution can be properly synchronized among computing elements. To address this challenge, in this work we propose a hardware/software approach to synchronize the execution of bio-signal processing applications in multi-core WBSNs. This new approach requires little hardware resources and very few adaptations in the source code. Moreover, it provides the necessary flexibility to execute applications with an arbitrarily large degree of complexity and parallelism, enabling considerable reductions in power consumption for all multi-core WBSN execution conditions. Experimental results show that a multi-core WBSN architecture using the illustrated approach can obtain energy savings of up to 40%, with respect to an equivalent single-core architecture, when performing advanced bio-signal analysis.
最新的嵌入式生物信号分析应用,针对低功耗无线身体传感器节点(WBSNs),提出了相互矛盾的要求。一方面,生物信号分析应用对高计算能力的要求不断提高。另一方面,wbns中的长期信号处理必须在其高度受限的能量预算范围内提供。在这种情况下,并行处理有效地提高了wbsn的功率效率,但前提是执行可以在计算元素之间适当同步。为了解决这一挑战,我们提出了一种硬件/软件方法来同步多核wbsn中生物信号处理应用程序的执行。这种新方法只需要很少的硬件资源,并且在源代码中只需要很少的调整。此外,它还提供了必要的灵活性,可以以任意大的复杂度和并行性执行应用程序,从而大大降低了所有多核WBSN执行条件下的功耗。实验结果表明,在进行高级生物信号分析时,使用所述方法的多核WBSN架构相对于等效的单核架构可以节省高达40%的能量。
{"title":"Hardware/software approach for code synchronization in low-power multi-core sensor nodes","authors":"R. Braojos, A. Dogan, I. Beretta, G. Ansaloni, David Atienza Alonso","doi":"10.7873/DATE.2014.181","DOIUrl":"https://doi.org/10.7873/DATE.2014.181","url":null,"abstract":"Latest embedded bio-signal analysis applications, targeting low-power Wireless Body Sensor Nodes (WBSNs), present conflicting requirements. On one hand, bio-signal analysis applications are continuously increasing their demand for high computing capabilities. On the other hand, long-term signal processing in WBSNs must be provided within their highly constrained energy budget. In this context, parallel processing effectively increases the power efficiency of WBSNs, but only if the execution can be properly synchronized among computing elements. To address this challenge, in this work we propose a hardware/software approach to synchronize the execution of bio-signal processing applications in multi-core WBSNs. This new approach requires little hardware resources and very few adaptations in the source code. Moreover, it provides the necessary flexibility to execute applications with an arbitrarily large degree of complexity and parallelism, enabling considerable reductions in power consumption for all multi-core WBSN execution conditions. Experimental results show that a multi-core WBSN architecture using the illustrated approach can obtain energy savings of up to 40%, with respect to an equivalent single-core architecture, when performing advanced bio-signal analysis.","PeriodicalId":6550,"journal":{"name":"2014 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"49 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84799724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Coverage evaluation of post-silicon validation tests with virtual prototypes 使用虚拟原型对后硅验证测试进行覆盖率评估
Pub Date : 2014-03-24 DOI: 10.7873/DATE.2014.331
Kai Cong, Li Lei, Zhenkun Yang, Fei Xie
High-quality tests for post-silicon validation should be ready before a silicon device becomes available in order to save time spent on preparing, debugging and fixing tests after the device is available. Test coverage is an important metric for evaluating the quality and readiness of post-silicon tests. We propose an online-capture offline-replay approach to coverage evaluation of post-silicon validation tests with virtual prototypes for estimating silicon device test coverage. We first capture necessary data from a concrete execution of the virtual prototype within a virtual platform under a given test, and then compute the test coverage by efficiently replaying this execution offline on the virtual prototype itself. Our approach provides early feedback on quality of post-silicon validation tests before silicon is ready. To ensure fidelity of early coverage evaluation, our approach have been further extended to support coverage evaluation and conformance checking in the post-silicon stage. We have applied our approach to evaluate a suite of common tests on virtual prototypes of five network adapters. Our approach was able to reliably estimate that this suite achieves high functional coverage on all five silicon devices.
硅后验证的高质量测试应该在硅设备可用之前准备好,以节省在设备可用后花费在准备、调试和修复测试上的时间。测试覆盖率是评估后硅测试的质量和准备情况的重要度量。我们提出了一种在线捕获离线重放的方法来评估后硅验证测试的覆盖率,并使用虚拟原型来估计硅器件测试覆盖率。我们首先从给定测试下的虚拟平台中的虚拟原型的具体执行中获取必要的数据,然后通过有效地在虚拟原型本身上脱机重放该执行来计算测试覆盖率。我们的方法在硅准备好之前提供了关于硅后验证测试质量的早期反馈。为了确保早期覆盖评估的保真度,我们的方法已经被进一步扩展,以支持覆盖评估和后硅阶段的一致性检查。我们已经应用我们的方法在五个网络适配器的虚拟原型上评估了一组通用测试。我们的方法能够可靠地估计该套件在所有五个硅器件上实现高功能覆盖。
{"title":"Coverage evaluation of post-silicon validation tests with virtual prototypes","authors":"Kai Cong, Li Lei, Zhenkun Yang, Fei Xie","doi":"10.7873/DATE.2014.331","DOIUrl":"https://doi.org/10.7873/DATE.2014.331","url":null,"abstract":"High-quality tests for post-silicon validation should be ready before a silicon device becomes available in order to save time spent on preparing, debugging and fixing tests after the device is available. Test coverage is an important metric for evaluating the quality and readiness of post-silicon tests. We propose an online-capture offline-replay approach to coverage evaluation of post-silicon validation tests with virtual prototypes for estimating silicon device test coverage. We first capture necessary data from a concrete execution of the virtual prototype within a virtual platform under a given test, and then compute the test coverage by efficiently replaying this execution offline on the virtual prototype itself. Our approach provides early feedback on quality of post-silicon validation tests before silicon is ready. To ensure fidelity of early coverage evaluation, our approach have been further extended to support coverage evaluation and conformance checking in the post-silicon stage. We have applied our approach to evaluate a suite of common tests on virtual prototypes of five network adapters. Our approach was able to reliably estimate that this suite achieves high functional coverage on all five silicon devices.","PeriodicalId":6550,"journal":{"name":"2014 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"1 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81887018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
A Linux-governor based Dynamic Reliability Manager for android mobile devices 基于linux调控器的android移动设备动态可靠性管理器
Pub Date : 2014-03-24 DOI: 10.7873/DATE.2014.117
Pietro Mercati, Andrea Bartolini, Francesco Paterna, T. Simunic, L. Benini
Reliability is a major concern in multiprocessors. Dynamic Reliability Management (DRM) aims at trading off processor performance with lifetime. The state-of-the-art publications study only the theory supported by simulation. This paper presents the first complete software implementation, working on a real hardware, of a low-overhead, Android-compatible workload-aware DRM Governor for mobile multiprocessors. We discuss the design challenges and the run-time overhead involved. We show the effectiveness of our governor in guaranteeing the predefined target lifetime and show that it achieves up to 100% of lifetime improvement with respect to traditional governors, while providing comparable performance for critical applications.
可靠性是多处理器的主要关注点。动态可靠性管理(DRM)旨在权衡处理器性能和寿命。最先进的出版物只研究由模拟支持的理论。本文提出了第一个完整的软件实现,它在一个真正的硬件上工作,用于移动多处理器的低开销、android兼容的工作负载感知DRM总督。我们将讨论设计挑战和所涉及的运行时开销。我们展示了我们的调控器在保证预定义目标生命周期方面的有效性,并展示了与传统调控器相比,它实现了高达100%的生命周期改进,同时为关键应用程序提供了相当的性能。
{"title":"A Linux-governor based Dynamic Reliability Manager for android mobile devices","authors":"Pietro Mercati, Andrea Bartolini, Francesco Paterna, T. Simunic, L. Benini","doi":"10.7873/DATE.2014.117","DOIUrl":"https://doi.org/10.7873/DATE.2014.117","url":null,"abstract":"Reliability is a major concern in multiprocessors. Dynamic Reliability Management (DRM) aims at trading off processor performance with lifetime. The state-of-the-art publications study only the theory supported by simulation. This paper presents the first complete software implementation, working on a real hardware, of a low-overhead, Android-compatible workload-aware DRM Governor for mobile multiprocessors. We discuss the design challenges and the run-time overhead involved. We show the effectiveness of our governor in guaranteeing the predefined target lifetime and show that it achieves up to 100% of lifetime improvement with respect to traditional governors, while providing comparable performance for critical applications.","PeriodicalId":6550,"journal":{"name":"2014 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"1 1","pages":"1-4"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81922196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 34
Minimally buffered single-cycle deflection router 最小缓冲单周期偏转路由器
Pub Date : 2014-03-24 DOI: 10.7873/DATE.2014.323
Gnaneswara Rao Jonna, John Jose, Rachana Radhakrishnan, M. Mutyam
With the drift from computation centric designs to communication centric designs in the Chip Multi Processor (CMP) era, the interconnect fabric is gaining more importance. An efficient NoC in terms of power, area and average flit latency has a huge impact on the overall performance of a CMP. In the current work, we propose MinBSD - a minimally buffered, single cycle, deflection router. It incorporates different operations (Injection, Ejection, Preemption, Re-injection) in a single module to handle the traffic effectively and ensures smooth flow of flits through router pipeline. It performs overlapped execution of independent operations. These factors not only make MinBSD to operate in a single cycle but also to reduce the critical path latency resulting in a faster interconnect network. Experimental results show that MinBSD reduces the average flit latency on real work loads, reduces die area and power consumption when compared to the existing state-of-the-art minimally buffered deflection routers.
随着芯片多处理器(CMP)时代从以计算为中心的设计转向以通信为中心的设计,互连结构变得越来越重要。在功率、面积和平均飞行延迟方面,高效的NoC对CMP的整体性能有巨大的影响。在目前的工作中,我们提出MinBSD -一个最小缓冲,单周期,偏转路由器。它将不同的操作(Injection, Ejection, Preemption, Re-injection)集成在一个模块中,有效地处理流量,保证流量通过路由器管道的顺畅。它执行独立操作的重叠执行。这些因素不仅使MinBSD在单周期内运行,而且还减少了关键路径延迟,从而实现了更快的互连网络。实验结果表明,与现有的最先进的最小缓冲偏转路由器相比,MinBSD降低了实际工作负载下的平均飞行延迟,减少了芯片面积和功耗。
{"title":"Minimally buffered single-cycle deflection router","authors":"Gnaneswara Rao Jonna, John Jose, Rachana Radhakrishnan, M. Mutyam","doi":"10.7873/DATE.2014.323","DOIUrl":"https://doi.org/10.7873/DATE.2014.323","url":null,"abstract":"With the drift from computation centric designs to communication centric designs in the Chip Multi Processor (CMP) era, the interconnect fabric is gaining more importance. An efficient NoC in terms of power, area and average flit latency has a huge impact on the overall performance of a CMP. In the current work, we propose MinBSD - a minimally buffered, single cycle, deflection router. It incorporates different operations (Injection, Ejection, Preemption, Re-injection) in a single module to handle the traffic effectively and ensures smooth flow of flits through router pipeline. It performs overlapped execution of independent operations. These factors not only make MinBSD to operate in a single cycle but also to reduce the critical path latency resulting in a faster interconnect network. Experimental results show that MinBSD reduces the average flit latency on real work loads, reduces die area and power consumption when compared to the existing state-of-the-art minimally buffered deflection routers.","PeriodicalId":6550,"journal":{"name":"2014 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"22 1","pages":"1-4"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81808959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Area minimization synthesis for reconfigurable single-electron transistor arrays with fabrication constraints 具有制造约束的可重构单电子晶体管阵列的面积最小化合成
Yi-Hang Chen, Jian-Yu Chen, Juinn-Dar Huang
As fabrication processes exploit even deeper submicron technology, power dissipation has become a crucial issue for most electronic circuit and system designs nowadays. In particular, leakage power is becoming a dominant source of power consumption. Recently, the reconfigurable single-electron transistor (SET) array has been proposed as an emerging circuit design style for continuing Moore's Law due to its ultra-low power consumption. Several automated synthesis approaches have been developed for the reconfigurable SET array in the past few years. Nevertheless, all of those existing methods consider fabrication constraints, which are mandatory, merely in late synthesis stages. In this paper, we propose a synthesis algorithm, featuring both variable reordering and product term reordering, for area minimization. In addition, our algorithm takes those mandatory fabrication constraints into account in early stages for better outcomes. Experimental results show that our new method can achieve an area reduction of up to 24% as compared to current state-of-the-art techniques.
随着制造工艺深入亚微米技术,功耗已成为当今大多数电子电路和系统设计的关键问题。特别是,泄漏功率正在成为电力消耗的主要来源。近年来,可重构单电子晶体管(SET)阵列因其超低功耗而成为延续摩尔定律的一种新兴电路设计方式。近年来,针对可重构SET阵列已经开发了几种自动化合成方法。然而,所有这些现有的方法考虑制造限制,这是强制性的,仅仅在后期合成阶段。在本文中,我们提出了一种同时具有变量重排序和乘积项重排序的面积最小化综合算法。此外,我们的算法在早期阶段考虑了这些强制性的制造约束,以获得更好的结果。实验结果表明,与目前最先进的技术相比,我们的新方法可以实现高达24%的面积缩小。
{"title":"Area minimization synthesis for reconfigurable single-electron transistor arrays with fabrication constraints","authors":"Yi-Hang Chen, Jian-Yu Chen, Juinn-Dar Huang","doi":"10.1145/2906360","DOIUrl":"https://doi.org/10.1145/2906360","url":null,"abstract":"As fabrication processes exploit even deeper submicron technology, power dissipation has become a crucial issue for most electronic circuit and system designs nowadays. In particular, leakage power is becoming a dominant source of power consumption. Recently, the reconfigurable single-electron transistor (SET) array has been proposed as an emerging circuit design style for continuing Moore's Law due to its ultra-low power consumption. Several automated synthesis approaches have been developed for the reconfigurable SET array in the past few years. Nevertheless, all of those existing methods consider fabrication constraints, which are mandatory, merely in late synthesis stages. In this paper, we propose a synthesis algorithm, featuring both variable reordering and product term reordering, for area minimization. In addition, our algorithm takes those mandatory fabrication constraints into account in early stages for better outcomes. Experimental results show that our new method can achieve an area reduction of up to 24% as compared to current state-of-the-art techniques.","PeriodicalId":6550,"journal":{"name":"2014 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"16 1","pages":"1-4"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82651716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Connecting different worlds — Technology abstraction for reliability-aware design and Test 连接不同的世界——可靠性感知设计和测试的技术抽象
Pub Date : 2014-03-24 DOI: 10.7873/DATE2014.265
Ulf Schlichtmann, V. Kleeberger, J. Abraham, A. Evans, C. Gimmler-Dumont, M. Glaß, A. Herkersdorf, S. Nassif, N. Wehn
The rapid shrinking of device geometries in the nanometer regime requires new technology-aware design methodologies. These must be able to evaluate the resilience of the circuit throughout all System on Chip (SoC) abstraction levels. To successfully guide design decisions at the system level, reliability models, which abstract technology information, are required to identify those parts of the system where additional protection in the form of hardware or software coun-termeasures is most effective. Interfaces such as the presented Resilience Articulation Point (RAP) or the Reliability Interchange Information Format (RIIF) are required to enable EDA-assisted analysis and propagation of reliability information. The models are discussed from different perspectives, such as design and test.
在纳米范围内,器件几何形状的迅速缩小需要新的技术意识设计方法。这些必须能够评估电路在所有片上系统(SoC)抽象级别中的弹性。为了成功地指导系统级的设计决策,需要抽象技术信息的可靠性模型来识别系统的那些部分,在这些部分中,以硬件或软件反措施形式的附加保护是最有效的。需要诸如弹性连接点(RAP)或可靠性交换信息格式(RIIF)之类的接口来实现eda辅助的可靠性信息分析和传播。从设计和测试等不同角度对模型进行了讨论。
{"title":"Connecting different worlds — Technology abstraction for reliability-aware design and Test","authors":"Ulf Schlichtmann, V. Kleeberger, J. Abraham, A. Evans, C. Gimmler-Dumont, M. Glaß, A. Herkersdorf, S. Nassif, N. Wehn","doi":"10.7873/DATE2014.265","DOIUrl":"https://doi.org/10.7873/DATE2014.265","url":null,"abstract":"The rapid shrinking of device geometries in the nanometer regime requires new technology-aware design methodologies. These must be able to evaluate the resilience of the circuit throughout all System on Chip (SoC) abstraction levels. To successfully guide design decisions at the system level, reliability models, which abstract technology information, are required to identify those parts of the system where additional protection in the form of hardware or software coun-termeasures is most effective. Interfaces such as the presented Resilience Articulation Point (RAP) or the Reliability Interchange Information Format (RIIF) are required to enable EDA-assisted analysis and propagation of reliability information. The models are discussed from different perspectives, such as design and test.","PeriodicalId":6550,"journal":{"name":"2014 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"33 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80515828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Joint Virtual Probe: Joint exploration of multiple test items' spatial patterns for efficient silicon characterization and test prediction 联合虚拟探针:联合探索多个测试项目的空间模式,用于高效硅表征和测试预测
Pub Date : 2014-03-24 DOI: 10.7873/DATE.2014.240
Shuang-Wang Zhang, Fan Lin, Chun-Kai Hsu, K. Cheng, Hong Wang
Virtual Probe (VP), proposed for characterization of spatial variations and for test time reduction, can effectively reconstruct the spatial pattern of a test item for an entire wafer using measurement values from only a small fraction of dies on the wafer. However, VP calculates the spatial signature of each test item separately, one item at a time, resulting in very long runtime for complex chips which often require hundreds, or even thousands, of test items in production. In this paper, we propose a new method, named Joint Virtual Probe (JVP), which can jointly derive spatial patterns of multiple test items. By simultaneously handling a large group of test items, JVP significantly reduces the overall runtime. And the prediction accuracy can also be improved because of JVP's implicit use of inter-test-item correlations in predicting spatial patterns. The experimental results on two industrial products, with 277 and 985 parametric test items in the production test programs respectively, demonstrate that, JVP achieves an average speedup of ~ 170X and ~ 50X over VP in the pre-test analysis and the test application phases respectively, as well as a slightly higher prediction accuracy than VP.
虚拟探针(VP)是为了表征空间变化和减少测试时间而提出的,它可以利用晶圆上一小部分芯片的测量值有效地重建整个晶圆上测试项目的空间模式。然而,VP单独计算每个测试项目的空间签名,每次计算一个项目,这导致复杂芯片的运行时间非常长,在生产中通常需要数百甚至数千个测试项目。本文提出了一种联合虚拟探针(Joint Virtual Probe, JVP)的方法,该方法可以联合导出多个测试项目的空间模式。通过同时处理大量的测试项,jvm显著地减少了整个运行时。由于JVP在预测空间格局时隐式地使用了测试项间的相关性,从而提高了预测的准确性。在两种工业产品生产试验程序中分别有277个和985个参数测试项的实验结果表明,JVP在试验前分析和试验应用阶段分别比VP平均提高了约170倍和约50倍,预测精度略高于VP。
{"title":"Joint Virtual Probe: Joint exploration of multiple test items' spatial patterns for efficient silicon characterization and test prediction","authors":"Shuang-Wang Zhang, Fan Lin, Chun-Kai Hsu, K. Cheng, Hong Wang","doi":"10.7873/DATE.2014.240","DOIUrl":"https://doi.org/10.7873/DATE.2014.240","url":null,"abstract":"Virtual Probe (VP), proposed for characterization of spatial variations and for test time reduction, can effectively reconstruct the spatial pattern of a test item for an entire wafer using measurement values from only a small fraction of dies on the wafer. However, VP calculates the spatial signature of each test item separately, one item at a time, resulting in very long runtime for complex chips which often require hundreds, or even thousands, of test items in production. In this paper, we propose a new method, named Joint Virtual Probe (JVP), which can jointly derive spatial patterns of multiple test items. By simultaneously handling a large group of test items, JVP significantly reduces the overall runtime. And the prediction accuracy can also be improved because of JVP's implicit use of inter-test-item correlations in predicting spatial patterns. The experimental results on two industrial products, with 277 and 985 parametric test items in the production test programs respectively, demonstrate that, JVP achieves an average speedup of ~ 170X and ~ 50X over VP in the pre-test analysis and the test application phases respectively, as well as a slightly higher prediction accuracy than VP.","PeriodicalId":6550,"journal":{"name":"2014 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"59 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80458282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
期刊
2014 Design, Automation & Test in Europe Conference & Exhibition (DATE)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1