首页 > 最新文献

2009 IEEE Computer Society Annual Symposium on VLSI最新文献

英文 中文
All Digital Duty Cycle Correction Circuit in 90nm Based on Mutex 基于互斥锁的90nm全数字占空比校正电路
Pub Date : 2009-05-13 DOI: 10.1109/ISVLSI.2009.41
S. Ramasahayam, M. Srinivas
A duty cycle correction circuit (DCC) for high frequency clocks with fine resolution is designed and tested at 1.2V in 90nm CMOS process. Spice simulations show that this duty cycle corrector can adjust the output duty cycle to 50±0.5% with input clock at 500MHz and input duty cycle ranging from20% to 80%. DCC will not introduce any delay in the forward path, which makes it suitable for multi-phase clock applications. The proposed implementation uses the high frequency delay line and MUTEX (Mutual Exclusion Element) based circuit for achieving high resolution.
设计了一种高分辨率高频时钟的占空比校正电路(DCC),并在1.2V的90nm CMOS工艺下进行了测试。Spice仿真结果表明,该占空比校正器在输入时钟频率为500MHz,输入占空比范围为20% ~ 80%的情况下,可将输出占空比调整为50±0.5%。DCC不会在正向路径中引入任何延迟,这使得它适用于多相时钟应用。采用高频延迟线和互斥元件(MUTEX)电路实现高分辨率。
{"title":"All Digital Duty Cycle Correction Circuit in 90nm Based on Mutex","authors":"S. Ramasahayam, M. Srinivas","doi":"10.1109/ISVLSI.2009.41","DOIUrl":"https://doi.org/10.1109/ISVLSI.2009.41","url":null,"abstract":"A duty cycle correction circuit (DCC) for high frequency clocks with fine resolution is designed and tested at 1.2V in 90nm CMOS process. Spice simulations show that this duty cycle corrector can adjust the output duty cycle to 50±0.5% with input clock at 500MHz and input duty cycle ranging from20% to 80%. DCC will not introduce any delay in the forward path, which makes it suitable for multi-phase clock applications. The proposed implementation uses the high frequency delay line and MUTEX (Mutual Exclusion Element) based circuit for achieving high resolution.","PeriodicalId":137508,"journal":{"name":"2009 IEEE Computer Society Annual Symposium on VLSI","volume":"54 7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128078197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Dynamic Reconfiguration of Two-Level Caches in Soft Real-Time Embedded Systems 软实时嵌入式系统中二级缓存的动态重构
Pub Date : 2009-05-13 DOI: 10.1109/ISVLSI.2009.22
Weixun Wang, P. Mishra
Cache reconfiguration is a promising optimization technique for reducing memory hierarchy energy consumption with little or no impact on overall system performance. While cache reconfiguration is successful in desktop-based systems, it is not directly applicable in real-time systems due to timing constraints. Existing scheduling-aware cache reconfiguration techniques consider only one-level cache. It is a major challenge to dynamically tune multi-level caches since the exploration space is prohibitively large. This paper efficiently integrates cache reconfiguration in soft real-time systems with a unified two-level cache hierarchy. We utilize a set of exploration heuristics during our static analysis which effectively decreases the exploration time while keeps the generated profile results beneficial to be leveraged during runtime. Our experimental results have demonstrated 32 - 49% energy savings with minor impact on performance.
缓存重新配置是一种很有前途的优化技术,它可以减少内存层次的能量消耗,而对整体系统性能几乎没有影响。虽然缓存重新配置在基于桌面的系统中是成功的,但由于时间限制,它不能直接应用于实时系统。现有的调度感知缓存重新配置技术只考虑一级缓存。动态调整多级缓存是一个主要挑战,因为探索空间非常大。本文采用统一的两级缓存结构,有效地集成了软实时系统中的缓存重构。我们在静态分析中使用了一组探索启发式方法,这有效地减少了探索时间,同时使生成的概要文件结果在运行时有利于利用。我们的实验结果表明,在对性能影响不大的情况下,节能32 - 49%。
{"title":"Dynamic Reconfiguration of Two-Level Caches in Soft Real-Time Embedded Systems","authors":"Weixun Wang, P. Mishra","doi":"10.1109/ISVLSI.2009.22","DOIUrl":"https://doi.org/10.1109/ISVLSI.2009.22","url":null,"abstract":"Cache reconfiguration is a promising optimization technique for reducing memory hierarchy energy consumption with little or no impact on overall system performance. While cache reconfiguration is successful in desktop-based systems, it is not directly applicable in real-time systems due to timing constraints. Existing scheduling-aware cache reconfiguration techniques consider only one-level cache. It is a major challenge to dynamically tune multi-level caches since the exploration space is prohibitively large. This paper efficiently integrates cache reconfiguration in soft real-time systems with a unified two-level cache hierarchy. We utilize a set of exploration heuristics during our static analysis which effectively decreases the exploration time while keeps the generated profile results beneficial to be leveraged during runtime. Our experimental results have demonstrated 32 - 49% energy savings with minor impact on performance.","PeriodicalId":137508,"journal":{"name":"2009 IEEE Computer Society Annual Symposium on VLSI","volume":"59 7","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131579629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
Inducing Thermal-Awareness in Multicore Systems Using Networks-on-Chip 利用片上网络诱导多核系统的热感知
Pub Date : 2009-05-13 DOI: 10.1109/ISVLSI.2009.25
David Atienza Alonso, E. Martinez
Technology scaling imposes an ever increasing temperature stress on digital circuit design due to transistor density, especially on highly integrated systems, such as Multi-Processor Systems-on-Chip (MPSoCs). Therefore,temperature-aware design is mandatory and should be performed at the early design stages. In this paper we present a novel hardware infrastructure to provide thermal control of MPSoC architectures, which is based on exploiting the No interconnects of the baseline system as an active component to communicate and coordinate between temperature sensors scattered around the chip, in order to globally monitor the actual temperature. Then, a thermal management unit and clock frequency controllers adjust the frequency and voltage of the processing elements according to the temperature requirements at run-time. We show experimental results of the infrastructure to implement effective global temperature control policies for a real-life 4-core MPSoC,emulated on an FPGA-based emulation framework.
由于晶体管密度的原因,技术规模对数字电路设计施加了不断增加的温度压力,特别是在高度集成的系统上,如多处理器片上系统(mpsoc)。因此,温度感知设计是强制性的,应该在设计的早期阶段进行。在本文中,我们提出了一种新的硬件基础设施来提供MPSoC架构的热控制,该基础设施基于利用基线系统的无互连作为主动组件,在分散在芯片周围的温度传感器之间进行通信和协调,以便全局监测实际温度。然后,热管理单元和时钟频率控制器根据运行时的温度要求调整处理元件的频率和电压。我们展示了在基于fpga的仿真框架上对现实生活中的4核MPSoC实施有效全局温度控制策略的基础设施的实验结果。
{"title":"Inducing Thermal-Awareness in Multicore Systems Using Networks-on-Chip","authors":"David Atienza Alonso, E. Martinez","doi":"10.1109/ISVLSI.2009.25","DOIUrl":"https://doi.org/10.1109/ISVLSI.2009.25","url":null,"abstract":"Technology scaling imposes an ever increasing temperature stress on digital circuit design due to transistor density, especially on highly integrated systems, such as Multi-Processor Systems-on-Chip (MPSoCs). Therefore,temperature-aware design is mandatory and should be performed at the early design stages. In this paper we present a novel hardware infrastructure to provide thermal control of MPSoC architectures, which is based on exploiting the No interconnects of the baseline system as an active component to communicate and coordinate between temperature sensors scattered around the chip, in order to globally monitor the actual temperature. Then, a thermal management unit and clock frequency controllers adjust the frequency and voltage of the processing elements according to the temperature requirements at run-time. We show experimental results of the infrastructure to implement effective global temperature control policies for a real-life 4-core MPSoC,emulated on an FPGA-based emulation framework.","PeriodicalId":137508,"journal":{"name":"2009 IEEE Computer Society Annual Symposium on VLSI","volume":"30 21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123703470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
High Speed Parallel Architecture for Cyclic Convolution Based on FNT 基于FNT的循环卷积高速并行结构
Pub Date : 2009-05-13 DOI: 10.1109/ISVLSI.2009.10
Jian Zhang, Shuguo Li
This paper presents a high speed parallel architecture for cyclic convolution based on Fermat Number Transform (FNT) in the diminished-1 number system. A code conversion method without addition (CCWA) and a butterfly operation method without addition (BOWA) are proposed to perform the FNT and its inverse (IFNT) except their final stages in the convolution. The pointwise multiplication in the convolution is accomplished by modulo 2n+1 partial product multipliers (MPPM) and output partial products which are inputs to the IFNT. Thus modulo 2n+1 carry propagation additions are avoided in the FNT and the IFNT except their final stages and the modulo 2n+1 multiplier. The execution delay of the parallel architecture is reduced evidently due to the decrease of modulo 2n+1 carry-propagation addition. Compared with the existing cyclic convolution architecture, the proposed one has better throughput performance and involves less hardware complexity. Synthesis results using 130nm CMOS technology demonstrate the superiority of the proposed architecture over the reported solution.
提出了一种基于费马数变换(FNT)的循环卷积高速并行结构。提出了一种无加法代码转换法(CCWA)和一种无加法蝴蝶运算法(BOWA),用于执行FNT及其逆(IFNT),除了它们在卷积中的最后阶段。卷积中的逐点乘法是通过对2n+1偏积乘法器(MPPM)和作为IFNT输入的输出偏积来完成的。因此,除了FNT和IFNT的最后阶段和模2n+1乘法器之外,在FNT和IFNT中避免了模2n+1进位传播加法。由于模2n+1载波传播量的减少,使得并行结构的执行时延明显降低。与现有的循环卷积结构相比,该结构具有更好的吞吐量性能和更低的硬件复杂度。采用130nm CMOS技术的合成结果表明,所提出的架构优于已有的解决方案。
{"title":"High Speed Parallel Architecture for Cyclic Convolution Based on FNT","authors":"Jian Zhang, Shuguo Li","doi":"10.1109/ISVLSI.2009.10","DOIUrl":"https://doi.org/10.1109/ISVLSI.2009.10","url":null,"abstract":"This paper presents a high speed parallel architecture for cyclic convolution based on Fermat Number Transform (FNT) in the diminished-1 number system. A code conversion method without addition (CCWA) and a butterfly operation method without addition (BOWA) are proposed to perform the FNT and its inverse (IFNT) except their final stages in the convolution. The pointwise multiplication in the convolution is accomplished by modulo 2n+1 partial product multipliers (MPPM) and output partial products which are inputs to the IFNT. Thus modulo 2n+1 carry propagation additions are avoided in the FNT and the IFNT except their final stages and the modulo 2n+1 multiplier. The execution delay of the parallel architecture is reduced evidently due to the decrease of modulo 2n+1 carry-propagation addition. Compared with the existing cyclic convolution architecture, the proposed one has better throughput performance and involves less hardware complexity. Synthesis results using 130nm CMOS technology demonstrate the superiority of the proposed architecture over the reported solution.","PeriodicalId":137508,"journal":{"name":"2009 IEEE Computer Society Annual Symposium on VLSI","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121190033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Synchronization-Based Abstraction Refinement for Modular Verification of Asynchronous Designs 异步设计模块化验证中基于同步的抽象细化
Pub Date : 2009-05-13 DOI: 10.1109/ISVLSI.2009.16
Hao Zheng, Haiqiong Yao, T. Yoneda
This paper presents a modular verification approach for asynchronous circuits to address state explosion with a novel interface refinement method to reduce false counterexamples.This method borrows the idea of parallel composition,and it iteratively refines each component in a design by examining its interface interactions, and removes the behavior not synchronized with its neighbors. This method is further enhanced by synchronizing multiple components simultaneously so that inter-dependencies among components are considered. The experiments on several large asynchronous circuits show that this method efficiently removes impossible behavior from each component including ones violating correctness requirements.
本文提出了一种异步电路状态爆炸的模块化验证方法,该方法采用了一种新颖的接口改进方法来减少假反例。这种方法借鉴了并行组合的思想,它通过检查接口交互来迭代地改进设计中的每个组件,并删除与相邻组件不同步的行为。该方法通过同时同步多个组件进一步增强,从而考虑组件之间的相互依赖关系。在几个大型异步电路上的实验表明,该方法有效地消除了每个组件的不可能行为,包括违反正确性要求的行为。
{"title":"Synchronization-Based Abstraction Refinement for Modular Verification of Asynchronous Designs","authors":"Hao Zheng, Haiqiong Yao, T. Yoneda","doi":"10.1109/ISVLSI.2009.16","DOIUrl":"https://doi.org/10.1109/ISVLSI.2009.16","url":null,"abstract":"This paper presents a modular verification approach for asynchronous circuits to address state explosion with a novel interface refinement method to reduce false counterexamples.This method borrows the idea of parallel composition,and it iteratively refines each component in a design by examining its interface interactions, and removes the behavior not synchronized with its neighbors. This method is further enhanced by synchronizing multiple components simultaneously so that inter-dependencies among components are considered. The experiments on several large asynchronous circuits show that this method efficiently removes impossible behavior from each component including ones violating correctness requirements.","PeriodicalId":137508,"journal":{"name":"2009 IEEE Computer Society Annual Symposium on VLSI","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115248202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
On-the-Fly Evaluation of FPGA-Based True Random Number Generator 基于fpga的真随机数发生器的动态评价
Pub Date : 2009-05-13 DOI: 10.1109/ISVLSI.2009.33
R. Santoro, O. Sentieys, S. Roy
Many embedded security chips require a high-quality digital True Random Number Generator (TRNG). Recently, some new TRNGs have been proposed in the literature, innovating by their new architectures. Moreover, some of them don't need to use the post-processing unit usually required in TRNG constructions. As a result, the TRNG data rate is enhanced and the produced random bits only depend on the noise source and its sampling. However, selecting a TRNG can be a delicate problem. In a hardware context (e.g. Field-Programmable Gate Array (FPGA) or Application-Specific Integrated Circuit (ASIC) implementation), the design area and power consumption are important criterions. To the best of our knowledge, no effective comparison of several TRNGs appears in the literature. This paper evaluates the randomness behavior, the area and the power consumption of the latest TRNGs. These investigations are realized into real conditions, by implementing the TRNGs into FPGA circuits.
许多嵌入式安全芯片需要高质量的数字真随机数发生器(TRNG)。最近,文献中提出了一些新的trng,它们的新架构具有创新性。此外,它们中的一些不需要使用TRNG构造中通常需要的后处理单元。因此,TRNG数据速率得到了提高,并且产生的随机比特只依赖于噪声源及其采样。然而,选择TRNG可能是一个微妙的问题。在硬件环境中(例如现场可编程门阵列(FPGA)或专用集成电路(ASIC)实现),设计面积和功耗是重要的标准。据我们所知,文献中没有出现几种trng的有效比较。本文对最新trng的随机行为、面积和功耗进行了评价。通过在FPGA电路中实现trng,这些研究在实际条件下得以实现。
{"title":"On-the-Fly Evaluation of FPGA-Based True Random Number Generator","authors":"R. Santoro, O. Sentieys, S. Roy","doi":"10.1109/ISVLSI.2009.33","DOIUrl":"https://doi.org/10.1109/ISVLSI.2009.33","url":null,"abstract":"Many embedded security chips require a high-quality digital True Random Number Generator (TRNG). Recently, some new TRNGs have been proposed in the literature, innovating by their new architectures. Moreover, some of them don't need to use the post-processing unit usually required in TRNG constructions. As a result, the TRNG data rate is enhanced and the produced random bits only depend on the noise source and its sampling. However, selecting a TRNG can be a delicate problem. In a hardware context (e.g. Field-Programmable Gate Array (FPGA) or Application-Specific Integrated Circuit (ASIC) implementation), the design area and power consumption are important criterions. To the best of our knowledge, no effective comparison of several TRNGs appears in the literature. This paper evaluates the randomness behavior, the area and the power consumption of the latest TRNGs. These investigations are realized into real conditions, by implementing the TRNGs into FPGA circuits.","PeriodicalId":137508,"journal":{"name":"2009 IEEE Computer Society Annual Symposium on VLSI","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133360546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
Increasing the Sensitivity of On-Chip Digital Thermal Sensors with Pre-Filtering 采用预滤波技术提高片上数字热传感器的灵敏度
Pub Date : 2009-05-13 DOI: 10.1109/ISVLSI.2009.31
Zhimin Chen, Raghunandan Nagesh, A. Reddy, P. Schaumont
Thermal monitoring has been broadly used to protect high-end integrated circuits from over-heating and to identify hot-spots in complex circuits. In this paper, we present a method to increase the sensitivity of an on-chip digital thermal sensor. In contrast to the existing mechanisms that characterize the overall temperature profile on a die, our solution is able to detect the submerged thermal variation caused by specific predefined events (SPE), under the precondition that the SPE’s dominant frequency does not overlap with those of other thermal events. This is made possible by pre-filtering of the temperature value. A demonstrator is implemented in an ordinary FPGA, in which the SPE is a person’s finger touching on the FPGA package. We successfully show that our design can do a correct and reliable detection of the finger touching event while ignoring other larger variations caused by other reasons. Because the finger touching event has no other special characteristics except for its unique frequency, we conclude that our solution is also applicable to other SPEs, especially low-frequency ones. In general, our method is sensitive, reliable and also flexible.
热监测已广泛用于保护高端集成电路的过热和识别复杂电路中的热点。本文提出了一种提高片上数字热传感器灵敏度的方法。与现有的表征模具整体温度分布的机制不同,我们的解决方案能够在SPE的主导频率不与其他热事件重叠的前提下,检测由特定预定义事件(SPE)引起的淹没热变化。这可以通过对温度值进行预滤波来实现。演示器是在一个普通的FPGA中实现的,其中SPE是一个人的手指触摸FPGA包。我们成功地证明了我们的设计可以对手指触摸事件进行正确可靠的检测,同时忽略了其他原因引起的其他更大的变化。由于手指触摸事件除了其独特的频率外,没有其他特殊的特征,因此我们得出结论,我们的解决方案也适用于其他spe,特别是低频spe。总的来说,我们的方法是灵敏、可靠和灵活的。
{"title":"Increasing the Sensitivity of On-Chip Digital Thermal Sensors with Pre-Filtering","authors":"Zhimin Chen, Raghunandan Nagesh, A. Reddy, P. Schaumont","doi":"10.1109/ISVLSI.2009.31","DOIUrl":"https://doi.org/10.1109/ISVLSI.2009.31","url":null,"abstract":"Thermal monitoring has been broadly used to protect high-end integrated circuits from over-heating and to identify hot-spots in complex circuits. In this paper, we present a method to increase the sensitivity of an on-chip digital thermal sensor. In contrast to the existing mechanisms that characterize the overall temperature profile on a die, our solution is able to detect the submerged thermal variation caused by specific predefined events (SPE), under the precondition that the SPE’s dominant frequency does not overlap with those of other thermal events. This is made possible by pre-filtering of the temperature value. A demonstrator is implemented in an ordinary FPGA, in which the SPE is a person’s finger touching on the FPGA package. We successfully show that our design can do a correct and reliable detection of the finger touching event while ignoring other larger variations caused by other reasons. Because the finger touching event has no other special characteristics except for its unique frequency, we conclude that our solution is also applicable to other SPEs, especially low-frequency ones. In general, our method is sensitive, reliable and also flexible.","PeriodicalId":137508,"journal":{"name":"2009 IEEE Computer Society Annual Symposium on VLSI","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132563748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Mapping Data and Code into Scratchpads from Relocatable Binaries 将数据和代码从可重新定位的二进制文件映射到scratchpad
Pub Date : 2009-05-13 DOI: 10.1109/ISVLSI.2009.28
Alexandre K. I. Mendonça, D. Volpato, José Luís Almada Güntzel, L. Santos
Scratchpad memories (SPMs) are promising for energy-efficient embedded systems. Most optimizing techniques for mapping data and code elements to SPMs assume the availability of source code. However, embedded software development has to cope with legacy code, third-party software, and IP-protected applications for which only the binaries are available. The few techniques that directly handle binaries operate on executable files and are limited to either code or data. This work proposes a new technique that addresses both data and code allocation into SPMs. Since it operates directly on binaries, the technique allows library elements to be eligible for SPM mapping. It consists of three main engines: a profiler, a mapper and a patcher. The patcher was designed to operate upon relocatable object binaries so as to overcome the inefficiency of bookkeeping SPM relocations on executable binaries. As compared to code-only SPM mapping, an average energy saving of 15% was obtained for a varied set of benchmark programs and memory configurations. Savings around 47% were reached for the two programs with higher static data content. The average patching time was 0.23s on a quad-core workstation.
刮刮板存储器(spm)在节能嵌入式系统中很有前途。大多数将数据和代码元素映射到spm的优化技术都假定源代码可用。然而,嵌入式软件开发必须处理遗留代码、第三方软件和只有二进制文件可用的受ip保护的应用程序。少数直接处理二进制文件的技术对可执行文件进行操作,并且仅限于代码或数据。这项工作提出了一种新的技术,将数据和代码分配到spm中。由于它直接对二进制文件进行操作,因此该技术允许库元素符合SPM映射的条件。它由三个主要引擎组成:分析器、映射器和补丁程序。补丁程序设计用于操作可重定位的对象二进制文件,以克服在可执行二进制文件上记录SPM重定位的低效率。与纯代码SPM映射相比,对于不同的基准测试程序集和内存配置,平均节省了15%的能源。对于具有较高静态数据内容的两个程序,节省了大约47%的成本。在四核工作站上,平均补丁时间为0.23秒。
{"title":"Mapping Data and Code into Scratchpads from Relocatable Binaries","authors":"Alexandre K. I. Mendonça, D. Volpato, José Luís Almada Güntzel, L. Santos","doi":"10.1109/ISVLSI.2009.28","DOIUrl":"https://doi.org/10.1109/ISVLSI.2009.28","url":null,"abstract":"Scratchpad memories (SPMs) are promising for energy-efficient embedded systems. Most optimizing techniques for mapping data and code elements to SPMs assume the availability of source code. However, embedded software development has to cope with legacy code, third-party software, and IP-protected applications for which only the binaries are available. The few techniques that directly handle binaries operate on executable files and are limited to either code or data. This work proposes a new technique that addresses both data and code allocation into SPMs. Since it operates directly on binaries, the technique allows library elements to be eligible for SPM mapping. It consists of three main engines: a profiler, a mapper and a patcher. The patcher was designed to operate upon relocatable object binaries so as to overcome the inefficiency of bookkeeping SPM relocations on executable binaries. As compared to code-only SPM mapping, an average energy saving of 15% was obtained for a varied set of benchmark programs and memory configurations. Savings around 47% were reached for the two programs with higher static data content. The average patching time was 0.23s on a quad-core workstation.","PeriodicalId":137508,"journal":{"name":"2009 IEEE Computer Society Annual Symposium on VLSI","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123484776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
High-Speed Low-Current Duobinary Signaling Over Active Terminated Chip-to-Chip Interconnect 基于主动端接芯片到芯片互连的高速低电流双二进制信号
Pub Date : 2009-05-13 DOI: 10.1109/ISVLSI.2009.9
V. Pasupureddi, P. Mandal, Sunil Sachdev
In this work we propose high-speed low-current duobinary signaling scheme over an active terminated chip-to-chip interconnect. The active termination scheme eliminates the need of any dedicated passive terminator both at the transmitter and receiver, avoiding signal reflection. Elimination of the passive terminator helps to reduce the transmitted signal level without effecting signal detect-ability of the receiver and also removes the thermal noise of the terminator. To implement bandwidth efficient duobinary signaling, we present a current-mode high-speed precoder operating at 10-Gb/s. A low-current active terminated driver based on modified Cherry-Hooper topology is proposed. At the receive-end, we propose an active terminated current-mode receiver(Rx) with regulated gate cascode (RGC) based transimpedance amplifier(TIA). Folded active inductor peaking is used to enhance the bandwidth of this TIA. We also propose lowpower broadband equalizer topology for channel equalization. The duobinary transmitter and receiver circuits are implemented in 1.8-V, 0.18-μm Digital CMOS technology with an f_T of 27-GHz. The designed high speed duobinary Tx/Rx circuits work up-to 8-Gb/s speed while transmitting the data over FR4 PCB trace of length 29.5-inch and for the targeted bit-error-rate(BER) of 10^−12. The power consumed in the transmitter and receiver circuits is 42.9-mW at 8-Gb/s
在这项工作中,我们提出了一种基于有源端接芯片到芯片互连的高速低电流双二进制信令方案。主动终端方案消除了发射器和接收器上任何专用被动终端的需要,避免了信号反射。消除无源终止器有助于在不影响接收机信号检测能力的情况下降低发射信号电平,并且还消除了终止器的热噪声。为了实现带宽高效的双二进制信令,我们提出了一种工作速度为10 gb /s的电流模式高速预编码器。提出了一种基于改进Cherry-Hooper拓扑结构的小电流有源端接驱动器。在接收端,我们提出了一种有源端接电流模式接收器(Rx),具有基于可调门级联码(RGC)的跨阻放大器(TIA)。采用折叠有源电感调峰来提高TIA的带宽。我们还提出了用于信道均衡的低功耗宽带均衡器拓扑。双二进制收发电路采用1.8 v、0.18 μm数字CMOS技术实现,f_T为27 ghz。设计的高速双二进制Tx/Rx电路在长度为29.5英寸的FR4 PCB走线上传输数据时,速度可达8gb /s,误码率(BER)为10^−12。在8gb /s下,发送和接收电路消耗的功率为42.9 mw
{"title":"High-Speed Low-Current Duobinary Signaling Over Active Terminated Chip-to-Chip Interconnect","authors":"V. Pasupureddi, P. Mandal, Sunil Sachdev","doi":"10.1109/ISVLSI.2009.9","DOIUrl":"https://doi.org/10.1109/ISVLSI.2009.9","url":null,"abstract":"In this work we propose high-speed low-current duobinary signaling scheme over an active terminated chip-to-chip interconnect. The active termination scheme eliminates the need of any dedicated passive terminator both at the transmitter and receiver, avoiding signal reflection. Elimination of the passive terminator helps to reduce the transmitted signal level without effecting signal detect-ability of the receiver and also removes the thermal noise of the terminator. To implement bandwidth efficient duobinary signaling, we present a current-mode high-speed precoder operating at 10-Gb/s. A low-current active terminated driver based on modified Cherry-Hooper topology is proposed. At the receive-end, we propose an active terminated current-mode receiver(Rx) with regulated gate cascode (RGC) based transimpedance amplifier(TIA). Folded active inductor peaking is used to enhance the bandwidth of this TIA. We also propose lowpower broadband equalizer topology for channel equalization. The duobinary transmitter and receiver circuits are implemented in 1.8-V, 0.18-μm Digital CMOS technology with an f_T of 27-GHz. The designed high speed duobinary Tx/Rx circuits work up-to 8-Gb/s speed while transmitting the data over FR4 PCB trace of length 29.5-inch and for the targeted bit-error-rate(BER) of 10^−12. The power consumed in the transmitter and receiver circuits is 42.9-mW at 8-Gb/s","PeriodicalId":137508,"journal":{"name":"2009 IEEE Computer Society Annual Symposium on VLSI","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126483913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
An 8-bit 1.8 V 500 MSPS CMOS Segmented Current Steering DAC 一个8位1.8 V 500 MSPS CMOS分段电流转向DAC
Pub Date : 2009-05-13 DOI: 10.1109/ISVLSI.2009.12
Santanu Sarkar, S. Banerjee
This paper presents design of an 8-bit 1.8 V segmented current steering (CS) digital-to-analog converter (DAC)using 0.18 μm double poly five metal CMOS technology. The DAC has been segmented as 6+2 to achieve optimum performance for minimum area. The simulation result shows a maximum DNLof 0.30 LSB and an INL of 0.33 LSB. The midcode glitch is0.27 pV s. The simulated SNDR and SFDR of the segmented DAC are 52.13 dB and 44.83 dB respectively. The settling of the segmented DAC is 6.02 ns. The power consumption is simulated as 7.88 mW. The prototype will be used in telecommunication applications.
提出了一种采用0.18 μm双聚五金属CMOS技术的8位1.8 V分段电流转向数模转换器(DAC)的设计方案。DAC被分割为6+2,以实现最小面积的最佳性能。仿真结果表明,最大dnl为0.30 LSB,最大INL为0.33 LSB。中间码误差为0.27 pV s,模拟的分段DAC的SNDR和SFDR分别为52.13 dB和44.83 dB。分段DAC的沉降为6.02 ns。功耗模拟为7.88 mW。原型机将用于电信应用。
{"title":"An 8-bit 1.8 V 500 MSPS CMOS Segmented Current Steering DAC","authors":"Santanu Sarkar, S. Banerjee","doi":"10.1109/ISVLSI.2009.12","DOIUrl":"https://doi.org/10.1109/ISVLSI.2009.12","url":null,"abstract":"This paper presents design of an 8-bit 1.8 V segmented current steering (CS) digital-to-analog converter (DAC)using 0.18 μm double poly five metal CMOS technology. The DAC has been segmented as 6+2 to achieve optimum performance for minimum area. The simulation result shows a maximum DNLof 0.30 LSB and an INL of 0.33 LSB. The midcode glitch is0.27 pV s. The simulated SNDR and SFDR of the segmented DAC are 52.13 dB and 44.83 dB respectively. The settling of the segmented DAC is 6.02 ns. The power consumption is simulated as 7.88 mW. The prototype will be used in telecommunication applications.","PeriodicalId":137508,"journal":{"name":"2009 IEEE Computer Society Annual Symposium on VLSI","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128182074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
期刊
2009 IEEE Computer Society Annual Symposium on VLSI
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1