首页 > 最新文献

2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools最新文献

英文 中文
System Level Hardening by Computing with Matrices 用矩阵计算系统级加固
R. Ferreira, Álvaro Freitas Moreira, L. Carro
Continuous advances in transistor manufacturing have enabled technology scaling along the years, sustaining Moore's law. As transistors sizes rapidly shrink, and voltage scales, the amount of charge in a node also rapidly decreases. A particle hitting the core will probably cause a transient fault to spam over several clock cycles. In this scenario, embedded systems using state-of-the-art technologies will face the challenge of operating in an environment susceptible to multiple errors, but with restricted resources available to deploy fault-tolerance, as these techniques severely increase power consumption. One possible solution to this problem is the adoption of software based fault-tolerance at the system level, aiming at reduced energy levels to ensure reliability and low energy dissipation. In this paper, we claim the detection and correction of errors on generic data structures at system level by using matrices to encode any program and algorithm. With such encoding, it is possible to employ established techniques of detection and correction of errors occurring in matrices, running with inexpressive overhead of power and energy. We evaluated this proposal using two case studies significant for the embedded system domain. Using the proposed approach, we observed in some cases an overhead of only 5% in performance and 8% in program size.
晶体管制造技术的不断进步,使得技术在多年的时间里不断扩大,从而维持了摩尔定律。随着晶体管尺寸的迅速缩小和电压的变化,节点中的电荷量也迅速减少。击中核心的粒子可能会在几个时钟周期内造成瞬态故障。在这种情况下,使用最先进技术的嵌入式系统将面临这样的挑战:在易受多种错误影响的环境中运行,但可用于部署容错的资源有限,因为这些技术严重增加了功耗。这个问题的一个可能的解决方案是在系统级别采用基于软件的容错,旨在降低能量级别以确保可靠性和低能量损耗。在本文中,我们提出了在系统级上使用矩阵对任何程序和算法进行编码来检测和纠正通用数据结构上的错误。有了这样的编码,就有可能采用现有的技术来检测和纠正矩阵中出现的错误,而运行时的电力和能量开销并不明显。我们使用两个对嵌入式系统领域具有重要意义的案例研究来评估这个建议。使用建议的方法,我们观察到在某些情况下,性能开销仅为5%,程序大小开销仅为8%。
{"title":"System Level Hardening by Computing with Matrices","authors":"R. Ferreira, Álvaro Freitas Moreira, L. Carro","doi":"10.1109/DSD.2010.8","DOIUrl":"https://doi.org/10.1109/DSD.2010.8","url":null,"abstract":"Continuous advances in transistor manufacturing have enabled technology scaling along the years, sustaining Moore's law. As transistors sizes rapidly shrink, and voltage scales, the amount of charge in a node also rapidly decreases. A particle hitting the core will probably cause a transient fault to spam over several clock cycles. In this scenario, embedded systems using state-of-the-art technologies will face the challenge of operating in an environment susceptible to multiple errors, but with restricted resources available to deploy fault-tolerance, as these techniques severely increase power consumption. One possible solution to this problem is the adoption of software based fault-tolerance at the system level, aiming at reduced energy levels to ensure reliability and low energy dissipation. In this paper, we claim the detection and correction of errors on generic data structures at system level by using matrices to encode any program and algorithm. With such encoding, it is possible to employ established techniques of detection and correction of errors occurring in matrices, running with inexpressive overhead of power and energy. We evaluated this proposal using two case studies significant for the embedded system domain. Using the proposed approach, we observed in some cases an overhead of only 5% in performance and 8% in program size.","PeriodicalId":356885,"journal":{"name":"2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114952292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Dynamic Control Flow Checking Technique for Reliable Microprocessors 可靠微处理器的动态控制流检测技术
M. Sugihara
Reliability issues such as a soft error and NBTI (negative bias temperature instability) have become a matter of concern as integrated circuits continue to shrink. It is getting more and more important to take reliability requirements into account even for consumer products. This paper presents a dynamic control flow checking (DCFC) technique for high reliable computer systems. The DCFC technique dynamically generates reference signatures as well as runtime signatures during executing a program. The dynamic generation of reference and runtime signatures contributes to saving program or data memory space that stores the signatures. Our DCFC technique stores signatures in a signature table unlike the conventional static control flow checking techniques. Our experiments showed that our DCFC technique protected 1.4-100.0% of executed instructions depending on the size of signature tables.
随着集成电路的不断缩小,软误差和负偏置温度不稳定性等可靠性问题已经成为人们关注的问题。即使对于消费产品,考虑可靠性要求也变得越来越重要。提出了一种用于高可靠性计算机系统的动态控制流检测技术。DCFC技术在执行程序期间动态地生成引用签名和运行时签名。动态生成引用和运行时签名有助于节省存储签名的程序或数据内存空间。与传统的静态控制流检查技术不同,我们的DCFC技术将签名存储在签名表中。我们的实验表明,根据签名表的大小,我们的DCFC技术保护了1.4-100.0%的已执行指令。
{"title":"Dynamic Control Flow Checking Technique for Reliable Microprocessors","authors":"M. Sugihara","doi":"10.1109/DSD.2010.81","DOIUrl":"https://doi.org/10.1109/DSD.2010.81","url":null,"abstract":"Reliability issues such as a soft error and NBTI (negative bias temperature instability) have become a matter of concern as integrated circuits continue to shrink. It is getting more and more important to take reliability requirements into account even for consumer products. This paper presents a dynamic control flow checking (DCFC) technique for high reliable computer systems. The DCFC technique dynamically generates reference signatures as well as runtime signatures during executing a program. The dynamic generation of reference and runtime signatures contributes to saving program or data memory space that stores the signatures. Our DCFC technique stores signatures in a signature table unlike the conventional static control flow checking techniques. Our experiments showed that our DCFC technique protected 1.4-100.0% of executed instructions depending on the size of signature tables.","PeriodicalId":356885,"journal":{"name":"2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools","volume":"9 44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115575755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Computation Reduction Techniques for Vector Median Filtering and their Hardware Implementation 向量中值滤波的计算缩减技术及其硬件实现
Ozgur Tasdizen, Ilker Hamzaoglu
Vector Median Filters (VMFs) are used in many image and video processing applications. Recently, they are used for Frame Rate Up-Conversion (FRC). However, they are difficult to implement in real-time because of their high computational complexity. Therefore, in this paper, we propose several techniques to reduce the computational complexity of VMFs by using data reuse methodology and by exploiting the spatial correlations in the motion vector field. In addition, we designed and implemented an efficient VMF hardware including the computation reduction techniques exploiting the spatial correlations in the motion vector field on a low cost Xilinx XC3S400A-5 FPGA. The FPGA implementation can work at 145 MHz and it can process more than 94 high definition frames per second.
矢量中值滤波器(vmf)用于许多图像和视频处理应用。最近,它们被用于帧率上转换(FRC)。但由于计算复杂度高,难以实时实现。因此,在本文中,我们提出了几种通过使用数据重用方法和利用运动矢量场中的空间相关性来降低vmf计算复杂性的技术。此外,我们在低成本Xilinx XC3S400A-5 FPGA上设计并实现了一种高效的VMF硬件,其中包括利用运动矢量场中的空间相关性的计算减少技术。FPGA实现可以工作在145 MHz,每秒可以处理超过94个高清帧。
{"title":"Computation Reduction Techniques for Vector Median Filtering and their Hardware Implementation","authors":"Ozgur Tasdizen, Ilker Hamzaoglu","doi":"10.1109/DSD.2010.102","DOIUrl":"https://doi.org/10.1109/DSD.2010.102","url":null,"abstract":"Vector Median Filters (VMFs) are used in many image and video processing applications. Recently, they are used for Frame Rate Up-Conversion (FRC). However, they are difficult to implement in real-time because of their high computational complexity. Therefore, in this paper, we propose several techniques to reduce the computational complexity of VMFs by using data reuse methodology and by exploiting the spatial correlations in the motion vector field. In addition, we designed and implemented an efficient VMF hardware including the computation reduction techniques exploiting the spatial correlations in the motion vector field on a low cost Xilinx XC3S400A-5 FPGA. The FPGA implementation can work at 145 MHz and it can process more than 94 high definition frames per second.","PeriodicalId":356885,"journal":{"name":"2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128679321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Software Programmable Data Allocation in Multi-bank Memory of SIMD Processors SIMD处理器多组存储器中的软件可编程数据分配
Jian Wang, Joar Sohl, Olof Kraigher, Dake Liu
The host-SIMD style heterogeneous multi-processor architecture offers high computing performance and user friendly programmability. It explores both task level parallelism and data level parallelism by the on-chip multiple SIMD coprocessors. For embedded DSP applications with predictable computing feature, this architecture can be further optimized for performance, implementation cost and power consumption. The optimization could be done by improving the SIMD processing efficiency and reducing redundant memory accesses and data shuffle operations. This paper introduces one effective approach by designing a software programmable multi-bank memory system for SIMD processors. Both the hardware architecture and software programming model are described in this paper, with an implementation example of the BLAS syrk routine. The proposed memory system offers high SIMD data access flexibility by using lookup table based address generators, and applying data permutations on both DMA controller interface and SIMD data access. The evaluation results show that the SIMD processor with this memory system can achieve high execution efficiency, with only 10% to 30% overhead. The proposed memory system also saves the implementation cost on SIMD local registers, in our system, each SIMD core has only 8 128-bit vector registers.
主机- simd风格的异构多处理器体系结构提供了高计算性能和用户友好的可编程性。它探讨了片上多个SIMD协处理器的任务级并行性和数据级并行性。对于具有可预测计算特性的嵌入式DSP应用,该架构可以进一步优化性能、实现成本和功耗。优化可以通过提高SIMD处理效率和减少冗余内存访问和数据洗牌操作来实现。本文通过设计SIMD处理器的软件可编程多库存储系统,介绍了一种有效的方法。本文介绍了系统的硬件结构和软件编程模型,并给出了BLAS syk例程的实现实例。所提出的存储系统通过使用基于查找表的地址生成器,并在DMA控制器接口和SIMD数据访问上应用数据排列,提供了很高的SIMD数据访问灵活性。评估结果表明,采用该存储系统的SIMD处理器可以获得较高的执行效率,开销仅为10% ~ 30%。该存储系统还节省了SIMD局部寄存器的实现成本,在我们的系统中,每个SIMD内核只有8个128位矢量寄存器。
{"title":"Software Programmable Data Allocation in Multi-bank Memory of SIMD Processors","authors":"Jian Wang, Joar Sohl, Olof Kraigher, Dake Liu","doi":"10.1109/DSD.2010.26","DOIUrl":"https://doi.org/10.1109/DSD.2010.26","url":null,"abstract":"The host-SIMD style heterogeneous multi-processor architecture offers high computing performance and user friendly programmability. It explores both task level parallelism and data level parallelism by the on-chip multiple SIMD coprocessors. For embedded DSP applications with predictable computing feature, this architecture can be further optimized for performance, implementation cost and power consumption. The optimization could be done by improving the SIMD processing efficiency and reducing redundant memory accesses and data shuffle operations. This paper introduces one effective approach by designing a software programmable multi-bank memory system for SIMD processors. Both the hardware architecture and software programming model are described in this paper, with an implementation example of the BLAS syrk routine. The proposed memory system offers high SIMD data access flexibility by using lookup table based address generators, and applying data permutations on both DMA controller interface and SIMD data access. The evaluation results show that the SIMD processor with this memory system can achieve high execution efficiency, with only 10% to 30% overhead. The proposed memory system also saves the implementation cost on SIMD local registers, in our system, each SIMD core has only 8 128-bit vector registers.","PeriodicalId":356885,"journal":{"name":"2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128788854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Customizable Composition and Parameterization of Hardware Design Transformations 硬件设计转换的可定制组合和参数化
T. Todman, Qiang Liu, W. Luk, G. Constantinides
A promising approach to high-level design is to start initially with an obvious but possibly inefficient design, and apply multiple transformations to meet design goals. Many hardware compilation tools support a fixed recipe of applying design transformations, but designers have few options to adapt the recipe without re-writing the tools themselves. In addition, complex transformations based on linear programming and geometric programming are often not included. This paper proposes anew approach that enables designers to customize the composition and parameterization of different types of design transformations in a unified framework, using a high-level language to control a transformation engine to automate the application of design transformations. Our approach is implemented by a tool based on the Python language and the ROSE compiler framework, which supports both syntax-directed transformations such as loop coalescing, and goal-directed transformations such as geometric programming. We illustrate how customizing the composition and parameterization of design transformations can lead to designs with different trade-offs in performance, resource usage, and energy efficiency. We evaluate our approach on benchmarks including matrix multiplication, Monte Carlo simulation of Asian options, edge detection, FIR filtering, and motion estimation.
一种有希望的高级设计方法是从一个明显但可能效率低下的设计开始,并应用多个转换来满足设计目标。许多硬件编译工具都支持应用设计转换的固定配方,但是设计人员在不重写工具本身的情况下几乎没有选择来调整配方。此外,基于线性规划和几何规划的复杂变换通常不包括在内。本文提出了一种新的方法,使设计人员能够在一个统一的框架中定制不同类型的设计转换的组合和参数化,使用高级语言来控制转换引擎以自动化设计转换的应用。我们的方法是由一个基于Python语言和ROSE编译器框架的工具实现的,该工具既支持语法导向的转换(如循环合并),也支持目标导向的转换(如几何编程)。我们将说明自定义设计转换的组合和参数化如何导致在性能、资源使用和能源效率方面具有不同权衡的设计。我们在基准测试中评估了我们的方法,包括矩阵乘法、蒙特卡罗模拟亚洲选项、边缘检测、FIR滤波和运动估计。
{"title":"Customizable Composition and Parameterization of Hardware Design Transformations","authors":"T. Todman, Qiang Liu, W. Luk, G. Constantinides","doi":"10.1109/DSD.2010.78","DOIUrl":"https://doi.org/10.1109/DSD.2010.78","url":null,"abstract":"A promising approach to high-level design is to start initially with an obvious but possibly inefficient design, and apply multiple transformations to meet design goals. Many hardware compilation tools support a fixed recipe of applying design transformations, but designers have few options to adapt the recipe without re-writing the tools themselves. In addition, complex transformations based on linear programming and geometric programming are often not included. This paper proposes anew approach that enables designers to customize the composition and parameterization of different types of design transformations in a unified framework, using a high-level language to control a transformation engine to automate the application of design transformations. Our approach is implemented by a tool based on the Python language and the ROSE compiler framework, which supports both syntax-directed transformations such as loop coalescing, and goal-directed transformations such as geometric programming. We illustrate how customizing the composition and parameterization of design transformations can lead to designs with different trade-offs in performance, resource usage, and energy efficiency. We evaluate our approach on benchmarks including matrix multiplication, Monte Carlo simulation of Asian options, edge detection, FIR filtering, and motion estimation.","PeriodicalId":356885,"journal":{"name":"2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124627724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
On CMOS Memory Design in Low Supply Voltage for Integrated Biosensor Applications 集成生物传感器低电压CMOS存储器设计研究
Allen Chen, Ryan Hoppal, Tom Chen
Storage arrays are widely used in integrated biosensor systems to store detected signals before and after they are processed. As integrated biosensor systems often require very low power consumption to extend battery life and to maintain low cost, power consumption for storage arrays in integrated biosensor systems should be kept low, whereas the speed requirement is usually not high such that state-of-the-art IC technology is not usually needed. This paper presents the results of our investigation of designing low power memory structures in sub 1-V operation with high reliability for biosensor systems. Rather than using the state-of-the-art 45nm/32nm technology, 0.18 um CMOS technology is used for the design to keep the overall cost down while achieving read and write performance of 200MHz cycle rate. The results show that the use of body back bias in systems with low supply voltage can improve memory's static noise margin (SNM) and memory write performance by as much as 25%.
存储阵列广泛应用于集成生物传感器系统中,用于存储处理前后的检测信号。由于集成生物传感器系统通常需要非常低的功耗来延长电池寿命并保持低成本,因此集成生物传感器系统中存储阵列的功耗应保持在低水平,而速度要求通常不高,因此通常不需要最先进的IC技术。本文介绍了我们为生物传感器系统设计低功耗、低电压、高可靠性的存储结构的研究结果。该设计采用了0.18 um CMOS技术,而不是使用最先进的45nm/32nm技术,以降低总体成本,同时实现200MHz周期速率的读写性能。结果表明,在低电源电压系统中使用体背偏置可使存储器的静态噪声裕度(SNM)和存储器写入性能提高多达25%。
{"title":"On CMOS Memory Design in Low Supply Voltage for Integrated Biosensor Applications","authors":"Allen Chen, Ryan Hoppal, Tom Chen","doi":"10.1109/DSD.2010.113","DOIUrl":"https://doi.org/10.1109/DSD.2010.113","url":null,"abstract":"Storage arrays are widely used in integrated biosensor systems to store detected signals before and after they are processed. As integrated biosensor systems often require very low power consumption to extend battery life and to maintain low cost, power consumption for storage arrays in integrated biosensor systems should be kept low, whereas the speed requirement is usually not high such that state-of-the-art IC technology is not usually needed. This paper presents the results of our investigation of designing low power memory structures in sub 1-V operation with high reliability for biosensor systems. Rather than using the state-of-the-art 45nm/32nm technology, 0.18 um CMOS technology is used for the design to keep the overall cost down while achieving read and write performance of 200MHz cycle rate. The results show that the use of body back bias in systems with low supply voltage can improve memory's static noise margin (SNM) and memory write performance by as much as 25%.","PeriodicalId":356885,"journal":{"name":"2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127450535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Architecture-Level Design Space Exploration of Super Scalar Microarchitecture for Network Applications 面向网络应用的超标量微体系结构的体系结构级设计空间探索
M. Salehi, H. Dorosti, S. M. Fakhraie
Increasing diversity in packet-processing applications and rapid increases in channel bandwidth lead to greater complexity in communication protocols. These factors result in larger computational loads for packet-processing engines that introduce high performance microprocessor designs as an important solution. This paper presents an exhaustive simulation for exploring the performance of instruction-level parallel super scalar processors executing packet-processing applications. Based on the simulation results, a design space exploration has been used to derive performance-efficient application-specific super scalar processor architecture based on MIPS instruction set architecture. Simple Scalar architecture toolset has been used for design space exploration and network applications have been investigated to guide the architecture exploration. The optimizations achieve up to 80% improvement in performance for representative packet-processing applications.
分组处理应用程序的多样性增加和信道带宽的快速增加导致通信协议的复杂性增加。这些因素导致数据包处理引擎的计算负荷增加,而高性能微处理器设计是一个重要的解决方案。本文给出了一个详尽的模拟,用于探索执行包处理应用程序的指令级并行标量处理器的性能。在仿真结果的基础上,采用设计空间探索的方法推导了基于MIPS指令集架构的高性能专用超标量处理器架构。使用简单标量架构工具集进行设计空间探索,并研究了网络应用来指导架构探索。对于代表性的数据包处理应用程序,这些优化实现了高达80%的性能改进。
{"title":"Architecture-Level Design Space Exploration of Super Scalar Microarchitecture for Network Applications","authors":"M. Salehi, H. Dorosti, S. M. Fakhraie","doi":"10.1109/DSD.2010.94","DOIUrl":"https://doi.org/10.1109/DSD.2010.94","url":null,"abstract":"Increasing diversity in packet-processing applications and rapid increases in channel bandwidth lead to greater complexity in communication protocols. These factors result in larger computational loads for packet-processing engines that introduce high performance microprocessor designs as an important solution. This paper presents an exhaustive simulation for exploring the performance of instruction-level parallel super scalar processors executing packet-processing applications. Based on the simulation results, a design space exploration has been used to derive performance-efficient application-specific super scalar processor architecture based on MIPS instruction set architecture. Simple Scalar architecture toolset has been used for design space exploration and network applications have been investigated to guide the architecture exploration. The optimizations achieve up to 80% improvement in performance for representative packet-processing applications.","PeriodicalId":356885,"journal":{"name":"2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130704165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Static Average Case Power Estimation Technique for Block Ciphers 分组密码的静态平均功率估计技术
Tingcong Ye, D. Vasudevan, Jiaoyan Chen, E. Popovici, M. Schellekens
In this paper a new static average case dynamic power estimation technique is introduced based on the property of randomness preservation for digital circuits. The proposed technique is validated by estimating the average case power for a block cipher, DES with a lower estimation error percentage of 0.9481 % and lesser simulation time with a pattern reduction of (2^n x 2^n!)-(2^n x 2^n x 2) for n bit design. The same technique can be extended to any block cipher, including the AES and IDEA-NXT.
基于数字电路的随机性保持特性,提出了一种新的静态平均情况动态功率估计技术。通过估计分组密码的平均case功率来验证所提出的技术,对于n位设计,DES的估计错误率较低,为0.9481%,模拟时间较短,模式减少(2^n x 2^n!)-(2^n x 2^n x 2)。同样的技术可以扩展到任何分组密码,包括AES和IDEA-NXT。
{"title":"Static Average Case Power Estimation Technique for Block Ciphers","authors":"Tingcong Ye, D. Vasudevan, Jiaoyan Chen, E. Popovici, M. Schellekens","doi":"10.1109/DSD.2010.105","DOIUrl":"https://doi.org/10.1109/DSD.2010.105","url":null,"abstract":"In this paper a new static average case dynamic power estimation technique is introduced based on the property of randomness preservation for digital circuits. The proposed technique is validated by estimating the average case power for a block cipher, DES with a lower estimation error percentage of 0.9481 % and lesser simulation time with a pattern reduction of (2^n x 2^n!)-(2^n x 2^n x 2) for n bit design. The same technique can be extended to any block cipher, including the AES and IDEA-NXT.","PeriodicalId":356885,"journal":{"name":"2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133363798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
NoC Switch with Credit Based Guaranteed Service Support Qualified for GALS Systems 具有基于信用的保证服务支持的NoC交换机,适用于GALS系统
T. Kranich, Mladen Berekovic
In this paper we present a scalable wormhole switch architecture with a credit based guaranteed service implementation. By means of credits for a service guarantee the architecture is also able to deal with mesochronous GALS systems. We extended a regular wormhole switch architecture with a control unit for service configuration during run-time and modified the arbitration policy. These changes result in a marginal area overhead per switch of approximately 4%. Thus our new architecture provides a simple solution to implement service guarantees without limitation to a fully synchronous system. We synthesized our design with a 65nm technology and achieved a clock frequency of 1GHz. Due to the high clock frequency we are able to get a channel throughput of more than 4GB/sec whereas the total design complexity is 30k gate equivalents.
在本文中,我们提出了一种可扩展的虫洞交换机架构,该架构具有基于信用的保证服务实现。通过服务保证的信用,该体系结构还能够处理中同步GALS系统。我们扩展了常规的虫洞交换机架构,在运行时使用控制单元进行服务配置,并修改了仲裁策略。这些变化导致每个交换机的边际面积开销约为4%。因此,我们的新架构提供了一个简单的解决方案来实现服务保证,而不局限于完全同步的系统。我们采用65nm技术合成了我们的设计,并实现了1GHz的时钟频率。由于高时钟频率,我们能够获得超过4GB/秒的通道吞吐量,而总设计复杂性为30k门等效。
{"title":"NoC Switch with Credit Based Guaranteed Service Support Qualified for GALS Systems","authors":"T. Kranich, Mladen Berekovic","doi":"10.1109/DSD.2010.30","DOIUrl":"https://doi.org/10.1109/DSD.2010.30","url":null,"abstract":"In this paper we present a scalable wormhole switch architecture with a credit based guaranteed service implementation. By means of credits for a service guarantee the architecture is also able to deal with mesochronous GALS systems. We extended a regular wormhole switch architecture with a control unit for service configuration during run-time and modified the arbitration policy. These changes result in a marginal area overhead per switch of approximately 4%. Thus our new architecture provides a simple solution to implement service guarantees without limitation to a fully synchronous system. We synthesized our design with a 65nm technology and achieved a clock frequency of 1GHz. Due to the high clock frequency we are able to get a channel throughput of more than 4GB/sec whereas the total design complexity is 30k gate equivalents.","PeriodicalId":356885,"journal":{"name":"2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122375304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Real-Time Testing of True Random Number Generators Through Dynamic Reconfiguration 动态重构真随机数生成器的实时测试
Dan Hotoleanu, O. Creţ, A. Suciu, Tamas Györfi, L. Văcariu
This paper presents the hardware implementation of the widely known NIST Statistical Test Suite – a battery of statistical tests for pseudorandom number generators (PRNGs) and true random number generators (TRNGs) – in a single Xilinx FPGA chip, using dynamic partial reconfiguration. The design offers a basic framework for easy integration of any additional randomness evaluation tests as well. Due to the integration of both the TRNG and the tests suite in a single FPGA chip, our solution offers new opportunities in the area of random number generation and testing, greatly reducing the time between the generation and the validation of the generated sequences of random bits.
本文介绍了广为人知的NIST统计测试套件的硬件实现-一组伪随机数生成器(prng)和真随机数生成器(trng)的统计测试-在单个Xilinx FPGA芯片上,使用动态部分重构。该设计为任何附加的随机性评估测试的轻松集成提供了一个基本框架。由于TRNG和测试套件集成在单个FPGA芯片中,我们的解决方案在随机数生成和测试领域提供了新的机会,大大缩短了生成的随机位序列的生成和验证之间的时间。
{"title":"Real-Time Testing of True Random Number Generators Through Dynamic Reconfiguration","authors":"Dan Hotoleanu, O. Creţ, A. Suciu, Tamas Györfi, L. Văcariu","doi":"10.1109/DSD.2010.56","DOIUrl":"https://doi.org/10.1109/DSD.2010.56","url":null,"abstract":"This paper presents the hardware implementation of the widely known NIST Statistical Test Suite – a battery of statistical tests for pseudorandom number generators (PRNGs) and true random number generators (TRNGs) – in a single Xilinx FPGA chip, using dynamic partial reconfiguration. The design offers a basic framework for easy integration of any additional randomness evaluation tests as well. Due to the integration of both the TRNG and the tests suite in a single FPGA chip, our solution offers new opportunities in the area of random number generation and testing, greatly reducing the time between the generation and the validation of the generated sequences of random bits.","PeriodicalId":356885,"journal":{"name":"2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121289104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
期刊
2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1