首页 > 最新文献

IET Computers and Digital Techniques最新文献

英文 中文
Performance analysis of dynamic CMOS circuit based on node-discharger and twist-connected transistors 基于节点放电器和扭接晶体管的动态CMOS电路性能分析
IF 1.2 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2020-02-10 DOI: 10.1049/iet-cdt.2018.5045
Dhandapani Vaithiyanathan, Ravindra Kumar, Ashima Rai, Khushboo Sharma

The incessant growth of devices such as mobile phones, digital cameras, and other portable electronic gadgets has led to a higher amount of research being dedicated to the low power digital and analogue circuits. In this study, a low power-delay-product (PDP) dynamic complementary metal oxide semiconductor (CMOS) circuit design using small swing domino logic with twist-connected transistors is proposed. An improvement in PDP can be achieved by using a node-discharger circuit in the conventional design. The conventional benchmark and modified circuits are implemented in 90 nm CMOS technology with different power supplies, i.e. 1.2, 1, and 0.9 V. Furthermore, a decrease in voltage level for logic ‘1’ and an increase in voltage level for logic ‘0’ is achieved while maintaining the logic threshold accordingly at half of the supply voltage. So, the output voltage swing is reduced and the unnecessary nodes of the pull down network get discharged in pre-charge phase, eventually leading to an improvement when compared with conventional design in overall PDP by 43.21 and 46.83% for two inverted two-input and three-input AND gate dynamic benchmarks, respectively, for a power supply of 1 V.

手机、数码相机和其他便携式电子设备等设备的不断增长,导致对低功耗数字和模拟电路的研究越来越多。在本研究中,提出了一种低功率延迟产品(PDP)动态互补金属氧化物半导体(CMOS)电路设计,该电路使用具有扭接晶体管的小摆动多米诺逻辑。可以通过在传统设计中使用节点放电器电路来实现PDP的改进。传统的基准电路和改进电路是在具有不同电源(即1.2、1和0.9V)的90nm CMOS技术中实现的。此外,在将逻辑阈值相应地保持在电源电压的一半的同时,实现了逻辑“1”的电压电平的降低和逻辑“0”的电压水平的增加。因此,输出电压摆动减少,下拉网络的不必要节点在预充电阶段放电,最终与传统设计相比,对于1V的电源,两个反相的两输入和三输入与门动态基准,整体PDP分别提高了43.21%和46.83%。
{"title":"Performance analysis of dynamic CMOS circuit based on node-discharger and twist-connected transistors","authors":"Dhandapani Vaithiyanathan,&nbsp;Ravindra Kumar,&nbsp;Ashima Rai,&nbsp;Khushboo Sharma","doi":"10.1049/iet-cdt.2018.5045","DOIUrl":"https://doi.org/10.1049/iet-cdt.2018.5045","url":null,"abstract":"<div>\u0000 <p>The incessant growth of devices such as mobile phones, digital cameras, and other portable electronic gadgets has led to a higher amount of research being dedicated to the low power digital and analogue circuits. In this study, a low power-delay-product (PDP) dynamic complementary metal oxide semiconductor (CMOS) circuit design using small swing domino logic with twist-connected transistors is proposed. An improvement in PDP can be achieved by using a node-discharger circuit in the conventional design. The conventional benchmark and modified circuits are implemented in 90 nm CMOS technology with different power supplies, i.e. 1.2, 1, and 0.9 V. Furthermore, a decrease in voltage level for logic ‘1’ and an increase in voltage level for logic ‘0’ is achieved while maintaining the logic threshold accordingly at half of the supply voltage. So, the output voltage swing is reduced and the unnecessary nodes of the pull down network get discharged in pre-charge phase, eventually leading to an improvement when compared with conventional design in overall PDP by 43.21 and 46.83% for two inverted two-input and three-input AND gate dynamic benchmarks, respectively, for a power supply of 1 V.</p>\u0000 </div>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"14 3","pages":"107-113"},"PeriodicalIF":1.2,"publicationDate":"2020-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/iet-cdt.2018.5045","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71950743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Integer linear programming model for allocation and migration of data blocks in the STT-RAM-based hybrid caches 基于STT RAM的混合缓存中数据块分配和迁移的整数线性规划模型
IF 1.2 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2020-02-10 DOI: 10.1049/iet-cdt.2019.0070
Elyas Khajekarimi, Kamal Jamshidi, Abbas Vafaei

Spin-transfer torque random access memory (STT-RAM) has emerged as an eminent choice for the larger on-chip caches due to high density, low static power consumption and scalability. However, this technology suffers from long latency and high energy consumption during a write operation. Hybrid caches alleviate these problems by incorporating a write-friendly memory technology such as static random access memory along with STT-RAM technology. The proper allocation of data blocks has a significant effect on both performance and energy consumption in the hybrid cache. In this study, the allocation and migration problem of data blocks in the hybrid cache is examined and then modelled using integer linear programming (ILP) formulations. The authors propose an ILP model with three different objective functions which include minimising access latency, minimising energy and minimising energy-delay product in the hybrid cache. Evaluations confirm that the proposed ILP model obtains better results in terms of energy consumption and performance compared to the existing hybrid cache architecture.

由于高密度、低静态功耗和可扩展性,自旋转移力矩随机存取存储器(STT-RAM)已成为大型片上高速缓存的杰出选择。然而,该技术在写入操作期间存在长延迟和高能耗的问题。混合高速缓存通过结合诸如静态随机存取存储器之类的写友好存储器技术以及STT-RAM技术来缓解这些问题。数据块的正确分配对混合缓存的性能和能耗都有显著影响。在这项研究中,研究了混合缓存中数据块的分配和迁移问题,然后使用整数线性规划(ILP)公式进行建模。作者提出了一个具有三个不同目标函数的ILP模型,包括最小化访问延迟、最小化能量和最小化混合缓存中的能量延迟乘积。评估证实,与现有的混合缓存架构相比,所提出的ILP模型在能耗和性能方面获得了更好的结果。
{"title":"Integer linear programming model for allocation and migration of data blocks in the STT-RAM-based hybrid caches","authors":"Elyas Khajekarimi,&nbsp;Kamal Jamshidi,&nbsp;Abbas Vafaei","doi":"10.1049/iet-cdt.2019.0070","DOIUrl":"https://doi.org/10.1049/iet-cdt.2019.0070","url":null,"abstract":"<div>\u0000 <p>Spin-transfer torque random access memory (STT-RAM) has emerged as an eminent choice for the larger on-chip caches due to high density, low static power consumption and scalability. However, this technology suffers from long latency and high energy consumption during a write operation. Hybrid caches alleviate these problems by incorporating a write-friendly memory technology such as static random access memory along with STT-RAM technology. The proper allocation of data blocks has a significant effect on both performance and energy consumption in the hybrid cache. In this study, the allocation and migration problem of data blocks in the hybrid cache is examined and then modelled using integer linear programming (ILP) formulations. The authors propose an ILP model with three different objective functions which include minimising access latency, minimising energy and minimising energy-delay product in the hybrid cache. Evaluations confirm that the proposed ILP model obtains better results in terms of energy consumption and performance compared to the existing hybrid cache architecture.</p>\u0000 </div>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"14 3","pages":"97-106"},"PeriodicalIF":1.2,"publicationDate":"2020-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/iet-cdt.2019.0070","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71950742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Scheme for periodical concurrent fault detection in parallel CRC circuits 并行CRC电路中的周期性并发故障检测方案
IF 1.2 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2020-01-21 DOI: 10.1049/iet-cdt.2018.5183
Jie Li, Shanshan Liu, Pedro Reviriego, Liyi Xiao, Fabrizio Lombardi

As technology scales down, circuits are more prone to incur in faults and fault detection is necessary to ensure the system reliability. However, fault-detection circuits are also vulnerable to stuck-at faults due to, for example, manufacturing defects or ageing; a fault can cause an incorrect output in the fault-detection scheme; so concurrent fault detection is, therefore, needed. Cyclic redundancy checks (CRCs) are widely used to detect errors in many applications, for example, they are used in communication to detect errors on transmitted frames. In this study, an efficient method to implement concurrent fault detection for parallel CRC computation is proposed. The scheme relies on using a serial CRC computation circuit that is used to periodically check the results obtained from the main module to detect the faults. This introduces a lower circuit overhead than existing schemes. All CRC encoders and decoders that implement the CRC computation in parallel can employ the proposed scheme to detect faults.

随着技术规模的缩小,电路更容易发生故障,故障检测是确保系统可靠性的必要条件。然而,由于制造缺陷或老化等原因,故障检测电路也容易出现卡在故障上;故障可能导致故障检测方案中的错误输出;因此,需要同时进行故障检测。循环冗余校验(CRC)在许多应用中被广泛用于检测错误,例如,它们在通信中用于检测传输帧上的错误。在这项研究中,提出了一种有效的方法来实现并行CRC计算的并发故障检测。该方案依赖于使用串行CRC计算电路,该电路用于定期检查从主模块获得的结果以检测故障。这引入了比现有方案更低的电路开销。所有并行实现CRC计算的CRC编码器和解码器都可以使用所提出的方案来检测故障。
{"title":"Scheme for periodical concurrent fault detection in parallel CRC circuits","authors":"Jie Li,&nbsp;Shanshan Liu,&nbsp;Pedro Reviriego,&nbsp;Liyi Xiao,&nbsp;Fabrizio Lombardi","doi":"10.1049/iet-cdt.2018.5183","DOIUrl":"https://doi.org/10.1049/iet-cdt.2018.5183","url":null,"abstract":"<div>\u0000 <p>As technology scales down, circuits are more prone to incur in faults and fault detection is necessary to ensure the system reliability. However, fault-detection circuits are also vulnerable to stuck-at faults due to, for example, manufacturing defects or ageing; a fault can cause an incorrect output in the fault-detection scheme; so concurrent fault detection is, therefore, needed. Cyclic redundancy checks (CRCs) are widely used to detect errors in many applications, for example, they are used in communication to detect errors on transmitted frames. In this study, an efficient method to implement concurrent fault detection for parallel CRC computation is proposed. The scheme relies on using a serial CRC computation circuit that is used to periodically check the results obtained from the main module to detect the faults. This introduces a lower circuit overhead than existing schemes. All CRC encoders and decoders that implement the CRC computation in parallel can employ the proposed scheme to detect faults.</p>\u0000 </div>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"14 2","pages":"80-85"},"PeriodicalIF":1.2,"publicationDate":"2020-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1049/iet-cdt.2018.5183","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71972151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Survey on memory management techniques in heterogeneous computing systems 异构计算系统内存管理技术综述
IF 1.2 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2020-01-21 DOI: 10.1049/iet-cdt.2019.0092
Anakhi Hazarika, Soumyajit Poddar, Hafizur Rahaman

A major issue faced by data scientists today is how to scale up their processing infrastructure to meet the challenge of big data and high-performance computing (HPC) workloads. With today's HPC domain, it is required to connect multiple graphics processing units (GPUs) to accomplish large-scale parallel computing along with CPUs. Data movement between the processor and on-chip or off-chip memory creates a major bottleneck in overall system performance. The CPU/GPU processes all the data on a computer's memory and hence the speed of the data movement to/from memory and the size of the memory affect computer speed. During memory access by any processing element, the memory management unit (MMU) controls the data flow of the computer's main memory and impacts the system performance and power. Change in dynamic random access memory (DRAM) architecture, integration of memory-centric hardware accelerator in the heterogeneous system and Processing-in-Memory (PIM) are the techniques adopted from all the available shared resource management techniques to maximise the system throughput. This survey study presents an analysis of various DRAM designs and their performances. The authors also focus on the architecture, functionality, and performance of different hardware accelerators and PIM systems to reduce memory access time. Some insights and potential directions toward enhancements to existing techniques are also discussed. The requirement of fast, reconfigurable, self-adaptive memory management schemes in the high-speed processing scenario motivates us to track the trend. An effective MMU handles memory protection, cache control and bus arbitration associated with the processors.

当今数据科学家面临的一个主要问题是如何扩大他们的处理基础设施,以应对大数据和高性能计算(HPC)工作负载的挑战。对于今天的HPC领域,需要连接多个图形处理单元(GPU)来与CPU一起完成大规模并行计算。处理器和片上或片外存储器之间的数据移动造成了整个系统性能的主要瓶颈。CPU/GPU处理计算机存储器上的所有数据,因此数据移动到存储器/从存储器移动的速度和存储器的大小影响计算机速度。在任何处理元件访问内存期间,内存管理单元(MMU)控制计算机主内存的数据流,并影响系统性能和功率。动态随机存取存储器(DRAM)架构的变化、异构系统中以存储器为中心的硬件加速器的集成以及存储器中的处理(PIM)是从所有可用的共享资源管理技术中采用的技术,以最大限度地提高系统吞吐量。这项调查研究对各种DRAM设计及其性能进行了分析。作者还关注不同硬件加速器和PIM系统的架构、功能和性能,以减少内存访问时间。还讨论了增强现有技术的一些见解和潜在方向。高速处理场景中对快速、可重构、自适应内存管理方案的需求促使我们跟踪趋势。有效的MMU处理与处理器相关的内存保护、缓存控制和总线仲裁。
{"title":"Survey on memory management techniques in heterogeneous computing systems","authors":"Anakhi Hazarika,&nbsp;Soumyajit Poddar,&nbsp;Hafizur Rahaman","doi":"10.1049/iet-cdt.2019.0092","DOIUrl":"https://doi.org/10.1049/iet-cdt.2019.0092","url":null,"abstract":"<div>\u0000 <p>A major issue faced by data scientists today is how to scale up their processing infrastructure to meet the challenge of big data and high-performance computing (HPC) workloads. With today's HPC domain, it is required to connect multiple graphics processing units (GPUs) to accomplish large-scale parallel computing along with CPUs. Data movement between the processor and on-chip or off-chip memory creates a major bottleneck in overall system performance. The CPU/GPU processes all the data on a computer's memory and hence the speed of the data movement to/from memory and the size of the memory affect computer speed. During memory access by any processing element, the memory management unit (MMU) controls the data flow of the computer's main memory and impacts the system performance and power. Change in dynamic random access memory (DRAM) architecture, integration of memory-centric hardware accelerator in the heterogeneous system and Processing-in-Memory (PIM) are the techniques adopted from all the available shared resource management techniques to maximise the system throughput. This survey study presents an analysis of various DRAM designs and their performances. The authors also focus on the architecture, functionality, and performance of different hardware accelerators and PIM systems to reduce memory access time. Some insights and potential directions toward enhancements to existing techniques are also discussed. The requirement of fast, reconfigurable, self-adaptive memory management schemes in the high-speed processing scenario motivates us to track the trend. An effective MMU handles memory protection, cache control and bus arbitration associated with the processors.</p>\u0000 </div>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"14 2","pages":"47-60"},"PeriodicalIF":1.2,"publicationDate":"2020-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1049/iet-cdt.2019.0092","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72160107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Efficient and flexible hardware structures of the 128 bit CLEFIA block cipher 高效灵活的128位CLEFIA分组密码硬件结构
IF 1.2 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2020-01-09 DOI: 10.1049/iet-cdt.2019.0157
Bahram Rashidi

In this study, high-throughput and flexible hardware implementations of the CLEFIA lightweight block cipher are presented. A unified processing element is designed and shared for implementing of generalised Feistel network that computes round keys and encryption process in the two separate times. The most complex blocks in the CLEFIA algorithm are substitution boxes ( and ). The S-box is implemented based on area-optimised combinational logic circuits. In the proposed S-box structure, the number of logic gates and critical path delay are reduced by using the simplification of computation terms. The S-box consists of three steps: a field inversion over and two affine transformations over . The inversion operation is implemented over the composite field instead of inversion over which is an important factor for the reduction of area consumption. In addition, we proposed a flexible structure that can perform various configurations of CLEFIA to support variable key sizes: 128, 192 and 256 bit. Implementation results of the proposed architectures in 180 nm complementary metal–oxide–semiconductor technology for different key sizes are achieved. The results show improvements in terms of execution time, throughput and throughput/area compared with other related works.

在这项研究中,提出了CLEFIA轻量级分组密码的高吞吐量和灵活的硬件实现。设计并共享了一个统一的处理单元,用于实现广义Feistel网络,该网络在两个不同的时间内计算循环密钥和加密过程。CLEFIA算法中最复杂的块是替换框(和)。S盒是基于面积优化的组合逻辑电路实现的。在所提出的S盒结构中,通过简化计算项,减少了逻辑门的数量和关键路径延迟。S盒由三个步骤组成:一个域反转和两个仿射变换。反演操作是在复合场上实现的,而不是反演,这是减少面积消耗的重要因素。此外,我们提出了一种灵活的结构,可以执行CLEFIA的各种配置,以支持可变的密钥大小:128、192和256位。针对不同的密钥大小,在180 nm互补金属-氧化物-半导体技术中实现了所提出的架构的结果。结果表明,与其他相关工作相比,在执行时间、吞吐量和吞吐量/面积方面有所改进。
{"title":"Efficient and flexible hardware structures of the 128 bit CLEFIA block cipher","authors":"Bahram Rashidi","doi":"10.1049/iet-cdt.2019.0157","DOIUrl":"https://doi.org/10.1049/iet-cdt.2019.0157","url":null,"abstract":"<div>\u0000 <p>In this study, high-throughput and flexible hardware implementations of the CLEFIA lightweight block cipher are presented. A unified processing element is designed and shared for implementing of generalised Feistel network that computes round keys and encryption process in the two separate times. The most complex blocks in the CLEFIA algorithm are substitution boxes ( and ). The S-box is implemented based on area-optimised combinational logic circuits. In the proposed S-box structure, the number of logic gates and critical path delay are reduced by using the simplification of computation terms. The S-box consists of three steps: a field inversion over and two affine transformations over . The inversion operation is implemented over the composite field instead of inversion over which is an important factor for the reduction of area consumption. In addition, we proposed a flexible structure that can perform various configurations of CLEFIA to support variable key sizes: 128, 192 and 256 bit. Implementation results of the proposed architectures in 180 nm complementary metal–oxide–semiconductor technology for different key sizes are achieved. The results show improvements in terms of execution time, throughput and throughput/area compared with other related works.</p>\u0000 </div>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"14 2","pages":"69-79"},"PeriodicalIF":1.2,"publicationDate":"2020-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1049/iet-cdt.2019.0157","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71948473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Mapping application-specific topology to mesh topology with reconfigurable switches 使用可重新配置的交换机将特定应用拓扑映射到网状拓扑
IF 1.2 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2020-01-01 DOI: 10.1049/iet-cdt.2018.5202
Pinar Kullu, Yilmaz Ar, Suleyman Tosun, Suat Ozdemir
When designing a Network-on-Chip (NoC) architecture, designers must consider various criteria such as bandwidth, performance, energy consumption, cost, re-usability, and fault tolerance. In most of the design efforts, it is very difficult to meet all these interacting constraints and objectives at the same time. Some of these parameters can be optimised and met easily by regular NoC topologies due to their re-usability and fault-tolerance capabilities. On the other hand, other parameters such as energy consumption, performance, and chip area can be better optimised in irregular NoC topologies. In this work, the authors present a novel two-step method that combines the advantages of regular and irregular NoC topologies. In the first step, the authors' method generates an energy and area optimised irregular topology for the given application by using a genetic algorithm. The generated topology uses the least amount of routers and links to minimise the area and energy; thus, it offers only one routing path between communicating nodes. Therefore, it does not fault tolerant. In the second step, their method maps the generated irregular topology to a reconfigurable mesh topology to make it fault tolerant. The detailed simulation results show the superiority of the proposed method over the existing work on several multimedia benchmarks.
在设计片上网络(NoC)架构时,设计者必须考虑各种标准,如带宽、性能、能耗、成本、可重用性和容错性。在大多数设计工作中,很难同时满足所有这些相互作用的约束和目标。其中一些参数可以通过常规的NoC拓扑结构进行优化和满足,因为它们具有可重用性和容错能力。另一方面,在不规则的NoC拓扑中,可以更好地优化诸如能耗、性能和芯片面积之类的其他参数。在这项工作中,作者提出了一种新的两步方法,该方法结合了规则和不规则NoC拓扑的优点。在第一步中,作者的方法通过使用遗传算法为给定的应用生成能量和面积优化的不规则拓扑。生成的拓扑使用最少数量的路由器和链路来最小化面积和能量;因此,它在通信节点之间只提供一条路由路径。因此,它不是容错的。在第二步中,他们的方法将生成的不规则拓扑映射到可重新配置的网格拓扑,使其具有容错性。详细的仿真结果表明,该方法在几个多媒体基准上优于现有的工作。
{"title":"Mapping application-specific topology to mesh topology with reconfigurable switches","authors":"Pinar Kullu,&nbsp;Yilmaz Ar,&nbsp;Suleyman Tosun,&nbsp;Suat Ozdemir","doi":"10.1049/iet-cdt.2018.5202","DOIUrl":"https://doi.org/10.1049/iet-cdt.2018.5202","url":null,"abstract":"When designing a Network-on-Chip (NoC) architecture, designers must consider various criteria such as bandwidth, performance, energy consumption, cost, re-usability, and fault tolerance. In most of the design efforts, it is very difficult to meet all these interacting constraints and objectives at the same time. Some of these parameters can be optimised and met easily by regular NoC topologies due to their re-usability and fault-tolerance capabilities. On the other hand, other parameters such as energy consumption, performance, and chip area can be better optimised in irregular NoC topologies. In this work, the authors present a novel two-step method that combines the advantages of regular and irregular NoC topologies. In the first step, the authors' method generates an energy and area optimised irregular topology for the given application by using a genetic algorithm. The generated topology uses the least amount of routers and links to minimise the area and energy; thus, it offers only one routing path between communicating nodes. Therefore, it does not fault tolerant. In the second step, their method maps the generated irregular topology to a reconfigurable mesh topology to make it fault tolerant. The detailed simulation results show the superiority of the proposed method over the existing work on several multimedia benchmarks.","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"14 1","pages":"9-16"},"PeriodicalIF":1.2,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1049/iet-cdt.2018.5202","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71936967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Power-efficient reliable register file for aggressive-environment applications 高效可靠的寄存器文件,适用于攻击性环境应用
IF 1.2 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2020-01-01 DOI: 10.1049/iet-cdt.2018.5047
Ihsen Alouani, Hamzeh Ahangari, Ozcan Ozturk, Smail Niar
In a context of increasing demands for on-board data processing, insuring reliability under reduced power budget is a serious design challenge for embedded system manufacturers. Particularly, embedded processors in aggressive environments need to be designed with error hardening as a primary goal, not an afterthought. As Register File (RF) is a critical element within the processor pipeline, enhancing RF reliability is mandatory to design fault immune computing systems. This study proposes integer and floating point RF reliability enhancement techniques. Specifically, the authors propose Adjacent Register Hardened RF, a new RF architecture that exploits the adjacent byte-level narrow-width values for hardening integer registers at runtime. Registers are paired together by special switches referred to as joiners and non-utilised bits of each register are exploited to enhance the reliability of its counterpart register. Moreover, they suggest sacrificing the least significant bits of the Mantissa to enhance the reliability of the floating point critical bits, namely, Exponent and Sign bits. The authors’ results show that with a low power budget compared to state of the art techniques, they achieve better results under both normal and highly aggressive operating conditions.
在对板载数据处理需求不断增加的背景下,在降低功率预算的情况下确保可靠性对嵌入式系统制造商来说是一个严峻的设计挑战。特别是,在攻击性环境中的嵌入式处理器需要将错误强化作为主要目标来设计,而不是事后考虑。由于寄存器文件(RF)是处理器流水线中的一个关键元素,因此增强RF可靠性对于设计故障免疫计算系统是强制性的。本研究提出了整数和浮点射频可靠性增强技术。具体而言,作者提出了相邻寄存器硬化RF,这是一种新的RF架构,它利用相邻字节级窄宽度值在运行时硬化整数寄存器。寄存器由称为连接器的特殊开关配对在一起,每个寄存器的未使用位被用来提高其对应寄存器的可靠性。此外,他们建议牺牲Mantissa的最低有效位来提高浮点关键位(即Exponent和Sign位)的可靠性。作者的研究结果表明,与现有技术相比,它们的功率预算较低,在正常和高度激进的操作条件下都能获得更好的结果。
{"title":"Power-efficient reliable register file for aggressive-environment applications","authors":"Ihsen Alouani,&nbsp;Hamzeh Ahangari,&nbsp;Ozcan Ozturk,&nbsp;Smail Niar","doi":"10.1049/iet-cdt.2018.5047","DOIUrl":"https://doi.org/10.1049/iet-cdt.2018.5047","url":null,"abstract":"In a context of increasing demands for on-board data processing, insuring reliability under reduced power budget is a serious design challenge for embedded system manufacturers. Particularly, embedded processors in aggressive environments need to be designed with error hardening as a primary goal, not an afterthought. As Register File (RF) is a critical element within the processor pipeline, enhancing RF reliability is mandatory to design fault immune computing systems. This study proposes integer and floating point RF reliability enhancement techniques. Specifically, the authors propose Adjacent Register Hardened RF, a new RF architecture that exploits the adjacent byte-level narrow-width values for hardening integer registers at runtime. Registers are paired together by special switches referred to as joiners and non-utilised bits of each register are exploited to enhance the reliability of its counterpart register. Moreover, they suggest sacrificing the least significant bits of the Mantissa to enhance the reliability of the floating point critical bits, namely, Exponent and Sign bits. The authors’ results show that with a low power budget compared to state of the art techniques, they achieve better results under both normal and highly aggressive operating conditions.","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"14 1","pages":"1-8"},"PeriodicalIF":1.2,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1049/iet-cdt.2018.5047","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71959306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-objective constraint and hybrid optimisation-based VM migration in a community cloud 社区云中基于多目标约束和混合优化的虚拟机迁移
IF 1.2 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2020-01-01 DOI: 10.1049/iet-cdt.2018.5243
Pradeepa Parthiban, Pushpalakshmi Raman
The growing demand for the cloud community market towards attracting and sustaining the incoming and the available cloud users is addressed actively to meet the competitive environment. There is a good scope for improving the provider capabilities in the cloud in order to satisfy the users with attractive benefits. The study introduces an effective virtual machine (VM) migration strategy using an optimisation algorithm in such a way to facilitate the user selection of the providers based on their budgetary requirements in running their own platforms. The constraints associated with the selection of the provider include cost, revenue, and resource, which are altogether confined as an elective factor. The optimisation algorithm employed for the VM migration is referred to as Taylor series-based salp swarm algorithm (Taylor-SSA) that is the integration of the Taylor series with SSA. The evaluation of the method is progressed using three setups by varying the number of providers and users. The cost, the revenue, and the resource of the proposed method are analysed and concluded that the proposed method acquired a minimal cost, maximal resource gain and revenue.
为了满足竞争环境,云社区市场对吸引和维持新用户和可用云用户的需求不断增长。有一个很好的空间来改进云中的提供商能力,以使用户满意并获得有吸引力的好处。该研究引入了一种使用优化算法的有效虚拟机(VM)迁移策略,以便于用户根据运营自己平台的预算要求选择提供商。与供应商选择相关的限制因素包括成本、收入和资源,这些都被限制为一个选择性因素。用于VM迁移的优化算法被称为基于泰勒级数的salp群算法(Taylor SSA),它是泰勒级数与SSA的集成。通过改变提供者和用户的数量,使用三种设置对该方法进行评估。对该方法的成本、收益和资源进行了分析,得出结论:该方法获得了最小的成本、最大的资源收益和收益。
{"title":"Multi-objective constraint and hybrid optimisation-based VM migration in a community cloud","authors":"Pradeepa Parthiban,&nbsp;Pushpalakshmi Raman","doi":"10.1049/iet-cdt.2018.5243","DOIUrl":"https://doi.org/10.1049/iet-cdt.2018.5243","url":null,"abstract":"The growing demand for the cloud community market towards attracting and sustaining the incoming and the available cloud users is addressed actively to meet the competitive environment. There is a good scope for improving the provider capabilities in the cloud in order to satisfy the users with attractive benefits. The study introduces an effective virtual machine (VM) migration strategy using an optimisation algorithm in such a way to facilitate the user selection of the providers based on their budgetary requirements in running their own platforms. The constraints associated with the selection of the provider include cost, revenue, and resource, which are altogether confined as an elective factor. The optimisation algorithm employed for the VM migration is referred to as Taylor series-based salp swarm algorithm (Taylor-SSA) that is the integration of the Taylor series with SSA. The evaluation of the method is progressed using three setups by varying the number of providers and users. The cost, the revenue, and the resource of the proposed method are analysed and concluded that the proposed method acquired a minimal cost, maximal resource gain and revenue.","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"14 1","pages":"37-45"},"PeriodicalIF":1.2,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1049/iet-cdt.2018.5243","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71986473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
LFSR generation for high test coverage and low hardware overhead LFSR生成可实现高测试覆盖率和低硬件开销
IF 1.2 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2020-01-01 DOI: 10.1049/iet-cdt.2019.0042
Leonel Hernández Martínez, Saqib Khursheed, Sudhakar Mannapuram Reddy
Safety-critical technology rests on optimised and effective testing techniques for every embedded system involved in the equipment. Pattern generator (PG) such as linear feedback shift register (LFSR) is used for fault detection and useful for reliability and online test. This study presents an analysis of the LFSR, using a known automatic test PG (ATPG) test set. Two techniques are undertaken to target difficult-to-detect faults with their respective trade-off analysis. This is achieved using Berlekamp-Massey (BM) algorithm with optimisations to reduce area overhead. The first technique (concatenated) combines all test sets generating a single polynomial that covers complete ATPG set (baseline-C). Improvements are found in Algorithm 1 reducing polynomial size through Xs assignment. The second technique uses non-concatenated test sets and provides a group of LFSRs using BM without including any optimisation (baseline-N). This algorithm is further optimised by selecting full mapping and independent polynomial expressions. Results are generated using 32 benchmarks and 65 nm technology. The concatenated technique provides reductions on area overhead for 90.6% cases with a best case of 57 and 39% means. The remaining 9.4% of cases, non-concatenated technique provides the best reduction of 37 with 1.4% means, whilst achieving 100% test mapping in both cases.
安全关键技术依赖于设备中涉及的每个嵌入式系统的优化和有效的测试技术。模式发生器(PG),如线性反馈移位寄存器(LFSR),用于故障检测和可靠性在线测试。本研究使用已知的自动测试PG(ATPG)测试集对LFSR进行了分析。针对难以检测的故障,采用了两种技术,分别进行权衡分析。这是使用Berlekamp–Massey(BM)算法实现的,该算法经过优化以减少区域开销。第一种技术(级联)组合所有测试集,生成覆盖完整ATPG集(基线-C)的单个多项式。在算法1中发现了通过X赋值来减小多项式大小的改进。第二种技术使用非级联测试集,并使用BM提供一组LFSR,而不包括任何优化(基线-N)。该算法通过选择全映射和独立多项式表达式来进一步优化。使用32个基准和65nm技术生成结果。级联技术在90.6%的情况下减少了面积开销,最佳情况下平均值为57%和39%。在剩下的9.4%的病例中,非级联技术提供了最好的减少37例,平均值为1.4%,同时在两种情况下都实现了100%的测试映射。
{"title":"LFSR generation for high test coverage and low hardware overhead","authors":"Leonel Hernández Martínez,&nbsp;Saqib Khursheed,&nbsp;Sudhakar Mannapuram Reddy","doi":"10.1049/iet-cdt.2019.0042","DOIUrl":"https://doi.org/10.1049/iet-cdt.2019.0042","url":null,"abstract":"Safety-critical technology rests on optimised and effective testing techniques for every embedded system involved in the equipment. Pattern generator (PG) such as linear feedback shift register (LFSR) is used for fault detection and useful for reliability and online test. This study presents an analysis of the LFSR, using a known automatic test PG (ATPG) test set. Two techniques are undertaken to target difficult-to-detect faults with their respective trade-off analysis. This is achieved using Berlekamp-Massey (BM) algorithm with optimisations to reduce area overhead. The first technique (concatenated) combines all test sets generating a single polynomial that covers complete ATPG set (baseline-C). Improvements are found in Algorithm 1 reducing polynomial size through Xs assignment. The second technique uses non-concatenated test sets and provides a group of LFSRs using BM without including any optimisation (baseline-N). This algorithm is further optimised by selecting full mapping and independent polynomial expressions. Results are generated using 32 benchmarks and 65 nm technology. The concatenated technique provides reductions on area overhead for 90.6% cases with a best case of 57 and 39% means. The remaining 9.4% of cases, non-concatenated technique provides the best reduction of 37 with 1.4% means, whilst achieving 100% test mapping in both cases.","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"14 1","pages":"27-36"},"PeriodicalIF":1.2,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1049/iet-cdt.2019.0042","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71986474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
LFSR-based generation of boundary-functional broadside tests 基于LFSR的边界函数宽边测试生成
IF 1.2 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2019-09-25 DOI: 10.1049/iet-cdt.2019.0058
Irith Pomeranz

This study considers the compression of a type of close-to-functional broadside tests called boundary-functional broadside tests when the on-chip decompression logic consists of a linear-feedback shift register (LFSR). Boundary-functional broadside tests maintain functional operation conditions on a set of lines (called a boundary) in a circuit. This limits the deviations from functional operation conditions by ensuring that they do not propagate across the boundary. Functional vectors for the boundary are obtained from functional broadside tests. Seeds for the LFSR are generated directly from functional boundary vectors without generating tests or test cubes. Considering the tests that the LFSR produces, the seed generation procedure attempts to obtain the lowest possible Hamming distance between their boundary vectors and functional boundary vectors. It considers multiple LFSRs with increasing lengths to achieve test data compression. The procedure is structured to explore the trade-off between the level of test data compression and the Hamming distances or the proximity to functional operation conditions.

当片上解压缩逻辑由线性反馈移位寄存器(LFSR)组成时,本研究考虑了一种称为边界函数宽边测试的接近函数宽边的测试的压缩。边界函数宽边测试保持电路中一组线(称为边界)上的函数操作条件。这通过确保它们不会跨越边界传播来限制与功能操作条件的偏差。边界的函数向量是从函数宽边测试中获得的。LFSR的种子直接从函数边界向量生成,而不生成测试或测试立方体。考虑到LFSR产生的测试,种子生成过程试图获得它们的边界向量和函数边界向量之间尽可能低的汉明距离。它考虑了长度增加的多个LFSR,以实现测试数据压缩。该程序的结构旨在探索测试数据压缩水平与汉明距离或接近功能操作条件之间的权衡。
{"title":"LFSR-based generation of boundary-functional broadside tests","authors":"Irith Pomeranz","doi":"10.1049/iet-cdt.2019.0058","DOIUrl":"https://doi.org/10.1049/iet-cdt.2019.0058","url":null,"abstract":"<div>\u0000 <p>This study considers the compression of a type of close-to-functional broadside tests called boundary-functional broadside tests when the on-chip decompression logic consists of a linear-feedback shift register (LFSR). Boundary-functional broadside tests maintain functional operation conditions on a set of lines (called a boundary) in a circuit. This limits the deviations from functional operation conditions by ensuring that they do not propagate across the boundary. Functional vectors for the boundary are obtained from functional broadside tests. Seeds for the LFSR are generated directly from functional boundary vectors without generating tests or test cubes. Considering the tests that the LFSR produces, the seed generation procedure attempts to obtain the lowest possible Hamming distance between their boundary vectors and functional boundary vectors. It considers multiple LFSRs with increasing lengths to achieve test data compression. The procedure is structured to explore the trade-off between the level of test data compression and the Hamming distances or the proximity to functional operation conditions.</p>\u0000 </div>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"14 2","pages":"61-68"},"PeriodicalIF":1.2,"publicationDate":"2019-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1049/iet-cdt.2019.0058","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71981189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
IET Computers and Digital Techniques
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1