IET Computers and Digital Techniques最新文献_第10页

Single bit-line 11T SRAM cell for low power and improved stability 用于低功耗和提高稳定性的单比特线11T SRAM单元

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IET Computers and Digital Techniques

Pub Date : 2020-03-31 DOI: 10.1049/iet-cdt.2019.0234

Rohit Lorenzo, Roy Pailly

This study aims for a new 11T static random access memory (SRAM) cell that uses power gating transistors and transmission gate for low leakage and reliable write operation. The proposed cell has a separate read and write path which successfully improves read and write abilities. Furthermore, it solves the row half select disturbance and utilises a row-based virtual ground signal to eliminate unnecessary bit-line discharge in the un-selected row, thus decreasing energy consumption. The cell also achieves low power due to the stack effect. To show the effectiveness of the cell, its design metrics are compared with other published SRAM cells, namely, conventional 6T, 10T, 9T, and power-gated 9T (PG9T). In standby mode, from 6.71 to 7.37% leakage power reduction is observed for this cell at an operating voltage of 1.2 V and 29.21 to 58.68% & 32.74 to 71.11% improvement for write & read power over other cells. The proposed cell exhibits higher write and reads static noise margins with an improvement of 13.54 and 63.28%, respectively, compared to conventional 6T SRAM cell. The cell provides write delay improvement from 29.77 to 49.40% and read delay improvement from 7 to 12% compared to 9T, 10T, and PG9T, respectively.

本研究旨在开发一种新的11T静态随机存取存储器（SRAM）单元，该单元使用功率门控晶体管和传输门来实现低泄漏和可靠的写入操作。所提出的单元具有独立的读写路径，这成功地提高了读写能力。此外，它解决了行半选择干扰，并利用基于行的虚拟接地信号来消除未选择行中不必要的位线放电，从而降低能耗。由于堆叠效应，该电池还实现了低功率。为了显示该单元的有效性，将其设计指标与其他已发表的SRAM单元进行了比较，即传统的6T、10T、9T和功率门控9T（PG9T）。在待机模式中，在1.2V的操作电压和29.21至58.68%&；写入&；读取其他单元格的功率。与传统的6T SRAM单元相比，所提出的单元表现出更高的写入和读取静态噪声裕度，分别提高了13.54%和63.28%。与9T、10T和PG9T相比，该单元分别提供从29.77%到49.40%的写入延迟改进和从7%到12%的读取延迟改进。

{"title":"Single bit-line 11T SRAM cell for low power and improved stability","authors":"Rohit Lorenzo, Roy Pailly","doi":"10.1049/iet-cdt.2019.0234","DOIUrl":"https://doi.org/10.1049/iet-cdt.2019.0234","url":null,"abstract":"<div>\u0000 <p>This study aims for a new 11T static random access memory (SRAM) cell that uses power gating transistors and transmission gate for low leakage and reliable write operation. The proposed cell has a separate read and write path which successfully improves read and write abilities. Furthermore, it solves the row half select disturbance and utilises a row-based virtual ground signal to eliminate unnecessary bit-line discharge in the un-selected row, thus decreasing energy consumption. The cell also achieves low power due to the stack effect. To show the effectiveness of the cell, its design metrics are compared with other published SRAM cells, namely, conventional 6T, 10T, 9T, and power-gated 9T (PG9T). In standby mode, from 6.71 to 7.37% leakage power reduction is observed for this cell at an operating voltage of 1.2 V and 29.21 to 58.68% & 32.74 to 71.11% improvement for write & read power over other cells. The proposed cell exhibits higher write and reads static noise margins with an improvement of 13.54 and 63.28%, respectively, compared to conventional 6T SRAM cell. The cell provides write delay improvement from 29.77 to 49.40% and read delay improvement from 7 to 12% compared to 9T, 10T, and PG9T, respectively.</p>\u0000 </div>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"14 3","pages":"114-121"},"PeriodicalIF":1.2,"publicationDate":"2020-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1049/iet-cdt.2019.0234","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71998028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 32

Scalable pseudo-exhaustive methodology for testing and diagnosis in flow-based microfluidic biochips 基于流动的微流控生物芯片测试和诊断的可扩展伪穷举方法

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IET Computers and Digital Techniques

Pub Date : 2020-03-31 DOI: 10.1049/iet-cdt.2018.5029

Gokulkrishnan Vadakkeveedu, Kamakoti Veezhinathan, Nitin Chandrachoodan, Seetal Potluri

Microfluidics is an upcoming field of science that is going to be used widely in many safety-critical applications including healthcare, medical research and defence. Hence, technologies for fault testing and fault diagnosis of these chips are of extreme importance. In this study, the authors propose a scalable pseudo-exhaustive testing and diagnosis methodology for flow-based microfluidic biochips. The proposed approach employs a divide-and-conquer based technique wherein, large architectures are split into smaller sub-architectures and each of these are tested and diagnosed independently.

微流体是一个即将到来的科学领域，将被广泛用于许多安全关键应用，包括医疗保健、医学研究和国防。因此，这些芯片的故障测试和故障诊断技术具有极其重要的意义。在这项研究中，作者提出了一种可扩展的基于流动的微流控生物芯片的伪穷举测试和诊断方法。所提出的方法采用了一种基于分而治之的技术，其中，将大型体系结构拆分为较小的子体系结构，并对每个子体系结构进行独立的测试和诊断。

引用次数: 1

Performance analysis of dynamic CMOS circuit based on node-discharger and twist-connected transistors 基于节点放电器和扭接晶体管的动态CMOS电路性能分析

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IET Computers and Digital Techniques

Pub Date : 2020-02-10 DOI: 10.1049/iet-cdt.2018.5045

Dhandapani Vaithiyanathan, Ravindra Kumar, Ashima Rai, Khushboo Sharma

The incessant growth of devices such as mobile phones, digital cameras, and other portable electronic gadgets has led to a higher amount of research being dedicated to the low power digital and analogue circuits. In this study, a low power-delay-product (PDP) dynamic complementary metal oxide semiconductor (CMOS) circuit design using small swing domino logic with twist-connected transistors is proposed. An improvement in PDP can be achieved by using a node-discharger circuit in the conventional design. The conventional benchmark and modified circuits are implemented in 90 nm CMOS technology with different power supplies, i.e. 1.2, 1, and 0.9 V. Furthermore, a decrease in voltage level for logic ‘1’ and an increase in voltage level for logic ‘0’ is achieved while maintaining the logic threshold accordingly at half of the supply voltage. So, the output voltage swing is reduced and the unnecessary nodes of the pull down network get discharged in pre-charge phase, eventually leading to an improvement when compared with conventional design in overall PDP by 43.21 and 46.83% for two inverted two-input and three-input AND gate dynamic benchmarks, respectively, for a power supply of 1 V.

手机、数码相机和其他便携式电子设备等设备的不断增长，导致对低功耗数字和模拟电路的研究越来越多。在本研究中，提出了一种低功率延迟产品（PDP）动态互补金属氧化物半导体（CMOS）电路设计，该电路使用具有扭接晶体管的小摆动多米诺逻辑。可以通过在传统设计中使用节点放电器电路来实现PDP的改进。传统的基准电路和改进电路是在具有不同电源（即1.2、1和0.9V）的90nm CMOS技术中实现的。此外，在将逻辑阈值相应地保持在电源电压的一半的同时，实现了逻辑“1”的电压电平的降低和逻辑“0”的电压水平的增加。因此，输出电压摆动减少，下拉网络的不必要节点在预充电阶段放电，最终与传统设计相比，对于1V的电源，两个反相的两输入和三输入与门动态基准，整体PDP分别提高了43.21%和46.83%。

{"title":"Performance analysis of dynamic CMOS circuit based on node-discharger and twist-connected transistors","authors":"Dhandapani Vaithiyanathan, Ravindra Kumar, Ashima Rai, Khushboo Sharma","doi":"10.1049/iet-cdt.2018.5045","DOIUrl":"https://doi.org/10.1049/iet-cdt.2018.5045","url":null,"abstract":"<div>\u0000 <p>The incessant growth of devices such as mobile phones, digital cameras, and other portable electronic gadgets has led to a higher amount of research being dedicated to the low power digital and analogue circuits. In this study, a low power-delay-product (PDP) dynamic complementary metal oxide semiconductor (CMOS) circuit design using small swing domino logic with twist-connected transistors is proposed. An improvement in PDP can be achieved by using a node-discharger circuit in the conventional design. The conventional benchmark and modified circuits are implemented in 90 nm CMOS technology with different power supplies, i.e. 1.2, 1, and 0.9 V. Furthermore, a decrease in voltage level for logic ‘1’ and an increase in voltage level for logic ‘0’ is achieved while maintaining the logic threshold accordingly at half of the supply voltage. So, the output voltage swing is reduced and the unnecessary nodes of the pull down network get discharged in pre-charge phase, eventually leading to an improvement when compared with conventional design in overall PDP by 43.21 and 46.83% for two inverted two-input and three-input AND gate dynamic benchmarks, respectively, for a power supply of 1 V.</p>\u0000 </div>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"14 3","pages":"107-113"},"PeriodicalIF":1.2,"publicationDate":"2020-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/iet-cdt.2018.5045","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71950743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Integer linear programming model for allocation and migration of data blocks in the STT-RAM-based hybrid caches 基于STT RAM的混合缓存中数据块分配和迁移的整数线性规划模型

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IET Computers and Digital Techniques

Pub Date : 2020-02-10 DOI: 10.1049/iet-cdt.2019.0070

Elyas Khajekarimi, Kamal Jamshidi, Abbas Vafaei

Spin-transfer torque random access memory (STT-RAM) has emerged as an eminent choice for the larger on-chip caches due to high density, low static power consumption and scalability. However, this technology suffers from long latency and high energy consumption during a write operation. Hybrid caches alleviate these problems by incorporating a write-friendly memory technology such as static random access memory along with STT-RAM technology. The proper allocation of data blocks has a significant effect on both performance and energy consumption in the hybrid cache. In this study, the allocation and migration problem of data blocks in the hybrid cache is examined and then modelled using integer linear programming (ILP) formulations. The authors propose an ILP model with three different objective functions which include minimising access latency, minimising energy and minimising energy-delay product in the hybrid cache. Evaluations confirm that the proposed ILP model obtains better results in terms of energy consumption and performance compared to the existing hybrid cache architecture.

由于高密度、低静态功耗和可扩展性，自旋转移力矩随机存取存储器（STT-RAM）已成为大型片上高速缓存的杰出选择。然而，该技术在写入操作期间存在长延迟和高能耗的问题。混合高速缓存通过结合诸如静态随机存取存储器之类的写友好存储器技术以及STT-RAM技术来缓解这些问题。数据块的正确分配对混合缓存的性能和能耗都有显著影响。在这项研究中，研究了混合缓存中数据块的分配和迁移问题，然后使用整数线性规划（ILP）公式进行建模。作者提出了一个具有三个不同目标函数的ILP模型，包括最小化访问延迟、最小化能量和最小化混合缓存中的能量延迟乘积。评估证实，与现有的混合缓存架构相比，所提出的ILP模型在能耗和性能方面获得了更好的结果。

{"title":"Integer linear programming model for allocation and migration of data blocks in the STT-RAM-based hybrid caches","authors":"Elyas Khajekarimi, Kamal Jamshidi, Abbas Vafaei","doi":"10.1049/iet-cdt.2019.0070","DOIUrl":"https://doi.org/10.1049/iet-cdt.2019.0070","url":null,"abstract":"<div>\u0000 <p>Spin-transfer torque random access memory (STT-RAM) has emerged as an eminent choice for the larger on-chip caches due to high density, low static power consumption and scalability. However, this technology suffers from long latency and high energy consumption during a write operation. Hybrid caches alleviate these problems by incorporating a write-friendly memory technology such as static random access memory along with STT-RAM technology. The proper allocation of data blocks has a significant effect on both performance and energy consumption in the hybrid cache. In this study, the allocation and migration problem of data blocks in the hybrid cache is examined and then modelled using integer linear programming (ILP) formulations. The authors propose an ILP model with three different objective functions which include minimising access latency, minimising energy and minimising energy-delay product in the hybrid cache. Evaluations confirm that the proposed ILP model obtains better results in terms of energy consumption and performance compared to the existing hybrid cache architecture.</p>\u0000 </div>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"14 3","pages":"97-106"},"PeriodicalIF":1.2,"publicationDate":"2020-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/iet-cdt.2019.0070","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71950742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Scheme for periodical concurrent fault detection in parallel CRC circuits 并行CRC电路中的周期性并发故障检测方案

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IET Computers and Digital Techniques

Pub Date : 2020-01-21 DOI: 10.1049/iet-cdt.2018.5183

Jie Li, Shanshan Liu, Pedro Reviriego, Liyi Xiao, Fabrizio Lombardi

As technology scales down, circuits are more prone to incur in faults and fault detection is necessary to ensure the system reliability. However, fault-detection circuits are also vulnerable to stuck-at faults due to, for example, manufacturing defects or ageing; a fault can cause an incorrect output in the fault-detection scheme; so concurrent fault detection is, therefore, needed. Cyclic redundancy checks (CRCs) are widely used to detect errors in many applications, for example, they are used in communication to detect errors on transmitted frames. In this study, an efficient method to implement concurrent fault detection for parallel CRC computation is proposed. The scheme relies on using a serial CRC computation circuit that is used to periodically check the results obtained from the main module to detect the faults. This introduces a lower circuit overhead than existing schemes. All CRC encoders and decoders that implement the CRC computation in parallel can employ the proposed scheme to detect faults.

随着技术规模的缩小，电路更容易发生故障，故障检测是确保系统可靠性的必要条件。然而，由于制造缺陷或老化等原因，故障检测电路也容易出现卡在故障上；故障可能导致故障检测方案中的错误输出；因此，需要同时进行故障检测。循环冗余校验（CRC）在许多应用中被广泛用于检测错误，例如，它们在通信中用于检测传输帧上的错误。在这项研究中，提出了一种有效的方法来实现并行CRC计算的并发故障检测。该方案依赖于使用串行CRC计算电路，该电路用于定期检查从主模块获得的结果以检测故障。这引入了比现有方案更低的电路开销。所有并行实现CRC计算的CRC编码器和解码器都可以使用所提出的方案来检测故障。

{"title":"Scheme for periodical concurrent fault detection in parallel CRC circuits","authors":"Jie Li, Shanshan Liu, Pedro Reviriego, Liyi Xiao, Fabrizio Lombardi","doi":"10.1049/iet-cdt.2018.5183","DOIUrl":"https://doi.org/10.1049/iet-cdt.2018.5183","url":null,"abstract":"<div>\u0000 <p>As technology scales down, circuits are more prone to incur in faults and fault detection is necessary to ensure the system reliability. However, fault-detection circuits are also vulnerable to stuck-at faults due to, for example, manufacturing defects or ageing; a fault can cause an incorrect output in the fault-detection scheme; so concurrent fault detection is, therefore, needed. Cyclic redundancy checks (CRCs) are widely used to detect errors in many applications, for example, they are used in communication to detect errors on transmitted frames. In this study, an efficient method to implement concurrent fault detection for parallel CRC computation is proposed. The scheme relies on using a serial CRC computation circuit that is used to periodically check the results obtained from the main module to detect the faults. This introduces a lower circuit overhead than existing schemes. All CRC encoders and decoders that implement the CRC computation in parallel can employ the proposed scheme to detect faults.</p>\u0000 </div>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"14 2","pages":"80-85"},"PeriodicalIF":1.2,"publicationDate":"2020-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1049/iet-cdt.2018.5183","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71972151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Survey on memory management techniques in heterogeneous computing systems 异构计算系统内存管理技术综述

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IET Computers and Digital Techniques

Pub Date : 2020-01-21 DOI: 10.1049/iet-cdt.2019.0092

Anakhi Hazarika, Soumyajit Poddar, Hafizur Rahaman

A major issue faced by data scientists today is how to scale up their processing infrastructure to meet the challenge of big data and high-performance computing (HPC) workloads. With today's HPC domain, it is required to connect multiple graphics processing units (GPUs) to accomplish large-scale parallel computing along with CPUs. Data movement between the processor and on-chip or off-chip memory creates a major bottleneck in overall system performance. The CPU/GPU processes all the data on a computer's memory and hence the speed of the data movement to/from memory and the size of the memory affect computer speed. During memory access by any processing element, the memory management unit (MMU) controls the data flow of the computer's main memory and impacts the system performance and power. Change in dynamic random access memory (DRAM) architecture, integration of memory-centric hardware accelerator in the heterogeneous system and Processing-in-Memory (PIM) are the techniques adopted from all the available shared resource management techniques to maximise the system throughput. This survey study presents an analysis of various DRAM designs and their performances. The authors also focus on the architecture, functionality, and performance of different hardware accelerators and PIM systems to reduce memory access time. Some insights and potential directions toward enhancements to existing techniques are also discussed. The requirement of fast, reconfigurable, self-adaptive memory management schemes in the high-speed processing scenario motivates us to track the trend. An effective MMU handles memory protection, cache control and bus arbitration associated with the processors.

当今数据科学家面临的一个主要问题是如何扩大他们的处理基础设施，以应对大数据和高性能计算（HPC）工作负载的挑战。对于今天的HPC领域，需要连接多个图形处理单元（GPU）来与CPU一起完成大规模并行计算。处理器和片上或片外存储器之间的数据移动造成了整个系统性能的主要瓶颈。CPU/GPU处理计算机存储器上的所有数据，因此数据移动到存储器/从存储器移动的速度和存储器的大小影响计算机速度。在任何处理元件访问内存期间，内存管理单元（MMU）控制计算机主内存的数据流，并影响系统性能和功率。动态随机存取存储器（DRAM）架构的变化、异构系统中以存储器为中心的硬件加速器的集成以及存储器中的处理（PIM）是从所有可用的共享资源管理技术中采用的技术，以最大限度地提高系统吞吐量。这项调查研究对各种DRAM设计及其性能进行了分析。作者还关注不同硬件加速器和PIM系统的架构、功能和性能，以减少内存访问时间。还讨论了增强现有技术的一些见解和潜在方向。高速处理场景中对快速、可重构、自适应内存管理方案的需求促使我们跟踪趋势。有效的MMU处理与处理器相关的内存保护、缓存控制和总线仲裁。

{"title":"Survey on memory management techniques in heterogeneous computing systems","authors":"Anakhi Hazarika, Soumyajit Poddar, Hafizur Rahaman","doi":"10.1049/iet-cdt.2019.0092","DOIUrl":"https://doi.org/10.1049/iet-cdt.2019.0092","url":null,"abstract":"<div>\u0000 <p>A major issue faced by data scientists today is how to scale up their processing infrastructure to meet the challenge of big data and high-performance computing (HPC) workloads. With today's HPC domain, it is required to connect multiple graphics processing units (GPUs) to accomplish large-scale parallel computing along with CPUs. Data movement between the processor and on-chip or off-chip memory creates a major bottleneck in overall system performance. The CPU/GPU processes all the data on a computer's memory and hence the speed of the data movement to/from memory and the size of the memory affect computer speed. During memory access by any processing element, the memory management unit (MMU) controls the data flow of the computer's main memory and impacts the system performance and power. Change in dynamic random access memory (DRAM) architecture, integration of memory-centric hardware accelerator in the heterogeneous system and Processing-in-Memory (PIM) are the techniques adopted from all the available shared resource management techniques to maximise the system throughput. This survey study presents an analysis of various DRAM designs and their performances. The authors also focus on the architecture, functionality, and performance of different hardware accelerators and PIM systems to reduce memory access time. Some insights and potential directions toward enhancements to existing techniques are also discussed. The requirement of fast, reconfigurable, self-adaptive memory management schemes in the high-speed processing scenario motivates us to track the trend. An effective MMU handles memory protection, cache control and bus arbitration associated with the processors.</p>\u0000 </div>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"14 2","pages":"47-60"},"PeriodicalIF":1.2,"publicationDate":"2020-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1049/iet-cdt.2019.0092","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72160107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

Efficient and flexible hardware structures of the 128 bit CLEFIA block cipher 高效灵活的128位CLEFIA分组密码硬件结构

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IET Computers and Digital Techniques

Pub Date : 2020-01-09 DOI: 10.1049/iet-cdt.2019.0157

Bahram Rashidi

In this study, high-throughput and flexible hardware implementations of the CLEFIA lightweight block cipher are presented. A unified processing element is designed and shared for implementing of generalised Feistel network that computes round keys and encryption process in the two separate times. The most complex blocks in the CLEFIA algorithm are substitution boxes ( and ). The S-box is implemented based on area-optimised combinational logic circuits. In the proposed S-box structure, the number of logic gates and critical path delay are reduced by using the simplification of computation terms. The S-box consists of three steps: a field inversion over and two affine transformations over . The inversion operation is implemented over the composite field instead of inversion over which is an important factor for the reduction of area consumption. In addition, we proposed a flexible structure that can perform various configurations of CLEFIA to support variable key sizes: 128, 192 and 256 bit. Implementation results of the proposed architectures in 180 nm complementary metal–oxide–semiconductor technology for different key sizes are achieved. The results show improvements in terms of execution time, throughput and throughput/area compared with other related works.

在这项研究中，提出了CLEFIA轻量级分组密码的高吞吐量和灵活的硬件实现。设计并共享了一个统一的处理单元，用于实现广义Feistel网络，该网络在两个不同的时间内计算循环密钥和加密过程。CLEFIA算法中最复杂的块是替换框（和）。S盒是基于面积优化的组合逻辑电路实现的。在所提出的S盒结构中，通过简化计算项，减少了逻辑门的数量和关键路径延迟。S盒由三个步骤组成：一个域反转和两个仿射变换。反演操作是在复合场上实现的，而不是反演，这是减少面积消耗的重要因素。此外，我们提出了一种灵活的结构，可以执行CLEFIA的各种配置，以支持可变的密钥大小：128、192和256位。针对不同的密钥大小，在180 nm互补金属-氧化物-半导体技术中实现了所提出的架构的结果。结果表明，与其他相关工作相比，在执行时间、吞吐量和吞吐量/面积方面有所改进。

{"title":"Efficient and flexible hardware structures of the 128 bit CLEFIA block cipher","authors":"Bahram Rashidi","doi":"10.1049/iet-cdt.2019.0157","DOIUrl":"https://doi.org/10.1049/iet-cdt.2019.0157","url":null,"abstract":"<div>\u0000 <p>In this study, high-throughput and flexible hardware implementations of the CLEFIA lightweight block cipher are presented. A unified processing element is designed and shared for implementing of generalised Feistel network that computes round keys and encryption process in the two separate times. The most complex blocks in the CLEFIA algorithm are substitution boxes ( and ). The S-box is implemented based on area-optimised combinational logic circuits. In the proposed S-box structure, the number of logic gates and critical path delay are reduced by using the simplification of computation terms. The S-box consists of three steps: a field inversion over and two affine transformations over . The inversion operation is implemented over the composite field instead of inversion over which is an important factor for the reduction of area consumption. In addition, we proposed a flexible structure that can perform various configurations of CLEFIA to support variable key sizes: 128, 192 and 256 bit. Implementation results of the proposed architectures in 180 nm complementary metal–oxide–semiconductor technology for different key sizes are achieved. The results show improvements in terms of execution time, throughput and throughput/area compared with other related works.</p>\u0000 </div>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"14 2","pages":"69-79"},"PeriodicalIF":1.2,"publicationDate":"2020-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1049/iet-cdt.2019.0157","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71948473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Mapping application-specific topology to mesh topology with reconfigurable switches 使用可重新配置的交换机将特定应用拓扑映射到网状拓扑

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IET Computers and Digital Techniques

Pub Date : 2020-01-01 DOI: 10.1049/iet-cdt.2018.5202

Pinar Kullu, Yilmaz Ar, Suleyman Tosun, Suat Ozdemir

When designing a Network-on-Chip (NoC) architecture, designers must consider various criteria such as bandwidth, performance, energy consumption, cost, re-usability, and fault tolerance. In most of the design efforts, it is very difficult to meet all these interacting constraints and objectives at the same time. Some of these parameters can be optimised and met easily by regular NoC topologies due to their re-usability and fault-tolerance capabilities. On the other hand, other parameters such as energy consumption, performance, and chip area can be better optimised in irregular NoC topologies. In this work, the authors present a novel two-step method that combines the advantages of regular and irregular NoC topologies. In the first step, the authors' method generates an energy and area optimised irregular topology for the given application by using a genetic algorithm. The generated topology uses the least amount of routers and links to minimise the area and energy; thus, it offers only one routing path between communicating nodes. Therefore, it does not fault tolerant. In the second step, their method maps the generated irregular topology to a reconfigurable mesh topology to make it fault tolerant. The detailed simulation results show the superiority of the proposed method over the existing work on several multimedia benchmarks.

在设计片上网络（NoC）架构时，设计者必须考虑各种标准，如带宽、性能、能耗、成本、可重用性和容错性。在大多数设计工作中，很难同时满足所有这些相互作用的约束和目标。其中一些参数可以通过常规的NoC拓扑结构进行优化和满足，因为它们具有可重用性和容错能力。另一方面，在不规则的NoC拓扑中，可以更好地优化诸如能耗、性能和芯片面积之类的其他参数。在这项工作中，作者提出了一种新的两步方法，该方法结合了规则和不规则NoC拓扑的优点。在第一步中，作者的方法通过使用遗传算法为给定的应用生成能量和面积优化的不规则拓扑。生成的拓扑使用最少数量的路由器和链路来最小化面积和能量；因此，它在通信节点之间只提供一条路由路径。因此，它不是容错的。在第二步中，他们的方法将生成的不规则拓扑映射到可重新配置的网格拓扑，使其具有容错性。详细的仿真结果表明，该方法在几个多媒体基准上优于现有的工作。

{"title":"Mapping application-specific topology to mesh topology with reconfigurable switches","authors":"Pinar Kullu, Yilmaz Ar, Suleyman Tosun, Suat Ozdemir","doi":"10.1049/iet-cdt.2018.5202","DOIUrl":"https://doi.org/10.1049/iet-cdt.2018.5202","url":null,"abstract":"When designing a Network-on-Chip (NoC) architecture, designers must consider various criteria such as bandwidth, performance, energy consumption, cost, re-usability, and fault tolerance. In most of the design efforts, it is very difficult to meet all these interacting constraints and objectives at the same time. Some of these parameters can be optimised and met easily by regular NoC topologies due to their re-usability and fault-tolerance capabilities. On the other hand, other parameters such as energy consumption, performance, and chip area can be better optimised in irregular NoC topologies. In this work, the authors present a novel two-step method that combines the advantages of regular and irregular NoC topologies. In the first step, the authors' method generates an energy and area optimised irregular topology for the given application by using a genetic algorithm. The generated topology uses the least amount of routers and links to minimise the area and energy; thus, it offers only one routing path between communicating nodes. Therefore, it does not fault tolerant. In the second step, their method maps the generated irregular topology to a reconfigurable mesh topology to make it fault tolerant. The detailed simulation results show the superiority of the proposed method over the existing work on several multimedia benchmarks.","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"14 1","pages":"9-16"},"PeriodicalIF":1.2,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1049/iet-cdt.2018.5202","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71936967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Power-efficient reliable register file for aggressive-environment applications 高效可靠的寄存器文件，适用于攻击性环境应用

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IET Computers and Digital Techniques

Pub Date : 2020-01-01 DOI: 10.1049/iet-cdt.2018.5047

Ihsen Alouani, Hamzeh Ahangari, Ozcan Ozturk, Smail Niar

In a context of increasing demands for on-board data processing, insuring reliability under reduced power budget is a serious design challenge for embedded system manufacturers. Particularly, embedded processors in aggressive environments need to be designed with error hardening as a primary goal, not an afterthought. As Register File (RF) is a critical element within the processor pipeline, enhancing RF reliability is mandatory to design fault immune computing systems. This study proposes integer and floating point RF reliability enhancement techniques. Specifically, the authors propose Adjacent Register Hardened RF, a new RF architecture that exploits the adjacent byte-level narrow-width values for hardening integer registers at runtime. Registers are paired together by special switches referred to as joiners and non-utilised bits of each register are exploited to enhance the reliability of its counterpart register. Moreover, they suggest sacrificing the least significant bits of the Mantissa to enhance the reliability of the floating point critical bits, namely, Exponent and Sign bits. The authors’ results show that with a low power budget compared to state of the art techniques, they achieve better results under both normal and highly aggressive operating conditions.

在对板载数据处理需求不断增加的背景下，在降低功率预算的情况下确保可靠性对嵌入式系统制造商来说是一个严峻的设计挑战。特别是，在攻击性环境中的嵌入式处理器需要将错误强化作为主要目标来设计，而不是事后考虑。由于寄存器文件（RF）是处理器流水线中的一个关键元素，因此增强RF可靠性对于设计故障免疫计算系统是强制性的。本研究提出了整数和浮点射频可靠性增强技术。具体而言，作者提出了相邻寄存器硬化RF，这是一种新的RF架构，它利用相邻字节级窄宽度值在运行时硬化整数寄存器。寄存器由称为连接器的特殊开关配对在一起，每个寄存器的未使用位被用来提高其对应寄存器的可靠性。此外，他们建议牺牲Mantissa的最低有效位来提高浮点关键位（即Exponent和Sign位）的可靠性。作者的研究结果表明，与现有技术相比，它们的功率预算较低，在正常和高度激进的操作条件下都能获得更好的结果。

{"title":"Power-efficient reliable register file for aggressive-environment applications","authors":"Ihsen Alouani, Hamzeh Ahangari, Ozcan Ozturk, Smail Niar","doi":"10.1049/iet-cdt.2018.5047","DOIUrl":"https://doi.org/10.1049/iet-cdt.2018.5047","url":null,"abstract":"In a context of increasing demands for on-board data processing, insuring reliability under reduced power budget is a serious design challenge for embedded system manufacturers. Particularly, embedded processors in aggressive environments need to be designed with error hardening as a primary goal, not an afterthought. As Register File (RF) is a critical element within the processor pipeline, enhancing RF reliability is mandatory to design fault immune computing systems. This study proposes integer and floating point RF reliability enhancement techniques. Specifically, the authors propose Adjacent Register Hardened RF, a new RF architecture that exploits the adjacent byte-level narrow-width values for hardening integer registers at runtime. Registers are paired together by special switches referred to as joiners and non-utilised bits of each register are exploited to enhance the reliability of its counterpart register. Moreover, they suggest sacrificing the least significant bits of the Mantissa to enhance the reliability of the floating point critical bits, namely, Exponent and Sign bits. The authors’ results show that with a low power budget compared to state of the art techniques, they achieve better results under both normal and highly aggressive operating conditions.","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"14 1","pages":"1-8"},"PeriodicalIF":1.2,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1049/iet-cdt.2018.5047","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71959306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-objective constraint and hybrid optimisation-based VM migration in a community cloud 社区云中基于多目标约束和混合优化的虚拟机迁移

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IET Computers and Digital Techniques

Pub Date : 2020-01-01 DOI: 10.1049/iet-cdt.2018.5243

Pradeepa Parthiban, Pushpalakshmi Raman

The growing demand for the cloud community market towards attracting and sustaining the incoming and the available cloud users is addressed actively to meet the competitive environment. There is a good scope for improving the provider capabilities in the cloud in order to satisfy the users with attractive benefits. The study introduces an effective virtual machine (VM) migration strategy using an optimisation algorithm in such a way to facilitate the user selection of the providers based on their budgetary requirements in running their own platforms. The constraints associated with the selection of the provider include cost, revenue, and resource, which are altogether confined as an elective factor. The optimisation algorithm employed for the VM migration is referred to as Taylor series-based salp swarm algorithm (Taylor-SSA) that is the integration of the Taylor series with SSA. The evaluation of the method is progressed using three setups by varying the number of providers and users. The cost, the revenue, and the resource of the proposed method are analysed and concluded that the proposed method acquired a minimal cost, maximal resource gain and revenue.

为了满足竞争环境，云社区市场对吸引和维持新用户和可用云用户的需求不断增长。有一个很好的空间来改进云中的提供商能力，以使用户满意并获得有吸引力的好处。该研究引入了一种使用优化算法的有效虚拟机（VM）迁移策略，以便于用户根据运营自己平台的预算要求选择提供商。与供应商选择相关的限制因素包括成本、收入和资源，这些都被限制为一个选择性因素。用于VM迁移的优化算法被称为基于泰勒级数的salp群算法（Taylor SSA），它是泰勒级数与SSA的集成。通过改变提供者和用户的数量，使用三种设置对该方法进行评估。对该方法的成本、收益和资源进行了分析，得出结论：该方法获得了最小的成本、最大的资源收益和收益。

{"title":"Multi-objective constraint and hybrid optimisation-based VM migration in a community cloud","authors":"Pradeepa Parthiban, Pushpalakshmi Raman","doi":"10.1049/iet-cdt.2018.5243","DOIUrl":"https://doi.org/10.1049/iet-cdt.2018.5243","url":null,"abstract":"The growing demand for the cloud community market towards attracting and sustaining the incoming and the available cloud users is addressed actively to meet the competitive environment. There is a good scope for improving the provider capabilities in the cloud in order to satisfy the users with attractive benefits. The study introduces an effective virtual machine (VM) migration strategy using an optimisation algorithm in such a way to facilitate the user selection of the providers based on their budgetary requirements in running their own platforms. The constraints associated with the selection of the provider include cost, revenue, and resource, which are altogether confined as an elective factor. The optimisation algorithm employed for the VM migration is referred to as Taylor series-based salp swarm algorithm (Taylor-SSA) that is the integration of the Taylor series with SSA. The evaluation of the method is progressed using three setups by varying the number of providers and users. The cost, the revenue, and the resource of the proposed method are analysed and concluded that the proposed method acquired a minimal cost, maximal resource gain and revenue.","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"14 1","pages":"37-45"},"PeriodicalIF":1.2,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1049/iet-cdt.2018.5243","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71986473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3