IET Computers and Digital Techniques最新文献_第4页

Fast approximation of the top-k items in data streams using FPGAs 使用FPGA快速逼近数据流中的前k项

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IET Computers and Digital Techniques

Pub Date : 2023-02-19 DOI: 10.1049/cdt2.12053

Ali Ebrahim, Jalal Khalifat

Two methods are presented for finding the top-k items in data streams using Field Programmable Gate Arrays (FPGAs). These methods deploy two variants of a novel accelerator architecture capable of extracting an approximate list of the topmost frequently occurring items in a single pass over the input stream without the need for random access. The first variant of the accelerator implements the well-known Probabilistic sampling algorithm by mapping its main processing stages to a hardware architecture consisting of two custom systolic arrays. The proposed architecture retains all the properties of this algorithm, which works even if the stream size is unknown at run time. The architecture shows better scalability compared to other architectures that are based on other stream algorithms. In addition, experimental results on both synthetic and real datasets, when implementing the accelerator on an Intel Arria 10 GX 1150 FPGA device, showed very good accuracy and significant throughput gains compared to the existing software and hardware-accelerated solutions. The second variant of the accelerator is specifically tailored for applications requiring higher accuracy, provided that the size of the stream is known at run time. This variant takes advantage of the embedded memory resources in an FPGA to implement a sketch-based filter that precedes the main systolic array in the accelerator's pipeline. This filter enhances the accuracy of the accelerator by pre-processing the stream to remove much of the insignificant items, allowing the accelerator to process a significantly smaller filtered stream.

提出了两种使用现场可编程门阵列（FPGA）查找数据流中前k项的方法。这些方法部署了一种新型加速器架构的两种变体，该架构能够在不需要随机访问的情况下在输入流上的一次传递中提取最频繁出现的项目的近似列表。加速器的第一个变体通过将其主要处理阶段映射到由两个自定义收缩阵列组成的硬件架构来实现众所周知的概率采样算法。所提出的体系结构保留了该算法的所有属性，即使在运行时流大小未知，该算法也能工作。与基于其他流算法的其他架构相比，该架构显示出更好的可扩展性。此外，当在Intel Arria 10 GX 1150 FPGA设备上实现加速器时，在合成和真实数据集上的实验结果显示，与现有的软件和硬件加速解决方案相比，具有非常好的准确性和显著的吞吐量提高。加速器的第二种变体是专门为需要更高精度的应用而定制的，前提是在运行时已知流的大小。该变体利用FPGA中的嵌入式内存资源来实现基于草图的滤波器，该滤波器位于加速器管道中的主收缩阵列之前。该过滤器通过预处理流以去除大部分不重要的项目来提高加速器的准确性，从而允许加速器处理明显较小的过滤流。

{"title":"Fast approximation of the top-k items in data streams using FPGAs","authors":"Ali Ebrahim, Jalal Khalifat","doi":"10.1049/cdt2.12053","DOIUrl":"https://doi.org/10.1049/cdt2.12053","url":null,"abstract":"Two methods are presented for finding the top-k items in data streams using Field Programmable Gate Arrays (FPGAs). These methods deploy two variants of a novel accelerator architecture capable of extracting an approximate list of the topmost frequently occurring items in a single pass over the input stream without the need for random access. The first variant of the accelerator implements the well-known Probabilistic sampling algorithm by mapping its main processing stages to a hardware architecture consisting of two custom systolic arrays. The proposed architecture retains all the properties of this algorithm, which works even if the stream size is unknown at run time. The architecture shows better scalability compared to other architectures that are based on other stream algorithms. In addition, experimental results on both synthetic and real datasets, when implementing the accelerator on an Intel Arria 10 GX 1150 FPGA device, showed very good accuracy and significant throughput gains compared to the existing software and hardware-accelerated solutions. The second variant of the accelerator is specifically tailored for applications requiring higher accuracy, provided that the size of the stream is known at run time. This variant takes advantage of the embedded memory resources in an FPGA to implement a sketch-based filter that precedes the main systolic array in the accelerator's pipeline. This filter enhances the accuracy of the accelerator by pre-processing the stream to remove much of the insignificant items, allowing the accelerator to process a significantly smaller filtered stream.","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"17 2","pages":"60-73"},"PeriodicalIF":1.2,"publicationDate":"2023-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12053","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50152328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Phone-nomenon 2.0: A compact thermal model for smartphones 手机nomenon 2.0：一款适用于智能手机的紧凑型散热机型

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IET Computers and Digital Techniques

Pub Date : 2023-01-08 DOI: 10.1049/cdt2.12052

Yu-Min Lee, Hong-Wen Chiou, Shinyu Shiau, Chi-Wen Pan, Shih-Hung Ting

This paper presents a compact thermal model for smartphones, Phone-nomenon 2.0, to predict the thermal behavior of smartphones. In the beginning, non-linearities of internal and external heat transfer mechanisms of smartphones and a compact thermal model for these non-linearities have been studied and proposed. Then, an iterative simulation procedure to handle these non-linearities was developed, and the basic simulation framework which is one option in Phone-nomenon 2.0 was established and we call it Phone-nomenon.Iter. Finally, the linearisation approach was applied, and model order reduction techniques to enhance and speed up the basic framework were employed, and these two options Phone-nomenon.Lin and Phone-nomenon.LinMOR were named. Compared with a commercial tool, ANSYS Icepak, Phone-nomenon.Iter can achieve two orders of magnitude speedup with the maximum error being less than 1.90% for steady-state simulations and three orders of magnitude speedup with the temperature difference being less than 0.65°C for transient simulations. In addition, the speedup of Phone-nomenon.Lin over Phone-nomenon.Iter can be at least 4.22× and 3.26× for steady-state and transient simulations, respectively. Moreover, the speedup of Phone-nomenon.LinMOR over Phone-nomenon.Lin is at least 2.57×.

本文提出了一个紧凑的智能手机热模型，Phone nomenon 2.0，用于预测智能手机的热行为。首先，研究并提出了智能手机内部和外部传热机制的非线性，以及这些非线性的紧凑热模型。然后，开发了一个处理这些非线性的迭代仿真程序，并建立了Phone nomenon 2.0中的一个基本仿真框架，我们称之为Phone-nomenon.it。最后，应用了线性化方法，并采用了模型降阶技术来增强和加速基本框架，并将这两个选项命名为Phone-nomnon.Lin和Phone-nomNon.LinMOR。与商业工具ANSYS Icepak相比，Phone-nomnon.Iter可以实现两个数量级的加速，稳态模拟的最大误差小于1.90%，瞬态模拟的加速可以实现三个数量级，温差小于0.65°C。此外，对于稳态和瞬态模拟，Phone-nomnon.Lin比Phone-nomNon.Iter的加速率分别至少为4.22倍和3.26倍。此外，Phone-nomenon.LinMOR比Phone-nomnon.Lin的加速率至少为2.57×。

{"title":"Phone-nomenon 2.0: A compact thermal model for smartphones","authors":"Yu-Min Lee, Hong-Wen Chiou, Shinyu Shiau, Chi-Wen Pan, Shih-Hung Ting","doi":"10.1049/cdt2.12052","DOIUrl":"https://doi.org/10.1049/cdt2.12052","url":null,"abstract":"This paper presents a compact thermal model for smartphones, Phone-nomenon 2.0, to predict the thermal behavior of smartphones. In the beginning, non-linearities of internal and external heat transfer mechanisms of smartphones and a compact thermal model for these non-linearities have been studied and proposed. Then, an iterative simulation procedure to handle these non-linearities was developed, and the basic simulation framework which is one option in Phone-nomenon 2.0 was established and we call it Phone-nomenon.Iter. Finally, the linearisation approach was applied, and model order reduction techniques to enhance and speed up the basic framework were employed, and these two options Phone-nomenon.Lin and Phone-nomenon.LinMOR were named. Compared with a commercial tool, ANSYS Icepak, Phone-nomenon.Iter can achieve two orders of magnitude speedup with the maximum error being less than 1.90% for steady-state simulations and three orders of magnitude speedup with the temperature difference being less than 0.65°C for transient simulations. In addition, the speedup of Phone-nomenon.Lin over Phone-nomenon.Iter can be at least 4.22× and 3.26× for steady-state and transient simulations, respectively. Moreover, the speedup of Phone-nomenon.LinMOR over Phone-nomenon.Lin is at least 2.57×.","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"17 2","pages":"43-59"},"PeriodicalIF":1.2,"publicationDate":"2023-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12052","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50125052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Machine learning guided thermal management of Open Computing Language applications on CPU-GPU based embedded platforms 基于CPU-GPU的嵌入式平台上开放计算语言应用程序的机器学习引导热管理

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IET Computers and Digital Techniques

Pub Date : 2022-12-28 DOI: 10.1049/cdt2.12050

Rakesh Kumar, Bibhas Ghoshal

As embedded devices start supporting heterogeneous processing cores (Central Processing Unit [CPU]–Graphical Processing Unit [GPU] based cores), performance aware task allocation becomes a major issue. Use of Open Computing Language (OpenCL) applications on both CPU and GPU cores improves performance and resolves the problem. However, it has an adverse effect on the overall power consumption and the operating temperature of the system. Operating both kind of cores within a small form factor at high frequency causes rise in power consumption which in turn leads to increase in processor temperature. The elevated temperature brings about major thermal issues. In this paper, we present our investigation on the role of CPU during execution of GPU specific application and argue against running it at the high frequency. In addition, a machine learning guided mechanism to predict the optimal operating frequency of CPU cores during execution of OpenCL GPU kernels is presented in this study. Our experiments with OpenCL applications on the state of the art ODROID XU4 embedded platform show that the CPU cores of the experimental board if operated at a frequency proposed by our Machine Learning-based predictive method brings about 12.5°C reduction in processor temperature at 1.06% degradation in performance compared to the baseline frequency (default performance frequency governor of the embedded platform).

随着嵌入式设备开始支持异构处理核心（基于中央处理器[CPU]-图形处理单元[GPU]的核心），性能感知任务分配成为一个主要问题。在CPU和GPU核心上使用开放计算语言（OpenCL）应用程序可以提高性能并解决问题。然而，它对系统的整体功耗和工作温度有不利影响。在高频下以小的形状因子操作这两种内核会导致功耗的上升，进而导致处理器温度的上升。升高的温度带来了重大的热问题。在本文中，我们对CPU在GPU特定应用程序执行过程中的作用进行了研究，并反对在高频率下运行它。此外，本研究还提出了一种机器学习引导机制，用于预测OpenCL GPU内核执行过程中CPU内核的最佳工作频率。我们在最先进的ODROID XU4嵌入式平台上对OpenCL应用程序进行的实验表明，如果实验板的CPU内核以我们基于机器学习的预测方法提出的频率运行，与基线频率相比，处理器温度降低约12.5°C，性能下降1.06%（嵌入式平台的默认性能调速器）。

{"title":"Machine learning guided thermal management of Open Computing Language applications on CPU-GPU based embedded platforms","authors":"Rakesh Kumar, Bibhas Ghoshal","doi":"10.1049/cdt2.12050","DOIUrl":"https://doi.org/10.1049/cdt2.12050","url":null,"abstract":"As embedded devices start supporting heterogeneous processing cores (Central Processing Unit [CPU]–Graphical Processing Unit [GPU] based cores), performance aware task allocation becomes a major issue. Use of Open Computing Language (OpenCL) applications on both CPU and GPU cores improves performance and resolves the problem. However, it has an adverse effect on the overall power consumption and the operating temperature of the system. Operating both kind of cores within a small form factor at high frequency causes rise in power consumption which in turn leads to increase in processor temperature. The elevated temperature brings about major thermal issues. In this paper, we present our investigation on the role of CPU during execution of GPU specific application and argue against running it at the high frequency. In addition, a machine learning guided mechanism to predict the optimal operating frequency of CPU cores during execution of OpenCL GPU kernels is presented in this study. Our experiments with OpenCL applications on the state of the art ODROID XU4 embedded platform show that the CPU cores of the experimental board if operated at a frequency proposed by our Machine Learning-based predictive method brings about 12.5°C reduction in processor temperature at 1.06% degradation in performance compared to the baseline frequency (default performance frequency governor of the embedded platform).","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"17 1","pages":"20-28"},"PeriodicalIF":1.2,"publicationDate":"2022-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12050","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50155184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Event-based high throughput computing: A series of case studies on a massively parallel softcore machine 基于事件的高吞吐量计算：大规模并行软核机器的一系列案例研究

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IET Computers and Digital Techniques

Pub Date : 2022-12-19 DOI: 10.1049/cdt2.12051

Mark Vousden, Jordan Morris, Graeme McLachlan Bragg, Jonathan Beaumont, Ashur Rafiev, Wayne Luk, David Thomas, Andrew Brown

This paper introduces an event-based computing paradigm, where workers only perform computation in response to external stimuli (events). This approach is best employed on hardware with many thousands of smaller compute cores with a fast, low-latency interconnect, as opposed to traditional computers with fewer and faster cores. Event-based computing is timely because it provides an alternative to traditional big computing, which suffers from immense infrastructural and power costs. This paper presents four case study applications, where an event-based computing approach finds solutions to orders of magnitude more quickly than the equivalent traditional big compute approach, including problems in computational chemistry and condensed matter physics.

本文介绍了一种基于事件的计算范式，其中工作人员只对外部刺激（事件）进行计算。这种方法最好用于具有数千个具有快速、低延迟互连的较小计算核心的硬件，而不是具有更少、更快核心的传统计算机。基于事件的计算是及时的，因为它提供了传统大型计算的替代方案，而传统大型计算面临巨大的基础设施和电力成本。本文介绍了四个案例研究应用，其中基于事件的计算方法比等效的传统大型计算方法更快地找到数量级的解决方案，包括计算化学和凝聚态物理学中的问题。

引用次数: 2

Voltage over-scaling CNT-based 8-bit multiplier by high-efficient GDI-based counters 通过高效的基于GDI的计数器对基于CNT的8位乘法器进行电压过缩放

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IET Computers and Digital Techniques

Pub Date : 2022-11-28 DOI: 10.1049/cdt2.12049

Ayoub Sadeghi, Nabiollah Shiri, Mahmood Rafiee, Abdolreza Darabi, Ebrahim Abiri

A new low-power and high-speed multiplier is presented based on the voltage over scaling (VOS) technique and new 5:3 and 7:3 counter cells. The VOS reduces power consumption in digital circuits, but different voltage levels of the VOS increase the delay in different stages of a multiplier. Hence, the proposed counters are implemented by the gate-diffusion input technique to solve the speed limitation of the VOS-based circuits. The proposed GDI-based 5:3 and 7:3 counters save power and reduce the area by 2x and 2.5x, respectively. To prevent the threshold voltage (V_th) drop in the suggested GDI-based circuits, carbon nanotube field-effect transistor (CNTFET) technology is used. In the counters, the chirality vector and tubes of the CNTFETs are properly adjusted to attain full-swing outputs with high driving capability. Also, their validation against heat distribution under different time intervals, as a major issue in the CNTFET technology is investigated, and their very low sensitivity is confirmed. The low complexity, high stability and efficient performance of the presented counter cells introduce the proposed VOS-CNTFET-GDI-based multiplier as an alternative to the previous designs.

基于电压过缩放（VOS）技术和新的5:3和7:3计数器单元，提出了一种新的低功耗高速乘法器。VOS降低了数字电路中的功耗，但VOS的不同电压电平增加了乘法器不同级中的延迟。因此，所提出的计数器是通过栅极扩散输入技术来实现的，以解决基于VOS的电路的速度限制。所提出的基于GDI的5:3和7:3计数器分别节能2倍和2.5倍。为了防止所提出的基于GDI的电路中的阈值电压（Vth）下降，使用了碳纳米管场效应晶体管（CNTFET）技术。在计数器中，CNTFET的手性矢量和管被适当地调节，以获得具有高驱动能力的全摆幅输出。此外，作为CNTFET技术中的一个主要问题，研究了它们在不同时间间隔下对热分布的验证，并证实了它们非常低的灵敏度。所提出的计数器单元的低复杂性、高稳定性和高效性能引入了所提出的基于VOS-CNTFET GDI的乘法器作为先前设计的替代方案。

{"title":"Voltage over-scaling CNT-based 8-bit multiplier by high-efficient GDI-based counters","authors":"Ayoub Sadeghi, Nabiollah Shiri, Mahmood Rafiee, Abdolreza Darabi, Ebrahim Abiri","doi":"10.1049/cdt2.12049","DOIUrl":"https://doi.org/10.1049/cdt2.12049","url":null,"abstract":"A new low-power and high-speed multiplier is presented based on the voltage over scaling (VOS) technique and new 5:3 and 7:3 counter cells. The VOS reduces power consumption in digital circuits, but different voltage levels of the VOS increase the delay in different stages of a multiplier. Hence, the proposed counters are implemented by the gate-diffusion input technique to solve the speed limitation of the VOS-based circuits. The proposed GDI-based 5:3 and 7:3 counters save power and reduce the area by 2x and 2.5x, respectively. To prevent the threshold voltage (Vth) drop in the suggested GDI-based circuits, carbon nanotube field-effect transistor (CNTFET) technology is used. In the counters, the chirality vector and tubes of the CNTFETs are properly adjusted to attain full-swing outputs with high driving capability. Also, their validation against heat distribution under different time intervals, as a major issue in the CNTFET technology is investigated, and their very low sensitivity is confirmed. The low complexity, high stability and efficient performance of the presented counter cells introduce the proposed VOS-CNTFET-GDI-based multiplier as an alternative to the previous designs.","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"17 1","pages":"1-19"},"PeriodicalIF":1.2,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12049","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50146867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

A four-stage yield optimization technique for analog integrated circuits using optimal computing budget allocation and evolutionary algorithms 基于最优计算预算分配和进化算法的模拟集成电路四阶段成品率优化技术

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IET Computers and Digital Techniques

Pub Date : 2022-10-09 DOI: 10.1049/cdt2.12048

Abbas Yaseri, Mohammad Hossein Maghami, Mehdi Radmehr

A high yield estimation is necessary for designing analogue integrated circuits. In the Monte-Carlo (MC) method, many transistor-level simulations should be performed to obtain the desired result. Therefore, some methods are needed to be combined with MC simulations to reach high yield with high speed at the same time. In this paper, a four-stage yield optimisation approach is presented, which employs computational intelligence to accelerate yield estimation without losing accuracy. Firstly, the designs that met the desired characteristics are provided using critical analysis (CA). The aim of utilising CA is to avoid unnecessary MC simulations repeating for non-critical solutions. Then in the second and third stages, the shuffled frog-leaping algorithm and the Non-dominated Sorting Genetic Algorithm-III are proposed to improve the performance. Finally, MC simulations are performed to present the final result. The yield value obtained from the simulation results for two-stage class-AB Operational Transconductance Amplifer (OTA) in 180 nm Complementary Metal-Oxide-Semiconductor (CMOS) technology is 99.85%. The proposed method has less computational effort and high accuracy than the MC-based approaches. Another advantage of using CA is that the initial population of multi-objective optimisation algorithms will no longer be random. Simulation results prove the efficiency of the proposed technique.

在模拟集成电路的设计中，高良率估计是必要的。在蒙特卡罗(MC)方法中，为了获得期望的结果，需要进行许多晶体管级的模拟。因此，需要一些方法与MC模拟相结合，以达到高成品率和高速度的同时。本文提出了一种四阶段产量优化方法，该方法利用计算智能在不损失精度的情况下加速产量估计。首先，使用关键分析(CA)提供满足所需特性的设计。利用CA的目的是避免不必要的MC模拟重复非关键的解决方案。在第二阶段和第三阶段，分别提出了shuffle frog- jump算法和non - dominant Sorting Genetic algorithm - iii来提高算法的性能。最后进行了MC模拟，给出了最终结果。仿真结果表明，180nm互补金属氧化物半导体(CMOS)工艺的两级ab类操作跨导放大器(OTA)的良率值为99.85%。与基于mc的方法相比，该方法计算量少，精度高。使用CA的另一个优点是多目标优化算法的初始人口将不再是随机的。仿真结果证明了该方法的有效性。

{"title":"A four-stage yield optimization technique for analog integrated circuits using optimal computing budget allocation and evolutionary algorithms","authors":"Abbas Yaseri, Mohammad Hossein Maghami, Mehdi Radmehr","doi":"10.1049/cdt2.12048","DOIUrl":"10.1049/cdt2.12048","url":null,"abstract":"A high yield estimation is necessary for designing analogue integrated circuits. In the Monte-Carlo (MC) method, many transistor-level simulations should be performed to obtain the desired result. Therefore, some methods are needed to be combined with MC simulations to reach high yield with high speed at the same time. In this paper, a four-stage yield optimisation approach is presented, which employs computational intelligence to accelerate yield estimation without losing accuracy. Firstly, the designs that met the desired characteristics are provided using critical analysis (CA). The aim of utilising CA is to avoid unnecessary MC simulations repeating for non-critical solutions. Then in the second and third stages, the shuffled frog-leaping algorithm and the Non-dominated Sorting Genetic Algorithm-III are proposed to improve the performance. Finally, MC simulations are performed to present the final result. The yield value obtained from the simulation results for two-stage class-AB Operational Transconductance Amplifer (OTA) in 180 nm Complementary Metal-Oxide-Semiconductor (CMOS) technology is 99.85%. The proposed method has less computational effort and high accuracy than the MC-based approaches. Another advantage of using CA is that the initial population of multi-objective optimisation algorithms will no longer be random. Simulation results prove the efficiency of the proposed technique.","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"16 5-6","pages":"183-195"},"PeriodicalIF":1.2,"publicationDate":"2022-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12048","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87528214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Illegal Trojan design and detection in asynchronous NULL Convention Logic and Sleep Convention Logic circuits 异步NULL约定逻辑与休眠约定逻辑电路中的非法木马设计与检测

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IET Computers and Digital Techniques

Pub Date : 2022-09-16 DOI: 10.1049/cdt2.12047

Kushal K. Ponugoti, Sudarshan K. Srinivasan, Scott C. Smith, Nimish Mathure

With Cyber warfare, detection of hardware Trojans, malicious digital circuit components that can leak data and degrade performance, is an urgent issue. Quasi-Delay Insensitive asynchronous digital circuits, such as NULL Convention Logic (NCL) and Sleep Convention Logic, also known as Multi-Threshold NULL Convention Logic (MTNCL), have inherent security properties and resilience to large fluctuations in temperatures, which make them very alluring to extreme environment applications, such as space exploration, automotive, power industry etc. This paper shows how dual-rail encoding used in NCL and MTNCL can be exploited to design Trojans, which would not be detected using existing methods. Generic threat models for Trojans are given. Formal verification methods that are capable of accurate detection of Trojans at the Register-Transfer-Level are also provided. The detection methods were tested by embedding Trojans in NCL and MTNCL Rivest-Shamir-Adleman (RSA) decryption circuits. The methods were applied to 25 NCL and 25 MTNCL RSA benchmarks of various data path width and provided 100% rate of detection.

在网络战争中，检测硬件木马，即可以泄露数据和降低性能的恶意数字电路组件，是一个紧迫的问题。准延迟不敏感异步数字电路，如NULL约定逻辑(NCL)和睡眠约定逻辑(MTNCL)，也称为多阈值NULL约定逻辑(MTNCL)，具有固有的安全特性和对温度大幅波动的弹性，这使得它们对极端环境应用非常有吸引力，例如太空探索，汽车，电力工业等。本文展示了如何利用NCL和MTNCL中使用的双轨编码来设计木马，使用现有方法无法检测到。给出了木马的一般威胁模型。还提供了能够在注册-传输级别准确检测木马的正式验证方法。通过在NCL和MTNCL RSA (Rivest-Shamir-Adleman)解密电路中嵌入木马，对检测方法进行了测试。将该方法应用于25个不同数据路径宽度的NCL和25个MTNCL RSA基准，并提供100%的检出率。

{"title":"Illegal Trojan design and detection in asynchronous NULL Convention Logic and Sleep Convention Logic circuits","authors":"Kushal K. Ponugoti, Sudarshan K. Srinivasan, Scott C. Smith, Nimish Mathure","doi":"10.1049/cdt2.12047","DOIUrl":"10.1049/cdt2.12047","url":null,"abstract":"With Cyber warfare, detection of hardware Trojans, malicious digital circuit components that can leak data and degrade performance, is an urgent issue. Quasi-Delay Insensitive asynchronous digital circuits, such as NULL Convention Logic (NCL) and Sleep Convention Logic, also known as Multi-Threshold NULL Convention Logic (MTNCL), have inherent security properties and resilience to large fluctuations in temperatures, which make them very alluring to extreme environment applications, such as space exploration, automotive, power industry etc. This paper shows how dual-rail encoding used in NCL and MTNCL can be exploited to design Trojans, which would not be detected using existing methods. Generic threat models for Trojans are given. Formal verification methods that are capable of accurate detection of Trojans at the Register-Transfer-Level are also provided. The detection methods were tested by embedding Trojans in NCL and MTNCL Rivest-Shamir-Adleman (RSA) decryption circuits. The methods were applied to 25 NCL and 25 MTNCL RSA benchmarks of various data path width and provided 100% rate of detection.","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"16 5-6","pages":"172-182"},"PeriodicalIF":1.2,"publicationDate":"2022-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12047","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85767635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

TLP: Towards three-level loop parallelisation TLP:迈向三层循环并行化

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IET Computers and Digital Techniques

Pub Date : 2022-08-09 DOI: 10.1049/cdt2.12046

Shabnam Mahjoub, Mehdi Golsorkhtabaramiri, Seyed Sadegh Salehi Amiri

Due to the design of computer systems in the multi-core and/or multi-processor form, it is possible to use the maximum capacity of processors to run an application with the least time consumed through parallelisation. This is the responsibility of parallel compilers, which perform parallelisation in several steps by distributing iterations between different processors and executing them simultaneously to achieve lower runtime. The present paper focuses on the uniformisation of three-level perfect nested loops as an important step in parallelisation and proposes a method called Towards Three-Level Loop Parallelisation (TLP) that uses a combination of a Frog Leaping Algorithm and Fuzzy to achieve optimal results because in recent years, many algorithms have worked on volumetric data, that is, three-dimensional spaces. Results of the implementation of the TLP algorithm in comparison with existing methods lead to a wide variety of optimal results at desired times, with minimum cone size resulting from the vectors. Besides, the maximum number of input dependence vectors is decomposed by this algorithm. These results can accelerate the process of generating parallel codes and facilitate their development for High-Performance Computing purposes.

由于计算机系统是以多核和/或多处理器形式设计的，因此可以使用处理器的最大容量来通过并行化以最少的时间消耗运行应用程序。这是并行编译器的责任，它通过在不同的处理器之间分配迭代并同时执行它们来分几个步骤执行并行化，以实现更低的运行时间。本文将三层完美嵌套循环的均匀化作为并行化的重要步骤，并提出了一种称为“迈向三层循环并行化”(TLP)的方法，该方法使用青蛙跳跃算法和模糊算法的组合来实现最佳结果，因为近年来，许多算法都用于体积数据，即三维空间。与现有方法相比，TLP算法的实现结果在所需时间内产生了各种各样的最优结果，并且由向量产生的锥尺寸最小。并对输入依赖向量的最大数量进行了分解。这些结果可以加速生成并行代码的过程，并促进其用于高性能计算目的的开发。

{"title":"TLP: Towards three-level loop parallelisation","authors":"Shabnam Mahjoub, Mehdi Golsorkhtabaramiri, Seyed Sadegh Salehi Amiri","doi":"10.1049/cdt2.12046","DOIUrl":"10.1049/cdt2.12046","url":null,"abstract":"Due to the design of computer systems in the multi-core and/or multi-processor form, it is possible to use the maximum capacity of processors to run an application with the least time consumed through parallelisation. This is the responsibility of parallel compilers, which perform parallelisation in several steps by distributing iterations between different processors and executing them simultaneously to achieve lower runtime. The present paper focuses on the uniformisation of three-level perfect nested loops as an important step in parallelisation and proposes a method called Towards Three-Level Loop Parallelisation (TLP) that uses a combination of a Frog Leaping Algorithm and Fuzzy to achieve optimal results because in recent years, many algorithms have worked on volumetric data, that is, three-dimensional spaces. Results of the implementation of the TLP algorithm in comparison with existing methods lead to a wide variety of optimal results at desired times, with minimum cone size resulting from the vectors. Besides, the maximum number of input dependence vectors is decomposed by this algorithm. These results can accelerate the process of generating parallel codes and facilitate their development for High-Performance Computing purposes.","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"16 5-6","pages":"159-171"},"PeriodicalIF":1.2,"publicationDate":"2022-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12046","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74517978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Guest Editorial: Special issue on battery-free computing 嘉宾评论:关于无电池计算的特刊

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IET Computers and Digital Techniques

Pub Date : 2022-06-09 DOI: 10.1049/cdt2.12043

Geoff V. Merrett, Bernd-Christian Renner, Brandon Lucia

In order to realise the vision and scale of the Internet of Things (IoT), we cannot rely on mains electricity or batteries to power devices due to environmental, maintenance, cost and physical volume implications. Considerable research has been undertaken in energy harvesting, allowing systems to extract electrical energy from their surrounding environments. However, such energy is typically highly dynamic, both spatially and temporally. In recent years, there has been an increase in research around how computing can be effectively performed from energy harvesting supplies, moving beyond the concepts of battery-powered and energy-neutral systems, thus enabling battery-free computing.

Challenges in battery-free computing are broad and wide-ranging, cutting across the spectrum of electronics and computer science—for example, circuits, algorithms, computer architecture, communication and networking, middleware, applications, deployments, and modelling and simulation tools.

This special issue explores the challenges, issues and opportunities in the research, design, and engineering of energy-harvesting, energy-neutral and intermittent sensing systems. These are enabling technologies for future applications in smart energy, transportation, environmental monitoring and smart cities. Innovative solutions are needed to enable either uninterrupted or intermittent operation.

This special issue contains two papers on different aspects of battery-free computing, as described below.

Hanschke et al.‘s article on ‘EmRep: Energy Management Relying on State-of-Charge Extrema Prediction’ considers energy management in energy-neutral systems, particularly those with small energy storage elements (e.g. a supercapacitor). They observe that existing energy-neutral management approaches have a tendency to operate inefficiently when exposed to extremes in the harvesting environment, for example, wasting harvested power in times of abundant energy due to saturation of the energy storage device. To resolve this, the authors present an approach to predict extremes in device state-of-charge (SoC) when such conditions are occurring and hence switch to a less conservative and more immediate policy for device activity (and hence, consumption). This decouples energy management of high-intake from low-intake harvest periods and ensures that the saturation of energy storage is reduced by design. The approach is thoroughly experimentally evaluated in combination with a variety of different prediction algorithms, time resolutions, and energy storage sizes. Promising results indicate the potential for a doubling in effective utility in systems with only small energy storage elements.

The second paper in the special issue, authored by Stricker et al., continues the theme of energy prediction by considering the impact of harvesting source prediction errors on the system scheduler and hence the system's performance. Their article, ‘Robustness of Predict

为了实现物联网(IoT)的愿景和规模，由于环境、维护、成本和物理体积的影响，我们不能依赖电源或电池为设备供电。在能量收集方面已经进行了大量的研究，使系统能够从周围环境中提取电能。然而，这种能量在空间和时间上都是高度动态的。近年来，关于如何从能量收集供应中有效地执行计算的研究有所增加，超越了电池供电和能量中性系统的概念，从而实现了无电池计算。无电池计算的挑战是广泛而广泛的，跨越了电子和计算机科学的各个领域，例如电路、算法、计算机体系结构、通信和网络、中间件、应用程序、部署以及建模和仿真工具。本期特刊探讨了能量收集、能量中性和间歇传感系统的研究、设计和工程中的挑战、问题和机遇。这些都是未来智能能源、交通、环境监测和智能城市应用的使能技术。需要创新的解决方案来实现不间断或间歇操作。本期特刊包含两篇关于无电池计算不同方面的论文，如下所述。Hanschke等人的文章“EmRep:基于充电状态极值预测的能源管理”考虑了能量中性系统中的能源管理，特别是那些具有小型储能元件(例如超级电容器)的系统。他们观察到，当暴露在极端的收集环境中时，现有的能量中性管理方法有一种低效率的趋势，例如，由于能量存储设备的饱和，在能量充足的时候浪费了收集的能量。为了解决这个问题，作者提出了一种方法，当这种情况发生时，可以预测设备充电状态(SoC)的极端情况，从而切换到不那么保守和更直接的设备活动(因此，消耗)策略。这将高摄入的能量管理与低摄入的收获期解耦，并确保通过设计降低能量储存的饱和度。该方法与各种不同的预测算法、时间分辨率和能量存储大小相结合，进行了彻底的实验评估。有希望的结果表明，在只有小型储能元件的系统中，有效效用有可能翻倍。特刊中的第二篇论文由Stricker等人撰写，通过考虑收集源预测误差对系统调度程序的影响以及系统性能，继续了能量预测的主题。他们的文章《预测能量收集系统的稳健性——分析和自适应预测缩放》定义了一个新的稳健性度量来描述预测误差的影响，并使用来自室内和室外收集场景的数据集演示了这一概念。作者随后提出了一种自适应预测缩放方法，该方法从本地环境和系统行为中学习，在现实环境中证明了高达13.8倍的性能改进。我们希望这期特刊能激励工业界和学术界的研究人员在这一具有挑战性的领域进行进一步的研究。

{"title":"Guest Editorial: Special issue on battery-free computing","authors":"Geoff V. Merrett, Bernd-Christian Renner, Brandon Lucia","doi":"10.1049/cdt2.12043","DOIUrl":"10.1049/cdt2.12043","url":null,"abstract":"In order to realise the vision and scale of the Internet of Things (IoT), we cannot rely on mains electricity or batteries to power devices due to environmental, maintenance, cost and physical volume implications. Considerable research has been undertaken in energy harvesting, allowing systems to extract electrical energy from their surrounding environments. However, such energy is typically highly dynamic, both spatially and temporally. In recent years, there has been an increase in research around how computing can be effectively performed from energy harvesting supplies, moving beyond the concepts of battery-powered and energy-neutral systems, thus enabling battery-free computing.Challenges in battery-free computing are broad and wide-ranging, cutting across the spectrum of electronics and computer science—for example, circuits, algorithms, computer architecture, communication and networking, middleware, applications, deployments, and modelling and simulation tools.This special issue explores the challenges, issues and opportunities in the research, design, and engineering of energy-harvesting, energy-neutral and intermittent sensing systems. These are enabling technologies for future applications in smart energy, transportation, environmental monitoring and smart cities. Innovative solutions are needed to enable either uninterrupted or intermittent operation.This special issue contains two papers on different aspects of battery-free computing, as described below.Hanschke et al.‘s article on ‘EmRep: Energy Management Relying on State-of-Charge Extrema Prediction’ considers energy management in energy-neutral systems, particularly those with small energy storage elements (e.g. a supercapacitor). They observe that existing energy-neutral management approaches have a tendency to operate inefficiently when exposed to extremes in the harvesting environment, for example, wasting harvested power in times of abundant energy due to saturation of the energy storage device. To resolve this, the authors present an approach to predict extremes in device state-of-charge (SoC) when such conditions are occurring and hence switch to a less conservative and more immediate policy for device activity (and hence, consumption). This decouples energy management of high-intake from low-intake harvest periods and ensures that the saturation of energy storage is reduced by design. The approach is thoroughly experimentally evaluated in combination with a variety of different prediction algorithms, time resolutions, and energy storage sizes. Promising results indicate the potential for a doubling in effective utility in systems with only small energy storage elements.The second paper in the special issue, authored by Stricker et al., continues the theme of energy prediction by considering the impact of harvesting source prediction errors on the system scheduler and hence the system's performance. Their article, ‘Robustness of Predict","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"16 4","pages":"89-90"},"PeriodicalIF":1.2,"publicationDate":"2022-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12043","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77386084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ASATM: Automated security assistant of threat models in intelligent transportation systems ASATM:智能交通系统中威胁模型的自动安全助手

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IET Computers and Digital Techniques

Pub Date : 2022-05-30 DOI: 10.1049/cdt2.12045

Mohammad Ali Ramazanzadeh, Behnam Barzegar, Homayun Motameni

The evolution of technology has led to the appearance of smart cities. An essential element in such cities is smart mobility that covers the subjects related to Intelligent Transportation Systems (ITS). The problem is that the ITS vulnerabilities may considerably harm the life quality and safety status of human beings living in smart cities. In fact, software and hardware systems are more exposed to security risks and threats. To reduce threats and secure software design, threat modelling has been proposed as a preventive solution in the software design phase. On the other hand, threat modelling is always criticised for being time consuming, complex, difficult, and error prone. The approach proposed in this study, that is, Automated Security Assistant of Threat Models (ASATM), is an automated solution that is capable of achieving a high level of security assurance. By defining concepts and conceptual modelling as well as implementing automated security assistant algorithms, ASATM introduces a new approach to identifying threats, extracting security requirements, and designing secure software. The proposed approach demonstrates a quantitative classification of security at three levels (insecure, secure, and threat), twelve sub-levels (nominal scale and colour scale), and a five-layer depth (human understandability and conditional probability). In this study, to evaluate the effectiveness of our approach, an example with various security parameters and scenarios was tested and the results confirmed the superiority of the proposed approach over the latest threat modelling approaches in terms of method, learning, and model understanding.

科技的发展导致了智慧城市的出现。这些城市的一个基本要素是智能交通，它涵盖了与智能交通系统(ITS)相关的主题。问题是ITS的脆弱性可能会严重损害智慧城市中人类的生活质量和安全状况。实际上，软件和硬件系统更容易受到安全风险和威胁。为了减少威胁和确保软件设计的安全，威胁建模被提出作为软件设计阶段的预防性解决方案。另一方面，威胁建模总是被批评为耗时、复杂、困难和容易出错。本研究提出的方法，即威胁模型的自动化安全助手(ASATM)，是一种能够实现高级别安全保障的自动化解决方案。通过定义概念和概念建模以及实现自动安全辅助算法，ASATM引入了一种识别威胁、提取安全需求和设计安全软件的新方法。所提出的方法在三个级别(不安全，安全和威胁)，十二个子级别(名义尺度和颜色尺度)和五层深度(人类可理解性和条件概率)上展示了安全的定量分类。在本研究中，为了评估我们的方法的有效性，对一个具有各种安全参数和场景的示例进行了测试，结果证实了所提出的方法在方法、学习和模型理解方面优于最新的威胁建模方法。

{"title":"ASATM: Automated security assistant of threat models in intelligent transportation systems","authors":"Mohammad Ali Ramazanzadeh, Behnam Barzegar, Homayun Motameni","doi":"10.1049/cdt2.12045","DOIUrl":"10.1049/cdt2.12045","url":null,"abstract":"The evolution of technology has led to the appearance of smart cities. An essential element in such cities is smart mobility that covers the subjects related to Intelligent Transportation Systems (ITS). The problem is that the ITS vulnerabilities may considerably harm the life quality and safety status of human beings living in smart cities. In fact, software and hardware systems are more exposed to security risks and threats. To reduce threats and secure software design, threat modelling has been proposed as a preventive solution in the software design phase. On the other hand, threat modelling is always criticised for being time consuming, complex, difficult, and error prone. The approach proposed in this study, that is, Automated Security Assistant of Threat Models (ASATM), is an automated solution that is capable of achieving a high level of security assurance. By defining concepts and conceptual modelling as well as implementing automated security assistant algorithms, ASATM introduces a new approach to identifying threats, extracting security requirements, and designing secure software. The proposed approach demonstrates a quantitative classification of security at three levels (insecure, secure, and threat), twelve sub-levels (nominal scale and colour scale), and a five-layer depth (human understandability and conditional probability). In this study, to evaluate the effectiveness of our approach, an example with various security parameters and scenarios was tested and the results confirmed the superiority of the proposed approach over the latest threat modelling approaches in terms of method, learning, and model understanding.","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"16 5-6","pages":"141-158"},"PeriodicalIF":1.2,"publicationDate":"2022-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12045","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76261435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2