首页 > 最新文献

Microprocessors and Microsystems最新文献

英文 中文
A real-time SVM-based hardware accelerator for hyperspectral images classification in FPGA 基于 SVM 的实时硬件加速器,用于 FPGA 中的高光谱图像分类
IF 2.6 4区 计算机科学 Q2 Computer Science Pub Date : 2023-12-27 DOI: 10.1016/j.micpro.2023.104998
Lucas Amilton Martins , Felipe Viel , Laio Oriel Seman , Eduardo Augusto Bezerra , Cesar Albenes Zeferino

Hyperspectral imaging can be conceptualized as a three-dimensional dataset of spectral information related to a particular landscape. Generally speaking, these are aerial photographs captured by Earth observation satellites. A useful analogy for a hyperspectral image is one of a cube formed with the image acquired along the X and Y axes and a third dimension of spectral bands of varying wavelengths. Given the wealth of data contained within these images, they have been employed in both civilian and military applications such as terrain recognition, urban development supervision, recognition of rare minerals, and various other objectives. The increased utilization of these images has garnered the interest of researchers striving to create solutions that may enable faster processing of the images via parallel processing. In this context, FPGA technology is an option capable of facilitating the implementation of such a system for observation satellites. This research is situated within this framework and aims to develop an FPGA-synthesized hardware accelerator to facilitate real-time hyperspectral image categorization. By taking this approach, hardware-specific solutions can be implemented for embedded applications that process hyperspectral images and can also be integrated with further image processing steps. The proposed accelerator was constructed based on an advanced algorithmic model, resulting in outcomes consistent with those generated by the software-based solution. The experimental results demonstrate that the engineered accelerator can attain a pixel classification time equal to or less than the pixel acquisition time, thus conforming to the real-time processing criteria concerning classification time. Further, the manufactured accelerator exhibits scalability that can classify distinct datasets with varying classes concurrently while maintaining a uniform logic resource utilization.

高光谱成像可以理解为与特定景观相关的光谱信息的三维数据集。一般来说,这些都是地球观测卫星拍摄的航空照片。高光谱图像的一个有用类比是一个立方体,图像沿 X 轴和 Y 轴采集,第三维是不同波长的光谱带。由于这些图像中包含大量数据,它们已被应用于民用和军用领域,如地形识别、城市发展监督、稀有矿物识别和其他各种目标。这些图像的使用率不断提高,引起了研究人员的兴趣,他们努力创造解决方案,以便通过并行处理更快地处理图像。在这种情况下,FPGA 技术是一种能够为观测卫星实施此类系统提供便利的选择。本研究就是在这一框架内进行的,旨在开发一种 FPGA 合成硬件加速器,以促进实时高光谱图像分类。通过这种方法,可以为处理高光谱图像的嵌入式应用程序实施特定的硬件解决方案,还可以与进一步的图像处理步骤集成。所建议的加速器是根据先进的算法模型构建的,其结果与基于软件的解决方案所产生的结果一致。实验结果表明,工程加速器的像素分类时间等于或小于像素采集时间,因此符合有关分类时间的实时处理标准。此外,制造的加速器还具有可扩展性,可同时对不同类别的数据集进行分类,同时保持统一的逻辑资源利用率。
{"title":"A real-time SVM-based hardware accelerator for hyperspectral images classification in FPGA","authors":"Lucas Amilton Martins ,&nbsp;Felipe Viel ,&nbsp;Laio Oriel Seman ,&nbsp;Eduardo Augusto Bezerra ,&nbsp;Cesar Albenes Zeferino","doi":"10.1016/j.micpro.2023.104998","DOIUrl":"10.1016/j.micpro.2023.104998","url":null,"abstract":"<div><p><span>Hyperspectral imaging<span><span> can be conceptualized as a three-dimensional dataset of spectral information related to a particular landscape. Generally speaking, these are aerial photographs captured by Earth observation satellites. A useful analogy for a </span>hyperspectral image<span> is one of a cube formed with the image acquired along the X and Y axes and a third dimension of spectral bands of varying wavelengths. Given the wealth of data contained within these images, they have been employed in both civilian and military applications such as terrain recognition, urban development supervision, recognition of rare minerals, and various other objectives. The increased utilization of these images has garnered the interest of researchers striving to create solutions that may enable faster processing of the images via </span></span></span>parallel processing<span>. In this context, FPGA<span><span> technology is an option capable of facilitating the implementation of such a system for observation satellites. This research is situated within this framework and aims to develop an FPGA-synthesized hardware accelerator to facilitate real-time hyperspectral image categorization. By taking this approach, hardware-specific solutions can be implemented for embedded applications that process hyperspectral images and can also be integrated with further </span>image processing<span> steps. The proposed accelerator was constructed based on an advanced algorithmic model, resulting in outcomes consistent with those generated by the software-based solution. The experimental results demonstrate that the engineered accelerator can attain a pixel classification time equal to or less than the pixel acquisition time, thus conforming to the real-time processing criteria concerning classification time. Further, the manufactured accelerator exhibits scalability that can classify distinct datasets with varying classes concurrently while maintaining a uniform logic resource utilization.</span></span></span></p></div>","PeriodicalId":49815,"journal":{"name":"Microprocessors and Microsystems","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2023-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139062414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Abeto: An automated benchmarking tool to manage heterogeneous IP core databases 阿贝托管理异构 IP 内核数据库的自动基准测试工具
IF 2.6 4区 计算机科学 Q2 Computer Science Pub Date : 2023-12-27 DOI: 10.1016/j.micpro.2023.104987
Antonio J. Sánchez , Yubal Barrios , Lucana Santos , Roberto Sarmiento

System-level design makes use of building blocks, known as soft IP cores, to build complex developments. The usage of these IP cores allows to reduce design and verification time, and also to save costs. However, the use of third-party IP cores tends to present difficulties because of a lack of standardization in their organization, distribution and management, which derive in heterogeneous databases. Most of the time, system developers need to describe some additional code to enable the integration, verification and validation of the IP core, which is not available as part of their distribution. This implies acquiring a deep knowledge of each IP core, often with a large learning curve.

In this work Abeto is presented, a new software tool for IP core databases management. It allows to easily integrate and use a heterogeneous group of IP cores, described in VHDL, with a unified set of instructions or commands. In order to do so, Abeto requires from every IP core some side information about its packaging and how to operate with the IP. Currently, Abeto provides support for a set of well-known EDA tools and has been successfully applied to the European Space Agency portfolio of IP cores for benchmarking purposes. To demonstrate its performance, mapping results for these IP cores on the novel NanoXplore BRAVE FPGA family are provided.

系统级设计利用构建模块(称为软 IP 核)来构建复杂的开发项目。使用这些 IP 核可以缩短设计和验证时间,还可以节约成本。然而,由于第三方 IP 核的组织、分配和管理缺乏标准化,导致数据库异构,因此使用第三方 IP 核往往会遇到困难。大多数情况下,系统开发人员需要描述一些额外的代码,以实现 IP 内核的集成、验证和确认,而这些代码并不作为 IP 内核发布的一部分。这意味着需要对每个 IP 核有深入的了解,学习曲线往往很高。Abeto 是一种用于 IP 核数据库管理的新软件工具。它可以通过一套统一的指令或命令,轻松集成和使用用 VHDL 描述的异构 IP 核。为此,Abeto 要求每个 IP 核提供一些有关其封装和如何使用 IP 的侧面信息。目前,Abeto 为一系列著名的 EDA 工具提供支持,并已成功应用于欧洲航天局的 IP 内核组合进行基准测试。为了证明其性能,我们提供了这些 IP 核在新型 NanoXplore BRAVE FPGA 系列上的映射结果。
{"title":"Abeto: An automated benchmarking tool to manage heterogeneous IP core databases","authors":"Antonio J. Sánchez ,&nbsp;Yubal Barrios ,&nbsp;Lucana Santos ,&nbsp;Roberto Sarmiento","doi":"10.1016/j.micpro.2023.104987","DOIUrl":"10.1016/j.micpro.2023.104987","url":null,"abstract":"<div><p>System-level design makes use of building blocks, known as soft IP cores, to build complex developments. The usage of these IP cores allows to reduce design and verification time, and also to save costs. However, the use of third-party IP cores tends to present difficulties because of a lack of standardization in their organization, distribution and management, which derive in heterogeneous databases. Most of the time, system developers need to describe some additional code to enable the integration, verification and validation of the IP core, which is not available as part of their distribution. This implies acquiring a deep knowledge of each IP core, often with a large learning curve.</p><p>In this work Abeto is presented, a new software tool for IP core databases management. It allows to easily integrate and use a heterogeneous group of IP cores, described in VHDL, with a unified set of instructions or commands. In order to do so, Abeto requires from every IP core some side information about its packaging and how to operate with the IP. Currently, Abeto provides support for a set of well-known EDA tools and has been successfully applied to the European Space Agency portfolio of IP cores for benchmarking purposes. To demonstrate its performance, mapping results for these IP cores on the novel NanoXplore BRAVE FPGA family are provided.</p></div>","PeriodicalId":49815,"journal":{"name":"Microprocessors and Microsystems","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2023-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0141933123002326/pdfft?md5=e94fcf9e16ab0ae8c28dbba99e96660d&pid=1-s2.0-S0141933123002326-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139094372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep neural networks accelerators with focus on tensor processors 以张量处理器为重点的深度神经网络加速器
IF 2.6 4区 计算机科学 Q2 Computer Science Pub Date : 2023-12-27 DOI: 10.1016/j.micpro.2023.105005
Hamidreza Bolhasani , Mohammad Marandinejad

The massive amount of data and the problem of processing them is one of the main challenges of the digital age, and the development of artificial intelligence and machine learning can be useful in solving this problem. Using deep neural networks to improve the efficiency of these two areas is a good solution. So far, several architectures have been introduced for data processing with the benefit of deep neural networks, whose accuracy, efficiency, and computing power are different from each other. This article tries to review these architectures, their features, and their functions in a systematic way. According to the current research style, 24 articles (conference and research articles related to this topic) have been evaluated in the period of 2014–2022. In fact, the significant aspects of the selected articles are compared and at the end, the upcoming challenges and topics for future research are presented. The results show that the main parameters for proposing a new tensor processor include increasing speed and accuracy and reducing data processing time, reducing on-chip storage space, reducing DRAM access, reducing energy consumption, and achieving high efficiency.

海量数据及其处理问题是数字时代的主要挑战之一,人工智能和机器学习的发展有助于解决这一问题。利用深度神经网络提高这两个领域的效率是一个很好的解决方案。迄今为止,已经推出了几种利用深度神经网络进行数据处理的架构,它们的精度、效率和计算能力各不相同。本文试图系统回顾这些架构、它们的特点和功能。根据目前的研究风格,2014-2022 年期间共评估了 24 篇文章(与该主题相关的会议和研究文章)。事实上,对所选文章的重要方面进行了比较,并在最后提出了即将面临的挑战和未来研究的主题。结果表明,提出新型张量处理器的主要参数包括:提高速度和精度、缩短数据处理时间、减少片上存储空间、减少 DRAM 访问、降低能耗和实现高效率。
{"title":"Deep neural networks accelerators with focus on tensor processors","authors":"Hamidreza Bolhasani ,&nbsp;Mohammad Marandinejad","doi":"10.1016/j.micpro.2023.105005","DOIUrl":"10.1016/j.micpro.2023.105005","url":null,"abstract":"<div><p><span>The massive amount of data and the problem of processing them is one of the main challenges of the digital age, and the development of artificial intelligence and </span>machine learning<span> can be useful in solving this problem. Using deep neural networks<span> to improve the efficiency of these two areas is a good solution. So far, several architectures have been introduced for data processing with the benefit of deep neural networks, whose accuracy, efficiency, and computing power are different from each other. This article tries to review these architectures, their features, and their functions in a systematic way. According to the current research style, 24 articles (conference and research articles related to this topic) have been evaluated in the period of 2014–2022. In fact, the significant aspects of the selected articles are compared and at the end, the upcoming challenges and topics for future research are presented. The results show that the main parameters for proposing a new tensor processor include increasing speed and accuracy and reducing data processing time, reducing on-chip storage space, reducing DRAM access, reducing energy consumption, and achieving high efficiency.</span></span></p></div>","PeriodicalId":49815,"journal":{"name":"Microprocessors and Microsystems","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2023-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139062366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel low hardware configurable ring oscillator (CRO) PUF for lightweight security applications 用于轻量级安全应用的新型低硬件可配置环形振荡器 (CRO) PUF
IF 2.6 4区 计算机科学 Q2 Computer Science Pub Date : 2023-12-23 DOI: 10.1016/j.micpro.2023.104989
Husam Kareem, Dmitriy Dunaev

Physical unclonable function (PUF) is a promising hardware security primitive that can generate a unique secret key peculiar to each chip by extracting the differences of non-reproducible manufacturing variations for the same implementations. Although there are several types of PUF designs and structures, ring oscillator (RO) PUF is one of the most prominent PUFs due to its straightforward implementation and remarkable performance. However, the traditional RO-PUF does not support large sizes of input/output combinations or challenge-response pairs (CRPs), as it is called in the scope of PUFs. Consequently, RO-PUF is more vulnerable to adversary attacks which can reveal the PUFs’ CRPs using a machine learning approach. Increasing the size of RO-PUF's CRPs requires a high increase in the circuit size leading to unacceptable area overhead for lightweight applications. The primary technique used to increase RO-PUF CRPs’ size without increasing the size of the required hardware is to develop a configurable ring oscillator (CRO) PUF. In this paper, we propose a configurable logic unit (CLU) that can be utilized to build a low-hardware CRO-PUF. The proposed CLU consists of a 1-XOR gate and a 1-XNOR gate. Building a CRO-PUF using the proposed CLU dramatically increases the CRPs size while minimizing the required hardware. The proposed CRO-PUF achieves excellent evaluation results, with measured uniqueness of 50.1 %, uniformity of 49.45 %, and reliability of 98.33 %. These values are in close proximity to the ideal targets of 50 % for uniqueness and uniformity, and 100 % for reliability

物理不可克隆函数(PUF)是一种前景广阔的硬件安全基元,它可以通过提取相同实现方式下不可再现的制造差异,生成每个芯片特有的唯一密钥。尽管 PUF 的设计和结构有多种类型,但环形振荡器(RO)PUF 以其简单的实现方式和出色的性能成为最突出的 PUF 之一。然而,传统的环形振荡器 PUF 不支持大容量的输入/输出组合或挑战-响应对(CRP),这在 PUF 的范围内被称为 "挑战-响应对"。因此,RO-PUF 更容易受到对手攻击的影响,这些攻击可以利用机器学习方法揭示 PUF 的 CRP。要增加 RO-PUF 的 CRP,就必须大幅增加电路尺寸,从而导致轻量级应用无法接受的面积开销。在不增加所需硬件体积的情况下增加 RO-PUF CRPs 体积的主要技术是开发可配置环形振荡器 (CRO) PUF。在本文中,我们提出了一种可配置逻辑单元(CLU),可用于构建低硬件成本的 CRO-PUF。拟议的 CLU 由一个 1-XOR 门和一个 1-XNOR 门组成。使用拟议的可编程逻辑单元构建 CRO-PUF 可显著增加 CRPs 的大小,同时最大限度地减少所需的硬件。拟议的 CRO-PUF 取得了出色的评估结果,测得唯一性为 50.1%,均匀性为 49.45%,可靠性为 98.33%。这些值都非常接近理想目标,即唯一性和统一性达到 50%,可靠性达到 100%。
{"title":"A novel low hardware configurable ring oscillator (CRO) PUF for lightweight security applications","authors":"Husam Kareem,&nbsp;Dmitriy Dunaev","doi":"10.1016/j.micpro.2023.104989","DOIUrl":"10.1016/j.micpro.2023.104989","url":null,"abstract":"<div><p>Physical unclonable function (PUF) is a promising hardware security primitive that can generate a unique secret key peculiar to each chip by extracting the differences of non-reproducible manufacturing variations for the same implementations. Although there are several types of PUF designs and structures, ring oscillator (RO) PUF is one of the most prominent PUFs due to its straightforward implementation and remarkable performance. However, the traditional RO-PUF does not support large sizes of input/output combinations or challenge-response pairs (CRPs), as it is called in the scope of PUFs. Consequently, RO-PUF is more vulnerable to adversary attacks which can reveal the PUFs’ CRPs using a machine learning approach. Increasing the size of RO-PUF's CRPs requires a high increase in the circuit size leading to unacceptable area overhead for lightweight applications. The primary technique used to increase RO-PUF CRPs’ size without increasing the size of the required hardware is to develop a configurable ring oscillator (CRO) PUF. In this paper, we propose a configurable logic unit (CLU) that can be utilized to build a low-hardware CRO-PUF. The proposed CLU consists of a 1-XOR gate and a 1-XNOR gate. Building a CRO-PUF using the proposed CLU dramatically increases the CRPs size while minimizing the required hardware. The proposed CRO-PUF achieves excellent evaluation results, with measured uniqueness of 50.1 %, uniformity of 49.45 %, and reliability of 98.33 %. These values are in close proximity to the ideal targets of 50 % for uniqueness and uniformity, and 100 % for reliability</p></div>","PeriodicalId":49815,"journal":{"name":"Microprocessors and Microsystems","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2023-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S014193312300234X/pdfft?md5=701fd3d90bd65a2ed764a13232d9a6bb&pid=1-s2.0-S014193312300234X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139036640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Experimental evaluation of RISC-V micro-architecture against fault injection attack 针对故障注入攻击的 RISC-V 微体系结构实验评估
IF 2.6 4区 计算机科学 Q2 Computer Science Pub Date : 2023-12-20 DOI: 10.1016/j.micpro.2023.104991
Maryam Esmaeilian, Hakem Beitollahi

Today, the use of embedded processors is increasing dramatically and they are used in all aspects from our daily life to security applications. Physical access to hardware has made the hardware security a major concern. Hardware attacks compromise the hardware security by physically accessing target devices. Among the available techniques for hardware attacks, Fault Injection Attacks (FIAs), such as clock glitching, are one of the most harmful types of non-invasive attacks that can disrupt the operation of an embedded system. Thus, it will be important and fundamental to evaluate embedded software programs before using them in critical applications and check their vulnerability against fault injection attacks. However, it is often difficult for software developers to assess vulnerabilities. In this paper, an easy-to-use platform is presented to facilitate the process of evaluating the vulnerability of programs running on embedded processors against clock glitching attacks. Our experimental results show the vulnerability window of RISC-V micro-architecture for different high-level C-functions. The results of this research can help the developers of embedded systems that are used in security applications to evaluate their system against clock glitching attacks with the least cost in a short time.

如今,嵌入式处理器的使用正在急剧增加,从我们的日常生活到安全应用的方方面面都在使用它们。对硬件的物理访问使硬件安全成为人们关注的焦点。硬件攻击通过物理访问目标设备来破坏硬件安全。在现有的硬件攻击技术中,故障注入攻击(FIA)(如时钟闪烁)是危害最大的非侵入式攻击类型之一,可破坏嵌入式系统的运行。因此,在关键应用中使用嵌入式软件程序之前,对其进行评估并检查其在故障注入攻击方面的脆弱性是非常重要和基本的。然而,软件开发人员往往很难对漏洞进行评估。本文介绍了一个易于使用的平台,以方便评估在嵌入式处理器上运行的程序对时钟闪烁攻击的脆弱性。我们的实验结果显示了 RISC-V 微体系结构针对不同高级 C 函数的漏洞窗口。这项研究成果可以帮助安全应用领域的嵌入式系统开发人员在短时间内以最低成本评估其系统针对时钟闪烁攻击的脆弱性。
{"title":"Experimental evaluation of RISC-V micro-architecture against fault injection attack","authors":"Maryam Esmaeilian,&nbsp;Hakem Beitollahi","doi":"10.1016/j.micpro.2023.104991","DOIUrl":"10.1016/j.micpro.2023.104991","url":null,"abstract":"<div><p><span>Today, the use of embedded processors is increasing dramatically and they are used in all aspects from our daily life to security applications. Physical access to hardware has made the hardware security a major concern. Hardware attacks compromise the hardware security by physically accessing target devices. Among the available techniques for hardware attacks, Fault Injection<span> Attacks (FIAs), such as clock glitching, are one of the most harmful types of non-invasive attacks that can disrupt the operation of an embedded system. Thus, it will be important and fundamental to evaluate </span></span>embedded software<span> programs before using them in critical applications and check their vulnerability against fault injection attacks. However, it is often difficult for software developers to assess vulnerabilities. In this paper, an easy-to-use platform is presented to facilitate the process of evaluating the vulnerability of programs running on embedded processors against clock glitching attacks. Our experimental results show the vulnerability window of RISC-V micro-architecture for different high-level C-functions. The results of this research can help the developers of embedded systems that are used in security applications to evaluate their system against clock glitching attacks with the least cost in a short time.</span></p></div>","PeriodicalId":49815,"journal":{"name":"Microprocessors and Microsystems","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2023-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139026463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improved DWT and IDWT architectures for image compression 用于图像压缩的改进型 DWT 和 IDWT 架构
IF 2.6 4区 计算机科学 Q2 Computer Science Pub Date : 2023-12-19 DOI: 10.1016/j.micpro.2023.104990
Ritesh Sur Chowdhury, Jhilam Jana, Sayan Tripathi, Jaydeb Bhaumik

In the recent era, a rapid development in the field of image processing has been observed. One of the important applications in image processing is compression. Several wavelet transform based image compression techniques have already been introduced. In this paper, Discrete Wavelet Transform (DWT) and Inverse Discrete Wavelet Transform (IDWT) based improved image compression and decompression techniques have been proposed by incorporating a scaling factor. The DWT and IDWT algorithms are implemented using folded architecture. To reduce the usages of hardware resources, a multiplier is recursively used. Image compression and decompression schemes based on proposed DWT and IDWT architectures are tested using four different image databases. The proposed technique provides better results in terms of bits per pixel, compression ratio, mean square error, peak-signal-to-noise ratio, normalized correlation coefficient and structural similarity index. FPGA based synthesis has been performed using Xilinx Vivado Synthesis tool in terms of slice LUTs, slice registers, clock frequency, delay and power. The synthesis results show that proposed DWT and IDWT architectures are amenable for image compression and decompression applications.

近年来,图像处理领域发展迅速。压缩是图像处理的重要应用之一。目前已经出现了几种基于小波变换的图像压缩技术。本文提出了基于离散小波变换(DWT)和反离散小波变换(IDWT)的改进型图像压缩和解压缩技术,并加入了缩放因子。DWT 和 IDWT 算法采用折叠式结构实现。为了减少硬件资源的使用,使用了递归乘法器。使用四个不同的图像数据库对基于所提出的 DWT 和 IDWT 架构的图像压缩和解压缩方案进行了测试。所提出的技术在每像素比特、压缩比、均方误差、峰值信噪比、归一化相关系数和结构相似性指数方面都取得了更好的效果。在片 LUT、片寄存器、时钟频率、延迟和功率方面,使用 Xilinx Vivado 综合工具进行了基于 FPGA 的综合。综合结果表明,所提出的 DWT 和 IDWT 架构适用于图像压缩和解压缩应用。
{"title":"Improved DWT and IDWT architectures for image compression","authors":"Ritesh Sur Chowdhury,&nbsp;Jhilam Jana,&nbsp;Sayan Tripathi,&nbsp;Jaydeb Bhaumik","doi":"10.1016/j.micpro.2023.104990","DOIUrl":"10.1016/j.micpro.2023.104990","url":null,"abstract":"<div><p><span><span>In the recent era, a rapid development in the field of image processing<span> has been observed. One of the important applications in image processing is compression. Several wavelet transform based </span></span>image compression<span><span> techniques have already been introduced. In this paper, Discrete Wavelet Transform (DWT) and Inverse Discrete Wavelet Transform (IDWT) based improved image compression and decompression techniques have been proposed by incorporating a scaling factor. The DWT and IDWT algorithms are implemented using folded architecture. To reduce the usages of hardware resources, a multiplier is recursively used. Image compression and decompression schemes based on proposed DWT and IDWT architectures are tested using four different image databases. The proposed technique provides better results in terms of bits per pixel, compression ratio, </span>mean square error, peak-signal-to-noise ratio, normalized correlation coefficient and structural similarity index. </span></span>FPGA<span> based synthesis has been performed using Xilinx Vivado Synthesis tool in terms of slice LUTs, slice registers, clock frequency, delay and power. The synthesis results show that proposed DWT and IDWT architectures are amenable for image compression and decompression applications.</span></p></div>","PeriodicalId":49815,"journal":{"name":"Microprocessors and Microsystems","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2023-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138820588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel partition strategy for efficient implementation of 3D Cellular Genetic Algorithms 高效实施三维蜂窝遗传算法的新型分区策略
IF 2.6 4区 计算机科学 Q2 Computer Science Pub Date : 2023-12-09 DOI: 10.1016/j.micpro.2023.104986
Martín Letras , Alicia Morales-Reyes , René Cumplido , María-Guadalupe Martínez-Peñaloza , Claudia Feregrino-Uribe

Solving optimization problems while fulfilling real-time constraints requires high algorithmic and processing performance. Cellular Genetic Algorithms (cGAs) have been competitive at difficult single objective combinatorial and continuous domain problems. Moreover, it has been demonstrated that structural properties in cGAs, such as population topology dimension, local neighborhood configuration and ad-hoc selection mechanisms, allow not only further algorithmic improvement but also, these characteristics can be combined at hardware level for acceleration. In this article, a novel partition strategy to exploit 3D cGAs population dynamics on a 2D processing array using Field Programmable Gate Arrays (FPGAs) as the target processing platform is presented. The proposed architecture fits as an optimization module within an embedded system where real-time constraints must be fulfilled. Therefore, it is important to find an optimal trade-off between hardware resources usage and searching time. Overall results demonstrate that the proposed architecture can run up to 90 MHz when tackling continuous benchmark functions. Moreover, speed-up of up to three and two orders of magnitude are achieved in comparison to a single CPU and a parallel GPU respectively.

在满足实时约束条件的同时解决优化问题,需要较高的算法和处理性能。细胞遗传算法(cGAs)在解决困难的单目标组合和连续域问题时具有很强的竞争力。此外,研究还表明,细胞遗传算法的结构特性,如群体拓扑维度、局部邻域配置和临时选择机制等,不仅可以进一步改进算法,还可以在硬件层面上结合这些特性来加速算法。本文提出了一种新颖的分区策略,利用现场可编程门阵列(FPGA)作为目标处理平台,在二维处理阵列上利用三维 cGAs 种群动态。所提出的架构适合作为嵌入式系统中的优化模块,必须满足实时性约束。因此,必须在硬件资源使用和搜索时间之间找到最佳平衡点。总体结果表明,在处理连续基准函数时,拟议架构的运行频率可达 90 MHz。此外,与单 CPU 和并行 GPU 相比,速度分别提高了三个和两个数量级。
{"title":"A novel partition strategy for efficient implementation of 3D Cellular Genetic Algorithms","authors":"Martín Letras ,&nbsp;Alicia Morales-Reyes ,&nbsp;René Cumplido ,&nbsp;María-Guadalupe Martínez-Peñaloza ,&nbsp;Claudia Feregrino-Uribe","doi":"10.1016/j.micpro.2023.104986","DOIUrl":"10.1016/j.micpro.2023.104986","url":null,"abstract":"<div><p><span><span><span>Solving optimization problems while fulfilling real-time constraints requires high algorithmic and processing performance. Cellular </span>Genetic Algorithms (cGAs) have been competitive at difficult single objective combinatorial and continuous domain problems. Moreover, it has been demonstrated that structural properties in cGAs, such as population topology dimension, local neighborhood configuration and ad-hoc selection mechanisms, allow not only further algorithmic improvement but also, these characteristics can be combined at hardware level for acceleration. In this article, a novel partition strategy to exploit 3D cGAs population dynamics on a 2D processing array using </span>Field Programmable Gate Arrays<span> (FPGAs) as the target processing platform is presented. The proposed architecture fits as an optimization module within an embedded system where real-time constraints must be fulfilled. Therefore, it is important to find an optimal trade-off between hardware resources usage and searching time. Overall results demonstrate that the proposed architecture can run up to 90 MHz when tackling continuous </span></span>benchmark functions<span>. Moreover, speed-up of up to three and two orders of magnitude are achieved in comparison to a single CPU and a parallel GPU respectively.</span></p></div>","PeriodicalId":49815,"journal":{"name":"Microprocessors and Microsystems","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2023-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138610571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Retraction notice to “FPGA implementation of PMSG based AC conversion using soft switching twin–mode PWM/FPGA control for high power IM application” [Microprocessors and Microsystems 75 (2020) 103044] 关于“基于软开关双模PWM/FPGA控制的高功率IM应用中基于PMSG的交流转换的FPGA实现”的撤回通知[微处理器与微系统]75 (2020)103044]
IF 2.6 4区 计算机科学 Q2 Computer Science Pub Date : 2023-11-30 DOI: 10.1016/j.micpro.2023.104977
C. Kadhiravan, J. Baskaran
{"title":"Retraction notice to “FPGA implementation of PMSG based AC conversion using soft switching twin–mode PWM/FPGA control for high power IM application” [Microprocessors and Microsystems 75 (2020) 103044]","authors":"C. Kadhiravan,&nbsp;J. Baskaran","doi":"10.1016/j.micpro.2023.104977","DOIUrl":"10.1016/j.micpro.2023.104977","url":null,"abstract":"","PeriodicalId":49815,"journal":{"name":"Microprocessors and Microsystems","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0141933123002223/pdfft?md5=17a61357c1aeda88aa50056adda92d00&pid=1-s2.0-S0141933123002223-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138515481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Wiretap resisting and privacy preserving data exchange with physical layer security and blockchain based authentication in Internet of Vehicles 基于物理层安全和区块链认证的车联网防窃听、保隐私数据交换
IF 2.6 4区 计算机科学 Q2 Computer Science Pub Date : 2023-11-23 DOI: 10.1016/j.micpro.2023.104965
Qiao Liu , Qi Han , Guangze Luo , Jin Cao , Hui Li , Yong Wang

With the development of automobile industry technology, vehicles have greatly affected everyday life, work and other aspects. With the continuous innovation of sensor technology, computer technology, wireless communication technology, and GPS technology, the concept of Inter of Vehicles (IoV) has been widely regarded as the core technology to solve a series of problems. However, as a complexity network with multiple elements including people, vehicle, base-station and so on, IoV is confronted with security threatened. In this paper, secure data exchange has been considered for two authenticated On Board Units (OBUs) with help of Road Side Unit (RSU). Blockchain based authentication and physical layer security have been applied into IoV for wiretap resisting and privacy preserving data exchange. For wiretap resisting, two synchronized transmitted signals from OBUs act as artificial noise at eavesdropper. In addition, for privacy preserving, summed codeword is formed at RSU which cannot be recovered individually. Finally, simulation results have been conducted to demonstrate that the proposed protocol can achieve transmission efficiency as well as informatics security.

随着汽车工业技术的发展,汽车已经极大地影响了人们的生活、工作等方方面面。随着传感器技术、计算机技术、无线通信技术、GPS技术的不断创新,车联网(IoV)的概念被广泛认为是解决一系列问题的核心技术。然而,作为一个包含人、车、基站等多要素的复杂网络,车联网面临着安全威胁。本文研究了在路旁单元(Road Side Unit, RSU)的帮助下,两个经过认证的车载单元(OBUs)之间的安全数据交换。基于区块链的身份验证和物理层安全已被应用于车联网中,用于防窃听和保护隐私的数据交换。为了抵抗窃听,两个同步传输的OBUs信号对窃听者起到了人工噪声的作用。此外,为了保护隐私,在RSU处形成了不能单独恢复的求和码字。最后,仿真结果表明,该协议在保证信息安全的前提下,具有较高的传输效率。
{"title":"Wiretap resisting and privacy preserving data exchange with physical layer security and blockchain based authentication in Internet of Vehicles","authors":"Qiao Liu ,&nbsp;Qi Han ,&nbsp;Guangze Luo ,&nbsp;Jin Cao ,&nbsp;Hui Li ,&nbsp;Yong Wang","doi":"10.1016/j.micpro.2023.104965","DOIUrl":"10.1016/j.micpro.2023.104965","url":null,"abstract":"<div><p>With the development of automobile industry technology, vehicles have greatly affected everyday life, work and other aspects. With the continuous innovation of sensor technology, computer technology, wireless communication<span><span><span> technology, and GPS technology, the concept of Inter of Vehicles (IoV) has been widely regarded as the core technology to solve a series of problems. However, as a complexity network with multiple elements including people, vehicle, base-station and so on, IoV is confronted with security threatened. In this paper, secure data exchange has been considered for two authenticated On Board Units (OBUs) with help of Road Side Unit (RSU). Blockchain based </span>authentication<span> and physical layer security have been applied into IoV for wiretap resisting and privacy preserving data exchange. For wiretap resisting, two synchronized transmitted signals from OBUs act as artificial noise at eavesdropper. In addition, for privacy preserving, summed codeword is formed at RSU which cannot be recovered individually. Finally, simulation results have been conducted to demonstrate that the proposed protocol can achieve transmission efficiency as well as </span></span>informatics security.</span></p></div>","PeriodicalId":49815,"journal":{"name":"Microprocessors and Microsystems","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2023-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138515506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the interactions between ILP and TLP with hardware transactional memory ILP和TLP与硬件事务性内存之间的相互作用
IF 2.6 4区 计算机科学 Q2 Computer Science Pub Date : 2023-11-19 DOI: 10.1016/j.micpro.2023.104975
Víctor Nicolás-Conesa, Rubén Titos-Gil, Ricardo Fernández-Pascual, Alberto Ros, Manuel E. Acacio

Hardware implementations of Transactional Memory (HTM) are designed to facilitate efficient thread synchronization in parallel programs, encouraging the use of larger critical sections. By employing optimistic concurrency control to execute transactions speculatively, HTM systems promise to deliver the performance benefits typically associated with fine-grained locks. In doing so, HTM systems must deal with transaction aborts. While under certain conditions aborts may be caused by the inherent limitations of hardware structures employed to implement TM (e.g., caches), conflicting concurrent accesses to shared memory locations are generally the prevailing cause for squashing the work done by a transaction

In this study, we present what we believe to be, to the best of our knowledge, the first characterization of how the aggressiveness of processor cores, particularly their ability to exploit instruction-level parallelism (ILP), interacts with the support for optimistic thread-level speculation offered by HTM systems. We have observed that by adjusting the size of structures that facilitate out-of-order and speculative execution, the number of aborts in the execution of transactional workloads can be altered in best-effort HTM implementations. Our findings indicate that in scenarios with high contention, a smaller number of powerful cores is more suitable, whereas in low contention scenarios, using a larger number of less aggressive cores is preferable. In addition, HTM systems that employ lazy detection and those employing eager detection with requester-stalls resolution, benefit from using simpler cores. In conclusion, abort ratios can be reduced with a careful choice of both processor aggressiveness and design aspects for each application depending on its contention.

事务性内存(HTM)的硬件实现旨在促进并行程序中的高效线程同步,鼓励使用更大的临界区。通过采用乐观并发控制来推测地执行事务,HTM系统承诺提供通常与细粒度锁相关的性能优势。为此,HTM系统必须处理事务中止。虽然在某些情况下,中断可能是由用于实现TM的硬件结构的固有限制(例如,缓存)引起的,但对共享内存位置的冲突并发访问通常是挤占事务完成工作的主要原因。在本研究中,我们提出了我们认为的,据我们所知,处理器内核的侵略性如何的第一个特征,特别是它们利用指令级并行性(ILP)的能力,与HTM系统提供的乐观线程级推测的支持相互作用。我们观察到,通过调整有利于乱序执行和推测执行的结构的大小,可以在尽力而为的HTM实现中改变事务性工作负载执行中的中止数量。我们的研究结果表明,在高争用的场景中,较少数量的强大核心更合适,而在低争用的场景中,使用较多数量的不那么激进的核心更可取。此外,采用惰性检测的HTM系统和采用具有请求者延迟解析的渴望检测的HTM系统都受益于使用更简单的内核。总之,根据每个应用程序的争用情况,仔细选择处理器侵略性和设计方面,可以减少中断比率。
{"title":"On the interactions between ILP and TLP with hardware transactional memory","authors":"Víctor Nicolás-Conesa,&nbsp;Rubén Titos-Gil,&nbsp;Ricardo Fernández-Pascual,&nbsp;Alberto Ros,&nbsp;Manuel E. Acacio","doi":"10.1016/j.micpro.2023.104975","DOIUrl":"https://doi.org/10.1016/j.micpro.2023.104975","url":null,"abstract":"<div><p>Hardware implementations of Transactional Memory (HTM) are designed to facilitate efficient thread synchronization in parallel programs, encouraging the use of larger critical sections. By employing optimistic concurrency control to execute transactions speculatively, HTM systems promise to deliver the performance benefits typically associated with fine-grained locks. In doing so, HTM systems must deal with transaction aborts. While under certain conditions aborts may be caused by the inherent limitations of hardware structures employed to implement TM (e.g., caches), conflicting concurrent accesses to shared memory locations are generally the prevailing cause for squashing the work done by a transaction</p><p>In this study, we present what we believe to be, to the best of our knowledge, the first characterization of how the aggressiveness of processor cores, particularly their ability to exploit instruction-level parallelism (ILP), interacts with the support for optimistic thread-level speculation offered by HTM systems. We have observed that by adjusting the size of structures that facilitate out-of-order and speculative execution, the number of aborts in the execution of transactional workloads can be altered in best-effort HTM implementations. Our findings indicate that in scenarios with high contention, a smaller number of powerful cores is more suitable, whereas in low contention scenarios, using a larger number of less aggressive cores is preferable. In addition, HTM systems that employ lazy detection and those employing eager detection with requester-stalls resolution, benefit from using simpler cores. In conclusion, abort ratios can be reduced with a careful choice of both processor aggressiveness and design aspects for each application depending on its contention.</p></div>","PeriodicalId":49815,"journal":{"name":"Microprocessors and Microsystems","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2023-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S014193312300220X/pdfft?md5=ce105b99f7f43d90376360a92db4669c&pid=1-s2.0-S014193312300220X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138404142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Microprocessors and Microsystems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1