首页 > 最新文献

IET Computers and Digital Techniques最新文献

英文 中文
Research on Adding Global Registration Model in Video Coding With Local Affine Motion Model 局部仿射运动模型在视频编码中加入全局配准模型的研究
IF 0.8 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-12-12 DOI: 10.1049/cdt2/6692669
Zhe Zheng, Wei Ma, Jinghua Liu, Jinghui Lu, Song Qiu, Rui Liu, Wenpeng Cui

With the widespread application of local affine (LA) motion models in various video coding standards, this study explores the implementation methods and performance changes of introducing a global registration model in an encoder that already includes a LA motion model. First, a coding scheme combining global and local registration is achieved by incorporating global registration computation, optimizing reference frame selection strategies, and macroblock mode selection strategies. Second, through experiments, the impact of introducing a global warp motion model and a global translational (GT) registration model on performance is further compared. The results indicate that the introduction of a global warp motion model leads to functional redundancy and mutual interference, with higher computational complexity and limited overall benefits. On the other hand, introducing a GT registration model can complement and enhance the coding performance for translation scenarios, working in synergy with the LA model, while maintaining lower computational complexity and greater practicality.

随着局部仿射(LA)运动模型在各种视频编码标准中的广泛应用,本研究探讨了在已经包含LA运动模型的编码器中引入全局配准模型的实现方法和性能变化。首先,结合全局配准计算、优化参考帧选择策略和宏块模式选择策略,实现了全局配准和局部配准相结合的编码方案;其次,通过实验,进一步比较了引入全局翘曲运动模型和全局平移配准模型对性能的影响。结果表明,引入全局翘曲运动模型会导致功能冗余和相互干扰,计算复杂度较高,整体效益有限。另一方面,引入GT配准模型可以补充和提高翻译场景的编码性能,与LA模型协同工作,同时保持较低的计算复杂度和更高的实用性。
{"title":"Research on Adding Global Registration Model in Video Coding With Local Affine Motion Model","authors":"Zhe Zheng,&nbsp;Wei Ma,&nbsp;Jinghua Liu,&nbsp;Jinghui Lu,&nbsp;Song Qiu,&nbsp;Rui Liu,&nbsp;Wenpeng Cui","doi":"10.1049/cdt2/6692669","DOIUrl":"https://doi.org/10.1049/cdt2/6692669","url":null,"abstract":"<p>With the widespread application of local affine (LA) motion models in various video coding standards, this study explores the implementation methods and performance changes of introducing a global registration model in an encoder that already includes a LA motion model. First, a coding scheme combining global and local registration is achieved by incorporating global registration computation, optimizing reference frame selection strategies, and macroblock mode selection strategies. Second, through experiments, the impact of introducing a global warp motion model and a global translational (GT) registration model on performance is further compared. The results indicate that the introduction of a global warp motion model leads to functional redundancy and mutual interference, with higher computational complexity and limited overall benefits. On the other hand, introducing a GT registration model can complement and enhance the coding performance for translation scenarios, working in synergy with the LA model, while maintaining lower computational complexity and greater practicality.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"2025 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2025-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2/6692669","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145750846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Systematic Literature Review on the Applications, Models, Limitations, and Future Directions of Generative Adversarial Networks 关于生成对抗网络的应用、模型、限制和未来方向的系统文献综述
IF 0.8 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-11-18 DOI: 10.1049/cdt2/5384331
Sunawar khan, Tehseen Mazhar, Tariq Shahzad, Muhammad Amir Khan, Wasim Ahmad, Afsha Bibi, Habib Hamam

Generative adversarial networks (GANs), a subset of deep learning, have demonstrated breakthrough performance in domains such as computer vision (CV) and natural language processing (NLP), particularly in surveillance, autonomous driving, and automated programing assistance. Based on game theory principles, GANs utilize a generator–discriminator architecture to produce high-quality synthetic data. This study conducts a systematic literature review (SLR) to comprehensively assess the development, applications, limitations, and security-related advancements of GANs. It examines foundational models and key architectural variants, providing a critical evaluation of their roles in NLP and CV. This research explores the integration of GANs into the domain of security, highlighting their applications in information security, cybersecurity, and artificial intelligence (AI)-driven defense mechanisms. The study also discusses prominent evaluation metrics such as inception score (IS), Fréchet inception distance (FID), structural similarity index measure (SSIM), and peak signal-to-noise ratio (PSNR) to assess GAN performance. Key strengths of GANs, including their ability to generate high-resolution data and support domain adaptation, are emphasized as driving factors for their continued evolution and adoption.

生成式对抗网络(gan)是深度学习的一个子集,在计算机视觉(CV)和自然语言处理(NLP)等领域,特别是在监控、自动驾驶和自动编程辅助方面,已经展示了突破性的性能。基于博弈论原理,gan利用生成-鉴别器架构生成高质量的合成数据。本研究进行了系统的文献综述(SLR),以全面评估gan的发展,应用,限制和安全相关的进展。它检查了基础模型和关键的架构变体,提供了它们在NLP和CV中的作用的关键评估。本研究探讨了gan与安全领域的融合,重点介绍了其在信息安全、网络安全以及人工智能驱动的防御机制中的应用。该研究还讨论了评估GAN性能的主要评估指标,如初始分数(IS)、fr起始距离(FID)、结构相似性指数(SSIM)和峰值信噪比(PSNR)。gan的关键优势,包括其生成高分辨率数据和支持领域适应的能力,被强调为其持续发展和采用的驱动因素。
{"title":"A Systematic Literature Review on the Applications, Models, Limitations, and Future Directions of Generative Adversarial Networks","authors":"Sunawar khan,&nbsp;Tehseen Mazhar,&nbsp;Tariq Shahzad,&nbsp;Muhammad Amir Khan,&nbsp;Wasim Ahmad,&nbsp;Afsha Bibi,&nbsp;Habib Hamam","doi":"10.1049/cdt2/5384331","DOIUrl":"10.1049/cdt2/5384331","url":null,"abstract":"<p>Generative adversarial networks (GANs), a subset of deep learning, have demonstrated breakthrough performance in domains such as computer vision (CV) and natural language processing (NLP), particularly in surveillance, autonomous driving, and automated programing assistance. Based on game theory principles, GANs utilize a generator–discriminator architecture to produce high-quality synthetic data. This study conducts a systematic literature review (SLR) to comprehensively assess the development, applications, limitations, and security-related advancements of GANs. It examines foundational models and key architectural variants, providing a critical evaluation of their roles in NLP and CV. This research explores the integration of GANs into the domain of security, highlighting their applications in information security, cybersecurity, and artificial intelligence (AI)-driven defense mechanisms. The study also discusses prominent evaluation metrics such as inception score (IS), Fréchet inception distance (FID), structural similarity index measure (SSIM), and peak signal-to-noise ratio (PSNR) to assess GAN performance. Key strengths of GANs, including their ability to generate high-resolution data and support domain adaptation, are emphasized as driving factors for their continued evolution and adoption.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"2025 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2025-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2/5384331","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145572159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Watermarking of Transient Fault-Detectable IP Designs Using Multivariate HLS Scheduling Based Multimodal Security 基于多模态安全的多变量HLS调度的瞬态故障检测IP水印设计
IF 0.8 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-11-12 DOI: 10.1049/cdt2/5926846
Anirban Sengupta, Vishal Chourasia, Nabendu Bhui, Aditya Anshul

Securing reusable hardware intellectual property (IP) cores used in system-on-chip (SoC) designs is crucial, due to global design supply chain that may introduce different points of security vulnerability. One of the major threats includes an untrustworthy entity in the SoC design house attempting piracy or falsely claiming ownership of the IP design. Further, owing to the importance of handling transient fault in hardware IP designs, design of fault-detectable IP designs has become a standard practice in the community. However, these fault-detectable IP designs are also similarly prone to hardware threats such as IP piracy and false claim of IP ownership. Therefore, robust sturdy countermeasure for fault-detectable IP designs against such threats is essential. This paper presents a detective countermeasure using proposed novel hardware watermarking methodology for transient fault-detectable IP designs. The proposed IP watermarking methodology introduces a novel multivariate encoded high-level synthesis (HLS) scheduling based multimodal security framework. The proposed approach is capable of embedding a robust, unique, and nonreplicable watermark in the HLS register allocation phase of fault-detectable IP design. The proposed watermarking technique is more robust than the prior watermarking approaches in terms of reduced probability of coincidence (PC; upto ~10−8), stronger tamper tolerance (TT; upto ~10130), and lower watermark decoding probability at 0% design cost overhead.

保护片上系统(SoC)设计中使用的可重用硬件知识产权(IP)内核至关重要,因为全球设计供应链可能会引入不同的安全漏洞点。其中一个主要威胁包括SoC设计公司中不值得信赖的实体试图盗版或虚假声称拥有IP设计。此外,由于处理暂态故障在硬件IP设计中的重要性,故障检测IP设计已经成为业界的标准做法。然而,这些可检测的IP设计也同样容易受到硬件威胁,如IP盗版和IP所有权的虚假声明。因此,针对此类威胁,针对故障可检测IP设计的稳健对策至关重要。本文提出了一种基于硬件水印的暂态故障检测IP设计方法。提出的IP水印方法引入了一种新的基于多模态安全框架的多变量编码高级综合调度。该方法能够在故障检测IP设计的HLS寄存器分配阶段嵌入一个鲁棒的、唯一的、不可复制的水印。所提出的水印技术在降低符合概率(PC;高达~10−8),更强的篡改容忍度(TT;高达~10130)和更低的水印解码概率方面比先前的水印方法更具鲁棒性,且设计成本开销为0%。
{"title":"Watermarking of Transient Fault-Detectable IP Designs Using Multivariate HLS Scheduling Based Multimodal Security","authors":"Anirban Sengupta,&nbsp;Vishal Chourasia,&nbsp;Nabendu Bhui,&nbsp;Aditya Anshul","doi":"10.1049/cdt2/5926846","DOIUrl":"https://doi.org/10.1049/cdt2/5926846","url":null,"abstract":"<p>Securing reusable hardware intellectual property (IP) cores used in system-on-chip (SoC) designs is crucial, due to global design supply chain that may introduce different points of security vulnerability. One of the major threats includes an untrustworthy entity in the SoC design house attempting piracy or falsely claiming ownership of the IP design. Further, owing to the importance of handling transient fault in hardware IP designs, design of fault-detectable IP designs has become a standard practice in the community. However, these fault-detectable IP designs are also similarly prone to hardware threats such as IP piracy and false claim of IP ownership. Therefore, robust sturdy countermeasure for fault-detectable IP designs against such threats is essential. This paper presents a detective countermeasure using proposed novel hardware watermarking methodology for transient fault-detectable IP designs. The proposed IP watermarking methodology introduces a novel multivariate encoded high-level synthesis (HLS) scheduling based multimodal security framework. The proposed approach is capable of embedding a robust, unique, and nonreplicable watermark in the HLS register allocation phase of fault-detectable IP design. The proposed watermarking technique is more robust than the prior watermarking approaches in terms of reduced probability of coincidence (PC; upto ~10<sup>−8</sup>), stronger tamper tolerance (TT; upto ~10<sup>130</sup>), and lower watermark decoding probability at 0% design cost overhead.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"2025 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2025-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2/5926846","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145521816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Temperature Noise Correction Method for CMOS Spatial Camera Using LSTM With Attention Mechanism 基于LSTM的CMOS空间相机温度噪声校正方法
IF 0.8 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-06-06 DOI: 10.1049/cdt2/6670185
Long Cheng, Xueying Wang, Jing Xu

This study presents an innovative temperature-induced random noise correction method for complementary metal oxide semiconductor (CMOS) spatial cameras using an attention mechanism-enhanced long short-term memory (LSTM) model. The model, specifically designed to address pixel drift and random noise issues in CMOS space cameras due to temperature variations, incorporates a multilayer LSTM network with an attention mechanism. This study comprehensively examines the temperature-induced variations in noise characteristics of CMOS cameras across diverse thermal conditions, encompassing in-depth analyses of both dark-field and light-field scenarios. Through detailed pixel-level analysis, the study quantifies the influence of temperature on pixel values and critical performance parameters such as internal nonuniformity within the camera. The experimental results show that under the dark field condition, the fitting variance between the predicted value and the measured value ranges from 0.29585 to 5.798307. After correction in light field conditions, the average variance of images decreases to 0.29, the mean signal-to-noise ratio (SNR) increases to 80, and the photo response nonuniformity (PRNU) mean drops to 0.0161%. Compared to precorrection levels, these key metrics show significant improvements, with an average 83.57-fold reduction, 1.89-fold increase, and 84.98-fold decrease, respectively. These results confirm the effectiveness of the deep learning method in correcting temperature-induced noise, highlighting the potential for practical engineering applications.

本研究提出一种基于注意机制增强长短期记忆(LSTM)模型的温度诱导随机噪声校正方法,用于互补金属氧化物半导体(CMOS)空间相机。该模型专门设计用于解决CMOS空间相机中由于温度变化引起的像素漂移和随机噪声问题,并结合了具有注意机制的多层LSTM网络。本研究全面考察了不同热条件下CMOS相机的温度引起的噪声特性变化,包括对暗场和光场场景的深入分析。通过详细的像素级分析,该研究量化了温度对像素值和相机内部不均匀性等关键性能参数的影响。实验结果表明,在暗场条件下,预测值与实测值的拟合方差在0.29585 ~ 5.798307之间。在光场条件下进行校正后,图像的平均方差减小到0.29,平均信噪比(SNR)增加到80,光响应不均匀度(PRNU)平均值下降到0.0161%。与校正前的水平相比,这些关键指标显示出显著改善,平均分别减少83.57倍、增加1.89倍和减少84.98倍。这些结果证实了深度学习方法在纠正温度引起的噪声方面的有效性,突出了实际工程应用的潜力。
{"title":"A Temperature Noise Correction Method for CMOS Spatial Camera Using LSTM With Attention Mechanism","authors":"Long Cheng,&nbsp;Xueying Wang,&nbsp;Jing Xu","doi":"10.1049/cdt2/6670185","DOIUrl":"10.1049/cdt2/6670185","url":null,"abstract":"<p>This study presents an innovative temperature-induced random noise correction method for complementary metal oxide semiconductor (CMOS) spatial cameras using an attention mechanism-enhanced long short-term memory (LSTM) model. The model, specifically designed to address pixel drift and random noise issues in CMOS space cameras due to temperature variations, incorporates a multilayer LSTM network with an attention mechanism. This study comprehensively examines the temperature-induced variations in noise characteristics of CMOS cameras across diverse thermal conditions, encompassing in-depth analyses of both dark-field and light-field scenarios. Through detailed pixel-level analysis, the study quantifies the influence of temperature on pixel values and critical performance parameters such as internal nonuniformity within the camera. The experimental results show that under the dark field condition, the fitting variance between the predicted value and the measured value ranges from 0.29585 to 5.798307. After correction in light field conditions, the average variance of images decreases to 0.29, the mean signal-to-noise ratio (SNR) increases to 80, and the photo response nonuniformity (PRNU) mean drops to 0.0161%. Compared to precorrection levels, these key metrics show significant improvements, with an average 83.57-fold reduction, 1.89-fold increase, and 84.98-fold decrease, respectively. These results confirm the effectiveness of the deep learning method in correcting temperature-induced noise, highlighting the potential for practical engineering applications.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"2025 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2025-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2/6670185","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144220342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GRSNet: An Ultra-Lightweight Neural Network for 3D Point Cloud Classification and Segmentation GRSNet:一种用于三维点云分类和分割的超轻量级神经网络
IF 0.8 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-05-12 DOI: 10.1049/cdt2/7934018
Zourong Long, Gen Tan, You Wu, Hong Yang, Chao Ding

The processing of point cloud data has become a significant area of research in the modern field of perception. Classification and segmentation are critical tasks in autonomous driving, environmental perception, and digital twins. Algorithms that directly extract features from raw point cloud data have simple architectures, but they are constrained by computational demands and limited efficiency. This makes effective deployment on resource-limited devices challenging. This article introduces GRSNet, an ultra-lightweight algorithm. The principal innovation is a new sampling method named golden ratio sampling (GRS), which generates sampling point indices directly using the golden ratio to subsequently locate the corresponding sampling points. This method efficiently extracts representative points from point cloud data and integrates them into deep networks. Leveraging GRS, this study combines the concepts from GhostNet and self-attention mechanisms to develop a feature extraction module dubbed the SA_Ghost Block, forming the core of GRSNet. Comparative experiments with leading algorithms on established point cloud open-source datasets demonstrate that GRSNet achieves superior performance, maintaining only 0.7 M parameters.

点云数据的处理已成为现代感知领域的一个重要研究领域。分类和分割是自动驾驶、环境感知和数字孪生中的关键任务。直接从原始点云数据中提取特征的算法结构简单,但受计算量和效率的限制。这使得在资源有限的设备上进行有效部署变得困难。本文介绍了一种超轻量级算法GRSNet。主要的创新是一种新的采样方法,称为黄金比例采样(GRS),它直接使用黄金比例生成采样点指数,从而定位相应的采样点。该方法有效地从点云数据中提取有代表性的点,并将其整合到深度网络中。本研究利用GRS,将GhostNet的概念与自关注机制相结合,开发了特征提取模块SA_Ghost Block,构成GRSNet的核心。在已建立的点云开源数据集上,与主流算法的对比实验表明,GRSNet算法仅保留0.7 M个参数,性能优越。
{"title":"GRSNet: An Ultra-Lightweight Neural Network for 3D Point Cloud Classification and Segmentation","authors":"Zourong Long,&nbsp;Gen Tan,&nbsp;You Wu,&nbsp;Hong Yang,&nbsp;Chao Ding","doi":"10.1049/cdt2/7934018","DOIUrl":"10.1049/cdt2/7934018","url":null,"abstract":"<p>The processing of point cloud data has become a significant area of research in the modern field of perception. Classification and segmentation are critical tasks in autonomous driving, environmental perception, and digital twins. Algorithms that directly extract features from raw point cloud data have simple architectures, but they are constrained by computational demands and limited efficiency. This makes effective deployment on resource-limited devices challenging. This article introduces GRSNet, an ultra-lightweight algorithm. The principal innovation is a new sampling method named golden ratio sampling (GRS), which generates sampling point indices directly using the golden ratio to subsequently locate the corresponding sampling points. This method efficiently extracts representative points from point cloud data and integrates them into deep networks. Leveraging GRS, this study combines the concepts from GhostNet and self-attention mechanisms to develop a feature extraction module dubbed the SA_Ghost Block, forming the core of GRSNet. Comparative experiments with leading algorithms on established point cloud open-source datasets demonstrate that GRSNet achieves superior performance, maintaining only 0.7 M parameters.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"2025 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2/7934018","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143939359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Application of Lightweight Target Detection Algorithm Based on YOLOv8 for Police Intelligent Moving Targets 基于YOLOv8的轻型目标检测算法在警用智能运动目标中的应用
IF 0.8 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-05-10 DOI: 10.1049/cdt2/9984821
Yanjie Zhang, Xiaojun Liu, Yuehan Shi, Zecong Ding, Xiaoming Zhang

This study presents an intelligent moving target to replicate mob attacks and other realistic events in police training to match actual fighting needs. The police intelligent moving target must deploy target detection algorithms on the hardware platform, but the traditional you only look once (YOLO)v8 algorithm has a large framework, which will slow recognition due to the hardware platform’s lack of arithmetic power. In this study, GhostNet network architecture replaces YOLOv8s backbone network for real-time target identification, improving recognition speed. The bounding box regression issue in target detection uses the scale invariant intersection over union (SIoU) loss function to increase prediction box overlapping and identification accuracy. Finally, BiFormer uses dynamic sparse attention for more flexible computational allocation and content perception. The method’s real-time detection speed is 4.81 frames per second (FPS) faster, [email protected] is 5.38% faster, mean average precision (mAP)@0.5:0.95 is 4.19% faster, and parameter volume is 5.81 M less than the original approach. The approach developed in this work has several applications in real-time target identification and lightweight deployment.

本研究提出了一个智能移动目标来复制暴徒袭击和警察训练中的其他现实事件,以匹配实际战斗需求。警用智能移动目标必须在硬件平台上部署目标检测算法,而传统的you only look once (YOLO)v8算法框架较大,由于硬件平台缺乏算力,会导致识别速度变慢。在本研究中,GhostNet网络架构取代YOLOv8的骨干网进行实时目标识别,提高了识别速度。目标检测中的边界盒回归问题采用SIoU损失函数(scale invariant intersection over union)来提高预测盒重叠和识别精度。最后,BiFormer使用动态稀疏注意实现更灵活的计算分配和内容感知。该方法的实时检测速度比原方法提高了4.81帧/秒(FPS), [email protected]提高了5.38%,平均精度(mAP)@0.5:0.95提高了4.19%,参数体积比原方法减少了5.81 M。本研究开发的方法在实时目标识别和轻量级部署中具有多种应用。
{"title":"Application of Lightweight Target Detection Algorithm Based on YOLOv8 for Police Intelligent Moving Targets","authors":"Yanjie Zhang,&nbsp;Xiaojun Liu,&nbsp;Yuehan Shi,&nbsp;Zecong Ding,&nbsp;Xiaoming Zhang","doi":"10.1049/cdt2/9984821","DOIUrl":"10.1049/cdt2/9984821","url":null,"abstract":"<p>This study presents an intelligent moving target to replicate mob attacks and other realistic events in police training to match actual fighting needs. The police intelligent moving target must deploy target detection algorithms on the hardware platform, but the traditional you only look once (YOLO)v8 algorithm has a large framework, which will slow recognition due to the hardware platform’s lack of arithmetic power. In this study, GhostNet network architecture replaces YOLOv8<sup>′</sup>s backbone network for real-time target identification, improving recognition speed. The bounding box regression issue in target detection uses the scale invariant intersection over union (SIoU) loss function to increase prediction box overlapping and identification accuracy. Finally, BiFormer uses dynamic sparse attention for more flexible computational allocation and content perception. The method’s real-time detection speed is 4.81 frames per second (FPS) faster, [email protected] is 5.38% faster, mean average precision (mAP)@0.5:0.95 is 4.19% faster, and parameter volume is 5.81 M less than the original approach. The approach developed in this work has several applications in real-time target identification and lightweight deployment.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"2025 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2025-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2/9984821","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143930499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Energy-Efficient Branch Predictor via Instruction Block Type Prediction in Decoupled Frontend 基于解耦前端指令块类型预测的节能分支预测器
IF 0.8 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-04-30 DOI: 10.1049/cdt2/3359419
Zilin Li, Jizeng Wei, Shuangsheng Li, Yaogong Yang

The branch predictor is widely used to enhance processor performance, but it also constitutes one of the major energy-consuming components in processors. We found that approximately 32% of instruction blocks in a decoupled frontend do not contain branch instructions, while 30.8% of instruction blocks contain only conditional branches. However, because the type of instructions within a block cannot be determined during prediction, branch prediction must be executed every cycle. In this work, we propose the next block type (NBT) and no branch sequence table (NST) for predicting instruction block types. These mechanisms occupy minimal space and are straightforward to implement. For a four-way out-of-order processor, the NBT and NST reduce the branch predictor’s energy consumption by 52.36% and processor’s energy consumption by 4.1% without sacrificing the processor’s instructions per cycle (IPC) and branch prediction accuracy.

分支预测器被广泛用于提高处理器性能,但它也是处理器中主要的耗能部件之一。我们发现,在解耦的前端中,大约32%的指令块不包含分支指令,而30.8%的指令块只包含条件分支。但是,由于在预测期间无法确定块内指令的类型,因此必须在每个周期执行分支预测。在这项工作中,我们提出了下一个块类型(NBT)和无分支序列表(NST)来预测指令块类型。这些机制占用的空间很小,而且很容易实现。对于四路乱序处理器,NBT和NST在不牺牲处理器每周期指令(IPC)和分支预测精度的情况下,将分支预测器的能耗降低了52.36%,处理器的能耗降低了4.1%。
{"title":"Energy-Efficient Branch Predictor via Instruction Block Type Prediction in Decoupled Frontend","authors":"Zilin Li,&nbsp;Jizeng Wei,&nbsp;Shuangsheng Li,&nbsp;Yaogong Yang","doi":"10.1049/cdt2/3359419","DOIUrl":"10.1049/cdt2/3359419","url":null,"abstract":"<p>The branch predictor is widely used to enhance processor performance, but it also constitutes one of the major energy-consuming components in processors. We found that approximately 32% of instruction blocks in a decoupled frontend do not contain branch instructions, while 30.8% of instruction blocks contain only conditional branches. However, because the type of instructions within a block cannot be determined during prediction, branch prediction must be executed every cycle. In this work, we propose the next block type (NBT) and no branch sequence table (NST) for predicting instruction block types. These mechanisms occupy minimal space and are straightforward to implement. For a four-way out-of-order processor, the NBT and NST reduce the branch predictor’s energy consumption by 52.36% and processor’s energy consumption by 4.1% without sacrificing the processor’s instructions per cycle (IPC) and branch prediction accuracy.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"2025 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2/3359419","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143889091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Reconfigurable Coarse-to-Fine Approach for the Execution of CNN Inference Models in Low-Power Edge Devices 在低功耗边缘设备中执行 CNN 推断模型的可重构粗到细方法
IF 0.8 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-12-19 DOI: 10.1049/cdt2/6214436
Auangkun Rangsikunpum, Sam Amiri, Luciano Ost

Convolutional neural networks (CNNs) have evolved into essential components for a wide range of embedded applications due to their outstanding efficiency and performance. To efficiently deploy CNN inference models on resource-constrained edge devices, field programmable gate arrays (FPGAs) have become a viable processing solution because of their unique hardware characteristics, enabling flexibility, parallel computation and low-power consumption. In this regard, this work proposes an FPGA-based dynamic reconfigurable coarse-to-fine (C2F) inference of CNN models, aiming to increase power efficiency and flexibility. The proposed C2F approach first coarsely classifies related input images into superclasses and then selects the appropriate fine model(s) to recognise and classify the input images according to their bespoke categories. Furthermore, the proposed architecture can be reprogrammed to the original model using partial reconfiguration (PR) in case the typical classification is required. To efficiently utilise different fine models on low-cost FPGAs with area minimisation, ZyCAP-based PR is adopted. Results show that our approach significantly improves the classification process when object identification of only one coarse category of interest is needed. This approach can reduce energy consumption and inference time by up to 27.2% and 13.2%, respectively, which can greatly benefit resource-constrained applications.

卷积神经网络(CNN)因其出色的效率和性能,已发展成为各种嵌入式应用的重要组件。为了在资源受限的边缘设备上高效部署 CNN 推断模型,现场可编程门阵列(FPGA)因其独特的硬件特性而成为一种可行的处理解决方案,可实现灵活性、并行计算和低功耗。为此,本研究提出了一种基于 FPGA 的 CNN 模型动态可重构粗到细(C2F)推理方法,旨在提高能效和灵活性。所提出的 C2F 方法首先将相关输入图像粗分类为超类,然后选择适当的精细模型,根据定制类别对输入图像进行识别和分类。此外,在需要进行典型分类时,还可使用部分重新配置(PR)将拟议架构重新编程为原始模型。为了在低成本 FPGA 上有效利用不同的精细模型,同时最大限度地减少面积,我们采用了基于 ZyCAP 的 PR。结果表明,当只需要识别一个感兴趣的粗分类对象时,我们的方法能明显改善分类过程。这种方法可将能耗和推理时间分别减少 27.2% 和 13.2%,这对资源有限的应用大有裨益。
{"title":"A Reconfigurable Coarse-to-Fine Approach for the Execution of CNN Inference Models in Low-Power Edge Devices","authors":"Auangkun Rangsikunpum,&nbsp;Sam Amiri,&nbsp;Luciano Ost","doi":"10.1049/cdt2/6214436","DOIUrl":"10.1049/cdt2/6214436","url":null,"abstract":"<p>Convolutional neural networks (CNNs) have evolved into essential components for a wide range of embedded applications due to their outstanding efficiency and performance. To efficiently deploy CNN inference models on resource-constrained edge devices, field programmable gate arrays (FPGAs) have become a viable processing solution because of their unique hardware characteristics, enabling flexibility, parallel computation and low-power consumption. In this regard, this work proposes an FPGA-based dynamic reconfigurable coarse-to-fine (C2F) inference of CNN models, aiming to increase power efficiency and flexibility. The proposed C2F approach first coarsely classifies related input images into superclasses and then selects the appropriate fine model(s) to recognise and classify the input images according to their bespoke categories. Furthermore, the proposed architecture can be reprogrammed to the original model using partial reconfiguration (PR) in case the typical classification is required. To efficiently utilise different fine models on low-cost FPGAs with area minimisation, ZyCAP-based PR is adopted. Results show that our approach significantly improves the classification process when object identification of only one coarse category of interest is needed. This approach can reduce energy consumption and inference time by up to 27.2% and 13.2%, respectively, which can greatly benefit resource-constrained applications.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"2024 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2/6214436","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142861745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
E-Commerce Logistics Software Package Tracking and Route Planning and Optimization System of Embedded Technology Based on the Intelligent Era 基于智能时代嵌入式技术的电子商务物流软件包跟踪与路线规划优化系统
IF 0.8 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-10-11 DOI: 10.1049/2024/6687853
Dan Zhang, Zhiyang Jia

In the Internet era, the e-commerce industry has risen, its development scale continues to expand, cross-border e-commerce (CBEC) has also been born, and it is now in the stage of sustainable development. The rapid development of CBEC also needs the strong support of logistics, the two are inseparable, and today, the development scale of CBEC is constantly expanding. The existing e-commerce logistics (ECL) model is also gradually unable to meet the increasingly diverse needs of users, and new logistics models need to be actively explored. To change this situation, this paper carried out a specific analysis of CBEC logistics model, and applied embedded technology to ECL, which also built a logistics tracking system. At the same time, combined with the ant colony algorithm, the paper carried out experimental research on the logistics package distribution route planning problem. From the experimental results, in terms of average delivery time, the algorithm’s result was 25.95 hr, while the traditional algorithm was 32.53 hr; in terms of average distribution freight cost, the algorithm’s result was 163.3 yuan, while the traditional algorithm was 257.7 yuan; in terms of average distribution cost, this algorithm’s result was 131.53 yuan, while the traditional algorithm was 211.68 yuan. To sum up, this algorithm could effectively optimize the distribution route of logistics packages and improve the efficiency of package transportation.

在互联网时代,电子商务产业异军突起,发展规模不断扩大,跨境电子商务(CBEC)也应运而生,目前正处于可持续发展阶段。CBEC 的快速发展也需要物流的大力支持,二者密不可分,如今,CBEC 的发展规模正在不断扩大。现有的电子商务物流(ECL)模式也逐渐无法满足用户日益多样化的需求,需要积极探索新的物流模式。为了改变这一现状,本文对 CBEC 物流模式进行了具体分析,并将嵌入式技术应用到 ECL 中,还构建了物流跟踪系统。同时,结合蚁群算法,本文对物流包裹配送路线规划问题进行了实验研究。从实验结果来看,在平均配送时间方面,该算法的结果为25.95小时,而传统算法为32.53小时;在平均配送运费方面,该算法的结果为163.3元,而传统算法为257.7元;在平均配送成本方面,该算法的结果为131.53元,而传统算法为211.68元。综上所述,该算法可以有效优化物流包裹的配送路线,提高包裹运输效率。
{"title":"E-Commerce Logistics Software Package Tracking and Route Planning and Optimization System of Embedded Technology Based on the Intelligent Era","authors":"Dan Zhang,&nbsp;Zhiyang Jia","doi":"10.1049/2024/6687853","DOIUrl":"10.1049/2024/6687853","url":null,"abstract":"<p>In the Internet era, the e-commerce industry has risen, its development scale continues to expand, cross-border e-commerce (CBEC) has also been born, and it is now in the stage of sustainable development. The rapid development of CBEC also needs the strong support of logistics, the two are inseparable, and today, the development scale of CBEC is constantly expanding. The existing e-commerce logistics (ECL) model is also gradually unable to meet the increasingly diverse needs of users, and new logistics models need to be actively explored. To change this situation, this paper carried out a specific analysis of CBEC logistics model, and applied embedded technology to ECL, which also built a logistics tracking system. At the same time, combined with the ant colony algorithm, the paper carried out experimental research on the logistics package distribution route planning problem. From the experimental results, in terms of average delivery time, the algorithm’s result was 25.95 hr, while the traditional algorithm was 32.53 hr; in terms of average distribution freight cost, the algorithm’s result was 163.3 yuan, while the traditional algorithm was 257.7 yuan; in terms of average distribution cost, this algorithm’s result was 131.53 yuan, while the traditional algorithm was 211.68 yuan. To sum up, this algorithm could effectively optimize the distribution route of logistics packages and improve the efficiency of package transportation.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"2024 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/6687853","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142429696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Configurable Accelerator for CNN-Based Remote Sensing Object Detection on FPGAs FPGA 上基于 CNN 的遥感物体检测的可配置加速器
IF 0.8 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-06-20 DOI: 10.1049/2024/4415342
Yingzhao Shao, Jincheng Shang, Yunsong Li, Yueli Ding, Mingming Zhang, Ke Ren, Yang Liu

Convolutional neural networks (CNNs) have been widely used in satellite remote sensing. However, satellites in orbit with limited resources and power consumption cannot meet the storage and computing power requirements of current million-scale artificial intelligence models. This paper proposes a new generation of high flexibility and intelligent CNNs hardware accelerator for satellite remote sensing in order to make its computing carrier more lightweight and efficient. A data quantization scheme for INT16 or INT8 is designed based on the idea of dynamic fixed point numbers and is applied to different scenarios. The operation mode of the systolic array is divided into channel blocks, and the calculation method is optimized to increase the utilization of on-chip computing resources and enhance the calculation efficiency. An RTL-level CNNs field programable gate arrays accelerator with microinstruction sequence scheduling data flow is then designed. The hardware framework is built upon the Xilinx VC709. The results show that, under INT16 or INT8 precision, the system achieves remarkable throughput in most convolutional layers of the network, with an average performance of 153.14 giga operations per second (GOPS) or 301.52 GOPS, which is close to the system’s peak performance, taking full advantage of the platform’s parallel computing capabilities.

卷积神经网络(CNN)已广泛应用于卫星遥感领域。然而,在轨卫星资源和功耗有限,无法满足当前百万量级人工智能模型的存储和计算能力要求。本文提出了一种用于卫星遥感的新一代高灵活性、高智能 CNN 硬件加速器,以使其计算载体更加轻便高效。基于动态定点数的思想,设计了一种 INT16 或 INT8 的数据量化方案,并应用于不同场景。将系统阵列的运行模式划分为通道块,并优化计算方法,以提高片上计算资源的利用率和计算效率。然后,设计了一个具有微指令序列调度数据流的 RTL 级 CNNs 现场可编程门阵列加速器。硬件框架基于 Xilinx VC709。结果表明,在 INT16 或 INT8 精度条件下,该系统在大多数卷积层网络中实现了显著的吞吐量,平均每秒 153.14 千兆操作(GOPS)或 301.52 GOPS,接近系统的峰值性能,充分利用了平台的并行计算能力。
{"title":"A Configurable Accelerator for CNN-Based Remote Sensing Object Detection on FPGAs","authors":"Yingzhao Shao,&nbsp;Jincheng Shang,&nbsp;Yunsong Li,&nbsp;Yueli Ding,&nbsp;Mingming Zhang,&nbsp;Ke Ren,&nbsp;Yang Liu","doi":"10.1049/2024/4415342","DOIUrl":"10.1049/2024/4415342","url":null,"abstract":"<p>Convolutional neural networks (CNNs) have been widely used in satellite remote sensing. However, satellites in orbit with limited resources and power consumption cannot meet the storage and computing power requirements of current million-scale artificial intelligence models. This paper proposes a new generation of high flexibility and intelligent CNNs hardware accelerator for satellite remote sensing in order to make its computing carrier more lightweight and efficient. A data quantization scheme for INT16 or INT8 is designed based on the idea of dynamic fixed point numbers and is applied to different scenarios. The operation mode of the systolic array is divided into channel blocks, and the calculation method is optimized to increase the utilization of on-chip computing resources and enhance the calculation efficiency. An RTL-level CNNs field programable gate arrays accelerator with microinstruction sequence scheduling data flow is then designed. The hardware framework is built upon the Xilinx VC709. The results show that, under INT16 or INT8 precision, the system achieves remarkable throughput in most convolutional layers of the network, with an average performance of 153.14 giga operations per second (GOPS) or 301.52 GOPS, which is close to the system’s peak performance, taking full advantage of the platform’s parallel computing capabilities.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"2024 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/4415342","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141435679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IET Computers and Digital Techniques
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1