Zhe Zheng, Wei Ma, Jinghua Liu, Jinghui Lu, Song Qiu, Rui Liu, Wenpeng Cui
With the widespread application of local affine (LA) motion models in various video coding standards, this study explores the implementation methods and performance changes of introducing a global registration model in an encoder that already includes a LA motion model. First, a coding scheme combining global and local registration is achieved by incorporating global registration computation, optimizing reference frame selection strategies, and macroblock mode selection strategies. Second, through experiments, the impact of introducing a global warp motion model and a global translational (GT) registration model on performance is further compared. The results indicate that the introduction of a global warp motion model leads to functional redundancy and mutual interference, with higher computational complexity and limited overall benefits. On the other hand, introducing a GT registration model can complement and enhance the coding performance for translation scenarios, working in synergy with the LA model, while maintaining lower computational complexity and greater practicality.
{"title":"Research on Adding Global Registration Model in Video Coding With Local Affine Motion Model","authors":"Zhe Zheng, Wei Ma, Jinghua Liu, Jinghui Lu, Song Qiu, Rui Liu, Wenpeng Cui","doi":"10.1049/cdt2/6692669","DOIUrl":"https://doi.org/10.1049/cdt2/6692669","url":null,"abstract":"<p>With the widespread application of local affine (LA) motion models in various video coding standards, this study explores the implementation methods and performance changes of introducing a global registration model in an encoder that already includes a LA motion model. First, a coding scheme combining global and local registration is achieved by incorporating global registration computation, optimizing reference frame selection strategies, and macroblock mode selection strategies. Second, through experiments, the impact of introducing a global warp motion model and a global translational (GT) registration model on performance is further compared. The results indicate that the introduction of a global warp motion model leads to functional redundancy and mutual interference, with higher computational complexity and limited overall benefits. On the other hand, introducing a GT registration model can complement and enhance the coding performance for translation scenarios, working in synergy with the LA model, while maintaining lower computational complexity and greater practicality.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"2025 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2025-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2/6692669","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145750846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sunawar khan, Tehseen Mazhar, Tariq Shahzad, Muhammad Amir Khan, Wasim Ahmad, Afsha Bibi, Habib Hamam
Generative adversarial networks (GANs), a subset of deep learning, have demonstrated breakthrough performance in domains such as computer vision (CV) and natural language processing (NLP), particularly in surveillance, autonomous driving, and automated programing assistance. Based on game theory principles, GANs utilize a generator–discriminator architecture to produce high-quality synthetic data. This study conducts a systematic literature review (SLR) to comprehensively assess the development, applications, limitations, and security-related advancements of GANs. It examines foundational models and key architectural variants, providing a critical evaluation of their roles in NLP and CV. This research explores the integration of GANs into the domain of security, highlighting their applications in information security, cybersecurity, and artificial intelligence (AI)-driven defense mechanisms. The study also discusses prominent evaluation metrics such as inception score (IS), Fréchet inception distance (FID), structural similarity index measure (SSIM), and peak signal-to-noise ratio (PSNR) to assess GAN performance. Key strengths of GANs, including their ability to generate high-resolution data and support domain adaptation, are emphasized as driving factors for their continued evolution and adoption.
{"title":"A Systematic Literature Review on the Applications, Models, Limitations, and Future Directions of Generative Adversarial Networks","authors":"Sunawar khan, Tehseen Mazhar, Tariq Shahzad, Muhammad Amir Khan, Wasim Ahmad, Afsha Bibi, Habib Hamam","doi":"10.1049/cdt2/5384331","DOIUrl":"10.1049/cdt2/5384331","url":null,"abstract":"<p>Generative adversarial networks (GANs), a subset of deep learning, have demonstrated breakthrough performance in domains such as computer vision (CV) and natural language processing (NLP), particularly in surveillance, autonomous driving, and automated programing assistance. Based on game theory principles, GANs utilize a generator–discriminator architecture to produce high-quality synthetic data. This study conducts a systematic literature review (SLR) to comprehensively assess the development, applications, limitations, and security-related advancements of GANs. It examines foundational models and key architectural variants, providing a critical evaluation of their roles in NLP and CV. This research explores the integration of GANs into the domain of security, highlighting their applications in information security, cybersecurity, and artificial intelligence (AI)-driven defense mechanisms. The study also discusses prominent evaluation metrics such as inception score (IS), Fréchet inception distance (FID), structural similarity index measure (SSIM), and peak signal-to-noise ratio (PSNR) to assess GAN performance. Key strengths of GANs, including their ability to generate high-resolution data and support domain adaptation, are emphasized as driving factors for their continued evolution and adoption.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"2025 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2025-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2/5384331","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145572159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Securing reusable hardware intellectual property (IP) cores used in system-on-chip (SoC) designs is crucial, due to global design supply chain that may introduce different points of security vulnerability. One of the major threats includes an untrustworthy entity in the SoC design house attempting piracy or falsely claiming ownership of the IP design. Further, owing to the importance of handling transient fault in hardware IP designs, design of fault-detectable IP designs has become a standard practice in the community. However, these fault-detectable IP designs are also similarly prone to hardware threats such as IP piracy and false claim of IP ownership. Therefore, robust sturdy countermeasure for fault-detectable IP designs against such threats is essential. This paper presents a detective countermeasure using proposed novel hardware watermarking methodology for transient fault-detectable IP designs. The proposed IP watermarking methodology introduces a novel multivariate encoded high-level synthesis (HLS) scheduling based multimodal security framework. The proposed approach is capable of embedding a robust, unique, and nonreplicable watermark in the HLS register allocation phase of fault-detectable IP design. The proposed watermarking technique is more robust than the prior watermarking approaches in terms of reduced probability of coincidence (PC; upto ~10−8), stronger tamper tolerance (TT; upto ~10130), and lower watermark decoding probability at 0% design cost overhead.
{"title":"Watermarking of Transient Fault-Detectable IP Designs Using Multivariate HLS Scheduling Based Multimodal Security","authors":"Anirban Sengupta, Vishal Chourasia, Nabendu Bhui, Aditya Anshul","doi":"10.1049/cdt2/5926846","DOIUrl":"https://doi.org/10.1049/cdt2/5926846","url":null,"abstract":"<p>Securing reusable hardware intellectual property (IP) cores used in system-on-chip (SoC) designs is crucial, due to global design supply chain that may introduce different points of security vulnerability. One of the major threats includes an untrustworthy entity in the SoC design house attempting piracy or falsely claiming ownership of the IP design. Further, owing to the importance of handling transient fault in hardware IP designs, design of fault-detectable IP designs has become a standard practice in the community. However, these fault-detectable IP designs are also similarly prone to hardware threats such as IP piracy and false claim of IP ownership. Therefore, robust sturdy countermeasure for fault-detectable IP designs against such threats is essential. This paper presents a detective countermeasure using proposed novel hardware watermarking methodology for transient fault-detectable IP designs. The proposed IP watermarking methodology introduces a novel multivariate encoded high-level synthesis (HLS) scheduling based multimodal security framework. The proposed approach is capable of embedding a robust, unique, and nonreplicable watermark in the HLS register allocation phase of fault-detectable IP design. The proposed watermarking technique is more robust than the prior watermarking approaches in terms of reduced probability of coincidence (PC; upto ~10<sup>−8</sup>), stronger tamper tolerance (TT; upto ~10<sup>130</sup>), and lower watermark decoding probability at 0% design cost overhead.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"2025 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2025-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2/5926846","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145521816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This study presents an innovative temperature-induced random noise correction method for complementary metal oxide semiconductor (CMOS) spatial cameras using an attention mechanism-enhanced long short-term memory (LSTM) model. The model, specifically designed to address pixel drift and random noise issues in CMOS space cameras due to temperature variations, incorporates a multilayer LSTM network with an attention mechanism. This study comprehensively examines the temperature-induced variations in noise characteristics of CMOS cameras across diverse thermal conditions, encompassing in-depth analyses of both dark-field and light-field scenarios. Through detailed pixel-level analysis, the study quantifies the influence of temperature on pixel values and critical performance parameters such as internal nonuniformity within the camera. The experimental results show that under the dark field condition, the fitting variance between the predicted value and the measured value ranges from 0.29585 to 5.798307. After correction in light field conditions, the average variance of images decreases to 0.29, the mean signal-to-noise ratio (SNR) increases to 80, and the photo response nonuniformity (PRNU) mean drops to 0.0161%. Compared to precorrection levels, these key metrics show significant improvements, with an average 83.57-fold reduction, 1.89-fold increase, and 84.98-fold decrease, respectively. These results confirm the effectiveness of the deep learning method in correcting temperature-induced noise, highlighting the potential for practical engineering applications.
{"title":"A Temperature Noise Correction Method for CMOS Spatial Camera Using LSTM With Attention Mechanism","authors":"Long Cheng, Xueying Wang, Jing Xu","doi":"10.1049/cdt2/6670185","DOIUrl":"10.1049/cdt2/6670185","url":null,"abstract":"<p>This study presents an innovative temperature-induced random noise correction method for complementary metal oxide semiconductor (CMOS) spatial cameras using an attention mechanism-enhanced long short-term memory (LSTM) model. The model, specifically designed to address pixel drift and random noise issues in CMOS space cameras due to temperature variations, incorporates a multilayer LSTM network with an attention mechanism. This study comprehensively examines the temperature-induced variations in noise characteristics of CMOS cameras across diverse thermal conditions, encompassing in-depth analyses of both dark-field and light-field scenarios. Through detailed pixel-level analysis, the study quantifies the influence of temperature on pixel values and critical performance parameters such as internal nonuniformity within the camera. The experimental results show that under the dark field condition, the fitting variance between the predicted value and the measured value ranges from 0.29585 to 5.798307. After correction in light field conditions, the average variance of images decreases to 0.29, the mean signal-to-noise ratio (SNR) increases to 80, and the photo response nonuniformity (PRNU) mean drops to 0.0161%. Compared to precorrection levels, these key metrics show significant improvements, with an average 83.57-fold reduction, 1.89-fold increase, and 84.98-fold decrease, respectively. These results confirm the effectiveness of the deep learning method in correcting temperature-induced noise, highlighting the potential for practical engineering applications.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"2025 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2025-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2/6670185","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144220342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zourong Long, Gen Tan, You Wu, Hong Yang, Chao Ding
The processing of point cloud data has become a significant area of research in the modern field of perception. Classification and segmentation are critical tasks in autonomous driving, environmental perception, and digital twins. Algorithms that directly extract features from raw point cloud data have simple architectures, but they are constrained by computational demands and limited efficiency. This makes effective deployment on resource-limited devices challenging. This article introduces GRSNet, an ultra-lightweight algorithm. The principal innovation is a new sampling method named golden ratio sampling (GRS), which generates sampling point indices directly using the golden ratio to subsequently locate the corresponding sampling points. This method efficiently extracts representative points from point cloud data and integrates them into deep networks. Leveraging GRS, this study combines the concepts from GhostNet and self-attention mechanisms to develop a feature extraction module dubbed the SA_Ghost Block, forming the core of GRSNet. Comparative experiments with leading algorithms on established point cloud open-source datasets demonstrate that GRSNet achieves superior performance, maintaining only 0.7 M parameters.
{"title":"GRSNet: An Ultra-Lightweight Neural Network for 3D Point Cloud Classification and Segmentation","authors":"Zourong Long, Gen Tan, You Wu, Hong Yang, Chao Ding","doi":"10.1049/cdt2/7934018","DOIUrl":"10.1049/cdt2/7934018","url":null,"abstract":"<p>The processing of point cloud data has become a significant area of research in the modern field of perception. Classification and segmentation are critical tasks in autonomous driving, environmental perception, and digital twins. Algorithms that directly extract features from raw point cloud data have simple architectures, but they are constrained by computational demands and limited efficiency. This makes effective deployment on resource-limited devices challenging. This article introduces GRSNet, an ultra-lightweight algorithm. The principal innovation is a new sampling method named golden ratio sampling (GRS), which generates sampling point indices directly using the golden ratio to subsequently locate the corresponding sampling points. This method efficiently extracts representative points from point cloud data and integrates them into deep networks. Leveraging GRS, this study combines the concepts from GhostNet and self-attention mechanisms to develop a feature extraction module dubbed the SA_Ghost Block, forming the core of GRSNet. Comparative experiments with leading algorithms on established point cloud open-source datasets demonstrate that GRSNet achieves superior performance, maintaining only 0.7 M parameters.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"2025 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2/7934018","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143939359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This study presents an intelligent moving target to replicate mob attacks and other realistic events in police training to match actual fighting needs. The police intelligent moving target must deploy target detection algorithms on the hardware platform, but the traditional you only look once (YOLO)v8 algorithm has a large framework, which will slow recognition due to the hardware platform’s lack of arithmetic power. In this study, GhostNet network architecture replaces YOLOv8′s backbone network for real-time target identification, improving recognition speed. The bounding box regression issue in target detection uses the scale invariant intersection over union (SIoU) loss function to increase prediction box overlapping and identification accuracy. Finally, BiFormer uses dynamic sparse attention for more flexible computational allocation and content perception. The method’s real-time detection speed is 4.81 frames per second (FPS) faster, [email protected] is 5.38% faster, mean average precision (mAP)@0.5:0.95 is 4.19% faster, and parameter volume is 5.81 M less than the original approach. The approach developed in this work has several applications in real-time target identification and lightweight deployment.
本研究提出了一个智能移动目标来复制暴徒袭击和警察训练中的其他现实事件,以匹配实际战斗需求。警用智能移动目标必须在硬件平台上部署目标检测算法,而传统的you only look once (YOLO)v8算法框架较大,由于硬件平台缺乏算力,会导致识别速度变慢。在本研究中,GhostNet网络架构取代YOLOv8的骨干网进行实时目标识别,提高了识别速度。目标检测中的边界盒回归问题采用SIoU损失函数(scale invariant intersection over union)来提高预测盒重叠和识别精度。最后,BiFormer使用动态稀疏注意实现更灵活的计算分配和内容感知。该方法的实时检测速度比原方法提高了4.81帧/秒(FPS), [email protected]提高了5.38%,平均精度(mAP)@0.5:0.95提高了4.19%,参数体积比原方法减少了5.81 M。本研究开发的方法在实时目标识别和轻量级部署中具有多种应用。
{"title":"Application of Lightweight Target Detection Algorithm Based on YOLOv8 for Police Intelligent Moving Targets","authors":"Yanjie Zhang, Xiaojun Liu, Yuehan Shi, Zecong Ding, Xiaoming Zhang","doi":"10.1049/cdt2/9984821","DOIUrl":"10.1049/cdt2/9984821","url":null,"abstract":"<p>This study presents an intelligent moving target to replicate mob attacks and other realistic events in police training to match actual fighting needs. The police intelligent moving target must deploy target detection algorithms on the hardware platform, but the traditional you only look once (YOLO)v8 algorithm has a large framework, which will slow recognition due to the hardware platform’s lack of arithmetic power. In this study, GhostNet network architecture replaces YOLOv8<sup>′</sup>s backbone network for real-time target identification, improving recognition speed. The bounding box regression issue in target detection uses the scale invariant intersection over union (SIoU) loss function to increase prediction box overlapping and identification accuracy. Finally, BiFormer uses dynamic sparse attention for more flexible computational allocation and content perception. The method’s real-time detection speed is 4.81 frames per second (FPS) faster, [email protected] is 5.38% faster, mean average precision (mAP)@0.5:0.95 is 4.19% faster, and parameter volume is 5.81 M less than the original approach. The approach developed in this work has several applications in real-time target identification and lightweight deployment.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"2025 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2025-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2/9984821","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143930499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zilin Li, Jizeng Wei, Shuangsheng Li, Yaogong Yang
The branch predictor is widely used to enhance processor performance, but it also constitutes one of the major energy-consuming components in processors. We found that approximately 32% of instruction blocks in a decoupled frontend do not contain branch instructions, while 30.8% of instruction blocks contain only conditional branches. However, because the type of instructions within a block cannot be determined during prediction, branch prediction must be executed every cycle. In this work, we propose the next block type (NBT) and no branch sequence table (NST) for predicting instruction block types. These mechanisms occupy minimal space and are straightforward to implement. For a four-way out-of-order processor, the NBT and NST reduce the branch predictor’s energy consumption by 52.36% and processor’s energy consumption by 4.1% without sacrificing the processor’s instructions per cycle (IPC) and branch prediction accuracy.
{"title":"Energy-Efficient Branch Predictor via Instruction Block Type Prediction in Decoupled Frontend","authors":"Zilin Li, Jizeng Wei, Shuangsheng Li, Yaogong Yang","doi":"10.1049/cdt2/3359419","DOIUrl":"10.1049/cdt2/3359419","url":null,"abstract":"<p>The branch predictor is widely used to enhance processor performance, but it also constitutes one of the major energy-consuming components in processors. We found that approximately 32% of instruction blocks in a decoupled frontend do not contain branch instructions, while 30.8% of instruction blocks contain only conditional branches. However, because the type of instructions within a block cannot be determined during prediction, branch prediction must be executed every cycle. In this work, we propose the next block type (NBT) and no branch sequence table (NST) for predicting instruction block types. These mechanisms occupy minimal space and are straightforward to implement. For a four-way out-of-order processor, the NBT and NST reduce the branch predictor’s energy consumption by 52.36% and processor’s energy consumption by 4.1% without sacrificing the processor’s instructions per cycle (IPC) and branch prediction accuracy.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"2025 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2/3359419","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143889091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Convolutional neural networks (CNNs) have evolved into essential components for a wide range of embedded applications due to their outstanding efficiency and performance. To efficiently deploy CNN inference models on resource-constrained edge devices, field programmable gate arrays (FPGAs) have become a viable processing solution because of their unique hardware characteristics, enabling flexibility, parallel computation and low-power consumption. In this regard, this work proposes an FPGA-based dynamic reconfigurable coarse-to-fine (C2F) inference of CNN models, aiming to increase power efficiency and flexibility. The proposed C2F approach first coarsely classifies related input images into superclasses and then selects the appropriate fine model(s) to recognise and classify the input images according to their bespoke categories. Furthermore, the proposed architecture can be reprogrammed to the original model using partial reconfiguration (PR) in case the typical classification is required. To efficiently utilise different fine models on low-cost FPGAs with area minimisation, ZyCAP-based PR is adopted. Results show that our approach significantly improves the classification process when object identification of only one coarse category of interest is needed. This approach can reduce energy consumption and inference time by up to 27.2% and 13.2%, respectively, which can greatly benefit resource-constrained applications.
{"title":"A Reconfigurable Coarse-to-Fine Approach for the Execution of CNN Inference Models in Low-Power Edge Devices","authors":"Auangkun Rangsikunpum, Sam Amiri, Luciano Ost","doi":"10.1049/cdt2/6214436","DOIUrl":"10.1049/cdt2/6214436","url":null,"abstract":"<p>Convolutional neural networks (CNNs) have evolved into essential components for a wide range of embedded applications due to their outstanding efficiency and performance. To efficiently deploy CNN inference models on resource-constrained edge devices, field programmable gate arrays (FPGAs) have become a viable processing solution because of their unique hardware characteristics, enabling flexibility, parallel computation and low-power consumption. In this regard, this work proposes an FPGA-based dynamic reconfigurable coarse-to-fine (C2F) inference of CNN models, aiming to increase power efficiency and flexibility. The proposed C2F approach first coarsely classifies related input images into superclasses and then selects the appropriate fine model(s) to recognise and classify the input images according to their bespoke categories. Furthermore, the proposed architecture can be reprogrammed to the original model using partial reconfiguration (PR) in case the typical classification is required. To efficiently utilise different fine models on low-cost FPGAs with area minimisation, ZyCAP-based PR is adopted. Results show that our approach significantly improves the classification process when object identification of only one coarse category of interest is needed. This approach can reduce energy consumption and inference time by up to 27.2% and 13.2%, respectively, which can greatly benefit resource-constrained applications.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"2024 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2/6214436","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142861745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the Internet era, the e-commerce industry has risen, its development scale continues to expand, cross-border e-commerce (CBEC) has also been born, and it is now in the stage of sustainable development. The rapid development of CBEC also needs the strong support of logistics, the two are inseparable, and today, the development scale of CBEC is constantly expanding. The existing e-commerce logistics (ECL) model is also gradually unable to meet the increasingly diverse needs of users, and new logistics models need to be actively explored. To change this situation, this paper carried out a specific analysis of CBEC logistics model, and applied embedded technology to ECL, which also built a logistics tracking system. At the same time, combined with the ant colony algorithm, the paper carried out experimental research on the logistics package distribution route planning problem. From the experimental results, in terms of average delivery time, the algorithm’s result was 25.95 hr, while the traditional algorithm was 32.53 hr; in terms of average distribution freight cost, the algorithm’s result was 163.3 yuan, while the traditional algorithm was 257.7 yuan; in terms of average distribution cost, this algorithm’s result was 131.53 yuan, while the traditional algorithm was 211.68 yuan. To sum up, this algorithm could effectively optimize the distribution route of logistics packages and improve the efficiency of package transportation.
{"title":"E-Commerce Logistics Software Package Tracking and Route Planning and Optimization System of Embedded Technology Based on the Intelligent Era","authors":"Dan Zhang, Zhiyang Jia","doi":"10.1049/2024/6687853","DOIUrl":"10.1049/2024/6687853","url":null,"abstract":"<p>In the Internet era, the e-commerce industry has risen, its development scale continues to expand, cross-border e-commerce (CBEC) has also been born, and it is now in the stage of sustainable development. The rapid development of CBEC also needs the strong support of logistics, the two are inseparable, and today, the development scale of CBEC is constantly expanding. The existing e-commerce logistics (ECL) model is also gradually unable to meet the increasingly diverse needs of users, and new logistics models need to be actively explored. To change this situation, this paper carried out a specific analysis of CBEC logistics model, and applied embedded technology to ECL, which also built a logistics tracking system. At the same time, combined with the ant colony algorithm, the paper carried out experimental research on the logistics package distribution route planning problem. From the experimental results, in terms of average delivery time, the algorithm’s result was 25.95 hr, while the traditional algorithm was 32.53 hr; in terms of average distribution freight cost, the algorithm’s result was 163.3 yuan, while the traditional algorithm was 257.7 yuan; in terms of average distribution cost, this algorithm’s result was 131.53 yuan, while the traditional algorithm was 211.68 yuan. To sum up, this algorithm could effectively optimize the distribution route of logistics packages and improve the efficiency of package transportation.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"2024 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/6687853","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142429696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yingzhao Shao, Jincheng Shang, Yunsong Li, Yueli Ding, Mingming Zhang, Ke Ren, Yang Liu
Convolutional neural networks (CNNs) have been widely used in satellite remote sensing. However, satellites in orbit with limited resources and power consumption cannot meet the storage and computing power requirements of current million-scale artificial intelligence models. This paper proposes a new generation of high flexibility and intelligent CNNs hardware accelerator for satellite remote sensing in order to make its computing carrier more lightweight and efficient. A data quantization scheme for INT16 or INT8 is designed based on the idea of dynamic fixed point numbers and is applied to different scenarios. The operation mode of the systolic array is divided into channel blocks, and the calculation method is optimized to increase the utilization of on-chip computing resources and enhance the calculation efficiency. An RTL-level CNNs field programable gate arrays accelerator with microinstruction sequence scheduling data flow is then designed. The hardware framework is built upon the Xilinx VC709. The results show that, under INT16 or INT8 precision, the system achieves remarkable throughput in most convolutional layers of the network, with an average performance of 153.14 giga operations per second (GOPS) or 301.52 GOPS, which is close to the system’s peak performance, taking full advantage of the platform’s parallel computing capabilities.
{"title":"A Configurable Accelerator for CNN-Based Remote Sensing Object Detection on FPGAs","authors":"Yingzhao Shao, Jincheng Shang, Yunsong Li, Yueli Ding, Mingming Zhang, Ke Ren, Yang Liu","doi":"10.1049/2024/4415342","DOIUrl":"10.1049/2024/4415342","url":null,"abstract":"<p>Convolutional neural networks (CNNs) have been widely used in satellite remote sensing. However, satellites in orbit with limited resources and power consumption cannot meet the storage and computing power requirements of current million-scale artificial intelligence models. This paper proposes a new generation of high flexibility and intelligent CNNs hardware accelerator for satellite remote sensing in order to make its computing carrier more lightweight and efficient. A data quantization scheme for INT16 or INT8 is designed based on the idea of dynamic fixed point numbers and is applied to different scenarios. The operation mode of the systolic array is divided into channel blocks, and the calculation method is optimized to increase the utilization of on-chip computing resources and enhance the calculation efficiency. An RTL-level CNNs field programable gate arrays accelerator with microinstruction sequence scheduling data flow is then designed. The hardware framework is built upon the Xilinx VC709. The results show that, under INT16 or INT8 precision, the system achieves remarkable throughput in most convolutional layers of the network, with an average performance of 153.14 giga operations per second (GOPS) or 301.52 GOPS, which is close to the system’s peak performance, taking full advantage of the platform’s parallel computing capabilities.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"2024 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/4415342","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141435679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}