首页 > 最新文献

IET Computers and Digital Techniques最新文献

英文 中文
Hybrid multi-level hardware Trojan detection platform for gate-level netlists based on XGBoost 基于XGBoost的门级网络混合多级硬件木马检测平台
IF 1.2 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2022-02-16 DOI: 10.1049/cdt2.12040
Ying Zhang, Sen Li, Xin Chen, Jiaqi Yao, Zhiming Mao, Jizhong Yang, Yifeng Hua

Coping with the problem of malicious third-party vendors implanting Hardware Trojan (HT) in the circuit design stage, this paper proposes a hybrid-mode gate-level hardware Trojan detection platform based on the XGBoost algorithm. This detection platform is composed of multi-level HT localization and circuit structure based HT detection. Each wire of the circuit is regarded as a node in multi-level HT localization, and static characteristics of nodes are analysed, combining with dynamic detection to locate HT. The network structure features of the circuit are extracted in modular HT structure detection, aiming to identify HT accurately and rapidly. The hybrid-mode HT detection platform can efficiently meet various detection requirements, such as HT localization or rapid and accurate HT detection. The experiment results on Trust-Hub benchmark show that the multi-level localization can achieve 94.0% location accuracy, and the modular HT structure detection accuracy can achieve 100%. The modular HT structure detection is about four times as fast as the multi-level HT localization on feature extraction. Therefore, multi-level localization and modular HT structure detection can be respectively or cooperatively applied for specific HT detection issues, which proves that the proposed hybrid-mode gate-level HT detection scheme is practical and effective.

针对恶意第三方厂商在电路设计阶段植入硬件木马(Hardware Trojan, HT)的问题,本文提出了一种基于XGBoost算法的混合模式门级硬件木马检测平台。该检测平台由多级高温定位和基于高温检测的电路结构组成。在多级高温定位中,将电路的每条导线视为一个节点,分析节点的静态特性,结合动态检测进行高温定位。在模块化HT结构检测中提取电路的网络结构特征,目的是准确、快速地识别HT。混合模式高温感应检测平台可以有效满足高温感应定位或快速准确检测等多种检测需求。在Trust-Hub基准上的实验结果表明,多级定位精度可达94.0%,模块化HT结构检测精度可达100%。模块化HT结构检测在特征提取上的速度是多级HT定位的4倍左右。因此,对于具体的高温检测问题,多级定位和模块化高温结构检测可以分别或协同应用,证明了所提出的混合模式门级高温检测方案的实用性和有效性。
{"title":"Hybrid multi-level hardware Trojan detection platform for gate-level netlists based on XGBoost","authors":"Ying Zhang,&nbsp;Sen Li,&nbsp;Xin Chen,&nbsp;Jiaqi Yao,&nbsp;Zhiming Mao,&nbsp;Jizhong Yang,&nbsp;Yifeng Hua","doi":"10.1049/cdt2.12040","DOIUrl":"10.1049/cdt2.12040","url":null,"abstract":"<p>Coping with the problem of malicious third-party vendors implanting Hardware Trojan (HT) in the circuit design stage, this paper proposes a hybrid-mode gate-level hardware Trojan detection platform based on the XGBoost algorithm. This detection platform is composed of multi-level HT localization and circuit structure based HT detection. Each wire of the circuit is regarded as a node in multi-level HT localization, and static characteristics of nodes are analysed, combining with dynamic detection to locate HT. The network structure features of the circuit are extracted in modular HT structure detection, aiming to identify HT accurately and rapidly. The hybrid-mode HT detection platform can efficiently meet various detection requirements, such as HT localization or rapid and accurate HT detection. The experiment results on Trust-Hub benchmark show that the multi-level localization can achieve 94.0% location accuracy, and the modular HT structure detection accuracy can achieve 100%. The modular HT structure detection is about four times as fast as the multi-level HT localization on feature extraction. Therefore, multi-level localization and modular HT structure detection can be respectively or cooperatively applied for specific HT detection issues, which proves that the proposed hybrid-mode gate-level HT detection scheme is practical and effective.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"16 2-3","pages":"54-70"},"PeriodicalIF":1.2,"publicationDate":"2022-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12040","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87993598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Enhanced overloaded code division multiple access for network on chip 片上网络的增强型重载码分多址
IF 1.2 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2021-12-07 DOI: 10.1049/cdt2.12039
Behnam Vakili, Morteza Gholipour

The Code-division multiple access (CDMA) method is commonly used as the network infrastructure in multi-core chips. One of its advantages is the simultaneous connection of all network components. Another advantage is the constant delay of this method. On the other hand, one drawback is that the number of transmitters is limited to the number of encoding bits. In this study, the authors used the combination of Walsh codes and their inverses, as well as the simultaneous application of the time-division multiple access (TDMA) method, to increase the transmission capacity of this protocol more than four times the standard mode. In the proposed design, although the circuit area does not increase significantly, a fourfold increase in the throughput of the CDMA network is seen. Using the method proposed in this study, it will be possible to increase the capacity further.

码分多址(CDMA)方式是多核芯片中常用的网络基础设施。它的优点之一是所有网络组件的同时连接。这种方法的另一个优点是延时不变。另一方面,一个缺点是发射机的数量受限于编码位的数量。在本研究中,作者利用沃尔什码及其逆码的组合,以及时分多址(TDMA)方法的同时应用,使该协议的传输容量比标准模式增加了四倍以上。在提出的设计中,虽然电路面积没有显著增加,但CDMA网络的吞吐量增加了四倍。使用本研究提出的方法,将有可能进一步提高容量。
{"title":"Enhanced overloaded code division multiple access for network on chip","authors":"Behnam Vakili,&nbsp;Morteza Gholipour","doi":"10.1049/cdt2.12039","DOIUrl":"10.1049/cdt2.12039","url":null,"abstract":"<p>The Code-division multiple access (CDMA) method is commonly used as the network infrastructure in multi-core chips. One of its advantages is the simultaneous connection of all network components. Another advantage is the constant delay of this method. On the other hand, one drawback is that the number of transmitters is limited to the number of encoding bits. In this study, the authors used the combination of Walsh codes and their inverses, as well as the simultaneous application of the time-division multiple access (TDMA) method, to increase the transmission capacity of this protocol more than four times the standard mode. In the proposed design, although the circuit area does not increase significantly, a fourfold increase in the throughput of the CDMA network is seen. Using the method proposed in this study, it will be possible to increase the capacity further.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"16 2-3","pages":"45-53"},"PeriodicalIF":1.2,"publicationDate":"2021-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12039","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81311322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Online multi-object tracking based on time and frequency domain features 基于时频域特征的在线多目标跟踪
IF 1.2 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2021-12-01 DOI: 10.1049/cdt2.12037
Mahbubeh Nazarloo, Meisam Yadollahzadeh-Tabari, Homayun Motameni

Multi-object tracking (MOT) can be considered as an interesting field in computer vision research. Its application can be found in video motion analysis, smart interfaces, and visual surveillance. It is a challenging issue due to difficulties made by a variable number of objects and interaction between them. In this work, a new method for online MOT based on time and frequency domain features is presented. The features are obtained from the wavelet transform and fractal dimension. The modified cuckoo optimization algorithm is utilized for feature selection, which has the ability such as fast convergence and global optima finding. The features are given for learning vector quantization, which is a supervised artificial neural network (ANN). It is used to classify the dataset. To evaluate the performance of the presented technique, simulations are performed using the ETH Mobile Platform and VS-PETS 2009 datasets. The simulation results show the superiority of the presented technique for MOT compared to earlier studies in terms of accuracy. The mostly tracked values for the datasets are 74.3% and 97.2%, which leads to at least 4.2% and 2.5% better performance according to the other methods, respectively.

多目标跟踪(MOT)是计算机视觉研究中的一个有趣的领域。它的应用可以在视频运动分析、智能接口和视觉监控中找到。这是一个具有挑战性的问题,因为可变数量的对象和它们之间的相互作用造成了困难。本文提出了一种基于时频域特征的在线MOT方法。特征由小波变换和分形维数得到。采用改进的布谷鸟优化算法进行特征选择,具有快速收敛和全局寻优的能力。给出了学习向量量化的特征,这是一种有监督的人工神经网络。它用于对数据集进行分类。为了评估该技术的性能,使用ETH移动平台和VS-PETS 2009数据集进行了仿真。仿真结果表明,该方法在精度方面优于前人的研究方法。数据集的最常跟踪值为74.3%和97.2%,与其他方法相比,其性能分别提高至少4.2%和2.5%。
{"title":"Online multi-object tracking based on time and frequency domain features","authors":"Mahbubeh Nazarloo,&nbsp;Meisam Yadollahzadeh-Tabari,&nbsp;Homayun Motameni","doi":"10.1049/cdt2.12037","DOIUrl":"10.1049/cdt2.12037","url":null,"abstract":"<p>Multi-object tracking (MOT) can be considered as an interesting field in computer vision research. Its application can be found in video motion analysis, smart interfaces, and visual surveillance. It is a challenging issue due to difficulties made by a variable number of objects and interaction between them. In this work, a new method for online MOT based on time and frequency domain features is presented. The features are obtained from the wavelet transform and fractal dimension. The modified cuckoo optimization algorithm is utilized for feature selection, which has the ability such as fast convergence and global optima finding. The features are given for learning vector quantization, which is a supervised artificial neural network (ANN). It is used to classify the dataset. To evaluate the performance of the presented technique, simulations are performed using the ETH Mobile Platform and VS-PETS 2009 datasets. The simulation results show the superiority of the presented technique for MOT compared to earlier studies in terms of accuracy. The mostly tracked values for the datasets are 74.3% and 97.2%, which leads to at least 4.2% and 2.5% better performance according to the other methods, respectively.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"16 1","pages":"19-28"},"PeriodicalIF":1.2,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12037","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76226451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sparse convolutional neural network acceleration with lossless input feature map compression for resource-constrained systems 基于无损输入特征映射压缩的稀疏卷积神经网络加速
IF 1.2 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2021-11-29 DOI: 10.1049/cdt2.12038
Jisu Kwon, Joonho Kong, Arslan Munir

Many recent research efforts have exploited data sparsity for the acceleration of convolutional neural network (CNN) inferences. However, the effects of data transfer between main memory and the CNN accelerator have been largely overlooked. In this work, the authors propose a CNN acceleration technique that leverages hardware/software co-design and exploits the sparsity in input feature maps (IFMs). On the software side, the authors' technique employs a novel lossless compression scheme for IFMs, which are sent to the hardware accelerator via direct memory access. On the hardware side, the authors' technique uses a CNN inference accelerator that performs convolutional layer operations with their compressed data format. With several design optimization techniques, the authors have implemented their technique in a field-programmable gate array (FPGA) system-on-chip platform and evaluated their technique for six different convolutional layers in SqueezeNet. Results reveal that the authors' technique improves the performance by 1.1×–22.6× while reducing energy consumption by 47.7%–97.4% as compared to the CPU-based execution. Furthermore, results indicate that the IFM size and transfer latency are reduced by 34.0%–85.2% and 4.4%–75.7%, respectively, compared to the case without data compression. In addition, the authors' hardware accelerator shows better performance per hardware resource with less than or comparable power consumption to the state-of-the-art FPGA-based designs.

最近的许多研究工作都利用数据稀疏性来加速卷积神经网络(CNN)的推理。然而,主存储器和CNN加速器之间数据传输的影响在很大程度上被忽视了。在这项工作中,作者提出了一种CNN加速技术,该技术利用硬件/软件协同设计并利用输入特征映射(ifm)中的稀疏性。在软件方面,作者的技术为ifm采用了一种新颖的无损压缩方案,通过直接存储器访问将ifm发送到硬件加速器。在硬件方面,作者的技术使用CNN推理加速器,用压缩的数据格式执行卷积层操作。通过几种设计优化技术,作者在现场可编程门阵列(FPGA)片上系统平台上实现了他们的技术,并在SqueezeNet中评估了六种不同卷积层的技术。结果表明,与基于cpu的执行相比,作者的技术提高了1.1×-22.6×的性能,同时减少了47.7%-97.4%的能耗。此外,结果表明,与没有数据压缩的情况相比,IFM大小和传输延迟分别减少了34.0% ~ 85.2%和4.4% ~ 75.7%。此外,作者的硬件加速器在每个硬件资源上显示出更好的性能,其功耗低于或与最先进的基于fpga的设计相当。
{"title":"Sparse convolutional neural network acceleration with lossless input feature map compression for resource-constrained systems","authors":"Jisu Kwon,&nbsp;Joonho Kong,&nbsp;Arslan Munir","doi":"10.1049/cdt2.12038","DOIUrl":"10.1049/cdt2.12038","url":null,"abstract":"<p>Many recent research efforts have exploited data sparsity for the acceleration of convolutional neural network (CNN) inferences. However, the effects of data transfer between main memory and the CNN accelerator have been largely overlooked. In this work, the authors propose a CNN acceleration technique that leverages hardware/software co-design and exploits the sparsity in input feature maps (IFMs). On the software side, the authors' technique employs a novel lossless compression scheme for IFMs, which are sent to the hardware accelerator via direct memory access. On the hardware side, the authors' technique uses a CNN inference accelerator that performs convolutional layer operations with their compressed data format. With several design optimization techniques, the authors have implemented their technique in a field-programmable gate array (FPGA) system-on-chip platform and evaluated their technique for six different convolutional layers in SqueezeNet. Results reveal that the authors' technique improves the performance by 1.1×–22.6× while reducing energy consumption by 47.7%–97.4% as compared to the CPU-based execution. Furthermore, results indicate that the IFM size and transfer latency are reduced by 34.0%–85.2% and 4.4%–75.7%, respectively, compared to the case without data compression. In addition, the authors' hardware accelerator shows better performance per hardware resource with less than or comparable power consumption to the state-of-the-art FPGA-based designs.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"16 1","pages":"29-43"},"PeriodicalIF":1.2,"publicationDate":"2021-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12038","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89983943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
An embedded intelligence engine for driver drowsiness detection 用于驾驶员睡意检测的嵌入式智能引擎
IF 1.2 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2021-11-25 DOI: 10.1049/cdt2.12036
Shirisha Vadlamudi, Ali Ahmadinia

Motor vehicle crashes involving drowsy driving are huge in number all over the world. Many studies revealed that 10%–30% of crashes are due to drowsy driving. Fatigue has costly effects on the safety, health, and quality of life. This drowsiness of drivers can be detected using various methods, for example, algorithms based on behavioural gestures, physiological signals and vitals. Also, few of them are vehicle based. Drowsiness of drivers was detected based on steering wheel movement and lane change patterns. A pattern is derived based on slow drifting and fast corrective steering movement. A prototype that detects the drowsiness of an automobile driver using artificial intelligence techniques, precisely using open-source tools like TensorFlow Lite on a Raspberry Pi development board, is developed. The TensorFlow model is trained on images captured from the video with the help of object detection using cascade classifier. In order to have a better accuracy, an Inception v3 architecture is used in pre-training the model with the image dataset. The final model is created and trained using long short-term memory and then the final TensorFlow model is converted to TensorFlow Lite model and this Lite model is used on Raspberry Pi board to detect the drowsiness of drivers. The results are comparable with desktop-based results in the literature.

昏睡驾驶引起的机动车撞车事故在世界范围内数量巨大。许多研究表明,10%-30%的车祸是由于疲劳驾驶造成的。疲劳对安全、健康和生活质量的影响是昂贵的。驾驶员的困倦可以通过各种方法检测,例如,基于行为手势、生理信号和生命体征的算法。此外,它们中很少是基于车辆的。驾驶员的睡意是根据方向盘的运动和变道模式来检测的。推导了一种基于慢漂移和快速修正转向运动的模式。一款使用人工智能技术检测汽车驾驶员睡意的原型机被开发出来,该技术精确地使用了树莓派开发板上的TensorFlow Lite等开源工具。利用级联分类器进行目标检测,对视频中捕获的图像进行TensorFlow模型的训练。为了获得更好的准确性,使用Inception v3架构对图像数据集进行模型预训练。使用长短期记忆创建和训练最终模型,然后将最终的TensorFlow模型转换为TensorFlow Lite模型,该Lite模型用于树莓派板上检测驾驶员的嗜睡状态。结果与文献中基于桌面的结果相当。
{"title":"An embedded intelligence engine for driver drowsiness detection","authors":"Shirisha Vadlamudi,&nbsp;Ali Ahmadinia","doi":"10.1049/cdt2.12036","DOIUrl":"10.1049/cdt2.12036","url":null,"abstract":"<p>Motor vehicle crashes involving drowsy driving are huge in number all over the world. Many studies revealed that 10%–30% of crashes are due to drowsy driving. Fatigue has costly effects on the safety, health, and quality of life. This drowsiness of drivers can be detected using various methods, for example, algorithms based on behavioural gestures, physiological signals and vitals. Also, few of them are vehicle based. Drowsiness of drivers was detected based on steering wheel movement and lane change patterns. A pattern is derived based on slow drifting and fast corrective steering movement. A prototype that detects the drowsiness of an automobile driver using artificial intelligence techniques, precisely using open-source tools like TensorFlow Lite on a Raspberry Pi development board, is developed. The TensorFlow model is trained on images captured from the video with the help of object detection using cascade classifier. In order to have a better accuracy, an Inception v3 architecture is used in pre-training the model with the image dataset. The final model is created and trained using long short-term memory and then the final TensorFlow model is converted to TensorFlow Lite model and this Lite model is used on Raspberry Pi board to detect the drowsiness of drivers. The results are comparable with desktop-based results in the literature.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"16 1","pages":"10-18"},"PeriodicalIF":1.2,"publicationDate":"2021-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12036","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78213695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Who is wearing me? TinyDL-based user recognition in constrained personal devices 谁在穿我的衣服?受限个人设备中基于tinydl的用户识别
IF 1.2 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2021-10-21 DOI: 10.1049/cdt2.12035
Ramon Sanchez-Iborra, Antonio Skarmeta

Deep learning (DL) techniques have been extensively studied to improve their precision and scalability in a vast range of applications. Recently, a new milestone has been reached driven by the emergence of the TinyDL paradigm, which enables adaptation of complex DL models generated by well-known libraries to the restrictions of constrained microcontroller-based devices. In this work, a comprehensive discussion is provided regarding this novel ecosystem, by identifying the benefits that it will bring to the wearable industry and analysing different TinyDL initiatives promoted by tech giants. The specific use case of automatic user recognition from data captured by a wearable device is also presented. The whole development process by which different DL configurations have been embedded in a real microcontroller unit is described. The attained results in terms of accuracy and resource usage confirm the validity of the proposal, which allows precise predictions in a highly constrained platform with limited input information. Therefore, this work provides insights into the viability of the integration of TinyDL models within wearables, which may be valuable for researchers, practitioners, and makers related to this industry.

深度学习(DL)技术已被广泛研究,以提高其在广泛应用中的精度和可扩展性。最近,TinyDL范式的出现推动了一个新的里程碑,它使知名库生成的复杂DL模型能够适应基于受限微控制器的设备的限制。在这项工作中,通过确定它将给可穿戴行业带来的好处,并分析科技巨头推动的不同TinyDL计划,对这种新型生态系统进行了全面的讨论。本文还介绍了从可穿戴设备捕获的数据中自动识别用户的具体用例。描述了整个开发过程,其中不同的DL配置已嵌入到实际的微控制器单元中。在准确性和资源使用方面获得的结果证实了该建议的有效性,该建议允许在输入信息有限的高度受限的平台上进行精确预测。因此,这项工作提供了对可穿戴设备中TinyDL模型集成可行性的见解,这可能对与该行业相关的研究人员,从业者和制造商有价值。
{"title":"Who is wearing me? TinyDL-based user recognition in constrained personal devices","authors":"Ramon Sanchez-Iborra,&nbsp;Antonio Skarmeta","doi":"10.1049/cdt2.12035","DOIUrl":"10.1049/cdt2.12035","url":null,"abstract":"<p>Deep learning (DL) techniques have been extensively studied to improve their precision and scalability in a vast range of applications. Recently, a new milestone has been reached driven by the emergence of the TinyDL paradigm, which enables adaptation of complex DL models generated by well-known libraries to the restrictions of constrained microcontroller-based devices. In this work, a comprehensive discussion is provided regarding this novel ecosystem, by identifying the benefits that it will bring to the wearable industry and analysing different TinyDL initiatives promoted by tech giants. The specific use case of automatic user recognition from data captured by a wearable device is also presented. The whole development process by which different DL configurations have been embedded in a real microcontroller unit is described. The attained results in terms of accuracy and resource usage confirm the validity of the proposal, which allows precise predictions in a highly constrained platform with limited input information. Therefore, this work provides insights into the viability of the integration of TinyDL models within wearables, which may be valuable for researchers, practitioners, and makers related to this industry.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"16 1","pages":"1-9"},"PeriodicalIF":1.2,"publicationDate":"2021-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12035","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74789266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Accelerating the SM3 hash algorithm with CPU-FPGA Co-Designed architecture 采用CPU-FPGA协同设计架构加速SM3哈希算法
IF 1.2 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2021-09-16 DOI: 10.1049/cdt2.12034
Xiaoying Huang, Zhichuan Guo, Mangu Song, Xuewen Zeng

SM3 hash algorithm developed by the Chinese Government is used in various fields of information security, and it is being widely used in commercial security products. However, the performance of implementation on the software architecture is not sufficient for high-speed applications. This study proposes a CPU-FPGA co-designed architecture which offloads the SM3 function on field-programmable gate array so that high throughput can be achieved. The architecture can execute the SM3 hash algorithm with 16 concurrent streams or more, which means that multiple data streams can be processed in parallel. This design is implemented on the Xilinx XCKU115-flva1517-2-e device and Dell commercial server, and the throughput of this design can reach up to 35.5 Gbps when 16 individual SM3 modules are processed in parallel. The proposed architecture results in an excellent performance in the CPU-FPGA-coupled environment.

中国政府开发的SM3哈希算法应用于信息安全的各个领域,并在商业安全产品中得到广泛应用。然而,软件架构上的实现性能对于高速应用来说是不够的。本研究提出了一种CPU-FPGA协同设计架构,该架构可以卸载现场可编程门阵列上的SM3功能,从而实现高吞吐量。该架构可以使用16个或更多并发流执行SM3散列算法,这意味着可以并行处理多个数据流。本设计在Xilinx XCKU115-flva1517-2-e器件和Dell商用服务器上实现,当并行处理16个独立SM3模块时,本设计的吞吐量可达35.5 Gbps。该架构在cpu - fpga耦合环境下具有优异的性能。
{"title":"Accelerating the SM3 hash algorithm with CPU-FPGA Co-Designed architecture","authors":"Xiaoying Huang,&nbsp;Zhichuan Guo,&nbsp;Mangu Song,&nbsp;Xuewen Zeng","doi":"10.1049/cdt2.12034","DOIUrl":"10.1049/cdt2.12034","url":null,"abstract":"<p>SM3 hash algorithm developed by the Chinese Government is used in various fields of information security, and it is being widely used in commercial security products. However, the performance of implementation on the software architecture is not sufficient for high-speed applications. This study proposes a CPU-FPGA co-designed architecture which offloads the SM3 function on field-programmable gate array so that high throughput can be achieved. The architecture can execute the SM3 hash algorithm with 16 concurrent streams or more, which means that multiple data streams can be processed in parallel. This design is implemented on the Xilinx XCKU115-flva1517-2-e device and Dell commercial server, and the throughput of this design can reach up to 35.5 Gbps when 16 individual SM3 modules are processed in parallel. The proposed architecture results in an excellent performance in the CPU-FPGA-coupled environment.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"15 6","pages":"427-436"},"PeriodicalIF":1.2,"publicationDate":"2021-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12034","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80352999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
EmRep: Energy management relying on state-of-charge extrema prediction EmRep:基于充电状态极值预测的能源管理
IF 1.2 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2021-08-17 DOI: 10.1049/cdt2.12033
Lars Hanschke, Christian Renner

The persistent rise of Energy Harvesting Wireless Sensor Networks entails increasing demands on the efficiency and configurability of energy management. New applications often profit from or even require user-defined time-varying utilities, for example, the health assessment of bridges is only possible at rushhour. However, monitoring times do not necessarily overlap with energy harvest periods. This misalignment is often corrected by over-provisioning the energy storage. Favourable small-footprint and cheap energy storage, however, fill up quickly and waste surplus energy. Hence, EmRep is presented, which decouples the energy management of high-intake from low-intake harvest periods. Based on the State-of-Charge extrema prediction, the authors enhance energy management and reduce saturation of energy storage by design. Considering multiple user-defined utility profiles, the benefits of EmRep in combination with a variety of prediction algorithms, time resolutions, and energy storage sizes are showcased. EmRep is tailored to platforms with small energy storage, in which it is found that it doubles effective utility, and also increases performance by 10% with large-sized storage.

能量收集无线传感器网络的持续发展对能量管理的效率和可配置性提出了越来越高的要求。新的应用程序通常受益于甚至需要用户定义的时变实用程序,例如,只有在高峰时段才能对桥梁进行健康评估。然而,监测时间不一定与能量收集时间重叠。这种偏差通常通过过度配置能量存储来纠正。然而,有利的小足迹和廉价的能源储存很快就会被填满,并浪费多余的能源。因此,提出了EmRep,将高摄入和低摄入收获期的能量管理解耦。基于荷电状态极值预测,通过设计提高储能系统的能量管理,降低储能系统的饱和。考虑到多个用户定义的公用事业配置文件,EmRep与各种预测算法、时间分辨率和能量存储大小相结合的优势得到了展示。EmRep是为小型储能平台量身定制的,在小型储能平台上,它的有效效用翻了一番,在大型储能平台上,它的性能也提高了10%。
{"title":"EmRep: Energy management relying on state-of-charge extrema prediction","authors":"Lars Hanschke,&nbsp;Christian Renner","doi":"10.1049/cdt2.12033","DOIUrl":"10.1049/cdt2.12033","url":null,"abstract":"<p>The persistent rise of Energy Harvesting Wireless Sensor Networks entails increasing demands on the efficiency and configurability of energy management. New applications often profit from or even require user-defined time-varying utilities, for example, the health assessment of bridges is only possible at rushhour. However, monitoring times do not necessarily overlap with energy harvest periods. This misalignment is often corrected by over-provisioning the energy storage. Favourable small-footprint and cheap energy storage, however, fill up quickly and waste surplus energy. Hence, EmRep is presented, which decouples the energy management of high-intake from low-intake harvest periods. Based on the State-of-Charge extrema prediction, the authors enhance energy management and reduce saturation of energy storage by design. Considering multiple user-defined utility profiles, the benefits of EmRep in combination with a variety of prediction algorithms, time resolutions, and energy storage sizes are showcased. EmRep is tailored to platforms with small energy storage, in which it is found that it doubles effective utility, and also increases performance by <math>\u0000 <mn>10</mn>\u0000 <mi>%</mi></math> with large-sized storage.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"16 4","pages":"91-105"},"PeriodicalIF":1.2,"publicationDate":"2021-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12033","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85793765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Introducing KeyRing self-timed microarchitecture and timing-driven design flow 介绍KeyRing自定时微架构和定时驱动设计流程
IF 1.2 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2021-06-05 DOI: 10.1049/cdt2.12032
Mickael Fiorentino, Claude Thibeault, Yvon Savaria

A self-timed microarchitecture called KeyRing is presented, and a method for implementing KeyRing circuits compatible with a timing-driven electronic design automation (EDA) flow is discussed. The KeyRing microarchitecture is derived from the AnARM, a low-power self-timed ARM processor based on ad hoc design principles. First, the unorthodox design style and circuit structures are revisited. A theoretical model that can support the design of generic circuits and the elaboration of EDA methods is then presented. Also addressed are the compatibility issues between KeyRing circuits and timing-driven EDA flows. The proposed method leverages relative timing constraints to translate the timing relations in a KeyRing circuit into a set of timing constraints that enable timing-driven synthesis and static timing analysis. Finally, two 32-bit RISC-V processors are presented; called KeyV and based on KeyRing microarchitectures, they are synthesized in a 65 nm technology using the proposed EDA flow. Postsynthesis results demonstrate the effectiveness of the design methodology and allow comparisons with a synchronous alternative called SynV. Performance and power consumption evaluations show that KeyV has a power efficiency that lies between SynV with clock-gating and SynV without clock-gating.

提出了一种称为KeyRing的自定时微架构,并讨论了一种与定时驱动的电子设计自动化(EDA)流程兼容的KeyRing电路的实现方法。KeyRing微架构源自AnARM, AnARM是一种基于ad hoc设计原则的低功耗自定时ARM处理器。首先,重新审视了非正统的设计风格和电路结构。然后提出了一个理论模型,可以支持通用电路的设计和EDA方法的阐述。还讨论了KeyRing电路和时序驱动的EDA流之间的兼容性问题。所提出的方法利用相对时序约束将KeyRing电路中的时序关系转化为一组时序约束,从而实现时序驱动合成和静态时序分析。最后,给出了两个32位RISC-V处理器;它们被称为KeyV,基于KeyRing微架构,使用拟议的EDA流程在65纳米技术中合成。合成后的结果证明了设计方法的有效性,并允许与同步替代方案SynV进行比较。性能和功耗评估表明,KeyV的功率效率介于带时钟门控的SynV和不带时钟门控的SynV之间。
{"title":"Introducing KeyRing self-timed microarchitecture and timing-driven design flow","authors":"Mickael Fiorentino,&nbsp;Claude Thibeault,&nbsp;Yvon Savaria","doi":"10.1049/cdt2.12032","DOIUrl":"10.1049/cdt2.12032","url":null,"abstract":"<p>A self-timed microarchitecture called <i>KeyRing</i> is presented, and a method for implementing KeyRing circuits compatible with a timing-driven electronic design automation (EDA) flow is discussed. The KeyRing microarchitecture is derived from the AnARM, a low-power self-timed ARM processor based on ad hoc design principles. First, the unorthodox design style and circuit structures are revisited. A theoretical model that can support the design of generic circuits and the elaboration of EDA methods is then presented. Also addressed are the compatibility issues between KeyRing circuits and timing-driven EDA flows. The proposed method leverages relative timing constraints to translate the timing relations in a KeyRing circuit into a set of timing constraints that enable timing-driven synthesis and static timing analysis. Finally, two 32-bit RISC-V processors are presented; called KeyV and based on KeyRing microarchitectures, they are synthesized in a 65 nm technology using the proposed EDA flow. Postsynthesis results demonstrate the effectiveness of the design methodology and allow comparisons with a synchronous alternative called SynV. Performance and power consumption evaluations show that KeyV has a power efficiency that lies between SynV with clock-gating and SynV without clock-gating.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"15 6","pages":"409-426"},"PeriodicalIF":1.2,"publicationDate":"2021-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12032","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90116704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Evaluation of the Soft Error Assessment Consistency of a JIT-based Virtual Platform Simulator 基于jit的虚拟平台模拟器软误差评估一致性评估
IF 1.2 4区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2021-05-26 DOI: 10.1049/cdt2.12030
{"title":"Evaluation of the Soft Error Assessment Consistency of a JIT-based Virtual Platform Simulator","authors":"","doi":"10.1049/cdt2.12030","DOIUrl":"https://doi.org/10.1049/cdt2.12030","url":null,"abstract":"","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"15 5","pages":"393"},"PeriodicalIF":1.2,"publicationDate":"2021-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12030","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"137549770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IET Computers and Digital Techniques
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1