首页 > 最新文献

ACM Transactions on Reconfigurable Technology and Systems最新文献

英文 中文
Design, Calibration, and Evaluation of Real-Time Waveform Matching on an FPGA-based Digitizer at 10 GS/s 基于fpga的数字化仪10gs /s实时波形匹配的设计、校准和评估
IF 2.3 4区 计算机科学 Q1 Computer Science Pub Date : 2023-12-05 DOI: 10.1145/3635719
Jens Trautmann, Paul Krüger, Andreas Becher, Stefan Wildermann, Jürgen Teich

Digitizing side-channel signals at high sampling rates produces huge amounts of data, while side-channel analysis techniques only need those specific trace segments containing Cryptographic Operations (COs). For detecting these segments, waveform-matching techniques have been established comparing the signal with a template of the CO’s characteristic pattern. Real-time waveform matching requires highly parallel implementations as achieved by hardware design but also reconfigurability as provided by FPGAs to adapt the matching hardware to a specific CO pattern. However, currently proposed designs process the samples from analog-to-digital converters sequentially and can only process low sampling rates due to the limited clock speed of FPGAs.

In this paper, we present a parallel waveform-matching architecture capable of performing high-speed waveform matching on a high-end FPGA-based digitizer. We also present a workflow for calibrating the waveform-matching system to the specific pattern of the CO in the presence of hardware restrictions provided by the FPGA hardware. Our implementation enables waveform matching at 10 GS/s, offering a speedup of 50x compared to the fastest state-of-the-art implementation known to us. We demonstrate how to apply the technique for attacking the widespread XTS-AES algorithm using waveform matching to recover the encrypted tweak even in the presence of so-called systemic noise.

在高采样率下数字化侧信道信号会产生大量的数据,而侧信道分析技术只需要那些包含密码操作(COs)的特定跟踪段。为了检测这些片段,波形匹配技术已经建立,将信号与CO的特征模式模板进行比较。实时波形匹配需要硬件设计实现的高度并行实现,还需要fpga提供的可重构性,以使匹配硬件适应特定的CO模式。然而,目前提出的设计顺序处理模数转换器的采样,由于fpga的时钟速度有限,只能处理低采样率。在本文中,我们提出了一种能够在高端fpga数字化仪上执行高速波形匹配的并行波形匹配架构。我们还提出了一个工作流,用于在FPGA硬件提供的硬件限制的情况下将波形匹配系统校准到CO的特定模式。我们的实现实现了10 GS/s的波形匹配,与我们已知的最快的最先进的实现相比,提供了50倍的加速。我们演示了如何应用该技术来攻击广泛使用的XTS-AES算法,使用波形匹配来恢复加密调整,即使在存在所谓的系统噪声的情况下。
{"title":"Design, Calibration, and Evaluation of Real-Time Waveform Matching on an FPGA-based Digitizer at 10 GS/s","authors":"Jens Trautmann, Paul Krüger, Andreas Becher, Stefan Wildermann, Jürgen Teich","doi":"10.1145/3635719","DOIUrl":"https://doi.org/10.1145/3635719","url":null,"abstract":"<p>Digitizing side-channel signals at high sampling rates produces huge amounts of data, while side-channel analysis techniques only need those specific trace segments containing Cryptographic Operations (COs). For detecting these segments, waveform-matching techniques have been established comparing the signal with a template of the CO’s characteristic pattern. Real-time waveform matching requires highly parallel implementations as achieved by hardware design but also reconfigurability as provided by FPGAs to adapt the matching hardware to a specific CO pattern. However, currently proposed designs process the samples from analog-to-digital converters sequentially and can only process low sampling rates due to the limited clock speed of FPGAs. </p><p>In this paper, we present a parallel waveform-matching architecture capable of performing high-speed waveform matching on a high-end FPGA-based digitizer. We also present a workflow for calibrating the waveform-matching system to the specific pattern of the CO in the presence of hardware restrictions provided by the FPGA hardware. Our implementation enables waveform matching at 10 GS/s, offering a speedup of 50x compared to the fastest state-of-the-art implementation known to us. We demonstrate how to apply the technique for attacking the widespread XTS-AES algorithm using waveform matching to recover the encrypted tweak even in the presence of so-called systemic noise.</p>","PeriodicalId":49248,"journal":{"name":"ACM Transactions on Reconfigurable Technology and Systems","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138541696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Introduction to the Special Section on FCCM 2022 FCCM 2022特别部分介绍
IF 2.3 4区 计算机科学 Q1 Computer Science Pub Date : 2023-12-05 DOI: 10.1145/3632092
Jing Li, Martin Herbordt

No abstract available.

没有摘要。
{"title":"Introduction to the Special Section on FCCM 2022","authors":"Jing Li, Martin Herbordt","doi":"10.1145/3632092","DOIUrl":"https://doi.org/10.1145/3632092","url":null,"abstract":"<p>No abstract available.</p>","PeriodicalId":49248,"journal":{"name":"ACM Transactions on Reconfigurable Technology and Systems","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138541628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A hardware design framework for computer vision models based on reconfigurable devices 基于可重构器件的计算机视觉模型硬件设计框架
IF 2.3 4区 计算机科学 Q1 Computer Science Pub Date : 2023-12-05 DOI: 10.1145/3635157
Zimeng Fan, Wei Hu, Fang Liu, Dian Xu, Hong Guo, Yanxiang He, Min Peng

In computer vision, the joint development of the algorithm and computing dimensions cannot be separated. Models and algorithms are constantly evolving, while hardware designs must adapt to new or updated algorithms. Reconfigurable devices are recognized as important platforms for computer vision applications because of their reconfigurability. There are two typical design approaches: customized and overlay design. However, existing work is unable to achieve both efficient performance and scalability to adapt to a wide range of models. To address both considerations, we propose a design framework based on reconfigurable devices to provide unified support for computer vision models. It provides software-programmable modules while leaving unit design space for problem-specific algorithms. Based on the proposed framework, we design a model mapping method and a hardware architecture with two processor arrays to enable dynamic and static reconfiguration, thereby relieving redesign pressure. In addition, resource consumption and efficiency can be balanced by adjusting the hyperparameter. In experiments on CNN, vision Transformer, and vision MLP models, our work’s throughput is improved by 18.8x–33.6x and 1.4x–2.0x compared to CPU and GPU. Compared to others on the same platform, accelerators based on our framework can better balance resource consumption and efficiency.

在计算机视觉中,算法和计算维数的共同发展是不可分割的。模型和算法不断发展,而硬件设计必须适应新的或更新的算法。可重构器件因其可重构性而被认为是计算机视觉应用的重要平台。有两种典型的设计方法:定制设计和覆盖设计。然而,现有的工作无法实现高效的性能和可扩展性,以适应广泛的模型。为了解决这两个问题,我们提出了一个基于可重构设备的设计框架,为计算机视觉模型提供统一的支持。它提供了软件可编程模块,同时为特定问题的算法留下了单元设计空间。基于所提出的框架,我们设计了一种模型映射方法和具有两个处理器阵列的硬件架构,以实现动态和静态重构,从而减轻了重新设计的压力。此外,可以通过调整超参数来平衡资源消耗和效率。在CNN、vision Transformer和vision MLP模型上的实验中,我们的工作吞吐量比CPU和GPU分别提高了18.8x - 33.6倍和1.4x - 2.0倍。与同一平台上的其他加速器相比,基于我们框架的加速器可以更好地平衡资源消耗和效率。
{"title":"A hardware design framework for computer vision models based on reconfigurable devices","authors":"Zimeng Fan, Wei Hu, Fang Liu, Dian Xu, Hong Guo, Yanxiang He, Min Peng","doi":"10.1145/3635157","DOIUrl":"https://doi.org/10.1145/3635157","url":null,"abstract":"<p>In computer vision, the joint development of the algorithm and computing dimensions cannot be separated. Models and algorithms are constantly evolving, while hardware designs must adapt to new or updated algorithms. Reconfigurable devices are recognized as important platforms for computer vision applications because of their reconfigurability. There are two typical design approaches: customized and overlay design. However, existing work is unable to achieve both efficient performance and scalability to adapt to a wide range of models. To address both considerations, we propose a design framework based on reconfigurable devices to provide unified support for computer vision models. It provides software-programmable modules while leaving unit design space for problem-specific algorithms. Based on the proposed framework, we design a model mapping method and a hardware architecture with two processor arrays to enable dynamic and static reconfiguration, thereby relieving redesign pressure. In addition, resource consumption and efficiency can be balanced by adjusting the hyperparameter. In experiments on CNN, vision Transformer, and vision MLP models, our work’s throughput is improved by 18.8x–33.6x and 1.4x–2.0x compared to CPU and GPU. Compared to others on the same platform, accelerators based on our framework can better balance resource consumption and efficiency.</p>","PeriodicalId":49248,"journal":{"name":"ACM Transactions on Reconfigurable Technology and Systems","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138541665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
High Throughput FPGA-Based Object Detection via Algorithm-Hardware Co-Design 基于fpga的高吞吐量目标检测算法与硬件协同设计
IF 2.3 4区 计算机科学 Q1 Computer Science Pub Date : 2023-12-04 DOI: 10.1145/3634919
Anupreetham Anupreetham, Mohamed Ibrahim, Mathew Hall, Andrew Boutros, Ajay Kuzhively, Abinash Mohanty, Eriko Nurvitadhi, Vaughn Betz, Yu Cao, Jae-sun Seo

Object detection and classification is a key task in many computer vision applications such as smart surveillance and autonomous vehicles. Recent advances in deep learning have significantly improved the quality of results achieved by these systems, making them more accurate and reliable in complex environments. Modern object detection systems make use of lightweight convolutional neural networks (CNNs) for feature extraction, coupled with single-shot multi-box detectors (SSDs) that generate bounding boxes around the identified objects along with their classification confidence scores. Subsequently, a non-maximum suppression (NMS) module removes any redundant detection boxes from the final output. Typical NMS algorithms must wait for all box predictions to be generated by the SSD-based feature extractor before processing them. This sequential dependency between box predictions and NMS results in a significant latency overhead and degrades the overall system throughput, even if a high-performance CNN accelerator is used for the SSD feature extraction component. In this paper, we present a novel pipelined NMS algorithm that eliminates this sequential dependency and associated NMS latency overhead. We then use our novel NMS algorithm to implement an end-to-end fully pipelined FPGA system for low-latency SSD-MobileNet-V1 object detection. Our system, implemented on an Intel Stratix 10 FPGA, runs at 400 MHz and achieves a throughput of 2,167 frames per second with an end-to-end batch-1 latency of 2.13 ms. Our system achieves 5.3 × higher throughput and 5 × lower latency compared to the best prior FPGA-based solution with comparable accuracy.

在智能监控和自动驾驶汽车等许多计算机视觉应用中,目标检测和分类是一项关键任务。深度学习的最新进展显著提高了这些系统所获得结果的质量,使它们在复杂环境中更加准确和可靠。现代目标检测系统使用轻量级卷积神经网络(cnn)进行特征提取,再加上单次多盒检测器(ssd),该检测器在识别的对象周围生成边界框以及它们的分类置信度得分。随后,非最大抑制(NMS)模块从最终输出中删除任何冗余检测框。典型的NMS算法必须等待基于ssd的特征提取器生成所有盒子预测,然后再处理它们。盒预测和NMS之间的这种顺序依赖关系导致了显著的延迟开销,并降低了整体系统吞吐量,即使SSD特征提取组件使用了高性能CNN加速器。在本文中,我们提出了一种新的流水线NMS算法,消除了这种顺序依赖和相关的NMS延迟开销。然后,我们使用我们的新颖NMS算法来实现端到端全流水线FPGA系统,用于低延迟SSD-MobileNet-V1对象检测。我们的系统在Intel Stratix 10 FPGA上实现,运行频率为400 MHz,吞吐量为每秒2167帧,端到端批处理延迟为2.13 ms。与先前基于fpga的最佳解决方案相比,我们的系统实现了5.3倍的高吞吐量和5倍的低延迟,并具有相当的精度。
{"title":"High Throughput FPGA-Based Object Detection via Algorithm-Hardware Co-Design","authors":"Anupreetham Anupreetham, Mohamed Ibrahim, Mathew Hall, Andrew Boutros, Ajay Kuzhively, Abinash Mohanty, Eriko Nurvitadhi, Vaughn Betz, Yu Cao, Jae-sun Seo","doi":"10.1145/3634919","DOIUrl":"https://doi.org/10.1145/3634919","url":null,"abstract":"<p>Object detection and classification is a key task in many computer vision applications such as smart surveillance and autonomous vehicles. Recent advances in deep learning have significantly improved the quality of results achieved by these systems, making them more accurate and reliable in complex environments. Modern object detection systems make use of lightweight convolutional neural networks (CNNs) for feature extraction, coupled with single-shot multi-box detectors (SSDs) that generate bounding boxes around the identified objects along with their classification confidence scores. Subsequently, a non-maximum suppression (NMS) module removes any redundant detection boxes from the final output. Typical NMS algorithms must wait for all box predictions to be generated by the SSD-based feature extractor before processing them. This sequential dependency between box predictions and NMS results in a significant latency overhead and degrades the overall system throughput, even if a high-performance CNN accelerator is used for the SSD feature extraction component. In this paper, we present a novel pipelined NMS algorithm that eliminates this sequential dependency and associated NMS latency overhead. We then use our novel NMS algorithm to implement an end-to-end fully pipelined FPGA system for low-latency SSD-MobileNet-V1 object detection. Our system, implemented on an Intel Stratix 10 FPGA, runs at 400 MHz and achieves a throughput of 2,167 frames per second with an end-to-end batch-1 latency of 2.13 ms. Our system achieves 5.3 × higher throughput and 5 × lower latency compared to the best prior FPGA-based solution with comparable accuracy.</p>","PeriodicalId":49248,"journal":{"name":"ACM Transactions on Reconfigurable Technology and Systems","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138541663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Covert-channels in FPGA-enabled SmartSSDs 支持fpga的smartssd中的转换通道
IF 2.3 4区 计算机科学 Q1 Computer Science Pub Date : 2023-12-04 DOI: 10.1145/3635312
Theodoros Trochatos, Anthony Etim, Jakub Szefer

Cloud computing providers today offer access to a variety of devices, which users can rent and access remotely in a shared setting. Among these devices are SmartSSDs, which a solid-state disks (SSD) augmented with an FPGA, enabling users to instantiate custom circuits within the FPGA, including potentially malicious circuits for power and temperature measurement. Normally, cloud users have no remote access to power and temperature data, but with SmartSSDs they could abuse the FPGA component to instantiate circuits to learn this information. Additionally, custom power waster circuits can be instantiated within the FPGA. This paper shows for the first time that by leveraging ring oscillator sensors and power wasters, numerous covert-channels in FPGA-enabled SmartSSDs could be used to transmit information. This work presents two channels in single-tenant setting (SmartSSD is used by one user at a time) and two channels in multi-tenant setting (FPGA and SSD inside SmartSSD is shared by different users). The presented covert channels can reach close to 100% accuracy. Meanwhile, bandwidth of the channels can be easily scaled by cloud users renting more SmartSSDs as the bandwidth of the covert channels is proportional to number of SmartSSD used.

如今,云计算提供商提供对各种设备的访问,用户可以在共享设置中租用和远程访问这些设备。这些设备中有smartssd,它是一个固态磁盘(SSD),增强了FPGA,使用户能够在FPGA内实例化自定义电路,包括用于功率和温度测量的潜在恶意电路。通常,云用户无法远程访问电源和温度数据,但使用smartssd,他们可以滥用FPGA组件来实例化电路以获取这些信息。此外,可在FPGA内实例化定制的功耗损耗电路。本文首次表明,通过利用环形振荡器传感器和功耗浪费器,可以使用fpga支持的smartssd中的许多转换通道来传输信息。本工作在单租户(SmartSSD由一个用户一次使用)和多租户(SmartSSD内部的FPGA和SSD由不同用户共享)的情况下提供了两个通道。所提出的隐蔽信道可以达到接近100%的准确率。同时,由于隐蔽通道的带宽与使用的SmartSSD数量成正比,云用户可以租用更多的SmartSSD来扩展通道的带宽。
{"title":"Covert-channels in FPGA-enabled SmartSSDs","authors":"Theodoros Trochatos, Anthony Etim, Jakub Szefer","doi":"10.1145/3635312","DOIUrl":"https://doi.org/10.1145/3635312","url":null,"abstract":"<p>Cloud computing providers today offer access to a variety of devices, which users can rent and access remotely in a shared setting. Among these devices are SmartSSDs, which a solid-state disks (SSD) augmented with an FPGA, enabling users to instantiate custom circuits within the FPGA, including potentially malicious circuits for power and temperature measurement. Normally, cloud users have no remote access to power and temperature data, but with SmartSSDs they could abuse the FPGA component to instantiate circuits to learn this information. Additionally, custom power waster circuits can be instantiated within the FPGA. This paper shows for the first time that by leveraging ring oscillator sensors and power wasters, numerous covert-channels in FPGA-enabled SmartSSDs could be used to transmit information. This work presents two channels in single-tenant setting (SmartSSD is used by one user at a time) and two channels in multi-tenant setting (FPGA and SSD inside SmartSSD is shared by different users). The presented covert channels can reach close to 100% accuracy. Meanwhile, bandwidth of the channels can be easily scaled by cloud users renting more SmartSSDs as the bandwidth of the covert channels is proportional to number of SmartSSD used.</p>","PeriodicalId":49248,"journal":{"name":"ACM Transactions on Reconfigurable Technology and Systems","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138541630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Across Time and Space: Senju’s Approach for Scaling Iterative Stencil Loop Accelerators on Single and Multiple FPGAs 跨越时间和空间:Senju在单个和多个fpga上缩放迭代模板循环加速器的方法
IF 2.3 4区 计算机科学 Q1 Computer Science Pub Date : 2023-11-29 DOI: 10.1145/3634920
Emanuele Del Sozzo, Davide Conficconi, Kentaro Sano

Stencil-based applications play an essential role in high-performance systems as they occur in numerous computational areas, such as partial differential equation solving. In this context, Iterative Stencil Loops (ISLs) represent a prominent and well-known algorithmic class within the stencil domain. Specifically, ISL-based calculations iteratively apply the same stencil to a multi-dimensional point grid multiple times or until convergence. However, due to their iterative and intensive nature, ISLs are highly performance-hungry, demanding specialized solutions. Here, Field Programmable Gate Arrays (FPGAs) represent a valid architectural choice as they enable the design of custom, parallel, and scalable ISL accelerators. Besides, the regular structure of ISLs makes them an ideal candidate for automatic optimization and generation flows. For these reasons, this paper introduces Senju, an automation framework for the design of highly parallel ISL accelerators targeting single-/multi-FPGA systems. Given an input description, Senju automates the entire design process and provides accurate performance estimations. The experimental evaluation shows remarkable and scalable results, outperforming single- and multi-FPGA literature approaches under different metrics. Finally, we present a new analysis of temporal and spatial parallelism trade-offs in a real-case scenario and discuss our performance through a single- and novel specialized multi-FPGA formulation of the Roofline Model.

基于模板的应用程序在高性能系统中起着至关重要的作用,因为它们出现在许多计算领域,例如偏微分方程求解。在这种情况下,迭代模板循环(isl)代表了模板领域中一个突出且众所周知的算法类。具体来说,基于is的计算迭代地将相同的模板应用于多维点网格多次或直到收敛。然而,由于其迭代性和密集性,isl对性能要求很高,需要专门的解决方案。在这里,现场可编程门阵列(fpga)代表了一种有效的架构选择,因为它们可以设计定制的、并行的和可扩展的ISL加速器。此外,isl的规则结构使其成为自动优化和生成流的理想候选者。基于这些原因,本文介绍了Senju,一个针对单/多fpga系统的高度并行ISL加速器设计的自动化框架。给定输入描述,Senju将整个设计过程自动化,并提供准确的性能评估。实验评估显示出显著的可扩展结果,在不同指标下优于单fpga和多fpga文献方法。最后,我们提出了一个新的分析,在实际情况下的时间和空间并行权衡,并讨论了我们的性能通过一个单一的和新颖的专用多fpga的rooline模型配方。
{"title":"Across Time and Space: Senju’s Approach for Scaling Iterative Stencil Loop Accelerators on Single and Multiple FPGAs","authors":"Emanuele Del Sozzo, Davide Conficconi, Kentaro Sano","doi":"10.1145/3634920","DOIUrl":"https://doi.org/10.1145/3634920","url":null,"abstract":"<p>Stencil-based applications play an essential role in high-performance systems as they occur in numerous computational areas, such as partial differential equation solving. In this context, Iterative Stencil Loops (ISLs) represent a prominent and well-known algorithmic class within the stencil domain. Specifically, ISL-based calculations iteratively apply the same stencil to a multi-dimensional point grid multiple times or until convergence. However, due to their iterative and intensive nature, ISLs are highly performance-hungry, demanding specialized solutions. Here, Field Programmable Gate Arrays (FPGAs) represent a valid architectural choice as they enable the design of custom, parallel, and scalable ISL accelerators. Besides, the regular structure of ISLs makes them an ideal candidate for automatic optimization and generation flows. For these reasons, this paper introduces <span>Senju</span>, an automation framework for the design of highly parallel ISL accelerators targeting single-/multi-FPGA systems. Given an input description, <span>Senju</span> automates the entire design process and provides accurate performance estimations. The experimental evaluation shows remarkable and scalable results, outperforming single- and multi-FPGA literature approaches under different metrics. Finally, we present a new analysis of temporal and spatial parallelism trade-offs in a real-case scenario and discuss our performance through a single- and novel specialized multi-FPGA formulation of the Roofline Model.</p>","PeriodicalId":49248,"journal":{"name":"ACM Transactions on Reconfigurable Technology and Systems","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138505041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the Malicious Potential of Xilinx’ Internal Configuration Access Port (ICAP) Xilinx内部配置访问端口(ICAP)的恶意漏洞分析
IF 2.3 4区 计算机科学 Q1 Computer Science Pub Date : 2023-11-17 DOI: 10.1145/3633204
Nils Albartus, Maik Ender, Jan-Niklas Möller, Marc Fyrbiak, Christof Paar, Russell Tessier

FPGAs have become increasingly popular in computing platforms. With recent advances in bitstream format reverse engineering, the scientific community has widely explored static FPGA security threats. For example, it is now possible to convert a bitstream to a netlist, revealing design information, and apply modifications to the static bitstream based on this knowledge. However, a systematic study of the influence of the bitstream format understanding in regards to the security aspects of the dynamic configuration process, particularly for Xilinx’s Internal Configuration Access Port (ICAP), is lacking. This paper fills this gap by comprehensively analyzing the security implications of ICAP interfaces, which primarily support dynamic partial reconfiguration. We delve into the Xilinx bitstream file format, identify misconceptions in official documentation, and propose novel configuration (attack) primitives based on dynamic reconfiguration, i.e., create/read/update/delete circuits in the FPGA, without requiring pre-definition during the design phase. Our primitives are consolidated in a novel Stealthy Reconfigurable Adaptive Trojan (STRAT) framework to conceal Trojans and evade state-of-the-art netlist reverse engineering methods. As FPGAs become integral to modern cloud computing, this research presents crucial insights on potential security risks, including the possibility of a malicious tenant or provider altering or spying on another tenant’s configuration undetected.

fpga在计算平台中越来越受欢迎。随着比特流格式逆向工程的最新进展,科学界对静态FPGA的安全威胁进行了广泛的探索。例如,现在可以将比特流转换为网表,揭示设计信息,并根据这些知识对静态比特流应用修改。然而,对于比特流格式理解对动态配置过程安全方面的影响,特别是对于Xilinx的内部配置访问端口(ICAP),缺乏系统的研究。本文通过全面分析ICAP接口的安全含义来填补这一空白,ICAP接口主要支持动态部分重构。我们深入研究了Xilinx比特流文件格式,识别官方文档中的误解,并提出了基于动态重新配置的新配置(攻击)原语,即在FPGA中创建/读取/更新/删除电路,而无需在设计阶段预先定义。我们的原语被整合在一个新颖的隐身可重构自适应木马(STRAT)框架中,以隐藏木马并逃避最先进的网络列表逆向工程方法。随着fpga成为现代云计算不可或缺的一部分,这项研究提出了对潜在安全风险的重要见解,包括恶意租户或提供商在未被发现的情况下更改或监视另一个租户配置的可能性。
{"title":"On the Malicious Potential of Xilinx’ Internal Configuration Access Port (ICAP)","authors":"Nils Albartus, Maik Ender, Jan-Niklas Möller, Marc Fyrbiak, Christof Paar, Russell Tessier","doi":"10.1145/3633204","DOIUrl":"https://doi.org/10.1145/3633204","url":null,"abstract":"<p>FPGAs have become increasingly popular in computing platforms. With recent advances in bitstream format reverse engineering, the scientific community has widely explored static FPGA security threats. For example, it is now possible to convert a bitstream to a netlist, revealing design information, and apply modifications to the static bitstream based on this knowledge. However, a systematic study of the influence of the bitstream format understanding in regards to the security aspects of the dynamic configuration process, particularly for Xilinx’s Internal Configuration Access Port (ICAP), is lacking. This paper fills this gap by comprehensively analyzing the security implications of ICAP interfaces, which primarily support dynamic partial reconfiguration. We delve into the Xilinx bitstream file format, identify misconceptions in official documentation, and propose novel configuration (attack) primitives based on dynamic reconfiguration, i.e., create/read/update/delete circuits in the FPGA, without requiring pre-definition during the design phase. Our primitives are consolidated in a novel <i>Stealthy Reconfigurable Adaptive Trojan</i> (<monospace>STRAT</monospace>) framework to conceal Trojans and evade state-of-the-art netlist reverse engineering methods. As FPGAs become integral to modern cloud computing, this research presents crucial insights on potential security risks, including the possibility of a malicious tenant or provider altering or spying on another tenant’s configuration undetected.</p>","PeriodicalId":49248,"journal":{"name":"ACM Transactions on Reconfigurable Technology and Systems","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138504977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HyBNN: Quantifying and Optimizing Hardware Efficiency of Binary Neural Networks 二值神经网络硬件效率的量化与优化
4区 计算机科学 Q1 Computer Science Pub Date : 2023-11-07 DOI: 10.1145/3631610
Geng Yang, Jie Lei, Zhenman Fang, Yunsong li, Jiaqing Zhang, Weiying Xie
Binary neural network (BNN), where both the weight and the activation values are represented with one bit, provides an attractive alternative to deploy highly efficient deep learning inference on resource-constrained edge devices. However, our investigation reveals that, to achieve satisfactory accuracy gains, state-of-the-art (SOTA) BNNs, such as FracBNN and ReActNet, usually have to incorporate various auxiliary floating-point components and increase the model size, which in turn degrades the hardware performance efficiency. In this paper, we aim to quantify such hardware inefficiency in SOTA BNNs and further mitigate it with negligible accuracy loss. First, we observe that the auxiliary floating-point (AFP) components consume an average of 93% DSPs, 46% LUTs, and 62% FFs, among the entire BNN accelerator resource utilization. To mitigate such overhead, we propose a novel algorithm-hardware co-design, called FuseBNN , to fuse those AFP operators without hurting the accuracy. On average, FuseBNN reduces AFP resource utilization to 59% DSPs, 13% LUTs, and 16% FFs. Second, SOTA BNNs often use the compact MobileNetV1 as the backbone network but have to replace the lightweight 3 × 3 depth-wise convolution (DWC) with the 3 × 3 standard convolution (SC, e.g., in ReActNet and our ReActNet-adapted BaseBNN) or even more complex fractional 3 × 3 SC (e.g., in FracBNN) to bridge the accuracy gap. As a result, the model parameter size is significantly increased and becomes 2.25 × larger than that of the 4-bit direct quantization with the original DWC (4-Bit-Net); the number of multiply-accumulate operations is also significantly increased so that the overall LUT resource usage of BaseBNN is almost the same as that of 4-Bit-Net. To address this issue, we propose HyBNN , where we binarize depth-wise separation convolution (DSC) blocks for the first time to decrease the model size and incorporate 4-bit DSC blocks to compensate for the accuracy loss. For the ship detection task in synthetic aperture radar imagery on the AMD-Xilinx ZCU102 FPGA, HyBNN achieves a detection accuracy of 94.8% and a detection speed of 615 frames per second (FPS), which is 6.8 × faster than FuseBNN+ (94.9% accuracy) and 2.7 × faster than 4-Bit-Net (95.9% accuracy). For image classification on the CIFAR-10 dataset on the AMD-Xilinx Ultra96-V2 FPGA, HyBNN achieves 1.5 × speedup and 0.7% better accuracy over SOTA FracBNN.
二元神经网络(BNN),其中权重和激活值都用一个比特表示,为在资源受限的边缘设备上部署高效的深度学习推理提供了一个有吸引力的替代方案。然而,我们的调查显示,为了获得令人满意的精度增益,最先进的(SOTA) bnn,如FracBNN和ReActNet,通常必须合并各种辅助浮点组件并增加模型尺寸,这反过来又降低了硬件性能效率。在本文中,我们的目标是量化SOTA bnn中的这种硬件低效率,并在可以忽略的精度损失下进一步减轻它。首先,我们观察到辅助浮点(AFP)组件在整个BNN加速器资源利用率中平均消耗93%的dsp, 46%的lut和62%的ff。为了减少这种开销,我们提出了一种新的算法-硬件协同设计,称为FuseBNN,在不影响精度的情况下融合这些AFP算子。平均而言,FuseBNN将AFP资源利用率降低到59%的dsp, 13%的lut和16%的ff。其次,SOTA bnn通常使用紧凑的MobileNetV1作为骨干网络,但必须用3 × 3标准卷积(SC,例如在ReActNet和我们的ReActNet-adapted BaseBNN中)或更复杂的分数3 × 3 SC(例如在FracBNN中)取代轻量级的3 × 3深度卷积(DWC)来弥合精度差距。结果,模型参数尺寸明显增大,比原始DWC (4-bit - net)的4位直接量化大2.25倍;乘法累加操作的数量也显著增加,使得BaseBNN的总体LUT资源使用几乎与4-Bit-Net相同。为了解决这个问题,我们提出了HyBNN,其中我们首次对深度分离卷积(DSC)块进行二值化以减小模型大小,并合并4位DSC块来补偿精度损失。在基于AMD-Xilinx ZCU102 FPGA的合成孔径雷达图像舰船检测任务中,HyBNN的检测精度为94.8%,检测速度为615帧/秒(FPS),比FuseBNN+(精度94.9%)快6.8倍,比4-Bit-Net(精度95.9%)快2.7倍。对于在AMD-Xilinx Ultra96-V2 FPGA上的CIFAR-10数据集上的图像分类,HyBNN比SOTA FracBNN实现了1.5倍的加速和0.7%的精度提高。
{"title":"HyBNN: Quantifying and Optimizing Hardware Efficiency of Binary Neural Networks","authors":"Geng Yang, Jie Lei, Zhenman Fang, Yunsong li, Jiaqing Zhang, Weiying Xie","doi":"10.1145/3631610","DOIUrl":"https://doi.org/10.1145/3631610","url":null,"abstract":"Binary neural network (BNN), where both the weight and the activation values are represented with one bit, provides an attractive alternative to deploy highly efficient deep learning inference on resource-constrained edge devices. However, our investigation reveals that, to achieve satisfactory accuracy gains, state-of-the-art (SOTA) BNNs, such as FracBNN and ReActNet, usually have to incorporate various auxiliary floating-point components and increase the model size, which in turn degrades the hardware performance efficiency. In this paper, we aim to quantify such hardware inefficiency in SOTA BNNs and further mitigate it with negligible accuracy loss. First, we observe that the auxiliary floating-point (AFP) components consume an average of 93% DSPs, 46% LUTs, and 62% FFs, among the entire BNN accelerator resource utilization. To mitigate such overhead, we propose a novel algorithm-hardware co-design, called FuseBNN , to fuse those AFP operators without hurting the accuracy. On average, FuseBNN reduces AFP resource utilization to 59% DSPs, 13% LUTs, and 16% FFs. Second, SOTA BNNs often use the compact MobileNetV1 as the backbone network but have to replace the lightweight 3 × 3 depth-wise convolution (DWC) with the 3 × 3 standard convolution (SC, e.g., in ReActNet and our ReActNet-adapted BaseBNN) or even more complex fractional 3 × 3 SC (e.g., in FracBNN) to bridge the accuracy gap. As a result, the model parameter size is significantly increased and becomes 2.25 × larger than that of the 4-bit direct quantization with the original DWC (4-Bit-Net); the number of multiply-accumulate operations is also significantly increased so that the overall LUT resource usage of BaseBNN is almost the same as that of 4-Bit-Net. To address this issue, we propose HyBNN , where we binarize depth-wise separation convolution (DSC) blocks for the first time to decrease the model size and incorporate 4-bit DSC blocks to compensate for the accuracy loss. For the ship detection task in synthetic aperture radar imagery on the AMD-Xilinx ZCU102 FPGA, HyBNN achieves a detection accuracy of 94.8% and a detection speed of 615 frames per second (FPS), which is 6.8 × faster than FuseBNN+ (94.9% accuracy) and 2.7 × faster than 4-Bit-Net (95.9% accuracy). For image classification on the CIFAR-10 dataset on the AMD-Xilinx Ultra96-V2 FPGA, HyBNN achieves 1.5 × speedup and 0.7% better accuracy over SOTA FracBNN.","PeriodicalId":49248,"journal":{"name":"ACM Transactions on Reconfigurable Technology and Systems","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135475095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Eciton: Very Low-Power Recurrent Neural Network Accelerator for Real-Time Inference at the Edge 用于边缘实时推理的极低功耗递归神经网络加速器
4区 计算机科学 Q1 Computer Science Pub Date : 2023-11-01 DOI: 10.1145/3629979
Jeffrey Chen, Sang-Woo Jun, Sehwan Hong, Warrick He, Jinyeong Moon
This paper presents Eciton, a very low-power recurrent neural network accelerator for time series data within low-power edge sensor nodes, achieving real-time inference with a power consumption of 17 mW under load. Eciton reduces memory and chip resource requirements via 8-bit quantization and hard sigmoid activation, allowing the accelerator as well as the recurrent neural network model parameters to fit in a low-cost, low-power Lattice iCE40 UP5K FPGA. We evaluate Eciton on multiple, established time-series classification applications including predictive maintenance of mechanical systems, sound classification, and intrusion detection for IoT nodes. Binary and multi-class classification edge models are explored, demonstrating that Eciton can adapt to a variety of deployable environments and remote use cases. Eciton demonstrates real-time processing at a very low power consumption with minimal loss of accuracy on multiple inference scenarios with differing characteristics, while achieving competitive power efficiency against the state-of-the-art of similar scale. We show that the addition of this accelerator actually reduces the power budget of the sensor node by reducing power-hungry wireless transmission. The resulting power budget of the sensor node is small enough to be powered by a power harvester, potentially allowing it to run indefinitely without a battery or periodic maintenance.
本文介绍了Eciton,一种非常低功耗的循环神经网络加速器,用于低功耗边缘传感器节点内的时间序列数据,在负载下以17兆瓦的功耗实现实时推断。Eciton通过8位量化和硬sigmoid激活降低了内存和芯片资源需求,允许加速器以及循环神经网络模型参数适合低成本、低功耗的Lattice iCE40 UP5K FPGA。我们在多个已建立的时间序列分类应用中评估Eciton,包括机械系统的预测性维护、声音分类和物联网节点的入侵检测。探讨了二元和多类分类边缘模型,证明了Eciton可以适应各种可部署环境和远程用例。Eciton演示了在具有不同特征的多个推理场景中以非常低的功耗进行实时处理,并将准确性损失降到最低,同时实现了与类似规模的最先进技术相比具有竞争力的功率效率。我们表明,这个加速器的加入实际上通过减少耗电的无线传输减少了传感器节点的功率预算。由此产生的传感器节点的功率预算足够小,可以由电力采集器供电,从而有可能使其在没有电池或定期维护的情况下无限期运行。
{"title":"Eciton: Very Low-Power Recurrent Neural Network Accelerator for Real-Time Inference at the Edge","authors":"Jeffrey Chen, Sang-Woo Jun, Sehwan Hong, Warrick He, Jinyeong Moon","doi":"10.1145/3629979","DOIUrl":"https://doi.org/10.1145/3629979","url":null,"abstract":"This paper presents Eciton, a very low-power recurrent neural network accelerator for time series data within low-power edge sensor nodes, achieving real-time inference with a power consumption of 17 mW under load. Eciton reduces memory and chip resource requirements via 8-bit quantization and hard sigmoid activation, allowing the accelerator as well as the recurrent neural network model parameters to fit in a low-cost, low-power Lattice iCE40 UP5K FPGA. We evaluate Eciton on multiple, established time-series classification applications including predictive maintenance of mechanical systems, sound classification, and intrusion detection for IoT nodes. Binary and multi-class classification edge models are explored, demonstrating that Eciton can adapt to a variety of deployable environments and remote use cases. Eciton demonstrates real-time processing at a very low power consumption with minimal loss of accuracy on multiple inference scenarios with differing characteristics, while achieving competitive power efficiency against the state-of-the-art of similar scale. We show that the addition of this accelerator actually reduces the power budget of the sensor node by reducing power-hungry wireless transmission. The resulting power budget of the sensor node is small enough to be powered by a power harvester, potentially allowing it to run indefinitely without a battery or periodic maintenance.","PeriodicalId":49248,"journal":{"name":"ACM Transactions on Reconfigurable Technology and Systems","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135371182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Designing Deep Learning Models on FPGA with Multiple Heterogeneous Engines 基于FPGA的多异构引擎深度学习模型设计
4区 计算机科学 Q1 Computer Science Pub Date : 2023-10-10 DOI: 10.1145/3615870
Miguel Reis, Mário Véstias, Horácio Neto
Deep learning models are becoming more complex and heterogeneous with new layer types to improve their accuracy. This brings a considerable challenge to the designers of accelerators of deep neural networks. There have been several architectures and design flows to map deep learning models on hardware, but they are limited to a particular model and/or layer types. Also, the architectures generated by these tools target, in general, high-performance devices, not appropriate for embedded computing. This paper proposes a multi-engine architecture and a design flow to implement deep learning models on FPGA. The hardware design uses high-level synthesis to allow design space exploration. The architecture is scalable and therefore applicable to any density FPGAs. The architecture and design flow were applied to the development of a hardware/software system for image classification with ResNet50, object detection with YOLOv3-Tiny and image segmentation with Deeplabv3+. The system was tested in a low-density Zynq UltraScale+ ZU3EG FPGA to show its scalability. The results show that the proposed multi-engine architecture generates efficient accelerators. An accelerator of ResNet50 with a 4-bit quantization achieves 67 FPS, and the object detector with YOLOv3-Tiny with a throughput of 36 FPS and the image segmentation application achieves 1.4 FPS.
深度学习模型正变得越来越复杂和异构,并增加了新的层类型以提高其准确性。这给深度神经网络加速器的设计者带来了相当大的挑战。有几种架构和设计流程可以将深度学习模型映射到硬件上,但它们仅限于特定的模型和/或层类型。此外,这些工具生成的体系结构通常针对高性能设备,不适合嵌入式计算。本文提出了在FPGA上实现深度学习模型的多引擎架构和设计流程。硬件设计采用高级综合,实现设计空间的探索。该架构是可扩展的,因此适用于任何密度的fpga。将该体系结构和设计流程应用于基于ResNet50的图像分类、基于YOLOv3-Tiny的目标检测和基于Deeplabv3+的图像分割的软硬件系统开发。该系统在低密度Zynq UltraScale+ ZU3EG FPGA上进行了测试,以显示其可扩展性。结果表明,所提出的多引擎架构产生了高效的加速器。采用4位量化的ResNet50加速器达到67 FPS,采用吞吐量为36 FPS的YOLOv3-Tiny目标检测器和图像分割应用程序达到1.4 FPS。
{"title":"Designing Deep Learning Models on FPGA with Multiple Heterogeneous Engines","authors":"Miguel Reis, Mário Véstias, Horácio Neto","doi":"10.1145/3615870","DOIUrl":"https://doi.org/10.1145/3615870","url":null,"abstract":"Deep learning models are becoming more complex and heterogeneous with new layer types to improve their accuracy. This brings a considerable challenge to the designers of accelerators of deep neural networks. There have been several architectures and design flows to map deep learning models on hardware, but they are limited to a particular model and/or layer types. Also, the architectures generated by these tools target, in general, high-performance devices, not appropriate for embedded computing. This paper proposes a multi-engine architecture and a design flow to implement deep learning models on FPGA. The hardware design uses high-level synthesis to allow design space exploration. The architecture is scalable and therefore applicable to any density FPGAs. The architecture and design flow were applied to the development of a hardware/software system for image classification with ResNet50, object detection with YOLOv3-Tiny and image segmentation with Deeplabv3+. The system was tested in a low-density Zynq UltraScale+ ZU3EG FPGA to show its scalability. The results show that the proposed multi-engine architecture generates efficient accelerators. An accelerator of ResNet50 with a 4-bit quantization achieves 67 FPS, and the object detector with YOLOv3-Tiny with a throughput of 36 FPS and the image segmentation application achieves 1.4 FPS.","PeriodicalId":49248,"journal":{"name":"ACM Transactions on Reconfigurable Technology and Systems","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136353080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
ACM Transactions on Reconfigurable Technology and Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1