首页 > 最新文献

ACM Transactions on Reconfigurable Technology and Systems最新文献

英文 中文
Topgun: An ECC Accelerator for Private Set Intersection Topgun:一种私有集交叉口的ECC加速器
IF 2.3 4区 计算机科学 Q1 Computer Science Pub Date : 2023-07-13 DOI: https://dl.acm.org/doi/10.1145/3603114
Guiming Wu, Qianwen He, Jiali Jiang, Zhenxiang Zhang, Yuan Zhao, Yinchao Zou, Jie Zhang, Changzheng Wei, Ying Yan, Hui Zhang

Elliptic Curve Cryptography (ECC), one of the most widely used asymmetric cryptographic algorithms, has been deployed in Transport Layer Security (TLS) protocol, blockchain, secure multiparty computation, etc. As one of the most secure ECC curves, Curve25519 is employed by some secure protocols, such as TLS 1.3 and Diffie-Hellman Private Set Intersection (DH-PSI) protocol. High performance implementation of ECC is required, especially for the DH-PSI protocol used in privacy-preserving platform.

Point multiplication, the chief cryptographic primitive in ECC, is computationally expensive. To improve the performance of DH-PSI protocol, we propose Topgun, a novel and high-performance hardware architecture for point multiplication over Curve25519. The proposed architecture features a pipelined Finite-field Arithmetic Unit and a simple and highly efficient instruction set architecture. Compared to the best existing work on Xilinx Zynq 7000 series FPGA, our implementation with one Processing Element can achieve 3.14 × speedup on the same device. To the best of our knowledge, our implementation appears to be the fastest among the state-of-the-art works. We also have implemented our architecture consisting of 4 Compute Groups, each with 16 PEs, on an Intel Agilex AGF027 FPGA. The measured performance of 4.48 Mops/s is achieved at the cost of 86 Watts power, which is the record-setting performance for point multiplication over Curve25519 on FPGAs.

椭圆曲线加密(ECC)是目前应用最广泛的非对称加密算法之一,已被广泛应用于传输层安全(TLS)协议、区块链、安全多方计算等领域。作为最安全的ECC曲线之一,Curve25519被一些安全协议所采用,如TLS 1.3和DH-PSI (Diffie-Hellman Private Set Intersection)协议。对ECC的高性能实现提出了更高的要求,特别是在隐私保护平台中使用的DH-PSI协议。点乘法是ECC中主要的密码原语,计算成本很高。为了提高DH-PSI协议的性能,我们提出了一种新的高性能硬件架构Topgun,用于在Curve25519上进行点乘法运算。该体系结构具有流水线式有限域算术单元和简单高效的指令集体系结构。与Xilinx Zynq 7000系列FPGA上现有的最佳工作相比,我们的实现使用一个处理元件可以在同一设备上实现3.14倍的加速。据我们所知,我们的实施似乎是最先进的作品中最快的。我们还在Intel Agilex AGF027 FPGA上实现了由4个计算组组成的架构,每个计算组有16个pe。测量到的4.48 Mops/s的性能是以86瓦的功耗为代价实现的,这是在fpga上通过Curve25519进行点乘法的创纪录性能。
{"title":"Topgun: An ECC Accelerator for Private Set Intersection","authors":"Guiming Wu, Qianwen He, Jiali Jiang, Zhenxiang Zhang, Yuan Zhao, Yinchao Zou, Jie Zhang, Changzheng Wei, Ying Yan, Hui Zhang","doi":"https://dl.acm.org/doi/10.1145/3603114","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3603114","url":null,"abstract":"<p>Elliptic Curve Cryptography (ECC), one of the most widely used asymmetric cryptographic algorithms, has been deployed in Transport Layer Security (TLS) protocol, blockchain, secure multiparty computation, etc. As one of the most secure ECC curves, Curve25519 is employed by some secure protocols, such as TLS 1.3 and Diffie-Hellman Private Set Intersection (DH-PSI) protocol. High performance implementation of ECC is required, especially for the DH-PSI protocol used in privacy-preserving platform. </p><p>Point multiplication, the chief cryptographic primitive in ECC, is computationally expensive. To improve the performance of DH-PSI protocol, we propose Topgun, a novel and high-performance hardware architecture for point multiplication over Curve25519. The proposed architecture features a pipelined Finite-field Arithmetic Unit and a simple and highly efficient instruction set architecture. Compared to the best existing work on Xilinx Zynq 7000 series FPGA, our implementation with one Processing Element can achieve 3.14 × speedup on the same device. To the best of our knowledge, our implementation appears to be the fastest among the state-of-the-art works. We also have implemented our architecture consisting of 4 Compute Groups, each with 16 PEs, on an Intel Agilex AGF027 FPGA. The measured performance of 4.48 Mops/s is achieved at the cost of 86 Watts power, which is the record-setting performance for point multiplication over Curve25519 on FPGAs.</p>","PeriodicalId":49248,"journal":{"name":"ACM Transactions on Reconfigurable Technology and Systems","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138504979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Topgun: An ECC Accelerator for Private Set Intersection Topgun:一种用于私用集交叉口的ECC加速器
IF 2.3 4区 计算机科学 Q1 Computer Science Pub Date : 2023-07-13 DOI: 10.1145/3603114
Guiming Wu, Qianwen He, Jiali Jiang, Zhenxiang Zhang, Yuan Zhao, Yinchao Zou, Jie Zhang, Changzheng Wei, Ying Yan, Hui Zhang
Elliptic Curve Cryptography (ECC), one of the most widely used asymmetric cryptographic algorithms, has been deployed in Transport Layer Security (TLS) protocol, blockchain, secure multiparty computation, etc. As one of the most secure ECC curves, Curve25519 is employed by some secure protocols, such as TLS 1.3 and Diffie-Hellman Private Set Intersection (DH-PSI) protocol. High performance implementation of ECC is required, especially for the DH-PSI protocol used in privacy-preserving platform. Point multiplication, the chief cryptographic primitive in ECC, is computationally expensive. To improve the performance of DH-PSI protocol, we propose Topgun, a novel and high-performance hardware architecture for point multiplication over Curve25519. The proposed architecture features a pipelined Finite-field Arithmetic Unit and a simple and highly efficient instruction set architecture. Compared to the best existing work on Xilinx Zynq 7000 series FPGA, our implementation with one Processing Element can achieve 3.14 × speedup on the same device. To the best of our knowledge, our implementation appears to be the fastest among the state-of-the-art works. We also have implemented our architecture consisting of 4 Compute Groups, each with 16 PEs, on an Intel Agilex AGF027 FPGA. The measured performance of 4.48 Mops/s is achieved at the cost of 86 Watts power, which is the record-setting performance for point multiplication over Curve25519 on FPGAs.
椭圆曲线密码算法(ECC)是应用最广泛的非对称密码算法之一,已被应用于传输层安全(TLS)协议、区块链、安全多方计算等领域。作为最安全的ECC曲线之一,Curve25519被一些安全协议所采用,如TLS 1.3和DH-PSI (Diffie-Hellman Private Set Intersection)协议。对ECC的高性能实现提出了更高的要求,特别是在隐私保护平台中使用的DH-PSI协议。点乘法是ECC中主要的密码原语,计算成本很高。为了提高DH-PSI协议的性能,我们提出了一种新的高性能硬件架构Topgun,用于在Curve25519上进行点乘法运算。该体系结构具有流水线式有限域算术单元和简单高效的指令集体系结构。与Xilinx Zynq 7000系列FPGA上现有的最佳工作相比,我们的实现使用一个处理元件可以在同一设备上实现3.14倍的加速。据我们所知,我们的实施似乎是最先进的作品中最快的。我们还在Intel Agilex AGF027 FPGA上实现了由4个计算组组成的架构,每个计算组有16个pe。测量到的4.48 Mops/s的性能是以86瓦的功耗为代价实现的,这是在fpga上通过Curve25519进行点乘法的创纪录性能。
{"title":"Topgun: An ECC Accelerator for Private Set Intersection","authors":"Guiming Wu, Qianwen He, Jiali Jiang, Zhenxiang Zhang, Yuan Zhao, Yinchao Zou, Jie Zhang, Changzheng Wei, Ying Yan, Hui Zhang","doi":"10.1145/3603114","DOIUrl":"https://doi.org/10.1145/3603114","url":null,"abstract":"Elliptic Curve Cryptography (ECC), one of the most widely used asymmetric cryptographic algorithms, has been deployed in Transport Layer Security (TLS) protocol, blockchain, secure multiparty computation, etc. As one of the most secure ECC curves, Curve25519 is employed by some secure protocols, such as TLS 1.3 and Diffie-Hellman Private Set Intersection (DH-PSI) protocol. High performance implementation of ECC is required, especially for the DH-PSI protocol used in privacy-preserving platform. Point multiplication, the chief cryptographic primitive in ECC, is computationally expensive. To improve the performance of DH-PSI protocol, we propose Topgun, a novel and high-performance hardware architecture for point multiplication over Curve25519. The proposed architecture features a pipelined Finite-field Arithmetic Unit and a simple and highly efficient instruction set architecture. Compared to the best existing work on Xilinx Zynq 7000 series FPGA, our implementation with one Processing Element can achieve 3.14 × speedup on the same device. To the best of our knowledge, our implementation appears to be the fastest among the state-of-the-art works. We also have implemented our architecture consisting of 4 Compute Groups, each with 16 PEs, on an Intel Agilex AGF027 FPGA. The measured performance of 4.48 Mops/s is achieved at the cost of 86 Watts power, which is the record-setting performance for point multiplication over Curve25519 on FPGAs.","PeriodicalId":49248,"journal":{"name":"ACM Transactions on Reconfigurable Technology and Systems","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44420898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
NAPOLY: A Non-deterministic Automata Processor OverLaY 非确定性自动机处理器叠加
IF 2.3 4区 计算机科学 Q1 Computer Science Pub Date : 2023-06-22 DOI: https://dl.acm.org/doi/10.1145/3593586
Rasha Karakchi, Jason D. Bakos

Deterministic and Non-deterministic Finite Automata (DFA and NFA) comprise the core of many big data applications. Recent efforts to develop Domain-Specific Architectures (DSAs) for DFA/NFA have taken divergent approaches, but achieving consistent throughput for arbitrarily-large pattern sets, state activation rates, and pattern match rates remains a challenge. In this article, we present NAPOLY (Non-Deterministic Automata Processor OverLaY), an FPGA overlay and associated compiler. A common limitation of prior efforts is a limit on NFA size for achieving the advertised throughput. NAPOLY is optimized for fast re-programming to permit practical time-division multiplexing of the hardware and permit high asymptotic throughput for NFAs of unlimited size, unlimited state activation rate, and high pattern reporting rate. NAPOLY also allows for offline generation of configurations having tradeoffs between state capacity and transition capacity. In this article, we (1) evaluate NAPOLY using benchmarks packaged in the ANMLZoo benchmark suite, (2) evaluate the use of an SAT solver for allocating physical resources, and (3) compare NAPOLY’s performance against existing solutions. NAPOLY performs most favorably on larger benchmarks, benchmarks with higher state activation frequency, and benchmarks with higher reporting frequency. NAPOLY outperforms the fastest of the CPU and GPU implementations in 10 out of 12 benchmarks.

确定性和非确定性有限自动机(DFA和NFA)构成了许多大数据应用的核心。最近为DFA/NFA开发特定领域架构(Domain-Specific Architectures, dsa)的工作采用了不同的方法,但是为任意大的模式集、状态激活率和模式匹配率实现一致的吞吐量仍然是一个挑战。在这篇文章中,我们提出了NAPOLY(非确定性自动机处理器覆盖层),一个FPGA覆盖层和相关的编译器。先前努力的一个常见限制是对NFA大小的限制,以实现所发布的吞吐量。NAPOLY针对快速重新编程进行了优化,以允许硬件的实际时分多路复用,并允许无限大小、无限状态激活率和高模式报告率的nfa的高渐近吞吐量。NAPOLY还允许离线生成具有状态容量和转换容量之间权衡的配置。在本文中,我们(1)使用封装在ANMLZoo基准测试套件中的基准测试来评估NAPOLY,(2)评估使用SAT求解器来分配物理资源,以及(3)将NAPOLY的性能与现有解决方案进行比较。NAPOLY在较大的基准测试、具有较高状态激活频率的基准测试和具有较高报告频率的基准测试中表现最佳。NAPOLY在12个基准测试中的10个中超过了CPU和GPU实现的最快速度。
{"title":"NAPOLY: A Non-deterministic Automata Processor OverLaY","authors":"Rasha Karakchi, Jason D. Bakos","doi":"https://dl.acm.org/doi/10.1145/3593586","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3593586","url":null,"abstract":"<p>Deterministic and Non-deterministic Finite Automata (DFA and NFA) comprise the core of many big data applications. Recent efforts to develop Domain-Specific Architectures (DSAs) for DFA/NFA have taken divergent approaches, but achieving consistent throughput for arbitrarily-large pattern sets, state activation rates, and pattern match rates remains a challenge. In this article, we present NAPOLY (Non-Deterministic Automata Processor OverLaY), an FPGA overlay and associated compiler. A common limitation of prior efforts is a limit on NFA size for achieving the advertised throughput. NAPOLY is optimized for fast re-programming to permit practical time-division multiplexing of the hardware and permit high asymptotic throughput for NFAs of unlimited size, unlimited state activation rate, and high pattern reporting rate. NAPOLY also allows for offline generation of configurations having tradeoffs between state capacity and transition capacity. In this article, we (1) evaluate NAPOLY using benchmarks packaged in the ANMLZoo benchmark suite, (2) evaluate the use of an SAT solver for allocating physical resources, and (3) compare NAPOLY’s performance against existing solutions. NAPOLY performs most favorably on larger benchmarks, benchmarks with higher state activation frequency, and benchmarks with higher reporting frequency. NAPOLY outperforms the fastest of the CPU and GPU implementations in 10 out of 12 benchmarks.</p>","PeriodicalId":49248,"journal":{"name":"ACM Transactions on Reconfigurable Technology and Systems","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138505002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fixed-point FPGA Implementation of the FFT Accumulation Method for Real-time Cyclostationary Analysis 定点FPGA实现FFT累加法实时循环平稳分析
IF 2.3 4区 计算机科学 Q1 Computer Science Pub Date : 2023-06-22 DOI: https://dl.acm.org/doi/10.1145/3567429
Carol Jingyi Li, Xiangwei Li, Binglei Lou, Craig T. Jin, David Boland, Philip H. W. Leong

The spectral correlation density (SCD) is an important tool in cyclostationary signal detection and classification. Even using efficient techniques based on the fast Fourier transform (FFT), real-time implementations are challenging because of the high computational complexity. A key dimension for computational optimization lies in minimizing the wordlength employed. In this article, we analyze the relationship between wordlength and signal-to-quantization noise in fixed-point implementations of the SCD function. A canonical SCD estimation algorithm, the FFT accumulation method (FAM) using fixed-point arithmetic, is studied. We derive closed-form expressions for SQNR and compare them at wordlengths ranging from 14 to 26 bits. The differences between the calculated SQNR and bit-exact simulations are less than 1 dB. Furthermore, an HLS-based FPGA design is implemented on a Xilinx Zynq UltraScale+ XCZU28DR-2FFVG1517E RFSoC. Using less than 25% of the logic fabric on the device, it consumes 7.7 W total on-chip power and has a power efficiency of 12.4 GOPS/W, which is an order of magnitude improvement over an Nvidia Tesla K40 graphics processing unit (GPU) implementation. In terms of throughput, it achieves 50 MS/sec, which is a speedup of 1.6 over a recent optimized FPGA implementation.

谱相关密度(SCD)是周期平稳信号检测和分类的重要工具。即使使用基于快速傅里叶变换(FFT)的高效技术,由于高计算复杂度,实时实现也是具有挑战性的。计算优化的一个关键维度在于最小化所使用的字长。在本文中,我们分析了在SCD函数的定点实现中字长与信量化噪声之间的关系。研究了一种典型的SCD估计算法——基于不动点算法的FFT积累法(FAM)。我们推导了SQNR的封闭表达式,并在14到26位的字长范围内对它们进行了比较。计算得到的SQNR与位精确模拟结果的差异小于1 dB。此外,基于hls的FPGA设计在Xilinx Zynq UltraScale+ XCZU28DR-2FFVG1517E RFSoC上实现。在器件上使用不到25%的逻辑结构,它的片上总功耗为7.7 W,功率效率为12.4 GOPS/W,比Nvidia Tesla K40图形处理单元(GPU)实现提高了一个数量级。在吞吐量方面,它达到了50 MS/sec,比最近优化的FPGA实现的速度提高了1.6。
{"title":"Fixed-point FPGA Implementation of the FFT Accumulation Method for Real-time Cyclostationary Analysis","authors":"Carol Jingyi Li, Xiangwei Li, Binglei Lou, Craig T. Jin, David Boland, Philip H. W. Leong","doi":"https://dl.acm.org/doi/10.1145/3567429","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3567429","url":null,"abstract":"<p>The spectral correlation density (SCD) is an important tool in cyclostationary signal detection and classification. Even using efficient techniques based on the fast Fourier transform (FFT), real-time implementations are challenging because of the high computational complexity. A key dimension for computational optimization lies in minimizing the wordlength employed. In this article, we analyze the relationship between wordlength and signal-to-quantization noise in fixed-point implementations of the SCD function. A canonical SCD estimation algorithm, the FFT accumulation method (FAM) using fixed-point arithmetic, is studied. We derive closed-form expressions for SQNR and compare them at wordlengths ranging from 14 to 26 bits. The differences between the calculated SQNR and bit-exact simulations are less than 1 dB. Furthermore, an HLS-based FPGA design is implemented on a Xilinx Zynq UltraScale+ XCZU28DR-2FFVG1517E RFSoC. Using less than 25% of the logic fabric on the device, it consumes 7.7 W total on-chip power and has a power efficiency of 12.4 GOPS/W, which is an order of magnitude improvement over an Nvidia Tesla K40 graphics processing unit (GPU) implementation. In terms of throughput, it achieves 50 MS/sec, which is a speedup of 1.6 over a recent optimized FPGA implementation.</p>","PeriodicalId":49248,"journal":{"name":"ACM Transactions on Reconfigurable Technology and Systems","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138505000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
fSEAD: A Composable FPGA-based Streaming Ensemble Anomaly Detection Library fSEAD:一个基于fpga的可组合流集成异常检测库
IF 2.3 4区 计算机科学 Q1 Computer Science Pub Date : 2023-06-21 DOI: https://dl.acm.org/doi/10.1145/3568992
Binglei Lou, David Boland, Philip Leong

Machine learning ensembles combine multiple base models to produce a more accurate output. They can be applied to a range of machine learning problems, including anomaly detection. In this article, we investigate how to maximize the composability and scalability of an FPGA-based streaming ensemble anomaly detector (fSEAD). To achieve this, we propose a flexible computing architecture consisting of multiple partially reconfigurable regions, pblocks, which each implement anomaly detectors. Our proof-of-concept design supports three state-of-the-art anomaly detection algorithms: Loda, RS-Hash, and xStream. Each algorithm is scalable, meaning multiple instances can be placed within a pblock to improve performance. Moreover, fSEAD is implemented using High-level synthesis (HLS), meaning further custom anomaly detectors can be supported. Pblocks are interconnected via an AXI-switch, enabling them to be composed in an arbitrary fashion before combining and merging results at runtime to create an ensemble that maximizes the use of FPGA resources and accuracy. Through utilizing reconfigurable Dynamic Function eXchange (DFX), the detector can be modified at runtime to adapt to changing environmental conditions. We compare fSEAD to an equivalent central processing unit (CPU) implementation using four standard datasets, with speedups ranging from 3× to 8×.

机器学习集成结合多个基本模型来产生更准确的输出。它们可以应用于一系列机器学习问题,包括异常检测。在本文中,我们研究了如何最大化基于fpga的流集成异常检测器(fSEAD)的可组合性和可扩展性。为了实现这一点,我们提出了一个灵活的计算架构,由多个部分可重构的区域组成,每个区域都实现异常检测器。我们的概念验证设计支持三种最先进的异常检测算法:Loda, RS-Hash和xStream。每个算法都是可伸缩的,这意味着可以在一个pblock中放置多个实例来提高性能。此外,fSEAD是使用高级综合(HLS)实现的,这意味着可以支持更多的自定义异常检测器。pblock通过轴向开关相互连接,使它们能够在运行时组合和合并结果之前以任意方式组合,以创建最大限度地利用FPGA资源和精度的集成。通过利用可重构的动态功能交换(DFX),探测器可以在运行时进行修改,以适应不断变化的环境条件。我们将fSEAD与使用四个标准数据集的等效中央处理器(CPU)实现进行比较,其速度从3倍到8倍不等。
{"title":"fSEAD: A Composable FPGA-based Streaming Ensemble Anomaly Detection Library","authors":"Binglei Lou, David Boland, Philip Leong","doi":"https://dl.acm.org/doi/10.1145/3568992","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3568992","url":null,"abstract":"<p>Machine learning ensembles combine multiple base models to produce a more accurate output. They can be applied to a range of machine learning problems, including anomaly detection. In this article, we investigate how to maximize the composability and scalability of an FPGA-based streaming ensemble anomaly detector (fSEAD). To achieve this, we propose a flexible computing architecture consisting of multiple partially reconfigurable regions, pblocks, which each implement anomaly detectors. Our proof-of-concept design supports three state-of-the-art anomaly detection algorithms: Loda, RS-Hash, and xStream. Each algorithm is scalable, meaning multiple instances can be placed within a pblock to improve performance. Moreover, fSEAD is implemented using High-level synthesis (HLS), meaning further custom anomaly detectors can be supported. Pblocks are interconnected via an AXI-switch, enabling them to be composed in an arbitrary fashion before combining and merging results at runtime to create an ensemble that maximizes the use of FPGA resources and accuracy. Through utilizing reconfigurable Dynamic Function eXchange (DFX), the detector can be modified at runtime to adapt to changing environmental conditions. We compare fSEAD to an equivalent central processing unit (CPU) implementation using four standard datasets, with speedups ranging from 3× to 8×.</p>","PeriodicalId":49248,"journal":{"name":"ACM Transactions on Reconfigurable Technology and Systems","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138504975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
NeuroHSMD: Neuromorphic Hybrid Spiking Motion Detector NeuroHSMD:神经形态杂交脉冲运动检测器
IF 2.3 4区 计算机科学 Q1 Computer Science Pub Date : 2023-06-21 DOI: https://dl.acm.org/doi/10.1145/3588318
Pedro Machado, João Filipe Ferreira, Andreas Oikonomou, T. M. McGinnity

Vertebrate retinas are highly-efficient in processing trivial visual tasks such as detecting moving objects, which still represent complex challenges for modern computers. In vertebrates, the detection of object motion is performed by specialised retinal cells named Object Motion Sensitive Ganglion Cells (OMS-GC). OMS-GC process continuous visual signals and generate spike patterns that are post-processed by the Visual Cortex. Our previous Hybrid Sensitive Motion Detector (HSMD) algorithm was the first hybrid algorithm to enhance Background subtraction (BS) algorithms with a customised 3-layer Spiking Neural Network (SNN) that generates OMS-GC spiking-like responses. In this work, we present a Neuromorphic Hybrid Sensitive Motion Detector (NeuroHSMD) algorithm that accelerates our HSMD algorithm using Field-Programmable Gate Arrays (FPGAs). The NeuroHSMD was compared against the HSMD algorithm, using the same 2012 Change Detection (CDnet2012) and 2014 Change Detection (CDnet2014) benchmark datasets. When tested against the CDnet2012 and CDnet2014 datasets, NeuroHSMD performs object motion detection at 720 × 480 at 28.06 Frames Per Second (fps) and 720 × 480 at 28.71 fps, respectively, with no degradation of quality. Moreover, the NeuroHSMD proposed in this article was completely implemented in Open Computer Language (OpenCL) and therefore is easily replicated in other devices such as Graphical Processing Units (GPUs) and clusters of Central Processing Units (CPUs).

脊椎动物的视网膜在处理琐碎的视觉任务时效率很高,比如检测移动的物体,这对现代计算机来说仍然是一个复杂的挑战。在脊椎动物中,物体运动的检测是由称为物体运动敏感神经节细胞(OMS-GC)的特殊视网膜细胞完成的。OMS-GC处理连续的视觉信号并产生由视觉皮层后处理的脉冲模式。我们之前的混合敏感运动检测器(HSMD)算法是第一个使用定制的3层峰值神经网络(SNN)增强背景减法(BS)算法的混合算法,该算法可以产生类似OMS-GC峰值的响应。在这项工作中,我们提出了一种神经形态混合敏感运动检测器(NeuroHSMD)算法,该算法使用现场可编程门阵列(fpga)加速了我们的HSMD算法。使用相同的2012年变化检测(CDnet2012)和2014年变化检测(CDnet2014)基准数据集,将NeuroHSMD与HSMD算法进行比较。在针对CDnet2012和CDnet2014数据集进行测试时,NeuroHSMD分别以28.06帧/秒(fps)的720 × 480和28.71帧/秒(fps)的720 × 480进行物体运动检测,质量没有下降。此外,本文提出的NeuroHSMD完全是在开放计算机语言(OpenCL)中实现的,因此很容易在其他设备中复制,例如图形处理单元(gpu)和中央处理单元(cpu)集群。
{"title":"NeuroHSMD: Neuromorphic Hybrid Spiking Motion Detector","authors":"Pedro Machado, João Filipe Ferreira, Andreas Oikonomou, T. M. McGinnity","doi":"https://dl.acm.org/doi/10.1145/3588318","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3588318","url":null,"abstract":"<p>Vertebrate retinas are highly-efficient in processing trivial visual tasks such as detecting moving objects, which still represent complex challenges for modern computers. In vertebrates, the detection of object motion is performed by specialised retinal cells named Object Motion Sensitive Ganglion Cells (OMS-GC). OMS-GC process continuous visual signals and generate spike patterns that are post-processed by the Visual Cortex. Our previous Hybrid Sensitive Motion Detector (HSMD) algorithm was the first hybrid algorithm to enhance Background subtraction (BS) algorithms with a customised 3-layer Spiking Neural Network (SNN) that generates OMS-GC spiking-like responses. In this work, we present a Neuromorphic Hybrid Sensitive Motion Detector (NeuroHSMD) algorithm that accelerates our HSMD algorithm using Field-Programmable Gate Arrays (FPGAs). The NeuroHSMD was compared against the HSMD algorithm, using the same 2012 Change Detection (CDnet2012) and 2014 Change Detection (CDnet2014) benchmark datasets. When tested against the CDnet2012 and CDnet2014 datasets, NeuroHSMD performs object motion detection at 720 × 480 at 28.06 Frames Per Second (fps) and 720 × 480 at 28.71 fps, respectively, with no degradation of quality. Moreover, the NeuroHSMD proposed in this article was completely implemented in Open Computer Language (OpenCL) and therefore is easily replicated in other devices such as Graphical Processing Units (GPUs) and clusters of Central Processing Units (CPUs).</p>","PeriodicalId":49248,"journal":{"name":"ACM Transactions on Reconfigurable Technology and Systems","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138504999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AutoScaleDSE: A Scalable Design Space Exploration Engine for High-Level Synthesis AutoScaleDSE:用于高级合成的可扩展设计空间探索引擎
IF 2.3 4区 计算机科学 Q1 Computer Science Pub Date : 2023-06-21 DOI: https://dl.acm.org/doi/10.1145/3572959
Hyegang Jun, Hanchen Ye, Hyunmin Jeong, Deming Chen

High-Level Synthesis (HLS) has enabled users to rapidly develop designs targeted for FPGAs from the behavioral description of the design. However, to synthesize an optimal design capable of taking better advantage of the target FPGA, a considerable amount of effort is needed to transform the initial behavioral description into a form that can capture the desired level of parallelism. Thus, a design space exploration (DSE) engine capable of optimizing large complex designs is needed to achieve this goal. We present a new DSE engine capable of considering code transformation, compiler directives (pragmas), and the compatibility of these optimizations. To accomplish this, we initially express the structure of the input code as a graph to guide the exploration process. To appropriately transform the code, we take advantage of ScaleHLS based on the multi-level compiler infrastructure (MLIR). Finally, we identify problems that limit the scalability of existing DSEs, which we name the “design space merging problem.” We address this issue by employing a Random Forest classifier that can successfully decrease the number of invalid design points without invoking the HLS compiler as a validation tool. We evaluated our DSE engine against the ScaleHLS DSE, outperforming it by a maximum of 59×. We additionally demonstrate the scalability of our design by applying our DSE to large-scale HLS designs, achieving a maximum speedup of 12× for the benchmarks in the MachSuite and Rodinia set.

高级综合(HLS)使用户能够从设计的行为描述中快速开发针对fpga的设计。然而,为了综合一个能够更好地利用目标FPGA的最佳设计,需要付出相当大的努力将初始行为描述转换为能够捕获所需并行性水平的形式。因此,需要一个能够优化大型复杂设计的设计空间探索(DSE)引擎来实现这一目标。我们提出了一个新的DSE引擎,它能够考虑代码转换、编译器指令(pragmas)以及这些优化的兼容性。为了实现这一点,我们首先将输入代码的结构表示为一个图,以指导探索过程。为了适当地转换代码,我们利用了基于多级编译器基础结构(MLIR)的ScaleHLS。最后,我们确定限制现有dse可伸缩性的问题,我们将其命名为“设计空间合并问题”。我们通过使用随机森林分类器来解决这个问题,该分类器可以成功地减少无效设计点的数量,而无需调用HLS编译器作为验证工具。我们将我们的DSE引擎与ScaleHLS的DSE进行了对比,结果显示,我们的DSE引擎的性能比ScaleHLS的DSE引擎高出59倍。我们还通过将我们的DSE应用于大规模HLS设计来证明我们设计的可扩展性,在MachSuite和Rodinia设置的基准测试中实现了12倍的最大加速。
{"title":"AutoScaleDSE: A Scalable Design Space Exploration Engine for High-Level Synthesis","authors":"Hyegang Jun, Hanchen Ye, Hyunmin Jeong, Deming Chen","doi":"https://dl.acm.org/doi/10.1145/3572959","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3572959","url":null,"abstract":"<p>High-Level Synthesis (HLS) has enabled users to rapidly develop designs targeted for FPGAs from the behavioral description of the design. However, to synthesize an optimal design capable of taking better advantage of the target FPGA, a considerable amount of effort is needed to transform the initial behavioral description into a form that can capture the desired level of parallelism. Thus, a design space exploration (DSE) engine capable of optimizing large complex designs is needed to achieve this goal. We present a new DSE engine capable of considering code transformation, compiler directives (pragmas), and the compatibility of these optimizations. To accomplish this, we initially express the structure of the input code as a graph to guide the exploration process. To appropriately transform the code, we take advantage of ScaleHLS based on the multi-level compiler infrastructure (MLIR). Finally, we identify problems that limit the scalability of existing DSEs, which we name the “design space merging problem.” We address this issue by employing a Random Forest classifier that can successfully decrease the number of invalid design points without invoking the HLS compiler as a validation tool. We evaluated our DSE engine against the ScaleHLS DSE, outperforming it by a maximum of 59×. We additionally demonstrate the scalability of our design by applying our DSE to large-scale HLS designs, achieving a maximum speedup of 12× for the benchmarks in the MachSuite and Rodinia set.</p>","PeriodicalId":49248,"journal":{"name":"ACM Transactions on Reconfigurable Technology and Systems","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138504978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Artifact Evaluation for ACM TRETS Papers Submitted from the FPT Journal Track 从FPT期刊轨道提交的ACM TRETS论文的伪影评估
IF 2.3 4区 计算机科学 Q1 Computer Science Pub Date : 2023-06-21 DOI: https://dl.acm.org/doi/10.1145/3596513
Miriam Leeser

Authors of papers that were accepted to ACM TRETS via the FPT 2022 journal track had the option of participating in Artifact Evaluation (AE). Four papers from this track volunteered to participate in the AE process. All of these papers have been awarded badges from ACM as described below.

通过FPT 2022期刊轨道被ACM TRETS接受的论文的作者可以选择参加工件评估(AE)。该方向的四篇论文自愿参与AE过程。所有这些论文都获得了ACM颁发的徽章,如下所述。
{"title":"Artifact Evaluation for ACM TRETS Papers Submitted from the FPT Journal Track","authors":"Miriam Leeser","doi":"https://dl.acm.org/doi/10.1145/3596513","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3596513","url":null,"abstract":"<p>Authors of papers that were accepted to ACM TRETS via the FPT 2022 journal track had the option of participating in Artifact Evaluation (AE). Four papers from this track volunteered to participate in the AE process. All of these papers have been awarded badges from ACM as described below.</p>","PeriodicalId":49248,"journal":{"name":"ACM Transactions on Reconfigurable Technology and Systems","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138504976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ZyPR: End-to-end Build Tool and Runtime Manager for Partial Reconfiguration of FPGA SoCs at the Edge ZyPR:端到端构建工具和运行时管理器,用于FPGA soc的边缘部分重新配置
IF 2.3 4区 计算机科学 Q1 Computer Science Pub Date : 2023-06-21 DOI: https://dl.acm.org/doi/10.1145/3585521
Alex R. Bucknall, Suhaib A. Fahmy

Partial reconfiguration (PR) is a key enabler to the design and development of adaptive systems on modern Field Programmable Gate Array (FPGA) Systems-on-Chip (SoCs), allowing hardware to be adapted dynamically at runtime. Vendor-supported PR infrastructure is performance-limited and blocking, drivers entail complex memory management, and software/hardware design requires bespoke knowledge of the underlying hardware. This article presents ZyPR: a complete end-to-end framework that provides high-performance reconfiguration of hardware from within a software abstraction in the Linux userspace, automating the process of building PR applications with support for the Xilinx Zynq and Zynq UltraScale+ architectures, aimed at enabling non-expert application designers to leverage PR for edge applications. We compare ZyPR against traditional vendor tooling for PR management as well as recent open source tools that support PR under Linux. The framework provides a high-performance runtime along with low overhead for its provided abstractions. We introduce improvements to our previous work, increasing the provisioning throughput for PR bitstreams on the Zynq Ultrascale+ by 2× and 5.4× compared to Xilinx’s FPGA Manager.

部分重构(PR)是现代现场可编程门阵列(FPGA)片上系统(soc)自适应系统设计和开发的关键,它允许硬件在运行时动态适应。供应商支持的PR基础设施性能有限且阻塞,驱动程序需要复杂的内存管理,软件/硬件设计需要对底层硬件的定制知识。本文介绍了ZyPR:一个完整的端到端框架,从Linux用户空间的软件抽象中提供高性能的硬件重构,自动化构建PR应用程序的过程,支持Xilinx Zynq和Zynq UltraScale+架构,旨在使非专家应用程序设计人员能够利用边缘应用程序的PR。我们将ZyPR与传统的公关管理供应商工具以及最近在Linux下支持公关的开源工具进行比较。该框架提供了一个高性能的运行时,并为其提供的抽象提供了低开销。我们对之前的工作进行了改进,与Xilinx的FPGA Manager相比,Zynq Ultrascale+上PR位流的配置吞吐量提高了2倍和5.4倍。
{"title":"ZyPR: End-to-end Build Tool and Runtime Manager for Partial Reconfiguration of FPGA SoCs at the Edge","authors":"Alex R. Bucknall, Suhaib A. Fahmy","doi":"https://dl.acm.org/doi/10.1145/3585521","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3585521","url":null,"abstract":"<p>Partial reconfiguration (PR) is a key enabler to the design and development of adaptive systems on modern Field Programmable Gate Array (FPGA) Systems-on-Chip (SoCs), allowing hardware to be adapted dynamically at runtime. Vendor-supported PR infrastructure is performance-limited and blocking, drivers entail complex memory management, and software/hardware design requires bespoke knowledge of the underlying hardware. This article presents ZyPR: a complete end-to-end framework that provides high-performance reconfiguration of hardware from within a software abstraction in the Linux userspace, automating the process of building PR applications with support for the Xilinx Zynq and Zynq UltraScale+ architectures, aimed at enabling non-expert application designers to leverage PR for edge applications. We compare ZyPR against traditional vendor tooling for PR management as well as recent open source tools that support PR under Linux. The framework provides a high-performance runtime along with low overhead for its provided abstractions. We introduce improvements to our previous work, increasing the provisioning throughput for PR bitstreams on the Zynq Ultrascale+ by 2× and 5.4× compared to Xilinx’s FPGA Manager.</p>","PeriodicalId":49248,"journal":{"name":"ACM Transactions on Reconfigurable Technology and Systems","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138505040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FPGA Implementation of Compact Hardware Accelerators for Ring-Binary-LWE-based Post-quantum Cryptography 基于环二进制lwe的后量子密码紧凑硬件加速器的FPGA实现
IF 2.3 4区 计算机科学 Q1 Computer Science Pub Date : 2023-06-21 DOI: https://dl.acm.org/doi/10.1145/3569457
Pengzhou He, Tianyou Bao, Jiafeng Xie, Moeness Amin

Post-quantum cryptography (PQC) has recently drawn substantial attention from various communities owing to the proven vulnerability of existing public-key cryptosystems against the attacks launched from well-established quantum computers. The Ring-Binary-Learning-with-Errors (RBLWE), a variant of Ring-LWE, has been proposed to build PQC for lightweight applications. As more Field-Programmable Gate Array (FPGA) devices are being deployed in lightweight applications like Internet-of-Things (IoT) devices, it would be interesting if the RBLWE-based PQC can be implemented on the FPGA with ultra-low complexity and flexible processing. However, thus far, limited information is available for such implementations. In this article, we propose novel RBLWE-based PQC accelerators on the FPGA with ultra-low implementation complexity and flexible timing. We first present the process of deriving the key operation of the RBLWE-based scheme into the proposed algorithmic operation. The corresponding hardware accelerator is then efficiently mapped from the proposed algorithm with the help of algorithm-to-architecture implementation techniques and extended to obtain higher-throughput designs. The final complexity analysis and implementation results (on a variety of FPGAs) show that the proposed accelerators have significantly smaller area-time complexities than the state-of-the-art designs. Overall, the proposed accelerators feature low implementation complexity and flexible processing, making them desirable for emerging FPGA-based lightweight applications.

后量子密码学(PQC)最近引起了各个社区的广泛关注,因为现有的公钥密码系统被证明容易受到来自成熟量子计算机的攻击。ring - binary - learning - witherrors (RBLWE)是Ring-LWE的一种变体,被提出用于构建轻量级应用程序的PQC。随着越来越多的现场可编程门阵列(FPGA)设备被部署在物联网(IoT)设备等轻量级应用中,如果基于rblwe的PQC能够以超低的复杂性和灵活的处理方式在FPGA上实现,那将是一件有趣的事情。然而,到目前为止,可用于此类实现的信息有限。在本文中,我们在FPGA上提出了一种基于rblwe的PQC加速器,具有超低的实现复杂度和灵活的时序。我们首先介绍了将基于rblwe的方案的关键操作导出到所提出的算法操作的过程。然后借助算法到体系结构的实现技术,从所提出的算法有效地映射相应的硬件加速器,并扩展以获得更高吞吐量的设计。最终的复杂性分析和实现结果(在各种fpga上)表明,所提出的加速器比最先进的设计具有明显更小的面积-时间复杂性。总体而言,所提出的加速器具有低实现复杂性和灵活处理的特点,使其成为新兴的基于fpga的轻量级应用的理想选择。
{"title":"FPGA Implementation of Compact Hardware Accelerators for Ring-Binary-LWE-based Post-quantum Cryptography","authors":"Pengzhou He, Tianyou Bao, Jiafeng Xie, Moeness Amin","doi":"https://dl.acm.org/doi/10.1145/3569457","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3569457","url":null,"abstract":"<p>Post-quantum cryptography (PQC) has recently drawn substantial attention from various communities owing to the proven vulnerability of existing public-key cryptosystems against the attacks launched from well-established quantum computers. The Ring-Binary-Learning-with-Errors (RBLWE), a variant of Ring-LWE, has been proposed to build PQC for lightweight applications. As more Field-Programmable Gate Array (FPGA) devices are being deployed in lightweight applications like Internet-of-Things (IoT) devices, it would be interesting if the RBLWE-based PQC can be implemented on the FPGA with ultra-low complexity and flexible processing. However, thus far, limited information is available for such implementations. In this article, we propose novel RBLWE-based PQC accelerators on the FPGA with ultra-low implementation complexity and flexible timing. We first present the process of deriving the key operation of the RBLWE-based scheme into the proposed algorithmic operation. The corresponding hardware accelerator is then efficiently mapped from the proposed algorithm with the help of algorithm-to-architecture implementation techniques and extended to obtain higher-throughput designs. The final complexity analysis and implementation results (on a variety of FPGAs) show that the proposed accelerators have significantly smaller area-time complexities than the state-of-the-art designs. Overall, the proposed accelerators feature low implementation complexity and flexible processing, making them desirable for emerging FPGA-based lightweight applications.</p>","PeriodicalId":49248,"journal":{"name":"ACM Transactions on Reconfigurable Technology and Systems","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138543681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
ACM Transactions on Reconfigurable Technology and Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1