2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)最新文献

英文中文

Design of Quantum Circuits for Cryptanalysis and Image Processing Applications 用于密码分析和图像处理应用的量子电路设计

2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)

Pub Date : 2019-07-01 DOI: 10.1109/ISVLSI.2019.00072

Edgard Muñoz-Coreas, H. Thapliyal

Quantum circuits for arithmetic functions over Galois fields such as squaring are required to implement quantum cryptanalysis algorithms. Quantum circuits for integer arithmetic such as multiplication are required to implement scientific computing algorithms and quantum image processing algorithms on quantum computers. Reliable quantum circuits require error correcting codes and gates that are fault tolerant in nature. Quantum circuits of many qubits are challenging to implement making designs with low qubit cost desirable. In this work, we present quantum arithmetic circuits for applications in quantum cryptanalysis and quantum image processing. We present a proposed algorithm for synthesizing gate cost, qubit cost and depth optimized Galois field (GF(2^m)) squaring circuits for quantum cryptanalysis applications. In addition, these squaring circuits are incorporated into a proposed quantum circuit for inversion in GF(2^m). This work also presents a proposed quantum integer conditional addition circuit and a quantum integer multiplication circuit optimized for T-count and qubit cost. The quantum conditional addition circuit and quantum multiplier are incorporated into proposed quantum circuits for bilinear interpolation optimized for T-count cost that can be used in quantum image processing applications.

为了实现量子密码分析算法，伽罗瓦场(如平方)上的算术函数的量子电路是必需的。为了在量子计算机上实现科学计算算法和量子图像处理算法，需要用于乘法等整数运算的量子电路。可靠的量子电路需要纠错码和本质上容错的门。多量子位的量子电路很难实现低量子位成本的设计。在这项工作中，我们提出了用于量子密码分析和量子图像处理的量子算术电路。我们提出了一种用于量子密码分析应用的门成本、量子比特成本和深度优化伽罗瓦场(GF(2^m))平方电路的综合算法。此外，这些平方电路被整合到一个在GF(2^m)中反转的量子电路中。本文还提出了一种针对t计数和量子比特成本优化的量子整数条件加法电路和量子整数乘法电路。将量子条件加法电路和量子乘法器集成到针对t计数成本进行优化的双线性插值量子电路中，可用于量子图像处理应用。

{"title":"Design of Quantum Circuits for Cryptanalysis and Image Processing Applications","authors":"Edgard Muñoz-Coreas, H. Thapliyal","doi":"10.1109/ISVLSI.2019.00072","DOIUrl":"https://doi.org/10.1109/ISVLSI.2019.00072","url":null,"abstract":"Quantum circuits for arithmetic functions over Galois fields such as squaring are required to implement quantum cryptanalysis algorithms. Quantum circuits for integer arithmetic such as multiplication are required to implement scientific computing algorithms and quantum image processing algorithms on quantum computers. Reliable quantum circuits require error correcting codes and gates that are fault tolerant in nature. Quantum circuits of many qubits are challenging to implement making designs with low qubit cost desirable. In this work, we present quantum arithmetic circuits for applications in quantum cryptanalysis and quantum image processing. We present a proposed algorithm for synthesizing gate cost, qubit cost and depth optimized Galois field (GF(2^m)) squaring circuits for quantum cryptanalysis applications. In addition, these squaring circuits are incorporated into a proposed quantum circuit for inversion in GF(2^m). This work also presents a proposed quantum integer conditional addition circuit and a quantum integer multiplication circuit optimized for T-count and qubit cost. The quantum conditional addition circuit and quantum multiplier are incorporated into proposed quantum circuits for bilinear interpolation optimized for T-count cost that can be used in quantum image processing applications.","PeriodicalId":6703,"journal":{"name":"2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"94 1","pages":"360-365"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79606804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Formal Hardware Verification of InfoSec Primitives 信息安全原语的正式硬件验证

2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)

Pub Date : 2019-07-01 DOI: 10.1109/ISVLSI.2019.00034

M. Basiri, S. Shukla

Information Security (InfoSec) plays a major role in the modern real time applications. This paper proposes equivalence check based efficient formal hardware verification schemes for various InfoSec primitives such as 128-bit Advanced Encryption Scheme (AES), Bose-Chaudhuri-Hocquenghem (BCH) encoder, and m-bit GF(p) exponentiator (where p = log2m). The verification of 128-bit AES is done with Artix-7 FPGA using Xilinx Vivado. The verification of BCH encoder and GF(p) exponentiator are done with 45nm CMOS technology using Cadence. The synthesis results show that the proposed hardwaresoftware co-design based 128-bit AES formal hardware verification does not compromise the resource utilization as compared with various existing designs. Similarly, the proposed formal hardware verification of BCH encoder with generator polynomial length 64 and 16-bit GF(p) exponentiator do not compromise the delay as compared with various existing techniques.

信息安全(InfoSec)在现代实时应用中起着重要作用。本文针对各种信息安全原语，如128位高级加密方案(AES)、Bose-Chaudhuri-Hocquenghem (BCH)编码器和m位GF(p)指数(其中p = log2m)，提出了基于等价校验的高效形式化硬件验证方案。采用Xilinx Vivado的Artix-7 FPGA对128位AES进行验证。BCH编码器和GF(p)指数器的验证采用45nm CMOS技术。综合结果表明，与现有的各种设计相比，基于128位AES形式硬件验证的软硬件协同设计不会影响资源利用率。同样，与各种现有技术相比，所提出的具有生成器多项式长度为64和16位GF(p)指数的BCH编码器的正式硬件验证不会损害延迟。

引用次数: 0

Computationally Efficient Learning of Quality Controlled Word Embeddings for Natural Language Processing 自然语言处理中质量控制词嵌入的计算效率学习

2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)

Pub Date : 2019-07-01 DOI: 10.1109/ISVLSI.2019.00033

M. Alawad, G. Tourassi

Deep learning (DL) has been used for many natural language processing (NLP) tasks due to its superior performance as compared to traditional machine learning approaches. In DL models for NLP, words are represented using word embeddings, which capture both semantic and syntactic information in text. However, 90-95% of the DL trainable parameters are associated with the word embeddings, resulting in a large storage or memory footprint. Therefore, reducing the number of word embedding parameters is critical, especially with the increase of vocabulary size. In this work, we propose a novel approximate word embeddings approach for convolutional neural networks (CNNs) used for text classification tasks. The proposed approach significantly reduces the number of model trainable parameters without noticeably sacrificing in computing performance accuracy. Compared to other techniques, our proposed word embeddings technique does not require modifications to the DL model architecture. We evaluate the performance of the the proposed word embeddings on three classification tasks using two datasets, composed of Yelp and Amazon reviews. The results show that the proposed method can reduce the number of word embeddings parameters by 98% and 99% for the Yelp and Amazon datasets respectively, with no drop in computing accuracy.

由于与传统机器学习方法相比，深度学习(DL)具有优越的性能，因此已被用于许多自然语言处理(NLP)任务。在NLP的深度学习模型中，单词使用词嵌入来表示，它捕获文本中的语义和句法信息。然而，90-95%的深度学习可训练参数与词嵌入相关，导致大量存储或内存占用。因此，减少词嵌入参数的数量至关重要，尤其是随着词汇量的增加。在这项工作中，我们为用于文本分类任务的卷积神经网络(cnn)提出了一种新的近似词嵌入方法。该方法在不显著牺牲计算性能精度的前提下，显著减少了模型可训练参数的数量。与其他技术相比，我们提出的词嵌入技术不需要修改深度学习模型架构。我们使用由Yelp和Amazon评论组成的两个数据集来评估所提出的词嵌入在三个分类任务上的性能。结果表明，该方法在不影响计算精度的情况下，可以将Yelp和Amazon数据集的词嵌入参数数量分别减少98%和99%。

{"title":"Computationally Efficient Learning of Quality Controlled Word Embeddings for Natural Language Processing","authors":"M. Alawad, G. Tourassi","doi":"10.1109/ISVLSI.2019.00033","DOIUrl":"https://doi.org/10.1109/ISVLSI.2019.00033","url":null,"abstract":"Deep learning (DL) has been used for many natural language processing (NLP) tasks due to its superior performance as compared to traditional machine learning approaches. In DL models for NLP, words are represented using word embeddings, which capture both semantic and syntactic information in text. However, 90-95% of the DL trainable parameters are associated with the word embeddings, resulting in a large storage or memory footprint. Therefore, reducing the number of word embedding parameters is critical, especially with the increase of vocabulary size. In this work, we propose a novel approximate word embeddings approach for convolutional neural networks (CNNs) used for text classification tasks. The proposed approach significantly reduces the number of model trainable parameters without noticeably sacrificing in computing performance accuracy. Compared to other techniques, our proposed word embeddings technique does not require modifications to the DL model architecture. We evaluate the performance of the the proposed word embeddings on three classification tasks using two datasets, composed of Yelp and Amazon reviews. The results show that the proposed method can reduce the number of word embeddings parameters by 98% and 99% for the Yelp and Amazon datasets respectively, with no drop in computing accuracy.","PeriodicalId":6703,"journal":{"name":"2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"9 1","pages":"134-139"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72712078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Adaptive Transceiver for Wireless NoC to Enhance Multicast/Unicast Communication Scenarios 无线NoC自适应收发器增强多播/单播通信场景

2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)

Pub Date : 2019-07-01 DOI: 10.1109/ISVLSI.2019.00111

Joel Ortiz Sosa, O. Sentieys, C. Roland

Wireless Network-on-Chip (WiNoC) is a viable solution to overcome critical bottlenecks in on-chip communication backbone. However, standard WiNoC approaches are vulnerable to multi-path interference introduced by on-chip physical structures. To overcome such parasitic phenomenon, this paper presents an adaptive digital transceiver, which enhances communication reliability under different wireless channel configurations. Based on a semi-realistic wireless channel model, we investigate the impact of using some channel correction techniques. Experimental results show that our approach significantly improves Bit Error Rate (BER) under different wireless channel configurations. Moreover, our adaptive transceiver allows for wireless communication links to be established in conditions where this would not be possible for standard transceiver architectures. The proposed architecture, designed using a 28-nm FDSOI technology, consumes only 3.27 mW for a data rate of 10 Gbit/s and has a very small area footprint.

无线片上网络(WiNoC)是克服片上通信骨干网瓶颈的可行解决方案。然而，标准的WiNoC方法容易受到片上物理结构引入的多径干扰。为了克服这种寄生现象，本文提出了一种自适应数字收发器，提高了在不同无线信道配置下的通信可靠性。基于半真实的无线信道模型，我们研究了使用一些信道校正技术的影响。实验结果表明，该方法在不同的无线信道配置下都能显著提高误码率。此外，我们的自适应收发器允许在标准收发器架构无法实现的条件下建立无线通信链路。该架构采用28纳米FDSOI技术设计，数据速率为10 Gbit/s，功耗仅为3.27 mW，占地面积非常小。

引用次数: 6

CSrram: Area-Efficient Low-Power Ex-Situ Training Framework for Memristive Neuromorphic Circuits Based on Clustered Sparsity 基于聚类稀疏性的记忆神经形态电路区域高效低功耗非原位训练框架

2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)

Pub Date : 2019-07-01 DOI: 10.1109/ISVLSI.2019.00090

A. Fayyazi, Souvik Kundu, Shahin Nazarian, P. Beerel, Massoud Pedram

Artificial Neural Networks (ANNs) play a key role in many machine learning (ML) applications but poses arduous challenges in terms of storage and computation of network parameters. Memristive crossbar arrays (MCAs) are capable of both computation and storage, making them promising for in-memory computing enabled neural network accelerators. At the same time, the presence of a significant amount of zero weights in ANNs has motivated research in a variety of parameter reduction techniques. However, for crossbar based architectures, the study of efficient methods to take advantage of network sparsity is still in the early stage. This paper presents CSrram, an efficient ex-situ training framework for hybrid CMOS-memristive neuromorphic circuits. CSrram includes a pre-defined block diagonal clustered (BDC) sparsity algorithm to significantly reduce area and power consumption. The proposed framework is verified on a wide range of datasets including MNIST handwritten recognition, fashion MNIST, breast cancer prediction (BCW), IRIS, and mobile health monitoring. Compared to state of the art fully connected memristive neuromorphic circuits, our CSrram with only 25% density of weights in the first junction, provides a power and area efficiency of 1.5x and 2.6x (averaged over five datasets), respectively, without any significant test accuracy loss.

人工神经网络(ann)在许多机器学习(ML)应用中发挥着关键作用，但在网络参数的存储和计算方面面临着艰巨的挑战。记忆交叉棒阵列(MCAs)具有计算和存储能力，使其成为具有内存计算能力的神经网络加速器。与此同时，人工神经网络中大量零权的存在激发了各种参数约简技术的研究。然而，对于基于交叉栏的体系结构，利用网络稀疏性的有效方法的研究仍处于早期阶段。本文提出了一种有效的cmos -记忆神经形态混合电路的非原位训练框架CSrram。CSrram包含预定义的块对角聚类(BDC)稀疏算法，可显着减少面积和功耗。提出的框架在广泛的数据集上进行了验证，包括MNIST手写识别、时尚MNIST、乳腺癌预测(BCW)、IRIS和移动健康监测。与目前最先进的全连接记忆神经形态电路相比，我们的CSrram在第一个结中只有25%的重量密度，分别提供1.5倍和2.6倍的功率和面积效率(在五个数据集上平均)，没有任何明显的测试精度损失。

{"title":"CSrram: Area-Efficient Low-Power Ex-Situ Training Framework for Memristive Neuromorphic Circuits Based on Clustered Sparsity","authors":"A. Fayyazi, Souvik Kundu, Shahin Nazarian, P. Beerel, Massoud Pedram","doi":"10.1109/ISVLSI.2019.00090","DOIUrl":"https://doi.org/10.1109/ISVLSI.2019.00090","url":null,"abstract":"Artificial Neural Networks (ANNs) play a key role in many machine learning (ML) applications but poses arduous challenges in terms of storage and computation of network parameters. Memristive crossbar arrays (MCAs) are capable of both computation and storage, making them promising for in-memory computing enabled neural network accelerators. At the same time, the presence of a significant amount of zero weights in ANNs has motivated research in a variety of parameter reduction techniques. However, for crossbar based architectures, the study of efficient methods to take advantage of network sparsity is still in the early stage. This paper presents CSrram, an efficient ex-situ training framework for hybrid CMOS-memristive neuromorphic circuits. CSrram includes a pre-defined block diagonal clustered (BDC) sparsity algorithm to significantly reduce area and power consumption. The proposed framework is verified on a wide range of datasets including MNIST handwritten recognition, fashion MNIST, breast cancer prediction (BCW), IRIS, and mobile health monitoring. Compared to state of the art fully connected memristive neuromorphic circuits, our CSrram with only 25% density of weights in the first junction, provides a power and area efficiency of 1.5x and 2.6x (averaged over five datasets), respectively, without any significant test accuracy loss.","PeriodicalId":6703,"journal":{"name":"2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"250 1","pages":"465-470"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75758584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Mitigating Reverse Engineering Attacks on Deep Neural Networks 减轻对深度神经网络的逆向工程攻击

2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)

Pub Date : 2019-07-01 DOI: 10.1109/ISVLSI.2019.00122

Yuntao Liu, D. Dachman-Soled, Ankur Srivastava

With the structure of deep neural networks (DNN) being of increasing commercial value, DNN reverse engineering attacks have become a great security concern. It has been shown that the memory access pattern of a processor running DNNs can be exploited to decipher their detailed structure. In this work, we propose a defensive memory access mechanism which utilizes oblivious shuffle, address space layout randomization, and dummy memory accesses to counter such attacks. Experiments show that our defense exponentially increases the attack complexity with asymptotically lower memory access overhead compared to generic memory obfuscation techniques such as ORAM and is scalable to larger DNNs.

随着深度神经网络(deep neural network, DNN)结构的商业价值越来越高，DNN逆向工程攻击已成为人们关注的一大安全问题。研究表明，运行深度神经网络的处理器的内存访问模式可以用来破译它们的详细结构。在这项工作中，我们提出了一种防御性内存访问机制，该机制利用无关洗牌，地址空间布局随机化和虚拟内存访问来对抗此类攻击。实验表明，与一般的内存混淆技术(如ORAM)相比，我们的防御以指数方式增加了攻击复杂性，并且内存访问开销渐近降低，并且可扩展到更大的dnn。

引用次数: 16

Tackling the Drawbacks of a Lagrangian Relaxation Based Discrete Gate Sizing Algorithm 解决基于拉格朗日松弛的离散门尺寸算法的缺陷

2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)

Pub Date : 2019-07-01 DOI: 10.1109/ISVLSI.2019.00059

Henrique Placido, R. Reis

The Lagrangian relaxation (LR) based gate sizer proposed in [1] has the best leakage power results published so far for the ISPD 2012 Gate Sizing Contest benchmarks. However, it requires many LR iterations and does not rely on any technique to perform cell option candidate filtering in the LR subproblem solver. Therefore, this paper presents some extensions to address these drawbacks. In order to reduce the number of LR iterations, we propose some enhancements to the original LR multiplier formula. We also use a scaling factor to properly scale timing cost and leakage power in the LR local cost. Moreover, we apply a cell option candidate filtering strategy to reduce the runtime of each LR iteration. Finally, we improve the post-processing timing recovery and power recovery. Our work achieved leakage power results very close to the original algorithm, taking 4.28x fewer LR iterations, on average, and 9.11x fewer cell swaps during LR, on average.

在b[1]中提出的基于拉格朗日弛豫(LR)的栅极尺寸器在ISPD 2012栅极尺寸竞赛基准测试中具有迄今为止公布的最佳泄漏功率结果。然而，它需要许多LR迭代，并且不依赖于任何技术来在LR子问题求解器中执行单元格选项候选过滤。因此，本文提出了一些扩展来解决这些缺点。为了减少LR迭代的次数，我们对原始LR乘数公式进行了一些改进。我们还使用比例因子来适当地缩放LR局部成本中的定时成本和泄漏功率。此外，我们采用单元选项候选过滤策略来减少每次LR迭代的运行时间。最后，对后处理时间恢复和功率恢复进行了改进。我们的工作获得了与原始算法非常接近的泄漏功率结果，平均减少了4.28倍的LR迭代，平均减少了9.11倍的LR期间的电池交换。

引用次数: 0

FAST: A Frequency-Aware Skewed Merkle Tree for FPGA-Secured Embedded Systems FAST:用于fpga安全嵌入式系统的频率感知倾斜默克尔树

2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)

Pub Date : 2019-07-01 DOI: 10.1109/ISVLSI.2019.00066

Yu Zou, Mingjie Lin

Protection of external memory is important when an attacker could get physical accesses to the external memory bus. Compared to general-purpose systems, embedded systems are more vulnerable to physical attacks due to the portability. One of the attacks is a replay attack, which an attacker records data sent over a memory bus and replays it to pretend to be an authorized user. Traditionally, the replay attack is protected using a full, balanced Merkle Tree. Focusing on average-case performance and general-purpose systems, traversal and verification of Merkle Tree incur a huge latency overhead to each memory access. In contrast to general-purpose systems, embedded systems are normally application-specific, and program behaviors and memory access patterns are deterministic. Besides that, we also observed that not all memory locations are accessed equally frequently given a program. Based on these two observations, we propose FAST, a Frequency-Aware Skewed merkle Tree for application-specific embedded systems. After profiling a program in a simulation environment without involving any replay attack protection, we get a memory access frequency distribution. Afterward, we design an automatic and systematic approach to generate an application-specific optimal skewed Merkle Tree accordingly. We propose an efficient hardware architecture to accelerate FAST on FPGA, and by experimenting on five real-world benchmarks, our skewed Merkle Tree implementation outperforms baseline which uses a full balanced Merkle Tree by up to 3 times.

当攻击者可以物理访问外部内存总线时，保护外部内存非常重要。与通用系统相比，嵌入式系统由于其可移植性更容易受到物理攻击。其中一种攻击是重放攻击，攻击者记录通过内存总线发送的数据，并将其重放，以假装是授权用户。传统上，使用完整、平衡的默克尔树来保护重放攻击。专注于平均情况下的性能和通用系统，遍历和验证Merkle树会导致每次内存访问的巨大延迟开销。与通用系统相比，嵌入式系统通常是特定于应用程序的，程序行为和内存访问模式是确定的。除此之外，我们还观察到，给定一个程序，并非所有内存位置的访问频率都是相同的。基于这两个观察结果，我们提出了FAST，一种针对特定应用的嵌入式系统的频率感知倾斜默克尔树。在不涉及任何重放攻击保护的模拟环境中对程序进行分析后，我们得到了内存访问频率分布。然后，我们设计了一个自动和系统的方法来生成特定应用的最优倾斜默克尔树。我们提出了一种高效的硬件架构来加速FPGA上的FAST，并且通过在五个实际基准上进行实验，我们的倾斜默克尔树实现比使用完全平衡的默克尔树的基线性能高出3倍。

{"title":"FAST: A Frequency-Aware Skewed Merkle Tree for FPGA-Secured Embedded Systems","authors":"Yu Zou, Mingjie Lin","doi":"10.1109/ISVLSI.2019.00066","DOIUrl":"https://doi.org/10.1109/ISVLSI.2019.00066","url":null,"abstract":"Protection of external memory is important when an attacker could get physical accesses to the external memory bus. Compared to general-purpose systems, embedded systems are more vulnerable to physical attacks due to the portability. One of the attacks is a replay attack, which an attacker records data sent over a memory bus and replays it to pretend to be an authorized user. Traditionally, the replay attack is protected using a full, balanced Merkle Tree. Focusing on average-case performance and general-purpose systems, traversal and verification of Merkle Tree incur a huge latency overhead to each memory access. In contrast to general-purpose systems, embedded systems are normally application-specific, and program behaviors and memory access patterns are deterministic. Besides that, we also observed that not all memory locations are accessed equally frequently given a program. Based on these two observations, we propose FAST, a Frequency-Aware Skewed merkle Tree for application-specific embedded systems. After profiling a program in a simulation environment without involving any replay attack protection, we get a memory access frequency distribution. Afterward, we design an automatic and systematic approach to generate an application-specific optimal skewed Merkle Tree accordingly. We propose an efficient hardware architecture to accelerate FAST on FPGA, and by experimenting on five real-world benchmarks, our skewed Merkle Tree implementation outperforms baseline which uses a full balanced Merkle Tree by up to 3 times.","PeriodicalId":6703,"journal":{"name":"2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"75 1","pages":"326-331"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85895988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 18

A Reconfigurable Layered-Based Bio-Inspired Smart Image Sensor 一种可重构分层生物智能图像传感器

2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)

Pub Date : 2019-07-01 DOI: 10.1109/ISVLSI.2019.00039

Pankaj Bhowmik, Md Jubaer Hossain Pantho, S. Saha, C. Bobda

This paper presents a hardware architecture to extract features from an image using the concepts of bio-inspired computing and a method of converting sequential image processing to parallel computational processing units that can execute on the sensor. These computational units are oriented on vertically integrated hierarchical planes and enabled with a region based Attention Module which separates the Regions of Interest (ROIs) from the image. In each layer, the computational units work in parallel and introduce massive parallelism at the pixel level. At the same time, the design saves dynamic power by dynamically enabling and disabling the computational units which ensure high-performance and high-throughput. Moreover, the units are made reconfigurable to support a wide range of machine vision applications by forming a basic structure that is common to all operations and reconfigurable parts for a specific application. Our simulation result shows the design achieves 4.852X power savings on ROIs while processing at 465 Kfps with 800 MHz clock frequency.

本文提出了一种硬件架构，利用生物启发计算的概念从图像中提取特征，并提出了一种将顺序图像处理转换为可在传感器上执行的并行计算处理单元的方法。这些计算单元面向垂直集成的层次平面，并启用基于区域的注意力模块，该模块将感兴趣的区域(roi)从图像中分离出来。在每一层中，计算单元并行工作，并在像素级引入大量并行性。同时，该设计通过动态启用和禁用计算单元来节省动态功耗，确保高性能和高吞吐量。此外，这些单元是可重构的，通过形成一个对所有操作和特定应用的可重构部件通用的基本结构来支持广泛的机器视觉应用。我们的仿真结果表明，该设计在800 MHz时钟频率下以465 Kfps处理时，在roi上节省了4.85倍的功耗。

引用次数: 0

Title Page i 第1页

2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)

Pub Date : 2019-07-01 DOI: 10.1109/isvlsi.2019.00001

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀