2019 IEEE International Workshop on Signal Processing Systems (SiPS)最新文献

英文中文

DynExit: A Dynamic Early-Exit Strategy for Deep Residual Networks 深度残差网络的动态早退出策略

2019 IEEE International Workshop on Signal Processing Systems (SiPS)

Pub Date : 2019-10-01 DOI: 10.1109/SiPS47522.2019.9020551

Meiqi Wang, Jianqiao Mo, Jun Lin, Zhongfeng Wang, L. Du

Early-exit is a kind of technique to terminate a pre-specified computation at an early stage depending on the input samples and has been introduced to reduce energy consumption for Deep Neural Networks (DNNs). Previous early-exit approaches suffered from the burden of manually tuning early-exit loss-weights to find a good trade-off between complexity reduction and system accuracy. In this work, we first propose DynExit, a dynamic loss-weight modification strategy for ResNets, which adaptively modifies the ratio of different exit branches and searches for a proper spot for both accuracy and cost. Then, an efficient hardware unit for early-exit branches is developed, which can be easily integrated to existing hardware architectures of DNNs to reduce average computing latency and energy cost. Experimental results show that the proposed DynExit strategy can reduce up to 43.6% FLOPS compared to the state-of-the-arts approaches. On the other hand, it is able to achieve 1.2% accuracy improvement over the existing end-to-end fixed loss-weight training scheme with comparable computation reduction ratio. The proposed hardware architecture for DynExit is evaluated on the platform of Xilinx Zynq-7000 ZC706 development board. Synthesis results demonstrate that the architecture can achieve high speed with low hardware complexity. To the best of our knowledge, this is the first hardware implementation for early-exit techniques used for DNNs in open literature.

早期退出是一种根据输入样本在早期阶段终止预先指定的计算的技术，用于减少深度神经网络(dnn)的能量消耗。以前的早期退出方法需要手动调整早期退出损失权重，以便在降低复杂性和系统准确性之间找到一个好的平衡点。在这项工作中，我们首先提出了DynExit，一种针对ResNets的动态减重修改策略，该策略自适应地修改不同退出分支的比例，并在精度和成本两方面搜索合适的位置。然后，开发了一种高效的早期退出分支硬件单元，该单元可以很容易地集成到现有的深度神经网络硬件架构中，以降低平均计算延迟和能量成本。实验结果表明，与最先进的方法相比，所提出的DynExit策略可以降低高达43.6%的FLOPS。另一方面，与现有的端到端固定减重训练方案相比，该方法在计算减少比相当的情况下，准确率提高了1.2%。在Xilinx Zynq-7000 ZC706开发板平台上对所提出的DynExit硬件架构进行了评估。综合结果表明，该体系结构可以在较低的硬件复杂度下实现较高的速度。据我们所知，这是公开文献中用于dnn的早期退出技术的第一个硬件实现。

{"title":"DynExit: A Dynamic Early-Exit Strategy for Deep Residual Networks","authors":"Meiqi Wang, Jianqiao Mo, Jun Lin, Zhongfeng Wang, L. Du","doi":"10.1109/SiPS47522.2019.9020551","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020551","url":null,"abstract":"Early-exit is a kind of technique to terminate a pre-specified computation at an early stage depending on the input samples and has been introduced to reduce energy consumption for Deep Neural Networks (DNNs). Previous early-exit approaches suffered from the burden of manually tuning early-exit loss-weights to find a good trade-off between complexity reduction and system accuracy. In this work, we first propose DynExit, a dynamic loss-weight modification strategy for ResNets, which adaptively modifies the ratio of different exit branches and searches for a proper spot for both accuracy and cost. Then, an efficient hardware unit for early-exit branches is developed, which can be easily integrated to existing hardware architectures of DNNs to reduce average computing latency and energy cost. Experimental results show that the proposed DynExit strategy can reduce up to 43.6% FLOPS compared to the state-of-the-arts approaches. On the other hand, it is able to achieve 1.2% accuracy improvement over the existing end-to-end fixed loss-weight training scheme with comparable computation reduction ratio. The proposed hardware architecture for DynExit is evaluated on the platform of Xilinx Zynq-7000 ZC706 development board. Synthesis results demonstrate that the architecture can achieve high speed with low hardware complexity. To the best of our knowledge, this is the first hardware implementation for early-exit techniques used for DNNs in open literature.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"173 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132842777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 25

Theoretical Analysis of Configurable RO PUFs and Strategies to Enhance Security 可配置RO puf的理论分析及提高安全性的策略

2019 IEEE International Workshop on Signal Processing Systems (SiPS)

Pub Date : 2019-10-01 DOI: 10.1109/SiPS47522.2019.9020320

Jiang Li, Hao Gao, Yijun Cui, Chenghua Wang, Yale Wang, Chongyan Gu, Weiqiang Liu

Compared to traditional ring oscillator PUF (RO PUF), configurable RO PUF (CRO PUF) greatly increases the number of challenge response pairs (CRPs) and improves hardware utilization. However, in the conventional CRO PUF design, when a path is selected by the challenge to generate a response, the circuit characteristic information constituting the CRO PUF, such as the delay information of the configurable unit, the transmission model, and etc., can also be leaked. Once the adversary monitors and masters this information, they can use this information to attack the CRO PUF circuits, such as modeling attacks. This paper establishes a theoretical model of CRO PUF and analyzes its unpredictability and security. Based on this model, a new mechanism to generate the proper challenges is proposed in this paper. In the proposed mechanism, the challenge is generated and utilized by a specific way, which can delay the feature leakage of the CRO PUF, thereby improving the security of the CRO PUF.

与传统的环形振荡器PUF (RO PUF)相比，可配置RO PUF (CRO PUF)大大增加了挑战响应对(CRPs)的数量，提高了硬件利用率。然而，在传统的CRO PUF设计中，当挑战选择一条路径产生响应时，构成CRO PUF的电路特征信息，如可配置单元的延迟信息、传输模型等也会泄露。一旦攻击者监视并掌握了这些信息，他们就可以使用这些信息来攻击CRO PUF电路，例如建模攻击。建立了CRO PUF的理论模型，并对其不可预测性和安全性进行了分析。在此基础上，提出了一种新的挑战生成机制。在该机制中，挑战以特定的方式产生和利用，可以延迟CRO PUF的特征泄漏，从而提高CRO PUF的安全性。

引用次数: 2

Data Driven Low-Complexity DOA Estimation for Ultra-Short Range Automotive Radar 数据驱动的超近程汽车雷达低复杂度DOA估计

2019 IEEE International Workshop on Signal Processing Systems (SiPS)

Pub Date : 2019-10-01 DOI: 10.1109/SiPS47522.2019.9020602

Yixin Song, Yang Li, Cheng Zhang, Yongming Huang

In recent applications of millimeter wave automotive radars, the short range detection and estimation performance becomes an important design metric. Due to the sphere rather than plane form of array incoming signals, direct use of conventional spectrum or direction of arrival (DOA) estimators generally result in large performance degradation. In this paper, a naive look-up table based solution is first introduced. To solve its involved large storage requirement problem, we further transform the DOA estimation problem into the DOA classification problem, and utilize the support vector machine (SVM) framework to propose a data-driven low-complexity DOA estimator. Simulations validate the effectiveness of the propose SVM solution especially for small sample set and high storage limit.

在毫米波汽车雷达的应用中，近距离探测和估计性能成为一项重要的设计指标。由于阵列输入信号是球形而不是平面形式，直接使用传统的频谱或到达方向(DOA)估计器通常会导致很大的性能下降。本文首先介绍了一种基于朴素查表的解决方案。为了解决其涉及的大存储需求问题，我们进一步将DOA估计问题转化为DOA分类问题，并利用支持向量机(SVM)框架提出了一种数据驱动的低复杂度DOA估计器。仿真结果验证了该方法在小样本集和高存储限制情况下的有效性。

引用次数: 0

Efficiently Learning a Robust Self-Driving Model with Neuron Coverage Aware Adaptive Filter Reuse 有效学习具有神经元覆盖感知的自适应滤波器复用鲁棒自驾车模型

2019 IEEE International Workshop on Signal Processing Systems (SiPS)

Pub Date : 2019-10-01 DOI: 10.1109/SiPS47522.2019.9020572

Chunpeng Wu, Ang Li, Bing Li, Yiran Chen

Human drivers learn driving skills from both regular (non-accidental) and accidental driving experiences, while most of current self-driving research focuses on regular driving only. We argue that learning from accidental driving data is necessary for robustly modeling driving behavior. A main challenge, however, is how accident data can be effectively used together with regular data to learn vehicle motion, since manually labeling accident data without expertise is significantly difficult. Our proposed solution for robust vehicle motion learning, in this paper, is to integrate layer-level discriminability and neuron coverage(neuron-level robustness) regulariziers into an unsupervised generative network for video prediction. Layer-level discriminability increases divergence of feature distribution between the regular data and accident data at network layers. Neuron coverage regulariziers enlarge interval span of neuron activation adopted by training samples, to reduce probability that a sample falls into untested interval regions. To accelerate training process, we propose adaptive filter reuse based on neuron coverage. Our strategies of filter reuse reduce structural network parameters, encourage memory reuse, and guarantee effectiveness of robust vehicle motion learning. Experiments results show that our model improves the inference accuracy by 1.1% compared to FCMLSTM, and cut down 10.2% training time over the traditional method with negligible accuracy loss.

人类驾驶员从常规(非意外)和意外驾驶经验中学习驾驶技能，而目前大多数自动驾驶研究只关注常规驾驶。我们认为，从意外驾驶数据中学习对于稳健地建模驾驶行为是必要的。然而，一个主要的挑战是如何有效地将事故数据与常规数据结合起来学习车辆运动，因为在没有专业知识的情况下手动标记事故数据是非常困难的。在本文中，我们提出的鲁棒车辆运动学习解决方案是将层级判别性和神经元覆盖(神经元级鲁棒性)正则化器集成到无监督生成网络中用于视频预测。层级可判别性增加了网络层中规则数据和事故数据特征分布的差异性。神经元覆盖正则化器扩大了训练样本所采用的神经元激活的区间跨度，降低了样本落入未测试区间区域的概率。为了加速训练过程，我们提出了基于神经元覆盖的自适应滤波器重用。我们的过滤器重用策略减少了结构网络参数，鼓励记忆重用，保证了鲁棒车辆运动学习的有效性。实验结果表明，该模型的推理精度比传统方法提高了1.1%，训练时间比传统方法缩短了10.2%，且准确率损失可以忽略不计。

{"title":"Efficiently Learning a Robust Self-Driving Model with Neuron Coverage Aware Adaptive Filter Reuse","authors":"Chunpeng Wu, Ang Li, Bing Li, Yiran Chen","doi":"10.1109/SiPS47522.2019.9020572","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020572","url":null,"abstract":"Human drivers learn driving skills from both regular (non-accidental) and accidental driving experiences, while most of current self-driving research focuses on regular driving only. We argue that learning from accidental driving data is necessary for robustly modeling driving behavior. A main challenge, however, is how accident data can be effectively used together with regular data to learn vehicle motion, since manually labeling accident data without expertise is significantly difficult. Our proposed solution for robust vehicle motion learning, in this paper, is to integrate layer-level discriminability and neuron coverage(neuron-level robustness) regulariziers into an unsupervised generative network for video prediction. Layer-level discriminability increases divergence of feature distribution between the regular data and accident data at network layers. Neuron coverage regulariziers enlarge interval span of neuron activation adopted by training samples, to reduce probability that a sample falls into untested interval regions. To accelerate training process, we propose adaptive filter reuse based on neuron coverage. Our strategies of filter reuse reduce structural network parameters, encourage memory reuse, and guarantee effectiveness of robust vehicle motion learning. Experiments results show that our model improves the inference accuracy by 1.1% compared to FCMLSTM, and cut down 10.2% training time over the traditional method with negligible accuracy loss.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127840953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Data Structure-Based Approximate Belief Propagation Decoder for Polar Codes 一种基于数据结构的极码近似置信传播解码器

2019 IEEE International Workshop on Signal Processing Systems (SiPS)

Pub Date : 2019-10-01 DOI: 10.1109/SiPS47522.2019.9020391

Menghui Xu, Weikang Qian, Zaichen Zhang, X. You, Chuan Zhang

Polar code, as the first code that can probably achieve the capacity of B-DMCs, has received great attention. Belief propagation (BP) decoding algorithm, a paralleled de-coding approach for polar codes, suffers from high hardware complexity. In this paper, we devoted ourselves to proposing a data structure-based approximate BP decoder for polar code. Multiple simulations have been done. The simulation results show that by reforming the data structure of the received channel message and introducing the approximate computing schemes, significant hardware reduction has been made compared to its conventional counterpart. The hardware architecture and corresponding implementation results are also given in this paper.

Polar码作为最早可能达到b - dmc容量的码，受到了广泛的关注。信念传播(BP)译码算法是一种并行的极化码译码方法，其硬件复杂度较高。在本文中，我们致力于提出一种基于数据结构的近似BP解码器。已经进行了多次模拟。仿真结果表明，通过对接收信道信息的数据结构进行改造并引入近似计算方案，与传统算法相比，该算法大大减少了硬件开销。文中给出了系统的硬件结构和相应的实现结果。

引用次数: 0

Nonlinear Functions in Learned Iterative Shrinkage-Thresholding Algorithm for Sparse Signal Recovery 稀疏信号恢复学习迭代收缩阈值算法中的非线性函数

2019 IEEE International Workshop on Signal Processing Systems (SiPS)

Pub Date : 2019-10-01 DOI: 10.1109/SiPS47522.2019.9020469

E. C. Marques, N. Maciel, L. Naviner, Hao Cai, Jun Yang

Compressive sensing requires fewer measurements than Nyquist rate to recover sparse signals, leading to processing and energy saving. The efficiency of this technique strongly depends on the quality of the considered sparse recovery algorithm. This work focuses on a learned iterative shrinkage-thresholding algorithm where iterations are related to layers of a neural network. We analyze the performance of this algorithm for different shrinkage functions. A decrease up to 9dB in the NMSE value is achieved by choosing appropriate shrinkage function. Moreover, the estimation performance can be close to the theoretical performance bound, showing deep learning as a promising tool for sparse signal estimation. This work can be applied in several areas such as image processing, Internet of Things (IoT), cognitive radio networks, and sparse channel estimation for wireless communications.

压缩感知需要比奈奎斯特速率更少的测量来恢复稀疏信号，从而实现处理和节能。该技术的效率很大程度上取决于所考虑的稀疏恢复算法的质量。这项工作的重点是学习迭代收缩阈值算法，其中迭代与神经网络的层相关。我们分析了该算法在不同收缩函数下的性能。通过选择合适的收缩函数，NMSE值降低了9dB。此外，估计性能可以接近理论性能界限，表明深度学习是一种很有前途的稀疏信号估计工具。这项工作可以应用于图像处理、物联网(IoT)、认知无线电网络和无线通信的稀疏信道估计等多个领域。

引用次数: 1

Side Channel Attack Resistant AES Design Based on Finite Field Construction Variation 基于有限域结构变化的抗侧信道攻击AES设计

2019 IEEE International Workshop on Signal Processing Systems (SiPS)

Pub Date : 2019-10-01 DOI: 10.1109/SiPS47522.2019.9020535

P. Shvartsman, Xinmiao Zhang

The Advanced Encryption Standard (AES) is the current standard for symmetric key cipher and is algorithmically secure. Side channel attacks that target power consumption can reveal the secret key in AES implementations. Masking data with random variables is one of the main methods used to thwart power analysis attacks. Data can be masked with multiple random variables to prevent higher-order attacks at the cost of a large increase in area. A novel masking scheme for AES resistant to second-order attacks is proposed. Instead of using an additional mask, variation in finite field construction is exploited to increase resistance to second-order attacks. As a result, the area requirement is reduced. For an example AES encryptor, the proposed design is 12% smaller compared to the previous best design, with a very small drop in achievable security level.

高级加密标准AES (Advanced Encryption Standard)是目前对称密钥密码的标准，具有算法安全性。以功耗为目标的侧信道攻击可以泄露AES实现中的密钥。用随机变量屏蔽数据是用来阻止功率分析攻击的主要方法之一。数据可以用多个随机变量来掩盖，以防止高阶攻击，代价是面积的大幅增加。提出了一种新的AES抗二阶攻击掩蔽方案。而不是使用一个额外的掩码，在有限域结构的变化被利用来增加抵抗二阶攻击。因此，减少了对面积的要求。以AES加密器为例，提议的设计比以前的最佳设计小12%，可实现的安全级别下降很小。

引用次数: 3

PRESS/HOLD/RELEASE Ultrasonic Gestures and Low Complexity Recognition Based on TCN 基于TCN的按/持/放超声手势及低复杂度识别

2019 IEEE International Workshop on Signal Processing Systems (SiPS)

Pub Date : 2019-10-01 DOI: 10.1109/SiPS47522.2019.9020579

Emad A. Ibrahim, Min Li, J. P. D. Gyvez

Targeting ultrasound-based gesture recognition, this paper proposes a new universal PRESS/HOLD/RELEASE approach that leverages the diversity of gestures performed on smart devices such as mobile phones and IoT nodes. The new set of gestures are generated by interleaving PRESS/HOLD/RELEASE patterns; abbreviated as P/H/R, with gestures like sweeps between a number of microphones. P/H/R patterns are constructed by a hand as it approaches a top of a microphone to generate a virtual Press. After that, the hand settles for an undefined period of time to generate a virtual Hold and finally departs to generate a virtual Release. The same hand can sweep to a 2nd microphone and perform another P/H/R. Interleaving the P/H/R patterns expands the number of performed gestures. Assuming an on-board speaker transmitting ultrasonic signals, the detection is performed on Doppler shift readings generated by a hand as it approaches and departs a top of a microphone. The Doppler shift readings are presented in a sequence of down-mixed ultrasonic spectrogram frames. We train a Temporal Convolutional Network (TCN) to classify the P/H/R patterns under different environmental noises. Our experimental results show that such P/H/R patterns at a top of a microphone can be achieved with 96.6% accuracy under different noise conditions. A group of P/H/R based gestures has been tested on commercially off-the-shelf (COTS) Samsung Galaxy S7 Edge. Different P/H/R interleaved gestures (such as sweeps, long taps, etc.) are designed using two microphones and a single speaker while using as low as $sim 5mathrm{K}$ parameters and as low as $sim 0.15$ Million operations (MOPs) in compute power per inference. The P/H/R interleaved set of gestures are intuitive and hence are easy to learn by end users. This paves its way to be deployed by smartphones and smart speakers for mass production.

针对基于超声波的手势识别，本文提出了一种新的通用PRESS/HOLD/RELEASE方法，该方法利用了智能设备(如手机和物联网节点)上执行的手势的多样性。新的手势由交错的PRESS/HOLD/RELEASE模式生成;缩写为P/H/R，手势就像在几个麦克风之间扫。P/H/R模式是由一只手构建的，因为它接近麦克风的顶部，以产生一个虚拟的新闻。之后，手静止一段未定义的时间以产生一个虚拟保持，最后离开以产生一个虚拟释放。同样的手可以扫到第二个麦克风，并执行另一个P/H/R。交错的P/H/R模式增加了所执行手势的数量。假设车载扬声器传输超声波信号，检测是通过一只手在接近和离开麦克风顶部时产生的多普勒频移读数来执行的。多普勒频移读数呈现在一系列下混合超声频谱图帧中。我们训练了一个时间卷积网络(TCN)来对不同环境噪声下的P/H/R模式进行分类。实验结果表明，在不同噪声条件下，这种麦克风顶部的P/H/R模式可以达到96.6%的精度。一组基于P/H/R的手势已经在商用现货(COTS)三星Galaxy S7 Edge上进行了测试。不同的P/H/R交错手势(如扫描，长敲击等)使用两个麦克风和单个扬声器设计，同时使用低至$ $ sim 5 maththrm {K}$参数和低至$ $ sim 0.15$百万次运算(MOPs)的计算能力。P/H/R交错手势是直观的，因此很容易被最终用户学习。这为智能手机和智能扬声器的大规模生产铺平了道路。

{"title":"PRESS/HOLD/RELEASE Ultrasonic Gestures and Low Complexity Recognition Based on TCN","authors":"Emad A. Ibrahim, Min Li, J. P. D. Gyvez","doi":"10.1109/SiPS47522.2019.9020579","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020579","url":null,"abstract":"Targeting ultrasound-based gesture recognition, this paper proposes a new universal PRESS/HOLD/RELEASE approach that leverages the diversity of gestures performed on smart devices such as mobile phones and IoT nodes. The new set of gestures are generated by interleaving PRESS/HOLD/RELEASE patterns; abbreviated as P/H/R, with gestures like sweeps between a number of microphones. P/H/R patterns are constructed by a hand as it approaches a top of a microphone to generate a virtual Press. After that, the hand settles for an undefined period of time to generate a virtual Hold and finally departs to generate a virtual Release. The same hand can sweep to a 2nd microphone and perform another P/H/R. Interleaving the P/H/R patterns expands the number of performed gestures. Assuming an on-board speaker transmitting ultrasonic signals, the detection is performed on Doppler shift readings generated by a hand as it approaches and departs a top of a microphone. The Doppler shift readings are presented in a sequence of down-mixed ultrasonic spectrogram frames. We train a Temporal Convolutional Network (TCN) to classify the P/H/R patterns under different environmental noises. Our experimental results show that such P/H/R patterns at a top of a microphone can be achieved with 96.6% accuracy under different noise conditions. A group of P/H/R based gestures has been tested on commercially off-the-shelf (COTS) Samsung Galaxy S7 Edge. Different P/H/R interleaved gestures (such as sweeps, long taps, etc.) are designed using two microphones and a single speaker while using as low as $sim 5mathrm{K}$ parameters and as low as $sim 0.15$ Million operations (MOPs) in compute power per inference. The P/H/R interleaved set of gestures are intuitive and hence are easy to learn by end users. This paves its way to be deployed by smartphones and smart speakers for mass production.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122717021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Neural Network-based Vehicle Image Classification for IoT Devices 基于神经网络的物联网设备车辆图像分类

2019 IEEE International Workshop on Signal Processing Systems (SiPS)

Pub Date : 2019-10-01 DOI: 10.1109/SiPS47522.2019.9020464

Saman Payvar, Mir Khan, Rafael Stahl, Daniel Mueller-Gritschneder, J. Boutellier

Convolutional Neural Networks (CNNs) have previously provided unforeseen results in automatic image analysis and interpretation, an area which has numerous applications in both consumer electronics and industry. However, the signal processing related to CNNs is computationally very demanding, which has prohibited their use in the smallest embedded computing platforms, to which many Internet of Things (IoT) devices belong. Fortunately, in the recent years researchers have developed many approaches for optimizing the performance and for shrinking the memory footprint of CNNs. This paper presents a neural-network-based image classifier that has been trained to classify vehicle images into four different classes. The neural network is optimized by a technique called binarization, and the resulting binarized network is placed to an IoT-class processor core for execution. Binarization reduces the memory footprint of the CNN by around 95% and increases performance by more than $6 times $. Furthermore, we show that by utilizing a custom instruction ’popcount’ of the processor, the performance of the binarized vehicle classifier can still be increased by more than $2 times $, making the CNN-based image classifier suitable for the smallest embedded processors.

卷积神经网络(cnn)以前在自动图像分析和解释中提供了不可预见的结果，这一领域在消费电子和工业中都有许多应用。然而，与cnn相关的信号处理对计算量的要求非常高，这使得它们无法在最小的嵌入式计算平台上使用，而许多物联网(IoT)设备都属于嵌入式计算平台。幸运的是，近年来研究人员已经开发了许多方法来优化cnn的性能和缩小其内存占用。本文提出了一种基于神经网络的图像分类器，该分类器经过训练可以将车辆图像分为四类。神经网络通过一种称为二值化的技术进行优化，得到的二值化网络被放置在物联网类处理器核心上执行。二值化使CNN的内存占用减少了约95%，性能提高了6倍以上。此外，我们表明，通过使用处理器的自定义指令“popcount”，二值化车辆分类器的性能仍然可以提高2倍以上，使得基于cnn的图像分类器适用于最小的嵌入式处理器。

{"title":"Neural Network-based Vehicle Image Classification for IoT Devices","authors":"Saman Payvar, Mir Khan, Rafael Stahl, Daniel Mueller-Gritschneder, J. Boutellier","doi":"10.1109/SiPS47522.2019.9020464","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020464","url":null,"abstract":"Convolutional Neural Networks (CNNs) have previously provided unforeseen results in automatic image analysis and interpretation, an area which has numerous applications in both consumer electronics and industry. However, the signal processing related to CNNs is computationally very demanding, which has prohibited their use in the smallest embedded computing platforms, to which many Internet of Things (IoT) devices belong. Fortunately, in the recent years researchers have developed many approaches for optimizing the performance and for shrinking the memory footprint of CNNs. This paper presents a neural-network-based image classifier that has been trained to classify vehicle images into four different classes. The neural network is optimized by a technique called binarization, and the resulting binarized network is placed to an IoT-class processor core for execution. Binarization reduces the memory footprint of the CNN by around 95% and increases performance by more than $6 times $. Furthermore, we show that by utilizing a custom instruction ’popcount’ of the processor, the performance of the binarized vehicle classifier can still be increased by more than $2 times $, making the CNN-based image classifier suitable for the smallest embedded processors.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"7 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126140675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Feature Selection Framework for XGBoost Based on Electrodermal Activity in Stress Detection 应力检测中基于皮肤电活动的XGBoost特征选择框架

2019 IEEE International Workshop on Signal Processing Systems (SiPS)

Pub Date : 2019-10-01 DOI: 10.1109/SiPS47522.2019.9020321

Cheng-Ping Hsieh, Yi-Ta Chen, Win-Ken Beh, A. Wu

Since stress has a strong influence on human’s health, it is necessary to automatically detect stress in our daily life. In this paper, we aim to improve the performance and obtain the dominant features in stress detection based on Electrodermal Activity (EDA). Compared to the methods in Wearable Stress and Affect Dataset (WESAD), we propose several enhancements to get higher f1-scores, including less overlapped signal segmentation, more signal processing features, and extreme gradient boosting classification algorithm (XGBoost). Furthermore, we select dominant features according to their importance in classifier and correlation among other features while keeping high performance. Experiment results show that with 9 dominant features in XGBoost, we can achieve 92.38% (+ 17.87%) and 89.92% (+14.58%) f1-scores compared to WESAD on chest-and wrist-based EDA signal respectively. The features we choose suggest that the magnitude of low frequency and the complexity of high frequency EDA signal contain the most significant information in stress detection.

由于压力对人的健康有很大的影响，在我们的日常生活中，有必要自动检测压力。本文旨在改进基于皮肤电活动(EDA)的应力检测性能，获得EDA的优势特征。与可穿戴应力和影响数据集(WESAD)中的方法相比，我们提出了一些改进以获得更高的f1分数，包括更少的重叠信号分割，更多的信号处理特征和极端梯度增强分类算法(XGBoost)。此外，我们根据特征在分类器中的重要性和其他特征之间的相关性来选择优势特征，同时保持高性能。实验结果表明，利用XGBoost中的9个优势特征，与WESAD相比，XGBoost在基于胸部和手腕的EDA信号上分别可以获得92.38%(+ 17.87%)和89.92%(+14.58%)的f1分数。我们选择的特征表明，低频EDA信号的幅值和高频EDA信号的复杂度在应力检测中包含了最重要的信息。

引用次数: 27

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2019 IEEE International Workshop on Signal Processing Systems (SiPS)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀