2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)最新文献

英文中文

A Distance Estimation Method to Railway Crossing Using Warning Signs 基于警示标志的铁路道口距离估算方法

2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)

Pub Date : 2021-12-01 DOI: 10.1109/MCSoC51149.2021.00034

Kaisei Shimura, Yoichi Tomioka, Qiang Zhao

A mobility scooter has come to be used to expand the range of mobility for the elderly. On the other hand, accidents involving mobility scooters have become serious problems. For example, if a mobility scooter stops inside a railway crossing due to battery exhaustion, it is very dangerous because accidental contact with a train may happen. Measuring the distance to a railway crossing during driving is helpful to avoid entrance to a railway crossing without enough battery. In this paper, we propose a method for predicting the distance to a railroad crossing based on the railway crossing warning signs in the video from a camera installed in front of the mobility scooter. In experiments, we evaluate the proposed method using images taken at various positions in relation to the railway crossing and show that the proposed method achieves higher accuracy than the distance estimation using a depth sensor.

一种机动滑板车已经被用来扩大老年人的活动范围。另一方面，涉及机动滑板车的事故已经成为严重的问题。例如，如果一辆机动滑板车由于电池耗尽而停在铁路道口内，这是非常危险的，因为可能会发生意外接触火车。在开车时测量到铁路道口的距离，有助于避免在没有足够电量的情况下进入铁路道口。在本文中，我们提出了一种基于安装在移动滑板车前面的摄像机的视频中的铁路道口警告标志来预测到铁路道口距离的方法。在实验中，我们使用与铁路道口相关的不同位置拍摄的图像来评估所提出的方法，并表明所提出的方法比使用深度传感器的距离估计具有更高的精度。

引用次数: 0

Surface Type Classification for Autonomous Robots Using Temporal, Statistical and Spectral Feature Extraction and Selection 基于时间、统计和光谱特征提取与选择的自主机器人表面类型分类

2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)

Pub Date : 2021-12-01 DOI: 10.1109/MCSoC51149.2021.00029

Md. Al Mehedi Hasan, Fuad Al Abir, Jungpil Shin

Real-time surface recognition has become a crucial component in assuring the safe walking of intelligent autonomous robots in a complex human-living interior environment. Numerous studies have been done addressing the problem recently. Still, there is a scope of improvements for accurate classification and inference time. In this paper, we have extracted features from accelerometer and gyroscope data in the temporal, statistical and spectral domain and classified them using a tree-based ensembling classification algorithm. We have achieved 80.81% mean accuracy, classifying 9 different surfaces with 1.0% standard deviation in 10-fold cross-validation and 97.25% average AUC score. Our method acquired state-of-the-art accuracy ensuring minimal inference time which is essential for real-time recognition for the autonomous robots.

实时表面识别已成为保证智能自主机器人在复杂的人类居住室内环境中安全行走的重要组成部分。最近针对这个问题做了大量的研究。尽管如此，在准确分类和推理时间方面仍有很大的改进空间。在本文中，我们从加速度计和陀螺仪数据中提取了时间域、统计域和频谱域的特征，并使用基于树的集成分类算法对它们进行分类。在10倍交叉验证中，我们对9个不同的表面进行了分类，平均准确率为80.81%，标准差为1.0%，平均AUC得分为97.25%。我们的方法获得了最先进的精度，确保了最小的推理时间，这对自主机器人的实时识别至关重要。

引用次数: 0

A Multi-scale Binarized Neural Network Application Based on All Programmable System on Chip 基于全可编程片上系统的多尺度二值化神经网络应用

2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)

Pub Date : 2021-12-01 DOI: 10.1109/MCSoC51149.2021.00030

Maoyang Xiang, T. Teo

Binary neural networks (BNNs) are particularly well-suited for low-power embedded devices with limited computational capabilities. Due to the binary weight parameters, it significantly reduces memory footprint and arithmetic logic unit operations. Nevertheless, one of the disadvantages of BNN is low accuracy and sharp optimization space. Several studies of BNNs have recently shown improved accuracy in various tests via more operations and more complicated topologies. This approach, however, is incompatible with the embedded BNN application since it requires complicated data type translation. Hence, We propose a novel approach for the BNN application on the embedded system with multi-scale neural network topology in this research from two optimization perspectives: hardware structure and BNN topology, which preserves more low-level information during the feed-forward process with few operations. Our network topology achieves 91.3% accuracy for the CIFAR-10 dataset, one of the highest recorded by BNN and can process 537 tiny pictures per second when deployed on an All programmable System on Chip (APSoc) device with 4.4W power consumption.

二进制神经网络(bnn)特别适合于计算能力有限的低功耗嵌入式设备。由于采用二进制权重参数，它显著减少了内存占用和算术逻辑单元操作。然而，BNN的缺点之一是精度低，优化空间大。最近对bnn的几项研究表明，通过更多的操作和更复杂的拓扑结构，在各种测试中提高了准确性。然而，这种方法与嵌入式BNN应用程序不兼容，因为它需要复杂的数据类型转换。因此，本研究从硬件结构优化和BNN拓扑优化两方面提出了一种新颖的BNN在多尺度神经网络拓扑的嵌入式系统中的应用方法，该方法在前馈过程中以较少的操作保留了更多的底层信息。我们的网络拓扑在CIFAR-10数据集上实现了91.3%的准确率，这是BNN记录的最高准确率之一，当部署在功耗为4.4W的全可编程片上系统(APSoc)设备上时，每秒可以处理537张微小图片。

{"title":"A Multi-scale Binarized Neural Network Application Based on All Programmable System on Chip","authors":"Maoyang Xiang, T. Teo","doi":"10.1109/MCSoC51149.2021.00030","DOIUrl":"https://doi.org/10.1109/MCSoC51149.2021.00030","url":null,"abstract":"Binary neural networks (BNNs) are particularly well-suited for low-power embedded devices with limited computational capabilities. Due to the binary weight parameters, it significantly reduces memory footprint and arithmetic logic unit operations. Nevertheless, one of the disadvantages of BNN is low accuracy and sharp optimization space. Several studies of BNNs have recently shown improved accuracy in various tests via more operations and more complicated topologies. This approach, however, is incompatible with the embedded BNN application since it requires complicated data type translation. Hence, We propose a novel approach for the BNN application on the embedded system with multi-scale neural network topology in this research from two optimization perspectives: hardware structure and BNN topology, which preserves more low-level information during the feed-forward process with few operations. Our network topology achieves 91.3% accuracy for the CIFAR-10 dataset, one of the highest recorded by BNN and can process 537 tiny pictures per second when deployed on an All programmable System on Chip (APSoc) device with 4.4W power consumption.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"451 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116180381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Low Cost and Portable Mini Motor Car System with a BNN Accelerator on FPGA 基于FPGA的基于BNN加速器的低成本便携式微型汽车系统

2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)

Pub Date : 2021-12-01 DOI: 10.1109/MCSoC51149.2021.00020

Fumio Hamanaka, Takuto Kanamori, Kenji Kise

To realize autonomous driving, a deep neural network (DNN) is one of the key technologies. However, since DNN needs a lot of computation, it is challenging for an edge device to support DNN with limited computation resources. A binarized neural network (BNN) has been proposed to reduce latency and parameter size and is suited for hardware implementation. Since current DNN technology is a growing and better algorithm change with time, implementing DNN on an FPGA is preferable to an ASIC. In this paper, we propose a low cost and portable mini motor car system with a BNN accelerator on an FPGA. We compare the road tracking demonstration with a similar motor car using Raspberry Pi and show the effectiveness of FPGA in a DNN implementation. The proposed system is implemented on Nexys A7, one of the most popular FPGA development boards using an Artix-7 FPGA.

要实现自动驾驶，深度神经网络(DNN)是关键技术之一。然而，由于深度神经网络需要大量的计算，边缘设备在有限的计算资源下支持深度神经网络是一个挑战。提出了一种二值化神经网络(BNN)，以减少延迟和参数大小，适合硬件实现。由于目前的深度神经网络技术是一种不断发展和更好的算法，随着时间的推移，在FPGA上实现深度神经网络比在ASIC上更好。本文提出了一种基于FPGA的低成本便携式微型汽车加速器系统。我们将道路跟踪演示与使用树莓派的类似汽车进行比较，并展示了FPGA在DNN实现中的有效性。提出的系统是在Nexys A7上实现的，这是最流行的FPGA开发板之一，使用Artix-7 FPGA。

引用次数: 0

Energy saving in a multi-context coarse grained reconfigurable array with non-volatile flip-flops 具有非易失性触发器的多上下文粗粒度可重构阵列的节能

2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)

Pub Date : 2021-12-01 DOI: 10.1109/MCSoC51149.2021.00047

Aika Kamei, Takuya Kojima, H. Amano, Daiki Yokoyama, Hisato Miyauchi, K. Usami, Keizo Hiraga, Kenta Suzuki, K. Bessho

In this study, a second-generation coarse-grained reconfigurable array with non-volatile flip-flops (NVFFs), known as the non-volatile cool mega array with multi-context (NVCMA/MC), is proposed. Similar to the previous NVCMA, verify-and-retriable NVFFs (VR-NVFFs) are provided for their configuration memory, constant memory, data memory, and instruction memory. The dedicated instructions for controlling the store, verify, and restore operations of the NVFFs are provided to the microcontroller in addition to power gating functions. Based on experience of the NVCMA, four hardware contexts are introduced to maintain the configuration data for four tasks, without the sacrifice of memory leakage. The array size is expanded, and pipeline registers are introduced to reduce the trade-off between the performance and power consumption. This study mainly focuses on the energy-saving effect of the VR-NVFFs and the multi-context facility of the NVCMA/MC, including the measurement of the break-even point. The evaluation of a real chip implemented with 40 nm MTJ/MOS hybrid process technology demonstrates that the store energy is reduced by 65% with the two-step store control of the VR-NVFFs. Moreover, applications that run intermittently for intervals as short as approximately 3 μs can benefit from the multi-context power gating.

本文提出了一种具有非易失性触发器(NVFFs)的第二代粗粒度可重构阵列，即非易失性多上下文冷阵列(NVCMA/MC)。与之前的NVCMA类似，可验证和可检索的nvff (vr - nvff)提供了配置内存、常量内存、数据内存和指令内存。除了电源门控功能外，还向微控制器提供了用于控制nvff的存储、验证和恢复操作的专用指令。根据NVCMA的经验，在不牺牲内存泄漏的情况下，引入了四种硬件上下文来维护四个任务的配置数据。扩展了阵列大小，并引入了流水线寄存器以减少性能和功耗之间的权衡。本研究主要关注vr - nvff和NVCMA/MC的多情境设施的节能效果，包括盈亏平衡点的测量。对采用40 nm MTJ/MOS混合工艺技术实现的实际芯片的评估表明，采用VR-NVFFs的两步存储控制，存储能量降低了65%。此外，间歇运行时间短至约3 μs的应用程序可以从多上下文功率门控中受益。

{"title":"Energy saving in a multi-context coarse grained reconfigurable array with non-volatile flip-flops","authors":"Aika Kamei, Takuya Kojima, H. Amano, Daiki Yokoyama, Hisato Miyauchi, K. Usami, Keizo Hiraga, Kenta Suzuki, K. Bessho","doi":"10.1109/MCSoC51149.2021.00047","DOIUrl":"https://doi.org/10.1109/MCSoC51149.2021.00047","url":null,"abstract":"In this study, a second-generation coarse-grained reconfigurable array with non-volatile flip-flops (NVFFs), known as the non-volatile cool mega array with multi-context (NVCMA/MC), is proposed. Similar to the previous NVCMA, verify-and-retriable NVFFs (VR-NVFFs) are provided for their configuration memory, constant memory, data memory, and instruction memory. The dedicated instructions for controlling the store, verify, and restore operations of the NVFFs are provided to the microcontroller in addition to power gating functions. Based on experience of the NVCMA, four hardware contexts are introduced to maintain the configuration data for four tasks, without the sacrifice of memory leakage. The array size is expanded, and pipeline registers are introduced to reduce the trade-off between the performance and power consumption. This study mainly focuses on the energy-saving effect of the VR-NVFFs and the multi-context facility of the NVCMA/MC, including the measurement of the break-even point. The evaluation of a real chip implemented with 40 nm MTJ/MOS hybrid process technology demonstrates that the store energy is reduced by 65% with the two-step store control of the VR-NVFFs. Moreover, applications that run intermittently for intervals as short as approximately 3 μs can benefit from the multi-context power gating.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125459062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Acceleration of Gravitation Field Analysis for Asteroids by GPU Computation 基于GPU计算的小行星重力场加速度分析

2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)

Pub Date : 2021-12-01 DOI: 10.1109/MCSoC51149.2021.00010

Fumiya Kono, N. Nakasato, N. Hirata, K. Matsumoto

Researches with explorations by space probes for asteroids have been performed actively to approach to the origin of the solar system and life. One of methods toward the goal is analyzing structure of solar system bodies by numerical simulation. GFandSlope is a code which calculates the gravitation field, slope, and attraction of given model data for small solar system bodies. When we use the existing sequential computation code, it is inevitable to take large time to analyze high resolution models with different initial conditions. This work achieved to compute several thousands faster than the previous by GPU implementation, which will also boost researches in the field of space science. This paper presents the evaluation of our GPU codes for fast gravitation field analysis and discusses numerical precision in floating point operations on the GPU for practical application.

在探索太阳系和生命起源方面，人们积极开展小行星探测研究。实现这一目标的方法之一是通过数值模拟分析太阳系天体的结构。GFandSlope是一个计算太阳系小天体的引力场，斜率和引力的给定模型数据的代码。当我们使用现有的顺序计算代码时，不可避免地要花费大量的时间来分析具有不同初始条件的高分辨率模型。这项工作通过GPU的实现实现了比以前的计算速度快几千个，这也将推动空间科学领域的研究。本文对我们的快速重力场分析GPU代码进行了评价，并讨论了在GPU上进行浮点运算的数值精度，以供实际应用。

引用次数: 0

Evaluation of Recursive Feature Elimination and LASSO Regularization-based optimized feature selection approaches for cervical cancer prediction 基于递归特征消除和LASSO正则化的宫颈癌预测优化特征选择方法评价

2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)

Pub Date : 2021-12-01 DOI: 10.1109/MCSoC51149.2021.00056

Mohamed Hamada, Jesse Jeremiah Tanimu, Mohammed Hassan, H. Kakudi, Patience Robert

Cervical cancer is one of the leading causes of premature mortality among women worldwide and more than 85% of these deaths are in developing countries. There are several risk factors associated with cervical cancer. In this research, the aim is to develop a predictive model for predicting the outcome of patient's cervical cancer results, given risk patterns from individual medical records and preliminary screening. This work presents a machine learning method using Decision Tree (DT) algorithm to analyze the risk factors of cervical cancer. Recursive Feature Elimination (RFE) and least absolute shrinkage and selection operator (LASSO) feature selection techniques were fully explored to determine the most important attributes for cervical cancer prediction. Comparative analysis of the 2 feature selection techniques were performed to show the importance of feature selection in cervical cancer prediction. Based on the result of the analysis, we can conclude that the proposed model produced the highest accuracy of 98% and 96% respectively while using DT with RFE and LASSO feature selection techniques respectively.

宫颈癌是全世界妇女过早死亡的主要原因之一，其中85%以上的死亡发生在发展中国家。有几个与子宫颈癌有关的危险因素。在这项研究中，目的是建立一个预测模型，根据个人医疗记录和初步筛查的风险模式，预测患者宫颈癌结果的结果。本文提出了一种使用决策树(DT)算法的机器学习方法来分析宫颈癌的危险因素。充分探索了递归特征消除(RFE)和最小绝对收缩和选择算子(LASSO)特征选择技术，以确定宫颈癌预测的最重要属性。通过对两种特征选择方法的比较分析，说明特征选择在宫颈癌预测中的重要性。根据分析结果，我们可以得出结论，当DT与RFE和LASSO特征选择技术分别使用时，所提出的模型分别产生了98%和96%的最高准确率。

{"title":"Evaluation of Recursive Feature Elimination and LASSO Regularization-based optimized feature selection approaches for cervical cancer prediction","authors":"Mohamed Hamada, Jesse Jeremiah Tanimu, Mohammed Hassan, H. Kakudi, Patience Robert","doi":"10.1109/MCSoC51149.2021.00056","DOIUrl":"https://doi.org/10.1109/MCSoC51149.2021.00056","url":null,"abstract":"Cervical cancer is one of the leading causes of premature mortality among women worldwide and more than 85% of these deaths are in developing countries. There are several risk factors associated with cervical cancer. In this research, the aim is to develop a predictive model for predicting the outcome of patient's cervical cancer results, given risk patterns from individual medical records and preliminary screening. This work presents a machine learning method using Decision Tree (DT) algorithm to analyze the risk factors of cervical cancer. Recursive Feature Elimination (RFE) and least absolute shrinkage and selection operator (LASSO) feature selection techniques were fully explored to determine the most important attributes for cervical cancer prediction. Comparative analysis of the 2 feature selection techniques were performed to show the importance of feature selection in cervical cancer prediction. Based on the result of the analysis, we can conclude that the proposed model produced the highest accuracy of 98% and 96% respectively while using DT with RFE and LASSO feature selection techniques respectively.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121541358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

RVCoreP-32IC: An optimized RISC- V soft processor supporting the compressed instructions RVCoreP-32IC:支持压缩指令的优化RISC- V软处理器

2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)

Pub Date : 2021-12-01 DOI: 10.1109/MCSoC51149.2021.00014

Takuto Kanamori, Kenji Kise

The compressed instructions extension in RISC-V reduces the program size. However, it needs a complicated logic for the instruction fetch unit and has an impact on performance. In this paper, we propose an instruction fetch unit that supports the compressed instructions achieving high performance. Furthermore, we propose a RISC-V soft processor using this unit. We implement this proposed processor in Verilog HDL and verify the behavior using Verilog simulation and a Xilinx Artix-7 FPGA board. We compare the results of some benchmarks and the amount of hardware with related works. From the evaluation results, we show that the proposed processor achieves 42.5% performance improvement compared with VexRiscv, which is a high-performance and open source RV32IC processor.

RISC-V中的压缩指令扩展减小了程序的大小。但是，它需要一个复杂的指令获取单元逻辑，并且对性能有影响。本文提出了一种支持压缩指令的指令提取单元。在此基础上，提出了一种RISC-V软处理器。我们在Verilog HDL中实现了该处理器，并使用Verilog仿真和Xilinx Artix-7 FPGA板验证了该处理器的行为。我们将一些基准测试的结果和硬件数量与相关工作进行了比较。从评估结果来看，与高性能开源RV32IC处理器VexRiscv相比，该处理器的性能提高了42.5%。

引用次数: 1

Execution Right Delegation Scheduling Algorithm for Multiprocessor 多处理机的执行权委托调度算法

2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)

Pub Date : 2021-12-01 DOI: 10.1109/MCSoC51149.2021.00015

Takaharu Suzuki, Kiyofumi Tanaka

In scheduling algorithms based on the Rate Monotonic (RM) method widely used in development of real-time systems, tasks with shorter periods have higher priorities. In contrast, ones with longer periods are likely to suffer from increased response times and jitters due to their lower priorities. We proposed the Execution Right Delegation (ERD) method for uniprocessor systems based on RM where a high-priority server for a privileged (or important) task is introduced to shorten response times of the task. In this paper, we propose an extended ERD method for multiprocessor systems. Our system model is based on partitioned systems while only a privileged task can migrate. In the evaluation, it is confirmed that response times of a privileged task are reduced compared with partitioned Fixed-Task-Priority(FTP) and global FTP scheduling.

在实时系统开发中广泛使用的基于速率单调法的调度算法中，周期越短的任务优先级越高。相比之下，那些周期较长的人可能会因为优先级较低而增加响应时间和紧张。我们提出了基于RM的单处理器系统的执行权委托(ERD)方法，其中为特权(或重要)任务引入高优先级服务器以缩短任务的响应时间。本文提出了一种适用于多处理机系统的扩展ERD方法。我们的系统模型基于分区系统，只有特权任务可以迁移。评估结果表明，与分区式FTP调度和全局FTP调度相比，特权任务的响应时间明显缩短。

引用次数: 0

Configuring an Embedded Neuromorphic Coprocessor Using a RISC-V Chip for Enabling Edge Computing Applications 使用RISC-V芯片配置嵌入式神经形态协处理器以实现边缘计算应用

2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)

Pub Date : 2021-12-01 DOI: 10.1109/MCSoC51149.2021.00055

Evelina Forno, Andrea Spitale, E. Macii, Gianvito Urgese

Neuromorphic hardware shows promising potential for employment in edge computing applications, as it can provide real-time and low-power elaboration of complex data directly on edge using computational paradigm based on Spiking Neural Networks (SNNs). However, such systems cannot be deployed as edge devices by themselves, as they require an external host for configuration and data input management. In this paper, we present a chip-level integrated system performing on-edge configuration of a neuromorphic platform. The proposed solution makes use of two existing open-source platforms: the low-power RISC-V processor Rocket Chip and the digital SNN processor ODIN. We built the two systems into a single SoC using the Chipyard framework, and connected them by designing a communication interface using ODIN's SPI and AER input/output ports. We validated the system by RTL simulation of a synfire chain running on ODIN, where Rocket Chip sets up configuration of the network, triggers the first spike, then collects the simulation results. The synthesized design utilizes a modest amount of resources on a PYNQ-Z2 board: 16% of LUT slices, 11% of Block RAMs and 8 pins, leaving plenty of room to integrate other peripherals or systems. The present work represents a first step towards seamless integration of neuromorphic technologies with state-of-the-art processors, improving on the ease of use of neuromorphic devices and leading the way into widespread use of SNN coprocessors in edge computing applications.

神经形态硬件在边缘计算应用中显示出巨大的潜力，因为它可以使用基于峰值神经网络(snn)的计算范式，直接在边缘上提供实时、低功耗的复杂数据处理。但是，这些系统本身不能作为边缘设备部署，因为它们需要外部主机进行配置和数据输入管理。在本文中，我们提出了一个芯片级集成系统，执行神经形态平台的边缘配置。提出的解决方案利用了两个现有的开源平台:低功耗RISC-V处理器Rocket Chip和数字SNN处理器ODIN。我们使用Chipyard框架将两个系统构建为单个SoC，并通过使用ODIN的SPI和AER输入/输出端口设计通信接口将它们连接起来。我们通过运行在ODIN上的synfire链的RTL模拟验证了该系统，其中Rocket Chip设置网络配置，触发第一个峰值，然后收集模拟结果。综合设计在PYNQ-Z2板上利用了适量的资源:16%的LUT片，11%的块ram和8个引脚，留下了足够的空间来集成其他外设或系统。目前的工作代表了神经形态技术与最先进处理器无缝集成的第一步，提高了神经形态设备的易用性，并引领了SNN协处理器在边缘计算应用中的广泛使用。

{"title":"Configuring an Embedded Neuromorphic Coprocessor Using a RISC-V Chip for Enabling Edge Computing Applications","authors":"Evelina Forno, Andrea Spitale, E. Macii, Gianvito Urgese","doi":"10.1109/MCSoC51149.2021.00055","DOIUrl":"https://doi.org/10.1109/MCSoC51149.2021.00055","url":null,"abstract":"Neuromorphic hardware shows promising potential for employment in edge computing applications, as it can provide real-time and low-power elaboration of complex data directly on edge using computational paradigm based on Spiking Neural Networks (SNNs). However, such systems cannot be deployed as edge devices by themselves, as they require an external host for configuration and data input management. In this paper, we present a chip-level integrated system performing on-edge configuration of a neuromorphic platform. The proposed solution makes use of two existing open-source platforms: the low-power RISC-V processor Rocket Chip and the digital SNN processor ODIN. We built the two systems into a single SoC using the Chipyard framework, and connected them by designing a communication interface using ODIN's SPI and AER input/output ports. We validated the system by RTL simulation of a synfire chain running on ODIN, where Rocket Chip sets up configuration of the network, triggers the first spike, then collects the simulation results. The synthesized design utilizes a modest amount of resources on a PYNQ-Z2 board: 16% of LUT slices, 11% of Block RAMs and 8 pins, leaving plenty of room to integrate other peripherals or systems. The present work represents a first step towards seamless integration of neuromorphic technologies with state-of-the-art processors, improving on the ease of use of neuromorphic devices and leading the way into widespread use of SNN coprocessors in edge computing applications.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132583790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀