Pub Date : 2021-12-01DOI: 10.1109/MCSoC51149.2021.00034
Kaisei Shimura, Yoichi Tomioka, Qiang Zhao
A mobility scooter has come to be used to expand the range of mobility for the elderly. On the other hand, accidents involving mobility scooters have become serious problems. For example, if a mobility scooter stops inside a railway crossing due to battery exhaustion, it is very dangerous because accidental contact with a train may happen. Measuring the distance to a railway crossing during driving is helpful to avoid entrance to a railway crossing without enough battery. In this paper, we propose a method for predicting the distance to a railroad crossing based on the railway crossing warning signs in the video from a camera installed in front of the mobility scooter. In experiments, we evaluate the proposed method using images taken at various positions in relation to the railway crossing and show that the proposed method achieves higher accuracy than the distance estimation using a depth sensor.
{"title":"A Distance Estimation Method to Railway Crossing Using Warning Signs","authors":"Kaisei Shimura, Yoichi Tomioka, Qiang Zhao","doi":"10.1109/MCSoC51149.2021.00034","DOIUrl":"https://doi.org/10.1109/MCSoC51149.2021.00034","url":null,"abstract":"A mobility scooter has come to be used to expand the range of mobility for the elderly. On the other hand, accidents involving mobility scooters have become serious problems. For example, if a mobility scooter stops inside a railway crossing due to battery exhaustion, it is very dangerous because accidental contact with a train may happen. Measuring the distance to a railway crossing during driving is helpful to avoid entrance to a railway crossing without enough battery. In this paper, we propose a method for predicting the distance to a railroad crossing based on the railway crossing warning signs in the video from a camera installed in front of the mobility scooter. In experiments, we evaluate the proposed method using images taken at various positions in relation to the railway crossing and show that the proposed method achieves higher accuracy than the distance estimation using a depth sensor.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"452 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113967152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-01DOI: 10.1109/MCSoC51149.2021.00029
Md. Al Mehedi Hasan, Fuad Al Abir, Jungpil Shin
Real-time surface recognition has become a crucial component in assuring the safe walking of intelligent autonomous robots in a complex human-living interior environment. Numerous studies have been done addressing the problem recently. Still, there is a scope of improvements for accurate classification and inference time. In this paper, we have extracted features from accelerometer and gyroscope data in the temporal, statistical and spectral domain and classified them using a tree-based ensembling classification algorithm. We have achieved 80.81% mean accuracy, classifying 9 different surfaces with 1.0% standard deviation in 10-fold cross-validation and 97.25% average AUC score. Our method acquired state-of-the-art accuracy ensuring minimal inference time which is essential for real-time recognition for the autonomous robots.
{"title":"Surface Type Classification for Autonomous Robots Using Temporal, Statistical and Spectral Feature Extraction and Selection","authors":"Md. Al Mehedi Hasan, Fuad Al Abir, Jungpil Shin","doi":"10.1109/MCSoC51149.2021.00029","DOIUrl":"https://doi.org/10.1109/MCSoC51149.2021.00029","url":null,"abstract":"Real-time surface recognition has become a crucial component in assuring the safe walking of intelligent autonomous robots in a complex human-living interior environment. Numerous studies have been done addressing the problem recently. Still, there is a scope of improvements for accurate classification and inference time. In this paper, we have extracted features from accelerometer and gyroscope data in the temporal, statistical and spectral domain and classified them using a tree-based ensembling classification algorithm. We have achieved 80.81% mean accuracy, classifying 9 different surfaces with 1.0% standard deviation in 10-fold cross-validation and 97.25% average AUC score. Our method acquired state-of-the-art accuracy ensuring minimal inference time which is essential for real-time recognition for the autonomous robots.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126181872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-01DOI: 10.1109/MCSoC51149.2021.00030
Maoyang Xiang, T. Teo
Binary neural networks (BNNs) are particularly well-suited for low-power embedded devices with limited computational capabilities. Due to the binary weight parameters, it significantly reduces memory footprint and arithmetic logic unit operations. Nevertheless, one of the disadvantages of BNN is low accuracy and sharp optimization space. Several studies of BNNs have recently shown improved accuracy in various tests via more operations and more complicated topologies. This approach, however, is incompatible with the embedded BNN application since it requires complicated data type translation. Hence, We propose a novel approach for the BNN application on the embedded system with multi-scale neural network topology in this research from two optimization perspectives: hardware structure and BNN topology, which preserves more low-level information during the feed-forward process with few operations. Our network topology achieves 91.3% accuracy for the CIFAR-10 dataset, one of the highest recorded by BNN and can process 537 tiny pictures per second when deployed on an All programmable System on Chip (APSoc) device with 4.4W power consumption.
{"title":"A Multi-scale Binarized Neural Network Application Based on All Programmable System on Chip","authors":"Maoyang Xiang, T. Teo","doi":"10.1109/MCSoC51149.2021.00030","DOIUrl":"https://doi.org/10.1109/MCSoC51149.2021.00030","url":null,"abstract":"Binary neural networks (BNNs) are particularly well-suited for low-power embedded devices with limited computational capabilities. Due to the binary weight parameters, it significantly reduces memory footprint and arithmetic logic unit operations. Nevertheless, one of the disadvantages of BNN is low accuracy and sharp optimization space. Several studies of BNNs have recently shown improved accuracy in various tests via more operations and more complicated topologies. This approach, however, is incompatible with the embedded BNN application since it requires complicated data type translation. Hence, We propose a novel approach for the BNN application on the embedded system with multi-scale neural network topology in this research from two optimization perspectives: hardware structure and BNN topology, which preserves more low-level information during the feed-forward process with few operations. Our network topology achieves 91.3% accuracy for the CIFAR-10 dataset, one of the highest recorded by BNN and can process 537 tiny pictures per second when deployed on an All programmable System on Chip (APSoc) device with 4.4W power consumption.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"451 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116180381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-01DOI: 10.1109/MCSoC51149.2021.00020
Fumio Hamanaka, Takuto Kanamori, Kenji Kise
To realize autonomous driving, a deep neural network (DNN) is one of the key technologies. However, since DNN needs a lot of computation, it is challenging for an edge device to support DNN with limited computation resources. A binarized neural network (BNN) has been proposed to reduce latency and parameter size and is suited for hardware implementation. Since current DNN technology is a growing and better algorithm change with time, implementing DNN on an FPGA is preferable to an ASIC. In this paper, we propose a low cost and portable mini motor car system with a BNN accelerator on an FPGA. We compare the road tracking demonstration with a similar motor car using Raspberry Pi and show the effectiveness of FPGA in a DNN implementation. The proposed system is implemented on Nexys A7, one of the most popular FPGA development boards using an Artix-7 FPGA.
{"title":"A Low Cost and Portable Mini Motor Car System with a BNN Accelerator on FPGA","authors":"Fumio Hamanaka, Takuto Kanamori, Kenji Kise","doi":"10.1109/MCSoC51149.2021.00020","DOIUrl":"https://doi.org/10.1109/MCSoC51149.2021.00020","url":null,"abstract":"To realize autonomous driving, a deep neural network (DNN) is one of the key technologies. However, since DNN needs a lot of computation, it is challenging for an edge device to support DNN with limited computation resources. A binarized neural network (BNN) has been proposed to reduce latency and parameter size and is suited for hardware implementation. Since current DNN technology is a growing and better algorithm change with time, implementing DNN on an FPGA is preferable to an ASIC. In this paper, we propose a low cost and portable mini motor car system with a BNN accelerator on an FPGA. We compare the road tracking demonstration with a similar motor car using Raspberry Pi and show the effectiveness of FPGA in a DNN implementation. The proposed system is implemented on Nexys A7, one of the most popular FPGA development boards using an Artix-7 FPGA.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114512264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-01DOI: 10.1109/MCSoC51149.2021.00047
Aika Kamei, Takuya Kojima, H. Amano, Daiki Yokoyama, Hisato Miyauchi, K. Usami, Keizo Hiraga, Kenta Suzuki, K. Bessho
In this study, a second-generation coarse-grained reconfigurable array with non-volatile flip-flops (NVFFs), known as the non-volatile cool mega array with multi-context (NVCMA/MC), is proposed. Similar to the previous NVCMA, verify-and-retriable NVFFs (VR-NVFFs) are provided for their configuration memory, constant memory, data memory, and instruction memory. The dedicated instructions for controlling the store, verify, and restore operations of the NVFFs are provided to the microcontroller in addition to power gating functions. Based on experience of the NVCMA, four hardware contexts are introduced to maintain the configuration data for four tasks, without the sacrifice of memory leakage. The array size is expanded, and pipeline registers are introduced to reduce the trade-off between the performance and power consumption. This study mainly focuses on the energy-saving effect of the VR-NVFFs and the multi-context facility of the NVCMA/MC, including the measurement of the break-even point. The evaluation of a real chip implemented with 40 nm MTJ/MOS hybrid process technology demonstrates that the store energy is reduced by 65% with the two-step store control of the VR-NVFFs. Moreover, applications that run intermittently for intervals as short as approximately 3 μs can benefit from the multi-context power gating.
{"title":"Energy saving in a multi-context coarse grained reconfigurable array with non-volatile flip-flops","authors":"Aika Kamei, Takuya Kojima, H. Amano, Daiki Yokoyama, Hisato Miyauchi, K. Usami, Keizo Hiraga, Kenta Suzuki, K. Bessho","doi":"10.1109/MCSoC51149.2021.00047","DOIUrl":"https://doi.org/10.1109/MCSoC51149.2021.00047","url":null,"abstract":"In this study, a second-generation coarse-grained reconfigurable array with non-volatile flip-flops (NVFFs), known as the non-volatile cool mega array with multi-context (NVCMA/MC), is proposed. Similar to the previous NVCMA, verify-and-retriable NVFFs (VR-NVFFs) are provided for their configuration memory, constant memory, data memory, and instruction memory. The dedicated instructions for controlling the store, verify, and restore operations of the NVFFs are provided to the microcontroller in addition to power gating functions. Based on experience of the NVCMA, four hardware contexts are introduced to maintain the configuration data for four tasks, without the sacrifice of memory leakage. The array size is expanded, and pipeline registers are introduced to reduce the trade-off between the performance and power consumption. This study mainly focuses on the energy-saving effect of the VR-NVFFs and the multi-context facility of the NVCMA/MC, including the measurement of the break-even point. The evaluation of a real chip implemented with 40 nm MTJ/MOS hybrid process technology demonstrates that the store energy is reduced by 65% with the two-step store control of the VR-NVFFs. Moreover, applications that run intermittently for intervals as short as approximately 3 μs can benefit from the multi-context power gating.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125459062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-01DOI: 10.1109/MCSoC51149.2021.00010
Fumiya Kono, N. Nakasato, N. Hirata, K. Matsumoto
Researches with explorations by space probes for asteroids have been performed actively to approach to the origin of the solar system and life. One of methods toward the goal is analyzing structure of solar system bodies by numerical simulation. GFandSlope is a code which calculates the gravitation field, slope, and attraction of given model data for small solar system bodies. When we use the existing sequential computation code, it is inevitable to take large time to analyze high resolution models with different initial conditions. This work achieved to compute several thousands faster than the previous by GPU implementation, which will also boost researches in the field of space science. This paper presents the evaluation of our GPU codes for fast gravitation field analysis and discusses numerical precision in floating point operations on the GPU for practical application.
{"title":"Acceleration of Gravitation Field Analysis for Asteroids by GPU Computation","authors":"Fumiya Kono, N. Nakasato, N. Hirata, K. Matsumoto","doi":"10.1109/MCSoC51149.2021.00010","DOIUrl":"https://doi.org/10.1109/MCSoC51149.2021.00010","url":null,"abstract":"Researches with explorations by space probes for asteroids have been performed actively to approach to the origin of the solar system and life. One of methods toward the goal is analyzing structure of solar system bodies by numerical simulation. GFandSlope is a code which calculates the gravitation field, slope, and attraction of given model data for small solar system bodies. When we use the existing sequential computation code, it is inevitable to take large time to analyze high resolution models with different initial conditions. This work achieved to compute several thousands faster than the previous by GPU implementation, which will also boost researches in the field of space science. This paper presents the evaluation of our GPU codes for fast gravitation field analysis and discusses numerical precision in floating point operations on the GPU for practical application.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123363042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-01DOI: 10.1109/MCSoC51149.2021.00056
Mohamed Hamada, Jesse Jeremiah Tanimu, Mohammed Hassan, H. Kakudi, Patience Robert
Cervical cancer is one of the leading causes of premature mortality among women worldwide and more than 85% of these deaths are in developing countries. There are several risk factors associated with cervical cancer. In this research, the aim is to develop a predictive model for predicting the outcome of patient's cervical cancer results, given risk patterns from individual medical records and preliminary screening. This work presents a machine learning method using Decision Tree (DT) algorithm to analyze the risk factors of cervical cancer. Recursive Feature Elimination (RFE) and least absolute shrinkage and selection operator (LASSO) feature selection techniques were fully explored to determine the most important attributes for cervical cancer prediction. Comparative analysis of the 2 feature selection techniques were performed to show the importance of feature selection in cervical cancer prediction. Based on the result of the analysis, we can conclude that the proposed model produced the highest accuracy of 98% and 96% respectively while using DT with RFE and LASSO feature selection techniques respectively.
{"title":"Evaluation of Recursive Feature Elimination and LASSO Regularization-based optimized feature selection approaches for cervical cancer prediction","authors":"Mohamed Hamada, Jesse Jeremiah Tanimu, Mohammed Hassan, H. Kakudi, Patience Robert","doi":"10.1109/MCSoC51149.2021.00056","DOIUrl":"https://doi.org/10.1109/MCSoC51149.2021.00056","url":null,"abstract":"Cervical cancer is one of the leading causes of premature mortality among women worldwide and more than 85% of these deaths are in developing countries. There are several risk factors associated with cervical cancer. In this research, the aim is to develop a predictive model for predicting the outcome of patient's cervical cancer results, given risk patterns from individual medical records and preliminary screening. This work presents a machine learning method using Decision Tree (DT) algorithm to analyze the risk factors of cervical cancer. Recursive Feature Elimination (RFE) and least absolute shrinkage and selection operator (LASSO) feature selection techniques were fully explored to determine the most important attributes for cervical cancer prediction. Comparative analysis of the 2 feature selection techniques were performed to show the importance of feature selection in cervical cancer prediction. Based on the result of the analysis, we can conclude that the proposed model produced the highest accuracy of 98% and 96% respectively while using DT with RFE and LASSO feature selection techniques respectively.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121541358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-01DOI: 10.1109/MCSoC51149.2021.00014
Takuto Kanamori, Kenji Kise
The compressed instructions extension in RISC-V reduces the program size. However, it needs a complicated logic for the instruction fetch unit and has an impact on performance. In this paper, we propose an instruction fetch unit that supports the compressed instructions achieving high performance. Furthermore, we propose a RISC-V soft processor using this unit. We implement this proposed processor in Verilog HDL and verify the behavior using Verilog simulation and a Xilinx Artix-7 FPGA board. We compare the results of some benchmarks and the amount of hardware with related works. From the evaluation results, we show that the proposed processor achieves 42.5% performance improvement compared with VexRiscv, which is a high-performance and open source RV32IC processor.
{"title":"RVCoreP-32IC: An optimized RISC- V soft processor supporting the compressed instructions","authors":"Takuto Kanamori, Kenji Kise","doi":"10.1109/MCSoC51149.2021.00014","DOIUrl":"https://doi.org/10.1109/MCSoC51149.2021.00014","url":null,"abstract":"The compressed instructions extension in RISC-V reduces the program size. However, it needs a complicated logic for the instruction fetch unit and has an impact on performance. In this paper, we propose an instruction fetch unit that supports the compressed instructions achieving high performance. Furthermore, we propose a RISC-V soft processor using this unit. We implement this proposed processor in Verilog HDL and verify the behavior using Verilog simulation and a Xilinx Artix-7 FPGA board. We compare the results of some benchmarks and the amount of hardware with related works. From the evaluation results, we show that the proposed processor achieves 42.5% performance improvement compared with VexRiscv, which is a high-performance and open source RV32IC processor.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"124 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116323591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-01DOI: 10.1109/MCSoC51149.2021.00015
Takaharu Suzuki, Kiyofumi Tanaka
In scheduling algorithms based on the Rate Monotonic (RM) method widely used in development of real-time systems, tasks with shorter periods have higher priorities. In contrast, ones with longer periods are likely to suffer from increased response times and jitters due to their lower priorities. We proposed the Execution Right Delegation (ERD) method for uniprocessor systems based on RM where a high-priority server for a privileged (or important) task is introduced to shorten response times of the task. In this paper, we propose an extended ERD method for multiprocessor systems. Our system model is based on partitioned systems while only a privileged task can migrate. In the evaluation, it is confirmed that response times of a privileged task are reduced compared with partitioned Fixed-Task-Priority(FTP) and global FTP scheduling.
{"title":"Execution Right Delegation Scheduling Algorithm for Multiprocessor","authors":"Takaharu Suzuki, Kiyofumi Tanaka","doi":"10.1109/MCSoC51149.2021.00015","DOIUrl":"https://doi.org/10.1109/MCSoC51149.2021.00015","url":null,"abstract":"In scheduling algorithms based on the Rate Monotonic (RM) method widely used in development of real-time systems, tasks with shorter periods have higher priorities. In contrast, ones with longer periods are likely to suffer from increased response times and jitters due to their lower priorities. We proposed the Execution Right Delegation (ERD) method for uniprocessor systems based on RM where a high-priority server for a privileged (or important) task is introduced to shorten response times of the task. In this paper, we propose an extended ERD method for multiprocessor systems. Our system model is based on partitioned systems while only a privileged task can migrate. In the evaluation, it is confirmed that response times of a privileged task are reduced compared with partitioned Fixed-Task-Priority(FTP) and global FTP scheduling.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131188584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-01DOI: 10.1109/MCSoC51149.2021.00055
Evelina Forno, Andrea Spitale, E. Macii, Gianvito Urgese
Neuromorphic hardware shows promising potential for employment in edge computing applications, as it can provide real-time and low-power elaboration of complex data directly on edge using computational paradigm based on Spiking Neural Networks (SNNs). However, such systems cannot be deployed as edge devices by themselves, as they require an external host for configuration and data input management. In this paper, we present a chip-level integrated system performing on-edge configuration of a neuromorphic platform. The proposed solution makes use of two existing open-source platforms: the low-power RISC-V processor Rocket Chip and the digital SNN processor ODIN. We built the two systems into a single SoC using the Chipyard framework, and connected them by designing a communication interface using ODIN's SPI and AER input/output ports. We validated the system by RTL simulation of a synfire chain running on ODIN, where Rocket Chip sets up configuration of the network, triggers the first spike, then collects the simulation results. The synthesized design utilizes a modest amount of resources on a PYNQ-Z2 board: 16% of LUT slices, 11% of Block RAMs and 8 pins, leaving plenty of room to integrate other peripherals or systems. The present work represents a first step towards seamless integration of neuromorphic technologies with state-of-the-art processors, improving on the ease of use of neuromorphic devices and leading the way into widespread use of SNN coprocessors in edge computing applications.
{"title":"Configuring an Embedded Neuromorphic Coprocessor Using a RISC-V Chip for Enabling Edge Computing Applications","authors":"Evelina Forno, Andrea Spitale, E. Macii, Gianvito Urgese","doi":"10.1109/MCSoC51149.2021.00055","DOIUrl":"https://doi.org/10.1109/MCSoC51149.2021.00055","url":null,"abstract":"Neuromorphic hardware shows promising potential for employment in edge computing applications, as it can provide real-time and low-power elaboration of complex data directly on edge using computational paradigm based on Spiking Neural Networks (SNNs). However, such systems cannot be deployed as edge devices by themselves, as they require an external host for configuration and data input management. In this paper, we present a chip-level integrated system performing on-edge configuration of a neuromorphic platform. The proposed solution makes use of two existing open-source platforms: the low-power RISC-V processor Rocket Chip and the digital SNN processor ODIN. We built the two systems into a single SoC using the Chipyard framework, and connected them by designing a communication interface using ODIN's SPI and AER input/output ports. We validated the system by RTL simulation of a synfire chain running on ODIN, where Rocket Chip sets up configuration of the network, triggers the first spike, then collects the simulation results. The synthesized design utilizes a modest amount of resources on a PYNQ-Z2 board: 16% of LUT slices, 11% of Block RAMs and 8 pins, leaving plenty of room to integrate other peripherals or systems. The present work represents a first step towards seamless integration of neuromorphic technologies with state-of-the-art processors, improving on the ease of use of neuromorphic devices and leading the way into widespread use of SNN coprocessors in edge computing applications.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132583790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}