Pub Date : 2021-12-01DOI: 10.1109/mcsoc51149.2021.00005
{"title":"Welcome from the MCSoC 2021 Chairs","authors":"","doi":"10.1109/mcsoc51149.2021.00005","DOIUrl":"https://doi.org/10.1109/mcsoc51149.2021.00005","url":null,"abstract":"","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121984217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-01DOI: 10.1109/MCSoC51149.2021.00038
Yufan Lu, X. Zhai, S. Saha, Shoaib Ehsan, K. Mcdonald-Maier
Machine learning, and in particular deep learning (DL), has seen strong success in a wide variety of applications, e.g. object detection, image classification and self-driving. However, due to the limitations on hardware resources and power consumption, there are many challenges to deploy deep learning algorithms on resource-constrained mobile and embedded systems, especially for systems running multiple DL algorithms for a variety of tasks. In this paper, an adaptive hardware resource management system, implemented on field-programmable gate arrays (FPGAs), is proposed to dynamically manage the on-chip hardware resources (e.g. LUTs, BRAMs and DSPs) to adapt to a variety of tasks. Using dynamic function exchange (DFX) technology, the system can dynamically allocate hardware resources to deploy deep learning units (DPUs) so as to balance the requirements, performance and power consumption of the deep learning applications. The prototype is implemented on the Xilinx Zynq UltraScale+ series chips. The experiment results indicate that the proposed scheme significantly improves the computing efficiency of the resource-constrained systems under various experimental scenarios. Compared to the baseline, the proposed strategy consumes 38% and 82% of power in low working load cases and high working load cases, respectively. Typically, the proposed system can save approximately 75.8% of energy.
机器学习,特别是深度学习(DL),已经在各种各样的应用中取得了巨大的成功,例如物体检测、图像分类和自动驾驶。然而,由于硬件资源和功耗的限制,在资源受限的移动和嵌入式系统上部署深度学习算法存在许多挑战,特别是对于运行多种深度学习算法以执行各种任务的系统。本文提出了一种基于现场可编程门阵列(fpga)的自适应硬件资源管理系统,用于动态管理片上硬件资源(如lut、bram和dsp),以适应各种任务。通过DFX (dynamic function exchange)技术,系统可以动态分配硬件资源来部署深度学习单元(dpu),从而平衡深度学习应用的需求、性能和功耗。该原型在赛灵思Zynq UltraScale+系列芯片上实现。实验结果表明,在各种实验场景下,该方案显著提高了资源受限系统的计算效率。与基线相比,该策略在低工作负载和高工作负载情况下的功耗分别为38%和82%。通常情况下,该系统可以节省约75.8%的能源。
{"title":"FPGA based Adaptive Hardware Acceleration for Multiple Deep Learning Tasks","authors":"Yufan Lu, X. Zhai, S. Saha, Shoaib Ehsan, K. Mcdonald-Maier","doi":"10.1109/MCSoC51149.2021.00038","DOIUrl":"https://doi.org/10.1109/MCSoC51149.2021.00038","url":null,"abstract":"Machine learning, and in particular deep learning (DL), has seen strong success in a wide variety of applications, e.g. object detection, image classification and self-driving. However, due to the limitations on hardware resources and power consumption, there are many challenges to deploy deep learning algorithms on resource-constrained mobile and embedded systems, especially for systems running multiple DL algorithms for a variety of tasks. In this paper, an adaptive hardware resource management system, implemented on field-programmable gate arrays (FPGAs), is proposed to dynamically manage the on-chip hardware resources (e.g. LUTs, BRAMs and DSPs) to adapt to a variety of tasks. Using dynamic function exchange (DFX) technology, the system can dynamically allocate hardware resources to deploy deep learning units (DPUs) so as to balance the requirements, performance and power consumption of the deep learning applications. The prototype is implemented on the Xilinx Zynq UltraScale+ series chips. The experiment results indicate that the proposed scheme significantly improves the computing efficiency of the resource-constrained systems under various experimental scenarios. Compared to the baseline, the proposed strategy consumes 38% and 82% of power in low working load cases and high working load cases, respectively. Typically, the proposed system can save approximately 75.8% of energy.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131792740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-01DOI: 10.1109/MCSoC51149.2021.00063
Richard Fenster, S. L. Beux
The high error-resilience of numerous applications such as neural networks and signal processing led to new optimization opportunities in manycore systems. Indeed, approximate computing enable the reduction of data bit size, which allows to relax design constraints of computing resources and memory. However, on-chip interconnects can hardly take advantage of the reduced data size since they also need to transmit plain sized data. Consequently, existing approximate networks-on-chip (NoCs) either involve additional physical layers dedicated to approximate data or significantly increase the energy to transfer non-approximate data. To solve this challenge, we propose RELAX, a reconfigurable network-on-chip that can operate in an accurate data only mode or a mixed mode. The mixed mode allows for concurrent accurate and approximate data transactions using the same physical layer, hence allowing the efficient transmission of approximate data while reducing the resources overhead. Synthesis and simulation results show that RELAX improves communication latency of approximate data up to 44.2% when compared to an accurate data only, baseline 2D-Mesh NoC.
{"title":"RELAX: a REconfigurabLe Approximate Network-on-Chip","authors":"Richard Fenster, S. L. Beux","doi":"10.1109/MCSoC51149.2021.00063","DOIUrl":"https://doi.org/10.1109/MCSoC51149.2021.00063","url":null,"abstract":"The high error-resilience of numerous applications such as neural networks and signal processing led to new optimization opportunities in manycore systems. Indeed, approximate computing enable the reduction of data bit size, which allows to relax design constraints of computing resources and memory. However, on-chip interconnects can hardly take advantage of the reduced data size since they also need to transmit plain sized data. Consequently, existing approximate networks-on-chip (NoCs) either involve additional physical layers dedicated to approximate data or significantly increase the energy to transfer non-approximate data. To solve this challenge, we propose RELAX, a reconfigurable network-on-chip that can operate in an accurate data only mode or a mixed mode. The mixed mode allows for concurrent accurate and approximate data transactions using the same physical layer, hence allowing the efficient transmission of approximate data while reducing the resources overhead. Synthesis and simulation results show that RELAX improves communication latency of approximate data up to 44.2% when compared to an accurate data only, baseline 2D-Mesh NoC.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130737349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-01DOI: 10.1109/MCSoC51149.2021.00060
Mong Tee Sim
The new generation of embedded applications demands both high performance and energy efficiency. This paper presents a new hardware design to support architecture-level thread isolation, together with logics to fold the branch and jump instructions and a Turbo module, thereby reducing the overall number of instructions flowing through the CPU without causing any pipeline stalls. By pipelining the branch and jump folding logics from multiple threads of execution, the hardware can continuously operate at the peak CPU speed, with reduced power consumption by reducing the number of microcontrollers required in the system. We show that this novel technique can accelerate the system performance, increase the instruction per cycle up to 1.36, and with the Turbo module, up to 1.823, without requiring any extra programming effort by developers. We used the Dhrystone, Coremark, and ten selected benchmark metrics to validate the performance and functionality of our system.
{"title":"Boosting CPU Performance using Pipelined Branch and Jump Folding Hardware with Turbo Module","authors":"Mong Tee Sim","doi":"10.1109/MCSoC51149.2021.00060","DOIUrl":"https://doi.org/10.1109/MCSoC51149.2021.00060","url":null,"abstract":"The new generation of embedded applications demands both high performance and energy efficiency. This paper presents a new hardware design to support architecture-level thread isolation, together with logics to fold the branch and jump instructions and a Turbo module, thereby reducing the overall number of instructions flowing through the CPU without causing any pipeline stalls. By pipelining the branch and jump folding logics from multiple threads of execution, the hardware can continuously operate at the peak CPU speed, with reduced power consumption by reducing the number of microcontrollers required in the system. We show that this novel technique can accelerate the system performance, increase the instruction per cycle up to 1.36, and with the Turbo module, up to 1.823, without requiring any extra programming effort by developers. We used the Dhrystone, Coremark, and ten selected benchmark metrics to validate the performance and functionality of our system.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134000387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-01DOI: 10.1109/MCSoC51149.2021.00009
Iman Firmansyah, Y. Yamaguchi
Stereo vision finds a wide range of applications in automotive, object detection, robot navigation, agriculture mapping, and others. Stereo matching is a stereo algorithm targeted to identify the corresponding pixels from two or more images. This study shows the implementation of stereo matching using the Sum of Absolute Difference (SAD) algorithm to extract the object's depth or disparity from stereo images. Our key objective revolves around the implementation of the stereo matching algorithm for a small field-programmable gate array (FPGA)—requiring relatively few resources while maintaining the processing speed as well as disparity map. For meeting this requirement, we used small window buffers to compute the stereo matching. The occluded pixels were reduced by introducing secondary consistency checking implementation. From the results of experiments performed using the Zynq UltraScale+ ZCU102 FPGA board with SDSoC compiler, the processing speed for computing the stereo matching algorithm with a 4×4 window buffer was 0.038 s for an image size of 486×720 pixels and 0.051 s for 375×1242 pixels resolution. The proposed design needed 1% each of BRAM and FF and 7% of LUT. An 18% reduction in the pixel errors has been observed when employing the secondary consistency matching on the post-processing.
{"title":"FPGA-Based Implementation of the Stereo Matching Algorithm Using High-Level Synthesis","authors":"Iman Firmansyah, Y. Yamaguchi","doi":"10.1109/MCSoC51149.2021.00009","DOIUrl":"https://doi.org/10.1109/MCSoC51149.2021.00009","url":null,"abstract":"Stereo vision finds a wide range of applications in automotive, object detection, robot navigation, agriculture mapping, and others. Stereo matching is a stereo algorithm targeted to identify the corresponding pixels from two or more images. This study shows the implementation of stereo matching using the Sum of Absolute Difference (SAD) algorithm to extract the object's depth or disparity from stereo images. Our key objective revolves around the implementation of the stereo matching algorithm for a small field-programmable gate array (FPGA)—requiring relatively few resources while maintaining the processing speed as well as disparity map. For meeting this requirement, we used small window buffers to compute the stereo matching. The occluded pixels were reduced by introducing secondary consistency checking implementation. From the results of experiments performed using the Zynq UltraScale+ ZCU102 FPGA board with SDSoC compiler, the processing speed for computing the stereo matching algorithm with a 4×4 window buffer was 0.038 s for an image size of 486×720 pixels and 0.051 s for 375×1242 pixels resolution. The proposed design needed 1% each of BRAM and FF and 7% of LUT. An 18% reduction in the pixel errors has been observed when employing the secondary consistency matching on the post-processing.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114415418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-01DOI: 10.1109/MCSoC51149.2021.00033
Lan-Da Van, Tao Wang, Sing-Jia Tzeng, T. Jung
Independent Component Analysis is a widely used machine learning technique to separate mixed signals into statistically independent components. This study proposes a computation-aware (CA) Task Parallel Library (TPL) utilization procedure to parallelize the Fast Independent Component Analysis (FastICA) algorithm on a multi-core CPU. The proposed CA method separates the complex from simple computations by exploring their execution times on a multi-core CPU. TPL is used for complex calculations, but not for simple ones. In comparison to the program without the TPL, the proposed CA procedure reduces the execution time of decomposing 8- and 32-channel artificially mixed signals by 34.88% and 43.01%, respectively. The proposed CA procedure reduces the execution time of decomposing 8- and 32-channel artificially mixed signals by 10.04% and 0.93%, respectively, compared to the fully parallelized program with TPL. Using CA TPL, the decomposition of 12-channel electroencephalograms (EEG) signals take 48.27% less time than without it. The proposed CA procedure reduces execution time by 15.12% compared to the fully parallelized program with TPL.
{"title":"A Computation-Aware TPL Utilization Procedure for Parallelizing the FastICA Algorithm on a Multi-Core CPU","authors":"Lan-Da Van, Tao Wang, Sing-Jia Tzeng, T. Jung","doi":"10.1109/MCSoC51149.2021.00033","DOIUrl":"https://doi.org/10.1109/MCSoC51149.2021.00033","url":null,"abstract":"Independent Component Analysis is a widely used machine learning technique to separate mixed signals into statistically independent components. This study proposes a computation-aware (CA) Task Parallel Library (TPL) utilization procedure to parallelize the Fast Independent Component Analysis (FastICA) algorithm on a multi-core CPU. The proposed CA method separates the complex from simple computations by exploring their execution times on a multi-core CPU. TPL is used for complex calculations, but not for simple ones. In comparison to the program without the TPL, the proposed CA procedure reduces the execution time of decomposing 8- and 32-channel artificially mixed signals by 34.88% and 43.01%, respectively. The proposed CA procedure reduces the execution time of decomposing 8- and 32-channel artificially mixed signals by 10.04% and 0.93%, respectively, compared to the fully parallelized program with TPL. Using CA TPL, the decomposition of 12-channel electroencephalograms (EEG) signals take 48.27% less time than without it. The proposed CA procedure reduces execution time by 15.12% compared to the fully parallelized program with TPL.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128868212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-01DOI: 10.1109/MCSoC51149.2021.00065
Vasco Xu, Liam White McShane, D. Mossé
As heterogeneous multicore systems become a standard in computing devices, there is an increasing need for intelligent and adaptive resource allocation schemes to achieve a balance between performance and energy consumption. To support this growing need, researchers have explored a plethora of techniques to guide OS scheduling policies, including machine learning, statistical regression and custom heuristics. Such techniques have been enabled by the abundance of low-level performance counters, and have proven effective in characterizing applications as well as predicting power and performance. However, most works require and develop custom infrastructures. In this paper we present LUSH, a Lightweight Framework for User-level Scheduling in Heterogeneous Multicores that allows for users to develop their own customized scheduling policies, without requiring root privileges. LUSH contributes the following to the state-of-the-art: (1) a mechanism for monitoring application runtime behavior using performance counters, (2) a mechanism for exporting kernel data to user-level at a user-defined period; and (3) a parameterized and flexible interface for developing, deploying, and evaluating novel algorithms applied to OS scheduling policies. The framework presented in this paper serves as a foundation for exploring advanced and intelligent techniques for resource management in heterogeneous systems.
{"title":"LUSH: Lightweight Framework for User-level Scheduling in Heterogeneous Multicores","authors":"Vasco Xu, Liam White McShane, D. Mossé","doi":"10.1109/MCSoC51149.2021.00065","DOIUrl":"https://doi.org/10.1109/MCSoC51149.2021.00065","url":null,"abstract":"As heterogeneous multicore systems become a standard in computing devices, there is an increasing need for intelligent and adaptive resource allocation schemes to achieve a balance between performance and energy consumption. To support this growing need, researchers have explored a plethora of techniques to guide OS scheduling policies, including machine learning, statistical regression and custom heuristics. Such techniques have been enabled by the abundance of low-level performance counters, and have proven effective in characterizing applications as well as predicting power and performance. However, most works require and develop custom infrastructures. In this paper we present LUSH, a Lightweight Framework for User-level Scheduling in Heterogeneous Multicores that allows for users to develop their own customized scheduling policies, without requiring root privileges. LUSH contributes the following to the state-of-the-art: (1) a mechanism for monitoring application runtime behavior using performance counters, (2) a mechanism for exporting kernel data to user-level at a user-defined period; and (3) a parameterized and flexible interface for developing, deploying, and evaluating novel algorithms applied to OS scheduling policies. The framework presented in this paper serves as a foundation for exploring advanced and intelligent techniques for resource management in heterogeneous systems.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126484853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-01DOI: 10.1109/MCSoC51149.2021.00054
R. Kleijnen, M. Robens, M. Schiek, S. Waasen
Observing long-term learning effects caused by neuron activity in the human brain in vivo, over a period of weeks, months, or years, is impractical. Over the last decade, the field of neuromorphic computing hardware has grown significantly, i.e. SpiNNaker, BrainScaleS and Neurogrid. These novel many-core simulation platforms offer a practical alternative to study neuron behaviour in the brain at an accelerated rate, with a high level of detail. However, they do by far not reach human brain scales yet as in particular the massive amount of spike communication turns out to be a bottleneck. In this paper, we introduce a network simulator specifically developed for the analysis of bandwidth load and latency of different network topologies and communication protocols in neuromorphic computing communication networks in high detail. Unique to this simulator, compared to state of the art network models and simulators, is its ability to simulate the impact of heterogeneous neural connectivity by different models as well as the evaluation of neuron mapping algorithms. We crosscheck the simulator by comparing the results of a run using a homogeneous neural network to the bandwidth load resulting from comparable works, but simultaneously show the increased level of detail reached with our simulator. Finally, we show the impact heterogeneous connectivity can have on the bandwidth and how different neuron mapping algorithms can enhance this effect.
{"title":"A Network Simulator for the Estimation of Bandwidth Load and Latency Created by Heterogeneous Spiking Neural Networks on Neuromorphic Computing Communication Networks","authors":"R. Kleijnen, M. Robens, M. Schiek, S. Waasen","doi":"10.1109/MCSoC51149.2021.00054","DOIUrl":"https://doi.org/10.1109/MCSoC51149.2021.00054","url":null,"abstract":"Observing long-term learning effects caused by neuron activity in the human brain in vivo, over a period of weeks, months, or years, is impractical. Over the last decade, the field of neuromorphic computing hardware has grown significantly, i.e. SpiNNaker, BrainScaleS and Neurogrid. These novel many-core simulation platforms offer a practical alternative to study neuron behaviour in the brain at an accelerated rate, with a high level of detail. However, they do by far not reach human brain scales yet as in particular the massive amount of spike communication turns out to be a bottleneck. In this paper, we introduce a network simulator specifically developed for the analysis of bandwidth load and latency of different network topologies and communication protocols in neuromorphic computing communication networks in high detail. Unique to this simulator, compared to state of the art network models and simulators, is its ability to simulate the impact of heterogeneous neural connectivity by different models as well as the evaluation of neuron mapping algorithms. We crosscheck the simulator by comparing the results of a run using a homogeneous neural network to the bandwidth load resulting from comparable works, but simultaneously show the increased level of detail reached with our simulator. Finally, we show the impact heterogeneous connectivity can have on the bandwidth and how different neuron mapping algorithms can enhance this effect.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128923649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-01DOI: 10.1109/MCSoC51149.2021.00027
Yuta Kasuga, Jungpil Shin, Md. Al Mehedi Hasan, Y. Okuyama, Yoichi Tomioka
The aim of this study is to find useful electrodes for positive-negative emotion classification based on EEG. We collected EEG signals from 30 people aged 19-38 using 14 electrodes. We used two movies for positive and negative emotions. First, we extracted the power spectrum from the EEG data, normalized the data, and extracted frequency-domain statistical parameters therefrom. When the features were applied to Random Forests (RF), 85.4%, 83.8%, and 83.4% accuracy was obtained for P8, P7, and FC6 electrodes, respectively. This indicates that the P8, P7 and FC6 electrodes are the useful electrode in positive-negative emotion classification.
{"title":"EEG-based Positive-Negative Emotion Classification Using Machine Learning Techniques","authors":"Yuta Kasuga, Jungpil Shin, Md. Al Mehedi Hasan, Y. Okuyama, Yoichi Tomioka","doi":"10.1109/MCSoC51149.2021.00027","DOIUrl":"https://doi.org/10.1109/MCSoC51149.2021.00027","url":null,"abstract":"The aim of this study is to find useful electrodes for positive-negative emotion classification based on EEG. We collected EEG signals from 30 people aged 19-38 using 14 electrodes. We used two movies for positive and negative emotions. First, we extracted the power spectrum from the EEG data, normalized the data, and extracted frequency-domain statistical parameters therefrom. When the features were applied to Random Forests (RF), 85.4%, 83.8%, and 83.4% accuracy was obtained for P8, P7, and FC6 electrodes, respectively. This indicates that the P8, P7 and FC6 electrodes are the useful electrode in positive-negative emotion classification.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"218 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124297017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-01DOI: 10.1109/MCSoC51149.2021.00058
Aminu Musa, Mohamed Hamada, F. Aliyu, Mohammed Hassan
Recently, researchers proposed automation of hydroponic systems to improve efficiency and minimize manpower requirements. Thus increasing profit and farm produce. However, a fully automated hydroponic system should be able to identify cases such as plant diseases, lack of nutrients, and inadequate water supply. Failure to detect these issues can lead to damage of crops and loss of capital. This paper presents an Internet of Things-based machine learning system for plant disease detection using Deep Convolutional Neural Network (DCNN). The model was trained on a data set of 54,309 instances containing 38 different classes of plant disease. The images were retrieved from a plant village database. The system achieved an Accuracy of 98.0% and AUC precision score of 88.0%.
{"title":"An Intelligent Plant Dissease Detection System for Smart Hydroponic Using Convolutional Neural Network","authors":"Aminu Musa, Mohamed Hamada, F. Aliyu, Mohammed Hassan","doi":"10.1109/MCSoC51149.2021.00058","DOIUrl":"https://doi.org/10.1109/MCSoC51149.2021.00058","url":null,"abstract":"Recently, researchers proposed automation of hydroponic systems to improve efficiency and minimize manpower requirements. Thus increasing profit and farm produce. However, a fully automated hydroponic system should be able to identify cases such as plant diseases, lack of nutrients, and inadequate water supply. Failure to detect these issues can lead to damage of crops and loss of capital. This paper presents an Internet of Things-based machine learning system for plant disease detection using Deep Convolutional Neural Network (DCNN). The model was trained on a data set of 54,309 instances containing 38 different classes of plant disease. The images were retrieved from a plant village database. The system achieved an Accuracy of 98.0% and AUC precision score of 88.0%.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121274471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}