2019 IEEE 13th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)最新文献

英文中文

Design of Knowledge Templates and Multi-View Symbols for Experiential Learning 体验式学习的知识模板与多视图符号设计

2019 IEEE 13th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)

Pub Date : 2019-10-01 DOI: 10.1109/MCSoC.2019.00036

Takayuki Hoshino, R. Yoshioka

A design of a knowledge description method based on Knowledge Templates and Multi-view Symbols for experiential learning is proposed. The proposed method is a unique approach for acquiring empirical data as knowledge and applying them to computing. For the domain of experiential learning, a knowledge model is designed based on the concept of Knowledge Templates, and a corresponding representation language is designed based on Multi-view Symbols. This allows description of both domain specific knowledge and subjective knowledge to be acquired more easily and reliably compared to using natural languages. These designs also demonstrate the application of these concepts to a specific knowledge domain. In addition, the design is evaluated by simulated visualizations of knowledge and use-case based analysis.

提出了一种基于知识模板和多视图符号的体验式学习知识描述方法设计。该方法是一种将经验数据作为知识获取并应用于计算的独特方法。对于体验学习领域，基于知识模板的概念设计了知识模型，并基于多视图符号设计了相应的表示语言。与使用自然语言相比，这使得对领域特定知识和主观知识的描述更容易、更可靠。这些设计还演示了这些概念在特定知识领域的应用。此外，通过知识的模拟可视化和基于用例的分析来评估设计。

引用次数: 0

Convolutional Neural Network for Classification of Source Codes 基于卷积神经网络的源代码分类

2019 IEEE 13th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)

Pub Date : 2019-10-01 DOI: 10.1109/MCSoC.2019.00035

Hiroki Ohashi, Y. Watanobe

A method to classify source code based on convolutional neural networks is presented. The goal of the neural networks is to predict the type of algorithm that is used in the corresponding source code so that the result obtained can be used for different kinds of assistance and assessment for programming education. In the proposed method, source code is converted into a sequence that represents the structure of the code without any keywords, such as variable names or function names. In present paper, models and implementation of the proposed method are presented. An experiment considering several algorithm types is also conducted. For evaluation of the proposed method, source code accumulated in an online judge system is used. The results of the experiment demonstrate that the proposed method can predict the algorithm used in the given source code to a high degree of accuracy.

提出了一种基于卷积神经网络的源代码分类方法。神经网络的目标是预测相应源代码中使用的算法类型，以便获得的结果可用于编程教育的不同类型的帮助和评估。在提出的方法中，源代码被转换成一个序列，该序列表示代码的结构，没有任何关键字，例如变量名或函数名。本文给出了该方法的模型和实现。还进行了考虑多种算法类型的实验。为了对所提出的方法进行评估，使用了在线判断系统中积累的源代码。实验结果表明，所提出的方法能够以较高的精度预测给定源代码中使用的算法。

引用次数: 9

Low-Cost Congestion Detection Mechanism for Networks-on-Chip 片上网络的低成本拥塞检测机制

2019 IEEE 13th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)

Pub Date : 2019-10-01 DOI: 10.1109/MCSoC.2019.00030

Zhengqian Han, M. Meyer, Xin Jiang, Takahiro Watanabe

Congestion detection has become a hot issue in Networks-on-Chip. Congestion-aware routing algorithms are designed to avoid congestion in the network; however, most of them achieve small performance benefits while drastically increasing the cost if the mesh size becomes larger. Using a different approach, we utilize the network itself to detect the congestion. In this paper, a congestion detecting mechanism is proposed, which is capable of locating where the congestion is in the network within several cycles. Then, the mechanism is applied to routing algorithm selection and task scheduling. The most suitable routing algorithm is selected according to the detection result. If the congestion status changes, the mechanism can also detect the change and judge whether the routing algorithm needs to be changed. Moreover, from the detection result, we can know the effect of task scheduling and judge whether it needs to be changed. Experimental results show that with the proposed detecting mechanism, the suitable routing algorithm is able to be successfully selected according to the congestion status and the better choice of task scheduling can be made. Consequently, the performance of NoC with the proposed congestion detection mechanism increases.

拥塞检测已成为片上网络中的一个热点问题。拥塞感知路由算法是为了避免网络拥塞而设计的;然而，如果网格尺寸变大，它们中的大多数实现了很小的性能优势，同时大幅增加了成本。使用另一种方法，我们利用网络本身来检测拥塞。本文提出了一种拥塞检测机制，该机制能够在几个周期内定位网络中的拥塞位置。然后，将该机制应用于路由算法选择和任务调度。根据检测结果选择最合适的路由算法。当拥塞状态发生变化时，该机制还可以检测到这种变化，并判断是否需要更改路由算法。此外，从检测结果中，我们可以知道任务调度的效果，判断是否需要改变任务调度。实验结果表明，利用所提出的检测机制，可以根据拥塞状态成功选择合适的路由算法，从而更好地选择任务调度。因此，采用所提出的拥塞检测机制后，NoC的性能有所提高。

{"title":"Low-Cost Congestion Detection Mechanism for Networks-on-Chip","authors":"Zhengqian Han, M. Meyer, Xin Jiang, Takahiro Watanabe","doi":"10.1109/MCSoC.2019.00030","DOIUrl":"https://doi.org/10.1109/MCSoC.2019.00030","url":null,"abstract":"Congestion detection has become a hot issue in Networks-on-Chip. Congestion-aware routing algorithms are designed to avoid congestion in the network; however, most of them achieve small performance benefits while drastically increasing the cost if the mesh size becomes larger. Using a different approach, we utilize the network itself to detect the congestion. In this paper, a congestion detecting mechanism is proposed, which is capable of locating where the congestion is in the network within several cycles. Then, the mechanism is applied to routing algorithm selection and task scheduling. The most suitable routing algorithm is selected according to the detection result. If the congestion status changes, the mechanism can also detect the change and judge whether the routing algorithm needs to be changed. Moreover, from the detection result, we can know the effect of task scheduling and judge whether it needs to be changed. Experimental results show that with the proposed detecting mechanism, the suitable routing algorithm is able to be successfully selected according to the congestion status and the better choice of task scheduling can be made. Consequently, the performance of NoC with the proposed congestion detection mechanism increases.","PeriodicalId":104240,"journal":{"name":"2019 IEEE 13th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"509 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120932916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Graph Transformations and Derivation of Scheduling Constraints Applied to the Mapping of Real-Time Distributed Applications 应用于实时分布式应用映射的调度约束的图变换和推导

2019 IEEE 13th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)

Pub Date : 2019-10-01 DOI: 10.1109/MCSoC.2019.00049

Stéphane Louise

Synchronous Data-Flow as a deterministic variation of Khan Process Networks is a good model for distributed applications that allows for verification of the properties of the applications both at the design level and at run-time. Real-Time extensions exist which allow to specify Real-Time clocks for some of the processes in the graph. With the addition of Real-Time clocks, the behavior of the complete system can be easily differentiated from a nominal Real-Time mode where all the required data for processing is available when a clock tick occurs, and an error mode can be triggered when the condition is not met. Our contribution in this paper is –first– to show a set of graph transformations that allow to account for the execution and communication time on a real platform while at the same time maximizing the parallelism of execution and –second– on top of these transformations to provide the execution constraints as a linear program that must be met at run-time to guarantee the real-time requirements. It is illustrated on a subset of a real-life automotive example.

同步数据流作为可汗过程网络的确定性变体是分布式应用程序的一个很好的模型，它允许在设计级别和运行时验证应用程序的属性。存在实时扩展，允许为图中的某些进程指定实时时钟。通过添加Real-Time时钟，可以很容易地将整个系统的行为与标称的Real-Time模式区分出来，在标称的Real-Time模式中，当时钟滴答发生时，处理所需的所有数据都可用，而当不满足条件时，可以触发错误模式。我们在本文中的贡献是——首先——展示了一组图形转换，这些转换允许在真实平台上考虑执行和通信时间，同时最大限度地提高执行的并行性;其次——在这些转换的基础上，提供了作为线性程序的执行约束，必须在运行时满足这些约束，以保证实时需求。它是在一个现实生活中的汽车例子的子集上说明的。

{"title":"Graph Transformations and Derivation of Scheduling Constraints Applied to the Mapping of Real-Time Distributed Applications","authors":"Stéphane Louise","doi":"10.1109/MCSoC.2019.00049","DOIUrl":"https://doi.org/10.1109/MCSoC.2019.00049","url":null,"abstract":"Synchronous Data-Flow as a deterministic variation of Khan Process Networks is a good model for distributed applications that allows for verification of the properties of the applications both at the design level and at run-time. Real-Time extensions exist which allow to specify Real-Time clocks for some of the processes in the graph. With the addition of Real-Time clocks, the behavior of the complete system can be easily differentiated from a nominal Real-Time mode where all the required data for processing is available when a clock tick occurs, and an error mode can be triggered when the condition is not met. Our contribution in this paper is –first– to show a set of graph transformations that allow to account for the execution and communication time on a real platform while at the same time maximizing the parallelism of execution and –second– on top of these transformations to provide the execution constraints as a linear program that must be met at run-time to guarantee the real-time requirements. It is illustrated on a subset of a real-life automotive example.","PeriodicalId":104240,"journal":{"name":"2019 IEEE 13th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"162 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117139526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A Semi-Lossless Image Compression Procedure using a Lossless Mode of JPEG 使用JPEG无损模式的半无损图像压缩程序

2019 IEEE 13th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)

Pub Date : 2019-10-01 DOI: 10.1109/MCSoC.2019.00028

Md.Atiqur Rahman, Mohamed Hamada

Day by day, communication through the internet is progressing; particularly video calling for which sending a massive amount of data over the internet and saving on a computer is being a big challenge. For which, there are many compression algorithms; such as block transform, vector quantization and JPEG are used to convert a big dataset in such a format so that it can be sent over the internet at a high speed and stored in a small space in computer. In this paper, a new procedure has been proposed using a lossless mode of JPEG by removing the first and last bits from the exact binary pattern of each pixel which provides a better result than the state-of-the-art techniques. In this technique, the two bits are removed after a little bit preprocessing and then the remaining binary pattern is replaced by a fixed value each. Lastly, average code word, compression ratio and PSNR are used to assess the performance of the proposed procedure with the state-of-the-art techniques. From the experimental results, it looks that the proposed procedure provides better results than the state-of-the-art techniques.

通过互联网的交流日益进步;特别是视频通话，通过互联网发送大量数据并保存在电脑上是一个巨大的挑战。为此，有许多压缩算法;如块变换、矢量量化和JPEG等，用于将大数据集转换成这种格式，使其能够在互联网上高速发送，并存储在计算机的小空间中。在本文中，提出了一种新的程序，使用JPEG的无损模式，通过从每个像素的精确二进制模式中去除第一个和最后一个比特，提供比最先进的技术更好的结果。在这种技术中，经过一点点预处理后，这两个比特被删除，然后剩下的二进制模式被每个固定值所取代。最后，使用平均码字，压缩比和PSNR来评估采用最先进技术的程序的性能。从实验结果来看，所提出的程序比最先进的技术提供了更好的结果。

{"title":"A Semi-Lossless Image Compression Procedure using a Lossless Mode of JPEG","authors":"Md.Atiqur Rahman, Mohamed Hamada","doi":"10.1109/MCSoC.2019.00028","DOIUrl":"https://doi.org/10.1109/MCSoC.2019.00028","url":null,"abstract":"Day by day, communication through the internet is progressing; particularly video calling for which sending a massive amount of data over the internet and saving on a computer is being a big challenge. For which, there are many compression algorithms; such as block transform, vector quantization and JPEG are used to convert a big dataset in such a format so that it can be sent over the internet at a high speed and stored in a small space in computer. In this paper, a new procedure has been proposed using a lossless mode of JPEG by removing the first and last bits from the exact binary pattern of each pixel which provides a better result than the state-of-the-art techniques. In this technique, the two bits are removed after a little bit preprocessing and then the remaining binary pattern is replaced by a fixed value each. Lastly, average code word, compression ratio and PSNR are used to assess the performance of the proposed procedure with the state-of-the-art techniques. From the experimental results, it looks that the proposed procedure provides better results than the state-of-the-art techniques.","PeriodicalId":104240,"journal":{"name":"2019 IEEE 13th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117242255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Unified Symbol Framework to Improve UI Comprehension 统一符号框架，提高用户界面理解

2019 IEEE 13th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)

Pub Date : 2019-10-01 DOI: 10.1109/MCSoC.2019.00026

R. Yoshioka, Naoyuki Murata

The objective of this research is to develop a method to serve a generic and unified set of symbology (or words) to software UI elements where each symbol is backed by multi-view explanations to support comprehension. The problem of selecting appropriate symbols and words for the UI is solved by reusable components that are connected to a dictionary of common symbols and words. It is unique from other approaches such that the meaning of symbols is explained by a set of pictures that provide multiple explanations of the context, in other words, Multi-view Symbols. The method is realized by an online dictionary of Multi-view Symbols and a UI framework that provide seamless on-demand access to the dictionary. The design and implementation of the symbol and dictionary system is described and an example of embedding it into a client application is provided.

本研究的目的是开发一种方法，为软件UI元素提供通用和统一的符号学(或单词)集，其中每个符号都有多视图解释，以支持理解。为UI选择合适的符号和单词的问题是通过可重用组件解决的，这些组件连接到通用符号和单词的字典。它与其他方法的独特之处在于，符号的意义是通过一组图片来解释的，这些图片提供了对上下文的多种解释，换句话说，多视图符号。该方法通过一个多视图符号在线字典和一个提供无缝按需访问字典的UI框架来实现。描述了符号和字典系统的设计与实现，并给出了将其嵌入到客户端应用程序中的示例。

引用次数: 2

Implementation of Content-Based Anonymization Edge Router on NetFPGA 基于内容匿名化边缘路由器在NetFPGA上的实现

2019 IEEE 13th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)

Pub Date : 2019-10-01 DOI: 10.1109/MCSoC.2019.00025

Akihiro Fukuhara, Tomomu Iwai, Yuiko Sakuma, H. Nishi

In recent years, a large number of Internet of Things (IoT) devices have appeared. Accordingly, various services using data from such devices have been proposed. However, the collected raw data include private information, and thus, privacy problems arise. Data anonymization is a method for removing privacy-sensitive information from raw data. Data anonymization for IoT data services should satisfy the following requirements. First, the raw data should be anonymized between a device and the cloud server. Second, the anonymization methods and the destinations of the collected data should be flexibly configured, as they depend on data types and agreements with data suppliers. Third, network transparency is necessary for ease of installation. However, conventional data anonymization systems do not satisfy these requirements. We propose anonymization hardware that functions as a network router on network edges. It directly anonymizes data in network packets. Moreover, it decides the destination IP address of the packets and anonymizes data based on their content. For high-throughput and low-power processing of the packets, the proposed hardware was implemented by using a field-programmable gate array. The throughput of the proposed hardware achieved 10 Gbps wire speed, and the power consumption was lower than that of software implementation.

近年来，大量的物联网(IoT)设备出现。因此，提出了使用这些装置的数据的各种服务。然而，收集的原始数据包含私人信息，因此出现隐私问题。数据匿名化是从原始数据中删除隐私敏感信息的一种方法。物联网数据服务的数据匿名化应满足以下要求:首先，原始数据应该在设备和云服务器之间匿名化。其次，应该灵活配置所收集数据的匿名化方法和目的地，因为它们取决于数据类型和与数据供应商的协议。第三，网络透明度是方便安装的必要条件。然而，传统的数据匿名化系统不能满足这些要求。我们建议将匿名化硬件用作网络边缘的网络路由器。它直接将网络数据包中的数据匿名化。此外，它还决定数据包的目的IP地址，并根据其内容对数据进行匿名化。为了实现高吞吐量和低功耗的数据包处理，所提出的硬件通过使用现场可编程门阵列来实现。所提出的硬件吞吐量达到10 Gbps线速，功耗低于软件实现。

{"title":"Implementation of Content-Based Anonymization Edge Router on NetFPGA","authors":"Akihiro Fukuhara, Tomomu Iwai, Yuiko Sakuma, H. Nishi","doi":"10.1109/MCSoC.2019.00025","DOIUrl":"https://doi.org/10.1109/MCSoC.2019.00025","url":null,"abstract":"In recent years, a large number of Internet of Things (IoT) devices have appeared. Accordingly, various services using data from such devices have been proposed. However, the collected raw data include private information, and thus, privacy problems arise. Data anonymization is a method for removing privacy-sensitive information from raw data. Data anonymization for IoT data services should satisfy the following requirements. First, the raw data should be anonymized between a device and the cloud server. Second, the anonymization methods and the destinations of the collected data should be flexibly configured, as they depend on data types and agreements with data suppliers. Third, network transparency is necessary for ease of installation. However, conventional data anonymization systems do not satisfy these requirements. We propose anonymization hardware that functions as a network router on network edges. It directly anonymizes data in network packets. Moreover, it decides the destination IP address of the packets and anonymizes data based on their content. For high-throughput and low-power processing of the packets, the proposed hardware was implemented by using a field-programmable gate array. The throughput of the proposed hardware achieved 10 Gbps wire speed, and the power consumption was lower than that of software implementation.","PeriodicalId":104240,"journal":{"name":"2019 IEEE 13th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133913251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

An on-Communication Multiple-TSV Defects Detection and Localization for Real-Time 3D-ICs 实时3d - ic的非通信多tsv缺陷检测与定位

2019 IEEE 13th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)

Pub Date : 2019-10-01 DOI: 10.1109/MCSoC.2019.00039

K. Dang, Akram Ben Ahmed, Xuan-Tu Tran

This paper presents "On Communication Through-Silicon-Via Test" (OCTT), an ECC-based method to localize faults without halting the operation of TSV-based 3D-IC systems. OCTT consists of two major parts named Statistical Detector and Isolation and Check. While Statistical Detector could detect open and short defects in TSVs that work without interrupting data transactions, the Isolation and Check algorithm enhances the ability to localize fault position. The Monte-Carlo simulations of Statistical Detector show ×2 increment in the number of detected faults when compared to conventional ECC-based techniques. While Isolation and Check helps localize the number of defects up to ×4 and ×5 higher. In addition, the worst case execution time is below 65,000 cycles with no performance degradation for testing which could be easily integrated into real-time applications.

本文提出了“通硅通孔通信测试”(OCTT)，这是一种基于ec的方法，可以在不停止基于tsv的3D-IC系统运行的情况下定位故障。OCTT由两个主要部分组成:统计检测器和隔离与检查。统计检测器可以在不中断数据事务的情况下检测出tsv中的开放缺陷和短缺陷，而隔离和检查算法增强了故障定位的能力。统计检测器的蒙特卡罗模拟表明，与传统的基于ecc的技术相比，×2检测到的故障数量增加了。而隔离和检查则有助于将缺陷数量定位到×4和×5以上。此外，在最坏的情况下，执行时间低于65,000个周期，没有性能下降，可以很容易地集成到实时应用程序中。

引用次数: 2

Prototype of FPGA Dynamic Reconfiguration Based-on Context-Oriented Programming 基于上下文编程的FPGA动态重构原型

2019 IEEE 13th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)

Pub Date : 2019-10-01 DOI: 10.1109/MCSoC.2019.00024

Takeshi Ohkawa, Ikuta Tanigawa, Mikiko Sato, K. Hisazumi, Nobuhiko Ogura, Harumi Watanabe

Acceleration by FPGA is expected for real-time edge processing as well as server applications in the cloud. A robot is one of the examples which need the acceleration of processing such as image recognition processing and actuation based on its visual feedback. As the system is more complex, it is required to introduce a management mechanism of FPGA dynamic reconfiguration. In this paper, we propose a method of system development which includes FPGA acceleration. The key idea of the proposed method is the FPGA reconfiguration based on a context, which is defined in Context-Oriented Programming (COP). This idea contributes to solve the cross-cutting concern problem at runtime. The problem causes to decrease the efficiency of development. Thus, this idea makes easily manage to FPGA reconfiguration with software in case of changing a whole system. In evaluation, we compare the reconfiguration time of FPGA to switch a context with the context switching time of the COP software written in C++ language. It indicates that the proposed method is feasible to handle FPGA context.

FPGA的加速有望用于实时边缘处理以及云中的服务器应用。机器人就是其中一个需要加速处理的例子，例如图像识别处理和基于视觉反馈的驱动。由于系统较为复杂，需要引入FPGA动态重构的管理机制。本文提出了一种包含FPGA加速的系统开发方法。该方法的核心思想是基于上下文的FPGA重构，上下文是在面向上下文的编程(COP)中定义的。这个想法有助于解决运行时的横切关注点问题。这个问题导致了开发效率的降低。因此，在改变整个系统的情况下，这种想法可以很容易地管理FPGA的软件重新配置。在评估中，我们比较了FPGA切换上下文的重构时间和用c++语言编写的COP软件切换上下文的时间。结果表明，该方法对FPGA环境的处理是可行的。

{"title":"Prototype of FPGA Dynamic Reconfiguration Based-on Context-Oriented Programming","authors":"Takeshi Ohkawa, Ikuta Tanigawa, Mikiko Sato, K. Hisazumi, Nobuhiko Ogura, Harumi Watanabe","doi":"10.1109/MCSoC.2019.00024","DOIUrl":"https://doi.org/10.1109/MCSoC.2019.00024","url":null,"abstract":"Acceleration by FPGA is expected for real-time edge processing as well as server applications in the cloud. A robot is one of the examples which need the acceleration of processing such as image recognition processing and actuation based on its visual feedback. As the system is more complex, it is required to introduce a management mechanism of FPGA dynamic reconfiguration. In this paper, we propose a method of system development which includes FPGA acceleration. The key idea of the proposed method is the FPGA reconfiguration based on a context, which is defined in Context-Oriented Programming (COP). This idea contributes to solve the cross-cutting concern problem at runtime. The problem causes to decrease the efficiency of development. Thus, this idea makes easily manage to FPGA reconfiguration with software in case of changing a whole system. In evaluation, we compare the reconfiguration time of FPGA to switch a context with the context switching time of the COP software written in C++ language. It indicates that the proposed method is feasible to handle FPGA context.","PeriodicalId":104240,"journal":{"name":"2019 IEEE 13th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114089177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Multicore Power Estimation using Independent Component Analysis Based Modeling 基于独立分量分析建模的多核功率估计

2019 IEEE 13th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)

Pub Date : 2019-10-01 DOI: 10.1109/MCSoC.2019.00013

Mark Sagi, N. Doan, Thomas Wild, A. Herkersdorf

State-of-the-art power estimation research for multicore processors combine performance counters that collect run-time activity information with an offline-generated power model. To generate these power models, the package power is measured and the activity information is traced while synthetic workloads are executed. These workloads stress distinct core components in order to expose power responses so that the activity information has low collinearity. The measurements are then combined into a power model describing the general power behavior. However, one of the main drawbacks of these synthetic workloads is that they are most of the time custom-designed for a given multi-core architecture and are hardly available. In this paper, we present a methodology to generate power models using freely available benchmarks, e.g. PARSEC/Splash-2. To minimize the collinearity of the activity information due to the uncontrolled/unspecified behavior of these more general benchmarks, we propose to use independent component analysis. This allows to avoid the use of synthetic workloads and a reduction of the relative error by 24% in the average case, when compared to prior state-of-the-art work. Although, we also observe an increase of 22% relative error in the worst case for our approach, this can easily be improved by using either different or more training benchmarks. These promising results give a strong indication that independent component analysis could directly be used with real application workload, leading to the possibility to build/improve power models during runtime.

针对多核处理器的最新功率估计研究将收集运行时活动信息的性能计数器与离线生成的功率模型相结合。为了生成这些功率模型，在执行合成工作负载时测量包功率并跟踪活动信息。这些工作负载对不同的核心组件施加压力，以暴露功率响应，从而使活动信息具有低共线性。然后将测量结果组合成描述一般功率行为的功率模型。然而，这些合成工作负载的主要缺点之一是，它们大多数时候是为给定的多核体系结构定制设计的，并且几乎不可用。在本文中，我们提出了一种使用免费可用的基准(例如PARSEC/Splash-2)生成功率模型的方法。为了最小化由于这些更一般的基准的不受控制/未指定的行为而导致的活动信息的共线性，我们建议使用独立组件分析。这可以避免使用合成工作负载，并且与之前的最先进的工作相比，在平均情况下将相对误差降低24%。虽然，我们也观察到在最坏的情况下，我们的方法的相对误差增加了22%，这可以很容易地通过使用不同或更多的训练基准来改进。这些令人鼓舞的结果有力地表明，独立组件分析可以直接用于实际的应用程序工作负载，从而有可能在运行时构建/改进功率模型。

{"title":"Multicore Power Estimation using Independent Component Analysis Based Modeling","authors":"Mark Sagi, N. Doan, Thomas Wild, A. Herkersdorf","doi":"10.1109/MCSoC.2019.00013","DOIUrl":"https://doi.org/10.1109/MCSoC.2019.00013","url":null,"abstract":"State-of-the-art power estimation research for multicore processors combine performance counters that collect run-time activity information with an offline-generated power model. To generate these power models, the package power is measured and the activity information is traced while synthetic workloads are executed. These workloads stress distinct core components in order to expose power responses so that the activity information has low collinearity. The measurements are then combined into a power model describing the general power behavior. However, one of the main drawbacks of these synthetic workloads is that they are most of the time custom-designed for a given multi-core architecture and are hardly available. In this paper, we present a methodology to generate power models using freely available benchmarks, e.g. PARSEC/Splash-2. To minimize the collinearity of the activity information due to the uncontrolled/unspecified behavior of these more general benchmarks, we propose to use independent component analysis. This allows to avoid the use of synthetic workloads and a reduction of the relative error by 24% in the average case, when compared to prior state-of-the-art work. Although, we also observe an increase of 22% relative error in the worst case for our approach, this can easily be improved by using either different or more training benchmarks. These promising results give a strong indication that independent component analysis could directly be used with real application workload, leading to the possibility to build/improve power models during runtime.","PeriodicalId":104240,"journal":{"name":"2019 IEEE 13th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121105426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2019 IEEE 13th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀