2021 ACM/IEEE 3rd Workshop on Machine Learning for CAD (MLCAD)最新文献

英文中文

Using Deep Neural Networks And Derivative Free Optimization To Accelerate Coverage Closure 利用深度神经网络和无导数优化加速覆盖闭合

2021 ACM/IEEE 3rd Workshop on Machine Learning for CAD (MLCAD)

Pub Date : 2021-08-30 DOI: 10.1109/MLCAD52597.2021.9531234

Raviv Gal, E. Haber, Brian Irwin, Marwa Mouallem, Bilal Saleh, A. Ziv

In computer aided design (CAD), a core task is to optimize the parameters of noisy simulations. Derivative free optimization (DFO) methods are the most common choice for this task. In this paper, we show how four DFO methods, specifically implicit filtering (IF), simulated annealing (SA), genetic algorithms (GA), and particle swarm (PS), can be accelerated using a deep neural network (DNN) that acts as a surrogate model of the objective function. In particular, we demonstrate the applicability of the DNN accelerated DFO approach to the coverage directed generation (CDG) problem that is commonly solved by hardware verification teams.

在计算机辅助设计(CAD)中，噪声仿真参数的优化是一个核心问题。无导数优化(DFO)方法是该任务最常用的选择。在本文中，我们展示了如何使用深度神经网络(DNN)作为目标函数的代理模型来加速四种DFO方法，特别是隐式滤波(IF)，模拟退火(SA)，遗传算法(GA)和粒子群(PS)。特别是，我们证明了DNN加速DFO方法对覆盖定向生成(CDG)问题的适用性，该问题通常由硬件验证团队解决。

引用次数: 1

On the Effectiveness of Quantization and Pruning on the Performance of FPGAs-based NN Temperature Estimation 量化和剪枝对基于fpga的神经网络温度估计性能的影响

2021 ACM/IEEE 3rd Workshop on Machine Learning for CAD (MLCAD)

Pub Date : 2021-08-30 DOI: 10.1109/MLCAD52597.2021.9531256

V. V. R. M. K. Muvva, Martin Rapp, J. Henkel, H. Amrouch, M. Wolf

A well-functioning thermal management system on the chip requires knowledge of the current temperature and the potential changes in temperature in the near future. This information is important for ensuring proactive thermal management on the chip. However, the limited number of sensors on the chip makes it difficult to accomplish this task. Hence we proposed a neural network based approach to predict the temperature map of the chip. To solve the problem, we have implemented two different neural networks, one is a feedforward network and the other uses recurrent neural networks. Our proposed method requires only performance counters measure to predict the temperature map of the chip during the runtime. Each of the two models shows promising results regarding the estimation of the temperature map on the chip. The recurrent neural network outperformed the feedforward neural network. Furthermore, both networks have been quantized, pruned, and the feedforward network has been compiled into FPGA logic. Therefore, the network could be embedded in the chip, whether it be an ASIC or an FPGA.

芯片上运行良好的热管理系统需要了解当前温度和不久的将来温度的潜在变化。这些信息对于确保芯片上的主动热管理非常重要。然而，芯片上的传感器数量有限，很难完成这项任务。因此，我们提出了一种基于神经网络的方法来预测芯片的温度图。为了解决这个问题，我们实现了两种不同的神经网络，一种是前馈网络，另一种是循环神经网络。我们提出的方法只需要性能计数器测量来预测芯片在运行时的温度分布图。对于芯片上的温度图的估计，这两种模型中的每一种都显示出有希望的结果。递归神经网络优于前馈神经网络。此外，两个网络都进行了量化、剪枝，并将前馈网络编译成FPGA逻辑。因此，网络可以嵌入到芯片中，无论是ASIC还是FPGA。

{"title":"On the Effectiveness of Quantization and Pruning on the Performance of FPGAs-based NN Temperature Estimation","authors":"V. V. R. M. K. Muvva, Martin Rapp, J. Henkel, H. Amrouch, M. Wolf","doi":"10.1109/MLCAD52597.2021.9531256","DOIUrl":"https://doi.org/10.1109/MLCAD52597.2021.9531256","url":null,"abstract":"A well-functioning thermal management system on the chip requires knowledge of the current temperature and the potential changes in temperature in the near future. This information is important for ensuring proactive thermal management on the chip. However, the limited number of sensors on the chip makes it difficult to accomplish this task. Hence we proposed a neural network based approach to predict the temperature map of the chip. To solve the problem, we have implemented two different neural networks, one is a feedforward network and the other uses recurrent neural networks. Our proposed method requires only performance counters measure to predict the temperature map of the chip during the runtime. Each of the two models shows promising results regarding the estimation of the temperature map on the chip. The recurrent neural network outperformed the feedforward neural network. Furthermore, both networks have been quantized, pruned, and the feedforward network has been compiled into FPGA logic. Therefore, the network could be embedded in the chip, whether it be an ASIC or an FPGA.","PeriodicalId":210763,"journal":{"name":"2021 ACM/IEEE 3rd Workshop on Machine Learning for CAD (MLCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125643447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Ensemble Learning Based Electric Components Footprint Analysis 基于集成学习的电子元件足迹分析

2021 ACM/IEEE 3rd Workshop on Machine Learning for CAD (MLCAD)

Pub Date : 2021-08-30 DOI: 10.1109/MLCAD52597.2021.9531219

Peng Huang, Xuan-Yi Lin, Yan-Jhih Wang, Tsung-Yi Ho

Along with the rapid growth in the market of the Internet of Things and electrical devices, the design flow of Printed Circuit Boards (PCBs) requires a more effective design methodology. As to design a PCB board, it is necessary to build a footprint of components first, containing manufacturing information such as outline, height, and other constraints for placing components on a PCB board. Footprint design can vary between different manufacturers, depending on their production technology, which means an electronic component can have distinctive footprints. Therefore, analyzing PCB footprint libraries can help to sort out footprint design rules, which can then be used for designing new footprints of the same type of components. In this paper, we adopt StackNet based on the ensemble learning method, using footprint images and numerical information for classification. Furthermore, we implement hierarchical clustering on the classification result to analyze the footprint design rules. Experimental results show our method can achieve higher accuracy than previous works.

随着物联网和电子设备市场的快速增长，印刷电路板(pcb)的设计流程需要更有效的设计方法。在设计PCB板时，首先需要建立一个元件的足迹，其中包含诸如轮廓，高度等制造信息，以及将元件放置在PCB板上的其他约束条件。根据不同的制造商的生产技术，足迹设计可能会有所不同，这意味着电子元件可能会有不同的足迹。因此，分析PCB足迹库可以帮助整理足迹设计规则，然后可以用于设计相同类型组件的新足迹。本文采用基于集成学习方法的StackNet，利用足迹图像和数值信息进行分类。在此基础上，对分类结果进行分层聚类，分析足迹设计规律。实验结果表明，该方法比以往的方法具有更高的精度。

引用次数: 1

Learning-Based Workload Phase Classification and Prediction Using Performance Monitoring Counters 基于学习的工作负荷阶段分类和性能监控计数器预测

2021 ACM/IEEE 3rd Workshop on Machine Learning for CAD (MLCAD)

Pub Date : 2021-08-30 DOI: 10.1109/MLCAD52597.2021.9531161

Erika S. Alcorta, A. Gerstlauer

Predicting coarse-grain variations in workload behavior during execution is essential for dynamic resource optimization of processor systems. Researchers have proposed various methods to first classify workloads into phases and then learn their long-term phase behavior to predict and anticipate phase changes. Early studies on phase prediction proposed table-based phase predictors. More recently, simple learning-based techniques such as decision trees have been explored. However, more recent advances in machine learning have not been applied to phase prediction so far. Furthermore, existing phase predictors have been studied only in connection with specific phase classifiers even though there is a wide range of classification methods. Early work in phase classification proposed various clustering methods that required access to source code. Some later studies used performance monitoring counters, but they only evaluated classifiers for specific contexts such as thermal modeling.In this work, we perform a comprehensive study of source-oblivious phase classification and prediction methods using hardware counters. We adapt classification techniques that were used with different inputs in the past and compare them to state-of-the-art hardware counter based classifiers. We further evaluate the accuracy of various phase predictors when coupled with different phase classifiers and evaluate a range of advanced machine learning techniques, including SVMs and LSTMs for workload phase prediction. We apply classification and prediction approaches to SPEC workloads running on an Intel Core-i9 platform. Results show that a two-level kmeans clustering combined with SVM-based phase change prediction provides the best tradeoff between accuracy and long-term stability. Additionally, the SVM predictor reduces the average prediction error by 80% when compared to a table-based predictor.

在执行过程中预测工作负载行为的粗粒度变化对于处理器系统的动态资源优化至关重要。研究人员提出了各种方法，首先将工作负荷划分为阶段，然后学习其长期阶段行为，以预测和预测阶段变化。早期的相位预测研究提出了基于表的相位预测器。最近，人们开始探索简单的基于学习的技术，如决策树。然而，到目前为止，机器学习的最新进展尚未应用于相位预测。此外，尽管有广泛的分类方法，但现有的相位预测器只与特定的相位分类器联系在一起进行研究。阶段分类的早期工作提出了各种需要访问源代码的聚类方法。后来的一些研究使用了性能监控计数器，但它们只评估了特定环境下的分类器，比如热建模。在这项工作中，我们对使用硬件计数器的无关源相位分类和预测方法进行了全面的研究。我们采用了过去用于不同输入的分类技术，并将它们与最先进的基于硬件计数器的分类器进行比较。我们进一步评估了与不同阶段分类器相结合的各种阶段预测器的准确性，并评估了一系列先进的机器学习技术，包括用于工作负载阶段预测的支持向量机和lstm。我们将分类和预测方法应用于运行在Intel Core-i9平台上的SPEC工作负载。结果表明，两级kmeans聚类结合基于支持向量机的相变预测在精度和长期稳定性之间取得了最好的平衡。此外，与基于表的预测器相比，SVM预测器将平均预测误差降低了80%。

{"title":"Learning-Based Workload Phase Classification and Prediction Using Performance Monitoring Counters","authors":"Erika S. Alcorta, A. Gerstlauer","doi":"10.1109/MLCAD52597.2021.9531161","DOIUrl":"https://doi.org/10.1109/MLCAD52597.2021.9531161","url":null,"abstract":"Predicting coarse-grain variations in workload behavior during execution is essential for dynamic resource optimization of processor systems. Researchers have proposed various methods to first classify workloads into phases and then learn their long-term phase behavior to predict and anticipate phase changes. Early studies on phase prediction proposed table-based phase predictors. More recently, simple learning-based techniques such as decision trees have been explored. However, more recent advances in machine learning have not been applied to phase prediction so far. Furthermore, existing phase predictors have been studied only in connection with specific phase classifiers even though there is a wide range of classification methods. Early work in phase classification proposed various clustering methods that required access to source code. Some later studies used performance monitoring counters, but they only evaluated classifiers for specific contexts such as thermal modeling.In this work, we perform a comprehensive study of source-oblivious phase classification and prediction methods using hardware counters. We adapt classification techniques that were used with different inputs in the past and compare them to state-of-the-art hardware counter based classifiers. We further evaluate the accuracy of various phase predictors when coupled with different phase classifiers and evaluate a range of advanced machine learning techniques, including SVMs and LSTMs for workload phase prediction. We apply classification and prediction approaches to SPEC workloads running on an Intel Core-i9 platform. Results show that a two-level kmeans clustering combined with SVM-based phase change prediction provides the best tradeoff between accuracy and long-term stability. Additionally, the SVM predictor reduces the average prediction error by 80% when compared to a table-based predictor.","PeriodicalId":210763,"journal":{"name":"2021 ACM/IEEE 3rd Workshop on Machine Learning for CAD (MLCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125597775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Learning based Memory Interference Prediction for Co-running Applications on Multi-Cores 基于学习的多核协同运行应用内存干扰预测

2021 ACM/IEEE 3rd Workshop on Machine Learning for CAD (MLCAD)

Pub Date : 2021-08-30 DOI: 10.1109/MLCAD52597.2021.9531245

Ahsan Saeed, Daniel Mueller-Gritschneder, Falk Rehm, A. Hamann, D. Ziegenbein, Ulf Schlichtmann, A. Gerstlauer

Early run-time prediction of co-running independent applications prior to application integration becomes challenging in multi-core processors. One of the most notable causes is the interference at the main memory subsystem, which results in significant degradation in application performance and response time in comparison to standalone execution. Currently available techniques for run-time prediction like traditional cycle-accurate simulations are slow, and analytical models are not accurate and time-consuming to build. By contrast, existing machine-learning-based approaches for run-time prediction simply do not account for interference. In this paper, we use a machine learning-based approach to train a model to correlate performance data (instructions and hardware performance counters) for a set of benchmark applications between the standalone and interference scenarios. After that, the trained model is used to predict the run-time of co-running applications in interference scenarios. In general, there is no straightforward one-to-one correspondence between samples obtained from the standalone and interference scenarios due to the different run-times, i.e. execution speeds. To address this, we developed a simple yet effective sample alignment algorithm, which is a key component in transforming interference prediction into a machine learning problem. In addition, we systematically identify the subset of features that have the highest positive impact on the model performance. Our approach is demonstrated to be effective and shows an average run-time prediction error, which is as low as 0.3% and 0.1% for two co-running applications.

在多核处理器中，在应用程序集成之前对共同运行的独立应用程序进行早期运行时预测变得很有挑战性。最显著的原因之一是主内存子系统的干扰，与独立执行相比，这会导致应用程序性能和响应时间的显著下降。目前可用的运行时间预测技术，如传统的周期精确模拟，速度慢，分析模型不准确，耗时长。相比之下，现有的基于机器学习的运行时预测方法根本没有考虑到干扰。在本文中，我们使用基于机器学习的方法来训练模型，以便在独立和干扰场景之间关联一组基准应用程序的性能数据(指令和硬件性能计数器)。然后，将训练好的模型用于预测协同运行应用程序在干扰场景下的运行时间。一般来说，由于不同的运行时间(即执行速度)，从独立场景和干扰场景获得的样本之间没有直接的一对一对应关系。为了解决这个问题，我们开发了一种简单而有效的样本对齐算法，这是将干扰预测转化为机器学习问题的关键组成部分。此外，我们系统地识别对模型性能有最高积极影响的特征子集。我们的方法被证明是有效的，并且显示了平均运行时预测误差，对于两个共同运行的应用程序，其误差低至0.3%和0.1%。

{"title":"Learning based Memory Interference Prediction for Co-running Applications on Multi-Cores","authors":"Ahsan Saeed, Daniel Mueller-Gritschneder, Falk Rehm, A. Hamann, D. Ziegenbein, Ulf Schlichtmann, A. Gerstlauer","doi":"10.1109/MLCAD52597.2021.9531245","DOIUrl":"https://doi.org/10.1109/MLCAD52597.2021.9531245","url":null,"abstract":"Early run-time prediction of co-running independent applications prior to application integration becomes challenging in multi-core processors. One of the most notable causes is the interference at the main memory subsystem, which results in significant degradation in application performance and response time in comparison to standalone execution. Currently available techniques for run-time prediction like traditional cycle-accurate simulations are slow, and analytical models are not accurate and time-consuming to build. By contrast, existing machine-learning-based approaches for run-time prediction simply do not account for interference. In this paper, we use a machine learning-based approach to train a model to correlate performance data (instructions and hardware performance counters) for a set of benchmark applications between the standalone and interference scenarios. After that, the trained model is used to predict the run-time of co-running applications in interference scenarios. In general, there is no straightforward one-to-one correspondence between samples obtained from the standalone and interference scenarios due to the different run-times, i.e. execution speeds. To address this, we developed a simple yet effective sample alignment algorithm, which is a key component in transforming interference prediction into a machine learning problem. In addition, we systematically identify the subset of features that have the highest positive impact on the model performance. Our approach is demonstrated to be effective and shows an average run-time prediction error, which is as low as 0.3% and 0.1% for two co-running applications.","PeriodicalId":210763,"journal":{"name":"2021 ACM/IEEE 3rd Workshop on Machine Learning for CAD (MLCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123087758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Massive Figure Extraction and Classification in Electronic Component Datasheets for Accelerating PCB Design Preparation 电子元件数据表中海量图形的提取与分类，加速PCB设计准备

2021 ACM/IEEE 3rd Workshop on Machine Learning for CAD (MLCAD)

Pub Date : 2021-08-30 DOI: 10.1109/MLCAD52597.2021.9531275

Kuan-Chun Chen, Chou-Chen Lee, Mark Po-Hung Lin, Yan-Jhih Wang, Yi-Ting Chen

Before starting printed-circuit-board (PCB) design, it is usually very time-consuming for PCB and system designers to review a large amount of electronic component datasheets in order to determine the best integration of electronic components for the target electronic systems. Each datasheet may contain over hundred figures and tables, while the figures and tables usually present the most important electronic component specifications. This paper categorizes various figures, including tables, in electronic component datasheets, and proposes the ECS-YOLO model for massive figure extraction and classification in order to accelerate PCB design preparation process. The experimental results show that, compared with the state-of-the-art object detection model, the proposed ECS-YOLO can consistently achieve better accuracy for figure extraction and classification in electronic component datasheets.

在开始印刷电路板(PCB)设计之前，为了确定目标电子系统的最佳电子元件集成，PCB和系统设计人员通常非常耗时地审查大量的电子元件数据表。每个数据表可能包含一百多个图表和表格，而这些图表和表格通常表示最重要的电子元件规格。本文对电子元器件数据手册中的各种图形(包括表格)进行了分类，并提出了ECS-YOLO模型，用于大量的图形提取和分类，以加快PCB设计准备过程。实验结果表明，与现有的目标检测模型相比，所提出的ECS-YOLO模型在电子元器件数据表的图像提取和分类中始终能够达到更高的精度。

引用次数: 1

Delving into Macro Placement with Reinforcement Learning 用强化学习深入研究宏观布局

2021 ACM/IEEE 3rd Workshop on Machine Learning for CAD (MLCAD)

Pub Date : 2021-08-30 DOI: 10.1109/MLCAD52597.2021.9531313

Zixuan Jiang, Ebrahim M. Songhori, Shen Wang, Anna Goldie, Azalia Mirhoseini, J. Jiang, Young-Joon Lee, David Z. Pan

In physical design, human designers typically place macros via trial and error, which is a Markov decision process. Reinforcement learning (RL) methods have demonstrated superhuman performance on the macro placement. In this paper, we propose an extension to this prior work [1]. We first describe the details of the policy and value network architecture. We replace the force-directed method with DREAMPlace for placing standard cells in the RL environment. We also compare our improved method with other academic placers on public benchmarks.

在物理设计中，人类设计师通常通过试错来放置宏，这是一个马尔可夫决策过程。强化学习(RL)方法在宏观布局上表现出了超人的性能。在本文中，我们提出了对先前工作[1]的扩展。我们首先描述了策略和价值网络架构的细节。我们用DREAMPlace取代了力导向方法，将标准细胞放置在RL环境中。我们还将改进后的方法与公共基准上的其他学术排名方法进行了比较。

引用次数: 7

A Survey of Graph Neural Networks for Electronic Design Automation 图神经网络在电子设计自动化中的应用综述

2021 ACM/IEEE 3rd Workshop on Machine Learning for CAD (MLCAD)

Pub Date : 2021-08-30 DOI: 10.1109/MLCAD52597.2021.9531070

Daniela Sánchez Lopera, Lorenzo Servadei, Gamze Naz Kiprit, Souvik Hazra, R. Wille, W. Ecker

Driven by Moore’s law, the chip design complexity is steadily increasing. Electronic Design Automation (EDA) has been able to cope with the challenging very large-scale integration process, assuring scalability, reliability, and proper time-to-market. However, EDA approaches are time and resource-demanding, and they often do not guarantee optimal solutions. To alleviate these, Machine Learning (ML) has been incorporated into many stages of the design flow, such as in placement and routing. Many solutions employ Euclidean data and ML techniques without considering that many EDA objects are represented naturally as graphs. The trending Graph Neural Networks are an opportunity to solve EDA problems directly using graph structures for circuits, intermediate RTLs, and netlists. In this paper, we present a comprehensive review of the existing works linking the EDA flow for chip design and Graph Neural Networks.

在摩尔定律的驱动下，芯片设计的复杂性正在稳步增加。电子设计自动化(EDA)已经能够应对具有挑战性的大规模集成过程，确保可扩展性、可靠性和适当的上市时间。然而，EDA方法需要时间和资源，而且它们通常不能保证最佳解决方案。为了缓解这些问题，机器学习(ML)已被纳入设计流程的许多阶段，例如放置和路由。许多解决方案采用欧几里德数据和ML技术，而没有考虑到许多EDA对象自然地表示为图。趋势图神经网络是一个直接使用电路、中间rtl和网络列表的图结构来解决EDA问题的机会。在本文中，我们全面回顾了将芯片设计的EDA流程与图神经网络联系起来的现有工作。

引用次数: 27

Domain-Adaptive Soft Real-Time Hybrid Application Mapping for MPSoCs 面向mpsoc的域自适应软实时混合应用映射

2021 ACM/IEEE 3rd Workshop on Machine Learning for CAD (MLCAD)

Pub Date : 2021-08-30 DOI: 10.1109/MLCAD52597.2021.9531269

J. Spieck, S. Wildermann, Jürgen Teich

The mapping of soft real-time applications onto heterogeneous MPSoC architectures can have a high influence on execution properties like energy consumption or the number of deadline violations. In recent years, scenario-aware hybrid application mapping (HAM) has turned out as the state-of-the-art mapping method for input-dependent applications whose execution characteristics are in strong dependence on the input that shall be processed. In this work, we propose an extension of scenario-aware HAM that is capable of transferring its mapping strategy learned from a labeled source data domain using supervised learning into an unlabeled target domain that exhibits a shift in its data distribution. Our domain-adaptive HAM employs a run-time manager (RTM) that performs mapping selection and reconfiguration at run time based on general domain-invariant knowledge learned at design time that is valid for both the source and target domain. Evaluation based on two input-dependent applications and two MPSoC architectures demonstrates that our domain-adaptive HAM consistently outperforms state-of-the-art mapping procedures with regard to the number of deadline misses and energy consumption in presence of a domain shift. Furthermore, our HAM approach obtains results close to an explicit optimization for the target domain in a fraction of the necessary optimization time and without necessitating target labels.

将软实时应用程序映射到异构MPSoC架构上可能会对执行属性(如能耗或违反截止日期的次数)产生很大影响。近年来，场景感知混合应用映射(scenario-aware hybrid application mapping, HAM)作为一种针对输入依赖型应用的最先进的映射方法，其执行特征强烈依赖于需要处理的输入。在这项工作中，我们提出了一种场景感知HAM的扩展，该扩展能够将其使用监督学习从标记的源数据域学习到的映射策略转移到数据分布发生变化的未标记目标域。我们的领域自适应HAM采用了一个运行时管理器(RTM)，该管理器基于在设计时学到的对源和目标领域都有效的通用领域不变知识，在运行时执行映射选择和重新配置。基于两个输入依赖应用程序和两个MPSoC架构的评估表明，我们的领域自适应HAM在存在领域转移的情况下，在截止日期错过次数和能耗方面始终优于最先进的映射程序。此外，我们的HAM方法在必要的优化时间的一小部分内获得了接近目标域显式优化的结果，并且不需要目标标签。

{"title":"Domain-Adaptive Soft Real-Time Hybrid Application Mapping for MPSoCs","authors":"J. Spieck, S. Wildermann, Jürgen Teich","doi":"10.1109/MLCAD52597.2021.9531269","DOIUrl":"https://doi.org/10.1109/MLCAD52597.2021.9531269","url":null,"abstract":"The mapping of soft real-time applications onto heterogeneous MPSoC architectures can have a high influence on execution properties like energy consumption or the number of deadline violations. In recent years, scenario-aware hybrid application mapping (HAM) has turned out as the state-of-the-art mapping method for input-dependent applications whose execution characteristics are in strong dependence on the input that shall be processed. In this work, we propose an extension of scenario-aware HAM that is capable of transferring its mapping strategy learned from a labeled source data domain using supervised learning into an unlabeled target domain that exhibits a shift in its data distribution. Our domain-adaptive HAM employs a run-time manager (RTM) that performs mapping selection and reconfiguration at run time based on general domain-invariant knowledge learned at design time that is valid for both the source and target domain. Evaluation based on two input-dependent applications and two MPSoC architectures demonstrates that our domain-adaptive HAM consistently outperforms state-of-the-art mapping procedures with regard to the number of deadline misses and energy consumption in presence of a domain shift. Furthermore, our HAM approach obtains results close to an explicit optimization for the target domain in a fraction of the necessary optimization time and without necessitating target labels.","PeriodicalId":210763,"journal":{"name":"2021 ACM/IEEE 3rd Workshop on Machine Learning for CAD (MLCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124568856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Feeding Hungry Models Less: Deep Transfer Learning for Embedded Memory PPA Models : Special Session 少喂饥饿模型:嵌入式记忆PPA模型的深度迁移学习:特别会议

2021 ACM/IEEE 3rd Workshop on Machine Learning for CAD (MLCAD)

Pub Date : 2021-08-30 DOI: 10.1109/MLCAD52597.2021.9531299

F. Last, Ulf Schlichtmann

Supervised machine learning requires large amounts of labeled data for training. In power, performance and area (PPA) estimation of embedded memories, every new memory compiler version is considered independently of previous versions. Since the data of different memory compilers originate from similar domains, transfer learning may reduce the amount of supervised data required by pre-training PPA estimation neural networks on related domains. We show that provisioning times of PPA models for new compiler versions can be reduced significantly by exploiting similarities across versions and technology nodes. Through transfer learning, we shorten the time to provision PPA models for new compiler versions by 50% to 90%, which speeds up time-critical periods of the design cycle. This is achieved by requiring less than 6,500 ground truth samples for the target compiler to achieve average estimation errors of 0.35% instead of 13,000 samples. Using only 1,300 samples is sufficient to achieve an almost worst-case (98th percentile) error of approximately 3% and allows us to shorten model provisioning times from over 40 days to less than one week.

监督式机器学习需要大量标记数据进行训练。在嵌入式内存的功耗、性能和面积(PPA)估计中，每个新的内存编译器版本都独立于以前的版本进行考虑。由于不同的记忆编译器的数据来源于相似的域，迁移学习可以减少相关域上预训练PPA估计神经网络所需的监督数据量。我们表明，通过利用版本和技术节点之间的相似性，可以显著减少新编译器版本的PPA模型的配置时间。通过迁移学习，我们将新编译器版本提供PPA模型的时间缩短了50% - 90%，加快了设计周期的时间关键阶段。这是通过目标编译器需要少于6,500个地面真值样本来实现0.35%的平均估计误差而不是13,000个样本来实现的。仅使用1300个样本就足以实现大约3%的最坏情况(98百分位数)误差，并允许我们将模型配置时间从40多天缩短到不到一周。

{"title":"Feeding Hungry Models Less: Deep Transfer Learning for Embedded Memory PPA Models : Special Session","authors":"F. Last, Ulf Schlichtmann","doi":"10.1109/MLCAD52597.2021.9531299","DOIUrl":"https://doi.org/10.1109/MLCAD52597.2021.9531299","url":null,"abstract":"Supervised machine learning requires large amounts of labeled data for training. In power, performance and area (PPA) estimation of embedded memories, every new memory compiler version is considered independently of previous versions. Since the data of different memory compilers originate from similar domains, transfer learning may reduce the amount of supervised data required by pre-training PPA estimation neural networks on related domains. We show that provisioning times of PPA models for new compiler versions can be reduced significantly by exploiting similarities across versions and technology nodes. Through transfer learning, we shorten the time to provision PPA models for new compiler versions by 50% to 90%, which speeds up time-critical periods of the design cycle. This is achieved by requiring less than 6,500 ground truth samples for the target compiler to achieve average estimation errors of 0.35% instead of 13,000 samples. Using only 1,300 samples is sufficient to achieve an almost worst-case (98th percentile) error of approximately 3% and allows us to shorten model provisioning times from over 40 days to less than one week.","PeriodicalId":210763,"journal":{"name":"2021 ACM/IEEE 3rd Workshop on Machine Learning for CAD (MLCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127555042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2021 ACM/IEEE 3rd Workshop on Machine Learning for CAD (MLCAD)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀