Towards Faster Execution of Ensemble ML Bootstrap Based Techniques.

Proceedings of the ... ICPP Workshops on. International Conference on Parallel Processing Workshops Pub Date : 2021-08-01 DOI:10.1145/3458744.3473365

Vinay B Gavirangaswamy, Vasilije Perović, Ajay Gupta, Hisham M Saleh

{"title":"Towards Faster Execution of Ensemble ML Bootstrap Based Techniques.","authors":"Vinay B Gavirangaswamy, Vasilije Perović, Ajay Gupta, Hisham M Saleh","doi":"10.1145/3458744.3473365","DOIUrl":null,"url":null,"abstract":"Algorithms for ensemble methods (EM) based on bootstrap aggregation often perform copious amount of redundant computations (RC) thus limiting their practicality. Given this constraint, we propose a framework that views these algorithms as a collection of computational units (cu), a tightly coupled set of both mathematical operations and data. This view facilitates a reduction in RC (RRC), thereby allowing for faster execution plans. Inspired by the floor tiling approach in VLSI, we look to engineer solutions for RRC while possibly reconfiguring the underlying computing system's compiler technology stack. We start by showing that under the assumption that the computational system has unbounded but finite memory (i.e., the memory is large enough to hold all intermediate values) and that each cu has a uniform cost, our approach reduces to a well-studied directed bandwidth problem for the directed acyclic graphs (DAGs). Next, we consider a more realistic scenario where the computing system has limited memory and concurrent execution while still assuming a uniform cost. Using a new notion of (r,s) set cover of a DAG (nodes representing computational units and edges representing their interdependencies) we formulate the problem of reducing redundant computational steps in EM as a variation of a directed bandwidth problem. We show that the graph's minimum bandwidth is closely related to memory requirements for studying RRC. Finally, our preliminary experimental results are supportive of the proposed approach for RRC and promising that it can be applied to a broader set of algorithms in decision sciences.","PeriodicalId":93355,"journal":{"name":"Proceedings of the ... ICPP Workshops on. International Conference on Parallel Processing Workshops","volume":"2021 ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8547788/pdf/nihms-1728656.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... ICPP Workshops on. International Conference on Parallel Processing Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3458744.3473365","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Algorithms for ensemble methods (EM) based on bootstrap aggregation often perform copious amount of redundant computations (RC) thus limiting their practicality. Given this constraint, we propose a framework that views these algorithms as a collection of computational units (cu), a tightly coupled set of both mathematical operations and data. This view facilitates a reduction in RC (RRC), thereby allowing for faster execution plans. Inspired by the floor tiling approach in VLSI, we look to engineer solutions for RRC while possibly reconfiguring the underlying computing system's compiler technology stack. We start by showing that under the assumption that the computational system has unbounded but finite memory (i.e., the memory is large enough to hold all intermediate values) and that each cu has a uniform cost, our approach reduces to a well-studied directed bandwidth problem for the directed acyclic graphs (DAGs). Next, we consider a more realistic scenario where the computing system has limited memory and concurrent execution while still assuming a uniform cost. Using a new notion of (r,s) set cover of a DAG (nodes representing computational units and edges representing their interdependencies) we formulate the problem of reducing redundant computational steps in EM as a variation of a directed bandwidth problem. We show that the graph's minimum bandwidth is closely related to memory requirements for studying RRC. Finally, our preliminary experimental results are supportive of the proposed approach for RRC and promising that it can be applied to a broader set of algorithms in decision sciences.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于集成机器学习引导技术的更快执行。

基于自举聚合的集成方法算法通常需要进行大量的冗余计算，从而限制了其实用性。鉴于这种约束，我们提出了一个框架，将这些算法视为计算单元(cu)的集合，即数学运算和数据的紧密耦合集。此视图有助于减少RC (RRC)，从而允许更快的执行计划。受超大规模集成电路(VLSI)中的地板平铺方法的启发，我们希望为RRC设计解决方案，同时可能重新配置底层计算系统的编译器技术堆栈。我们首先表明，在假设计算系统具有无界但有限的内存(即，内存大到足以容纳所有中间值)并且每个cu具有统一的成本的情况下，我们的方法可以简化为有向无环图(dag)的有向带宽问题。接下来，我们考虑一个更现实的场景，其中计算系统具有有限的内存和并发执行，但仍然假设成本是统一的。使用DAG(节点表示计算单元，边表示它们的相互依赖性)的(r,s)集覆盖的新概念，我们将EM中减少冗余计算步骤的问题表述为有向带宽问题的变化。我们表明，图的最小带宽与研究RRC的内存需求密切相关。最后，我们的初步实验结果支持RRC提出的方法，并有望将其应用于决策科学中更广泛的算法集。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the ... ICPP Workshops on. International Conference on Parallel Processing Workshops

自引率

0.00%

发文量

期刊最新文献

Accelerating the Task Activation and Data Communication for Dataflow Computing Towards Faster Execution of Ensemble ML Bootstrap Based Techniques. Combining Dynamic Concurrency Throttling with Voltage and Frequency Scaling on Task-based Programming Models Message from the Chairs W12 Live die repeat: our experience