Towards Faster Execution of Ensemble ML Bootstrap Based Techniques.

Vinay B Gavirangaswamy, Vasilije Perović, Ajay Gupta, Hisham M Saleh
{"title":"Towards Faster Execution of Ensemble ML Bootstrap Based Techniques.","authors":"Vinay B Gavirangaswamy,&nbsp;Vasilije Perović,&nbsp;Ajay Gupta,&nbsp;Hisham M Saleh","doi":"10.1145/3458744.3473365","DOIUrl":null,"url":null,"abstract":"<p><p>Algorithms for ensemble methods (EM) based on bootstrap aggregation often perform copious amount of redundant computations (RC) thus limiting their practicality. Given this constraint, we propose a framework that views these algorithms as a collection of computational units (cu), a tightly coupled set of both mathematical operations <i>and</i> data. This view facilitates a reduction in RC (RRC), thereby allowing for faster execution plans. Inspired by the floor tiling approach in VLSI, we look to engineer solutions for RRC while possibly reconfiguring the underlying computing system's compiler technology stack. We start by showing that under the assumption that the computational system has unbounded but finite memory (i.e., the memory is large enough to hold all intermediate values) and that each cu has a uniform cost, our approach reduces to a well-studied <i>directed bandwidth problem</i> for the directed acyclic graphs (DAGs). Next, we consider a more realistic scenario where the computing system has limited memory and concurrent execution while still assuming a uniform cost. Using a new notion of (<i>r,s</i>) set cover of a DAG (nodes representing computational units and edges representing their interdependencies) we formulate the problem of reducing redundant computational steps in EM as a variation of a directed bandwidth problem. We show that the graph's minimum bandwidth is closely related to memory requirements for studying RRC. Finally, our preliminary experimental results are supportive of the proposed approach for RRC and promising that it can be applied to a broader set of algorithms in decision sciences.</p>","PeriodicalId":93355,"journal":{"name":"Proceedings of the ... ICPP Workshops on. International Conference on Parallel Processing Workshops","volume":"2021 ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8547788/pdf/nihms-1728656.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... ICPP Workshops on. International Conference on Parallel Processing Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3458744.3473365","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Algorithms for ensemble methods (EM) based on bootstrap aggregation often perform copious amount of redundant computations (RC) thus limiting their practicality. Given this constraint, we propose a framework that views these algorithms as a collection of computational units (cu), a tightly coupled set of both mathematical operations and data. This view facilitates a reduction in RC (RRC), thereby allowing for faster execution plans. Inspired by the floor tiling approach in VLSI, we look to engineer solutions for RRC while possibly reconfiguring the underlying computing system's compiler technology stack. We start by showing that under the assumption that the computational system has unbounded but finite memory (i.e., the memory is large enough to hold all intermediate values) and that each cu has a uniform cost, our approach reduces to a well-studied directed bandwidth problem for the directed acyclic graphs (DAGs). Next, we consider a more realistic scenario where the computing system has limited memory and concurrent execution while still assuming a uniform cost. Using a new notion of (r,s) set cover of a DAG (nodes representing computational units and edges representing their interdependencies) we formulate the problem of reducing redundant computational steps in EM as a variation of a directed bandwidth problem. We show that the graph's minimum bandwidth is closely related to memory requirements for studying RRC. Finally, our preliminary experimental results are supportive of the proposed approach for RRC and promising that it can be applied to a broader set of algorithms in decision sciences.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于集成机器学习引导技术的更快执行。
基于自举聚合的集成方法算法通常需要进行大量的冗余计算,从而限制了其实用性。鉴于这种约束,我们提出了一个框架,将这些算法视为计算单元(cu)的集合,即数学运算和数据的紧密耦合集。此视图有助于减少RC (RRC),从而允许更快的执行计划。受超大规模集成电路(VLSI)中的地板平铺方法的启发,我们希望为RRC设计解决方案,同时可能重新配置底层计算系统的编译器技术堆栈。我们首先表明,在假设计算系统具有无界但有限的内存(即,内存大到足以容纳所有中间值)并且每个cu具有统一的成本的情况下,我们的方法可以简化为有向无环图(dag)的有向带宽问题。接下来,我们考虑一个更现实的场景,其中计算系统具有有限的内存和并发执行,但仍然假设成本是统一的。使用DAG(节点表示计算单元,边表示它们的相互依赖性)的(r,s)集覆盖的新概念,我们将EM中减少冗余计算步骤的问题表述为有向带宽问题的变化。我们表明,图的最小带宽与研究RRC的内存需求密切相关。最后,我们的初步实验结果支持RRC提出的方法,并有望将其应用于决策科学中更广泛的算法集。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Accelerating the Task Activation and Data Communication for Dataflow Computing Towards Faster Execution of Ensemble ML Bootstrap Based Techniques. Combining Dynamic Concurrency Throttling with Voltage and Frequency Scaling on Task-based Programming Models Message from the Chairs W12 Live die repeat: our experience
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1