首页 > 最新文献

Proceedings of the 1st Workshop on Machine Learning and Systems最新文献

英文 中文
DISC: A Dynamic Shape Compiler for Machine Learning Workloads DISC:用于机器学习工作负载的动态形状编译器
Pub Date : 2021-03-09 DOI: 10.1145/3437984.3458838
Kai Zhu, Wenyi Zhao, Zhen Zheng, Tianyou Guo, Pengzhan Zhao, Junjie Bai, Jun Yang, Xiaoyong Liu, Lansong Diao, Wei Lin
Many recent machine learning models show dynamic shape characteristics. However, existing AI compiler optimization systems suffer a lot from problems brought by dynamic shape models, including compilation overhead, memory usage, optimization pipeline and deployment complexity. This paper provides a compiler system to natively support optimization for dynamic shape workloads, named DISC. DISC enriches a set of IR to form a fully dynamic shape representation. It generates the runtime flow at compile time to support processing dynamic shape based logic, which avoids the interpretation overhead at runtime and enlarges the opportunity of host-device co-optimization. It addresses the kernel fusion problem of dynamic shapes with shape propagation and constraints collecting methods. This is the first work to demonstrate how to build an end-to-end dynamic shape compiler based on MLIR infrastructure. Experiments show that DISC achieves up to 3.3× speedup than TensorFlow/PyTorch, and 1.8× than Nimble.
许多最近的机器学习模型显示动态形状特征。然而,现有的AI编译器优化系统受到动态形状模型带来的编译开销、内存使用、优化管道和部署复杂性等问题的困扰。本文提供了一个编译器系统来支持动态形状工作负载的优化。DISC丰富了一组IR,形成了一个完全动态的形状表示。它在编译时生成运行时流,以支持处理基于动态形状的逻辑,从而避免了运行时的解释开销,并扩大了主机-设备协同优化的机会。利用形状传播和约束收集方法解决了动态形状的核融合问题。这是演示如何构建基于MLIR基础架构的端到端动态形状编译器的第一个工作。实验表明,DISC的加速速度比TensorFlow/PyTorch快3.3倍,比Nimble快1.8倍。
{"title":"DISC: A Dynamic Shape Compiler for Machine Learning Workloads","authors":"Kai Zhu, Wenyi Zhao, Zhen Zheng, Tianyou Guo, Pengzhan Zhao, Junjie Bai, Jun Yang, Xiaoyong Liu, Lansong Diao, Wei Lin","doi":"10.1145/3437984.3458838","DOIUrl":"https://doi.org/10.1145/3437984.3458838","url":null,"abstract":"Many recent machine learning models show dynamic shape characteristics. However, existing AI compiler optimization systems suffer a lot from problems brought by dynamic shape models, including compilation overhead, memory usage, optimization pipeline and deployment complexity. This paper provides a compiler system to natively support optimization for dynamic shape workloads, named DISC. DISC enriches a set of IR to form a fully dynamic shape representation. It generates the runtime flow at compile time to support processing dynamic shape based logic, which avoids the interpretation overhead at runtime and enlarges the opportunity of host-device co-optimization. It addresses the kernel fusion problem of dynamic shapes with shape propagation and constraints collecting methods. This is the first work to demonstrate how to build an end-to-end dynamic shape compiler based on MLIR infrastructure. Experiments show that DISC achieves up to 3.3× speedup than TensorFlow/PyTorch, and 1.8× than Nimble.","PeriodicalId":269840,"journal":{"name":"Proceedings of the 1st Workshop on Machine Learning and Systems","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122389606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Towards a General Framework for ML-based Self-tuning Databases 基于机器学习的自调优数据库通用框架研究
Pub Date : 2020-11-16 DOI: 10.1145/3437984.3458830
Thomas Schmied, Diego Didona, Andreas Doring, Thomas Parnell, Nikolas Ioannou
Machine learning (ML) methods have recently emerged as an effective way to perform automated parameter tuning of databases. State-of-the-art approaches include Bayesian optimization (BO) and reinforcement learning (RL). In this work, we describe our experience when applying these methods to a database not yet studied in this context: FoundationDB. Firstly, we describe the challenges we faced, such as unknown valid ranges of configuration parameters and combinations of parameter values that result in invalid runs, and how we mitigated them. While these issues are typically overlooked, we argue that they are a crucial barrier to the adoption of ML self-tuning techniques in databases, and thus deserve more attention from the research community. Secondly, we present experimental results obtained when tuning FoundationDB using ML methods. Unlike prior work in this domain, we also compare with the simplest of baselines: random search. Our results show that, while BO and RL methods can improve the throughput of FoundationDB by up to 38%, random search is a highly competitive baseline, finding a configuration that is only 4% worse than the, vastly more complex, ML methods. We conclude that future work in this area may want to focus more on randomized, model-free optimization algorithms.
机器学习(ML)方法最近成为执行数据库自动参数调优的有效方法。最先进的方法包括贝叶斯优化(BO)和强化学习(RL)。在这项工作中,我们描述了将这些方法应用于一个尚未在此上下文中研究过的数据库时的经验:FoundationDB。首先,我们描述了我们面临的挑战,例如未知的有效配置参数范围和导致无效运行的参数值组合,以及我们如何减轻它们。虽然这些问题通常被忽视,但我们认为它们是在数据库中采用ML自调优技术的关键障碍,因此值得研究社区更多关注。其次,给出了使用ML方法调优FoundationDB的实验结果。与该领域的先前工作不同,我们还比较了最简单的基线:随机搜索。我们的结果表明,虽然BO和RL方法可以将FoundationDB的吞吐量提高38%,但随机搜索是一个非常有竞争力的基准,发现配置只比更复杂的ML方法差4%。我们得出结论,该领域的未来工作可能需要更多地关注随机化、无模型优化算法。
{"title":"Towards a General Framework for ML-based Self-tuning Databases","authors":"Thomas Schmied, Diego Didona, Andreas Doring, Thomas Parnell, Nikolas Ioannou","doi":"10.1145/3437984.3458830","DOIUrl":"https://doi.org/10.1145/3437984.3458830","url":null,"abstract":"Machine learning (ML) methods have recently emerged as an effective way to perform automated parameter tuning of databases. State-of-the-art approaches include Bayesian optimization (BO) and reinforcement learning (RL). In this work, we describe our experience when applying these methods to a database not yet studied in this context: FoundationDB. Firstly, we describe the challenges we faced, such as unknown valid ranges of configuration parameters and combinations of parameter values that result in invalid runs, and how we mitigated them. While these issues are typically overlooked, we argue that they are a crucial barrier to the adoption of ML self-tuning techniques in databases, and thus deserve more attention from the research community. Secondly, we present experimental results obtained when tuning FoundationDB using ML methods. Unlike prior work in this domain, we also compare with the simplest of baselines: random search. Our results show that, while BO and RL methods can improve the throughput of FoundationDB by up to 38%, random search is a highly competitive baseline, finding a configuration that is only 4% worse than the, vastly more complex, ML methods. We conclude that future work in this area may want to focus more on randomized, model-free optimization algorithms.","PeriodicalId":269840,"journal":{"name":"Proceedings of the 1st Workshop on Machine Learning and Systems","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130262656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
μNAS: Constrained Neural Architecture Search for Microcontrollers μNAS:微控制器约束神经结构搜索
Pub Date : 2020-10-27 DOI: 10.1145/3437984.3458836
Edgar Liberis, L. Dudziak, N. Lane
IoT devices are powered by microcontroller units (MCUs) which are extremely resource-scarce: a typical MCU may have an underpowered processor and around 64 KB of memory and persistent storage. Designing neural networks for such a platform requires an intricate balance between keeping high predictive performance (accuracy) while achieving low memory and storage usage and inference latency. This is extremely challenging to achieve manually, so in this work, we build a neural architecture search (NAS) system, called μNAS, to automate the design of such small-yet-powerful MCU-level networks. μNAS explicitly targets the three primary aspects of resource scarcity of MCUs: the size of RAM, persistent storage and processor speed. μNAS represents a significant advance in resource-efficient models, especially for "mid-tier" MCUs with memory requirements ranging from 0.5 KB to 64 KB. We show that on a variety of image classification datasets μNAS is able to (a) improve top-1 classification accuracy by up to 4.8%, or (b) reduce memory footprint by 4-13×, or (c) reduce the number of multiply-accumulate operations by at least 2×, compared to existing MCU specialist literature and resource-efficient models. μNAS is freely available for download at https://github.com/eliberis/uNAS
物联网设备由资源极其稀缺的微控制器单元(MCU)供电:典型的MCU可能具有功率不足的处理器和大约64 KB的内存和持久存储。为这样的平台设计神经网络需要在保持高预测性能(准确性)与实现低内存和存储使用以及推理延迟之间取得复杂的平衡。这是极具挑战性的手动实现,所以在这项工作中,我们建立了一个神经架构搜索(NAS)系统,称为μNAS,以自动设计这种小而强大的mcu级网络。μNAS明确针对mcu资源稀缺的三个主要方面:RAM大小、持久存储和处理器速度。μNAS代表了资源效率模型的重大进步,特别是对于内存要求从0.5 KB到64 KB的“中间层”mcu。研究表明,与现有的MCU专业文献和资源高效模型相比,μNAS在各种图像分类数据集上能够(a)将top1分类精度提高4.8%,或(b)将内存占用减少4-13倍,或(c)将乘法累积操作次数减少至少2倍。μNAS可以在https://github.com/eliberis/uNAS上免费下载
{"title":"μNAS: Constrained Neural Architecture Search for Microcontrollers","authors":"Edgar Liberis, L. Dudziak, N. Lane","doi":"10.1145/3437984.3458836","DOIUrl":"https://doi.org/10.1145/3437984.3458836","url":null,"abstract":"IoT devices are powered by microcontroller units (MCUs) which are extremely resource-scarce: a typical MCU may have an underpowered processor and around 64 KB of memory and persistent storage. Designing neural networks for such a platform requires an intricate balance between keeping high predictive performance (accuracy) while achieving low memory and storage usage and inference latency. This is extremely challenging to achieve manually, so in this work, we build a neural architecture search (NAS) system, called μNAS, to automate the design of such small-yet-powerful MCU-level networks. μNAS explicitly targets the three primary aspects of resource scarcity of MCUs: the size of RAM, persistent storage and processor speed. μNAS represents a significant advance in resource-efficient models, especially for \"mid-tier\" MCUs with memory requirements ranging from 0.5 KB to 64 KB. We show that on a variety of image classification datasets μNAS is able to (a) improve top-1 classification accuracy by up to 4.8%, or (b) reduce memory footprint by 4-13×, or (c) reduce the number of multiply-accumulate operations by at least 2×, compared to existing MCU specialist literature and resource-efficient models. μNAS is freely available for download at https://github.com/eliberis/uNAS","PeriodicalId":269840,"journal":{"name":"Proceedings of the 1st Workshop on Machine Learning and Systems","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128275359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 66
DPD-InfoGAN: Differentially Private Distributed InfoGAN DPD-InfoGAN:差分私有分布式信息网络
Pub Date : 2020-10-22 DOI: 10.1145/3437984.3458826
Vaikkunth Mugunthan, V. Gokul, Lalana Kagal, S. Dubnov
Generative Adversarial Networks (GANs) are deep learning architectures capable of generating synthetic datasets. Despite producing high-quality synthetic images, the default GAN has no control over the kinds of images it generates. The Information Maximizing GAN (InfoGAN) is a variant of the default GAN that introduces feature-control variables that are automatically learned by the framework, hence providing greater control over the different kinds of images produced. Due to the high model complexity of InfoGAN, the generative distribution tends to be concentrated around the training data points. This is a critical problem as the models may inadvertently expose the sensitive and private information present in the dataset. To address this problem, we propose a differentially private version of InfoGAN (DP-InfoGAN). We also extend our framework to a distributed setting (DPD-InfoGAN) to allow clients to learn different attributes present in other clients' datasets in a privacy-preserving manner. In our experiments, we show that both DP-InfoGAN and DPD-InfoGAN can synthesize high-quality images with flexible control over image attributes while preserving privacy.
生成对抗网络(gan)是一种能够生成合成数据集的深度学习架构。尽管生成了高质量的合成图像,但默认GAN无法控制其生成的图像类型。信息最大化GAN (InfoGAN)是默认GAN的一种变体,它引入了由框架自动学习的特性控制变量,从而对生成的不同类型的图像提供更好的控制。由于InfoGAN的高模型复杂度,生成分布倾向于集中在训练数据点周围。这是一个关键问题,因为模型可能会无意中暴露数据集中存在的敏感和私有信息。为了解决这个问题,我们提出了一个不同的私有版本的InfoGAN (DP-InfoGAN)。我们还将我们的框架扩展到分布式设置(DPD-InfoGAN),以允许客户端以保护隐私的方式学习其他客户端数据集中存在的不同属性。实验表明,DP-InfoGAN和DPD-InfoGAN都可以合成高质量的图像,对图像属性进行灵活的控制,同时保护隐私。
{"title":"DPD-InfoGAN: Differentially Private Distributed InfoGAN","authors":"Vaikkunth Mugunthan, V. Gokul, Lalana Kagal, S. Dubnov","doi":"10.1145/3437984.3458826","DOIUrl":"https://doi.org/10.1145/3437984.3458826","url":null,"abstract":"Generative Adversarial Networks (GANs) are deep learning architectures capable of generating synthetic datasets. Despite producing high-quality synthetic images, the default GAN has no control over the kinds of images it generates. The Information Maximizing GAN (InfoGAN) is a variant of the default GAN that introduces feature-control variables that are automatically learned by the framework, hence providing greater control over the different kinds of images produced. Due to the high model complexity of InfoGAN, the generative distribution tends to be concentrated around the training data points. This is a critical problem as the models may inadvertently expose the sensitive and private information present in the dataset. To address this problem, we propose a differentially private version of InfoGAN (DP-InfoGAN). We also extend our framework to a distributed setting (DPD-InfoGAN) to allow clients to learn different attributes present in other clients' datasets in a privacy-preserving manner. In our experiments, we show that both DP-InfoGAN and DPD-InfoGAN can synthesize high-quality images with flexible control over image attributes while preserving privacy.","PeriodicalId":269840,"journal":{"name":"Proceedings of the 1st Workshop on Machine Learning and Systems","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122147631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
AutoAblation AutoAblation
Pub Date : 1900-01-01 DOI: 10.1145/3437984.3458834
Sina Sheikholeslami, Moritz Meister, Tianze Wang, A. H. Payberah, Vladimir Vlassov, J. Dowling
Ablation studies provide insights into the relative contribution of different architectural and regularization components to machine learning models' performance. In this paper, we introduce AutoAblation, a new framework for the design and parallel execution of ablation experiments. AutoAblation provides a declarative approach to defining ablation experiments on model architectures and training datasets, and enables the parallel execution of ablation trials. This reduces the execution time and allows more comprehensive experiments by exploiting larger amounts of computational resources. We show that AutoAblation can provide near-linear scalability by performing an ablation study on the modules of the Inception-v3 network trained on the TenGeoPSAR dataset.
{"title":"AutoAblation","authors":"Sina Sheikholeslami, Moritz Meister, Tianze Wang, A. H. Payberah, Vladimir Vlassov, J. Dowling","doi":"10.1145/3437984.3458834","DOIUrl":"https://doi.org/10.1145/3437984.3458834","url":null,"abstract":"Ablation studies provide insights into the relative contribution of different architectural and regularization components to machine learning models' performance. In this paper, we introduce AutoAblation, a new framework for the design and parallel execution of ablation experiments. AutoAblation provides a declarative approach to defining ablation experiments on model architectures and training datasets, and enables the parallel execution of ablation trials. This reduces the execution time and allows more comprehensive experiments by exploiting larger amounts of computational resources. We show that AutoAblation can provide near-linear scalability by performing an ablation study on the modules of the Inception-v3 network trained on the TenGeoPSAR dataset.","PeriodicalId":269840,"journal":{"name":"Proceedings of the 1st Workshop on Machine Learning and Systems","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114230437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Vate Vate
Pub Date : 1900-01-01 DOI: 10.1145/3437984.3458835
D. Goodman, A. Pocock, Jason Peck, Guy L. Steele
Inspired by earlier work on Augur, Vate is a probabilistic programming language for the construction of JVM based probabilistic models with an Object-Oriented interface. As a compiled language it is able to examine the dependency graph of the model to produce optimised code that can be dynamically targeted to different platforms. Using Gibbs Sampling, Metropolis-Hastings and variable marginalisation it can handle a range of model types and is able to efficiently infer values, estimate probabilities, and execute models.
{"title":"Vate","authors":"D. Goodman, A. Pocock, Jason Peck, Guy L. Steele","doi":"10.1145/3437984.3458835","DOIUrl":"https://doi.org/10.1145/3437984.3458835","url":null,"abstract":"Inspired by earlier work on Augur, Vate is a probabilistic programming language for the construction of JVM based probabilistic models with an Object-Oriented interface. As a compiled language it is able to examine the dependency graph of the model to produce optimised code that can be dynamically targeted to different platforms. Using Gibbs Sampling, Metropolis-Hastings and variable marginalisation it can handle a range of model types and is able to efficiently infer values, estimate probabilities, and execute models.","PeriodicalId":269840,"journal":{"name":"Proceedings of the 1st Workshop on Machine Learning and Systems","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121666711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings of the 1st Workshop on Machine Learning and Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1