DLHub: Model and Data Serving for Science

2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS) Pub Date : 2018-11-27 DOI:10.1109/IPDPS.2019.00038

Ryan Chard, Zhuozhao Li, K. Chard, Logan T. Ward, Y. Babuji, A. Woodard, S. Tuecke, B. Blaiszik, M. Franklin, Ian T Foster

{"title":"DLHub: Model and Data Serving for Science","authors":"Ryan Chard, Zhuozhao Li, K. Chard, Logan T. Ward, Y. Babuji, A. Woodard, S. Tuecke, B. Blaiszik, M. Franklin, Ian T Foster","doi":"10.1109/IPDPS.2019.00038","DOIUrl":null,"url":null,"abstract":"While the Machine Learning (ML) landscape is evolving rapidly, there has been a relative lag in the development of the \"learning systems\" needed to enable broad adoption. Furthermore, few such systems are designed to support the specialized requirements of scientific ML. Here we present the Data and Learning Hub for science (DLHub), a multi-tenant system that provides both model repository and serving capabilities with a focus on science applications. DLHub addresses two significant shortcomings in current systems. First, its self-service model repository allows users to share, publish, verify, reproduce, and reuse models, and addresses concerns related to model reproducibility by packaging and distributing models and all constituent components. Second, it implements scalable and low-latency serving capabilities that can leverage parallel and distributed computing resources to democratize access to published models through a simple web interface. Unlike other model serving frameworks, DLHub can store and serve any Python 3-compatible model or processing function, plus multiple-function pipelines. We show that relative to other model serving systems including TensorFlow Serving, SageMaker, and Clipper, DLHub provides greater capabilities, comparable performance without memoization and batching, and significantly better performance when the latter two techniques can be employed. We also describe early uses of DLHub for scientific applications.","PeriodicalId":403406,"journal":{"name":"2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"66","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS.2019.00038","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 66

Abstract

While the Machine Learning (ML) landscape is evolving rapidly, there has been a relative lag in the development of the "learning systems" needed to enable broad adoption. Furthermore, few such systems are designed to support the specialized requirements of scientific ML. Here we present the Data and Learning Hub for science (DLHub), a multi-tenant system that provides both model repository and serving capabilities with a focus on science applications. DLHub addresses two significant shortcomings in current systems. First, its self-service model repository allows users to share, publish, verify, reproduce, and reuse models, and addresses concerns related to model reproducibility by packaging and distributing models and all constituent components. Second, it implements scalable and low-latency serving capabilities that can leverage parallel and distributed computing resources to democratize access to published models through a simple web interface. Unlike other model serving frameworks, DLHub can store and serve any Python 3-compatible model or processing function, plus multiple-function pipelines. We show that relative to other model serving systems including TensorFlow Serving, SageMaker, and Clipper, DLHub provides greater capabilities, comparable performance without memoization and batching, and significantly better performance when the latter two techniques can be employed. We also describe early uses of DLHub for scientific applications.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

DLHub:为科学服务的模型和数据

虽然机器学习(ML)领域正在迅速发展，但实现广泛采用所需的“学习系统”的发展相对滞后。此外，很少有这样的系统是为支持科学机器学习的特殊需求而设计的。在这里，我们介绍了科学数据和学习中心(DLHub)，这是一个多租户系统，它提供了模型存储库和服务功能，重点是科学应用程序。DLHub解决了当前系统中的两个重大缺陷。首先，它的自助服务模型存储库允许用户共享、发布、验证、复制和重用模型，并通过打包和分发模型和所有组成组件来解决与模型再现性相关的问题。其次，它实现了可伸缩和低延迟的服务功能，可以利用并行和分布式计算资源，通过简单的web界面实现对发布模型的民主化访问。与其他模型服务框架不同，DLHub可以存储和服务任何与Python 3兼容的模型或处理函数，以及多功能管道。我们表明，相对于其他模型服务系统(包括TensorFlow services、SageMaker和Clipper)， DLHub提供了更大的功能，在没有记忆和批处理的情况下提供了相当的性能，并且在使用后两种技术时提供了明显更好的性能。我们还描述了DLHub在科学应用中的早期使用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

自引率

0.00%

发文量