End-to-End Machine Learning with Apache AsterixDB

Wail Y. Alkowaileet, Sattam Alsubaiee, M. Carey, Chen Li, H. Ramampiaro, Phanwadee Sinthong, Xikui Wang
{"title":"End-to-End Machine Learning with Apache AsterixDB","authors":"Wail Y. Alkowaileet, Sattam Alsubaiee, M. Carey, Chen Li, H. Ramampiaro, Phanwadee Sinthong, Xikui Wang","doi":"10.1145/3209889.3209894","DOIUrl":null,"url":null,"abstract":"Recent developments in machine learning and data science provide a foundation for extracting underlying information from Big Data. Unfortunately, current platforms and tools often require data scientists to glue together and maintain custom-built platforms consisting of multiple Big Data component technologies. In this paper, we explain how Apache AsterixDB, an open source Big Data Management System, can help to reduce the burden involved in using machine learning algorithms in Big Data analytics. In particular, we describe how AsterixDB's built-in support for user-defined functions (UDFs), the availability of UDFs in data ingestion pipelines and queries, and the provision of machine learning platform and notebook inter-operation capabilities can together enable data analysts to more easily create and manage end-to-end analytical dataflows.","PeriodicalId":92710,"journal":{"name":"Proceedings of the Second Workshop on Data Management for End-to-End Machine Learning. Workshop on Data Management for End-to-End Machine Learning (2nd : 2018 : Houston, Tex.)","volume":"10 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2018-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Second Workshop on Data Management for End-to-End Machine Learning. Workshop on Data Management for End-to-End Machine Learning (2nd : 2018 : Houston, Tex.)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3209889.3209894","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11

Abstract

Recent developments in machine learning and data science provide a foundation for extracting underlying information from Big Data. Unfortunately, current platforms and tools often require data scientists to glue together and maintain custom-built platforms consisting of multiple Big Data component technologies. In this paper, we explain how Apache AsterixDB, an open source Big Data Management System, can help to reduce the burden involved in using machine learning algorithms in Big Data analytics. In particular, we describe how AsterixDB's built-in support for user-defined functions (UDFs), the availability of UDFs in data ingestion pipelines and queries, and the provision of machine learning platform and notebook inter-operation capabilities can together enable data analysts to more easily create and manage end-to-end analytical dataflows.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
端到端机器学习与Apache AsterixDB
机器学习和数据科学的最新发展为从大数据中提取底层信息提供了基础。不幸的是,当前的平台和工具通常需要数据科学家粘合在一起,并维护由多种大数据组件技术组成的定制平台。在本文中,我们解释了Apache AsterixDB,一个开源的大数据管理系统,如何帮助减少在大数据分析中使用机器学习算法所带来的负担。特别是,我们描述了AsterixDB对用户定义函数(udf)的内置支持,udf在数据摄取管道和查询中的可用性,以及机器学习平台和笔记本互操作功能的提供如何使数据分析师能够更轻松地创建和管理端到端分析数据流。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Modelling Machine Learning Algorithms on Relational Data with Datalog Towards Interactive Curation & Automatic Tuning of ML Pipelines Avatar: Large Scale Entity Resolution of Heterogeneous User Profiles Learning Efficiently Over Heterogeneous Databases: Sampling and Constraints to the Rescue Proceedings of the Second Workshop on Data Management for End-To-End Machine Learning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1