Extending Apache Spark with a Mediation Layer

Dimitris Stripelis, Chrysovalantis Anastasiou, J. Ambite
{"title":"Extending Apache Spark with a Mediation Layer","authors":"Dimitris Stripelis, Chrysovalantis Anastasiou, J. Ambite","doi":"10.1145/3208352.3208354","DOIUrl":null,"url":null,"abstract":"With the recent growth of data volumes in many disciplines of both industry and academia many new Big Data Management systems have emerged to provide scalable tools for efficient data storing, processing and analysis. However, most of these systems offer little support for efficiently integrating multiple external sources under a uniform schema and a single query access point, which greatly simplifies further analytics. In this work, we present Spark Mediator, a system that extends the logical data integration capabilities of Apache Spark. As a use case, we show the application of Spark Mediator to the integration of schizophrenia neuroimaging data and compare with previous data integration systems.","PeriodicalId":210506,"journal":{"name":"Proceedings of the International Workshop on Semantic Big Data","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2018-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the International Workshop on Semantic Big Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3208352.3208354","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

With the recent growth of data volumes in many disciplines of both industry and academia many new Big Data Management systems have emerged to provide scalable tools for efficient data storing, processing and analysis. However, most of these systems offer little support for efficiently integrating multiple external sources under a uniform schema and a single query access point, which greatly simplifies further analytics. In this work, we present Spark Mediator, a system that extends the logical data integration capabilities of Apache Spark. As a use case, we show the application of Spark Mediator to the integration of schizophrenia neuroimaging data and compare with previous data integration systems.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用中介层扩展Apache Spark
随着近年来工业界和学术界许多学科数据量的增长,出现了许多新的大数据管理系统,为有效的数据存储、处理和分析提供了可扩展的工具。然而,这些系统中的大多数都不支持在统一模式和单个查询访问点下有效地集成多个外部源,这极大地简化了进一步的分析。在这项工作中,我们提出了Spark Mediator,一个扩展Apache Spark的逻辑数据集成功能的系统。作为一个用例,我们展示了Spark Mediator在精神分裂症神经成像数据集成中的应用,并与以前的数据集成系统进行了比较。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Triag, a framework based on triangles of RDF triples Towards making sense of Spark-SQL performance for processing vast distributed RDF datasets What is the schema of your knowledge graph?: leveraging knowledge graph embeddings and clustering for expressive taxonomy learning Ten ways of leveraging ontologies for natural language processing and its enterprise applications Relaxing global-as-view in mediated data integration from linked data
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1