Towards large-scale data discovery: position paper

R. Fernandez, Ziawasch Abedjan, S. Madden, M. Stonebraker
{"title":"Towards large-scale data discovery: position paper","authors":"R. Fernandez, Ziawasch Abedjan, S. Madden, M. Stonebraker","doi":"10.1145/2948674.2948675","DOIUrl":null,"url":null,"abstract":"With thousands of data sources spread across multiple databases and data lakes, modern organizations face a data discovery challenge. Analysts spend more time finding relevant data to answer the questions at hand than analyzing it. In this paper we introduce a data discovery system that facilitates locating relevant data among thousands of data sources. We represent data sources succinctly through signatures, and then create search paths that permit quick execution of a set of data discovery primitives used for finding relevant data. We have built a prototype that is being used to solve data discovery challenges of two big organizations.","PeriodicalId":165112,"journal":{"name":"Proceedings of the Third International Workshop on Exploratory Search in Databases and the Web","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Third International Workshop on Exploratory Search in Databases and the Web","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2948674.2948675","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11

Abstract

With thousands of data sources spread across multiple databases and data lakes, modern organizations face a data discovery challenge. Analysts spend more time finding relevant data to answer the questions at hand than analyzing it. In this paper we introduce a data discovery system that facilitates locating relevant data among thousands of data sources. We represent data sources succinctly through signatures, and then create search paths that permit quick execution of a set of data discovery primitives used for finding relevant data. We have built a prototype that is being used to solve data discovery challenges of two big organizations.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
迈向大规模数据发现:立场文件
由于数以千计的数据源分布在多个数据库和数据湖中,现代组织面临着数据发现的挑战。分析师花更多的时间寻找相关数据来回答手头的问题,而不是分析数据。在本文中,我们介绍了一个数据发现系统,可以方便地在成千上万的数据源中找到相关的数据。我们通过签名简洁地表示数据源,然后创建搜索路径,允许快速执行一组用于查找相关数据的数据发现原语。我们已经建立了一个原型,用于解决两个大组织的数据发现挑战。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Towards large-scale data discovery: position paper Multiple diagram navigation (MDN) CourseNavigator: interactive learning path exploration Data exploration: a roll call of all user-data interaction functionality Space odyssey: efficient exploration of scientific data
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1