Monitoring Evolution of Dependency Discovery Results

Loredana Caruccio, Stefano Cirillo
{"title":"Monitoring Evolution of Dependency Discovery Results","authors":"Loredana Caruccio, Stefano Cirillo","doi":"10.18293/jvlc2020-n2-007","DOIUrl":null,"url":null,"abstract":"The automatic discovery from data of Functional Dependencies (FDs), and their extensions Relaxed Functional Dependencies (RFDs), represents one of the main tasks in the data profiling research area. Several algorithms that deal with the “complex” problem of discovering RFDs have been recognized as a fundamental tool to automatically collect them starting from data. Moreover, the characteristics of scenarios involving “big” data require also profiling tasks to evolve towards continuous ones, which must be capable to dynamically collect and update the set of holding RFDs on the analyzed data. In this context, one of the most critical scenarios is represented by the possibility to discover RFDs over data streams. Nevertheless, although the main goal of discovery algorithms is allowing for fast execution processes, to enable the analysis of the resulting RFDs, it is necessary to also devise methods to continuously monitor discovery results. Thus, one of the main goals is to reduce the users’ effort in moving in and out the possible huge quantity of holding RFDs. To this end, in this paper, we present DEVICE, a tool for continuously monitoring resulting RFDs during the execution of discovery processes. In particular, it permits to analyze the evolution of results by using a lattice representation of the search space. Moreover, zooming and filtering functionalities enable the user to focus the analysis on a specific portion of the search space. The effectiveness of the proposed tool has been evaluated in a scenario studying the application of different discovery strategies over a well-known and real-world dataset.","PeriodicalId":297195,"journal":{"name":"J. Vis. Lang. Sentient Syst.","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Vis. Lang. Sentient Syst.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18293/jvlc2020-n2-007","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The automatic discovery from data of Functional Dependencies (FDs), and their extensions Relaxed Functional Dependencies (RFDs), represents one of the main tasks in the data profiling research area. Several algorithms that deal with the “complex” problem of discovering RFDs have been recognized as a fundamental tool to automatically collect them starting from data. Moreover, the characteristics of scenarios involving “big” data require also profiling tasks to evolve towards continuous ones, which must be capable to dynamically collect and update the set of holding RFDs on the analyzed data. In this context, one of the most critical scenarios is represented by the possibility to discover RFDs over data streams. Nevertheless, although the main goal of discovery algorithms is allowing for fast execution processes, to enable the analysis of the resulting RFDs, it is necessary to also devise methods to continuously monitor discovery results. Thus, one of the main goals is to reduce the users’ effort in moving in and out the possible huge quantity of holding RFDs. To this end, in this paper, we present DEVICE, a tool for continuously monitoring resulting RFDs during the execution of discovery processes. In particular, it permits to analyze the evolution of results by using a lattice representation of the search space. Moreover, zooming and filtering functionalities enable the user to focus the analysis on a specific portion of the search space. The effectiveness of the proposed tool has been evaluated in a scenario studying the application of different discovery strategies over a well-known and real-world dataset.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
监控依赖发现结果的演变
从数据中自动发现功能依赖关系(fd)及其扩展松弛功能依赖关系(rfd)是数据分析研究领域的主要任务之一。一些处理发现rfd这一“复杂”问题的算法已经被认为是从数据开始自动收集rfd的基本工具。此外,涉及“大”数据的场景的特征还要求分析任务向连续的方向发展,这必须能够动态地收集和更新所分析数据的持有rfd集。在这种情况下,最关键的场景之一是通过数据流发现rfd的可能性。然而,尽管发现算法的主要目标是允许快速执行过程,但是为了能够分析结果rfd,还需要设计出持续监视发现结果的方法。因此,主要目标之一是减少用户在进出可能大量持有的rfd时所做的工作。为此,在本文中,我们提出了DEVICE,这是一个在发现过程执行过程中持续监控产生的rfd的工具。特别是,它允许通过使用搜索空间的晶格表示来分析结果的演变。此外,缩放和过滤功能使用户能够将分析集中在搜索空间的特定部分上。所提出的工具的有效性已经在一个场景中进行了评估,该场景研究了在一个已知的和真实的数据集上不同发现策略的应用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Enriching Image Datasets with Unrestrained Emotional Data: A Study with Users A Visual Debugging Aid based upon Discriminative Graph Mining Smart City Control Room Dashboards: Big Data Infrastructure, from data to decision support Dominant Colors as Image Content Descriptors: A Study with Users Spider Diagrams with Absence: Inference Rules for Clutter Reduction
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1