使用分区有效地发现功能和近似依赖关系

Proceedings 14th International Conference on Data Engineering Pub Date : 1998-02-23 DOI:10.1109/ICDE.1998.655802

Ykä Huhtala, Juha Kärkkäinen, P. Porkka, Hannu (TT) Toivonen

{"title":"使用分区有效地发现功能和近似依赖关系","authors":"Ykä Huhtala, Juha Kärkkäinen, P. Porkka, Hannu (TT) Toivonen","doi":"10.1109/ICDE.1998.655802","DOIUrl":null,"url":null,"abstract":"Discovery of functional dependencies from relations has been identified as an important database analysis technique. We present a new approach for finding functional dependencies from large databases, based on partitioning the set of rows with respect to their attribute values. The use of partitions makes the discovery of approximate functional dependencies easy and efficient, and the erroneous or exceptional rows can be identified easily. Experiments show that the new algorithm is efficient in practice. For benchmark databases the running times are improved by several orders of magnitude over previously published results. The algorithm is also applicable to much larger datasets than the previous methods.","PeriodicalId":264926,"journal":{"name":"Proceedings 14th International Conference on Data Engineering","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1998-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"217","resultStr":"{\"title\":\"Efficient discovery of functional and approximate dependencies using partitions\",\"authors\":\"Ykä Huhtala, Juha Kärkkäinen, P. Porkka, Hannu (TT) Toivonen\",\"doi\":\"10.1109/ICDE.1998.655802\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Discovery of functional dependencies from relations has been identified as an important database analysis technique. We present a new approach for finding functional dependencies from large databases, based on partitioning the set of rows with respect to their attribute values. The use of partitions makes the discovery of approximate functional dependencies easy and efficient, and the erroneous or exceptional rows can be identified easily. Experiments show that the new algorithm is efficient in practice. For benchmark databases the running times are improved by several orders of magnitude over previously published results. The algorithm is also applicable to much larger datasets than the previous methods.\",\"PeriodicalId\":264926,\"journal\":{\"name\":\"Proceedings 14th International Conference on Data Engineering\",\"volume\":\"18 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1998-02-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"217\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings 14th International Conference on Data Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDE.1998.655802\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 14th International Conference on Data Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE.1998.655802","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 217

摘要

从关系中发现功能依赖已被认为是一项重要的数据库分析技术。我们提出了一种从大型数据库中查找功能依赖的新方法，该方法基于对行集的属性值进行分区。分区的使用使得发现近似的功能依赖关系变得简单和有效，并且可以很容易地识别错误或异常行。实验表明，该算法在实际应用中是有效的。对于基准数据库，运行时间比以前发布的结果提高了几个数量级。该算法也适用于比以前的方法更大的数据集。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Efficient discovery of functional and approximate dependencies using partitions

Discovery of functional dependencies from relations has been identified as an important database analysis technique. We present a new approach for finding functional dependencies from large databases, based on partitioning the set of rows with respect to their attribute values. The use of partitions makes the discovery of approximate functional dependencies easy and efficient, and the erroneous or exceptional rows can be identified easily. Experiments show that the new algorithm is efficient in practice. For benchmark databases the running times are improved by several orders of magnitude over previously published results. The algorithm is also applicable to much larger datasets than the previous methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings 14th International Conference on Data Engineering

自引率

0.00%

发文量

期刊最新文献

A distribution-based clustering algorithm for mining in large spatial databases Parallelizing loops in database programming languages Data logging: a method for efficient data updates in constantly active RAIDs Query processing in a video retrieval system Optimizing regular path expressions using graph schemas