关系数据库的目标最小卡片性候选密钥

Vasileios Nakos, Hung Q. Ngo, Charalampos E. Tsourakakis
{"title":"关系数据库的目标最小卡片性候选密钥","authors":"Vasileios Nakos, Hung Q. Ngo, Charalampos E. Tsourakakis","doi":"arxiv-2408.13540","DOIUrl":null,"url":null,"abstract":"Functional dependencies (FDs) are a central theme in databases, playing a\nmajor role in the design of database schemas and the optimization of queries.\nIn this work, we introduce the {\\it targeted least cardinality candidate key\nproblem} (TCAND). This problem is defined over a set of functional dependencies\n$F$ and a target variable set $T \\subseteq V$, and it aims to find the smallest\nset $X \\subseteq V$ such that the FD $X \\to T$ can be derived from $F$. The\nTCAND problem generalizes the well-known NP-hard problem of finding the least\ncardinality candidate key~\\cite{lucchesi1978candidate}, which has been\npreviously demonstrated to be at least as difficult as the set cover problem. We present an integer programming (IP) formulation for the TCAND problem,\nanalogous to a layered set cover problem. We analyze its linear programming\n(LP) relaxation from two perspectives: we propose two approximation algorithms\nand investigate the integrality gap. Our findings indicate that the\napproximation upper bounds for our algorithms are not significantly improvable\nthrough LP rounding, a notable distinction from the standard set cover problem.\nAdditionally, we discover that a generalization of the TCAND problem is\nequivalent to a variant of the set cover problem, named red-blue set\ncover~\\cite{carr1999red}, which cannot be approximated within a sub-polynomial\nfactor in polynomial time under plausible\nconjectures~\\cite{chlamtavc2023approximating}. Despite the extensive history\nsurrounding the issue of identifying the least cardinality candidate key, our\nresearch contributes new theoretical insights, novel algorithms, and\ndemonstrates that the general TCAND problem poses complexities beyond those\nencountered in the set cover problem.","PeriodicalId":501123,"journal":{"name":"arXiv - CS - Databases","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Targeted Least Cardinality Candidate Key for Relational Databases\",\"authors\":\"Vasileios Nakos, Hung Q. Ngo, Charalampos E. Tsourakakis\",\"doi\":\"arxiv-2408.13540\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Functional dependencies (FDs) are a central theme in databases, playing a\\nmajor role in the design of database schemas and the optimization of queries.\\nIn this work, we introduce the {\\\\it targeted least cardinality candidate key\\nproblem} (TCAND). This problem is defined over a set of functional dependencies\\n$F$ and a target variable set $T \\\\subseteq V$, and it aims to find the smallest\\nset $X \\\\subseteq V$ such that the FD $X \\\\to T$ can be derived from $F$. The\\nTCAND problem generalizes the well-known NP-hard problem of finding the least\\ncardinality candidate key~\\\\cite{lucchesi1978candidate}, which has been\\npreviously demonstrated to be at least as difficult as the set cover problem. We present an integer programming (IP) formulation for the TCAND problem,\\nanalogous to a layered set cover problem. We analyze its linear programming\\n(LP) relaxation from two perspectives: we propose two approximation algorithms\\nand investigate the integrality gap. Our findings indicate that the\\napproximation upper bounds for our algorithms are not significantly improvable\\nthrough LP rounding, a notable distinction from the standard set cover problem.\\nAdditionally, we discover that a generalization of the TCAND problem is\\nequivalent to a variant of the set cover problem, named red-blue set\\ncover~\\\\cite{carr1999red}, which cannot be approximated within a sub-polynomial\\nfactor in polynomial time under plausible\\nconjectures~\\\\cite{chlamtavc2023approximating}. Despite the extensive history\\nsurrounding the issue of identifying the least cardinality candidate key, our\\nresearch contributes new theoretical insights, novel algorithms, and\\ndemonstrates that the general TCAND problem poses complexities beyond those\\nencountered in the set cover problem.\",\"PeriodicalId\":501123,\"journal\":{\"name\":\"arXiv - CS - Databases\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Databases\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.13540\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Databases","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.13540","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

功能依赖(FDs)是数据库的核心主题,在数据库模式设计和查询优化中发挥着重要作用。在这项工作中,我们引入了{it targeted least cardinality candidate keyproblem}(TCAND)。这个问题是在一组函数依赖$F$和一个目标变量集$T \subseteq V$上定义的,它的目的是找到最小的集$X \subseteq V$,使得FD $X \to T$ 可以从$F$中导出。TCAND问题概括了众所周知的NP-hard问题--寻找最小卡最小度候选密钥(leastcardinality candidate key~cite{lucchesi1978candidate} )。我们提出了 TCAND 问题的整数编程(IP)公式,类似于分层集合覆盖问题。我们从两个角度分析了它的线性规划(LP)松弛:我们提出了两种近似算法,并研究了积分差距。我们的研究结果表明,我们算法的近似上限并不能通过 LP 舍入得到明显改善,这是与标准集合覆盖问题的显著区别。此外,我们还发现,TCAND 问题的一个泛化等价于集合覆盖问题的一个变种,被命名为红蓝集合覆盖~\cite{carr1999red},在可信猜想~\cite{chlamtavc2023approximating}下,它无法在多项式时间内以亚多项式因子逼近。尽管围绕识别最小卡片数候选密钥的问题已有大量研究,但我们的研究贡献了新的理论见解、新的算法,并证明了一般 TCAND 问题所带来的复杂性超出了集合覆盖问题所遇到的复杂性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Targeted Least Cardinality Candidate Key for Relational Databases
Functional dependencies (FDs) are a central theme in databases, playing a major role in the design of database schemas and the optimization of queries. In this work, we introduce the {\it targeted least cardinality candidate key problem} (TCAND). This problem is defined over a set of functional dependencies $F$ and a target variable set $T \subseteq V$, and it aims to find the smallest set $X \subseteq V$ such that the FD $X \to T$ can be derived from $F$. The TCAND problem generalizes the well-known NP-hard problem of finding the least cardinality candidate key~\cite{lucchesi1978candidate}, which has been previously demonstrated to be at least as difficult as the set cover problem. We present an integer programming (IP) formulation for the TCAND problem, analogous to a layered set cover problem. We analyze its linear programming (LP) relaxation from two perspectives: we propose two approximation algorithms and investigate the integrality gap. Our findings indicate that the approximation upper bounds for our algorithms are not significantly improvable through LP rounding, a notable distinction from the standard set cover problem. Additionally, we discover that a generalization of the TCAND problem is equivalent to a variant of the set cover problem, named red-blue set cover~\cite{carr1999red}, which cannot be approximated within a sub-polynomial factor in polynomial time under plausible conjectures~\cite{chlamtavc2023approximating}. Despite the extensive history surrounding the issue of identifying the least cardinality candidate key, our research contributes new theoretical insights, novel algorithms, and demonstrates that the general TCAND problem poses complexities beyond those encountered in the set cover problem.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Development of Data Evaluation Benchmark for Data Wrangling Recommendation System Messy Code Makes Managing ML Pipelines Difficult? Just Let LLMs Rewrite the Code! Fast and Adaptive Bulk Loading of Multidimensional Points Matrix Profile for Anomaly Detection on Multidimensional Time Series Extending predictive process monitoring for collaborative processes
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1