{"title":"联接大小和属性相关性的实用估计","authors":"D. Bell, D. H. O. Ling, S. McClean","doi":"10.1109/ICDE.1989.47202","DOIUrl":null,"url":null,"abstract":"A method is presented for modeling attribute value distributions in database relations for the purpose of obtaining accurate estimates of intermediate relation sizes during query evaluation. The basic idea is that instead of keeping a single (average) value to represent the number of occurrences of each attribute value, m (typically ten) parameters are kept, each representing the number of occurrences of attribute values in a piece, or partition, corresponding to a subrange of 1/mth of the original value range. The uniformity assumption, taken as an estimation technique rather than as an assumption, holds for each partition, hence the name piecewise uniform. The distribution method is extended to the modeling of important intrarelational attribute correlations. This and other enhancements to the technique such as application to semijoin operation are suggested. The technique is being used on two multidatabase management systems.<<ETX>>","PeriodicalId":329505,"journal":{"name":"[1989] Proceedings. Fifth International Conference on Data Engineering","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1989-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":"{\"title\":\"Pragmatic estimation of join sizes and attribute correlations\",\"authors\":\"D. Bell, D. H. O. Ling, S. McClean\",\"doi\":\"10.1109/ICDE.1989.47202\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A method is presented for modeling attribute value distributions in database relations for the purpose of obtaining accurate estimates of intermediate relation sizes during query evaluation. The basic idea is that instead of keeping a single (average) value to represent the number of occurrences of each attribute value, m (typically ten) parameters are kept, each representing the number of occurrences of attribute values in a piece, or partition, corresponding to a subrange of 1/mth of the original value range. The uniformity assumption, taken as an estimation technique rather than as an assumption, holds for each partition, hence the name piecewise uniform. The distribution method is extended to the modeling of important intrarelational attribute correlations. This and other enhancements to the technique such as application to semijoin operation are suggested. The technique is being used on two multidatabase management systems.<<ETX>>\",\"PeriodicalId\":329505,\"journal\":{\"name\":\"[1989] Proceedings. Fifth International Conference on Data Engineering\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1989-02-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"23\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"[1989] Proceedings. Fifth International Conference on Data Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDE.1989.47202\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"[1989] Proceedings. Fifth International Conference on Data Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE.1989.47202","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Pragmatic estimation of join sizes and attribute correlations
A method is presented for modeling attribute value distributions in database relations for the purpose of obtaining accurate estimates of intermediate relation sizes during query evaluation. The basic idea is that instead of keeping a single (average) value to represent the number of occurrences of each attribute value, m (typically ten) parameters are kept, each representing the number of occurrences of attribute values in a piece, or partition, corresponding to a subrange of 1/mth of the original value range. The uniformity assumption, taken as an estimation technique rather than as an assumption, holds for each partition, hence the name piecewise uniform. The distribution method is extended to the modeling of important intrarelational attribute correlations. This and other enhancements to the technique such as application to semijoin operation are suggested. The technique is being used on two multidatabase management systems.<>