A method is presented for modeling attribute value distributions in database relations for the purpose of obtaining accurate estimates of intermediate relation sizes during query evaluation. The basic idea is that instead of keeping a single (average) value to represent the number of occurrences of each attribute value, m (typically ten) parameters are kept, each representing the number of occurrences of attribute values in a piece, or partition, corresponding to a subrange of 1/mth of the original value range. The uniformity assumption, taken as an estimation technique rather than as an assumption, holds for each partition, hence the name piecewise uniform. The distribution method is extended to the modeling of important intrarelational attribute correlations. This and other enhancements to the technique such as application to semijoin operation are suggested. The technique is being used on two multidatabase management systems.<>
{"title":"Pragmatic estimation of join sizes and attribute correlations","authors":"D. Bell, D. H. O. Ling, S. McClean","doi":"10.1109/ICDE.1989.47202","DOIUrl":"https://doi.org/10.1109/ICDE.1989.47202","url":null,"abstract":"A method is presented for modeling attribute value distributions in database relations for the purpose of obtaining accurate estimates of intermediate relation sizes during query evaluation. The basic idea is that instead of keeping a single (average) value to represent the number of occurrences of each attribute value, m (typically ten) parameters are kept, each representing the number of occurrences of attribute values in a piece, or partition, corresponding to a subrange of 1/mth of the original value range. The uniformity assumption, taken as an estimation technique rather than as an assumption, holds for each partition, hence the name piecewise uniform. The distribution method is extended to the modeling of important intrarelational attribute correlations. This and other enhancements to the technique such as application to semijoin operation are suggested. The technique is being used on two multidatabase management systems.<<ETX>>","PeriodicalId":329505,"journal":{"name":"[1989] Proceedings. Fifth International Conference on Data Engineering","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115343045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A multiattribute index structure called the hB-tree is introduced. The hB-tree internode search and growth processes are precisely analogous to the corresponding processes in B-trees. The intranode processes are unique. A k-d tree is used as the structure within nodes for very efficient searching. Node splitting requires that this k-d tree be split. This produces nodes which do not represent brick-like regions in k-space but that can be characterized as holey bricks, i.e. bricks in which subregions have been extracted. Results are presented that guarantee hB-tree users decent storage utilization, reasonable-size index terms, and good search and insert performance regardless of key distribution.<>
{"title":"A robust multi-attribute search structure","authors":"D. Lomet, B. Salzberg","doi":"10.1109/ICDE.1989.47229","DOIUrl":"https://doi.org/10.1109/ICDE.1989.47229","url":null,"abstract":"A multiattribute index structure called the hB-tree is introduced. The hB-tree internode search and growth processes are precisely analogous to the corresponding processes in B-trees. The intranode processes are unique. A k-d tree is used as the structure within nodes for very efficient searching. Node splitting requires that this k-d tree be split. This produces nodes which do not represent brick-like regions in k-space but that can be characterized as holey bricks, i.e. bricks in which subregions have been extracted. Results are presented that guarantee hB-tree users decent storage utilization, reasonable-size index terms, and good search and insert performance regardless of key distribution.<<ETX>>","PeriodicalId":329505,"journal":{"name":"[1989] Proceedings. Fifth International Conference on Data Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130970972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The read and write availabilities of replicated data managed by the regeneration algorithm, a replica control protocol based on file regeneration, are evaluated, and two regeneration protocols are presented that overcome some of its limitations. The first protocol combines regeneration and the available copy approach to improve availability of replicated data. The second combines regeneration and the dynamic voting approach to guarantee data consistency in the presence of network partitions while maintaining a high availability. Expressions for the availabilities of replicated data managed by both protocols are derived and found to improve significantly on the availability achieved using extant consistency protocols.<>
{"title":"Regeneration protocols for replicated objects","authors":"D. Long, Jehan-Francois Pâris","doi":"10.1109/ICDE.1989.47260","DOIUrl":"https://doi.org/10.1109/ICDE.1989.47260","url":null,"abstract":"The read and write availabilities of replicated data managed by the regeneration algorithm, a replica control protocol based on file regeneration, are evaluated, and two regeneration protocols are presented that overcome some of its limitations. The first protocol combines regeneration and the available copy approach to improve availability of replicated data. The second combines regeneration and the dynamic voting approach to guarantee data consistency in the presence of network partitions while maintaining a high availability. Expressions for the availabilities of replicated data managed by both protocols are derived and found to improve significantly on the availability achieved using extant consistency protocols.<<ETX>>","PeriodicalId":329505,"journal":{"name":"[1989] Proceedings. Fifth International Conference on Data Engineering","volume":" 12","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113952530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Three known algorithms for relational division, the algebra operator used to express universal quantification (for-all conditions) and an algorithm called hash-division are outlined. By comparing the algorithms analytically and experimentally, it is shown that the algorithm provides performance competitive with or superior to that of techniques used to date, namely techniques using sorting or aggregate functions. Furthermore, the algorithm can eliminate duplicates in the divisor on the fly, ignores duplicates in the dividend, and allows two kinds of partitioning, either of which can be used to resolve hash table overflow or to efficiently implement the algorithm on a multiprocessor system.<>
{"title":"Relational division: four algorithms and their performance","authors":"G. Graefe","doi":"10.1109/ICDE.1989.47204","DOIUrl":"https://doi.org/10.1109/ICDE.1989.47204","url":null,"abstract":"Three known algorithms for relational division, the algebra operator used to express universal quantification (for-all conditions) and an algorithm called hash-division are outlined. By comparing the algorithms analytically and experimentally, it is shown that the algorithm provides performance competitive with or superior to that of techniques used to date, namely techniques using sorting or aggregate functions. Furthermore, the algorithm can eliminate duplicates in the divisor on the fly, ignores duplicates in the dividend, and allows two kinds of partitioning, either of which can be used to resolve hash table overflow or to efficiently implement the algorithm on a multiprocessor system.<<ETX>>","PeriodicalId":329505,"journal":{"name":"[1989] Proceedings. Fifth International Conference on Data Engineering","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114877515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The problem is discussed of determining the vote assignment and quorum that yields the highest availability in a system where node availabilities can be different and the mix of the read and write operations is arbitrary. For this purpose, an enumeration algorithm is presented that can be used to find the vote and quorum assignments that need to be considered for achieving optimal availability. An analytical method is derived to evaluate the availability of a given system for any vote and quorum assignment. This method and the enumeration algorithm are used to find the optimal vote and quorum assignment for several systems. The algorithm can also be used to obtain the optimal performance when other measures are considered.<>
{"title":"Optimizing vote and quorum assignments for reading and writing replicated data","authors":"S. Y. Cheung, M. Ahamad, M. Ammar","doi":"10.1109/ICDE.1989.47226","DOIUrl":"https://doi.org/10.1109/ICDE.1989.47226","url":null,"abstract":"The problem is discussed of determining the vote assignment and quorum that yields the highest availability in a system where node availabilities can be different and the mix of the read and write operations is arbitrary. For this purpose, an enumeration algorithm is presented that can be used to find the vote and quorum assignments that need to be considered for achieving optimal availability. An analytical method is derived to evaluate the availability of a given system for any vote and quorum assignment. This method and the enumeration algorithm are used to find the optimal vote and quorum assignment for several systems. The algorithm can also be used to obtain the optimal performance when other measures are considered.<<ETX>>","PeriodicalId":329505,"journal":{"name":"[1989] Proceedings. Fifth International Conference on Data Engineering","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123367087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The authors present a portion of the EARL semantic data model, an augmentation of the entity-relationship model, which uses notational constructs based on propositional logic to allow a greater selection of semantic integrity constraints to be captured during the specification and design of a domain schema. The use of these logic-based constructs is shown to allow more subtle constraints to be specified, so that ambiguity can be reduced when placing constraints on the universe of discourse for database or knowledge-base applications. The notion of what are semantic constraints is discussed, and notational constructs with informal semantics are defined for the EARL model. Several examples are presented.<>
{"title":"Modeling semantic constraints with logic in the EARL data model","authors":"James P. Davis, R. Bonnell","doi":"10.1109/ICDE.1989.47218","DOIUrl":"https://doi.org/10.1109/ICDE.1989.47218","url":null,"abstract":"The authors present a portion of the EARL semantic data model, an augmentation of the entity-relationship model, which uses notational constructs based on propositional logic to allow a greater selection of semantic integrity constraints to be captured during the specification and design of a domain schema. The use of these logic-based constructs is shown to allow more subtle constraints to be specified, so that ambiguity can be reduced when placing constraints on the universe of discourse for database or knowledge-base applications. The notion of what are semantic constraints is discussed, and notational constructs with informal semantics are defined for the EARL model. Several examples are presented.<<ETX>>","PeriodicalId":329505,"journal":{"name":"[1989] Proceedings. Fifth International Conference on Data Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130119625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Methods are discussed to enhance the efficiency and speed of data compression techniques in DBMS (database management systems). Arithmetic coding utilizes the skewness of character distribution by assigning larger intervals (code ranges) to characters having higher probabilities of occurrence. A scheme is presented which effectively increases the code ranges of individual characters by splitting the interval assignment into different groups. This decreases the rate of interval narrowing and hence improves the compression efficiency. Hardware assistance for arithmetic and tree-based coding is also discussed and high-speed VLSI algorithms for data compression are presented. The proposed algorithms give rates that are an order of magnitude faster than currently attainable encoding speeds.<>
{"title":"On software and hardware techniques of data engineering","authors":"M. Bassiouni, A. Mukherjee, N. Ranganathan","doi":"10.1109/ICDE.1989.47216","DOIUrl":"https://doi.org/10.1109/ICDE.1989.47216","url":null,"abstract":"Methods are discussed to enhance the efficiency and speed of data compression techniques in DBMS (database management systems). Arithmetic coding utilizes the skewness of character distribution by assigning larger intervals (code ranges) to characters having higher probabilities of occurrence. A scheme is presented which effectively increases the code ranges of individual characters by splitting the interval assignment into different groups. This decreases the rate of interval narrowing and hence improves the compression efficiency. Hardware assistance for arithmetic and tree-based coding is also discussed and high-speed VLSI algorithms for data compression are presented. The proposed algorithms give rates that are an order of magnitude faster than currently attainable encoding speeds.<<ETX>>","PeriodicalId":329505,"journal":{"name":"[1989] Proceedings. Fifth International Conference on Data Engineering","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130123815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The author maintains that loose AI-DBMS (artificial intelligence-database-management system) coupling will thrive in the future. The specific approaches adopted for the coupling (loose or tight) of AI-DBMS systems will depend on engineering, economic, social, organizational factors. A number of applications involving the processing of knowledge and data are presented and classified in terms of their need for loose or tight coupling. The author asserts that both approaches are valid expert database system architectures, and both have a role in present and future systems and applications.<>
{"title":"The role of loose coupling in expert database system architectures","authors":"L. Kerschberg","doi":"10.1109/ICDE.1989.47222","DOIUrl":"https://doi.org/10.1109/ICDE.1989.47222","url":null,"abstract":"The author maintains that loose AI-DBMS (artificial intelligence-database-management system) coupling will thrive in the future. The specific approaches adopted for the coupling (loose or tight) of AI-DBMS systems will depend on engineering, economic, social, organizational factors. A number of applications involving the processing of knowledge and data are presented and classified in terms of their need for loose or tight coupling. The author asserts that both approaches are valid expert database system architectures, and both have a role in present and future systems and applications.<<ETX>>","PeriodicalId":329505,"journal":{"name":"[1989] Proceedings. Fifth International Conference on Data Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129987376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
An approach known as constraint analysis is presented for specifying updatable perspectives of class objects in an object-oriented data schema. Perspectives provide a way of defining the scope of a user's view of a schema. Operations on class objects are then defined in terms of specific perspectives based on the semantics associated with the schema. Schema semantics are represented as a set of integrity constraints expressed in Horn logic. The constraint analysis process then reasons about schema constraints to support a flexible approach to update propagation. The advantage of constraint analysis is that both inherent and explicit constraints can be used to support the automated specification of updatable perspectives that preserve object integrity.<>
{"title":"Constraint analysis for specifying perspectives of class objects","authors":"S. Urban, L. Delcambre","doi":"10.1109/ICDE.1989.47195","DOIUrl":"https://doi.org/10.1109/ICDE.1989.47195","url":null,"abstract":"An approach known as constraint analysis is presented for specifying updatable perspectives of class objects in an object-oriented data schema. Perspectives provide a way of defining the scope of a user's view of a schema. Operations on class objects are then defined in terms of specific perspectives based on the semantics associated with the schema. Schema semantics are represented as a set of integrity constraints expressed in Horn logic. The constraint analysis process then reasons about schema constraints to support a flexible approach to update propagation. The advantage of constraint analysis is that both inherent and explicit constraints can be used to support the automated specification of updatable perspectives that preserve object integrity.<<ETX>>","PeriodicalId":329505,"journal":{"name":"[1989] Proceedings. Fifth International Conference on Data Engineering","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125987368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The use of the Steiner tree (or the shortest connection) as a model for automatic logical navigation among relations is described. An efficient approximation algorithm is introduced for finding shortest connections. The model is extended to allow for ambiguous words in queries. The capability of this method is demonstrated by a simple database interface which is both easy to build and easy to use. The method can also be used in other query systems as a tool for finding navigation routes.<>
{"title":"Automated logical navigation among relations using Steiner trees","authors":"Dekang Lin","doi":"10.1109/ICDE.1989.47265","DOIUrl":"https://doi.org/10.1109/ICDE.1989.47265","url":null,"abstract":"The use of the Steiner tree (or the shortest connection) as a model for automatic logical navigation among relations is described. An efficient approximation algorithm is introduced for finding shortest connections. The model is extended to allow for ambiguous words in queries. The capability of this method is demonstrated by a simple database interface which is both easy to build and easy to use. The method can also be used in other query systems as a tool for finding navigation routes.<<ETX>>","PeriodicalId":329505,"journal":{"name":"[1989] Proceedings. Fifth International Conference on Data Engineering","volume":"181 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123189921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}