The feasibility of inferring functional dependencies from an example relation is investigated. The problem occurs in the context of automatic database design, when a tool is needed to assist the database designer in the process of specifying logical dependencies. The complexity of the dependency inference problem is inherently exponential. However, algorithms could be developed that perform well when the input relation has certain characteristics. Two such algorithms for dependency inference are implemented and optimized. An extensive set of experiments is presented, in which dependencies were inferred from example relations with different cardinalities, number of attributes, and degree of normalization. It is concluded that for practical example relations, an adequate implementation of a dependence inference function leads to acceptable interactive response times.<>
{"title":"A feasibility and performance study of dependency inference (database design)","authors":"D. Bitton, Jeffrey Millman, Solveig Torgersen","doi":"10.1109/ICDE.1989.47271","DOIUrl":"https://doi.org/10.1109/ICDE.1989.47271","url":null,"abstract":"The feasibility of inferring functional dependencies from an example relation is investigated. The problem occurs in the context of automatic database design, when a tool is needed to assist the database designer in the process of specifying logical dependencies. The complexity of the dependency inference problem is inherently exponential. However, algorithms could be developed that perform well when the input relation has certain characteristics. Two such algorithms for dependency inference are implemented and optimized. An extensive set of experiments is presented, in which dependencies were inferred from example relations with different cardinalities, number of attributes, and degree of normalization. It is concluded that for practical example relations, an adequate implementation of a dependence inference function leads to acceptable interactive response times.<<ETX>>","PeriodicalId":329505,"journal":{"name":"[1989] Proceedings. Fifth International Conference on Data Engineering","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117106041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A protocol is presented for negotiating access to data in a federated database. This protocol deals with several important aspects of data sharing in an environment where databases belonging to different organizations coexist and cooperate. It is based on the concept of quasicopies. These are cached values that are allowed to deviate from the central value in a controlled fashion. The degree of consistency of a quasicopy is established by the entity where the quasicopy is to reside. Two techniques are presented that can be used for numerical data and for copies that rely on version numbers. It is believed that these two ideas encompass many of the interesting cases in practice. However, it is also possible to fine tune the estimation of this function once the exact semantics of the data involved are known.<>
{"title":"Negotiating data access in federated database systems","authors":"R. Alonso, Daniel Barbará","doi":"10.1109/ICDE.1989.47200","DOIUrl":"https://doi.org/10.1109/ICDE.1989.47200","url":null,"abstract":"A protocol is presented for negotiating access to data in a federated database. This protocol deals with several important aspects of data sharing in an environment where databases belonging to different organizations coexist and cooperate. It is based on the concept of quasicopies. These are cached values that are allowed to deviate from the central value in a controlled fashion. The degree of consistency of a quasicopy is established by the entity where the quasicopy is to reside. Two techniques are presented that can be used for numerical data and for copies that rely on version numbers. It is believed that these two ideas encompass many of the interesting cases in practice. However, it is also possible to fine tune the estimation of this function once the exact semantics of the data involved are known.<<ETX>>","PeriodicalId":329505,"journal":{"name":"[1989] Proceedings. Fifth International Conference on Data Engineering","volume":"99 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131813109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The aggregative closure operator is defined and its usefulness is demonstrated in a wide variety of applications. The concepts and definitions of closed semirings and the aggregating relational operators provide a mathematical framework for the presentation of algorithms for these applications. A novel algorithm is also presented which is intended for the computation of the aggregate closure. All of these algorithms but the last are generalizations of existing algorithms intended for transitive closure.<>
{"title":"Aggregative closure: an extension of transitive closure","authors":"I. Cruz, T. Norvell","doi":"10.1109/ICDE.1989.47239","DOIUrl":"https://doi.org/10.1109/ICDE.1989.47239","url":null,"abstract":"The aggregative closure operator is defined and its usefulness is demonstrated in a wide variety of applications. The concepts and definitions of closed semirings and the aggregating relational operators provide a mathematical framework for the presentation of algorithms for these applications. A novel algorithm is also presented which is intended for the computation of the aggregate closure. All of these algorithms but the last are generalizations of existing algorithms intended for transitive closure.<<ETX>>","PeriodicalId":329505,"journal":{"name":"[1989] Proceedings. Fifth International Conference on Data Engineering","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122771428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Join algorithms on KD-tree indexed relations are proposed. The join algorithms are based on a concept called wave. The wave is a set of pages that is the object of joining and that propagates over the relation space in the direction of the join attribute axis. Four basic join algorithms that determine the wave from one of the relations and one algorithm that determines the wave from both relations are proposed. The algorithms are described and extensively analyzed with analytical formulas and simulation results. Then a garbage collection mechanism is introduced that discards the unnecessary data loaded in the main memory and extends the previous basic algorithms with an efficient memory management. It is shown that the proposed algorithms perform the join of very large relations with one scan.<>
{"title":"Join strategies on KD-tree indexed relations","authors":"M. Kitsuregawa, L. Harada, M. Takagi","doi":"10.1109/ICDE.1989.47203","DOIUrl":"https://doi.org/10.1109/ICDE.1989.47203","url":null,"abstract":"Join algorithms on KD-tree indexed relations are proposed. The join algorithms are based on a concept called wave. The wave is a set of pages that is the object of joining and that propagates over the relation space in the direction of the join attribute axis. Four basic join algorithms that determine the wave from one of the relations and one algorithm that determines the wave from both relations are proposed. The algorithms are described and extensively analyzed with analytical formulas and simulation results. Then a garbage collection mechanism is introduced that discards the unnecessary data loaded in the main memory and extends the previous basic algorithms with an efficient memory management. It is shown that the proposed algorithms perform the join of very large relations with one scan.<<ETX>>","PeriodicalId":329505,"journal":{"name":"[1989] Proceedings. Fifth International Conference on Data Engineering","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131714099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
An approach for evolutionary database design is presented which tries to remedy some of the shortcomings of previous design methods. The approach distinguishes clearly between a conceptual and a logical database design. A conceptual schema models the relevant aspects of reality. A logical schema describes the structure of the database as generic tables, and it reflects the design decisions taken to map the objects of the conceptual schema into the generic tables. To support this strategy with tools, it is necessary to have a version concept and a mechanism for recording design decisions called protocolling. These concepts are realized in Presto, a development environment for an evolutionary design of database applications which is described.<>
{"title":"Evolutionary database design","authors":"Fredy Oertly, G. Schiller","doi":"10.1109/ICDE.1989.47269","DOIUrl":"https://doi.org/10.1109/ICDE.1989.47269","url":null,"abstract":"An approach for evolutionary database design is presented which tries to remedy some of the shortcomings of previous design methods. The approach distinguishes clearly between a conceptual and a logical database design. A conceptual schema models the relevant aspects of reality. A logical schema describes the structure of the database as generic tables, and it reflects the design decisions taken to map the objects of the conceptual schema into the generic tables. To support this strategy with tools, it is necessary to have a version concept and a mechanism for recording design decisions called protocolling. These concepts are realized in Presto, a development environment for an evolutionary design of database applications which is described.<<ETX>>","PeriodicalId":329505,"journal":{"name":"[1989] Proceedings. Fifth International Conference on Data Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124964719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A family of practical algorithms is presented to schedule join execution in a shared-memory multiprocessor environment. The algorithms are based on page connectivity graphs and determine when to read each data page into memory and how to schedule page joins on the available processors. The goal is to overlap page reads with parallel join execution in such a way that both the number of processors and total duration of join processing time are minimized. Upper and lower bounds are derived on the number of processors required to complete join execution in optimal time. A description is given of a general strategy for generating read schedules that it is conjectured can be processed in minimal time (over all read schedules on any number of processors) and a family of practical algorithms utilizing an arbitrary number of lookahead steps to approximate this general strategy.<>
{"title":"Processor scheduling for multiprocessor joins","authors":"M. C. Murphy, D. Rotem","doi":"10.1109/ICDE.1989.47209","DOIUrl":"https://doi.org/10.1109/ICDE.1989.47209","url":null,"abstract":"A family of practical algorithms is presented to schedule join execution in a shared-memory multiprocessor environment. The algorithms are based on page connectivity graphs and determine when to read each data page into memory and how to schedule page joins on the available processors. The goal is to overlap page reads with parallel join execution in such a way that both the number of processors and total duration of join processing time are minimized. Upper and lower bounds are derived on the number of processors required to complete join execution in optimal time. A description is given of a general strategy for generating read schedules that it is conjectured can be processed in minimal time (over all read schedules on any number of processors) and a family of practical algorithms utilizing an arbitrary number of lookahead steps to approximate this general strategy.<<ETX>>","PeriodicalId":329505,"journal":{"name":"[1989] Proceedings. Fifth International Conference on Data Engineering","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115130301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A novel facility in hypermedia systems is proposed to provide search using queries. The search mechanism is shown to complement linking, making the creation of rich linked structures easier. A prototype hypermedia system that provides search is described, and user experiences to date and some future directions are discussed.<>
{"title":"Searching in a hyperlibrary","authors":"B. Schatz, M. Caplinger","doi":"10.1109/ICDE.1989.47214","DOIUrl":"https://doi.org/10.1109/ICDE.1989.47214","url":null,"abstract":"A novel facility in hypermedia systems is proposed to provide search using queries. The search mechanism is shown to complement linking, making the creation of rich linked structures easier. A prototype hypermedia system that provides search is described, and user experiences to date and some future directions are discussed.<<ETX>>","PeriodicalId":329505,"journal":{"name":"[1989] Proceedings. Fifth International Conference on Data Engineering","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122019450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A method is proposed to store and efficiently retrieve rules in a deductive database. The rules are compiled into an execution model called a PCN (production compilation network). The PCN is stored as a particular relational database which consists of three relations. For a given end-user query the first step of the inference process consists of searching for the relevant rules by traversing the PCN structure. This traversal is computed by a transitive closure of a part of the PCN. The transitive closure is performed by a loop of joins over the relations storing the PCN. In this context, a transitive closure algorithm that is based on a physical clustering of the relations is proposed. This clustering consists of a double hashing technique. It ensures a linear cost for the join operation with weak conditions on the size of the main memory. The approach reduces the number of I/O operations for semi-naive transitive closure operation. This algorithm will be applied to the search for relevant rules and to the transitive computation of linear recursive relations.<>
{"title":"Relational storage and efficient retrieval of rules in a deductive DBMS","authors":"J. Cheiney, C. D. Maindreville","doi":"10.1109/ICDE.1989.47272","DOIUrl":"https://doi.org/10.1109/ICDE.1989.47272","url":null,"abstract":"A method is proposed to store and efficiently retrieve rules in a deductive database. The rules are compiled into an execution model called a PCN (production compilation network). The PCN is stored as a particular relational database which consists of three relations. For a given end-user query the first step of the inference process consists of searching for the relevant rules by traversing the PCN structure. This traversal is computed by a transitive closure of a part of the PCN. The transitive closure is performed by a loop of joins over the relations storing the PCN. In this context, a transitive closure algorithm that is based on a physical clustering of the relations is proposed. This clustering consists of a double hashing technique. It ensures a linear cost for the join operation with weak conditions on the size of the main memory. The approach reduces the number of I/O operations for semi-naive transitive closure operation. This algorithm will be applied to the search for relevant rules and to the transitive computation of linear recursive relations.<<ETX>>","PeriodicalId":329505,"journal":{"name":"[1989] Proceedings. Fifth International Conference on Data Engineering","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126625596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A nonnumeric analog of statistical methods for proving security is described and implemented. The approach involves analyzing facts and inference rules assumed to be known to a compromiser and deriving all their possible consequences using resolution theorem-proving, a technique which is argued to be far more appropriate to this problem than rule-based expert systems or information-flow analysis. An important contribution of this method is augmentation of resolution to handle associated time intervals and probabilities of statements being true. The augmentation is simple to use by domain experts untrained in computers, and it is believed to provide the first truly practical tool for analysis of indirect logical inferences in information systems. Its capabilities are demonstrated with an example from military security.<>
{"title":"Inference-security analysis using resolution theorem-proving","authors":"N. Rowe","doi":"10.1109/ICDE.1989.47242","DOIUrl":"https://doi.org/10.1109/ICDE.1989.47242","url":null,"abstract":"A nonnumeric analog of statistical methods for proving security is described and implemented. The approach involves analyzing facts and inference rules assumed to be known to a compromiser and deriving all their possible consequences using resolution theorem-proving, a technique which is argued to be far more appropriate to this problem than rule-based expert systems or information-flow analysis. An important contribution of this method is augmentation of resolution to handle associated time intervals and probabilities of statements being true. The augmentation is simple to use by domain experts untrained in computers, and it is believed to provide the first truly practical tool for analysis of indirect logical inferences in information systems. Its capabilities are demonstrated with an example from military security.<<ETX>>","PeriodicalId":329505,"journal":{"name":"[1989] Proceedings. Fifth International Conference on Data Engineering","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126646481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The problem of efficiently processing recursive path queries in deductive database systems is discussed, and a semimaterialized encoding structure is proposed as an attractive approach that provides a balance between efficiency of retrieval and feasibility of storage. Incremental algorithms are presented that enable the effects of updates to the underlying database to be reflected in the materialized information. Performance simulations indicate that these techniques can significantly speed up the processing of path queries at an acceptable level of storage overhead.<>
{"title":"Materialization and incremental update of path information","authors":"R. Agrawal, H. Jagadish","doi":"10.1109/ICDE.1989.47238","DOIUrl":"https://doi.org/10.1109/ICDE.1989.47238","url":null,"abstract":"The problem of efficiently processing recursive path queries in deductive database systems is discussed, and a semimaterialized encoding structure is proposed as an attractive approach that provides a balance between efficiency of retrieval and feasibility of storage. Incremental algorithms are presented that enable the effects of updates to the underlying database to be reflected in the materialized information. Performance simulations indicate that these techniques can significantly speed up the processing of path queries at an acceptable level of storage overhead.<<ETX>>","PeriodicalId":329505,"journal":{"name":"[1989] Proceedings. Fifth International Conference on Data Engineering","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126455337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}