Pub Date : 1995-03-06DOI: 10.1109/ICDE.1995.380381
P. Hsu, D. S. Parker
A generalized quantifier is a particular kind of operator on sets. Coming under increasing attention recently by linguists and logicians, they correspond to many useful natural language phrases, including phrases like: three, Chamberlin's three, more than three, fewer than three, at most three, all but three, no more than three, not more than half the, at least two and not more than three, no student's, most male and all female, etc. Reasoning about quantifiers is a source of recurring problems for most SQL users, and leads to both confusion and incorrect expression of queries. By adopting a more modern and natural model of quantification these problems can be alleviated. We show how generalized quantifiers can be used to improve the SQL interface.<>
{"title":"Improving SQL with generalized quantifiers","authors":"P. Hsu, D. S. Parker","doi":"10.1109/ICDE.1995.380381","DOIUrl":"https://doi.org/10.1109/ICDE.1995.380381","url":null,"abstract":"A generalized quantifier is a particular kind of operator on sets. Coming under increasing attention recently by linguists and logicians, they correspond to many useful natural language phrases, including phrases like: three, Chamberlin's three, more than three, fewer than three, at most three, all but three, no more than three, not more than half the, at least two and not more than three, no student's, most male and all female, etc. Reasoning about quantifiers is a source of recurring problems for most SQL users, and leads to both confusion and incorrect expression of queries. By adopting a more modern and natural model of quantification these problems can be alleviated. We show how generalized quantifiers can be used to improve the SQL interface.<<ETX>>","PeriodicalId":184415,"journal":{"name":"Proceedings of the Eleventh International Conference on Data Engineering","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127902638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-03-06DOI: 10.1109/ICDE.1995.380408
S. Dao
Future large and complex information systems create new challenges and opportunities for research and advanced development in data management. A brief description of Hughes research and prototype efforts to meet these challenges is summarized.<>
{"title":"Toward scalability and interoperability of heterogeneous information sources","authors":"S. Dao","doi":"10.1109/ICDE.1995.380408","DOIUrl":"https://doi.org/10.1109/ICDE.1995.380408","url":null,"abstract":"Future large and complex information systems create new challenges and opportunities for research and advanced development in data management. A brief description of Hughes research and prototype efforts to meet these challenges is summarized.<<ETX>>","PeriodicalId":184415,"journal":{"name":"Proceedings of the Eleventh International Conference on Data Engineering","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131849350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-03-06DOI: 10.1109/ICDE.1995.380355
P. Seshadri, A. Swami
This paper demonstrates the use of generalized partial indexes for efficient query processing. We propose that partial indexes be built on those portions of the database that are statistically likely to be the most useful for query processing. We identify three classes of statistical information, and two levels at which it may be available. We describe indexing strategies that use this information to significantly improve average query performance. Results from simulation experiments demonstrate that the proposed generalized partial indexing strategies perform well compared to the traditional approach to indexing.<>
{"title":"Generalized partial indexes","authors":"P. Seshadri, A. Swami","doi":"10.1109/ICDE.1995.380355","DOIUrl":"https://doi.org/10.1109/ICDE.1995.380355","url":null,"abstract":"This paper demonstrates the use of generalized partial indexes for efficient query processing. We propose that partial indexes be built on those portions of the database that are statistically likely to be the most useful for query processing. We identify three classes of statistical information, and two levels at which it may be available. We describe indexing strategies that use this information to significantly improve average query performance. Results from simulation experiments demonstrate that the proposed generalized partial indexing strategies perform well compared to the traditional approach to indexing.<<ETX>>","PeriodicalId":184415,"journal":{"name":"Proceedings of the Eleventh International Conference on Data Engineering","volume":"236 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131865010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-03-06DOI: 10.1109/ICDE.1995.380396
Young-Gook Ra, Elke A. Rundensteiner
When a database is shared by many users, updates to the database schema are almost always prohibited because there is a risk of making existing application programs obsolete when they run against the modified schema. This paper addresses the problem by integrating schema evolution with view facilities. When new requirements necessitate schema updates for a particular user, the user specifies schema changes to the personal view rather than to the shared base schema. Our view evolution approach then computes a new view schema that reflects the semantics of the desired schema change, and replaces the old view with the new one. We present algorithms that implement the set of schema evolution operations typically supported by OODB systems as view definitions. This approach provides the means for schema change without affecting other views (and thus without affecting existing application programs). The persistent data is shared by different views of the schema, i.e., both old as well as newly developed applications can continue to interoperate. In this paper, we present examples that demonstrate our approach.<>
{"title":"A transparent object-oriented schema change approach using view evolution","authors":"Young-Gook Ra, Elke A. Rundensteiner","doi":"10.1109/ICDE.1995.380396","DOIUrl":"https://doi.org/10.1109/ICDE.1995.380396","url":null,"abstract":"When a database is shared by many users, updates to the database schema are almost always prohibited because there is a risk of making existing application programs obsolete when they run against the modified schema. This paper addresses the problem by integrating schema evolution with view facilities. When new requirements necessitate schema updates for a particular user, the user specifies schema changes to the personal view rather than to the shared base schema. Our view evolution approach then computes a new view schema that reflects the semantics of the desired schema change, and replaces the old view with the new one. We present algorithms that implement the set of schema evolution operations typically supported by OODB systems as view definitions. This approach provides the means for schema change without affecting other views (and thus without affecting existing application programs). The persistent data is shared by different views of the schema, i.e., both old as well as newly developed applications can continue to interoperate. In this paper, we present examples that demonstrate our approach.<<ETX>>","PeriodicalId":184415,"journal":{"name":"Proceedings of the Eleventh International Conference on Data Engineering","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129999710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-03-06DOI: 10.1109/ICDE.1995.380370
Duen-Ren Liu, S. Shekhar
We propose a new similarity-based technique for declustering data. The proposed method can adapt to available information about query distributions, data distributions, data sizes and partition-size constraints. The method is based on max-cut partitioning of a similarity graph defined over the given set of data, under constraints on the partition sizes. It maximizes the chances that a pair of data-items that are to be accessed together by queries are allocated to distinct disks. We show that the proposed method can achieve optimal speed-up for a query-set, if there exists any other declustering method which will achieve the optimal speed-up. Experiments in parallelizing grid files show that the proposed method outperforms mapping-function-based methods for interesting query distributions as well for non-uniform data distributions.<>
{"title":"A similarity graph-based approach to declustering problems and its application towards parallelizing grid files","authors":"Duen-Ren Liu, S. Shekhar","doi":"10.1109/ICDE.1995.380370","DOIUrl":"https://doi.org/10.1109/ICDE.1995.380370","url":null,"abstract":"We propose a new similarity-based technique for declustering data. The proposed method can adapt to available information about query distributions, data distributions, data sizes and partition-size constraints. The method is based on max-cut partitioning of a similarity graph defined over the given set of data, under constraints on the partition sizes. It maximizes the chances that a pair of data-items that are to be accessed together by queries are allocated to distinct disks. We show that the proposed method can achieve optimal speed-up for a query-set, if there exists any other declustering method which will achieve the optimal speed-up. Experiments in parallelizing grid files show that the proposed method outperforms mapping-function-based methods for interesting query distributions as well for non-uniform data distributions.<<ETX>>","PeriodicalId":184415,"journal":{"name":"Proceedings of the Eleventh International Conference on Data Engineering","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130863326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-03-06DOI: 10.1109/ICDE.1995.380374
M. Yap
The National Information Infrastructure (NII) is an infrastructure consisting of efficient transport, information processing and service facilities that combine both computer and communication technologies. The needs of business and the public in general drive the definition of the infrastructure. Its goal is to increase the well-being of people as a whole. To deliver the promise of a more effective way to do business, the NII must strive to bring as many of these services or their equivalent to the end-users. Further, these services must be easily accessible and easy to use in addition to being affordable. The NII attempts to (re)engineer the real-world capabilities over the physical telecommunication network. In addition, the NII provides for a rich set of common computing services and supports the reuse of large components, over and above providing a physical telecommunication network.<>
{"title":"Singapore NII: building the electronic universe","authors":"M. Yap","doi":"10.1109/ICDE.1995.380374","DOIUrl":"https://doi.org/10.1109/ICDE.1995.380374","url":null,"abstract":"The National Information Infrastructure (NII) is an infrastructure consisting of efficient transport, information processing and service facilities that combine both computer and communication technologies. The needs of business and the public in general drive the definition of the infrastructure. Its goal is to increase the well-being of people as a whole. To deliver the promise of a more effective way to do business, the NII must strive to bring as many of these services or their equivalent to the end-users. Further, these services must be easily accessible and easy to use in addition to being affordable. The NII attempts to (re)engineer the real-world capabilities over the physical telecommunication network. In addition, the NII provides for a rich set of common computing services and supports the reuse of large components, over and above providing a physical telecommunication network.<<ETX>>","PeriodicalId":184415,"journal":{"name":"Proceedings of the Eleventh International Conference on Data Engineering","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127686715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-03-06DOI: 10.1109/ICDE.1995.380362
Sibel Adali, Ross Emery
Integrating knowledge from multiple sources is an important aspect of automated reasoning systems. Wiederhold and his colleagues (1993) have proposed the concept of a mediator-a device that will express how such an integration is to be achieved. In (1994) Subrahmanian et al. presented a uniform declarative and operational framework for mediators for amalgamating multiple knowledge bases and data structures (e.g. relational, object-oriented, spatial, and temporal structures) when these knowledge bases (possibly) contain inconsistencies, uncertainties, and nonmonotonic modes of negation. We specify the programming environment for this framework and show that it can be used to extract and integrate information obtained from different sources of data and resolve conflicts. We also show that it can be extended easily to integrate new knowledge bases.<>
{"title":"A uniform framework for integrating knowledge in heterogeneous knowledge systems","authors":"Sibel Adali, Ross Emery","doi":"10.1109/ICDE.1995.380362","DOIUrl":"https://doi.org/10.1109/ICDE.1995.380362","url":null,"abstract":"Integrating knowledge from multiple sources is an important aspect of automated reasoning systems. Wiederhold and his colleagues (1993) have proposed the concept of a mediator-a device that will express how such an integration is to be achieved. In (1994) Subrahmanian et al. presented a uniform declarative and operational framework for mediators for amalgamating multiple knowledge bases and data structures (e.g. relational, object-oriented, spatial, and temporal structures) when these knowledge bases (possibly) contain inconsistencies, uncertainties, and nonmonotonic modes of negation. We specify the programming environment for this framework and show that it can be used to extract and integrate information obtained from different sources of data and resolve conflicts. We also show that it can be extended easily to integrate new knowledge bases.<<ETX>>","PeriodicalId":184415,"journal":{"name":"Proceedings of the Eleventh International Conference on Data Engineering","volume":"109 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116389250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-03-06DOI: 10.1109/ICDE.1995.380413
M. Houtsma, A. Swami
Describe set-oriented algorithms for mining association rules. Such algorithms imply performing multiple joins and may appear to be inherently less efficient than special-purpose algorithms. We develop new algorithms that can be expressed as SQL queries, and discuss the optimization of these algorithms. After analytical evaluation, an algorithm named SETM emerges as the algorithm of choice. SETM uses only simple database primitives, viz. sorting and merge-scan join. SETM is simple, fast and stable over the range of parameter values. The major contribution of this paper is that it shows that at least some aspects of data mining can be carried out by using general query languages such as SQL, rather than by developing specialized black-box algorithms. The set-oriented nature of SETM facilitates the development of extensions.<>
{"title":"Set-oriented mining for association rules in relational databases","authors":"M. Houtsma, A. Swami","doi":"10.1109/ICDE.1995.380413","DOIUrl":"https://doi.org/10.1109/ICDE.1995.380413","url":null,"abstract":"Describe set-oriented algorithms for mining association rules. Such algorithms imply performing multiple joins and may appear to be inherently less efficient than special-purpose algorithms. We develop new algorithms that can be expressed as SQL queries, and discuss the optimization of these algorithms. After analytical evaluation, an algorithm named SETM emerges as the algorithm of choice. SETM uses only simple database primitives, viz. sorting and merge-scan join. SETM is simple, fast and stable over the range of parameter values. The major contribution of this paper is that it shows that at least some aspects of data mining can be carried out by using general query languages such as SQL, rather than by developing specialized black-box algorithms. The set-oriented nature of SETM facilitates the development of extensions.<<ETX>>","PeriodicalId":184415,"journal":{"name":"Proceedings of the Eleventh International Conference on Data Engineering","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117128770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-03-06DOI: 10.1109/ICDE.1995.380398
Walid G. Aref, Daniel Barbará, Stephen Johnson, S. Mehrotra
Emerging multimedia applications require database systems to provide support for new types of objects and to process queries that may have no parallel in traditional database applications. One such important class of queries are the proximity queries that aims to retrieve objects in the database that are related by a distance metric in a way that is specified by the query. The importance of proximity queries has earlier been realized in developing constructs for visual languages. In this paper, we present algorithms for answering a class of proximity queries-fixed-radius nearest-neighbor queries over point object. Processing proximity queries using existing query processing techniques results in high CPU and I/O costs. We develop new algorithms to answer proximity queries over objects that lie in the one-dimensional space (e.g., words in a document). The algorithms exploit query semantics to reduce the CPU and I/O costs, and hence improve performance. We also show how our algorithms can be generalized to handle d-dimensional objects.<>
{"title":"Efficient processing of proximity queries for large databases","authors":"Walid G. Aref, Daniel Barbará, Stephen Johnson, S. Mehrotra","doi":"10.1109/ICDE.1995.380398","DOIUrl":"https://doi.org/10.1109/ICDE.1995.380398","url":null,"abstract":"Emerging multimedia applications require database systems to provide support for new types of objects and to process queries that may have no parallel in traditional database applications. One such important class of queries are the proximity queries that aims to retrieve objects in the database that are related by a distance metric in a way that is specified by the query. The importance of proximity queries has earlier been realized in developing constructs for visual languages. In this paper, we present algorithms for answering a class of proximity queries-fixed-radius nearest-neighbor queries over point object. Processing proximity queries using existing query processing techniques results in high CPU and I/O costs. We develop new algorithms to answer proximity queries over objects that lie in the one-dimensional space (e.g., words in a document). The algorithms exploit query semantics to reduce the CPU and I/O costs, and hence improve performance. We also show how our algorithms can be generalized to handle d-dimensional objects.<<ETX>>","PeriodicalId":184415,"journal":{"name":"Proceedings of the Eleventh International Conference on Data Engineering","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126978839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-03-06DOI: 10.1109/ICDE.1995.380380
A. Biliris, E. Panagos
The paper describes the client-server software architecture of the EOS storage manager and the concurrency control and recovery mechanisms it employs. Unlike most client-server storage systems that use the standard two-phase locking protocol, EOS offers a semi-optimistic locking scheme based on a multigranularity two-version two-phase locking protocol. Under this scheme, many readers are allowed to access a data item while it is being updated by a single writer. For recovery, EOS maintains a write-ahead redo-only log because of the potential benefits it offers in a client-server environment. First, there are no undo records, as log records of aborted transactions are never inserted in the log; this minimizes the I/O and network transfer costs associated with logging during normal transaction execution. Secondly, it reduces the space required for the log. Thirdly, it facilitates fast recovery from system crashes because only one forward scan of the log is required for installing the updates performed by transactions that committed prior to the crash. Performance results of the EOS recovery subsystem are also presented.<>
{"title":"Transactions in the client-server EOS object store","authors":"A. Biliris, E. Panagos","doi":"10.1109/ICDE.1995.380380","DOIUrl":"https://doi.org/10.1109/ICDE.1995.380380","url":null,"abstract":"The paper describes the client-server software architecture of the EOS storage manager and the concurrency control and recovery mechanisms it employs. Unlike most client-server storage systems that use the standard two-phase locking protocol, EOS offers a semi-optimistic locking scheme based on a multigranularity two-version two-phase locking protocol. Under this scheme, many readers are allowed to access a data item while it is being updated by a single writer. For recovery, EOS maintains a write-ahead redo-only log because of the potential benefits it offers in a client-server environment. First, there are no undo records, as log records of aborted transactions are never inserted in the log; this minimizes the I/O and network transfer costs associated with logging during normal transaction execution. Secondly, it reduces the space required for the log. Thirdly, it facilitates fast recovery from system crashes because only one forward scan of the log is required for installing the updates performed by transactions that committed prior to the crash. Performance results of the EOS recovery subsystem are also presented.<<ETX>>","PeriodicalId":184415,"journal":{"name":"Proceedings of the Eleventh International Conference on Data Engineering","volume":"91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126173913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}