首页 > 最新文献

Proceedings 17th International Conference on Data Engineering最新文献

英文 中文
Workflow and process synchronization with interaction expressions and graphs 工作流和过程同步与交互表达式和图形
Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914835
C. Heinlein
Current workflow management technology does not provide adequate means for inter-workflow coordination as concurrently executing workflows are considered completely independent. While this simplified view might suffice for one application domain or the other, there are many real-world application scenarios where workflows, though independently modeled in order to remain comprehensible and manageable, are semantically interrelated. As pragmatical approaches, like merging interdependent workflows or inter-workflow message passing, do not satisfactorily solve the inter-workflow coordination problem, interaction expressions and graphs are proposed as a simple yet powerful formalism for the specification and implementation of synchronization conditions in general and inter-workflow dependencies in particular. In addition to a graph based semi-formal interpretation of the formalism, a precise formal semantics, an equivalent operational semantics, an efficient implementation of the latter, and detailed complexity analyses have been developed, allowing the formalism to be actually applied to solve real-world problems like inter-workflow coordination.
由于并行执行的工作流被认为是完全独立的,当前的工作流管理技术并没有为工作流间的协调提供足够的手段。虽然这种简化的视图可能足以满足一个或另一个应用程序领域的需要,但是在许多实际的应用程序场景中,工作流虽然独立建模以保持可理解和可管理,但在语义上是相互关联的。由于合并相互依赖的工作流或工作流间的消息传递等实用方法不能很好地解决工作流间的协调问题,因此提出了交互表达式和图,作为一种简单而强大的形式,用于规范和实现一般的同步条件,特别是工作流间的依赖关系。除了基于图的形式主义的半形式解释之外,还开发了精确的形式语义、等效的操作语义、后者的有效实现以及详细的复杂性分析,从而允许形式主义实际应用于解决诸如工作流间协调之类的现实世界问题。
{"title":"Workflow and process synchronization with interaction expressions and graphs","authors":"C. Heinlein","doi":"10.1109/ICDE.2001.914835","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914835","url":null,"abstract":"Current workflow management technology does not provide adequate means for inter-workflow coordination as concurrently executing workflows are considered completely independent. While this simplified view might suffice for one application domain or the other, there are many real-world application scenarios where workflows, though independently modeled in order to remain comprehensible and manageable, are semantically interrelated. As pragmatical approaches, like merging interdependent workflows or inter-workflow message passing, do not satisfactorily solve the inter-workflow coordination problem, interaction expressions and graphs are proposed as a simple yet powerful formalism for the specification and implementation of synchronization conditions in general and inter-workflow dependencies in particular. In addition to a graph based semi-formal interpretation of the formalism, a precise formal semantics, an equivalent operational semantics, an efficient implementation of the latter, and detailed complexity analyses have been developed, allowing the formalism to be actually applied to solve real-world problems like inter-workflow coordination.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125864085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 59
fAST refresh using mass query optimization 快速刷新使用大量查询优化
Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914852
Wolfgang Lehner, R. Cochrane, H. Pirahesh, Markos Zaharioudakis
Automatic summary tables (ASTs), more commonly known as materialized views, are widely used to enhance query performance, particularly for aggregate queries. Such queries access a huge number of rows to retrieve aggregated summary data while performing multiple joins in the context of a typical data warehouse star schema. To keep ASTs consistent with their underlying base data, the ASTs are either immediately synchronized or fully recomputed. This paper proposes an optimization strategy for simultaneously refreshing multiple ASTs, thus avoiding multiple scans of a large fact table (one pass for AST computation). A query stacking strategy detects common sub-expressions using the available query matching technology of DB2. Since exact common sub-expressions are rare, the novel query sharing approach systematically generates common subexpressions for a given set of "related" queries, considering different predicates, grouping expressions, and sets of base tables. The theoretical framework, a prototype implementation of both strategies in the IBM DB2 UDB/UWO database system, and performance evaluations based on the TPC/R data schema are presented in this paper.
自动汇总表(ast),通常称为物化视图,广泛用于增强查询性能,特别是对于聚合查询。在典型的数据仓库星型模式上下文中执行多个连接时,此类查询访问大量行以检索聚合的摘要数据。为了使ast与其基础数据保持一致,ast要么立即同步,要么完全重新计算。本文提出了一种同时刷新多个AST的优化策略,从而避免了对大型事实表的多次扫描(一次扫描AST计算)。查询堆叠策略使用DB2的可用查询匹配技术检测公共子表达式。由于精确的公共子表达式很少,因此新的查询共享方法系统地为给定的一组“相关”查询生成公共子表达式,同时考虑不同的谓词、分组表达式和基表集。本文给出了理论框架、两种策略在IBM DB2 UDB/UWO数据库系统中的原型实现,以及基于TPC/R数据模式的性能评估。
{"title":"fAST refresh using mass query optimization","authors":"Wolfgang Lehner, R. Cochrane, H. Pirahesh, Markos Zaharioudakis","doi":"10.1109/ICDE.2001.914852","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914852","url":null,"abstract":"Automatic summary tables (ASTs), more commonly known as materialized views, are widely used to enhance query performance, particularly for aggregate queries. Such queries access a huge number of rows to retrieve aggregated summary data while performing multiple joins in the context of a typical data warehouse star schema. To keep ASTs consistent with their underlying base data, the ASTs are either immediately synchronized or fully recomputed. This paper proposes an optimization strategy for simultaneously refreshing multiple ASTs, thus avoiding multiple scans of a large fact table (one pass for AST computation). A query stacking strategy detects common sub-expressions using the available query matching technology of DB2. Since exact common sub-expressions are rare, the novel query sharing approach systematically generates common subexpressions for a given set of \"related\" queries, considering different predicates, grouping expressions, and sets of base tables. The theoretical framework, a prototype implementation of both strategies in the IBM DB2 UDB/UWO database system, and performance evaluations based on the TPC/R data schema are presented in this paper.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122867941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 34
Data management support of Web applications Web应用程序的数据管理支持
Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914841
D. H. Fishman
Automating the interactions between trusted business partners is a major goal of businesses today. This is often called "supply-chain integration". The intent is to make the businesses more responsive to customer needs and more efficient in their business or manufacturing processes. This paper describes an infrastructure that facilitates the collaboration of trusted business partners to achieve common business goals.
自动化可信业务伙伴之间的交互是当今业务的主要目标。这通常被称为“供应链整合”。其目的是使企业更能响应客户需求,并在其业务或制造过程中更有效率。本文描述了一种基础设施,它可以促进可信业务伙伴之间的协作,从而实现共同的业务目标。
{"title":"Data management support of Web applications","authors":"D. H. Fishman","doi":"10.1109/ICDE.2001.914841","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914841","url":null,"abstract":"Automating the interactions between trusted business partners is a major goal of businesses today. This is often called \"supply-chain integration\". The intent is to make the businesses more responsive to customer needs and more efficient in their business or manufacturing processes. This paper describes an infrastructure that facilitates the collaboration of trusted business partners to achieve common business goals.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114491900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integrating data mining with SQL databases: OLE DB for data mining 与SQL数据库集成数据挖掘:用于数据挖掘的OLE DB
Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914850
Amir Netz, S. Chaudhuri, U. Fayyad, J. Bernhardt
The integration of data mining with traditional database systems is key to making it convenient, easy to deploy in real applications, and to growing its user base. We describe the new API for data mining proposed by Microsoft as extensions to the OLE DB standard. We illustrate the basic notions that motivated the API's design and describe the key components of an OLE DB for the data mining provider. We also include examples of the usage and treat the problems of data representation and integration with the SQL framework. We believe this new API will go a long way in enabling deployment of data mining in enterprise data warehouses. A reference implementation of a provider is available with the recent release of Microsoft SQL Server 2000 database system.
数据挖掘与传统数据库系统的集成是使其方便、易于在实际应用程序中部署以及扩大其用户基础的关键。我们将微软提出的用于数据挖掘的新API描述为OLE DB标准的扩展。我们说明了激发API设计的基本概念,并描述了数据挖掘提供者的OLE DB的关键组件。我们还包括使用示例,并处理数据表示和与SQL框架集成的问题。我们相信这个新的API将在企业数据仓库中部署数据挖掘方面大有帮助。最近发布的Microsoft SQL Server 2000数据库系统提供了提供程序的参考实现。
{"title":"Integrating data mining with SQL databases: OLE DB for data mining","authors":"Amir Netz, S. Chaudhuri, U. Fayyad, J. Bernhardt","doi":"10.1109/ICDE.2001.914850","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914850","url":null,"abstract":"The integration of data mining with traditional database systems is key to making it convenient, easy to deploy in real applications, and to growing its user base. We describe the new API for data mining proposed by Microsoft as extensions to the OLE DB standard. We illustrate the basic notions that motivated the API's design and describe the key components of an OLE DB for the data mining provider. We also include examples of the usage and treat the problems of data representation and integration with the SQL framework. We believe this new API will go a long way in enabling deployment of data mining in enterprise data warehouses. A reference implementation of a provider is available with the recent release of Microsoft SQL Server 2000 database system.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"2 8","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120808565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 84
Spatial clustering in the presence of obstacles 存在障碍物时的空间聚类
Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914848
A. Tung, Jean Hou, Jiawei Han
Clustering in spatial data mining is to group similar objects based on their distance, connectivity, or their relative density in space. In the real world there exist many physical obstacles such as rivers, lakes and highways, and their presence may affect the result of clustering substantially. We study the problem of clustering in the presence of obstacles and define it as a COD (Clustering with Obstructed Distance) problem. As a solution to this problem, we propose a scalable clustering algorithm, called COD-CLARANS. We discuss various forms of pre-processed information that could enhance the efficiency of COD-CLARANS. In the strictest sense, the COD problem can be treated as a change in distance function and thus could be handled by current clustering algorithms by changing the distance function. However, we show that by pushing the task of handling obstacles into COD-CLARANS instead of abstracting it at the distance function level, more optimization can be done in the form of a pruning function E'. We conduct various performance studies to show that COD-CLARANS is both efficient and effective.
空间数据挖掘中的聚类是根据相似对象在空间中的距离、连通性或相对密度对其进行分组。在现实世界中,存在许多物理障碍,如河流、湖泊和高速公路,它们的存在可能会对聚类结果产生很大的影响。我们研究了障碍物存在下的聚类问题,并将其定义为COD (clustered with obstacle Distance)问题。为了解决这个问题,我们提出了一种可扩展的聚类算法,称为COD-CLARANS。我们讨论了可以提高COD-CLARANS效率的各种形式的预处理信息。从严格意义上讲,COD问题可以看作是距离函数的变化,因此当前的聚类算法可以通过改变距离函数来处理COD问题。然而,我们表明,通过将处理障碍物的任务推入COD-CLARANS而不是在距离函数级别抽象它,可以以修剪函数E'的形式进行更多优化。我们进行了各种性能研究,以表明COD-CLARANS既高效又有效。
{"title":"Spatial clustering in the presence of obstacles","authors":"A. Tung, Jean Hou, Jiawei Han","doi":"10.1109/ICDE.2001.914848","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914848","url":null,"abstract":"Clustering in spatial data mining is to group similar objects based on their distance, connectivity, or their relative density in space. In the real world there exist many physical obstacles such as rivers, lakes and highways, and their presence may affect the result of clustering substantially. We study the problem of clustering in the presence of obstacles and define it as a COD (Clustering with Obstructed Distance) problem. As a solution to this problem, we propose a scalable clustering algorithm, called COD-CLARANS. We discuss various forms of pre-processed information that could enhance the efficiency of COD-CLARANS. In the strictest sense, the COD problem can be treated as a change in distance function and thus could be handled by current clustering algorithms by changing the distance function. However, we show that by pushing the task of handling obstacles into COD-CLARANS instead of abstracting it at the distance function level, more optimization can be done in the form of a pruning function E'. We conduct various performance studies to show that COD-CLARANS is both efficient and effective.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"309 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124398094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 207
Overcoming limitations of sampling for aggregation queries 克服聚合查询的抽样限制
Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914867
S. Chaudhuri, Gautam Das, Mayur Datar, R. Motwani, Vivek R. Narasayya
Studies the problem of approximately answering aggregation queries using sampling. We observe that uniform sampling performs poorly when the distribution of the aggregated attribute is skewed. To address this issue, we introduce a technique called outlier indexing. Uniform sampling is also ineffective for queries with low selectivity. We rely on weighted sampling based on workload information to overcome this shortcoming. We demonstrate that a combination of outlier indexing with weighted sampling can be used to answer aggregation queries with a significantly reduced approximation error compared to either uniform sampling or weighted sampling alone. We discuss the implementation of these techniques on Microsoft's SQL Server and present experimental results that demonstrate the merits of our techniques.
研究了用抽样方法近似回答聚合查询的问题。我们观察到,当聚合属性的分布偏斜时,均匀抽样的性能很差。为了解决这个问题,我们引入了一种称为离群值索引的技术。对于低选择性的查询,统一采样也是无效的。我们依靠基于工作负载信息的加权抽样来克服这一缺点。我们证明,与单独的均匀抽样或加权抽样相比,离群值索引与加权抽样的组合可用于回答聚合查询,其近似误差显着降低。讨论了这些技术在Microsoft SQL Server上的实现,并给出了实验结果,证明了这些技术的优点。
{"title":"Overcoming limitations of sampling for aggregation queries","authors":"S. Chaudhuri, Gautam Das, Mayur Datar, R. Motwani, Vivek R. Narasayya","doi":"10.1109/ICDE.2001.914867","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914867","url":null,"abstract":"Studies the problem of approximately answering aggregation queries using sampling. We observe that uniform sampling performs poorly when the distribution of the aggregated attribute is skewed. To address this issue, we introduce a technique called outlier indexing. Uniform sampling is also ineffective for queries with low selectivity. We rely on weighted sampling based on workload information to overcome this shortcoming. We demonstrate that a combination of outlier indexing with weighted sampling can be used to answer aggregation queries with a significantly reduced approximation error compared to either uniform sampling or weighted sampling alone. We discuss the implementation of these techniques on Microsoft's SQL Server and present experimental results that demonstrate the merits of our techniques.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116820226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 163
Mobile data management: challenges of wireless and offline data access 移动数据管理:无线和离线数据访问的挑战
Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914831
Eric Gigukre
Applications require access to database servers for many purposes. Mobile users, those who use their computing devices away from a traditional local area network, require access to data even when central database servers are unavailable. iAnywhere Solutions provides a number of solutions that address the challenges of offline and wireless data access. The article discusses those challenges and presents solutions.
应用程序出于多种目的需要访问数据库服务器。移动用户,即那些在传统局域网之外使用计算设备的用户,即使在中央数据库服务器不可用的情况下也需要访问数据。iAnywhere Solutions提供了许多解决方案,以应对离线和无线数据访问的挑战。本文将讨论这些挑战并提出解决方案。
{"title":"Mobile data management: challenges of wireless and offline data access","authors":"Eric Gigukre","doi":"10.1109/ICDE.2001.914831","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914831","url":null,"abstract":"Applications require access to database servers for many purposes. Mobile users, those who use their computing devices away from a traditional local area network, require access to data even when central database servers are unavailable. iAnywhere Solutions provides a number of solutions that address the challenges of offline and wireless data access. The article discusses those challenges and presents solutions.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"181 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122665185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
TAR: temporal association rules on evolving numerical attributes 演化数值属性的时间关联规则
Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914839
Wei Wang, Jiong Yang, R. Muntz
Data mining has been an area of increasing interest. The association rule discovery problem in particular has been widely studied. However, there are still some unresolved problems. For example, research on mining patterns in the evolution of numerical attributes is still lacking. This is both a challenging problem and one with significant practical applications in business, science, and medicine. In this paper we present a temporal association rule model for evolving numerical attributes. Metrics for qualifying a temporal association rule include the familiar measures of support and strength used in traditional association rule mining and a new metric called density. The density metric not only gives us a way to extract the rules that best represent the data, but also provides an effective mechanism to prune the search space. An efficient algorithm is devised for mining temporal association rules, which utilizes all three thresholds (especially the strength) to prune the search space drastically. Moreover, the resulting rules are represented in a concise manner via rule sets to reduce the output size. Experimental results on real and synthetic data sets demonstrate the efficiency of our algorithm.
数据挖掘已经成为人们越来越感兴趣的一个领域。特别是关联规则发现问题已经得到了广泛的研究。但是,仍存在一些尚未解决的问题。例如,数值属性演化中的挖掘模式研究仍然缺乏。这既是一个具有挑战性的问题,也是一个在商业、科学和医学中具有重要实际应用的问题。本文提出了一种用于演化数值属性的时间关联规则模型。用于确定时态关联规则的度量包括传统关联规则挖掘中常用的支持度和强度度量,以及称为密度的新度量。密度度量不仅为我们提供了一种提取最能代表数据的规则的方法,而且还提供了一种有效的机制来修剪搜索空间。设计了一种有效的时序关联规则挖掘算法,该算法利用三个阈值(特别是强度)对搜索空间进行了大幅度的精简。此外,结果规则通过规则集以简洁的方式表示,以减少输出大小。在真实和合成数据集上的实验结果证明了该算法的有效性。
{"title":"TAR: temporal association rules on evolving numerical attributes","authors":"Wei Wang, Jiong Yang, R. Muntz","doi":"10.1109/ICDE.2001.914839","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914839","url":null,"abstract":"Data mining has been an area of increasing interest. The association rule discovery problem in particular has been widely studied. However, there are still some unresolved problems. For example, research on mining patterns in the evolution of numerical attributes is still lacking. This is both a challenging problem and one with significant practical applications in business, science, and medicine. In this paper we present a temporal association rule model for evolving numerical attributes. Metrics for qualifying a temporal association rule include the familiar measures of support and strength used in traditional association rule mining and a new metric called density. The density metric not only gives us a way to extract the rules that best represent the data, but also provides an effective mechanism to prune the search space. An efficient algorithm is devised for mining temporal association rules, which utilizes all three thresholds (especially the strength) to prune the search space drastically. Moreover, the resulting rules are represented in a concise manner via rule sets to reduce the output size. Experimental results on real and synthetic data sets demonstrate the efficiency of our algorithm.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123588607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 46
An XML indexing structure with relative region coordinate 具有相对区域坐标的XML索引结构
Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914843
Dao Dinh Kha, Masatoshi Yoshikawa, Shunsuke Uemura
For most of the index structures for XML data proposed so far, updating is a problem, because an XML element's coordinates are expressed using absolute values. Due to the structural relationship among the elements in XML documents, we have to re-compute these absolute values if the content of the source data is updated. The reconstruction requires the updating of a large portion of the index files, which causes a serious problem, especially when the XML data content is updated frequently. In this paper, we propose an indexing structure scheme based on the relative region coordinates that can effectively deal with the update problem. The main idea is that we express the coordinates of an XML element based on the region of its parent element. We present an algorithm to construct a tree-structured index in which related coordinates are stored together. In consequence, our indexing scheme requires the updating of only a small portion of the index file.
对于目前提出的大多数XML数据索引结构,更新是一个问题,因为XML元素的坐标是使用绝对值表示的。由于XML文档中元素之间的结构关系,如果更新了源数据的内容,我们必须重新计算这些绝对值。重建需要更新很大一部分索引文件,这会导致严重的问题,特别是当XML数据内容经常更新时。本文提出了一种基于相对区域坐标的索引结构方案,可以有效地解决索引更新问题。其主要思想是基于父元素的区域来表示XML元素的坐标。我们提出了一种构造树状索引的算法,其中相关坐标存储在一起。因此,我们的索引方案只需要更新索引文件的一小部分。
{"title":"An XML indexing structure with relative region coordinate","authors":"Dao Dinh Kha, Masatoshi Yoshikawa, Shunsuke Uemura","doi":"10.1109/ICDE.2001.914843","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914843","url":null,"abstract":"For most of the index structures for XML data proposed so far, updating is a problem, because an XML element's coordinates are expressed using absolute values. Due to the structural relationship among the elements in XML documents, we have to re-compute these absolute values if the content of the source data is updated. The reconstruction requires the updating of a large portion of the index files, which causes a serious problem, especially when the XML data content is updated frequently. In this paper, we propose an indexing structure scheme based on the relative region coordinates that can effectively deal with the update problem. The main idea is that we express the coordinates of an XML element based on the region of its parent element. We present an algorithm to construct a tree-structured index in which related coordinates are stored together. In consequence, our indexing scheme requires the updating of only a small portion of the index file.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"16 1-2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120924351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 84
Database managed external file update 数据库管理的外部文件更新
Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914870
N. Mittal, Hui-I Hsiao
Relational DBMSs (RDBMSs) have evolved to an extent that they are used to manage almost all traditional business data in a robust fashion. Nevertheless, a large fraction of unstructured and semi-structured data continues to be managed by file systems. As companies increasingly depend on non-traditional data for their daily business operations, it becomes more and more important to provide higher degree of integrity, security and reliability to the data stored in file systems. DataLinks technology, developed at IBM Almaden Research Center, achieves this by providing a vital integration between a RDBMS and a file system. It enables the DBMS to manage files residing in file systems as though they are logically within the database. Current DataLinks technology supports only read access to external files that are being managed by the DBMS. This severely restricts the applicability of DataLinks technology in transaction-oriented and/or e-business applications. Traditional database systems enforce ACID properties for database updates. Extending these properties to cover both external files stored outside of a DBMS and metadata stored in the DBMS is a hard problem. This is because files are updated through a standard file-system API while metadata, which references the files, is updated through a database API. This paper describes our experiences in the design and prototyping of an advanced DataLinks technology that supports database-managed external file updates. This enhanced capability makes DataLinks technology an even more attractive solution for managing the world's data.
关系dbms (rdbms)已经发展到可以以健壮的方式管理几乎所有传统业务数据的程度。然而,很大一部分非结构化和半结构化数据仍然由文件系统管理。随着企业越来越依赖非传统数据进行日常业务操作,为存储在文件系统中的数据提供更高程度的完整性、安全性和可靠性变得越来越重要。由IBM Almaden研究中心开发的datallinks技术通过在RDBMS和文件系统之间提供重要的集成来实现这一点。它使DBMS能够管理驻留在文件系统中的文件,就像它们在逻辑上位于数据库中一样。当前的datallinks技术只支持对由DBMS管理的外部文件的读访问。这严重限制了datallinks技术在面向事务和/或电子商务应用中的适用性。传统的数据库系统为数据库更新强制执行ACID属性。扩展这些属性以涵盖存储在DBMS之外的外部文件和存储在DBMS中的元数据是一个难题。这是因为文件是通过标准文件系统API更新的,而引用文件的元数据是通过数据库API更新的。本文描述了我们在设计和原型化高级datallinks技术方面的经验,该技术支持数据库管理的外部文件更新。这种增强的功能使DataLinks技术成为管理全球数据的更具吸引力的解决方案。
{"title":"Database managed external file update","authors":"N. Mittal, Hui-I Hsiao","doi":"10.1109/ICDE.2001.914870","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914870","url":null,"abstract":"Relational DBMSs (RDBMSs) have evolved to an extent that they are used to manage almost all traditional business data in a robust fashion. Nevertheless, a large fraction of unstructured and semi-structured data continues to be managed by file systems. As companies increasingly depend on non-traditional data for their daily business operations, it becomes more and more important to provide higher degree of integrity, security and reliability to the data stored in file systems. DataLinks technology, developed at IBM Almaden Research Center, achieves this by providing a vital integration between a RDBMS and a file system. It enables the DBMS to manage files residing in file systems as though they are logically within the database. Current DataLinks technology supports only read access to external files that are being managed by the DBMS. This severely restricts the applicability of DataLinks technology in transaction-oriented and/or e-business applications. Traditional database systems enforce ACID properties for database updates. Extending these properties to cover both external files stored outside of a DBMS and metadata stored in the DBMS is a hard problem. This is because files are updated through a standard file-system API while metadata, which references the files, is updated through a database API. This paper describes our experiences in the design and prototyping of an advanced DataLinks technology that supports database-managed external file updates. This enhanced capability makes DataLinks technology an even more attractive solution for managing the world's data.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126802090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
Proceedings 17th International Conference on Data Engineering
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1