Dynamic Workload for Schema Evolution in Data Warehouses

Complex Data Warehousing and Knowledge Discovery for Advanced Retrieval Development Pub Date : 1900-01-01 DOI:10.4018/978-1-60566-748-5.CH002

F. Bentayeb, Cécile Favre, Omar Boussaïd

{"title":"Dynamic Workload for Schema Evolution in Data Warehouses","authors":"F. Bentayeb, Cécile Favre, Omar Boussaïd","doi":"10.4018/978-1-60566-748-5.CH002","DOIUrl":null,"url":null,"abstract":"A data warehouse allows the integration of heterogeneous data sources for identified analysis purposes. The data warehouse schema is designed according to the available data sources and the users' analysis requirements. In order to provide an answer to new individual analysis needs, we previously proposed, in recent work, a solution for on-line analysis personalization. We based our solution on a user-driven approach for data warehouse schema evolution which consists in creating new hierarchy levels in OLAP (On-Line Analytical Processing) dimensions. One of the main objectives of OLAP, as the meaning of the acronym refers, is the performance during the analysis process. Since data warehouses contain a large volume of data, answering decision queries efficiently requires particular access methods. The main issue is to use redundant optimization structures such as views and indices. This implies to select an appropriate set of materialized views and indices, which minimizes total query response time, given a limited storage space. A judicious choice in this selection must be cost-driven and based on a workload which represents a set of users' queries on the data warehouse. In this chapter, we address the issues related to the workload’s evolution and maintenance in data warehouse systems in response to new requirements modeling resulting from users’ personalized analysis needs. The main issue is to avoid the workload generation from scratch. Hence, we propose a workload management system which helps the administrator to maintain and adapt dynamically the workload according to changes arising on the data warehouse schema. To achieve this maintenance, we propose two types of workload updates: (1) maintaining existing queries consistent with respect to the new data warehouse schema and (2) creating new queries based on the new dimension hierarchy levels. Our system helps the administrator in adopting a pro-active behaviour in the management of the data warehouse performance. In order to validate our workload management system, we address the implementation issues of our proposed prototype. This latter has been developed within client/server architecture with a web client interfaced with the Oracle 10g DataBase Management System.","PeriodicalId":255230,"journal":{"name":"Complex Data Warehousing and Knowledge Discovery for Advanced Retrieval Development","volume":"61 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Complex Data Warehousing and Knowledge Discovery for Advanced Retrieval Development","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4018/978-1-60566-748-5.CH002","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

A data warehouse allows the integration of heterogeneous data sources for identified analysis purposes. The data warehouse schema is designed according to the available data sources and the users' analysis requirements. In order to provide an answer to new individual analysis needs, we previously proposed, in recent work, a solution for on-line analysis personalization. We based our solution on a user-driven approach for data warehouse schema evolution which consists in creating new hierarchy levels in OLAP (On-Line Analytical Processing) dimensions. One of the main objectives of OLAP, as the meaning of the acronym refers, is the performance during the analysis process. Since data warehouses contain a large volume of data, answering decision queries efficiently requires particular access methods. The main issue is to use redundant optimization structures such as views and indices. This implies to select an appropriate set of materialized views and indices, which minimizes total query response time, given a limited storage space. A judicious choice in this selection must be cost-driven and based on a workload which represents a set of users' queries on the data warehouse. In this chapter, we address the issues related to the workload’s evolution and maintenance in data warehouse systems in response to new requirements modeling resulting from users’ personalized analysis needs. The main issue is to avoid the workload generation from scratch. Hence, we propose a workload management system which helps the administrator to maintain and adapt dynamically the workload according to changes arising on the data warehouse schema. To achieve this maintenance, we propose two types of workload updates: (1) maintaining existing queries consistent with respect to the new data warehouse schema and (2) creating new queries based on the new dimension hierarchy levels. Our system helps the administrator in adopting a pro-active behaviour in the management of the data warehouse performance. In order to validate our workload management system, we address the implementation issues of our proposed prototype. This latter has been developed within client/server architecture with a web client interfaced with the Oracle 10g DataBase Management System.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

数据仓库中模式演化的动态工作负载

数据仓库允许为确定的分析目的集成异构数据源。数据仓库模式是根据可用的数据源和用户的分析需求设计的。为了满足新的个性化分析需求，我们在最近的工作中提出了一种在线分析个性化解决方案。我们的解决方案基于用户驱动的数据仓库模式演变方法，该方法包括在OLAP(在线分析处理)维度中创建新的层次结构级别。顾名思义，OLAP的主要目标之一是分析过程中的性能。由于数据仓库包含大量数据，因此有效地回答决策查询需要特定的访问方法。主要问题是使用冗余的优化结构，如视图和索引。这意味着在有限的存储空间下，选择一组适当的物化视图和索引，这样可以最大限度地减少总查询响应时间。在这种选择中，明智的选择必须是成本驱动的，并且必须基于表示一组用户对数据仓库的查询的工作负载。在本章中，我们将讨论与数据仓库系统中工作负载的演变和维护相关的问题，以响应由用户个性化分析需求产生的新需求建模。主要问题是避免从头生成工作负载。因此，我们提出了一个工作负载管理系统，它可以帮助管理员根据数据仓库模式上的变化动态地维护和适应工作负载。为了实现这种维护，我们提出了两种类型的工作负载更新:(1)维护现有查询与新数据仓库模式的一致性;(2)基于新的维度层次结构级别创建新查询。我们的系统帮助管理员在管理数据仓库性能时采取主动的行为。为了验证我们的工作负载管理系统，我们解决了我们提出的原型的实现问题。后者是在客户端/服务器架构中开发的，带有一个与Oracle 10g数据库管理系统接口的web客户端。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Complex Data Warehousing and Knowledge Discovery for Advanced Retrieval Development

自引率

0.00%

发文量

期刊最新文献

Ranking Gradients in Multi-Dimensional Spaces An Approximate Approach for Maintaining Recent Occurrences of Itemsets in a Sliding Window over Data Streams Learning Cost-Sensitive Decision Trees to Support Medical Diagnosis The LBF R-Tree Simultaneous Feature Selection and Tuple Selection for Efficient Classification