Logical design of data warehouses from XML

M. Banek, Z. Skocir, B. Vrdoljak
{"title":"Logical design of data warehouses from XML","authors":"M. Banek, Z. Skocir, B. Vrdoljak","doi":"10.1109/CONTEL.2005.185875","DOIUrl":null,"url":null,"abstract":"Data warehouse is a database that collects and integrates data from heterogeneous sources in order to support a decision making process. Data exchanged over the Internet and intranets has recently become an important data source, having XML as a standard format for exchange. The possibility of integrating available XML data into data warehouses plays an important role in providing enterprise managers with up-to-date and relevant information about their business domain. We have developed a methodology for data warehouse design from the source XML Schemas and conforming XML documents. As XML data is semi-structured, data warehouse design from XML brings many particular challenges. In this paper the final steps of deriving a conceptual multidimensional scheme are described, followed by the logical design, where a set of tables is created according to the derived conceptual scheme. A prototype tool has been developed to test and verify the proposed methodology. Data warehousing system is a set of technologies and tools that enable decision-makers (managers and analysts) to acquire, integrate and flexibly analyze information coming from different sources. The central part of the system is a large database specialized for complex analysis of historical data, called a data warehouse. The process of building a data warehousing system includes analysis of the data sources, design of a warehouse model that can successfully integrate them and later the construction of the warehouse according to the proposed model. Decision-makers use OLAP (OnLine Analytical Processing) tools to put queries against the warehouse in a quick, intuitive and interactive way. OLAP tools use the multidimensional data model, which enables focusing on small pieces of data, generally a few numerical parameters, that are most interesting for the decision making process. Other data in the warehouse are organized hierarchically into several independent groups, called dimensions, and used to perform calculations with the few important parameters. Data warehouses, owned by big enterprises and organizations, integrate data from heterogeneous sources: relational databases or other legacy database models, semi-structured data and different file formats. Recently, the World Wide Web, Web services and different information systems for exchanging data over the Internet and private networks have become an important data source.","PeriodicalId":265923,"journal":{"name":"Proceedings of the 8th International Conference on Telecommunications, 2005. ConTEL 2005.","volume":"92 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 8th International Conference on Telecommunications, 2005. ConTEL 2005.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CONTEL.2005.185875","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

Abstract

Data warehouse is a database that collects and integrates data from heterogeneous sources in order to support a decision making process. Data exchanged over the Internet and intranets has recently become an important data source, having XML as a standard format for exchange. The possibility of integrating available XML data into data warehouses plays an important role in providing enterprise managers with up-to-date and relevant information about their business domain. We have developed a methodology for data warehouse design from the source XML Schemas and conforming XML documents. As XML data is semi-structured, data warehouse design from XML brings many particular challenges. In this paper the final steps of deriving a conceptual multidimensional scheme are described, followed by the logical design, where a set of tables is created according to the derived conceptual scheme. A prototype tool has been developed to test and verify the proposed methodology. Data warehousing system is a set of technologies and tools that enable decision-makers (managers and analysts) to acquire, integrate and flexibly analyze information coming from different sources. The central part of the system is a large database specialized for complex analysis of historical data, called a data warehouse. The process of building a data warehousing system includes analysis of the data sources, design of a warehouse model that can successfully integrate them and later the construction of the warehouse according to the proposed model. Decision-makers use OLAP (OnLine Analytical Processing) tools to put queries against the warehouse in a quick, intuitive and interactive way. OLAP tools use the multidimensional data model, which enables focusing on small pieces of data, generally a few numerical parameters, that are most interesting for the decision making process. Other data in the warehouse are organized hierarchically into several independent groups, called dimensions, and used to perform calculations with the few important parameters. Data warehouses, owned by big enterprises and organizations, integrate data from heterogeneous sources: relational databases or other legacy database models, semi-structured data and different file formats. Recently, the World Wide Web, Web services and different information systems for exchanging data over the Internet and private networks have become an important data source.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于XML的数据仓库逻辑设计
数据仓库是一种数据库,它收集和集成来自异构源的数据,以支持决策制定过程。在Internet和内部网上交换的数据最近已成为一个重要的数据源,XML是交换的标准格式。将可用XML数据集成到数据仓库的可能性在向企业管理人员提供有关其业务领域的最新相关信息方面发挥了重要作用。我们已经开发了一种从源XML schema和符合XML文档设计数据仓库的方法。由于XML数据是半结构化的,因此从XML设计数据仓库带来了许多特殊的挑战。在本文中,描述了派生概念多维方案的最后步骤,然后是逻辑设计,其中根据派生的概念方案创建一组表。已经开发了一个原型工具来测试和验证所提出的方法。数据仓库系统是决策者(管理人员和分析人员)获取、集成和灵活分析来自不同来源的信息的一套技术和工具。系统的中心部分是一个专门用于历史数据复杂分析的大型数据库,称为数据仓库。构建数据仓库系统的过程包括对数据源进行分析,设计能够成功集成数据源的仓库模型,然后根据所提出的模型构建仓库。决策者使用OLAP(在线分析处理)工具以一种快速、直观和交互式的方式对仓库进行查询。OLAP工具使用多维数据模型,它支持关注决策过程中最感兴趣的小块数据(通常是几个数值参数)。仓库中的其他数据按层次结构组织为几个独立的组(称为维度),并用于使用少数重要参数执行计算。大型企业和组织拥有的数据仓库集成了来自异构数据源的数据:关系数据库或其他遗留数据库模型、半结构化数据和不同的文件格式。最近,万维网、网络服务和各种信息系统在因特网和专用网络上交换数据已成为重要的数据来源。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
An insight into challenges of international and multicultural knowledge transfers within ict industry An energy efficient protocol exploiting cache data for sensor networks Prediction of possible congestions in SLA creation process Toward automatic generation of promela models from SDL specification All optical fibre auxiliary carrier based 2R regenerator
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1