Testing Extract-Transform-Load Process in Data Warehouse Systems

Hajar Homayouni
{"title":"Testing Extract-Transform-Load Process in Data Warehouse Systems","authors":"Hajar Homayouni","doi":"10.1109/ISSREW.2018.000-6","DOIUrl":null,"url":null,"abstract":"Enterprises use data warehouses to accumulate data from multiple sources for analysis and research. A data warehouse is populated using the Extract, Transform, and Load (ETL) process that (1) extracts data from various sources, (2) integrates, cleans, and transforms it into a common form, and (3) loads it into the data warehouse. Faults in the ETL implementation and execution can lead to incorrect data in the data warehouse, which renders it useless irrespective of the quality of the applications accessing it and the quality of the source data. Thus, ETL processes must be thoroughly tested to validate the correctness of the ETL implementation. This project develops and evaluates two types of functional testing approaches, namely data quality, and balancing tests. Data quality tests validate the data in the target data warehouse in isolation and balancing tests check for discrepancies between the source and target data. This paper describes the proposed approach, the work accomplished to date, and the expected contributions of this research.","PeriodicalId":321448,"journal":{"name":"2018 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISSREW.2018.000-6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

Abstract

Enterprises use data warehouses to accumulate data from multiple sources for analysis and research. A data warehouse is populated using the Extract, Transform, and Load (ETL) process that (1) extracts data from various sources, (2) integrates, cleans, and transforms it into a common form, and (3) loads it into the data warehouse. Faults in the ETL implementation and execution can lead to incorrect data in the data warehouse, which renders it useless irrespective of the quality of the applications accessing it and the quality of the source data. Thus, ETL processes must be thoroughly tested to validate the correctness of the ETL implementation. This project develops and evaluates two types of functional testing approaches, namely data quality, and balancing tests. Data quality tests validate the data in the target data warehouse in isolation and balancing tests check for discrepancies between the source and target data. This paper describes the proposed approach, the work accomplished to date, and the expected contributions of this research.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
数据仓库系统中提取-转换-加载过程的测试
企业使用数据仓库来积累来自多个来源的数据,以便进行分析和研究。使用提取、转换和加载(ETL)过程填充数据仓库,该过程(1)从各种来源提取数据,(2)集成、清理并将其转换为公共形式,以及(3)将其加载到数据仓库中。ETL实现和执行中的错误可能导致数据仓库中的数据不正确,无论访问数据仓库的应用程序的质量和源数据的质量如何,都会使数据仓库变得无用。因此,必须对ETL过程进行彻底的测试,以验证ETL实现的正确性。本项目开发并评估了两种类型的功能测试方法,即数据质量测试和平衡测试。数据质量测试隔离地验证目标数据仓库中的数据,平衡测试检查源数据和目标数据之间的差异。本文描述了提出的方法,迄今为止完成的工作,以及本研究的预期贡献。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Message from the WoSoCer 2018 Workshop Chairs Software Aging and Rejuvenation in the Cloud: A Literature Review Spectrum-Based Fault Localization for Logic-Based Reasoning [Title page iii] Software Reliability Assessment: Modeling and Algorithms
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1