{"title":"数据仓库系统中提取-转换-加载过程的测试","authors":"Hajar Homayouni","doi":"10.1109/ISSREW.2018.000-6","DOIUrl":null,"url":null,"abstract":"Enterprises use data warehouses to accumulate data from multiple sources for analysis and research. A data warehouse is populated using the Extract, Transform, and Load (ETL) process that (1) extracts data from various sources, (2) integrates, cleans, and transforms it into a common form, and (3) loads it into the data warehouse. Faults in the ETL implementation and execution can lead to incorrect data in the data warehouse, which renders it useless irrespective of the quality of the applications accessing it and the quality of the source data. Thus, ETL processes must be thoroughly tested to validate the correctness of the ETL implementation. This project develops and evaluates two types of functional testing approaches, namely data quality, and balancing tests. Data quality tests validate the data in the target data warehouse in isolation and balancing tests check for discrepancies between the source and target data. This paper describes the proposed approach, the work accomplished to date, and the expected contributions of this research.","PeriodicalId":321448,"journal":{"name":"2018 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Testing Extract-Transform-Load Process in Data Warehouse Systems\",\"authors\":\"Hajar Homayouni\",\"doi\":\"10.1109/ISSREW.2018.000-6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Enterprises use data warehouses to accumulate data from multiple sources for analysis and research. A data warehouse is populated using the Extract, Transform, and Load (ETL) process that (1) extracts data from various sources, (2) integrates, cleans, and transforms it into a common form, and (3) loads it into the data warehouse. Faults in the ETL implementation and execution can lead to incorrect data in the data warehouse, which renders it useless irrespective of the quality of the applications accessing it and the quality of the source data. Thus, ETL processes must be thoroughly tested to validate the correctness of the ETL implementation. This project develops and evaluates two types of functional testing approaches, namely data quality, and balancing tests. Data quality tests validate the data in the target data warehouse in isolation and balancing tests check for discrepancies between the source and target data. This paper describes the proposed approach, the work accomplished to date, and the expected contributions of this research.\",\"PeriodicalId\":321448,\"journal\":{\"name\":\"2018 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)\",\"volume\":\"49 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISSREW.2018.000-6\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISSREW.2018.000-6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Testing Extract-Transform-Load Process in Data Warehouse Systems
Enterprises use data warehouses to accumulate data from multiple sources for analysis and research. A data warehouse is populated using the Extract, Transform, and Load (ETL) process that (1) extracts data from various sources, (2) integrates, cleans, and transforms it into a common form, and (3) loads it into the data warehouse. Faults in the ETL implementation and execution can lead to incorrect data in the data warehouse, which renders it useless irrespective of the quality of the applications accessing it and the quality of the source data. Thus, ETL processes must be thoroughly tested to validate the correctness of the ETL implementation. This project develops and evaluates two types of functional testing approaches, namely data quality, and balancing tests. Data quality tests validate the data in the target data warehouse in isolation and balancing tests check for discrepancies between the source and target data. This paper describes the proposed approach, the work accomplished to date, and the expected contributions of this research.