Bhargavinath Dornadula, S. Geetha, L. Anbarasi, Seifedine Kadry
The coronavirus (COVID-19) outbreak has opened an alarming situation for the whole world and has been marked as one of the most severe and acute medical conditions in the last hundred years. Various medical imaging modalities including computer tomography (CT) and chest x-rays are employed for diagnosis. This paper presents an overview of the recently developed COVID-19 detection systems from chest x-ray images using deep learning approaches. This review explores and analyses the data sets, feature engineering techniques, image pre-processing methods, and experimental results of various works carried out in the literature. It also highlights the transfer learning techniques and different performance metrics used by researchers in this field. This information is helpful to point out the future research direction in the domain of automatic diagnosis of COVID-19 using deep learning techniques.
{"title":"A Survey of COVID-19 Detection From Chest X-Rays Using Deep Learning Methods","authors":"Bhargavinath Dornadula, S. Geetha, L. Anbarasi, Seifedine Kadry","doi":"10.4018/ijdwm.314155","DOIUrl":"https://doi.org/10.4018/ijdwm.314155","url":null,"abstract":"The coronavirus (COVID-19) outbreak has opened an alarming situation for the whole world and has been marked as one of the most severe and acute medical conditions in the last hundred years. Various medical imaging modalities including computer tomography (CT) and chest x-rays are employed for diagnosis. This paper presents an overview of the recently developed COVID-19 detection systems from chest x-ray images using deep learning approaches. This review explores and analyses the data sets, feature engineering techniques, image pre-processing methods, and experimental results of various works carried out in the literature. It also highlights the transfer learning techniques and different performance metrics used by researchers in this field. This information is helpful to point out the future research direction in the domain of automatic diagnosis of COVID-19 using deep learning techniques.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":"1 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41458501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nasser Allheeib, Marine Alraqdi, Mohammed Almukaynizi
With the urbanization of various regions, many historical sites may be misrepresented or totally neglected. As more people move to urban areas with time, heritage areas are being abandoned or ignored. The roads leading to such areas are less maintained, and they are not being adequately promoted. Over the years, the emergence and evolution of digital maps have played a significant role in tourist and cultural exploration and are important sources of information for tourists who are considering specific destinations. In this paper, the authors discuss the development and implementation of a geographic information system (GIS) in the tourism industry. They create an interactive map for tourist sites and suggest a means of retrieving tourist data. They select the Aseer region as a case study since it is rich with deep cultural heritage, comprising almost 4,000 heritage villages, and is considered to be one of the most important tourist destinations in the country. In this paper, the authors propose an initiative for the development and implementation of GIS in the tourism industry.
{"title":"Data Warehouse and Interactive Map for Promoting Cultural Heritage in Saudi Arabia Using GIS","authors":"Nasser Allheeib, Marine Alraqdi, Mohammed Almukaynizi","doi":"10.4018/ijdwm.314236","DOIUrl":"https://doi.org/10.4018/ijdwm.314236","url":null,"abstract":"With the urbanization of various regions, many historical sites may be misrepresented or totally neglected. As more people move to urban areas with time, heritage areas are being abandoned or ignored. The roads leading to such areas are less maintained, and they are not being adequately promoted. Over the years, the emergence and evolution of digital maps have played a significant role in tourist and cultural exploration and are important sources of information for tourists who are considering specific destinations. In this paper, the authors discuss the development and implementation of a geographic information system (GIS) in the tourism industry. They create an interactive map for tourist sites and suggest a means of retrieving tourist data. They select the Aseer region as a case study since it is rich with deep cultural heritage, comprising almost 4,000 heritage villages, and is considered to be one of the most important tourist destinations in the country. In this paper, the authors propose an initiative for the development and implementation of GIS in the tourism industry.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":" ","pages":""},"PeriodicalIF":1.2,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48390256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Z. Ye, Wenhui Cai, Mingwei Wang, Aixin Zhang, Wen-hua Zhou, Na Deng, Zimei Wei, Daxin Zhu
Association Rule Mining(ARM) is one of the most significant and active research areas in data mining. Recently, Whale Optimization Algorithm (WOA) has been successfully applied in the field of data mining, however, it easily falls into the local optimum. Therefore, an improved WOA based adaptive parameter strategy and Levy Flight mechanism (LWOA) is applied to mine association rules. Meanwhile, a hybrid strategy that blends two algorithms to balance the exploration and exploitation phases is put forward, that is, grey wolf optimization algorithm (GWO), artificial bee colony algorithm (ABC) and cuckoo search algorithm (CS) are devoted to improving the convergence of LWOA. The approach performs a global search and finds the association rules sets by modeling the rule mining task as a multi-objective problem that simultaneously meets support, confidence, lift, and certain factor, which is examined on multiple data sets. Experimental results verify that the proposed method has better mining performance compared to other algorithms involved in the paper.
{"title":"Association Rule Mining Based on Hybrid Whale Optimization Algorithm","authors":"Z. Ye, Wenhui Cai, Mingwei Wang, Aixin Zhang, Wen-hua Zhou, Na Deng, Zimei Wei, Daxin Zhu","doi":"10.4018/ijdwm.308817","DOIUrl":"https://doi.org/10.4018/ijdwm.308817","url":null,"abstract":"Association Rule Mining(ARM) is one of the most significant and active research areas in data mining. Recently, Whale Optimization Algorithm (WOA) has been successfully applied in the field of data mining, however, it easily falls into the local optimum. Therefore, an improved WOA based adaptive parameter strategy and Levy Flight mechanism (LWOA) is applied to mine association rules. Meanwhile, a hybrid strategy that blends two algorithms to balance the exploration and exploitation phases is put forward, that is, grey wolf optimization algorithm (GWO), artificial bee colony algorithm (ABC) and cuckoo search algorithm (CS) are devoted to improving the convergence of LWOA. The approach performs a global search and finds the association rules sets by modeling the rule mining task as a multi-objective problem that simultaneously meets support, confidence, lift, and certain factor, which is examined on multiple data sets. Experimental results verify that the proposed method has better mining performance compared to other algorithms involved in the paper.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":"89 1","pages":"1-22"},"PeriodicalIF":1.2,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91026072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Document classification is a research topic aiming to predict the overall text sentiment polarity with the advent of deep neural networks. Various deep learning algorithms have been employed in the current studies to improve classification performance. To this end, this paper proposes a hierarchical hybrid neural network with multi-head attention (HHNN-MHA) model on the task of document classification. The proposed model contains two layers to deal with the word-sentence level and sentence-document level classification respectively. In the first layer, CNN is integrated into Bi-GRU and a multi-head attention mechanism is employed, in order to exploit local and global features. Then, both Bi-GRU and attention mechanism are applied to document processing and classification in the second layer. Experiments on four datasets demonstrate the effectiveness of the proposed method. Compared to the state-of-art methods, our model achieves competitive results in document classification in terms of experimental performance.
{"title":"Hierarchical Hybrid Neural Networks With Multi-Head Attention for Document Classification","authors":"Weihao Huang, Jiaojiao Chen, Qianhua Cai, Xuejie Liu, Yu-dong Zhang, Xiaohui Hu","doi":"10.4018/ijdwm.303673","DOIUrl":"https://doi.org/10.4018/ijdwm.303673","url":null,"abstract":"Document classification is a research topic aiming to predict the overall text sentiment polarity with the advent of deep neural networks. Various deep learning algorithms have been employed in the current studies to improve classification performance. To this end, this paper proposes a hierarchical hybrid neural network with multi-head attention (HHNN-MHA) model on the task of document classification. The proposed model contains two layers to deal with the word-sentence level and sentence-document level classification respectively. In the first layer, CNN is integrated into Bi-GRU and a multi-head attention mechanism is employed, in order to exploit local and global features. Then, both Bi-GRU and attention mechanism are applied to document processing and classification in the second layer. Experiments on four datasets demonstrate the effectiveness of the proposed method. Compared to the state-of-art methods, our model achieves competitive results in document classification in terms of experimental performance.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":"1 1","pages":"1-16"},"PeriodicalIF":1.2,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89573483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-10-01DOI: 10.4018/ijdwm.2021100101
Waqas Ahmed, E. Zimányi, A. Vaisman, R. Wrembel
Data warehouses (DWs) evolve in both their content and schema due to changes of user requirements, business processes, or external sources to name a few. Although multiple approaches using temporal and/or multiversion DWs have been proposed to handle these changes, an efficient solution for this problem is still lacking. The authors' approach is to separate concerns and use temporal DWs to deal with content changes, and multiversion DWs to deal with schema changes. To address the former, previously, they have proposed a temporal multidimensional (MD) model. In this paper, they propose a multiversion MD model for schema evolution to tackle the latter problem. The two models complement each other and allow managing both content and schema evolution. In this paper, the semantics of schema modification operators (SMOs) to derive various schema versions are given. It is also shown how online analytical processing (OLAP) operations like roll-up work on the model. Finally, the mapping from the multiversion MD model to a relational schema is given along with OLAP operations in standard SQL.
{"title":"Schema Evolution in Multiversion Data Warehouses","authors":"Waqas Ahmed, E. Zimányi, A. Vaisman, R. Wrembel","doi":"10.4018/ijdwm.2021100101","DOIUrl":"https://doi.org/10.4018/ijdwm.2021100101","url":null,"abstract":"Data warehouses (DWs) evolve in both their content and schema due to changes of user requirements, business processes, or external sources to name a few. Although multiple approaches using temporal and/or multiversion DWs have been proposed to handle these changes, an efficient solution for this problem is still lacking. The authors' approach is to separate concerns and use temporal DWs to deal with content changes, and multiversion DWs to deal with schema changes. To address the former, previously, they have proposed a temporal multidimensional (MD) model. In this paper, they propose a multiversion MD model for schema evolution to tackle the latter problem. The two models complement each other and allow managing both content and schema evolution. In this paper, the semantics of schema modification operators (SMOs) to derive various schema versions are given. It is also shown how online analytical processing (OLAP) operations like roll-up work on the model. Finally, the mapping from the multiversion MD model to a relational schema is given along with OLAP operations in standard SQL.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":"11 1","pages":"1-28"},"PeriodicalIF":1.2,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73133877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-10-01DOI: 10.4018/ijdwm.2021100103
Han Li, Zhao Liu, P. Zhu
The missing values in industrial data restrict the applications. Although this incomplete data contains enough information for engineers to support subsequent development, there are still too many missing values for algorithms to establish precise models. This is because the engineering domain knowledge is not considered, and valuable information is not fully captured. Therefore, this article proposes an engineering domain knowledge-based framework for modelling incomplete industrial data. The raw datasets are partitioned and processed at different scales. Firstly, the hierarchical features are combined to decrease the missing ratio. In order to fill the missing values in special data, which is identified for classifying the samples, samples with only part of the features presented are fully utilized instead of being removed to establish local imputation model. Then samples are divided into different groups to transfer the information. A series of industrial data is analyzed for verifying the feasibility of the proposed method.
{"title":"An Engineering Domain Knowledge-Based Framework for Modelling Highly Incomplete Industrial Data","authors":"Han Li, Zhao Liu, P. Zhu","doi":"10.4018/ijdwm.2021100103","DOIUrl":"https://doi.org/10.4018/ijdwm.2021100103","url":null,"abstract":"The missing values in industrial data restrict the applications. Although this incomplete data contains enough information for engineers to support subsequent development, there are still too many missing values for algorithms to establish precise models. This is because the engineering domain knowledge is not considered, and valuable information is not fully captured. Therefore, this article proposes an engineering domain knowledge-based framework for modelling incomplete industrial data. The raw datasets are partitioned and processed at different scales. Firstly, the hierarchical features are combined to decrease the missing ratio. In order to fill the missing values in special data, which is identified for classifying the samples, samples with only part of the features presented are fully utilized instead of being removed to establish local imputation model. Then samples are divided into different groups to transfer the information. A series of industrial data is analyzed for verifying the feasibility of the proposed method.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":"16 1","pages":"48-66"},"PeriodicalIF":1.2,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81853124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-10-01DOI: 10.4018/ijdwm.2021100104
Thang Truong Nguyen, Long Giang Nguyen, D. T. Tran, T. T. Nguyen, Huy Quang Nguyen, Anh Viet Pham, T. D. Vu
Attribute reduction from decision tables is one of the crucial topics in data mining. This problem belongs to NP-hard and many approximation algorithms based on the filter or the filter-wrapper approaches have been designed to find the reducts. Intuitionistic fuzzy set (IFS) has been regarded as the effective tool to deal with such the problem by adding two degrees, namely the membership and non-membership for each data element. The separation of attributes in the view of two counterparts as in the IFS set would increase the quality of classification and reduce the reducts. From this motivation, this paper proposes a new filter-wrapper algorithm based on the IFS for attribute reduction from decision tables. The contributions include a new instituitionistics fuzzy distance between partitions accompanied with theoretical analysis. The filter-wrapper algorithm is designed based on that distance with the new stopping condition based on the concept of delta-equality. Experiments are conducted on the benchmark UCI machine learning repository datasets.
{"title":"A Novel Filter-Wrapper Algorithm on Intuitionistic Fuzzy Set for Attribute Reduction From Decision Tables","authors":"Thang Truong Nguyen, Long Giang Nguyen, D. T. Tran, T. T. Nguyen, Huy Quang Nguyen, Anh Viet Pham, T. D. Vu","doi":"10.4018/ijdwm.2021100104","DOIUrl":"https://doi.org/10.4018/ijdwm.2021100104","url":null,"abstract":"Attribute reduction from decision tables is one of the crucial topics in data mining. This problem belongs to NP-hard and many approximation algorithms based on the filter or the filter-wrapper approaches have been designed to find the reducts. Intuitionistic fuzzy set (IFS) has been regarded as the effective tool to deal with such the problem by adding two degrees, namely the membership and non-membership for each data element. The separation of attributes in the view of two counterparts as in the IFS set would increase the quality of classification and reduce the reducts. From this motivation, this paper proposes a new filter-wrapper algorithm based on the IFS for attribute reduction from decision tables. The contributions include a new instituitionistics fuzzy distance between partitions accompanied with theoretical analysis. The filter-wrapper algorithm is designed based on that distance with the new stopping condition based on the concept of delta-equality. Experiments are conducted on the benchmark UCI machine learning repository datasets.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":"1 1","pages":"67-100"},"PeriodicalIF":1.2,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84367662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-10-01DOI: 10.4018/ijdwm.2021100105
R. Abirami, M. DuraiRajVincentP., S. Kadry
Early and automatic segmentation of lung infections from computed tomography images of COVID-19 patients is crucial for timely quarantine and effective treatment. However, automating the segmentation of lung infection from CT slices is challenging due to a lack of contrast between the normal and infected tissues. A CNN and GAN-based framework are presented to classify and then segment the lung infections automatically from COVID-19 lung CT slices. In this work, the authors propose a novel method named P2P-COVID-SEG to automatically classify COVID-19 and normal CT images and then segment COVID-19 lung infections from CT images using GAN. The proposed model outperformed the existing classification models with an accuracy of 98.10%. The segmentation results outperformed existing methods and achieved infection segmentation with accurate boundaries. The Dice coefficient achieved using GAN segmentation is 81.11%. The segmentation results demonstrate that the proposed model outperforms the existing models and achieves state-of-the-art performance.
{"title":"P2P-COVID-GAN: Classification and Segmentation of COVID-19 Lung Infections From CT Images Using GAN","authors":"R. Abirami, M. DuraiRajVincentP., S. Kadry","doi":"10.4018/ijdwm.2021100105","DOIUrl":"https://doi.org/10.4018/ijdwm.2021100105","url":null,"abstract":"Early and automatic segmentation of lung infections from computed tomography images of COVID-19 patients is crucial for timely quarantine and effective treatment. However, automating the segmentation of lung infection from CT slices is challenging due to a lack of contrast between the normal and infected tissues. A CNN and GAN-based framework are presented to classify and then segment the lung infections automatically from COVID-19 lung CT slices. In this work, the authors propose a novel method named P2P-COVID-SEG to automatically classify COVID-19 and normal CT images and then segment COVID-19 lung infections from CT images using GAN. The proposed model outperformed the existing classification models with an accuracy of 98.10%. The segmentation results outperformed existing methods and achieved infection segmentation with accurate boundaries. The Dice coefficient achieved using GAN segmentation is 81.11%. The segmentation results demonstrate that the proposed model outperforms the existing models and achieves state-of-the-art performance.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":"2 1","pages":"101-118"},"PeriodicalIF":1.2,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79083588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-10-01DOI: 10.4018/ijdwm.2021100102
Bruno Oliveira, Óscar Oliveira, O. Belo
Considering extract-transform-load (ETL) as a complex and evolutionary process, development teams must conscientiously and rigorously create log strategies for retrieving the most value of the information that can be gathered from the events that occur through the ETL workflow. Efficient logging strategies must be structured so that metrics, logs, and alerts can, beyond their troubleshooting capabilities, provide insights about the system. This paper presents a configurable and flexible ETL component for creating logging mechanisms in ETL workflows. A pattern-oriented approach is followed as a way to abstract ETL activities and enable its mapping to physical primitives that can be interpreted by ETL commercial tools.
{"title":"ETL Logs Under a Pattern-Oriented Approach","authors":"Bruno Oliveira, Óscar Oliveira, O. Belo","doi":"10.4018/ijdwm.2021100102","DOIUrl":"https://doi.org/10.4018/ijdwm.2021100102","url":null,"abstract":"Considering extract-transform-load (ETL) as a complex and evolutionary process, development teams must conscientiously and rigorously create log strategies for retrieving the most value of the information that can be gathered from the events that occur through the ETL workflow. Efficient logging strategies must be structured so that metrics, logs, and alerts can, beyond their troubleshooting capabilities, provide insights about the system. This paper presents a configurable and flexible ETL component for creating logging mechanisms in ETL workflows. A pattern-oriented approach is followed as a way to abstract ETL activities and enable its mapping to physical primitives that can be interpreted by ETL commercial tools.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":"7 1","pages":"29-47"},"PeriodicalIF":1.2,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85506565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-01DOI: 10.4018/IJDWM.2021070102
S. Chakraborty
Data from multiple sources are loaded into the organization data warehouse for analysis. Since some OLAP queries are quite frequently fired on the warehouse data, their execution time is reduced by storing the queries and results in a relational database, referred as materialized query database (MQDB). If the tables, fields, functions, and criteria of input query and stored query are the same but the query criteria specified in WHERE or HAVING clause do not match, then they are considered non-synonymous to each other. In the present research, the results of non-synonymous queries are generated by reusing the existing stored results after applying UNION or MINUS operations on them. This will reduce the execution time of non-synonymous queries. For superset criteria values of input query, UNION operation is applied, and for subset values, MINUS operation is applied. Incremental result processing of existing stored results, if required, is performed using Data Marts.
{"title":"A Novel Approach Using Non-Synonymous Materialized Queries for Data Warehousing","authors":"S. Chakraborty","doi":"10.4018/IJDWM.2021070102","DOIUrl":"https://doi.org/10.4018/IJDWM.2021070102","url":null,"abstract":"Data from multiple sources are loaded into the organization data warehouse for analysis. Since some OLAP queries are quite frequently fired on the warehouse data, their execution time is reduced by storing the queries and results in a relational database, referred as materialized query database (MQDB). If the tables, fields, functions, and criteria of input query and stored query are the same but the query criteria specified in WHERE or HAVING clause do not match, then they are considered non-synonymous to each other. In the present research, the results of non-synonymous queries are generated by reusing the existing stored results after applying UNION or MINUS operations on them. This will reduce the execution time of non-synonymous queries. For superset criteria values of input query, UNION operation is applied, and for subset values, MINUS operation is applied. Incremental result processing of existing stored results, if required, is performed using Data Marts.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":"34 1","pages":"22-43"},"PeriodicalIF":1.2,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74450217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}