Aspect-based sentiment analysis (ABSA) aims to classify the sentiment polarity of a given aspect in a sentence or document, which is a fine-grained task of natural language processing. Recent ABSA methods mainly focus on exploiting the syntactic information, the semantic information and both. Research on cognition theory reveals that the syntax an*/874d the semantics have effects on each other. In this work, a graph convolutional network-based model that fuses the syntactic information and semantic information in line with the cognitive practice is proposed. To start with, the GCN is taken to extract syntactic information on the syntax dependency tree. Then, the semantic graph is constructed via a multi-head self-attention mechanism and encoded by GCN. Furthermore, a parameter-sharing GCN is developed to capture the common information between the semantics and the syntax. Experiments conducted on three benchmark datasets (Laptop14, Restaurant14 and Twitter) validate that the proposed model achieves compelling performance comparing with the state-of-the-art models.
{"title":"Fusing Syntax and Semantics-Based Graph Convolutional Network for Aspect-Based Sentiment Analysis","authors":"Jinhui Feng, Shaohua Cai, Kuntao Li, Yifan Chen, Qianhua Cai, Hongya Zhao","doi":"10.4018/ijdwm.319803","DOIUrl":"https://doi.org/10.4018/ijdwm.319803","url":null,"abstract":"Aspect-based sentiment analysis (ABSA) aims to classify the sentiment polarity of a given aspect in a sentence or document, which is a fine-grained task of natural language processing. Recent ABSA methods mainly focus on exploiting the syntactic information, the semantic information and both. Research on cognition theory reveals that the syntax an*/874d the semantics have effects on each other. In this work, a graph convolutional network-based model that fuses the syntactic information and semantic information in line with the cognitive practice is proposed. To start with, the GCN is taken to extract syntactic information on the syntax dependency tree. Then, the semantic graph is constructed via a multi-head self-attention mechanism and encoded by GCN. Furthermore, a parameter-sharing GCN is developed to capture the common information between the semantics and the syntax. Experiments conducted on three benchmark datasets (Laptop14, Restaurant14 and Twitter) validate that the proposed model achieves compelling performance comparing with the state-of-the-art models.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":"22 1","pages":"1-15"},"PeriodicalIF":1.2,"publicationDate":"2023-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85826520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Network representation learning is one of the important works of analyzing network information. Its purpose is to learn a vector for each node in the network and map it into the vector space, and the resulting number of node dimensions is much smaller than the number of nodes in the network. Most of the current work only considers local features and ignores other features in the network, such as attribute features. Aiming at such problems, this paper proposes novel mechanisms of combining network topology, which models node text information and node clustering information on the basis of network structure and then constrains the learning process of network representation to obtain the optimal network node vector. The method is experimentally verified on three datasets: Citeseer (M10), DBLP (V4), and SDBLP. Experimental results show that the proposed method is better than the algorithm based on network topology and text feature. Good experimental results are obtained, which verifies the feasibility of the algorithm and achieves the expected experimental results.
{"title":"CTNRL: A Novel Network Representation Learning With Three Feature Integrations","authors":"Yanlong Tang, Zhonglin Ye, Haixing Zhao, Yi Ji","doi":"10.4018/ijdwm.318696","DOIUrl":"https://doi.org/10.4018/ijdwm.318696","url":null,"abstract":"Network representation learning is one of the important works of analyzing network information. Its purpose is to learn a vector for each node in the network and map it into the vector space, and the resulting number of node dimensions is much smaller than the number of nodes in the network. Most of the current work only considers local features and ignores other features in the network, such as attribute features. Aiming at such problems, this paper proposes novel mechanisms of combining network topology, which models node text information and node clustering information on the basis of network structure and then constrains the learning process of network representation to obtain the optimal network node vector. The method is experimentally verified on three datasets: Citeseer (M10), DBLP (V4), and SDBLP. Experimental results show that the proposed method is better than the algorithm based on network topology and text feature. Good experimental results are obtained, which verifies the feasibility of the algorithm and achieves the expected experimental results.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":"19 1","pages":"1-14"},"PeriodicalIF":1.2,"publicationDate":"2023-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70455633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The higher-order and temporal characteristics of tweet sequences are often ignored in the field of rumor detection. In this paper, a new rumor detection method (T-BiGAT) is proposed to capture the temporal features between tweets by combining a graph attention network (GAT) and gated recurrent neural network (GRU). First, timestamps are calculated for each tweet within the same event. On the premise of the same timestamp, two different propagation subgraphs are constructed according to the response relationship between tweets. Then, GRU is used to capture intralayer dependencies between sibling nodes in the subtree; global features of each subtree are extracted using an improved GAT. Furthermore, GRU is reused to capture the temporal dependencies of individual subgraphs at different timestamps. Finally, weights are assigned to the global feature vectors of different timestamp subtrees for aggregation, and a mapping function is used to classify the aggregated vectors.
{"title":"Research on Rumor Detection Based on a Graph Attention Network With Temporal Features","authors":"Xiaohui Yang, Hailong Ma, Miao Wang","doi":"10.4018/ijdwm.319342","DOIUrl":"https://doi.org/10.4018/ijdwm.319342","url":null,"abstract":"The higher-order and temporal characteristics of tweet sequences are often ignored in the field of rumor detection. In this paper, a new rumor detection method (T-BiGAT) is proposed to capture the temporal features between tweets by combining a graph attention network (GAT) and gated recurrent neural network (GRU). First, timestamps are calculated for each tweet within the same event. On the premise of the same timestamp, two different propagation subgraphs are constructed according to the response relationship between tweets. Then, GRU is used to capture intralayer dependencies between sibling nodes in the subtree; global features of each subtree are extracted using an improved GAT. Furthermore, GRU is reused to capture the temporal dependencies of individual subgraphs at different timestamps. Finally, weights are assigned to the global feature vectors of different timestamp subtrees for aggregation, and a mapping function is used to classify the aggregated vectors.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":"47 1","pages":"1-17"},"PeriodicalIF":1.2,"publicationDate":"2023-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86331233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sundus Naji Alaziz, Bakr Albayati, A. A. El-Bagoury, Wasswa Shafik
The COVID-19 pandemic is one of the current universal threats to humanity. The entire world is cooperating persistently to find some ways to decrease its effect. The time series is one of the basic criteria that play a fundamental part in developing an accurate prediction model for future estimations regarding the expansion of this virus with its infective nature. The authors discuss in this paper the goals of the study, problems, definitions, and previous studies. Also they deal with the theoretical aspect of multi-time series clusters using both the K-means and the time series cluster. In the end, they apply the topics, and ARIMA is used to introduce a prototype to give specific predictions about the impact of the COVID-19 pandemic from 90 to 140 days. The modeling and prediction process is done using the available data set from the Saudi Ministry of Health for Riyadh, Jeddah, Makkah, and Dammam during the previous four months, and the model is evaluated using the Python program. Based on this proposed method, the authors address the conclusions.
{"title":"Clustering of COVID-19 Multi-Time Series-Based K-Means and PCA With Forecasting","authors":"Sundus Naji Alaziz, Bakr Albayati, A. A. El-Bagoury, Wasswa Shafik","doi":"10.4018/ijdwm.317374","DOIUrl":"https://doi.org/10.4018/ijdwm.317374","url":null,"abstract":"The COVID-19 pandemic is one of the current universal threats to humanity. The entire world is cooperating persistently to find some ways to decrease its effect. The time series is one of the basic criteria that play a fundamental part in developing an accurate prediction model for future estimations regarding the expansion of this virus with its infective nature. The authors discuss in this paper the goals of the study, problems, definitions, and previous studies. Also they deal with the theoretical aspect of multi-time series clusters using both the K-means and the time series cluster. In the end, they apply the topics, and ARIMA is used to introduce a prototype to give specific predictions about the impact of the COVID-19 pandemic from 90 to 140 days. The modeling and prediction process is done using the available data set from the Saudi Ministry of Health for Riyadh, Jeddah, Makkah, and Dammam during the previous four months, and the model is evaluated using the Python program. Based on this proposed method, the authors address the conclusions.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":"3 1","pages":"1-25"},"PeriodicalIF":1.2,"publicationDate":"2023-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90155954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
W. Yang, Xianghan Zheng, Qiongxia Huang, Yu Liu, Yimi Chen, ZhiGang Song
It has been widely known that long non-coding RNA (lncRNA) plays an important role in gene expression and regulation. However, due to a few characteristics of lncRNA (e.g., huge amounts of data, high dimension, lack of noted samples, etc.), identifying key lncRNA closely related to specific disease is nearly impossible. In this paper, the authors propose a computational method to predict key lncRNA closely related to its corresponding disease. The proposed solution implements a BPSO based intelligent algorithm to select possible optimal lncRNA subset, and then uses ML-ELM based deep learning model to evaluate each lncRNA subset. After that, wrapper feature extraction method is used to select lncRNAs, which are closely related to the pathophysiology of disease from massive data. Experimentation on three typical open datasets proves the feasibility and efficiency of our proposed solution. This proposed solution achieves above 93% accuracy, the best ever.
{"title":"Combining BPSO and ELM Models for Inferring Novel lncRNA-Disease Associations","authors":"W. Yang, Xianghan Zheng, Qiongxia Huang, Yu Liu, Yimi Chen, ZhiGang Song","doi":"10.4018/ijdwm.317092","DOIUrl":"https://doi.org/10.4018/ijdwm.317092","url":null,"abstract":"It has been widely known that long non-coding RNA (lncRNA) plays an important role in gene expression and regulation. However, due to a few characteristics of lncRNA (e.g., huge amounts of data, high dimension, lack of noted samples, etc.), identifying key lncRNA closely related to specific disease is nearly impossible. In this paper, the authors propose a computational method to predict key lncRNA closely related to its corresponding disease. The proposed solution implements a BPSO based intelligent algorithm to select possible optimal lncRNA subset, and then uses ML-ELM based deep learning model to evaluate each lncRNA subset. After that, wrapper feature extraction method is used to select lncRNAs, which are closely related to the pathophysiology of disease from massive data. Experimentation on three typical open datasets proves the feasibility and efficiency of our proposed solution. This proposed solution achieves above 93% accuracy, the best ever.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":"14 1","pages":"1-18"},"PeriodicalIF":1.2,"publicationDate":"2023-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75810785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Spatiotemporal data prediction is of great significance in the fields of smart cities and smart manufacturing. Current spatiotemporal data prediction models heavily rely on traditional spatial views or single temporal granularity, which suffer from missing knowledge, including dynamic spatial correlations, periodicity, and mutability. This paper addresses these challenges by proposing a multi-layer attention-based predictive model. The key idea of this paper is to use a multi-layer attention mechanism to model the dynamic spatial correlation of different features. Then, multi-granularity historical features are fused to predict future spatiotemporal data. Experiments on real-world data show that the proposed model outperforms six state-of-the-art benchmark methods.
{"title":"Spatiotemporal Data Prediction Model Based on a Multi-Layer Attention Mechanism","authors":"Man Jiang, Qilong Han, Haitao Zhang, Hexiang Liu","doi":"10.4018/ijdwm.315822","DOIUrl":"https://doi.org/10.4018/ijdwm.315822","url":null,"abstract":"Spatiotemporal data prediction is of great significance in the fields of smart cities and smart manufacturing. Current spatiotemporal data prediction models heavily rely on traditional spatial views or single temporal granularity, which suffer from missing knowledge, including dynamic spatial correlations, periodicity, and mutability. This paper addresses these challenges by proposing a multi-layer attention-based predictive model. The key idea of this paper is to use a multi-layer attention mechanism to model the dynamic spatial correlation of different features. Then, multi-granularity historical features are fused to predict future spatiotemporal data. Experiments on real-world data show that the proposed model outperforms six state-of-the-art benchmark methods.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":"8 1","pages":"1-15"},"PeriodicalIF":1.2,"publicationDate":"2023-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82940997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhongping Zhang, Sen Li, Weixiong Liu, Y. Wang, Daisy Xin Li
Outlier detection is an important field in data mining, which can be used in fraud detection, fault detection, and other fields. This article focuses on the problem where the density peak clustering algorithm needs a manual parameter setting and time complexity is high; the first is to use the k nearest neighbors clustering algorithm to replace the density peak of the density estimate, which adopts the KD-Tree index data structure calculation of data objects k close neighbors. Then it adopts the method of the product of density and distance automatic selection of clustering centers. In addition, the central relative distance and fast density peak clustering outliers were defined to characterize the degree of outliers of data objects. Then, based on fast density peak clustering outliers, an outlier detection algorithm was devised. Experiments on artificial and real data sets are performed to validate the algorithm, and the validity and time efficiency of the proposed algorithm are validated when compared to several conventional and innovative algorithms.
{"title":"A New Outlier Detection Algorithm Based on Fast Density Peak Clustering Outlier Factor","authors":"Zhongping Zhang, Sen Li, Weixiong Liu, Y. Wang, Daisy Xin Li","doi":"10.4018/ijdwm.316534","DOIUrl":"https://doi.org/10.4018/ijdwm.316534","url":null,"abstract":"Outlier detection is an important field in data mining, which can be used in fraud detection, fault detection, and other fields. This article focuses on the problem where the density peak clustering algorithm needs a manual parameter setting and time complexity is high; the first is to use the k nearest neighbors clustering algorithm to replace the density peak of the density estimate, which adopts the KD-Tree index data structure calculation of data objects k close neighbors. Then it adopts the method of the product of density and distance automatic selection of clustering centers. In addition, the central relative distance and fast density peak clustering outliers were defined to characterize the degree of outliers of data objects. Then, based on fast density peak clustering outliers, an outlier detection algorithm was devised. Experiments on artificial and real data sets are performed to validate the algorithm, and the validity and time efficiency of the proposed algorithm are validated when compared to several conventional and innovative algorithms.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":"21 1","pages":"1-19"},"PeriodicalIF":1.2,"publicationDate":"2023-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78453444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Clustering is a basic primer of exploratory tasks. In order to obtain valuable results, the parameters in the clustering algorithm, the number of clusters must be set appropriately. Existing methods for determining the number of clusters perform well on low-dimensional small datasets, but how to effectively determine the optimal number of clusters on large high-dimensional datasets is still a challenging problem. In this paper, the authors design a method for effectively estimating the optimal number of clusters on large-scale high-dimensional datasets that can overcome the shortcomings of existing estimation methods and accurately and quickly estimate the optimal number of clusters on large-scale high-dimensional datasets. Extensive experiments show that it (1) outperforms existing estimation methods in accuracy and efficiency, (2) generalizes across different datasets, and (3) is suitable for high-dimensional large datasets.
{"title":"Estimating the Number of Clusters in High-Dimensional Large Datasets","authors":"Xutong Zhu, Lingli Li","doi":"10.4018/ijdwm.316142","DOIUrl":"https://doi.org/10.4018/ijdwm.316142","url":null,"abstract":"Clustering is a basic primer of exploratory tasks. In order to obtain valuable results, the parameters in the clustering algorithm, the number of clusters must be set appropriately. Existing methods for determining the number of clusters perform well on low-dimensional small datasets, but how to effectively determine the optimal number of clusters on large high-dimensional datasets is still a challenging problem. In this paper, the authors design a method for effectively estimating the optimal number of clusters on large-scale high-dimensional datasets that can overcome the shortcomings of existing estimation methods and accurately and quickly estimate the optimal number of clusters on large-scale high-dimensional datasets. Extensive experiments show that it (1) outperforms existing estimation methods in accuracy and efficiency, (2) generalizes across different datasets, and (3) is suitable for high-dimensional large datasets.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":"42 1","pages":"1-14"},"PeriodicalIF":1.2,"publicationDate":"2023-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88955315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sibo Prasad Patro, Neelamadhab Padhy, Rahul Deo Sah
There are very few studies are carried for investigating the potential of hybrid ensemble machine learning techniques for building a model for the detection and prediction of heart disease in the human body. In this research, the authors deal with a classification problem that is a hybridization of fusion-based ensemble model with machine learning approaches, which produces a more trustworthy ensemble than the original ensemble model and outperforms previous heart disease prediction models. The proposed model is evaluated on the Cleveland heart disease dataset using six boosting techniques named XGBoost, AdaBoost, Gradient Boosting, LightGBM, CatBoost, and Histogram-Based Gradient Boosting. Hybridization produces superior results under consideration of classification algorithms. The remarkable accuracies of 96.51% for training and 93.37% for testing have been achieved by the Meta-XGBoost classifier.
{"title":"An Ensemble Approach for Prediction of Cardiovascular Disease Using Meta Classifier Boosting Algorithms","authors":"Sibo Prasad Patro, Neelamadhab Padhy, Rahul Deo Sah","doi":"10.4018/ijdwm.316145","DOIUrl":"https://doi.org/10.4018/ijdwm.316145","url":null,"abstract":"There are very few studies are carried for investigating the potential of hybrid ensemble machine learning techniques for building a model for the detection and prediction of heart disease in the human body. In this research, the authors deal with a classification problem that is a hybridization of fusion-based ensemble model with machine learning approaches, which produces a more trustworthy ensemble than the original ensemble model and outperforms previous heart disease prediction models. The proposed model is evaluated on the Cleveland heart disease dataset using six boosting techniques named XGBoost, AdaBoost, Gradient Boosting, LightGBM, CatBoost, and Histogram-Based Gradient Boosting. Hybridization produces superior results under consideration of classification algorithms. The remarkable accuracies of 96.51% for training and 93.37% for testing have been achieved by the Meta-XGBoost classifier.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":" ","pages":""},"PeriodicalIF":1.2,"publicationDate":"2023-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45477411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Spatial keyword query has attracted the attention of many researchers. Most of the existing spatial keyword indexes do not consider the differences in keyword distribution, so their efficiencies are not high when data are skewed. To this end, this paper proposes a novel association rule mining based spatial keyword index, ARM-SQ, whose inverted lists are materialized by the frequent item sets mined by association rules; thus, intersections of long lists can be avoided. To prevent excessive space costs caused by materialization, a depth-based materialization strategy is introduced, which maintains a good balance between query and space costs. To select the right frequent item sets for answering a query, the authors further implement a benefit-based greedy frequent item set selection algorithm, BGF-Selection. The experimental results show that this algorithm significantly outperforms the existing algorithms, and its efficiency can be an order of magnitude higher than SFC-Quad.
{"title":"An Efficient Association Rule Mining-Based Spatial Keyword Index","authors":"Lianyin Jia, Haotian Tang, Mengjuan Li, Bingxin Zhao, S. Wei, Haihe Zhou","doi":"10.4018/ijdwm.316161","DOIUrl":"https://doi.org/10.4018/ijdwm.316161","url":null,"abstract":"Spatial keyword query has attracted the attention of many researchers. Most of the existing spatial keyword indexes do not consider the differences in keyword distribution, so their efficiencies are not high when data are skewed. To this end, this paper proposes a novel association rule mining based spatial keyword index, ARM-SQ, whose inverted lists are materialized by the frequent item sets mined by association rules; thus, intersections of long lists can be avoided. To prevent excessive space costs caused by materialization, a depth-based materialization strategy is introduced, which maintains a good balance between query and space costs. To select the right frequent item sets for answering a query, the authors further implement a benefit-based greedy frequent item set selection algorithm, BGF-Selection. The experimental results show that this algorithm significantly outperforms the existing algorithms, and its efficiency can be an order of magnitude higher than SFC-Quad.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":"32 1","pages":"1-19"},"PeriodicalIF":1.2,"publicationDate":"2023-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88182495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}