International Journal of Data Warehousing and Mining最新文献

英文中文

A Survey of COVID-19 Detection From Chest X-Rays Using Deep Learning Methods 应用深度学习方法从胸部X射线中检测新冠肺炎的调查

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

International Journal of Data Warehousing and Mining

Pub Date : 2022-01-01 DOI: 10.4018/ijdwm.314155

Bhargavinath Dornadula, S. Geetha, L. Anbarasi, Seifedine Kadry

The coronavirus (COVID-19) outbreak has opened an alarming situation for the whole world and has been marked as one of the most severe and acute medical conditions in the last hundred years. Various medical imaging modalities including computer tomography (CT) and chest x-rays are employed for diagnosis. This paper presents an overview of the recently developed COVID-19 detection systems from chest x-ray images using deep learning approaches. This review explores and analyses the data sets, feature engineering techniques, image pre-processing methods, and experimental results of various works carried out in the literature. It also highlights the transfer learning techniques and different performance metrics used by researchers in this field. This information is helpful to point out the future research direction in the domain of automatic diagnosis of COVID-19 using deep learning techniques.

冠状病毒（新冠肺炎）的爆发为整个世界打开了一个令人担忧的局面，并被标记为过去一百年来最严重和最急性的医疗状况之一。包括计算机断层扫描（CT）和胸部x光片在内的各种医学成像模式被用于诊断。本文概述了最近开发的使用深度学习方法的胸部x射线图像新冠肺炎检测系统。这篇综述探讨和分析了文献中各种工作的数据集、特征工程技术、图像预处理方法和实验结果。它还强调了迁移学习技术和该领域研究人员使用的不同绩效指标。这些信息有助于指出未来新冠肺炎深度学习自动诊断领域的研究方向。

引用次数: 0

Data Warehouse and Interactive Map for Promoting Cultural Heritage in Saudi Arabia Using GIS 利用GIS促进沙特阿拉伯文化遗产的数据仓库和交互式地图

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

International Journal of Data Warehousing and Mining

Pub Date : 2022-01-01 DOI: 10.4018/ijdwm.314236

Nasser Allheeib, Marine Alraqdi, Mohammed Almukaynizi

With the urbanization of various regions, many historical sites may be misrepresented or totally neglected. As more people move to urban areas with time, heritage areas are being abandoned or ignored. The roads leading to such areas are less maintained, and they are not being adequately promoted. Over the years, the emergence and evolution of digital maps have played a significant role in tourist and cultural exploration and are important sources of information for tourists who are considering specific destinations. In this paper, the authors discuss the development and implementation of a geographic information system (GIS) in the tourism industry. They create an interactive map for tourist sites and suggest a means of retrieving tourist data. They select the Aseer region as a case study since it is rich with deep cultural heritage, comprising almost 4,000 heritage villages, and is considered to be one of the most important tourist destinations in the country. In this paper, the authors propose an initiative for the development and implementation of GIS in the tourism industry.

随着各个地区的城市化，许多历史遗迹可能被歪曲或完全忽视。随着时间的推移，越来越多的人搬到城市地区，遗产地区正在被遗弃或忽视。通往这些地区的道路维修较少，而且没有得到充分的推广。多年来，数字地图的出现和发展在旅游和文化探索中发挥了重要作用，是游客考虑特定目的地的重要信息来源。本文讨论了旅游行业地理信息系统(GIS)的开发与实现。他们为旅游景点创建了一个交互式地图，并提出了一种检索旅游数据的方法。他们选择阿西尔地区作为案例研究，因为该地区拥有丰富的深厚文化遗产，包括近4000个遗产村庄，被认为是该国最重要的旅游目的地之一。在本文中，作者提出了GIS在旅游行业发展和实施的倡议。

引用次数: 0

Association Rule Mining Based on Hybrid Whale Optimization Algorithm 基于混合鲸优化算法的关联规则挖掘

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

International Journal of Data Warehousing and Mining

Pub Date : 2022-01-01 DOI: 10.4018/ijdwm.308817

Z. Ye, Wenhui Cai, Mingwei Wang, Aixin Zhang, Wen-hua Zhou, Na Deng, Zimei Wei, Daxin Zhu

Association Rule Mining(ARM) is one of the most significant and active research areas in data mining. Recently, Whale Optimization Algorithm (WOA) has been successfully applied in the field of data mining, however, it easily falls into the local optimum. Therefore, an improved WOA based adaptive parameter strategy and Levy Flight mechanism (LWOA) is applied to mine association rules. Meanwhile, a hybrid strategy that blends two algorithms to balance the exploration and exploitation phases is put forward, that is, grey wolf optimization algorithm (GWO), artificial bee colony algorithm (ABC) and cuckoo search algorithm (CS) are devoted to improving the convergence of LWOA. The approach performs a global search and finds the association rules sets by modeling the rule mining task as a multi-objective problem that simultaneously meets support, confidence, lift, and certain factor, which is examined on multiple data sets. Experimental results verify that the proposed method has better mining performance compared to other algorithms involved in the paper.

关联规则挖掘(ARM)是数据挖掘中最重要和最活跃的研究领域之一。近年来，鲸鱼优化算法(Whale Optimization Algorithm, WOA)在数据挖掘领域得到了成功的应用，但该算法容易陷入局部最优。为此，将改进的基于WOA的自适应参数策略和Levy Flight机制(LWOA)应用于关联规则挖掘。同时，提出了一种混合两种算法来平衡探索和开发阶段的混合策略，即灰狼优化算法(GWO)、人工蜂群算法(ABC)和布谷鸟搜索算法(CS)致力于提高LWOA的收敛性。该方法通过将规则挖掘任务建模为同时满足支持度、置信度、提升度和特定因子的多目标问题，并在多个数据集上进行检查，从而进行全局搜索并找到关联规则集。实验结果表明，与其他算法相比，该方法具有更好的挖掘性能。

引用次数: 1

Hierarchical Hybrid Neural Networks With Multi-Head Attention for Document Classification 具有多头关注的层次混合神经网络用于文档分类

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

International Journal of Data Warehousing and Mining

Pub Date : 2022-01-01 DOI: 10.4018/ijdwm.303673

Weihao Huang, Jiaojiao Chen, Qianhua Cai, Xuejie Liu, Yu-dong Zhang, Xiaohui Hu

Document classification is a research topic aiming to predict the overall text sentiment polarity with the advent of deep neural networks. Various deep learning algorithms have been employed in the current studies to improve classification performance. To this end, this paper proposes a hierarchical hybrid neural network with multi-head attention (HHNN-MHA) model on the task of document classification. The proposed model contains two layers to deal with the word-sentence level and sentence-document level classification respectively. In the first layer, CNN is integrated into Bi-GRU and a multi-head attention mechanism is employed, in order to exploit local and global features. Then, both Bi-GRU and attention mechanism are applied to document processing and classification in the second layer. Experiments on four datasets demonstrate the effectiveness of the proposed method. Compared to the state-of-art methods, our model achieves competitive results in document classification in terms of experimental performance.

随着深度神经网络的出现，文档分类是一个旨在预测文本整体情感极性的研究课题。在目前的研究中，各种深度学习算法被用于提高分类性能。为此，本文提出了一种具有多头关注的层次混合神经网络(HHNN-MHA)模型来完成文档分类任务。该模型包含两层，分别处理词-句子级和句子-文档级的分类。在第一层，将CNN集成到Bi-GRU中，采用多头注意机制，利用局部和全局特征。然后，将Bi-GRU和注意力机制应用于第二层的文档处理和分类。在四个数据集上的实验证明了该方法的有效性。与目前最先进的方法相比，我们的模型在实验性能上取得了具有竞争力的文档分类结果。

引用次数: 2

Schema Evolution in Multiversion Data Warehouses 多版本数据仓库中的模式演化

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

International Journal of Data Warehousing and Mining

Pub Date : 2021-10-01 DOI: 10.4018/ijdwm.2021100101

Waqas Ahmed, E. Zimányi, A. Vaisman, R. Wrembel

Data warehouses (DWs) evolve in both their content and schema due to changes of user requirements, business processes, or external sources to name a few. Although multiple approaches using temporal and/or multiversion DWs have been proposed to handle these changes, an efficient solution for this problem is still lacking. The authors' approach is to separate concerns and use temporal DWs to deal with content changes, and multiversion DWs to deal with schema changes. To address the former, previously, they have proposed a temporal multidimensional (MD) model. In this paper, they propose a multiversion MD model for schema evolution to tackle the latter problem. The two models complement each other and allow managing both content and schema evolution. In this paper, the semantics of schema modification operators (SMOs) to derive various schema versions are given. It is also shown how online analytical processing (OLAP) operations like roll-up work on the model. Finally, the mapping from the multiversion MD model to a relational schema is given along with OLAP operations in standard SQL.

数据仓库(dw)的内容和模式都会随着用户需求、业务流程或外部来源的变化而变化。尽管已经提出了使用时态和/或多版本dw的多种方法来处理这些更改，但仍然缺乏有效的解决方案。作者的方法是分离关注点并使用时态dw来处理内容更改，使用多版本dw来处理模式更改。为了解决前者，之前，他们提出了一个时间多维(MD)模型。在本文中，他们提出了一个多版本的模式演化模型来解决后一个问题。这两个模型相互补充，并允许管理内容和模式演变。本文给出了用于派生各种模式版本的模式修改操作符的语义。还展示了联机分析处理(OLAP)操作(如上卷)如何在模型上工作。最后，给出了从多版本MD模型到关系模式的映射，以及标准SQL中的OLAP操作。

引用次数: 0

An Engineering Domain Knowledge-Based Framework for Modelling Highly Incomplete Industrial Data 高度不完备工业数据建模的工程领域知识框架

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

International Journal of Data Warehousing and Mining

Pub Date : 2021-10-01 DOI: 10.4018/ijdwm.2021100103

Han Li, Zhao Liu, P. Zhu

The missing values in industrial data restrict the applications. Although this incomplete data contains enough information for engineers to support subsequent development, there are still too many missing values for algorithms to establish precise models. This is because the engineering domain knowledge is not considered, and valuable information is not fully captured. Therefore, this article proposes an engineering domain knowledge-based framework for modelling incomplete industrial data. The raw datasets are partitioned and processed at different scales. Firstly, the hierarchical features are combined to decrease the missing ratio. In order to fill the missing values in special data, which is identified for classifying the samples, samples with only part of the features presented are fully utilized instead of being removed to establish local imputation model. Then samples are divided into different groups to transfer the information. A series of industrial data is analyzed for verifying the feasibility of the proposed method.

工业数据中的缺失值限制了应用。虽然这些不完整的数据包含了足够工程师支持后续开发的信息，但对于算法建立精确模型来说，仍然有太多的缺失值。这是因为没有考虑到工程领域的知识，并且没有完全捕获有价值的信息。因此，本文提出了一种基于工程领域知识的不完全工业数据建模框架。对原始数据集进行了不同尺度的分区和处理。首先，结合层次特征降低缺失率;为了填补识别出来的特殊数据中缺失的值，对样本进行分类，不去除只呈现部分特征的样本，而是充分利用这些特征来建立局部插值模型。然后将样本分成不同的组来传递信息。通过对一系列工业数据的分析，验证了所提方法的可行性。

引用次数: 1

A Novel Filter-Wrapper Algorithm on Intuitionistic Fuzzy Set for Attribute Reduction From Decision Tables 基于直觉模糊集的决策表属性约简滤波-包装算法

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

International Journal of Data Warehousing and Mining

Pub Date : 2021-10-01 DOI: 10.4018/ijdwm.2021100104

Thang Truong Nguyen, Long Giang Nguyen, D. T. Tran, T. T. Nguyen, Huy Quang Nguyen, Anh Viet Pham, T. D. Vu

Attribute reduction from decision tables is one of the crucial topics in data mining. This problem belongs to NP-hard and many approximation algorithms based on the filter or the filter-wrapper approaches have been designed to find the reducts. Intuitionistic fuzzy set (IFS) has been regarded as the effective tool to deal with such the problem by adding two degrees, namely the membership and non-membership for each data element. The separation of attributes in the view of two counterparts as in the IFS set would increase the quality of classification and reduce the reducts. From this motivation, this paper proposes a new filter-wrapper algorithm based on the IFS for attribute reduction from decision tables. The contributions include a new instituitionistics fuzzy distance between partitions accompanied with theoretical analysis. The filter-wrapper algorithm is designed based on that distance with the new stopping condition based on the concept of delta-equality. Experiments are conducted on the benchmark UCI machine learning repository datasets.

决策表的属性约简是数据挖掘中的关键问题之一。这个问题属于np困难问题，人们设计了许多基于过滤器或过滤器-包装方法的近似算法来寻找约简。直觉模糊集(IFS)被认为是处理这类问题的有效工具，它为每个数据元素增加两个度，即隶属度和非隶属度。在IFS集合中，在两个对应物的视图中分离属性将提高分类质量并减少约简。基于这一动机，本文提出了一种新的基于IFS的过滤-包装算法，用于决策表的属性约简。贡献包括一个新的制度模糊距离分区与理论分析。基于该距离和基于delta-等式的新停止条件设计了滤波-包装算法。在基准UCI机器学习存储库数据集上进行了实验。

{"title":"A Novel Filter-Wrapper Algorithm on Intuitionistic Fuzzy Set for Attribute Reduction From Decision Tables","authors":"Thang Truong Nguyen, Long Giang Nguyen, D. T. Tran, T. T. Nguyen, Huy Quang Nguyen, Anh Viet Pham, T. D. Vu","doi":"10.4018/ijdwm.2021100104","DOIUrl":"https://doi.org/10.4018/ijdwm.2021100104","url":null,"abstract":"Attribute reduction from decision tables is one of the crucial topics in data mining. This problem belongs to NP-hard and many approximation algorithms based on the filter or the filter-wrapper approaches have been designed to find the reducts. Intuitionistic fuzzy set (IFS) has been regarded as the effective tool to deal with such the problem by adding two degrees, namely the membership and non-membership for each data element. The separation of attributes in the view of two counterparts as in the IFS set would increase the quality of classification and reduce the reducts. From this motivation, this paper proposes a new filter-wrapper algorithm based on the IFS for attribute reduction from decision tables. The contributions include a new instituitionistics fuzzy distance between partitions accompanied with theoretical analysis. The filter-wrapper algorithm is designed based on that distance with the new stopping condition based on the concept of delta-equality. Experiments are conducted on the benchmark UCI machine learning repository datasets.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":"1 1","pages":"67-100"},"PeriodicalIF":1.2,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84367662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

P2P-COVID-GAN: Classification and Segmentation of COVID-19 Lung Infections From CT Images Using GAN P2P-COVID-GAN:基于GAN的CT图像中COVID-19肺部感染的分类和分割

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

International Journal of Data Warehousing and Mining

Pub Date : 2021-10-01 DOI: 10.4018/ijdwm.2021100105

R. Abirami, M. DuraiRajVincentP., S. Kadry

Early and automatic segmentation of lung infections from computed tomography images of COVID-19 patients is crucial for timely quarantine and effective treatment. However, automating the segmentation of lung infection from CT slices is challenging due to a lack of contrast between the normal and infected tissues. A CNN and GAN-based framework are presented to classify and then segment the lung infections automatically from COVID-19 lung CT slices. In this work, the authors propose a novel method named P2P-COVID-SEG to automatically classify COVID-19 and normal CT images and then segment COVID-19 lung infections from CT images using GAN. The proposed model outperformed the existing classification models with an accuracy of 98.10%. The segmentation results outperformed existing methods and achieved infection segmentation with accurate boundaries. The Dice coefficient achieved using GAN segmentation is 81.11%. The segmentation results demonstrate that the proposed model outperforms the existing models and achieves state-of-the-art performance.

从COVID-19患者的计算机断层扫描图像中早期自动分割肺部感染对于及时隔离和有效治疗至关重要。然而，由于缺乏正常组织和感染组织之间的对比，从CT切片中自动分割肺部感染是具有挑战性的。提出了一种基于CNN和gan的框架，对COVID-19肺部CT切片的肺部感染进行自动分类和分割。本文提出了一种新颖的P2P-COVID-SEG方法，对COVID-19和正常CT图像进行自动分类，然后使用GAN从CT图像中分割COVID-19肺部感染。该模型的准确率达到98.10%，优于现有的分类模型。分割结果优于现有方法，实现了边界准确的感染分割。使用GAN分割得到的Dice系数为81.11%。分割结果表明，所提模型优于现有模型，达到了最先进的性能。

引用次数: 11

ETL Logs Under a Pattern-Oriented Approach 面向模式方法下的ETL日志

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

International Journal of Data Warehousing and Mining

Pub Date : 2021-10-01 DOI: 10.4018/ijdwm.2021100102

Bruno Oliveira, Óscar Oliveira, O. Belo

Considering extract-transform-load (ETL) as a complex and evolutionary process, development teams must conscientiously and rigorously create log strategies for retrieving the most value of the information that can be gathered from the events that occur through the ETL workflow. Efficient logging strategies must be structured so that metrics, logs, and alerts can, beyond their troubleshooting capabilities, provide insights about the system. This paper presents a configurable and flexible ETL component for creating logging mechanisms in ETL workflows. A pattern-oriented approach is followed as a way to abstract ETL activities and enable its mapping to physical primitives that can be interpreted by ETL commercial tools.

考虑到提取-转换-加载(ETL)是一个复杂且不断发展的过程，开发团队必须认真且严格地创建日志策略，以便从通过ETL工作流发生的事件中收集最有价值的信息。必须构建有效的日志策略，以便度量、日志和警报能够提供关于系统的洞察，而不仅仅是它们的故障排除功能。本文提出了一个可配置的、灵活的ETL组件，用于在ETL工作流中创建日志机制。采用面向模式的方法作为一种抽象ETL活动并使其映射到可由ETL商业工具解释的物理原语的方法。

引用次数: 1

A Novel Approach Using Non-Synonymous Materialized Queries for Data Warehousing 使用非同义物化查询的数据仓库新方法

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

International Journal of Data Warehousing and Mining

Pub Date : 2021-07-01 DOI: 10.4018/IJDWM.2021070102

S. Chakraborty

Data from multiple sources are loaded into the organization data warehouse for analysis. Since some OLAP queries are quite frequently fired on the warehouse data, their execution time is reduced by storing the queries and results in a relational database, referred as materialized query database (MQDB). If the tables, fields, functions, and criteria of input query and stored query are the same but the query criteria specified in WHERE or HAVING clause do not match, then they are considered non-synonymous to each other. In the present research, the results of non-synonymous queries are generated by reusing the existing stored results after applying UNION or MINUS operations on them. This will reduce the execution time of non-synonymous queries. For superset criteria values of input query, UNION operation is applied, and for subset values, MINUS operation is applied. Incremental result processing of existing stored results, if required, is performed using Data Marts.

来自多个源的数据被加载到组织数据仓库中进行分析。由于一些OLAP查询经常在仓库数据上触发，因此通过将查询和结果存储在关系数据库(称为物化查询数据库(MQDB))中，可以减少它们的执行时间。如果输入查询和存储查询的表、字段、函数和条件相同，但WHERE或HAVING子句中指定的查询条件不匹配，则认为它们彼此非同义。在本研究中，非同义查询的结果是通过对已有存储的结果进行UNION或MINUS操作后重用产生的。这将减少非同义查询的执行时间。对于输入查询的超集标准值，应用UNION操作，对于子集值，应用MINUS操作。如果需要，可以使用Data markets对现有存储结果进行增量结果处理。

引用次数: 1

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

International Journal of Data Warehousing and Mining

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀