International Journal of Data Warehousing and Mining最新文献

英文中文

Research on Multi-Parameter Prediction of Rabbit Housing Environment Based on Transformer 基于变压器的兔舍环境多参数预测研究

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

International Journal of Data Warehousing and Mining

Pub Date : 2024-01-10 DOI: 10.4018/ijdwm.336286

Feiqi Liu, Dong Yang, Yuyang Zhang, Chengcai Yang, Jingjing Yang

The rabbit breeding industry exhibits vast economic potential and growth opportunities. Nevertheless, the ineffective prediction of environmental conditions in rabbit houses often leads to the spread of infectious diseases, causing illness and death among rabbits. This paper presents a multi-parameter predictive model for environmental conditions such as temperature, humidity, illumination, CO2 concentration, NH3 concentration, and dust conditions in rabbit houses. The model adeptly distinguishes between day and night forecasts, thereby improving the adaptive adjustment of environmental data trends. Importantly, the model encapsulates multi-parameter environmental forecasting to heighten precision, given the high degree of interrelation among parameters. The model's performance is assessed through RMSE, MAE, and MAPE metrics, yielding values of 0.018, 0.031, and 6.31% respectively in predicting rabbit house environmental factors. Experimentally juxtaposed with Bert, Seq2seq, and conventional transformer models, the method demonstrates superior performance.

养兔业展现出巨大的经济潜力和发展机遇。然而，对兔舍环境条件的无效预测往往会导致传染病的传播，造成兔子生病和死亡。本文针对兔舍的温度、湿度、光照、二氧化碳浓度、NH3 浓度和灰尘状况等环境条件提出了一个多参数预测模型。该模型善于区分白天和夜晚的预测，从而改进了对环境数据趋势的适应性调整。重要的是，鉴于参数之间的高度相互关系，该模型囊括了多参数环境预测，从而提高了预测精度。该模型的性能通过 RMSE、MAE 和 MAPE 指标进行评估，在预测兔舍环境因素方面，其值分别为 0.018、0.031 和 6.31%。通过与 Bert、Seq2seq 和传统变压器模型并列实验，该方法表现出卓越的性能。

引用次数: 0

Analyzing AI-Generated Packaging's Impact on Consumer Satisfaction With Three Types of Datasets 用三种数据集分析人工智能生成的包装对消费者满意度的影响

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

International Journal of Data Warehousing and Mining

Pub Date : 2023-11-28 DOI: 10.4018/ijdwm.334024

Tao Chen, D. Luh, J. Wang

The study quantitatively examines how AI-generated cosmetic packaging design impact consumer satisfaction, offering strategies for database-driven development and design based on this evaluation. A comprehensive evaluation system consisting of 18 indicators in five dimensions was constructed by combining literature review and user interviews with expert opinions. On this basis, a questionnaire survey on AI-generated packaging design was conducted based on three types of datasets. In addition, importance-performance analysis was used to analyze the satisfaction of AI-generated packaging design indicators. The study found that while consumers are highly satisfied with the information transmission and creative attraction of AI-generated packaging design, the design's functional availability and user experience still have to be improved. It is suggested that the public model be combined into the data warehouse to build an AI packaging service platform. Focusing on the interpretability and controllability of the design process will also help increase consumer satisfaction and trust.

本研究定量研究了人工智能生成的化妆品包装设计如何影响消费者满意度，并在此基础上提出了数据库驱动的开发和设计策略。研究结合文献综述、用户访谈和专家意见，构建了由五个维度 18 个指标组成的综合评价体系。在此基础上，基于三类数据集对人工智能生成的包装设计进行了问卷调查。此外，还采用重要性-绩效分析法对人工智能生成的包装设计指标的满意度进行了分析。研究发现，虽然消费者对人工智能生成包装设计的信息传递和创意吸引力满意度较高，但设计的功能可用性和用户体验仍有待提高。建议将公共模型纳入数据仓库，构建人工智能包装服务平台。注重设计过程的可解释性和可控性，也有助于提高消费者的满意度和信任度。

引用次数: 0

A Cross-Domain Recommender System for Literary Books Using Multi-Head Self-Attention Interaction and Knowledge Transfer Learning 利用多头自我关注互动和知识迁移学习的文学书籍跨域推荐系统

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

International Journal of Data Warehousing and Mining

Pub Date : 2023-11-28 DOI: 10.4018/ijdwm.334122

Yuan Cui, Yuexing Duan, Yueqin Zhang, Li Pan

Existing book recommendation methods often overlook the rich information contained in the comment text, which can limit their effectiveness. Therefore, a cross-domain recommender system for literary books that leverages multi-head self-attention interaction and knowledge transfer learning is proposed. Firstly, the BERT model is employed to obtain word vectors, and CNN is used to extract user and project features. Then, higher-level features are captured through the fusion of multi-head self-attention and addition pooling. Finally, knowledge transfer learning is introduced to conduct joint modeling between different domains by simultaneously extracting domain-specific features and shared features between domains. On the Amazon dataset, the proposed model achieved MAE and MSE of 0.801 and 1.058 in the “movie-book” recommendation task and 0.787 and 0.805 in the “music-book” recommendation task, respectively. This performance is significantly superior to other advanced recommendation models. Moreover, the proposed model also has good universality on the Chinese dataset.

现有的图书推荐方法往往忽略了评论文本中包含的丰富信息，从而限制了其有效性。因此，本文提出了一种利用多头自关注交互和知识迁移学习的文学书籍跨域推荐系统。首先，利用 BERT 模型获取词向量，并利用 CNN 提取用户和项目特征。然后，通过多头自我关注和加法池的融合来捕捉更高层次的特征。最后，引入知识迁移学习，通过同时提取特定领域特征和领域间共享特征，进行不同领域间的联合建模。在亚马逊数据集上，所提出的模型在 "电影书 "推荐任务中的 MAE 和 MSE 分别为 0.801 和 1.058，在 "音乐书 "推荐任务中分别为 0.787 和 0.805。这一性能明显优于其他先进的推荐模型。此外，所提出的模型在中文数据集上也具有良好的普适性。

{"title":"A Cross-Domain Recommender System for Literary Books Using Multi-Head Self-Attention Interaction and Knowledge Transfer Learning","authors":"Yuan Cui, Yuexing Duan, Yueqin Zhang, Li Pan","doi":"10.4018/ijdwm.334122","DOIUrl":"https://doi.org/10.4018/ijdwm.334122","url":null,"abstract":"Existing book recommendation methods often overlook the rich information contained in the comment text, which can limit their effectiveness. Therefore, a cross-domain recommender system for literary books that leverages multi-head self-attention interaction and knowledge transfer learning is proposed. Firstly, the BERT model is employed to obtain word vectors, and CNN is used to extract user and project features. Then, higher-level features are captured through the fusion of multi-head self-attention and addition pooling. Finally, knowledge transfer learning is introduced to conduct joint modeling between different domains by simultaneously extracting domain-specific features and shared features between domains. On the Amazon dataset, the proposed model achieved MAE and MSE of 0.801 and 1.058 in the “movie-book” recommendation task and 0.787 and 0.805 in the “music-book” recommendation task, respectively. This performance is significantly superior to other advanced recommendation models. Moreover, the proposed model also has good universality on the Chinese dataset.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":"23 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2023-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139220014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An Outlier Detection Algorithm Based on Probability Density Clustering 基于概率密度聚类的离群点检测算法

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

International Journal of Data Warehousing and Mining

Pub Date : 2023-11-21 DOI: 10.4018/ijdwm.333901

Wei Wang, Yongjian Ren, Renjie Zhou, Jilin Zhang

Outlier detection for batch and streaming data is an important branch of data mining. However, there are shortcomings for existing algorithms. For batch data, the outlier detection algorithm, only labeling a few data points, is not accurate enough because it uses histogram strategy to generate feature vectors. For streaming data, the outlier detection algorithms are sensitive to data distance, resulting in low accuracy when sparse clusters and dense clusters are close to each other. Moreover, they require tuning of parameters, which takes a lot of time. With this, the manuscript per the authors propose a new outlier detection algorithm, called PDC which use probability density to generate feature vectors to train a lightweight machine learning model that is finally applied to detect outliers. PDC takes advantages of accuracy and insensitivity-to-data-distance of probability density, so it can overcome the aforementioned drawbacks.

批量数据和流数据的离群点检测是数据挖掘的一个重要分支。然而，现有算法也存在不足之处。对于批量数据，离群点检测算法只标记几个数据点，由于使用直方图策略生成特征向量，因此不够准确。对于流数据，离群点检测算法对数据距离很敏感，当稀疏聚类和密集聚类彼此靠近时，准确率会很低。此外，这些算法需要调整参数，耗费大量时间。有鉴于此，作者在手稿中提出了一种新的离群值检测算法，称为 PDC，它利用概率密度生成特征向量，训练轻量级机器学习模型，最后应用于检测离群值。PDC 利用了概率密度的准确性和对数据距离不敏感的优点，因此可以克服上述缺点。

引用次数: 0

An Intelligent Heart Disease Prediction Framework Using Machine Learning and Deep Learning Techniques 利用机器学习和深度学习技术的智能心脏病预测框架

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

International Journal of Data Warehousing and Mining

Pub Date : 2023-11-17 DOI: 10.4018/ijdwm.333862

Nasser Allheeib, Summrina Kanwal, Sultan Alamri

Cardiovascular diseases (CVD) rank among the leading global causes of mortality. Early detection and diagnosis are paramount in minimizing their impact. The application of ML and DL in classifying the occurrence of cardiovascular diseases holds significant potential for reducing diagnostic errors. This research endeavors to construct a model capable of accurately predicting cardiovascular diseases, thereby mitigating the fatality associated with CVD. In this paper, the authors introduce a novel approach that combines an artificial intelligence network (AIN)-based feature selection (FS) technique with cutting-edge DL and ML classifiers for the early detection of heart diseases based on patient medical histories. The proposed model is rigorously evaluated using two real-world datasets sourced from the University of California. The authors conduct extensive data preprocessing and analysis, and the findings from this study demonstrate that the proposed methodology surpasses the performance of existing state-of-the-art methods, achieving an exceptional accuracy rate of 99.99%.

心血管疾病（CVD）是导致全球死亡的主要原因之一。早期检测和诊断对最大限度地减少其影响至关重要。应用 ML 和 DL 对心血管疾病的发生进行分类，在减少诊断错误方面具有巨大潜力。本研究致力于构建一个能够准确预测心血管疾病的模型，从而降低与心血管疾病相关的死亡率。在本文中，作者介绍了一种新方法，该方法将基于人工智能网络（AIN）的特征选择（FS）技术与最先进的 DL 和 ML 分类器相结合，用于根据患者病史早期检测心脏病。作者使用来自加利福尼亚大学的两个真实数据集对所提出的模型进行了严格评估。作者对数据进行了广泛的预处理和分析，研究结果表明，所提出的方法超越了现有最先进方法的性能，准确率高达 99.99%。

引用次数: 0

Image and Text Aspect Level Multimodal Sentiment Classification Model Using Transformer and Multilayer Attention Interaction 使用变换器和多层注意力交互的图像和文本特征多模态情感分类模型

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

International Journal of Data Warehousing and Mining

Pub Date : 2023-11-15 DOI: 10.4018/ijdwm.333854

Xiuye Yin, Liyong Chen

Many existing image and text sentiment analysis methods only consider the interaction between image and text modalities, while ignoring the inconsistency and correlation of image and text data, to address this issue, an image and text aspect level multimodal sentiment analysis model using transformer and multi-layer attention interaction is proposed. Firstly, ResNet50 is used to extract image features, and RoBERTa-BiLSTM is used to extract text and aspect level features. Then, through the aspect direct interaction mechanism and deep attention interaction mechanism, multi-level fusion of aspect information and graphic information is carried out to remove text and images unrelated to the given aspect. The emotional representations of text data, image data, and aspect type sentiments are concatenated, fused, and fully connected. Finally, the designed sentiment classifier is used to achieve sentiment analysis in terms of images and texts. This effectively has improved the performance of sentiment discrimination in terms of graphics and text.

现有的许多图像和文本情感分析方法只考虑了图像和文本模态之间的交互，而忽略了图像和文本数据的不一致性和相关性，针对这一问题，提出了一种利用变换器和多层注意力交互的图像和文本方面级多模态情感分析模型。首先，使用 ResNet50 提取图像特征，使用 RoBERTa-BiLSTM 提取文本和方面层特征。然后，通过方面直接交互机制和深度注意力交互机制，对方面信息和图形信息进行多层次融合，去除与给定方面无关的文本和图像。文本数据、图像数据和方面类型情感的情感表征被串联、融合和完全连接。最后，使用所设计的情感分类器实现对图像和文本的情感分析。这有效地提高了图形和文本情感判别的性能。

引用次数: 0

Mining and Analysis of the Traffic Information Situation in the South China Sea Based on Satellite AIS Data 基于卫星AIS数据的南海交通信息态势挖掘与分析

4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

International Journal of Data Warehousing and Mining

Pub Date : 2023-10-27 DOI: 10.4018/ijdwm.332864

Tianyu Pu

The loading of Automatic Identification System equipment on low-orbiting satellites can adapt to the demand of exchanging data and information with greater “capacity” brought by the AIS data information of ships in deep waters that cannot be covered by land-based stations. The information in the satellite AIS data contains a large number of potential features of ship activities, and by selecting the ship satellite AIS data of typical months in the South China Sea in 2020. Data mining, geographic information system, and traffic flow theory are used to visualize and analyze the ship activities in the South China Sea. The study shows that the distribution of ship routes in the South China Sea is highly compatible with the recommended routes of merchant ships, and the width of the track belt is obviously characterized. The number of ships passing through the southern waters of the Taiwan Strait has increased significantly, and the focus of traffic safety in the South China Sea should also focus on major route belt and important straits.

在低轨卫星上装载自动识别系统设备，可以适应陆基台站无法覆盖的深海船舶AIS数据信息带来的更大“容量”的数据信息交换需求。卫星AIS数据中的信息包含大量船舶活动的潜在特征，通过选取2020年南海典型月份的船舶卫星AIS数据。运用数据挖掘、地理信息系统和交通流理论对南海海域船舶活动进行可视化分析。研究表明，南海船舶航路分布与商船推荐航路高度契合，航路带宽度特征明显。通过台湾海峡南部海域的船舶数量明显增加，南海交通安全的重点也应集中在主要航路带和重要海峡。

引用次数: 0

An Integration Model on Brainstorming and Extenics for Intelligent Innovation in Big Data Environment 大数据环境下智能创新的头脑风暴与可拓集成模型

4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

International Journal of Data Warehousing and Mining

Pub Date : 2023-10-25 DOI: 10.4018/ijdwm.332413

Xingsen Li, Haibin Pi, Junwen Sun, Hao Lan Zhang, Zhencheng Liang

Brainstorming is a widely used problem-solving method that generates a large number of innovative ideas by guiding and stimulating intuitive and divergent thinking. However, in practice, the method is limited by the human brain's capacity or special capabilities, especially by the experience and knowledge they possess. How does our brain create ideas like storming? Based on the new discipline of Extenics, the authors propose a new model that explores the process of how ideas are created in our brain, with the goal of helping people think multi-dimensionally and getting more ideas. With the support of information technology and artificial intelligence, we can systematically collect more information and knowledge than ever before to form a basic-element information base and build human-computer interaction models, to make up for the lack of information and knowledge in the human brain. In addition, the authors provide a methodology to help people think positively in a multidimensional way based on the guidance of Extenics in the brainstorming process.

头脑风暴是一种广泛使用的解决问题的方法，它通过引导和激发直觉思维和发散思维来产生大量的创新想法。然而，在实践中，这种方法受到人类大脑容量或特殊能力的限制，特别是受到他们所拥有的经验和知识的限制。我们的大脑是如何产生像风暴这样的想法的?基于可拓学的新学科，作者提出了一个新的模型，探索想法是如何在我们的大脑中产生的过程，目的是帮助人们多维度思考，获得更多的想法。在信息技术和人工智能的支持下，我们可以系统地收集比以往更多的信息和知识，形成基本要素信息库，构建人机交互模型，弥补人脑中信息和知识的不足。此外，作者还提供了一种方法，帮助人们在头脑风暴过程中以可拓学的指导为基础，以多维度的方式积极思考。

引用次数: 0

Secure Transmission Method of Power Quality Data in Power Internet of Things Based on the Encryption Algorithm 基于加密算法的电力物联网电能质量数据安全传输方法

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

International Journal of Data Warehousing and Mining

Pub Date : 2023-09-08 DOI: 10.4018/ijdwm.330014

Xin Liu, Yingxian Chang, Honglei Yao, Bing Su

As a new mobile communication technology in the era of the internet of things, 5G is characterized by high speed, low delay, and large connection. It is a network infrastructure to realize human-computer and internet of things in the era of the internet of things. Power quality data is the efficiency with which a power grid delivers electricity to users and expresses how well a piece of machinery uses the electricity it receives. The waveform at the nominal voltage and frequency is the goal of power quality research and improvement. The power internet of things (IoT) is an intelligent service platform that fully uses cutting-edge tech to enable user-machine interaction, data-driven decision-making, real-time analytics, and adaptive software design. The process by which plaintext is converted into cipher text is called an encryption algorithm. The cipher text may seem completely random, but it can be decrypted using the exact mechanism that created the encryption key.

5G作为物联网时代的一种新型移动通信技术，具有高速、低延迟、大连接等特点。它是在物联网时代实现人机和物联网的网络基础设施。电能质量数据是指电网向用户输送电力的效率，并表示一台机器使用其接收的电力的情况。额定电压和频率下的波形是电能质量研究和改进的目标。电力物联网（IoT）是一个智能服务平台，充分利用尖端技术实现用户-机器交互、数据驱动决策、实时分析和自适应软件设计。将明文转换为密文的过程称为加密算法。密文可能看起来完全是随机的，但可以使用创建加密密钥的确切机制进行解密。

引用次数: 0

Constrained Density Peak Clustering 约束密度峰值聚类

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

International Journal of Data Warehousing and Mining

Pub Date : 2023-08-25 DOI: 10.4018/ijdwm.328776

Viet-Thang Vu, T. T. Q. Bui, Tien Loi Nguyen, Doan-Vinh Tran, Quan Hong, V. Vu, S. Avdoshin

Clustering is a commonly used tool for discovering knowledge in data mining. Density peak clustering (DPC) has recently gained attention for its ability to detect clusters with various shapes and noise, using just one parameter. DPC has shown advantages over other methods, such as DBSCAN and K-means, but it struggles with datasets that have both high and low-density clusters. To overcome this limitation, the paper introduces a new semi-supervised DPC method that improves clustering results with a small set of constraints expressed as must-link and cannot-link. The proposed method combines constraints and a k-nearest neighbor graph to filter out peaks and find the center for each cluster. Constraints are also used to support label assignment during the clustering procedure. The efficacy of this method is demonstrated through experiments on well-known data sets from UCI and benchmarked against contemporary semi-supervised clustering techniques.

聚类是数据挖掘中发现知识的常用工具。密度峰聚类(DPC)最近因其仅使用一个参数即可检测具有各种形状和噪声的聚类而受到关注。DPC已经显示出比其他方法(如DBSCAN和K-means)的优势，但是它在处理同时具有高密度和低密度集群的数据集时遇到了困难。为了克服这一限制，本文引入了一种新的半监督DPC方法，该方法通过将一小组约束表示为必须链接和不能链接来改善聚类结果。该方法结合约束和k近邻图来过滤峰值并找到每个聚类的中心。约束还用于支持聚类过程中的标签分配。该方法的有效性通过来自UCI的知名数据集的实验来证明，并与当代半监督聚类技术进行了基准测试。

引用次数: 0

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

International Journal of Data Warehousing and Mining

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀