首页 > 最新文献

International Journal of Data Warehousing and Mining最新文献

英文 中文
Fishing Vessel Type Recognition Based on Semantic Feature Vector 基于语义特征向量的渔船类型识别
IF 0.5 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-07-26 DOI: 10.4018/ijdwm.349222
Junfeng Yuan, Qianqian Zhang, Jilin Zhang, Youhuizi Li, Zhen Liu, Meiting Xue, Y. Zeng
Identifying fishing vessel types with artificial intelligence has become a key technology in marine resource management. However, classical feature modeling lacks the ability to express time series features, and the feature extraction is insufficient. Hence, this work focuses on the identification of trawlers, gillnetters, and purse seiners based on semantic feature vectors. First, we extract trajectories from massive and complex historical Vessel Monitoring System data that contain a large amount of dirty data and then extract the semantic features of fishing vessel trajectories. Finally, we input the semantic feature vectors into the LightGBM classification model for classification of fishing vessel types. In this experiment, the F1 measure of our proposed method on the East China Sea fishing vessel dataset reached 96.25, which was 6.82% higher than that of the classical feature-modeling method based on fishing vessel trajectories. Experiments show that this method is accurate and effective for the classification of fishing vessels.
利用人工智能识别渔船类型已成为海洋资源管理的一项关键技术。然而,经典的特征建模缺乏表达时间序列特征的能力,特征提取也不够充分。因此,这项工作的重点是基于语义特征向量识别拖网渔船、刺网渔船和围网渔船。首先,我们从包含大量脏数据的海量复杂渔船监控系统历史数据中提取轨迹,然后提取渔船轨迹的语义特征。最后,我们将语义特征向量输入 LightGBM 分类模型,对渔船类型进行分类。在该实验中,我们提出的方法在东海渔船数据集上的 F1 测量值达到 96.25,比基于渔船轨迹的经典特征建模方法高出 6.82%。实验表明,该方法对渔船分类准确有效。
{"title":"Fishing Vessel Type Recognition Based on Semantic Feature Vector","authors":"Junfeng Yuan, Qianqian Zhang, Jilin Zhang, Youhuizi Li, Zhen Liu, Meiting Xue, Y. Zeng","doi":"10.4018/ijdwm.349222","DOIUrl":"https://doi.org/10.4018/ijdwm.349222","url":null,"abstract":"Identifying fishing vessel types with artificial intelligence has become a key technology in marine resource management. However, classical feature modeling lacks the ability to express time series features, and the feature extraction is insufficient. Hence, this work focuses on the identification of trawlers, gillnetters, and purse seiners based on semantic feature vectors. First, we extract trajectories from massive and complex historical Vessel Monitoring System data that contain a large amount of dirty data and then extract the semantic features of fishing vessel trajectories. Finally, we input the semantic feature vectors into the LightGBM classification model for classification of fishing vessel types. In this experiment, the F1 measure of our proposed method on the East China Sea fishing vessel dataset reached 96.25, which was 6.82% higher than that of the classical feature-modeling method based on fishing vessel trajectories. Experiments show that this method is accurate and effective for the classification of fishing vessels.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141800070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimizing Cadet Squad Organizational Satisfaction by Integrating Leadership Factor Data Mining and Integer Programming 通过整合领导力因素数据挖掘和整数编程优化学员队组织满意度
IF 0.5 4区 计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-07-17 DOI: 10.4018/ijdwm.349226
Hyunho Kim, Eunmi Lee, S. Cha
Military academy cadets reside in a brigade organized by cadets. Despite its importance, squads have traditionally been organized based on the personal preferences of the fourth-year squad leader without considering the compatibility of the squad members. This study proposes a more scientific approach to increase cadet satisfaction with their squads and foster their leadership development. Initially, a multiple linear regression analysis was conducted to identify the leadership factors of squad leaders that significantly affect squad organizational satisfaction. The model maximized the sum of the factor scores among squad leaders to enhance squad organizational satisfaction and maximized the difference in factor scores to improve the effectiveness of leadership discipline. Applying the squad formation algorithm to data from cadets at the Korea Military Academy revealed that the squad organizational satisfaction and leadership discipline effectiveness were significantly increased compared to the existing squad formation methods.
军校学员居住在由学员组织的大队中。尽管班组很重要,但传统上班组的组织都是基于四年级班长的个人喜好,而不考虑班组成员的兼容性。本研究提出了一种更科学的方法,以提高学员对班组的满意度,促进他们的领导力发展。首先,进行了多元线性回归分析,以确定对班级组织满意度有显著影响的班长领导力因素。该模型最大化了班长之间的因子得分之和,以提高班级组织满意度;最大化了因子得分之差,以提高领导纪律的有效性。在韩国军事学院学员的数据中应用班级编组算法后发现,与现有的班级编组方法相比,班级组织满意度和领导纪律有效性都有显著提高。
{"title":"Optimizing Cadet Squad Organizational Satisfaction by Integrating Leadership Factor Data Mining and Integer Programming","authors":"Hyunho Kim, Eunmi Lee, S. Cha","doi":"10.4018/ijdwm.349226","DOIUrl":"https://doi.org/10.4018/ijdwm.349226","url":null,"abstract":"Military academy cadets reside in a brigade organized by cadets. Despite its importance, squads have traditionally been organized based on the personal preferences of the fourth-year squad leader without considering the compatibility of the squad members. This study proposes a more scientific approach to increase cadet satisfaction with their squads and foster their leadership development. Initially, a multiple linear regression analysis was conducted to identify the leadership factors of squad leaders that significantly affect squad organizational satisfaction. The model maximized the sum of the factor scores among squad leaders to enhance squad organizational satisfaction and maximized the difference in factor scores to improve the effectiveness of leadership discipline. Applying the squad formation algorithm to data from cadets at the Korea Military Academy revealed that the squad organizational satisfaction and leadership discipline effectiveness were significantly increased compared to the existing squad formation methods.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141828869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hybrid Inductive Graph Method for Matrix Completion 矩阵补全混合归纳图法
IF 1.2 4区 计算机科学 Q3 Computer Science Pub Date : 2024-05-07 DOI: 10.4018/ijdwm.345361
Jayun Yong, Chulyun Kim
The recommender system can be viewed as a matrix completion problem, which aims to predict unknown values within a matrix. Solutions to this problem are categorized into two approaches: transductive and inductive reasoning. In transductive reasoning, the model cannot be applied to new cases unseen during training. In contrast, IGMC, the state-of-the-art inductive algorithm, only requires subgraphs for target users and items, without needing any other content information. While the absence of a requirement for content information simplifies the model and enhances transferability to new tasks, incorporating content information could still improve the model's performance. In this article, the authors introduce Hi-GMC, a hybrid version of the IGMC model that incorporates content information alongside users and items. They present a novel graph model to encapsulate the side information related to users and items and develop a learning method based on graph neural networks. This proposed method achieves state-of-the-art performance on the MovieLens-100K dataset for both warm and cold start scenarios.
推荐系统可视为矩阵补全问题,其目的是预测矩阵中的未知值。这一问题的解决方案可分为两种方法:转导式推理和归纳式推理。在归纳推理中,模型不能应用于训练过程中未见的新案例。相比之下,最先进的归纳算法 IGMC 只需要目标用户和条目的子图,而不需要任何其他内容信息。虽然不需要内容信息简化了模型并提高了模型对新任务的可移植性,但加入内容信息仍能提高模型的性能。在本文中,作者介绍了 Hi-GMC,这是 IGMC 模型的混合版本,它将内容信息与用户和项目信息结合在一起。他们提出了一种新颖的图模型来封装与用户和项目相关的侧面信息,并开发了一种基于图神经网络的学习方法。所提出的方法在 MovieLens-100K 数据集上的热启动和冷启动场景中都取得了一流的性能。
{"title":"Hybrid Inductive Graph Method for Matrix Completion","authors":"Jayun Yong, Chulyun Kim","doi":"10.4018/ijdwm.345361","DOIUrl":"https://doi.org/10.4018/ijdwm.345361","url":null,"abstract":"The recommender system can be viewed as a matrix completion problem, which aims to predict unknown values within a matrix. Solutions to this problem are categorized into two approaches: transductive and inductive reasoning. In transductive reasoning, the model cannot be applied to new cases unseen during training. In contrast, IGMC, the state-of-the-art inductive algorithm, only requires subgraphs for target users and items, without needing any other content information. While the absence of a requirement for content information simplifies the model and enhances transferability to new tasks, incorporating content information could still improve the model's performance. In this article, the authors introduce Hi-GMC, a hybrid version of the IGMC model that incorporates content information alongside users and items. They present a novel graph model to encapsulate the side information related to users and items and develop a learning method based on graph neural networks. This proposed method achieves state-of-the-art performance on the MovieLens-100K dataset for both warm and cold start scenarios.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2024-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141004116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Fuzzy Portfolio Model With Cardinality Constraints Based on Differential Evolution Algorithms 基于差分进化算法的具有卡方限制的模糊投资组合模型
IF 1.2 4区 计算机科学 Q3 Computer Science Pub Date : 2024-03-27 DOI: 10.4018/ijdwm.341268
JianDong He
Uncertain information in the securities market exhibits fuzziness. In this article, expected returns and liquidity are considered as trapezoidal fuzzy numbers. The possibility mean and mean absolute deviation of expected returns represent the returns and risks of securities assets, while the possibility mean of expected turnover represents the liquidity of securities assets. Taking into account practical constraints such as cardinality and transaction costs, this article establishes a fuzzy portfolio model with cardinality constraints and solves it using the differential evolution algorithm. Finally, using fuzzy c-means clustering algorithm, 12 stocks are selected as empirical samples to provide numerical calculation examples. At the same time, fuzzy c-means clustering algorithm is used to cluster the stock yield data and analyse the stock data comprehensively and accurately, which provides a reference for establishing an effective portfolio.
证券市场中的不确定信息具有模糊性。本文将预期收益和流动性视为梯形模糊数。预期收益的可能性均值和绝对偏差均值代表证券资产的收益和风险,而预期成交量的可能性均值代表证券资产的流动性。考虑到卡片数量和交易成本等实际约束,本文建立了一个具有卡片数量约束的模糊投资组合模型,并利用微分演化算法对其进行求解。最后,利用模糊 c-means 聚类算法,选取 12 只股票作为经验样本,提供数值计算实例。同时,利用模糊均值聚类算法对股票收益率数据进行聚类,全面准确地分析股票数据,为建立有效的投资组合提供参考。
{"title":"A Fuzzy Portfolio Model With Cardinality Constraints Based on Differential Evolution Algorithms","authors":"JianDong He","doi":"10.4018/ijdwm.341268","DOIUrl":"https://doi.org/10.4018/ijdwm.341268","url":null,"abstract":"Uncertain information in the securities market exhibits fuzziness. In this article, expected returns and liquidity are considered as trapezoidal fuzzy numbers. The possibility mean and mean absolute deviation of expected returns represent the returns and risks of securities assets, while the possibility mean of expected turnover represents the liquidity of securities assets. Taking into account practical constraints such as cardinality and transaction costs, this article establishes a fuzzy portfolio model with cardinality constraints and solves it using the differential evolution algorithm. Finally, using fuzzy c-means clustering algorithm, 12 stocks are selected as empirical samples to provide numerical calculation examples. At the same time, fuzzy c-means clustering algorithm is used to cluster the stock yield data and analyse the stock data comprehensively and accurately, which provides a reference for establishing an effective portfolio.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140376177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dynamic Research on Youth Thought, Behavior, and Growth Law Based on Deep Learning Algorithm 基于深度学习算法的青少年思想、行为与成长规律动态研究
IF 1.2 4区 计算机科学 Q3 Computer Science Pub Date : 2024-01-31 DOI: 10.4018/ijdwm.333518
Qi Fu
The growth and development of youth is related to the destiny of the country and the nation, and the cultivation of young people in the new era is of great significance. Youth is a special stage in life, and the ideological and moral concepts of young people during this period are still very malleable. In the process of educating them, it is necessary to grasp the characteristics and laws of their ideological and moral growth to be targeted. The characteristics of young people's ideological and moral growth are variability, subjectivity, practicality, and hierarchy; the laws of their ideological and moral growth include the law of guiding transcendence, the law of edification and internalization, the law of mutual promotion of knowledge, will and action, and the law of gradual progression. With the intensification of economic globalization, the Internet, and big data, the thoughts, behaviors, and psychology of young people are constantly changing, which poses a huge challenge to youth ideological and political education.
青年的成长和发展关系到国家和民族的命运,新时期青年的培养意义重大。青年时期是人生的特殊阶段,这一时期青年的思想道德观念还具有很强的可塑性。在对他们进行教育的过程中,要把握他们思想道德成长的特点和规律,做到有的放矢。青少年思想道德成长的特点是多变性、主体性、实践性、层次性;青少年思想道德成长的规律是引导超越规律、教化内化规律、知志行相互促进规律、循序渐进规律。随着经济全球化、互联网、大数据的深入发展,青年的思想、行为、心理不断发生变化,这对青年思想政治教育提出了巨大的挑战。
{"title":"Dynamic Research on Youth Thought, Behavior, and Growth Law Based on Deep Learning Algorithm","authors":"Qi Fu","doi":"10.4018/ijdwm.333518","DOIUrl":"https://doi.org/10.4018/ijdwm.333518","url":null,"abstract":"The growth and development of youth is related to the destiny of the country and the nation, and the cultivation of young people in the new era is of great significance. Youth is a special stage in life, and the ideological and moral concepts of young people during this period are still very malleable. In the process of educating them, it is necessary to grasp the characteristics and laws of their ideological and moral growth to be targeted. The characteristics of young people's ideological and moral growth are variability, subjectivity, practicality, and hierarchy; the laws of their ideological and moral growth include the law of guiding transcendence, the law of edification and internalization, the law of mutual promotion of knowledge, will and action, and the law of gradual progression. With the intensification of economic globalization, the Internet, and big data, the thoughts, behaviors, and psychology of young people are constantly changing, which poses a huge challenge to youth ideological and political education.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2024-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140475518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research on Multi-Parameter Prediction of Rabbit Housing Environment Based on Transformer 基于变压器的兔舍环境多参数预测研究
IF 1.2 4区 计算机科学 Q3 Computer Science Pub Date : 2024-01-10 DOI: 10.4018/ijdwm.336286
Feiqi Liu, Dong Yang, Yuyang Zhang, Chengcai Yang, Jingjing Yang
The rabbit breeding industry exhibits vast economic potential and growth opportunities. Nevertheless, the ineffective prediction of environmental conditions in rabbit houses often leads to the spread of infectious diseases, causing illness and death among rabbits. This paper presents a multi-parameter predictive model for environmental conditions such as temperature, humidity, illumination, CO2 concentration, NH3 concentration, and dust conditions in rabbit houses. The model adeptly distinguishes between day and night forecasts, thereby improving the adaptive adjustment of environmental data trends. Importantly, the model encapsulates multi-parameter environmental forecasting to heighten precision, given the high degree of interrelation among parameters. The model's performance is assessed through RMSE, MAE, and MAPE metrics, yielding values of 0.018, 0.031, and 6.31% respectively in predicting rabbit house environmental factors. Experimentally juxtaposed with Bert, Seq2seq, and conventional transformer models, the method demonstrates superior performance.
养兔业展现出巨大的经济潜力和发展机遇。然而,对兔舍环境条件的无效预测往往会导致传染病的传播,造成兔子生病和死亡。本文针对兔舍的温度、湿度、光照、二氧化碳浓度、NH3 浓度和灰尘状况等环境条件提出了一个多参数预测模型。该模型善于区分白天和夜晚的预测,从而改进了对环境数据趋势的适应性调整。重要的是,鉴于参数之间的高度相互关系,该模型囊括了多参数环境预测,从而提高了预测精度。该模型的性能通过 RMSE、MAE 和 MAPE 指标进行评估,在预测兔舍环境因素方面,其值分别为 0.018、0.031 和 6.31%。通过与 Bert、Seq2seq 和传统变压器模型并列实验,该方法表现出卓越的性能。
{"title":"Research on Multi-Parameter Prediction of Rabbit Housing Environment Based on Transformer","authors":"Feiqi Liu, Dong Yang, Yuyang Zhang, Chengcai Yang, Jingjing Yang","doi":"10.4018/ijdwm.336286","DOIUrl":"https://doi.org/10.4018/ijdwm.336286","url":null,"abstract":"The rabbit breeding industry exhibits vast economic potential and growth opportunities. Nevertheless, the ineffective prediction of environmental conditions in rabbit houses often leads to the spread of infectious diseases, causing illness and death among rabbits. This paper presents a multi-parameter predictive model for environmental conditions such as temperature, humidity, illumination, CO2 concentration, NH3 concentration, and dust conditions in rabbit houses. The model adeptly distinguishes between day and night forecasts, thereby improving the adaptive adjustment of environmental data trends. Importantly, the model encapsulates multi-parameter environmental forecasting to heighten precision, given the high degree of interrelation among parameters. The model's performance is assessed through RMSE, MAE, and MAPE metrics, yielding values of 0.018, 0.031, and 6.31% respectively in predicting rabbit house environmental factors. Experimentally juxtaposed with Bert, Seq2seq, and conventional transformer models, the method demonstrates superior performance.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139440702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analyzing AI-Generated Packaging's Impact on Consumer Satisfaction With Three Types of Datasets 用三种数据集分析人工智能生成的包装对消费者满意度的影响
IF 1.2 4区 计算机科学 Q3 Computer Science Pub Date : 2023-11-28 DOI: 10.4018/ijdwm.334024
Tao Chen, D. Luh, J. Wang
The study quantitatively examines how AI-generated cosmetic packaging design impact consumer satisfaction, offering strategies for database-driven development and design based on this evaluation. A comprehensive evaluation system consisting of 18 indicators in five dimensions was constructed by combining literature review and user interviews with expert opinions. On this basis, a questionnaire survey on AI-generated packaging design was conducted based on three types of datasets. In addition, importance-performance analysis was used to analyze the satisfaction of AI-generated packaging design indicators. The study found that while consumers are highly satisfied with the information transmission and creative attraction of AI-generated packaging design, the design's functional availability and user experience still have to be improved. It is suggested that the public model be combined into the data warehouse to build an AI packaging service platform. Focusing on the interpretability and controllability of the design process will also help increase consumer satisfaction and trust.
本研究定量研究了人工智能生成的化妆品包装设计如何影响消费者满意度,并在此基础上提出了数据库驱动的开发和设计策略。研究结合文献综述、用户访谈和专家意见,构建了由五个维度 18 个指标组成的综合评价体系。在此基础上,基于三类数据集对人工智能生成的包装设计进行了问卷调查。此外,还采用重要性-绩效分析法对人工智能生成的包装设计指标的满意度进行了分析。研究发现,虽然消费者对人工智能生成包装设计的信息传递和创意吸引力满意度较高,但设计的功能可用性和用户体验仍有待提高。建议将公共模型纳入数据仓库,构建人工智能包装服务平台。注重设计过程的可解释性和可控性,也有助于提高消费者的满意度和信任度。
{"title":"Analyzing AI-Generated Packaging's Impact on Consumer Satisfaction With Three Types of Datasets","authors":"Tao Chen, D. Luh, J. Wang","doi":"10.4018/ijdwm.334024","DOIUrl":"https://doi.org/10.4018/ijdwm.334024","url":null,"abstract":"The study quantitatively examines how AI-generated cosmetic packaging design impact consumer satisfaction, offering strategies for database-driven development and design based on this evaluation. A comprehensive evaluation system consisting of 18 indicators in five dimensions was constructed by combining literature review and user interviews with expert opinions. On this basis, a questionnaire survey on AI-generated packaging design was conducted based on three types of datasets. In addition, importance-performance analysis was used to analyze the satisfaction of AI-generated packaging design indicators. The study found that while consumers are highly satisfied with the information transmission and creative attraction of AI-generated packaging design, the design's functional availability and user experience still have to be improved. It is suggested that the public model be combined into the data warehouse to build an AI packaging service platform. Focusing on the interpretability and controllability of the design process will also help increase consumer satisfaction and trust.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2023-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139218299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Cross-Domain Recommender System for Literary Books Using Multi-Head Self-Attention Interaction and Knowledge Transfer Learning 利用多头自我关注互动和知识迁移学习的文学书籍跨域推荐系统
IF 1.2 4区 计算机科学 Q3 Computer Science Pub Date : 2023-11-28 DOI: 10.4018/ijdwm.334122
Yuan Cui, Yuexing Duan, Yueqin Zhang, Li Pan
Existing book recommendation methods often overlook the rich information contained in the comment text, which can limit their effectiveness. Therefore, a cross-domain recommender system for literary books that leverages multi-head self-attention interaction and knowledge transfer learning is proposed. Firstly, the BERT model is employed to obtain word vectors, and CNN is used to extract user and project features. Then, higher-level features are captured through the fusion of multi-head self-attention and addition pooling. Finally, knowledge transfer learning is introduced to conduct joint modeling between different domains by simultaneously extracting domain-specific features and shared features between domains. On the Amazon dataset, the proposed model achieved MAE and MSE of 0.801 and 1.058 in the “movie-book” recommendation task and 0.787 and 0.805 in the “music-book” recommendation task, respectively. This performance is significantly superior to other advanced recommendation models. Moreover, the proposed model also has good universality on the Chinese dataset.
现有的图书推荐方法往往忽略了评论文本中包含的丰富信息,从而限制了其有效性。因此,本文提出了一种利用多头自关注交互和知识迁移学习的文学书籍跨域推荐系统。首先,利用 BERT 模型获取词向量,并利用 CNN 提取用户和项目特征。然后,通过多头自我关注和加法池的融合来捕捉更高层次的特征。最后,引入知识迁移学习,通过同时提取特定领域特征和领域间共享特征,进行不同领域间的联合建模。在亚马逊数据集上,所提出的模型在 "电影书 "推荐任务中的 MAE 和 MSE 分别为 0.801 和 1.058,在 "音乐书 "推荐任务中分别为 0.787 和 0.805。这一性能明显优于其他先进的推荐模型。此外,所提出的模型在中文数据集上也具有良好的普适性。
{"title":"A Cross-Domain Recommender System for Literary Books Using Multi-Head Self-Attention Interaction and Knowledge Transfer Learning","authors":"Yuan Cui, Yuexing Duan, Yueqin Zhang, Li Pan","doi":"10.4018/ijdwm.334122","DOIUrl":"https://doi.org/10.4018/ijdwm.334122","url":null,"abstract":"Existing book recommendation methods often overlook the rich information contained in the comment text, which can limit their effectiveness. Therefore, a cross-domain recommender system for literary books that leverages multi-head self-attention interaction and knowledge transfer learning is proposed. Firstly, the BERT model is employed to obtain word vectors, and CNN is used to extract user and project features. Then, higher-level features are captured through the fusion of multi-head self-attention and addition pooling. Finally, knowledge transfer learning is introduced to conduct joint modeling between different domains by simultaneously extracting domain-specific features and shared features between domains. On the Amazon dataset, the proposed model achieved MAE and MSE of 0.801 and 1.058 in the “movie-book” recommendation task and 0.787 and 0.805 in the “music-book” recommendation task, respectively. This performance is significantly superior to other advanced recommendation models. Moreover, the proposed model also has good universality on the Chinese dataset.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2023-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139220014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Outlier Detection Algorithm Based on Probability Density Clustering 基于概率密度聚类的离群点检测算法
IF 1.2 4区 计算机科学 Q3 Computer Science Pub Date : 2023-11-21 DOI: 10.4018/ijdwm.333901
Wei Wang, Yongjian Ren, Renjie Zhou, Jilin Zhang
Outlier detection for batch and streaming data is an important branch of data mining. However, there are shortcomings for existing algorithms. For batch data, the outlier detection algorithm, only labeling a few data points, is not accurate enough because it uses histogram strategy to generate feature vectors. For streaming data, the outlier detection algorithms are sensitive to data distance, resulting in low accuracy when sparse clusters and dense clusters are close to each other. Moreover, they require tuning of parameters, which takes a lot of time. With this, the manuscript per the authors propose a new outlier detection algorithm, called PDC which use probability density to generate feature vectors to train a lightweight machine learning model that is finally applied to detect outliers. PDC takes advantages of accuracy and insensitivity-to-data-distance of probability density, so it can overcome the aforementioned drawbacks.
批量数据和流数据的离群点检测是数据挖掘的一个重要分支。然而,现有算法也存在不足之处。对于批量数据,离群点检测算法只标记几个数据点,由于使用直方图策略生成特征向量,因此不够准确。对于流数据,离群点检测算法对数据距离很敏感,当稀疏聚类和密集聚类彼此靠近时,准确率会很低。此外,这些算法需要调整参数,耗费大量时间。有鉴于此,作者在手稿中提出了一种新的离群值检测算法,称为 PDC,它利用概率密度生成特征向量,训练轻量级机器学习模型,最后应用于检测离群值。PDC 利用了概率密度的准确性和对数据距离不敏感的优点,因此可以克服上述缺点。
{"title":"An Outlier Detection Algorithm Based on Probability Density Clustering","authors":"Wei Wang, Yongjian Ren, Renjie Zhou, Jilin Zhang","doi":"10.4018/ijdwm.333901","DOIUrl":"https://doi.org/10.4018/ijdwm.333901","url":null,"abstract":"Outlier detection for batch and streaming data is an important branch of data mining. However, there are shortcomings for existing algorithms. For batch data, the outlier detection algorithm, only labeling a few data points, is not accurate enough because it uses histogram strategy to generate feature vectors. For streaming data, the outlier detection algorithms are sensitive to data distance, resulting in low accuracy when sparse clusters and dense clusters are close to each other. Moreover, they require tuning of parameters, which takes a lot of time. With this, the manuscript per the authors propose a new outlier detection algorithm, called PDC which use probability density to generate feature vectors to train a lightweight machine learning model that is finally applied to detect outliers. PDC takes advantages of accuracy and insensitivity-to-data-distance of probability density, so it can overcome the aforementioned drawbacks.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2023-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139253281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Intelligent Heart Disease Prediction Framework Using Machine Learning and Deep Learning Techniques 利用机器学习和深度学习技术的智能心脏病预测框架
IF 1.2 4区 计算机科学 Q3 Computer Science Pub Date : 2023-11-17 DOI: 10.4018/ijdwm.333862
Nasser Allheeib, Summrina Kanwal, Sultan Alamri
Cardiovascular diseases (CVD) rank among the leading global causes of mortality. Early detection and diagnosis are paramount in minimizing their impact. The application of ML and DL in classifying the occurrence of cardiovascular diseases holds significant potential for reducing diagnostic errors. This research endeavors to construct a model capable of accurately predicting cardiovascular diseases, thereby mitigating the fatality associated with CVD. In this paper, the authors introduce a novel approach that combines an artificial intelligence network (AIN)-based feature selection (FS) technique with cutting-edge DL and ML classifiers for the early detection of heart diseases based on patient medical histories. The proposed model is rigorously evaluated using two real-world datasets sourced from the University of California. The authors conduct extensive data preprocessing and analysis, and the findings from this study demonstrate that the proposed methodology surpasses the performance of existing state-of-the-art methods, achieving an exceptional accuracy rate of 99.99%.
心血管疾病(CVD)是导致全球死亡的主要原因之一。早期检测和诊断对最大限度地减少其影响至关重要。应用 ML 和 DL 对心血管疾病的发生进行分类,在减少诊断错误方面具有巨大潜力。本研究致力于构建一个能够准确预测心血管疾病的模型,从而降低与心血管疾病相关的死亡率。在本文中,作者介绍了一种新方法,该方法将基于人工智能网络(AIN)的特征选择(FS)技术与最先进的 DL 和 ML 分类器相结合,用于根据患者病史早期检测心脏病。作者使用来自加利福尼亚大学的两个真实数据集对所提出的模型进行了严格评估。作者对数据进行了广泛的预处理和分析,研究结果表明,所提出的方法超越了现有最先进方法的性能,准确率高达 99.99%。
{"title":"An Intelligent Heart Disease Prediction Framework Using Machine Learning and Deep Learning Techniques","authors":"Nasser Allheeib, Summrina Kanwal, Sultan Alamri","doi":"10.4018/ijdwm.333862","DOIUrl":"https://doi.org/10.4018/ijdwm.333862","url":null,"abstract":"Cardiovascular diseases (CVD) rank among the leading global causes of mortality. Early detection and diagnosis are paramount in minimizing their impact. The application of ML and DL in classifying the occurrence of cardiovascular diseases holds significant potential for reducing diagnostic errors. This research endeavors to construct a model capable of accurately predicting cardiovascular diseases, thereby mitigating the fatality associated with CVD. In this paper, the authors introduce a novel approach that combines an artificial intelligence network (AIN)-based feature selection (FS) technique with cutting-edge DL and ML classifiers for the early detection of heart diseases based on patient medical histories. The proposed model is rigorously evaluated using two real-world datasets sourced from the University of California. The authors conduct extensive data preprocessing and analysis, and the findings from this study demonstrate that the proposed methodology surpasses the performance of existing state-of-the-art methods, achieving an exceptional accuracy rate of 99.99%.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2023-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139265941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Journal of Data Warehousing and Mining
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1