Identifying fishing vessel types with artificial intelligence has become a key technology in marine resource management. However, classical feature modeling lacks the ability to express time series features, and the feature extraction is insufficient. Hence, this work focuses on the identification of trawlers, gillnetters, and purse seiners based on semantic feature vectors. First, we extract trajectories from massive and complex historical Vessel Monitoring System data that contain a large amount of dirty data and then extract the semantic features of fishing vessel trajectories. Finally, we input the semantic feature vectors into the LightGBM classification model for classification of fishing vessel types. In this experiment, the F1 measure of our proposed method on the East China Sea fishing vessel dataset reached 96.25, which was 6.82% higher than that of the classical feature-modeling method based on fishing vessel trajectories. Experiments show that this method is accurate and effective for the classification of fishing vessels.
利用人工智能识别渔船类型已成为海洋资源管理的一项关键技术。然而,经典的特征建模缺乏表达时间序列特征的能力,特征提取也不够充分。因此,这项工作的重点是基于语义特征向量识别拖网渔船、刺网渔船和围网渔船。首先,我们从包含大量脏数据的海量复杂渔船监控系统历史数据中提取轨迹,然后提取渔船轨迹的语义特征。最后,我们将语义特征向量输入 LightGBM 分类模型,对渔船类型进行分类。在该实验中,我们提出的方法在东海渔船数据集上的 F1 测量值达到 96.25,比基于渔船轨迹的经典特征建模方法高出 6.82%。实验表明,该方法对渔船分类准确有效。
{"title":"Fishing Vessel Type Recognition Based on Semantic Feature Vector","authors":"Junfeng Yuan, Qianqian Zhang, Jilin Zhang, Youhuizi Li, Zhen Liu, Meiting Xue, Y. Zeng","doi":"10.4018/ijdwm.349222","DOIUrl":"https://doi.org/10.4018/ijdwm.349222","url":null,"abstract":"Identifying fishing vessel types with artificial intelligence has become a key technology in marine resource management. However, classical feature modeling lacks the ability to express time series features, and the feature extraction is insufficient. Hence, this work focuses on the identification of trawlers, gillnetters, and purse seiners based on semantic feature vectors. First, we extract trajectories from massive and complex historical Vessel Monitoring System data that contain a large amount of dirty data and then extract the semantic features of fishing vessel trajectories. Finally, we input the semantic feature vectors into the LightGBM classification model for classification of fishing vessel types. In this experiment, the F1 measure of our proposed method on the East China Sea fishing vessel dataset reached 96.25, which was 6.82% higher than that of the classical feature-modeling method based on fishing vessel trajectories. Experiments show that this method is accurate and effective for the classification of fishing vessels.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141800070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Military academy cadets reside in a brigade organized by cadets. Despite its importance, squads have traditionally been organized based on the personal preferences of the fourth-year squad leader without considering the compatibility of the squad members. This study proposes a more scientific approach to increase cadet satisfaction with their squads and foster their leadership development. Initially, a multiple linear regression analysis was conducted to identify the leadership factors of squad leaders that significantly affect squad organizational satisfaction. The model maximized the sum of the factor scores among squad leaders to enhance squad organizational satisfaction and maximized the difference in factor scores to improve the effectiveness of leadership discipline. Applying the squad formation algorithm to data from cadets at the Korea Military Academy revealed that the squad organizational satisfaction and leadership discipline effectiveness were significantly increased compared to the existing squad formation methods.
{"title":"Optimizing Cadet Squad Organizational Satisfaction by Integrating Leadership Factor Data Mining and Integer Programming","authors":"Hyunho Kim, Eunmi Lee, S. Cha","doi":"10.4018/ijdwm.349226","DOIUrl":"https://doi.org/10.4018/ijdwm.349226","url":null,"abstract":"Military academy cadets reside in a brigade organized by cadets. Despite its importance, squads have traditionally been organized based on the personal preferences of the fourth-year squad leader without considering the compatibility of the squad members. This study proposes a more scientific approach to increase cadet satisfaction with their squads and foster their leadership development. Initially, a multiple linear regression analysis was conducted to identify the leadership factors of squad leaders that significantly affect squad organizational satisfaction. The model maximized the sum of the factor scores among squad leaders to enhance squad organizational satisfaction and maximized the difference in factor scores to improve the effectiveness of leadership discipline. Applying the squad formation algorithm to data from cadets at the Korea Military Academy revealed that the squad organizational satisfaction and leadership discipline effectiveness were significantly increased compared to the existing squad formation methods.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141828869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The recommender system can be viewed as a matrix completion problem, which aims to predict unknown values within a matrix. Solutions to this problem are categorized into two approaches: transductive and inductive reasoning. In transductive reasoning, the model cannot be applied to new cases unseen during training. In contrast, IGMC, the state-of-the-art inductive algorithm, only requires subgraphs for target users and items, without needing any other content information. While the absence of a requirement for content information simplifies the model and enhances transferability to new tasks, incorporating content information could still improve the model's performance. In this article, the authors introduce Hi-GMC, a hybrid version of the IGMC model that incorporates content information alongside users and items. They present a novel graph model to encapsulate the side information related to users and items and develop a learning method based on graph neural networks. This proposed method achieves state-of-the-art performance on the MovieLens-100K dataset for both warm and cold start scenarios.
{"title":"Hybrid Inductive Graph Method for Matrix Completion","authors":"Jayun Yong, Chulyun Kim","doi":"10.4018/ijdwm.345361","DOIUrl":"https://doi.org/10.4018/ijdwm.345361","url":null,"abstract":"The recommender system can be viewed as a matrix completion problem, which aims to predict unknown values within a matrix. Solutions to this problem are categorized into two approaches: transductive and inductive reasoning. In transductive reasoning, the model cannot be applied to new cases unseen during training. In contrast, IGMC, the state-of-the-art inductive algorithm, only requires subgraphs for target users and items, without needing any other content information. While the absence of a requirement for content information simplifies the model and enhances transferability to new tasks, incorporating content information could still improve the model's performance. In this article, the authors introduce Hi-GMC, a hybrid version of the IGMC model that incorporates content information alongside users and items. They present a novel graph model to encapsulate the side information related to users and items and develop a learning method based on graph neural networks. This proposed method achieves state-of-the-art performance on the MovieLens-100K dataset for both warm and cold start scenarios.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2024-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141004116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Uncertain information in the securities market exhibits fuzziness. In this article, expected returns and liquidity are considered as trapezoidal fuzzy numbers. The possibility mean and mean absolute deviation of expected returns represent the returns and risks of securities assets, while the possibility mean of expected turnover represents the liquidity of securities assets. Taking into account practical constraints such as cardinality and transaction costs, this article establishes a fuzzy portfolio model with cardinality constraints and solves it using the differential evolution algorithm. Finally, using fuzzy c-means clustering algorithm, 12 stocks are selected as empirical samples to provide numerical calculation examples. At the same time, fuzzy c-means clustering algorithm is used to cluster the stock yield data and analyse the stock data comprehensively and accurately, which provides a reference for establishing an effective portfolio.
{"title":"A Fuzzy Portfolio Model With Cardinality Constraints Based on Differential Evolution Algorithms","authors":"JianDong He","doi":"10.4018/ijdwm.341268","DOIUrl":"https://doi.org/10.4018/ijdwm.341268","url":null,"abstract":"Uncertain information in the securities market exhibits fuzziness. In this article, expected returns and liquidity are considered as trapezoidal fuzzy numbers. The possibility mean and mean absolute deviation of expected returns represent the returns and risks of securities assets, while the possibility mean of expected turnover represents the liquidity of securities assets. Taking into account practical constraints such as cardinality and transaction costs, this article establishes a fuzzy portfolio model with cardinality constraints and solves it using the differential evolution algorithm. Finally, using fuzzy c-means clustering algorithm, 12 stocks are selected as empirical samples to provide numerical calculation examples. At the same time, fuzzy c-means clustering algorithm is used to cluster the stock yield data and analyse the stock data comprehensively and accurately, which provides a reference for establishing an effective portfolio.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140376177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The growth and development of youth is related to the destiny of the country and the nation, and the cultivation of young people in the new era is of great significance. Youth is a special stage in life, and the ideological and moral concepts of young people during this period are still very malleable. In the process of educating them, it is necessary to grasp the characteristics and laws of their ideological and moral growth to be targeted. The characteristics of young people's ideological and moral growth are variability, subjectivity, practicality, and hierarchy; the laws of their ideological and moral growth include the law of guiding transcendence, the law of edification and internalization, the law of mutual promotion of knowledge, will and action, and the law of gradual progression. With the intensification of economic globalization, the Internet, and big data, the thoughts, behaviors, and psychology of young people are constantly changing, which poses a huge challenge to youth ideological and political education.
{"title":"Dynamic Research on Youth Thought, Behavior, and Growth Law Based on Deep Learning Algorithm","authors":"Qi Fu","doi":"10.4018/ijdwm.333518","DOIUrl":"https://doi.org/10.4018/ijdwm.333518","url":null,"abstract":"The growth and development of youth is related to the destiny of the country and the nation, and the cultivation of young people in the new era is of great significance. Youth is a special stage in life, and the ideological and moral concepts of young people during this period are still very malleable. In the process of educating them, it is necessary to grasp the characteristics and laws of their ideological and moral growth to be targeted. The characteristics of young people's ideological and moral growth are variability, subjectivity, practicality, and hierarchy; the laws of their ideological and moral growth include the law of guiding transcendence, the law of edification and internalization, the law of mutual promotion of knowledge, will and action, and the law of gradual progression. With the intensification of economic globalization, the Internet, and big data, the thoughts, behaviors, and psychology of young people are constantly changing, which poses a huge challenge to youth ideological and political education.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2024-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140475518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Feiqi Liu, Dong Yang, Yuyang Zhang, Chengcai Yang, Jingjing Yang
The rabbit breeding industry exhibits vast economic potential and growth opportunities. Nevertheless, the ineffective prediction of environmental conditions in rabbit houses often leads to the spread of infectious diseases, causing illness and death among rabbits. This paper presents a multi-parameter predictive model for environmental conditions such as temperature, humidity, illumination, CO2 concentration, NH3 concentration, and dust conditions in rabbit houses. The model adeptly distinguishes between day and night forecasts, thereby improving the adaptive adjustment of environmental data trends. Importantly, the model encapsulates multi-parameter environmental forecasting to heighten precision, given the high degree of interrelation among parameters. The model's performance is assessed through RMSE, MAE, and MAPE metrics, yielding values of 0.018, 0.031, and 6.31% respectively in predicting rabbit house environmental factors. Experimentally juxtaposed with Bert, Seq2seq, and conventional transformer models, the method demonstrates superior performance.
{"title":"Research on Multi-Parameter Prediction of Rabbit Housing Environment Based on Transformer","authors":"Feiqi Liu, Dong Yang, Yuyang Zhang, Chengcai Yang, Jingjing Yang","doi":"10.4018/ijdwm.336286","DOIUrl":"https://doi.org/10.4018/ijdwm.336286","url":null,"abstract":"The rabbit breeding industry exhibits vast economic potential and growth opportunities. Nevertheless, the ineffective prediction of environmental conditions in rabbit houses often leads to the spread of infectious diseases, causing illness and death among rabbits. This paper presents a multi-parameter predictive model for environmental conditions such as temperature, humidity, illumination, CO2 concentration, NH3 concentration, and dust conditions in rabbit houses. The model adeptly distinguishes between day and night forecasts, thereby improving the adaptive adjustment of environmental data trends. Importantly, the model encapsulates multi-parameter environmental forecasting to heighten precision, given the high degree of interrelation among parameters. The model's performance is assessed through RMSE, MAE, and MAPE metrics, yielding values of 0.018, 0.031, and 6.31% respectively in predicting rabbit house environmental factors. Experimentally juxtaposed with Bert, Seq2seq, and conventional transformer models, the method demonstrates superior performance.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139440702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The study quantitatively examines how AI-generated cosmetic packaging design impact consumer satisfaction, offering strategies for database-driven development and design based on this evaluation. A comprehensive evaluation system consisting of 18 indicators in five dimensions was constructed by combining literature review and user interviews with expert opinions. On this basis, a questionnaire survey on AI-generated packaging design was conducted based on three types of datasets. In addition, importance-performance analysis was used to analyze the satisfaction of AI-generated packaging design indicators. The study found that while consumers are highly satisfied with the information transmission and creative attraction of AI-generated packaging design, the design's functional availability and user experience still have to be improved. It is suggested that the public model be combined into the data warehouse to build an AI packaging service platform. Focusing on the interpretability and controllability of the design process will also help increase consumer satisfaction and trust.
{"title":"Analyzing AI-Generated Packaging's Impact on Consumer Satisfaction With Three Types of Datasets","authors":"Tao Chen, D. Luh, J. Wang","doi":"10.4018/ijdwm.334024","DOIUrl":"https://doi.org/10.4018/ijdwm.334024","url":null,"abstract":"The study quantitatively examines how AI-generated cosmetic packaging design impact consumer satisfaction, offering strategies for database-driven development and design based on this evaluation. A comprehensive evaluation system consisting of 18 indicators in five dimensions was constructed by combining literature review and user interviews with expert opinions. On this basis, a questionnaire survey on AI-generated packaging design was conducted based on three types of datasets. In addition, importance-performance analysis was used to analyze the satisfaction of AI-generated packaging design indicators. The study found that while consumers are highly satisfied with the information transmission and creative attraction of AI-generated packaging design, the design's functional availability and user experience still have to be improved. It is suggested that the public model be combined into the data warehouse to build an AI packaging service platform. Focusing on the interpretability and controllability of the design process will also help increase consumer satisfaction and trust.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2023-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139218299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Existing book recommendation methods often overlook the rich information contained in the comment text, which can limit their effectiveness. Therefore, a cross-domain recommender system for literary books that leverages multi-head self-attention interaction and knowledge transfer learning is proposed. Firstly, the BERT model is employed to obtain word vectors, and CNN is used to extract user and project features. Then, higher-level features are captured through the fusion of multi-head self-attention and addition pooling. Finally, knowledge transfer learning is introduced to conduct joint modeling between different domains by simultaneously extracting domain-specific features and shared features between domains. On the Amazon dataset, the proposed model achieved MAE and MSE of 0.801 and 1.058 in the “movie-book” recommendation task and 0.787 and 0.805 in the “music-book” recommendation task, respectively. This performance is significantly superior to other advanced recommendation models. Moreover, the proposed model also has good universality on the Chinese dataset.
{"title":"A Cross-Domain Recommender System for Literary Books Using Multi-Head Self-Attention Interaction and Knowledge Transfer Learning","authors":"Yuan Cui, Yuexing Duan, Yueqin Zhang, Li Pan","doi":"10.4018/ijdwm.334122","DOIUrl":"https://doi.org/10.4018/ijdwm.334122","url":null,"abstract":"Existing book recommendation methods often overlook the rich information contained in the comment text, which can limit their effectiveness. Therefore, a cross-domain recommender system for literary books that leverages multi-head self-attention interaction and knowledge transfer learning is proposed. Firstly, the BERT model is employed to obtain word vectors, and CNN is used to extract user and project features. Then, higher-level features are captured through the fusion of multi-head self-attention and addition pooling. Finally, knowledge transfer learning is introduced to conduct joint modeling between different domains by simultaneously extracting domain-specific features and shared features between domains. On the Amazon dataset, the proposed model achieved MAE and MSE of 0.801 and 1.058 in the “movie-book” recommendation task and 0.787 and 0.805 in the “music-book” recommendation task, respectively. This performance is significantly superior to other advanced recommendation models. Moreover, the proposed model also has good universality on the Chinese dataset.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2023-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139220014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Outlier detection for batch and streaming data is an important branch of data mining. However, there are shortcomings for existing algorithms. For batch data, the outlier detection algorithm, only labeling a few data points, is not accurate enough because it uses histogram strategy to generate feature vectors. For streaming data, the outlier detection algorithms are sensitive to data distance, resulting in low accuracy when sparse clusters and dense clusters are close to each other. Moreover, they require tuning of parameters, which takes a lot of time. With this, the manuscript per the authors propose a new outlier detection algorithm, called PDC which use probability density to generate feature vectors to train a lightweight machine learning model that is finally applied to detect outliers. PDC takes advantages of accuracy and insensitivity-to-data-distance of probability density, so it can overcome the aforementioned drawbacks.
{"title":"An Outlier Detection Algorithm Based on Probability Density Clustering","authors":"Wei Wang, Yongjian Ren, Renjie Zhou, Jilin Zhang","doi":"10.4018/ijdwm.333901","DOIUrl":"https://doi.org/10.4018/ijdwm.333901","url":null,"abstract":"Outlier detection for batch and streaming data is an important branch of data mining. However, there are shortcomings for existing algorithms. For batch data, the outlier detection algorithm, only labeling a few data points, is not accurate enough because it uses histogram strategy to generate feature vectors. For streaming data, the outlier detection algorithms are sensitive to data distance, resulting in low accuracy when sparse clusters and dense clusters are close to each other. Moreover, they require tuning of parameters, which takes a lot of time. With this, the manuscript per the authors propose a new outlier detection algorithm, called PDC which use probability density to generate feature vectors to train a lightweight machine learning model that is finally applied to detect outliers. PDC takes advantages of accuracy and insensitivity-to-data-distance of probability density, so it can overcome the aforementioned drawbacks.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2023-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139253281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cardiovascular diseases (CVD) rank among the leading global causes of mortality. Early detection and diagnosis are paramount in minimizing their impact. The application of ML and DL in classifying the occurrence of cardiovascular diseases holds significant potential for reducing diagnostic errors. This research endeavors to construct a model capable of accurately predicting cardiovascular diseases, thereby mitigating the fatality associated with CVD. In this paper, the authors introduce a novel approach that combines an artificial intelligence network (AIN)-based feature selection (FS) technique with cutting-edge DL and ML classifiers for the early detection of heart diseases based on patient medical histories. The proposed model is rigorously evaluated using two real-world datasets sourced from the University of California. The authors conduct extensive data preprocessing and analysis, and the findings from this study demonstrate that the proposed methodology surpasses the performance of existing state-of-the-art methods, achieving an exceptional accuracy rate of 99.99%.
心血管疾病(CVD)是导致全球死亡的主要原因之一。早期检测和诊断对最大限度地减少其影响至关重要。应用 ML 和 DL 对心血管疾病的发生进行分类,在减少诊断错误方面具有巨大潜力。本研究致力于构建一个能够准确预测心血管疾病的模型,从而降低与心血管疾病相关的死亡率。在本文中,作者介绍了一种新方法,该方法将基于人工智能网络(AIN)的特征选择(FS)技术与最先进的 DL 和 ML 分类器相结合,用于根据患者病史早期检测心脏病。作者使用来自加利福尼亚大学的两个真实数据集对所提出的模型进行了严格评估。作者对数据进行了广泛的预处理和分析,研究结果表明,所提出的方法超越了现有最先进方法的性能,准确率高达 99.99%。
{"title":"An Intelligent Heart Disease Prediction Framework Using Machine Learning and Deep Learning Techniques","authors":"Nasser Allheeib, Summrina Kanwal, Sultan Alamri","doi":"10.4018/ijdwm.333862","DOIUrl":"https://doi.org/10.4018/ijdwm.333862","url":null,"abstract":"Cardiovascular diseases (CVD) rank among the leading global causes of mortality. Early detection and diagnosis are paramount in minimizing their impact. The application of ML and DL in classifying the occurrence of cardiovascular diseases holds significant potential for reducing diagnostic errors. This research endeavors to construct a model capable of accurately predicting cardiovascular diseases, thereby mitigating the fatality associated with CVD. In this paper, the authors introduce a novel approach that combines an artificial intelligence network (AIN)-based feature selection (FS) technique with cutting-edge DL and ML classifiers for the early detection of heart diseases based on patient medical histories. The proposed model is rigorously evaluated using two real-world datasets sourced from the University of California. The authors conduct extensive data preprocessing and analysis, and the findings from this study demonstrate that the proposed methodology surpasses the performance of existing state-of-the-art methods, achieving an exceptional accuracy rate of 99.99%.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2023-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139265941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}