Pub Date : 2024-09-05DOI: 10.1007/s13042-024-02292-3
Tao Wang, LiYun Jia, JiaLing Xu, Ahmed G. Gad, Hai Ren, Ahmed Salem
Identifying disease-related genes is an ongoing study issue in biomedical analysis. Many research has recently presented various strategies for predicting disease-related genes. However, only a handful of them were capable of identifying or selecting relevant genes with a low computational burden. In order to tackle this issue, we introduce a new filter–wrapper-based gene selection (GS) method based on metaheuristic algorithms (MHAs) in conjunction with the k-nearest neighbors (({k{hbox {-NN}}})) classifier. Specifically, we hybridize two MHAs, bat algorithm (BA) and JAYA algorithm (JA), embedded with perturbation as a new perturbation-based exploration strategy (PES), to obtain JAYA–bat algorithm (JBA). The fact that JBA outperforms 10 state-of-the-art GS methods on 12 high-dimensional microarray datasets (ranging from 2000 to 22,283 features or genes) is impressive. It is also noteworthy that relevant genes are first selected via a filter-based method called mutual information (MI), and then further optimized by JBA to select the near-optimal genes in a timely fashion. Comparing the performance analysis of 11 well-known original MHAs, including BA and JA, the proposed JBA achieves significantly better results with improvement rates of 12.36%, 12.45%, 97.88%, 9.84%, 12.45%, and 12.17% in terms of fitness, accuracy, gene selection ratio, precision, recall, and F1-score, respectively. The results of Wilcoxon’s signed-rank test at a significance level of (alpha =0.05) further validate the superiority of JBA over its peers on most of the datasets. The use of PES and the combination of BA and JA’s strengths appear to enhance JBA’s exploration and exploitation capabilities. This gives it a significant advantage in gene selection ratio, while also ensuring the highest classification accuracy and the lowest computational time among all competing algorithms. Thus, this research could potentially make a significant contribution to the field of biomedical analysis.
识别疾病相关基因是生物医学分析中的一个持续研究课题。最近,许多研究提出了各种预测疾病相关基因的策略。然而,其中只有少数几种能以较低的计算负担识别或选择相关基因。为了解决这个问题,我们引入了一种基于元启发式算法(MHAs)和k-近邻({k{hbox {-NN}})分类器的新的基因选择(GS)方法。具体来说,我们将蝙蝠算法(BA)和 JAYA 算法(JA)这两种 MHA 混合,并嵌入扰动作为一种新的基于扰动的探索策略(PES),从而得到 JAYA-bat 算法(JBA)。在 12 个高维微阵列数据集(从 2000 个特征或基因到 22283 个特征或基因)上,JBA 的表现优于 10 种最先进的 GS 方法,令人印象深刻。值得注意的是,相关基因首先是通过一种称为互信息(MI)的滤波方法筛选出来的,然后由 JBA 进一步优化,及时选出接近最优的基因。通过对包括 BA 和 JA 在内的 11 种著名原始 MHA 的性能分析比较,所提出的 JBA 取得了明显更好的结果,在适合度、准确度、基因选择比、精确度、召回率和 F1 分数方面的改进率分别为 12.36%、12.45%、97.88%、9.84%、12.45% 和 12.17%。在显著性水平(α =0.05)下的Wilcoxon符号秩检验结果进一步验证了JBA在大多数数据集上优于其同行。PES 的使用以及 BA 和 JA 优势的结合似乎增强了 JBA 的探索和利用能力。这使其在基因选择比例上具有显著优势,同时也确保了其在所有竞争算法中具有最高的分类准确性和最短的计算时间。因此,这项研究有可能为生物医学分析领域做出重大贡献。
{"title":"A hybrid intelligent optimization algorithm to select discriminative genes from large-scale medical data","authors":"Tao Wang, LiYun Jia, JiaLing Xu, Ahmed G. Gad, Hai Ren, Ahmed Salem","doi":"10.1007/s13042-024-02292-3","DOIUrl":"https://doi.org/10.1007/s13042-024-02292-3","url":null,"abstract":"<p>Identifying disease-related genes is an ongoing study issue in biomedical analysis. Many research has recently presented various strategies for predicting disease-related genes. However, only a handful of them were capable of identifying or selecting relevant genes with a low computational burden. In order to tackle this issue, we introduce a new filter–wrapper-based gene selection (GS) method based on metaheuristic algorithms (MHAs) in conjunction with the <i>k</i>-nearest neighbors (<span>({k{hbox {-NN}}})</span>) classifier. Specifically, we hybridize two MHAs, bat algorithm (BA) and JAYA algorithm (JA), embedded with perturbation as a new perturbation-based exploration strategy (PES), to obtain JAYA–bat algorithm (JBA). The fact that JBA outperforms 10 state-of-the-art GS methods on 12 high-dimensional microarray datasets (ranging from 2000 to 22,283 features or genes) is impressive. It is also noteworthy that relevant genes are first selected via a filter-based method called mutual information (MI), and then further optimized by JBA to select the near-optimal genes in a timely fashion. Comparing the performance analysis of 11 well-known original MHAs, including BA and JA, the proposed JBA achieves significantly better results with improvement rates of 12.36%, 12.45%, 97.88%, 9.84%, 12.45%, and 12.17% in terms of fitness, accuracy, gene selection ratio, precision, recall, and F1-score, respectively. The results of Wilcoxon’s signed-rank test at a significance level of <span>(alpha =0.05)</span> further validate the superiority of JBA over its peers on most of the datasets. The use of PES and the combination of BA and JA’s strengths appear to enhance JBA’s exploration and exploitation capabilities. This gives it a significant advantage in gene selection ratio, while also ensuring the highest classification accuracy and the lowest computational time among all competing algorithms. Thus, this research could potentially make a significant contribution to the field of biomedical analysis.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"26 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142209122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-04DOI: 10.1007/s13042-024-02330-0
Juncheng Li, Hanhui Yang, Lok Ming Lui, Guixu Zhang, Jun Shi, Tieyong Zeng
Improving the speed of MRI acquisition is a key issue in modern medical practice. However, existing deep learning-based methods are often accompanied by a large number of parameters and ignore the use of deep features. In this work, we propose a novel Self-Ensemble Feedback Recurrent Network (SEFRN) for fast MRI reconstruction inspired by recursive learning and ensemble learning strategies. Specifically, a lightweight but powerful Data Consistency Residual Group (DCRG) is proposed for feature extraction and data stabilization. Meanwhile, an efficient Wide Activation Module (WAM) is introduced between different DCRGs to encourage more activated features to pass through the model. In addition, a Feedback Enhancement Recurrent Architecture (FERA) is designed to reuse the model parameters and deep features. Moreover, combined with the specially designed Automatic Selection and Integration Module (ASIM), different stages of the recurrent model can elegantly implement self-ensemble learning and synergize the sub-networks to improve the overall performance. Extensive experiments demonstrate that our model achieves competitive results and strikes a good balance between the size, complexity, and performance of the model.
{"title":"A lightweight self-ensemble feedback recurrent network for fast MRI reconstruction","authors":"Juncheng Li, Hanhui Yang, Lok Ming Lui, Guixu Zhang, Jun Shi, Tieyong Zeng","doi":"10.1007/s13042-024-02330-0","DOIUrl":"https://doi.org/10.1007/s13042-024-02330-0","url":null,"abstract":"<p>Improving the speed of MRI acquisition is a key issue in modern medical practice. However, existing deep learning-based methods are often accompanied by a large number of parameters and ignore the use of deep features. In this work, we propose a novel Self-Ensemble Feedback Recurrent Network (SEFRN) for fast MRI reconstruction inspired by recursive learning and ensemble learning strategies. Specifically, a lightweight but powerful Data Consistency Residual Group (DCRG) is proposed for feature extraction and data stabilization. Meanwhile, an efficient Wide Activation Module (WAM) is introduced between different DCRGs to encourage more activated features to pass through the model. In addition, a Feedback Enhancement Recurrent Architecture (FERA) is designed to reuse the model parameters and deep features. Moreover, combined with the specially designed Automatic Selection and Integration Module (ASIM), different stages of the recurrent model can elegantly implement self-ensemble learning and synergize the sub-networks to improve the overall performance. Extensive experiments demonstrate that our model achieves competitive results and strikes a good balance between the size, complexity, and performance of the model.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"3 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142209124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-04DOI: 10.1007/s13042-024-02353-7
Tianhang Liu, Hui Zhou, Chao Li, Zhongying Zhao
Multi-behavior recommendation (MBR) aims to enhance the accuracy of predicting target behavior by considering multiple behaviors simultaneously. Recent researches have attempted to capture the dependencies within behavioral sequences to improve recommendation outcomes, exemplified by the sequential pattern “click(rightarrow )cart(rightarrow )buy”. However, their performances are still limited due to the following two problems. Firstly, potential leapfrogging relations among behaviors are underexplored, notably in cases where users purchase directly post-click, bypassing the cart stage. Skipping intermediate behavior allows for better modeling of real-world realities. Secondly, the uneven distribution of user behaviors and item popularity presents a challenge for model training, resulting in prevalence bias and over-reliance issues. To this end, we propose a self-supervised progressive graph neural network model, namely SSPGNN. The model can capture a broader range of behavioral dependencies by using a dual-behavior chain. In addition, we design a self-supervised learning mechanism, including intra- and inter-behavioral self-supervised learning, the former within a single behavior and the latter across multiple behaviors, to address the problems of prevalence bias and overdependence. Extensive experiments on real-world datasets and comparative analyses with state-of-the-art algorithms demonstrate the effectiveness of the proposed SSPGNN. The source codes of this work are available at https://github.com/ZZY-GraphMiningLab/SSPGNN.
{"title":"Self-supervised progressive graph neural network for enhanced multi-behavior recommendation","authors":"Tianhang Liu, Hui Zhou, Chao Li, Zhongying Zhao","doi":"10.1007/s13042-024-02353-7","DOIUrl":"https://doi.org/10.1007/s13042-024-02353-7","url":null,"abstract":"<p>Multi-behavior recommendation (MBR) aims to enhance the accuracy of predicting target behavior by considering multiple behaviors simultaneously. Recent researches have attempted to capture the dependencies within behavioral sequences to improve recommendation outcomes, exemplified by the sequential pattern “click<span>(rightarrow )</span>cart<span>(rightarrow )</span>buy”. However, their performances are still limited due to the following two problems. Firstly, potential leapfrogging relations among behaviors are underexplored, notably in cases where users purchase directly post-click, bypassing the cart stage. Skipping intermediate behavior allows for better modeling of real-world realities. Secondly, the uneven distribution of user behaviors and item popularity presents a challenge for model training, resulting in prevalence bias and over-reliance issues. To this end, we propose a self-supervised progressive graph neural network model, namely <b>SSPGNN</b>. The model can capture a broader range of behavioral dependencies by using a dual-behavior chain. In addition, we design a self-supervised learning mechanism, including intra- and inter-behavioral self-supervised learning, the former within a single behavior and the latter across multiple behaviors, to address the problems of prevalence bias and overdependence. Extensive experiments on real-world datasets and comparative analyses with state-of-the-art algorithms demonstrate the effectiveness of the proposed <b>SSPGNN</b>. The source codes of this work are available at https://github.com/ZZY-GraphMiningLab/SSPGNN.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"408 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142209128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-04DOI: 10.1007/s13042-024-02351-9
Ziyun Zhang, Jing Wang, Xin Geng
Label Distribution Learning (LDL) is a novel machine learning paradigm that focuses on the description degrees of labels to a particular instance. Existing LDL algorithms generally learn with the original input space, that is, all features are simply employed in the discrimination processes of all class labels. However, this common-used data representation strategy ignores that each label is supposed to possess some specific characteristics of its own and therefore, may lead to sub-optimal performance. We propose label distribution learning by utilizing common and label-specific feature fusion space (LDL-CLSFS) in this paper. It first partitions all instances by label-value rankings. Second, it constructs label-specific features of each label by conducting clustering analysis on different instance categories. Third, it performs training and testing by querying the clustering results. Comprehensive experiments on several real-world label distribution data sets validate the superiority of our method against other LDL algorithms as well as the effectiveness of label-specific features.
{"title":"Label distribution learning by utilizing common and label-specific feature fusion space","authors":"Ziyun Zhang, Jing Wang, Xin Geng","doi":"10.1007/s13042-024-02351-9","DOIUrl":"https://doi.org/10.1007/s13042-024-02351-9","url":null,"abstract":"<p>Label Distribution Learning (LDL) is a novel machine learning paradigm that focuses on the description degrees of labels to a particular instance. Existing LDL algorithms generally learn with the original input space, that is, all features are simply employed in the discrimination processes of all class labels. However, this common-used data representation strategy ignores that each label is supposed to possess some specific characteristics of its own and therefore, may lead to sub-optimal performance. We propose label distribution learning by utilizing common and label-specific feature fusion space (LDL-CLSFS) in this paper. It first partitions all instances by label-value rankings. Second, it constructs label-specific features of each label by conducting clustering analysis on different instance categories. Third, it performs training and testing by querying the clustering results. Comprehensive experiments on several real-world label distribution data sets validate the superiority of our method against other LDL algorithms as well as the effectiveness of label-specific features.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"54 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142209082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A graph neural network is a deep learning model for processing graph data. In recent years, graph neural network architectures have become more and more complex as the research progresses, thus the design of graph neural networks has become an important task. Graph Neural Architecture Search aims to automate the design of graph neural network architectures. However, current methods require large computational resources, cannot be applied in lightweight scenarios, and the search process is not transparent. To address these challenges, this paper proposes a graph neural network architecture search method based on a heuristic algorithm combining tabu search and evolutionary strategies (Gnas-Te). Gnas-Te mainly consists of a tabu search algorithm module and an evolutionary strategy algorithm module. The tabu Search Algorithm Module designs and implements for the first time the tabu Search Algorithm suitable for the search of graph neural network architectures, and uses the maintenance of the tabu table to guide the search process. The evolutionary strategy Algorithm Module implements the evolutionary strategy Algorithm for the search of architectures with the design goal of being light-weight. After the reflection and implementation of Gnas-Te, in order to provide an accurate evaluation of the neural architecture search process, a new metric EASI is proposed. Gnas-Te searched architecture is comparable to the excellent human-designed graph neural network architecture. Experimental results on three real datasets show that Gnas-Te has a 1.37% improvement in search accuracy and a 37.7% reduction in search time to the state-of-the-art graph neural network architecture search method for an graph node classification task and can find high allround-performance architectures which are comparable to the excellent human-designed graph neural network architecture. Gnas-Te implements a lightweight and efficient search method that reduces the need of computational resources for searching graph neural network structures and meets the need for high-accuracy architecture search in the case of insufficient computational resources.
{"title":"Lightweight graph neural network architecture search based on heuristic algorithms","authors":"ZiHao Zhao, XiangHong Tang, JianGuang Lu, Yong Huang","doi":"10.1007/s13042-024-02356-4","DOIUrl":"https://doi.org/10.1007/s13042-024-02356-4","url":null,"abstract":"<p>A graph neural network is a deep learning model for processing graph data. In recent years, graph neural network architectures have become more and more complex as the research progresses, thus the design of graph neural networks has become an important task. Graph Neural Architecture Search aims to automate the design of graph neural network architectures. However, current methods require large computational resources, cannot be applied in lightweight scenarios, and the search process is not transparent. To address these challenges, this paper proposes a graph neural network architecture search method based on a heuristic algorithm combining tabu search and evolutionary strategies (Gnas-Te). Gnas-Te mainly consists of a tabu search algorithm module and an evolutionary strategy algorithm module. The tabu Search Algorithm Module designs and implements for the first time the tabu Search Algorithm suitable for the search of graph neural network architectures, and uses the maintenance of the tabu table to guide the search process. The evolutionary strategy Algorithm Module implements the evolutionary strategy Algorithm for the search of architectures with the design goal of being light-weight. After the reflection and implementation of Gnas-Te, in order to provide an accurate evaluation of the neural architecture search process, a new metric EASI is proposed. Gnas-Te searched architecture is comparable to the excellent human-designed graph neural network architecture. Experimental results on three real datasets show that Gnas-Te has a 1.37% improvement in search accuracy and a 37.7% reduction in search time to the state-of-the-art graph neural network architecture search method for an graph node classification task and can find high allround-performance architectures which are comparable to the excellent human-designed graph neural network architecture. Gnas-Te implements a lightweight and efficient search method that reduces the need of computational resources for searching graph neural network structures and meets the need for high-accuracy architecture search in the case of insufficient computational resources.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"408 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142209083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-04DOI: 10.1007/s13042-024-02352-8
Shixin Peng, Can Xiong, Leyuan Liu, Laurence T. Yang, Jingying Chen
lmage captioning is a multimodal task involving both computer vision and natural language processing. Recently, there has been a substantial improvement in the performance of image captioning with the introduction of multi-feature extraction methods. However, existing single-feature and multi-feature methods still face challenges such as a low refinement degree, weak feature complementarity, and lack of an end-to-end model. To tackle these issues, we propose an end-to-end image captioning model called GRPIC (Grid-Region-Pixel Image Captioning), which integrates three types of image features: region features, grid features, and pixel features. Our model utilizes the Swin Transformer for extracting grid features, DETR for extracting region features, and Deeplab for extracting pixel features. We merge pixel-level features with region and grid features to extract more refined contextual and detailed information. Additionally, we incorporate absolute position information and pairwise align the three features to fully leverage their complementarity. Qualitative and quantitative experiments conducted on the MSCOCO dataset demonstrate that our model achieved a 2.3% improvement in CIDEr, reaching 136.1 CIDEr compared to traditional dual-feature methods on the Karpathy test split. Furthermore, observation of the actual generated descriptions shows that the model also produced more refined captions.
{"title":"GRPIC: an end-to-end image captioning model using three visual features","authors":"Shixin Peng, Can Xiong, Leyuan Liu, Laurence T. Yang, Jingying Chen","doi":"10.1007/s13042-024-02352-8","DOIUrl":"https://doi.org/10.1007/s13042-024-02352-8","url":null,"abstract":"<p>lmage captioning is a multimodal task involving both computer vision and natural language processing. Recently, there has been a substantial improvement in the performance of image captioning with the introduction of multi-feature extraction methods. However, existing single-feature and multi-feature methods still face challenges such as a low refinement degree, weak feature complementarity, and lack of an end-to-end model. To tackle these issues, we propose an end-to-end image captioning model called GRPIC (Grid-Region-Pixel Image Captioning), which integrates three types of image features: region features, grid features, and pixel features. Our model utilizes the Swin Transformer for extracting grid features, DETR for extracting region features, and Deeplab for extracting pixel features. We merge pixel-level features with region and grid features to extract more refined contextual and detailed information. Additionally, we incorporate absolute position information and pairwise align the three features to fully leverage their complementarity. Qualitative and quantitative experiments conducted on the MSCOCO dataset demonstrate that our model achieved a 2.3% improvement in CIDEr, reaching 136.1 CIDEr compared to traditional dual-feature methods on the Karpathy test split. Furthermore, observation of the actual generated descriptions shows that the model also produced more refined captions.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"24 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142209081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-03DOI: 10.1007/s13042-024-02349-3
Akram Mustafa, Mostafa Rahimi Azghadi
Clinical coding is a time-consuming task that involves manually identifying and classifying patients’ diseases. This task becomes even more challenging when classifying across multiple diagnoses and performing multi-label classification. Automated Machine Learning (AutoML) techniques can improve this classification process. However, no previous study has developed an AutoML-based approach for multi-label clinical coding. To address this gap, a novel approach, called Clustered Automated Machine Learning (CAML), is introduced in this paper. CAML utilizes the AutoML library Auto-Sklearn and cTAKES feature extraction method. CAML clusters binary diagnosis labels using Hamming distance and employs the AutoML library to select the best algorithm for each cluster. The effectiveness of CAML is evaluated by comparing its performance with that of the Auto-Sklearn model on five different datasets from the Medical Information Mart for Intensive Care (MIMIC III) database of reports. These datasets vary in size, label set, and related diseases. The results demonstrate that CAML outperforms Auto-Sklearn in terms of Micro F1-score and Weighted F1-score, with an overall improvement ratio of 35.15% and 40.56%, respectively. The CAML approach offers the potential to improve healthcare quality by facilitating more accurate diagnoses and treatment decisions, ultimately enhancing patient outcomes.
{"title":"Clustered Automated Machine Learning (CAML) model for clinical coding multi-label classification","authors":"Akram Mustafa, Mostafa Rahimi Azghadi","doi":"10.1007/s13042-024-02349-3","DOIUrl":"https://doi.org/10.1007/s13042-024-02349-3","url":null,"abstract":"<p>Clinical coding is a time-consuming task that involves manually identifying and classifying patients’ diseases. This task becomes even more challenging when classifying across multiple diagnoses and performing multi-label classification. Automated Machine Learning (AutoML) techniques can improve this classification process. However, no previous study has developed an AutoML-based approach for multi-label clinical coding. To address this gap, a novel approach, called Clustered Automated Machine Learning (CAML), is introduced in this paper. CAML utilizes the AutoML library Auto-Sklearn and cTAKES feature extraction method. CAML clusters binary diagnosis labels using Hamming distance and employs the AutoML library to select the best algorithm for each cluster. The effectiveness of CAML is evaluated by comparing its performance with that of the Auto-Sklearn model on five different datasets from the Medical Information Mart for Intensive Care (MIMIC III) database of reports. These datasets vary in size, label set, and related diseases. The results demonstrate that CAML outperforms Auto-Sklearn in terms of Micro F1-score and Weighted F1-score, with an overall improvement ratio of 35.15% and 40.56%, respectively. The CAML approach offers the potential to improve healthcare quality by facilitating more accurate diagnoses and treatment decisions, ultimately enhancing patient outcomes.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"33 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142209085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-02DOI: 10.1007/s13042-024-02326-w
Dai Shi, Andi Han, Lequan Lin, Yi Guo, Zhiyong Wang, Junbin Gao
Physics-informed Graph Neural Networks have achieved remarkable performance in learning through graph-structured data by mitigating common GNN challenges such as over-smoothing, over-squashing, and heterophily adaption. Despite these advancements, the development of a simple yet effective paradigm that appropriately integrates previous methods for handling all these challenges is still underway. In this paper, we draw an analogy between the propagation of GNNs and particle systems in physics, proposing a model-agnostic enhancement framework. This framework enriches the graph structure by introducing additional nodes and rewiring connections with both positive and negative weights, guided by node labeling information. We theoretically verify that GNNs enhanced through our approach can effectively circumvent the over-smoothing issue and exhibit robustness against over-squashing. Moreover, we conduct a spectral analysis on the rewired graph to demonstrate that the corresponding GNNs can fit both homophilic and heterophilic graphs. Empirical validations on benchmarks for homophilic, heterophilic graphs, and long-term graph datasets show that GNNs enhanced by our method significantly outperform their original counterparts.
{"title":"Design your own universe: a physics-informed agnostic method for enhancing graph neural networks","authors":"Dai Shi, Andi Han, Lequan Lin, Yi Guo, Zhiyong Wang, Junbin Gao","doi":"10.1007/s13042-024-02326-w","DOIUrl":"https://doi.org/10.1007/s13042-024-02326-w","url":null,"abstract":"<p>Physics-informed Graph Neural Networks have achieved remarkable performance in learning through graph-structured data by mitigating common GNN challenges such as over-smoothing, over-squashing, and heterophily adaption. Despite these advancements, the development of a simple yet effective paradigm that appropriately integrates previous methods for handling all these challenges is still underway. In this paper, we draw an analogy between the propagation of GNNs and particle systems in physics, proposing a model-agnostic enhancement framework. This framework enriches the graph structure by introducing additional nodes and rewiring connections with both positive and negative weights, guided by node labeling information. We theoretically verify that GNNs enhanced through our approach can effectively circumvent the over-smoothing issue and exhibit robustness against over-squashing. Moreover, we conduct a spectral analysis on the rewired graph to demonstrate that the corresponding GNNs can fit both homophilic and heterophilic graphs. Empirical validations on benchmarks for homophilic, heterophilic graphs, and long-term graph datasets show that GNNs enhanced by our method significantly outperform their original counterparts.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"82 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142209129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-02DOI: 10.1007/s13042-024-02365-3
Hossein Hassani, Ehsan Hallaji, Roozbeh Razavi-Far, Mehrdad Saif
Quality of data and complexity of decision boundaries in high-dimensional data streams that are collected from cyber-physical power systems can greatly influence the process of learning from data and diagnosing faults in such critical systems. These systems generate massive amounts of data that overburden the system with excessive computational costs. Another issue is the presence of noise in recorded measurements that poses a challenge to the learning process, leading to a degradation in the performance of fault diagnosis. Furthermore, the diagnostic model is often provided with a mixture of redundant measurements that may deviate it from learning normal and fault distributions. This paper presents the effect of feature engineering on mitigating the aforementioned challenges in learning from data streams collected from cyber-physical systems. A data-driven fault diagnosis framework for a 118-bus power system is constructed by integrating feature selection, dimensionality reduction methods, and decision models. A comparative study is enabled accordingly to compare several advanced techniques in both domains. Dimensionality reduction and feature selection methods are compared both jointly and separately. Finally, experiments are concluded, and a setting is suggested that enhances data quality for fault diagnosis.
{"title":"Learning from high-dimensional cyber-physical data streams: a case of large-scale smart grid","authors":"Hossein Hassani, Ehsan Hallaji, Roozbeh Razavi-Far, Mehrdad Saif","doi":"10.1007/s13042-024-02365-3","DOIUrl":"https://doi.org/10.1007/s13042-024-02365-3","url":null,"abstract":"<p>Quality of data and complexity of decision boundaries in high-dimensional data streams that are collected from cyber-physical power systems can greatly influence the process of learning from data and diagnosing faults in such critical systems. These systems generate massive amounts of data that overburden the system with excessive computational costs. Another issue is the presence of noise in recorded measurements that poses a challenge to the learning process, leading to a degradation in the performance of fault diagnosis. Furthermore, the diagnostic model is often provided with a mixture of redundant measurements that may deviate it from learning normal and fault distributions. This paper presents the effect of feature engineering on mitigating the aforementioned challenges in learning from data streams collected from cyber-physical systems. A data-driven fault diagnosis framework for a 118-bus power system is constructed by integrating feature selection, dimensionality reduction methods, and decision models. A comparative study is enabled accordingly to compare several advanced techniques in both domains. Dimensionality reduction and feature selection methods are compared both jointly and separately. Finally, experiments are concluded, and a setting is suggested that enhances data quality for fault diagnosis.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"19 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142209123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Influenced by the urban road network, traffic flow has complex temporal and spatial correlation characteristics. Traffic flow forecasting is an important problem in the intelligent transportation system, which is related to the safety and stability of the transportation system. At present, many researchers ignore the research need for traffic flow forecasting beyond one hour. To address the issue of long-term traffic flow prediction, this paper proposes a traffic flow prediction model (HSTGCNN) based on a hybrid spatial–temporal gated convolution. Spatial–temporal attention mechanism and Gated convolution are the main components of HSTGCNN. The spatial–temporal attention mechanism can effectively obtain the spatial–temporal features of traffic flow, and gated convolution plays an important role in extracting longer-term features. The usage of dilated causal convolution effectively improves the long-term prediction ability of the model. HSTGCNN predicts the traffic conditions of 1 h, 1.5 h, and 2 h on two general traffic flow datasets. Experimental results show that the prediction accuracy of HSTGCNN is generally better than that of Temporal Graph Convolutional Network (T-GCN), Graph WaveNet, and other baselines.
{"title":"A traffic flow forecasting method based on hybrid spatial–temporal gated convolution","authors":"Ying Zhang, Songhao Yang, Hongchao Wang, Yongqiang Cheng, Jinyu Wang, Liping Cao, Ziying An","doi":"10.1007/s13042-024-02364-4","DOIUrl":"https://doi.org/10.1007/s13042-024-02364-4","url":null,"abstract":"<p>Influenced by the urban road network, traffic flow has complex temporal and spatial correlation characteristics. Traffic flow forecasting is an important problem in the intelligent transportation system, which is related to the safety and stability of the transportation system. At present, many researchers ignore the research need for traffic flow forecasting beyond one hour. To address the issue of long-term traffic flow prediction, this paper proposes a traffic flow prediction model (HSTGCNN) based on a hybrid spatial–temporal gated convolution. Spatial–temporal attention mechanism and Gated convolution are the main components of HSTGCNN. The spatial–temporal attention mechanism can effectively obtain the spatial–temporal features of traffic flow, and gated convolution plays an important role in extracting longer-term features. The usage of dilated causal convolution effectively improves the long-term prediction ability of the model. HSTGCNN predicts the traffic conditions of 1 h, 1.5 h, and 2 h on two general traffic flow datasets. Experimental results show that the prediction accuracy of HSTGCNN is generally better than that of Temporal Graph Convolutional Network (T-GCN), Graph WaveNet, and other baselines.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"19 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142209119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}