Pub Date : 2024-09-26DOI: 10.1007/s10489-024-05785-4
Hyunsung Kim, Seonghyeon Ko, Junghyun Bum, Duc-Tai Le, Hyunseung Choo
Radiologists often inspect hundreds of two-dimensional computed-tomography (CT) images to accurately locate lesions and make diagnoses, by classifying and labeling the ribs. However, this task is repetitive and time consuming. To effectively address this problem, we propose a multi-axial rib segmentation and sequential labeling (MARSS) method. First, we slice the CT volume into sagittal, frontal, and transverse planes for segmentation. The segmentation masks generated for each plane are then reconstructed into a single 3D segmentation mask using binarization techniques. After separating the left and right rib volumes from the entire CT volume, we cluster the connected components identified as bones and sequentially assign labels to each rib. The segmentation and sequential labeling performance of this method outperformed existing methods by up to 4.2%. The proposed automatic rib sequential labeling method enhances the efficiency of radiologists. In addition, this method provides an extended opportunity for advancements not only in rib segmentation but also in bone-fracture detection and lesion-diagnosis research.
{"title":"Automatic rib segmentation and sequential labeling via multi-axial slicing and 3D reconstruction","authors":"Hyunsung Kim, Seonghyeon Ko, Junghyun Bum, Duc-Tai Le, Hyunseung Choo","doi":"10.1007/s10489-024-05785-4","DOIUrl":"10.1007/s10489-024-05785-4","url":null,"abstract":"<div><p>Radiologists often inspect hundreds of two-dimensional computed-tomography (CT) images to accurately locate lesions and make diagnoses, by classifying and labeling the ribs. However, this task is repetitive and time consuming. To effectively address this problem, we propose a multi-axial rib segmentation and sequential labeling (MARSS) method. First, we slice the CT volume into sagittal, frontal, and transverse planes for segmentation. The segmentation masks generated for each plane are then reconstructed into a single 3D segmentation mask using binarization techniques. After separating the left and right rib volumes from the entire CT volume, we cluster the connected components identified as bones and sequentially assign labels to each rib. The segmentation and sequential labeling performance of this method outperformed existing methods by up to 4.2%. The proposed automatic rib sequential labeling method enhances the efficiency of radiologists. In addition, this method provides an extended opportunity for advancements not only in rib segmentation but also in bone-fracture detection and lesion-diagnosis research.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 24","pages":"12644 - 12660"},"PeriodicalIF":3.4,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-26DOI: 10.1007/s10489-024-05834-y
Haodong Cheng, Yingchi Mao, Xiao Jia
Physics-informed spatial-temporal discrete sequence learning networks have great potential in solving partial differential equations and time series prediction compared to traditional fully connected PINN algorithms, and can serve as the foundation for data-driven sequence prediction modeling and inverse problem analysis. However, such existing models are unable to deal with inverse problem scenarios in which the parameters of the physical process are time-varying and unknown, while usually failing to make predictions in continuous time. In this paper, we propose a continuous time series prediction algorithm constructed by the physics-informed graph neural ordinary differential equation (PGNODE). Proposed parameterized GNODE-GRU and physics-informed loss constraints are used to explicitly characterize and solve unknown time-varying hyperparameters. The GNODE solver integrates this physical parameter to predict the sequence value at any time. This paper uses epidemic prediction tasks as a case study, and experimental results demonstrate that the proposed algorithm can effectively improve the prediction accuracy of the spread of epidemics in the future continuous time.
{"title":"A framework based on physics-informed graph neural ODE: for continuous spatial-temporal pandemic prediction","authors":"Haodong Cheng, Yingchi Mao, Xiao Jia","doi":"10.1007/s10489-024-05834-y","DOIUrl":"10.1007/s10489-024-05834-y","url":null,"abstract":"<div><p>Physics-informed spatial-temporal discrete sequence learning networks have great potential in solving partial differential equations and time series prediction compared to traditional fully connected PINN algorithms, and can serve as the foundation for data-driven sequence prediction modeling and inverse problem analysis. However, such existing models are unable to deal with inverse problem scenarios in which the parameters of the physical process are time-varying and unknown, while usually failing to make predictions in continuous time. In this paper, we propose a continuous time series prediction algorithm constructed by the physics-informed graph neural ordinary differential equation (PGNODE). Proposed parameterized GNODE-GRU and physics-informed loss constraints are used to explicitly characterize and solve unknown time-varying hyperparameters. The GNODE solver integrates this physical parameter to predict the sequence value at any time. This paper uses epidemic prediction tasks as a case study, and experimental results demonstrate that the proposed algorithm can effectively improve the prediction accuracy of the spread of epidemics in the future continuous time.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 24","pages":"12661 - 12675"},"PeriodicalIF":3.4,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-25DOI: 10.1007/s10489-024-05841-z
Yongping Du, Runfeng Xie, Bochao Zhang, Zihao Yin
Multimodal aspect-based sentiment analysis (MABSA) aims to predict the sentiment of aspect by the fusion of different modalities such as image, text and so on. However, the availability of high-quality multimodal data remains limited. Therefore, few-shot MABSA is a new challenge. Previous works are rarely able to cope with low-resource and few-shot scenarios. In order to address the above problems, we design a Few-shot Multimodal aspect-based sentiment analysis framework based on Contrastive Finetuning (FMCF). Initially, the image modality is transformed to the corresponding textual caption to achieve the entailed semantic information and a contrastive dataset is constructed based on similarity retrieval for finetuning in the following stage. Further, a sentence encoder is trained based on SBERT, which combines supervised contrastive learning and sentence-level multi-feature fusion to complete MABSA. The experiments demonstrate that our framework achieves excellent performance in the few-shot scenarios. Importantly, with only 256 training samples and limited computational resources, the proposed method outperforms fine-tuned models that use all available data on the Twitter dataset.
{"title":"FMCF: Few-shot Multimodal aspect-based sentiment analysis framework based on Contrastive Finetuning","authors":"Yongping Du, Runfeng Xie, Bochao Zhang, Zihao Yin","doi":"10.1007/s10489-024-05841-z","DOIUrl":"10.1007/s10489-024-05841-z","url":null,"abstract":"<div><p>Multimodal aspect-based sentiment analysis (MABSA) aims to predict the sentiment of aspect by the fusion of different modalities such as image, text and so on. However, the availability of high-quality multimodal data remains limited. Therefore, few-shot MABSA is a new challenge. Previous works are rarely able to cope with low-resource and few-shot scenarios. In order to address the above problems, we design a <b>F</b>ew-shot <b>M</b>ultimodal aspect-based sentiment analysis framework based on <b>C</b>ontrastive <b>F</b>inetuning (FMCF). Initially, the image modality is transformed to the corresponding textual caption to achieve the entailed semantic information and a contrastive dataset is constructed based on similarity retrieval for finetuning in the following stage. Further, a sentence encoder is trained based on SBERT, which combines supervised contrastive learning and sentence-level multi-feature fusion to complete MABSA. The experiments demonstrate that our framework achieves excellent performance in the few-shot scenarios. Importantly, with only 256 training samples and limited computational resources, the proposed method outperforms fine-tuned models that use all available data on the Twitter dataset.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 24","pages":"12629 - 12643"},"PeriodicalIF":3.4,"publicationDate":"2024-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-24DOI: 10.1007/s10489-024-05808-0
Francisco de Arriba-Pérez, Silvia García-Méndez, Javier Otero-Mosquera, Francisco J. González-Castaño
Cognitive and neurological impairments are very common, but only a small proportion of affected individuals are diagnosed and treated, partly because of the high costs associated with frequent screening. Detecting pre-illness stages and analyzing the progression of neurological disorders through effective and efficient intelligent systems can be beneficial for timely diagnosis and early intervention. We propose using Large Language Models to extract features from free dialogues to detect cognitive decline. These features comprise high-level reasoning content-independent features (such as comprehension, decreased awareness, increased distraction, and memory problems). Our solution comprises (i) preprocessing, (ii) feature engineering via Natural Language Processing techniques and prompt engineering, (iii) feature analysis and selection to optimize performance, and (iv) classification, supported by automatic explainability. We also explore how to improve Chatgpt’s direct cognitive impairment prediction capabilities using the best features in our models. Evaluation metrics obtained endorse the effectiveness of a mixed approach combining feature extraction with Chatgpt and a specialized Machine Learning model to detect cognitive decline within free-form conversational dialogues with older adults. Ultimately, our work may facilitate the development of an inexpensive, non-invasive, and rapid means of detecting and explaining cognitive decline.
{"title":"Explainable cognitive decline detection in free dialogues with a Machine Learning approach based on pre-trained Large Language Models","authors":"Francisco de Arriba-Pérez, Silvia García-Méndez, Javier Otero-Mosquera, Francisco J. González-Castaño","doi":"10.1007/s10489-024-05808-0","DOIUrl":"10.1007/s10489-024-05808-0","url":null,"abstract":"<div><p>Cognitive and neurological impairments are very common, but only a small proportion of affected individuals are diagnosed and treated, partly because of the high costs associated with frequent screening. Detecting pre-illness stages and analyzing the progression of neurological disorders through effective and efficient intelligent systems can be beneficial for timely diagnosis and early intervention. We propose using Large Language Models to extract features from free dialogues to detect cognitive decline. These features comprise high-level reasoning content-independent features (such as comprehension, decreased awareness, increased distraction, and memory problems). Our solution comprises (<i>i</i>) preprocessing, (<i>ii</i>) feature engineering via Natural Language Processing techniques and prompt engineering, (<i>iii</i>) feature analysis and selection to optimize performance, and (<i>iv</i>) classification, supported by automatic explainability. We also explore how to improve Chat<span>gpt</span>’s direct cognitive impairment prediction capabilities using the best features in our models. Evaluation metrics obtained endorse the effectiveness of a mixed approach combining feature extraction with Chat<span>gpt</span> and a specialized Machine Learning model to detect cognitive decline within free-form conversational dialogues with older adults. Ultimately, our work may facilitate the development of an inexpensive, non-invasive, and rapid means of detecting and explaining cognitive decline.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 24","pages":"12613 - 12628"},"PeriodicalIF":3.4,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-23DOI: 10.1007/s10489-024-05809-z
Wenhao Shu, Yichen Hu, Wenbin Qian
Multi-label feature selection serves an effective dimensionality reduction technique in the high-dimensional multi-label data. However, most feature selection methods regard the label as complete. In fact, in real-world applications, labels in a multi-label dataset may be missing due to various difficulties in collecting sufficient labels, which enables some valuable information to be overlooked and leads to an inaccurate prediction in the classification. To address these issues, a feature selection algorithm based on the granular-ball based mutual information is proposed for the multi-label data with missing labels in this paper. At first, to improve the classification ability, a label recovery model is proposed to calculate some labels, which utilizes the correlation between labels, the properties of label specific features and global common features. Secondly, to avoid computing the neighborhood radius, a granular-ball based mutual information metric for evaluating candidate features is proposed, which well fits the data distribution. Finally, the corresponding feature selection algorithm is developed for selecting a subset from the multi-label data with missing labels. Experiments on the different datasets demonstrate that compared with the state-of-the-art algorithms the proposed algorithm considerably improves the classification accuracy. The code is publicly available online at https://github.com/skylark-leo/MLMLFS.git
{"title":"Multi-label feature selection for missing labels by granular-ball based mutual information","authors":"Wenhao Shu, Yichen Hu, Wenbin Qian","doi":"10.1007/s10489-024-05809-z","DOIUrl":"10.1007/s10489-024-05809-z","url":null,"abstract":"<p>Multi-label feature selection serves an effective dimensionality reduction technique in the high-dimensional multi-label data. However, most feature selection methods regard the label as complete. In fact, in real-world applications, labels in a multi-label dataset may be missing due to various difficulties in collecting sufficient labels, which enables some valuable information to be overlooked and leads to an inaccurate prediction in the classification. To address these issues, a feature selection algorithm based on the granular-ball based mutual information is proposed for the multi-label data with missing labels in this paper. At first, to improve the classification ability, a label recovery model is proposed to calculate some labels, which utilizes the correlation between labels, the properties of label specific features and global common features. Secondly, to avoid computing the neighborhood radius, a granular-ball based mutual information metric for evaluating candidate features is proposed, which well fits the data distribution. Finally, the corresponding feature selection algorithm is developed for selecting a subset from the multi-label data with missing labels. Experiments on the different datasets demonstrate that compared with the state-of-the-art algorithms the proposed algorithm considerably improves the classification accuracy. The code is publicly available online at https://github.com/skylark-leo/MLMLFS.git</p>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 23","pages":"12589 - 12612"},"PeriodicalIF":3.4,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142413386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-20DOI: 10.1007/s10489-024-05799-y
Tao Wu, Qiushu Chen, Dongfang Zhao, Jinhua Wang, Linhua Jiang
Unsupervised domain adaptation (UDA) for time series analysis remains challenging due to the lack of labeled data in target domains. Existing methods rely heavily on auxiliary data yet often fail to fully exploit the intrinsic task consistency between different domains. To address this limitation, we propose a novel time series UDA framework called CLTC that enhances feature transferability by capturing semantic context and reconstructing class-wise representations. Specifically, contrastive learning is first utilized to capture contextual representations that enable label transfer across domains. Dual reconstruction on samples from the same class then refines the task-specific features to improve consistency. To align the cross-domain distributions without target labels, we leverage Sinkhorn divergence which can handle non-overlapping supports. Consequently, our CLTC reduces the domain gap while retaining task-specific consistency for effective knowledge transfer. Extensive experiments on four time series benchmarks demonstrate state-of-the-art performance improvements of 0.7-3.6% over existing methods, and ablation study validates the efficacy of each component.
{"title":"Domain adaptation of time series via contrastive learning with task-specific consistency","authors":"Tao Wu, Qiushu Chen, Dongfang Zhao, Jinhua Wang, Linhua Jiang","doi":"10.1007/s10489-024-05799-y","DOIUrl":"10.1007/s10489-024-05799-y","url":null,"abstract":"<div><p>Unsupervised domain adaptation (UDA) for time series analysis remains challenging due to the lack of labeled data in target domains. Existing methods rely heavily on auxiliary data yet often fail to fully exploit the intrinsic task consistency between different domains. To address this limitation, we propose a novel time series UDA framework called CLTC that enhances feature transferability by capturing semantic context and reconstructing class-wise representations. Specifically, contrastive learning is first utilized to capture contextual representations that enable label transfer across domains. Dual reconstruction on samples from the same class then refines the task-specific features to improve consistency. To align the cross-domain distributions without target labels, we leverage Sinkhorn divergence which can handle non-overlapping supports. Consequently, our CLTC reduces the domain gap while retaining task-specific consistency for effective knowledge transfer. Extensive experiments on four time series benchmarks demonstrate state-of-the-art performance improvements of 0.7-3.6% over existing methods, and ablation study validates the efficacy of each component.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 23","pages":"12576 - 12588"},"PeriodicalIF":3.4,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142412770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-18DOI: 10.1007/s10489-024-05828-w
Ruimin Ma, Junqi Gao, Li Cheng, Yuyi Zhang, Ovanes Petrosian
In the cloud computing domain, significant strides have been made in performance prediction for cloud workflows, yet link prediction for cloud workflows remains largely unexplored. This paper introduces a novel challenge: joint node and link prediction in cloud workflows, with the aim of increasing the efficiency and overall performance of cloud computing resources. GNN-based methods have gained traction in handling graph-related tasks. The unique format of the DAG presents an underexplored area for GNNs effectiveness. To enhance comprehension of intricate graph structures and interrelationships, this paper introduces two novel models under the DAGCN framework: DAG-ConvGCN and DAG-AttGCN. The former synergizes the local receptive fields of the CNN with the global interpretive power of the GCN, whereas the latter integrates an attention mechanism to dynamically weigh the significance of node adjacencies. Through rigorous experimentation on a meticulously crafted joint node and link prediction task utilizing the Cluster-trace-v2018 dataset, both DAG-ConvGCN and DAG-AttGCN demonstrate superior performance over a spectrum of established machine learning and deep learning benchmarks. Moreover, the application of similarity measures such as the propagation kernel and the innovative GRBF kernel-which merges the graphlet kernel with the radial basis function kernel to accentuate graph topology and node features-reinforces the superiority of DAGCN models over graph-level prediction accuracy conventional baselines. This paper offers a fresh vantage point for advancing predictive methodologies within graph theory.
{"title":"DAGCN: hybrid model for efficiently handling joint node and link prediction in cloud workflows","authors":"Ruimin Ma, Junqi Gao, Li Cheng, Yuyi Zhang, Ovanes Petrosian","doi":"10.1007/s10489-024-05828-w","DOIUrl":"10.1007/s10489-024-05828-w","url":null,"abstract":"<div><p>In the cloud computing domain, significant strides have been made in performance prediction for cloud workflows, yet link prediction for cloud workflows remains largely unexplored. This paper introduces a novel challenge: joint node and link prediction in cloud workflows, with the aim of increasing the efficiency and overall performance of cloud computing resources. GNN-based methods have gained traction in handling graph-related tasks. The unique format of the DAG presents an underexplored area for GNNs effectiveness. To enhance comprehension of intricate graph structures and interrelationships, this paper introduces two novel models under the DAGCN framework: DAG-ConvGCN and DAG-AttGCN. The former synergizes the local receptive fields of the CNN with the global interpretive power of the GCN, whereas the latter integrates an attention mechanism to dynamically weigh the significance of node adjacencies. Through rigorous experimentation on a meticulously crafted joint node and link prediction task utilizing the Cluster-trace-v2018 dataset, both DAG-ConvGCN and DAG-AttGCN demonstrate superior performance over a spectrum of established machine learning and deep learning benchmarks. Moreover, the application of similarity measures such as the propagation kernel and the innovative GRBF kernel-which merges the graphlet kernel with the radial basis function kernel to accentuate graph topology and node features-reinforces the superiority of DAGCN models over graph-level prediction accuracy conventional baselines. This paper offers a fresh vantage point for advancing predictive methodologies within graph theory.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 23","pages":"12505 - 12530"},"PeriodicalIF":3.4,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142265436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Human-object interaction (HOI) detection is an important computer vision task for recognizing the interaction between humans and surrounding objects in an image or video. The HOI datasets have a serious long-tailed data distribution problem because it is challenging to have a dataset that contains all potential interactions. Many HOI detectors have addressed this issue by utilizing visual-language models. However, due to the calculation mechanism of the Transformer, the visual-language model is not good at extracting the local features of input samples. Therefore, we propose a novel local feature enhanced Transformer to motivate encoders to extract multi-modal features that contain more information. Moreover, it is worth noting that the application of prompt learning in HOI detection is still in preliminary stages. Consequently, we propose a multi-modal adaptive prompt module, which uses an adaptive learning strategy to facilitate the interaction of language and visual prompts. In the HICO-DET and SWIG-HOI datasets, the proposed model achieves full interaction with 24.21% mAP and 14.29% mAP, respectively. Our code is available at https://github.com/small-code-cat/AMP-HOI.
{"title":"Adaptive multimodal prompt for human-object interaction with local feature enhanced transformer","authors":"Kejun Xue, Yongbin Gao, Zhijun Fang, Xiaoyan Jiang, Wenjun Yu, Mingxuan Chen, Chenmou Wu","doi":"10.1007/s10489-024-05774-7","DOIUrl":"10.1007/s10489-024-05774-7","url":null,"abstract":"<div><p>Human-object interaction (HOI) detection is an important computer vision task for recognizing the interaction between humans and surrounding objects in an image or video. The HOI datasets have a serious long-tailed data distribution problem because it is challenging to have a dataset that contains all potential interactions. Many HOI detectors have addressed this issue by utilizing visual-language models. However, due to the calculation mechanism of the Transformer, the visual-language model is not good at extracting the local features of input samples. Therefore, we propose a novel local feature enhanced Transformer to motivate encoders to extract multi-modal features that contain more information. Moreover, it is worth noting that the application of prompt learning in HOI detection is still in preliminary stages. Consequently, we propose a multi-modal adaptive prompt module, which uses an adaptive learning strategy to facilitate the interaction of language and visual prompts. In the HICO-DET and SWIG-HOI datasets, the proposed model achieves full interaction with 24.21% mAP and 14.29% mAP, respectively. Our code is available at https://github.com/small-code-cat/AMP-HOI.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 23","pages":"12492 - 12504"},"PeriodicalIF":3.4,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142265435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-18DOI: 10.1007/s10489-024-05763-w
William C. Sleeman IV, Martha Roseberry, Preetam Ghosh, Alberto Cano, Bartosz Krawczyk
In the era of big data, it is necessary to provide novel and efficient platforms for training machine learning models over large volumes of data. The MapReduce approach and its Apache Spark implementation are among the most popular methods that provide high-performance computing for classification algorithms. However, they require dedicated implementations that will take advantage of such architectures. Additionally, many real-world big data problems are plagued by class imbalance, posing challenges to the classifier training step. Existing solutions for alleviating skewed distributions do not work well in the MapReduce environment. In this paper, we propose a novel KD-tree based classifier, together with a variation of the SMOTE algorithm dedicated to the Spark platform. Our algorithms offer excellent predictive power and can work simultaneously with binary and multi-class imbalanced data. Exhaustive experiments conducted using the Amazon Web Service platform showcase the high efficiency and flexibility of our proposed algorithms.
{"title":"Improved KD-tree based imbalanced big data classification and oversampling for MapReduce platforms","authors":"William C. Sleeman IV, Martha Roseberry, Preetam Ghosh, Alberto Cano, Bartosz Krawczyk","doi":"10.1007/s10489-024-05763-w","DOIUrl":"10.1007/s10489-024-05763-w","url":null,"abstract":"<div><p>In the era of big data, it is necessary to provide novel and efficient platforms for training machine learning models over large volumes of data. The MapReduce approach and its Apache Spark implementation are among the most popular methods that provide high-performance computing for classification algorithms. However, they require dedicated implementations that will take advantage of such architectures. Additionally, many real-world big data problems are plagued by class imbalance, posing challenges to the classifier training step. Existing solutions for alleviating skewed distributions do not work well in the MapReduce environment. In this paper, we propose a novel KD-tree based classifier, together with a variation of the SMOTE algorithm dedicated to the Spark platform. Our algorithms offer excellent predictive power and can work simultaneously with binary and multi-class imbalanced data. Exhaustive experiments conducted using the Amazon Web Service platform showcase the high efficiency and flexibility of our proposed algorithms.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 23","pages":"12558 - 12575"},"PeriodicalIF":3.4,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142265136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-18DOI: 10.1007/s10489-024-05800-8
Meihang Zhang, Hua Zhang, Wei Yan, Lin Zhang, Zhigang Jiang
The expanding application of Carbon Fiber Reinforced Polymer (CFRP) in industries is drawing increasing attention to energy efficiency improvement and cost reducing during the secondary processing, particularly in milling. Machining parameter optimization is a practical and economical way to achieve this goal. However, the unclear milling mechanism and dynamic machining conditions of CFRP make it challenging. To fill this gap, this paper proposes a DRL-based approach that integrates physics-guided Transformer networks with Twin Delayed Deep Deterministic Policy Gradient (PGTTD3) to optimize CFRP milling parameters with multi-objectives. Firstly, a PG-Transformer-based CFRP milling energy consumption model is proposed, which modifies the existing De-stationary Attention module by integrating external physical variables to enhance modeling accuracy and efficiency. Secondly, a multi-objective optimization model considering energy consumption, milling time and machining cost for CFRP milling is formulated and mapped to a Markov Decision Process, and a reward function is designed. Thirdly, a PGTTD3 approach is proposed for dynamic parameter decision-making, incorporating a time difference strategy to enhance agent training stability and online adjustment reliability. The experimental results show that the proposed method reduces energy consumption, milling time and machining cost by 10.98%, 3.012%, and 14.56% in CFRP milling respectively, compared to the actual averages. The proposed algorithm exhibits excellent performance metrics when compared to state-of-the-art optimization algorithms, with an average improvement in optimization efficiency of over 20% and a maximum enhancement of 88.66%.
{"title":"Multi-objective optimization enabling CFRP energy-efficient milling based on deep reinforcement learning","authors":"Meihang Zhang, Hua Zhang, Wei Yan, Lin Zhang, Zhigang Jiang","doi":"10.1007/s10489-024-05800-8","DOIUrl":"10.1007/s10489-024-05800-8","url":null,"abstract":"<div><p>The expanding application of Carbon Fiber Reinforced Polymer (CFRP) in industries is drawing increasing attention to energy efficiency improvement and cost reducing during the secondary processing, particularly in milling. Machining parameter optimization is a practical and economical way to achieve this goal. However, the unclear milling mechanism and dynamic machining conditions of CFRP make it challenging. To fill this gap, this paper proposes a DRL-based approach that integrates physics-guided Transformer networks with Twin Delayed Deep Deterministic Policy Gradient (PGTTD3) to optimize CFRP milling parameters with multi-objectives. Firstly, a PG-Transformer-based CFRP milling energy consumption model is proposed, which modifies the existing De-stationary Attention module by integrating external physical variables to enhance modeling accuracy and efficiency. Secondly, a multi-objective optimization model considering energy consumption, milling time and machining cost for CFRP milling is formulated and mapped to a Markov Decision Process, and a reward function is designed. Thirdly, a PGTTD3 approach is proposed for dynamic parameter decision-making, incorporating a time difference strategy to enhance agent training stability and online adjustment reliability. The experimental results show that the proposed method reduces energy consumption, milling time and machining cost by 10.98%, 3.012%, and 14.56% in CFRP milling respectively, compared to the actual averages. The proposed algorithm exhibits excellent performance metrics when compared to state-of-the-art optimization algorithms, with an average improvement in optimization efficiency of over 20% and a maximum enhancement of 88.66%.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 23","pages":"12531 - 12557"},"PeriodicalIF":3.4,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142265135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}