Pub Date : 2024-10-29DOI: 10.1016/j.eswa.2024.125583
Text-to-image synthesis aims to generate high-quality realistic images conditioned on text description. The major challenge of this task rests on the deep and seamless integration of text and image features. Therefore, in this paper, we present a novel approach, e.g., semantic fusion generative adversarial networks (SF-GAN), for fine-grained text-to-image generation, which enables efficient semantic interactions. Specifically, our proposed SF-GAN leverages a novel recurrent semantic fusion network to seamlessly manipulate the global allocation of text information across discrete fusion blocks. Moreover, with the usage of the contrastive loss and the dynamic convolution, SF-GAN could fuse the text and image information more accurately and further improve the semantic consistency in the generate stage. During the discrimination stage, we introduce a word-level discriminator designed to offer the generator precise feedback pertaining to each individual word. When compared to current state-of-the-art techniques, our SF-GAN demonstrates remarkable efficiency in generating realistic and text-aligned images, outperforming its contemporaries on challenging benchmark datasets.
{"title":"SF-GAN: Semantic fusion generative adversarial networks for text-to-image synthesis","authors":"","doi":"10.1016/j.eswa.2024.125583","DOIUrl":"10.1016/j.eswa.2024.125583","url":null,"abstract":"<div><div>Text-to-image synthesis aims to generate high-quality realistic images conditioned on text description. The major challenge of this task rests on the deep and seamless integration of text and image features. Therefore, in this paper, we present a novel approach, e.g., semantic fusion generative adversarial networks (SF-GAN), for fine-grained text-to-image generation, which enables efficient semantic interactions. Specifically, our proposed SF-GAN leverages a novel recurrent semantic fusion network to seamlessly manipulate the global allocation of text information across discrete fusion blocks. Moreover, with the usage of the contrastive loss and the dynamic convolution, SF-GAN could fuse the text and image information more accurately and further improve the semantic consistency in the generate stage. During the discrimination stage, we introduce a word-level discriminator designed to offer the generator precise feedback pertaining to each individual word. When compared to current state-of-the-art techniques, our SF-GAN demonstrates remarkable efficiency in generating realistic and text-aligned images, outperforming its contemporaries on challenging benchmark datasets.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142539020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-29DOI: 10.1016/j.eswa.2024.125613
Artificial bee colony (ABC) algorithm has shown excellent performance over many single and multi-objective optimization problems (MOPs). However, ABC encounters some difficulties when solving many-objective optimization problems (MaOPs) with irregular Pareto fronts (PFs). The possible reasons include two aspects: (1) there are many non-dominated solutions in the population and the low selection pressure cannot move the population toward the PF; and (2) it is hard to maintain population diversity for PFs having irregular geometric structures. To address these issues, a new many-objective ABC variant based on multiple indicators (called MIMaOABC) is proposed in this paper. Firstly, a convergence indicator and a diversity indicator () based on parallel distance are utilized. A single indicator may have preferences and it easily causes the population to converge to a subregion of the PF. Then, a two-stage environmental selection method is designed based on the two indicators. In the first stage, the based environmental selection is used to improve the convergence. In the second stage, the based environmental selection is employed to maintain diversity and handle irregular PFs. To balance exploration and exploitation during the search, multiple search strategies are used in different search stages, respectively. In the onlooker bee stage, solutions with good convergence are chosen for further search based on a new selection mechanism. In order to verify the performance of MIMaOABC, a set of well-known benchmark problems with degenerate, discontinuous, inverted, and regular PFs are tested. Performance of MIMaOABC is compared with eight state-of-the-art algorithms. Computational results shows that the proposed MIMaOABC is competitive in solving MaOPs with both irregular and regular PFs.
{"title":"Artificial bee colony algorithm based on multiple indicators for many-objective optimization with irregular Pareto fronts","authors":"","doi":"10.1016/j.eswa.2024.125613","DOIUrl":"10.1016/j.eswa.2024.125613","url":null,"abstract":"<div><div>Artificial bee colony (ABC) algorithm has shown excellent performance over many single and multi-objective optimization problems (MOPs). However, ABC encounters some difficulties when solving many-objective optimization problems (MaOPs) with irregular Pareto fronts (PFs). The possible reasons include two aspects: (1) there are many non-dominated solutions in the population and the low selection pressure cannot move the population toward the PF; and (2) it is hard to maintain population diversity for PFs having irregular geometric structures. To address these issues, a new many-objective ABC variant based on multiple indicators (called MIMaOABC) is proposed in this paper. Firstly, a convergence indicator <span><math><msub><mrow><mi>I</mi></mrow><mrow><msub><mrow><mi>ɛ</mi></mrow><mrow><mo>+</mo></mrow></msub></mrow></msub></math></span> and a diversity indicator (<span><math><mrow><mi>D</mi><mi>i</mi><mi>v</mi></mrow></math></span>) based on parallel distance are utilized. A single indicator may have preferences and it easily causes the population to converge to a subregion of the PF. Then, a two-stage environmental selection method is designed based on the two indicators. In the first stage, the <span><math><msub><mrow><mi>I</mi></mrow><mrow><msub><mrow><mi>ɛ</mi></mrow><mrow><mo>+</mo></mrow></msub></mrow></msub></math></span> based environmental selection is used to improve the convergence. In the second stage, the <span><math><mrow><mi>D</mi><mi>i</mi><mi>v</mi></mrow></math></span> based environmental selection is employed to maintain diversity and handle irregular PFs. To balance exploration and exploitation during the search, multiple search strategies are used in different search stages, respectively. In the onlooker bee stage, solutions with good convergence are chosen for further search based on a new selection mechanism. In order to verify the performance of MIMaOABC, a set of well-known benchmark problems with degenerate, discontinuous, inverted, and regular PFs are tested. Performance of MIMaOABC is compared with eight state-of-the-art algorithms. Computational results shows that the proposed MIMaOABC is competitive in solving MaOPs with both irregular and regular PFs.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142571808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-29DOI: 10.1016/j.eswa.2024.125624
Sterile processing is a critical secondary process and a major cost factor in the processing, acquisition, and storage of costly medical devices. This article aims to improve the performance of sterile processing by developing, implementing, and evaluating a dispatching rule-based algorithm to reduce the time medical devices spend in the central sterile supply department using a two-stage hybrid flow-shop formulation. The algorithm combines dispatching rules with stage decomposition and compatibility conditions. A genetic algorithm is designed to benchmark the performance in addition to an analytic bound. Real-world data from a large German hospital were used to test the effectiveness of the heuristics. The case study demonstrated the practical implications of the approach, leading to a reduction in the time medical devices spend in the system and improved utilization of washer-disinfector machines and sterilizers. It also highlighted the importance of aligning machine capacity with demand and the potential trade-offs associated with batch processing decisions. Our approach can contribute to substantial operational cost savings and efficiency gains, offering significant benefits to decision makers at both the operational and tactical levels.
{"title":"A two-stage hybrid flow-shop formulation for sterilization processes in hospitals","authors":"","doi":"10.1016/j.eswa.2024.125624","DOIUrl":"10.1016/j.eswa.2024.125624","url":null,"abstract":"<div><div>Sterile processing is a critical secondary process and a major cost factor in the processing, acquisition, and storage of costly medical devices. This article aims to improve the performance of sterile processing by developing, implementing, and evaluating a dispatching rule-based algorithm to reduce the time medical devices spend in the central sterile supply department using a two-stage hybrid flow-shop formulation. The algorithm combines dispatching rules with stage decomposition and compatibility conditions. A genetic algorithm is designed to benchmark the performance in addition to an analytic bound. Real-world data from a large German hospital were used to test the effectiveness of the heuristics. The case study demonstrated the practical implications of the approach, leading to a reduction in the time medical devices spend in the system and improved utilization of washer-disinfector machines and sterilizers. It also highlighted the importance of aligning machine capacity with demand and the potential trade-offs associated with batch processing decisions. Our approach can contribute to substantial operational cost savings and efficiency gains, offering significant benefits to decision makers at both the operational and tactical levels.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142577872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-28DOI: 10.1016/j.eswa.2024.125604
Due to wide distribution and low energy efficiency, the energy-saving in industrial robots (IRs) is attracting extensive attention. Accurate energy consumption (EC) models of IRs lay the foundation for energy-saving. However, most dynamic and electrical parameters of IRs are not disclosed by manufacturers, which leads to the invalidity of most model-based EC prediction methods. To bridge this gap, a mechanism-data hybrid-driven method is proposed to predict the EC of IRs in this paper. First, a joint torque prediction model integrating a hybrid-driven parameter identification is developed based on deep reinforcement learning (DRL). The framework for DRL-based parameter identification is constructed through tailored design of interfaces and training mechanisms, wherein the DRL agent can learn to identify the dynamic parameters from the trajectory database. And a deep neural network based on long short-term memory (LSTM) is proposed to predict the EC of IRs according to the joint torques and velocities. The nonlinear item, which is not modeled in the robot dynamic equation, are also encapsulated in the deep neural network with one-dimensional convolutional neural network (1D-CNN) layers to improve the prediction accuracy. To validate the accuracy and efficacy of the proposed method, experiments are conducted on a KUKA KR60-3 industrial robot with different loads. The results demonstrate that the proposed method can predict EC with a mean absolute percentage error of less than 2% under a fixed load and less than 3% under loads not used for agent training.
由于工业机器人(IR)分布广、能效低,其节能问题受到广泛关注。精确的工业机器人能耗(EC)模型为节能奠定了基础。然而,由于制造商并未公布工业机器人的大部分动态参数和电气参数,导致大多数基于模型的能耗预测方法无效。为了弥补这一缺陷,本文提出了一种机制-数据混合驱动的方法来预测 IR 的 EC。首先,在深度强化学习(DRL)的基础上,开发了一种集成了混合驱动参数识别的联合扭矩预测模型。通过量身定制的界面设计和训练机制,构建了基于 DRL 的参数识别框架,其中 DRL 代理可以从轨迹数据库中学习识别动态参数。此外,还提出了一种基于长短期记忆(LSTM)的深度神经网络,用于根据关节扭矩和速度预测 IR 的 EC。机器人动态方程中没有建模的非线性项目也被封装在深度神经网络中的一维卷积神经网络(1D-CNN)层中,以提高预测精度。为了验证所提方法的准确性和有效性,我们在不同负载的 KUKA KR60-3 工业机器人上进行了实验。结果表明,在固定负载下,所提出的方法能够以低于 2% 的平均绝对百分比误差预测 EC,而在未用于代理培训的负载下,误差则低于 3%。
{"title":"Industrial robot energy consumption model identification: A coupling model-driven and data-driven paradigm","authors":"","doi":"10.1016/j.eswa.2024.125604","DOIUrl":"10.1016/j.eswa.2024.125604","url":null,"abstract":"<div><div>Due to wide distribution and low energy efficiency, the energy-saving in industrial robots (IRs) is attracting extensive attention. Accurate energy consumption (EC) models of IRs lay the foundation for energy-saving. However, most dynamic and electrical parameters of IRs are not disclosed by manufacturers, which leads to the invalidity of most model-based EC prediction methods. To bridge this gap, a mechanism-data hybrid-driven method is proposed to predict the EC of IRs in this paper. First, a joint torque prediction model integrating a hybrid-driven parameter identification is developed based on deep reinforcement learning (DRL). The framework for DRL-based parameter identification is constructed through tailored design of interfaces and training mechanisms, wherein the DRL agent can learn to identify the dynamic parameters from the trajectory database. And a deep neural network based on long short-term memory (LSTM) is proposed to predict the EC of IRs according to the joint torques and velocities. The nonlinear item, which is not modeled in the robot dynamic equation, are also encapsulated in the deep neural network with one-dimensional convolutional neural network (1D-CNN) layers to improve the prediction accuracy. To validate the accuracy and efficacy of the proposed method, experiments are conducted on a KUKA KR60-3 industrial robot with different loads. The results demonstrate that the proposed method can predict EC with a mean absolute percentage error of less than 2% under a fixed load and less than 3% under loads not used for agent training.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142571807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-28DOI: 10.1016/j.eswa.2024.125598
Fraudulent activities on e-commerce platforms, such as spamming product reviews or fake payment behaviors, seriously mislead users’ purchasing decisions and harm platform integrity. To effectively identify fraudsters, recent research mainly attempts to employ graph neural networks (GNNs) with aggregating neighborhood features for detecting the fraud suspiciousness. However, GNNs are vulnerable to carefully-crafted perturbations in the graph structure, and the camouflage strategies of collusive fraudsters limit the effectiveness of GNNs-based fraud detectors. To address these issues, a novel multiplex graph fusion network with reinforcement structure learning (RestMGFN) is proposed in this paper to reveal the collaborative camouflage review fraud. Specifically, an adaptive graph structure learning module is designed to generate high-quality graph representation by utilizing paradigm constraints on the intrinsic properties of graph. Multiple relation-specific graphs are then constructed using meta-path search for capturing the deep semantic features of fraudulent activities. Finally, we incorporate the multiplex graph representations module into a unified framework, jointly optimizing the graph structure and corresponding embedding representations. Comprehensive experiments on real-world datasets verify the effectiveness and robustness of the proposed model compared with state-of-the-art approaches.
{"title":"Multiplex graph fusion network with reinforcement structure learning for fraud detection in online e-commerce platforms","authors":"","doi":"10.1016/j.eswa.2024.125598","DOIUrl":"10.1016/j.eswa.2024.125598","url":null,"abstract":"<div><div>Fraudulent activities on e-commerce platforms, such as spamming product reviews or fake payment behaviors, seriously mislead users’ purchasing decisions and harm platform integrity. To effectively identify fraudsters, recent research mainly attempts to employ graph neural networks (GNNs) with aggregating neighborhood features for detecting the fraud suspiciousness. However, GNNs are vulnerable to carefully-crafted perturbations in the graph structure, and the camouflage strategies of collusive fraudsters limit the effectiveness of GNNs-based fraud detectors. To address these issues, a novel multiplex graph fusion network with reinforcement structure learning (RestMGFN) is proposed in this paper to reveal the collaborative camouflage review fraud. Specifically, an adaptive graph structure learning module is designed to generate high-quality graph representation by utilizing paradigm constraints on the intrinsic properties of graph. Multiple relation-specific graphs are then constructed using meta-path search for capturing the deep semantic features of fraudulent activities. Finally, we incorporate the multiplex graph representations module into a unified framework, jointly optimizing the graph structure and corresponding embedding representations. Comprehensive experiments on real-world datasets verify the effectiveness and robustness of the proposed model compared with state-of-the-art approaches.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142593546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-28DOI: 10.1016/j.eswa.2024.125621
Automatic epilepsy seizure detection has high clinical value since it can alleviate the burden of manual monitoring. Nevertheless, it remains a technically challenging task to achieve a reliable system. In this study, we investigated the significance of the phase information in EEG signals in seizure detection using machine learning. We used the Stockwell transform (S-transform) to extract both phase and power spectra of the EEG signal in epilepsy patients. A dual-stream convolution neural network (CNN) model was adopted as the classifier, which takes both spectra as inputs. We demonstrated that the phase input allows the CNN model to capture the heightened phase synchronization among EEG channels in seizure and add network attention to both the low- and high-frequency features of the inputs in the CHB-MIT and Bonn databases. We improved the detection AUC-ROC by 6.68% on the CHB-MIT database when adding phase inputs to the power inputs. By incorporating a channel fusion post-processing to the outputs of this CNN model, it achieves a sensitivity and specificity of 79.59% and 92.23%, respectively, surpassing some of the state-of-the-art methods. Our results show that the phase inputs are useful features in seizure detection. This discovery has significant implications for improving the effectiveness of automatic seizure detection systems.
{"title":"Phase spectrogram of EEG from S-transform Enhances epileptic seizure detection","authors":"","doi":"10.1016/j.eswa.2024.125621","DOIUrl":"10.1016/j.eswa.2024.125621","url":null,"abstract":"<div><div>Automatic epilepsy seizure detection has high clinical value since it can alleviate the burden of manual monitoring. Nevertheless, it remains a technically challenging task to achieve a reliable system. In this study, we investigated the significance of the phase information in EEG signals in seizure detection using machine learning. We used the Stockwell transform (S-transform) to extract both phase and power spectra of the EEG signal in epilepsy patients. A dual-stream convolution neural network (CNN) model was adopted as the classifier, which takes both spectra as inputs. We demonstrated that the phase input allows the CNN model to capture the heightened phase synchronization among EEG channels in seizure and add network attention to both the low- and high-frequency features of the inputs in the CHB-MIT and Bonn databases. We improved the detection AUC-ROC by 6.68% on the CHB-MIT database when adding phase inputs to the power inputs. By incorporating a channel fusion post-processing to the outputs of this CNN model, it achieves a sensitivity and specificity of 79.59% and 92.23%, respectively, surpassing some of the state-of-the-art methods. Our results show that the phase inputs are useful features in seizure detection. This discovery has significant implications for improving the effectiveness of automatic seizure detection systems.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142577871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-28DOI: 10.1016/j.eswa.2024.125593
The computation of uncertainty are crucial for developing a reliable machine learning model. The natural posterior network (NatPN) provides uncertainty estimation for any single exponential family distribution, but real-world data is often complex. Therefore, we introduce a mixture exponential family posterior network (MEFDPN), which extends the prior distribution to a mixture of exponential family distributions, aiming to fit complex distributions that better represent real data. During network training, MEFDPN independently updates the posterior Bayesian estimates for each prior distribution, and the weights of these distributions are updated based on the forward propagation results. Furthermore, MEFDPN calculates two types of uncertainty (aleatoric and epistemic) and combines them using entropy weighting to obtain a comprehensive confidence measure for each data point. Theoretically, MEFDPN achieves higher prediction accuracy, and experimental results demonstrate its capability to compute high-quality data comprehensive confidence. Moreover, it shows encouraging accuracy in Out-of-Distribution(OOD) detection and validation experiments. Finally, we apply MEFDPN to a materials dataset, efficiently filtering out OOD data. This results in a significant enhancement of prediction accuracy for machine learning models. Specifically, removing only 5% of outlier data leads to a 2%–5% improvement in accuracy.
{"title":"MEFDPN: Mixture exponential family distribution posterior networks for evaluating data uncertainty","authors":"","doi":"10.1016/j.eswa.2024.125593","DOIUrl":"10.1016/j.eswa.2024.125593","url":null,"abstract":"<div><div>The computation of uncertainty are crucial for developing a reliable machine learning model. The natural posterior network (NatPN) provides uncertainty estimation for any single exponential family distribution, but real-world data is often complex. Therefore, we introduce a mixture exponential family posterior network (MEFDPN), which extends the prior distribution to a mixture of exponential family distributions, aiming to fit complex distributions that better represent real data. During network training, MEFDPN independently updates the posterior Bayesian estimates for each prior distribution, and the weights of these distributions are updated based on the forward propagation results. Furthermore, MEFDPN calculates two types of uncertainty (aleatoric and epistemic) and combines them using entropy weighting to obtain a comprehensive confidence measure for each data point. Theoretically, MEFDPN achieves higher prediction accuracy, and experimental results demonstrate its capability to compute high-quality data comprehensive confidence. Moreover, it shows encouraging accuracy in Out-of-Distribution(OOD) detection and validation experiments. Finally, we apply MEFDPN to a materials dataset, efficiently filtering out OOD data. This results in a significant enhancement of prediction accuracy for machine learning models. Specifically, removing only 5% of outlier data leads to a 2%–5% improvement in accuracy.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142571830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-28DOI: 10.1016/j.eswa.2024.125605
For the application of social item recommendation, how to effectively dig out the implicit relationships between different items plays a crucial role in its performance. However, existing social item recommendation systems constructed their item graphs using a static method based on item features. Considering the fact that most items, such as live streams, can hardly be characterized with limited number of feature tags in reality, the static construction methods make it hard to accurately grasp the underlying item–item relationships. To address the problem, we propose an item graph generation method based on Recommendation Feedback and Dynamic Adaptive Training (RFDAT) to achieve an efficient social item recommendation. Specifically, a multi-task learning technique is leveraged to concurrently predict the item graph and user–item interaction graph, allowing the recommendation task itself to directly participate in the dynamic construction process of the item graph, which is adaptively constructed based on feedback from recommendation results iteratively during the training procedure. Compared with the static construction methods, this allows us to fully explore item–item relationships and item feature representations, therefore improving recommendation accuracy. Furthermore, a lightweight graph convolutional denoising and fusion method based on Laplacian smoothing filter is employed to achieve deep interaction and fusion among multi-graph features, and effectively mitigate the influence of noise in the process of feature learning. Finally, extensive experimental results on four public datasets show that compared with eight state-of-the-art methods, our proposed method achieves improvements of 4.97%, 2.90%, 2.03%, and 4.82% in the important evaluation metric NDCG@10 on Yelp, Ciao, LastFM, and Douban datasets, respectively. It also illustrates very competitive performance against these baselines in the recommendation accuracy for cold users and the recommendation rate for cold items.
{"title":"Recommendation feedback-based dynamic adaptive training for efficient social item recommendation","authors":"","doi":"10.1016/j.eswa.2024.125605","DOIUrl":"10.1016/j.eswa.2024.125605","url":null,"abstract":"<div><div>For the application of social item recommendation, how to effectively dig out the implicit relationships between different items plays a crucial role in its performance. However, existing social item recommendation systems constructed their item graphs using a static method based on item features. Considering the fact that most items, such as live streams, can hardly be characterized with limited number of feature tags in reality, the static construction methods make it hard to accurately grasp the underlying item–item relationships. To address the problem, we propose an item graph generation method based on Recommendation Feedback and Dynamic Adaptive Training (RFDAT) to achieve an efficient social item recommendation. Specifically, a multi-task learning technique is leveraged to concurrently predict the item graph and user–item interaction graph, allowing the recommendation task itself to directly participate in the dynamic construction process of the item graph, which is adaptively constructed based on feedback from recommendation results iteratively during the training procedure. Compared with the static construction methods, this allows us to fully explore item–item relationships and item feature representations, therefore improving recommendation accuracy. Furthermore, a lightweight graph convolutional denoising and fusion method based on Laplacian smoothing filter is employed to achieve deep interaction and fusion among multi-graph features, and effectively mitigate the influence of noise in the process of feature learning. Finally, extensive experimental results on four public datasets show that compared with eight state-of-the-art methods, our proposed method achieves improvements of 4.97%, 2.90%, 2.03%, and 4.82% in the important evaluation metric NDCG@10 on Yelp, Ciao, LastFM, and Douban datasets, respectively. It also illustrates very competitive performance against these baselines in the recommendation accuracy for cold users and the recommendation rate for cold items.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142552577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-28DOI: 10.1016/j.eswa.2024.125595
Imbalanced data classification is an important research topic in machine learning. The class imbalance problem has a great impact on the classification performance of the algorithm. In this research direction, proposing an effective sampling strategy for imbalanced data is a challenging task. Although a lot of methods have been proposed to classify imbalanced data, the problem remains open. If a method reflects the data distribution and removes noisy samples, then good classification results will be obtained. Therefore, this paper proposes a weighted ensemble algorithm based on differentiated sampling rates (KSDE) and apply it to the field of credit risk assessment. KSDE removes noisy samples using the outlier detection technique. Then, multiple balanced training subsets are generated to train submodels using differentiated sampling rates. These training subsets sufficiently represent the distribution of data. Finally, the well-performing submodels are weighted and integrated to obtain the prediction result. We conducted comprehensive experiments to validate the performance of the proposed method. Comparing 12 state-of-the-art methods on 23 datasets. KSDE outperforms the recently proposed SPE (Self-paced Ensemble) by 12.46% in terms of TPR (True Positive Rate). In addition, KSDE achieves good results on 7 credit risk datasets. The experimental results show that the proposed method is competitive in solving the imbalanced data classification problem.
{"title":"Weighted ensemble based on differentiated sampling rates for imbalanced classification and application to credit risk assessment","authors":"","doi":"10.1016/j.eswa.2024.125595","DOIUrl":"10.1016/j.eswa.2024.125595","url":null,"abstract":"<div><div>Imbalanced data classification is an important research topic in machine learning. The class imbalance problem has a great impact on the classification performance of the algorithm. In this research direction, proposing an effective sampling strategy for imbalanced data is a challenging task. Although a lot of methods have been proposed to classify imbalanced data, the problem remains open. If a method reflects the data distribution and removes noisy samples, then good classification results will be obtained. Therefore, this paper proposes a weighted ensemble algorithm based on differentiated sampling rates (KSDE) and apply it to the field of credit risk assessment. KSDE removes noisy samples using the outlier detection technique. Then, multiple balanced training subsets are generated to train submodels using differentiated sampling rates. These training subsets sufficiently represent the distribution of data. Finally, the well-performing submodels are weighted and integrated to obtain the prediction result. We conducted comprehensive experiments to validate the performance of the proposed method. Comparing 12 state-of-the-art methods on 23 datasets. KSDE outperforms the recently proposed SPE (Self-paced Ensemble) by 12.46% in terms of TPR (True Positive Rate). In addition, KSDE achieves good results on 7 credit risk datasets. The experimental results show that the proposed method is competitive in solving the imbalanced data classification problem.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142552580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-28DOI: 10.1016/j.eswa.2024.125619
Background
Deep learning-based models for atrial fibrillation (AF) detection require extensive training data, which often necessitates labor-intensive professional annotation. While data augmentation techniques have been employed to mitigate the scarcity of annotated electrocardiogram (ECG) data, specific augmentation methods tailored for recording-level ECG annotations are lacking. This gap hampers the development of robust deep learning models for AF detection.
Methods
We propose a novel strategy, a combination of Class Activation Map-based Slicing-Concatenation (CAM-SC) data augmentation and contrastive learning, to address the current challenges. Initially, a baseline model incorporating a global average pooling layer is trained for classification and to generate class activation maps (CAMs), which highlight indicative ECG segments. After that, in each recording, indicative and non-indicative segments are sliced. These segments are subsequently concatenated randomly based on starting and ending Q points of QRS complexes, with indicative segments preserved to maintain label correctness. Finally, the augmented dataset undergoes contrastive learning to learn general representations, thereby enhancing AF detection performance.
Results
Using ResNet-101 as the baseline model, training with the augmented data yielded the highest F1-score of 0.861 on the Computing in Cardiology (CinC) Challenge 2017 dataset, a typical AF dataset with recording-level annotations. The metrics outperform most previous studies.
Conclusions
This study introduces an innovative data augmentation method specifically designed for recording-level ECG annotations, significantly enhancing AF detection using deep learning models. This approach has substantial implications for future AF detection research.
{"title":"Class activation map-based slicing-concatenation and contrastive learning: A novel strategy for record-level atrial fibrillation detection","authors":"","doi":"10.1016/j.eswa.2024.125619","DOIUrl":"10.1016/j.eswa.2024.125619","url":null,"abstract":"<div><h3>Background</h3><div>Deep learning-based models for atrial fibrillation (AF) detection require extensive training data, which often necessitates labor-intensive professional annotation. While data augmentation techniques have been employed to mitigate the scarcity of annotated electrocardiogram (ECG) data, specific augmentation methods tailored for recording-level ECG annotations are lacking. This gap hampers the development of robust deep learning models for AF detection.</div></div><div><h3>Methods</h3><div>We propose a novel strategy, a combination of Class Activation Map-based Slicing-Concatenation (CAM-SC) data augmentation and contrastive learning, to address the current challenges. Initially, a baseline model incorporating a global average pooling layer is trained for classification and to generate class activation maps (CAMs), which highlight indicative ECG segments. After that, in each recording, indicative and non-indicative segments are sliced. These segments are subsequently concatenated randomly based on starting and ending Q points of QRS complexes, with indicative segments preserved to maintain label correctness. Finally, the augmented dataset undergoes contrastive learning to learn general representations, thereby enhancing AF detection performance.</div></div><div><h3>Results</h3><div>Using ResNet-101 as the baseline model, training with the augmented data yielded the highest F1-score of 0.861 on the Computing in Cardiology (CinC) Challenge 2017 dataset, a typical AF dataset with recording-level annotations. The metrics outperform most previous studies.</div></div><div><h3>Conclusions</h3><div>This study introduces an innovative data augmentation method specifically designed for recording-level ECG annotations, significantly enhancing AF detection using deep learning models. This approach has substantial implications for future AF detection research.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142552578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}