Pub Date : 2024-11-07DOI: 10.1016/j.engappai.2024.109563
Shipeng Song , Bin Liu , Fei Teng , Tianrui Li
Recommendation systems are a critical application of artificial intelligence (AI), driving personalized user experiences across various platforms. Recent advancements in contrastive learning-based recommendation algorithms have led to significant progress in self-supervised recommendation. A key method in this field is Bayesian Personalized Ranking (BPR), which has become a dominant approach for implicit collaborative filtering. However, the challenge of false-positive and false-negative examples in implicit feedback continues to hinder accurate preference learning. In this study, we introduce an efficient self-supervised contrastive learning framework that enhances the supervisory signal by incorporating positive feature augmentation and negative label augmentation. Our theoretical analysis reveals that this approach is equivalent to maximizing the likelihood estimation with latent variables representing user interest centers. Additionally, we present a novel negative label augmentation technique that selects unlabeled examples based on their relative ranking positions, enabling efficient augmentation with constant time complexity. Validation on the MovieLens-100k, MovieLens-1M, Yahoo!-R3, Yelp2018, and Gowalla datasets demonstrates that our method achieves over a 5% improvement in precision compared to the widely used BPR optimization objective, while maintaining comparable runtime efficiency.
{"title":"Self-supervised contrastive learning for implicit collaborative filtering","authors":"Shipeng Song , Bin Liu , Fei Teng , Tianrui Li","doi":"10.1016/j.engappai.2024.109563","DOIUrl":"10.1016/j.engappai.2024.109563","url":null,"abstract":"<div><div>Recommendation systems are a critical application of artificial intelligence (AI), driving personalized user experiences across various platforms. Recent advancements in contrastive learning-based recommendation algorithms have led to significant progress in self-supervised recommendation. A key method in this field is Bayesian Personalized Ranking (BPR), which has become a dominant approach for implicit collaborative filtering. However, the challenge of false-positive and false-negative examples in implicit feedback continues to hinder accurate preference learning. In this study, we introduce an efficient self-supervised contrastive learning framework that enhances the supervisory signal by incorporating positive feature augmentation and negative label augmentation. Our theoretical analysis reveals that this approach is equivalent to maximizing the likelihood estimation with latent variables representing user interest centers. Additionally, we present a novel negative label augmentation technique that selects unlabeled examples based on their relative ranking positions, enabling efficient augmentation with constant time complexity. Validation on the MovieLens-100k, MovieLens-1M, Yahoo!-R3, Yelp2018, and Gowalla datasets demonstrates that our method achieves over a 5% improvement in precision compared to the widely used BPR optimization objective, while maintaining comparable runtime efficiency.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"139 ","pages":"Article 109563"},"PeriodicalIF":7.5,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142652776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-07DOI: 10.1016/j.engappai.2024.109586
Juan P. Martinez-Esteso, Francisco J. Castellanos, Adrian Rosello, Jorge Calvo-Zaragoza, Antonio Javier Gallego
Time is a critical factor in maritime Search And Rescue (SAR) missions, during which promptly locating survivors is paramount. Unmanned Aerial Vehicles (UAVs) are a useful tool with which to increase the success rate by rapidly identifying targets. While this task can be performed using other means, such as helicopters, the cost-effectiveness of UAVs makes them an effective choice. Moreover, these vehicles allow the easy integration of automatic systems that can be used to assist in the search process. Despite the impact of artificial intelligence on autonomous technology, there are still two major drawbacks to overcome: the need for sufficient training data to cover the wide variability of scenes that a UAV may encounter and the strong dependence of the generated models on the specific characteristics of the training samples. In this work, we address these challenges by proposing a novel approach that leverages computer-generated synthetic data alongside novel modifications to the You Only Look Once (YOLO) architecture that enhance its robustness, adaptability to new environments, and accuracy in detecting small targets. Our method introduces a new patch-sample extraction technique and task-specific data augmentation, ensuring robust performance across diverse weather conditions. The results demonstrate our proposal’s superiority, showing an average 28% relative improvement in mean Average Precision (mAP) over the best-performing state-of-the-art baseline under training conditions with sufficient real data, and a remarkable 218% improvement when real data is limited. The proposal also presents a favorable balance between efficiency, effectiveness, and resource requirements.
{"title":"On the use of synthetic data for body detection in maritime search and rescue operations","authors":"Juan P. Martinez-Esteso, Francisco J. Castellanos, Adrian Rosello, Jorge Calvo-Zaragoza, Antonio Javier Gallego","doi":"10.1016/j.engappai.2024.109586","DOIUrl":"10.1016/j.engappai.2024.109586","url":null,"abstract":"<div><div>Time is a critical factor in maritime Search And Rescue (SAR) missions, during which promptly locating survivors is paramount. Unmanned Aerial Vehicles (UAVs) are a useful tool with which to increase the success rate by rapidly identifying targets. While this task can be performed using other means, such as helicopters, the cost-effectiveness of UAVs makes them an effective choice. Moreover, these vehicles allow the easy integration of automatic systems that can be used to assist in the search process. Despite the impact of artificial intelligence on autonomous technology, there are still two major drawbacks to overcome: the need for sufficient training data to cover the wide variability of scenes that a UAV may encounter and the strong dependence of the generated models on the specific characteristics of the training samples. In this work, we address these challenges by proposing a novel approach that leverages computer-generated synthetic data alongside novel modifications to the You Only Look Once (YOLO) architecture that enhance its robustness, adaptability to new environments, and accuracy in detecting small targets. Our method introduces a new patch-sample extraction technique and task-specific data augmentation, ensuring robust performance across diverse weather conditions. The results demonstrate our proposal’s superiority, showing an average 28% relative improvement in mean Average Precision (mAP) over the best-performing state-of-the-art baseline under training conditions with sufficient real data, and a remarkable 218% improvement when real data is limited. The proposal also presents a favorable balance between efficiency, effectiveness, and resource requirements.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"139 ","pages":"Article 109586"},"PeriodicalIF":7.5,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142659271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-07DOI: 10.1016/j.engappai.2024.109570
Yihao Zheng, Zhuming Wang, Ke Gu, Lifang Wu, Zun Li, Ye Xiang
Existing group activity recognition methods generally use optical flow image to represent motion within videos, which often struggle to capture the movements of individuals inaccurately. In this paper, we explore the effectiveness of more kinds of motion information for group activity recognition. We propose a novel multi-scale MOtion-based relational reasoning framework for Group Activity Recognition (MOGAR). It combines joint motion (intra-individual level) with trajectory (individual-level) and individual position (inter-individual level) to acquire richer activity representation. Specifically, it involves two branches: the trajectory branch utilizes individuals’ trajectories and positions to extract the motion feature at the individual and inter-individual levels. The joint branch extracts the motion features at the intra-individual level. Furthermore, the gated recurrent units (GRU) and Transformers are employed to enhance the corresponding features through gating mechanism and self-attention mechanism. The features from the two branches are concatenated for group activity recognition. The experiments on two public datasets demonstrate that our method achieves competitive performance and has potential benefits in terms of computational complexity.
{"title":"Multi-scale motion-based relational reasoning for group activity recognition","authors":"Yihao Zheng, Zhuming Wang, Ke Gu, Lifang Wu, Zun Li, Ye Xiang","doi":"10.1016/j.engappai.2024.109570","DOIUrl":"10.1016/j.engappai.2024.109570","url":null,"abstract":"<div><div>Existing group activity recognition methods generally use optical flow image to represent motion within videos, which often struggle to capture the movements of individuals inaccurately. In this paper, we explore the effectiveness of more kinds of motion information for group activity recognition. We propose a novel multi-scale MOtion-based relational reasoning framework for Group Activity Recognition (MOGAR). It combines joint motion (intra-individual level) with trajectory (individual-level) and individual position (inter-individual level) to acquire richer activity representation. Specifically, it involves two branches: the trajectory branch utilizes individuals’ trajectories and positions to extract the motion feature at the individual and inter-individual levels. The joint branch extracts the motion features at the intra-individual level. Furthermore, the gated recurrent units (GRU) and Transformers are employed to enhance the corresponding features through gating mechanism and self-attention mechanism. The features from the two branches are concatenated for group activity recognition. The experiments on two public datasets demonstrate that our method achieves competitive performance and has potential benefits in terms of computational complexity.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"139 ","pages":"Article 109570"},"PeriodicalIF":7.5,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142652635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-07DOI: 10.1016/j.engappai.2024.109557
Lanjun Wan , Long Fu , Changyun Li , Keqin Li
Flexible job shop scheduling problem (FJSP) is a complex optimization problem in intelligent manufacturing and plays a key role in improving productivity, which is characterized by that each operation can be processed by multiple machines. Most current research into FJSP focuses on finding a higher-quality scheduling scheme in a shorter time. However, existing studies are hard to optimize the operation sequencing and machine assignment strategies simultaneously, which is critical for making the optimal scheduling decision. Therefore, a multi-agent-based graph reinforcement learning (MAGRL) method is proposed to effectively solve FJSP. Firstly, the FJSP is modeled into two Markov decision processes (MDPs), where the operation and machine agents are adopted to control the operation sequencing and machine assignment respectively. Secondly, to effectively predict the operation sequencing and machine assignment strategies, an encoder-double-decoder architecture is designed, including an improved graph attention network (IGAT)-based encoder, an operation strategy network-based decoder, and a machine strategy network-based decoder. Thirdly, an automatic entropy adjustment multi-agent proximal policy optimization (AEA-MAPPO) algorithm is proposed for effectively training the operation and machine strategy networks to optimize the operation sequencing and machine assignment strategies simultaneously. Finally, the effectiveness of MAGRL is verified through experimental comparisons with the classical scheduling rules and state-of-the-art methods to solve FJSP. The results achieved on the randomly generated FJSP instances and two common benchmarks indicate that MAGRL can consume less solution time to achieve higher solution quality in solving different-sized FJSP instances, and the overall performance of MAGRL is superior to that of the comparison methods.
{"title":"An effective multi-agent-based graph reinforcement learning method for solving flexible job shop scheduling problem","authors":"Lanjun Wan , Long Fu , Changyun Li , Keqin Li","doi":"10.1016/j.engappai.2024.109557","DOIUrl":"10.1016/j.engappai.2024.109557","url":null,"abstract":"<div><div>Flexible job shop scheduling problem (FJSP) is a complex optimization problem in intelligent manufacturing and plays a key role in improving productivity, which is characterized by that each operation can be processed by multiple machines. Most current research into FJSP focuses on finding a higher-quality scheduling scheme in a shorter time. However, existing studies are hard to optimize the operation sequencing and machine assignment strategies simultaneously, which is critical for making the optimal scheduling decision. Therefore, a multi-agent-based graph reinforcement learning (MAGRL) method is proposed to effectively solve FJSP. Firstly, the FJSP is modeled into two Markov decision processes (MDPs), where the operation and machine agents are adopted to control the operation sequencing and machine assignment respectively. Secondly, to effectively predict the operation sequencing and machine assignment strategies, an encoder-double-decoder architecture is designed, including an improved graph attention network (IGAT)-based encoder, an operation strategy network-based decoder, and a machine strategy network-based decoder. Thirdly, an automatic entropy adjustment multi-agent proximal policy optimization (AEA-MAPPO) algorithm is proposed for effectively training the operation and machine strategy networks to optimize the operation sequencing and machine assignment strategies simultaneously. Finally, the effectiveness of MAGRL is verified through experimental comparisons with the classical scheduling rules and state-of-the-art methods to solve FJSP. The results achieved on the randomly generated FJSP instances and two common benchmarks indicate that MAGRL can consume less solution time to achieve higher solution quality in solving different-sized FJSP instances, and the overall performance of MAGRL is superior to that of the comparison methods.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"139 ","pages":"Article 109557"},"PeriodicalIF":7.5,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142652708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-06DOI: 10.1016/j.engappai.2024.109550
Fei Wu , Zhuohang Xiang , Dengyu Xiao , Yaodong Hao , Yi Qin , Huayan Pu , Jun Luo
To address the challenges of obtaining diverse data, domain generalization (DG) methods for fault diagnosis have been developed. Domain adversarial methods are currently the most popular, due to their ability to handle data from unknown domains without requiring target domain information. However, their capacity to extract domain-irrelevant features remains challenging, often resulting in accuracy below 90% in many DG scenarios. This limitation stems from their inability to fully capture global dependencies, causing feature entanglement and redundant dependencies. To address these issues, we proposed a novel intelligent fault diagnosis method called Adversarial-Causal Representation Learning Networks (ACRLN), which is based on causal learning. By spatial mask domain adversarial method, ACRLN can significantly enhance data utilization by fully capturing the global dependency that are often ignored by domain adversarial algorithms. At the same time, causal learning is integrated into the ACRLN to further accomplish feature decoupling and the reduction of redundant dependency. This is achieved through channel feature orthogonality method combined with a loss function rooted in correlation analysis. Moreover, it adeptly addresses the spill-over effect often encountered in causal learning. Finally, ACRLN achieves better results and proves its effectiveness by comparison with several state-of-the-art fault diagnosis and DG algorithms on multiple datasets.
{"title":"Adversarial-Causal Representation Learning Networks for Machine fault diagnosis under unseen conditions based on vibration and acoustic signals","authors":"Fei Wu , Zhuohang Xiang , Dengyu Xiao , Yaodong Hao , Yi Qin , Huayan Pu , Jun Luo","doi":"10.1016/j.engappai.2024.109550","DOIUrl":"10.1016/j.engappai.2024.109550","url":null,"abstract":"<div><div>To address the challenges of obtaining diverse data, domain generalization (DG) methods for fault diagnosis have been developed. Domain adversarial methods are currently the most popular, due to their ability to handle data from unknown domains without requiring target domain information. However, their capacity to extract domain-irrelevant features remains challenging, often resulting in accuracy below 90% in many DG scenarios. This limitation stems from their inability to fully capture global dependencies, causing feature entanglement and redundant dependencies. To address these issues, we proposed a novel intelligent fault diagnosis method called Adversarial-Causal Representation Learning Networks (ACRLN), which is based on causal learning. By spatial mask domain adversarial method, ACRLN can significantly enhance data utilization by fully capturing the global dependency that are often ignored by domain adversarial algorithms. At the same time, causal learning is integrated into the ACRLN to further accomplish feature decoupling and the reduction of redundant dependency. This is achieved through channel feature orthogonality method combined with a loss function rooted in correlation analysis. Moreover, it adeptly addresses the spill-over effect often encountered in causal learning. Finally, ACRLN achieves better results and proves its effectiveness by comparison with several state-of-the-art fault diagnosis and DG algorithms on multiple datasets.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"139 ","pages":"Article 109550"},"PeriodicalIF":7.5,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142652634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-06DOI: 10.1016/j.engappai.2024.109589
Yi Zhu , Ye Wang , Yun Li , Jipeng Qiang , Yunhao Yuan
Short text streams such as real-time news and search snippets have attained vast amounts of attention and research in recent decades, the characteristics of high generation velocity, feature sparsity, and high ambiguity accentuate both the importance and challenges to language models. However, most of the existing short text stream classification methods can neither automatically select relevant knowledge components for arbitrary samples, nor expand knowledge internally instead of rely on external open knowledge base to address the inherent limitations of short text stream. In this paper, we propose a Soft Prompt-tuning with Self-Resource Verbalizer (SPSV for short) for short text stream classification, the soft prompt with self-resource knowledgeable expansion is conducted for updating label words space to address evolved semantic topics in the data streams. Specifically, the automatic constructed prompt is first generated to instruct the model prediction, which is optimized to address the problem of high velocity and topic drift in short text streams. Then, in each chunk, the projection between category names and label words space, i.e. verbalizer, is updated, which is constructed by internal knowledge expansion from the short text itself. Through comprehensive experiments on four well-known benchmark datasets, we validate the superb performance of our method compared to other short text stream classification and fine-tuning PLMs methods, which achieves up to more than 90% classification accuracy with the counts of data chunk increased.
近几十年来,实时新闻和搜索片段等短文本流得到了广泛的关注和研究,其高速生成、特征稀疏和高度模糊的特点凸显了语言模型的重要性和挑战性。然而,现有的大多数短文本流分类方法既不能针对任意样本自动选择相关知识组件,也不能在内部扩展知识而不是依赖外部开放知识库来解决短文本流的固有局限性。在本文中,我们提出了一种针对短文本流分类的软提示与自资源知识扩展(Soft Prompt-tuning with Self-Resource Verbalizer,简称 SPSV)。具体来说,首先生成自动构建的提示来指导模型预测,并对其进行优化,以解决短文本流中的高速度和主题漂移问题。然后,在每个分块中,更新类别名称和标签词空间之间的投影,即口头化器(verbalizer),它是由短文本本身的内部知识扩展构建的。通过在四个知名基准数据集上的综合实验,我们验证了与其他短文本流分类和微调 PLMs 方法相比,我们的方法具有卓越的性能,随着数据块数量的增加,分类准确率可达 90% 以上。
{"title":"Soft Prompt-tuning with Self-Resource Verbalizer for short text streams","authors":"Yi Zhu , Ye Wang , Yun Li , Jipeng Qiang , Yunhao Yuan","doi":"10.1016/j.engappai.2024.109589","DOIUrl":"10.1016/j.engappai.2024.109589","url":null,"abstract":"<div><div>Short text streams such as real-time news and search snippets have attained vast amounts of attention and research in recent decades, the characteristics of high generation velocity, feature sparsity, and high ambiguity accentuate both the importance and challenges to language models. However, most of the existing short text stream classification methods can neither automatically select relevant knowledge components for arbitrary samples, nor expand knowledge internally instead of rely on external open knowledge base to address the inherent limitations of short text stream. In this paper, we propose a Soft Prompt-tuning with Self-Resource Verbalizer (SPSV for short) for short text stream classification, the soft prompt with self-resource knowledgeable expansion is conducted for updating label words space to address evolved semantic topics in the data streams. Specifically, the automatic constructed prompt is first generated to instruct the model prediction, which is optimized to address the problem of high velocity and topic drift in short text streams. Then, in each chunk, the projection between category names and label words space, i.e. verbalizer, is updated, which is constructed by internal knowledge expansion from the short text itself. Through comprehensive experiments on four well-known benchmark datasets, we validate the superb performance of our method compared to other short text stream classification and fine-tuning PLMs methods, which achieves up to more than 90% classification accuracy with the counts of data chunk increased.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"139 ","pages":"Article 109589"},"PeriodicalIF":7.5,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142652639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the field of constrained multi-objective optimization, constructing auxiliary tasks can guide the algorithm to achieve efficient search. Different forms of auxiliary tasks have their own advantages, and a reasonable combination can effectively improve the performance of the algorithm. Inspired by this, a Constrained Multi-objective Optimization Evolutionary Algorithm based on Convergence and Diversity auxiliary Tasks (CMOEA-CDT) is proposed. This algorithm achieves efficient search through simultaneous optimization and knowledge transfer of the main task, convergence auxiliary task, and diversity auxiliary task. Specifically, the main task is to find feasible Pareto front, which improves the global exploration and local exploitation of the algorithm through knowledge transfer from the convergence and diversity auxiliary tasks. In addition, the convergence auxiliary task helps the main task population traverse infeasible obstacles by ignoring constraints to achieve global search. The diversity auxiliary task aims to provide local diversity to the regions around the main task population to exploit promising search regions. The convergence and diversity of the algorithm are significantly improved by knowledge transfer between the convergence auxiliary task, diversity auxiliary task, and main task. CMOEA-CDT is compared with five state-of-the-art constrained multi-objective evolutionary optimization algorithms on 37 benchmark problems and a disc brake engineering design problem. The experimental results indicate that the proposed CMOEA-CDT respectively obtains 19 and 20 best results on the two indicators, and achieves the best performance on disc brake engineering design problem.
{"title":"Constrained multi-objective optimization assisted by convergence and diversity auxiliary tasks","authors":"Qianlong Dang , Wutao Shang , Zhengxin Huang , Shuai Yang","doi":"10.1016/j.engappai.2024.109546","DOIUrl":"10.1016/j.engappai.2024.109546","url":null,"abstract":"<div><div>In the field of constrained multi-objective optimization, constructing auxiliary tasks can guide the algorithm to achieve efficient search. Different forms of auxiliary tasks have their own advantages, and a reasonable combination can effectively improve the performance of the algorithm. Inspired by this, a Constrained Multi-objective Optimization Evolutionary Algorithm based on Convergence and Diversity auxiliary Tasks (CMOEA-CDT) is proposed. This algorithm achieves efficient search through simultaneous optimization and knowledge transfer of the main task, convergence auxiliary task, and diversity auxiliary task. Specifically, the main task is to find feasible Pareto front, which improves the global exploration and local exploitation of the algorithm through knowledge transfer from the convergence and diversity auxiliary tasks. In addition, the convergence auxiliary task helps the main task population traverse infeasible obstacles by ignoring constraints to achieve global search. The diversity auxiliary task aims to provide local diversity to the regions around the main task population to exploit promising search regions. The convergence and diversity of the algorithm are significantly improved by knowledge transfer between the convergence auxiliary task, diversity auxiliary task, and main task. CMOEA-CDT is compared with five state-of-the-art constrained multi-objective evolutionary optimization algorithms on 37 benchmark problems and a disc brake engineering design problem. The experimental results indicate that the proposed CMOEA-CDT respectively obtains 19 and 20 best results on the two indicators, and achieves the best performance on disc brake engineering design problem.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"139 ","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142586198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-06DOI: 10.1016/j.engappai.2024.109539
Ahed Albadin, Chadi Albitar, Michel Alsaba
In this paper, we propose a model-free method for estimating the height and the Ground Reaction Force (GRF) for the legs of mobile robots using the Long Short-Term Memory network (LSTM). The method does not require the presence of a force sensor at each foot, and it is proven to be robust to the changes that may occur in the dynamics. First, we generated a dataset to estimate the state of the legs for the non-damaged robot and for various types of damage situations; a disabled leg with working joints’ encoders, a fully disabled leg, and a removed leg. The network was tuned to obtain the highest stable score. Then, we studied the effect of the available sensors on the results of estimation which proved the sufficiency of using just the joint encoders which led to reducing the computational time by 17%. The sequence length required for estimation is also optimized to less than half of the gait period. The estimation results on a simulated hexapod robot and on a dataset recorded using a real four-legged robot proved the effectiveness and reliability of the proposed method as the score reached 94% with the damaged hexapod robot and 92% with the real four-legged robot, and that also proved the ability of our proposed method to be generalized to different types of robots.
{"title":"Estimation of the legs’ state of a mobile robot based on Long Short-Term Memory network","authors":"Ahed Albadin, Chadi Albitar, Michel Alsaba","doi":"10.1016/j.engappai.2024.109539","DOIUrl":"10.1016/j.engappai.2024.109539","url":null,"abstract":"<div><div>In this paper, we propose a model-free method for estimating the height and the Ground Reaction Force (GRF) for the legs of mobile robots using the Long Short-Term Memory network (LSTM). The method does not require the presence of a force sensor at each foot, and it is proven to be robust to the changes that may occur in the dynamics. First, we generated a dataset to estimate the state of the legs for the non-damaged robot and for various types of damage situations; a disabled leg with working joints’ encoders, a fully disabled leg, and a removed leg. The network was tuned to obtain the highest stable <span><math><msup><mrow><mi>R</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span> score. Then, we studied the effect of the available sensors on the results of estimation which proved the sufficiency of using just the joint encoders which led to reducing the computational time by 17%. The sequence length required for estimation is also optimized to less than half of the gait period. The estimation results on a simulated hexapod robot and on a dataset recorded using a real four-legged robot proved the effectiveness and reliability of the proposed method as the <span><math><msup><mrow><mi>R</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span> score reached 94% with the damaged hexapod robot and 92% with the real four-legged robot, and that also proved the ability of our proposed method to be generalized to different types of robots.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"139 ","pages":"Article 109539"},"PeriodicalIF":7.5,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142652637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-06DOI: 10.1016/j.engappai.2024.109558
Yifan Duan , Xiaojie Liu , Ran Liu , Xin Li , Hongwei Li , Hongyang Li , Yanqin Sun , Yujie Zhang , Qing Lv
Traditional relying on manual experience to assess the tuyere status consumes significant human resources. In the era of intelligent blast furnaces and intensified smelting, this approach struggles to meet the demands for accuracy and real-time assessment, posing challenges to safety and efficiency of blast furnace production. Tuyere images exhibit high feature similarity, and the number of samples is often limited. Therefore, if a simple convolution operation is only used, it will be difficult to discern differences across various images. To address this challenge and cater to the requirements of intelligent tuyere status recognition across different steel enterprises, we designed a novel deep neural network algorithm called ES-SFRNet (Enhanced Sequential: Feature Fusion and Recognition Network), building upon our prior research. The algorithm concurrently modeled tuyere images alongside relevant time series data, comprising three components: Feature pre-extraction, Tuyere status recognition, and Generalization & Robustness. The first two modules focus on feature extraction and fusion of tuyere images, while leveraging edge detection information from the image, we developed a mathematical index (Area Ratio) to serve as an auxiliary criterion for tuyere status recognition. Given the model's future scalability and multi-scenario application, the final module focuses on knowledge integration and parameter control. Test results reveal an overall accuracy rate of 99.3% for the ES-SFRNet algorithm, effectively capturing key parameters to facilitate on-site operations. In comparison to other mainstream object detection algorithms, our algorithm framework excels in tuyere image feature extraction and recognition, which can offer broad applications to Chinese blast furnace ironmaking industry.
{"title":"A novel anomaly detection and classification algorithm for application in tuyere images of blast furnace","authors":"Yifan Duan , Xiaojie Liu , Ran Liu , Xin Li , Hongwei Li , Hongyang Li , Yanqin Sun , Yujie Zhang , Qing Lv","doi":"10.1016/j.engappai.2024.109558","DOIUrl":"10.1016/j.engappai.2024.109558","url":null,"abstract":"<div><div>Traditional relying on manual experience to assess the tuyere status consumes significant human resources. In the era of intelligent blast furnaces and intensified smelting, this approach struggles to meet the demands for accuracy and real-time assessment, posing challenges to safety and efficiency of blast furnace production. Tuyere images exhibit high feature similarity, and the number of samples is often limited. Therefore, if a simple convolution operation is only used, it will be difficult to discern differences across various images. To address this challenge and cater to the requirements of intelligent tuyere status recognition across different steel enterprises, we designed a novel deep neural network algorithm called ES-SFRNet (Enhanced Sequential: Feature Fusion and Recognition Network), building upon our prior research. The algorithm concurrently modeled tuyere images alongside relevant time series data, comprising three components: Feature pre-extraction, Tuyere status recognition, and Generalization & Robustness. The first two modules focus on feature extraction and fusion of tuyere images, while leveraging edge detection information from the image, we developed a mathematical index <span><math><mrow><msub><mi>A</mi><mi>r</mi></msub></mrow></math></span> (Area Ratio) to serve as an auxiliary criterion for tuyere status recognition. Given the model's future scalability and multi-scenario application, the final module focuses on knowledge integration and parameter control. Test results reveal an overall accuracy rate of 99.3% for the ES-SFRNet algorithm, effectively capturing key parameters to facilitate on-site operations. In comparison to other mainstream object detection algorithms, our algorithm framework excels in tuyere image feature extraction and recognition, which can offer broad applications to Chinese blast furnace ironmaking industry.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"139 ","pages":"Article 109558"},"PeriodicalIF":7.5,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142652707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-06DOI: 10.1016/j.engappai.2024.109581
Guiqiang Hu , Yong Hu , Tao Wu , Yushu Zhang , Shuai Yuan
In this work, we investigate the problem of distributed deep learning in Internet of Things (IoT). The proposed learning framework is constructed in a fog–cloud computing architecture, so as to overcome the limitation of resource constrained IoT end device. Compressive Sensing (CS) is used as a lightweight encryption in the framework to preserve the privacy of training data. Specifically, a chaotic-based CS measurement matrix construction mechanism is applied in the system to save the storage and transmission costs. With this design, the computation overhead of the learning framework in IoT can be successfully offloaded from IoT end device to the fog nodes. Theoretical analysis demonstrates that our system can guarantee security of the raw data against chosen plaintext attack (CPA). Experimental and analysis results show that our privacy-preserving proposal can significantly reduce the communication costs and computation costs with only a negligible accuracy penalty (with classification accuracy 91% testing on MNIST dataset under compression rate 0.5) compared to traditional non-private federated learning schemes. Notably, due to the chaotic-based CS measurement matrix construction mechanism, the memory requirement of end device side can be significantly reduced. This makes our framework be very suitable for the IoT applications in which end devices are equipped with low-spec chips.
{"title":"Lightweight distributed deep learning on compressive measurements for internet of things","authors":"Guiqiang Hu , Yong Hu , Tao Wu , Yushu Zhang , Shuai Yuan","doi":"10.1016/j.engappai.2024.109581","DOIUrl":"10.1016/j.engappai.2024.109581","url":null,"abstract":"<div><div>In this work, we investigate the problem of distributed deep learning in Internet of Things (IoT). The proposed learning framework is constructed in a fog–cloud computing architecture, so as to overcome the limitation of resource constrained IoT end device. Compressive Sensing (CS) is used as a lightweight encryption in the framework to preserve the privacy of training data. Specifically, a chaotic-based CS measurement matrix construction mechanism is applied in the system to save the storage and transmission costs. With this design, the computation overhead of the learning framework in IoT can be successfully offloaded from IoT end device to the fog nodes. Theoretical analysis demonstrates that our system can guarantee security of the raw data against chosen plaintext attack (CPA). Experimental and analysis results show that our privacy-preserving proposal can significantly reduce the communication costs and computation costs with only a negligible accuracy penalty (with classification accuracy 91% testing on MNIST dataset under compression rate 0.5) compared to traditional non-private federated learning schemes. Notably, due to the chaotic-based CS measurement matrix construction mechanism, the memory requirement of end device side can be significantly reduced. This makes our framework be very suitable for the IoT applications in which end devices are equipped with low-spec chips.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"139 ","pages":"Article 109581"},"PeriodicalIF":7.5,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142652638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}