Pub Date : 2024-07-22DOI: 10.1007/s40747-024-01563-4
Jun Long, Zhuoying Yin, Chao Liu, Wenti Huang
Prompt-tuning has emerged as a promising approach for improving the performance of classification tasks by converting them into masked language modeling problems through the insertion of text templates. Despite its considerable success, applying this approach to relation extraction is challenging. Predicting the relation, often expressed as a specific word or phrase between two entities, usually requires creating mappings from these terms to an existing lexicon and introducing extra learnable parameters. This can lead to a decrease in coherence between the pre-training task and fine-tuning. To address this issue, we propose a novel method for prompt-tuning in relation extraction, aiming to enhance the coherence between fine-tuning and pre-training tasks. Specifically, we avoid the need for a suitable relation word by converting the relation into relational semantic keywords, which are representative phrases that encapsulate the essence of the relation. Moreover, we employ a composite loss function that optimizes the model at both token and relation levels. Our approach incorporates the masked language modeling (MLM) loss and the entity pair constraint loss for predicted tokens. For relation level optimization, we use both the cross-entropy loss and TransE. Extensive experimental results on four datasets demonstrate that our method significantly improves performance in relation extraction tasks. The results show an average improvement of approximately 1.6 points in F1 metrics compared to the current state-of-the-art model. Codes are released at https://github.com/12138yx/TCohPrompt.
{"title":"TCohPrompt: task-coherent prompt-oriented fine-tuning for relation extraction","authors":"Jun Long, Zhuoying Yin, Chao Liu, Wenti Huang","doi":"10.1007/s40747-024-01563-4","DOIUrl":"https://doi.org/10.1007/s40747-024-01563-4","url":null,"abstract":"<p>Prompt-tuning has emerged as a promising approach for improving the performance of classification tasks by converting them into masked language modeling problems through the insertion of text templates. Despite its considerable success, applying this approach to relation extraction is challenging. Predicting the relation, often expressed as a specific word or phrase between two entities, usually requires creating mappings from these terms to an existing lexicon and introducing extra learnable parameters. This can lead to a decrease in coherence between the pre-training task and fine-tuning. To address this issue, we propose a novel method for prompt-tuning in relation extraction, aiming to enhance the coherence between fine-tuning and pre-training tasks. Specifically, we avoid the need for a suitable relation word by converting the relation into relational semantic keywords, which are representative phrases that encapsulate the essence of the relation. Moreover, we employ a composite loss function that optimizes the model at both token and relation levels. Our approach incorporates the masked language modeling (MLM) loss and the entity pair constraint loss for predicted tokens. For relation level optimization, we use both the cross-entropy loss and TransE. Extensive experimental results on four datasets demonstrate that our method significantly improves performance in relation extraction tasks. The results show an average improvement of approximately 1.6 points in <i>F</i>1 metrics compared to the current state-of-the-art model. Codes are released at https://github.com/12138yx/TCohPrompt.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141736944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-22DOI: 10.1007/s40747-024-01561-6
Fan Lin, Jianhua Li
Optical chemical structure recognition (OCSR) is a fundamental and crucial task in the field of chemistry, which aims at transforming intricate chemical structure images into machine-readable formats. Current deep learning-based OCSR methods typically use image feature extractors to extract visual features and employ encoder-decoder architectures for chemical structure recognition. However, the performance of these methods is limited by their image feature extractors and the class imbalance of elements in chemical structure representation. This paper proposes MPOCSR (multi-path optical chemical structure recognition), which introduces the multi-path Vision Transformer (MPViT) and the class-balanced (CB) loss function to address these two challenges. MPOCSR uses MPViT as an image feature extractor, combining the advantages of convolutional neural networks and Vision Transformers. This strategy enables the provision of richer visual information for subsequent decoding processes. Furthermore, MPOCSR incorporates CB loss function to rebalance the loss weights among different categories. For training and validation of our method, we constructed a dataset that includes both Markush and non-Markush structures. Experimental results show that MPOCSR achieves an accuracy of 90.95% on the test set, surpassing other existing methods.
{"title":"MPOCSR: optical chemical structure recognition based on multi-path Vision Transformer","authors":"Fan Lin, Jianhua Li","doi":"10.1007/s40747-024-01561-6","DOIUrl":"https://doi.org/10.1007/s40747-024-01561-6","url":null,"abstract":"<p>Optical chemical structure recognition (OCSR) is a fundamental and crucial task in the field of chemistry, which aims at transforming intricate chemical structure images into machine-readable formats. Current deep learning-based OCSR methods typically use image feature extractors to extract visual features and employ encoder-decoder architectures for chemical structure recognition. However, the performance of these methods is limited by their image feature extractors and the class imbalance of elements in chemical structure representation. This paper proposes MPOCSR (multi-path optical chemical structure recognition), which introduces the multi-path Vision Transformer (MPViT) and the class-balanced (CB) loss function to address these two challenges. MPOCSR uses MPViT as an image feature extractor, combining the advantages of convolutional neural networks and Vision Transformers. This strategy enables the provision of richer visual information for subsequent decoding processes. Furthermore, MPOCSR incorporates CB loss function to rebalance the loss weights among different categories. For training and validation of our method, we constructed a dataset that includes both Markush and non-Markush structures. Experimental results show that MPOCSR achieves an accuracy of 90.95% on the test set, surpassing other existing methods.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141755096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-22DOI: 10.1007/s40747-024-01560-7
Xiaoqin Ma, Huanhuan Hu, Qinli Zhang, Yi Xu
Feature selection plays a crucial role in machine learning, as it eliminates data noise and redundancy, thereby significantly reducing computational complexity and enhancing the overall performance of the model. The challenges of feature selection for hybrid information systems stem from the difficulty in quantifying the disparities among nominal attribute values. Furthermore, a significant majority of the current methodologies exhibit sensitivity to noise. This paper introduces techniques that address the aforementioned issues from the perspective of fuzzy evidence theory. First of all, a new distance incorporating decision attributes is defined, and then a relation between fuzzy evidence theory and fuzzy (beta ) covering with an anti-noise mechanism is established. In this framework, two robust feature selection algorithms for hybrid data are proposed based on fuzzy belief and fuzzy plausibility. Experiments on 10 data sets of various types show that compared with the other 6 state-of-the-art algorithms, the proposed algorithms improve the anti-noise ability by at least 6% with higher average classification accuracy. Therefore, it can be concluded that the proposed algorithms have excellent anti-noise ability while maintaining good feature selection ability.
{"title":"Feature selection for hybrid information systems based on fuzzy $$beta $$ covering and fuzzy evidence theory","authors":"Xiaoqin Ma, Huanhuan Hu, Qinli Zhang, Yi Xu","doi":"10.1007/s40747-024-01560-7","DOIUrl":"https://doi.org/10.1007/s40747-024-01560-7","url":null,"abstract":"<p>Feature selection plays a crucial role in machine learning, as it eliminates data noise and redundancy, thereby significantly reducing computational complexity and enhancing the overall performance of the model. The challenges of feature selection for hybrid information systems stem from the difficulty in quantifying the disparities among nominal attribute values. Furthermore, a significant majority of the current methodologies exhibit sensitivity to noise. This paper introduces techniques that address the aforementioned issues from the perspective of fuzzy evidence theory. First of all, a new distance incorporating decision attributes is defined, and then a relation between fuzzy evidence theory and fuzzy <span>(beta )</span> covering with an anti-noise mechanism is established. In this framework, two robust feature selection algorithms for hybrid data are proposed based on fuzzy belief and fuzzy plausibility. Experiments on 10 data sets of various types show that compared with the other 6 state-of-the-art algorithms, the proposed algorithms improve the anti-noise ability by at least 6% with higher average classification accuracy. Therefore, it can be concluded that the proposed algorithms have excellent anti-noise ability while maintaining good feature selection ability.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141755097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-22DOI: 10.1007/s40747-024-01556-3
Yundi Li, Yinlong Yuan, Yun Cheng, Liang Hua
In air combat missions, unmanned combat aerial vehicles (UCAVs) must take strategic actions to establish combat advantages, enabling effective tracking and attacking of enemy UCAVs. Currently, a lot of reinforcement learning algorithms are applied to the air combat mission of unmanned fighter aircraft. However, most algorithms can only select policies based on the current state of both sides. This leads to the inability to effectively track and attack when the enemy performs large angle maneuvering. Additionally, these algorithms cannot adapt to different situations, resulting in the unmanned fighter aircraft being at a disadvantage in some cases. To solve these problems, this paper proposes predictive air combat decision model with segmented reward allocation for air combat tracking and attacking. On the basis of the air combat environment, we propose the prediction soft actor-critic (Pre-SAC) algorithm, which combines the prediction of enemy states with the states of UCAV for model training. This enables the UCAV to predict the next move of the enemy UCAV in advance and establish a greater air combat advantage for us. Furthermore, by adopting a segmented reward allocation model and combining it with the Pre-SAC algorithm, we propose the segmented reward allocation soft actor-critic (Sra-SAC) algorithm, which solves the problem of UCAVs being unable to adapt to different situations. The results show that the prediction-based segmented reward allocation the Sra-SAC algorithm outperforms the traditional soft actor-critic (SAC) algorithm in terms of overall reward, travel distance, and relative advantage.
{"title":"Predictive air combat decision model with segmented reward allocation","authors":"Yundi Li, Yinlong Yuan, Yun Cheng, Liang Hua","doi":"10.1007/s40747-024-01556-3","DOIUrl":"https://doi.org/10.1007/s40747-024-01556-3","url":null,"abstract":"<p>In air combat missions, unmanned combat aerial vehicles (UCAVs) must take strategic actions to establish combat advantages, enabling effective tracking and attacking of enemy UCAVs. Currently, a lot of reinforcement learning algorithms are applied to the air combat mission of unmanned fighter aircraft. However, most algorithms can only select policies based on the current state of both sides. This leads to the inability to effectively track and attack when the enemy performs large angle maneuvering. Additionally, these algorithms cannot adapt to different situations, resulting in the unmanned fighter aircraft being at a disadvantage in some cases. To solve these problems, this paper proposes predictive air combat decision model with segmented reward allocation for air combat tracking and attacking. On the basis of the air combat environment, we propose the prediction soft actor-critic (Pre-SAC) algorithm, which combines the prediction of enemy states with the states of UCAV for model training. This enables the UCAV to predict the next move of the enemy UCAV in advance and establish a greater air combat advantage for us. Furthermore, by adopting a segmented reward allocation model and combining it with the Pre-SAC algorithm, we propose the segmented reward allocation soft actor-critic (Sra-SAC) algorithm, which solves the problem of UCAVs being unable to adapt to different situations. The results show that the prediction-based segmented reward allocation the Sra-SAC algorithm outperforms the traditional soft actor-critic (SAC) algorithm in terms of overall reward, travel distance, and relative advantage.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141737021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-20DOI: 10.1007/s40747-024-01557-2
Longjie Liao, Qimin Xu, Xinyi Zhou, Xu Li, Xixiang Liu
In the field of autonomous mobile robots, sampling-based motion planning methods have demonstrated their efficiency in complex environments. Although the Rapidly-exploring Random Tree (RRT) algorithm and its variants have achieved significant success in known static environment, it is still challenging in achieving optimal motion planning in unknown dynamic environments. To address this issue, this paper proposes a novel motion planning algorithm Bi-HS-RRT(^text {X}), which facilitates asymptotically optimal real-time planning in continuously changing unknown environments. The algorithm swiftly determines an initial feasible path by employing the bidirectional search. When dynamic obstacles render the planned path infeasible, the bidirectional search is reactivated promptly to reconstruct the search tree in a local area, thereby significantly reducing the search planning time. Additionally, this paper adopts a hybrid heuristic sampling strategy to optimize the planned path quality and search efficiency. The convergence of the proposed algorithm is accelerated by merging local biased sampling with nominal path and global heuristic sampling in hyper-ellipsoid region. To verify the effectiveness and efficiency of the proposed algorithm in unknown dynamic environments, numerous comparative experiments with existing algorithms were conducted. The experimental results indicate that the proposed planning algorithm has significant advantages in planned path length and planning time.
{"title":"Bi-HS-RRT $$^text {X}$$ : an efficient sampling-based motion planning algorithm for unknown dynamic environments","authors":"Longjie Liao, Qimin Xu, Xinyi Zhou, Xu Li, Xixiang Liu","doi":"10.1007/s40747-024-01557-2","DOIUrl":"https://doi.org/10.1007/s40747-024-01557-2","url":null,"abstract":"<p>In the field of autonomous mobile robots, sampling-based motion planning methods have demonstrated their efficiency in complex environments. Although the Rapidly-exploring Random Tree (RRT) algorithm and its variants have achieved significant success in known static environment, it is still challenging in achieving optimal motion planning in unknown dynamic environments. To address this issue, this paper proposes a novel motion planning algorithm Bi-HS-RRT<span>(^text {X})</span>, which facilitates asymptotically optimal real-time planning in continuously changing unknown environments. The algorithm swiftly determines an initial feasible path by employing the bidirectional search. When dynamic obstacles render the planned path infeasible, the bidirectional search is reactivated promptly to reconstruct the search tree in a local area, thereby significantly reducing the search planning time. Additionally, this paper adopts a hybrid heuristic sampling strategy to optimize the planned path quality and search efficiency. The convergence of the proposed algorithm is accelerated by merging local biased sampling with nominal path and global heuristic sampling in hyper-ellipsoid region. To verify the effectiveness and efficiency of the proposed algorithm in unknown dynamic environments, numerous comparative experiments with existing algorithms were conducted. The experimental results indicate that the proposed planning algorithm has significant advantages in planned path length and planning time.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141732690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-19DOI: 10.1007/s40747-024-01558-1
Haoyuan Zhang
Most existing 3D action recognition works rely on the supervised learning paradigm, yet the limited availability of annotated data limits the full potential of encoding networks. As a result, effective self-supervised pre-training strategies have been actively researched. In this paper, we target to explore a self-supervised learning approach for 3D action recognition, and propose the Attention-guided Mask Learning (AML) scheme. Specifically, the dropping mechanism is introduced into contrastive learning to develop Attention-guided Mask (AM) module as well as mask learning strategy, respectively. The AM module leverages the spatial and temporal attention to guide the corresponding features masking, so as to produce the masked contrastive object. The mask learning strategy enables the model to discriminate different actions even with important features masked, which makes action representation learning more discriminative. What’s more, to alleviate the strict positive constraint that would hinder representation learning, the positive-enhanced learning strategy is leveraged in the second-stage training. Extensive experiments on NTU-60, NTU-120, and PKU-MMD datasets show that the proposed AML scheme improves the performance in self-supervised 3D action recognition, achieving state-of-the-art results.
大多数现有的三维动作识别工作都依赖于监督学习范式,但注释数据的有限性限制了编码网络潜力的充分发挥。因此,人们一直在积极研究有效的自监督预训练策略。在本文中,我们以探索三维动作识别的自监督学习方法为目标,提出了注意力引导掩码学习(AML)方案。具体来说,在对比学习中引入下降机制,分别开发出注意力引导面具(AM)模块和面具学习策略。注意力引导掩码模块利用空间和时间注意力引导相应的特征掩码,从而生成被掩码的对比对象。掩码学习策略使模型即使在重要特征被掩码的情况下也能分辨出不同的动作,从而使动作表征学习更具辨别力。此外,为了缓解严格的正向约束对表征学习的阻碍,在第二阶段训练中采用了正向增强学习策略。在 NTU-60、NTU-120 和 PKU-MMD 数据集上的广泛实验表明,所提出的 AML 方案提高了自监督三维动作识别的性能,取得了最先进的结果。
{"title":"Attention-guided mask learning for self-supervised 3D action recognition","authors":"Haoyuan Zhang","doi":"10.1007/s40747-024-01558-1","DOIUrl":"https://doi.org/10.1007/s40747-024-01558-1","url":null,"abstract":"<p>Most existing 3D action recognition works rely on the supervised learning paradigm, yet the limited availability of annotated data limits the full potential of encoding networks. As a result, effective self-supervised pre-training strategies have been actively researched. In this paper, we target to explore a self-supervised learning approach for 3D action recognition, and propose the Attention-guided Mask Learning (AML) scheme. Specifically, the dropping mechanism is introduced into contrastive learning to develop Attention-guided Mask (AM) module as well as mask learning strategy, respectively. The AM module leverages the spatial and temporal attention to guide the corresponding features masking, so as to produce the masked contrastive object. The mask learning strategy enables the model to discriminate different actions even with important features masked, which makes action representation learning more discriminative. What’s more, to alleviate the strict positive constraint that would hinder representation learning, the positive-enhanced learning strategy is leveraged in the second-stage training. Extensive experiments on NTU-60, NTU-120, and PKU-MMD datasets show that the proposed AML scheme improves the performance in self-supervised 3D action recognition, achieving state-of-the-art results.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141725770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-18DOI: 10.1007/s40747-024-01531-y
Itilekha Podder, Tamas Fischl, Udo Bub
Micro-electro-mechanical systems (MEMS)-based sensors endure complex production processes that inherently include high variance. To meet rigorous client demands (such as sensitivity, offset noise, robustness against vibration, etc.). products must go through comprehensive calibration and testing procedures. All sensors undergo a standardized and sequential calibration process with a predetermined number of steps, even though some may reach the correct calibration value sooner. Moreover, the traditional sequential calibration method faces challenges due to specific operating conditions resulting from manufacturing discrepancies. This not only extends the calibration duration but also introduces rigidity and inefficiency. To tackle the issue of production variances and elongated calibration time and enhance efficiency, we provide a novel quasi-parallelized calibration framework aided by an artificial intelligence (AI) based solution. Our suggested method utilizes a supervised tree-based regression technique and statistical measures to dynamically identify and optimize the appropriate working point for each sensor. The objective is to decrease the total calibration duration while ensuring accuracy. The findings of our investigation show a time reduction of 23.8% for calibration, leading to substantial cost savings in the manufacturing process. In addition, we propose an end-to-end monitoring system to accelerate the incorporation of our framework into production. This not only guarantees the prompt execution of our solution but also enables the identification of process modifications or data irregularities, promoting a more agile and adaptable production process.
{"title":"Smart calibration and monitoring: leveraging artificial intelligence to improve MEMS-based inertial sensor calibration","authors":"Itilekha Podder, Tamas Fischl, Udo Bub","doi":"10.1007/s40747-024-01531-y","DOIUrl":"https://doi.org/10.1007/s40747-024-01531-y","url":null,"abstract":"<p>Micro-electro-mechanical systems (MEMS)-based sensors endure complex production processes that inherently include high variance. To meet rigorous client demands (such as sensitivity, offset noise, robustness against vibration, etc.). products must go through comprehensive calibration and testing procedures. All sensors undergo a standardized and sequential calibration process with a predetermined number of steps, even though some may reach the correct calibration value sooner. Moreover, the traditional sequential calibration method faces challenges due to specific operating conditions resulting from manufacturing discrepancies. This not only extends the calibration duration but also introduces rigidity and inefficiency. To tackle the issue of production variances and elongated calibration time and enhance efficiency, we provide a novel quasi-parallelized calibration framework aided by an artificial intelligence (AI) based solution. Our suggested method utilizes a supervised tree-based regression technique and statistical measures to dynamically identify and optimize the appropriate working point for each sensor. The objective is to decrease the total calibration duration while ensuring accuracy. The findings of our investigation show a time reduction of 23.8% for calibration, leading to substantial cost savings in the manufacturing process. In addition, we propose an end-to-end monitoring system to accelerate the incorporation of our framework into production. This not only guarantees the prompt execution of our solution but also enables the identification of process modifications or data irregularities, promoting a more agile and adaptable production process.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141725767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-18DOI: 10.1007/s40747-024-01452-w
Shida Liu, Qingsheng Liu, Li Wang, Xianlong Chen
This paper presents a chaotic optimal thermodynamic evolutionary algorithm (COTEA) designed to address the integrated scheduling problems of berth allocation, ship unloader scheduling, and yard allocation at bulk cargo terminals. Our proposed COTEA introduces a thermal transition crossover method that effectively circumvents local optima in the scheduling solution process. Additionally, the method innovatively combines a good point set with chaotic dynamics within an integrated initialization framework, thereby cultivating a robust and exploratory initial population for the optimization algorithm. To further enhance the selection process, our paper proposes a refined parental selection protocol that employs a quantified hypervolume contribution metric to discern superior candidate solutions. Postevolution, our algorithm employs a Cauchy inverse cumulative distribution-based neighborhood search to effectively explore and enhance the solution spaces, significantly accelerating the convergence speed during the scheduling solution process. The proposed method is adept at achieving multiobjective optimization, simultaneously improving the service level and reducing costs for bulk cargo terminals, which in turn boosts their competitiveness. The effectiveness of our COTEA is demonstrated through extensive numerical simulations.
{"title":"Intelligent bulk cargo terminal scheduling based on a novel chaotic-optimal thermodynamic evolutionary algorithm","authors":"Shida Liu, Qingsheng Liu, Li Wang, Xianlong Chen","doi":"10.1007/s40747-024-01452-w","DOIUrl":"https://doi.org/10.1007/s40747-024-01452-w","url":null,"abstract":"<p>This paper presents a chaotic optimal thermodynamic evolutionary algorithm (COTEA) designed to address the integrated scheduling problems of berth allocation, ship unloader scheduling, and yard allocation at bulk cargo terminals. Our proposed COTEA introduces a thermal transition crossover method that effectively circumvents local optima in the scheduling solution process. Additionally, the method innovatively combines a good point set with chaotic dynamics within an integrated initialization framework, thereby cultivating a robust and exploratory initial population for the optimization algorithm. To further enhance the selection process, our paper proposes a refined parental selection protocol that employs a quantified hypervolume contribution metric to discern superior candidate solutions. Postevolution, our algorithm employs a Cauchy inverse cumulative distribution-based neighborhood search to effectively explore and enhance the solution spaces, significantly accelerating the convergence speed during the scheduling solution process. The proposed method is adept at achieving multiobjective optimization, simultaneously improving the service level and reducing costs for bulk cargo terminals, which in turn boosts their competitiveness. The effectiveness of our COTEA is demonstrated through extensive numerical simulations.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141725769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Due to the limitations of the model itself, the performance of switched autoregressive exogenous (SARX) models will face potential threats when modeling nonlinear hybrid dynamic systems. To address this problem, a robust identification approach of the switched gated recurrent unit (SGRU) model is developed in this paper. Firstly, all submodels of the SARX model are replaced by gated recurrent unit neural networks. The obtained SGRU model has stronger nonlinear fitting ability than the SARX model. Secondly, this paper departs from the conventional Gaussian distribution assumption for noise, opting instead for a generalized Gaussian distribution. This enables the proposed model to achieve stable prediction performance under the influence of different noises. Notably, no prior assumptions are imposed on the knowledge of operating modes in the proposed switched model. Therefore, the EM algorithm is used to solve the problem of parameter estimation with hidden variables in this paper. Finally, two simulation experiments are performed. By comparing the nonlinear fitting ability of the SGRU model with the SARX model and the prediction performance of the SGRU model under different noise distributions, the effectiveness of the proposed approach is verified.
{"title":"Identification of switched gated recurrent unit neural networks with a generalized Gaussian distribution","authors":"Wentao Bai, Fan Guo, Suhang Gu, Chao Yan, Chunli Jiang, Haoyu Zhang","doi":"10.1007/s40747-024-01540-x","DOIUrl":"https://doi.org/10.1007/s40747-024-01540-x","url":null,"abstract":"<p>Due to the limitations of the model itself, the performance of switched autoregressive exogenous (SARX) models will face potential threats when modeling nonlinear hybrid dynamic systems. To address this problem, a robust identification approach of the switched gated recurrent unit (SGRU) model is developed in this paper. Firstly, all submodels of the SARX model are replaced by gated recurrent unit neural networks. The obtained SGRU model has stronger nonlinear fitting ability than the SARX model. Secondly, this paper departs from the conventional Gaussian distribution assumption for noise, opting instead for a generalized Gaussian distribution. This enables the proposed model to achieve stable prediction performance under the influence of different noises. Notably, no prior assumptions are imposed on the knowledge of operating modes in the proposed switched model. Therefore, the EM algorithm is used to solve the problem of parameter estimation with hidden variables in this paper. Finally, two simulation experiments are performed. By comparing the nonlinear fitting ability of the SGRU model with the SARX model and the prediction performance of the SGRU model under different noise distributions, the effectiveness of the proposed approach is verified.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141725768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-17DOI: 10.1007/s40747-024-01559-0
Langyue Zhao, Yiquan Wu, Yubin Yuan
Defect detection for photovoltaic (PV) cell images is a challenging task due to the small size of the defect features and the complexity of the background characteristics. Modern detectors rely mostly on proxy learning objectives for prediction and on manual post-processing components. One-to-one set matching is a critical design for DEtection TRansformer (DETR) in order to provide end-to-end capability, so that does not need a hand-crafted Efficient Non-Maximum Suppression NMS. In order to detect PV cell defects faster and better, a technology called the PV cell Defects DEtection Transformer (PD-DETR) is proposed. To address the issue of slow convergence caused by DETR’s direct translation of image feature mapping into target detection results, we created a hybrid feature module. To achieve a balance between performance and computation, the image features are passed through a scoring network and dilated convolution, respectively, to obtain the foreground fine feature and contour high-frequency feature. The two features are then adaptively intercepted and fused. The capacity of the model to detect small-scale defects under complex background conditions is improved by the addition of high-frequency information. Furthermore, too few positive queries will be assigned to the defect target via one-to-one set matching, which will result in sparse supervision of the encoder and impair the decoder’s ability of attention learning. Consequently, we enhanced the detection effect by combining the original DETR with the one-to-many matching branch. Specifically, two Faster RCNN detection heads were added during training. To maintain the end-to-end benefits of DETR, inference is still performed using the original one-to-one set matching. Our model implements 64.7% AP on the PVEL-AD dataset.
{"title":"PD-DETR: towards efficient parallel hybrid matching with transformer for photovoltaic cell defects detection","authors":"Langyue Zhao, Yiquan Wu, Yubin Yuan","doi":"10.1007/s40747-024-01559-0","DOIUrl":"https://doi.org/10.1007/s40747-024-01559-0","url":null,"abstract":"<p>Defect detection for photovoltaic (PV) cell images is a challenging task due to the small size of the defect features and the complexity of the background characteristics. Modern detectors rely mostly on proxy learning objectives for prediction and on manual post-processing components. One-to-one set matching is a critical design for DEtection TRansformer (DETR) in order to provide end-to-end capability, so that does not need a hand-crafted Efficient Non-Maximum Suppression NMS. In order to detect PV cell defects faster and better, a technology called the PV cell Defects DEtection Transformer (PD-DETR) is proposed. To address the issue of slow convergence caused by DETR’s direct translation of image feature mapping into target detection results, we created a hybrid feature module. To achieve a balance between performance and computation, the image features are passed through a scoring network and dilated convolution, respectively, to obtain the foreground fine feature and contour high-frequency feature. The two features are then adaptively intercepted and fused. The capacity of the model to detect small-scale defects under complex background conditions is improved by the addition of high-frequency information. Furthermore, too few positive queries will be assigned to the defect target via one-to-one set matching, which will result in sparse supervision of the encoder and impair the decoder’s ability of attention learning. Consequently, we enhanced the detection effect by combining the original DETR with the one-to-many matching branch. Specifically, two Faster RCNN detection heads were added during training. To maintain the end-to-end benefits of DETR, inference is still performed using the original one-to-one set matching. Our model implements 64.7% AP on the PVEL-AD dataset.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141631302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}