This study addresses the imperative need for efficient solutions in the context of the dual resource constrained flexible job shop scheduling problem with sequence-dependent setup times (DRCFJS-SDSTs). We introduce a pioneering tri-objective mixed-integer linear mathematical model tailored to this complex challenge. Our model is designed to optimize the assignment of operations to candidate multi-skilled machines and operators, with the primary goals of minimizing operators' idleness cost and sequence-dependent setup time-related expenses. Additionally, it aims to mitigate total tardiness and earliness penalties while regulating maximum machine workload. Given the NP-hard nature of the proposed DRCFJS-SDST, we employ the epsilon constraint method to derive exact optimal solutions for small-scale problems. For larger instances, we develop a modified variant of the multi-objective invasive weed optimization (MOIWO) algorithm, enhanced by a fuzzy sorting algorithm for competitive exclusion. In the absence of established benchmarks in the literature, we validate our solutions against those generated by multi-objective particle swarm optimization (MOPSO) and non-dominated sorted genetic algorithm (NSGA-II). Through comparative analysis, we demonstrate the superior performance of MOIWO. Specifically, when compared with NSGA-II, MOIWO achieves success rates of 90.83% and shows similar performance in 4.17% of cases. Moreover, compared with MOPSO, MOIWO achieves success rates of 84.17% and exhibits similar performance in 9.17% of cases. These findings contribute significantly to the advancement of scheduling optimization methodologies.
The recent period has witnessed automated crawlers designed to automatically crack passwords, which greatly risks various aspects of our lives. To prevent passwords from being cracked, image verification codes have been implemented to accomplish the human–machine verification. It is important to note, however, that the most widely-used image verification codes, especially the visual reasoning Completely Automated Public Turing tests to tell Computers and Humans Apart (CAPTCHAs), are still susceptible to attacks by artificial intelligence. Taking the visual reasoning CAPTCHAs representing the image verification codes, this study introduces an enhanced approach for generating image verification codes and proposes an improved Convolutional Neural Network (CNN)-based recognition system. After we add a fully connected layer and briefly solve the edge of stability issue, the accuracy of the improved CNN model can smoothly approach 98.40% within 50 epochs on the image verification codes with four digits using a large initial learning rate of 0.01. Compared with the baseline model, it is approximately 37.82% better in accuracy without obvious curve oscillation. The improved CNN model can also smoothly reach the accuracy of 99.00% within 7500 epochs on the image verification codes with six characters, including digits, upper-case alphabets, lower-case alphabets, and symbols. A detailed comparison between our proposed approach and the baseline one is presented. The relationship between the time consumption and the length of the seeds is compared theoretically. Subsequently, we figure out the threat assignments on the visual reasoning CAPTCHAs with different lengths based on four machine learning models. Based on the threat assignments, the Kaplan-Meier (KM) curves are computed.
Video-based facial expression recognition (VFER) technique intends to categorize an input video into different kinds of emotions. It remains a challenging issue because of the gap between visual features and emotions, problems in handling the delicate movement of muscles, and restricted datasets. One of the effective solutions to solve this problem is the exploitation of efficient features defining facial expressions to carry out FER. Generally, the VFER find useful in several areas like unmanned driving, venue management, urban safety management, and senseless attendance. Recent advances in computer vision and deep learning (DL) techniques enable the design of automated VFER models. In this aspect, this study establishes a new Marine Predators Optimization with Deep Learning Model for Video-based Facial Expression Recognition (MPODL-VFER) technique. The presented MPODL-VFER technique mainly aims to classify different kinds of facial emotions in the video. To accomplish this, the presented MPODL-VFER technique derives features using the deep convolutional neural network based densely connected network (DenseNet) model. The presented MPODL-VFER technique employs MPO technique for the hyperparameter adjustment of the DenseNet model. Finally, Elman Neural Network (ENN) model is exploited for emotion recognition purposes. For assuring the enhanced recognition performance of the MPODL-VFER approach, a comparison study was developed on benchmark dataset. The comprehensive results have shown the significant outcome of MPODL-VFER model over other approaches.
Research on multi-biometric authentication systems using multiple biometric modalities to defend against adversarial attacks is actively being pursued. These systems authenticate users by combining two or more biometric modalities using score or feature-level fusion. However, research on adversarial attacks and defences against each biometric modality within these authentication systems has not been actively conducted. In this study, we constructed a multi-biometric authentication system using fingerprint, palmprint, and iris information from CASIA-BIT by employing score and feature-level fusion. We verified the system's vulnerability by deploying adversarial attacks on single and multiple biometric modalities based on the FGSM, with epsilon values ranging from 0 to 0.5. The experimental results show that when the epsilon value is 0.5, the accuracy of the multi-biometric authentication system against adversarial attacks on the palmprint and iris information decreases from 0.995 to 0.018 and 0.003, respectively, and the f1-score decreases from 0.995 to 0.007 and 0.000, respectively, demonstrating susceptibility to adversarial attacks. In the case of fingerprint data, however, the accuracy and f1-score decreased from 0.995 to 0.731 and from 0.995 to 0.741, respectively, indicating resilience against adversarial attacks.
Providing the human body with smooth and natural assistance through lower limb exoskeletons is crucial. However, a significant challenge is identifying various locomotion modes to enable the exoskeleton to offer seamless support. In this study, we propose a method for locomotion mode recognition named Convolution-enhanced Vision Transformer (Conv-ViT). This method maximizes the benefits of convolution for feature extraction and fusion, as well as the self-attention mechanism of the Transformer, to efficiently capture and handle long-term dependencies among different positions within the input sequence. By equipping the exoskeleton with inertial measurement units, we collected motion data from 27 healthy subjects, using it as input to train the Conv-ViT model. To ensure the exoskeleton's stability and safety during transitions between various locomotion modes, we not only examined the typical five steady modes (involving walking on level ground [WL], stair ascent [SA], stair descent [SD], ramp ascent [RA], and ramp descent [RD]) but also extensively explored eight locomotion transitions (including WL-SA, WL-SD, WL-RA, WL-RD, SA-WL, SD-WL, RA-WL, RD-WL). In tasks involving the recognition of five steady locomotions and eight transitions, the recognition accuracy reached 98.87% and 96.74%, respectively. Compared with three popular algorithms, ViT, convolutional neural networks, and support vector machine, the results show that the proposed method has the best recognition performance, and there are highly significant differences in accuracy and F1 score compared to other methods. Finally, we also demonstrated the excellent performance of Conv-ViT in terms of generalization performance.
Text classification is the process of labelling a given set of text documents with predefined classes or categories. Existing Arabic text classifiers are either applying classic Machine Learning algorithms such as k-NN and SVM or using modern deep learning techniques. The former are assessed using small text collections and their accuracy is still subject to improvement while the latter are efficient in classifying big data collections and show limited effectiveness in classifying small corpora with a large number of categories. This paper proposes a new approach to Arabic text classification to treat small and large data collections while improving the classification rates of existing classifiers. We first demonstrate the ability of analogical proportions (AP) (statements of the form ‘x is to