Multiagent hierarchical reinforcement learning (MAHRL) has been studied as an effective means to solve intelligent decision problems in complex and large-scale environments. However, most current MAHRL algorithms follow the traditional way of using reward functions in reinforcement learning (RL), which limits their use to a single task. This study aims to design a multiagent cooperative algorithm with logic reward shaping (LRS), which uses a more flexible way of setting the rewards, allowing for the effective completion of multitasks. LRS uses linear-time temporal logic (LTL) to express the internal logic relation of subtasks within a complex task. Then, it evaluates whether the subformulas of the LTL expressions are satisfied based on a designed reward structure. This helps agents to learn to effectively complete tasks by adhering to the LTL expressions, thus enhancing the interpretability and credibility of their decisions. To enhance coordination and cooperation among multiple agents, a value iteration technique is designed to evaluate the actions taken by each agent. Based on this evaluation, a reward function is shaped for coordination, which enables each agent to evaluate its status and complete the remaining subtasks through experiential learning. Experiments have been conducted on various types of tasks in the Minecraft World and Office World. The results demonstrate that the proposed algorithm can improve the performance of multiagents when learning to complete multitasks.
Enhancing system security under denial-of-service (DoS) attacks requires robust compensation mechanisms. However, existing model-free adaptive control-based compensation solutions are limited to constant reference signals and neglect control optimization, causing insufficient tracking performance in dynamic attacks. This study develops a data-driven adaptive dynamic programming (ADP) resilient control scheme for networked control system under aperiodic DoS attacks. An ADP method with a modified performance index is proposed to derive a globally optimal controller, while a dynamic penalty factor is introduced to accelerate error convergence. Leveraging ADP technology and the latest available control increments, a compensation mechanism for time-varying reference signals is designed to reduce performance degradation. Finally, theoretical proofs ensure error convergence, and comparative simulations verify the strategy's superiority.
Sleep is essential for maintaining human health and quality of life. Analyzing physiological signals during sleep is critical in assessing sleep quality and diagnosing sleep disorders. However, manual diagnoses by clinicians are time-intensive and subjective. Despite advances in deep learning that have enhanced automation, these approaches remain heavily dependent on large-scale labeled datasets. This study introduces SynthSleepNet, a multimodal hybrid-self-supervised learning (SSL) framework designed for analyzing polysomnography (PSG) data. SynthSleepNet effectively integrates masked prediction and contrastive learning to leverage complementary features across multiple modalities, including electroencephalogram (EEG), electrooculography (EOG), electromyography (EMG), and electrocardiogram (ECG). This approach enables the model to learn highly expressive representations of PSG data. Furthermore, a temporal context module based on Mamba was developed to efficiently capture contextual information across signals. SynthSleepNet achieved superior performance compared to state-of-the-art methods across three downstream tasks: sleep-stage classification, apnea detection, and hypopnea detection, with accuracies of 89.89%, 99.75%, and 89.60%, respectively. The model demonstrated robust performance in a semi-SSL environment with limited labels, achieving accuracies of 87.98%, 99.37%, and 77.52% in the same tasks. These results underscore the potential of the model as a foundational tool for the comprehensive analysis of PSG data. SynthSleepNet demonstrates comprehensively superior performance across multiple downstream tasks compared to other methodologies, making it expected to set a new standard for sleep disorder monitoring and diagnostic systems. The source code is available at https://github.com/dlcjfgmlnasa/SynthSleepNet.
In this study, an improved jump model is proposed for the Roesser-type 2-D Markov jump systems (MJSs). We use two independent Markov chains that propagate along the horizontal and vertical directions, respectively, to characterize the switching of system dynamics in those two directions. Compared with the conventional jump model, which uses only one Markov chain to characterize the switching of system dynamics in both directions, the newly proposed 2-D jump model shows better modeling capabilities for real-world applications with abrupt changes while inherently avoiding the mode ambiguity phenomenon. Based on the proposed jump model, we then propose a dual-mode-dependent state feedback control law to stabilize the concerned 2-D MJS. A sufficient criterion, whose feasibility is enhanced via a dual-mode-dependent Lyapunov functional technique, is obtained to ensure the asymptotic mean square stability and $H_{infty }$ disturbance attenuation level of the resulting closed-loop system. Subsequently, resorting to a novel nonconservative separation principle, two equivalent conditions with one of them in the form of linear matrix inequalities (LMIs) are developed. Finally, a convex optimization algorithm which is formulated by the obtained LMIs is proposed to design the control law. An example of the Darboux equation with Markov switching parameters is presented to validate the effectiveness of the obtained results.

